linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Watchdog Reset on Idle CPU with a task on its runq
@ 2024-05-07 21:40 73% Vijay Balakrishna
  0 siblings, 0 replies; 200+ results
From: Vijay Balakrishna @ 2024-05-07 21:40 UTC (permalink / raw)
  To: Linux kernel mailing list, linux-arm-kernel; +Cc: Tyler Hicks, Allen Pais

Hello,

We are seeing watchdog reset on ARM64 SoC running v5.10.178 kernel 
(stable) where CPU 0 running an idle task even though there is a 
runnable task on CFS runq (rcu_sched in output below).  We are wondering 
why do we see a task waiting to get scheduled to run a CPU otherwise 
running an idle task.  What does this indicate with respect to state of 
CPU 0?  What else could we check in the kernel crash dump. Any pointers 
appreciated.

Thanks,
Vijay

(crash tool output)

[530671.963762] Kernel panic - not syncing: SBSA Generic Watchdog timeout
[530671.970288] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G 
         O      5.10.178.13-microsoft-standard #1
[530671.980969] Hardware name: Overlake (DT)
[530671.984967] Call trace:
[530671.987499]  dump_backtrace+0x0/0x1f0
[530671.991238]  show_stack+0x1c/0x24
[530671.994630]  dump_stack+0xe0/0x13c
[530671.998107]  panic+0x198/0x3a4
[530672.001239]  sbsa_gwdt_set_timeout+0x0/0x7c
[530672.005498]  __handle_irq_event_percpu+0xf0/0x2ac
[530672.010277]  handle_irq_event+0x60/0x144
[530672.014275]  handle_fasteoi_irq+0x144/0x234
[530672.018533]  __handle_domain_irq+0x8c/0xcc
[530672.022704]  gic_handle_irq+0xc0/0x120
[530672.026527]  el1_irq+0xcc/0x180
[530672.029744]  cpuidle_enter_state+0x1fc/0x31c
[530672.034088]  cpuidle_enter+0x3c/0x50
[530672.037740]  do_idle+0x1e4/0x28c
[530672.041042]  cpu_startup_entry+0x28/0x2c
[530672.045042]  rest_init+0xc4/0xd0
[530672.048346]  arch_call_rest_init+0x14/0x1c
[530672.052517]  start_kernel+0x328/0x3a4
[530672.056267] SMP: stopping secondary CPUs
[530672.060450] Starting crashdump kernel...
[530672.064447] Bye!
crash> runq -c 0
CPU 0 RUNQUEUE: ffff07cf49233200
   CURRENT: PID: 0      TASK: ffffde8e444e8900  COMMAND: "swapper/0"
   RT PRIO_ARRAY: ffff07cf49233440
      [no tasks queued]
   CFS RB_ROOT: ffff07cf492332b0
      [120] PID: 11     TASK: ffff07ad40c10000  COMMAND: "rcu_sched"
crash> bt ffffde8e444e8900
PID: 0        TASK: ffffde8e444e8900  CPU: 0    COMMAND: "swapper/0"
  #0 [ffff800010003db0] __crash_kexec at ffffde8e4370b424
  #1 [ffff800010003e60] panic at ffffde8e4363b64c
  #2 [ffff800010003eb0] sbsa_gwdt_interrupt at ffffde8e43d92aa8
  #3 [ffff800010003ed0] __handle_irq_event_percpu at ffffde8e436b9720
  #4 [ffff800010003f40] handle_irq_event at ffffde8e436b99c4
  #5 [ffff800010003f70] handle_fasteoi_irq at ffffde8e436bff0c
  #6 [ffff800010003fa0] __handle_domain_irq at ffffde8e436b831c
  #7 [ffff800010003fe0] gic_handle_irq at ffffde8e43600974
--- <IRQ stack> ---
  #8 [ffffde8e444d3e50] el1_irq at ffffde8e43602288
  #9 [ffffde8e444d3e70] cpuidle_enter_state at ffffde8e43dd6190
#10 [ffffde8e444d3ed0] cpuidle_enter at ffffde8e43dd6314
#11 [ffffde8e444d3f10] do_idle at ffffde8e4368307c
#12 [ffffde8e444d3f70] cpu_startup_entry at ffffde8e4368314c
#13 [ffffde8e444d3f90] rest_init at ffffde8e4408d79c
#14 [ffffde8e444d3fb0] arch_call_rest_init at ffffde8e443b0730
#15 [ffffde8e444d3fe0] start_kernel at ffffde8e443b0a60
crash>

^ permalink raw reply	[relevance 73%]

* [PATCH 0/1] Convert tasklets to BH workqueues in ethernet drivers
@ 2024-05-07 19:01 55% Allen Pais
  2024-05-07 19:01  7% ` [PATCH 1/1] [RFC] ethernet: Convert from tasklet to BH workqueue Allen Pais
  0 siblings, 1 reply; 200+ results
From: Allen Pais @ 2024-05-07 19:01 UTC (permalink / raw)
  To: netdev
  Cc: jes, davem, edumazet, kuba, pabeni, kda, cai.huoqing, dougmill,
	npiggin, christophe.leroy, aneesh.kumar, naveen.n.rao, nnac123,
	tlfalcon, cooldavid, marcin.s.wojtas, linux, mlindner, stephen,
	nbd, sean.wang, Mark-MC.Lee, lorenzo, matthias.bgg,
	angelogioacchino.delregno, borisp, bryan.whitehead,
	UNGLinuxDriver, louis.peens, richardcochran, linux-rdma,
	linux-kernel, linux-acenic, linux-arm-kernel, linuxppc-dev,
	linux-mediatek, oss-drivers, linux-net-drivers

This series focuses on converting the existing implementation of
tasklets to bottom half (BH) workqueues across various Ethernet
drivers under drivers/net/ethernet/*.

Impact:
 The conversion is expected to maintain or improve the performance
of the affected drivers. It also improves the maintainability and
readability of the driver code.

Testing:
 - Conducted standard network throughput and latency benchmarks
   to ensure performance parity or improvement.
 - Ran kernel regression tests to verify that changes do not introduce new issues.

I appreciate your review and feedback on this patch series.
And additional tested would be really helpful.

Allen Pais (1):
  [RFC] ethernet: Convert from tasklet to BH workqueue

 drivers/infiniband/hw/mlx4/cq.c               |  2 +-
 drivers/infiniband/hw/mlx5/cq.c               |  2 +-
 drivers/net/ethernet/alteon/acenic.c          | 26 +++----
 drivers/net/ethernet/alteon/acenic.h          |  7 +-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c      | 30 ++++----
 drivers/net/ethernet/amd/xgbe/xgbe-i2c.c      | 16 ++---
 drivers/net/ethernet/amd/xgbe/xgbe-mdio.c     | 16 ++---
 drivers/net/ethernet/amd/xgbe/xgbe-pci.c      |  4 +-
 drivers/net/ethernet/amd/xgbe/xgbe.h          | 11 +--
 drivers/net/ethernet/broadcom/cnic.c          | 19 ++---
 drivers/net/ethernet/broadcom/cnic.h          |  2 +-
 drivers/net/ethernet/cadence/macb.h           |  3 +-
 drivers/net/ethernet/cadence/macb_main.c      | 10 +--
 .../net/ethernet/cavium/liquidio/lio_core.c   |  4 +-
 .../net/ethernet/cavium/liquidio/lio_main.c   | 25 +++----
 .../ethernet/cavium/liquidio/lio_vf_main.c    | 10 +--
 .../ethernet/cavium/liquidio/octeon_droq.c    |  4 +-
 .../ethernet/cavium/liquidio/octeon_main.h    |  5 +-
 .../net/ethernet/cavium/octeon/octeon_mgmt.c  | 12 ++--
 drivers/net/ethernet/cavium/thunder/nic.h     |  5 +-
 .../net/ethernet/cavium/thunder/nicvf_main.c  | 24 +++----
 .../ethernet/cavium/thunder/nicvf_queues.c    |  5 +-
 .../ethernet/cavium/thunder/nicvf_queues.h    |  3 +-
 drivers/net/ethernet/chelsio/cxgb/sge.c       | 19 ++---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  9 +--
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  2 +-
 .../ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c  |  4 +-
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.c    |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/sge.c      | 41 +++++------
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c    |  6 +-
 drivers/net/ethernet/dlink/sundance.c         | 41 +++++------
 .../net/ethernet/huawei/hinic/hinic_hw_cmdq.c |  2 +-
 .../net/ethernet/huawei/hinic/hinic_hw_eqs.c  | 17 +++--
 .../net/ethernet/huawei/hinic/hinic_hw_eqs.h  |  2 +-
 drivers/net/ethernet/ibm/ehea/ehea.h          |  3 +-
 drivers/net/ethernet/ibm/ehea/ehea_main.c     | 14 ++--
 drivers/net/ethernet/ibm/ibmvnic.c            | 24 +++----
 drivers/net/ethernet/ibm/ibmvnic.h            |  2 +-
 drivers/net/ethernet/jme.c                    | 72 +++++++++----------
 drivers/net/ethernet/jme.h                    |  9 +--
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |  2 +-
 drivers/net/ethernet/marvell/skge.c           | 12 ++--
 drivers/net/ethernet/marvell/skge.h           |  3 +-
 drivers/net/ethernet/mediatek/mtk_wed_wo.c    | 12 ++--
 drivers/net/ethernet/mediatek/mtk_wed_wo.h    |  3 +-
 drivers/net/ethernet/mellanox/mlx4/cq.c       | 42 +++++------
 drivers/net/ethernet/mellanox/mlx4/eq.c       | 10 +--
 drivers/net/ethernet/mellanox/mlx4/mlx4.h     | 11 +--
 drivers/net/ethernet/mellanox/mlx5/core/cq.c  | 38 +++++-----
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  | 12 ++--
 .../ethernet/mellanox/mlx5/core/fpga/conn.c   | 15 ++--
 .../ethernet/mellanox/mlx5/core/fpga/conn.h   |  3 +-
 .../net/ethernet/mellanox/mlx5/core/lib/eq.h  | 11 +--
 drivers/net/ethernet/mellanox/mlxsw/pci.c     | 29 ++++----
 drivers/net/ethernet/micrel/ks8842.c          | 29 ++++----
 drivers/net/ethernet/micrel/ksz884x.c         | 37 +++++-----
 drivers/net/ethernet/microchip/lan743x_ptp.c  |  2 +-
 drivers/net/ethernet/natsemi/ns83820.c        | 10 +--
 drivers/net/ethernet/netronome/nfp/nfd3/dp.c  |  7 +-
 .../net/ethernet/netronome/nfp/nfd3/nfd3.h    |  2 +-
 drivers/net/ethernet/netronome/nfp/nfdk/dp.c  |  6 +-
 .../net/ethernet/netronome/nfp/nfdk/nfdk.h    |  3 +-
 drivers/net/ethernet/netronome/nfp/nfp_net.h  |  4 +-
 .../ethernet/netronome/nfp/nfp_net_common.c   | 12 ++--
 .../net/ethernet/netronome/nfp/nfp_net_dp.h   |  4 +-
 drivers/net/ethernet/ni/nixge.c               | 19 ++---
 drivers/net/ethernet/qlogic/qed/qed.h         |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_int.c     |  6 +-
 drivers/net/ethernet/qlogic/qed/qed_int.h     |  4 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c    | 20 +++---
 drivers/net/ethernet/sfc/falcon/farch.c       |  4 +-
 drivers/net/ethernet/sfc/falcon/net_driver.h  |  2 +-
 drivers/net/ethernet/sfc/falcon/selftest.c    |  2 +-
 drivers/net/ethernet/sfc/net_driver.h         |  2 +-
 drivers/net/ethernet/sfc/selftest.c           |  2 +-
 drivers/net/ethernet/sfc/siena/farch.c        |  4 +-
 drivers/net/ethernet/sfc/siena/net_driver.h   |  2 +-
 drivers/net/ethernet/sfc/siena/selftest.c     |  2 +-
 drivers/net/ethernet/silan/sc92031.c          | 47 ++++++------
 drivers/net/ethernet/smsc/smc91x.c            | 16 ++---
 drivers/net/ethernet/smsc/smc91x.h            |  3 +-
 include/linux/mlx4/device.h                   |  2 +-
 include/linux/mlx5/cq.h                       |  2 +-
 83 files changed, 501 insertions(+), 473 deletions(-)

-- 
2.17.1


^ permalink raw reply	[relevance 55%]

* [PATCH 1/1] [RFC] ethernet: Convert from tasklet to BH workqueue
  2024-05-07 19:01 55% [PATCH 0/1] Convert tasklets to BH workqueues in ethernet drivers Allen Pais
@ 2024-05-07 19:01  7% ` Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-05-07 19:01 UTC (permalink / raw)
  To: netdev
  Cc: jes, davem, edumazet, kuba, pabeni, kda, cai.huoqing, dougmill,
	npiggin, christophe.leroy, aneesh.kumar, naveen.n.rao, nnac123,
	tlfalcon, cooldavid, marcin.s.wojtas, linux, mlindner, stephen,
	nbd, sean.wang, Mark-MC.Lee, lorenzo, matthias.bgg,
	angelogioacchino.delregno, borisp, bryan.whitehead,
	UNGLinuxDriver, louis.peens, richardcochran, linux-rdma,
	linux-kernel, linux-acenic, linux-arm-kernel, linuxppc-dev,
	linux-mediatek, oss-drivers, linux-net-drivers

The only generic interface to execute asynchronously in the BH context is
tasklet; however, it's marked deprecated and has some design flaws. To
replace tasklets, BH workqueue support was recently added. A BH workqueue
behaves similarly to regular workqueues except that the queued work items
are executed in the BH context.

This patch converts drivers/ethernet/* from tasklet to BH workqueue.

Based on the work done by Tejun Heo <tj@kernel.org>
Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git disable_work-v1

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
---
 drivers/infiniband/hw/mlx4/cq.c               |  2 +-
 drivers/infiniband/hw/mlx5/cq.c               |  2 +-
 drivers/net/ethernet/alteon/acenic.c          | 26 +++----
 drivers/net/ethernet/alteon/acenic.h          |  7 +-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c      | 30 ++++----
 drivers/net/ethernet/amd/xgbe/xgbe-i2c.c      | 16 ++---
 drivers/net/ethernet/amd/xgbe/xgbe-mdio.c     | 16 ++---
 drivers/net/ethernet/amd/xgbe/xgbe-pci.c      |  4 +-
 drivers/net/ethernet/amd/xgbe/xgbe.h          | 11 +--
 drivers/net/ethernet/broadcom/cnic.c          | 19 ++---
 drivers/net/ethernet/broadcom/cnic.h          |  2 +-
 drivers/net/ethernet/cadence/macb.h           |  3 +-
 drivers/net/ethernet/cadence/macb_main.c      | 10 +--
 .../net/ethernet/cavium/liquidio/lio_core.c   |  4 +-
 .../net/ethernet/cavium/liquidio/lio_main.c   | 25 +++----
 .../ethernet/cavium/liquidio/lio_vf_main.c    | 10 +--
 .../ethernet/cavium/liquidio/octeon_droq.c    |  4 +-
 .../ethernet/cavium/liquidio/octeon_main.h    |  5 +-
 .../net/ethernet/cavium/octeon/octeon_mgmt.c  | 12 ++--
 drivers/net/ethernet/cavium/thunder/nic.h     |  5 +-
 .../net/ethernet/cavium/thunder/nicvf_main.c  | 24 +++----
 .../ethernet/cavium/thunder/nicvf_queues.c    |  5 +-
 .../ethernet/cavium/thunder/nicvf_queues.h    |  3 +-
 drivers/net/ethernet/chelsio/cxgb/sge.c       | 19 ++---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  9 +--
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  2 +-
 .../ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c  |  4 +-
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.c    |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/sge.c      | 41 +++++------
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c    |  6 +-
 drivers/net/ethernet/dlink/sundance.c         | 41 +++++------
 .../net/ethernet/huawei/hinic/hinic_hw_cmdq.c |  2 +-
 .../net/ethernet/huawei/hinic/hinic_hw_eqs.c  | 17 +++--
 .../net/ethernet/huawei/hinic/hinic_hw_eqs.h  |  2 +-
 drivers/net/ethernet/ibm/ehea/ehea.h          |  3 +-
 drivers/net/ethernet/ibm/ehea/ehea_main.c     | 14 ++--
 drivers/net/ethernet/ibm/ibmvnic.c            | 24 +++----
 drivers/net/ethernet/ibm/ibmvnic.h            |  2 +-
 drivers/net/ethernet/jme.c                    | 72 +++++++++----------
 drivers/net/ethernet/jme.h                    |  9 +--
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |  2 +-
 drivers/net/ethernet/marvell/skge.c           | 12 ++--
 drivers/net/ethernet/marvell/skge.h           |  3 +-
 drivers/net/ethernet/mediatek/mtk_wed_wo.c    | 12 ++--
 drivers/net/ethernet/mediatek/mtk_wed_wo.h    |  3 +-
 drivers/net/ethernet/mellanox/mlx4/cq.c       | 42 +++++------
 drivers/net/ethernet/mellanox/mlx4/eq.c       | 10 +--
 drivers/net/ethernet/mellanox/mlx4/mlx4.h     | 11 +--
 drivers/net/ethernet/mellanox/mlx5/core/cq.c  | 38 +++++-----
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  | 12 ++--
 .../ethernet/mellanox/mlx5/core/fpga/conn.c   | 15 ++--
 .../ethernet/mellanox/mlx5/core/fpga/conn.h   |  3 +-
 .../net/ethernet/mellanox/mlx5/core/lib/eq.h  | 11 +--
 drivers/net/ethernet/mellanox/mlxsw/pci.c     | 29 ++++----
 drivers/net/ethernet/micrel/ks8842.c          | 29 ++++----
 drivers/net/ethernet/micrel/ksz884x.c         | 37 +++++-----
 drivers/net/ethernet/microchip/lan743x_ptp.c  |  2 +-
 drivers/net/ethernet/natsemi/ns83820.c        | 10 +--
 drivers/net/ethernet/netronome/nfp/nfd3/dp.c  |  7 +-
 .../net/ethernet/netronome/nfp/nfd3/nfd3.h    |  2 +-
 drivers/net/ethernet/netronome/nfp/nfdk/dp.c  |  6 +-
 .../net/ethernet/netronome/nfp/nfdk/nfdk.h    |  3 +-
 drivers/net/ethernet/netronome/nfp/nfp_net.h  |  4 +-
 .../ethernet/netronome/nfp/nfp_net_common.c   | 12 ++--
 .../net/ethernet/netronome/nfp/nfp_net_dp.h   |  4 +-
 drivers/net/ethernet/ni/nixge.c               | 19 ++---
 drivers/net/ethernet/qlogic/qed/qed.h         |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_int.c     |  6 +-
 drivers/net/ethernet/qlogic/qed/qed_int.h     |  4 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c    | 20 +++---
 drivers/net/ethernet/sfc/falcon/farch.c       |  4 +-
 drivers/net/ethernet/sfc/falcon/net_driver.h  |  2 +-
 drivers/net/ethernet/sfc/falcon/selftest.c    |  2 +-
 drivers/net/ethernet/sfc/net_driver.h         |  2 +-
 drivers/net/ethernet/sfc/selftest.c           |  2 +-
 drivers/net/ethernet/sfc/siena/farch.c        |  4 +-
 drivers/net/ethernet/sfc/siena/net_driver.h   |  2 +-
 drivers/net/ethernet/sfc/siena/selftest.c     |  2 +-
 drivers/net/ethernet/silan/sc92031.c          | 47 ++++++------
 drivers/net/ethernet/smsc/smc91x.c            | 16 ++---
 drivers/net/ethernet/smsc/smc91x.h            |  3 +-
 include/linux/mlx4/device.h                   |  2 +-
 include/linux/mlx5/cq.h                       |  2 +-
 83 files changed, 501 insertions(+), 473 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index 4cd738aae53c..75ae9412c21d 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -253,7 +253,7 @@ int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		goto err_dbmap;
 
 	if (udata)
-		cq->mcq.tasklet_ctx.comp = mlx4_ib_cq_comp;
+		cq->mcq.work_ctx.comp = mlx4_ib_cq_comp;
 	else
 		cq->mcq.comp = mlx4_ib_cq_comp;
 	cq->mcq.event = mlx4_ib_cq_event;
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 9773d2a3d97f..d38a160928c0 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -1017,7 +1017,7 @@ int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 
 	mlx5_ib_dbg(dev, "cqn 0x%x\n", cq->mcq.cqn);
 	if (udata)
-		cq->mcq.tasklet_ctx.comp = mlx5_ib_cq_comp;
+		cq->mcq.work_ctx.comp = mlx5_ib_cq_comp;
 	else
 		cq->mcq.comp  = mlx5_ib_cq_comp;
 	cq->mcq.event = mlx5_ib_cq_event;
diff --git a/drivers/net/ethernet/alteon/acenic.c b/drivers/net/ethernet/alteon/acenic.c
index eafef84fe3be..9d0394ceeafa 100644
--- a/drivers/net/ethernet/alteon/acenic.c
+++ b/drivers/net/ethernet/alteon/acenic.c
@@ -1560,9 +1560,9 @@ static void ace_watchdog(struct net_device *data, unsigned int txqueue)
 }
 
 
-static void ace_tasklet(struct tasklet_struct *t)
+static void ace_work(struct work_struct *t)
 {
-	struct ace_private *ap = from_tasklet(ap, t, ace_tasklet);
+	struct ace_private *ap = from_work(ap, t, ace_work);
 	struct net_device *dev = ap->ndev;
 	int cur_size;
 
@@ -1595,7 +1595,7 @@ static void ace_tasklet(struct tasklet_struct *t)
 #endif
 		ace_load_jumbo_rx_ring(dev, RX_JUMBO_SIZE - cur_size);
 	}
-	ap->tasklet_pending = 0;
+	ap->work_pending = 0;
 }
 
 
@@ -1617,7 +1617,7 @@ static void ace_dump_trace(struct ace_private *ap)
  *
  * Loading rings is safe without holding the spin lock since this is
  * done only before the device is enabled, thus no interrupts are
- * generated and by the interrupt handler/tasklet handler.
+ * generated and by the interrupt handler/work handler.
  */
 static void ace_load_std_rx_ring(struct net_device *dev, int nr_bufs)
 {
@@ -2160,7 +2160,7 @@ static irqreturn_t ace_interrupt(int irq, void *dev_id)
 	 */
 	if (netif_running(dev)) {
 		int cur_size;
-		int run_tasklet = 0;
+		int run_work = 0;
 
 		cur_size = atomic_read(&ap->cur_rx_bufs);
 		if (cur_size < RX_LOW_STD_THRES) {
@@ -2172,7 +2172,7 @@ static irqreturn_t ace_interrupt(int irq, void *dev_id)
 				ace_load_std_rx_ring(dev,
 						     RX_RING_SIZE - cur_size);
 			} else
-				run_tasklet = 1;
+				run_work = 1;
 		}
 
 		if (!ACE_IS_TIGON_I(ap)) {
@@ -2188,7 +2188,7 @@ static irqreturn_t ace_interrupt(int irq, void *dev_id)
 					ace_load_mini_rx_ring(dev,
 							      RX_MINI_SIZE - cur_size);
 				} else
-					run_tasklet = 1;
+					run_work = 1;
 			}
 		}
 
@@ -2205,12 +2205,12 @@ static irqreturn_t ace_interrupt(int irq, void *dev_id)
 					ace_load_jumbo_rx_ring(dev,
 							       RX_JUMBO_SIZE - cur_size);
 				} else
-					run_tasklet = 1;
+					run_work = 1;
 			}
 		}
-		if (run_tasklet && !ap->tasklet_pending) {
-			ap->tasklet_pending = 1;
-			tasklet_schedule(&ap->ace_tasklet);
+		if (run_work && !ap->work_pending) {
+			ap->work_pending = 1;
+			queue_work(system_bh_wq, &ap->ace_work);
 		}
 	}
 
@@ -2267,7 +2267,7 @@ static int ace_open(struct net_device *dev)
 	/*
 	 * Setup the bottom half rx ring refill handler
 	 */
-	tasklet_setup(&ap->ace_tasklet, ace_tasklet);
+	INIT_WORK(&ap->ace_work, ace_work);
 	return 0;
 }
 
@@ -2301,7 +2301,7 @@ static int ace_close(struct net_device *dev)
 	cmd.idx = 0;
 	ace_issue_cmd(regs, &cmd);
 
-	tasklet_kill(&ap->ace_tasklet);
+	cancel_work_sync(&ap->ace_work);
 
 	/*
 	 * Make sure one CPU is not processing packets while
diff --git a/drivers/net/ethernet/alteon/acenic.h b/drivers/net/ethernet/alteon/acenic.h
index ca5ce0cbbad1..2ea5cd8005aa 100644
--- a/drivers/net/ethernet/alteon/acenic.h
+++ b/drivers/net/ethernet/alteon/acenic.h
@@ -2,6 +2,7 @@
 #ifndef _ACENIC_H_
 #define _ACENIC_H_
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 
 
 /*
@@ -667,8 +668,8 @@ struct ace_private
 	struct rx_desc		*rx_mini_ring;
 	struct rx_desc		*rx_return_ring;
 
-	int			tasklet_pending, jumbo;
-	struct tasklet_struct	ace_tasklet;
+	int			work_pending, jumbo;
+	struct work_struct	ace_work;
 
 	struct event		*evt_ring;
 
@@ -776,7 +777,7 @@ static int ace_open(struct net_device *dev);
 static netdev_tx_t ace_start_xmit(struct sk_buff *skb,
 				  struct net_device *dev);
 static int ace_close(struct net_device *dev);
-static void ace_tasklet(struct tasklet_struct *t);
+static void ace_work(struct work_struct *t);
 static void ace_dump_trace(struct ace_private *ap);
 static void ace_set_multicast_list(struct net_device *dev);
 static int ace_change_mtu(struct net_device *dev, int new_mtu);
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index 6b73648b3779..424dafaffc87 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -403,9 +403,9 @@ static bool xgbe_ecc_ded(struct xgbe_prv_data *pdata, unsigned long *period,
 	return false;
 }
 
-static void xgbe_ecc_isr_task(struct tasklet_struct *t)
+static void xgbe_ecc_isr_task(struct work_struct *t)
 {
-	struct xgbe_prv_data *pdata = from_tasklet(pdata, t, tasklet_ecc);
+	struct xgbe_prv_data *pdata = from_work(pdata, t, work_ecc);
 	unsigned int ecc_isr;
 	bool stop = false;
 
@@ -465,17 +465,17 @@ static irqreturn_t xgbe_ecc_isr(int irq, void *data)
 {
 	struct xgbe_prv_data *pdata = data;
 
-	if (pdata->isr_as_tasklet)
-		tasklet_schedule(&pdata->tasklet_ecc);
+	if (pdata->isr_as_work)
+		queue_work(system_bh_wq, &pdata->work_ecc);
 	else
-		xgbe_ecc_isr_task(&pdata->tasklet_ecc);
+		xgbe_ecc_isr_task(&pdata->work_ecc);
 
 	return IRQ_HANDLED;
 }
 
-static void xgbe_isr_task(struct tasklet_struct *t)
+static void xgbe_isr_task(struct work_struct *t)
 {
-	struct xgbe_prv_data *pdata = from_tasklet(pdata, t, tasklet_dev);
+	struct xgbe_prv_data *pdata = from_work(pdata, t, work_dev);
 	struct xgbe_hw_if *hw_if = &pdata->hw_if;
 	struct xgbe_channel *channel;
 	unsigned int dma_isr, dma_ch_isr;
@@ -582,7 +582,7 @@ static void xgbe_isr_task(struct tasklet_struct *t)
 
 	/* If there is not a separate ECC irq, handle it here */
 	if (pdata->vdata->ecc_support && (pdata->dev_irq == pdata->ecc_irq))
-		xgbe_ecc_isr_task(&pdata->tasklet_ecc);
+		xgbe_ecc_isr_task(&pdata->work_ecc);
 
 	/* If there is not a separate I2C irq, handle it here */
 	if (pdata->vdata->i2c_support && (pdata->dev_irq == pdata->i2c_irq))
@@ -604,10 +604,10 @@ static irqreturn_t xgbe_isr(int irq, void *data)
 {
 	struct xgbe_prv_data *pdata = data;
 
-	if (pdata->isr_as_tasklet)
-		tasklet_schedule(&pdata->tasklet_dev);
+	if (pdata->isr_as_work)
+		queue_work(system_bh_wq, &pdata->work_dev);
 	else
-		xgbe_isr_task(&pdata->tasklet_dev);
+		xgbe_isr_task(&pdata->work_dev);
 
 	return IRQ_HANDLED;
 }
@@ -1007,8 +1007,8 @@ static int xgbe_request_irqs(struct xgbe_prv_data *pdata)
 	unsigned int i;
 	int ret;
 
-	tasklet_setup(&pdata->tasklet_dev, xgbe_isr_task);
-	tasklet_setup(&pdata->tasklet_ecc, xgbe_ecc_isr_task);
+	INIT_WORK(&pdata->work_dev, xgbe_isr_task);
+	INIT_WORK(&pdata->work_ecc, xgbe_ecc_isr_task);
 
 	ret = devm_request_irq(pdata->dev, pdata->dev_irq, xgbe_isr, 0,
 			       netdev_name(netdev), pdata);
@@ -1078,8 +1078,8 @@ static void xgbe_free_irqs(struct xgbe_prv_data *pdata)
 
 	devm_free_irq(pdata->dev, pdata->dev_irq, pdata);
 
-	tasklet_kill(&pdata->tasklet_dev);
-	tasklet_kill(&pdata->tasklet_ecc);
+	cancel_work_sync(&pdata->work_dev);
+	cancel_work_sync(&pdata->work_ecc);
 
 	if (pdata->vdata->ecc_support && (pdata->dev_irq != pdata->ecc_irq))
 		devm_free_irq(pdata->dev, pdata->ecc_irq, pdata);
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-i2c.c b/drivers/net/ethernet/amd/xgbe/xgbe-i2c.c
index a9ccc4258ee5..8e1ec81a632e 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-i2c.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-i2c.c
@@ -274,9 +274,9 @@ static void xgbe_i2c_clear_isr_interrupts(struct xgbe_prv_data *pdata,
 		XI2C_IOREAD(pdata, IC_CLR_STOP_DET);
 }
 
-static void xgbe_i2c_isr_task(struct tasklet_struct *t)
+static void xgbe_i2c_isr_task(struct work_struct *t)
 {
-	struct xgbe_prv_data *pdata = from_tasklet(pdata, t, tasklet_i2c);
+	struct xgbe_prv_data *pdata = from_work(pdata, t, work_i2c);
 	struct xgbe_i2c_op_state *state = &pdata->i2c.op_state;
 	unsigned int isr;
 
@@ -321,10 +321,10 @@ static irqreturn_t xgbe_i2c_isr(int irq, void *data)
 {
 	struct xgbe_prv_data *pdata = (struct xgbe_prv_data *)data;
 
-	if (pdata->isr_as_tasklet)
-		tasklet_schedule(&pdata->tasklet_i2c);
+	if (pdata->isr_as_work)
+		queue_work(system_bh_wq, &pdata->work_i2c);
 	else
-		xgbe_i2c_isr_task(&pdata->tasklet_i2c);
+		xgbe_i2c_isr_task(&pdata->work_i2c);
 
 	return IRQ_HANDLED;
 }
@@ -369,7 +369,7 @@ static void xgbe_i2c_set_target(struct xgbe_prv_data *pdata, unsigned int addr)
 
 static irqreturn_t xgbe_i2c_combined_isr(struct xgbe_prv_data *pdata)
 {
-	xgbe_i2c_isr_task(&pdata->tasklet_i2c);
+	xgbe_i2c_isr_task(&pdata->work_i2c);
 
 	return IRQ_HANDLED;
 }
@@ -449,7 +449,7 @@ static void xgbe_i2c_stop(struct xgbe_prv_data *pdata)
 
 	if (pdata->dev_irq != pdata->i2c_irq) {
 		devm_free_irq(pdata->dev, pdata->i2c_irq, pdata);
-		tasklet_kill(&pdata->tasklet_i2c);
+		cancel_work_sync(&pdata->work_i2c);
 	}
 }
 
@@ -464,7 +464,7 @@ static int xgbe_i2c_start(struct xgbe_prv_data *pdata)
 
 	/* If we have a separate I2C irq, enable it */
 	if (pdata->dev_irq != pdata->i2c_irq) {
-		tasklet_setup(&pdata->tasklet_i2c, xgbe_i2c_isr_task);
+		INIT_WORK(&pdata->work_i2c, xgbe_i2c_isr_task);
 
 		ret = devm_request_irq(pdata->dev, pdata->i2c_irq,
 				       xgbe_i2c_isr, 0, pdata->i2c_name,
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
index 4a2dc705b528..8df27c6262bf 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
@@ -703,9 +703,9 @@ static void xgbe_an73_isr(struct xgbe_prv_data *pdata)
 	}
 }
 
-static void xgbe_an_isr_task(struct tasklet_struct *t)
+static void xgbe_an_isr_task(struct work_struct *t)
 {
-	struct xgbe_prv_data *pdata = from_tasklet(pdata, t, tasklet_an);
+	struct xgbe_prv_data *pdata = from_work(pdata, t, work_an);
 
 	netif_dbg(pdata, intr, pdata->netdev, "AN interrupt received\n");
 
@@ -727,17 +727,17 @@ static irqreturn_t xgbe_an_isr(int irq, void *data)
 {
 	struct xgbe_prv_data *pdata = (struct xgbe_prv_data *)data;
 
-	if (pdata->isr_as_tasklet)
-		tasklet_schedule(&pdata->tasklet_an);
+	if (pdata->isr_as_work)
+		queue_work(system_bh_wq, &pdata->work_an);
 	else
-		xgbe_an_isr_task(&pdata->tasklet_an);
+		xgbe_an_isr_task(&pdata->work_an);
 
 	return IRQ_HANDLED;
 }
 
 static irqreturn_t xgbe_an_combined_isr(struct xgbe_prv_data *pdata)
 {
-	xgbe_an_isr_task(&pdata->tasklet_an);
+	xgbe_an_isr_task(&pdata->work_an);
 
 	return IRQ_HANDLED;
 }
@@ -1454,7 +1454,7 @@ static void xgbe_phy_stop(struct xgbe_prv_data *pdata)
 
 	if (pdata->dev_irq != pdata->an_irq) {
 		devm_free_irq(pdata->dev, pdata->an_irq, pdata);
-		tasklet_kill(&pdata->tasklet_an);
+		cancel_work_sync(&pdata->work_an);
 	}
 
 	pdata->phy_if.phy_impl.stop(pdata);
@@ -1477,7 +1477,7 @@ static int xgbe_phy_start(struct xgbe_prv_data *pdata)
 
 	/* If we have a separate AN irq, enable it */
 	if (pdata->dev_irq != pdata->an_irq) {
-		tasklet_setup(&pdata->tasklet_an, xgbe_an_isr_task);
+		INIT_WORK(&pdata->work_an, xgbe_an_isr_task);
 
 		ret = devm_request_irq(pdata->dev, pdata->an_irq,
 				       xgbe_an_isr, 0, pdata->an_name,
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-pci.c b/drivers/net/ethernet/amd/xgbe/xgbe-pci.c
index f409d7bd1f1e..712c1f04925a 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-pci.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-pci.c
@@ -139,7 +139,7 @@ static int xgbe_config_multi_msi(struct xgbe_prv_data *pdata)
 		return ret;
 	}
 
-	pdata->isr_as_tasklet = 1;
+	pdata->isr_as_work = 1;
 	pdata->irq_count = ret;
 
 	pdata->dev_irq = pci_irq_vector(pdata->pcidev, 0);
@@ -176,7 +176,7 @@ static int xgbe_config_irqs(struct xgbe_prv_data *pdata)
 		return ret;
 	}
 
-	pdata->isr_as_tasklet = pdata->pcidev->msi_enabled ? 1 : 0;
+	pdata->isr_as_work = pdata->pcidev->msi_enabled ? 1 : 0;
 	pdata->irq_count = 1;
 	pdata->channel_irq_count = 1;
 
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h b/drivers/net/ethernet/amd/xgbe/xgbe.h
index f01a1e566da6..b37231e637f7 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe.h
+++ b/drivers/net/ethernet/amd/xgbe/xgbe.h
@@ -133,6 +133,7 @@
 #include <linux/dcache.h>
 #include <linux/ethtool.h>
 #include <linux/list.h>
+#include <linux/workqueue.h>
 
 #define XGBE_DRV_NAME		"amd-xgbe"
 #define XGBE_DRV_DESC		"AMD 10 Gigabit Ethernet Driver"
@@ -1298,11 +1299,11 @@ struct xgbe_prv_data {
 
 	unsigned int lpm_ctrl;		/* CTRL1 for resume */
 
-	unsigned int isr_as_tasklet;
-	struct tasklet_struct tasklet_dev;
-	struct tasklet_struct tasklet_ecc;
-	struct tasklet_struct tasklet_i2c;
-	struct tasklet_struct tasklet_an;
+	unsigned int isr_as_work;
+	struct work_struct work_dev;
+	struct work_struct work_ecc;
+	struct work_struct work_i2c;
+	struct work_struct work_an;
 
 	struct dentry *xgbe_debugfs;
 
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index 3d63177e7e52..8664c873da4d 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -31,6 +31,7 @@
 #include <linux/if_vlan.h>
 #include <linux/prefetch.h>
 #include <linux/random.h>
+#include <linux/workqueue.h>
 #if IS_ENABLED(CONFIG_VLAN_8021Q)
 #define BCM_VLAN 1
 #endif
@@ -3015,9 +3016,9 @@ static int cnic_service_bnx2(void *data, void *status_blk)
 	return cnic_service_bnx2_queues(dev);
 }
 
-static void cnic_service_bnx2_msix(struct tasklet_struct *t)
+static void cnic_service_bnx2_msix(struct work_struct *t)
 {
-	struct cnic_local *cp = from_tasklet(cp, t, cnic_irq_task);
+	struct cnic_local *cp = from_work(cp, t, cnic_irq_task);
 	struct cnic_dev *dev = cp->dev;
 
 	cp->last_status_idx = cnic_service_bnx2_queues(dev);
@@ -3036,7 +3037,7 @@ static void cnic_doirq(struct cnic_dev *dev)
 		prefetch(cp->status_blk.gen);
 		prefetch(&cp->kcq1.kcq[KCQ_PG(prod)][KCQ_IDX(prod)]);
 
-		tasklet_schedule(&cp->cnic_irq_task);
+		queue_work(system_bh_wq, &cp->cnic_irq_task);
 	}
 }
 
@@ -3140,9 +3141,9 @@ static u32 cnic_service_bnx2x_kcq(struct cnic_dev *dev, struct kcq_info *info)
 	return last_status;
 }
 
-static void cnic_service_bnx2x_bh(struct tasklet_struct *t)
+static void cnic_service_bnx2x_bh(struct work_struct *t)
 {
-	struct cnic_local *cp = from_tasklet(cp, t, cnic_irq_task);
+	struct cnic_local *cp = from_work(cp, t, cnic_irq_task);
 	struct cnic_dev *dev = cp->dev;
 	struct bnx2x *bp = netdev_priv(dev->netdev);
 	u32 status_idx, new_status_idx;
@@ -4427,7 +4428,7 @@ static void cnic_free_irq(struct cnic_dev *dev)
 
 	if (ethdev->drv_state & CNIC_DRV_STATE_USING_MSIX) {
 		cp->disable_int_sync(dev);
-		tasklet_kill(&cp->cnic_irq_task);
+		cancel_work_sync(&cp->cnic_irq_task);
 		free_irq(ethdev->irq_arr[0].vector, dev);
 	}
 }
@@ -4440,7 +4441,7 @@ static int cnic_request_irq(struct cnic_dev *dev)
 
 	err = request_irq(ethdev->irq_arr[0].vector, cnic_irq, 0, "cnic", dev);
 	if (err)
-		tasklet_disable(&cp->cnic_irq_task);
+		disable_work_sync(&cp->cnic_irq_task);
 
 	return err;
 }
@@ -4463,7 +4464,7 @@ static int cnic_init_bnx2_irq(struct cnic_dev *dev)
 		CNIC_WR(dev, base + BNX2_HC_CMD_TICKS_OFF, (64 << 16) | 220);
 
 		cp->last_status_idx = cp->status_blk.bnx2->status_idx;
-		tasklet_setup(&cp->cnic_irq_task, cnic_service_bnx2_msix);
+		INIT_WORK(&cp->cnic_irq_task, cnic_service_bnx2_msix);
 		err = cnic_request_irq(dev);
 		if (err)
 			return err;
@@ -4872,7 +4873,7 @@ static int cnic_init_bnx2x_irq(struct cnic_dev *dev)
 	struct cnic_eth_dev *ethdev = cp->ethdev;
 	int err = 0;
 
-	tasklet_setup(&cp->cnic_irq_task, cnic_service_bnx2x_bh);
+	INIT_WORK(&cp->cnic_irq_task, cnic_service_bnx2x_bh);
 	if (ethdev->drv_state & CNIC_DRV_STATE_USING_MSIX)
 		err = cnic_request_irq(dev);
 
diff --git a/drivers/net/ethernet/broadcom/cnic.h b/drivers/net/ethernet/broadcom/cnic.h
index fedc84ada937..9b0a271c11d5 100644
--- a/drivers/net/ethernet/broadcom/cnic.h
+++ b/drivers/net/ethernet/broadcom/cnic.h
@@ -268,7 +268,7 @@ struct cnic_local {
 	u32				bnx2x_igu_sb_id;
 	u32				int_num;
 	u32				last_status_idx;
-	struct tasklet_struct		cnic_irq_task;
+	struct work_struct		cnic_irq_task;
 
 	struct kcqe		*completed_kcq[MAX_COMPLETED_KCQE];
 
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index aa5700ac9c00..a6d95a11b4a5 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -13,6 +13,7 @@
 #include <linux/net_tstamp.h>
 #include <linux/interrupt.h>
 #include <linux/phy/phy.h>
+#include <linux/workqueue.h>
 
 #if defined(CONFIG_ARCH_DMA_ADDR_T_64BIT) || defined(CONFIG_MACB_USE_HWSTAMP)
 #define MACB_EXT_DESC
@@ -1322,7 +1323,7 @@ struct macb {
 	spinlock_t rx_fs_lock;
 	unsigned int max_tuples;
 
-	struct tasklet_struct	hresp_err_tasklet;
+	struct work_struct	hresp_err_work;
 
 	int	rx_bd_rd_prefetch;
 	int	tx_bd_rd_prefetch;
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 898debfd4db3..08ceb51ca127 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1792,9 +1792,9 @@ static int macb_tx_poll(struct napi_struct *napi, int budget)
 	return work_done;
 }
 
-static void macb_hresp_error_task(struct tasklet_struct *t)
+static void macb_hresp_error_task(struct work_struct *t)
 {
-	struct macb *bp = from_tasklet(bp, t, hresp_err_tasklet);
+	struct macb *bp = from_work(bp, t, hresp_err_work);
 	struct net_device *dev = bp->dev;
 	struct macb_queue *queue;
 	unsigned int q;
@@ -1994,7 +1994,7 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 		}
 
 		if (status & MACB_BIT(HRESP)) {
-			tasklet_schedule(&bp->hresp_err_tasklet);
+			queue_work(system_bh_wq, &bp->hresp_err_work);
 			netdev_err(dev, "DMA bus error: HRESP not OK\n");
 
 			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
@@ -5150,7 +5150,7 @@ static int macb_probe(struct platform_device *pdev)
 		goto err_out_unregister_mdio;
 	}
 
-	tasklet_setup(&bp->hresp_err_tasklet, macb_hresp_error_task);
+	INIT_WORK(&bp->hresp_err_work, macb_hresp_error_task);
 
 	netdev_info(dev, "Cadence %s rev 0x%08x at 0x%08lx irq %d (%pM)\n",
 		    macb_is_gem(bp) ? "GEM" : "MACB", macb_readl(bp, MID),
@@ -5194,7 +5194,7 @@ static void macb_remove(struct platform_device *pdev)
 		mdiobus_free(bp->mii_bus);
 
 		unregister_netdev(dev);
-		tasklet_kill(&bp->hresp_err_tasklet);
+		cancel_work_sync(&bp->hresp_err_work);
 		pm_runtime_disable(&pdev->dev);
 		pm_runtime_dont_use_autosuspend(&pdev->dev);
 		if (!pm_runtime_suspended(&pdev->dev)) {
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_core.c b/drivers/net/ethernet/cavium/liquidio/lio_core.c
index f38d31bfab1b..ba09260e7ea7 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_core.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_core.c
@@ -925,7 +925,7 @@ int liquidio_schedule_msix_droq_pkt_handler(struct octeon_droq *droq, u64 ret)
 			if (OCTEON_CN23XX_VF(oct))
 				dev_err(&oct->pci_dev->dev,
 					"should not come here should not get rx when poll mode = 0 for vf\n");
-			tasklet_schedule(&oct_priv->droq_tasklet);
+			queue_work(system_bh_wq, &oct_priv->droq_work);
 			return 1;
 		}
 		/* this will be flushed periodically by check iq db */
@@ -975,7 +975,7 @@ static void liquidio_schedule_droq_pkt_handlers(struct octeon_device *oct)
 				droq->ops.napi_fn(droq);
 				oct_priv->napi_mask |= BIT_ULL(oq_no);
 			} else {
-				tasklet_schedule(&oct_priv->droq_tasklet);
+				queue_work(system_bh_wq, &oct_priv->droq_work);
 			}
 		}
 	}
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 34f02a8ec2ca..4d0aced1896b 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -21,6 +21,7 @@
 #include <linux/firmware.h>
 #include <net/vxlan.h>
 #include <linux/kthread.h>
+#include <linux/workqueue.h>
 #include "liquidio_common.h"
 #include "octeon_droq.h"
 #include "octeon_iq.h"
@@ -156,12 +157,12 @@ static int liquidio_set_vf_link_state(struct net_device *netdev, int vfidx,
 static struct handshake handshake[MAX_OCTEON_DEVICES];
 static struct completion first_stage;
 
-static void octeon_droq_bh(struct tasklet_struct *t)
+static void octeon_droq_bh(struct work_struct *t)
 {
 	int q_no;
 	int reschedule = 0;
-	struct octeon_device_priv *oct_priv = from_tasklet(oct_priv, t,
-							  droq_tasklet);
+	struct octeon_device_priv *oct_priv = from_work(oct_priv, t,
+							  droq_work);
 	struct octeon_device *oct = oct_priv->dev;
 
 	for (q_no = 0; q_no < MAX_OCTEON_OUTPUT_QUEUES(oct); q_no++) {
@@ -186,7 +187,7 @@ static void octeon_droq_bh(struct tasklet_struct *t)
 	}
 
 	if (reschedule)
-		tasklet_schedule(&oct_priv->droq_tasklet);
+		queue_work(system_bh_wq, &oct_priv->droq_work);
 }
 
 static int lio_wait_for_oq_pkts(struct octeon_device *oct)
@@ -205,7 +206,7 @@ static int lio_wait_for_oq_pkts(struct octeon_device *oct)
 		}
 		if (pkt_cnt > 0) {
 			pending_pkts += pkt_cnt;
-			tasklet_schedule(&oct_priv->droq_tasklet);
+			queue_work(system_bh_wq, &oct_priv->droq_work);
 		}
 		pkt_cnt = 0;
 		schedule_timeout_uninterruptible(1);
@@ -1136,7 +1137,7 @@ static void octeon_destroy_resources(struct octeon_device *oct)
 		break;
 	}                       /* end switch (oct->status) */
 
-	tasklet_kill(&oct_priv->droq_tasklet);
+	cancel_work_sync(&oct_priv->droq_work);
 }
 
 /**
@@ -1240,7 +1241,7 @@ static void liquidio_destroy_nic_device(struct octeon_device *oct, int ifidx)
 	list_for_each_entry_safe(napi, n, &netdev->napi_list, dev_list)
 		netif_napi_del(napi);
 
-	tasklet_enable(&oct_priv->droq_tasklet);
+	enable_and_queue_work(system_bh_wq, &oct_priv->droq_work);
 
 	if (atomic_read(&lio->ifstate) & LIO_IFSTATE_REGISTERED)
 		unregister_netdev(netdev);
@@ -1776,7 +1777,7 @@ static int liquidio_open(struct net_device *netdev)
 	int ret = 0;
 
 	if (oct->props[lio->ifidx].napi_enabled == 0) {
-		tasklet_disable(&oct_priv->droq_tasklet);
+		disable_work_sync(&oct_priv->droq_work);
 
 		list_for_each_entry_safe(napi, n, &netdev->napi_list, dev_list)
 			napi_enable(napi);
@@ -1902,7 +1903,7 @@ static int liquidio_stop(struct net_device *netdev)
 		if (OCTEON_CN23XX_PF(oct))
 			oct->droq[0]->ops.poll_mode = 0;
 
-		tasklet_enable(&oct_priv->droq_tasklet);
+		enable_and_queue_work(system_bh_wq, &oct_priv->droq_work);
 	}
 
 	dev_info(&oct->pci_dev->dev, "%s interface is stopped\n", netdev->name);
@@ -4210,9 +4211,9 @@ static int octeon_device_init(struct octeon_device *octeon_dev)
 		}
 	}
 
-	/* Initialize the tasklet that handles output queue packet processing.*/
-	dev_dbg(&octeon_dev->pci_dev->dev, "Initializing droq tasklet\n");
-	tasklet_setup(&oct_priv->droq_tasklet, octeon_droq_bh);
+	/* Initialize the work that handles output queue packet processing.*/
+	dev_dbg(&octeon_dev->pci_dev->dev, "Initializing droq work\n");
+	INIT_WORK(&oct_priv->droq_work, octeon_droq_bh);
 
 	/* Setup the interrupt handler and record the INT SUM register address
 	 */
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index 62c2eadc33e3..54e402f18c4f 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -87,7 +87,7 @@ static int lio_wait_for_oq_pkts(struct octeon_device *oct)
 		}
 		if (pkt_cnt > 0) {
 			pending_pkts += pkt_cnt;
-			tasklet_schedule(&oct_priv->droq_tasklet);
+			queue_work(system_bh_wq, &oct_priv->droq_work);
 		}
 		pkt_cnt = 0;
 		schedule_timeout_uninterruptible(1);
@@ -584,7 +584,7 @@ static void octeon_destroy_resources(struct octeon_device *oct)
 		break;
 	}
 
-	tasklet_kill(&oct_priv->droq_tasklet);
+	cancel_work_sync(&oct_priv->droq_work);
 }
 
 /**
@@ -687,7 +687,7 @@ static void liquidio_destroy_nic_device(struct octeon_device *oct, int ifidx)
 	list_for_each_entry_safe(napi, n, &netdev->napi_list, dev_list)
 		netif_napi_del(napi);
 
-	tasklet_enable(&oct_priv->droq_tasklet);
+	enable_and_queue_work(system_bh_wq, &oct_priv->droq_work);
 
 	if (atomic_read(&lio->ifstate) & LIO_IFSTATE_REGISTERED)
 		unregister_netdev(netdev);
@@ -911,7 +911,7 @@ static int liquidio_open(struct net_device *netdev)
 	int ret = 0;
 
 	if (!oct->props[lio->ifidx].napi_enabled) {
-		tasklet_disable(&oct_priv->droq_tasklet);
+		disable_work_sync(&oct_priv->droq_work);
 
 		list_for_each_entry_safe(napi, n, &netdev->napi_list, dev_list)
 			napi_enable(napi);
@@ -986,7 +986,7 @@ static int liquidio_stop(struct net_device *netdev)
 
 		oct->droq[0]->ops.poll_mode = 0;
 
-		tasklet_enable(&oct_priv->droq_tasklet);
+		enable_and_queue_work(system_bh_wq, &oct_priv->droq_work);
 	}
 
 	cancel_delayed_work_sync(&lio->stats_wk.work);
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_droq.c b/drivers/net/ethernet/cavium/liquidio/octeon_droq.c
index 0d6ee30affb9..ad673cc141dc 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_droq.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_droq.c
@@ -101,7 +101,7 @@ u32 octeon_droq_check_hw_for_pkts(struct octeon_droq *droq)
 	last_count = pkt_count - droq->pkt_count;
 	droq->pkt_count = pkt_count;
 
-	/* we shall write to cnts  at napi irq enable or end of droq tasklet */
+	/* we shall write to cnts  at napi irq enable or end of droq BH work */
 	if (last_count)
 		atomic_add(last_count, &droq->pkts_pending);
 
@@ -769,7 +769,7 @@ octeon_droq_process_packets(struct octeon_device *oct,
 				(u16)rdisp->rinfo->recv_pkt->rh.r.subcode));
 	}
 
-	/* If there are packets pending. schedule tasklet again */
+	/* If there are packets pending. queue BH work again */
 	if (atomic_read(&droq->pkts_pending))
 		return 1;
 
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_main.h b/drivers/net/ethernet/cavium/liquidio/octeon_main.h
index 5b4cb725f60f..bbc60215d629 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_main.h
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_main.h
@@ -24,6 +24,7 @@
 #define  _OCTEON_MAIN_H_
 
 #include <linux/sched/signal.h>
+#include <linux/workqueue.h>
 
 #if BITS_PER_LONG == 32
 #define CVM_CAST64(v) ((long long)(v))
@@ -36,8 +37,8 @@
 #define DRV_NAME "LiquidIO"
 
 struct octeon_device_priv {
-	/** Tasklet structures for this device. */
-	struct tasklet_struct droq_tasklet;
+	/** Work structures for this device. */
+	struct work_struct droq_work;
 	unsigned long napi_mask;
 	struct octeon_device *dev;
 };
diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 007d4b06819e..f1d61c4a362c 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -11,6 +11,7 @@
 #include <linux/etherdevice.h>
 #include <linux/capability.h>
 #include <linux/net_tstamp.h>
+#include <linux/workqueue.h>
 #include <linux/interrupt.h>
 #include <linux/netdevice.h>
 #include <linux/spinlock.h>
@@ -144,7 +145,7 @@ struct octeon_mgmt {
 	unsigned int last_speed;
 	struct device *dev;
 	struct napi_struct napi;
-	struct tasklet_struct tx_clean_tasklet;
+	struct work_struct tx_clean_work;
 	struct device_node *phy_np;
 	resource_size_t mix_phys;
 	resource_size_t mix_size;
@@ -315,9 +316,9 @@ static void octeon_mgmt_clean_tx_buffers(struct octeon_mgmt *p)
 		netif_wake_queue(p->netdev);
 }
 
-static void octeon_mgmt_clean_tx_tasklet(struct tasklet_struct *t)
+static void octeon_mgmt_clean_tx_work(struct work_struct *t)
 {
-	struct octeon_mgmt *p = from_tasklet(p, t, tx_clean_tasklet);
+	struct octeon_mgmt *p = from_work(p, t, tx_clean_work);
 	octeon_mgmt_clean_tx_buffers(p);
 	octeon_mgmt_enable_tx_irq(p);
 }
@@ -684,7 +685,7 @@ static irqreturn_t octeon_mgmt_interrupt(int cpl, void *dev_id)
 	}
 	if (mixx_isr.s.orthresh) {
 		octeon_mgmt_disable_tx_irq(p);
-		tasklet_schedule(&p->tx_clean_tasklet);
+		queue_work(system_bh_wq, &p->tx_clean_work);
 	}
 
 	return IRQ_HANDLED;
@@ -1487,8 +1488,7 @@ static int octeon_mgmt_probe(struct platform_device *pdev)
 
 	skb_queue_head_init(&p->tx_list);
 	skb_queue_head_init(&p->rx_list);
-	tasklet_setup(&p->tx_clean_tasklet,
-		      octeon_mgmt_clean_tx_tasklet);
+	INIT_WORK(&p->tx_clean_work, octeon_mgmt_clean_tx_work);
 
 	netdev->priv_flags |= IFF_UNICAST_FLT;
 
diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 090d6b83982a..4ffe6e177d8c 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -9,6 +9,7 @@
 #include <linux/netdevice.h>
 #include <linux/interrupt.h>
 #include <linux/pci.h>
+#include <linux/workqueue.h>
 #include "thunder_bgx.h"
 
 /* PCI device IDs */
@@ -295,7 +296,7 @@ struct nicvf {
 	bool			rb_work_scheduled;
 	struct page		*rb_page;
 	struct delayed_work	rbdr_work;
-	struct tasklet_struct	rbdr_task;
+	struct work_struct	rbdr_task;
 
 	/* Secondary Qset */
 	u8			sqs_count;
@@ -319,7 +320,7 @@ struct nicvf {
 	bool			loopback_supported;
 	struct nicvf_rss_info	rss_info;
 	struct nicvf_pfc	pfc;
-	struct tasklet_struct	qs_err_task;
+	struct work_struct	qs_err_task;
 	struct work_struct	reset_task;
 	struct nicvf_work       rx_mode_work;
 	/* spinlock to protect workqueue arguments from concurrent access */
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index eff350e0bc2a..d2a68d12fca1 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -982,9 +982,9 @@ static int nicvf_poll(struct napi_struct *napi, int budget)
  *
  * As of now only CQ errors are handled
  */
-static void nicvf_handle_qs_err(struct tasklet_struct *t)
+static void nicvf_handle_qs_err(struct work_struct *t)
 {
-	struct nicvf *nic = from_tasklet(nic, t, qs_err_task);
+	struct nicvf *nic = from_work(nic, t, qs_err_task);
 	struct queue_set *qs = nic->qs;
 	int qidx;
 	u64 status;
@@ -1069,7 +1069,7 @@ static irqreturn_t nicvf_rbdr_intr_handler(int irq, void *nicvf_irq)
 		if (!nicvf_is_intr_enabled(nic, NICVF_INTR_RBDR, qidx))
 			continue;
 		nicvf_disable_intr(nic, NICVF_INTR_RBDR, qidx);
-		tasklet_hi_schedule(&nic->rbdr_task);
+		queue_work(system_bh_highpri_wq, &nic->rbdr_task);
 		/* Clear interrupt */
 		nicvf_clear_intr(nic, NICVF_INTR_RBDR, qidx);
 	}
@@ -1085,7 +1085,7 @@ static irqreturn_t nicvf_qs_err_intr_handler(int irq, void *nicvf_irq)
 
 	/* Disable Qset err interrupt and schedule softirq */
 	nicvf_disable_intr(nic, NICVF_INTR_QS_ERR, 0);
-	tasklet_hi_schedule(&nic->qs_err_task);
+	queue_work(system_bh_highpri_wq, &nic->qs_err_task);
 	nicvf_clear_intr(nic, NICVF_INTR_QS_ERR, 0);
 
 	return IRQ_HANDLED;
@@ -1364,8 +1364,8 @@ int nicvf_stop(struct net_device *netdev)
 	for (irq = 0; irq < nic->num_vec; irq++)
 		synchronize_irq(pci_irq_vector(nic->pdev, irq));
 
-	tasklet_kill(&nic->rbdr_task);
-	tasklet_kill(&nic->qs_err_task);
+	cancel_work_sync(&nic->rbdr_task);
+	cancel_work_sync(&nic->qs_err_task);
 	if (nic->rb_work_scheduled)
 		cancel_delayed_work_sync(&nic->rbdr_work);
 
@@ -1488,11 +1488,11 @@ int nicvf_open(struct net_device *netdev)
 		nicvf_hw_set_mac_addr(nic, netdev);
 	}
 
-	/* Init tasklet for handling Qset err interrupt */
-	tasklet_setup(&nic->qs_err_task, nicvf_handle_qs_err);
+	/* Init work for handling Qset err interrupt */
+	INIT_WORK(&nic->qs_err_task, nicvf_handle_qs_err);
 
-	/* Init RBDR tasklet which will refill RBDR */
-	tasklet_setup(&nic->rbdr_task, nicvf_rbdr_task);
+	/* Init RBDR work which will refill RBDR */
+	INIT_WORK(&nic->rbdr_task, nicvf_rbdr_task);
 	INIT_DELAYED_WORK(&nic->rbdr_work, nicvf_rbdr_work);
 
 	/* Configure CPI alorithm */
@@ -1561,8 +1561,8 @@ int nicvf_open(struct net_device *netdev)
 cleanup:
 	nicvf_disable_intr(nic, NICVF_INTR_MBOX, 0);
 	nicvf_unregister_interrupts(nic);
-	tasklet_kill(&nic->qs_err_task);
-	tasklet_kill(&nic->rbdr_task);
+	cancel_work_sync(&nic->qs_err_task);
+	cancel_work_sync(&nic->rbdr_task);
 napi_del:
 	for (qidx = 0; qidx < qs->cq_cnt; qidx++) {
 		cq_poll = nic->napi[qidx];
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 06397cc8bb36..79b80eb8c0b0 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -8,6 +8,7 @@
 #include <linux/ip.h>
 #include <linux/etherdevice.h>
 #include <linux/iommu.h>
+#include <linux/workqueue.h>
 #include <net/ip.h>
 #include <net/tso.h>
 #include <uapi/linux/bpf.h>
@@ -461,9 +462,9 @@ void nicvf_rbdr_work(struct work_struct *work)
 }
 
 /* In Softirq context, alloc rcv buffers in atomic mode */
-void nicvf_rbdr_task(struct tasklet_struct *t)
+void nicvf_rbdr_task(struct work_struct *t)
 {
-	struct nicvf *nic = from_tasklet(nic, t, rbdr_task);
+	struct nicvf *nic = from_work(nic, t, rbdr_task);
 
 	nicvf_refill_rbdr(nic, GFP_ATOMIC);
 	if (nic->rb_alloc_fail) {
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 8453defc296c..e167a065c7f6 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -8,6 +8,7 @@
 
 #include <linux/netdevice.h>
 #include <linux/iommu.h>
+#include <linux/workqueue.h>
 #include <net/xdp.h>
 #include "q_struct.h"
 
@@ -348,7 +349,7 @@ void nicvf_xdp_sq_doorbell(struct nicvf *nic, struct snd_queue *sq, int sq_num);
 
 struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic,
 				  struct cqe_rx_t *cqe_rx, bool xdp);
-void nicvf_rbdr_task(struct tasklet_struct *t);
+void nicvf_rbdr_task(struct work_struct *t);
 void nicvf_rbdr_work(struct work_struct *work);
 
 void nicvf_enable_intr(struct nicvf *nic, int int_type, int q_idx);
diff --git a/drivers/net/ethernet/chelsio/cxgb/sge.c b/drivers/net/ethernet/chelsio/cxgb/sge.c
index 861edff5ed89..3075a5c5c616 100644
--- a/drivers/net/ethernet/chelsio/cxgb/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb/sge.c
@@ -44,6 +44,7 @@
 #include <linux/if_arp.h>
 #include <linux/slab.h>
 #include <linux/prefetch.h>
+#include <linux/workqueue.h>
 
 #include "cpl5_cmd.h"
 #include "sge.h"
@@ -229,11 +230,11 @@ struct sched {
 	unsigned int	port;		/* port index (round robin ports) */
 	unsigned int	num;		/* num skbs in per port queues */
 	struct sched_port p[MAX_NPORTS];
-	struct tasklet_struct sched_tsk;/* tasklet used to run scheduler */
+	struct work_struct sched_tsk;/* work used to run scheduler */
 	struct sge *sge;
 };
 
-static void restart_sched(struct tasklet_struct *t);
+static void restart_sched(struct work_struct *t);
 
 
 /*
@@ -270,14 +271,14 @@ static const u8 ch_mac_addr[ETH_ALEN] = {
 };
 
 /*
- * stop tasklet and free all pending skb's
+ * stop work and free all pending skb's
  */
 static void tx_sched_stop(struct sge *sge)
 {
 	struct sched *s = sge->tx_sched;
 	int i;
 
-	tasklet_kill(&s->sched_tsk);
+	cancel_work_sync(&s->sched_tsk);
 
 	for (i = 0; i < MAX_NPORTS; i++)
 		__skb_queue_purge(&s->p[s->port].skbq);
@@ -371,7 +372,7 @@ static int tx_sched_init(struct sge *sge)
 		return -ENOMEM;
 
 	pr_debug("tx_sched_init\n");
-	tasklet_setup(&s->sched_tsk, restart_sched);
+	INIT_WORK(&s->sched_tsk, restart_sched);
 	s->sge = sge;
 	sge->tx_sched = s;
 
@@ -1300,12 +1301,12 @@ static inline void reclaim_completed_tx(struct sge *sge, struct cmdQ *q)
 }
 
 /*
- * Called from tasklet. Checks the scheduler for any
+ * Called from work. Checks the scheduler for any
  * pending skbs that can be sent.
  */
-static void restart_sched(struct tasklet_struct *t)
+static void restart_sched(struct work_struct *t)
 {
-	struct sched *s = from_tasklet(s, t, sched_tsk);
+	struct sched *s = from_work(s, t, sched_tsk);
 	struct sge *sge = s->sge;
 	struct adapter *adapter = sge->adapter;
 	struct cmdQ *q = &sge->cmdQ[0];
@@ -1451,7 +1452,7 @@ static unsigned int update_tx_info(struct adapter *adapter,
 			writel(F_CMDQ0_ENABLE, adapter->regs + A_SG_DOORBELL);
 		}
 		if (sge->tx_sched)
-			tasklet_hi_schedule(&sge->tx_sched->sched_tsk);
+			queue_work(system_bh_highpri_wq, &sge->tx_sched->sched_tsk);
 
 		flags &= ~F_CMDQ0_ENABLE;
 	}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index fca9533bc011..ce9b8124495c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -54,6 +54,7 @@
 #include <linux/ptp_classify.h>
 #include <linux/crash_dump.h>
 #include <linux/thermal.h>
+#include <linux/workqueue.h>
 #include <asm/io.h>
 #include "t4_chip_type.h"
 #include "cxgb4_uld.h"
@@ -880,7 +881,7 @@ struct sge_uld_txq {               /* state for an SGE offload Tx queue */
 	struct sge_txq q;
 	struct adapter *adap;
 	struct sk_buff_head sendq;  /* list of backpressured packets */
-	struct tasklet_struct qresume_tsk; /* restarts the queue */
+	struct work_struct qresume_tsk; /* restarts the queue */
 	bool service_ofldq_running; /* service_ofldq() is processing sendq */
 	u8 full;                    /* the Tx ring is full */
 	unsigned long mapping_err;  /* # of I/O MMU packet mapping errors */
@@ -890,7 +891,7 @@ struct sge_ctrl_txq {               /* state for an SGE control Tx queue */
 	struct sge_txq q;
 	struct adapter *adap;
 	struct sk_buff_head sendq;  /* list of backpressured packets */
-	struct tasklet_struct qresume_tsk; /* restarts the queue */
+	struct work_struct qresume_tsk; /* restarts the queue */
 	u8 full;                    /* the Tx ring is full */
 } ____cacheline_aligned_in_smp;
 
@@ -946,7 +947,7 @@ struct sge_eosw_txq {
 
 	u32 hwqid; /* Underlying hardware queue index */
 	struct net_device *netdev; /* Pointer to netdevice */
-	struct tasklet_struct qresume_tsk; /* Restarts the queue */
+	struct work_struct qresume_tsk; /* Restarts the queue */
 	struct completion completion; /* completion for FLOWC rendezvous */
 };
 
@@ -2107,7 +2108,7 @@ void free_tx_desc(struct adapter *adap, struct sge_txq *q,
 void cxgb4_eosw_txq_free_desc(struct adapter *adap, struct sge_eosw_txq *txq,
 			      u32 ndesc);
 int cxgb4_ethofld_send_flowc(struct net_device *dev, u32 eotid, u32 tc);
-void cxgb4_ethofld_restart(struct tasklet_struct *t);
+void cxgb4_ethofld_restart(struct work_struct *t);
 int cxgb4_ethofld_rx_handler(struct sge_rspq *q, const __be64 *rsp,
 			     const struct pkt_gl *si);
 void free_txq(struct adapter *adap, struct sge_txq *q);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 2eb33a727bba..5d9b926aff7d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -589,7 +589,7 @@ static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
 			struct sge_uld_txq *oq;
 
 			oq = container_of(txq, struct sge_uld_txq, q);
-			tasklet_schedule(&oq->qresume_tsk);
+			queue_work(system_bh_wq, &oq->qresume_tsk);
 		}
 	} else if (opcode == CPL_FW6_MSG || opcode == CPL_FW4_MSG) {
 		const struct cpl_fw6_msg *p = (void *)rsp;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c
index 338b04f339b3..9f077841b309 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c
@@ -114,7 +114,7 @@ static int cxgb4_init_eosw_txq(struct net_device *dev,
 	eosw_txq->cred = adap->params.ofldq_wr_cred;
 	eosw_txq->hwqid = hwqid;
 	eosw_txq->netdev = dev;
-	tasklet_setup(&eosw_txq->qresume_tsk, cxgb4_ethofld_restart);
+	INIT_WORK(&eosw_txq->qresume_tsk, cxgb4_ethofld_restart);
 	return 0;
 }
 
@@ -143,7 +143,7 @@ static void cxgb4_free_eosw_txq(struct net_device *dev,
 	cxgb4_clean_eosw_txq(dev, eosw_txq);
 	kfree(eosw_txq->desc);
 	spin_unlock_bh(&eosw_txq->lock);
-	tasklet_kill(&eosw_txq->qresume_tsk);
+	cancel_work_sync(&eosw_txq->qresume_tsk);
 }
 
 static int cxgb4_mqprio_alloc_hw_resources(struct net_device *dev)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
index 17faac715882..388ade2ddca9 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
@@ -407,7 +407,7 @@ free_sge_txq_uld(struct adapter *adap, struct sge_uld_txq_info *txq_info)
 		struct sge_uld_txq *txq = &txq_info->uldtxq[i];
 
 		if (txq && txq->q.desc) {
-			tasklet_kill(&txq->qresume_tsk);
+			cancel_work_sync(&txq->qresume_tsk);
 			t4_ofld_eq_free(adap, adap->mbox, adap->pf, 0,
 					txq->q.cntxt_id);
 			free_tx_desc(adap, &txq->q, txq->q.in_use, false);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 49d5808b7d11..ffa74e45248d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -41,6 +41,7 @@
 #include <linux/jiffies.h>
 #include <linux/prefetch.h>
 #include <linux/export.h>
+#include <linux/workqueue.h>
 #include <net/xfrm.h>
 #include <net/ipv6.h>
 #include <net/tcp.h>
@@ -2769,15 +2770,15 @@ static int ctrl_xmit(struct sge_ctrl_txq *q, struct sk_buff *skb)
 
 /**
  *	restart_ctrlq - restart a suspended control queue
- *	@t: pointer to the tasklet associated with this handler
+ *	@t: pointer to the work associated with this handler
  *
  *	Resumes transmission on a suspended Tx control queue.
  */
-static void restart_ctrlq(struct tasklet_struct *t)
+static void restart_ctrlq(struct work_struct *t)
 {
 	struct sk_buff *skb;
 	unsigned int written = 0;
-	struct sge_ctrl_txq *q = from_tasklet(q, t, qresume_tsk);
+	struct sge_ctrl_txq *q = from_work(q, t, qresume_tsk);
 
 	spin_lock(&q->sendq.lock);
 	reclaim_completed_tx_imm(&q->q);
@@ -2926,7 +2927,7 @@ static void ofldtxq_stop(struct sge_uld_txq *q, struct fw_wr_hdr *wr)
  *	left on the queue in case we experience DMA Mapping errors, etc.
  *	and need to give up and restart later.
  *
- *	service_ofldq() can be thought of as a task which opportunistically
+ *	service_ofldq() can be thought of as a work which opportunistically
  *	uses other threads execution contexts.  We use the Offload Queue
  *	boolean "service_ofldq_running" to make sure that only one instance
  *	is ever running at a time ...
@@ -3075,13 +3076,13 @@ static int ofld_xmit(struct sge_uld_txq *q, struct sk_buff *skb)
 
 /**
  *	restart_ofldq - restart a suspended offload queue
- *	@t: pointer to the tasklet associated with this handler
+ *	@t: pointer to the work associated with this handler
  *
  *	Resumes transmission on a suspended Tx offload queue.
  */
-static void restart_ofldq(struct tasklet_struct *t)
+static void restart_ofldq(struct work_struct *t)
 {
-	struct sge_uld_txq *q = from_tasklet(q, t, qresume_tsk);
+	struct sge_uld_txq *q = from_work(q, t, qresume_tsk);
 
 	spin_lock(&q->sendq.lock);
 	q->full = 0;            /* the queue actually is completely empty now */
@@ -4020,9 +4021,9 @@ static int napi_rx_handler(struct napi_struct *napi, int budget)
 	return work_done;
 }
 
-void cxgb4_ethofld_restart(struct tasklet_struct *t)
+void cxgb4_ethofld_restart(struct work_struct *t)
 {
-	struct sge_eosw_txq *eosw_txq = from_tasklet(eosw_txq, t,
+	struct sge_eosw_txq *eosw_txq = from_work(eosw_txq, t,
 						     qresume_tsk);
 	int pktcount;
 
@@ -4050,7 +4051,7 @@ void cxgb4_ethofld_restart(struct tasklet_struct *t)
  * @si: the gather list of packet fragments
  *
  * Process a ETHOFLD Tx completion. Increment the cidx here, but
- * free up the descriptors in a tasklet later.
+ * free up the descriptors in a work later.
  */
 int cxgb4_ethofld_rx_handler(struct sge_rspq *q, const __be64 *rsp,
 			     const struct pkt_gl *si)
@@ -4117,10 +4118,10 @@ int cxgb4_ethofld_rx_handler(struct sge_rspq *q, const __be64 *rsp,
 
 		spin_unlock(&eosw_txq->lock);
 
-		/* Schedule a tasklet to reclaim SKBs and restart ETHOFLD Tx,
+		/* Schedule a work to reclaim SKBs and restart ETHOFLD Tx,
 		 * if there were packets waiting for completion.
 		 */
-		tasklet_schedule(&eosw_txq->qresume_tsk);
+		queue_work(system_bh_wq, &eosw_txq->qresume_tsk);
 	}
 
 out_done:
@@ -4279,7 +4280,7 @@ static void sge_tx_timer_cb(struct timer_list *t)
 			struct sge_uld_txq *txq = s->egr_map[id];
 
 			clear_bit(id, s->txq_maperr);
-			tasklet_schedule(&txq->qresume_tsk);
+			queue_work(system_bh_wq, &txq->qresume_tsk);
 		}
 
 	if (!is_t4(adap->params.chip)) {
@@ -4719,7 +4720,7 @@ int t4_sge_alloc_ctrl_txq(struct adapter *adap, struct sge_ctrl_txq *txq,
 	init_txq(adap, &txq->q, FW_EQ_CTRL_CMD_EQID_G(ntohl(c.cmpliqid_eqid)));
 	txq->adap = adap;
 	skb_queue_head_init(&txq->sendq);
-	tasklet_setup(&txq->qresume_tsk, restart_ctrlq);
+	INIT_WORK(&txq->qresume_tsk, restart_ctrlq);
 	txq->full = 0;
 	return 0;
 }
@@ -4809,7 +4810,7 @@ int t4_sge_alloc_uld_txq(struct adapter *adap, struct sge_uld_txq *txq,
 	txq->q.q_type = CXGB4_TXQ_ULD;
 	txq->adap = adap;
 	skb_queue_head_init(&txq->sendq);
-	tasklet_setup(&txq->qresume_tsk, restart_ofldq);
+	INIT_WORK(&txq->qresume_tsk, restart_ofldq);
 	txq->full = 0;
 	txq->mapping_err = 0;
 	return 0;
@@ -4952,7 +4953,7 @@ void t4_free_sge_resources(struct adapter *adap)
 		struct sge_ctrl_txq *cq = &adap->sge.ctrlq[i];
 
 		if (cq->q.desc) {
-			tasklet_kill(&cq->qresume_tsk);
+			cancel_work_sync(&cq->qresume_tsk);
 			t4_ctrl_eq_free(adap, adap->mbox, adap->pf, 0,
 					cq->q.cntxt_id);
 			__skb_queue_purge(&cq->sendq);
@@ -5002,7 +5003,7 @@ void t4_sge_start(struct adapter *adap)
  *	t4_sge_stop - disable SGE operation
  *	@adap: the adapter
  *
- *	Stop tasklets and timers associated with the DMA engine.  Note that
+ *	Stop works and timers associated with the DMA engine.  Note that
  *	this is effective only if measures have been taken to disable any HW
  *	events that may restart them.
  */
@@ -5025,7 +5026,7 @@ void t4_sge_stop(struct adapter *adap)
 
 			for_each_ofldtxq(&adap->sge, i) {
 				if (txq->q.desc)
-					tasklet_kill(&txq->qresume_tsk);
+					cancel_work_sync(&txq->qresume_tsk);
 			}
 		}
 	}
@@ -5039,7 +5040,7 @@ void t4_sge_stop(struct adapter *adap)
 
 			for_each_ofldtxq(&adap->sge, i) {
 				if (txq->q.desc)
-					tasklet_kill(&txq->qresume_tsk);
+					cancel_work_sync(&txq->qresume_tsk);
 			}
 		}
 	}
@@ -5048,7 +5049,7 @@ void t4_sge_stop(struct adapter *adap)
 		struct sge_ctrl_txq *cq = &s->ctrlq[i];
 
 		if (cq->q.desc)
-			tasklet_kill(&cq->qresume_tsk);
+			cancel_work_sync(&cq->qresume_tsk);
 	}
 }
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
index 5b1d746e6563..9a449fca079d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
@@ -2587,7 +2587,7 @@ void t4vf_free_sge_resources(struct adapter *adapter)
  *	t4vf_sge_start - enable SGE operation
  *	@adapter: the adapter
  *
- *	Start tasklets and timers associated with the DMA engine.
+ *	Start works and timers associated with the DMA engine.
  */
 void t4vf_sge_start(struct adapter *adapter)
 {
@@ -2600,7 +2600,7 @@ void t4vf_sge_start(struct adapter *adapter)
  *	t4vf_sge_stop - disable SGE operation
  *	@adapter: the adapter
  *
- *	Stop tasklets and timers associated with the DMA engine.  Note that
+ *	Stop works and timers associated with the DMA engine.  Note that
  *	this is effective only if measures have been taken to disable any HW
  *	events that may restart them.
  */
@@ -2692,7 +2692,7 @@ int t4vf_sge_init(struct adapter *adapter)
 	s->fl_starve_thres = s->fl_starve_thres * 2 + 1;
 
 	/*
-	 * Set up tasklet timers.
+	 * Set up timers.
 	 */
 	timer_setup(&s->rx_timer, sge_rx_timer_cb, 0);
 	timer_setup(&s->tx_timer, sge_tx_timer_cb, 0);
diff --git a/drivers/net/ethernet/dlink/sundance.c b/drivers/net/ethernet/dlink/sundance.c
index aaf0eda96292..44cd33facdab 100644
--- a/drivers/net/ethernet/dlink/sundance.c
+++ b/drivers/net/ethernet/dlink/sundance.c
@@ -97,6 +97,7 @@ static char *media[MAX_UNITS];
 #include <linux/crc32.h>
 #include <linux/ethtool.h>
 #include <linux/mii.h>
+#include <linux/workqueue.h>
 
 MODULE_AUTHOR("Donald Becker <becker@scyld.com>");
 MODULE_DESCRIPTION("Sundance Alta Ethernet driver");
@@ -395,8 +396,8 @@ struct netdev_private {
 	unsigned int an_enable:1;
 	unsigned int speed;
 	unsigned int wol_enabled:1;			/* Wake on LAN enabled */
-	struct tasklet_struct rx_tasklet;
-	struct tasklet_struct tx_tasklet;
+	struct work_struct rx_work;
+	struct work_struct tx_work;
 	int budget;
 	int cur_task;
 	/* Multicast and receive mode. */
@@ -430,8 +431,8 @@ static void init_ring(struct net_device *dev);
 static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev);
 static int reset_tx (struct net_device *dev);
 static irqreturn_t intr_handler(int irq, void *dev_instance);
-static void rx_poll(struct tasklet_struct *t);
-static void tx_poll(struct tasklet_struct *t);
+static void rx_poll(struct work_struct *t);
+static void tx_poll(struct work_struct *t);
 static void refill_rx (struct net_device *dev);
 static void netdev_error(struct net_device *dev, int intr_status);
 static void netdev_error(struct net_device *dev, int intr_status);
@@ -541,8 +542,8 @@ static int sundance_probe1(struct pci_dev *pdev,
 	np->msg_enable = (1 << debug) - 1;
 	spin_lock_init(&np->lock);
 	spin_lock_init(&np->statlock);
-	tasklet_setup(&np->rx_tasklet, rx_poll);
-	tasklet_setup(&np->tx_tasklet, tx_poll);
+	INIT_WORK(&np->rx_work, rx_poll);
+	INIT_WORK(&np->tx_work, tx_poll);
 
 	ring_space = dma_alloc_coherent(&pdev->dev, TX_TOTAL_SIZE,
 			&ring_dma, GFP_KERNEL);
@@ -965,7 +966,7 @@ static void tx_timeout(struct net_device *dev, unsigned int txqueue)
 	unsigned long flag;
 
 	netif_stop_queue(dev);
-	tasklet_disable_in_atomic(&np->tx_tasklet);
+	disable_work_sync(&np->tx_work);
 	iowrite16(0, ioaddr + IntrEnable);
 	printk(KERN_WARNING "%s: Transmit timed out, TxStatus %2.2x "
 		   "TxFrameId %2.2x,"
@@ -1006,7 +1007,7 @@ static void tx_timeout(struct net_device *dev, unsigned int txqueue)
 		netif_wake_queue(dev);
 	}
 	iowrite16(DEFAULT_INTR, ioaddr + IntrEnable);
-	tasklet_enable(&np->tx_tasklet);
+	enable_and_queue_work(system_bh_wq, &np->tx_work);
 }
 
 
@@ -1058,9 +1059,9 @@ static void init_ring(struct net_device *dev)
 	}
 }
 
-static void tx_poll(struct tasklet_struct *t)
+static void tx_poll(struct work_struct *t)
 {
-	struct netdev_private *np = from_tasklet(np, t, tx_tasklet);
+	struct netdev_private *np = from_work(np, t, tx_work);
 	unsigned head = np->cur_task % TX_RING_SIZE;
 	struct netdev_desc *txdesc =
 		&np->tx_ring[(np->cur_tx - 1) % TX_RING_SIZE];
@@ -1104,11 +1105,11 @@ start_tx (struct sk_buff *skb, struct net_device *dev)
 			goto drop_frame;
 	txdesc->frag.length = cpu_to_le32 (skb->len | LastFrag);
 
-	/* Increment cur_tx before tasklet_schedule() */
+	/* Increment cur_tx before queue_work(system_bh_wq, ) */
 	np->cur_tx++;
 	mb();
-	/* Schedule a tx_poll() task */
-	tasklet_schedule(&np->tx_tasklet);
+	/* Schedule a tx_poll() work */
+	queue_work(system_bh_wq, &np->tx_work);
 
 	/* On some architectures: explicitly flush cache lines here. */
 	if (np->cur_tx - np->dirty_tx < TX_QUEUE_LEN - 1 &&
@@ -1199,7 +1200,7 @@ static irqreturn_t intr_handler(int irq, void *dev_instance)
 					ioaddr + IntrEnable);
 			if (np->budget < 0)
 				np->budget = RX_BUDGET;
-			tasklet_schedule(&np->rx_tasklet);
+			queue_work(system_bh_wq, &np->rx_work);
 		}
 		if (intr_status & (IntrTxDone | IntrDrvRqst)) {
 			tx_status = ioread16 (ioaddr + TxStatus);
@@ -1315,9 +1316,9 @@ static irqreturn_t intr_handler(int irq, void *dev_instance)
 	return IRQ_RETVAL(handled);
 }
 
-static void rx_poll(struct tasklet_struct *t)
+static void rx_poll(struct work_struct *t)
 {
-	struct netdev_private *np = from_tasklet(np, t, rx_tasklet);
+	struct netdev_private *np = from_work(np, t, rx_work);
 	struct net_device *dev = np->ndev;
 	int entry = np->cur_rx % RX_RING_SIZE;
 	int boguscnt = np->budget;
@@ -1407,7 +1408,7 @@ static void rx_poll(struct tasklet_struct *t)
 	np->budget -= received;
 	if (np->budget <= 0)
 		np->budget = RX_BUDGET;
-	tasklet_schedule(&np->rx_tasklet);
+	queue_work(system_bh_wq, &np->rx_work);
 }
 
 static void refill_rx (struct net_device *dev)
@@ -1819,9 +1820,9 @@ static int netdev_close(struct net_device *dev)
 	struct sk_buff *skb;
 	int i;
 
-	/* Wait and kill tasklet */
-	tasklet_kill(&np->rx_tasklet);
-	tasklet_kill(&np->tx_tasklet);
+	/* Wait and kill work */
+	cancel_work_sync(&np->rx_work);
+	cancel_work_sync(&np->tx_work);
 	np->cur_tx = 0;
 	np->dirty_tx = 0;
 	np->cur_task = 0;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c
index d39eec9c62bf..02145cb1ebc4 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_cmdq.c
@@ -344,7 +344,7 @@ static int cmdq_sync_cmd_direct_resp(struct hinic_cmdq *cmdq,
 	struct hinic_hw_wqe *hw_wqe;
 	struct completion done;
 
-	/* Keep doorbell index correct. bh - for tasklet(ceq). */
+	/* Keep doorbell index correct. - for BH work(ceq). */
 	spin_lock_bh(&cmdq->cmdq_lock);
 
 	/* WQE_SIZE = WQEBB_SIZE, we will get the wq element and not shadow*/
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.c
index 045c47786a04..66c36c151294 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.c
@@ -368,12 +368,12 @@ static void eq_irq_work(struct work_struct *work)
 }
 
 /**
- * ceq_tasklet - the tasklet of the EQ that received the event
- * @t: the tasklet struct pointer
+ * ceq_work - the work of the EQ that received the event
+ * @t: the work struct pointer
  **/
-static void ceq_tasklet(struct tasklet_struct *t)
+static void ceq_work(struct work_struct *t)
 {
-	struct hinic_eq *ceq = from_tasklet(ceq, t, ceq_tasklet);
+	struct hinic_eq *ceq = from_work(ceq, t, ceq_work);
 
 	eq_irq_handler(ceq);
 }
@@ -413,7 +413,7 @@ static irqreturn_t ceq_interrupt(int irq, void *data)
 	/* clear resend timer cnt register */
 	hinic_msix_attr_cnt_clear(ceq->hwif, ceq->msix_entry.entry);
 
-	tasklet_schedule(&ceq->ceq_tasklet);
+	queue_work(system_bh_wq, &ceq->ceq_work);
 
 	return IRQ_HANDLED;
 }
@@ -782,7 +782,7 @@ static int init_eq(struct hinic_eq *eq, struct hinic_hwif *hwif,
 
 		INIT_WORK(&aeq_work->work, eq_irq_work);
 	} else if (type == HINIC_CEQ) {
-		tasklet_setup(&eq->ceq_tasklet, ceq_tasklet);
+		INIT_WORK(&eq->ceq_work, ceq_work);
 	}
 
 	/* set the attributes of the msix entry */
@@ -833,7 +833,7 @@ static void remove_eq(struct hinic_eq *eq)
 		hinic_hwif_write_reg(eq->hwif,
 				     HINIC_CSR_AEQ_CTRL_1_ADDR(eq->q_id), 0);
 	} else if (eq->type == HINIC_CEQ) {
-		tasklet_kill(&eq->ceq_tasklet);
+		cancel_work_sync(&eq->ceq_work);
 		/* clear ceq_len to avoid hw access host memory */
 		hinic_hwif_write_reg(eq->hwif,
 				     HINIC_CSR_CEQ_CTRL_1_ADDR(eq->q_id), 0);
@@ -968,9 +968,8 @@ void hinic_dump_ceq_info(struct hinic_hwdev *hwdev)
 		ci = hinic_hwif_read_reg(hwdev->hwif, addr);
 		addr = EQ_PROD_IDX_REG_ADDR(eq);
 		pi = hinic_hwif_read_reg(hwdev->hwif, addr);
-		dev_err(&hwdev->hwif->pdev->dev, "Ceq id: %d, ci: 0x%08x, sw_ci: 0x%08x, pi: 0x%x, tasklet_state: 0x%lx, wrap: %d, ceqe: 0x%x\n",
+		dev_err(&hwdev->hwif->pdev->dev, "Ceq id: %d, ci: 0x%08x, sw_ci: 0x%08x, pi: 0x%x, wrap: %d, ceqe: 0x%x\n",
 			q_id, ci, eq->cons_idx, pi,
-			eq->ceq_tasklet.state,
 			eq->wrapped, be32_to_cpu(*(__be32 *)(GET_CURR_CEQ_ELEM(eq))));
 	}
 }
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.h
index 2f3222174fc7..49c08bebc07f 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_eqs.h
@@ -193,7 +193,7 @@ struct hinic_eq {
 
 	struct hinic_eq_work    aeq_work;
 
-	struct tasklet_struct   ceq_tasklet;
+	struct work_struct ceq_work;
 };
 
 struct hinic_hw_event_cb {
diff --git a/drivers/net/ethernet/ibm/ehea/ehea.h b/drivers/net/ethernet/ibm/ehea/ehea.h
index 208c440a602b..eeb124b5f9c5 100644
--- a/drivers/net/ethernet/ibm/ehea/ehea.h
+++ b/drivers/net/ethernet/ibm/ehea/ehea.h
@@ -20,6 +20,7 @@
 #include <linux/vmalloc.h>
 #include <linux/if_vlan.h>
 #include <linux/platform_device.h>
+#include <linux/workqueue.h>
 
 #include <asm/ibmebus.h>
 #include <asm/io.h>
@@ -381,7 +382,7 @@ struct ehea_adapter {
 	struct platform_device *ofdev;
 	struct ehea_port *port[EHEA_MAX_PORTS];
 	struct ehea_eq *neq;       /* notification event queue */
-	struct tasklet_struct neq_tasklet;
+	struct work_struct neq_work;
 	struct ehea_mr mr;
 	u32 pd;                    /* protection domain */
 	u64 max_mc_mac;            /* max number of multicast mac addresses */
diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c
index 1e29e5c9a2df..88db27778363 100644
--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
+++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
@@ -976,7 +976,7 @@ int ehea_sense_port_attr(struct ehea_port *port)
 	u64 hret;
 	struct hcp_ehea_port_cb0 *cb0;
 
-	/* may be called via ehea_neq_tasklet() */
+	/* may be called via ehea_neq_work() */
 	cb0 = (void *)get_zeroed_page(GFP_ATOMIC);
 	if (!cb0) {
 		pr_err("no mem for cb0\n");
@@ -1216,9 +1216,9 @@ static void ehea_parse_eqe(struct ehea_adapter *adapter, u64 eqe)
 	}
 }
 
-static void ehea_neq_tasklet(struct tasklet_struct *t)
+static void ehea_neq_work(struct work_struct *t)
 {
-	struct ehea_adapter *adapter = from_tasklet(adapter, t, neq_tasklet);
+	struct ehea_adapter *adapter = from_work(adapter, t, neq_work);
 	struct ehea_eqe *eqe;
 	u64 event_mask;
 
@@ -1243,7 +1243,7 @@ static void ehea_neq_tasklet(struct tasklet_struct *t)
 static irqreturn_t ehea_interrupt_neq(int irq, void *param)
 {
 	struct ehea_adapter *adapter = param;
-	tasklet_hi_schedule(&adapter->neq_tasklet);
+	queue_work(system_bh_highpri_wq, &adapter->neq_work);
 	return IRQ_HANDLED;
 }
 
@@ -3423,7 +3423,7 @@ static int ehea_probe_adapter(struct platform_device *dev)
 		goto out_free_ad;
 	}
 
-	tasklet_setup(&adapter->neq_tasklet, ehea_neq_tasklet);
+	INIT_WORK(&adapter->neq_work, ehea_neq_work);
 
 	ret = ehea_create_device_sysfs(dev);
 	if (ret)
@@ -3444,7 +3444,7 @@ static int ehea_probe_adapter(struct platform_device *dev)
 	}
 
 	/* Handle any events that might be pending. */
-	tasklet_hi_schedule(&adapter->neq_tasklet);
+	queue_work(system_bh_highpri_wq, &adapter->neq_work);
 
 	ret = 0;
 	goto out;
@@ -3485,7 +3485,7 @@ static void ehea_remove(struct platform_device *dev)
 	ehea_remove_device_sysfs(dev);
 
 	ibmebus_free_irq(adapter->neq->attr.ist1, adapter);
-	tasklet_kill(&adapter->neq_tasklet);
+	cancel_work_sync(&adapter->neq_work);
 
 	ehea_destroy_eq(adapter->neq);
 	ehea_remove_adapter_mr(adapter);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 30c47b8470ad..5e09fdd9b63b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2721,7 +2721,7 @@ static const char *reset_reason_to_string(enum ibmvnic_reset_reason reason)
 /*
  * Initialize the init_done completion and return code values. We
  * can get a transport event just after registering the CRQ and the
- * tasklet will use this to communicate the transport event. To ensure
+ * work will use this to communicate the transport event. To ensure
  * we don't miss the notification/error, initialize these _before_
  * regisering the CRQ.
  */
@@ -4425,7 +4425,7 @@ static void send_request_cap(struct ibmvnic_adapter *adapter, int retry)
 	int cap_reqs;
 
 	/* We send out 6 or 7 REQUEST_CAPABILITY CRQs below (depending on
-	 * the PROMISC flag). Initialize this count upfront. When the tasklet
+	 * the PROMISC flag). Initialize this count upfront. When the work
 	 * receives a response to all of these, it will send the next protocol
 	 * message (QUERY_IP_OFFLOAD).
 	 */
@@ -4961,7 +4961,7 @@ static void send_query_cap(struct ibmvnic_adapter *adapter)
 	int cap_reqs;
 
 	/* We send out 25 QUERY_CAPABILITY CRQs below.  Initialize this count
-	 * upfront. When the tasklet receives a response to all of these, it
+	 * upfront. When the work receives a response to all of these, it
 	 * can send out the next protocol messaage (REQUEST_CAPABILITY).
 	 */
 	cap_reqs = 25;
@@ -5473,7 +5473,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
 	int i;
 
 	/* CHECK: Test/set of login_pending does not need to be atomic
-	 * because only ibmvnic_tasklet tests/clears this.
+	 * because only ibmvnic_work tests/clears this.
 	 */
 	if (!adapter->login_pending) {
 		netdev_warn(netdev, "Ignoring unexpected login response\n");
@@ -6059,13 +6059,13 @@ static irqreturn_t ibmvnic_interrupt(int irq, void *instance)
 {
 	struct ibmvnic_adapter *adapter = instance;
 
-	tasklet_schedule(&adapter->tasklet);
+	queue_work(system_bh_wq, &adapter->work);
 	return IRQ_HANDLED;
 }
 
-static void ibmvnic_tasklet(struct tasklet_struct *t)
+static void ibmvnic_work(struct work_struct *t)
 {
-	struct ibmvnic_adapter *adapter = from_tasklet(adapter, t, tasklet);
+	struct ibmvnic_adapter *adapter = from_work(adapter, t, work);
 	struct ibmvnic_crq_queue *queue = &adapter->crq;
 	union ibmvnic_crq *crq;
 	unsigned long flags;
@@ -6146,7 +6146,7 @@ static void release_crq_queue(struct ibmvnic_adapter *adapter)
 
 	netdev_dbg(adapter->netdev, "Releasing CRQ\n");
 	free_irq(vdev->irq, adapter);
-	tasklet_kill(&adapter->tasklet);
+	cancel_work_sync(&adapter->work);
 	do {
 		rc = plpar_hcall_norets(H_FREE_CRQ, vdev->unit_address);
 	} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
@@ -6197,7 +6197,7 @@ static int init_crq_queue(struct ibmvnic_adapter *adapter)
 
 	retrc = 0;
 
-	tasklet_setup(&adapter->tasklet, (void *)ibmvnic_tasklet);
+	INIT_WORK(&adapter->work, (void *)ibmvnic_work);
 
 	netdev_dbg(adapter->netdev, "registering irq 0x%x\n", vdev->irq);
 	snprintf(crq->name, sizeof(crq->name), "ibmvnic-%x",
@@ -6219,12 +6219,12 @@ static int init_crq_queue(struct ibmvnic_adapter *adapter)
 	spin_lock_init(&crq->lock);
 
 	/* process any CRQs that were queued before we enabled interrupts */
-	tasklet_schedule(&adapter->tasklet);
+	queue_work(system_bh_wq, &adapter->work);
 
 	return retrc;
 
 req_irq_failed:
-	tasklet_kill(&adapter->tasklet);
+	cancel_work_sync(&adapter->work);
 	do {
 		rc = plpar_hcall_norets(H_FREE_CRQ, vdev->unit_address);
 	} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
@@ -6617,7 +6617,7 @@ static int ibmvnic_resume(struct device *dev)
 	if (adapter->state != VNIC_OPEN)
 		return 0;
 
-	tasklet_schedule(&adapter->tasklet);
+	queue_work(system_bh_wq, &adapter->work);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 94ac36b1408b..8afceba3b427 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1036,7 +1036,7 @@ struct ibmvnic_adapter {
 	u32 cur_rx_buf_sz;
 	u32 prev_rx_buf_sz;
 
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 	enum vnic_state state;
 	/* Used for serialization of state field. When taking both state
 	 * and rwi locks, take state lock first.
diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index 1732ec3c3dbd..1fa89d45be0a 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -1141,7 +1141,7 @@ jme_dynamic_pcc(struct jme_adapter *jme)
 
 	if (unlikely(dpi->attempt != dpi->cur && dpi->cnt > 5)) {
 		if (dpi->attempt < dpi->cur)
-			tasklet_schedule(&jme->rxclean_task);
+			queue_work(system_bh_wq, &jme->rxclean_task);
 		jme_set_rx_pcc(jme, dpi->attempt);
 		dpi->cur = dpi->attempt;
 		dpi->cnt = 0;
@@ -1182,9 +1182,9 @@ jme_shutdown_nic(struct jme_adapter *jme)
 }
 
 static void
-jme_pcc_tasklet(struct tasklet_struct *t)
+jme_pcc_work(struct work_struct *t)
 {
-	struct jme_adapter *jme = from_tasklet(jme, t, pcc_task);
+	struct jme_adapter *jme = from_work(jme, t, pcc_task);
 	struct net_device *netdev = jme->dev;
 
 	if (unlikely(test_bit(JME_FLAG_SHUTDOWN, &jme->flags))) {
@@ -1282,9 +1282,9 @@ static void jme_link_change_work(struct work_struct *work)
 		jme_stop_shutdown_timer(jme);
 
 	jme_stop_pcc_timer(jme);
-	tasklet_disable(&jme->txclean_task);
-	tasklet_disable(&jme->rxclean_task);
-	tasklet_disable(&jme->rxempty_task);
+	disable_work_sync(&jme->txclean_task);
+	disable_work_sync(&jme->rxclean_task);
+	disable_work_sync(&jme->rxempty_task);
 
 	if (netif_carrier_ok(netdev)) {
 		jme_disable_rx_engine(jme);
@@ -1304,7 +1304,7 @@ static void jme_link_change_work(struct work_struct *work)
 		rc = jme_setup_rx_resources(jme);
 		if (rc) {
 			pr_err("Allocating resources for RX error, Device STOPPED!\n");
-			goto out_enable_tasklet;
+			goto out_enable_work;
 		}
 
 		rc = jme_setup_tx_resources(jme);
@@ -1326,22 +1326,22 @@ static void jme_link_change_work(struct work_struct *work)
 		jme_start_shutdown_timer(jme);
 	}
 
-	goto out_enable_tasklet;
+	goto out_enable_work;
 
 err_out_free_rx_resources:
 	jme_free_rx_resources(jme);
-out_enable_tasklet:
-	tasklet_enable(&jme->txclean_task);
-	tasklet_enable(&jme->rxclean_task);
-	tasklet_enable(&jme->rxempty_task);
+out_enable_work:
+	enable_and_queue_work(system_bh_wq, &jme->txclean_task);
+	enable_and_queue_work(system_bh_wq, &jme->rxclean_task);
+	enable_and_queue_work(system_bh_wq, &jme->rxempty_task);
 out:
 	atomic_inc(&jme->link_changing);
 }
 
 static void
-jme_rx_clean_tasklet(struct tasklet_struct *t)
+jme_rx_clean_work(struct work_struct *t)
 {
-	struct jme_adapter *jme = from_tasklet(jme, t, rxclean_task);
+	struct jme_adapter *jme = from_work(jme, t, rxclean_task);
 	struct dynpcc_info *dpi = &(jme->dpi);
 
 	jme_process_receive(jme, jme->rx_ring_size);
@@ -1374,9 +1374,9 @@ jme_poll(JME_NAPI_HOLDER(holder), JME_NAPI_WEIGHT(budget))
 }
 
 static void
-jme_rx_empty_tasklet(struct tasklet_struct *t)
+jme_rx_empty_work(struct work_struct *t)
 {
-	struct jme_adapter *jme = from_tasklet(jme, t, rxempty_task);
+	struct jme_adapter *jme = from_work(jme, t, rxempty_task);
 
 	if (unlikely(atomic_read(&jme->link_changing) != 1))
 		return;
@@ -1386,7 +1386,7 @@ jme_rx_empty_tasklet(struct tasklet_struct *t)
 
 	netif_info(jme, rx_status, jme->dev, "RX Queue Full!\n");
 
-	jme_rx_clean_tasklet(&jme->rxclean_task);
+	jme_rx_clean_work(&jme->rxclean_task);
 
 	while (atomic_read(&jme->rx_empty) > 0) {
 		atomic_dec(&jme->rx_empty);
@@ -1410,9 +1410,9 @@ jme_wake_queue_if_stopped(struct jme_adapter *jme)
 
 }
 
-static void jme_tx_clean_tasklet(struct tasklet_struct *t)
+static void jme_tx_clean_work(struct work_struct *t)
 {
-	struct jme_adapter *jme = from_tasklet(jme, t, txclean_task);
+	struct jme_adapter *jme = from_work(jme, t, txclean_task);
 	struct jme_ring *txring = &(jme->txring[0]);
 	struct txdesc *txdesc = txring->desc;
 	struct jme_buffer_info *txbi = txring->bufinf, *ctxbi, *ttxbi;
@@ -1510,12 +1510,12 @@ jme_intr_msi(struct jme_adapter *jme, u32 intrstat)
 
 	if (intrstat & INTR_TMINTR) {
 		jwrite32(jme, JME_IEVE, INTR_TMINTR);
-		tasklet_schedule(&jme->pcc_task);
+		queue_work(system_bh_wq, &jme->pcc_task);
 	}
 
 	if (intrstat & (INTR_PCCTXTO | INTR_PCCTX)) {
 		jwrite32(jme, JME_IEVE, INTR_PCCTXTO | INTR_PCCTX | INTR_TX0);
-		tasklet_schedule(&jme->txclean_task);
+		queue_work(system_bh_wq, &jme->txclean_task);
 	}
 
 	if ((intrstat & (INTR_PCCRX0TO | INTR_PCCRX0 | INTR_RX0EMP))) {
@@ -1538,9 +1538,9 @@ jme_intr_msi(struct jme_adapter *jme, u32 intrstat)
 	} else {
 		if (intrstat & INTR_RX0EMP) {
 			atomic_inc(&jme->rx_empty);
-			tasklet_hi_schedule(&jme->rxempty_task);
+			queue_work(system_bh_highpri_wq, &jme->rxempty_task);
 		} else if (intrstat & (INTR_PCCRX0TO | INTR_PCCRX0)) {
-			tasklet_hi_schedule(&jme->rxclean_task);
+			queue_work(system_bh_highpri_wq, &jme->rxclean_task);
 		}
 	}
 
@@ -1826,9 +1826,9 @@ jme_open(struct net_device *netdev)
 	jme_clear_pm_disable_wol(jme);
 	JME_NAPI_ENABLE(jme);
 
-	tasklet_setup(&jme->txclean_task, jme_tx_clean_tasklet);
-	tasklet_setup(&jme->rxclean_task, jme_rx_clean_tasklet);
-	tasklet_setup(&jme->rxempty_task, jme_rx_empty_tasklet);
+	INIT_WORK(&jme->txclean_task, jme_tx_clean_work);
+	INIT_WORK(&jme->rxclean_task, jme_rx_clean_work);
+	INIT_WORK(&jme->rxempty_task, jme_rx_empty_work);
 
 	rc = jme_request_irq(jme);
 	if (rc)
@@ -1914,9 +1914,9 @@ jme_close(struct net_device *netdev)
 	JME_NAPI_DISABLE(jme);
 
 	cancel_work_sync(&jme->linkch_task);
-	tasklet_kill(&jme->txclean_task);
-	tasklet_kill(&jme->rxclean_task);
-	tasklet_kill(&jme->rxempty_task);
+	cancel_work_sync(&jme->txclean_task);
+	cancel_work_sync(&jme->rxclean_task);
+	cancel_work_sync(&jme->rxempty_task);
 
 	jme_disable_rx_engine(jme);
 	jme_disable_tx_engine(jme);
@@ -3020,7 +3020,7 @@ jme_init_one(struct pci_dev *pdev,
 	atomic_set(&jme->tx_cleaning, 1);
 	atomic_set(&jme->rx_empty, 1);
 
-	tasklet_setup(&jme->pcc_task, jme_pcc_tasklet);
+	INIT_WORK(&jme->pcc_task, jme_pcc_work);
 	INIT_WORK(&jme->linkch_task, jme_link_change_work);
 	jme->dpi.cur = PCC_P1;
 
@@ -3180,9 +3180,9 @@ jme_suspend(struct device *dev)
 	netif_stop_queue(netdev);
 	jme_stop_irq(jme);
 
-	tasklet_disable(&jme->txclean_task);
-	tasklet_disable(&jme->rxclean_task);
-	tasklet_disable(&jme->rxempty_task);
+	disable_work_sync(&jme->txclean_task);
+	disable_work_sync(&jme->rxclean_task);
+	disable_work_sync(&jme->rxempty_task);
 
 	if (netif_carrier_ok(netdev)) {
 		if (test_bit(JME_FLAG_POLL, &jme->flags))
@@ -3198,9 +3198,9 @@ jme_suspend(struct device *dev)
 		jme->phylink = 0;
 	}
 
-	tasklet_enable(&jme->txclean_task);
-	tasklet_enable(&jme->rxclean_task);
-	tasklet_enable(&jme->rxempty_task);
+	enable_and_queue_work(system_bh_wq, &jme->txclean_task);
+	enable_and_queue_work(system_bh_wq, &jme->rxclean_task);
+	enable_and_queue_work(system_bh_wq, &jme->rxempty_task);
 
 	jme_powersave_phy(jme);
 
diff --git a/drivers/net/ethernet/jme.h b/drivers/net/ethernet/jme.h
index 860494ff3714..f485258eebab 100644
--- a/drivers/net/ethernet/jme.h
+++ b/drivers/net/ethernet/jme.h
@@ -12,6 +12,7 @@
 #ifndef __JME_H_INCLUDED__
 #define __JME_H_INCLUDED__
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 
 #define DRV_NAME	"jme"
 #define DRV_VERSION	"1.0.8"
@@ -406,11 +407,11 @@ struct jme_adapter {
 	spinlock_t		phy_lock;
 	spinlock_t		macaddr_lock;
 	spinlock_t		rxmcs_lock;
-	struct tasklet_struct	rxempty_task;
-	struct tasklet_struct	rxclean_task;
-	struct tasklet_struct	txclean_task;
+	struct work_struct	rxempty_task;
+	struct work_struct	rxclean_task;
+	struct work_struct	txclean_task;
 	struct work_struct	linkch_task;
-	struct tasklet_struct	pcc_task;
+	struct work_struct	pcc_task;
 	unsigned long		flags;
 	u32			reg_txcs;
 	u32			reg_txpfc;
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 23adf53c2aa1..469312d15e6f 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -2629,7 +2629,7 @@ static u32 mvpp2_txq_desc_csum(int l3_offs, __be16 l3_proto,
  * Per-thread access
  *
  * Called only from mvpp2_txq_done(), called from mvpp2_tx()
- * (migration disabled) and from the TX completion tasklet (migration
+ * (migration disabled) and from the TX completion BH work (migration
  * disabled) so using smp_processor_id() is OK.
  */
 static inline int mvpp2_txq_sent_desc_proc(struct mvpp2_port *port,
diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c
index 1b43704baceb..337a5350c754 100644
--- a/drivers/net/ethernet/marvell/skge.c
+++ b/drivers/net/ethernet/marvell/skge.c
@@ -3342,13 +3342,13 @@ static void skge_error_irq(struct skge_hw *hw)
 }
 
 /*
- * Interrupt from PHY are handled in tasklet (softirq)
+ * Interrupt from PHY are handled in work (softirq)
  * because accessing phy registers requires spin wait which might
  * cause excess interrupt latency.
  */
-static void skge_extirq(struct tasklet_struct *t)
+static void skge_extirq(struct work_struct *t)
 {
-	struct skge_hw *hw = from_tasklet(hw, t, phy_task);
+	struct skge_hw *hw = from_work(hw, t, phy_task);
 	int port;
 
 	for (port = 0; port < hw->ports; port++) {
@@ -3389,7 +3389,7 @@ static irqreturn_t skge_intr(int irq, void *dev_id)
 	status &= hw->intr_mask;
 	if (status & IS_EXT_REG) {
 		hw->intr_mask &= ~IS_EXT_REG;
-		tasklet_schedule(&hw->phy_task);
+		queue_work(system_bh_wq, &hw->phy_task);
 	}
 
 	if (status & (IS_XA1_F|IS_R1_F)) {
@@ -3937,7 +3937,7 @@ static int skge_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	hw->pdev = pdev;
 	spin_lock_init(&hw->hw_lock);
 	spin_lock_init(&hw->phy_lock);
-	tasklet_setup(&hw->phy_task, skge_extirq);
+	INIT_WORK(&hw->phy_task, skge_extirq);
 
 	hw->regs = ioremap(pci_resource_start(pdev, 0), 0x4000);
 	if (!hw->regs) {
@@ -4035,7 +4035,7 @@ static void skge_remove(struct pci_dev *pdev)
 	dev0 = hw->dev[0];
 	unregister_netdev(dev0);
 
-	tasklet_kill(&hw->phy_task);
+	cancel_work_sync(&hw->phy_task);
 
 	spin_lock_irq(&hw->hw_lock);
 	hw->intr_mask = 0;
diff --git a/drivers/net/ethernet/marvell/skge.h b/drivers/net/ethernet/marvell/skge.h
index f72217348eb4..0e7ce19c692e 100644
--- a/drivers/net/ethernet/marvell/skge.h
+++ b/drivers/net/ethernet/marvell/skge.h
@@ -5,6 +5,7 @@
 #ifndef _SKGE_H
 #define _SKGE_H
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 
 /* PCI config registers */
 #define PCI_DEV_REG1	0x40
@@ -2418,7 +2419,7 @@ struct skge_hw {
 	u32	     	     ram_offset;
 	u16		     phy_addr;
 	spinlock_t	     phy_lock;
-	struct tasklet_struct phy_task;
+	struct work_struct phy_task;
 
 	char		     irq_name[]; /* skge@pci:000:04:00.0 */
 };
diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c
index 7063c78bd35f..93b6f6fba933 100644
--- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c
+++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c
@@ -71,7 +71,7 @@ static void
 mtk_wed_wo_irq_enable(struct mtk_wed_wo *wo, u32 mask)
 {
 	mtk_wed_wo_set_isr_mask(wo, 0, mask, false);
-	tasklet_schedule(&wo->mmio.irq_tasklet);
+	queue_work(system_bh_wq, &wo->mmio.irq_work);
 }
 
 static void
@@ -227,14 +227,14 @@ mtk_wed_wo_irq_handler(int irq, void *data)
 	struct mtk_wed_wo *wo = data;
 
 	mtk_wed_wo_set_isr(wo, 0);
-	tasklet_schedule(&wo->mmio.irq_tasklet);
+	queue_work(system_bh_wq, &wo->mmio.irq_work);
 
 	return IRQ_HANDLED;
 }
 
-static void mtk_wed_wo_irq_tasklet(struct tasklet_struct *t)
+static void mtk_wed_wo_irq_work(struct work_struct *t)
 {
-	struct mtk_wed_wo *wo = from_tasklet(wo, t, mmio.irq_tasklet);
+	struct mtk_wed_wo *wo = from_work(wo, t, mmio.irq_work);
 	u32 intr, mask;
 
 	/* disable interrupts */
@@ -395,7 +395,7 @@ mtk_wed_wo_hardware_init(struct mtk_wed_wo *wo)
 	wo->mmio.irq = irq_of_parse_and_map(np, 0);
 	wo->mmio.irq_mask = MTK_WED_WO_ALL_INT_MASK;
 	spin_lock_init(&wo->mmio.lock);
-	tasklet_setup(&wo->mmio.irq_tasklet, mtk_wed_wo_irq_tasklet);
+	INIT_WORK(&wo->mmio.irq_work, mtk_wed_wo_irq_work);
 
 	ret = devm_request_irq(wo->hw->dev, wo->mmio.irq,
 			       mtk_wed_wo_irq_handler, IRQF_TRIGGER_HIGH,
@@ -449,7 +449,7 @@ mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
 	/* disable interrupts */
 	mtk_wed_wo_set_isr(wo, 0);
 
-	tasklet_disable(&wo->mmio.irq_tasklet);
+	disable_work_sync(&wo->mmio.irq_work);
 
 	disable_irq(wo->mmio.irq);
 	devm_free_irq(wo->hw->dev, wo->mmio.irq, wo);
diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.h b/drivers/net/ethernet/mediatek/mtk_wed_wo.h
index 87a67fa3868d..d8e4cf594317 100644
--- a/drivers/net/ethernet/mediatek/mtk_wed_wo.h
+++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.h
@@ -6,6 +6,7 @@
 
 #include <linux/skbuff.h>
 #include <linux/netdevice.h>
+#include <linux/workqueue.h>
 
 struct mtk_wed_hw;
 
@@ -247,7 +248,7 @@ struct mtk_wed_wo {
 		struct regmap *regs;
 
 		spinlock_t lock;
-		struct tasklet_struct irq_tasklet;
+		struct work_struct irq_work;
 		int irq;
 		u32 irq_mask;
 	} mmio;
diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c
index e130e7259275..0427fddd506a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -52,23 +52,23 @@
 #define MLX4_CQ_STATE_ARMED_SOL		( 6 <<  8)
 #define MLX4_EQ_STATE_FIRED		(10 <<  8)
 
-#define TASKLET_MAX_TIME 2
-#define TASKLET_MAX_TIME_JIFFIES msecs_to_jiffies(TASKLET_MAX_TIME)
+#define BH_WORK_MAX_TIME 2
+#define BH_WORK_MAX_TIME_JIFFIES msecs_to_jiffies(BH_WORK_MAX_TIME)
 
-void mlx4_cq_tasklet_cb(struct tasklet_struct *t)
+void mlx4_cq_work_cb(struct work_struct *t)
 {
 	unsigned long flags;
-	unsigned long end = jiffies + TASKLET_MAX_TIME_JIFFIES;
-	struct mlx4_eq_tasklet *ctx = from_tasklet(ctx, t, task);
+	unsigned long end = jiffies + BH_WORK_MAX_TIME_JIFFIES;
+	struct mlx4_eq_work *ctx = from_work(ctx, t, work);
 	struct mlx4_cq *mcq, *temp;
 
 	spin_lock_irqsave(&ctx->lock, flags);
 	list_splice_tail_init(&ctx->list, &ctx->process_list);
 	spin_unlock_irqrestore(&ctx->lock, flags);
 
-	list_for_each_entry_safe(mcq, temp, &ctx->process_list, tasklet_ctx.list) {
-		list_del_init(&mcq->tasklet_ctx.list);
-		mcq->tasklet_ctx.comp(mcq);
+	list_for_each_entry_safe(mcq, temp, &ctx->process_list, work_ctx.list) {
+		list_del_init(&mcq->work_ctx.list);
+		mcq->work_ctx.comp(mcq);
 		if (refcount_dec_and_test(&mcq->refcount))
 			complete(&mcq->free);
 		if (time_after(jiffies, end))
@@ -76,29 +76,29 @@ void mlx4_cq_tasklet_cb(struct tasklet_struct *t)
 	}
 
 	if (!list_empty(&ctx->process_list))
-		tasklet_schedule(&ctx->task);
+		queue_work(system_bh_wq, &ctx->work);
 }
 
-static void mlx4_add_cq_to_tasklet(struct mlx4_cq *cq)
+static void mlx4_add_cq_to_work(struct mlx4_cq *cq)
 {
-	struct mlx4_eq_tasklet *tasklet_ctx = cq->tasklet_ctx.priv;
+	struct mlx4_eq_work *work_ctx = cq->work_ctx.priv;
 	unsigned long flags;
 	bool kick;
 
-	spin_lock_irqsave(&tasklet_ctx->lock, flags);
+	spin_lock_irqsave(&work_ctx->lock, flags);
 	/* When migrating CQs between EQs will be implemented, please note
 	 * that you need to sync this point. It is possible that
 	 * while migrating a CQ, completions on the old EQs could
 	 * still arrive.
 	 */
-	if (list_empty_careful(&cq->tasklet_ctx.list)) {
+	if (list_empty_careful(&cq->work_ctx.list)) {
 		refcount_inc(&cq->refcount);
-		kick = list_empty(&tasklet_ctx->list);
-		list_add_tail(&cq->tasklet_ctx.list, &tasklet_ctx->list);
+		kick = list_empty(&work_ctx->list);
+		list_add_tail(&cq->work_ctx.list, &work_ctx->list);
 		if (kick)
-			tasklet_schedule(&tasklet_ctx->task);
+			queue_work(system_bh_wq, &work_ctx->work);
 	}
-	spin_unlock_irqrestore(&tasklet_ctx->lock, flags);
+	spin_unlock_irqrestore(&work_ctx->lock, flags);
 }
 
 void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn)
@@ -412,10 +412,10 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
 	cq->uar        = uar;
 	refcount_set(&cq->refcount, 1);
 	init_completion(&cq->free);
-	cq->comp = mlx4_add_cq_to_tasklet;
-	cq->tasklet_ctx.priv =
-		&priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].tasklet_ctx;
-	INIT_LIST_HEAD(&cq->tasklet_ctx.list);
+	cq->comp = mlx4_add_cq_to_work;
+	cq->work_ctx.priv =
+		&priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].work_ctx;
+	INIT_LIST_HEAD(&cq->work_ctx.list);
 
 
 	cq->irq = priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].irq;
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 9572a45f6143..ca67bb2ffc41 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1055,10 +1055,10 @@ static int mlx4_create_eq(struct mlx4_dev *dev, int nent,
 
 	eq->cons_index = 0;
 
-	INIT_LIST_HEAD(&eq->tasklet_ctx.list);
-	INIT_LIST_HEAD(&eq->tasklet_ctx.process_list);
-	spin_lock_init(&eq->tasklet_ctx.lock);
-	tasklet_setup(&eq->tasklet_ctx.task, mlx4_cq_tasklet_cb);
+	INIT_LIST_HEAD(&eq->work_ctx.list);
+	INIT_LIST_HEAD(&eq->work_ctx.process_list);
+	spin_lock_init(&eq->work_ctx.lock);
+	INIT_WORK(&eq->work_ctx.work, mlx4_cq_work_cb);
 
 	return err;
 
@@ -1101,7 +1101,7 @@ static void mlx4_free_eq(struct mlx4_dev *dev,
 		mlx4_warn(dev, "HW2SW_EQ failed (%d)\n", err);
 
 	synchronize_irq(eq->irq);
-	tasklet_disable(&eq->tasklet_ctx.task);
+	disable_work_sync(&eq->work_ctx.work);
 
 	mlx4_mtt_cleanup(dev, &eq->mtt);
 	for (i = 0; i < npages; ++i)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index d7d856d1758a..f0029f68b5d3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -54,6 +54,7 @@
 #include <linux/mlx4/driver.h>
 #include <linux/mlx4/doorbell.h>
 #include <linux/mlx4/cmd.h>
+#include <linux/workqueue.h>
 #include "fw_qos.h"
 
 #define DRV_NAME	"mlx4_core"
@@ -382,11 +383,11 @@ struct mlx4_srq_context {
 	__be64			db_rec_addr;
 };
 
-struct mlx4_eq_tasklet {
+struct mlx4_eq_work {
 	struct list_head list;
 	struct list_head process_list;
-	struct tasklet_struct task;
-	/* lock on completion tasklet list */
+	struct work_struct work;
+	/* lock on completion work list */
 	spinlock_t lock;
 };
 
@@ -400,7 +401,7 @@ struct mlx4_eq {
 	int			nent;
 	struct mlx4_buf_list   *page_list;
 	struct mlx4_mtt		mtt;
-	struct mlx4_eq_tasklet	tasklet_ctx;
+	struct mlx4_eq_work	work_ctx;
 	struct mlx4_active_ports actv_ports;
 	u32			ref_count;
 	cpumask_var_t		affinity_mask;
@@ -1228,7 +1229,7 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev);
 int mlx4_comm_cmd(struct mlx4_dev *dev, u8 cmd, u16 param,
 		  u16 op, unsigned long timeout);
 
-void mlx4_cq_tasklet_cb(struct tasklet_struct *t);
+void mlx4_cq_work_cb(struct work_struct *t);
 void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn);
 void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cq.c b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
index 4caa1b6f40ba..78ad929d5270 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
@@ -38,14 +38,14 @@
 #include "mlx5_core.h"
 #include "lib/eq.h"
 
-#define TASKLET_MAX_TIME 2
-#define TASKLET_MAX_TIME_JIFFIES msecs_to_jiffies(TASKLET_MAX_TIME)
+#define BH_WORK_MAX_TIME 2
+#define BH_WORK_MAX_TIME_JIFFIES msecs_to_jiffies(BH_WORK_MAX_TIME)
 
-void mlx5_cq_tasklet_cb(struct tasklet_struct *t)
+void mlx5_cq_work_cb(struct work_struct *t)
 {
 	unsigned long flags;
-	unsigned long end = jiffies + TASKLET_MAX_TIME_JIFFIES;
-	struct mlx5_eq_tasklet *ctx = from_tasklet(ctx, t, task);
+	unsigned long end = jiffies + BH_WORK_MAX_TIME_JIFFIES;
+	struct mlx5_eq_work *ctx = from_work(ctx, t, work);
 	struct mlx5_core_cq *mcq;
 	struct mlx5_core_cq *temp;
 
@@ -54,35 +54,35 @@ void mlx5_cq_tasklet_cb(struct tasklet_struct *t)
 	spin_unlock_irqrestore(&ctx->lock, flags);
 
 	list_for_each_entry_safe(mcq, temp, &ctx->process_list,
-				 tasklet_ctx.list) {
-		list_del_init(&mcq->tasklet_ctx.list);
-		mcq->tasklet_ctx.comp(mcq, NULL);
+				 work_ctx.list) {
+		list_del_init(&mcq->work_ctx.list);
+		mcq->work_ctx.comp(mcq, NULL);
 		mlx5_cq_put(mcq);
 		if (time_after(jiffies, end))
 			break;
 	}
 
 	if (!list_empty(&ctx->process_list))
-		tasklet_schedule(&ctx->task);
+		queue_work(system_bh_wq, &ctx->work);
 }
 
-static void mlx5_add_cq_to_tasklet(struct mlx5_core_cq *cq,
-				   struct mlx5_eqe *eqe)
+static void mlx5_add_cq_to_work(struct mlx5_core_cq *cq,
+				struct mlx5_eqe *eqe)
 {
 	unsigned long flags;
-	struct mlx5_eq_tasklet *tasklet_ctx = cq->tasklet_ctx.priv;
+	struct mlx5_eq_work *work_ctx = cq->work_ctx.priv;
 
-	spin_lock_irqsave(&tasklet_ctx->lock, flags);
+	spin_lock_irqsave(&work_ctx->lock, flags);
 	/* When migrating CQs between EQs will be implemented, please note
 	 * that you need to sync this point. It is possible that
 	 * while migrating a CQ, completions on the old EQs could
 	 * still arrive.
 	 */
-	if (list_empty_careful(&cq->tasklet_ctx.list)) {
+	if (list_empty_careful(&cq->work_ctx.list)) {
 		mlx5_cq_hold(cq);
-		list_add_tail(&cq->tasklet_ctx.list, &tasklet_ctx->list);
+		list_add_tail(&cq->work_ctx.list, &work_ctx->list);
 	}
-	spin_unlock_irqrestore(&tasklet_ctx->lock, flags);
+	spin_unlock_irqrestore(&work_ctx->lock, flags);
 }
 
 /* Callers must verify outbox status in case of err */
@@ -113,10 +113,10 @@ int mlx5_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
 	refcount_set(&cq->refcount, 1);
 	init_completion(&cq->free);
 	if (!cq->comp)
-		cq->comp = mlx5_add_cq_to_tasklet;
+		cq->comp = mlx5_add_cq_to_work;
 	/* assuming CQ will be deleted before the EQ */
-	cq->tasklet_ctx.priv = &eq->tasklet_ctx;
-	INIT_LIST_HEAD(&cq->tasklet_ctx.list);
+	cq->work_ctx.priv = &eq->work_ctx;
+	INIT_LIST_HEAD(&cq->work_ctx.list);
 
 	/* Add to comp EQ CQ tree to recv comp events */
 	err = mlx5_eq_add_cq(&eq->core, cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 40a6cb052a2d..f5bb666f609a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -148,7 +148,7 @@ static int mlx5_eq_comp_int(struct notifier_block *nb,
 	eq_update_ci(eq, 1);
 
 	if (cqn != -1)
-		tasklet_schedule(&eq_comp->tasklet_ctx.task);
+		queue_work(system_bh_wq, &eq_comp->work_ctx.work);
 
 	return 0;
 }
@@ -979,7 +979,7 @@ static void destroy_comp_eq(struct mlx5_core_dev *dev, struct mlx5_eq_comp *eq,
 	if (destroy_unmap_eq(dev, &eq->core))
 		mlx5_core_warn(dev, "failed to destroy comp EQ 0x%x\n",
 			       eq->core.eqn);
-	tasklet_disable(&eq->tasklet_ctx.task);
+	disable_work_sync(&eq->work_ctx.work);
 	kfree(eq);
 	comp_irq_release(dev, vecidx);
 	table->curr_comp_eqs--;
@@ -1029,10 +1029,10 @@ static int create_comp_eq(struct mlx5_core_dev *dev, u16 vecidx)
 		goto clean_irq;
 	}
 
-	INIT_LIST_HEAD(&eq->tasklet_ctx.list);
-	INIT_LIST_HEAD(&eq->tasklet_ctx.process_list);
-	spin_lock_init(&eq->tasklet_ctx.lock);
-	tasklet_setup(&eq->tasklet_ctx.task, mlx5_cq_tasklet_cb);
+	INIT_LIST_HEAD(&eq->work_ctx.list);
+	INIT_LIST_HEAD(&eq->work_ctx.process_list);
+	spin_lock_init(&eq->work_ctx.lock);
+	INIT_WORK(&eq->work_ctx.work, mlx5_cq_work_cb);
 
 	irq = xa_load(&table->comp_irqs, vecidx);
 	eq->irq_nb.notifier_call = mlx5_eq_comp_int;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
index c4de6bf8d1b6..fe3edfac2b70 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
@@ -34,6 +34,7 @@
 #include <net/addrconf.h>
 #include <linux/etherdevice.h>
 #include <linux/mlx5/vport.h>
+#include <linux/workqueue.h>
 
 #include "mlx5_core.h"
 #include "lib/mlx5.h"
@@ -378,7 +379,7 @@ static inline void mlx5_fpga_conn_cqes(struct mlx5_fpga_conn *conn,
 		mlx5_cqwq_update_db_record(&conn->cq.wq);
 	}
 	if (!budget) {
-		tasklet_schedule(&conn->cq.tasklet);
+		queue_work(system_bh_wq, &conn->cq.work);
 		return;
 	}
 
@@ -388,9 +389,9 @@ static inline void mlx5_fpga_conn_cqes(struct mlx5_fpga_conn *conn,
 	mlx5_fpga_conn_arm_cq(conn);
 }
 
-static void mlx5_fpga_conn_cq_tasklet(struct tasklet_struct *t)
+static void mlx5_fpga_conn_cq_work(struct work_struct *t)
 {
-	struct mlx5_fpga_conn *conn = from_tasklet(conn, t, cq.tasklet);
+	struct mlx5_fpga_conn *conn = from_work(conn, t, cq.work);
 
 	if (unlikely(!conn->qp.active))
 		return;
@@ -476,7 +477,7 @@ static int mlx5_fpga_conn_create_cq(struct mlx5_fpga_conn *conn, int cq_size)
 	conn->cq.mcq.vector     = 0;
 	conn->cq.mcq.comp       = mlx5_fpga_conn_cq_complete;
 	conn->cq.mcq.uar        = fdev->conn_res.uar;
-	tasklet_setup(&conn->cq.tasklet, mlx5_fpga_conn_cq_tasklet);
+	INIT_WORK(&conn->cq.work, mlx5_fpga_conn_cq_work);
 
 	mlx5_fpga_dbg(fdev, "Created CQ #0x%x\n", conn->cq.mcq.cqn);
 
@@ -490,8 +491,8 @@ static int mlx5_fpga_conn_create_cq(struct mlx5_fpga_conn *conn, int cq_size)
 
 static void mlx5_fpga_conn_destroy_cq(struct mlx5_fpga_conn *conn)
 {
-	tasklet_disable(&conn->cq.tasklet);
-	tasklet_kill(&conn->cq.tasklet);
+	disable_work_sync(&conn->cq.work);
+	cancel_work_sync(&conn->cq.work);
 	mlx5_core_destroy_cq(conn->fdev->mdev, &conn->cq.mcq);
 	mlx5_wq_destroy(&conn->cq.wq_ctrl);
 }
@@ -933,7 +934,7 @@ struct mlx5_fpga_conn *mlx5_fpga_conn_create(struct mlx5_fpga_device *fdev,
 void mlx5_fpga_conn_destroy(struct mlx5_fpga_conn *conn)
 {
 	conn->qp.active = false;
-	tasklet_disable(&conn->cq.tasklet);
+	disable_work_sync(&conn->cq.work);
 	synchronize_irq(conn->cq.mcq.irqn);
 
 	mlx5_fpga_destroy_qp(conn->fdev->mdev, conn->fpga_qpn);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.h b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.h
index 5116e869a6e4..cb76cd681a4b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.h
@@ -36,6 +36,7 @@
 
 #include <linux/mlx5/cq.h>
 #include <linux/mlx5/qp.h>
+#include <linux/workqueue.h>
 
 #include "fpga/core.h"
 #include "fpga/sdk.h"
@@ -56,7 +57,7 @@ struct mlx5_fpga_conn {
 		struct mlx5_cqwq wq;
 		struct mlx5_wq_ctrl wq_ctrl;
 		struct mlx5_core_cq mcq;
-		struct tasklet_struct tasklet;
+		struct work_struct work;
 	} cq;
 
 	/* QP */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
index 4b7f7131c560..1ec59452b784 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
@@ -6,14 +6,15 @@
 #include <linux/mlx5/driver.h>
 #include <linux/mlx5/eq.h>
 #include <linux/mlx5/cq.h>
+#include <linux/workqueue.h>
 
 #define MLX5_EQE_SIZE       (sizeof(struct mlx5_eqe))
 
-struct mlx5_eq_tasklet {
+struct mlx5_eq_work {
 	struct list_head      list;
 	struct list_head      process_list;
-	struct tasklet_struct task;
-	spinlock_t            lock; /* lock completion tasklet list */
+	struct work_struct work;
+	spinlock_t            lock; /* lock completion work list */
 };
 
 struct mlx5_cq_table {
@@ -44,7 +45,7 @@ struct mlx5_eq_async {
 struct mlx5_eq_comp {
 	struct mlx5_eq          core;
 	struct notifier_block   irq_nb;
-	struct mlx5_eq_tasklet  tasklet_ctx;
+	struct mlx5_eq_work  work_ctx;
 	struct list_head        list;
 };
 
@@ -84,7 +85,7 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq);
 void mlx5_eq_del_cq(struct mlx5_eq *eq, struct mlx5_core_cq *cq);
 struct mlx5_eq_comp *mlx5_eqn2comp_eq(struct mlx5_core_dev *dev, int eqn);
 struct mlx5_eq *mlx5_get_async_eq(struct mlx5_core_dev *dev);
-void mlx5_cq_tasklet_cb(struct tasklet_struct *t);
+void mlx5_cq_work_cb(struct work_struct *t);
 
 u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq);
 void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index af99bf17eb36..3e652a3c3bfa 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -14,6 +14,7 @@
 #include <linux/if_vlan.h>
 #include <linux/log2.h>
 #include <linux/string.h>
+#include <linux/workqueue.h>
 
 #include "pci_hw.h"
 #include "pci.h"
@@ -78,7 +79,7 @@ struct mlxsw_pci_queue {
 	u8 num; /* queue number */
 	u8 elem_size; /* size of one element */
 	enum mlxsw_pci_queue_type type;
-	struct tasklet_struct tasklet; /* queue processing tasklet */
+	struct work_struct work; /* queue processing work */
 	struct mlxsw_pci *pci;
 	union {
 		struct {
@@ -135,9 +136,9 @@ struct mlxsw_pci {
 	bool skip_reset;
 };
 
-static void mlxsw_pci_queue_tasklet_schedule(struct mlxsw_pci_queue *q)
+static void mlxsw_pci_queue_work(struct mlxsw_pci_queue *q)
 {
-	tasklet_schedule(&q->tasklet);
+	queue_work(system_bh_wq, &q->work);
 }
 
 static char *__mlxsw_pci_queue_elem_get(struct mlxsw_pci_queue *q,
@@ -714,9 +715,9 @@ static char *mlxsw_pci_cq_sw_cqe_get(struct mlxsw_pci_queue *q)
 	return elem;
 }
 
-static void mlxsw_pci_cq_tasklet(struct tasklet_struct *t)
+static void mlxsw_pci_cq_work(struct work_struct *t)
 {
-	struct mlxsw_pci_queue *q = from_tasklet(q, t, tasklet);
+	struct mlxsw_pci_queue *q = from_work(q, t, work);
 	struct mlxsw_pci *mlxsw_pci = q->pci;
 	char *cqe;
 	int items = 0;
@@ -827,9 +828,9 @@ static char *mlxsw_pci_eq_sw_eqe_get(struct mlxsw_pci_queue *q)
 	return elem;
 }
 
-static void mlxsw_pci_eq_tasklet(struct tasklet_struct *t)
+static void mlxsw_pci_eq_work(struct work_struct *t)
 {
-	struct mlxsw_pci_queue *q = from_tasklet(q, t, tasklet);
+	struct mlxsw_pci_queue *q = from_work(q, t, work);
 	struct mlxsw_pci *mlxsw_pci = q->pci;
 	u8 cq_count = mlxsw_pci_cq_count(mlxsw_pci);
 	unsigned long active_cqns[BITS_TO_LONGS(MLXSW_PCI_CQS_MAX)];
@@ -873,7 +874,7 @@ static void mlxsw_pci_eq_tasklet(struct tasklet_struct *t)
 		return;
 	for_each_set_bit(cqn, active_cqns, cq_count) {
 		q = mlxsw_pci_cq_get(mlxsw_pci, cqn);
-		mlxsw_pci_queue_tasklet_schedule(q);
+		mlxsw_pci_queue_work(q);
 	}
 }
 
@@ -886,7 +887,7 @@ struct mlxsw_pci_queue_ops {
 		    struct mlxsw_pci_queue *q);
 	void (*fini)(struct mlxsw_pci *mlxsw_pci,
 		     struct mlxsw_pci_queue *q);
-	void (*tasklet)(struct tasklet_struct *t);
+	void (*work)(struct work_struct *t);
 	u16 (*elem_count_f)(const struct mlxsw_pci_queue *q);
 	u8 (*elem_size_f)(const struct mlxsw_pci_queue *q);
 	u16 elem_count;
@@ -914,7 +915,7 @@ static const struct mlxsw_pci_queue_ops mlxsw_pci_cq_ops = {
 	.pre_init	= mlxsw_pci_cq_pre_init,
 	.init		= mlxsw_pci_cq_init,
 	.fini		= mlxsw_pci_cq_fini,
-	.tasklet	= mlxsw_pci_cq_tasklet,
+	.work	= mlxsw_pci_cq_work,
 	.elem_count_f	= mlxsw_pci_cq_elem_count,
 	.elem_size_f	= mlxsw_pci_cq_elem_size
 };
@@ -923,7 +924,7 @@ static const struct mlxsw_pci_queue_ops mlxsw_pci_eq_ops = {
 	.type		= MLXSW_PCI_QUEUE_TYPE_EQ,
 	.init		= mlxsw_pci_eq_init,
 	.fini		= mlxsw_pci_eq_fini,
-	.tasklet	= mlxsw_pci_eq_tasklet,
+	.work	= mlxsw_pci_eq_work,
 	.elem_count	= MLXSW_PCI_EQE_COUNT,
 	.elem_size	= MLXSW_PCI_EQE_SIZE
 };
@@ -948,8 +949,8 @@ static int mlxsw_pci_queue_init(struct mlxsw_pci *mlxsw_pci, char *mbox,
 	q->type = q_ops->type;
 	q->pci = mlxsw_pci;
 
-	if (q_ops->tasklet)
-		tasklet_setup(&q->tasklet, q_ops->tasklet);
+	if (q_ops->work)
+		INIT_WORK(&q->work, q_ops->work);
 
 	mem_item->size = MLXSW_PCI_AQ_SIZE;
 	mem_item->buf = dma_alloc_coherent(&mlxsw_pci->pdev->dev,
@@ -1436,7 +1437,7 @@ static irqreturn_t mlxsw_pci_eq_irq_handler(int irq, void *dev_id)
 
 	for (i = 0; i < MLXSW_PCI_EQS_COUNT; i++) {
 		q = mlxsw_pci_eq_get(mlxsw_pci, i);
-		mlxsw_pci_queue_tasklet_schedule(q);
+		mlxsw_pci_queue_work(q);
 	}
 	return IRQ_HANDLED;
 }
diff --git a/drivers/net/ethernet/micrel/ks8842.c b/drivers/net/ethernet/micrel/ks8842.c
index ddd87ef71caf..c88209d8f569 100644
--- a/drivers/net/ethernet/micrel/ks8842.c
+++ b/drivers/net/ethernet/micrel/ks8842.c
@@ -22,6 +22,7 @@
 #include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
 #include <linux/scatterlist.h>
+#include <linux/workqueue.h>
 
 #define DRV_NAME "ks8842"
 
@@ -140,7 +141,7 @@ struct ks8842_rx_dma_ctl {
 	struct dma_async_tx_descriptor *adesc;
 	struct sk_buff  *skb;
 	struct scatterlist sg;
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 	int channel;
 };
 
@@ -151,7 +152,7 @@ struct ks8842_adapter {
 	void __iomem	*hw_addr;
 	int		irq;
 	unsigned long	conf_flags;	/* copy of platform_device config */
-	struct tasklet_struct	tasklet;
+	struct work_struct	work;
 	spinlock_t	lock; /* spinlock to be interrupt safe */
 	struct work_struct timeout_work;
 	struct net_device *netdev;
@@ -589,9 +590,9 @@ static int __ks8842_start_new_rx_dma(struct net_device *netdev)
 	return err;
 }
 
-static void ks8842_rx_frame_dma_tasklet(struct tasklet_struct *t)
+static void ks8842_rx_frame_dma_work(struct work_struct *t)
 {
-	struct ks8842_adapter *adapter = from_tasklet(adapter, t, dma_rx.tasklet);
+	struct ks8842_adapter *adapter = from_work(adapter, t, dma_rx.work);
 	struct net_device *netdev = adapter->netdev;
 	struct ks8842_rx_dma_ctl *ctl = &adapter->dma_rx;
 	struct sk_buff *skb = ctl->skb;
@@ -722,9 +723,9 @@ static void ks8842_handle_rx_overrun(struct net_device *netdev,
 	netdev->stats.rx_fifo_errors++;
 }
 
-static void ks8842_tasklet(struct tasklet_struct *t)
+static void ks8842_work(struct work_struct *t)
 {
-	struct ks8842_adapter *adapter = from_tasklet(adapter, t, tasklet);
+	struct ks8842_adapter *adapter = from_work(adapter, t, work);
 	struct net_device *netdev = adapter->netdev;
 	u16 isr;
 	unsigned long flags;
@@ -813,8 +814,8 @@ static irqreturn_t ks8842_irq(int irq, void *devid)
 			/* disable IRQ */
 			ks8842_write16(adapter, 18, 0x00, REG_IER);
 
-		/* schedule tasklet */
-		tasklet_schedule(&adapter->tasklet);
+		/* schedule work */
+		queue_work(system_bh_wq, &adapter->work);
 
 		ret = IRQ_HANDLED;
 	}
@@ -835,9 +836,9 @@ static void ks8842_dma_rx_cb(void *data)
 	struct ks8842_adapter	*adapter = netdev_priv(netdev);
 
 	netdev_dbg(netdev, "RX DMA finished\n");
-	/* schedule tasklet */
+	/* schedule work */
 	if (adapter->dma_rx.adesc)
-		tasklet_schedule(&adapter->dma_rx.tasklet);
+		queue_work(system_bh_wq, &adapter->dma_rx.work);
 }
 
 static void ks8842_dma_tx_cb(void *data)
@@ -895,7 +896,7 @@ static void ks8842_dealloc_dma_bufs(struct ks8842_adapter *adapter)
 		dma_release_channel(rx_ctl->chan);
 	rx_ctl->chan = NULL;
 
-	tasklet_kill(&rx_ctl->tasklet);
+	cancel_work_sync(&rx_ctl->work);
 
 	if (sg_dma_address(&tx_ctl->sg))
 		dma_unmap_single(adapter->dev, sg_dma_address(&tx_ctl->sg),
@@ -955,7 +956,7 @@ static int ks8842_alloc_dma_bufs(struct net_device *netdev)
 		goto err;
 	}
 
-	tasklet_setup(&rx_ctl->tasklet, ks8842_rx_frame_dma_tasklet);
+	INIT_WORK(&rx_ctl->work, ks8842_rx_frame_dma_work);
 
 	return 0;
 err:
@@ -1178,7 +1179,7 @@ static int ks8842_probe(struct platform_device *pdev)
 		adapter->dma_tx.channel = -1;
 	}
 
-	tasklet_setup(&adapter->tasklet, ks8842_tasklet);
+	INIT_WORK(&adapter->work, ks8842_work);
 	spin_lock_init(&adapter->lock);
 
 	netdev->netdev_ops = &ks8842_netdev_ops;
@@ -1235,7 +1236,7 @@ static void ks8842_remove(struct platform_device *pdev)
 	struct resource *iomem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 
 	unregister_netdev(netdev);
-	tasklet_kill(&adapter->tasklet);
+	cancel_work_sync(&adapter->work);
 	iounmap(adapter->hw_addr);
 	free_netdev(netdev);
 	release_mem_region(iomem->start, resource_size(iomem));
diff --git a/drivers/net/ethernet/micrel/ksz884x.c b/drivers/net/ethernet/micrel/ksz884x.c
index c5aeeb964c17..978d220ea6ec 100644
--- a/drivers/net/ethernet/micrel/ksz884x.c
+++ b/drivers/net/ethernet/micrel/ksz884x.c
@@ -26,6 +26,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/micrel_phy.h>
+#include <linux/workqueue.h>
 
 
 /* DMA Registers */
@@ -1339,8 +1340,8 @@ struct ksz_counter_info {
  * @mtu:		Current MTU used.  The default is REGULAR_RX_BUF_SIZE;
  * 			the maximum is MAX_RX_BUF_SIZE.
  * @opened:		Counter to keep track of device open.
- * @rx_tasklet:		Receive processing tasklet.
- * @tx_tasklet:		Transmit processing tasklet.
+ * @rx_work:		Receive processing work.
+ * @tx_work:		Transmit processing work.
  * @wol_enable:		Wake-on-LAN enable set by ethtool.
  * @wol_support:	Wake-on-LAN support used by ethtool.
  * @pme_wait:		Used for KSZ8841 power management.
@@ -1368,8 +1369,8 @@ struct dev_info {
 	int mtu;
 	int opened;
 
-	struct tasklet_struct rx_tasklet;
-	struct tasklet_struct tx_tasklet;
+	struct work_struct rx_work;
+	struct work_struct tx_work;
 
 	int wol_enable;
 	int wol_support;
@@ -4792,9 +4793,9 @@ static int dev_rcv_special(struct dev_info *hw_priv)
 	return received;
 }
 
-static void rx_proc_task(struct tasklet_struct *t)
+static void rx_proc_task(struct work_struct *t)
 {
-	struct dev_info *hw_priv = from_tasklet(hw_priv, t, rx_tasklet);
+	struct dev_info *hw_priv = from_work(hw_priv, t, rx_work);
 	struct ksz_hw *hw = &hw_priv->hw;
 
 	if (!hw->enabled)
@@ -4804,26 +4805,26 @@ static void rx_proc_task(struct tasklet_struct *t)
 		/* In case receive process is suspended because of overrun. */
 		hw_resume_rx(hw);
 
-		/* tasklets are interruptible. */
+		/* works are interruptible. */
 		spin_lock_irq(&hw_priv->hwlock);
 		hw_turn_on_intr(hw, KS884X_INT_RX_MASK);
 		spin_unlock_irq(&hw_priv->hwlock);
 	} else {
 		hw_ack_intr(hw, KS884X_INT_RX);
-		tasklet_schedule(&hw_priv->rx_tasklet);
+		queue_work(system_bh_wq, &hw_priv->rx_work);
 	}
 }
 
-static void tx_proc_task(struct tasklet_struct *t)
+static void tx_proc_task(struct work_struct *t)
 {
-	struct dev_info *hw_priv = from_tasklet(hw_priv, t, tx_tasklet);
+	struct dev_info *hw_priv = from_work(hw_priv, t, tx_work);
 	struct ksz_hw *hw = &hw_priv->hw;
 
 	hw_ack_intr(hw, KS884X_INT_TX_MASK);
 
 	tx_done(hw_priv);
 
-	/* tasklets are interruptible. */
+	/* works are interruptible. */
 	spin_lock_irq(&hw_priv->hwlock);
 	hw_turn_on_intr(hw, KS884X_INT_TX);
 	spin_unlock_irq(&hw_priv->hwlock);
@@ -4879,12 +4880,12 @@ static irqreturn_t netdev_intr(int irq, void *dev_id)
 
 		if (unlikely(int_enable & KS884X_INT_TX_MASK)) {
 			hw_dis_intr_bit(hw, KS884X_INT_TX_MASK);
-			tasklet_schedule(&hw_priv->tx_tasklet);
+			queue_work(system_bh_wq, &hw_priv->tx_work);
 		}
 
 		if (likely(int_enable & KS884X_INT_RX)) {
 			hw_dis_intr_bit(hw, KS884X_INT_RX);
-			tasklet_schedule(&hw_priv->rx_tasklet);
+			queue_work(system_bh_wq, &hw_priv->rx_work);
 		}
 
 		if (unlikely(int_enable & KS884X_INT_RX_OVERRUN)) {
@@ -5013,11 +5014,11 @@ static int netdev_close(struct net_device *dev)
 		hw_disable(hw);
 		hw_clr_multicast(hw);
 
-		/* Delay for receive task to stop scheduling itself. */
+		/* Delay for receive work to stop scheduling itself. */
 		msleep(2000 / HZ);
 
-		tasklet_kill(&hw_priv->rx_tasklet);
-		tasklet_kill(&hw_priv->tx_tasklet);
+		cancel_work_sync(&hw_priv->rx_work);
+		cancel_work_sync(&hw_priv->tx_work);
 		free_irq(dev->irq, hw_priv->dev);
 
 		transmit_cleanup(hw_priv, 0);
@@ -5068,8 +5069,8 @@ static int prepare_hardware(struct net_device *dev)
 	rc = request_irq(dev->irq, netdev_intr, IRQF_SHARED, dev->name, dev);
 	if (rc)
 		return rc;
-	tasklet_setup(&hw_priv->rx_tasklet, rx_proc_task);
-	tasklet_setup(&hw_priv->tx_tasklet, tx_proc_task);
+	INIT_WORK(&hw_priv->rx_work, rx_proc_task);
+	INIT_WORK(&hw_priv->tx_work, tx_proc_task);
 
 	hw->promiscuous = 0;
 	hw->all_multi = 0;
diff --git a/drivers/net/ethernet/microchip/lan743x_ptp.c b/drivers/net/ethernet/microchip/lan743x_ptp.c
index 2801f08bf1c9..a8d2be7d2b65 100644
--- a/drivers/net/ethernet/microchip/lan743x_ptp.c
+++ b/drivers/net/ethernet/microchip/lan743x_ptp.c
@@ -1380,7 +1380,7 @@ void lan743x_ptp_isr(void *context)
 
 	if (ptp_int_sts & PTP_INT_BIT_TX_TS_) {
 		ptp_schedule_worker(ptp->ptp_clock, 0);
-		enable_flag = 0;/* tasklet will re-enable later */
+		enable_flag = 0;/* BH work will re-enable later */
 	}
 	if (ptp_int_sts & PTP_INT_BIT_TX_SWTS_ERR_) {
 		netif_err(adapter, drv, adapter->netdev,
diff --git a/drivers/net/ethernet/natsemi/ns83820.c b/drivers/net/ethernet/natsemi/ns83820.c
index 998586872599..8e7e723706e8 100644
--- a/drivers/net/ethernet/natsemi/ns83820.c
+++ b/drivers/net/ethernet/natsemi/ns83820.c
@@ -415,7 +415,7 @@ struct ns83820 {
 	struct net_device	*ndev;
 
 	struct rx_info		rx_info;
-	struct tasklet_struct	rx_tasklet;
+	struct work_struct	rx_work;
 
 	unsigned		ihr;
 	struct work_struct	tq_refill;
@@ -925,9 +925,9 @@ static void rx_irq(struct net_device *ndev)
 	spin_unlock_irqrestore(&info->lock, flags);
 }
 
-static void rx_action(struct tasklet_struct *t)
+static void rx_action(struct work_struct *t)
 {
-	struct ns83820 *dev = from_tasklet(dev, t, rx_tasklet);
+	struct ns83820 *dev = from_work(dev, t, rx_work);
 	struct net_device *ndev = dev->ndev;
 	rx_irq(ndev);
 	writel(ihr, dev->base + IHR);
@@ -1426,7 +1426,7 @@ static void ns83820_do_isr(struct net_device *ndev, u32 isr)
 		writel(dev->IMR_cache, dev->base + IMR);
 		spin_unlock_irqrestore(&dev->misc_lock, flags);
 
-		tasklet_schedule(&dev->rx_tasklet);
+		queue_work(system_bh_wq, &dev->rx_work);
 		//rx_irq(ndev);
 		//writel(4, dev->base + IHR);
 	}
@@ -1929,7 +1929,7 @@ static int ns83820_init_one(struct pci_dev *pci_dev,
 	SET_NETDEV_DEV(ndev, &pci_dev->dev);
 
 	INIT_WORK(&dev->tq_refill, queue_refill);
-	tasklet_setup(&dev->rx_tasklet, rx_action);
+	INIT_WORK(&dev->rx_work, rx_action);
 
 	err = pci_enable_device(pci_dev);
 	if (err) {
diff --git a/drivers/net/ethernet/netronome/nfp/nfd3/dp.c b/drivers/net/ethernet/netronome/nfp/nfd3/dp.c
index d215efc6cad0..75ed7587b401 100644
--- a/drivers/net/ethernet/netronome/nfp/nfd3/dp.c
+++ b/drivers/net/ethernet/netronome/nfp/nfd3/dp.c
@@ -4,6 +4,7 @@
 #include <linux/bpf_trace.h>
 #include <linux/netdevice.h>
 #include <linux/bitfield.h>
+#include <linux/workqueue.h>
 #include <net/xfrm.h>
 
 #include "../nfp_app.h"
@@ -1402,9 +1403,9 @@ static bool nfp_ctrl_rx(struct nfp_net_r_vector *r_vec)
 	return budget;
 }
 
-void nfp_nfd3_ctrl_poll(struct tasklet_struct *t)
+void nfp_nfd3_ctrl_poll(struct work_struct *t)
 {
-	struct nfp_net_r_vector *r_vec = from_tasklet(r_vec, t, tasklet);
+	struct nfp_net_r_vector *r_vec = from_work(r_vec, t, work);
 
 	spin_lock(&r_vec->lock);
 	nfp_nfd3_tx_complete(r_vec->tx_ring, 0);
@@ -1414,7 +1415,7 @@ void nfp_nfd3_ctrl_poll(struct tasklet_struct *t)
 	if (nfp_ctrl_rx(r_vec)) {
 		nfp_net_irq_unmask(r_vec->nfp_net, r_vec->irq_entry);
 	} else {
-		tasklet_schedule(&r_vec->tasklet);
+		queue_work(system_bh_wq, &r_vec->work);
 		nn_dp_warn(&r_vec->nfp_net->dp,
 			   "control message budget exceeded!\n");
 	}
diff --git a/drivers/net/ethernet/netronome/nfp/nfd3/nfd3.h b/drivers/net/ethernet/netronome/nfp/nfd3/nfd3.h
index 9c1c10dcbaee..972593f54ecd 100644
--- a/drivers/net/ethernet/netronome/nfp/nfd3/nfd3.h
+++ b/drivers/net/ethernet/netronome/nfp/nfd3/nfd3.h
@@ -97,7 +97,7 @@ netdev_tx_t nfp_nfd3_tx(struct sk_buff *skb, struct net_device *netdev);
 bool
 nfp_nfd3_ctrl_tx_one(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
 		     struct sk_buff *skb, bool old);
-void nfp_nfd3_ctrl_poll(struct tasklet_struct *t);
+void nfp_nfd3_ctrl_poll(struct work_struct *t);
 void nfp_nfd3_rx_ring_fill_freelist(struct nfp_net_dp *dp,
 				    struct nfp_net_rx_ring *rx_ring);
 void nfp_nfd3_xsk_tx_free(struct nfp_nfd3_tx_buf *txbuf);
diff --git a/drivers/net/ethernet/netronome/nfp/nfdk/dp.c b/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
index dae5af7d1845..9eb6b6e8e2f0 100644
--- a/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
+++ b/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
@@ -1564,9 +1564,9 @@ static bool nfp_ctrl_rx(struct nfp_net_r_vector *r_vec)
 	return budget;
 }
 
-void nfp_nfdk_ctrl_poll(struct tasklet_struct *t)
+void nfp_nfdk_ctrl_poll(struct work_struct *t)
 {
-	struct nfp_net_r_vector *r_vec = from_tasklet(r_vec, t, tasklet);
+	struct nfp_net_r_vector *r_vec = from_work(r_vec, t, work);
 
 	spin_lock(&r_vec->lock);
 	nfp_nfdk_tx_complete(r_vec->tx_ring, 0);
@@ -1576,7 +1576,7 @@ void nfp_nfdk_ctrl_poll(struct tasklet_struct *t)
 	if (nfp_ctrl_rx(r_vec)) {
 		nfp_net_irq_unmask(r_vec->nfp_net, r_vec->irq_entry);
 	} else {
-		tasklet_schedule(&r_vec->tasklet);
+		queue_work(system_bh_wq, &r_vec->work);
 		nn_dp_warn(&r_vec->nfp_net->dp,
 			   "control message budget exceeded!\n");
 	}
diff --git a/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h b/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h
index fe55980348e9..d9eef0b11746 100644
--- a/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h
+++ b/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h
@@ -6,6 +6,7 @@
 
 #include <linux/bitops.h>
 #include <linux/types.h>
+#include <linux/workqueue.h>
 
 #define NFDK_TX_DESC_PER_SIMPLE_PKT	2
 
@@ -122,7 +123,7 @@ netdev_tx_t nfp_nfdk_tx(struct sk_buff *skb, struct net_device *netdev);
 bool
 nfp_nfdk_ctrl_tx_one(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
 		     struct sk_buff *skb, bool old);
-void nfp_nfdk_ctrl_poll(struct tasklet_struct *t);
+void nfp_nfdk_ctrl_poll(struct work_struct *t);
 void nfp_nfdk_rx_ring_fill_freelist(struct nfp_net_dp *dp,
 				    struct nfp_net_rx_ring *rx_ring);
 #ifndef CONFIG_NFP_NET_IPSEC
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 46764aeccb37..8ab22ecd7813 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -339,7 +339,7 @@ struct nfp_net_rx_ring {
  * struct nfp_net_r_vector - Per ring interrupt vector configuration
  * @nfp_net:        Backpointer to nfp_net structure
  * @napi:           NAPI structure for this ring vec
- * @tasklet:        ctrl vNIC, tasklet for servicing the r_vec
+ * @work:        ctrl vNIC, work for servicing the r_vec
  * @queue:          ctrl vNIC, send queue
  * @lock:           ctrl vNIC, r_vec lock protects @queue
  * @tx_ring:        Pointer to TX ring
@@ -389,7 +389,7 @@ struct nfp_net_r_vector {
 	union {
 		struct napi_struct napi;
 		struct {
-			struct tasklet_struct tasklet;
+			struct work_struct work;
 			struct sk_buff_head queue;
 			spinlock_t lock;
 		};
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index f28e769e6fda..4f852613d50e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -463,7 +463,7 @@ static irqreturn_t nfp_ctrl_irq_rxtx(int irq, void *data)
 {
 	struct nfp_net_r_vector *r_vec = data;
 
-	tasklet_schedule(&r_vec->tasklet);
+	queue_work(system_bh_wq, &r_vec->work);
 
 	return IRQ_HANDLED;
 }
@@ -761,8 +761,8 @@ static void nfp_net_vecs_init(struct nfp_net *nn)
 
 			__skb_queue_head_init(&r_vec->queue);
 			spin_lock_init(&r_vec->lock);
-			tasklet_setup(&r_vec->tasklet, nn->dp.ops->ctrl_poll);
-			tasklet_disable(&r_vec->tasklet);
+			INIT_WORK(&r_vec->work, nn->dp.ops->ctrl_poll);
+			disable_work_sync(&r_vec->work);
 		}
 
 		cpumask_set_cpu(cpumask_local_spread(r, numa_node), &r_vec->affinity_mask);
@@ -776,7 +776,7 @@ nfp_net_napi_add(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec, int idx)
 		netif_napi_add(dp->netdev, &r_vec->napi,
 			       nfp_net_has_xsk_pool_slow(dp, idx) ? dp->ops->xsk_poll : dp->ops->poll);
 	else
-		tasklet_enable(&r_vec->tasklet);
+		enable_and_queue_work(system_bh_wq, &r_vec->work);
 }
 
 static void
@@ -785,7 +785,7 @@ nfp_net_napi_del(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec)
 	if (dp->netdev)
 		netif_napi_del(&r_vec->napi);
 	else
-		tasklet_disable(&r_vec->tasklet);
+		disable_work_sync(&r_vec->work);
 }
 
 static void
@@ -1148,7 +1148,7 @@ void nfp_ctrl_close(struct nfp_net *nn)
 
 	for (r = 0; r < nn->dp.num_r_vecs; r++) {
 		disable_irq(nn->r_vecs[r].irq_vector);
-		tasklet_disable(&nn->r_vecs[r].tasklet);
+		disable_work_sync(&nn->r_vecs[r].work);
 	}
 
 	nfp_net_clear_config_and_disable(nn);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_dp.h b/drivers/net/ethernet/netronome/nfp/nfp_net_dp.h
index 831c83ce0d3d..39dd7b00a3bb 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_dp.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_dp.h
@@ -122,7 +122,7 @@ enum nfp_nfd_version {
  * @dma_mask:			DMA addressing capability
  * @poll:			Napi poll for normal rx/tx
  * @xsk_poll:			Napi poll when xsk is enabled
- * @ctrl_poll:			Tasklet poll for ctrl rx/tx
+ * @ctrl_poll:			Work poll for ctrl rx/tx
  * @xmit:			Xmit for normal path
  * @ctrl_tx_one:		Xmit for ctrl path
  * @rx_ring_fill_freelist:	Give buffers from the ring to FW
@@ -141,7 +141,7 @@ struct nfp_dp_ops {
 
 	int (*poll)(struct napi_struct *napi, int budget);
 	int (*xsk_poll)(struct napi_struct *napi, int budget);
-	void (*ctrl_poll)(struct tasklet_struct *t);
+	void (*ctrl_poll)(struct work_struct *t);
 	netdev_tx_t (*xmit)(struct sk_buff *skb, struct net_device *netdev);
 	bool (*ctrl_tx_one)(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
 			    struct sk_buff *skb, bool old);
diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
index fa1f78b03cb2..9d9ec00acf37 100644
--- a/drivers/net/ethernet/ni/nixge.c
+++ b/drivers/net/ethernet/ni/nixge.c
@@ -17,6 +17,7 @@
 #include <linux/nvmem-consumer.h>
 #include <linux/ethtool.h>
 #include <linux/iopoll.h>
+#include <linux/workqueue.h>
 
 #define TX_BD_NUM		64
 #define RX_BD_NUM		128
@@ -184,7 +185,7 @@ struct nixge_priv {
 	void __iomem *ctrl_regs;
 	void __iomem *dma_regs;
 
-	struct tasklet_struct dma_err_tasklet;
+	struct work_struct dma_err_work;
 
 	int tx_irq;
 	int rx_irq;
@@ -732,7 +733,7 @@ static irqreturn_t nixge_tx_irq(int irq, void *_ndev)
 		/* Write to the Rx channel control register */
 		nixge_dma_write_reg(priv, XAXIDMA_RX_CR_OFFSET, cr);
 
-		tasklet_schedule(&priv->dma_err_tasklet);
+		queue_work(system_bh_wq, &priv->dma_err_work);
 		nixge_dma_write_reg(priv, XAXIDMA_TX_SR_OFFSET, status);
 	}
 out:
@@ -780,16 +781,16 @@ static irqreturn_t nixge_rx_irq(int irq, void *_ndev)
 		/* write to the Rx channel control register */
 		nixge_dma_write_reg(priv, XAXIDMA_RX_CR_OFFSET, cr);
 
-		tasklet_schedule(&priv->dma_err_tasklet);
+		queue_work(system_bh_wq, &priv->dma_err_work);
 		nixge_dma_write_reg(priv, XAXIDMA_RX_SR_OFFSET, status);
 	}
 out:
 	return IRQ_HANDLED;
 }
 
-static void nixge_dma_err_handler(struct tasklet_struct *t)
+static void nixge_dma_err_handler(struct work_struct *t)
 {
-	struct nixge_priv *lp = from_tasklet(lp, t, dma_err_tasklet);
+	struct nixge_priv *lp = from_work(lp, t, dma_err_work);
 	struct nixge_hw_dma_bd *cur_p;
 	struct nixge_tx_skb *tx_skb;
 	u32 cr, i;
@@ -878,8 +879,8 @@ static int nixge_open(struct net_device *ndev)
 
 	phy_start(phy);
 
-	/* Enable tasklets for Axi DMA error handling */
-	tasklet_setup(&priv->dma_err_tasklet, nixge_dma_err_handler);
+	/* Enable works for Axi DMA error handling */
+	INIT_WORK(&priv->dma_err_work, nixge_dma_err_handler);
 
 	napi_enable(&priv->napi);
 
@@ -902,7 +903,7 @@ static int nixge_open(struct net_device *ndev)
 	napi_disable(&priv->napi);
 	phy_stop(phy);
 	phy_disconnect(phy);
-	tasklet_kill(&priv->dma_err_tasklet);
+	cancel_work_sync(&priv->dma_err_work);
 	netdev_err(ndev, "request_irq() failed\n");
 	return ret;
 }
@@ -927,7 +928,7 @@ static int nixge_stop(struct net_device *ndev)
 	nixge_dma_write_reg(priv, XAXIDMA_TX_CR_OFFSET,
 			    cr & (~XAXIDMA_CR_RUNSTOP_MASK));
 
-	tasklet_kill(&priv->dma_err_tasklet);
+	cancel_work_sync(&priv->dma_err_work);
 
 	free_irq(priv->tx_irq, ndev);
 	free_irq(priv->rx_irq, ndev);
diff --git a/drivers/net/ethernet/qlogic/qed/qed.h b/drivers/net/ethernet/qlogic/qed/qed.h
index 1d719726f72b..208dda46204e 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -565,7 +565,7 @@ struct qed_hwfn {
 	struct qed_consq		*p_consq;
 
 	/* Slow-Path definitions */
-	struct tasklet_struct		sp_dpc;
+	struct work_struct		p_dpc;
 	bool				b_sp_dpc_enabled;
 
 	struct qed_ptt			*p_main_ptt;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c b/drivers/net/ethernet/qlogic/qed/qed_int.c
index 2661c483c67e..a5b7937cca3c 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
@@ -1236,9 +1236,9 @@ static void qed_sb_ack_attn(struct qed_hwfn *p_hwfn,
 	barrier();
 }
 
-void qed_int_sp_dpc(struct tasklet_struct *t)
+void qed_int_sp_dpc(struct work_struct *t)
 {
-	struct qed_hwfn *p_hwfn = from_tasklet(p_hwfn, t, sp_dpc);
+	struct qed_hwfn *p_hwfn = from_work(p_hwfn, t, sp_dpc);
 	struct qed_pi_info *pi_info = NULL;
 	struct qed_sb_attn_info *sb_attn;
 	struct qed_sb_info *sb_info;
@@ -2305,7 +2305,7 @@ u64 qed_int_igu_read_sisr_reg(struct qed_hwfn *p_hwfn)
 
 static void qed_int_sp_dpc_setup(struct qed_hwfn *p_hwfn)
 {
-	tasklet_setup(&p_hwfn->sp_dpc, qed_int_sp_dpc);
+	INIT_WORK(&p_hwfn->sp_dpc, qed_int_sp_dpc);
 	p_hwfn->b_sp_dpc_enabled = true;
 }
 
diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.h b/drivers/net/ethernet/qlogic/qed/qed_int.h
index 7e5127f61744..38034f5c2992 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.h
@@ -142,12 +142,12 @@ int qed_int_sb_release(struct qed_hwfn *p_hwfn,
  * qed_int_sp_dpc(): To be called when an interrupt is received on the
  *                   default status block.
  *
- * @t: Tasklet.
+ * @t: Work.
  *
  * Return: Void.
  *
  */
-void qed_int_sp_dpc(struct tasklet_struct *t);
+void qed_int_sp_dpc(struct work_struct *t);
 
 /**
  * qed_int_get_num_sbs(): Get the number of status blocks configured
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index c278f8893042..990a3199499d 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -684,9 +684,9 @@ static void qed_simd_handler_clean(struct qed_dev *cdev, int index)
 	       sizeof(struct qed_simd_fp_handler));
 }
 
-static irqreturn_t qed_msix_sp_int(int irq, void *tasklet)
+static irqreturn_t qed_msix_sp_int(int irq, void *work)
 {
-	tasklet_schedule((struct tasklet_struct *)tasklet);
+	queue_work(system_bh_wq, (struct work_struct *)work);
 	return IRQ_HANDLED;
 }
 
@@ -708,7 +708,7 @@ static irqreturn_t qed_single_int(int irq, void *dev_instance)
 
 		/* Slowpath interrupt */
 		if (unlikely(status & 0x1)) {
-			tasklet_schedule(&hwfn->sp_dpc);
+			queue_work(system_bh_wq, &hwfn->sp_dpc);
 			status &= ~0x1;
 			rc = IRQ_HANDLED;
 		}
@@ -779,15 +779,15 @@ int qed_slowpath_irq_req(struct qed_hwfn *hwfn)
 	return rc;
 }
 
-static void qed_slowpath_tasklet_flush(struct qed_hwfn *p_hwfn)
+static void qed_slowpath_work_flush(struct qed_hwfn *p_hwfn)
 {
 	/* Calling the disable function will make sure that any
 	 * currently-running function is completed. The following call to the
 	 * enable function makes this sequence a flush-like operation.
 	 */
 	if (p_hwfn->b_sp_dpc_enabled) {
-		tasklet_disable(&p_hwfn->sp_dpc);
-		tasklet_enable(&p_hwfn->sp_dpc);
+		disable_work_sync(&p_hwfn->sp_dpc);
+		enable_and_queue_work(system_bh_wq, &p_hwfn->sp_dpc);
 	}
 }
 
@@ -803,7 +803,7 @@ void qed_slowpath_irq_sync(struct qed_hwfn *p_hwfn)
 	else
 		synchronize_irq(cdev->pdev->irq);
 
-	qed_slowpath_tasklet_flush(p_hwfn);
+	qed_slowpath_work_flush(p_hwfn);
 }
 
 static void qed_slowpath_irq_free(struct qed_dev *cdev)
@@ -834,10 +834,10 @@ static int qed_nic_stop(struct qed_dev *cdev)
 		struct qed_hwfn *p_hwfn = &cdev->hwfns[i];
 
 		if (p_hwfn->b_sp_dpc_enabled) {
-			tasklet_disable(&p_hwfn->sp_dpc);
+			disable_work_sync(&p_hwfn->sp_dpc);
 			p_hwfn->b_sp_dpc_enabled = false;
 			DP_VERBOSE(cdev, NETIF_MSG_IFDOWN,
-				   "Disabled sp tasklet [hwfn %d] at %p\n",
+				   "Disabled sp work [hwfn %d] at %p\n",
 				   i, &p_hwfn->sp_dpc);
 		}
 	}
@@ -3115,7 +3115,7 @@ void qed_get_protocol_stats(struct qed_dev *cdev,
 int qed_mfw_tlv_req(struct qed_hwfn *hwfn)
 {
 	DP_VERBOSE(hwfn->cdev, NETIF_MSG_DRV,
-		   "Scheduling slowpath task [Flag: %d]\n",
+		   "Scheduling slowpath work [Flag: %d]\n",
 		   QED_SLOWPATH_MFW_TLV_REQ);
 	/* Memory barrier for setting atomic bit */
 	smp_mb__before_atomic();
diff --git a/drivers/net/ethernet/sfc/falcon/farch.c b/drivers/net/ethernet/sfc/falcon/farch.c
index c64623c2e80c..4a3b0fbd1d0e 100644
--- a/drivers/net/ethernet/sfc/falcon/farch.c
+++ b/drivers/net/ethernet/sfc/falcon/farch.c
@@ -766,7 +766,7 @@ void ef4_farch_finish_flr(struct ef4_nic *efx)
 /**************************************************************************
  *
  * Event queue processing
- * Event queues are processed by per-channel tasklets.
+ * Event queues are processed by per-channel works.
  *
  **************************************************************************/
 
@@ -1397,7 +1397,7 @@ void ef4_farch_rx_defer_refill(struct ef4_rx_queue *rx_queue)
  *
  * Hardware interrupts
  * The hardware interrupt handler does very little work; all the event
- * queue processing is carried out by per-channel tasklets.
+ * queue processing is carried out by per-channel works.
  *
  **************************************************************************/
 
diff --git a/drivers/net/ethernet/sfc/falcon/net_driver.h b/drivers/net/ethernet/sfc/falcon/net_driver.h
index a2c7139f2b32..4f37f853769f 100644
--- a/drivers/net/ethernet/sfc/falcon/net_driver.h
+++ b/drivers/net/ethernet/sfc/falcon/net_driver.h
@@ -361,7 +361,7 @@ struct ef4_rx_queue {
  * struct ef4_channel - An Efx channel
  *
  * A channel comprises an event queue, at least one TX queue, at least
- * one RX queue, and an associated tasklet for processing the event
+ * one RX queue, and an associated BH work for processing the event
  * queue.
  *
  * @efx: Associated Efx NIC
diff --git a/drivers/net/ethernet/sfc/falcon/selftest.c b/drivers/net/ethernet/sfc/falcon/selftest.c
index c3dc88e6c26c..530e0e22b2d8 100644
--- a/drivers/net/ethernet/sfc/falcon/selftest.c
+++ b/drivers/net/ethernet/sfc/falcon/selftest.c
@@ -26,7 +26,7 @@
  * - All IRQs may be disabled on a CPU for a *long* time by e.g. a
  *   slow serial console or an old IDE driver doing error recovery
  * - The PREEMPT_RT patches mostly deal with this, but also allow a
- *   tasklet or normal task to be given higher priority than our IRQ
+ *   BH work or normal task to be given higher priority than our IRQ
  *   threads
  * Try to avoid blaming the hardware for this.
  */
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index f2dd7feb0e0c..9f6d6dbb6fb4 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -421,7 +421,7 @@ enum efx_sync_events_state {
  * struct efx_channel - An Efx channel
  *
  * A channel comprises an event queue, at least one TX queue, at least
- * one RX queue, and an associated tasklet for processing the event
+ * one RX queue, and an associated BH work for processing the event
  * queue.
  *
  * @efx: Associated Efx NIC
diff --git a/drivers/net/ethernet/sfc/selftest.c b/drivers/net/ethernet/sfc/selftest.c
index 894fad0bb5ea..8f30fab454cf 100644
--- a/drivers/net/ethernet/sfc/selftest.c
+++ b/drivers/net/ethernet/sfc/selftest.c
@@ -29,7 +29,7 @@
  * - All IRQs may be disabled on a CPU for a *long* time by e.g. a
  *   slow serial console or an old IDE driver doing error recovery
  * - The PREEMPT_RT patches mostly deal with this, but also allow a
- *   tasklet or normal task to be given higher priority than our IRQ
+ *   BH work or normal task to be given higher priority than our IRQ
  *   threads
  * Try to avoid blaming the hardware for this.
  */
diff --git a/drivers/net/ethernet/sfc/siena/farch.c b/drivers/net/ethernet/sfc/siena/farch.c
index 89ccd65c978b..166482dd192a 100644
--- a/drivers/net/ethernet/sfc/siena/farch.c
+++ b/drivers/net/ethernet/sfc/siena/farch.c
@@ -766,7 +766,7 @@ void efx_farch_finish_flr(struct efx_nic *efx)
 /**************************************************************************
  *
  * Event queue processing
- * Event queues are processed by per-channel tasklets.
+ * Event queues are processed by per-channel works.
  *
  **************************************************************************/
 
@@ -1414,7 +1414,7 @@ void efx_farch_rx_defer_refill(struct efx_rx_queue *rx_queue)
  *
  * Hardware interrupts
  * The hardware interrupt handler does very little work; all the event
- * queue processing is carried out by per-channel tasklets.
+ * queue processing is carried out by per-channel works.
  *
  **************************************************************************/
 
diff --git a/drivers/net/ethernet/sfc/siena/net_driver.h b/drivers/net/ethernet/sfc/siena/net_driver.h
index 94152f595acd..4502c7ce5495 100644
--- a/drivers/net/ethernet/sfc/siena/net_driver.h
+++ b/drivers/net/ethernet/sfc/siena/net_driver.h
@@ -431,7 +431,7 @@ enum efx_sync_events_state {
  * struct efx_channel - An Efx channel
  *
  * A channel comprises an event queue, at least one TX queue, at least
- * one RX queue, and an associated tasklet for processing the event
+ * one RX queue, and an associated BH work for processing the event
  * queue.
  *
  * @efx: Associated Efx NIC
diff --git a/drivers/net/ethernet/sfc/siena/selftest.c b/drivers/net/ethernet/sfc/siena/selftest.c
index 526da43d4b61..e5a3f7300daf 100644
--- a/drivers/net/ethernet/sfc/siena/selftest.c
+++ b/drivers/net/ethernet/sfc/siena/selftest.c
@@ -29,7 +29,7 @@
  * - All IRQs may be disabled on a CPU for a *long* time by e.g. a
  *   slow serial console or an old IDE driver doing error recovery
  * - The PREEMPT_RT patches mostly deal with this, but also allow a
- *   tasklet or normal task to be given higher priority than our IRQ
+ *   BH work or normal task to be given higher priority than our IRQ
  *   threads
  * Try to avoid blaming the hardware for this.
  */
diff --git a/drivers/net/ethernet/silan/sc92031.c b/drivers/net/ethernet/silan/sc92031.c
index ff4197f5e46d..e449ac0c89a2 100644
--- a/drivers/net/ethernet/silan/sc92031.c
+++ b/drivers/net/ethernet/silan/sc92031.c
@@ -34,6 +34,7 @@
 #include <linux/ethtool.h>
 #include <linux/mii.h>
 #include <linux/crc32.h>
+#include <linux/workqueue.h>
 
 #include <asm/irq.h>
 
@@ -255,7 +256,7 @@ enum PMConfigBits {
  */
 
 /* Locking rules for the interrupt:
- * - the interrupt and the tasklet never run at the same time
+ * - the interrupt and the work never run at the same time
  * - neither run between sc92031_disable_interrupts and
  *   sc92031_enable_interrupt
  */
@@ -266,8 +267,8 @@ struct sc92031_priv {
 	void __iomem		*port_base;
 	/* pci device structure */
 	struct pci_dev		*pdev;
-	/* tasklet */
-	struct tasklet_struct	tasklet;
+	/* work */
+	struct work_struct	work;
 
 	/* CPU address of rx ring */
 	void			*rx_ring;
@@ -355,7 +356,7 @@ static void sc92031_disable_interrupts(struct net_device *dev)
 	struct sc92031_priv *priv = netdev_priv(dev);
 	void __iomem *port_base = priv->port_base;
 
-	/* tell the tasklet/interrupt not to enable interrupts */
+	/* tell the work/interrupt not to enable interrupts */
 	atomic_set(&priv->intr_mask, 0);
 	wmb();
 
@@ -363,9 +364,9 @@ static void sc92031_disable_interrupts(struct net_device *dev)
 	iowrite32(0, port_base + IntrMask);
 	_sc92031_dummy_read(port_base);
 
-	/* wait for any concurrent interrupt/tasklet to finish */
+	/* wait for any concurrent interrupt/work to finish */
 	synchronize_irq(priv->pdev->irq);
-	tasklet_disable(&priv->tasklet);
+	disable_work_sync(&priv->work);
 }
 
 static void sc92031_enable_interrupts(struct net_device *dev)
@@ -373,7 +374,7 @@ static void sc92031_enable_interrupts(struct net_device *dev)
 	struct sc92031_priv *priv = netdev_priv(dev);
 	void __iomem *port_base = priv->port_base;
 
-	tasklet_enable(&priv->tasklet);
+	enable_and_queue_work(system_bh_wq, &priv->work);
 
 	atomic_set(&priv->intr_mask, IntrBits);
 	wmb();
@@ -644,7 +645,7 @@ static void _sc92031_reset(struct net_device *dev)
 	ioread32(port_base + IntrStatus);
 }
 
-static void _sc92031_tx_tasklet(struct net_device *dev)
+static void _sc92031_tx_work(struct net_device *dev)
 {
 	struct sc92031_priv *priv = netdev_priv(dev);
 	void __iomem *port_base = priv->port_base;
@@ -692,8 +693,8 @@ static void _sc92031_tx_tasklet(struct net_device *dev)
 			netif_wake_queue(dev);
 }
 
-static void _sc92031_rx_tasklet_error(struct net_device *dev,
-				      u32 rx_status, unsigned rx_size)
+static void _sc92031_rx_work_error(struct net_device *dev,
+				   u32 rx_status, unsigned rx_size)
 {
 	if(rx_size > (MAX_ETH_FRAME_SIZE + 4) || rx_size < 16) {
 		dev->stats.rx_errors++;
@@ -717,7 +718,7 @@ static void _sc92031_rx_tasklet_error(struct net_device *dev,
 	}
 }
 
-static void _sc92031_rx_tasklet(struct net_device *dev)
+static void _sc92031_rx_work(struct net_device *dev)
 {
 	struct sc92031_priv *priv = netdev_priv(dev);
 	void __iomem *port_base = priv->port_base;
@@ -773,7 +774,7 @@ static void _sc92031_rx_tasklet(struct net_device *dev)
 			     rx_size > (MAX_ETH_FRAME_SIZE + 4) ||
 			     rx_size < 16 ||
 			     !(rx_status & RxStatesOK))) {
-			_sc92031_rx_tasklet_error(dev, rx_status, rx_size);
+			_sc92031_rx_work_error(dev, rx_status, rx_size);
 			break;
 		}
 
@@ -820,7 +821,7 @@ static void _sc92031_rx_tasklet(struct net_device *dev)
 	iowrite32(priv->rx_ring_tail, port_base + RxBufRPtr);
 }
 
-static void _sc92031_link_tasklet(struct net_device *dev)
+static void _sc92031_link_work(struct net_device *dev)
 {
 	if (_sc92031_check_media(dev))
 		netif_wake_queue(dev);
@@ -830,9 +831,9 @@ static void _sc92031_link_tasklet(struct net_device *dev)
 	}
 }
 
-static void sc92031_tasklet(struct tasklet_struct *t)
+static void sc92031_work(struct work_struct *t)
 {
-	struct  sc92031_priv *priv = from_tasklet(priv, t, tasklet);
+	struct  sc92031_priv *priv = from_work(priv, t, work);
 	struct net_device *dev = priv->ndev;
 	void __iomem *port_base = priv->port_base;
 	u32 intr_status, intr_mask;
@@ -845,10 +846,10 @@ static void sc92031_tasklet(struct tasklet_struct *t)
 		goto out;
 
 	if (intr_status & TxOK)
-		_sc92031_tx_tasklet(dev);
+		_sc92031_tx_work(dev);
 
 	if (intr_status & RxOK)
-		_sc92031_rx_tasklet(dev);
+		_sc92031_rx_work(dev);
 
 	if (intr_status & RxOverflow)
 		dev->stats.rx_errors++;
@@ -859,7 +860,7 @@ static void sc92031_tasklet(struct tasklet_struct *t)
 	}
 
 	if (intr_status & (LinkFail | LinkOK))
-		_sc92031_link_tasklet(dev);
+		_sc92031_link_work(dev);
 
 out:
 	intr_mask = atomic_read(&priv->intr_mask);
@@ -890,7 +891,7 @@ static irqreturn_t sc92031_interrupt(int irq, void *dev_id)
 		goto out_none;
 
 	priv->intr_status = intr_status;
-	tasklet_schedule(&priv->tasklet);
+	queue_work(system_bh_wq, &priv->work);
 
 	return IRQ_HANDLED;
 
@@ -1109,7 +1110,7 @@ static void sc92031_poll_controller(struct net_device *dev)
 
 	disable_irq(irq);
 	if (sc92031_interrupt(irq, dev) != IRQ_NONE)
-		sc92031_tasklet(&priv->tasklet);
+		sc92031_work(&priv->work);
 	enable_irq(irq);
 }
 #endif
@@ -1449,10 +1450,10 @@ static int sc92031_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	spin_lock_init(&priv->lock);
 	priv->port_base = port_base;
 	priv->pdev = pdev;
-	tasklet_setup(&priv->tasklet, sc92031_tasklet);
-	/* Fudge tasklet count so the call to sc92031_enable_interrupts at
+	INIT_WORK(&priv->work, sc92031_work);
+	/* Fudge work count so the call to sc92031_enable_interrupts at
 	 * sc92031_open will work correctly */
-	tasklet_disable_nosync(&priv->tasklet);
+	disable_work(&priv->work);
 
 	/* PCI PM Wakeup */
 	iowrite32((~PM_LongWF & ~PM_LWPTN) | PM_Enable, port_base + PMConfig);
diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c
index 78ff3af7911a..8cf8e2b44eef 100644
--- a/drivers/net/ethernet/smsc/smc91x.c
+++ b/drivers/net/ethernet/smsc/smc91x.c
@@ -245,7 +245,7 @@ static void smc_reset(struct net_device *dev)
 
 	DBG(2, dev, "%s\n", __func__);
 
-	/* Disable all interrupts, block TX tasklet */
+	/* Disable all interrupts, block TX work */
 	spin_lock_irq(&lp->lock);
 	SMC_SELECT_BANK(lp, 2);
 	SMC_SET_INT_MASK(lp, 0);
@@ -356,7 +356,7 @@ static void smc_enable(struct net_device *dev)
 	/*
 	 * From this point the register bank must _NOT_ be switched away
 	 * to something else than bank 2 without proper locking against
-	 * races with any tasklet or interrupt handlers until smc_shutdown()
+	 * races with any work or interrupt handlers until smc_shutdown()
 	 * or smc_reset() is called.
 	 */
 }
@@ -536,9 +536,9 @@ static inline void  smc_rcv(struct net_device *dev)
 /*
  * This is called to actually send a packet to the chip.
  */
-static void smc_hardware_send_pkt(struct tasklet_struct *t)
+static void smc_hardware_send_pkt(struct work_struct *t)
 {
-	struct smc_local *lp = from_tasklet(lp, t, tx_task);
+	struct smc_local *lp = from_work(lp, t, tx_task);
 	struct net_device *dev = lp->dev;
 	void __iomem *ioaddr = lp->base;
 	struct sk_buff *skb;
@@ -550,7 +550,7 @@ static void smc_hardware_send_pkt(struct tasklet_struct *t)
 
 	if (!smc_special_trylock(&lp->lock, flags)) {
 		netif_stop_queue(dev);
-		tasklet_schedule(&lp->tx_task);
+		queue_work(system_bh_wq, &lp->tx_task);
 		return;
 	}
 
@@ -1248,7 +1248,7 @@ static irqreturn_t smc_interrupt(int irq, void *dev_id)
 			smc_rcv(dev);
 		} else if (status & IM_ALLOC_INT) {
 			DBG(3, dev, "Allocation irq\n");
-			tasklet_hi_schedule(&lp->tx_task);
+			queue_work(system_bh_highpri_wq, &lp->tx_task);
 			mask &= ~IM_ALLOC_INT;
 		} else if (status & IM_TX_EMPTY_INT) {
 			DBG(3, dev, "TX empty\n");
@@ -1515,7 +1515,7 @@ static int smc_close(struct net_device *dev)
 
 	/* clear everything */
 	smc_shutdown(dev);
-	tasklet_kill(&lp->tx_task);
+	cancel_work_sync(&lp->tx_task);
 	smc_phy_powerdown(dev);
 	return 0;
 }
@@ -1968,7 +1968,7 @@ static int smc_probe(struct net_device *dev, void __iomem *ioaddr,
 	dev->netdev_ops = &smc_netdev_ops;
 	dev->ethtool_ops = &smc_ethtool_ops;
 
-	tasklet_setup(&lp->tx_task, smc_hardware_send_pkt);
+	INIT_WORK(&lp->tx_task, smc_hardware_send_pkt);
 	INIT_WORK(&lp->phy_configure, smc_phy_configure);
 	lp->dev = dev;
 	lp->mii.phy_id_mask = 0x1f;
diff --git a/drivers/net/ethernet/smsc/smc91x.h b/drivers/net/ethernet/smsc/smc91x.h
index 46eee747c699..a3e7bd8ba2e0 100644
--- a/drivers/net/ethernet/smsc/smc91x.h
+++ b/drivers/net/ethernet/smsc/smc91x.h
@@ -201,7 +201,7 @@ struct smc_local {
 	 * desired memory.  Then, I'll send it out and free it.
 	 */
 	struct sk_buff *pending_tx_skb;
-	struct tasklet_struct tx_task;
+	struct work_struct tx_task;
 
 	struct gpio_desc *power_gpio;
 	struct gpio_desc *reset_gpio;
@@ -260,6 +260,7 @@ struct smc_local {
  * as RX which can overrun memory and lose packets.
  */
 #include <linux/dma-mapping.h>
+#include <linux/workqueue.h>
 
 #ifdef SMC_insl
 #undef SMC_insl
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 27f42f713c89..e5605dbdb5ca 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -745,7 +745,7 @@ struct mlx4_cq {
 		struct list_head list;
 		void (*comp)(struct mlx4_cq *);
 		void		*priv;
-	} tasklet_ctx;
+	} work_ctx;
 	int		reset_notify_added;
 	struct list_head	reset_notify;
 	u8			usage;
diff --git a/include/linux/mlx5/cq.h b/include/linux/mlx5/cq.h
index cb15308b5cb0..fd4b7b6d5ca0 100644
--- a/include/linux/mlx5/cq.h
+++ b/include/linux/mlx5/cq.h
@@ -56,7 +56,7 @@ struct mlx5_core_cq {
 		struct list_head list;
 		void (*comp)(struct mlx5_core_cq *cq, struct mlx5_eqe *eqe);
 		void		*priv;
-	} tasklet_ctx;
+	} work_ctx;
 	int			reset_notify_added;
 	struct list_head	reset_notify;
 	struct mlx5_eq_comp	*eq;
-- 
2.17.1


^ permalink raw reply related	[relevance 7%]

* Re: [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive
  2024-05-03 18:13 22% ` [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
@ 2024-05-07 18:16 67%   ` Easwar Hariharan
  0 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-07 18:16 UTC (permalink / raw)
  To: Christian König, David Airlie, Daniel Vetter
  Cc: Wolfram Sang,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Pan, Xinhui, Harry Wentland, Leo Li, Rodrigo Siqueira, Evan Quan,
	Hawking Zhang, Candice Li, Ran Sun, Alexander Richards,
	Wolfram Sang, Andi Shyti, Dmitry Baryshkov, Heiko Stuebner,
	Heiner Kallweit, Hamza Mahfooz, Ruan Jinjie, Aurabindo Pillai,
	Wayne Lin, Samson Tam, Alvin Lee, Sohaib Nadeem, Charlene Liu,
	Tom Chung, Alan Liu, Bhawanpreet Lakha,
	Meenakshikumar Somasundaram, George Shen, Aric Cyr,
	Nicholas Kazlauskas, Qingqing Zhuo, Dillon Varone, Lijo Lazar,
	Asad kamal, Kenneth Feng, Ma Jun, Darren Powell, Yang Wang,
	Mario Limonciello, Yifan Zhang, Le Ma,
	open list:RADEON and AMDGPU DRM DRIVERS, open list:DRM DRIVERS,
	Alex Deucher, open list

On 5/3/2024 11:13 AM, Easwar Hariharan wrote:
> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
> with more appropriate terms. Inspired by and following on to Wolfram's
> series to fix drivers/i2c/[1], fix the terminology for users of
> I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
> in the specification.
> 
> Compile tested, no functionality changes intended
> 
> [1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
> 
> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c  |  8 +++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c       | 10 +++----
>  drivers/gpu/drm/amd/amdgpu/atombios_i2c.c     |  8 +++---
>  drivers/gpu/drm/amd/amdgpu/atombios_i2c.h     |  2 +-
>  drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    | 20 ++++++-------
>  .../gpu/drm/amd/display/dc/bios/bios_parser.c |  2 +-
>  .../drm/amd/display/dc/bios/bios_parser2.c    |  2 +-
>  .../drm/amd/display/dc/core/dc_link_exports.c |  4 +--
>  drivers/gpu/drm/amd/display/dc/dc.h           |  2 +-
>  drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c  |  4 +--
>  .../display/include/grph_object_ctrl_defs.h   |  2 +-
>  drivers/gpu/drm/amd/include/atombios.h        |  2 +-
>  drivers/gpu/drm/amd/include/atomfirmware.h    | 26 ++++++++---------
>  .../powerplay/hwmgr/vega20_processpptables.c  |  4 +--
>  .../amd/pm/powerplay/inc/smu11_driver_if.h    |  2 +-
>  .../inc/pmfw_if/smu11_driver_if_arcturus.h    |  2 +-
>  .../inc/pmfw_if/smu11_driver_if_navi10.h      |  2 +-
>  .../pmfw_if/smu11_driver_if_sienna_cichlid.h  |  2 +-
>  .../inc/pmfw_if/smu13_driver_if_aldebaran.h   |  2 +-
>  .../inc/pmfw_if/smu13_driver_if_v13_0_0.h     |  2 +-
>  .../inc/pmfw_if/smu13_driver_if_v13_0_7.h     |  2 +-
>  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  4 +--
>  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  8 +++---
>  drivers/gpu/drm/radeon/atombios.h             | 16 +++++------
>  drivers/gpu/drm/radeon/atombios_i2c.c         |  4 +--
>  drivers/gpu/drm/radeon/radeon_combios.c       | 28 +++++++++----------
>  drivers/gpu/drm/radeon/radeon_i2c.c           | 10 +++----
>  drivers/gpu/drm/radeon/radeon_mode.h          |  6 ++--
>  28 files changed, 93 insertions(+), 93 deletions(-)
>

<snip>

Hello Christian, Daniel, David, others,

Could you re-review v2 since the feedback provided in v0 [1] has now been addressed? I can send v3 with
all other feedback and signoffs from the other maintainers incorporated when I have something for amdgpu 
and radeon.

Thanks,
Easwar

[1] https://lore.kernel.org/all/53f3afba-4759-4ea1-b408-8a929b26280c@amd.com/

^ permalink raw reply	[relevance 67%]

* [PATCH rdma-next 3/3] RDMA/mana_ib: Modify QP state
  2024-05-07  9:53 79% [PATCH rdma-next 0/3] RDMA/mana_ib: Add support of RC QPs Konstantin Taranov
  2024-05-07  9:53 63% ` [PATCH rdma-next 1/3] RDMA/mana_ib: Create and destroy RC QP Konstantin Taranov
  2024-05-07  9:53 62% ` [PATCH rdma-next 2/3] RDMA/mana_ib: Implement uapi to create " Konstantin Taranov
@ 2024-05-07  9:53 64% ` Konstantin Taranov
  2 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-05-07  9:53 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement modify QP state for RC QPs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/mana_ib.h | 37 ++++++++++++++
 drivers/infiniband/hw/mana/qp.c      | 72 +++++++++++++++++++++++++++-
 2 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 5cccbe3..d29dee7 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -140,6 +140,7 @@ enum mana_ib_command_code {
 	MANA_IB_DESTROY_CQ      = 0x30009,
 	MANA_IB_CREATE_RC_QP    = 0x3000a,
 	MANA_IB_DESTROY_RC_QP   = 0x3000b,
+	MANA_IB_SET_QP_STATE	= 0x3000d,
 };
 
 struct mana_ib_query_adapter_caps_req {
@@ -286,6 +287,42 @@ struct mana_rnic_destroy_rc_qp_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+struct mana_ib_ah_attr {
+	u8 src_addr[16];
+	u8 dest_addr[16];
+	u8 src_mac[ETH_ALEN];
+	u8 dest_mac[ETH_ALEN];
+	u8 src_addr_type;
+	u8 dest_addr_type;
+	u8 hop_limit;
+	u8 traffic_class;
+	u16 src_port;
+	u16 dest_port;
+	u32 reserved;
+};
+
+struct mana_rnic_set_qp_state_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t qp_handle;
+	u64 attr_mask;
+	u32 qp_state;
+	u32 path_mtu;
+	u32 rq_psn;
+	u32 sq_psn;
+	u32 dest_qpn;
+	u32 max_dest_rd_atomic;
+	u32 retry_cnt;
+	u32 rnr_retry;
+	u32 min_rnr_timer;
+	u32 reserved;
+	struct mana_ib_ah_attr ah_attr;
+}; /* HW Data */
+
+struct mana_rnic_set_qp_state_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 14e6adb..5393b6f 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -492,11 +492,79 @@ int mana_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
 	return -EINVAL;
 }
 
+static int mana_ib_gd_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+				int attr_mask, struct ib_udata *udata)
+{
+	struct mana_ib_dev *mdev = container_of(ibqp->device, struct mana_ib_dev, ib_dev);
+	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
+	struct mana_rnic_set_qp_state_resp resp = {};
+	struct mana_rnic_set_qp_state_req req = {};
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_port_context *mpc;
+	struct net_device *ndev;
+	int err;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_SET_QP_STATE, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.qp_handle = qp->qp_handle;
+	req.qp_state = attr->qp_state;
+	req.attr_mask = attr_mask;
+	req.path_mtu = attr->path_mtu;
+	req.rq_psn = attr->rq_psn;
+	req.sq_psn = attr->sq_psn;
+	req.dest_qpn = attr->dest_qp_num;
+	req.max_dest_rd_atomic = attr->max_dest_rd_atomic;
+	req.retry_cnt = attr->retry_cnt;
+	req.rnr_retry = attr->rnr_retry;
+	req.min_rnr_timer = attr->min_rnr_timer;
+	if (attr_mask & IB_QP_AV) {
+		ndev = mana_ib_get_netdev(&mdev->ib_dev, ibqp->port);
+		if (!ndev) {
+			ibdev_dbg(&mdev->ib_dev, "Invalid port %u in RC QP %u\n",
+				  ibqp->port, ibqp->qp_num);
+			return -EINVAL;
+		}
+		mpc = netdev_priv(ndev);
+		copy_in_reverse(req.ah_attr.src_mac, mpc->mac_addr, ETH_ALEN);
+		copy_in_reverse(req.ah_attr.dest_mac, attr->ah_attr.roce.dmac, ETH_ALEN);
+		copy_in_reverse(req.ah_attr.src_addr, attr->ah_attr.grh.sgid_attr->gid.raw,
+				sizeof(union ib_gid));
+		copy_in_reverse(req.ah_attr.dest_addr, attr->ah_attr.grh.dgid.raw,
+				sizeof(union ib_gid));
+		if (rdma_gid_attr_network_type(attr->ah_attr.grh.sgid_attr) == RDMA_NETWORK_IPV4) {
+			req.ah_attr.src_addr_type = SGID_TYPE_IPV4;
+			req.ah_attr.dest_addr_type = SGID_TYPE_IPV4;
+		} else {
+			req.ah_attr.src_addr_type = SGID_TYPE_IPV6;
+			req.ah_attr.dest_addr_type = SGID_TYPE_IPV6;
+		}
+		req.ah_attr.dest_port = ROCE_V2_UDP_DPORT;
+		req.ah_attr.src_port = rdma_get_udp_sport(attr->ah_attr.grh.flow_label,
+							  ibqp->qp_num, attr->dest_qp_num);
+		req.ah_attr.traffic_class = attr->ah_attr.grh.traffic_class;
+		req.ah_attr.hop_limit = attr->ah_attr.grh.hop_limit;
+	}
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed modify qp err %d", err);
+		return err;
+	}
+
+	return 0;
+}
+
 int mana_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		      int attr_mask, struct ib_udata *udata)
 {
-	/* modify_qp is not supported by this version of the driver */
-	return -EOPNOTSUPP;
+	switch (ibqp->qp_type) {
+	case IB_QPT_RC:
+		return mana_ib_gd_modify_qp(ibqp, attr, attr_mask, udata);
+	default:
+		ibdev_dbg(ibqp->device, "Modify QP type %u not supported", ibqp->qp_type);
+		return -EOPNOTSUPP;
+	}
 }
 
 static int mana_ib_destroy_qp_rss(struct mana_ib_qp *qp,
-- 
2.43.0


^ permalink raw reply related	[relevance 64%]

* [PATCH rdma-next 1/3] RDMA/mana_ib: Create and destroy RC QP
  2024-05-07  9:53 79% [PATCH rdma-next 0/3] RDMA/mana_ib: Add support of RC QPs Konstantin Taranov
@ 2024-05-07  9:53 63% ` Konstantin Taranov
  2024-05-07  9:53 62% ` [PATCH rdma-next 2/3] RDMA/mana_ib: Implement uapi to create " Konstantin Taranov
  2024-05-07  9:53 64% ` [PATCH rdma-next 3/3] RDMA/mana_ib: Modify QP state Konstantin Taranov
  2 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-05-07  9:53 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement HW requests to create and destroy an RC QP.
An RC QP may have 5 queues.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    | 59 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h | 58 ++++++++++++++++++++++++++-
 2 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 2a41135..6bd6072 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -888,3 +888,62 @@ int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 
 	return 0;
 }
+
+int mana_ib_gd_create_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp,
+			    struct ib_qp_init_attr *attr, u32 doorbell, u64 flags)
+{
+	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct mana_ib_cq, ibcq);
+	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct mana_ib_cq, ibcq);
+	struct mana_ib_pd *pd = container_of(qp->ibqp.pd, struct mana_ib_pd, ibpd);
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_create_qp_resp resp = {};
+	struct mana_rnic_create_qp_req req = {};
+	int err, i;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CREATE_RC_QP, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.pd_handle = pd->pd_handle;
+	req.send_cq_handle = send_cq->cq_handle;
+	req.recv_cq_handle = recv_cq->cq_handle;
+	for (i = 0; i < MANA_RC_QUEUE_TYPE_MAX; i++)
+		req.dma_region[i] = qp->rc_qp.queues[i].gdma_region;
+	req.doorbell_page = doorbell;
+	req.max_send_wr = attr->cap.max_send_wr;
+	req.max_recv_wr = attr->cap.max_recv_wr;
+	req.max_send_sge = attr->cap.max_send_sge;
+	req.max_recv_sge = attr->cap.max_recv_sge;
+	req.flags = flags;
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create rc qp err %d", err);
+		return err;
+	}
+	qp->qp_handle = resp.rc_qp_handle;
+	for (i = 0; i < MANA_RC_QUEUE_TYPE_MAX; i++) {
+		qp->rc_qp.queues[i].id = resp.queue_ids[i];
+		/* The GDMA regions are now owned by the RNIC QP handle */
+		qp->rc_qp.queues[i].gdma_region = GDMA_INVALID_DMA_REGION;
+	}
+	return 0;
+}
+
+int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
+{
+	struct mana_rnic_destroy_rc_qp_resp resp = {0};
+	struct mana_rnic_destroy_rc_qp_req req = {0};
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	int err;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_DESTROY_RC_QP, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.rc_qp_handle = qp->qp_handle;
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to destroy rc qp err %d", err);
+		return err;
+	}
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 68c3b4f..a3e229c 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -95,11 +95,27 @@ struct mana_ib_cq {
 	mana_handle_t  cq_handle;
 };
 
+enum mana_rc_queue_type {
+	MANA_RC_SEND_QUEUE_REQUESTER = 0,
+	MANA_RC_SEND_QUEUE_RESPONDER,
+	MANA_RC_SEND_QUEUE_FMR,
+	MANA_RC_RECV_QUEUE_REQUESTER,
+	MANA_RC_RECV_QUEUE_RESPONDER,
+	MANA_RC_QUEUE_TYPE_MAX,
+};
+
+struct mana_ib_rc_qp {
+	struct mana_ib_queue queues[MANA_RC_QUEUE_TYPE_MAX];
+};
+
 struct mana_ib_qp {
 	struct ib_qp ibqp;
 
 	mana_handle_t qp_handle;
-	struct mana_ib_queue raw_sq;
+	union {
+		struct mana_ib_queue raw_sq;
+		struct mana_ib_rc_qp rc_qp;
+	};
 
 	/* The port on the IB device, starting with 1 */
 	u32 port;
@@ -122,6 +138,8 @@ enum mana_ib_command_code {
 	MANA_IB_CONFIG_MAC_ADDR	= 0x30005,
 	MANA_IB_CREATE_CQ       = 0x30008,
 	MANA_IB_DESTROY_CQ      = 0x30009,
+	MANA_IB_CREATE_RC_QP    = 0x3000a,
+	MANA_IB_DESTROY_RC_QP   = 0x3000b,
 };
 
 struct mana_ib_query_adapter_caps_req {
@@ -230,6 +248,40 @@ struct mana_rnic_destroy_cq_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+struct mana_rnic_create_qp_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t pd_handle;
+	mana_handle_t send_cq_handle;
+	mana_handle_t recv_cq_handle;
+	u64 dma_region[MANA_RC_QUEUE_TYPE_MAX];
+	u64 deprecated[2];
+	u64 flags;
+	u32 doorbell_page;
+	u32 max_send_wr;
+	u32 max_recv_wr;
+	u32 max_send_sge;
+	u32 max_recv_sge;
+	u32 reserved;
+}; /* HW Data */
+
+struct mana_rnic_create_qp_resp {
+	struct gdma_resp_hdr hdr;
+	mana_handle_t rc_qp_handle;
+	u32 queue_ids[MANA_RC_QUEUE_TYPE_MAX];
+	u32 reserved;
+}; /* HW Data*/
+
+struct mana_rnic_destroy_rc_qp_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t rc_qp_handle;
+}; /* HW Data */
+
+struct mana_rnic_destroy_rc_qp_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -354,4 +406,8 @@ int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8
 int mana_ib_gd_create_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq, u32 doorbell);
 
 int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
+
+int mana_ib_gd_create_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp,
+			    struct ib_qp_init_attr *attr, u32 doorbell, u64 flags);
+int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp);
 #endif
-- 
2.43.0


^ permalink raw reply related	[relevance 63%]

* [PATCH rdma-next 2/3] RDMA/mana_ib: Implement uapi to create and destroy RC QP
  2024-05-07  9:53 79% [PATCH rdma-next 0/3] RDMA/mana_ib: Add support of RC QPs Konstantin Taranov
  2024-05-07  9:53 63% ` [PATCH rdma-next 1/3] RDMA/mana_ib: Create and destroy RC QP Konstantin Taranov
@ 2024-05-07  9:53 62% ` Konstantin Taranov
  2024-05-07  9:53 64% ` [PATCH rdma-next 3/3] RDMA/mana_ib: Modify QP state Konstantin Taranov
  2 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-05-07  9:53 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement user requests to create and destroy an RC QP.
As the user does not have an FMR queue, it is skipped and NO_FMR flag
is used.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/mana_ib.h |  4 ++
 drivers/infiniband/hw/mana/qp.c      | 93 +++++++++++++++++++++++++++-
 include/uapi/rdma/mana-abi.h         |  9 +++
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index a3e229c..5cccbe3 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -248,6 +248,10 @@ struct mana_rnic_destroy_cq_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+enum mana_rnic_create_rc_flags {
+	MANA_RC_FLAG_NO_FMR = 2,
+};
+
 struct mana_rnic_create_qp_req {
 	struct gdma_req_hdr hdr;
 	mana_handle_t adapter;
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index ba13c5a..14e6adb 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -398,6 +398,78 @@ err_free_vport:
 	return err;
 }
 
+static int mana_ib_create_rc_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
+				struct ib_qp_init_attr *attr, struct ib_udata *udata)
+{
+	struct mana_ib_dev *mdev = container_of(ibpd->device, struct mana_ib_dev, ib_dev);
+	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
+	struct mana_ib_create_rc_qp_resp resp = {};
+	struct mana_ib_ucontext *mana_ucontext;
+	struct mana_ib_create_rc_qp ucmd = {};
+	int i, err, j;
+	u64 flags = 0;
+	u32 doorbell;
+
+	if (!udata || udata->inlen < sizeof(ucmd))
+		return -EINVAL;
+
+	mana_ucontext = rdma_udata_to_drv_context(udata, struct mana_ib_ucontext, ibucontext);
+	doorbell = mana_ucontext->doorbell;
+	flags = MANA_RC_FLAG_NO_FMR;
+	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen));
+	if (err) {
+		ibdev_dbg(&mdev->ib_dev, "Failed to copy from udata, %d\n", err);
+		return err;
+	}
+
+	for (i = 0, j = 0; i < MANA_RC_QUEUE_TYPE_MAX; ++i) {
+		// skip FMR for user-level RC QPs
+		if (i == MANA_RC_SEND_QUEUE_FMR) {
+			qp->rc_qp.queues[i].id = INVALID_QUEUE_ID;
+			qp->rc_qp.queues[i].gdma_region = GDMA_INVALID_DMA_REGION;
+			continue;
+		}
+		err = mana_ib_create_queue(mdev, ucmd.queue_buf[j], ucmd.queue_size[j],
+					   &qp->rc_qp.queues[i]);
+		if (err) {
+			ibdev_err(&mdev->ib_dev, "Failed to create queue %d, err %d\n", i, err);
+			goto destroy_queues;
+		}
+		j++;
+	}
+
+	err = mana_ib_gd_create_rc_qp(mdev, qp, attr, doorbell, flags);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create rc qp  %d\n", err);
+		goto destroy_queues;
+	}
+	qp->ibqp.qp_num = qp->rc_qp.queues[MANA_RC_RECV_QUEUE_RESPONDER].id;
+	qp->port = attr->port_num;
+
+	if (udata) {
+		for (i = 0, j = 0; i < MANA_RC_QUEUE_TYPE_MAX; ++i) {
+			if (i == MANA_RC_SEND_QUEUE_FMR)
+				continue;
+			resp.queue_id[j] = qp->rc_qp.queues[i].id;
+			j++;
+		}
+		err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
+		if (err) {
+			ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
+			goto destroy_qp;
+		}
+	}
+
+	return 0;
+
+destroy_qp:
+	mana_ib_gd_destroy_rc_qp(mdev, qp);
+destroy_queues:
+	while (i-- > 0)
+		mana_ib_destroy_queue(mdev, &qp->rc_qp.queues[i]);
+	return err;
+}
+
 int mana_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
 		      struct ib_udata *udata)
 {
@@ -409,6 +481,8 @@ int mana_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
 						     udata);
 
 		return mana_ib_create_qp_raw(ibqp, ibqp->pd, attr, udata);
+	case IB_QPT_RC:
+		return mana_ib_create_rc_qp(ibqp, ibqp->pd, attr, udata);
 	default:
 		/* Creating QP other than IB_QPT_RAW_PACKET is not supported */
 		ibdev_dbg(ibqp->device, "Creating QP type %u not supported\n",
@@ -473,6 +547,22 @@ static int mana_ib_destroy_qp_raw(struct mana_ib_qp *qp, struct ib_udata *udata)
 	return 0;
 }
 
+static int mana_ib_destroy_rc_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
+{
+	struct mana_ib_dev *mdev =
+		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
+	int i;
+
+	/* Ignore return code as there is not much we can do about it.
+	 * The error message is printed inside.
+	 */
+	mana_ib_gd_destroy_rc_qp(mdev, qp);
+	for (i = 0; i < MANA_RC_QUEUE_TYPE_MAX; ++i)
+		mana_ib_destroy_queue(mdev, &qp->rc_qp.queues[i]);
+
+	return 0;
+}
+
 int mana_ib_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
 {
 	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
@@ -484,7 +574,8 @@ int mana_ib_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
 						      udata);
 
 		return mana_ib_destroy_qp_raw(qp, udata);
-
+	case IB_QPT_RC:
+		return mana_ib_destroy_rc_qp(qp, udata);
 	default:
 		ibdev_dbg(ibqp->device, "Unexpected QP type %u\n",
 			  ibqp->qp_type);
diff --git a/include/uapi/rdma/mana-abi.h b/include/uapi/rdma/mana-abi.h
index 2c41cc3..45c2df6 100644
--- a/include/uapi/rdma/mana-abi.h
+++ b/include/uapi/rdma/mana-abi.h
@@ -45,6 +45,15 @@ struct mana_ib_create_qp_resp {
 	__u32 reserved;
 };
 
+struct mana_ib_create_rc_qp {
+	__aligned_u64 queue_buf[4];
+	__u32 queue_size[4];
+};
+
+struct mana_ib_create_rc_qp_resp {
+	__u32 queue_id[4];
+};
+
 struct mana_ib_create_wq {
 	__aligned_u64 wq_buf_addr;
 	__u32 wq_buf_size;
-- 
2.43.0


^ permalink raw reply related	[relevance 62%]

* [PATCH rdma-next 0/3] RDMA/mana_ib: Add support of RC QPs
@ 2024-05-07  9:53 79% Konstantin Taranov
  2024-05-07  9:53 63% ` [PATCH rdma-next 1/3] RDMA/mana_ib: Create and destroy RC QP Konstantin Taranov
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Konstantin Taranov @ 2024-05-07  9:53 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

This patch series enables creation and destruction of RC QPs.
The RC QP can be transitioned to RTS and be used by rdma-core.

Later I will submit rdma-core patches with fully working RC QPs.

Konstantin Taranov (3):
  RDMA/mana_ib: Create and destroy RC QP
  RDMA/mana_ib: Implement uapi to create and destroy RC QP
  RDMA/mana_ib: Modify QP state

 drivers/infiniband/hw/mana/main.c    |  59 ++++++++++
 drivers/infiniband/hw/mana/mana_ib.h |  99 +++++++++++++++-
 drivers/infiniband/hw/mana/qp.c      | 165 ++++++++++++++++++++++++++-
 include/uapi/rdma/mana-abi.h         |   9 ++
 4 files changed, 328 insertions(+), 4 deletions(-)

-- 
2.43.0


^ permalink raw reply	[relevance 79%]

* [PATCH v4] fs/coredump: Enable dynamic configuration of max file note size
@ 2024-05-06 19:37 63% Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-05-06 19:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, linux-mm, viro, brauner, jack, ebiederm, keescook,
	mcgrof, j.granados, allen.lkml

Introduce the capability to dynamically configure the maximum file
note size for ELF core dumps via sysctl.

Why is this being done?
We have observed that during a crash when there are more than 65k mmaps
in memory, the existing fixed limit on the size of the ELF notes section
becomes a bottleneck. The notes section quickly reaches its capacity,
leading to incomplete memory segment information in the resulting coredump.
This truncation compromises the utility of the coredumps, as crucial
information about the memory state at the time of the crash might be
omitted.

This enhancement removes the previous static limit of 4MB, allowing
system administrators to adjust the size based on system-specific
requirements or constraints.

Eg:
$ sysctl -a | grep core_file_note_size_min
kernel.core_file_note_size_max = 4194304

$ sysctl -n kernel.core_file_note_size_min
4194304

$echo 519304 > /proc/sys/kernel/core_file_note_size_min

$sysctl -n kernel.core_file_note_size_min
519304

Attempting to write beyond the ceiling value of 16MB
$echo 17194304 > /proc/sys/kernel/core_file_note_size_min
bash: echo: write error: Invalid argument

Signed-off-by: Vijay Nag <nagvijay@microsoft.com>
Signed-off-by: Allen Pais <apais@linux.microsoft.com>

---
Changes in v4:
   - Rename core_file_note_size_max to core_file_note_size_min [kees]
   - Rename core_file_note_size_max to MAX_FILE_NOTE_SIZE to
     CORE_FILE_NOTE_SIZE_DEFAULT and MAX_ALLOWED_NOTE_SIZE to
     CORE_FILE_NOTE_SIZE_MAX [Kees]
   - change core_file_note_size_allowed to static and const [Kees]
Changes in v3:
   - Fix commit message to reflect the correct sysctl knob [Kees]
   - Add a ceiling for maximum pssible note size(16M) [Allen]
   - Add a pr_warn_once() [Kees]
Changes in v2:
   - Move new sysctl to fs/coredump.c [Luis & Kees]
   - rename max_file_note_size to core_file_note_size_max [kees]
   - Capture "why this is being done?" int he commit message [Luis & Kees]
---
 fs/binfmt_elf.c          |  7 +++++--
 fs/coredump.c            | 15 +++++++++++++++
 include/linux/coredump.h |  1 +
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 5397b552fbeb..4dc7eb265a97 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1564,7 +1564,6 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
 	fill_note(note, "CORE", NT_SIGINFO, sizeof(*csigdata), csigdata);
 }
 
-#define MAX_FILE_NOTE_SIZE (4*1024*1024)
 /*
  * Format of NT_FILE note:
  *
@@ -1592,8 +1591,12 @@ static int fill_files_note(struct memelfnote *note, struct coredump_params *cprm
 
 	names_ofs = (2 + 3 * count) * sizeof(data[0]);
  alloc:
-	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
+	/* paranoia check */
+	if (size >= core_file_note_size_min) {
+		pr_warn_once("coredump Note size too large: %u (does kernel.core_file_note_size_min sysctl need adjustment?\n",
+			      size);
 		return -EINVAL;
+	}
 	size = round_up(size, PAGE_SIZE);
 	/*
 	 * "size" can be 0 here legitimately.
diff --git a/fs/coredump.c b/fs/coredump.c
index be6403b4b14b..20807c3c5477 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -56,10 +56,16 @@
 static bool dump_vma_snapshot(struct coredump_params *cprm);
 static void free_vma_snapshot(struct coredump_params *cprm);
 
+#define CORE_FILE_NOTE_SIZE_DEFAULT (4*1024*1024)
+/* Define a reasonable max cap */
+#define CORE_FILE_NOTE_SIZE_MAX (16*1024*1024)
+
 static int core_uses_pid;
 static unsigned int core_pipe_limit;
 static char core_pattern[CORENAME_MAX_SIZE] = "core";
 static int core_name_size = CORENAME_MAX_SIZE;
+static const unsigned int core_file_note_size_max = CORE_FILE_NOTE_SIZE_MAX;
+unsigned int core_file_note_size_min = CORE_FILE_NOTE_SIZE_DEFAULT;
 
 struct core_name {
 	char *corename;
@@ -1020,6 +1026,15 @@ static struct ctl_table coredump_sysctls[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname       = "core_file_note_size_min",
+		.data           = &core_file_note_size_min,
+		.maxlen         = sizeof(unsigned int),
+		.mode           = 0644,
+		.proc_handler	= proc_douintvec_minmax,
+		.extra1		= &core_file_note_size_min,
+		.extra2		= (unsigned int *) &core_file_note_size_max,
+	},
 };
 
 static int __init init_fs_coredump_sysctls(void)
diff --git a/include/linux/coredump.h b/include/linux/coredump.h
index d3eba4360150..f6be9fd2aea7 100644
--- a/include/linux/coredump.h
+++ b/include/linux/coredump.h
@@ -46,6 +46,7 @@ static inline void do_coredump(const kernel_siginfo_t *siginfo) {}
 #endif
 
 #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL)
+extern unsigned int core_file_note_size_min;
 extern void validate_coredump_safety(void);
 #else
 static inline void validate_coredump_safety(void) {}
-- 
2.17.1


^ permalink raw reply related	[relevance 63%]

* Re: [PATCH v1 10/12] sfc: falcon: Make I2C terminology more inclusive
  @ 2024-05-06 15:54 76%     ` Easwar Hariharan
  0 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-06 15:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Edward Cree, Martin Habets, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, open list:SFC NETWORK DRIVER,
	open list:SFC NETWORK DRIVER, open list, Wolfram Sang,
	open list:RADEON and AMDGPU DRM DRIVERS, open list:DRM DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

On 5/3/2024 3:13 PM, Jakub Kicinski wrote:
> On Tue, 30 Apr 2024 17:38:09 +0000 Easwar Hariharan wrote:
>> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
>> with more appropriate terms. Inspired by and following on to Wolfram's
>> series to fix drivers/i2c/[1], fix the terminology for users of
>> I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
>> in the specification.
>>
>> Compile tested, no functionality changes intended
> 
> FWIW we're assuming someone (Wolfram?) will take all of these,
> instead of area maintainers picking them individually.
> Please let us know if that's incorrect.

I think, based on the trend in the v2 conversation[1], that's correct. If maintainers of
other areas disagree, please chime in.

Thanks,
Easwar

[1] https://lore.kernel.org/all/20240503181333.2336999-1-eahariha@linux.microsoft.com/

^ permalink raw reply	[relevance 76%]

* [PATCH v2] tools: hv: suppress the invalid warning for packed member alignment
@ 2024-05-06  5:38 79% Saurabh Sengar
  0 siblings, 0 replies; 200+ results
From: Saurabh Sengar @ 2024-05-06  5:38 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, gregkh, linux-kernel, linux-hyperv
  Cc: ssengar, maryhardy, longli

Packed struct vmbus_bufring is 4096 byte aligned and the reporting
warning is for the first member of that struct which shouldn't add
any offset to create alignment issue.

Suppress the warning by adding -Wno-address-of-packed-member flag to
gcc.

Fixes: 45bab4d74651 ("tools: hv: Add vmbus_bufring")
Reported-by: kernel test robot <yujie.liu@intel.com>
Closes: https://lore.kernel.org/all/202404121913.GhtSoKbW-lkp@intel.com/
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
---
[V2] Added 'Fixes' tag

 tools/hv/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/hv/Makefile b/tools/hv/Makefile
index bb52871..2e60e2c 100644
--- a/tools/hv/Makefile
+++ b/tools/hv/Makefile
@@ -17,6 +17,7 @@ endif
 MAKEFLAGS += -r
 
 override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include
+override CFLAGS += -Wno-address-of-packed-member
 
 ALL_TARGETS := hv_kvp_daemon hv_vss_daemon
 ifneq ($(ARCH), aarch64)
-- 
1.8.3.1


^ permalink raw reply related	[relevance 79%]

* Re: [PATCH v18 20/21] Documentation: add ipe documentation
  @ 2024-05-04 20:13 79%     ` Fan Wu
  0 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-04 20:13 UTC (permalink / raw)
  To: Bagas Sanjaya, corbet, zohar, jmorris, serge, tytso, ebiggers,
	axboe, agk, snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers



On 5/4/2024 1:04 AM, Bagas Sanjaya wrote:
> On Fri, May 03, 2024 at 03:32:30PM -0700, Fan Wu wrote:
>> +IPE does not mitigate threats arising from malicious but authorized
>> +developers (with access to a signing certificate), or compromised
>> +developer tools used by them (i.e. return-oriented programming attacks).
>> +Additionally, IPE draws hard security boundary between userspace and
>> +kernelspace. As a result, IPE does not provide any protections against a
>> +kernel level exploit, and a kernel-level exploit can disable or tamper
>> +with IPE's protections.
> 
> So how to mitigate kernel-level exploits then?
>
One possible way is to use hypervisor to protect the kernel integrity. 
https://github.com/heki-linux is one project on this direction. Perhaps 
I should also add this link to the doc.

>> +Allow only initramfs
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> <snipped>...
>> +Allow any signed and validated dm-verity volume and the initramfs
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> <snipped>...
> 
> htmldocs build reports new warnings:
> 
> Documentation/admin-guide/LSM/ipe.rst:694: WARNING: Title underline too short.
> 
> Allow any signed and validated dm-verity volume and the initramfs
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Documentation/admin-guide/LSM/ipe.rst:694: WARNING: Title underline too short.
> 
> Allow any signed and validated dm-verity volume and the initramfs
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Documentation/arch/x86/resctrl.rst:577: WARNING: Title underline too short.
> 
> I have to match these sections underline length:
> 
> ---- >8 ----
> diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
> index 1a3bf1d8aa23f0..a47e14e024a90d 100644
> --- a/Documentation/admin-guide/LSM/ipe.rst
> +++ b/Documentation/admin-guide/LSM/ipe.rst
> @@ -681,7 +681,7 @@ Allow all
>      DEFAULT action=ALLOW
>   
>   Allow only initramfs
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +~~~~~~~~~~~~~~~~~~~~
>   
>   ::
>   
> @@ -691,7 +691,7 @@ Allow only initramfs
>      op=EXECUTE boot_verified=TRUE action=ALLOW
>   
>   Allow any signed and validated dm-verity volume and the initramfs
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   
>   ::
>   
> @@ -725,7 +725,7 @@ Allow only a specific dm-verity volume
>      op=EXECUTE dmverity_roothash=sha256:401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW
>   
>   Allow any fs-verity file with a valid built-in signature
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   
>   ::
>   
> @@ -735,7 +735,7 @@ Allow any fs-verity file with a valid built-in signature
>      op=EXECUTE fsverity_signature=TRUE action=ALLOW
>   
>   Allow execution of a specific fs-verity file
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   
>   ::
>   
> 
>> +Additional Information
>> +----------------------
>> +
>> +- `Github Repository <https://github.com/microsoft/ipe>`_
>> +- Documentation/security/ipe.rst
> 
> Link title to both this admin-side and developer docs can be added for
> disambiguation (to avoid confusion on readers):
> 
> ---- >8 ----
> diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
> index a47e14e024a90d..25b17e11559149 100644
> --- a/Documentation/admin-guide/LSM/ipe.rst
> +++ b/Documentation/admin-guide/LSM/ipe.rst
> @@ -7,7 +7,8 @@ Integrity Policy Enforcement (IPE)
>   
>      This is the documentation for admins, system builders, or individuals
>      attempting to use IPE. If you're looking for more developer-focused
> -   documentation about IPE please see Documentation/security/ipe.rst
> +   documentation about IPE please see :doc:`the design docs
> +   </security/ipe>`.
>   
>   Overview
>   --------
> @@ -748,7 +749,7 @@ Additional Information
>   ----------------------
>   
>   - `Github Repository <https://github.com/microsoft/ipe>`_
> -- Documentation/security/ipe.rst
> +- :doc:`Developer and design docs for IPE </security/ipe>`
>   
>   FAQ
>   ---
> diff --git a/Documentation/security/ipe.rst b/Documentation/security/ipe.rst
> index 07e3632241285d..fd1b1a852d2165 100644
> --- a/Documentation/security/ipe.rst
> +++ b/Documentation/security/ipe.rst
> @@ -7,7 +7,7 @@ Integrity Policy Enforcement (IPE) - Kernel Documentation
>   
>      This is documentation targeted at developers, instead of administrators.
>      If you're looking for documentation on the usage of IPE, please see
> -   Documentation/admin-guide/LSM/ipe.rst
> +   `IPE admin guide </admin-guide/LSM/ipe.rst>`_.
>   
>   Historical Motivation
>   ---------------------
> 
> Thanks.
> 

My apologies for these format issues and thanks for the suggestions. I 
will fix them.
-Fan

^ permalink raw reply	[relevance 79%]

* [PATCH v18 21/21] MAINTAINERS: ipe: add ipe maintainer information
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (19 preceding siblings ...)
  2024-05-03 22:32 12% ` [PATCH v18 20/21] Documentation: add ipe documentation Fan Wu
@ 2024-05-03 22:32 77% ` Fan Wu
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

Update MAINTAINERS to include ipe maintainer information.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

--
v1-v16:
  + Not present

v17:
  + Introduced

v18:
  + No changes
---
 MAINTAINERS | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index ec0284125e8f..def7116eba7b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10742,6 +10742,16 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git
 F:	security/integrity/
 F:	security/integrity/ima/
 
+INTEGRITY POLICY ENFORCEMENT (IPE)
+M:	Fan Wu <wufan@linux.microsoft.com>
+L:	linux-security-module@vger.kernel.org
+S:	Supported
+T:	git https://github.com/microsoft/ipe.git
+F:	Documentation/admin-guide/LSM/ipe.rst
+F:	Documentation/security/ipe.rst
+F:	scripts/ipe/
+F:	security/ipe/
+
 INTEL 810/815 FRAMEBUFFER DRIVER
 M:	Antonino Daplas <adaplas@gmail.com>
 L:	linux-fbdev@vger.kernel.org
-- 
2.44.0


^ permalink raw reply related	[relevance 77%]

* [PATCH v18 19/21] ipe: kunit test for parser
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (17 preceding siblings ...)
  2024-05-03 22:32 47% ` [PATCH v18 18/21] scripts: add boot policy generation program Fan Wu
@ 2024-05-03 22:32 48% ` Fan Wu
  2024-05-03 22:32 12% ` [PATCH v18 20/21] Documentation: add ipe documentation Fan Wu
  2024-05-03 22:32 77% ` [PATCH v18 21/21] MAINTAINERS: ipe: add ipe maintainer information Fan Wu
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Add various happy/unhappy unit tests for both IPE's policy parser.

Besides, a test suite for IPE functionality is available at
https://github.com/microsoft/ipe/tree/test-suite

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v6:
  + Not present

v7:
  Introduced

v8:
  + Remove the kunit tests with respect to the fsverity digest, as these
    require significant changes to work with the new method of acquiring
    the digest at runtime.

v9:
  + Remove the kunit tests related to ipe_context

v10:
  + No changes

v11:
  + No changes

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 security/ipe/Kconfig        |  17 +++
 security/ipe/Makefile       |   3 +
 security/ipe/policy_tests.c | 296 ++++++++++++++++++++++++++++++++++++
 3 files changed, 316 insertions(+)
 create mode 100644 security/ipe/policy_tests.c

diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index 7a82778f93ae..6c4677d1880e 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -76,4 +76,21 @@ config IPE_PROP_FS_VERITY_BUILTIN_SIG
 
 endmenu
 
+config SECURITY_IPE_KUNIT_TEST
+	bool "Build KUnit tests for IPE" if !KUNIT_ALL_TESTS
+	depends on KUNIT=y
+	default KUNIT_ALL_TESTS
+	help
+	  This builds the IPE KUnit tests.
+
+	  KUnit tests run during boot and output the results to the debug log
+	  in TAP format (https://testanything.org/). Only useful for kernel devs
+	  running KUnit test harness and are not for inclusion into a
+	  production build.
+
+	  For more information on KUnit and unit tests in general please refer
+	  to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+	  If unsure, say N.
+
 endif
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 84ad76556170..5125b8357e2f 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -26,3 +26,6 @@ obj-$(CONFIG_SECURITY_IPE) += \
 	audit.o \
 
 clean-files := boot-policy.c \
+
+obj-$(CONFIG_SECURITY_IPE_KUNIT_TEST) += \
+	policy_tests.o \
diff --git a/security/ipe/policy_tests.c b/security/ipe/policy_tests.c
new file mode 100644
index 000000000000..89521f6b9994
--- /dev/null
+++ b/security/ipe/policy_tests.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <kunit/test.h>
+#include "policy.h"
+struct policy_case {
+	const char *const policy;
+	int errno;
+	const char *const desc;
+};
+
+static const struct policy_case policy_cases[] = {
+	{
+		"policy_name=allowall policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW",
+		0,
+		"basic",
+	},
+	{
+		"policy_name=trailing_comment policy_version=152.0.0 #This is comment\n"
+		"DEFAULT action=ALLOW",
+		0,
+		"trailing comment",
+	},
+	{
+		"policy_name=allowallnewline policy_version=0.2.0\n"
+		"DEFAULT action=ALLOW\n"
+		"\n",
+		0,
+		"trailing newline",
+	},
+	{
+		"policy_name=carriagereturnlinefeed policy_version=0.0.1\n"
+		"DEFAULT action=ALLOW\n"
+		"\r\n",
+		0,
+		"clrf newline",
+	},
+	{
+		"policy_name=whitespace policy_version=0.0.0\n"
+		"DEFAULT\taction=ALLOW\n"
+		"     \t     DEFAULT \t    op=EXECUTE      action=DENY\n"
+		"op=EXECUTE boot_verified=TRUE action=ALLOW\n"
+		"# this is a\tcomment\t\t\t\t\n"
+		"DEFAULT \t op=KMODULE\t\t\t  action=DENY\r\n"
+		"op=KMODULE boot_verified=TRUE action=ALLOW\n",
+		0,
+		"various whitespaces and nested default",
+	},
+	{
+		"policy_name=boot_verified policy_version=-1236.0.0\n"
+		"DEFAULT\taction=ALLOW\n",
+		-EINVAL,
+		"negative version",
+	},
+	{
+		"policy_name=$@!*&^%%\\:;{}() policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW",
+		0,
+		"special characters",
+	},
+	{
+		"policy_name=test policy_version=999999.0.0\n"
+		"DEFAULT action=ALLOW",
+		-ERANGE,
+		"overflow version",
+	},
+	{
+		"policy_name=test policy_version=255.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"incomplete version",
+	},
+	{
+		"policy_name=test policy_version=111.0.0.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"extra version",
+	},
+	{
+		"",
+		-EBADMSG,
+		"0-length policy",
+	},
+	{
+		"policy_name=test\0policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"random null in header",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"\0DEFAULT action=ALLOW",
+		-EBADMSG,
+		"incomplete policy from NULL",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=DENY\n\0"
+		"op=EXECUTE dmverity_signature=TRUE action=ALLOW\n",
+		0,
+		"NULL truncates policy",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_signature=abc action=ALLOW",
+		-EBADMSG,
+		"invalid property type",
+	},
+	{
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"missing policy header",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n",
+		-EBADMSG,
+		"missing default definition",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"dmverity_signature=TRUE op=EXECUTE action=ALLOW",
+		-EBADMSG,
+		"invalid rule ordering"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"action=ALLOW op=EXECUTE dmverity_signature=TRUE",
+		-EBADMSG,
+		"invalid rule ordering (2)",
+	},
+	{
+		"policy_name=test policy_version=0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_signature=TRUE action=ALLOW",
+		-EBADMSG,
+		"invalid version",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=UNKNOWN dmverity_signature=TRUE action=ALLOW",
+		-EBADMSG,
+		"unknown operation",
+	},
+	{
+		"policy_name=asdvpolicy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n",
+		-EBADMSG,
+		"missing space after policy name",
+	},
+	{
+		"policy_name=test\xFF\xEF policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_signature=TRUE action=ALLOW",
+		0,
+		"expanded ascii",
+	},
+	{
+		"policy_name=test\xFF\xEF policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_roothash=GOOD_DOG action=ALLOW",
+		-EBADMSG,
+		"invalid property value (2)",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"policy_name=test policy_version=0.1.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"double header"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT action=ALLOW\n",
+		-EBADMSG,
+		"double default"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT op=EXECUTE action=DENY\n"
+		"DEFAULT op=EXECUTE action=ALLOW\n",
+		-EBADMSG,
+		"double operation default"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT op=EXECUTE action=DEN\n",
+		-EBADMSG,
+		"invalid action value"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT op=EXECUTE action\n",
+		-EBADMSG,
+		"invalid action value (2)"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"UNKNOWN value=true\n",
+		-EBADMSG,
+		"unrecognized statement"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_roothash=1c0d7ee1f8343b7fbe418378e8eb22c061d7dec7 action=DENY\n",
+		-EBADMSG,
+		"old-style digest"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE fsverity_digest=1c0d7ee1f8343b7fbe418378e8eb22c061d7dec7 action=DENY\n",
+		-EBADMSG,
+		"old-style digest"
+	}
+};
+
+static void pol_to_desc(const struct policy_case *c, char *desc)
+{
+	strscpy(desc, c->desc, KUNIT_PARAM_DESC_SIZE);
+}
+
+KUNIT_ARRAY_PARAM(ipe_policies, policy_cases, pol_to_desc);
+
+/**
+ * ipe_parser_unsigned_test - Test the parser by passing unsigned policies.
+ * @test: Supplies a pointer to a kunit structure.
+ *
+ * This is called by the kunit harness. This test does not check the correctness
+ * of the policy, but ensures that errors are handled correctly.
+ */
+static void ipe_parser_unsigned_test(struct kunit *test)
+{
+	const struct policy_case *p = test->param_value;
+	struct ipe_policy *pol;
+
+	pol = ipe_new_policy(p->policy, strlen(p->policy), NULL, 0);
+
+	if (p->errno) {
+		KUNIT_EXPECT_EQ(test, PTR_ERR(pol), p->errno);
+		return;
+	}
+
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pol);
+	KUNIT_EXPECT_NOT_ERR_OR_NULL(test, pol->parsed);
+	KUNIT_EXPECT_STREQ(test, pol->text, p->policy);
+	KUNIT_EXPECT_PTR_EQ(test, NULL, pol->pkcs7);
+	KUNIT_EXPECT_EQ(test, 0, pol->pkcs7len);
+
+	ipe_free_policy(pol);
+}
+
+/**
+ * ipe_parser_widestring_test - Ensure parser fail on a wide string policy.
+ * @test: Supplies a pointer to a kunit structure.
+ *
+ * This is called by the kunit harness.
+ */
+static void ipe_parser_widestring_test(struct kunit *test)
+{
+	const unsigned short policy[] = L"policy_name=Test policy_version=0.0.0\n"
+					L"DEFAULT action=ALLOW";
+	struct ipe_policy *pol = NULL;
+
+	pol = ipe_new_policy((const char *)policy, (ARRAY_SIZE(policy) - 1) * 2, NULL, 0);
+	KUNIT_EXPECT_TRUE(test, IS_ERR_OR_NULL(pol));
+
+	ipe_free_policy(pol);
+}
+
+static struct kunit_case ipe_parser_test_cases[] = {
+	KUNIT_CASE_PARAM(ipe_parser_unsigned_test, ipe_policies_gen_params),
+	KUNIT_CASE(ipe_parser_widestring_test),
+};
+
+static struct kunit_suite ipe_parser_test_suite = {
+	.name = "ipe-parser",
+	.test_cases = ipe_parser_test_cases,
+};
+
+kunit_test_suite(ipe_parser_test_suite);
-- 
2.44.0


^ permalink raw reply related	[relevance 48%]

* [PATCH v18 20/21] Documentation: add ipe documentation
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (18 preceding siblings ...)
  2024-05-03 22:32 48% ` [PATCH v18 19/21] ipe: kunit test for parser Fan Wu
@ 2024-05-03 22:32 12% ` Fan Wu
    2024-05-03 22:32 77% ` [PATCH v18 21/21] MAINTAINERS: ipe: add ipe maintainer information Fan Wu
  20 siblings, 1 reply; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Add IPE's admin and developer documentation to the kernel tree.

Co-developed-by: Fan Wu <wufan@linux.microsoft.com>
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
---
v2:
  + No Changes

v3:
  + Add Acked-by
  + Fixup code block syntax
  + Fix a minor grammatical issue.

v4:
  + Update documentation with the results of other
    code changes.

v5:
  + No changes

v6:
  + No changes

v7:
  + Add additional developer-level documentation
  + Update admin-guide docs to reflect changes.
  + Drop Acked-by due to significant changes
  + Added section about audit events in admin-guide

v8:
  + Correct terminology from "audit event" to "audit record"
  + Add associated documentation with the correct "audit event"
    terminology.
  + Add some context to the historical motivation for IPE and design
    philosophy.
  + Add some content about the securityfs layout in the policies
    directory.
  + Various spelling and grammatical corrections.

v9:
  + Correct spelling of "pitfalls"
  + Update the docs w.r.t the new parser and new audit formats

v10:
  + Refine user docs per upstream suggestions
  + Update audit events part

v11:
  + No changes

v12:
  + Update audit formats
  + Update initramfs related docs
  + Add test suite link

v13:
  + No changes

v14:
  + No changes

v15:
  + Update boot_verified part
  + Fix format issues
  + Add IPE doc link to fsverity.rst

v16:
  + Explicitly mention fsverity builtin signature

v17:
  + Rewrite many parts of Documentation/admin-guide/LSM/ipe.rst
  + Fix incorrect path name of policyfs interfaces

v18:
  + Improve policy examples
  + Remove insecure hash algorithms and adapt the documentation
    accordingly
  + Update the documentation regarding the new Kconfig switches
---
 Documentation/admin-guide/LSM/index.rst       |   1 +
 Documentation/admin-guide/LSM/ipe.rst         | 792 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  12 +
 Documentation/filesystems/fsverity.rst        |   5 +-
 Documentation/security/index.rst              |   1 +
 Documentation/security/ipe.rst                | 446 ++++++++++
 6 files changed, 1256 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/LSM/ipe.rst
 create mode 100644 Documentation/security/ipe.rst

diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index a6ba95fbaa9f..ce63be6d64ad 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -47,3 +47,4 @@ subdirectories.
    tomoyo
    Yama
    SafeSetID
+   ipe
diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
new file mode 100644
index 000000000000..1a3bf1d8aa23
--- /dev/null
+++ b/Documentation/admin-guide/LSM/ipe.rst
@@ -0,0 +1,792 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Integrity Policy Enforcement (IPE)
+==================================
+
+.. NOTE::
+
+   This is the documentation for admins, system builders, or individuals
+   attempting to use IPE. If you're looking for more developer-focused
+   documentation about IPE please see Documentation/security/ipe.rst
+
+Overview
+--------
+
+Integrity Policy Enforcement (IPE) is a Linux Security Module that takes a
+complementary approach to access control. Unlike traditional access control
+mechanisms that rely on labels and paths for decision-making, IPE focuses
+on the immutable security properties inherent to system components. These
+properties are fundamental attributes or features of a system component
+that cannot be altered, ensuring a consistent and reliable basis for
+security decisions.
+
+To elaborate, in the context of IPE, system components primarily refer to
+files or the devices these files reside on. However, this is just a
+starting point. The concept of system components is flexible and can be
+extended to include new elements as the system evolves. The immutable
+properties include the origin of a file, which remains constant and
+unchangeable over time. For example, IPE policies can be crafted to trust
+files originating from the initramfs. Since initramfs is typically verified
+by the bootloader, its files are deemed trustworthy; "file is from
+initramfs" becomes an immutable property under IPE's consideration.
+
+The immutable property concept extends to the security features enabled on
+a file's origin, such as dm-verity or fs-verity, which provide a layer of
+integrity and trust. For example, IPE allows the definition of policies
+that trust files from a dm-verity protected device. dm-verity ensures the
+integrity of an entire device by providing a verifiable and immutable state
+of its contents. Similarly, fs-verity offers filesystem-level integrity
+checks, allowing IPE to enforce policies that trust files protected by
+fs-verity. These two features cannot be turned off once established, so
+they are considered immutable properties. These examples demonstrate how
+IPE leverages immutable properties, such as a file's origin and its
+integrity protection mechanisms, to make access control decisions.
+
+For the IPE policy, specifically, it grants the ability to enforce
+stringent access controls by assessing security properties against
+reference values defined within the policy. This assessment can be based on
+the existence of a security property (e.g., verifying if a file originates
+from initramfs) or evaluating the internal state of an immutable security
+property. The latter includes checking the roothash of a dm-verity
+protected device, determining whether dm-verity possesses a valid
+signature, assessing the digest of a fs-verity protected file, or
+determining whether fs-verity possesses a valid built-in signature. This
+nuanced approach to policy enforcement enables a highly secure and
+customizable system defense mechanism, tailored to specific security
+requirements and trust models.
+
+To enable IPE, ensure that ``CONFIG_SECURITY_IPE`` (under
+:menuselection:`Security -> Integrity Policy Enforcement (IPE)`) config
+option is enabled.
+
+Use Cases
+---------
+
+IPE works best in fixed-function devices: devices in which their purpose
+is clearly defined and not supposed to be changed (e.g. network firewall
+device in a data center, an IoT device, etcetera), where all software and
+configuration is built and provisioned by the system owner.
+
+IPE is a long-way off for use in general-purpose computing: the Linux
+community as a whole tends to follow a decentralized trust model (known as
+the web of trust), which IPE has no support for it yet. Instead, IPE
+supports PKI (public key infrastructure), which generally designates a
+set of trusted entities that provide a measure of absolute trust.
+
+Additionally, while most packages are signed today, the files inside
+the packages (for instance, the executables), tend to be unsigned. This
+makes it difficult to utilize IPE in systems where a package manager is
+expected to be functional, without major changes to the package manager
+and ecosystem behind it.
+
+The digest_cache LSM [#digest_cache_lsm]_ is a system that when combined with IPE,
+could be used to enable and support general-purpose computing use cases.
+
+Known Limitations
+-----------------
+
+IPE cannot verify the integrity of anonymous executable memory, such as
+the trampolines created by gcc closures and libffi (<3.4.2), or JIT'd code.
+Unfortunately, as this is dynamically generated code, there is no way
+for IPE to ensure the integrity of this code to form a trust basis. In all
+cases, the return result for these operations will be whatever the admin
+configures as the ``DEFAULT`` action for ``EXECUTE``.
+
+IPE cannot verify the integrity of programs written in interpreted
+languages when these scripts are invoked by passing these program files
+to the interpreter. This is because the way interpreters execute these
+files; the scripts themselves are not evaluated as executable code
+through one of IPE's hooks, but they are merely text files that are read
+(as opposed to compiled executables) [#interpreters]_.
+
+Threat Model
+------------
+
+IPE specifically targets the risk of tampering with user-space executable
+code after the kernel has initially booted, including the kernel modules
+loaded from userspace via ``modprobe`` or ``insmod``.
+
+To illustrate, consider a scenario where an untrusted binary, possibly
+malicious, is downloaded along with all necessary dependencies, including a
+loader and libc. The primary function of IPE in this context is to prevent
+the execution of such binaries and their dependencies.
+
+IPE achieves this by verifying the integrity and authenticity of all
+executable code before allowing them to run. It conducts a thorough
+check to ensure that the code's integrity is intact and that they match an
+authorized reference value (digest, signature, etc) as per the defined
+policy. If a binary does not pass this verification process, either
+because its integrity has been compromised or it does not meet the
+authorization criteria, IPE will deny its execution. Additionally, IPE
+generates audit logs which may be utilized to detect and analyze failures
+resulting from policy violation.
+
+Tampering threat scenarios include modification or replacement of
+executable code by a range of actors including:
+
+-  Actors with physical access to the hardware
+-  Actors with local network access to the system
+-  Actors with access to the deployment system
+-  Compromised internal systems under external control
+-  Malicious end users of the system
+-  Compromised end users of the system
+-  Remote (external) compromise of the system
+
+IPE does not mitigate threats arising from malicious but authorized
+developers (with access to a signing certificate), or compromised
+developer tools used by them (i.e. return-oriented programming attacks).
+Additionally, IPE draws hard security boundary between userspace and
+kernelspace. As a result, IPE does not provide any protections against a
+kernel level exploit, and a kernel-level exploit can disable or tamper
+with IPE's protections.
+
+Policy
+------
+
+IPE policy is a plain-text [#devdoc]_ policy composed of multiple statements
+over several lines. There is one required line, at the top of the
+policy, indicating the policy name, and the policy version, for
+instance::
+
+   policy_name=Ex_Policy policy_version=0.0.0
+
+The policy name is a unique key identifying this policy in a human
+readable name. This is used to create nodes under securityfs as well as
+uniquely identify policies to deploy new policies vs update existing
+policies.
+
+The policy version indicates the current version of the policy (NOT the
+policy syntax version). This is used to prevent rollback of policy to
+potentially insecure previous versions of the policy.
+
+The next portion of IPE policy are rules. Rules are formed by key=value
+pairs, known as properties. IPE rules require two properties: ``action``,
+which determines what IPE does when it encounters a match against the
+rule, and ``op``, which determines when the rule should be evaluated.
+The ordering is significant, a rule must start with ``op``, and end with
+``action``. Thus, a minimal rule is::
+
+   op=EXECUTE action=ALLOW
+
+This example will allow any execution. Additional properties are used to
+assess immutable security properties about the files being evaluated.
+These properties are intended to be descriptions of systems within the
+kernel that can provide a measure of integrity verification, such that IPE
+can determine the trust of the resource based on the value of the property.
+
+Rules are evaluated top-to-bottom. As a result, any revocation rules,
+or denies should be placed early in the file to ensure that these rules
+are evaluated before a rule with ``action=ALLOW``.
+
+IPE policy supports comments. The character '#' will function as a
+comment, ignoring all characters to the right of '#' until the newline.
+
+The default behavior of IPE evaluations can also be expressed in policy,
+through the ``DEFAULT`` statement. This can be done at a global level,
+or a per-operation level::
+
+   # Global
+   DEFAULT action=ALLOW
+
+   # Operation Specific
+   DEFAULT op=EXECUTE action=ALLOW
+
+A default must be set for all known operations in IPE. If you want to
+preserve older policies being compatible with newer kernels that can introduce
+new operations, set a global default of ``ALLOW``, then override the
+defaults on a per-operation basis (as above).
+
+With configurable policy-based LSMs, there's several issues with
+enforcing the configurable policies at startup, around reading and
+parsing the policy:
+
+1. The kernel *should* not read files from userspace, so directly reading
+   the policy file is prohibited.
+2. The kernel command line has a character limit, and one kernel module
+   should not reserve the entire character limit for its own
+   configuration.
+3. There are various boot loaders in the kernel ecosystem, so handing
+   off a memory block would be costly to maintain.
+
+As a result, IPE has addressed this problem through a concept of a "boot
+policy". A boot policy is a minimal policy which is compiled into the
+kernel. This policy is intended to get the system to a state where
+userspace is set up and ready to receive commands, at which point a more
+complex policy can be deployed via securityfs. The boot policy can be
+specified via ``SECURITY_IPE_BOOT_POLICY`` config option, which accepts
+a path to a plain-text version of the IPE policy to apply. This policy
+will be compiled into the kernel. If not specified, IPE will be disabled
+until a policy is deployed and activated through securityfs.
+
+Deploying Policies
+~~~~~~~~~~~~~~~~~~
+
+Policies can be deployed from userspace through securityfs. These policies
+are signed through the PKCS#7 message format to enforce some level of
+authorization of the policies (prohibiting an attacker from gaining
+unconstrained root, and deploying an "allow all" policy). These
+policies must be signed by a certificate that chains to the
+``SYSTEM_TRUSTED_KEYRING``. With openssl, the policy can be signed by::
+
+   openssl smime -sign \
+      -in "$MY_POLICY" \
+      -signer "$MY_CERTIFICATE" \
+      -inkey "$MY_PRIVATE_KEY" \
+      -noattr \
+      -nodetach \
+      -nosmimecap \
+      -outform der \
+      -out "$MY_POLICY.p7b"
+
+Deploying the policies is done through securityfs, through the
+``new_policy`` node. To deploy a policy, simply cat the file into the
+securityfs node::
+
+   cat "$MY_POLICY.p7b" > /sys/kernel/security/ipe/new_policy
+
+Upon success, this will create one subdirectory under
+``/sys/kernel/security/ipe/policies/``. The subdirectory will be the
+``policy_name`` field of the policy deployed, so for the example above,
+the directory will be ``/sys/kernel/security/ipe/policies/Ex_Policy``.
+Within this directory, there will be seven files: ``pkcs7``, ``policy``,
+``name``, ``version``, ``active``, ``update``, and ``delete``.
+
+The ``pkcs7`` file is read-only. Reading it returns the raw PKCS#7 data
+that was provided to the kernel, representing the policy. If the policy being
+read is the boot policy, this will return ``ENOENT``, as it is not signed.
+
+The ``policy`` file is read only. Reading it returns the PKCS#7 inner
+content of the policy, which will be the plain text policy.
+
+The ``active`` file is used to set a policy as the currently active policy.
+This file is rw, and accepts a value of ``"1"`` to set the policy as active.
+Since only a single policy can be active at one time, all other policies
+will be marked inactive. The policy being marked active must have a policy
+version greater or equal to the currently-running version.
+
+The ``update`` file is used to update a policy that is already present
+in the kernel. This file is write-only and accepts a PKCS#7 signed
+policy. Two checks will always be performed on this policy: First, the
+``policy_names`` must match with the updated version and the existing
+version. Second the updated policy must have a policy version greater than
+or equal to the currently-running version. This is to prevent rollback attacks.
+
+The ``delete`` file is used to remove a policy that is no longer needed.
+This file is write-only and accepts a value of ``1`` to delete the policy.
+On deletion, the securityfs node representing the policy will be removed.
+However, delete the current active policy is not allowed and will return
+an operation not permitted error.
+
+Similarly, writing to both ``update`` and ``new_policy`` could result in
+bad message(policy syntax error) or file exists error. The latter error happens
+when trying to deploy a policy with a ``policy_name`` while the kernel already
+has a deployed policy with the same ``policy_name``.
+
+Deploying a policy will *not* cause IPE to start enforcing the policy. IPE will
+only enforce the policy marked active. Note that only one policy can be active
+at a time.
+
+Once deployment is successful, the policy can be activated, by writing file
+``/sys/kernel/security/ipe/policies/$policy_name/active``.
+For example, the ``Ex_Policy`` can be activated by::
+
+   echo 1 > "/sys/kernel/security/ipe/policies/Ex_Policy/active"
+
+From above point on, ``Ex_Policy`` is now the enforced policy on the
+system.
+
+IPE also provides a way to delete policies. This can be done via the
+``delete`` securityfs node,
+``/sys/kernel/security/ipe/policies/$policy_name/delete``.
+Writing ``1`` to that file deletes the policy::
+
+   echo 1 > "/sys/kernel/security/ipe/policies/$policy_name/delete"
+
+There is only one requirement to delete a policy: the policy being deleted
+must be inactive.
+
+.. NOTE::
+
+   If a traditional MAC system is enabled (SELinux, apparmor, smack), all
+   writes to ipe's securityfs nodes require ``CAP_MAC_ADMIN``.
+
+Modes
+~~~~~
+
+IPE supports two modes of operation: permissive (similar to SELinux's
+permissive mode) and enforced. In permissive mode, all events are
+checked and policy violations are logged, but the policy is not really
+enforced. This allows users to test policies before enforcing them.
+
+The default mode is enforce, and can be changed via the kernel command
+line parameter ``ipe.enforce=(0|1)``, or the securityfs node
+``/sys/kernel/security/ipe/enforce``.
+
+.. NOTE::
+
+   If a traditional MAC system is enabled (SELinux, apparmor, smack, etcetera),
+   all writes to ipe's securityfs nodes require ``CAP_MAC_ADMIN``.
+
+Audit Events
+~~~~~~~~~~~~
+
+1420 AUDIT_IPE_ACCESS
+^^^^^^^^^^^^^^^^^^^^^
+Event Examples::
+
+   type=1420 audit(1653364370.067:61): ipe_op=EXECUTE ipe_hook=MMAP enforcing=1 pid=2241 comm="ld-linux.so" path="/deny/lib/libc.so.6" dev="sda2" ino=14549020 rule="DEFAULT action=DENY"
+   type=1300 audit(1653364370.067:61): SYSCALL arch=c000003e syscall=9 success=no exit=-13 a0=7f1105a28000 a1=195000 a2=5 a3=812 items=0 ppid=2219 pid=2241 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="ld-linux.so" exe="/tmp/ipe-test/lib/ld-linux.so" subj=unconfined key=(null)
+   type=1327 audit(1653364370.067:61): 707974686F6E3300746573742F6D61696E2E7079002D6E00
+
+   type=1420 audit(1653364735.161:64): ipe_op=EXECUTE ipe_hook=MMAP enforcing=1 pid=2472 comm="mmap_test" path=? dev=? ino=? rule="DEFAULT action=DENY"
+   type=1300 audit(1653364735.161:64): SYSCALL arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=1000 a2=4 a3=21 items=0 ppid=2219 pid=2472 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="mmap_test" exe="/root/overlake_test/upstream_test/vol_fsverity/bin/mmap_test" subj=unconfined key=(null)
+   type=1327 audit(1653364735.161:64): 707974686F6E3300746573742F6D61696E2E7079002D6E00
+
+This event indicates that IPE made an access control decision; the IPE
+specific record (1420) is always emitted in conjunction with a
+``AUDITSYSCALL`` record.
+
+Determining whether IPE is in permissive or enforced mode can be derived
+from ``success`` property and exit code of the ``AUDITSYSCALL`` record.
+
+
+Field descriptions:
+
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| Field     | Value Type | Optional? | Description of Value                                                            |
++===========+============+===========+=================================================================================+
+| ipe_op    | string     | No        | The IPE operation name associated with the log                                  |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| ipe_hook  | string     | No        | The name of the LSM hook that triggered the IPE event                           |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| enforcing | integer    | No        | The current IPE enforcing state 1 is in enforcing mode, 0 is in permissive mode |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| pid       | integer    | No        | The pid of the process that triggered the IPE event.                            |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| comm      | string     | No        | The command line program name of the process that triggered the IPE event       |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| path      | string     | Yes       | The absolute path to the evaluated file                                         |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| ino       | integer    | Yes       | The inode number of the evaluated file                                          |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| dev       | string     | Yes       | The device name of the evaluated file, e.g. vda                                 |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| rule      | string     | No        | The matched policy rule                                                         |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+
+1421 AUDIT_IPE_CONFIG_CHANGE
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Event Example::
+
+   type=1421 audit(1653425583.136:54): old_active_pol_name="Allow_All" old_active_pol_version=0.0.0 old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855 new_active_pol_name="boot_verified" new_active_pol_version=0.0.0 new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F26765076DD8EED7B8F4DB auid=4294967295 ses=4294967295 lsm=ipe res=1
+   type=1300 audit(1653425583.136:54): SYSCALL arch=c000003e syscall=1 success=yes exit=2 a0=3 a1=5596fcae1fb0 a2=2 a3=2 items=0 ppid=184 pid=229 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="python3" exe="/usr/bin/python3.10" key=(null)
+   type=1327 audit(1653425583.136:54): PROCTITLE proctitle=707974686F6E3300746573742F6D61696E2E7079002D66002E2
+
+This event indicates that IPE switched the active poliy from one to another
+along with the version and the hash digest of the two policies.
+Note IPE can only have one policy active at a time, all access decision
+evaluation is based on the current active policy.
+The normal procedure to deploy a new policy is loading the policy to deploy
+into the kernel first, then switch the active policy to it.
+
+This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record for the ``write`` syscall.
+
+Field descriptions:
+
++------------------------+------------+-----------+---------------------------------------------------+
+| Field                  | Value Type | Optional? | Description of Value                              |
++========================+============+===========+===================================================+
+| old_active_pol_name    | string     | No        | The name of previous active policy                |
++------------------------+------------+-----------+---------------------------------------------------+
+| old_active_pol_version | string     | No        | The version of previous active policy             |
++------------------------+------------+-----------+---------------------------------------------------+
+| old_policy_digest      | string     | No        | The hash of previous active policy                |
++------------------------+------------+-----------+---------------------------------------------------+
+| new_active_pol_name    | string     | No        | The name of current active policy                 |
++------------------------+------------+-----------+---------------------------------------------------+
+| new_active_pol_version | string     | No        | The version of current active policy              |
++------------------------+------------+-----------+---------------------------------------------------+
+| new_policy_digest      | string     | No        | The hash of current active policy                 |
++------------------------+------------+-----------+---------------------------------------------------+
+| auid                   | integer    | No        | The login user ID                                 |
++------------------------+------------+-----------+---------------------------------------------------+
+| ses                    | integer    | No        | The login session ID                              |
++------------------------+------------+-----------+---------------------------------------------------+
+| lsm                    | string     | No        | The lsm name associated with the event            |
++------------------------+------------+-----------+---------------------------------------------------+
+| res                    | integer    | No        | The result of the audited operation(success/fail) |
++------------------------+------------+-----------+---------------------------------------------------+
+
+1422 AUDIT_IPE_POLICY_LOAD
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Event Example::
+
+   type=1422 audit(1653425529.927:53): policy_name="boot_verified" policy_version=0.0.0 policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F26765076DD8EED7B8F4DB auid=4294967295 ses=4294967295 lsm=ipe res=1
+   type=1300 audit(1653425529.927:53): arch=c000003e syscall=1 success=yes exit=2567 a0=3 a1=5596fcae1fb0 a2=a07 a3=2 items=0 ppid=184 pid=229 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="python3" exe="/usr/bin/python3.10" key=(null)
+   type=1327 audit(1653425529.927:53): PROCTITLE proctitle=707974686F6E3300746573742F6D61696E2E7079002D66002E2E
+
+This record indicates a new policy has been loaded into the kernel with the policy name, policy version and policy hash.
+
+This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record for the ``write`` syscall.
+
+Field descriptions:
+
++----------------+------------+-----------+---------------------------------------------------+
+| Field          | Value Type | Optional? | Description of Value                              |
++================+============+===========+===================================================+
+| policy_name    | string     | No        | The policy_name                                   |
++----------------+------------+-----------+---------------------------------------------------+
+| policy_version | string     | No        | The policy_version                                |
++----------------+------------+-----------+---------------------------------------------------+
+| policy_digest  | string     | No        | The policy hash                                   |
++----------------+------------+-----------+---------------------------------------------------+
+| auid           | integer    | No        | The login user ID                                 |
++----------------+------------+-----------+---------------------------------------------------+
+| ses            | integer    | No        | The login session ID                              |
++----------------+------------+-----------+---------------------------------------------------+
+| lsm            | string     | No        | The lsm name associated with the event            |
++----------------+------------+-----------+---------------------------------------------------+
+| res            | integer    | No        | The result of the audited operation(success/fail) |
++----------------+------------+-----------+---------------------------------------------------+
+
+
+1404 AUDIT_MAC_STATUS
+^^^^^^^^^^^^^^^^^^^^^
+
+Event Examples::
+
+   type=1404 audit(1653425689.008:55): enforcing=0 old_enforcing=1 auid=4294967295 ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
+   type=1300 audit(1653425689.008:55): arch=c000003e syscall=1 success=yes exit=2 a0=1 a1=55c1065e5c60 a2=2 a3=0 items=0 ppid=405 pid=441 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=)
+   type=1327 audit(1653425689.008:55): proctitle="-bash"
+
+   type=1404 audit(1653425689.008:55): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
+   type=1300 audit(1653425689.008:55): arch=c000003e syscall=1 success=yes exit=2 a0=1 a1=55c1065e5c60 a2=2 a3=0 items=0 ppid=405 pid=441 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=)
+   type=1327 audit(1653425689.008:55): proctitle="-bash"
+
+This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record for the ``write`` syscall.
+
+Field descriptions:
+
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| Field         | Value Type | Optional? | Description of Value                                                                            |
++===============+============+===========+=================================================================================================+
+| enforcing     | integer    | No        | The enforcing state IPE is being switched to, 1 is in enforcing mode, 0 is in permissive mode   |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| old_enforcing | integer    | No        | The enforcing state IPE is being switched from, 1 is in enforcing mode, 0 is in permissive mode |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| auid          | integer    | No        | The login user ID                                                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| ses           | integer    | No        | The login session ID                                                                            |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| enabled       | integer    | No        | The new TTY audit enabled setting                                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| old-enabled   | integer    | No        | The old TTY audit enabled setting                                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| lsm           | string     | No        | The lsm name associated with the event                                                          |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| res           | integer    | No        | The result of the audited operation(success/fail)                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+
+
+Success Auditing
+^^^^^^^^^^^^^^^^
+
+IPE supports success auditing. When enabled, all events that pass IPE
+policy and are not blocked will emit an audit event. This is disabled by
+default, and can be enabled via the kernel command line
+``ipe.success_audit=(0|1)`` or
+``/sys/kernel/security/ipe/success_audit`` securityfs file.
+
+This is *very* noisy, as IPE will check every userspace binary on the
+system, but is useful for debugging policies.
+
+.. NOTE::
+
+   If a traditional MAC system is enabled (SELinux, apparmor, smack, etcetera),
+   all writes to ipe's securityfs nodes require ``CAP_MAC_ADMIN``.
+
+Properties
+----------
+
+As explained above, IPE properties are ``key=value`` pairs expressed in IPE
+policy. Two properties are built-into the policy parser: 'op' and 'action'.
+The other properties are used to restrict immutable security properties
+about the files being evaluated. Currently those properties are:
+'``boot_verified``', '``dmverity_signature``', '``dmverity_roothash``',
+'``fsverity_signature``', '``fsverity_digest``'. A description of all
+properties supported by IPE are listed below:
+
+op
+~~
+
+Indicates the operation for a rule to apply to. Must be in every rule,
+as the first token. IPE supports the following operations:
+
+   ``EXECUTE``
+
+      Pertains to any file attempting to be executed, or loaded as an
+      executable.
+
+   ``FIRMWARE``:
+
+      Pertains to firmware being loaded via the firmware_class interface.
+      This covers both the preallocated buffer and the firmware file
+      itself.
+
+   ``KMODULE``:
+
+      Pertains to loading kernel modules via ``modprobe`` or ``insmod``.
+
+   ``KEXEC_IMAGE``:
+
+      Pertains to kernel images loading via ``kexec``.
+
+   ``KEXEC_INITRAMFS``
+
+      Pertains to initrd images loading via ``kexec --initrd``.
+
+   ``POLICY``:
+
+      Controls loading policies via reading a kernel-space initiated read.
+
+      An example of such is loading IMA policies by writing the path
+      to the policy file to ``$securityfs/ima/policy``
+
+   ``X509_CERT``:
+
+      Controls loading IMA certificates through the Kconfigs,
+      ``CONFIG_IMA_X509_PATH`` and ``CONFIG_EVM_X509_PATH``.
+
+action
+~~~~~~
+
+   Determines what IPE should do when a rule matches. Must be in every
+   rule, as the final clause. Can be one of:
+
+   ``ALLOW``:
+
+      If the rule matches, explicitly allow access to the resource to proceed
+      without executing any more rules.
+
+   ``DENY``:
+
+      If the rule matches, explicitly prohibit access to the resource to
+      proceed without executing any more rules.
+
+boot_verified
+~~~~~~~~~~~~~
+
+   This property can be utilized for authorization of files from initramfs.
+   The format of this property is::
+
+         boot_verified=(TRUE|FALSE)
+
+
+   .. WARNING::
+
+      This property will trust files from initramfs(rootfs). It should
+      only be used during early booting stage. Before mounting the real
+      rootfs on top of the initramfs, initramfs script will recursively
+      remove all files and directories on the initramfs. This is typically
+      implemented by using switch_root(8) [#switch_root]_. Therefore the
+      initramfs will be empty and not accessible after the real
+      rootfs takes over. It is advised to switch to a different policy
+      that doesn't rely on the property after this point.
+      This ensures that the trust policies remain relevant and effective
+      throughout the system's operation.
+
+dmverity_roothash
+~~~~~~~~~~~~~~~~~
+
+   This property can be utilized for authorization or revocation of
+   specific dm-verity volumes, identified via their root hashes. It has a
+   dependency on the DM_VERITY module. This property is controlled by
+   the ``IPE_PROP_DM_VERITY`` config option, it will be automatically
+   selected when ``SECURITY_IPE`` and ``DM_VERITY`` are all enabled.
+   The format of this property is::
+
+      dmverity_roothash=DigestName:HexadecimalString
+
+   The supported DigestNames for dmverity_roothash are [#dmveritydigests]_
+
+      + blake2b-512
+      + blake2s-256
+      + sha256
+      + sha384
+      + sha512
+      + sha3-224
+      + sha3-256
+      + sha3-384
+      + sha3-512
+      + sm3
+      + rmd160
+
+dmverity_signature
+~~~~~~~~~~~~~~~~~~
+
+   This property can be utilized for authorization of all dm-verity
+   volumes that have a signed roothash that validated by a keyring
+   specified by dm-verity's configuration, either the system trusted
+   keyring, or the secondary keyring. It depends on
+   ``DM_VERITY_VERIFY_ROOTHASH_SIG`` config option and is controlled by
+   the ``IPE_PROP_DM_VERITY_SIGNATURE`` config option, it will be automatically
+   selected when ``SECURITY_IPE``, ``DM_VERITY`` and
+   ``DM_VERITY_VERIFY_ROOTHASH_SIG`` are all enabled.
+   The format of this property is::
+
+      dmverity_signature=(TRUE|FALSE)
+
+fsverity_digest
+~~~~~~~~~~~~~~~
+
+   This property can be utilized for authorization of specific fsverity
+   enabled files, identified via their fsverity digests.
+   It depends on ``FS_VERITY`` config option and is controlled by
+   the ``IPE_PROP_FS_VERITY`` config option, it will be automatically
+   selected when ``SECURITY_IPE`` and ``FS_VERITY`` are all enabled.
+   The format of this property is::
+
+      fsverity_digest=DigestName:HexadecimalString
+
+   The supported DigestNames for fsverity_digest are [#fsveritydigest]_
+
+      + sha256
+      + sha512
+
+fsverity_signature
+~~~~~~~~~~~~~~~~~~
+
+   This property is used to authorize all fs-verity enabled files that have
+   been verified by fs-verity's built-in signature mechanism. The signature
+   verification relies on a key stored within the ".fs-verity" keyring. It
+   depends on ``FS_VERITY_BUILTIN_SIGNATURES`` config option and
+   it is controlled by the ``IPE_PROP_FS_VERITY`` config option,
+   it will be automatically selected when ``SECURITY_IPE``, ``FS_VERITY``
+   and ``FS_VERITY_BUILTIN_SIGNATURES`` are all enabled.
+   The format of this property is::
+
+      fsverity_signature=(TRUE|FALSE)
+
+Policy Examples
+---------------
+
+Allow all
+~~~~~~~~~
+
+::
+
+   policy_name=Allow_All policy_version=0.0.0
+   DEFAULT action=ALLOW
+
+Allow only initramfs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=Allow_Initramfs policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+
+Allow any signed and validated dm-verity volume and the initramfs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=Allow_Signed_DMV_And_Initramfs policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+   op=EXECUTE dmverity_signature=TRUE action=ALLOW
+
+Prohibit execution from a specific dm-verity volume
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=Deny_DMV_By_Roothash policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE dmverity_roothash=sha256:cd2c5bae7c6c579edaae4353049d58eb5f2e8be0244bf05345bc8e5ed257baff action=DENY
+
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+   op=EXECUTE dmverity_signature=TRUE action=ALLOW
+
+Allow only a specific dm-verity volume
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=Allow_DMV_By_Roothash policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE dmverity_roothash=sha256:401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW
+
+Allow any fs-verity file with a valid built-in signature
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=Allow_Signed_And_Validated_FSVerity policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE fsverity_signature=TRUE action=ALLOW
+
+Allow execution of a specific fs-verity file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=ALLOW_FSV_By_Digest policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE fsverity_digest=sha256:fd88f2b8824e197f850bf4c5109bea5cf0ee38104f710843bb72da796ba5af9e action=ALLOW
+
+Additional Information
+----------------------
+
+- `Github Repository <https://github.com/microsoft/ipe>`_
+- Documentation/security/ipe.rst
+
+FAQ
+---
+
+Q:
+   What's the difference between other LSMs which provide a measure of
+   trust-based access control?
+
+A:
+
+   In general, there's two other LSMs that can provide similar functionality:
+   IMA, and Loadpin.
+
+   IMA and IPE are functionally very similar. The significant difference between
+   the two is the policy. [#devdoc]_
+
+   Loadpin and IPE differ fairly dramatically, as Loadpin only covers the IPE's
+   kernel read operations, whereas IPE is capable of controlling execution
+   on top of kernel read. The trust model is also different; Loadpin roots its
+   trust in the initial super-block, whereas trust in IPE is stemmed from kernel
+   itself (via ``SYSTEM_TRUSTED_KEYS``).
+
+-----------
+
+.. [#digest_cache_lsm] https://lore.kernel.org/lkml/20240415142436.2545003-1-roberto.sassu@huaweicloud.com/
+
+.. [#interpreters] There is `some interest in solving this issue <https://lore.kernel.org/lkml/20220321161557.495388-1-mic@digikod.net/>`_.
+
+.. [#devdoc] Please see Documentation/security/ipe.rst for more on this topic.
+
+.. [#switch_root] https://man7.org/linux/man-pages/man8/switch_root.8.html
+
+.. [#dmveritydigests] These hash algorithms are based on values accepted by
+                      the Linux crypto API; IPE does not impose any
+                      restrictions on the digest algorithm itself;
+                      thus, this list may be out of date.
+
+.. [#fsveritydigest] These hash algorithms are based on values accepted by the
+                     kernel's fsverity support; IPE does not impose any
+                     restrictions on the digest algorithm itself;
+                     thus, this list may be out of date.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 213d0719e2b7..bf7ae099db9c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2321,6 +2321,18 @@
 	ipcmni_extend	[KNL,EARLY] Extend the maximum number of unique System V
 			IPC identifiers from 32,768 to 16,777,216.
 
+	ipe.enforce=	[IPE]
+			Format: <bool>
+			Determine whether IPE starts in permissive (0) or
+			enforce (1) mode. The default is enforce.
+
+	ipe.success_audit=
+			[IPE]
+			Format: <bool>
+			Start IPE with success auditing enabled, emitting
+			an audit event when a binary is allowed. The default
+			is 0.
+
 	irqaffinity=	[SMP] Set the default irq affinity mask
 			The argument is a cpu list, as described above.
 
diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 362b7a5dc300..46ab280e1b13 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -92,7 +92,9 @@ authenticating fs-verity file hashes include:
   "IPE policy" specifically allows for the authorization of fs-verity
   files using properties ``fsverity_digest`` for identifying
   files by their verity digest, and ``fsverity_signature`` to authorize
-  files with a verified fs-verity's built-in signature.
+  files with a verified fs-verity's built-in signature. For
+  details on configuring IPE policies and understanding its operational
+  modes, please refer to Documentation/admin-guide/LSM/ipe.rst.
 
 - Trusted userspace code in combination with `Built-in signature
   verification`_.  This approach should be used only with great care.
@@ -508,6 +510,7 @@ be carefully considered before using them:
   files with a verified fs-verity builtin signature to perform certain
   operations, such as execution. Note that IPE doesn't require
   fs.verity.require_signatures=1.
+  Please refer to Documentation/admin-guide/LSM/ipe.rst for more details.
 
 - A file's builtin signature can only be set at the same time that
   fs-verity is being enabled on the file.  Changing or deleting the
diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 59f8fc106cb0..3e0a7114a862 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -19,3 +19,4 @@ Security Documentation
    digsig
    landlock
    secrets/index
+   ipe
diff --git a/Documentation/security/ipe.rst b/Documentation/security/ipe.rst
new file mode 100644
index 000000000000..07e363224128
--- /dev/null
+++ b/Documentation/security/ipe.rst
@@ -0,0 +1,446 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Integrity Policy Enforcement (IPE) - Kernel Documentation
+=========================================================
+
+.. NOTE::
+
+   This is documentation targeted at developers, instead of administrators.
+   If you're looking for documentation on the usage of IPE, please see
+   Documentation/admin-guide/LSM/ipe.rst
+
+Historical Motivation
+---------------------
+
+The original issue that prompted IPE's implementation was the creation
+of a locked-down system. This system would be born-secure, and have
+strong integrity guarantees over both the executable code, and specific
+*data files* on the system, that were critical to its function. These
+specific data files would not be readable unless they passed integrity
+policy. A mandatory access control system would be present, and
+as a result, xattrs would have to be protected. This lead to a selection
+of what would provide the integrity claims. At the time, there were two
+main mechanisms considered that could guarantee integrity for the system
+with these requirements:
+
+  1. IMA + EVM Signatures
+  2. DM-Verity
+
+Both options were carefully considered, however the choice to use DM-Verity
+over IMA+EVM as the *integrity mechanism* in the original use case of IPE
+was due to three main reasons:
+
+  1. Protection of additional attack vectors:
+
+    * With IMA+EVM, without an encryption solution, the system is vulnerable
+      to offline attack against the aforementioned specific data files.
+
+      Unlike executables, read operations (like those on the protected data
+      files), cannot be enforced to be globally integrity verified. This means
+      there must be some form of selector to determine whether a read should
+      enforce the integrity policy, or it should not.
+
+      At the time, this was done with mandatory access control labels. An IMA
+      policy would indicate what labels required integrity verification, which
+      presented an issue: EVM would protect the label, but if an attacker could
+      modify filesystem offline, the attacker could wipe all the xattrs -
+      including the SELinux labels that would be used to determine whether the
+      file should be subject to integrity policy.
+
+      With DM-Verity, as the xattrs are saved as part of the Merkel tree, if
+      offline mount occurs against the filesystem protected by dm-verity, the
+      checksum no longer matches and the file fails to be read.
+
+    * As userspace binaries are paged in Linux, dm-verity also offers the
+      additional protection against a hostile block device. In such an attack,
+      the block device reports the appropriate content for the IMA hash
+      initially, passing the required integrity check. Then, on the page fault
+      that accesses the real data, will report the attacker's payload. Since
+      dm-verity will check the data when the page fault occurs (and the disk
+      access), this attack is mitigated.
+
+  2. Performance:
+
+    * dm-verity provides integrity verification on demand as blocks are
+      read versus requiring the entire file being read into memory for
+      validation.
+
+  3. Simplicity of signing:
+
+    * No need for two signatures (IMA, then EVM): one signature covers
+      an entire block device.
+    * Signatures can be stored externally to the filesystem metadata.
+    * The signature supports an x.509-based signing infrastructure.
+
+The next step was to choose a *policy* to enforce the integrity mechanism.
+The minimum requirements for the policy were:
+
+  1. The policy itself must be integrity verified (preventing trivial
+     attack against it).
+  2. The policy itself must be resistant to rollback attacks.
+  3. The policy enforcement must have a permissive-like mode.
+  4. The policy must be able to be updated, in its entirety, without
+     a reboot.
+  5. Policy updates must be atomic.
+  6. The policy must support *revocations* of previously authored
+     components.
+  7. The policy must be auditable, at any point-of-time.
+
+IMA, as the only integrity policy mechanism at the time, was
+considered against these list of requirements, and did not fulfill
+all of the minimum requirements. Extending IMA to cover these
+requirements was considered, but ultimately discarded for a
+two reasons:
+
+  1. Regression risk; many of these changes would result in
+     dramatic code changes to IMA, which is already present in the
+     kernel, and therefore might impact users.
+
+  2. IMA was used in the system for measurement and attestation;
+     separation of measurement policy from local integrity policy
+     enforcement was considered favorable.
+
+Due to these reasons, it was decided that a new LSM should be created,
+whose responsibility would be only the local integrity policy enforcement.
+
+Role and Scope
+--------------
+
+IPE, as its name implies, is fundamentally an integrity policy enforcement
+solution; IPE does not mandate how integrity is provided, but instead
+leaves that decision to the system administrator to set the security bar,
+via the mechanisms that they select that suit their individual needs.
+There are several different integrity solutions that provide a different
+level of security guarantees; and IPE allows sysadmins to express policy for
+theoretically all of them.
+
+IPE does not have an inherent mechanism to ensure integrity on its own.
+Instead, there are more effective layers available for building systems that
+can guarantee integrity. It's important to note that the mechanism for proving
+integrity is independent of the policy for enforcing that integrity claim.
+
+Therefore, IPE was designed around:
+
+  1. Easy integrations with integrity providers.
+  2. Ease of use for platform administrators/sysadmins.
+
+Design Rationale:
+-----------------
+
+IPE was designed after evaluating existing integrity policy solutions
+in other operating systems and environments. In this survey of other
+implementations, there were a few pitfalls identified:
+
+  1. Policies were not readable by humans, usually requiring a binary
+     intermediary format.
+  2. A single, non-customizable action was implicitly taken as a default.
+  3. Debugging the policy required manual steps to determine what rule was violated.
+  4. Authoring a policy required an in-depth knowledge of the larger system,
+     or operating system.
+
+IPE attempts to avoid all of these pitfalls.
+
+Policy
+~~~~~~
+
+Plain Text
+^^^^^^^^^^
+
+IPE's policy is plain-text. This introduces slightly larger policy files than
+other LSMs, but solves two major problems that occurs with some integrity policy
+solutions on other platforms.
+
+The first issue is one of code maintenance and duplication. To author policies,
+the policy has to be some form of string representation (be it structured,
+through XML, JSON, YAML, etcetera), to allow the policy author to understand
+what is being written. In a hypothetical binary policy design, a serializer
+is necessary to write the policy from the human readable form, to the binary
+form, and a deserializer is needed to interpret the binary form into a data
+structure in the kernel.
+
+Eventually, another deserializer will be needed to transform the binary from
+back into the human-readable form with as much information preserved. This is because a
+user of this access control system will have to keep a lookup table of a checksum
+and the original file itself to try to understand what policies have been deployed
+on this system and what policies have not. For a single user, this may be alright,
+as old policies can be discarded almost immediately after the update takes hold.
+For users that manage computer fleets in the thousands, if not hundreds of thousands,
+with multiple different operating systems, and multiple different operational needs,
+this quickly becomes an issue, as stale policies from years ago may be present,
+quickly resulting in the need to recover the policy or fund extensive infrastructure
+to track what each policy contains.
+
+With now three separate serializer/deserializers, maintenance becomes costly. If the
+policy avoids the binary format, there is only one required serializer: from the
+human-readable form to the data structure in kernel, saving on code maintenance,
+and retaining operability.
+
+The second issue with a binary format is one of transparency. As IPE controls
+access based on the trust of the system's resources, it's policy must also be
+trusted to be changed. This is done through signatures, resulting in needing
+signing as a process. Signing, as a process, is typically done with a
+high security bar, as anything signed can be used to attack integrity
+enforcement systems. It is also important that, when signing something, that
+the signer is aware of what they are signing. A binary policy can cause
+obfuscation of that fact; what signers see is an opaque binary blob. A
+plain-text policy, on the other hand, the signers see the actual policy
+submitted for signing.
+
+Boot Policy
+~~~~~~~~~~~
+
+IPE, if configured appropriately, is able to enforce a policy as soon as a
+kernel is booted and usermode starts. That implies some level of storage
+of the policy to apply the minute usermode starts. Generally, that storage
+can be handled in one of three ways:
+
+  1. The policy file(s) live on disk and the kernel loads the policy prior
+     to an code path that would result in an enforcement decision.
+  2. The policy file(s) are passed by the bootloader to the kernel, who
+     parses the policy.
+  3. There is a policy file that is compiled into the kernel that is
+     parsed and enforced on initialization.
+
+The first option has problems: the kernel reading files from userspace
+is typically discouraged and very uncommon in the kernel.
+
+The second option also has problems: Linux supports a variety of bootloaders
+across its entire ecosystem - every bootloader would have to support this
+new methodology or there must be an independent source. It would likely
+result in more drastic changes to the kernel startup than necessary.
+
+The third option is the best but it's important to be aware that the policy
+will take disk space against the kernel it's compiled in. It's important to
+keep this policy generalized enough that userspace can load a new, more
+complicated policy, but restrictive enough that it will not overauthorize
+and cause security issues.
+
+The initramfs provides a way that this bootup path can be established. The
+kernel starts with a minimal policy, that trusts the initramfs only. Inside
+the initramfs, when the real rootfs is mounted, but not yet transferred to,
+it deploys and activates a policy that trusts the new root filesystem.
+This prevents overauthorization at any step, and keeps the kernel policy
+to a minimal size.
+
+Startup
+^^^^^^^
+
+Not every system, however starts with an initramfs, so the startup policy
+compiled into the kernel will need some flexibility to express how trust
+is established for the next phase of the bootup. To this end, if we just
+make the compiled-in policy a full IPE policy, it allows system builders
+to express the first stage bootup requirements appropriately.
+
+Updatable, Rebootless Policy
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As requirements change over time (vulnerabilities are found in previously
+trusted applications, keys roll, etcetera). Updating a kernel to change the
+meet those security goals is not always a suitable option, as updates are not
+always risk-free, and blocking a security update leaves systems vulnerable.
+This means IPE requires a policy that can be completely updated (allowing
+revocations of existing policy) from a source external to the kernel (allowing
+policies to be updated without updating the kernel).
+
+Additionally, since the kernel is stateless between invocations, and reading
+policy files off the disk from kernel space is a bad idea(tm), then the
+policy updates have to be done rebootlessly.
+
+To allow an update from an external source, it could be potentially malicious,
+so this policy needs to have a way to be identified as trusted. This is
+done via a signature chained to a trust source in the kernel. Arbitrarily,
+this is  the ``SYSTEM_TRUSTED_KEYRING``, a keyring that is initially
+populated at kernel compile-time, as this matches the expectation that the
+author of the compiled-in policy described above is the same entity that can
+deploy policy updates.
+
+Anti-Rollback / Anti-Replay
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Over time, vulnerabilities are found and trusted resources may not be
+trusted anymore. IPE's policy has no exception to this. There can be
+instances where a mistaken policy author deploys an insecure policy,
+before correcting it with a secure policy.
+
+Assuming that as soon as the insecure policy is signed, and an attacker
+acquires the insecure policy, IPE needs a way to prevent rollback
+from the secure policy update to the insecure policy update.
+
+Initially, IPE's policy can have a policy_version that states the
+minimum required version across all policies that can be active on
+the system. This will prevent rollback while the system is live.
+
+.. WARNING::
+
+  However, since the kernel is stateless across boots, this policy
+  version will be reset to 0.0.0 on the next boot. System builders
+  need to be aware of this, and ensure the new secure policies are
+  deployed ASAP after a boot to ensure that the window of
+  opportunity is minimal for an attacker to deploy the insecure policy.
+
+Implicit Actions:
+~~~~~~~~~~~~~~~~~
+
+The issue of implicit actions only becomes visible when you consider
+a mixed level of security bars across multiple operations in a system.
+For example, consider a system that has strong integrity guarantees
+over both the executable code, and specific *data files* on the system,
+that were critical to its function. In this system, three types of policies
+are possible:
+
+  1. A policy in which failure to match any rules in the policy results
+     in the action being denied.
+  2. A policy in which failure to match any rules in the policy results
+     in the action being allowed.
+  3. A policy in which the action taken when no rules are matched is
+     specified by the policy author.
+
+The first option could make a policy like this::
+
+  op=EXECUTE integrity_verified=YES action=ALLOW
+
+In the example system, this works well for the executables, as all
+executables should have integrity guarantees, without exception. The
+issue becomes with the second requirement about specific data files.
+This would result in a policy like this (assuming each line is
+evaluated in order)::
+
+  op=EXECUTE integrity_verified=YES action=ALLOW
+
+  op=READ integrity_verified=NO label=critical_t action=DENY
+  op=READ action=ALLOW
+
+This is somewhat clear if you read the docs, understand the policy
+is executed in order and that the default is a denial; however, the
+last line effectively changes that default to an ALLOW. This is
+required, because in a realistic system, there are some unverified
+reads (imagine appending to a log file).
+
+The second option, matching no rules results in an allow, is clearer
+for the specific data files::
+
+  op=READ integrity_verified=NO label=critical_t action=DENY
+
+And, like the first option, falls short with the opposite scenario,
+effectively needing to override the default::
+
+  op=EXECUTE integrity_verified=YES action=ALLOW
+  op=EXECUTE action=DENY
+
+  op=READ integrity_verified=NO label=critical_t action=DENY
+
+This leaves the third option. Instead of making users be clever
+and override the default with an empty rule, force the end-user
+to consider what the appropriate default should be for their
+scenario and explicitly state it::
+
+  DEFAULT op=EXECUTE action=DENY
+  op=EXECUTE integrity_verified=YES action=ALLOW
+
+  DEFAULT op=READ action=ALLOW
+  op=READ integrity_verified=NO label=critical_t action=DENY
+
+Policy Debugging:
+~~~~~~~~~~~~~~~~~
+
+When developing a policy, it is useful to know what line of the policy
+is being violated to reduce debugging costs; narrowing the scope of the
+investigation to the exact line that resulted in the action. Some integrity
+policy systems do not provide this information, instead providing the
+information that was used in the evaluation. This then requires a correlation
+with the policy to evaluate what went wrong.
+
+Instead, IPE just emits the rule that was matched. This limits the scope
+of the investigation to the exact policy line (in the case of a specific
+rule), or the section (in the case of a DEFAULT). This decreases iteration
+and investigation times when policy failures are observed while evaluating
+policies.
+
+IPE's policy engine is also designed in a way that it makes it obvious to
+a human of how to investigate a policy failure. Each line is evaluated in
+the sequence that is written, so the algorithm is very simple to follow
+for humans to recreate the steps and could have caused the failure. In other
+surveyed systems, optimizations occur (sorting rules, for instance) when loading
+the policy. In those systems, it requires multiple steps to debug, and the
+algorithm may not always be clear to the end-user without reading the code first.
+
+Simplified Policy:
+~~~~~~~~~~~~~~~~~~
+
+Finally, IPE's policy is designed for sysadmins, not kernel developers. Instead
+of covering individual LSM hooks (or syscalls), IPE covers operations. This means
+instead of sysadmins needing to know that the syscalls ``mmap``, ``mprotect``,
+``execve``, and ``uselib`` must have rules protecting them, they must simple know
+that they want to restrict code execution. This limits the amount of bypasses that
+could occur due to a lack of knowledge of the underlying system; whereas the
+maintainers of IPE, being kernel developers can make the correct choice to determine
+whether something maps to these operations, and under what conditions.
+
+Implementation Notes
+--------------------
+
+Anonymous Memory
+~~~~~~~~~~~~~~~~
+
+Anonymous memory isn't treated any differently from any other access in IPE.
+When anonymous memory is mapped with ``+X``, it still comes into the ``file_mmap``
+or ``file_mprotect`` hook, but with a ``NULL`` file object. This is submitted to
+the evaluation, like any other file, however, all current trust mechanisms will
+return false as there is nothing to evaluate. This means anonymous memory
+execution is subject to whatever the ``DEFAULT`` is for ``EXECUTE``.
+
+.. WARNING::
+
+  This also occurs with the ``kernel_load_data`` hook, which is used by signed
+  and compressed kernel modules. Using signed and compressed kernel modules with
+  IPE will always result in the ``DEFAULT`` action for ``KMODULE``.
+
+Securityfs Interface
+~~~~~~~~~~~~~~~~~~~~
+
+The per-policy securityfs tree is somewhat unique. For example, for
+a standard securityfs policy tree::
+
+  MyPolicy
+    |- active
+    |- delete
+    |- name
+    |- pkcs7
+    |- policy
+    |- update
+    |- version
+
+The policy is stored in the ``->i_private`` data of the MyPolicy inode.
+
+Tests
+-----
+
+IPE has KUnit Tests for the policy parser. Recommended kunitconfig::
+
+  CONFIG_KUNIT=y
+  CONFIG_SECURITY=y
+  CONFIG_SECURITYFS=y
+  CONFIG_PKCS7_MESSAGE_PARSER=y
+  CONFIG_SYSTEM_DATA_VERIFICATION=y
+  CONFIG_FS_VERITY=y
+  CONFIG_FS_VERITY_BUILTIN_SIGNATURES=y
+  CONFIG_BLOCK=y
+  CONFIG_MD=y
+  CONFIG_BLK_DEV_DM=y
+  CONFIG_DM_VERITY=y
+  CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
+  CONFIG_NET=y
+  CONFIG_AUDIT=y
+  CONFIG_AUDITSYSCALL=y
+  CONFIG_BLK_DEV_INITRD=y
+
+  CONFIG_SECURITY_IPE=y
+  CONFIG_IPE_PROP_DM_VERITY=y
+  CONFIG_IPE_PROP_DM_VERITY_SIGNATURE=y
+  CONFIG_IPE_PROP_FS_VERITY=y
+  CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG=y
+  CONFIG_SECURITY_IPE_KUNIT_TEST=y
+
+In addition, IPE has a python based integration
+`test suite <https://github.com/microsoft/ipe/tree/test-suite>`_ that
+can test both user interfaces and enforcement functionalities.
-- 
2.44.0


^ permalink raw reply related	[relevance 12%]

* [PATCH v18 18/21] scripts: add boot policy generation program
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (16 preceding siblings ...)
  2024-05-03 22:32 38% ` [PATCH v18 17/21] ipe: enable support for fs-verity as a trust provider Fan Wu
@ 2024-05-03 22:32 47% ` Fan Wu
  2024-05-03 22:32 48% ` [PATCH v18 19/21] ipe: kunit test for parser Fan Wu
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Enables an IPE policy to be enforced from kernel start, enabling access
control based on trust from kernel startup. This is accomplished by
transforming an IPE policy indicated by CONFIG_IPE_BOOT_POLICY into a
c-string literal that is parsed at kernel startup as an unsigned policy.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + No Changes

v3:
  + No Changes

v4:
  + No Changes

v5:
  + No Changes

v6:
  + No Changes

v7:
  + Move from 01/11 to 14/16
  + Don't return errno directly.
  + Make output of script more user-friendly
  + Add escaping for tab and '?'
  + Mark argv pointer const
  + Invert return code check in the boot policy parsing code path.

v8:
  + No significant changes.

v9:
  + No changes

v10:
  + Update the init part code for rcu changes in the eval loop patch

v11:
  + Fix code style issues

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + Fix one grammar issue in Kconfig

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 scripts/Makefile              |   1 +
 scripts/ipe/Makefile          |   2 +
 scripts/ipe/polgen/.gitignore |   2 +
 scripts/ipe/polgen/Makefile   |   5 ++
 scripts/ipe/polgen/polgen.c   | 145 ++++++++++++++++++++++++++++++++++
 security/ipe/.gitignore       |   2 +
 security/ipe/Kconfig          |  10 +++
 security/ipe/Makefile         |  11 +++
 security/ipe/fs.c             |   8 ++
 security/ipe/ipe.c            |  12 +++
 10 files changed, 198 insertions(+)
 create mode 100644 scripts/ipe/Makefile
 create mode 100644 scripts/ipe/polgen/.gitignore
 create mode 100644 scripts/ipe/polgen/Makefile
 create mode 100644 scripts/ipe/polgen/polgen.c
 create mode 100644 security/ipe/.gitignore

diff --git a/scripts/Makefile b/scripts/Makefile
index bc90520a5426..cae8a14fa40d 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -55,6 +55,7 @@ targets += module.lds
 subdir-$(CONFIG_GCC_PLUGINS) += gcc-plugins
 subdir-$(CONFIG_MODVERSIONS) += genksyms
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
+subdir-$(CONFIG_SECURITY_IPE) += ipe
 
 # Let clean descend into subdirs
 subdir-	+= basic dtc gdb kconfig mod
diff --git a/scripts/ipe/Makefile b/scripts/ipe/Makefile
new file mode 100644
index 000000000000..e87553fbb8d6
--- /dev/null
+++ b/scripts/ipe/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+subdir-y := polgen
diff --git a/scripts/ipe/polgen/.gitignore b/scripts/ipe/polgen/.gitignore
new file mode 100644
index 000000000000..b6f05cf3dc0e
--- /dev/null
+++ b/scripts/ipe/polgen/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+polgen
diff --git a/scripts/ipe/polgen/Makefile b/scripts/ipe/polgen/Makefile
new file mode 100644
index 000000000000..c20456a2f2e9
--- /dev/null
+++ b/scripts/ipe/polgen/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+hostprogs-always-y	:= polgen
+HOST_EXTRACFLAGS += \
+	-I$(srctree)/include \
+	-I$(srctree)/include/uapi \
diff --git a/scripts/ipe/polgen/polgen.c b/scripts/ipe/polgen/polgen.c
new file mode 100644
index 000000000000..c6283b3ff006
--- /dev/null
+++ b/scripts/ipe/polgen/polgen.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <errno.h>
+
+static void usage(const char *const name)
+{
+	printf("Usage: %s OutputFile (PolicyFile)\n", name);
+	exit(EINVAL);
+}
+
+static int policy_to_buffer(const char *pathname, char **buffer, size_t *size)
+{
+	size_t fsize;
+	size_t read;
+	char *lbuf;
+	int rc = 0;
+	FILE *fd;
+
+	fd = fopen(pathname, "r");
+	if (!fd) {
+		rc = errno;
+		goto out;
+	}
+
+	fseek(fd, 0, SEEK_END);
+	fsize = ftell(fd);
+	rewind(fd);
+
+	lbuf = malloc(fsize);
+	if (!lbuf) {
+		rc = ENOMEM;
+		goto out_close;
+	}
+
+	read = fread((void *)lbuf, sizeof(*lbuf), fsize, fd);
+	if (read != fsize) {
+		rc = -1;
+		goto out_free;
+	}
+
+	*buffer = lbuf;
+	*size = fsize;
+	fclose(fd);
+
+	return rc;
+
+out_free:
+	free(lbuf);
+out_close:
+	fclose(fd);
+out:
+	return rc;
+}
+
+static int write_boot_policy(const char *pathname, const char *buf, size_t size)
+{
+	int rc = 0;
+	FILE *fd;
+	size_t i;
+
+	fd = fopen(pathname, "w");
+	if (!fd) {
+		rc = errno;
+		goto err;
+	}
+
+	fprintf(fd, "/* This file is automatically generated.");
+	fprintf(fd, " Do not edit. */\n");
+	fprintf(fd, "#include <linux/stddef.h>\n");
+	fprintf(fd, "\nextern const char *const ipe_boot_policy;\n\n");
+	fprintf(fd, "const char *const ipe_boot_policy =\n");
+
+	if (!buf || size == 0) {
+		fprintf(fd, "\tNULL;\n");
+		fclose(fd);
+		return 0;
+	}
+
+	fprintf(fd, "\t\"");
+
+	for (i = 0; i < size; ++i) {
+		switch (buf[i]) {
+		case '"':
+			fprintf(fd, "\\\"");
+			break;
+		case '\'':
+			fprintf(fd, "'");
+			break;
+		case '\n':
+			fprintf(fd, "\\n\"\n\t\"");
+			break;
+		case '\\':
+			fprintf(fd, "\\\\");
+			break;
+		case '\t':
+			fprintf(fd, "\\t");
+			break;
+		case '\?':
+			fprintf(fd, "\\?");
+			break;
+		default:
+			fprintf(fd, "%c", buf[i]);
+		}
+	}
+	fprintf(fd, "\";\n");
+	fclose(fd);
+
+	return 0;
+
+err:
+	if (fd)
+		fclose(fd);
+	return rc;
+}
+
+int main(int argc, const char *const argv[])
+{
+	char *policy = NULL;
+	size_t len = 0;
+	int rc = 0;
+
+	if (argc < 2)
+		usage(argv[0]);
+
+	if (argc > 2) {
+		rc = policy_to_buffer(argv[2], &policy, &len);
+		if (rc != 0)
+			goto cleanup;
+	}
+
+	rc = write_boot_policy(argv[1], policy, len);
+cleanup:
+	if (policy)
+		free(policy);
+	if (rc != 0)
+		perror("An error occurred during policy conversion: ");
+	return rc;
+}
diff --git a/security/ipe/.gitignore b/security/ipe/.gitignore
new file mode 100644
index 000000000000..07313d3fd74a
--- /dev/null
+++ b/security/ipe/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+boot-policy.c
diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index 839d63698841..7a82778f93ae 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -21,6 +21,16 @@ menuconfig SECURITY_IPE
 	  If unsure, answer N.
 
 if SECURITY_IPE
+config IPE_BOOT_POLICY
+	string "Integrity policy to apply on system startup"
+	help
+	  This option specifies a filepath to an IPE policy that is compiled
+	  into the kernel. This policy will be enforced until a policy update
+	  is deployed via the $securityfs/ipe/policies/$policy_name/active
+	  interface.
+
+	  If unsure, leave blank.
+
 menu "IPE Trust Providers"
 
 config IPE_PROP_DM_VERITY
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index e1019bb9f0f3..84ad76556170 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -5,7 +5,16 @@
 # Makefile for building the IPE module as part of the kernel tree.
 #
 
+quiet_cmd_polgen = IPE_POL $(2)
+      cmd_polgen = scripts/ipe/polgen/polgen security/ipe/boot-policy.c $(2)
+
+targets += boot-policy.c
+
+$(obj)/boot-policy.c: scripts/ipe/polgen/polgen $(CONFIG_IPE_BOOT_POLICY) FORCE
+	$(call if_changed,polgen,$(CONFIG_IPE_BOOT_POLICY))
+
 obj-$(CONFIG_SECURITY_IPE) += \
+	boot-policy.o \
 	digest.o \
 	eval.o \
 	hooks.o \
@@ -15,3 +24,5 @@ obj-$(CONFIG_SECURITY_IPE) += \
 	policy_fs.o \
 	policy_parser.o \
 	audit.o \
+
+clean-files := boot-policy.c \
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
index b52fb6023904..5b6d19fb844a 100644
--- a/security/ipe/fs.c
+++ b/security/ipe/fs.c
@@ -190,6 +190,7 @@ static const struct file_operations enforce_fops = {
 static int __init ipe_init_securityfs(void)
 {
 	int rc = 0;
+	struct ipe_policy *ap;
 
 	if (!ipe_enabled)
 		return -EOPNOTSUPP;
@@ -220,6 +221,13 @@ static int __init ipe_init_securityfs(void)
 		goto err;
 	}
 
+	ap = rcu_access_pointer(ipe_active_policy);
+	if (ap) {
+		rc = ipe_new_policyfs_node(ap);
+		if (rc)
+			goto err;
+	}
+
 	np = securityfs_create_file("new_policy", 0200, root, NULL, &np_fops);
 	if (IS_ERR(np)) {
 		rc = PTR_ERR(np);
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index da79f66b0010..5e993cbf3428 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -9,6 +9,7 @@
 #include "hooks.h"
 #include "eval.h"
 
+extern const char *const ipe_boot_policy;
 bool ipe_enabled;
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
@@ -74,9 +75,20 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
  */
 static int __init ipe_init(void)
 {
+	struct ipe_policy *p = NULL;
+
 	security_add_hooks(ipe_hooks, ARRAY_SIZE(ipe_hooks), &ipe_lsmid);
 	ipe_enabled = true;
 
+	if (ipe_boot_policy) {
+		p = ipe_new_policy(ipe_boot_policy, strlen(ipe_boot_policy),
+				   NULL, 0);
+		if (IS_ERR(p))
+			return PTR_ERR(p);
+
+		rcu_assign_pointer(ipe_active_policy, p);
+	}
+
 	return 0;
 }
 
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v18 17/21] ipe: enable support for fs-verity as a trust provider
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (15 preceding siblings ...)
  2024-05-03 22:32 48% ` [PATCH v18 16/21] fsverity: expose verified fsverity built-in signatures to LSMs Fan Wu
@ 2024-05-03 22:32 38% ` Fan Wu
  2024-05-03 22:32 47% ` [PATCH v18 18/21] scripts: add boot policy generation program Fan Wu
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu, Deven Bowers

Enable IPE policy authors to indicate trust for a singular fsverity
file, identified by the digest information, through "fsverity_digest"
and all files using valid fsverity builtin signatures via
"fsverity_signature".

This enables file-level integrity claims to be expressed in IPE,
allowing individual files to be authorized, giving some flexibility
for policy authors. Such file-level claims are important to be expressed
for enforcing the integrity of packages, as well as address some of the
scalability issues in a sole dm-verity based solution (# of loop back
devices, etc).

This solution cannot be done in userspace as the minimum threat that
IPE should mitigate is an attacker downloads malicious payload with
all required dependencies. These dependencies can lack the userspace
check, bypassing the protection entirely. A similar attack succeeds if
the userspace component is replaced with a version that does not
perform the check. As a result, this can only be done in the common
entry point - the kernel.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v6:
  + Not present

v7:
  Introduced

v8:
  * Undo squash of 08/12, 10/12 - separating drivers/md/ from security/
  * Use common-audit function for fsverity_signature.
  + Change fsverity implementation to use fsverity_get_digest
  + prevent unnecessary copy of fs-verity signature data, instead
    just check for presence of signature data.
  + Remove free_inode_security hook, as the digest is now acquired
    at runtime instead of via LSM blob.

v9:
  + Adapt to the new parser

v10:
  + Update the fsverity get digest call

v11:
  + No changes

v12:
  + Fix audit format
  + Simplify property evaluation

v13:
  + Remove the CONFIG_IPE_PROP_FS_VERITY dependency inside the parser
    to make the policy grammar independent of the kernel config.

v14:
  + No changes

v15:
  + Fix on grammar issue in Kconfig
  + Switch hook to security_inode_setintegrity()

v16:
  + Rewrite fsverity signature part in Kconfig

v17:
  + Fix documentation issues
  + Use new enum name LSM_INT_FSVERITY_BUILTINSIG_VALID

v18:
  + Add Kconfig IPE_PROP_FS_VERITY_BUILTIN_SIG and make both FS_VERITY
    Kconfigs auto-selected
---
 security/ipe/Kconfig         |  25 +++++++
 security/ipe/audit.c         |  17 +++++
 security/ipe/eval.c          | 123 ++++++++++++++++++++++++++++++++++-
 security/ipe/eval.h          |  12 ++++
 security/ipe/hooks.c         |  28 ++++++++
 security/ipe/hooks.h         |   6 ++
 security/ipe/ipe.c           |  13 ++++
 security/ipe/ipe.h           |   3 +
 security/ipe/policy.h        |   3 +
 security/ipe/policy_parser.c |   6 ++
 10 files changed, 235 insertions(+), 1 deletion(-)

diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index 8279dddf92ad..839d63698841 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -10,6 +10,8 @@ menuconfig SECURITY_IPE
 	select SYSTEM_DATA_VERIFICATION
 	select IPE_PROP_DM_VERITY if DM_VERITY
 	select IPE_PROP_DM_VERITY_SIGNATURE if DM_VERITY && DM_VERITY_VERIFY_ROOTHASH_SIG
+	select IPE_PROP_FS_VERITY if FS_VERITY
+	select IPE_PROP_FS_VERITY_BUILTIN_SIG if FS_VERITY && FS_VERITY_BUILTIN_SIGNATURES
 	help
 	  This option enables the Integrity Policy Enforcement LSM
 	  allowing users to define a policy to enforce a trust-based access
@@ -39,6 +41,29 @@ config IPE_PROP_DM_VERITY_SIGNATURE
 	  volume, which has been mounted with a valid signed root hash,
 	  is evaluated.
 
+	  If unsure, answer Y.
+
+config IPE_PROP_FS_VERITY
+	bool "Enable support for fs-verity based on file digest"
+	depends on FS_VERITY
+	help
+	  This option enables the 'fsverity_digest' property within IPE
+	  policies. The property evaluates to TRUE when a file is fsverity
+	  enabled and its digest matches the supplied value in the policy.
+
+	  if unsure, answer Y.
+
+config IPE_PROP_FS_VERITY_BUILTIN_SIG
+	bool "Enable support for fs-verity based on builtin signature"
+	depends on FS_VERITY && FS_VERITY_BUILTIN_SIGNATURES
+	help
+	  This option enables the 'fsverity_signature' property within IPE
+	  policies. The property evaluates to TRUE when a file is fsverity
+	  enabled and it has a valid builtin signature whose signing cert
+	  is in the .fs-verity keyring.
+
+	  if unsure, answer Y.
+
 endmenu
 
 endif
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index 2c98520267c1..bd258f887e6f 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -53,6 +53,9 @@ static const char *const audit_prop_names[__IPE_PROP_MAX] = {
 	"dmverity_roothash=",
 	"dmverity_signature=FALSE",
 	"dmverity_signature=TRUE",
+	"fsverity_digest=",
+	"fsverity_signature=FALSE",
+	"fsverity_signature=TRUE",
 };
 
 /**
@@ -66,6 +69,17 @@ static void audit_dmv_roothash(struct audit_buffer *ab, const void *rh)
 	ipe_digest_audit(ab, rh);
 }
 
+/**
+ * audit_fsv_digest() - audit the digest of a fsverity_digest property.
+ * @ab: Supplies a pointer to the audit_buffer to append to.
+ * @d: Supplies a pointer to the digest structure.
+ */
+static void audit_fsv_digest(struct audit_buffer *ab, const void *d)
+{
+	audit_log_format(ab, "%s", audit_prop_names[IPE_PROP_FSV_DIGEST]);
+	ipe_digest_audit(ab, d);
+}
+
 /**
  * audit_rule() - audit an IPE policy rule.
  * @ab: Supplies a pointer to the audit_buffer to append to.
@@ -82,6 +96,9 @@ static void audit_rule(struct audit_buffer *ab, const struct ipe_rule *r)
 		case IPE_PROP_DMV_ROOTHASH:
 			audit_dmv_roothash(ab, ptr->value);
 			break;
+		case IPE_PROP_FSV_DIGEST:
+			audit_fsv_digest(ab, ptr->value);
+			break;
 		default:
 			audit_log_format(ab, "%s", audit_prop_names[ptr->type]);
 			break;
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 8f4f63088206..dca1b1f312b4 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -10,6 +10,7 @@
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
 #include <linux/moduleparam.h>
+#include <linux/fsverity.h>
 
 #include "ipe.h"
 #include "eval.h"
@@ -51,6 +52,36 @@ static void build_ipe_bdev_ctx(struct ipe_eval_ctx *ctx, const struct inode *con
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+static void build_ipe_inode_blob_ctx(struct ipe_eval_ctx *ctx,
+				     const struct inode *const ino)
+{
+	ctx->ipe_inode = ipe_inode(ctx->ino);
+}
+#else
+static inline void build_ipe_inode_blob_ctx(struct ipe_eval_ctx *ctx,
+					    const struct inode *const ino)
+{
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
+
+/**
+ * build_ipe_inode_ctx() - Build inode fields of an evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @ino: Supplies the inode struct of the file triggered IPE event.
+ */
+static void build_ipe_inode_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+	ctx->ino = ino;
+	build_ipe_inode_blob_ctx(ctx, ino);
+}
+#else
+static void build_ipe_inode_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
 /**
  * ipe_build_eval_ctx() - Build an ipe evaluation context.
  * @ctx: Supplies a pointer to the context to be populated.
@@ -63,13 +94,17 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 			enum ipe_op_type op,
 			enum ipe_hook_type hook)
 {
+	struct inode *ino;
+
 	ctx->file = file;
 	ctx->op = op;
 	ctx->hook = hook;
 
 	if (file) {
 		build_ipe_sb_ctx(ctx, file);
-		build_ipe_bdev_ctx(ctx, d_real_inode(file->f_path.dentry));
+		ino = d_real_inode(file->f_path.dentry);
+		build_ipe_bdev_ctx(ctx, ino);
+		build_ipe_inode_ctx(ctx, ino);
 	}
 }
 
@@ -150,6 +185,86 @@ static bool evaluate_dmv_sig_true(const struct ipe_eval_ctx *const ctx)
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY_SIGNATURE */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+/**
+ * evaluate_fsv_digest() - Evaluate @ctx against a fsv digest property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ * @p: Supplies a pointer to the property being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_fsv_digest(const struct ipe_eval_ctx *const ctx,
+				struct ipe_prop *p)
+{
+	enum hash_algo alg;
+	u8 digest[FS_VERITY_MAX_DIGEST_SIZE];
+	struct digest_info info;
+
+	if (!ctx->ino)
+		return false;
+	if (!fsverity_get_digest((struct inode *)ctx->ino,
+				 digest,
+				 NULL,
+				 &alg))
+		return false;
+
+	info.alg = hash_algo_name[alg];
+	info.digest = digest;
+	info.digest_len = hash_digest_size[alg];
+
+	return ipe_digest_eval(p->value, &info);
+}
+#else
+static bool evaluate_fsv_digest(const struct ipe_eval_ctx *const ctx,
+				struct ipe_prop *p)
+{
+	return false;
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+/**
+ * evaluate_fsv_sig_false() - Evaluate @ctx against a fsv sig false property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the property
+ * * %false	- The current @ctx doesn't match the property
+ */
+static bool evaluate_fsv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return !ctx->ino ||
+	       !IS_VERITY(ctx->ino) ||
+	       !ctx->ipe_inode ||
+	       !ctx->ipe_inode->fs_verity_signed;
+}
+
+/**
+ * evaluate_fsv_sig_true() - Evaluate @ctx against a fsv sig true property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true - The current @ctx match the property
+ * * %false - The current @ctx doesn't match the property
+ */
+static bool evaluate_fsv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return !evaluate_fsv_sig_false(ctx);
+}
+#else
+static bool evaluate_fsv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+
+static bool evaluate_fsv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
+
 /**
  * evaluate_property() - Analyze @ctx against a rule property.
  * @ctx: Supplies a pointer to the context to be evaluated.
@@ -176,6 +291,12 @@ static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 		return evaluate_dmv_sig_false(ctx);
 	case IPE_PROP_DMV_SIG_TRUE:
 		return evaluate_dmv_sig_true(ctx);
+	case IPE_PROP_FSV_DIGEST:
+		return evaluate_fsv_digest(ctx, p);
+	case IPE_PROP_FSV_SIG_FALSE:
+		return evaluate_fsv_sig_false(ctx);
+	case IPE_PROP_FSV_SIG_TRUE:
+		return evaluate_fsv_sig_true(ctx);
 	default:
 		return false;
 	}
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 4901df0e1369..fef65a36468c 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -31,6 +31,12 @@ struct ipe_bdev {
 };
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+struct ipe_inode {
+	bool fs_verity_signed;
+};
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
+
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
 	enum ipe_hook_type hook;
@@ -40,6 +46,12 @@ struct ipe_eval_ctx {
 #ifdef CONFIG_IPE_PROP_DM_VERITY
 	const struct ipe_bdev *ipe_bdev;
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+	const struct inode *ino;
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+	const struct ipe_inode *ipe_inode;
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
 };
 
 enum ipe_match {
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index bc0a7268179d..df5057b8670f 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -282,3 +282,31 @@ int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type typ
 	return -EINVAL;
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+/**
+ * ipe_inode_setintegrity() - save integrity data from a inode to IPE's LSM blob.
+ * @inode: The inode to source the security blob from.
+ * @type: Supplies the integrity type.
+ * @value: The value to be stored.
+ * @size: The size of @value.
+ *
+ * This hook is currently used to save the existence of a validated fs-verity
+ * builtin signature into LSM blob.
+ *
+ * Return: %0 on success. If an error occurs, the function will return the
+ * -errno.
+ */
+int ipe_inode_setintegrity(struct inode *inode, enum lsm_integrity_type type,
+			   const void *value, size_t size)
+{
+	struct ipe_inode *inode_sec = ipe_inode(inode);
+
+	if (type == LSM_INT_FSVERITY_BUILTINSIG_VALID) {
+		inode_sec->fs_verity_signed = size > 0 && value;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+#endif /* CONFIG_CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index 4d585fb6ada3..b45c0243107b 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -9,6 +9,7 @@
 #include <linux/binfmts.h>
 #include <linux/security.h>
 #include <linux/blk_types.h>
+#include <linux/fsverity.h>
 
 enum ipe_hook_type {
 	IPE_HOOK_BPRM_CHECK = 0,
@@ -43,4 +44,9 @@ int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type typ
 			  const void *value, size_t len);
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+int ipe_inode_setintegrity(struct inode *inode, enum lsm_integrity_type type,
+			   const void *value, size_t size);
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
+
 #endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 99cb42caa63a..da79f66b0010 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -16,6 +16,9 @@ static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 #ifdef CONFIG_IPE_PROP_DM_VERITY
 	.lbs_bdev = sizeof(struct ipe_bdev),
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+	.lbs_inode = sizeof(struct ipe_inode),
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
 };
 
 static const struct lsm_id ipe_lsmid = {
@@ -35,6 +38,13 @@ struct ipe_bdev *ipe_bdev(struct block_device *b)
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+struct ipe_inode *ipe_inode(const struct inode *inode)
+{
+	return inode->i_security + ipe_blobs.lbs_inode;
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
+
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
 	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
@@ -46,6 +56,9 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bdev_free_security, ipe_bdev_free_security),
 	LSM_HOOK_INIT(bdev_setintegrity, ipe_bdev_setintegrity),
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+	LSM_HOOK_INIT(inode_setintegrity, ipe_inode_setintegrity),
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
 };
 
 /**
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index 01f46286e383..fb37513812dd 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -19,5 +19,8 @@ extern bool ipe_enabled;
 #ifdef CONFIG_IPE_PROP_DM_VERITY
 struct ipe_bdev *ipe_bdev(struct block_device *b);
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG
+struct ipe_inode *ipe_inode(const struct inode *inode);
+#endif /* CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG */
 
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 26776092c710..5bfbdbddeef8 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -36,6 +36,9 @@ enum ipe_prop_type {
 	IPE_PROP_DMV_ROOTHASH,
 	IPE_PROP_DMV_SIG_FALSE,
 	IPE_PROP_DMV_SIG_TRUE,
+	IPE_PROP_FSV_DIGEST,
+	IPE_PROP_FSV_SIG_FALSE,
+	IPE_PROP_FSV_SIG_TRUE,
 	__IPE_PROP_MAX
 };
 
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 71c84b293029..5a182c006b0e 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -278,6 +278,9 @@ static const match_table_t property_tokens = {
 	{IPE_PROP_DMV_ROOTHASH,		"dmverity_roothash=%s"},
 	{IPE_PROP_DMV_SIG_FALSE,	"dmverity_signature=FALSE"},
 	{IPE_PROP_DMV_SIG_TRUE,		"dmverity_signature=TRUE"},
+	{IPE_PROP_FSV_DIGEST,		"fsverity_digest=%s"},
+	{IPE_PROP_FSV_SIG_FALSE,	"fsverity_signature=FALSE"},
+	{IPE_PROP_FSV_SIG_TRUE,		"fsverity_signature=TRUE"},
 	{IPE_PROP_INVALID,		NULL}
 };
 
@@ -310,6 +313,7 @@ static int parse_property(char *t, struct ipe_rule *r)
 
 	switch (token) {
 	case IPE_PROP_DMV_ROOTHASH:
+	case IPE_PROP_FSV_DIGEST:
 		dup = match_strdup(&args[0]);
 		if (!dup) {
 			rc = -ENOMEM;
@@ -325,6 +329,8 @@ static int parse_property(char *t, struct ipe_rule *r)
 	case IPE_PROP_BOOT_VERIFIED_TRUE:
 	case IPE_PROP_DMV_SIG_FALSE:
 	case IPE_PROP_DMV_SIG_TRUE:
+	case IPE_PROP_FSV_SIG_FALSE:
+	case IPE_PROP_FSV_SIG_TRUE:
 		p->type = token;
 		break;
 	default:
-- 
2.44.0


^ permalink raw reply related	[relevance 38%]

* [PATCH v18 15/21] security: add security_inode_setintegrity() hook
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (13 preceding siblings ...)
  2024-05-03 22:32 31% ` [PATCH v18 14/21] ipe: add support for dm-verity as a trust provider Fan Wu
@ 2024-05-03 22:32 65% ` Fan Wu
  2024-05-03 22:32 48% ` [PATCH v18 16/21] fsverity: expose verified fsverity built-in signatures to LSMs Fan Wu
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

This patch introduces a new hook to save inode's integrity
data. For example, for fsverity enabled files, LSMs can use this hook to
save the verified fsverity builtin signature into the inode's security
blob, and LSMs can make access decisions based on the data inside
the signature, like the signer certificate.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

--
v1-v14:
  + Not present

v15:
  + Introduced

v16:
  + Switch to call_int_hook()

v17:
  + Fix a typo

v18:
  + No changes
---
 include/linux/lsm_hook_defs.h |  2 ++
 include/linux/security.h      | 10 ++++++++++
 security/security.c           | 20 ++++++++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index b391a7f13053..6f746dfdb28b 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -177,6 +177,8 @@ LSM_HOOK(int, 0, inode_listsecurity, struct inode *inode, char *buffer,
 LSM_HOOK(void, LSM_RET_VOID, inode_getsecid, struct inode *inode, u32 *secid)
 LSM_HOOK(int, 0, inode_copy_up, struct dentry *src, struct cred **new)
 LSM_HOOK(int, -EOPNOTSUPP, inode_copy_up_xattr, const char *name)
+LSM_HOOK(int, 0, inode_setintegrity, struct inode *inode,
+	 enum lsm_integrity_type type, const void *value, size_t size)
 LSM_HOOK(int, 0, kernfs_init_security, struct kernfs_node *kn_dir,
 	 struct kernfs_node *kn)
 LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
diff --git a/include/linux/security.h b/include/linux/security.h
index d2ddd7c63b62..568d96012a48 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -410,6 +410,9 @@ int security_inode_listsecurity(struct inode *inode, char *buffer, size_t buffer
 void security_inode_getsecid(struct inode *inode, u32 *secid);
 int security_inode_copy_up(struct dentry *src, struct cred **new);
 int security_inode_copy_up_xattr(const char *name);
+int security_inode_setintegrity(struct inode *inode,
+				enum lsm_integrity_type type, const void *value,
+				size_t size);
 int security_kernfs_init_security(struct kernfs_node *kn_dir,
 				  struct kernfs_node *kn);
 int security_file_permission(struct file *file, int mask);
@@ -1026,6 +1029,13 @@ static inline int security_inode_copy_up(struct dentry *src, struct cred **new)
 	return 0;
 }
 
+static inline int security_inode_setintegrity(struct inode *inode,
+					      enum lsm_integrity_type type,
+					      const void *value, size_t size)
+{
+	return 0;
+}
+
 static inline int security_kernfs_init_security(struct kernfs_node *kn_dir,
 						struct kernfs_node *kn)
 {
diff --git a/security/security.c b/security/security.c
index 3a7724c3dd76..2c20635a589b 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2681,6 +2681,26 @@ int security_inode_copy_up_xattr(const char *name)
 }
 EXPORT_SYMBOL(security_inode_copy_up_xattr);
 
+/**
+ * security_inode_setintegrity() - Set the inode's integrity data
+ * @inode: inode
+ * @type: type of integrity, e.g. hash digest, signature, etc
+ * @value: the integrity value
+ * @size: size of the integrity value
+ *
+ * Register a verified integrity measurement of a inode with LSMs.
+ * LSMs should free the previously saved data if @value is NULL.
+ *
+ * Return: Returns 0 on success, negative values on failure.
+ */
+int security_inode_setintegrity(struct inode *inode,
+				enum lsm_integrity_type type, const void *value,
+				size_t size)
+{
+	return call_int_hook(inode_setintegrity, inode, type, value, size);
+}
+EXPORT_SYMBOL(security_inode_setintegrity);
+
 /**
  * security_kernfs_init_security() - Init LSM context for a kernfs node
  * @kn_dir: parent kernfs node
-- 
2.44.0


^ permalink raw reply related	[relevance 65%]

* [PATCH v18 16/21] fsverity: expose verified fsverity built-in signatures to LSMs
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (14 preceding siblings ...)
  2024-05-03 22:32 65% ` [PATCH v18 15/21] security: add security_inode_setintegrity() hook Fan Wu
@ 2024-05-03 22:32 48% ` Fan Wu
  2024-05-03 22:32 38% ` [PATCH v18 17/21] ipe: enable support for fs-verity as a trust provider Fan Wu
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu, Deven Bowers

This patch enhances fsverity's capabilities to support both integrity and
authenticity protection by introducing the exposure of built-in
signatures through a new LSM hook. This functionality allows LSMs,
e.g. IPE, to enforce policies based on the authenticity and integrity of
files, specifically focusing on built-in fsverity signatures. It enables
a policy enforcement layer within LSMs for fsverity, offering granular
control over the usage of authenticity claims. For instance, a policy
could be established to permit the execution of all files with verified
built-in fsverity signatures while restricting kernel module loading
from specified fsverity files via fsverity digests.

The introduction of a security_inode_setintegrity() hook call within
fsverity's workflow ensures that the verified built-in signature of a file
is exposed to LSMs. This enables LSMs to recognize and label fsverity files
that contain a verified built-in fsverity signature. This hook is invoked
subsequent to the fsverity_verify_signature() process, guaranteeing the
signature's verification against fsverity's keyring. This mechanism is
crucial for maintaining system security, as it operates in kernel space,
effectively thwarting attempts by malicious binaries to bypass user space
stack interactions.

The second to last commit in this patch set will add a link to the IPE
documentation in fsverity.rst.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v6:
  + Not present

v7:
  Introduced

v8:
  + Split fs/verity/ changes and security/ changes into separate patches
  + Change signature of fsverity_create_info to accept non-const inode
  + Change signature of fsverity_verify_signature to accept non-const inode
  + Don't cast-away const from inode.
  + Digest functionality dropped in favor of:
    ("fs-verity: define a function to return the integrity protected
      file digest")
  + Reworded commit description and title to match changes.
  + Fix a bug wherein no LSM implements the particular fsverity @name
    (or LSM is disabled), and returns -EOPNOTSUPP, causing errors.

v9:
  + No changes

v10:
  + Rename the signature blob key
  + Cleanup redundant code
  + Make the hook call depends on CONFIG_FS_VERITY_BUILTIN_SIGNATURES

v11:
  + No changes

v12:
  + Add constification to the hook call

v13:
  + No changes

v14:
  + Add doc/comment to built-in signature verification

v15:
  + Add more docs related to IPE
  + Switch the hook call to security_inode_setintegrity()

v16:
  + Explicitly mention "fsverity builtin signatures" in the commit
    message
  + Amend documentation in fsverity.rst
  + Fix format issue
  + Change enum name

v17:
  + Fix various documentation issues
  + Use new enum name LSM_INT_FSVERITY_BUILTINSIG_VALID

v18:
  + Fix typos
  + Move the inode_setintegrity hook call into fsverity_verify_signature()
---
 Documentation/filesystems/fsverity.rst | 23 +++++++++++++++++++++--
 fs/verity/signature.c                  | 21 ++++++++++++++++++++-
 include/linux/security.h               |  1 +
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 13e4b18e5dbb..362b7a5dc300 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -86,6 +86,14 @@ authenticating fs-verity file hashes include:
   signature in their "security.ima" extended attribute, as controlled
   by the IMA policy.  For more information, see the IMA documentation.
 
+- Integrity Policy Enforcement (IPE).  IPE supports enforcing access
+  control decisions based on immutable security properties of files,
+  including those protected by fs-verity's built-in signatures.
+  "IPE policy" specifically allows for the authorization of fs-verity
+  files using properties ``fsverity_digest`` for identifying
+  files by their verity digest, and ``fsverity_signature`` to authorize
+  files with a verified fs-verity's built-in signature.
+
 - Trusted userspace code in combination with `Built-in signature
   verification`_.  This approach should be used only with great care.
 
@@ -457,7 +465,11 @@ Enabling this option adds the following:
    On success, the ioctl persists the signature alongside the Merkle
    tree.  Then, any time the file is opened, the kernel verifies the
    file's actual digest against this signature, using the certificates
-   in the ".fs-verity" keyring.
+   in the ".fs-verity" keyring. This verification happens as long as the
+   file's signature exists, regardless of the state of the sysctl variable
+   "fs.verity.require_signatures" described in the next item. The IPE LSM
+   relies on this behavior to recognize and label fsverity files
+   that contain a verified built-in fsverity signature.
 
 3. A new sysctl "fs.verity.require_signatures" is made available.
    When set to 1, the kernel requires that all verity files have a
@@ -481,7 +493,7 @@ be carefully considered before using them:
 
 - Builtin signature verification does *not* make the kernel enforce
   that any files actually have fs-verity enabled.  Thus, it is not a
-  complete authentication policy.  Currently, if it is used, the only
+  complete authentication policy.  Currently, if it is used, one
   way to complete the authentication policy is for trusted userspace
   code to explicitly check whether files have fs-verity enabled with a
   signature before they are accessed.  (With
@@ -490,6 +502,13 @@ be carefully considered before using them:
   could just store the signature alongside the file and verify it
   itself using a cryptographic library, instead of using this feature.
 
+- Another approach is to utilize fs-verity builtin signature
+  verification in conjunction with the IPE LSM, which supports defining
+  a kernel-enforced, system-wide authentication policy that allows only
+  files with a verified fs-verity builtin signature to perform certain
+  operations, such as execution. Note that IPE doesn't require
+  fs.verity.require_signatures=1.
+
 - A file's builtin signature can only be set at the same time that
   fs-verity is being enabled on the file.  Changing or deleting the
   builtin signature later requires re-creating the file.
diff --git a/fs/verity/signature.c b/fs/verity/signature.c
index 90c07573dd77..727e4a22c3d0 100644
--- a/fs/verity/signature.c
+++ b/fs/verity/signature.c
@@ -17,6 +17,7 @@
 
 #include <linux/cred.h>
 #include <linux/key.h>
+#include <linux/security.h>
 #include <linux/slab.h>
 #include <linux/verification.h>
 
@@ -41,7 +42,11 @@ static struct key *fsverity_keyring;
  * @sig_size: size of signature in bytes, or 0 if no signature
  *
  * If the file includes a signature of its fs-verity file digest, verify it
- * against the certificates in the fs-verity keyring.
+ * against the certificates in the fs-verity keyring. Note that signatures
+ * are verified regardless of the state of the 'fsverity_require_signatures'
+ * variable and the LSM subsystem relies on this behavior to help enforce
+ * file integrity policies. Please discuss changes with the LSM list
+ * (thank you!).
  *
  * Return: 0 on success (signature valid or not required); -errno on failure
  */
@@ -106,6 +111,20 @@ int fsverity_verify_signature(const struct fsverity_info *vi,
 		return err;
 	}
 
+	/*
+	 * We need to cast out const in order to set inode data
+	 */
+	err = security_inode_setintegrity((struct inode *) inode,
+					  LSM_INT_FSVERITY_BUILTINSIG_VALID,
+					  signature,
+					  le32_to_cpu(sig_size));
+
+	if (err) {
+		fsverity_err(inode, "Error %d exposing file signature to LSMs",
+			     err);
+		return err;
+	}
+
 	return 0;
 }
 
diff --git a/include/linux/security.h b/include/linux/security.h
index 568d96012a48..ab5f24040abc 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -92,6 +92,7 @@ struct dm_verity_digest {
 enum lsm_integrity_type {
 	LSM_INT_DMVERITY_SIG_VALID,
 	LSM_INT_DMVERITY_ROOTHASH,
+	LSM_INT_FSVERITY_BUILTINSIG_VALID,
 };
 
 /*
-- 
2.44.0


^ permalink raw reply related	[relevance 48%]

* [PATCH v18 12/21] dm: add finalize hook to target_type
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (10 preceding siblings ...)
  2024-05-03 22:32 44% ` [PATCH v18 11/21] block,lsm: add LSM blob and new LSM hooks for block device Fan Wu
@ 2024-05-03 22:32 70% ` Fan Wu
  2024-05-03 22:32 50% ` [PATCH v18 13/21] dm verity: expose root hash digest and signature data to LSMs Fan Wu
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

This patch adds a target finalize hook.

The hook is triggered just before activating an inactive table of a
mapped device. If it returns an error the __bind get cancelled.

The dm-verity target will use this hook to attach the dm-verity's
roothash metadata to the block_device struct of the mapped device.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v10:
  + Not present

v11:
  + Introduced

v12:
  + No changes

v13:
  + No changes

v14:
  + Add documentation

v15:
  + No changes

v16:
  + No changes

v17:
  + No changes

v18:
  + No changes
---
 drivers/md/dm.c               | 12 ++++++++++++
 include/linux/device-mapper.h |  9 +++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7d0746b37c8e..a748c3735156 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2276,6 +2276,18 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
 		goto out;
 	}
 
+	for (unsigned int i = 0; i < t->num_targets; i++) {
+		struct dm_target *ti = dm_table_get_target(t, i);
+
+		if (ti->type->finalize) {
+			ret = ti->type->finalize(ti);
+			if (ret) {
+				old_map = ERR_PTR(ret);
+				goto out;
+			}
+		}
+	}
+
 	old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock));
 	rcu_assign_pointer(md->map, (void *)t);
 	md->immutable_target_type = dm_table_get_immutable_target_type(t);
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 82b2195efaca..ad368904b1d5 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -160,6 +160,14 @@ typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
  */
 typedef size_t (*dm_dax_recovery_write_fn)(struct dm_target *ti, pgoff_t pgoff,
 		void *addr, size_t bytes, struct iov_iter *i);
+/*
+ * This hook allows DM targets in an inactive table to complete their setup
+ * before the table is made active.
+ * Returns:
+ *  < 0 : error
+ *  = 0 : success
+ */
+typedef int (*dm_finalize_fn) (struct dm_target *target);
 
 void dm_error(const char *message);
 
@@ -210,6 +218,7 @@ struct target_type {
 	dm_dax_direct_access_fn direct_access;
 	dm_dax_zero_page_range_fn dax_zero_page_range;
 	dm_dax_recovery_write_fn dax_recovery_write;
+	dm_finalize_fn finalize;
 
 	/* For internal device-mapper use. */
 	struct list_head list;
-- 
2.44.0


^ permalink raw reply related	[relevance 70%]

* [PATCH v18 13/21] dm verity: expose root hash digest and signature data to LSMs
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (11 preceding siblings ...)
  2024-05-03 22:32 70% ` [PATCH v18 12/21] dm: add finalize hook to target_type Fan Wu
@ 2024-05-03 22:32 50% ` Fan Wu
  2024-05-03 22:32 31% ` [PATCH v18 14/21] ipe: add support for dm-verity as a trust provider Fan Wu
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

dm-verity provides a strong guarantee of a block device's integrity. As
a generic way to check the integrity of a block device, it provides
those integrity guarantees to its higher layers, including the filesystem
level.

An LSM that control access to a resource on the system based on the
available integrity claims can use this transitive property of
dm-verity, by querying the underlying block_device of a particular
file.

The digest and signature information need to be stored in the block
device to fulfill the next requirement of authorization via LSM policy.
This will enable the LSM to perform revocation of devices that are still
mounted, prohibiting execution of files that are no longer authorized
by the LSM in question.

This patch adds two security hook calls in dm-verity to expose the
dm-verity roothash and the roothash signature to LSMs. The hook calls
are depended on CONFIG_SECURITY.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
---
v2:
  + No Changes

v3:
  + No changes

v4:
  + No changes

v5:
  + No changes

v6:
  + Fix an improper cleanup that can result in
    a leak

v7:
  + Squash patch 08/12, 10/12 to [11/16]
  + Use part0 for block_device, to retrieve the block_device, when
    calling security_bdev_setsecurity

v8:
  + Undo squash of 08/12, 10/12 - separating drivers/md/ from
    security/ & block/
  + Use common-audit function for dmverity_signature.
  + Change implementation for storing the dm-verity digest to use the
    newly introduced dm_verity_digest structure introduced in patch
    14/20.
  + Create new structure, dm_verity_digest, containing digest algorithm,
    size, and digest itself to pass to the LSM layer. V7 was missing the
    algorithm.
  + Create an associated public header containing this new structure and
    the key values for the LSM hook, specific to dm-verity.
  + Additional information added to commit, discussing the layering of
    the changes and how the information passed will be used.

v9:
  + No changes

v10:
  + No changes

v11:
  + Add an optional field to save signature
  + Move the security hook call to the new finalize hook

v12:
  + No changes

v13:
  + No changes

v14:
  + Correct code format
  + Remove unnecessary header and switch to dm_disk()

v15:
  + Refactor security_bdev_setsecurity() to security_bdev_setintegrity()
  + Remove unnecessary headers

v16:
  + Use kmemdup to duplicate signature
  + Clean up lsm blob data in error case

v17:
  + Switch to depend on CONFIG_SECURITY
  + Use new enum name LSM_INT_DMVERITY_SIG_VALID

v18:
  + Amend commit title
  + Fix incorrect error handling
  + Make signature exposure depends on CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG
  + Fix inaccurate comment
  + Remove include/linux/dm-verity.h
  + use crypto_ahash_alg_name(v->tfm) instead of v->alg_name
---
 drivers/md/dm-verity-target.c | 100 ++++++++++++++++++++++++++++++++++
 drivers/md/dm-verity.h        |   6 ++
 include/linux/security.h      |   9 ++-
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index bb5da66da4c1..3afab8f2c509 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -22,6 +22,7 @@
 #include <linux/scatterlist.h>
 #include <linux/string.h>
 #include <linux/jump_label.h>
+#include <linux/security.h>
 
 #define DM_MSG_PREFIX			"verity"
 
@@ -1017,6 +1018,38 @@ static void verity_io_hints(struct dm_target *ti, struct queue_limits *limits)
 	blk_limits_io_min(limits, limits->logical_block_size);
 }
 
+#ifdef CONFIG_SECURITY
+
+static int verity_init_sig(struct dm_verity *v, const void *sig,
+			   size_t sig_size)
+{
+	v->sig_size = sig_size;
+	v->root_digest_sig = kmemdup(sig, v->sig_size, GFP_KERNEL);
+	if (!v->root_digest_sig)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void verity_free_sig(struct dm_verity *v)
+{
+	kfree(v->root_digest_sig);
+}
+
+#else
+
+static inline int verity_init_sig(struct dm_verity *v, const void *sig,
+				  size_t sig_size)
+{
+	return 0;
+}
+
+static inline void verity_free_sig(struct dm_verity *v)
+{
+}
+
+#endif /* CONFIG_SECURITY */
+
 static void verity_dtr(struct dm_target *ti)
 {
 	struct dm_verity *v = ti->private;
@@ -1035,6 +1068,7 @@ static void verity_dtr(struct dm_target *ti)
 	kfree(v->salt);
 	kfree(v->root_digest);
 	kfree(v->zero_digest);
+	verity_free_sig(v);
 
 	if (v->tfm)
 		crypto_free_ahash(v->tfm);
@@ -1434,6 +1468,13 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		ti->error = "Root hash verification failed";
 		goto bad;
 	}
+
+	r = verity_init_sig(v, verify_args.sig, verify_args.sig_size);
+	if (r < 0) {
+		ti->error = "Cannot allocate root digest signature";
+		goto bad;
+	}
+
 	v->hash_per_block_bits =
 		__fls((1 << v->hash_dev_block_bits) / v->digest_size);
 
@@ -1584,6 +1625,62 @@ int dm_verity_get_root_digest(struct dm_target *ti, u8 **root_digest, unsigned i
 	return 0;
 }
 
+#ifdef CONFIG_SECURITY
+
+#ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG
+
+static int verity_security_set_signature(struct block_device *bdev,
+					 struct dm_verity *v)
+{
+	return security_bdev_setintegrity(bdev,
+					  LSM_INT_DMVERITY_SIG_VALID,
+					  v->root_digest_sig,
+					  v->sig_size);
+}
+
+#else
+
+static inline int verity_security_set_signature(struct block_device *bdev,
+						struct dm_verity *v)
+{
+	return 0;
+}
+
+#endif /* CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG */
+
+static int verity_finalize(struct dm_target *ti)
+{
+	struct block_device *bdev;
+	struct dm_verity_digest root_digest;
+	struct dm_verity *v;
+	int r;
+
+	v = ti->private;
+	bdev = dm_disk(dm_table_get_md(ti->table))->part0;
+	root_digest.digest = v->root_digest;
+	root_digest.digest_len = v->digest_size;
+	root_digest.alg = crypto_ahash_alg_name(v->tfm);
+
+	r = security_bdev_setintegrity(bdev, LSM_INT_DMVERITY_ROOTHASH, &root_digest,
+				       sizeof(root_digest));
+	if (r)
+		return r;
+
+	r =  verity_security_set_signature(bdev, v);
+	if (r)
+		goto bad;
+
+	return 0;
+
+bad:
+
+	security_bdev_setintegrity(bdev, LSM_INT_DMVERITY_ROOTHASH, NULL, 0);
+
+	return r;
+}
+
+#endif /* CONFIG_SECURITY */
+
 static struct target_type verity_target = {
 	.name		= "verity",
 	.features	= DM_TARGET_SINGLETON | DM_TARGET_IMMUTABLE,
@@ -1596,6 +1693,9 @@ static struct target_type verity_target = {
 	.prepare_ioctl	= verity_prepare_ioctl,
 	.iterate_devices = verity_iterate_devices,
 	.io_hints	= verity_io_hints,
+#ifdef CONFIG_SECURITY
+	.finalize	= verity_finalize,
+#endif /* CONFIG_SECURITY */
 };
 module_dm(verity);
 
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index 20b1bcf03474..2de89e0d555b 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -43,6 +43,9 @@ struct dm_verity {
 	u8 *root_digest;	/* digest of the root block */
 	u8 *salt;		/* salt: its size is salt_size */
 	u8 *zero_digest;	/* digest for a zero block */
+#ifdef CONFIG_SECURITY
+	u8 *root_digest_sig;	/* signature of the root digest */
+#endif /* CONFIG_SECURITY */
 	unsigned int salt_size;
 	sector_t data_start;	/* data offset in 512-byte sectors */
 	sector_t hash_start;	/* hash start in blocks */
@@ -56,6 +59,9 @@ struct dm_verity {
 	bool hash_failed:1;	/* set if hash of any block failed */
 	bool use_bh_wq:1;	/* try to verify in BH wq before normal work-queue */
 	unsigned int digest_size;	/* digest size for the current hash algorithm */
+#ifdef CONFIG_SECURITY
+	unsigned int sig_size;	/* root digest signature size */
+#endif /* CONFIG_SECURITY */
 	unsigned int ahash_reqsize;/* the size of temporary space for crypto */
 	enum verity_mode mode;	/* mode for handling verification errors */
 	unsigned int corrupted_errs;/* Number of errors for corrupted blocks */
diff --git a/include/linux/security.h b/include/linux/security.h
index ac0985641611..d2ddd7c63b62 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -83,8 +83,15 @@ enum lsm_event {
 	LSM_POLICY_CHANGE,
 };
 
+struct dm_verity_digest {
+	const char *alg;
+	const u8 *digest;
+	size_t digest_len;
+};
+
 enum lsm_integrity_type {
-	__LSM_INT_MAX
+	LSM_INT_DMVERITY_SIG_VALID,
+	LSM_INT_DMVERITY_ROOTHASH,
 };
 
 /*
-- 
2.44.0


^ permalink raw reply related	[relevance 50%]

* [PATCH v18 11/21] block,lsm: add LSM blob and new LSM hooks for block device
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (9 preceding siblings ...)
  2024-05-03 22:32 47% ` [PATCH v18 10/21] ipe: add permissive toggle Fan Wu
@ 2024-05-03 22:32 44% ` Fan Wu
  2024-05-03 22:32 70% ` [PATCH v18 12/21] dm: add finalize hook to target_type Fan Wu
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Some block devices have valuable security properties that is only
accessible during the creation time.

For example, when creating a dm-verity block device, the dm-verity's
roothash and roothash signature, which are extreme important security
metadata, are passed to the kernel. However, the roothash will be saved
privately in dm-verity, which prevents the security subsystem to easily
access that information. Worse, in the current implementation the
roothash signature will be discarded after the verification, making it
impossible to utilize the roothash signature by the security subsystem.

With this patch, an LSM blob is added to the block_device structure.
This enables the security subsystem to store security-sensitive data
related to block devices within the security blob. For example, LSM can
use the new LSM blob to save the roothash signature of a dm-verity,
and LSM can make access decision based on the data inside the signature,
like the signer certificate.

The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.

This patch also introduces a new hook to save block device's integrity
data. For example, for dm-verity, LSMs can use this hook to save
the roothash signature of a dm-verity into the security blob,
and LSMs can make access decisions based on the data inside
the signature, like the signer certificate.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + No Changes

v3:
  + Minor style changes from checkpatch --strict

v4:
  + No Changes

v5:
  + Allow multiple callers to call security_bdev_setsecurity

v6:
  + Simplify security_bdev_setsecurity break condition

v7:
  + Squash all dm-verity related patches to two patches,
    the additions to dm-verity/fs, and the consumption of
    the additions.

v8:
  + Split dm-verity related patches squashed in v7 to 3 commits based on
    topic:
      + New LSM hook
      + Consumption of hook outside LSM
      + Consumption of hook inside LSM.

  + change return of security_bdev_alloc / security_bdev_setsecurity
    to LSM_RET_DEFAULT instead of 0.

  + Change return code to -EOPNOTSUPP, bring inline with other
    setsecurity hooks.

v9:
  + Add Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
  + Remove unlikely when calling LSM hook
  + Make the security field dependent on CONFIG_SECURITY

v10:
  + No changes

v11:
  + No changes

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + Drop security_bdev_setsecurity() for new hook
    security_bdev_setintegrity() in the next commit
  + Update call_int_hook() for 260017f

v16:
  + Drop Reviewed-by tag for the new changes
  + Squash the security_bdev_setintegrity() into this commit
  + Rename enum from lsm_intgr_type to lsm_integrity_type
  + Switch to use call_int_hook() for bdev_setintegrity()
  + Correct comment
  + Fix return in security_bdev_alloc()

v17:
  + Fix a typo
  + Improve the commit subject line

v18:
  + No changes
---
 block/bdev.c                  |  7 +++
 include/linux/blk_types.h     |  3 ++
 include/linux/lsm_hook_defs.h |  5 ++
 include/linux/lsm_hooks.h     |  1 +
 include/linux/security.h      | 26 ++++++++++
 security/security.c           | 89 +++++++++++++++++++++++++++++++++++
 6 files changed, 131 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index da2a167a4d08..38f9b0e54a49 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -24,6 +24,7 @@
 #include <linux/pseudo_fs.h>
 #include <linux/uio.h>
 #include <linux/namei.h>
+#include <linux/security.h>
 #include <linux/part_stat.h>
 #include <linux/uaccess.h>
 #include <linux/stat.h>
@@ -313,6 +314,11 @@ static struct inode *bdev_alloc_inode(struct super_block *sb)
 	if (!ei)
 		return NULL;
 	memset(&ei->bdev, 0, sizeof(ei->bdev));
+
+	if (security_bdev_alloc(&ei->bdev)) {
+		kmem_cache_free(bdev_cachep, ei);
+		return NULL;
+	}
 	return &ei->vfs_inode;
 }
 
@@ -322,6 +328,7 @@ static void bdev_free_inode(struct inode *inode)
 
 	free_percpu(bdev->bd_stats);
 	kfree(bdev->bd_meta_info);
+	security_bdev_free(bdev);
 
 	if (!bdev_is_partition(bdev)) {
 		if (bdev->bd_disk && bdev->bd_disk->bdi)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index cb1526ec44b5..effe3c4e6b35 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -70,6 +70,9 @@ struct block_device {
 #endif
 	bool			bd_ro_warned;
 	int			bd_writers;
+#ifdef CONFIG_SECURITY
+	void			*security;
+#endif
 	/*
 	 * keep this out-of-line as it's both big and not needed in the fast
 	 * path
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 7db99ae75651..b391a7f13053 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -452,3 +452,8 @@ LSM_HOOK(int, 0, uring_cmd, struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_IO_URING */
 
 LSM_HOOK(void, LSM_RET_VOID, initramfs_populated, void)
+
+LSM_HOOK(int, 0, bdev_alloc_security, struct block_device *bdev)
+LSM_HOOK(void, LSM_RET_VOID, bdev_free_security, struct block_device *bdev)
+LSM_HOOK(int, 0, bdev_setintegrity, struct block_device *bdev,
+	 enum lsm_integrity_type type, const void *value, size_t size)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index a2ade0ffe9e7..f1692179aa56 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -78,6 +78,7 @@ struct lsm_blob_sizes {
 	int	lbs_msg_msg;
 	int	lbs_task;
 	int	lbs_xattr_count; /* number of xattr slots in new_xattrs array */
+	int	lbs_bdev;
 };
 
 /**
diff --git a/include/linux/security.h b/include/linux/security.h
index f35af7b6cfba..ac0985641611 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -83,6 +83,10 @@ enum lsm_event {
 	LSM_POLICY_CHANGE,
 };
 
+enum lsm_integrity_type {
+	__LSM_INT_MAX
+};
+
 /*
  * These are reasons that can be passed to the security_locked_down()
  * LSM hook. Lockdown reasons that protect kernel integrity (ie, the
@@ -509,6 +513,11 @@ int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
 int security_locked_down(enum lockdown_reason what);
 int lsm_fill_user_ctx(struct lsm_ctx __user *uctx, u32 *uctx_len,
 		      void *val, size_t val_len, u64 id, u64 flags);
+int security_bdev_alloc(struct block_device *bdev);
+void security_bdev_free(struct block_device *bdev);
+int security_bdev_setintegrity(struct block_device *bdev,
+			       enum lsm_integrity_type type, const void *value,
+			       size_t size);
 #else /* CONFIG_SECURITY */
 
 static inline int call_blocking_lsm_notifier(enum lsm_event event, void *data)
@@ -1483,6 +1492,23 @@ static inline int lsm_fill_user_ctx(struct lsm_ctx __user *uctx,
 {
 	return -EOPNOTSUPP;
 }
+
+static inline int security_bdev_alloc(struct block_device *bdev)
+{
+	return 0;
+}
+
+static inline void security_bdev_free(struct block_device *bdev)
+{
+}
+
+static inline int security_bdev_setintegrity(struct block_device *bdev,
+					     enum lsm_integrity_type type,
+					     const void *value, size_t size)
+{
+	return 0;
+}
+
 #endif	/* CONFIG_SECURITY */
 
 #if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE)
diff --git a/security/security.c b/security/security.c
index 0db5a6b32aab..3a7724c3dd76 100644
--- a/security/security.c
+++ b/security/security.c
@@ -29,6 +29,7 @@
 #include <linux/msg.h>
 #include <linux/overflow.h>
 #include <net/flow.h>
+#include <linux/fs.h>
 
 /* How many LSMs were built into the kernel? */
 #define LSM_COUNT (__end_lsm_info - __start_lsm_info)
@@ -232,6 +233,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes *needed)
 	lsm_set_blob_size(&needed->lbs_task, &blob_sizes.lbs_task);
 	lsm_set_blob_size(&needed->lbs_xattr_count,
 			  &blob_sizes.lbs_xattr_count);
+	lsm_set_blob_size(&needed->lbs_bdev, &blob_sizes.lbs_bdev);
 }
 
 /* Prepare LSM for initialization. */
@@ -405,6 +407,7 @@ static void __init ordered_lsm_init(void)
 	init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
 	init_debug("task blob size       = %d\n", blob_sizes.lbs_task);
 	init_debug("xattr slots          = %d\n", blob_sizes.lbs_xattr_count);
+	init_debug("bdev blob size       = %d\n", blob_sizes.lbs_bdev);
 
 	/*
 	 * Create any kmem_caches needed for blobs
@@ -737,6 +740,28 @@ static int lsm_msg_msg_alloc(struct msg_msg *mp)
 	return 0;
 }
 
+/**
+ * lsm_bdev_alloc - allocate a composite block_device blob
+ * @bdev: the block_device that needs a blob
+ *
+ * Allocate the block_device blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_bdev_alloc(struct block_device *bdev)
+{
+	if (blob_sizes.lbs_bdev == 0) {
+		bdev->security = NULL;
+		return 0;
+	}
+
+	bdev->security = kzalloc(blob_sizes.lbs_bdev, GFP_KERNEL);
+	if (!bdev->security)
+		return -ENOMEM;
+
+	return 0;
+}
+
 /**
  * lsm_early_task - during initialization allocate a composite task blob
  * @task: the task that needs a blob
@@ -5568,6 +5593,70 @@ int security_locked_down(enum lockdown_reason what)
 }
 EXPORT_SYMBOL(security_locked_down);
 
+/**
+ * security_bdev_alloc() - Allocate a block device LSM blob
+ * @bdev: block device
+ *
+ * Allocate and attach a security structure to @bdev->security.  The
+ * security field is initialized to NULL when the bdev structure is
+ * allocated.
+ *
+ * Return: Return 0 if operation was successful.
+ */
+int security_bdev_alloc(struct block_device *bdev)
+{
+	int rc = 0;
+
+	rc = lsm_bdev_alloc(bdev);
+	if (unlikely(rc))
+		return rc;
+
+	rc = call_int_hook(bdev_alloc_security, bdev);
+	if (unlikely(rc))
+		security_bdev_free(bdev);
+
+	return rc;
+}
+EXPORT_SYMBOL(security_bdev_alloc);
+
+/**
+ * security_bdev_free() - Free a block device's LSM blob
+ * @bdev: block device
+ *
+ * Deallocate the bdev security structure and set @bdev->security to NULL.
+ */
+void security_bdev_free(struct block_device *bdev)
+{
+	if (!bdev->security)
+		return;
+
+	call_void_hook(bdev_free_security, bdev);
+
+	kfree(bdev->security);
+	bdev->security = NULL;
+}
+EXPORT_SYMBOL(security_bdev_free);
+
+/**
+ * security_bdev_setintegrity() - Set the device's integrity data
+ * @bdev: block device
+ * @type: type of integrity, e.g. hash digest, signature, etc
+ * @value: the integrity value
+ * @size: size of the integrity value
+ *
+ * Register a verified integrity measurement of a bdev with LSMs.
+ * LSMs should free the previously saved data if @value is NULL.
+ *
+ * Return: Returns 0 on success, negative values on failure.
+ */
+int security_bdev_setintegrity(struct block_device *bdev,
+			       enum lsm_integrity_type type, const void *value,
+			       size_t size)
+{
+	return call_int_hook(bdev_setintegrity, bdev, type, value, size);
+}
+EXPORT_SYMBOL(security_bdev_setintegrity);
+
 #ifdef CONFIG_PERF_EVENTS
 /**
  * security_perf_event_open() - Check if a perf event open is allowed
-- 
2.44.0


^ permalink raw reply related	[relevance 44%]

* [PATCH v18 14/21] ipe: add support for dm-verity as a trust provider
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (12 preceding siblings ...)
  2024-05-03 22:32 50% ` [PATCH v18 13/21] dm verity: expose root hash digest and signature data to LSMs Fan Wu
@ 2024-05-03 22:32 31% ` Fan Wu
  2024-05-03 22:32 65% ` [PATCH v18 15/21] security: add security_inode_setintegrity() hook Fan Wu
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Allows author of IPE policy to indicate trust for a singular dm-verity
volume, identified by roothash, through "dmverity_roothash" and all
signed dm-verity volumes, through "dmverity_signature".

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + No Changes

v3:
  + No changes

v4:
  + No changes

v5:
  + No changes

v6:
  + Fix an improper cleanup that can result in
    a leak

v7:
  + Squash patch 08/12, 10/12 to [11/16]

v8:
  + Undo squash of 08/12, 10/12 - separating drivers/md/ from security/
    & block/
  + Use common-audit function for dmverity_signature.
  + Change implementation for storing the dm-verity digest to use the
    newly introduced dm_verity_digest structure introduced in patch
    14/20.

v9:
  + Adapt to the new parser

v10:
  + Select the Kconfig when all dependencies are enabled

v11:
  + No changes

v12:
  + Refactor to use struct digest_info* instead of void*
  + Correct audit format

v13:
  + Remove the CONFIG_IPE_PROP_DM_VERITY dependency inside the parser
    to make the policy grammar independent of the kernel config.

v14:
  + No changes

v15:
  + Fix one grammar issue in KCONFIG
  + Switch to use security_bdev_setintegrity() hook

v16:
  + Refactor for enum integrity type

v17:
  + Add years to license header
  + Fix code and documentation style issues
  + Return -EINVAL in ipe_bdev_setintegrity when passed type is not
    supported
  + Use new enum name LSM_INT_DMVERITY_SIG_VALID

v18:
  + Add Kconfig IPE_PROP_DM_VERITY_SIGNATURE and make both DM_VERITY
    config auto-selected
---
 security/ipe/Kconfig         |  27 ++++++++
 security/ipe/Makefile        |   1 +
 security/ipe/audit.c         |  29 ++++++++-
 security/ipe/digest.c        | 118 +++++++++++++++++++++++++++++++++++
 security/ipe/digest.h        |  26 ++++++++
 security/ipe/eval.c          |  93 ++++++++++++++++++++++++++-
 security/ipe/eval.h          |  12 ++++
 security/ipe/hooks.c         |  91 +++++++++++++++++++++++++++
 security/ipe/hooks.h         |   8 +++
 security/ipe/ipe.c           |  15 +++++
 security/ipe/ipe.h           |   4 ++
 security/ipe/policy.h        |   3 +
 security/ipe/policy_parser.c |  24 ++++++-
 13 files changed, 447 insertions(+), 4 deletions(-)
 create mode 100644 security/ipe/digest.c
 create mode 100644 security/ipe/digest.h

diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index ac4d558e69d5..8279dddf92ad 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -8,6 +8,8 @@ menuconfig SECURITY_IPE
 	depends on SECURITY && SECURITYFS && AUDIT && AUDITSYSCALL
 	select PKCS7_MESSAGE_PARSER
 	select SYSTEM_DATA_VERIFICATION
+	select IPE_PROP_DM_VERITY if DM_VERITY
+	select IPE_PROP_DM_VERITY_SIGNATURE if DM_VERITY && DM_VERITY_VERIFY_ROOTHASH_SIG
 	help
 	  This option enables the Integrity Policy Enforcement LSM
 	  allowing users to define a policy to enforce a trust-based access
@@ -15,3 +17,28 @@ menuconfig SECURITY_IPE
 	  admins to reconfigure trust requirements on the fly.
 
 	  If unsure, answer N.
+
+if SECURITY_IPE
+menu "IPE Trust Providers"
+
+config IPE_PROP_DM_VERITY
+	bool "Enable support for dm-verity based on root hash"
+	depends on DM_VERITY
+	help
+	  This option enables the 'dmverity_roothash' property within IPE
+	  policies. The property evaluates to TRUE when a file from a dm-verity
+	  volume is evaluated, and the volume's root hash matches the value
+	  supplied in the policy.
+
+config IPE_PROP_DM_VERITY_SIGNATURE
+	bool "Enable support for dm-verity based on root hash signature"
+	depends on DM_VERITY && DM_VERITY_VERIFY_ROOTHASH_SIG
+	help
+	  This option enables the 'dmverity_signature' property within IPE
+	  policies. The property evaluates to TRUE when a file from a dm-verity
+	  volume, which has been mounted with a valid signed root hash,
+	  is evaluated.
+
+endmenu
+
+endif
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 62caccba14b4..e1019bb9f0f3 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -6,6 +6,7 @@
 #
 
 obj-$(CONFIG_SECURITY_IPE) += \
+	digest.o \
 	eval.o \
 	hooks.o \
 	fs.o \
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index a416291ba477..2c98520267c1 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -13,6 +13,7 @@
 #include "hooks.h"
 #include "policy.h"
 #include "audit.h"
+#include "digest.h"
 
 #define ACTSTR(x) ((x) == IPE_ACTION_ALLOW ? "ALLOW" : "DENY")
 
@@ -49,8 +50,22 @@ static const char *const audit_hook_names[__IPE_HOOK_MAX] = {
 static const char *const audit_prop_names[__IPE_PROP_MAX] = {
 	"boot_verified=FALSE",
 	"boot_verified=TRUE",
+	"dmverity_roothash=",
+	"dmverity_signature=FALSE",
+	"dmverity_signature=TRUE",
 };
 
+/**
+ * audit_dmv_roothash() - audit the roothash of a dmverity_roothash property.
+ * @ab: Supplies a pointer to the audit_buffer to append to.
+ * @rh: Supplies a pointer to the digest structure.
+ */
+static void audit_dmv_roothash(struct audit_buffer *ab, const void *rh)
+{
+	audit_log_format(ab, "%s", audit_prop_names[IPE_PROP_DMV_ROOTHASH]);
+	ipe_digest_audit(ab, rh);
+}
+
 /**
  * audit_rule() - audit an IPE policy rule.
  * @ab: Supplies a pointer to the audit_buffer to append to.
@@ -62,8 +77,18 @@ static void audit_rule(struct audit_buffer *ab, const struct ipe_rule *r)
 
 	audit_log_format(ab, " rule=\"%s ", audit_op_names[r->op]);
 
-	list_for_each_entry(ptr, &r->props, next)
-		audit_log_format(ab, "%s ", audit_prop_names[ptr->type]);
+	list_for_each_entry(ptr, &r->props, next) {
+		switch (ptr->type) {
+		case IPE_PROP_DMV_ROOTHASH:
+			audit_dmv_roothash(ab, ptr->value);
+			break;
+		default:
+			audit_log_format(ab, "%s", audit_prop_names[ptr->type]);
+			break;
+		}
+
+		audit_log_format(ab, " ");
+	}
 
 	audit_log_format(ab, "action=%s\"", ACTSTR(r->action));
 }
diff --git a/security/ipe/digest.c b/security/ipe/digest.c
new file mode 100644
index 000000000000..493716370570
--- /dev/null
+++ b/security/ipe/digest.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include "digest.h"
+
+/**
+ * ipe_digest_parse() - parse a digest in IPE's policy.
+ * @valstr: Supplies the string parsed from the policy.
+ *
+ * Digests in IPE are defined in a standard way:
+ *	<alg_name>:<hex>
+ *
+ * Use this function to create a property to parse the digest
+ * consistently. The parsed digest will be saved in @value in IPE's
+ * policy.
+ *
+ * Return: The parsed digest_info structure on success. If an error occurs,
+ * the function will return the error value (via ERR_PTR).
+ */
+struct digest_info *ipe_digest_parse(const char *valstr)
+{
+	struct digest_info *info = NULL;
+	char *sep, *raw_digest;
+	size_t raw_digest_len;
+	u8 *digest = NULL;
+	char *alg = NULL;
+	int rc = 0;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	sep = strchr(valstr, ':');
+	if (!sep) {
+		rc = -EBADMSG;
+		goto err;
+	}
+
+	alg = kstrndup(valstr, sep - valstr, GFP_KERNEL);
+	if (!alg) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	raw_digest = sep + 1;
+	raw_digest_len = strlen(raw_digest);
+
+	info->digest_len = (raw_digest_len + 1) / 2;
+	digest = kzalloc(info->digest_len, GFP_KERNEL);
+	if (!digest) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	rc = hex2bin(digest, raw_digest, info->digest_len);
+	if (rc < 0) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	info->alg = alg;
+	info->digest = digest;
+	return info;
+
+err:
+	kfree(alg);
+	kfree(digest);
+	kfree(info);
+	return ERR_PTR(rc);
+}
+
+/**
+ * ipe_digest_eval() - evaluate an IPE digest against another digest.
+ * @expected: Supplies the policy-provided digest value.
+ * @digest: Supplies the digest to compare against the policy digest value.
+ *
+ * Return:
+ * * %true	- digests match
+ * * %false	- digests do not match
+ */
+bool ipe_digest_eval(const struct digest_info *expected,
+		     const struct digest_info *digest)
+{
+	return (expected->digest_len == digest->digest_len) &&
+	       (!strcmp(expected->alg, digest->alg)) &&
+	       (!memcmp(expected->digest, digest->digest, expected->digest_len));
+}
+
+/**
+ * ipe_digest_free() - free an IPE digest.
+ * @info: Supplies a pointer the policy-provided digest to free.
+ */
+void ipe_digest_free(struct digest_info *info)
+{
+	if (IS_ERR_OR_NULL(info))
+		return;
+
+	kfree(info->alg);
+	kfree(info->digest);
+	kfree(info);
+}
+
+/**
+ * ipe_digest_audit() - audit a digest that was sourced from IPE's policy.
+ * @ab: Supplies the audit_buffer to append the formatted result.
+ * @info: Supplies a pointer to source the audit record from.
+ *
+ * Digests in IPE are audited in this format:
+ *	<alg_name>:<hex>
+ */
+void ipe_digest_audit(struct audit_buffer *ab, const struct digest_info *info)
+{
+	audit_log_untrustedstring(ab, info->alg);
+	audit_log_format(ab, ":");
+	audit_log_n_hex(ab, info->digest, info->digest_len);
+}
diff --git a/security/ipe/digest.h b/security/ipe/digest.h
new file mode 100644
index 000000000000..52c9b3844a38
--- /dev/null
+++ b/security/ipe/digest.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_DIGEST_H
+#define _IPE_DIGEST_H
+
+#include <linux/types.h>
+#include <linux/audit.h>
+
+#include "policy.h"
+
+struct digest_info {
+	const char *alg;
+	const u8 *digest;
+	size_t digest_len;
+};
+
+struct digest_info *ipe_digest_parse(const char *valstr);
+void ipe_digest_free(struct digest_info *digest_info);
+void ipe_digest_audit(struct audit_buffer *ab, const struct digest_info *val);
+bool ipe_digest_eval(const struct digest_info *expected,
+		     const struct digest_info *digest);
+
+#endif /* _IPE_DIGEST_H */
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index dd9064974be6..8f4f63088206 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -15,10 +15,12 @@
 #include "eval.h"
 #include "policy.h"
 #include "audit.h"
+#include "digest.h"
 
 struct ipe_policy __rcu *ipe_active_policy;
 bool success_audit;
 bool enforce = true;
+#define INO_BLOCK_DEV(ino) ((ino)->i_sb->s_bdev)
 
 #define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
 
@@ -32,6 +34,23 @@ static void build_ipe_sb_ctx(struct ipe_eval_ctx *ctx, const struct file *const
 	ctx->initramfs = ipe_sb(FILE_SUPERBLOCK(file))->initramfs;
 }
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+/**
+ * build_ipe_bdev_ctx() - Build ipe_bdev field of an evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @ino: Supplies the inode struct of the file triggered IPE event.
+ */
+static void build_ipe_bdev_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+	if (INO_BLOCK_DEV(ino))
+		ctx->ipe_bdev = ipe_bdev(INO_BLOCK_DEV(ino));
+}
+#else
+static void build_ipe_bdev_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 /**
  * ipe_build_eval_ctx() - Build an ipe evaluation context.
  * @ctx: Supplies a pointer to the context to be populated.
@@ -48,8 +67,10 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 	ctx->op = op;
 	ctx->hook = hook;
 
-	if (file)
+	if (file) {
 		build_ipe_sb_ctx(ctx, file);
+		build_ipe_bdev_ctx(ctx, d_real_inode(file->f_path.dentry));
+	}
 }
 
 /**
@@ -65,6 +86,70 @@ static bool evaluate_boot_verified(const struct ipe_eval_ctx *const ctx)
 	return ctx->initramfs;
 }
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+/**
+ * evaluate_dmv_roothash() - Evaluate @ctx against a dmv roothash property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ * @p: Supplies a pointer to the property being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_dmv_roothash(const struct ipe_eval_ctx *const ctx,
+				  struct ipe_prop *p)
+{
+	return !!ctx->ipe_bdev &&
+	       !!ctx->ipe_bdev->root_hash &&
+	       ipe_digest_eval(p->value,
+			       ctx->ipe_bdev->root_hash);
+}
+#else
+static bool evaluate_dmv_roothash(const struct ipe_eval_ctx *const ctx,
+				  struct ipe_prop *p)
+{
+	return false;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
+#ifdef CONFIG_IPE_PROP_DM_VERITY_SIGNATURE
+/**
+ * evaluate_dmv_sig_false() - Evaluate @ctx against a dmv sig false property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the property
+ * * %false	- The current @ctx doesn't match the property
+ */
+static bool evaluate_dmv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return !ctx->ipe_bdev || (!ctx->ipe_bdev->dm_verity_signed);
+}
+
+/**
+ * evaluate_dmv_sig_true() - Evaluate @ctx against a dmv sig true property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the property
+ * * %false	- The current @ctx doesn't match the property
+ */
+static bool evaluate_dmv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return !evaluate_dmv_sig_false(ctx);
+}
+#else
+static bool evaluate_dmv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+
+static bool evaluate_dmv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY_SIGNATURE */
+
 /**
  * evaluate_property() - Analyze @ctx against a rule property.
  * @ctx: Supplies a pointer to the context to be evaluated.
@@ -85,6 +170,12 @@ static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 		return !evaluate_boot_verified(ctx);
 	case IPE_PROP_BOOT_VERIFIED_TRUE:
 		return evaluate_boot_verified(ctx);
+	case IPE_PROP_DMV_ROOTHASH:
+		return evaluate_dmv_roothash(ctx, p);
+	case IPE_PROP_DMV_SIG_FALSE:
+		return evaluate_dmv_sig_false(ctx);
+	case IPE_PROP_DMV_SIG_TRUE:
+		return evaluate_dmv_sig_true(ctx);
 	default:
 		return false;
 	}
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 80b74f55fa69..4901df0e1369 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -22,12 +22,24 @@ struct ipe_superblock {
 	bool initramfs;
 };
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+struct ipe_bdev {
+#ifdef CONFIG_IPE_PROP_DM_VERITY_SIGNATURE
+	bool dm_verity_signed;
+#endif /* CONFIG_IPE_PROP_DM_VERITY_SIGNATURE */
+	struct digest_info *root_hash;
+};
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
 	enum ipe_hook_type hook;
 
 	const struct file *file;
 	bool initramfs;
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+	const struct ipe_bdev *ipe_bdev;
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
 };
 
 enum ipe_match {
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index b68719bf44fb..bc0a7268179d 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -8,10 +8,12 @@
 #include <linux/types.h>
 #include <linux/binfmts.h>
 #include <linux/mman.h>
+#include <linux/blk_types.h>
 
 #include "ipe.h"
 #include "hooks.h"
 #include "eval.h"
+#include "digest.h"
 
 /**
  * ipe_bprm_check_security() - ipe security hook function for bprm check.
@@ -191,3 +193,92 @@ void ipe_unpack_initramfs(void)
 {
 	ipe_sb(current->fs->root.mnt->mnt_sb)->initramfs = true;
 }
+
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+/**
+ * ipe_bdev_free_security() - Free IPE's LSM blob of block_devices.
+ * @bdev: Supplies a pointer to a block_device that contains the structure
+ *	  to free.
+ */
+void ipe_bdev_free_security(struct block_device *bdev)
+{
+	struct ipe_bdev *blob = ipe_bdev(bdev);
+
+	ipe_digest_free(blob->root_hash);
+}
+
+#ifdef CONFIG_IPE_PROP_DM_VERITY_SIGNATURE
+static void ipe_set_dmverity_signature(struct ipe_bdev *blob,
+				       const void *value,
+				       size_t size)
+{
+	blob->dm_verity_signed = size > 0 && value;
+}
+#else
+static inline void ipe_set_dmverity_signature(struct ipe_bdev *blob,
+					      const void *value,
+					      size_t size)
+{
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY_SIGNATURE */
+
+/**
+ * ipe_bdev_setintegrity() - Save integrity data from a bdev to IPE's LSM blob.
+ * @bdev: Supplies a pointer to a block_device that contains the LSM blob.
+ * @type: Supplies the integrity type.
+ * @value: Supplies the value to store.
+ * @size: The size of @value.
+ *
+ * This hook is currently used to save dm-verity's root hash or the existence
+ * of a validated signed dm-verity root hash into LSM blob.
+ *
+ * Return: %0 on success. If an error occurs, the function will return the
+ * -errno.
+ */
+int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type type,
+			  const void *value, size_t size)
+{
+	const struct dm_verity_digest *digest = NULL;
+	struct ipe_bdev *blob = ipe_bdev(bdev);
+	struct digest_info *info = NULL;
+
+	if (type == LSM_INT_DMVERITY_ROOTHASH) {
+		if (!value) {
+			ipe_digest_free(blob->root_hash);
+			blob->root_hash = NULL;
+
+			return 0;
+		}
+		digest = value;
+
+		info = kzalloc(sizeof(*info), GFP_KERNEL);
+		if (!info)
+			return -ENOMEM;
+
+		info->digest = kmemdup(digest->digest, digest->digest_len,
+				       GFP_KERNEL);
+		if (!info->digest)
+			goto dmv_roothash_err;
+
+		info->alg = kstrdup(digest->alg, GFP_KERNEL);
+		if (!info->alg)
+			goto dmv_roothash_err;
+
+		info->digest_len = digest->digest_len;
+
+		blob->root_hash = info;
+
+		return 0;
+dmv_roothash_err:
+		ipe_digest_free(info);
+
+		return -ENOMEM;
+	} else if (type == LSM_INT_DMVERITY_SIG_VALID) {
+		ipe_set_dmverity_signature(blob, value, size);
+
+		return 0;
+	}
+
+	return -EINVAL;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index f4f0b544ddcc..4d585fb6ada3 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -8,6 +8,7 @@
 #include <linux/fs.h>
 #include <linux/binfmts.h>
 #include <linux/security.h>
+#include <linux/blk_types.h>
 
 enum ipe_hook_type {
 	IPE_HOOK_BPRM_CHECK = 0,
@@ -35,4 +36,11 @@ int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents);
 
 void ipe_unpack_initramfs(void);
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+void ipe_bdev_free_security(struct block_device *bdev);
+
+int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type type,
+			  const void *value, size_t len);
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 #endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 53f2196b9bcc..99cb42caa63a 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -7,11 +7,15 @@
 #include "ipe.h"
 #include "eval.h"
 #include "hooks.h"
+#include "eval.h"
 
 bool ipe_enabled;
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 	.lbs_superblock = sizeof(struct ipe_superblock),
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+	.lbs_bdev = sizeof(struct ipe_bdev),
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
 };
 
 static const struct lsm_id ipe_lsmid = {
@@ -24,6 +28,13 @@ struct ipe_superblock *ipe_sb(const struct super_block *sb)
 	return sb->s_security + ipe_blobs.lbs_superblock;
 }
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+struct ipe_bdev *ipe_bdev(struct block_device *b)
+{
+	return b->security + ipe_blobs.lbs_bdev;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
 	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
@@ -31,6 +42,10 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(kernel_read_file, ipe_kernel_read_file),
 	LSM_HOOK_INIT(kernel_load_data, ipe_kernel_load_data),
 	LSM_HOOK_INIT(initramfs_populated, ipe_unpack_initramfs),
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+	LSM_HOOK_INIT(bdev_free_security, ipe_bdev_free_security),
+	LSM_HOOK_INIT(bdev_setintegrity, ipe_bdev_setintegrity),
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
 };
 
 /**
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index 4aa18d1d0525..01f46286e383 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -16,4 +16,8 @@ struct ipe_superblock *ipe_sb(const struct super_block *sb);
 
 extern bool ipe_enabled;
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+struct ipe_bdev *ipe_bdev(struct block_device *b);
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index ffd60cc7fda6..26776092c710 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -33,6 +33,9 @@ enum ipe_action_type {
 enum ipe_prop_type {
 	IPE_PROP_BOOT_VERIFIED_FALSE,
 	IPE_PROP_BOOT_VERIFIED_TRUE,
+	IPE_PROP_DMV_ROOTHASH,
+	IPE_PROP_DMV_SIG_FALSE,
+	IPE_PROP_DMV_SIG_TRUE,
 	__IPE_PROP_MAX
 };
 
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 84cc688be3a2..71c84b293029 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -11,6 +11,7 @@
 
 #include "policy.h"
 #include "policy_parser.h"
+#include "digest.h"
 
 #define START_COMMENT	'#'
 #define IPE_POLICY_DELIM " \t"
@@ -221,6 +222,7 @@ static void free_rule(struct ipe_rule *r)
 
 	list_for_each_entry_safe(p, t, &r->props, next) {
 		list_del(&p->next);
+		ipe_digest_free(p->value);
 		kfree(p);
 	}
 
@@ -273,6 +275,9 @@ static enum ipe_action_type parse_action(char *t)
 static const match_table_t property_tokens = {
 	{IPE_PROP_BOOT_VERIFIED_FALSE,	"boot_verified=FALSE"},
 	{IPE_PROP_BOOT_VERIFIED_TRUE,	"boot_verified=TRUE"},
+	{IPE_PROP_DMV_ROOTHASH,		"dmverity_roothash=%s"},
+	{IPE_PROP_DMV_SIG_FALSE,	"dmverity_signature=FALSE"},
+	{IPE_PROP_DMV_SIG_TRUE,		"dmverity_signature=TRUE"},
 	{IPE_PROP_INVALID,		NULL}
 };
 
@@ -295,6 +300,7 @@ static int parse_property(char *t, struct ipe_rule *r)
 	struct ipe_prop *p = NULL;
 	int rc = 0;
 	int token;
+	char *dup = NULL;
 
 	p = kzalloc(sizeof(*p), GFP_KERNEL);
 	if (!p)
@@ -303,8 +309,22 @@ static int parse_property(char *t, struct ipe_rule *r)
 	token = match_token(t, property_tokens, args);
 
 	switch (token) {
+	case IPE_PROP_DMV_ROOTHASH:
+		dup = match_strdup(&args[0]);
+		if (!dup) {
+			rc = -ENOMEM;
+			goto err;
+		}
+		p->value = ipe_digest_parse(dup);
+		if (IS_ERR(p->value)) {
+			rc = PTR_ERR(p->value);
+			goto err;
+		}
+		fallthrough;
 	case IPE_PROP_BOOT_VERIFIED_FALSE:
 	case IPE_PROP_BOOT_VERIFIED_TRUE:
+	case IPE_PROP_DMV_SIG_FALSE:
+	case IPE_PROP_DMV_SIG_TRUE:
 		p->type = token;
 		break;
 	default:
@@ -315,10 +335,12 @@ static int parse_property(char *t, struct ipe_rule *r)
 		goto err;
 	list_add_tail(&p->next, &r->props);
 
+out:
+	kfree(dup);
 	return rc;
 err:
 	kfree(p);
-	return rc;
+	goto out;
 }
 
 /**
-- 
2.44.0


^ permalink raw reply related	[relevance 31%]

* [PATCH v18 07/21] security: add new securityfs delete function
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (5 preceding siblings ...)
  2024-05-03 22:32 47% ` [PATCH v18 06/21] ipe: introduce 'boot_verified' as a trust provider Fan Wu
@ 2024-05-03 22:32 69% ` Fan Wu
  2024-05-03 22:32 28% ` [PATCH v18 08/21] ipe: add userspace interface Fan Wu
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

When deleting a directory in the security file system, the existing
securityfs_remove requires the directory to be empty, otherwise
it will do nothing. This leads to a potential risk that the security
file system might be in an unclean state when the intended deletion
did not happen.

This commit introduces a new function securityfs_recursive_remove
to recursively delete a directory without leaving an unclean state.

Co-developed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v8:
  + Not present

v9:
  + Introduced

v10:
  + No changes

v11:
  + Fix code style issues

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + No changes

v18:
  + No changes
---
 include/linux/security.h |  1 +
 security/inode.c         | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/linux/security.h b/include/linux/security.h
index 14fff542f2e3..f35af7b6cfba 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -2089,6 +2089,7 @@ struct dentry *securityfs_create_symlink(const char *name,
 					 const char *target,
 					 const struct inode_operations *iops);
 extern void securityfs_remove(struct dentry *dentry);
+extern void securityfs_recursive_remove(struct dentry *dentry);
 
 #else /* CONFIG_SECURITYFS */
 
diff --git a/security/inode.c b/security/inode.c
index 9e7cde913667..f21847badb7d 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -313,6 +313,31 @@ void securityfs_remove(struct dentry *dentry)
 }
 EXPORT_SYMBOL_GPL(securityfs_remove);
 
+static void remove_one(struct dentry *victim)
+{
+	simple_release_fs(&mount, &mount_count);
+}
+
+/**
+ * securityfs_recursive_remove - recursively removes a file or directory
+ *
+ * @dentry: a pointer to a the dentry of the file or directory to be removed.
+ *
+ * This function recursively removes a file or directory in securityfs that was
+ * previously created with a call to another securityfs function (like
+ * securityfs_create_file() or variants thereof.)
+ */
+void securityfs_recursive_remove(struct dentry *dentry)
+{
+	if (IS_ERR_OR_NULL(dentry))
+		return;
+
+	simple_pin_fs(&fs_type, &mount, &mount_count);
+	simple_recursive_removal(dentry, remove_one);
+	simple_release_fs(&mount, &mount_count);
+}
+EXPORT_SYMBOL_GPL(securityfs_recursive_remove);
+
 #ifdef CONFIG_SECURITY
 static struct dentry *lsm_dentry;
 static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
-- 
2.44.0


^ permalink raw reply related	[relevance 69%]

* [PATCH v18 05/21] initramfs|security: Add a security hook to do_populate_rootfs()
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (3 preceding siblings ...)
  2024-05-03 22:32 45% ` [PATCH v18 04/21] ipe: add LSM hooks on execution and kernel read Fan Wu
@ 2024-05-03 22:32 67% ` Fan Wu
  2024-05-03 22:32 47% ` [PATCH v18 06/21] ipe: introduce 'boot_verified' as a trust provider Fan Wu
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

This patch introduces a new hook to notify security system that the
content of initramfs has been unpacked into the rootfs.

Upon receiving this notification, the security system can activate
a policy to allow only files that originated from the initramfs to
execute or load into kernel during the early stages of booting.

This approach is crucial for minimizing the attack surface by
ensuring that only trusted files from the initramfs are operational
in the critical boot phase.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v11:
  + Not present

v12:
  + Introduced

v13:
  + Rename the hook name to initramfs_populated()

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Fix ocumentation style issues

v18:
  + No changes
---
 include/linux/lsm_hook_defs.h |  2 ++
 include/linux/security.h      |  8 ++++++++
 init/initramfs.c              |  3 +++
 security/security.c           | 10 ++++++++++
 4 files changed, 23 insertions(+)

diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 334e00efbde4..7db99ae75651 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -450,3 +450,5 @@ LSM_HOOK(int, 0, uring_override_creds, const struct cred *new)
 LSM_HOOK(int, 0, uring_sqpoll, void)
 LSM_HOOK(int, 0, uring_cmd, struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_IO_URING */
+
+LSM_HOOK(void, LSM_RET_VOID, initramfs_populated, void)
diff --git a/include/linux/security.h b/include/linux/security.h
index 41a8f667bdfa..14fff542f2e3 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -2255,4 +2255,12 @@ static inline int security_uring_cmd(struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_SECURITY */
 #endif /* CONFIG_IO_URING */
 
+#ifdef CONFIG_SECURITY
+extern void security_initramfs_populated(void);
+#else
+static inline void security_initramfs_populated(void)
+{
+}
+#endif /* CONFIG_SECURITY */
+
 #endif /* ! __LINUX_SECURITY_H */
diff --git a/init/initramfs.c b/init/initramfs.c
index a298a3854a80..feedb47d0f55 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -17,6 +17,7 @@
 #include <linux/namei.h>
 #include <linux/init_syscalls.h>
 #include <linux/umh.h>
+#include <linux/security.h>
 
 #include "do_mounts.h"
 
@@ -719,6 +720,8 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
 #endif
 	}
 
+	security_initramfs_populated();
+
 done:
 	/*
 	 * If the initrd region is overlapped with crashkernel reserved region,
diff --git a/security/security.c b/security/security.c
index 820e0d437452..0db5a6b32aab 100644
--- a/security/security.c
+++ b/security/security.c
@@ -5675,3 +5675,13 @@ int security_uring_cmd(struct io_uring_cmd *ioucmd)
 	return call_int_hook(uring_cmd, ioucmd);
 }
 #endif /* CONFIG_IO_URING */
+
+/**
+ * security_initramfs_populated() - Notify LSMs that initramfs has been loaded
+ *
+ * Tells the LSMs the initramfs has been unpacked into the rootfs.
+ */
+void security_initramfs_populated(void)
+{
+	call_void_hook(initramfs_populated);
+}
-- 
2.44.0


^ permalink raw reply related	[relevance 67%]

* [PATCH v18 06/21] ipe: introduce 'boot_verified' as a trust provider
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (4 preceding siblings ...)
  2024-05-03 22:32 67% ` [PATCH v18 05/21] initramfs|security: Add a security hook to do_populate_rootfs() Fan Wu
@ 2024-05-03 22:32 47% ` Fan Wu
  2024-05-03 22:32 69% ` [PATCH v18 07/21] security: add new securityfs delete function Fan Wu
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu, Deven Bowers

IPE is designed to provide system level trust guarantees, this usually
implies that trust starts from bootup with a hardware root of trust,
which validates the bootloader. After this, the bootloader verifies
the kernel and the initramfs.

As there's no currently supported integrity method for initramfs, and
it's typically already verified by the bootloader. This patch introduces
a new IPE property `boot_verified` which allows author of IPE policy to
indicate trust for files from initramfs.

The implementation of this feature utilizes the newly added
`initramfs_populated` hook. This hook marks the superblock of the rootfs
after the initramfs has been unpacked into it.

Before mounting the real rootfs on top of the initramfs, initramfs
script will recursively remove all files and directories on the
initramfs. This is typically implemented by using switch_root(8)
(https://man7.org/linux/man-pages/man8/switch_root.8.html).
Therefore the initramfs will be empty and not accessible after the real
rootfs takes over. It is advised to switch to a different policy
that doesn't rely on the `boot_verified` property after this point.
This ensures that the trust policies remain relevant and effective
throughout the system's operation.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  +No Changes

v3:
  + Remove useless caching system
  + Move ipe_load_properties to this match
  + Minor changes from checkpatch --strict warnings

v4:
  + Remove comments from headers that was missed previously.
  + Grammatical corrections.

v5:
  + No significant changes

v6:
  + No changes

v7:
  + Reword and refactor patch 04/12 to [09/16], based on changes in
the underlying system.
  + Add common audit function for boolean values
  + Use common audit function as implementation.

v8:
  + No changes

v9:
  + No changes

v10:
  + Replace struct file with struct super_block

v11:
  + Fix code style issues

v12:
  + Switch to use unpack_initramfs hook and security blob

v13:
  + Update the hook name
  + Rename the security blob field to initramfs
  + Remove the dependency on CONFIG_BLK_DEV_INITRD

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Fix code and documentation style issues

v18:
  + No changes
---
 security/ipe/eval.c          | 41 +++++++++++++++++++++++++++++++++---
 security/ipe/eval.h          |  5 +++++
 security/ipe/hooks.c         |  9 ++++++++
 security/ipe/hooks.h         |  2 ++
 security/ipe/ipe.c           |  8 +++++++
 security/ipe/ipe.h           |  1 +
 security/ipe/policy.h        |  2 ++
 security/ipe/policy_parser.c | 39 +++++++++++++++++++++++++++++++---
 8 files changed, 101 insertions(+), 6 deletions(-)

diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index cc3b3f6583ad..28b3bded06c2 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -16,6 +16,18 @@
 
 struct ipe_policy __rcu *ipe_active_policy;
 
+#define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
+
+/**
+ * build_ipe_sb_ctx() - Build initramfs field of an ipe evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @file: Supplies the file struct of the file triggered IPE event.
+ */
+static void build_ipe_sb_ctx(struct ipe_eval_ctx *ctx, const struct file *const file)
+{
+	ctx->initramfs = ipe_sb(FILE_SUPERBLOCK(file))->initramfs;
+}
+
 /**
  * ipe_build_eval_ctx() - Build an ipe evaluation context.
  * @ctx: Supplies a pointer to the context to be populated.
@@ -28,6 +40,22 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 {
 	ctx->file = file;
 	ctx->op = op;
+
+	if (file)
+		build_ipe_sb_ctx(ctx, file);
+}
+
+/**
+ * evaluate_boot_verified() - Evaluate @ctx for the boot verified property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_boot_verified(const struct ipe_eval_ctx *const ctx)
+{
+	return ctx->initramfs;
 }
 
 /**
@@ -35,8 +63,8 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
  * @ctx: Supplies a pointer to the context to be evaluated.
  * @p: Supplies a pointer to the property to be evaluated.
  *
- * This is a placeholder. The actual function will be introduced in the
- * latter commits.
+ * This function Determines whether the specified @ctx
+ * matches the conditions defined by a rule property @p.
  *
  * Return:
  * * %true	- The current @ctx match the @p
@@ -45,7 +73,14 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 			      struct ipe_prop *p)
 {
-	return false;
+	switch (p->type) {
+	case IPE_PROP_BOOT_VERIFIED_FALSE:
+		return !evaluate_boot_verified(ctx);
+	case IPE_PROP_BOOT_VERIFIED_TRUE:
+		return evaluate_boot_verified(ctx);
+	default:
+		return false;
+	}
 }
 
 /**
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 00ed8ceca10e..0fa6492354dd 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -15,10 +15,15 @@
 
 extern struct ipe_policy __rcu *ipe_active_policy;
 
+struct ipe_superblock {
+	bool initramfs;
+};
+
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
 
 	const struct file *file;
+	bool initramfs;
 };
 
 void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index f2aaa749dd7b..76370919aac0 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -4,6 +4,7 @@
  */
 
 #include <linux/fs.h>
+#include <linux/fs_struct.h>
 #include <linux/types.h>
 #include <linux/binfmts.h>
 #include <linux/mman.h>
@@ -182,3 +183,11 @@ int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents)
 	ipe_build_eval_ctx(&ctx, NULL, op);
 	return ipe_evaluate_event(&ctx);
 }
+
+/**
+ * ipe_unpack_initramfs() - Mark the current rootfs as initramfs.
+ */
+void ipe_unpack_initramfs(void)
+{
+	ipe_sb(current->fs->root.mnt->mnt_sb)->initramfs = true;
+}
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index c22c3336d27c..4de5fabebd54 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -22,4 +22,6 @@ int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
 
 int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents);
 
+void ipe_unpack_initramfs(void);
+
 #endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 729334812636..28555eadb7f3 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -5,9 +5,11 @@
 #include <uapi/linux/lsm.h>
 
 #include "ipe.h"
+#include "eval.h"
 #include "hooks.h"
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
+	.lbs_superblock = sizeof(struct ipe_superblock),
 };
 
 static const struct lsm_id ipe_lsmid = {
@@ -15,12 +17,18 @@ static const struct lsm_id ipe_lsmid = {
 	.id = LSM_ID_IPE,
 };
 
+struct ipe_superblock *ipe_sb(const struct super_block *sb)
+{
+	return sb->s_security + ipe_blobs.lbs_superblock;
+}
+
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
 	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
 	LSM_HOOK_INIT(file_mprotect, ipe_file_mprotect),
 	LSM_HOOK_INIT(kernel_read_file, ipe_kernel_read_file),
 	LSM_HOOK_INIT(kernel_load_data, ipe_kernel_load_data),
+	LSM_HOOK_INIT(initramfs_populated, ipe_unpack_initramfs),
 };
 
 /**
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index adc3c45e9f53..7f1c818193a0 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -12,5 +12,6 @@
 #define pr_fmt(fmt) "ipe: " fmt
 
 #include <linux/lsm_hooks.h>
+struct ipe_superblock *ipe_sb(const struct super_block *sb);
 
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 8292ffaaff12..69ca8cdecd64 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -30,6 +30,8 @@ enum ipe_action_type {
 #define IPE_ACTION_INVALID __IPE_ACTION_MAX
 
 enum ipe_prop_type {
+	IPE_PROP_BOOT_VERIFIED_FALSE,
+	IPE_PROP_BOOT_VERIFIED_TRUE,
 	__IPE_PROP_MAX
 };
 
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 32064262348a..84cc688be3a2 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -270,13 +270,19 @@ static enum ipe_action_type parse_action(char *t)
 	return match_token(t, action_tokens, args);
 }
 
+static const match_table_t property_tokens = {
+	{IPE_PROP_BOOT_VERIFIED_FALSE,	"boot_verified=FALSE"},
+	{IPE_PROP_BOOT_VERIFIED_TRUE,	"boot_verified=TRUE"},
+	{IPE_PROP_INVALID,		NULL}
+};
+
 /**
  * parse_property() - Parse a rule property given a token string.
  * @t: Supplies the token string to be parsed.
  * @r: Supplies the ipe_rule the parsed property will be associated with.
  *
- * This is a placeholder. The actual function will be introduced in the
- * latter commits.
+ * This function parses and associates a property with an IPE rule based
+ * on a token string.
  *
  * Return:
  * * %0		- Success
@@ -285,7 +291,34 @@ static enum ipe_action_type parse_action(char *t)
  */
 static int parse_property(char *t, struct ipe_rule *r)
 {
-	return -EBADMSG;
+	substring_t args[MAX_OPT_ARGS];
+	struct ipe_prop *p = NULL;
+	int rc = 0;
+	int token;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return -ENOMEM;
+
+	token = match_token(t, property_tokens, args);
+
+	switch (token) {
+	case IPE_PROP_BOOT_VERIFIED_FALSE:
+	case IPE_PROP_BOOT_VERIFIED_TRUE:
+		p->type = token;
+		break;
+	default:
+		rc = -EBADMSG;
+		break;
+	}
+	if (rc)
+		goto err;
+	list_add_tail(&p->next, &r->props);
+
+	return rc;
+err:
+	kfree(p);
+	return rc;
 }
 
 /**
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v18 10/21] ipe: add permissive toggle
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (8 preceding siblings ...)
  2024-05-03 22:32 26% ` [PATCH v18 09/21] uapi|audit|ipe: add ipe auditing support Fan Wu
@ 2024-05-03 22:32 47% ` Fan Wu
  2024-05-03 22:32 44% ` [PATCH v18 11/21] block,lsm: add LSM blob and new LSM hooks for block device Fan Wu
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

IPE, like SELinux, supports a permissive mode. This mode allows policy
authors to test and evaluate IPE policy without it effecting their
programs. When the mode is changed, a 1404 AUDIT_MAC_STATUS
be reported.

This patch adds the following audit records:

    audit: MAC_STATUS enforcing=0 old_enforcing=1 auid=4294967295
      ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
    audit: MAC_STATUS enforcing=1 old_enforcing=0 auid=4294967295
      ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1

The audit record only emit when the value from the user input is
different from the current enforce value.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation into a separate commit from the
    evaluation loop and audit system, for easier review.
  + Propagating changes to support the new ipe_context structure in the
    evaluation loop.
  + Split out permissive functionality into a separate patch for easier
    review.
  + Remove permissive switch compile-time configuration option - this
    is trivial to add later.

v8:
  + Remove "IPE" prefix from permissive audit record
  + align fields to the linux-audit field dictionary. This causes the
    following fields to change:
      enforce -> permissive

  + Remove duplicated information correlated with syscall record, that
    will always be present in the audit event.
  + Change audit types:
    + AUDIT_TRUST_STATUS -> AUDIT_MAC_STATUS
      + There is no significant difference in meaning between
        these types.

v9:
  + Clean up ipe_context related code

v10:
  + Change audit format to comform with the existing format selinux is
    using
  + Remove the audit record emission during init to align with selinux,
    which does not perform this action.

v11:
  + Remove redundant code

v12:
  + Remove redundant code

v13:
  + Remove audit format macro

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Fix code and documentation style issues

v18:
  + No changes
---
 security/ipe/audit.c | 27 ++++++++++++++++--
 security/ipe/audit.h |  1 +
 security/ipe/eval.c  | 11 ++++++--
 security/ipe/eval.h  |  1 +
 security/ipe/fs.c    | 66 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index 6a3f24665655..a416291ba477 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -93,8 +93,8 @@ void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
 	if (!ab)
 		return;
 
-	audit_log_format(ab, "ipe_op=%s ipe_hook=%s pid=%d comm=",
-			 op, audit_hook_names[ctx->hook],
+	audit_log_format(ab, "ipe_op=%s ipe_hook=%s enforcing=%d pid=%d comm=",
+			 op, audit_hook_names[ctx->hook], READ_ONCE(enforce),
 			 task_tgid_nr(current));
 	audit_log_untrustedstring(ab, get_task_comm(comm, current));
 
@@ -212,3 +212,26 @@ void ipe_audit_policy_load(const struct ipe_policy *const p)
 
 	audit_log_end(ab);
 }
+
+/**
+ * ipe_audit_enforce() - Audit a change in IPE's enforcement state.
+ * @new_enforce: The new value enforce to be set.
+ * @old_enforce: The old value currently in enforce.
+ */
+void ipe_audit_enforce(bool new_enforce, bool old_enforce)
+{
+	struct audit_buffer *ab;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_MAC_STATUS);
+	if (!ab)
+		return;
+
+	audit_log(audit_context(), GFP_KERNEL, AUDIT_MAC_STATUS,
+		  "enforcing=%d old_enforcing=%d auid=%u ses=%u"
+		  " enabled=1 old-enabled=1 lsm=ipe res=1",
+		  new_enforce, old_enforce,
+		  from_kuid(&init_user_ns, audit_get_loginuid(current)),
+		  audit_get_sessionid(current));
+
+	audit_log_end(ab);
+}
diff --git a/security/ipe/audit.h b/security/ipe/audit.h
index 3ba8b8a91541..ed2620846a79 100644
--- a/security/ipe/audit.h
+++ b/security/ipe/audit.h
@@ -14,5 +14,6 @@ void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
 void ipe_audit_policy_load(const struct ipe_policy *const p);
 void ipe_audit_policy_activation(const struct ipe_policy *const op,
 				 const struct ipe_policy *const np);
+void ipe_audit_enforce(bool new_enforce, bool old_enforce);
 
 #endif /* _IPE_AUDIT_H */
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 18fd5d8fa03e..dd9064974be6 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -18,6 +18,7 @@
 
 struct ipe_policy __rcu *ipe_active_policy;
 bool success_audit;
+bool enforce = true;
 
 #define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
 
@@ -108,6 +109,7 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	enum ipe_action_type action;
 	enum ipe_match match_type;
 	bool match = false;
+	int rc = 0;
 
 	rcu_read_lock();
 
@@ -160,9 +162,12 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	ipe_audit_match(ctx, match_type, action, rule);
 
 	if (action == IPE_ACTION_DENY)
-		return -EACCES;
+		rc = -EACCES;
 
-	return 0;
+	if (!READ_ONCE(enforce))
+		rc = 0;
+
+	return rc;
 }
 
 /* Set the right module name */
@@ -173,3 +178,5 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 
 module_param(success_audit, bool, 0400);
 MODULE_PARM_DESC(success_audit, "Start IPE with success auditing enabled");
+module_param(enforce, bool, 0400);
+MODULE_PARM_DESC(enforce, "Start IPE in enforce or permissive mode");
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 42b74a7a7c2b..80b74f55fa69 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -16,6 +16,7 @@
 
 extern struct ipe_policy __rcu *ipe_active_policy;
 extern bool success_audit;
+extern bool enforce;
 
 struct ipe_superblock {
 	bool initramfs;
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
index 9e410982b759..b52fb6023904 100644
--- a/security/ipe/fs.c
+++ b/security/ipe/fs.c
@@ -16,6 +16,7 @@ static struct dentry *np __ro_after_init;
 static struct dentry *root __ro_after_init;
 struct dentry *policy_root __ro_after_init;
 static struct dentry *audit_node __ro_after_init;
+static struct dentry *enforce_node __ro_after_init;
 
 /**
  * setaudit() - Write handler for the securityfs node, "ipe/success_audit"
@@ -65,6 +66,58 @@ static ssize_t getaudit(struct file *f, char __user *data,
 	return simple_read_from_buffer(data, len, offset, result, 1);
 }
 
+/**
+ * setenforce() - Write handler for the securityfs node, "ipe/enforce"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ */
+static ssize_t setenforce(struct file *f, const char __user *data,
+			  size_t len, loff_t *offset)
+{
+	int rc = 0;
+	bool new_value, old_value;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	old_value = READ_ONCE(enforce);
+	rc = kstrtobool_from_user(data, len, &new_value);
+	if (rc)
+		return rc;
+
+	if (new_value != old_value) {
+		ipe_audit_enforce(new_value, old_value);
+		WRITE_ONCE(enforce, new_value);
+	}
+
+	return len;
+}
+
+/**
+ * getenforce() - Read handler for the securityfs node, "ipe/enforce"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the read syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return: Length of buffer written
+ */
+static ssize_t getenforce(struct file *f, char __user *data,
+			  size_t len, loff_t *offset)
+{
+	const char *result;
+
+	result = ((READ_ONCE(enforce)) ? "1" : "0");
+
+	return simple_read_from_buffer(data, len, offset, result, 1);
+}
+
 /**
  * new_policy() - Write handler for the securityfs node, "ipe/new_policy".
  * @f: Supplies a file structure representing the securityfs node.
@@ -123,6 +176,11 @@ static const struct file_operations audit_fops = {
 	.read = getaudit,
 };
 
+static const struct file_operations enforce_fops = {
+	.write = setenforce,
+	.read = getenforce,
+};
+
 /**
  * ipe_init_securityfs() - Initialize IPE's securityfs tree at fsinit.
  *
@@ -149,6 +207,13 @@ static int __init ipe_init_securityfs(void)
 		goto err;
 	}
 
+	enforce_node = securityfs_create_file("enforce", 0600, root, NULL,
+					      &enforce_fops);
+	if (IS_ERR(enforce_node)) {
+		rc = PTR_ERR(enforce_node);
+		goto err;
+	}
+
 	policy_root = securityfs_create_dir("policies", root);
 	if (IS_ERR(policy_root)) {
 		rc = PTR_ERR(policy_root);
@@ -165,6 +230,7 @@ static int __init ipe_init_securityfs(void)
 err:
 	securityfs_remove(np);
 	securityfs_remove(policy_root);
+	securityfs_remove(enforce_node);
 	securityfs_remove(audit_node);
 	securityfs_remove(root);
 	return rc;
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v18 08/21] ipe: add userspace interface
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (6 preceding siblings ...)
  2024-05-03 22:32 69% ` [PATCH v18 07/21] security: add new securityfs delete function Fan Wu
@ 2024-05-03 22:32 28% ` Fan Wu
  2024-05-03 22:32 26% ` [PATCH v18 09/21] uapi|audit|ipe: add ipe auditing support Fan Wu
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

As is typical with LSMs, IPE uses securityfs as its interface with
userspace. for a complete list of the interfaces and the respective
inputs/outputs, please see the documentation under
admin-guide/LSM/ipe.rst

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move policy load and activation audit event to 03/12
  + Fix a potential panic when a policy failed to load.
  + use pr_warn for a failure to parse instead of an
    audit record
  + Remove comments from headers
  + Add lockdep assertions to ipe_update_active_policy and
    ipe_activate_policy
  + Fix up warnings with checkpatch --strict
  + Use file_ns_capable for CAP_MAC_ADMIN for securityfs
    nodes.
  + Use memdup_user instead of kzalloc+simple_write_to_buffer.
  + Remove strict_parse command line parameter, as it is added
    by the sysctl command line.
  + Prefix extern variables with ipe_

v4:
  + Remove securityfs to reverse-dependency
  + Add SHA1 reverse dependency.
  + Add versioning scheme for IPE properties, and associated
    interface to query the versioning scheme.
  + Cause a parser to always return an error on unknown syntax.
  + Remove strict_parse option
  + Change active_policy interface from sysctl, to securityfs,
    and change scheme.

v5:
  + Cause an error if a default action is not defined for each
    operation.
  + Minor function renames

v6:
  + No changes

v7:
  + Propagating changes to support the new ipe_context structure in the
    evaluation loop.

  + Further split the parser and userspace interface changes into
    separate commits.

  + "raw" was renamed to "pkcs7" and made read only
  + "raw"'s write functionality (update a policy) moved to "update"
  + introduced "version", "policy_name" nodes.
  + "content" renamed to "policy"
  + changes to allow the compiled-in policy to be treated
    identical to deployed-after-the-fact policies.

v8:
  + Prevent securityfs initialization if the LSM is disabled

v9:
  + Switch to securityfs_recursive_remove for policy folder deletion

v10:
  + Simplify and correct concurrency
  + Fix typos

v11:
  + Correct code comments

v12:
  + Correct locking and remove redundant code

v13:
  + Move the free of old policy into the ipe_update_policy function

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 security/ipe/Makefile    |   2 +
 security/ipe/fs.c        | 105 +++++++++
 security/ipe/fs.h        |  16 ++
 security/ipe/ipe.c       |   3 +
 security/ipe/ipe.h       |   2 +
 security/ipe/policy.c    | 121 ++++++++++
 security/ipe/policy.h    |   7 +
 security/ipe/policy_fs.c | 470 +++++++++++++++++++++++++++++++++++++++
 8 files changed, 726 insertions(+)
 create mode 100644 security/ipe/fs.c
 create mode 100644 security/ipe/fs.h
 create mode 100644 security/ipe/policy_fs.c

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index e1c27e974c5c..b97f8c10fe01 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -8,6 +8,8 @@
 obj-$(CONFIG_SECURITY_IPE) += \
 	eval.o \
 	hooks.o \
+	fs.o \
 	ipe.o \
 	policy.o \
+	policy_fs.o \
 	policy_parser.o \
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
new file mode 100644
index 000000000000..49484c8feead
--- /dev/null
+++ b/security/ipe/fs.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/dcache.h>
+#include <linux/security.h>
+
+#include "ipe.h"
+#include "fs.h"
+#include "policy.h"
+
+static struct dentry *np __ro_after_init;
+static struct dentry *root __ro_after_init;
+struct dentry *policy_root __ro_after_init;
+
+/**
+ * new_policy() - Write handler for the securityfs node, "ipe/new_policy".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ * * %-ENOMEM			- Out of memory (OOM)
+ * * %-EBADMSG			- Policy is invalid
+ * * %-ERANGE			- Policy version number overflow
+ * * %-EINVAL			- Policy version parsing error
+ * * %-EEXIST			- Same name policy already deployed
+ */
+static ssize_t new_policy(struct file *f, const char __user *data,
+			  size_t len, loff_t *offset)
+{
+	struct ipe_policy *p = NULL;
+	char *copy = NULL;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	copy = memdup_user_nul(data, len);
+	if (IS_ERR(copy))
+		return PTR_ERR(copy);
+
+	p = ipe_new_policy(NULL, 0, copy, len);
+	if (IS_ERR(p)) {
+		rc = PTR_ERR(p);
+		goto out;
+	}
+
+	rc = ipe_new_policyfs_node(p);
+
+out:
+	if (rc < 0)
+		ipe_free_policy(p);
+	kfree(copy);
+	return (rc < 0) ? rc : len;
+}
+
+static const struct file_operations np_fops = {
+	.write = new_policy,
+};
+
+/**
+ * ipe_init_securityfs() - Initialize IPE's securityfs tree at fsinit.
+ *
+ * Return: %0 on success. If an error occurs, the function will return
+ * the -errno.
+ */
+static int __init ipe_init_securityfs(void)
+{
+	int rc = 0;
+
+	if (!ipe_enabled)
+		return -EOPNOTSUPP;
+
+	root = securityfs_create_dir("ipe", NULL);
+	if (IS_ERR(root)) {
+		rc = PTR_ERR(root);
+		goto err;
+	}
+
+	policy_root = securityfs_create_dir("policies", root);
+	if (IS_ERR(policy_root)) {
+		rc = PTR_ERR(policy_root);
+		goto err;
+	}
+
+	np = securityfs_create_file("new_policy", 0200, root, NULL, &np_fops);
+	if (IS_ERR(np)) {
+		rc = PTR_ERR(np);
+		goto err;
+	}
+
+	return 0;
+err:
+	securityfs_remove(np);
+	securityfs_remove(policy_root);
+	securityfs_remove(root);
+	return rc;
+}
+
+fs_initcall(ipe_init_securityfs);
diff --git a/security/ipe/fs.h b/security/ipe/fs.h
new file mode 100644
index 000000000000..0141ae8e86ec
--- /dev/null
+++ b/security/ipe/fs.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_FS_H
+#define _IPE_FS_H
+
+#include "policy.h"
+
+extern struct dentry *policy_root __ro_after_init;
+
+int ipe_new_policyfs_node(struct ipe_policy *p);
+void ipe_del_policyfs_node(struct ipe_policy *p);
+
+#endif /* _IPE_FS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 28555eadb7f3..53f2196b9bcc 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -8,6 +8,8 @@
 #include "eval.h"
 #include "hooks.h"
 
+bool ipe_enabled;
+
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 	.lbs_superblock = sizeof(struct ipe_superblock),
 };
@@ -45,6 +47,7 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
 static int __init ipe_init(void)
 {
 	security_add_hooks(ipe_hooks, ARRAY_SIZE(ipe_hooks), &ipe_lsmid);
+	ipe_enabled = true;
 
 	return 0;
 }
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index 7f1c818193a0..4aa18d1d0525 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -14,4 +14,6 @@
 #include <linux/lsm_hooks.h>
 struct ipe_superblock *ipe_sb(const struct super_block *sb);
 
+extern bool ipe_enabled;
+
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.c b/security/ipe/policy.c
index dd7b5b79903a..112913f83c6d 100644
--- a/security/ipe/policy.c
+++ b/security/ipe/policy.c
@@ -7,9 +7,36 @@
 #include <linux/verification.h>
 
 #include "ipe.h"
+#include "eval.h"
+#include "fs.h"
 #include "policy.h"
 #include "policy_parser.h"
 
+/* lock for synchronizing writers across ipe policy */
+DEFINE_MUTEX(ipe_policy_lock);
+
+/**
+ * ver_to_u64() - Convert an internal ipe_policy_version to a u64.
+ * @p: Policy to extract the version from.
+ *
+ * Bits (LSB is index 0):
+ *	[48,32] -> Major
+ *	[32,16] -> Minor
+ *	[16, 0] -> Revision
+ *
+ * Return: u64 version of the embedded version structure.
+ */
+static inline u64 ver_to_u64(const struct ipe_policy *const p)
+{
+	u64 r;
+
+	r = (((u64)p->parsed->version.major) << 32)
+	  | (((u64)p->parsed->version.minor) << 16)
+	  | ((u64)(p->parsed->version.rev));
+
+	return r;
+}
+
 /**
  * ipe_free_policy() - Deallocate a given IPE policy.
  * @p: Supplies the policy to free.
@@ -21,6 +48,7 @@ void ipe_free_policy(struct ipe_policy *p)
 	if (IS_ERR_OR_NULL(p))
 		return;
 
+	ipe_del_policyfs_node(p);
 	ipe_free_parsed_policy(p->parsed);
 	/*
 	 * p->text is allocated only when p->pkcs7 is not NULL
@@ -43,6 +71,66 @@ static int set_pkcs7_data(void *ctx, const void *data, size_t len,
 	return 0;
 }
 
+/**
+ * ipe_update_policy() - parse a new policy and replace old with it.
+ * @root: Supplies a pointer to the securityfs inode saved the policy.
+ * @text: Supplies a pointer to the plain text policy.
+ * @textlen: Supplies the length of @text.
+ * @pkcs7: Supplies a pointer to a buffer containing a pkcs7 message.
+ * @pkcs7len: Supplies the length of @pkcs7len.
+ *
+ * @text/@textlen is mutually exclusive with @pkcs7/@pkcs7len - see
+ * ipe_new_policy.
+ *
+ * Context: Requires root->i_rwsem to be held.
+ * Return: %0 on success. If an error occurs, the function will return
+ * the -errno.
+ */
+int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
+		      const char *pkcs7, size_t pkcs7len)
+{
+	struct ipe_policy *old, *ap, *new = NULL;
+	int rc = 0;
+
+	old = (struct ipe_policy *)root->i_private;
+	if (!old)
+		return -ENOENT;
+
+	new = ipe_new_policy(text, textlen, pkcs7, pkcs7len);
+	if (IS_ERR(new))
+		return PTR_ERR(new);
+
+	if (strcmp(new->parsed->name, old->parsed->name)) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	if (ver_to_u64(old) > ver_to_u64(new)) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	root->i_private = new;
+	swap(new->policyfs, old->policyfs);
+
+	mutex_lock(&ipe_policy_lock);
+	ap = rcu_dereference_protected(ipe_active_policy,
+				       lockdep_is_held(&ipe_policy_lock));
+	if (old == ap) {
+		rcu_assign_pointer(ipe_active_policy, new);
+		mutex_unlock(&ipe_policy_lock);
+		synchronize_rcu();
+	} else {
+		mutex_unlock(&ipe_policy_lock);
+	}
+	ipe_free_policy(old);
+
+	return 0;
+err:
+	ipe_free_policy(new);
+	return rc;
+}
+
 /**
  * ipe_new_policy() - Allocate and parse an ipe_policy structure.
  *
@@ -101,3 +189,36 @@ struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
 	ipe_free_policy(new);
 	return ERR_PTR(rc);
 }
+
+/**
+ * ipe_set_active_pol() - Make @p the active policy.
+ * @p: Supplies a pointer to the policy to make active.
+ *
+ * Context: Requires root->i_rwsem, which i_private has the policy, to be held.
+ * Return:
+ * * %0	- Success
+ * * %-EINVAL	- New active policy version is invalid
+ */
+int ipe_set_active_pol(const struct ipe_policy *p)
+{
+	struct ipe_policy *ap = NULL;
+
+	mutex_lock(&ipe_policy_lock);
+
+	ap = rcu_dereference_protected(ipe_active_policy,
+				       lockdep_is_held(&ipe_policy_lock));
+	if (ap == p) {
+		mutex_unlock(&ipe_policy_lock);
+		return 0;
+	}
+	if (ap && ver_to_u64(ap) > ver_to_u64(p)) {
+		mutex_unlock(&ipe_policy_lock);
+		return -EINVAL;
+	}
+
+	rcu_assign_pointer(ipe_active_policy, p);
+	mutex_unlock(&ipe_policy_lock);
+	synchronize_rcu();
+
+	return 0;
+}
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 69ca8cdecd64..ffd60cc7fda6 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -7,6 +7,7 @@
 
 #include <linux/list.h>
 #include <linux/types.h>
+#include <linux/fs.h>
 
 enum ipe_op_type {
 	IPE_OP_EXEC = 0,
@@ -76,10 +77,16 @@ struct ipe_policy {
 	size_t textlen;
 
 	struct ipe_parsed_policy *parsed;
+
+	struct dentry *policyfs;
 };
 
 struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
 				  const char *pkcs7, size_t pkcs7len);
 void ipe_free_policy(struct ipe_policy *pol);
+int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
+		      const char *pkcs7, size_t pkcs7len);
+int ipe_set_active_pol(const struct ipe_policy *p);
+extern struct mutex ipe_policy_lock;
 
 #endif /* _IPE_POLICY_H */
diff --git a/security/ipe/policy_fs.c b/security/ipe/policy_fs.c
new file mode 100644
index 000000000000..c19c06627efb
--- /dev/null
+++ b/security/ipe/policy_fs.c
@@ -0,0 +1,470 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/types.h>
+#include <linux/dcache.h>
+#include <linux/security.h>
+
+#include "ipe.h"
+#include "policy.h"
+#include "eval.h"
+#include "fs.h"
+
+#define MAX_VERSION_SIZE ARRAY_SIZE("65535.65535.65535")
+
+/**
+ * ipefs_file - defines a file in securityfs.
+ */
+struct ipefs_file {
+	const char *name;
+	umode_t access;
+	const struct file_operations *fops;
+};
+
+/**
+ * read_pkcs7() - Read handler for "ipe/policies/$name/pkcs7".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the pkcs7 blob representing the policy
+ * on success. If the policy is unsigned (like the boot policy), this
+ * will return -ENOENT.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted or is unsigned
+ */
+static ssize_t read_pkcs7(struct file *f, char __user *data,
+			  size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	if (!p->pkcs7) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = simple_read_from_buffer(data, len, offset, p->pkcs7, p->pkcs7len);
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * read_policy() - Read handler for "ipe/policies/$name/policy".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the plain-text version of the policy
+ * on success.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t read_policy(struct file *f, char __user *data,
+			   size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = simple_read_from_buffer(data, len, offset, p->text, p->textlen);
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * read_name() - Read handler for "ipe/policies/$name/name".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the policy_name attribute on success.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t read_name(struct file *f, char __user *data,
+			 size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = simple_read_from_buffer(data, len, offset, p->parsed->name,
+				     strlen(p->parsed->name));
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * read_version() - Read handler for "ipe/policies/$name/version".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the version string on success.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t read_version(struct file *f, char __user *data,
+			    size_t len, loff_t *offset)
+{
+	char buffer[MAX_VERSION_SIZE] = { 0 };
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	size_t strsize = 0;
+	ssize_t rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	strsize = scnprintf(buffer, ARRAY_SIZE(buffer), "%hu.%hu.%hu",
+			    p->parsed->version.major, p->parsed->version.minor,
+			    p->parsed->version.rev);
+
+	rc = simple_read_from_buffer(data, len, offset, buffer, strsize);
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * setactive() - Write handler for "ipe/policies/$name/active".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ * * %-EINVAL			- Invalid input
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t setactive(struct file *f, const char __user *data,
+			 size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	bool value = false;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	rc = kstrtobool_from_user(data, len, &value);
+	if (rc)
+		return rc;
+
+	if (!value)
+		return -EINVAL;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+	inode_lock(root);
+
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = ipe_set_active_pol(p);
+
+out:
+	inode_unlock(root);
+	return (rc < 0) ? rc : len;
+}
+
+/**
+ * getactive() - Read handler for "ipe/policies/$name/active".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the 1 or 0 depending on if the
+ * corresponding policy is active.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t getactive(struct file *f, char __user *data,
+			 size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	const char *str;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		inode_unlock_shared(root);
+		return -ENOENT;
+	}
+	inode_unlock_shared(root);
+
+	str = (p == rcu_access_pointer(ipe_active_policy)) ? "1" : "0";
+	rc = simple_read_from_buffer(data, len, offset, str, 1);
+
+	return rc;
+}
+
+/**
+ * update_policy() - Write handler for "ipe/policies/$name/update".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * On success this updates the policy represented by $name,
+ * in-place.
+ *
+ * Return: Length of buffer written on success. If an error occurs,
+ * the function will return the -errno.
+ */
+static ssize_t update_policy(struct file *f, const char __user *data,
+			     size_t len, loff_t *offset)
+{
+	struct inode *root = NULL;
+	char *copy = NULL;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	copy = memdup_user(data, len);
+	if (IS_ERR(copy))
+		return PTR_ERR(copy);
+
+	root = d_inode(f->f_path.dentry->d_parent);
+	inode_lock(root);
+	rc = ipe_update_policy(root, NULL, 0, copy, len);
+	inode_unlock(root);
+
+	kfree(copy);
+	if (rc)
+		return rc;
+
+	return len;
+}
+
+/**
+ * delete_policy() - write handler for  "ipe/policies/$name/delete".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * On success this deletes the policy represented by $name.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission/deleting active policy
+ * * %-EINVAL			- Invalid input
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t delete_policy(struct file *f, const char __user *data,
+			     size_t len, loff_t *offset)
+{
+	struct ipe_policy *ap = NULL;
+	struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	bool value = false;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	rc = kstrtobool_from_user(data, len, &value);
+	if (rc)
+		return rc;
+
+	if (!value)
+		return -EINVAL;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+	inode_lock(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		inode_unlock(root);
+		return -ENOENT;
+	}
+
+	mutex_lock(&ipe_policy_lock);
+	ap = rcu_dereference_protected(ipe_active_policy,
+				       lockdep_is_held(&ipe_policy_lock));
+	if (p == ap) {
+		mutex_unlock(&ipe_policy_lock);
+		inode_unlock(root);
+		return -EPERM;
+	}
+	mutex_unlock(&ipe_policy_lock);
+
+	root->i_private = NULL;
+	inode_unlock(root);
+
+	ipe_free_policy(p);
+	return len;
+}
+
+static const struct file_operations content_fops = {
+	.read = read_policy,
+};
+
+static const struct file_operations pkcs7_fops = {
+	.read = read_pkcs7,
+};
+
+static const struct file_operations name_fops = {
+	.read = read_name,
+};
+
+static const struct file_operations ver_fops = {
+	.read = read_version,
+};
+
+static const struct file_operations active_fops = {
+	.write = setactive,
+	.read = getactive,
+};
+
+static const struct file_operations update_fops = {
+	.write = update_policy,
+};
+
+static const struct file_operations delete_fops = {
+	.write = delete_policy,
+};
+
+/**
+ * policy_subdir - files under a policy subdirectory
+ */
+static const struct ipefs_file policy_subdir[] = {
+	{ "pkcs7", 0444, &pkcs7_fops },
+	{ "policy", 0444, &content_fops },
+	{ "name", 0444, &name_fops },
+	{ "version", 0444, &ver_fops },
+	{ "active", 0600, &active_fops },
+	{ "update", 0200, &update_fops },
+	{ "delete", 0200, &delete_fops },
+};
+
+/**
+ * ipe_del_policyfs_node() - Delete a securityfs entry for @p.
+ * @p: Supplies a pointer to the policy to delete a securityfs entry for.
+ */
+void ipe_del_policyfs_node(struct ipe_policy *p)
+{
+	securityfs_recursive_remove(p->policyfs);
+	p->policyfs = NULL;
+}
+
+/**
+ * ipe_new_policyfs_node() - Create a securityfs entry for @p.
+ * @p: Supplies a pointer to the policy to create a securityfs entry for.
+ *
+ * Return: %0 on success. If an error occurs, the function will return
+ * the -errno.
+ */
+int ipe_new_policyfs_node(struct ipe_policy *p)
+{
+	const struct ipefs_file *f = NULL;
+	struct dentry *policyfs = NULL;
+	struct inode *root = NULL;
+	struct dentry *d = NULL;
+	size_t i = 0;
+	int rc = 0;
+
+	if (p->policyfs)
+		return 0;
+
+	policyfs = securityfs_create_dir(p->parsed->name, policy_root);
+	if (IS_ERR(policyfs))
+		return PTR_ERR(policyfs);
+
+	root = d_inode(policyfs);
+
+	for (i = 0; i < ARRAY_SIZE(policy_subdir); ++i) {
+		f = &policy_subdir[i];
+
+		d = securityfs_create_file(f->name, f->access, policyfs,
+					   NULL, f->fops);
+		if (IS_ERR(d)) {
+			rc = PTR_ERR(d);
+			goto err;
+		}
+	}
+
+	inode_lock(root);
+	p->policyfs = policyfs;
+	root->i_private = p;
+	inode_unlock(root);
+
+	return 0;
+err:
+	securityfs_recursive_remove(policyfs);
+	return rc;
+}
-- 
2.44.0


^ permalink raw reply related	[relevance 28%]

* [PATCH v18 09/21] uapi|audit|ipe: add ipe auditing support
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (7 preceding siblings ...)
  2024-05-03 22:32 28% ` [PATCH v18 08/21] ipe: add userspace interface Fan Wu
@ 2024-05-03 22:32 26% ` Fan Wu
  2024-05-03 22:32 47% ` [PATCH v18 10/21] ipe: add permissive toggle Fan Wu
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Users of IPE require a way to identify when and why an operation fails,
allowing them to both respond to violations of policy and be notified
of potentially malicious actions on their systems with respect to IPE
itself.

This patch introduces 3 new audit events.

AUDIT_IPE_ACCESS(1420) indicates the result of an IPE policy evaluation
of a resource.
AUDIT_IPE_CONFIG_CHANGE(1421) indicates the current active IPE policy
has been changed to another loaded policy.
AUDIT_IPE_POLICY_LOAD(1422) indicates a new IPE policy has been loaded
into the kernel.

This patch also adds support for success auditing, allowing users to
identify why an allow decision was made for a resource. However, it is
recommended to use this option with caution, as it is quite noisy.

Here are some examples of the new audit record types:

AUDIT_IPE_ACCESS(1420):

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=297 comm="sh" path="/root/vol/bin/hello" dev="tmpfs"
      ino=3897 rule="op=EXECUTE boot_verified=TRUE action=ALLOW"

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=299 comm="sh" path="/mnt/ipe/bin/hello" dev="dm-0"
      ino=2 rule="DEFAULT action=DENY"

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
     pid=300 path="/tmp/tmpdp2h1lub/deny/bin/hello" dev="tmpfs"
      ino=131 rule="DEFAULT action=DENY"

The above three records were generated when the active IPE policy only
allows binaries from the initramfs to run. The three identical `hello`
binary were placed at different locations, only the first hello from
the rootfs(initramfs) was allowed.

Field ipe_op followed by the IPE operation name associated with the log.

Field ipe_hook followed by the name of the LSM hook that triggered the IPE
event.

Field enforcing followed by the enforcement state of IPE. (it will be
introduced in the next commit)

Field pid followed by the pid of the process that triggered the IPE
event.

Field comm followed by the command line program name of the process that
triggered the IPE event.

Field path followed by the file's path name.

Field dev followed by the device name as found in /dev where the file is
from.
Note that for device mappers it will use the name `dm-X` instead of
the name in /dev/mapper.
For a file in a temp file system, which is not from a device, it will use
`tmpfs` for the field.
The implementation of this part is following another existing use case
LSM_AUDIT_DATA_INODE in security/lsm_audit.c

Field ino followed by the file's inode number.

Field rule followed by the IPE rule made the access decision. The whole
rule must be audited because the decision is based on the combination of
all property conditions in the rule.

Along with the syscall audit event, user can know why a blocked
happened. For example:

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=2138 comm="bash" path="/mnt/ipe/bin/hello" dev="dm-0"
      ino=2 rule="DEFAULT action=DENY"
    audit[1956]: SYSCALL arch=c000003e syscall=59
      success=no exit=-13 a0=556790138df0 a1=556790135390 a2=5567901338b0
      a3=ab2a41a67f4f1f4e items=1 ppid=147 pid=1956 auid=4294967295 uid=0
      gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0
      ses=4294967295 comm="bash" exe="/usr/bin/bash" key=(null)

The above two records showed bash used execve to run "hello" and got
blocked by IPE. Note that the IPE records are always prior to a SYSCALL
record.

AUDIT_IPE_CONFIG_CHANGE(1421):

    audit: AUDIT1421
      old_active_pol_name="Allow_All" old_active_pol_version=0.0.0
      old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649
      new_active_pol_name="boot_verified" new_active_pol_version=0.0.0
      new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F
      auid=4294967295 ses=4294967295 lsm=ipe res=1

The above record showed the current IPE active policy switch from
`Allow_All` to `boot_verified` along with the version and the hash
digest of the two policies. Note IPE can only have one policy active
at a time, all access decision evaluation is based on the current active
policy.
The normal procedure to deploy a policy is loading the policy to deploy
into the kernel first, then switch the active policy to it.

AUDIT_IPE_POLICY_LOAD(1422):

    audit: AUDIT1422 policy_name="boot_verified" policy_version=0.0.0
      policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F2676
      auid=4294967295 ses=4294967295 lsm=ipe res=1

The above record showed a new policy has been loaded into the kernel
with the policy name, policy version and policy hash.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation, the audit system, the evaluation loop,
    and access control hooks into separate patches.
  + Further split audit system patch into two separate patches; one
    for include/uapi, and the usage of the new defines.
  + Split out the permissive functionality into another separate patch,
    for easier review.
  + Correct misuse of audit_log_n_untrusted string to audit_log_format
  + Use get_task_comm instead of comm directly.
  + Quote certain audit values
  + Remove unnecessary help text on choice options - these were
    previously indented at the wrong level
  + Correct a stale string constant (ctx_ns_enforce to ctx_enforce)

v8:

  + Change dependency for CONFIG_AUDIT to CONFIG_AUDITSYSCALL
  + Drop ctx_* prefix
  + Reuse, where appropriate, the audit fields from the field
    dictionary. This transforms:
      ctx_pathname  -> path
      ctx_ino       -> ino
      ctx_dev       -> dev

  + Add audit records and event examples to commit description.
  + Remove new_audit_ctx, replace with audit_log_start. All data that
    would provided by new_audit_ctx is already present in the syscall
    audit record, that is always emitted on these actions. The audit
    records should be correlated as such.
  + Change audit types:
    + AUDIT_TRUST_RESULT                -> AUDIT_IPE_ACCESS
      +  This prevents overloading of the AVC type.
    + AUDIT_TRUST_POLICY_ACTIVATE       -> AUDIT_MAC_CONFIG_CHANGE
    + AUDIT_TRUST_POLICY_LOAD           -> AUDIT_MAC_POLICY_LOAD
      + There were no significant difference in meaning between
        these types.

  + Remove enforcing parameter passed from the context structure
    for AUDIT_IPE_ACCESS.
    +  This field can be inferred from the SYSCALL audit event,
       based on the success field.

  + Remove all fields already captured in the syscall record. "hook",
    an IPE specific field, can be determined via the syscall field in
    the syscall record itself, so it has been removed.
      + ino, path, and dev in IPE's record refer to the subject of the
        syscall, while the syscall record refers to the calling process.

  + remove IPE prefix from policy load/policy activation events
  + fix a bug wherein a policy change audit record was not fired when
    updating a policy

v9:
  + Merge the AUDIT_IPE_ACCESS definition with the audit support commit
  + Change the audit format of policy load and switch
  + Remove the ipe audit kernel switch

v10:
  + Create AUDIT_IPE_CONFIG_CHANGE and AUDIT_IPE_POLICY_LOAD
  + Change field names per upstream feedback

v11:
  + Fix style issues

v12:
  + Add ipe_op, ipe_hook, and enforcing fields to AUDIT_IPE_ACCESS

v13:
  + Remove dependency on CONFIG_BLK_DEV_INITRD
  + Add field placeholders for anonymous files

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 include/uapi/linux/audit.h |   3 +
 security/ipe/Kconfig       |   2 +-
 security/ipe/Makefile      |   1 +
 security/ipe/audit.c       | 214 +++++++++++++++++++++++++++++++++++++
 security/ipe/audit.h       |  18 ++++
 security/ipe/eval.c        |  44 ++++++--
 security/ipe/eval.h        |  13 ++-
 security/ipe/fs.c          |  68 ++++++++++++
 security/ipe/hooks.c       |  10 +-
 security/ipe/hooks.h       |  11 ++
 security/ipe/policy.c      |   5 +
 11 files changed, 372 insertions(+), 17 deletions(-)
 create mode 100644 security/ipe/audit.c
 create mode 100644 security/ipe/audit.h

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index d676ed2b246e..75e21a135483 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -143,6 +143,9 @@
 #define AUDIT_MAC_UNLBL_STCDEL	1417	/* NetLabel: del a static label */
 #define AUDIT_MAC_CALIPSO_ADD	1418	/* NetLabel: add CALIPSO DOI entry */
 #define AUDIT_MAC_CALIPSO_DEL	1419	/* NetLabel: del CALIPSO DOI entry */
+#define AUDIT_IPE_ACCESS	1420	/* IPE denial or grant */
+#define AUDIT_IPE_CONFIG_CHANGE	1421	/* IPE config change */
+#define AUDIT_IPE_POLICY_LOAD	1422	/* IPE policy load */
 
 #define AUDIT_FIRST_KERN_ANOM_MSG   1700
 #define AUDIT_LAST_KERN_ANOM_MSG    1799
diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index e4875fb04883..ac4d558e69d5 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -5,7 +5,7 @@
 
 menuconfig SECURITY_IPE
 	bool "Integrity Policy Enforcement (IPE)"
-	depends on SECURITY && SECURITYFS
+	depends on SECURITY && SECURITYFS && AUDIT && AUDITSYSCALL
 	select PKCS7_MESSAGE_PARSER
 	select SYSTEM_DATA_VERIFICATION
 	help
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index b97f8c10fe01..62caccba14b4 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_SECURITY_IPE) += \
 	policy.o \
 	policy_fs.o \
 	policy_parser.o \
+	audit.o \
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
new file mode 100644
index 000000000000..6a3f24665655
--- /dev/null
+++ b/security/ipe/audit.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/slab.h>
+#include <linux/audit.h>
+#include <linux/types.h>
+#include <crypto/hash.h>
+
+#include "ipe.h"
+#include "eval.h"
+#include "hooks.h"
+#include "policy.h"
+#include "audit.h"
+
+#define ACTSTR(x) ((x) == IPE_ACTION_ALLOW ? "ALLOW" : "DENY")
+
+#define IPE_AUDIT_HASH_ALG "sha256"
+
+#define AUDIT_POLICY_LOAD_FMT "policy_name=\"%s\" policy_version=%hu.%hu.%hu "\
+			      "policy_digest=" IPE_AUDIT_HASH_ALG ":"
+#define AUDIT_OLD_ACTIVE_POLICY_FMT "old_active_pol_name=\"%s\" "\
+				    "old_active_pol_version=%hu.%hu.%hu "\
+				    "old_policy_digest=" IPE_AUDIT_HASH_ALG ":"
+#define AUDIT_NEW_ACTIVE_POLICY_FMT "new_active_pol_name=\"%s\" "\
+				    "new_active_pol_version=%hu.%hu.%hu "\
+				    "new_policy_digest=" IPE_AUDIT_HASH_ALG ":"
+
+static const char *const audit_op_names[__IPE_OP_MAX + 1] = {
+	"EXECUTE",
+	"FIRMWARE",
+	"KMODULE",
+	"KEXEC_IMAGE",
+	"KEXEC_INITRAMFS",
+	"POLICY",
+	"X509_CERT",
+	"UNKNOWN",
+};
+
+static const char *const audit_hook_names[__IPE_HOOK_MAX] = {
+	"BPRM_CHECK",
+	"MMAP",
+	"MPROTECT",
+	"KERNEL_READ",
+	"KERNEL_LOAD",
+};
+
+static const char *const audit_prop_names[__IPE_PROP_MAX] = {
+	"boot_verified=FALSE",
+	"boot_verified=TRUE",
+};
+
+/**
+ * audit_rule() - audit an IPE policy rule.
+ * @ab: Supplies a pointer to the audit_buffer to append to.
+ * @r: Supplies a pointer to the ipe_rule to approximate a string form for.
+ */
+static void audit_rule(struct audit_buffer *ab, const struct ipe_rule *r)
+{
+	const struct ipe_prop *ptr;
+
+	audit_log_format(ab, " rule=\"%s ", audit_op_names[r->op]);
+
+	list_for_each_entry(ptr, &r->props, next)
+		audit_log_format(ab, "%s ", audit_prop_names[ptr->type]);
+
+	audit_log_format(ab, "action=%s\"", ACTSTR(r->action));
+}
+
+/**
+ * ipe_audit_match() - Audit a rule match in a policy evaluation.
+ * @ctx: Supplies a pointer to the evaluation context that was used in the
+ *	 evaluation.
+ * @match_type: Supplies the scope of the match: rule, operation default,
+ *		global default.
+ * @act: Supplies the IPE's evaluation decision, deny or allow.
+ * @r: Supplies a pointer to the rule that was matched, if possible.
+ */
+void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
+		     enum ipe_match match_type,
+		     enum ipe_action_type act, const struct ipe_rule *const r)
+{
+	const char *op = audit_op_names[ctx->op];
+	char comm[sizeof(current->comm)];
+	struct audit_buffer *ab;
+	struct inode *inode;
+
+	if (act != IPE_ACTION_DENY && !READ_ONCE(success_audit))
+		return;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_IPE_ACCESS);
+	if (!ab)
+		return;
+
+	audit_log_format(ab, "ipe_op=%s ipe_hook=%s pid=%d comm=",
+			 op, audit_hook_names[ctx->hook],
+			 task_tgid_nr(current));
+	audit_log_untrustedstring(ab, get_task_comm(comm, current));
+
+	if (ctx->file) {
+		audit_log_d_path(ab, " path=", &ctx->file->f_path);
+		inode = file_inode(ctx->file);
+		if (inode) {
+			audit_log_format(ab, " dev=");
+			audit_log_untrustedstring(ab, inode->i_sb->s_id);
+			audit_log_format(ab, " ino=%lu", inode->i_ino);
+		} else {
+			audit_log_format(ab, " dev=? ino=?");
+		}
+	} else {
+		audit_log_format(ab, " path=? dev=? ino=?");
+	}
+
+	if (match_type == IPE_MATCH_RULE)
+		audit_rule(ab, r);
+	else if (match_type == IPE_MATCH_TABLE)
+		audit_log_format(ab, " rule=\"DEFAULT op=%s action=%s\"", op,
+				 ACTSTR(act));
+	else
+		audit_log_format(ab, " rule=\"DEFAULT action=%s\"",
+				 ACTSTR(act));
+
+	audit_log_end(ab);
+}
+
+/**
+ * audit_policy() - Audit a policy's name, version and thumbprint to @ab.
+ * @ab: Supplies a pointer to the audit buffer to append to.
+ * @audit_format: Supplies a pointer to the audit format string
+ * @p: Supplies a pointer to the policy to audit.
+ */
+static void audit_policy(struct audit_buffer *ab,
+			 const char *audit_format,
+			 const struct ipe_policy *const p)
+{
+	SHASH_DESC_ON_STACK(desc, tfm);
+	struct crypto_shash *tfm;
+	u8 *digest = NULL;
+
+	tfm = crypto_alloc_shash(IPE_AUDIT_HASH_ALG, 0, 0);
+	if (IS_ERR(tfm))
+		return;
+
+	desc->tfm = tfm;
+
+	digest = kzalloc(crypto_shash_digestsize(tfm), GFP_KERNEL);
+	if (!digest)
+		goto out;
+
+	if (crypto_shash_init(desc))
+		goto out;
+
+	if (crypto_shash_update(desc, p->pkcs7, p->pkcs7len))
+		goto out;
+
+	if (crypto_shash_final(desc, digest))
+		goto out;
+
+	audit_log_format(ab, audit_format, p->parsed->name,
+			 p->parsed->version.major, p->parsed->version.minor,
+			 p->parsed->version.rev);
+	audit_log_n_hex(ab, digest, crypto_shash_digestsize(tfm));
+
+out:
+	kfree(digest);
+	crypto_free_shash(tfm);
+}
+
+/**
+ * ipe_audit_policy_activation() - Audit a policy being activated.
+ * @op: Supplies a pointer to the previously activated policy to audit.
+ * @np: Supplies a pointer to the newly activated policy to audit.
+ */
+void ipe_audit_policy_activation(const struct ipe_policy *const op,
+				 const struct ipe_policy *const np)
+{
+	struct audit_buffer *ab;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL,
+			     AUDIT_IPE_CONFIG_CHANGE);
+	if (!ab)
+		return;
+
+	audit_policy(ab, AUDIT_OLD_ACTIVE_POLICY_FMT, op);
+	audit_log_format(ab, " ");
+	audit_policy(ab, AUDIT_NEW_ACTIVE_POLICY_FMT, np);
+	audit_log_format(ab, " auid=%u ses=%u lsm=ipe res=1",
+			 from_kuid(&init_user_ns, audit_get_loginuid(current)),
+			 audit_get_sessionid(current));
+
+	audit_log_end(ab);
+}
+
+/**
+ * ipe_audit_policy_load() - Audit a policy being loaded into the kernel.
+ * @p: Supplies a pointer to the policy to audit.
+ */
+void ipe_audit_policy_load(const struct ipe_policy *const p)
+{
+	struct audit_buffer *ab;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL,
+			     AUDIT_IPE_POLICY_LOAD);
+	if (!ab)
+		return;
+
+	audit_policy(ab, AUDIT_POLICY_LOAD_FMT, p);
+	audit_log_format(ab, " auid=%u ses=%u lsm=ipe res=1",
+			 from_kuid(&init_user_ns, audit_get_loginuid(current)),
+			 audit_get_sessionid(current));
+
+	audit_log_end(ab);
+}
diff --git a/security/ipe/audit.h b/security/ipe/audit.h
new file mode 100644
index 000000000000..3ba8b8a91541
--- /dev/null
+++ b/security/ipe/audit.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_AUDIT_H
+#define _IPE_AUDIT_H
+
+#include "policy.h"
+
+void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
+		     enum ipe_match match_type,
+		     enum ipe_action_type act, const struct ipe_rule *const r);
+void ipe_audit_policy_load(const struct ipe_policy *const p);
+void ipe_audit_policy_activation(const struct ipe_policy *const op,
+				 const struct ipe_policy *const np);
+
+#endif /* _IPE_AUDIT_H */
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 28b3bded06c2..18fd5d8fa03e 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -9,12 +9,15 @@
 #include <linux/file.h>
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
+#include <linux/moduleparam.h>
 
 #include "ipe.h"
 #include "eval.h"
 #include "policy.h"
+#include "audit.h"
 
 struct ipe_policy __rcu *ipe_active_policy;
+bool success_audit;
 
 #define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
 
@@ -33,13 +36,16 @@ static void build_ipe_sb_ctx(struct ipe_eval_ctx *ctx, const struct file *const
  * @ctx: Supplies a pointer to the context to be populated.
  * @file: Supplies a pointer to the file to associated with the evaluation.
  * @op: Supplies the IPE policy operation associated with the evaluation.
+ * @hook: Supplies the LSM hook associated with the evaluation.
  */
 void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 			const struct file *file,
-			enum ipe_op_type op)
+			enum ipe_op_type op,
+			enum ipe_hook_type hook)
 {
 	ctx->file = file;
 	ctx->op = op;
+	ctx->hook = hook;
 
 	if (file)
 		build_ipe_sb_ctx(ctx, file);
@@ -100,6 +106,7 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	struct ipe_policy *pol = NULL;
 	struct ipe_prop *prop = NULL;
 	enum ipe_action_type action;
+	enum ipe_match match_type;
 	bool match = false;
 
 	rcu_read_lock();
@@ -111,14 +118,15 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	}
 
 	if (ctx->op == IPE_OP_INVALID) {
-		if (pol->parsed->global_default_action == IPE_ACTION_DENY) {
-			rcu_read_unlock();
-			return -EACCES;
-		}
-		if (pol->parsed->global_default_action == IPE_ACTION_INVALID)
+		if (pol->parsed->global_default_action == IPE_ACTION_INVALID) {
 			WARN(1, "no default rule set for unknown op, ALLOW it");
+			action = IPE_ACTION_ALLOW;
+		} else {
+			action = pol->parsed->global_default_action;
+		}
 		rcu_read_unlock();
-		return 0;
+		match_type = IPE_MATCH_GLOBAL;
+		goto eval;
 	}
 
 	rules = &pol->parsed->rules[ctx->op];
@@ -136,16 +144,32 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 			break;
 	}
 
-	if (match)
+	if (match) {
 		action = rule->action;
-	else if (rules->default_action != IPE_ACTION_INVALID)
+		match_type = IPE_MATCH_RULE;
+	} else if (rules->default_action != IPE_ACTION_INVALID) {
 		action = rules->default_action;
-	else
+		match_type = IPE_MATCH_TABLE;
+	} else {
 		action = pol->parsed->global_default_action;
+		match_type = IPE_MATCH_GLOBAL;
+	}
 
 	rcu_read_unlock();
+eval:
+	ipe_audit_match(ctx, match_type, action, rule);
+
 	if (action == IPE_ACTION_DENY)
 		return -EACCES;
 
 	return 0;
 }
+
+/* Set the right module name */
+#ifdef KBUILD_MODNAME
+#undef KBUILD_MODNAME
+#define KBUILD_MODNAME "ipe"
+#endif
+
+module_param(success_audit, bool, 0400);
+MODULE_PARM_DESC(success_audit, "Start IPE with success auditing enabled");
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 0fa6492354dd..42b74a7a7c2b 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -10,10 +10,12 @@
 #include <linux/types.h>
 
 #include "policy.h"
+#include "hooks.h"
 
 #define IPE_EVAL_CTX_INIT ((struct ipe_eval_ctx){ 0 })
 
 extern struct ipe_policy __rcu *ipe_active_policy;
+extern bool success_audit;
 
 struct ipe_superblock {
 	bool initramfs;
@@ -21,14 +23,23 @@ struct ipe_superblock {
 
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
+	enum ipe_hook_type hook;
 
 	const struct file *file;
 	bool initramfs;
 };
 
+enum ipe_match {
+	IPE_MATCH_RULE = 0,
+	IPE_MATCH_TABLE,
+	IPE_MATCH_GLOBAL,
+	__IPE_MATCH_MAX
+};
+
 void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 			const struct file *file,
-			enum ipe_op_type op);
+			enum ipe_op_type op,
+			enum ipe_hook_type hook);
 int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx);
 
 #endif /* _IPE_EVAL_H */
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
index 49484c8feead..9e410982b759 100644
--- a/security/ipe/fs.c
+++ b/security/ipe/fs.c
@@ -8,11 +8,62 @@
 
 #include "ipe.h"
 #include "fs.h"
+#include "eval.h"
 #include "policy.h"
+#include "audit.h"
 
 static struct dentry *np __ro_after_init;
 static struct dentry *root __ro_after_init;
 struct dentry *policy_root __ro_after_init;
+static struct dentry *audit_node __ro_after_init;
+
+/**
+ * setaudit() - Write handler for the securityfs node, "ipe/success_audit"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ */
+static ssize_t setaudit(struct file *f, const char __user *data,
+			size_t len, loff_t *offset)
+{
+	int rc = 0;
+	bool value;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	rc = kstrtobool_from_user(data, len, &value);
+	if (rc)
+		return rc;
+
+	WRITE_ONCE(success_audit, value);
+
+	return len;
+}
+
+/**
+ * getaudit() - Read handler for the securityfs node, "ipe/success_audit"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the read syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return: Length of buffer written
+ */
+static ssize_t getaudit(struct file *f, char __user *data,
+			size_t len, loff_t *offset)
+{
+	const char *result;
+
+	result = ((READ_ONCE(success_audit)) ? "1" : "0");
+
+	return simple_read_from_buffer(data, len, offset, result, 1);
+}
 
 /**
  * new_policy() - Write handler for the securityfs node, "ipe/new_policy".
@@ -51,6 +102,10 @@ static ssize_t new_policy(struct file *f, const char __user *data,
 	}
 
 	rc = ipe_new_policyfs_node(p);
+	if (rc)
+		goto out;
+
+	ipe_audit_policy_load(p);
 
 out:
 	if (rc < 0)
@@ -63,6 +118,11 @@ static const struct file_operations np_fops = {
 	.write = new_policy,
 };
 
+static const struct file_operations audit_fops = {
+	.write = setaudit,
+	.read = getaudit,
+};
+
 /**
  * ipe_init_securityfs() - Initialize IPE's securityfs tree at fsinit.
  *
@@ -82,6 +142,13 @@ static int __init ipe_init_securityfs(void)
 		goto err;
 	}
 
+	audit_node = securityfs_create_file("success_audit", 0600, root,
+					    NULL, &audit_fops);
+	if (IS_ERR(audit_node)) {
+		rc = PTR_ERR(audit_node);
+		goto err;
+	}
+
 	policy_root = securityfs_create_dir("policies", root);
 	if (IS_ERR(policy_root)) {
 		rc = PTR_ERR(policy_root);
@@ -98,6 +165,7 @@ static int __init ipe_init_securityfs(void)
 err:
 	securityfs_remove(np);
 	securityfs_remove(policy_root);
+	securityfs_remove(audit_node);
 	securityfs_remove(root);
 	return rc;
 }
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index 76370919aac0..b68719bf44fb 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -29,7 +29,7 @@ int ipe_bprm_check_security(struct linux_binprm *bprm)
 {
 	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
 
-	ipe_build_eval_ctx(&ctx, bprm->file, IPE_OP_EXEC);
+	ipe_build_eval_ctx(&ctx, bprm->file, IPE_OP_EXEC, IPE_HOOK_BPRM_CHECK);
 	return ipe_evaluate_event(&ctx);
 }
 
@@ -54,7 +54,7 @@ int ipe_mmap_file(struct file *f, unsigned long reqprot __always_unused,
 	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
 
 	if (prot & PROT_EXEC) {
-		ipe_build_eval_ctx(&ctx, f, IPE_OP_EXEC);
+		ipe_build_eval_ctx(&ctx, f, IPE_OP_EXEC, IPE_HOOK_MMAP);
 		return ipe_evaluate_event(&ctx);
 	}
 
@@ -86,7 +86,7 @@ int ipe_file_mprotect(struct vm_area_struct *vma,
 		return 0;
 
 	if (prot & PROT_EXEC) {
-		ipe_build_eval_ctx(&ctx, vma->vm_file, IPE_OP_EXEC);
+		ipe_build_eval_ctx(&ctx, vma->vm_file, IPE_OP_EXEC, IPE_HOOK_MPROTECT);
 		return ipe_evaluate_event(&ctx);
 	}
 
@@ -136,7 +136,7 @@ int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
 		WARN(1, "no rule setup for kernel_read_file enum %d", id);
 	}
 
-	ipe_build_eval_ctx(&ctx, file, op);
+	ipe_build_eval_ctx(&ctx, file, op, IPE_HOOK_KERNEL_READ);
 	return ipe_evaluate_event(&ctx);
 }
 
@@ -180,7 +180,7 @@ int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents)
 		WARN(1, "no rule setup for kernel_load_data enum %d", id);
 	}
 
-	ipe_build_eval_ctx(&ctx, NULL, op);
+	ipe_build_eval_ctx(&ctx, NULL, op, IPE_HOOK_KERNEL_LOAD);
 	return ipe_evaluate_event(&ctx);
 }
 
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index 4de5fabebd54..f4f0b544ddcc 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -9,6 +9,17 @@
 #include <linux/binfmts.h>
 #include <linux/security.h>
 
+enum ipe_hook_type {
+	IPE_HOOK_BPRM_CHECK = 0,
+	IPE_HOOK_MMAP,
+	IPE_HOOK_MPROTECT,
+	IPE_HOOK_KERNEL_READ,
+	IPE_HOOK_KERNEL_LOAD,
+	__IPE_HOOK_MAX
+};
+
+#define IPE_HOOK_INVALID __IPE_HOOK_MAX
+
 int ipe_bprm_check_security(struct linux_binprm *bprm);
 
 int ipe_mmap_file(struct file *f, unsigned long reqprot, unsigned long prot,
diff --git a/security/ipe/policy.c b/security/ipe/policy.c
index 112913f83c6d..5ac7640a8aef 100644
--- a/security/ipe/policy.c
+++ b/security/ipe/policy.c
@@ -11,6 +11,7 @@
 #include "fs.h"
 #include "policy.h"
 #include "policy_parser.h"
+#include "audit.h"
 
 /* lock for synchronizing writers across ipe policy */
 DEFINE_MUTEX(ipe_policy_lock);
@@ -112,6 +113,7 @@ int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
 
 	root->i_private = new;
 	swap(new->policyfs, old->policyfs);
+	ipe_audit_policy_load(new);
 
 	mutex_lock(&ipe_policy_lock);
 	ap = rcu_dereference_protected(ipe_active_policy,
@@ -120,6 +122,7 @@ int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
 		rcu_assign_pointer(ipe_active_policy, new);
 		mutex_unlock(&ipe_policy_lock);
 		synchronize_rcu();
+		ipe_audit_policy_activation(old, new);
 	} else {
 		mutex_unlock(&ipe_policy_lock);
 	}
@@ -220,5 +223,7 @@ int ipe_set_active_pol(const struct ipe_policy *p)
 	mutex_unlock(&ipe_policy_lock);
 	synchronize_rcu();
 
+	ipe_audit_policy_activation(ap, p);
+
 	return 0;
 }
-- 
2.44.0


^ permalink raw reply related	[relevance 26%]

* [PATCH v18 03/21] ipe: add evaluation loop
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
  2024-05-03 22:32 47% ` [PATCH v18 01/21] security: add ipe lsm Fan Wu
  2024-05-03 22:32 33% ` [PATCH v18 02/21] ipe: add policy parser Fan Wu
@ 2024-05-03 22:32 54% ` Fan Wu
  2024-05-03 22:32 45% ` [PATCH v18 04/21] ipe: add LSM hooks on execution and kernel read Fan Wu
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Introduce a core evaluation function in IPE that will be triggered by
various security hooks (e.g., mmap, bprm_check, kexec). This function
systematically assesses actions against the defined IPE policy, by
iterating over rules specific to the action being taken. This critical
addition enables IPE to enforce its security policies effectively,
ensuring that actions intercepted by these hooks are scrutinized for policy
compliance before they are allowed to proceed.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
+ Split evaluation loop, access control hooks, and evaluation loop from policy parser and userspace interface to pass mailing list character limit

v3:
+ Move ipe_load_properties to patch 04.
+ Remove useless 0-initializations Prefix extern variables with ipe_
+ Remove kernel module parameters, as these are exposed through sysctls.
+ Add more prose to the IPE base config option help text.
+ Use GFP_KERNEL for audit_log_start.
+ Remove unnecessary caching system.
+ Remove comments from headers
+ Use rcu_access_pointer for rcu-pointer null check
+ Remove usage of reqprot; use prot only.
+Move policy load and activation audit event to 03/12

v4:
+ Remove sysctls in favor of securityfs nodes
+ Re-add kernel module parameters, as these are now exposed through securityfs.
+ Refactor property audit loop to a separate function.

v5:
+ fix minor grammatical errors
+ do not group rule by curly-brace in audit record,
+ reconstruct the exact rule.

v6:
+ No changes

v7:
+ Further split lsm creation into a separate commit from the evaluation loop and audit system, for easier review.
+ Propagating changes to support the new ipe_context structure in the evaluation loop.

v8:
+ Remove ipe_hook enumeration; hooks can be correlated via syscall record.

v9:
+ Remove ipe_context related code and simplify the evaluation loop.

v10:
+ Split eval part and boot_verified part

v11:
+ Fix code style issues

v12:
+ Correct an rcu_read_unlock usage
+ Add a WARN to unknown op during evaluation

v13:
+ No changes

v14:
+ No changes

v15:
+ No changes

v16:
+ No changes

v17:
+ Add years to license header
+ Fix code and documentation style issues

v18:
+ No changes
---
 security/ipe/Makefile |   1 +
 security/ipe/eval.c   | 102 ++++++++++++++++++++++++++++++++++++++++++
 security/ipe/eval.h   |  24 ++++++++++
 3 files changed, 127 insertions(+)
 create mode 100644 security/ipe/eval.c
 create mode 100644 security/ipe/eval.h

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 3093de1afd3e..4cc17eb92060 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -6,6 +6,7 @@
 #
 
 obj-$(CONFIG_SECURITY_IPE) += \
+	eval.o \
 	ipe.o \
 	policy.o \
 	policy_parser.o \
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
new file mode 100644
index 000000000000..41331afdef7c
--- /dev/null
+++ b/security/ipe/eval.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/file.h>
+#include <linux/sched.h>
+#include <linux/rcupdate.h>
+
+#include "ipe.h"
+#include "eval.h"
+#include "policy.h"
+
+struct ipe_policy __rcu *ipe_active_policy;
+
+/**
+ * evaluate_property() - Analyze @ctx against a rule property.
+ * @ctx: Supplies a pointer to the context to be evaluated.
+ * @p: Supplies a pointer to the property to be evaluated.
+ *
+ * This is a placeholder. The actual function will be introduced in the
+ * latter commits.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
+			      struct ipe_prop *p)
+{
+	return false;
+}
+
+/**
+ * ipe_evaluate_event() - Analyze @ctx against the current active policy.
+ * @ctx: Supplies a pointer to the context to be evaluated.
+ *
+ * This is the loop where all policy evaluation happens against IPE policy.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- @ctx did not pass evaluation
+ */
+int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
+{
+	const struct ipe_op_table *rules = NULL;
+	const struct ipe_rule *rule = NULL;
+	struct ipe_policy *pol = NULL;
+	struct ipe_prop *prop = NULL;
+	enum ipe_action_type action;
+	bool match = false;
+
+	rcu_read_lock();
+
+	pol = rcu_dereference(ipe_active_policy);
+	if (!pol) {
+		rcu_read_unlock();
+		return 0;
+	}
+
+	if (ctx->op == IPE_OP_INVALID) {
+		if (pol->parsed->global_default_action == IPE_ACTION_DENY) {
+			rcu_read_unlock();
+			return -EACCES;
+		}
+		if (pol->parsed->global_default_action == IPE_ACTION_INVALID)
+			WARN(1, "no default rule set for unknown op, ALLOW it");
+		rcu_read_unlock();
+		return 0;
+	}
+
+	rules = &pol->parsed->rules[ctx->op];
+
+	list_for_each_entry(rule, &rules->rules, next) {
+		match = true;
+
+		list_for_each_entry(prop, &rule->props, next) {
+			match = evaluate_property(ctx, prop);
+			if (!match)
+				break;
+		}
+
+		if (match)
+			break;
+	}
+
+	if (match)
+		action = rule->action;
+	else if (rules->default_action != IPE_ACTION_INVALID)
+		action = rules->default_action;
+	else
+		action = pol->parsed->global_default_action;
+
+	rcu_read_unlock();
+	if (action == IPE_ACTION_DENY)
+		return -EACCES;
+
+	return 0;
+}
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
new file mode 100644
index 000000000000..b137f2107852
--- /dev/null
+++ b/security/ipe/eval.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_EVAL_H
+#define _IPE_EVAL_H
+
+#include <linux/file.h>
+#include <linux/types.h>
+
+#include "policy.h"
+
+extern struct ipe_policy __rcu *ipe_active_policy;
+
+struct ipe_eval_ctx {
+	enum ipe_op_type op;
+
+	const struct file *file;
+};
+
+int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx);
+
+#endif /* _IPE_EVAL_H */
-- 
2.44.0


^ permalink raw reply related	[relevance 54%]

* [PATCH v18 01/21] security: add ipe lsm
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
@ 2024-05-03 22:32 47% ` Fan Wu
  2024-05-03 22:32 33% ` [PATCH v18 02/21] ipe: add policy parser Fan Wu
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Integrity Policy Enforcement (IPE) is an LSM that provides an
complimentary approach to Mandatory Access Control than existing LSMs
today.

Existing LSMs have centered around the concept of access to a resource
should be controlled by the current user's credentials. IPE's approach,
is that access to a resource should be controlled by the system's trust
of a current resource.

The basis of this approach is defining a global policy to specify which
resource can be trusted.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation into a separate commit from the
    evaluation loop and audit system, for easier review.

  + Introduce the concept of an ipe_context, a scoped way to
    introduce execution policies, used initially for allowing for
    kunit tests in isolation.

v8:
  + Follow lsmname_hook_name convention for lsm hooks.
  + Move LSM blob accessors to ipe.c and mark LSM blobs as static.

v9:
  + Remove ipe_context for simplification

v10:
  + Add github url

v11:
  + Correct github url
  + Move ipe before bpf

v12:
  + Switch to use lsm_id instead of string for lsm name

v13:
  + No changes

v14:
  + No changes

v15:
  + Add missing code in tools/testing/selftests/lsm/lsm_list_modules_test.c

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 include/uapi/linux/lsm.h                      |  1 +
 security/Kconfig                              | 11 ++---
 security/Makefile                             |  1 +
 security/ipe/Kconfig                          | 17 ++++++++
 security/ipe/Makefile                         |  9 ++++
 security/ipe/ipe.c                            | 42 +++++++++++++++++++
 security/ipe/ipe.h                            | 16 +++++++
 security/security.c                           |  3 +-
 .../selftests/lsm/lsm_list_modules_test.c     |  3 ++
 9 files changed, 97 insertions(+), 6 deletions(-)
 create mode 100644 security/ipe/Kconfig
 create mode 100644 security/ipe/Makefile
 create mode 100644 security/ipe/ipe.c
 create mode 100644 security/ipe/ipe.h

diff --git a/include/uapi/linux/lsm.h b/include/uapi/linux/lsm.h
index 33d8c9f4aa6b..938593dfd5da 100644
--- a/include/uapi/linux/lsm.h
+++ b/include/uapi/linux/lsm.h
@@ -64,6 +64,7 @@ struct lsm_ctx {
 #define LSM_ID_LANDLOCK		110
 #define LSM_ID_IMA		111
 #define LSM_ID_EVM		112
+#define LSM_ID_IPE		113
 
 /*
  * LSM_ATTR_XXX definitions identify different LSM attributes
diff --git a/security/Kconfig b/security/Kconfig
index 412e76f1575d..9fb8f9b14972 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -192,6 +192,7 @@ source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
 source "security/landlock/Kconfig"
+source "security/ipe/Kconfig"
 
 source "security/integrity/Kconfig"
 
@@ -231,11 +232,11 @@ endchoice
 
 config LSM
 	string "Ordered list of enabled LSMs"
-	default "landlock,lockdown,yama,loadpin,safesetid,smack,selinux,tomoyo,apparmor,bpf" if DEFAULT_SECURITY_SMACK
-	default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,bpf" if DEFAULT_SECURITY_APPARMOR
-	default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,bpf" if DEFAULT_SECURITY_TOMOYO
-	default "landlock,lockdown,yama,loadpin,safesetid,bpf" if DEFAULT_SECURITY_DAC
-	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,bpf"
+	default "landlock,lockdown,yama,loadpin,safesetid,smack,selinux,tomoyo,apparmor,ipe,bpf" if DEFAULT_SECURITY_SMACK
+	default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,ipe,bpf" if DEFAULT_SECURITY_APPARMOR
+	default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,ipe,bpf" if DEFAULT_SECURITY_TOMOYO
+	default "landlock,lockdown,yama,loadpin,safesetid,ipe,bpf" if DEFAULT_SECURITY_DAC
+	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,bpf"
 	help
 	  A comma-separated list of LSMs, in initialization order.
 	  Any LSMs left off this list, except for those with order
diff --git a/security/Makefile b/security/Makefile
index 59f238490665..cc0982214b84 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown/
 obj-$(CONFIG_CGROUPS)			+= device_cgroup.o
 obj-$(CONFIG_BPF_LSM)			+= bpf/
 obj-$(CONFIG_SECURITY_LANDLOCK)		+= landlock/
+obj-$(CONFIG_SECURITY_IPE)		+= ipe/
 
 # Object integrity file lists
 obj-$(CONFIG_INTEGRITY)			+= integrity/
diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
new file mode 100644
index 000000000000..e4875fb04883
--- /dev/null
+++ b/security/ipe/Kconfig
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Integrity Policy Enforcement (IPE) configuration
+#
+
+menuconfig SECURITY_IPE
+	bool "Integrity Policy Enforcement (IPE)"
+	depends on SECURITY && SECURITYFS
+	select PKCS7_MESSAGE_PARSER
+	select SYSTEM_DATA_VERIFICATION
+	help
+	  This option enables the Integrity Policy Enforcement LSM
+	  allowing users to define a policy to enforce a trust-based access
+	  control. A key feature of IPE is a customizable policy to allow
+	  admins to reconfigure trust requirements on the fly.
+
+	  If unsure, answer N.
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
new file mode 100644
index 000000000000..5486398a69e9
--- /dev/null
+++ b/security/ipe/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+#
+# Makefile for building the IPE module as part of the kernel tree.
+#
+
+obj-$(CONFIG_SECURITY_IPE) += \
+	ipe.o \
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
new file mode 100644
index 000000000000..8d4ea372873e
--- /dev/null
+++ b/security/ipe/ipe.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#include <uapi/linux/lsm.h>
+
+#include "ipe.h"
+
+static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
+};
+
+static const struct lsm_id ipe_lsmid = {
+	.name = "ipe",
+	.id = LSM_ID_IPE,
+};
+
+static struct security_hook_list ipe_hooks[] __ro_after_init = {
+};
+
+/**
+ * ipe_init() - Entry point of IPE.
+ *
+ * This is called at LSM init, which happens occurs early during kernel
+ * start up. During this phase, IPE registers its hooks and loads the
+ * builtin boot policy.
+ *
+ * Return:
+ * * %0		- OK
+ * * %-ENOMEM	- Out of memory (OOM)
+ */
+static int __init ipe_init(void)
+{
+	security_add_hooks(ipe_hooks, ARRAY_SIZE(ipe_hooks), &ipe_lsmid);
+
+	return 0;
+}
+
+DEFINE_LSM(ipe) = {
+	.name = "ipe",
+	.init = ipe_init,
+	.blobs = &ipe_blobs,
+};
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
new file mode 100644
index 000000000000..adc3c45e9f53
--- /dev/null
+++ b/security/ipe/ipe.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_H
+#define _IPE_H
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) "ipe: " fmt
+
+#include <linux/lsm_hooks.h>
+
+#endif /* _IPE_H */
diff --git a/security/security.c b/security/security.c
index 0a9a0ac3f266..820e0d437452 100644
--- a/security/security.c
+++ b/security/security.c
@@ -51,7 +51,8 @@
 	(IS_ENABLED(CONFIG_BPF_LSM) ? 1 : 0) + \
 	(IS_ENABLED(CONFIG_SECURITY_LANDLOCK) ? 1 : 0) + \
 	(IS_ENABLED(CONFIG_IMA) ? 1 : 0) + \
-	(IS_ENABLED(CONFIG_EVM) ? 1 : 0))
+	(IS_ENABLED(CONFIG_EVM) ? 1 : 0) + \
+	(IS_ENABLED(CONFIG_SECURITY_IPE) ? 1 : 0))
 
 /*
  * These are descriptions of the reasons that can be passed to the
diff --git a/tools/testing/selftests/lsm/lsm_list_modules_test.c b/tools/testing/selftests/lsm/lsm_list_modules_test.c
index 06d24d4679a6..1cc8a977c711 100644
--- a/tools/testing/selftests/lsm/lsm_list_modules_test.c
+++ b/tools/testing/selftests/lsm/lsm_list_modules_test.c
@@ -128,6 +128,9 @@ TEST(correct_lsm_list_modules)
 		case LSM_ID_EVM:
 			name = "evm";
 			break;
+		case LSM_ID_IPE:
+			name = "ipe";
+			break;
 		default:
 			name = "INVALID";
 			break;
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v18 04/21] ipe: add LSM hooks on execution and kernel read
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (2 preceding siblings ...)
  2024-05-03 22:32 54% ` [PATCH v18 03/21] ipe: add evaluation loop Fan Wu
@ 2024-05-03 22:32 45% ` Fan Wu
  2024-05-03 22:32 67% ` [PATCH v18 05/21] initramfs|security: Add a security hook to do_populate_rootfs() Fan Wu
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

IPE's initial goal is to control both execution and the loading of
kernel modules based on the system's definition of trust. It
accomplishes this by plugging into the security hooks for
bprm_check_security, file_mprotect, mmap_file, kernel_load_data,
and kernel_read_data.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation, the audit system, the evaluation loop
    and access control hooks into separate commits.

v8:
  + Rename hook functions to follow the lsmname_hook_name convention
  + Remove ipe_hook enumeration, can be derived from correlation with
    syscall audit record.

v9:
  + Minor changes for adapting to the new parser

v10:
  + Remove @reqprot part

v11:
  + Fix code style issues

v12:
  + Correct WARN usages

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 security/ipe/Makefile |   1 +
 security/ipe/eval.c   |  14 ++++
 security/ipe/eval.h   |   5 ++
 security/ipe/hooks.c  | 184 ++++++++++++++++++++++++++++++++++++++++++
 security/ipe/hooks.h  |  25 ++++++
 security/ipe/ipe.c    |   6 ++
 6 files changed, 235 insertions(+)
 create mode 100644 security/ipe/hooks.c
 create mode 100644 security/ipe/hooks.h

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 4cc17eb92060..e1c27e974c5c 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -7,6 +7,7 @@
 
 obj-$(CONFIG_SECURITY_IPE) += \
 	eval.o \
+	hooks.o \
 	ipe.o \
 	policy.o \
 	policy_parser.o \
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 41331afdef7c..cc3b3f6583ad 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -16,6 +16,20 @@
 
 struct ipe_policy __rcu *ipe_active_policy;
 
+/**
+ * ipe_build_eval_ctx() - Build an ipe evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @file: Supplies a pointer to the file to associated with the evaluation.
+ * @op: Supplies the IPE policy operation associated with the evaluation.
+ */
+void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
+			const struct file *file,
+			enum ipe_op_type op)
+{
+	ctx->file = file;
+	ctx->op = op;
+}
+
 /**
  * evaluate_property() - Analyze @ctx against a rule property.
  * @ctx: Supplies a pointer to the context to be evaluated.
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index b137f2107852..00ed8ceca10e 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -11,6 +11,8 @@
 
 #include "policy.h"
 
+#define IPE_EVAL_CTX_INIT ((struct ipe_eval_ctx){ 0 })
+
 extern struct ipe_policy __rcu *ipe_active_policy;
 
 struct ipe_eval_ctx {
@@ -19,6 +21,9 @@ struct ipe_eval_ctx {
 	const struct file *file;
 };
 
+void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
+			const struct file *file,
+			enum ipe_op_type op);
 int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx);
 
 #endif /* _IPE_EVAL_H */
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
new file mode 100644
index 000000000000..f2aaa749dd7b
--- /dev/null
+++ b/security/ipe/hooks.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/binfmts.h>
+#include <linux/mman.h>
+
+#include "ipe.h"
+#include "hooks.h"
+#include "eval.h"
+
+/**
+ * ipe_bprm_check_security() - ipe security hook function for bprm check.
+ * @bprm: Supplies a pointer to a linux_binprm structure to source the file
+ *	  being evaluated.
+ *
+ * This LSM hook is called when a binary is loaded through the exec
+ * family of system calls.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_bprm_check_security(struct linux_binprm *bprm)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+
+	ipe_build_eval_ctx(&ctx, bprm->file, IPE_OP_EXEC);
+	return ipe_evaluate_event(&ctx);
+}
+
+/**
+ * ipe_mmap_file() - ipe security hook function for mmap check.
+ * @f: File being mmap'd. Can be NULL in the case of anonymous memory.
+ * @reqprot: The requested protection on the mmap, passed from usermode.
+ * @prot: The effective protection on the mmap, resolved from reqprot and
+ *	  system configuration.
+ * @flags: Unused.
+ *
+ * This hook is called when a file is loaded through the mmap
+ * family of system calls.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_mmap_file(struct file *f, unsigned long reqprot __always_unused,
+		  unsigned long prot, unsigned long flags)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+
+	if (prot & PROT_EXEC) {
+		ipe_build_eval_ctx(&ctx, f, IPE_OP_EXEC);
+		return ipe_evaluate_event(&ctx);
+	}
+
+	return 0;
+}
+
+/**
+ * ipe_file_mprotect() - ipe security hook function for mprotect check.
+ * @vma: Existing virtual memory area created by mmap or similar.
+ * @reqprot: The requested protection on the mmap, passed from usermode.
+ * @prot: The effective protection on the mmap, resolved from reqprot and
+ *	  system configuration.
+ *
+ * This LSM hook is called when a mmap'd region of memory is changing
+ * its protections via mprotect.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_file_mprotect(struct vm_area_struct *vma,
+		      unsigned long reqprot __always_unused,
+		      unsigned long prot)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+
+	/* Already Executable */
+	if (vma->vm_flags & VM_EXEC)
+		return 0;
+
+	if (prot & PROT_EXEC) {
+		ipe_build_eval_ctx(&ctx, vma->vm_file, IPE_OP_EXEC);
+		return ipe_evaluate_event(&ctx);
+	}
+
+	return 0;
+}
+
+/**
+ * ipe_kernel_read_file() - ipe security hook function for kernel read.
+ * @file: Supplies a pointer to the file structure being read in from disk.
+ * @id: Supplies the enumeration identifying the purpose of the read.
+ * @contents: Unused.
+ *
+ * This LSM hook is called when a file is being read in from disk from
+ * the kernel.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
+			 bool contents)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+	enum ipe_op_type op;
+
+	switch (id) {
+	case READING_FIRMWARE:
+		op = IPE_OP_FIRMWARE;
+		break;
+	case READING_MODULE:
+		op = IPE_OP_KERNEL_MODULE;
+		break;
+	case READING_KEXEC_INITRAMFS:
+		op = IPE_OP_KEXEC_INITRAMFS;
+		break;
+	case READING_KEXEC_IMAGE:
+		op = IPE_OP_KEXEC_IMAGE;
+		break;
+	case READING_POLICY:
+		op = IPE_OP_POLICY;
+		break;
+	case READING_X509_CERTIFICATE:
+		op = IPE_OP_X509;
+		break;
+	default:
+		op = IPE_OP_INVALID;
+		WARN(1, "no rule setup for kernel_read_file enum %d", id);
+	}
+
+	ipe_build_eval_ctx(&ctx, file, op);
+	return ipe_evaluate_event(&ctx);
+}
+
+/**
+ * ipe_kernel_load_data() - ipe security hook function for kernel load data.
+ * @id: Supplies the enumeration identifying the purpose of the read.
+ * @contents: Unused.
+ *
+ * This LSM hook is called when a buffer is being read in from disk.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+	enum ipe_op_type op;
+
+	switch (id) {
+	case LOADING_FIRMWARE:
+		op = IPE_OP_FIRMWARE;
+		break;
+	case LOADING_MODULE:
+		op = IPE_OP_KERNEL_MODULE;
+		break;
+	case LOADING_KEXEC_INITRAMFS:
+		op = IPE_OP_KEXEC_INITRAMFS;
+		break;
+	case LOADING_KEXEC_IMAGE:
+		op = IPE_OP_KEXEC_IMAGE;
+		break;
+	case LOADING_POLICY:
+		op = IPE_OP_POLICY;
+		break;
+	case LOADING_X509_CERTIFICATE:
+		op = IPE_OP_X509;
+		break;
+	default:
+		op = IPE_OP_INVALID;
+		WARN(1, "no rule setup for kernel_load_data enum %d", id);
+	}
+
+	ipe_build_eval_ctx(&ctx, NULL, op);
+	return ipe_evaluate_event(&ctx);
+}
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
new file mode 100644
index 000000000000..c22c3336d27c
--- /dev/null
+++ b/security/ipe/hooks.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#ifndef _IPE_HOOKS_H
+#define _IPE_HOOKS_H
+
+#include <linux/fs.h>
+#include <linux/binfmts.h>
+#include <linux/security.h>
+
+int ipe_bprm_check_security(struct linux_binprm *bprm);
+
+int ipe_mmap_file(struct file *f, unsigned long reqprot, unsigned long prot,
+		  unsigned long flags);
+
+int ipe_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+		      unsigned long prot);
+
+int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
+			 bool contents);
+
+int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents);
+
+#endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 8d4ea372873e..729334812636 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -5,6 +5,7 @@
 #include <uapi/linux/lsm.h>
 
 #include "ipe.h"
+#include "hooks.h"
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 };
@@ -15,6 +16,11 @@ static const struct lsm_id ipe_lsmid = {
 };
 
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
+	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
+	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
+	LSM_HOOK_INIT(file_mprotect, ipe_file_mprotect),
+	LSM_HOOK_INIT(kernel_read_file, ipe_kernel_read_file),
+	LSM_HOOK_INIT(kernel_load_data, ipe_kernel_load_data),
 };
 
 /**
-- 
2.44.0


^ permalink raw reply related	[relevance 45%]

* [PATCH v18 02/21] ipe: add policy parser
  2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
  2024-05-03 22:32 47% ` [PATCH v18 01/21] security: add ipe lsm Fan Wu
@ 2024-05-03 22:32 33% ` Fan Wu
  2024-05-03 22:32 54% ` [PATCH v18 03/21] ipe: add evaluation loop Fan Wu
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

IPE's interpretation of the what the user trusts is accomplished through
its policy. IPE's design is to not provide support for a single trust
provider, but to support multiple providers to enable the end-user to
choose the best one to seek their needs.

This requires the policy to be rather flexible and modular so that
integrity providers, like fs-verity, dm-verity, dm-integrity, or
some other system, can plug into the policy with minimal code changes.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move policy load and activation audit event to 03/12
  + Fix a potential panic when a policy failed to load.
  + use pr_warn for a failure to parse instead of an
    audit record
  + Remove comments from headers
  + Add lockdep assertions to ipe_update_active_policy and
    ipe_activate_policy
  + Fix up warnings with checkpatch --strict
  + Use file_ns_capable for CAP_MAC_ADMIN for securityfs
    nodes.
  + Use memdup_user instead of kzalloc+simple_write_to_buffer.
  + Remove strict_parse command line parameter, as it is added
    by the sysctl command line.
  + Prefix extern variables with ipe_

v4:
  + Remove securityfs to reverse-dependency
  + Add SHA1 reverse dependency.
  + Add versioning scheme for IPE properties, and associated
    interface to query the versioning scheme.
  + Cause a parser to always return an error on unknown syntax.
  + Remove strict_parse option
  + Change active_policy interface from sysctl, to securityfs,
    and change scheme.

v5:
  + Cause an error if a default action is not defined for each
    operation.
  + Minor function renames

v6:
  + No changes

v7:
  + Further split parser and userspace interface into two
    separate commits, for easier review.
  + Refactor policy parser to make code cleaner via introducing a
    more modular design, for easier extension of policy, and
    easier review.

v8:
  + remove unnecessary pr_info emission on parser loading
  + add explicit newline to the pr_err emitted when a parser
    fails to load.

v9:
  + switch to match table to parse policy
  + remove quote syntax and KERNEL_READ operation

v10:
  + Fix memory leaks in parser
  + Fix typos and change code styles

v11:
  + Fix code style issues

v12:
  + Add __always_unused to an unused parameter
  + Simplify error case handling

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues

v18:
  + No changes
---
 security/ipe/Makefile        |   2 +
 security/ipe/policy.c        | 103 ++++++++
 security/ipe/policy.h        |  83 ++++++
 security/ipe/policy_parser.c | 495 +++++++++++++++++++++++++++++++++++
 security/ipe/policy_parser.h |  11 +
 5 files changed, 694 insertions(+)
 create mode 100644 security/ipe/policy.c
 create mode 100644 security/ipe/policy.h
 create mode 100644 security/ipe/policy_parser.c
 create mode 100644 security/ipe/policy_parser.h

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 5486398a69e9..3093de1afd3e 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -7,3 +7,5 @@
 
 obj-$(CONFIG_SECURITY_IPE) += \
 	ipe.o \
+	policy.o \
+	policy_parser.o \
diff --git a/security/ipe/policy.c b/security/ipe/policy.c
new file mode 100644
index 000000000000..dd7b5b79903a
--- /dev/null
+++ b/security/ipe/policy.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/errno.h>
+#include <linux/verification.h>
+
+#include "ipe.h"
+#include "policy.h"
+#include "policy_parser.h"
+
+/**
+ * ipe_free_policy() - Deallocate a given IPE policy.
+ * @p: Supplies the policy to free.
+ *
+ * Safe to call on IS_ERR/NULL.
+ */
+void ipe_free_policy(struct ipe_policy *p)
+{
+	if (IS_ERR_OR_NULL(p))
+		return;
+
+	ipe_free_parsed_policy(p->parsed);
+	/*
+	 * p->text is allocated only when p->pkcs7 is not NULL
+	 * otherwise it points to the plaintext data inside the pkcs7
+	 */
+	if (!p->pkcs7)
+		kfree(p->text);
+	kfree(p->pkcs7);
+	kfree(p);
+}
+
+static int set_pkcs7_data(void *ctx, const void *data, size_t len,
+			  size_t asn1hdrlen __always_unused)
+{
+	struct ipe_policy *p = ctx;
+
+	p->text = (const char *)data;
+	p->textlen = len;
+
+	return 0;
+}
+
+/**
+ * ipe_new_policy() - Allocate and parse an ipe_policy structure.
+ *
+ * @text: Supplies a pointer to the plain-text policy to parse.
+ * @textlen: Supplies the length of @text.
+ * @pkcs7: Supplies a pointer to a pkcs7-signed IPE policy.
+ * @pkcs7len: Supplies the length of @pkcs7.
+ *
+ * @text/@textlen Should be NULL/0 if @pkcs7/@pkcs7len is set.
+ *
+ * Return:
+ * * a pointer to the ipe_policy structure	- Success
+ * * %-EBADMSG					- Policy is invalid
+ * * %-ENOMEM					- Out of memory (OOM)
+ * * %-ERANGE					- Policy version number overflow
+ * * %-EINVAL					- Policy version parsing error
+ */
+struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
+				  const char *pkcs7, size_t pkcs7len)
+{
+	struct ipe_policy *new = NULL;
+	int rc = 0;
+
+	new = kzalloc(sizeof(*new), GFP_KERNEL);
+	if (!new)
+		return ERR_PTR(-ENOMEM);
+
+	if (!text) {
+		new->pkcs7len = pkcs7len;
+		new->pkcs7 = kmemdup(pkcs7, pkcs7len, GFP_KERNEL);
+		if (!new->pkcs7) {
+			rc = -ENOMEM;
+			goto err;
+		}
+
+		rc = verify_pkcs7_signature(NULL, 0, new->pkcs7, pkcs7len, NULL,
+					    VERIFYING_UNSPECIFIED_SIGNATURE,
+					    set_pkcs7_data, new);
+		if (rc)
+			goto err;
+	} else {
+		new->textlen = textlen;
+		new->text = kstrdup(text, GFP_KERNEL);
+		if (!new->text) {
+			rc = -ENOMEM;
+			goto err;
+		}
+	}
+
+	rc = ipe_parse_policy(new);
+	if (rc)
+		goto err;
+
+	return new;
+err:
+	ipe_free_policy(new);
+	return ERR_PTR(rc);
+}
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
new file mode 100644
index 000000000000..8292ffaaff12
--- /dev/null
+++ b/security/ipe/policy.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#ifndef _IPE_POLICY_H
+#define _IPE_POLICY_H
+
+#include <linux/list.h>
+#include <linux/types.h>
+
+enum ipe_op_type {
+	IPE_OP_EXEC = 0,
+	IPE_OP_FIRMWARE,
+	IPE_OP_KERNEL_MODULE,
+	IPE_OP_KEXEC_IMAGE,
+	IPE_OP_KEXEC_INITRAMFS,
+	IPE_OP_POLICY,
+	IPE_OP_X509,
+	__IPE_OP_MAX,
+};
+
+#define IPE_OP_INVALID __IPE_OP_MAX
+
+enum ipe_action_type {
+	IPE_ACTION_ALLOW = 0,
+	IPE_ACTION_DENY,
+	__IPE_ACTION_MAX
+};
+
+#define IPE_ACTION_INVALID __IPE_ACTION_MAX
+
+enum ipe_prop_type {
+	__IPE_PROP_MAX
+};
+
+#define IPE_PROP_INVALID __IPE_PROP_MAX
+
+struct ipe_prop {
+	struct list_head next;
+	enum ipe_prop_type type;
+	void *value;
+};
+
+struct ipe_rule {
+	enum ipe_op_type op;
+	enum ipe_action_type action;
+	struct list_head props;
+	struct list_head next;
+};
+
+struct ipe_op_table {
+	struct list_head rules;
+	enum ipe_action_type default_action;
+};
+
+struct ipe_parsed_policy {
+	const char *name;
+	struct {
+		u16 major;
+		u16 minor;
+		u16 rev;
+	} version;
+
+	enum ipe_action_type global_default_action;
+
+	struct ipe_op_table rules[__IPE_OP_MAX];
+};
+
+struct ipe_policy {
+	const char *pkcs7;
+	size_t pkcs7len;
+
+	const char *text;
+	size_t textlen;
+
+	struct ipe_parsed_policy *parsed;
+};
+
+struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
+				  const char *pkcs7, size_t pkcs7len);
+void ipe_free_policy(struct ipe_policy *pol);
+
+#endif /* _IPE_POLICY_H */
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
new file mode 100644
index 000000000000..32064262348a
--- /dev/null
+++ b/security/ipe/policy_parser.c
@@ -0,0 +1,495 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/parser.h>
+#include <linux/types.h>
+#include <linux/ctype.h>
+
+#include "policy.h"
+#include "policy_parser.h"
+
+#define START_COMMENT	'#'
+#define IPE_POLICY_DELIM " \t"
+#define IPE_LINE_DELIM "\n\r"
+
+/**
+ * new_parsed_policy() - Allocate and initialize a parsed policy.
+ *
+ * Return:
+ * * a pointer to the ipe_parsed_policy structure	- Success
+ * * %-ENOMEM						- Out of memory (OOM)
+ */
+static struct ipe_parsed_policy *new_parsed_policy(void)
+{
+	struct ipe_parsed_policy *p = NULL;
+	struct ipe_op_table *t = NULL;
+	size_t i = 0;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return ERR_PTR(-ENOMEM);
+
+	p->global_default_action = IPE_ACTION_INVALID;
+
+	for (i = 0; i < ARRAY_SIZE(p->rules); ++i) {
+		t = &p->rules[i];
+
+		t->default_action = IPE_ACTION_INVALID;
+		INIT_LIST_HEAD(&t->rules);
+	}
+
+	return p;
+}
+
+/**
+ * remove_comment() - Truncate all chars following START_COMMENT in a string.
+ *
+ * @line: Supplies a policy line string for preprocessing.
+ */
+static void remove_comment(char *line)
+{
+	line = strchr(line, START_COMMENT);
+
+	if (line)
+		*line = '\0';
+}
+
+/**
+ * remove_trailing_spaces() - Truncate all trailing spaces in a string.
+ *
+ * @line: Supplies a policy line string for preprocessing.
+ *
+ * Return: The length of truncated string.
+ */
+static size_t remove_trailing_spaces(char *line)
+{
+	size_t i = 0;
+
+	i = strlen(line);
+	while (i > 0 && isspace(line[i - 1]))
+		i--;
+
+	line[i] = '\0';
+
+	return i;
+}
+
+/**
+ * parse_version() - Parse policy version.
+ * @ver: Supplies a version string to be parsed.
+ * @p: Supplies the partial parsed policy.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Version string is invalid
+ * * %-ERANGE	- Version number overflow
+ * * %-EINVAL	- Parsing error
+ */
+static int parse_version(char *ver, struct ipe_parsed_policy *p)
+{
+	u16 *const cv[] = { &p->version.major, &p->version.minor, &p->version.rev };
+	size_t sep_count = 0;
+	char *token;
+	int rc = 0;
+
+	while ((token = strsep(&ver, ".")) != NULL) {
+		/* prevent overflow */
+		if (sep_count >= ARRAY_SIZE(cv))
+			return -EBADMSG;
+
+		rc = kstrtou16(token, 10, cv[sep_count]);
+		if (rc)
+			return rc;
+
+		++sep_count;
+	}
+
+	/* prevent underflow */
+	if (sep_count != ARRAY_SIZE(cv))
+		return -EBADMSG;
+
+	return 0;
+}
+
+enum header_opt {
+	IPE_HEADER_POLICY_NAME = 0,
+	IPE_HEADER_POLICY_VERSION,
+	__IPE_HEADER_MAX
+};
+
+static const match_table_t header_tokens = {
+	{IPE_HEADER_POLICY_NAME,	"policy_name=%s"},
+	{IPE_HEADER_POLICY_VERSION,	"policy_version=%s"},
+	{__IPE_HEADER_MAX,		NULL}
+};
+
+/**
+ * parse_header() - Parse policy header information.
+ * @line: Supplies header line to be parsed.
+ * @p: Supplies the partial parsed policy.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Header string is invalid
+ * * %-ENOMEM	- Out of memory (OOM)
+ * * %-ERANGE	- Version number overflow
+ * * %-EINVAL	- Version parsing error
+ */
+static int parse_header(char *line, struct ipe_parsed_policy *p)
+{
+	substring_t args[MAX_OPT_ARGS];
+	char *t, *ver = NULL;
+	size_t idx = 0;
+	int rc = 0;
+
+	while ((t = strsep(&line, IPE_POLICY_DELIM)) != NULL) {
+		int token;
+
+		if (*t == '\0')
+			continue;
+		if (idx >= __IPE_HEADER_MAX) {
+			rc = -EBADMSG;
+			goto out;
+		}
+
+		token = match_token(t, header_tokens, args);
+		if (token != idx) {
+			rc = -EBADMSG;
+			goto out;
+		}
+
+		switch (token) {
+		case IPE_HEADER_POLICY_NAME:
+			p->name = match_strdup(&args[0]);
+			if (!p->name)
+				rc = -ENOMEM;
+			break;
+		case IPE_HEADER_POLICY_VERSION:
+			ver = match_strdup(&args[0]);
+			if (!ver) {
+				rc = -ENOMEM;
+				break;
+			}
+			rc = parse_version(ver, p);
+			break;
+		default:
+			rc = -EBADMSG;
+		}
+		if (rc)
+			goto out;
+		++idx;
+	}
+
+	if (idx != __IPE_HEADER_MAX)
+		rc = -EBADMSG;
+
+out:
+	kfree(ver);
+	return rc;
+}
+
+/**
+ * token_default() - Determine if the given token is "DEFAULT".
+ * @token: Supplies the token string to be compared.
+ *
+ * Return:
+ * * %false	- The token is not "DEFAULT"
+ * * %true	- The token is "DEFAULT"
+ */
+static bool token_default(char *token)
+{
+	return !strcmp(token, "DEFAULT");
+}
+
+/**
+ * free_rule() - Free the supplied ipe_rule struct.
+ * @r: Supplies the ipe_rule struct to be freed.
+ *
+ * Free a ipe_rule struct @r. Note @r must be removed from any lists before
+ * calling this function.
+ */
+static void free_rule(struct ipe_rule *r)
+{
+	struct ipe_prop *p, *t;
+
+	if (IS_ERR_OR_NULL(r))
+		return;
+
+	list_for_each_entry_safe(p, t, &r->props, next) {
+		list_del(&p->next);
+		kfree(p);
+	}
+
+	kfree(r);
+}
+
+static const match_table_t operation_tokens = {
+	{IPE_OP_EXEC,			"op=EXECUTE"},
+	{IPE_OP_FIRMWARE,		"op=FIRMWARE"},
+	{IPE_OP_KERNEL_MODULE,		"op=KMODULE"},
+	{IPE_OP_KEXEC_IMAGE,		"op=KEXEC_IMAGE"},
+	{IPE_OP_KEXEC_INITRAMFS,	"op=KEXEC_INITRAMFS"},
+	{IPE_OP_POLICY,			"op=POLICY"},
+	{IPE_OP_X509,			"op=X509_CERT"},
+	{IPE_OP_INVALID,		NULL}
+};
+
+/**
+ * parse_operation() - Parse the operation type given a token string.
+ * @t: Supplies the token string to be parsed.
+ *
+ * Return: The parsed operation type.
+ */
+static enum ipe_op_type parse_operation(char *t)
+{
+	substring_t args[MAX_OPT_ARGS];
+
+	return match_token(t, operation_tokens, args);
+}
+
+static const match_table_t action_tokens = {
+	{IPE_ACTION_ALLOW,	"action=ALLOW"},
+	{IPE_ACTION_DENY,	"action=DENY"},
+	{IPE_ACTION_INVALID,	NULL}
+};
+
+/**
+ * parse_action() - Parse the action type given a token string.
+ * @t: Supplies the token string to be parsed.
+ *
+ * Return: The parsed action type.
+ */
+static enum ipe_action_type parse_action(char *t)
+{
+	substring_t args[MAX_OPT_ARGS];
+
+	return match_token(t, action_tokens, args);
+}
+
+/**
+ * parse_property() - Parse a rule property given a token string.
+ * @t: Supplies the token string to be parsed.
+ * @r: Supplies the ipe_rule the parsed property will be associated with.
+ *
+ * This is a placeholder. The actual function will be introduced in the
+ * latter commits.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-ENOMEM	- Out of memory (OOM)
+ * * %-EBADMSG	- The supplied token cannot be parsed
+ */
+static int parse_property(char *t, struct ipe_rule *r)
+{
+	return -EBADMSG;
+}
+
+/**
+ * parse_rule() - parse a policy rule line.
+ * @line: Supplies rule line to be parsed.
+ * @p: Supplies the partial parsed policy.
+ *
+ * Return:
+ * * 0		- Success
+ * * %-ENOMEM	- Out of memory (OOM)
+ * * %-EBADMSG	- Policy syntax error
+ */
+static int parse_rule(char *line, struct ipe_parsed_policy *p)
+{
+	enum ipe_action_type action = IPE_ACTION_INVALID;
+	enum ipe_op_type op = IPE_OP_INVALID;
+	bool is_default_rule = false;
+	struct ipe_rule *r = NULL;
+	bool first_token = true;
+	bool op_parsed = false;
+	int rc = 0;
+	char *t;
+
+	r = kzalloc(sizeof(*r), GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&r->next);
+	INIT_LIST_HEAD(&r->props);
+
+	while (t = strsep(&line, IPE_POLICY_DELIM), line) {
+		if (*t == '\0')
+			continue;
+		if (first_token && token_default(t)) {
+			is_default_rule = true;
+		} else {
+			if (!op_parsed) {
+				op = parse_operation(t);
+				if (op == IPE_OP_INVALID)
+					rc = -EBADMSG;
+				else
+					op_parsed = true;
+			} else {
+				rc = parse_property(t, r);
+			}
+		}
+
+		if (rc)
+			goto err;
+		first_token = false;
+	}
+
+	action = parse_action(t);
+	if (action == IPE_ACTION_INVALID) {
+		rc = -EBADMSG;
+		goto err;
+	}
+
+	if (is_default_rule) {
+		if (!list_empty(&r->props)) {
+			rc = -EBADMSG;
+		} else if (op == IPE_OP_INVALID) {
+			if (p->global_default_action != IPE_ACTION_INVALID)
+				rc = -EBADMSG;
+			else
+				p->global_default_action = action;
+		} else {
+			if (p->rules[op].default_action != IPE_ACTION_INVALID)
+				rc = -EBADMSG;
+			else
+				p->rules[op].default_action = action;
+		}
+	} else if (op != IPE_OP_INVALID && action != IPE_ACTION_INVALID) {
+		r->op = op;
+		r->action = action;
+	} else {
+		rc = -EBADMSG;
+	}
+
+	if (rc)
+		goto err;
+	if (!is_default_rule)
+		list_add_tail(&r->next, &p->rules[op].rules);
+	else
+		free_rule(r);
+
+	return rc;
+err:
+	free_rule(r);
+	return rc;
+}
+
+/**
+ * ipe_free_parsed_policy() - free a parsed policy structure.
+ * @p: Supplies the parsed policy.
+ */
+void ipe_free_parsed_policy(struct ipe_parsed_policy *p)
+{
+	struct ipe_rule *pp, *t;
+	size_t i = 0;
+
+	if (IS_ERR_OR_NULL(p))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(p->rules); ++i)
+		list_for_each_entry_safe(pp, t, &p->rules[i].rules, next) {
+			list_del(&pp->next);
+			free_rule(pp);
+		}
+
+	kfree(p->name);
+	kfree(p);
+}
+
+/**
+ * validate_policy() - validate a parsed policy.
+ * @p: Supplies the fully parsed policy.
+ *
+ * Given a policy structure that was just parsed, validate that all
+ * operations have their default rules or a global default rule is set.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Policy is invalid
+ */
+static int validate_policy(const struct ipe_parsed_policy *p)
+{
+	size_t i = 0;
+
+	if (p->global_default_action != IPE_ACTION_INVALID)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(p->rules); ++i) {
+		if (p->rules[i].default_action == IPE_ACTION_INVALID)
+			return -EBADMSG;
+	}
+
+	return 0;
+}
+
+/**
+ * ipe_parse_policy() - Given a string, parse the string into an IPE policy.
+ * @p: partially filled ipe_policy structure to populate with the result.
+ *     it must have text and textlen set.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Policy is invalid
+ * * %-ENOMEM	- Out of Memory
+ * * %-ERANGE	- Policy version number overflow
+ * * %-EINVAL	- Policy version parsing error
+ */
+int ipe_parse_policy(struct ipe_policy *p)
+{
+	struct ipe_parsed_policy *pp = NULL;
+	char *policy = NULL, *dup = NULL;
+	bool header_parsed = false;
+	char *line = NULL;
+	size_t len;
+	int rc = 0;
+
+	if (!p->textlen)
+		return -EBADMSG;
+
+	policy = kmemdup_nul(p->text, p->textlen, GFP_KERNEL);
+	if (!policy)
+		return -ENOMEM;
+	dup = policy;
+
+	pp = new_parsed_policy();
+	if (IS_ERR(pp)) {
+		rc = PTR_ERR(pp);
+		goto out;
+	}
+
+	while ((line = strsep(&policy, IPE_LINE_DELIM)) != NULL) {
+		remove_comment(line);
+		len = remove_trailing_spaces(line);
+		if (!len)
+			continue;
+
+		if (!header_parsed) {
+			rc = parse_header(line, pp);
+			if (rc)
+				goto err;
+			header_parsed = true;
+		} else {
+			rc = parse_rule(line, pp);
+			if (rc)
+				goto err;
+		}
+	}
+
+	if (!header_parsed || validate_policy(pp)) {
+		rc = -EBADMSG;
+		goto err;
+	}
+
+	p->parsed = pp;
+
+out:
+	kfree(dup);
+	return rc;
+err:
+	ipe_free_parsed_policy(pp);
+	goto out;
+}
diff --git a/security/ipe/policy_parser.h b/security/ipe/policy_parser.h
new file mode 100644
index 000000000000..62b6209019a2
--- /dev/null
+++ b/security/ipe/policy_parser.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#ifndef _IPE_POLICY_PARSER_H
+#define _IPE_POLICY_PARSER_H
+
+int ipe_parse_policy(struct ipe_policy *p);
+void ipe_free_parsed_policy(struct ipe_parsed_policy *p);
+
+#endif /* _IPE_POLICY_PARSER_H */
-- 
2.44.0


^ permalink raw reply related	[relevance 33%]

* [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE)
@ 2024-05-03 22:32 24% Fan Wu
  2024-05-03 22:32 47% ` [PATCH v18 01/21] security: add ipe lsm Fan Wu
                   ` (20 more replies)
  0 siblings, 21 replies; 200+ results
From: Fan Wu @ 2024-05-03 22:32 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

IPE is a Linux Security Module that takes a complementary approach to
access control. Unlike traditional access control mechanisms that rely on
labels and paths for decision-making, IPE focuses on the immutable security
properties inherent to system components. These properties are fundamental
attributes or features of a system component that cannot be altered,
ensuring a consistent and reliable basis for security decisions.

To elaborate, in the context of IPE, system components primarily refer to
files or the devices these files reside on. However, this is just a
starting point. The concept of system components is flexible and can be
extended to include new elements as the system evolves. The immutable
properties include the origin of a file, which remains constant and
unchangeable over time. For example, IPE policies can be crafted to trust
files originating from the initramfs. Since initramfs is typically verified
by the bootloader, its files are deemed trustworthy; "file is from
initramfs" becomes an immutable property under IPE's consideration.

The immutable property concept extends to the security features enabled on
a file's origin, such as dm-verity or fs-verity, which provide a layer of
integrity and trust. For example, IPE allows the definition of policies
that trust files from a dm-verity protected device. dm-verity ensures the
integrity of an entire device by providing a verifiable and immutable state
of its contents. Similarly, fs-verity offers filesystem-level integrity
checks, allowing IPE to enforce policies that trust files protected by
fs-verity. These two features cannot be turned off once established, so
they are considered immutable properties. These examples demonstrate how
IPE leverages immutable properties, such as a file's origin and its
integrity protection mechanisms, to make access control decisions.

For the IPE policy, specifically, it grants the ability to enforce
stringent access controls by assessing security properties against
reference values defined within the policy. This assessment can be based on
the existence of a security property (e.g., verifying if a file originates
from initramfs) or evaluating the internal state of an immutable security
property. The latter includes checking the roothash of a dm-verity
protected device, determining whether dm-verity possesses a valid
signature, assessing the digest of a fs-verity protected file, or
determining whether fs-verity possesses a valid built-in signature. This
nuanced approach to policy enforcement enables a highly secure and
customizable system defense mechanism, tailored to specific security
requirements and trust models.

IPE is compiled under CONFIG_SECURITY_IPE.

Use Cases
---------

IPE works best in fixed-function devices: Devices in which their purpose
is clearly defined and not supposed to be changed (e.g. network firewall
device in a data center, an IoT device, etcetera), where all software and
configuration is built and provisioned by the system owner.

IPE is a long-way off for use in general-purpose computing: the Linux
community as a whole tends to follow a decentralized trust model,
known as the web of trust, which IPE has no support for as of  yet.
There are exceptions, such as the case where a Linux distribution
vendor trusts only their own keys, where IPE can successfully be used
to enforce the trust requirement.

Additionally, while most packages are signed today, the files inside
the packages (for instance, the executables), tend to be unsigned. This
makes it difficult to utilize IPE in systems where a package manager is
expected to be functional, without major changes to the package manager
and ecosystem behind it.

The digest_cache LSM[1] is a system that when combined with IPE, could be
used to enable general purpose computing scenarios.

Policy
-------

IPE policy is a plain-text policy composed of multiple statements
over several lines. There is one required line, at the top of the
policy, indicating the policy name, and the policy version, for
instance:

  policy_name=Ex_Policy policy_version=0.0.0

The policy version indicates the current version of the policy. This is
used to prevent roll-back of policy to potentially insecure previous
versions of the policy.

The next portion of IPE policy, are rules. Rules are formed by key=value
pairs, known as properties. IPE rules require two keys: "action", which
determines what IPE does when it encounters a match against the policy
and "op", which determines when that rule should be evaluated.

Thus, a minimal rule is:

  op=EXECUTE action=ALLOW

This example rule will allow any execution. A rule is required to have the
"op" property as the first token of a rule, and the "action" as the last
token of the rule.

Additional properties are used to assess immutable security properties
about the files being evaluated. These properties are intended to be
deterministic attributes that are resident in the kernel.

For example:

  op=EXECUTE dmverity_signature=FALSE action=DENY

This rule with property dmverity_signature will deny any file not from
a signed dmverity volume to be executed.

All available properties for IPE described in the documentation patch of
this series.

Rules are evaluated top-to-bottom. As a result, any revocation rules,
or denies should be placed early in the file to ensure that these rules
are evaluated before a rule with "action=ALLOW" is hit.

Any unknown syntax in IPE policy will result in a fatal error to parse
the policy.

Additionally, a DEFAULT operation must be set for all understood
operations within IPE. For policies to remain completely forwards
compatible, it is recommended that users add a "DEFAULT action=ALLOW"
and override the defaults on a per-operation basis.

For more information about the policy syntax, see the kernel
documentation page.

Early Usermode Protection
--------------------------

IPE can be provided with a policy at startup to load and enforce.
This is intended to be a minimal policy to get the system to a state
where userspace is setup and ready to receive commands, at which
point a policy can be deployed via securityfs. This "boot policy" can be
specified via the config, SECURITY_IPE_BOOT_POLICY, which accepts a path
to a plain-text version of the IPE policy to apply. This policy will be
compiled into the kernel. If not specified, IPE will be disabled until a
policy is deployed and activated through the method above.

Policy Examples
----------------

Allow all:

  policy_name=Allow_All policy_version=0.0.0
  DEFAULT action=ALLOW

Allow only initramfs:

  policy_name=Allow_Initramfs policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW

Allow any signed and validated dm-verity volume and the initramfs:

  policy_name=Allow_Signed_DMV_And_Initramfs policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Prohibit execution from a specific dm-verity volume, while allowing
all signed volumes and the initramfs:

  policy_name=Deny_DMV_By_Roothash policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=sha256:cd2c5bae7c6c579edaae4353049d58eb5f2e8be0244bf05345bc8e5ed257baff action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Allow only a specific dm-verity volume:

  policy_name=Allow_DMV_By_Roothash policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=sha256:401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW

Allow any fs-verity file with a valid built-in signature:

  policy_name=Allow_Signed_And_Validated_FSVerity policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE fsverity_signature=TRUE action=ALLOW

Allow execution of a specific fs-verity file:

  policy_name=ALLOW_FSV_By_Digest policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE fsverity_digest=sha256:fd88f2b8824e197f850bf4c5109bea5cf0ee38104f710843bb72da796ba5af9e action=ALLOW

Deploying Policies
-------------------

First sign a plain text policy, with a certificate that is present in
the SYSTEM_TRUSTED_KEYRING of your test machine. Through openssl, the
signing can be done via:

  openssl smime -sign -in "$MY_POLICY" -signer "$MY_CERTIFICATE" \
    -inkey "$MY_PRIVATE_KEY" -outform der -noattr -nodetach \
    -out "$MY_POLICY.p7s"

Then, simply cat the file into the IPE's "new_policy" securityfs node:

  cat "$MY_POLICY.p7s" > /sys/kernel/security/ipe/new_policy

The policy should now be present under the policies/ subdirectory, under
its "policy_name" attribute.

The policy is now present in the kernel and can be marked as active,
via the securityfs node:

  echo 1 > "/sys/kernel/security/ipe/$MY_POLICY_NAME/active"

This will now mark the policy as active and the system will be enforcing
$MY_POLICY_NAME.

There is one requirement when marking a policy as active, the policy_version
attribute must either increase, or remain the same as the currently running
policy.

Policies can be updated via:

  cat "$MY_UPDATED_POLICY.p7s" > \
    "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/update"

Additionally, policies can be deleted via the "delete" securityfs
node. Simply write "1" to the corresponding node in the policy folder:

  echo 1 > "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/delete"

There is only one requirement to delete policies, the policy being
deleted must not be the active policy.

NOTE: Any securityfs write to IPE's nodes will require CAP_MAC_ADMIN.

Integrations
-------------

This patch series adds support for fsverity via digest and signature
(fsverity_signature and fsverity_digest), dm-verity by digest and
signature (dmverity_signature and dmverity_roothash), and trust for
the initramfs (boot_verified).

Please see the documentation patch for more information about the
integrations available.

Testing
--------

KUnit Tests are available. Recommended kunitconfig:

    CONFIG_KUNIT=y
    CONFIG_SECURITY=y
    CONFIG_SECURITYFS=y
    CONFIG_PKCS7_MESSAGE_PARSER=y
    CONFIG_SYSTEM_DATA_VERIFICATION=y
    CONFIG_FS_VERITY=y
    CONFIG_FS_VERITY_BUILTIN_SIGNATURES=y
    CONFIG_BLOCK=y
    CONFIG_MD=y
    CONFIG_BLK_DEV_DM=y
    CONFIG_DM_VERITY=y
    CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
    CONFIG_NET=y
    CONFIG_AUDIT=y
    CONFIG_AUDITSYSCALL=y
    CONFIG_BLK_DEV_INITRD=y

    CONFIG_SECURITY_IPE=y
    CONFIG_IPE_PROP_DM_VERITY=y
    CONFIG_IPE_PROP_DM_VERITY_SIGNATURE=y
    CONFIG_IPE_PROP_FS_VERITY=y
    CONFIG_IPE_PROP_FS_VERITY_BUILTIN_SIG=y
    CONFIG_SECURITY_IPE_KUNIT_TEST=y

Simply run:

    make ARCH=um mrproper
    ./tools/testing/kunit/kunit.py run --kunitconfig <path/to/config>

And the tests will execute and report the result.

In addition, IPE has a python based integration
test suite https://github.com/microsoft/ipe/tree/test-suite that
can test both user interfaces and enforcement functionalities.

Documentation
--------------

There is both documentation available on github at
https://microsoft.github.io/ipe, and Documentation in this patch series,
to be added in-tree.

Known Gaps
-----------

IPE has two known gaps:

1. IPE cannot verify the integrity of anonymous executable memory, such as
  the trampolines created by gcc closures and libffi (<3.4.2), or JIT'd code.
  Unfortunately, as this is dynamically generated code, there is no way
  for IPE to ensure the integrity of this code to form a trust basis. In all
  cases, the return result for these operations will be whatever the admin
  configures the DEFAULT action for "EXECUTE".

2. IPE cannot verify the integrity of interpreted languages' programs when
  these scripts invoked via ``<interpreter> <file>``. This is because the
  way interpreters execute these files, the scripts themselves are not
  evaluated as executable code through one of IPE's hooks. Interpreters
  can be enlightened to the usage of IPE by trying to mmap a file into
  executable memory (+X), after opening the file and responding to the
  error code appropriately. This also applies to included files, or high
  value files, such as configuration files of critical system components.

Appendix
---------

A. IPE Github Repository: https://github.com/microsoft/ipe
B. IPE Users' Guide: Documentation/admin-guide/LSM/ipe.rst

References
-----------

1: https://lore.kernel.org/lkml/20240415142436.2545003-1-roberto.sassu@huaweicloud.com/

FAQ:
----

Q: What is the difference between IMA and IPE?

A: See the documentation patch for more on this topic.

Previous Postings
-----------------

v1: https://lore.kernel.org/all/20200406181045.1024164-1-deven.desai@linux.microsoft.com/
v2: https://lore.kernel.org/all/20200406221439.1469862-1-deven.desai@linux.microsoft.com/
v3: https://lore.kernel.org/all/20200415162550.2324-1-deven.desai@linux.microsoft.com/
v4: https://lore.kernel.org/all/20200717230941.1190744-1-deven.desai@linux.microsoft.com/
v5: https://lore.kernel.org/all/20200728213614.586312-1-deven.desai@linux.microsoft.com/
v6: https://lore.kernel.org/all/20200730003113.2561644-1-deven.desai@linux.microsoft.com/
v7: https://lore.kernel.org/all/1634151995-16266-1-git-send-email-deven.desai@linux.microsoft.com/
v8: https://lore.kernel.org/all/1654714889-26728-1-git-send-email-deven.desai@linux.microsoft.com/
v9: https://lore.kernel.org/lkml/1675119451-23180-1-git-send-email-wufan@linux.microsoft.com/
v10: https://lore.kernel.org/lkml/1687986571-16823-1-git-send-email-wufan@linux.microsoft.com/
v11: https://lore.kernel.org/lkml/1696457386-3010-1-git-send-email-wufan@linux.microsoft.com/
v12: https://lore.kernel.org/lkml/1706654228-17180-1-git-send-email-wufan@linux.microsoft.com/
v13: https://lore.kernel.org/lkml/1709168102-7677-1-git-send-email-wufan@linux.microsoft.com/
v14: https://lore.kernel.org/lkml/1709768084-22539-1-git-send-email-wufan@linux.microsoft.com/
v15: https://lore.kernel.org/lkml/1710560151-28904-1-git-send-email-wufan@linux.microsoft.com/
v16: https://lore.kernel.org/lkml/1711657047-10526-1-git-send-email-wufan@linux.microsoft.com/
v17: https://lore.kernel.org/lkml/1712969764-31039-1-git-send-email-wufan@linux.microsoft.com/

Changelog
----------

v2:
  Split the second patch of the previous series into two.
  Minor corrections in the cover-letter and documentation
  comments regarding CAP_MAC_ADMIN checks in IPE.

v3:
  Address various comments by Jann Horn. Highlights:
    Switch various audit allocators to GFP_KERNEL.
    Utilize rcu_access_pointer() in various locations.
    Strip out the caching system for properties
    Strip comments from headers
    Move functions around in patches
    Remove kernel command line parameters
    Reconcile the race condition on the delete node for policy by
      expanding the policy critical section.

  Address a few comments by Jonathan Corbet around the documentation
    pages for IPE.

  Fix an issue with the initialization of IPE policy with a "-0"
    version, caused by not initializing the hlist entries before
    freeing.

v4:
  Address a concern around IPE's behavior with unknown syntax.
    Specifically, make any unknown syntax a fatal error instead of a
    warning, as suggested by Mickaël Salaün.
  Introduce a new securityfs node, $securityfs/ipe/property_config,
    which provides a listing of what properties are enabled by the
    kernel and their versions. This allows usermode to predict what
    policies should be allowed.
  Strip some comments from c files that I missed.
  Clarify some documentation comments around 'boot_verified'.
    While this currently does not functionally change the property
    itself, the distinction is important when IPE can enforce verified
    reads. Additionally, 'KERNEL_READ' was omitted from the documentation.
    This has been corrected.
  Change SecurityFS and SHA1 to a reverse dependency.
  Update the cover-letter with the updated behavior of unknown syntax.
  Remove all sysctls, making an equivalent function in securityfs.
  Rework the active/delete mechanism to be a node under the policy in
    $securityfs/ipe/policies.
  The kernel command line parameters ipe.enforce and ipe.success_audit
    have returned as this functionality is no longer exposed through
    sysfs.

v5:
  Correct some grammatical errors reported by Randy Dunlap.
  Fix some warnings reported by kernel test bot.
  Change convention around security_bdev_setsecurity. -ENOSYS
    is now expected if an LSM does not implement a particular @name,
    as suggested by Casey Schaufler.
  Minor string corrections related to the move from sysfs to securityfs
  Correct a spelling of an #ifdef for the permissive argument.
  Add the kernel parameters re-added to the documentation.
  Fix a minor bug where the mode being audited on permissive switch
    was the original mode, not the mode being swapped to.
  Cleanup doc comments, fix some whitespace alignment issues.

v6:
  Change if statement condition in security_bdev_setsecurity to be
    more concise, as suggested by Casey Schaufler and Al Viro
  Drop the 6th patch in the series, "dm-verity move signature check..."
    due to numerous issues, and it ultimately providing no real value.
  Fix the patch tree - the previous iteration appears to have been in a
    torn state (patches 8+9 were merged). This has since been corrected.

v7:
  * Reword cover letter to more accurate convey IPE's purpose
    and latest updates.
  * Refactor series to:
      1. Support a context structure, enabling:
          1. Easier Testing via KUNIT
          2. A better architecture for future designs
      2. Make parser code cleaner
  * Move patch 01/12 to [14/16] of the series
  * Split up patch 02/12 into four parts:
      1. context creation [01/16]
      2. audit [07/16]
      3. evaluation loop [03/16]
      4. access control hooks [05/16]
      5. permissive mode [08/16]
  * Split up patch 03/12 into two parts:
      1. parser [02/16]
      2. userspace interface [04/16]
  * Reword and refactor patch 04/12 to [09/16]
  * Squash patch 05/12, 07/12, 09/12 to [10/16]
  * Squash patch 08/12, 10/12 to [11/16]
  * Change audit records to MAC region (14XX) from Integrity region (18XX)
  * Add FSVerity Support
  * Interface changes:
      1. "raw" was renamed to "pkcs7" and made read only
      2. "raw"'s write functionality (update a policy) moved to "update"
      3. introduced "version", "policy_name" nodes.
      4. "content" renamed to "policy"
      5. The boot policy can now be updated like any other policy.
  * Add additional developer-level documentation
  * Update admin-guide docs to reflect changes.
  * Kunit tests
  * Dropped CONFIG_SECURITY_IPE_PERMISSIVE_SWITCH - functionality can
    easily come later with a small patch.
  * Use partition0 for block_device for dm-verity patch

v8:
  * Add changelog information to individual commits
  * A large number of changes to the audit patch.
  * split fs/ & security/ changes to two separate patches.
  * split block/, security/ & drivers/md/ changes to separate patches.
  * Add some historical context to what lead to the creation of IPE
    in the documentation patch.
  * Cover-letter changes suggested by Roberto Sassu.

v9:
  * Rewrite IPE parser to use kernel match_table parser.
  * Adapt existing IPE properties to the new parser.
  * Remove ipe_context, quote policy syntax, kernel_read for simplicity.
  * Add new function in the security file system to delete IPE policy.
  * Make IPE audit builtin and change several audit formats.
  * Make boot_verified property builtin

v10:
  * Address various code style/format issues
  * Correct the rcu locking for active policy
  * Fix memleak bugs in the parser, optimize the parser per upstream feedback
  * Adding new audit events for IPE and update audit formats
  * Make the dmverity property auto selected
  * Adding more context in the commit messages

v11:
  * Address various code style/format issues
  * Add finalize hook to device mapper
  * move the security hook for dm-verity to the new device mapper finalize hook

v12:
  * Address locking issues
  * Change the implementation of boot_verified to trust initramfs only
  * Update audit format for IPE decision events
  * Refactor code for lsm_id
  * Add IPE test suite link

v13:
  * Rename the new security hook in initramfs
  * Make the policy grammar independent of kernel config
  * Correct IPE audit format
  * Refactor policy update code

v14:
  * Add more code comments/docs for dmverity/fsverity
  * Fix incorrect code usage and format in dmverity
  * Drop one accepted commit of dmverity

v15:
  * Fix grammar issues
  * Add more documentation to fsverity
  * Switch security hooks from *_setsecurity() to *_setintegrity()
  * Cleanup unnecessary headers

v16:
  * Fix format issues, refactor names
  * Further improve documentation for fsverity
  * Fix bugs in dmverity implementation
  * Switch to use call_int_hook() for *_setintegrity()

v17:
  * Fix various code/Documentation style issues
  * Switch to use reverse christmas tree style
  * add ipe_ prefix to all non-static functions
  * Correct documentation for fsverity
  * Rewrite design concept part of IPE Documentation
  * Fix incorrect interface path in IPE Documentation

v18:
  * Add two new kconfigs and make them auto-selected
  * Fix incorrect error handling and switch to use crypto_ahash_alg_name() in dmverity
  * Move the inode_setintegrity hook call into fsverity_verify_signature() in fsverity
  * Fix typos and cleanup unnecessary code
  * Improve policy examples inside documentation
  * Remove insecure hash algorithms and adapt the documentation accordingly
  * Update the documentation regarding the new Kconfig switches

Deven Bowers (13):
  security: add ipe lsm
  ipe: add policy parser
  ipe: add evaluation loop
  ipe: add LSM hooks on execution and kernel read
  ipe: add userspace interface
  uapi|audit|ipe: add ipe auditing support
  ipe: add permissive toggle
  block,lsm: add LSM blob and new LSM hooks for block device
  dm verity: expose root hash digest and signature data to LSMs
  ipe: add support for dm-verity as a trust provider
  scripts: add boot policy generation program
  ipe: kunit test for parser
  Documentation: add ipe documentation

Fan Wu (8):
  initramfs|security: Add a security hook to do_populate_rootfs()
  ipe: introduce 'boot_verified' as a trust provider
  security: add new securityfs delete function
  dm: add finalize hook to target_type
  security: add security_inode_setintegrity() hook
  fsverity: expose verified fsverity built-in signatures to LSMs
  ipe: enable support for fs-verity as a trust provider
  MAINTAINERS: ipe: add ipe maintainer information

 Documentation/admin-guide/LSM/index.rst       |   1 +
 Documentation/admin-guide/LSM/ipe.rst         | 792 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  12 +
 Documentation/filesystems/fsverity.rst        |  26 +-
 Documentation/security/index.rst              |   1 +
 Documentation/security/ipe.rst                | 446 ++++++++++
 MAINTAINERS                                   |  10 +
 block/bdev.c                                  |   7 +
 drivers/md/dm-verity-target.c                 | 100 +++
 drivers/md/dm-verity.h                        |   6 +
 drivers/md/dm.c                               |  12 +
 fs/verity/signature.c                         |  21 +-
 include/linux/blk_types.h                     |   3 +
 include/linux/device-mapper.h                 |   9 +
 include/linux/lsm_hook_defs.h                 |   9 +
 include/linux/lsm_hooks.h                     |   1 +
 include/linux/security.h                      |  53 ++
 include/uapi/linux/audit.h                    |   3 +
 include/uapi/linux/lsm.h                      |   1 +
 init/initramfs.c                              |   3 +
 scripts/Makefile                              |   1 +
 scripts/ipe/Makefile                          |   2 +
 scripts/ipe/polgen/.gitignore                 |   2 +
 scripts/ipe/polgen/Makefile                   |   5 +
 scripts/ipe/polgen/polgen.c                   | 145 ++++
 security/Kconfig                              |  11 +-
 security/Makefile                             |   1 +
 security/inode.c                              |  25 +
 security/ipe/.gitignore                       |   2 +
 security/ipe/Kconfig                          |  96 +++
 security/ipe/Makefile                         |  31 +
 security/ipe/audit.c                          | 279 ++++++
 security/ipe/audit.h                          |  19 +
 security/ipe/digest.c                         | 118 +++
 security/ipe/digest.h                         |  26 +
 security/ipe/eval.c                           | 394 +++++++++
 security/ipe/eval.h                           |  70 ++
 security/ipe/fs.c                             | 247 ++++++
 security/ipe/fs.h                             |  16 +
 security/ipe/hooks.c                          | 312 +++++++
 security/ipe/hooks.h                          |  52 ++
 security/ipe/ipe.c                            |  99 +++
 security/ipe/ipe.h                            |  26 +
 security/ipe/policy.c                         | 229 +++++
 security/ipe/policy.h                         |  98 +++
 security/ipe/policy_fs.c                      | 470 +++++++++++
 security/ipe/policy_parser.c                  | 556 ++++++++++++
 security/ipe/policy_parser.h                  |  11 +
 security/ipe/policy_tests.c                   | 296 +++++++
 security/security.c                           | 122 ++-
 .../selftests/lsm/lsm_list_modules_test.c     |   3 +
 51 files changed, 5271 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/admin-guide/LSM/ipe.rst
 create mode 100644 Documentation/security/ipe.rst
 create mode 100644 scripts/ipe/Makefile
 create mode 100644 scripts/ipe/polgen/.gitignore
 create mode 100644 scripts/ipe/polgen/Makefile
 create mode 100644 scripts/ipe/polgen/polgen.c
 create mode 100644 security/ipe/.gitignore
 create mode 100644 security/ipe/Kconfig
 create mode 100644 security/ipe/Makefile
 create mode 100644 security/ipe/audit.c
 create mode 100644 security/ipe/audit.h
 create mode 100644 security/ipe/digest.c
 create mode 100644 security/ipe/digest.h
 create mode 100644 security/ipe/eval.c
 create mode 100644 security/ipe/eval.h
 create mode 100644 security/ipe/fs.c
 create mode 100644 security/ipe/fs.h
 create mode 100644 security/ipe/hooks.c
 create mode 100644 security/ipe/hooks.h
 create mode 100644 security/ipe/ipe.c
 create mode 100644 security/ipe/ipe.h
 create mode 100644 security/ipe/policy.c
 create mode 100644 security/ipe/policy.h
 create mode 100644 security/ipe/policy_fs.c
 create mode 100644 security/ipe/policy_parser.c
 create mode 100644 security/ipe/policy_parser.h
 create mode 100644 security/ipe/policy_tests.c

--
2.44.0


^ permalink raw reply	[relevance 24%]

* Re: [PATCH v2 03/12] drm/i915: Make I2C terminology more inclusive
  @ 2024-05-03 21:04 72%     ` Easwar Hariharan
  0 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 21:04 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, Zhenyu Wang, Zhi Wang,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL GVT-g DRIVERS (Intel GPU Virtualization),
	Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Zhi Wang

On 5/3/2024 12:34 PM, Rodrigo Vivi wrote:
> On Fri, May 03, 2024 at 06:13:24PM +0000, Easwar Hariharan wrote:
>> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
>> with more appropriate terms. Inspired by and following on to Wolfram's
>> series to fix drivers/i2c/[1], fix the terminology for users of
>> I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
>> in the specification.
>>
>> Compile tested, no functionality changes intended
>>
>> [1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
>>
>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> It looks like the ack is not needed since we are merging this through
> drm-intel-next. But I'm planing to merge this only after seeing the
> main drivers/i2c accepting the new terminology. So we don't have a
> risk of that getting push back and new names there and we having
> to rename it once again.

Just to be explicit, did you want me to remove the Acked-by in v3, or will you when you pull
the patch into drm-intel-next?

> 
> (more below)
> 
>> Acked-by: Zhi Wang <zhiwang@kernel.org>
>> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
> 
> Cc: Jani Nikula <jani.nikula@intel.com>
> 
> Jani, what bits were you concerned that were not necessarily i2c?
> I believe although not necessarily/directly i2c, I believe they
> are related and could benefit from the massive single shot renable.
> or do you have any better split to suggest here?
> 
> (more below)
> 
>> ---
>>  drivers/gpu/drm/i915/display/dvo_ch7017.c     | 14 ++++-----
>>  drivers/gpu/drm/i915/display/dvo_ch7xxx.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/dvo_ivch.c       | 16 +++++-----
>>  drivers/gpu/drm/i915/display/dvo_ns2501.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/dvo_sil164.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/dvo_tfp410.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/intel_bios.c     | 22 +++++++-------
>>  drivers/gpu/drm/i915/display/intel_ddi.c      |  2 +-
>>  .../gpu/drm/i915/display/intel_display_core.h |  2 +-
>>  drivers/gpu/drm/i915/display/intel_dsi.h      |  2 +-
>>  drivers/gpu/drm/i915/display/intel_dsi_vbt.c  | 20 ++++++-------
>>  drivers/gpu/drm/i915/display/intel_dvo.c      | 14 ++++-----
>>  drivers/gpu/drm/i915/display/intel_dvo_dev.h  |  2 +-
>>  drivers/gpu/drm/i915/display/intel_gmbus.c    |  4 +--
>>  drivers/gpu/drm/i915/display/intel_sdvo.c     | 30 +++++++++----------
>>  drivers/gpu/drm/i915/display/intel_vbt_defs.h |  4 +--
>>  drivers/gpu/drm/i915/gvt/edid.c               | 28 ++++++++---------
>>  drivers/gpu/drm/i915/gvt/edid.h               |  4 +--
>>  drivers/gpu/drm/i915/gvt/opregion.c           |  2 +-
>>  19 files changed, 119 insertions(+), 119 deletions(-)
>>

<snip>

>> diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
>> index c17462b4c2ac..64db211148a8 100644
>> --- a/drivers/gpu/drm/i915/display/intel_ddi.c
>> +++ b/drivers/gpu/drm/i915/display/intel_ddi.c
>> @@ -4332,7 +4332,7 @@ static int intel_ddi_compute_config_late(struct intel_encoder *encoder,
>>  									connector->tile_group->id);
>>  
>>  	/*
>> -	 * EDP Transcoders cannot be ensalved
>> +	 * EDP Transcoders cannot be slaves
> 
>                                      ^ here
> perhaps you meant 'targeted' ?
> 
>>  	 * make them a master always when present

<snip>

This is not actually I2C related as far as I could tell when I was making the change, so this was more of a typo fix.

If we want to improve this, a quick check with the eDP v1.5a spec suggests using primary/secondary instead,
though in a global fashion rather than specifically for eDP transcoders. There is also source/sink terminology
in the spec related to DP encoders.

Which would be a more acceptable change here?

Thanks,
Easwar

^ permalink raw reply	[relevance 72%]

* [PATCH v2 12/12] fbdev/viafb: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (10 preceding siblings ...)
  2024-05-03 18:13 70% ` [PATCH v2 11/12] fbdev/smscufx: " Easwar Hariharan
@ 2024-05-03 18:13 47% ` Easwar Hariharan
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Florian Tobias Schandinat, Helge Deller,
	open list:VIA UNICHROME(PRO)/CHROME9 FRAMEBUFFER DRIVER,
	open list:FRAMEBUFFER LAYER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/video/fbdev/via/chip.h    |  8 ++++----
 drivers/video/fbdev/via/dvi.c     | 24 ++++++++++++------------
 drivers/video/fbdev/via/lcd.c     |  6 +++---
 drivers/video/fbdev/via/via_aux.h |  2 +-
 drivers/video/fbdev/via/via_i2c.c | 12 ++++++------
 drivers/video/fbdev/via/vt1636.c  |  6 +++---
 6 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/video/fbdev/via/chip.h b/drivers/video/fbdev/via/chip.h
index f0a19cbcb9e5..f81af13630e2 100644
--- a/drivers/video/fbdev/via/chip.h
+++ b/drivers/video/fbdev/via/chip.h
@@ -69,7 +69,7 @@
 #define     VT1632_TMDS             0x01
 #define     INTEGRATED_TMDS         0x42
 
-/* Definition TMDS Trasmitter I2C Slave Address */
+/* Definition TMDS Trasmitter I2C Target Address */
 #define     VT1632_TMDS_I2C_ADDR    0x10
 
 /**************************************************/
@@ -88,21 +88,21 @@
 #define     TX_DATA_DDR_MODE        0x04
 #define     TX_DATA_SDR_MODE        0x08
 
-/* Definition LVDS Trasmitter I2C Slave Address */
+/* Definition LVDS Trasmitter I2C Target Address */
 #define     VT1631_LVDS_I2C_ADDR    0x70
 #define     VT3271_LVDS_I2C_ADDR    0x80
 #define     VT1636_LVDS_I2C_ADDR    0x80
 
 struct tmds_chip_information {
 	int tmds_chip_name;
-	int tmds_chip_slave_addr;
+	int tmds_chip_target_addr;
 	int output_interface;
 	int i2c_port;
 };
 
 struct lvds_chip_information {
 	int lvds_chip_name;
-	int lvds_chip_slave_addr;
+	int lvds_chip_target_addr;
 	int output_interface;
 	int i2c_port;
 };
diff --git a/drivers/video/fbdev/via/dvi.c b/drivers/video/fbdev/via/dvi.c
index 13147e3066eb..27990a73bfa3 100644
--- a/drivers/video/fbdev/via/dvi.c
+++ b/drivers/video/fbdev/via/dvi.c
@@ -70,7 +70,7 @@ bool viafb_tmds_trasmitter_identify(void)
 	/* Check for VT1632: */
 	viaparinfo->chip_info->tmds_chip_info.tmds_chip_name = VT1632_TMDS;
 	viaparinfo->chip_info->
-		tmds_chip_info.tmds_chip_slave_addr = VT1632_TMDS_I2C_ADDR;
+		tmds_chip_info.tmds_chip_target_addr = VT1632_TMDS_I2C_ADDR;
 	viaparinfo->chip_info->tmds_chip_info.i2c_port = VIA_PORT_31;
 	if (check_tmds_chip(VT1632_DEVICE_ID_REG, VT1632_DEVICE_ID)) {
 		/*
@@ -128,14 +128,14 @@ bool viafb_tmds_trasmitter_identify(void)
 	viaparinfo->chip_info->
 		tmds_chip_info.tmds_chip_name = NON_TMDS_TRANSMITTER;
 	viaparinfo->chip_info->tmds_chip_info.
-		tmds_chip_slave_addr = VT1632_TMDS_I2C_ADDR;
+		tmds_chip_target_addr = VT1632_TMDS_I2C_ADDR;
 	return false;
 }
 
 static void tmds_register_write(int index, u8 data)
 {
 	viafb_i2c_writebyte(viaparinfo->chip_info->tmds_chip_info.i2c_port,
-			    viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr,
+			    viaparinfo->chip_info->tmds_chip_info.tmds_chip_target_addr,
 			    index, data);
 }
 
@@ -144,7 +144,7 @@ static int tmds_register_read(int index)
 	u8 data;
 
 	viafb_i2c_readbyte(viaparinfo->chip_info->tmds_chip_info.i2c_port,
-			   (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr,
+			   (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_target_addr,
 			   (u8) index, &data);
 	return data;
 }
@@ -152,7 +152,7 @@ static int tmds_register_read(int index)
 static int tmds_register_read_bytes(int index, u8 *buff, int buff_len)
 {
 	viafb_i2c_readbytes(viaparinfo->chip_info->tmds_chip_info.i2c_port,
-			    (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr,
+			    (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_target_addr,
 			    (u8) index, buff, buff_len);
 	return 0;
 }
@@ -256,14 +256,14 @@ static int viafb_dvi_query_EDID(void)
 
 	DEBUG_MSG(KERN_INFO "viafb_dvi_query_EDID!!\n");
 
-	restore = viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr;
-	viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr = 0xA0;
+	restore = viaparinfo->chip_info->tmds_chip_info.tmds_chip_target_addr;
+	viaparinfo->chip_info->tmds_chip_info.tmds_chip_target_addr = 0xA0;
 
 	data0 = (u8) tmds_register_read(0x00);
 	data1 = (u8) tmds_register_read(0x01);
 	if ((data0 == 0) && (data1 == 0xFF)) {
 		viaparinfo->chip_info->
-			tmds_chip_info.tmds_chip_slave_addr = restore;
+			tmds_chip_info.tmds_chip_target_addr = restore;
 		return EDID_VERSION_1;	/* Found EDID1 Table */
 	}
 
@@ -280,8 +280,8 @@ static void dvi_get_panel_size_from_DDCv1(
 
 	DEBUG_MSG(KERN_INFO "\n dvi_get_panel_size_from_DDCv1 \n");
 
-	restore = tmds_chip->tmds_chip_slave_addr;
-	tmds_chip->tmds_chip_slave_addr = 0xA0;
+	restore = tmds_chip->tmds_chip_target_addr;
+	tmds_chip->tmds_chip_target_addr = 0xA0;
 	for (i = 0x25; i < 0x6D; i++) {
 		switch (i) {
 		case 0x36:
@@ -306,7 +306,7 @@ static void dvi_get_panel_size_from_DDCv1(
 
 	DEBUG_MSG(KERN_INFO "DVI max pixelclock = %d\n",
 		tmds_setting->max_pixel_clock);
-	tmds_chip->tmds_chip_slave_addr = restore;
+	tmds_chip->tmds_chip_target_addr = restore;
 }
 
 /* If Disable DVI, turn off pad */
@@ -427,7 +427,7 @@ void viafb_dvi_enable(void)
 				viafb_i2c_writebyte(viaparinfo->chip_info->
 					tmds_chip_info.i2c_port,
 					viaparinfo->chip_info->
-					tmds_chip_info.tmds_chip_slave_addr,
+					tmds_chip_info.tmds_chip_target_addr,
 					0x08, data);
 			}
 		}
diff --git a/drivers/video/fbdev/via/lcd.c b/drivers/video/fbdev/via/lcd.c
index beec5c8d4d08..8673fced8749 100644
--- a/drivers/video/fbdev/via/lcd.c
+++ b/drivers/video/fbdev/via/lcd.c
@@ -147,7 +147,7 @@ bool viafb_lvds_trasmitter_identify(void)
 		return true;
 	/* Check for VT1631: */
 	viaparinfo->chip_info->lvds_chip_info.lvds_chip_name = VT1631_LVDS;
-	viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr =
+	viaparinfo->chip_info->lvds_chip_info.lvds_chip_target_addr =
 		VT1631_LVDS_I2C_ADDR;
 
 	if (check_lvds_chip(VT1631_DEVICE_ID_REG, VT1631_DEVICE_ID)) {
@@ -161,7 +161,7 @@ bool viafb_lvds_trasmitter_identify(void)
 
 	viaparinfo->chip_info->lvds_chip_info.lvds_chip_name =
 		NON_LVDS_TRANSMITTER;
-	viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr =
+	viaparinfo->chip_info->lvds_chip_info.lvds_chip_target_addr =
 		VT1631_LVDS_I2C_ADDR;
 	return false;
 }
@@ -327,7 +327,7 @@ static int lvds_register_read(int index)
 	u8 data;
 
 	viafb_i2c_readbyte(VIA_PORT_2C,
-			(u8) viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr,
+			(u8) viaparinfo->chip_info->lvds_chip_info.lvds_chip_target_addr,
 			(u8) index, &data);
 	return data;
 }
diff --git a/drivers/video/fbdev/via/via_aux.h b/drivers/video/fbdev/via/via_aux.h
index 0933bbf20e58..464723fd514c 100644
--- a/drivers/video/fbdev/via/via_aux.h
+++ b/drivers/video/fbdev/via/via_aux.h
@@ -24,7 +24,7 @@ struct via_aux_drv {
 	struct list_head chain;		/* chain to support multiple drivers */
 
 	struct via_aux_bus *bus;	/* the I2C bus used */
-	u8 addr;			/* the I2C slave address */
+	u8 addr;			/* the I2C target address */
 
 	const char *name;	/* human readable name of the driver */
 	void *data;		/* private data of this driver */
diff --git a/drivers/video/fbdev/via/via_i2c.c b/drivers/video/fbdev/via/via_i2c.c
index 582502810575..5edd3827ca27 100644
--- a/drivers/video/fbdev/via/via_i2c.c
+++ b/drivers/video/fbdev/via/via_i2c.c
@@ -104,7 +104,7 @@ static void via_i2c_setsda(void *data, int state)
 	spin_unlock_irqrestore(&i2c_vdev->reg_lock, flags);
 }
 
-int viafb_i2c_readbyte(u8 adap, u8 slave_addr, u8 index, u8 *pdata)
+int viafb_i2c_readbyte(u8 adap, u8 target_addr, u8 index, u8 *pdata)
 {
 	int ret;
 	u8 mm1[] = {0x00};
@@ -115,7 +115,7 @@ int viafb_i2c_readbyte(u8 adap, u8 slave_addr, u8 index, u8 *pdata)
 	*pdata = 0;
 	msgs[0].flags = 0;
 	msgs[1].flags = I2C_M_RD;
-	msgs[0].addr = msgs[1].addr = slave_addr / 2;
+	msgs[0].addr = msgs[1].addr = target_addr / 2;
 	mm1[0] = index;
 	msgs[0].len = 1; msgs[1].len = 1;
 	msgs[0].buf = mm1; msgs[1].buf = pdata;
@@ -128,7 +128,7 @@ int viafb_i2c_readbyte(u8 adap, u8 slave_addr, u8 index, u8 *pdata)
 	return ret;
 }
 
-int viafb_i2c_writebyte(u8 adap, u8 slave_addr, u8 index, u8 data)
+int viafb_i2c_writebyte(u8 adap, u8 target_addr, u8 index, u8 data)
 {
 	int ret;
 	u8 msg[2] = { index, data };
@@ -137,7 +137,7 @@ int viafb_i2c_writebyte(u8 adap, u8 slave_addr, u8 index, u8 data)
 	if (!via_i2c_par[adap].is_active)
 		return -ENODEV;
 	msgs.flags = 0;
-	msgs.addr = slave_addr / 2;
+	msgs.addr = target_addr / 2;
 	msgs.len = 2;
 	msgs.buf = msg;
 	ret = i2c_transfer(&via_i2c_par[adap].adapter, &msgs, 1);
@@ -149,7 +149,7 @@ int viafb_i2c_writebyte(u8 adap, u8 slave_addr, u8 index, u8 data)
 	return ret;
 }
 
-int viafb_i2c_readbytes(u8 adap, u8 slave_addr, u8 index, u8 *buff, int buff_len)
+int viafb_i2c_readbytes(u8 adap, u8 target_addr, u8 index, u8 *buff, int buff_len)
 {
 	int ret;
 	u8 mm1[] = {0x00};
@@ -159,7 +159,7 @@ int viafb_i2c_readbytes(u8 adap, u8 slave_addr, u8 index, u8 *buff, int buff_len
 		return -ENODEV;
 	msgs[0].flags = 0;
 	msgs[1].flags = I2C_M_RD;
-	msgs[0].addr = msgs[1].addr = slave_addr / 2;
+	msgs[0].addr = msgs[1].addr = target_addr / 2;
 	mm1[0] = index;
 	msgs[0].len = 1; msgs[1].len = buff_len;
 	msgs[0].buf = mm1; msgs[1].buf = buff;
diff --git a/drivers/video/fbdev/via/vt1636.c b/drivers/video/fbdev/via/vt1636.c
index 8d8cfdb05618..0d58ca144e19 100644
--- a/drivers/video/fbdev/via/vt1636.c
+++ b/drivers/video/fbdev/via/vt1636.c
@@ -44,7 +44,7 @@ u8 viafb_gpio_i2c_read_lvds(struct lvds_setting_information
 	u8 data;
 
 	viafb_i2c_readbyte(plvds_chip_info->i2c_port,
-			   plvds_chip_info->lvds_chip_slave_addr, index, &data);
+			   plvds_chip_info->lvds_chip_target_addr, index, &data);
 	return data;
 }
 
@@ -60,7 +60,7 @@ void viafb_gpio_i2c_write_mask_lvds(struct lvds_setting_information
 	data = (data & (~io_data.Mask)) | io_data.Data;
 
 	viafb_i2c_writebyte(plvds_chip_info->i2c_port,
-			    plvds_chip_info->lvds_chip_slave_addr, index, data);
+			    plvds_chip_info->lvds_chip_target_addr, index, data);
 }
 
 void viafb_init_lvds_vt1636(struct lvds_setting_information
@@ -113,7 +113,7 @@ bool viafb_lvds_identify_vt1636(u8 i2c_adapter)
 	DEBUG_MSG(KERN_INFO "viafb_lvds_identify_vt1636.\n");
 
 	/* Sense VT1636 LVDS Transmiter */
-	viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr =
+	viaparinfo->chip_info->lvds_chip_info.lvds_chip_target_addr =
 		VT1636_LVDS_I2C_ADDR;
 
 	/* Check vendor ID first: */
-- 
2.34.1


^ permalink raw reply related	[relevance 47%]

* [PATCH v2 11/12] fbdev/smscufx: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (9 preceding siblings ...)
  2024-05-03 18:13 70% ` [PATCH v2 10/12] sfc: falcon: " Easwar Hariharan
@ 2024-05-03 18:13 70% ` Easwar Hariharan
  2024-05-03 18:13 47% ` [PATCH v2 12/12] fbdev/viafb: " Easwar Hariharan
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Steve Glendinning, Helge Deller,
	open list:SMSC UFX6000 and UFX7000 USB to VGA DRIVER,
	open list:FRAMEBUFFER LAYER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/video/fbdev/smscufx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/fbdev/smscufx.c b/drivers/video/fbdev/smscufx.c
index 35d682b110c4..5f0dd01fd834 100644
--- a/drivers/video/fbdev/smscufx.c
+++ b/drivers/video/fbdev/smscufx.c
@@ -1292,7 +1292,7 @@ static int ufx_realloc_framebuffer(struct ufx_data *dev, struct fb_info *info)
 	return 0;
 }
 
-/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, master,
+/* sets up DDC channel for 100 Kbps, std. speed, 7-bit addr, controller mode,
  * restart enabled, but no start byte, enable controller */
 static int ufx_i2c_init(struct ufx_data *dev)
 {
@@ -1321,7 +1321,7 @@ static int ufx_i2c_init(struct ufx_data *dev)
 	/* 7-bit (not 10-bit) addressing */
 	tmp &= ~(0x10);
 
-	/* enable restart conditions and master mode */
+	/* enable restart conditions and controller mode */
 	tmp |= 0x21;
 
 	status = ufx_reg_write(dev, 0x1000, tmp);
-- 
2.34.1


^ permalink raw reply related	[relevance 70%]

* [PATCH v2 10/12] sfc: falcon: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (8 preceding siblings ...)
  2024-05-03 18:13 58% ` [PATCH v2 09/12] media: cx23885: " Easwar Hariharan
@ 2024-05-03 18:13 70% ` Easwar Hariharan
  2024-05-03 18:13 70% ` [PATCH v2 11/12] fbdev/smscufx: " Easwar Hariharan
  2024-05-03 18:13 47% ` [PATCH v2 12/12] fbdev/viafb: " Easwar Hariharan
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Edward Cree, Martin Habets, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Easwar Hariharan,
	open list:SFC NETWORK DRIVER, open list:SFC NETWORK DRIVER,
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/net/ethernet/sfc/falcon/falcon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sfc/falcon/falcon.c b/drivers/net/ethernet/sfc/falcon/falcon.c
index 7a1c9337081b..36114ce88034 100644
--- a/drivers/net/ethernet/sfc/falcon/falcon.c
+++ b/drivers/net/ethernet/sfc/falcon/falcon.c
@@ -367,7 +367,7 @@ static const struct i2c_algo_bit_data falcon_i2c_bit_operations = {
 	.getsda		= falcon_getsda,
 	.getscl		= falcon_getscl,
 	.udelay		= 5,
-	/* Wait up to 50 ms for slave to let us pull SCL high */
+	/* Wait up to 50 ms for target to let us pull SCL high */
 	.timeout	= DIV_ROUND_UP(HZ, 20),
 };
 
-- 
2.34.1


^ permalink raw reply related	[relevance 70%]

* [PATCH v2 09/12] media: cx23885: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (7 preceding siblings ...)
  2024-05-03 18:13 57% ` [PATCH v2 08/12] media: ivtv: " Easwar Hariharan
@ 2024-05-03 18:13 58% ` Easwar Hariharan
  2024-05-03 18:13 70% ` [PATCH v2 10/12] sfc: falcon: " Easwar Hariharan
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Mariusz Bialonczyk, Hans Verkuil,
	Easwar Hariharan, open list:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cx23885/cx23885-core.c | 6 +++---
 drivers/media/pci/cx23885/cx23885-f300.c | 8 ++++----
 drivers/media/pci/cx23885/cx23885-i2c.c  | 6 +++---
 drivers/media/pci/cx23885/cx23885.h      | 2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/media/pci/cx23885/cx23885-core.c b/drivers/media/pci/cx23885/cx23885-core.c
index c8705d786cdd..0adbdf529cec 100644
--- a/drivers/media/pci/cx23885/cx23885-core.c
+++ b/drivers/media/pci/cx23885/cx23885-core.c
@@ -942,7 +942,7 @@ static int cx23885_dev_setup(struct cx23885_dev *dev)
 	dev->pci_slot = PCI_SLOT(dev->pci->devfn);
 	cx23885_irq_add(dev, 0x001f00);
 
-	/* External Master 1 Bus */
+	/* External Controller 1 Bus */
 	dev->i2c_bus[0].nr = 0;
 	dev->i2c_bus[0].dev = dev;
 	dev->i2c_bus[0].reg_stat  = I2C1_STAT;
@@ -952,7 +952,7 @@ static int cx23885_dev_setup(struct cx23885_dev *dev)
 	dev->i2c_bus[0].reg_wdata = I2C1_WDATA;
 	dev->i2c_bus[0].i2c_period = (0x9d << 24); /* 100kHz */
 
-	/* External Master 2 Bus */
+	/* External Controller 2 Bus */
 	dev->i2c_bus[1].nr = 1;
 	dev->i2c_bus[1].dev = dev;
 	dev->i2c_bus[1].reg_stat  = I2C2_STAT;
@@ -962,7 +962,7 @@ static int cx23885_dev_setup(struct cx23885_dev *dev)
 	dev->i2c_bus[1].reg_wdata = I2C2_WDATA;
 	dev->i2c_bus[1].i2c_period = (0x9d << 24); /* 100kHz */
 
-	/* Internal Master 3 Bus */
+	/* Internal Controller 3 Bus */
 	dev->i2c_bus[2].nr = 2;
 	dev->i2c_bus[2].dev = dev;
 	dev->i2c_bus[2].reg_stat  = I2C3_STAT;
diff --git a/drivers/media/pci/cx23885/cx23885-f300.c b/drivers/media/pci/cx23885/cx23885-f300.c
index ac1c434e8e24..2ef7454e0f61 100644
--- a/drivers/media/pci/cx23885/cx23885-f300.c
+++ b/drivers/media/pci/cx23885/cx23885-f300.c
@@ -92,7 +92,7 @@ static u8 f300_xfer(struct dvb_frontend *fe, u8 *buf)
 	f300_set_line(dev, F300_RESET, 0);/* begin to send data */
 	msleep(1);
 
-	f300_send_byte(dev, 0xe0);/* the slave address is 0xe0, write */
+	f300_send_byte(dev, 0xe0);/* the target address is 0xe0, write */
 	msleep(1);
 
 	temp = buf[0];
@@ -112,10 +112,10 @@ static u8 f300_xfer(struct dvb_frontend *fe, u8 *buf)
 	}
 
 	if (i > 7) {
-		pr_err("%s: timeout, the slave no response\n",
+		pr_err("%s: timeout, the target no response\n",
 								__func__);
-		ret = 1; /* timeout, the slave no response */
-	} else { /* the slave not busy, prepare for getting data */
+		ret = 1; /* timeout, the target no response */
+	} else { /* the target not busy, prepare for getting data */
 		f300_set_line(dev, F300_RESET, 0);/*ready...*/
 		msleep(1);
 		f300_send_byte(dev, 0xe1);/* 0xe1 is Read */
diff --git a/drivers/media/pci/cx23885/cx23885-i2c.c b/drivers/media/pci/cx23885/cx23885-i2c.c
index f51fad33dc04..ddafeccb2b0a 100644
--- a/drivers/media/pci/cx23885/cx23885-i2c.c
+++ b/drivers/media/pci/cx23885/cx23885-i2c.c
@@ -34,7 +34,7 @@ MODULE_PARM_DESC(i2c_scan, "scan i2c bus at insmod time");
 #define I2C_EXTEND  (1 << 3)
 #define I2C_NOSTOP  (1 << 4)
 
-static inline int i2c_slave_did_ack(struct i2c_adapter *i2c_adap)
+static inline int i2c_target_did_ack(struct i2c_adapter *i2c_adap)
 {
 	struct cx23885_i2c *bus = i2c_adap->algo_data;
 	struct cx23885_dev *dev = bus->dev;
@@ -84,7 +84,7 @@ static int i2c_sendbytes(struct i2c_adapter *i2c_adap,
 		cx_write(bus->reg_ctrl, bus->i2c_period | (1 << 2));
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_target_did_ack(i2c_adap))
 			return -ENXIO;
 
 		dprintk(1, "%s() returns 0\n", __func__);
@@ -163,7 +163,7 @@ static int i2c_readbytes(struct i2c_adapter *i2c_adap,
 		cx_write(bus->reg_ctrl, bus->i2c_period | (1 << 2) | 1);
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_target_did_ack(i2c_adap))
 			return -ENXIO;
 
 
diff --git a/drivers/media/pci/cx23885/cx23885.h b/drivers/media/pci/cx23885/cx23885.h
index 349462ee2c48..c2d7a95933d5 100644
--- a/drivers/media/pci/cx23885/cx23885.h
+++ b/drivers/media/pci/cx23885/cx23885.h
@@ -368,7 +368,7 @@ struct cx23885_dev {
 	 * AV core so we see nice clean and stable video and audio. */
 	u32                        clk_freq;
 
-	/* I2C adapters: Master 1 & 2 (External) & Master 3 (Internal only) */
+	/* I2C adapters: Controller 1 & 2 (External) & Controller 3 (Internal only) */
 	struct cx23885_i2c         i2c_bus[3];
 
 	int                        nr;
-- 
2.34.1


^ permalink raw reply related	[relevance 58%]

* [PATCH v2 08/12] media: ivtv: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (6 preceding siblings ...)
  2024-05-03 18:13 63% ` [PATCH v2 07/12] media: cx25821: " Easwar Hariharan
@ 2024-05-03 18:13 57% ` Easwar Hariharan
  2024-05-03 18:13 58% ` [PATCH v2 09/12] media: cx23885: " Easwar Hariharan
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Andy Walls, Mauro Carvalho Chehab,
	open list:IVTV VIDEO4LINUX DRIVER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/ivtv/ivtv-i2c.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/media/pci/ivtv/ivtv-i2c.c b/drivers/media/pci/ivtv/ivtv-i2c.c
index c052c57c6dce..a22c7caa92f7 100644
--- a/drivers/media/pci/ivtv/ivtv-i2c.c
+++ b/drivers/media/pci/ivtv/ivtv-i2c.c
@@ -33,14 +33,14 @@
     Some more general comments about what we are doing:
 
     The i2c bus is a 2 wire serial bus, with clock (SCL) and data (SDA)
-    lines.  To communicate on the bus (as a master, we don't act as a slave),
+    lines.  To communicate on the bus (as a controller, we don't act as a target),
     we first initiate a start condition (ivtv_start).  We then write the
     address of the device that we want to communicate with, along with a flag
-    that indicates whether this is a read or a write.  The slave then issues
+    that indicates whether this is a read or a write.  The target then issues
     an ACK signal (ivtv_ack), which tells us that it is ready for reading /
     writing.  We then proceed with reading or writing (ivtv_read/ivtv_write),
     and finally issue a stop condition (ivtv_stop) to make the bus available
-    to other masters.
+    to other controllers.
 
     There is an additional form of transaction where a write may be
     immediately followed by a read.  In this case, there is no intervening
@@ -379,7 +379,7 @@ static int ivtv_waitsda(struct ivtv *itv, int val)
 	return 0;
 }
 
-/* Wait for the slave to issue an ACK */
+/* Wait for the target to issue an ACK */
 static int ivtv_ack(struct ivtv *itv)
 {
 	int ret = 0;
@@ -396,7 +396,7 @@ static int ivtv_ack(struct ivtv *itv)
 	ivtv_scldelay(itv);
 	ivtv_setscl(itv, 1);
 	if (!ivtv_waitsda(itv, 0)) {
-		IVTV_DEBUG_I2C("Slave did not ack\n");
+		IVTV_DEBUG_I2C("Target did not ack\n");
 		ret = -EREMOTEIO;
 	}
 	ivtv_setscl(itv, 0);
@@ -407,7 +407,7 @@ static int ivtv_ack(struct ivtv *itv)
 	return ret;
 }
 
-/* Write a single byte to the i2c bus and wait for the slave to ACK */
+/* Write a single byte to the i2c bus and wait for the target to ACK */
 static int ivtv_sendbyte(struct ivtv *itv, unsigned char byte)
 {
 	int i, bit;
@@ -427,7 +427,7 @@ static int ivtv_sendbyte(struct ivtv *itv, unsigned char byte)
 		}
 		ivtv_setscl(itv, 1);
 		if (!ivtv_waitscl(itv, 1)) {
-			IVTV_DEBUG_I2C("Slave not ready for bit\n");
+			IVTV_DEBUG_I2C("Target not ready for bit\n");
 			return -EREMOTEIO;
 		}
 	}
@@ -471,7 +471,7 @@ static int ivtv_readbyte(struct ivtv *itv, unsigned char *byte, int nack)
 	return 0;
 }
 
-/* Issue a start condition on the i2c bus to alert slaves to prepare for
+/* Issue a start condition on the i2c bus to alert targets to prepare for
    an address write */
 static int ivtv_start(struct ivtv *itv)
 {
@@ -534,7 +534,7 @@ static int ivtv_stop(struct ivtv *itv)
 	return 0;
 }
 
-/* Write a message to the given i2c slave.  do_stop may be 0 to prevent
+/* Write a message to the given i2c target.  do_stop may be 0 to prevent
    issuing the i2c stop condition (when following with a read) */
 static int ivtv_write(struct ivtv *itv, unsigned char addr, unsigned char *data, u32 len, int do_stop)
 {
@@ -558,7 +558,7 @@ static int ivtv_write(struct ivtv *itv, unsigned char addr, unsigned char *data,
 	return ret;
 }
 
-/* Read data from the given i2c slave.  A stop condition is always issued. */
+/* Read data from the given i2c target.  A stop condition is always issued. */
 static int ivtv_read(struct ivtv *itv, unsigned char addr, unsigned char *data, u32 len)
 {
 	int retry, ret = -EREMOTEIO;
-- 
2.34.1


^ permalink raw reply related	[relevance 57%]

* [PATCH v2 07/12] media: cx25821: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (5 preceding siblings ...)
  2024-05-03 18:13 56% ` [PATCH v2 06/12] media: cx18: " Easwar Hariharan
@ 2024-05-03 18:13 63% ` Easwar Hariharan
  2024-05-03 18:13 57% ` [PATCH v2 08/12] media: ivtv: " Easwar Hariharan
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Easwar Hariharan,
	open list:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cx25821/cx25821-core.c         | 2 +-
 drivers/media/pci/cx25821/cx25821-i2c.c          | 6 +++---
 drivers/media/pci/cx25821/cx25821-medusa-video.c | 2 +-
 drivers/media/pci/cx25821/cx25821.h              | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/pci/cx25821/cx25821-core.c b/drivers/media/pci/cx25821/cx25821-core.c
index 6627fa9166d3..a9af18910c1f 100644
--- a/drivers/media/pci/cx25821/cx25821-core.c
+++ b/drivers/media/pci/cx25821/cx25821-core.c
@@ -877,7 +877,7 @@ static int cx25821_dev_setup(struct cx25821_dev *dev)
 	dev->pci_slot = PCI_SLOT(dev->pci->devfn);
 	dev->pci_irqmask = 0x001f00;
 
-	/* External Master 1 Bus */
+	/* External Controller 1 Bus */
 	dev->i2c_bus[0].nr = 0;
 	dev->i2c_bus[0].dev = dev;
 	dev->i2c_bus[0].reg_stat = I2C1_STAT;
diff --git a/drivers/media/pci/cx25821/cx25821-i2c.c b/drivers/media/pci/cx25821/cx25821-i2c.c
index 0ef4cd6528a0..0000f3322dd2 100644
--- a/drivers/media/pci/cx25821/cx25821-i2c.c
+++ b/drivers/media/pci/cx25821/cx25821-i2c.c
@@ -33,7 +33,7 @@ do {									\
 #define I2C_EXTEND  (1 << 3)
 #define I2C_NOSTOP  (1 << 4)
 
-static inline int i2c_slave_did_ack(struct i2c_adapter *i2c_adap)
+static inline int i2c_target_did_ack(struct i2c_adapter *i2c_adap)
 {
 	struct cx25821_i2c *bus = i2c_adap->algo_data;
 	struct cx25821_dev *dev = bus->dev;
@@ -85,7 +85,7 @@ static int i2c_sendbytes(struct i2c_adapter *i2c_adap,
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
 
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_target_did_ack(i2c_adap))
 			return -EIO;
 
 		dprintk(1, "%s(): returns 0\n", __func__);
@@ -174,7 +174,7 @@ static int i2c_readbytes(struct i2c_adapter *i2c_adap,
 		cx_write(bus->reg_ctrl, bus->i2c_period | (1 << 2) | 1);
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_target_did_ack(i2c_adap))
 			return -EIO;
 
 		dprintk(1, "%s(): returns 0\n", __func__);
diff --git a/drivers/media/pci/cx25821/cx25821-medusa-video.c b/drivers/media/pci/cx25821/cx25821-medusa-video.c
index f0a1ac77f048..67a18add6ed3 100644
--- a/drivers/media/pci/cx25821/cx25821-medusa-video.c
+++ b/drivers/media/pci/cx25821/cx25821-medusa-video.c
@@ -659,7 +659,7 @@ int medusa_video_init(struct cx25821_dev *dev)
 	if (ret_val < 0)
 		goto error;
 
-	/* Turn off Master source switch enable */
+	/* Turn off Controller source switch enable */
 	value = cx25821_i2c_read(&dev->i2c_bus[0], MON_A_CTRL, &tmp);
 	value &= 0xFFFFFFDF;
 	ret_val = cx25821_i2c_write(&dev->i2c_bus[0], MON_A_CTRL, value);
diff --git a/drivers/media/pci/cx25821/cx25821.h b/drivers/media/pci/cx25821/cx25821.h
index 3aa7604fb944..e96be9127467 100644
--- a/drivers/media/pci/cx25821/cx25821.h
+++ b/drivers/media/pci/cx25821/cx25821.h
@@ -234,7 +234,7 @@ struct cx25821_dev {
 
 	u32 clk_freq;
 
-	/* I2C adapters: Master 1 & 2 (External) & Master 3 (Internal only) */
+	/* I2C adapters: Controller 1 & 2 (External) & Controller 3 (Internal only) */
 	struct cx25821_i2c i2c_bus[3];
 
 	int nr;
-- 
2.34.1


^ permalink raw reply related	[relevance 63%]

* [PATCH v2 06/12] media: cx18: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (4 preceding siblings ...)
  2024-05-03 18:13 71% ` [PATCH v2 05/12] media: cobalt: " Easwar Hariharan
@ 2024-05-03 18:13 56% ` Easwar Hariharan
  2024-05-03 18:13 63% ` [PATCH v2 07/12] media: cx25821: " Easwar Hariharan
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Andy Walls, Mauro Carvalho Chehab,
	open list:CX18 VIDEO4LINUX DRIVER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

I2S specification has also updated the terms in v.3 to use "controller"
and "target" respectively. Make those changes in the relevant spaces as
well.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cx18/cx18-av-firmware.c | 8 ++++----
 drivers/media/pci/cx18/cx18-cards.c       | 6 +++---
 drivers/media/pci/cx18/cx18-cards.h       | 4 ++--
 drivers/media/pci/cx18/cx18-gpio.c        | 6 +++---
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/media/pci/cx18/cx18-av-firmware.c b/drivers/media/pci/cx18/cx18-av-firmware.c
index 61aeb8c9af7f..906e0b33cffc 100644
--- a/drivers/media/pci/cx18/cx18-av-firmware.c
+++ b/drivers/media/pci/cx18/cx18-av-firmware.c
@@ -140,22 +140,22 @@ int cx18_av_loadfw(struct cx18 *cx)
 	cx18_av_and_or4(cx, CXADEC_PIN_CTRL1, ~0, 0x78000);
 
 	/* Audio input control 1 set to Sony mode */
-	/* Audio output input 2 is 0 for slave operation input */
+	/* Audio output input 2 is 0 for target operation input */
 	/* 0xC4000914[5]: 0 = left sample on WS=0, 1 = left sample on WS=1 */
 	/* 0xC4000914[7]: 0 = Philips mode, 1 = Sony mode (1st SCK rising edge
 	   after WS transition for first bit of audio word. */
 	cx18_av_write4(cx, CXADEC_I2S_IN_CTL, 0x000000A0);
 
 	/* Audio output control 1 is set to Sony mode */
-	/* Audio output control 2 is set to 1 for master mode */
+	/* Audio output control 2 is set to 1 for controller mode */
 	/* 0xC4000918[5]: 0 = left sample on WS=0, 1 = left sample on WS=1 */
 	/* 0xC4000918[7]: 0 = Philips mode, 1 = Sony mode (1st SCK rising edge
 	   after WS transition for first bit of audio word. */
-	/* 0xC4000918[8]: 0 = slave operation, 1 = master (SCK_OUT and WS_OUT
+	/* 0xC4000918[8]: 0 = target operation, 1 = controller (SCK_OUT and WS_OUT
 	   are generated) */
 	cx18_av_write4(cx, CXADEC_I2S_OUT_CTL, 0x000001A0);
 
-	/* set alt I2s master clock to /0x16 and enable alt divider i2s
+	/* set alt I2s controller clock to /0x16 and enable alt divider i2s
 	   passthrough */
 	cx18_av_write4(cx, CXADEC_PIN_CFG3, 0x5600B687);
 
diff --git a/drivers/media/pci/cx18/cx18-cards.c b/drivers/media/pci/cx18/cx18-cards.c
index f5a30959a367..4a04bc984578 100644
--- a/drivers/media/pci/cx18/cx18-cards.c
+++ b/drivers/media/pci/cx18/cx18-cards.c
@@ -82,7 +82,7 @@ static const struct cx18_card cx18_card_hvr1600_esmt = {
 	},
 	.gpio_init.initial_value = 0x3001,
 	.gpio_init.direction = 0x3001,
-	.gpio_i2c_slave_reset = {
+	.gpio_i2c_target_reset = {
 		.active_lo_mask = 0x3001,
 		.msecs_asserted = 10,
 		.msecs_recovery = 40,
@@ -129,7 +129,7 @@ static const struct cx18_card cx18_card_hvr1600_s5h1411 = {
 	},
 	.gpio_init.initial_value = 0x3801,
 	.gpio_init.direction = 0x3801,
-	.gpio_i2c_slave_reset = {
+	.gpio_i2c_target_reset = {
 		.active_lo_mask = 0x3801,
 		.msecs_asserted = 10,
 		.msecs_recovery = 40,
@@ -176,7 +176,7 @@ static const struct cx18_card cx18_card_hvr1600_samsung = {
 	},
 	.gpio_init.initial_value = 0x3001,
 	.gpio_init.direction = 0x3001,
-	.gpio_i2c_slave_reset = {
+	.gpio_i2c_target_reset = {
 		.active_lo_mask = 0x3001,
 		.msecs_asserted = 10,
 		.msecs_recovery = 40,
diff --git a/drivers/media/pci/cx18/cx18-cards.h b/drivers/media/pci/cx18/cx18-cards.h
index ae9cf5bfdd59..a886ff735e89 100644
--- a/drivers/media/pci/cx18/cx18-cards.h
+++ b/drivers/media/pci/cx18/cx18-cards.h
@@ -69,7 +69,7 @@ struct cx18_gpio_init { /* set initial GPIO DIR and OUT values */
 	u32 initial_value;
 };
 
-struct cx18_gpio_i2c_slave_reset {
+struct cx18_gpio_i2c_target_reset {
 	u32 active_lo_mask; /* GPIO outputs that reset i2c chips when low */
 	u32 active_hi_mask; /* GPIO outputs that reset i2c chips when high */
 	int msecs_asserted; /* time period reset must remain asserted */
@@ -121,7 +121,7 @@ struct cx18_card {
 	/* GPIO card-specific settings */
 	u8 xceive_pin;		/* XCeive tuner GPIO reset pin */
 	struct cx18_gpio_init		 gpio_init;
-	struct cx18_gpio_i2c_slave_reset gpio_i2c_slave_reset;
+	struct cx18_gpio_i2c_target_reset gpio_i2c_target_reset;
 	struct cx18_gpio_audio_input    gpio_audio_input;
 
 	struct cx18_card_tuner tuners[CX18_CARD_MAX_TUNERS];
diff --git a/drivers/media/pci/cx18/cx18-gpio.c b/drivers/media/pci/cx18/cx18-gpio.c
index c85eb8d25837..85546e0a01c1 100644
--- a/drivers/media/pci/cx18/cx18-gpio.c
+++ b/drivers/media/pci/cx18/cx18-gpio.c
@@ -204,9 +204,9 @@ static int resetctrl_log_status(struct v4l2_subdev *sd)
 static int resetctrl_reset(struct v4l2_subdev *sd, u32 val)
 {
 	struct cx18 *cx = v4l2_get_subdevdata(sd);
-	const struct cx18_gpio_i2c_slave_reset *p;
+	const struct cx18_gpio_i2c_target_reset *p;
 
-	p = &cx->card->gpio_i2c_slave_reset;
+	p = &cx->card->gpio_i2c_target_reset;
 	switch (val) {
 	case CX18_GPIO_RESET_I2C:
 		gpio_reset_seq(cx, p->active_lo_mask, p->active_hi_mask,
@@ -309,7 +309,7 @@ void cx18_reset_ir_gpio(void *data)
 {
 	struct cx18 *cx = to_cx18(data);
 
-	if (cx->card->gpio_i2c_slave_reset.ir_reset_mask == 0)
+	if (cx->card->gpio_i2c_target_reset.ir_reset_mask == 0)
 		return;
 
 	CX18_DEBUG_INFO("Resetting IR microcontroller\n");
-- 
2.34.1


^ permalink raw reply related	[relevance 56%]

* [PATCH v2 05/12] media: cobalt: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (3 preceding siblings ...)
  2024-05-03 18:13 69% ` [PATCH v2 04/12] media: au0828: " Easwar Hariharan
@ 2024-05-03 18:13 71% ` Easwar Hariharan
  2024-05-03 18:13 56% ` [PATCH v2 06/12] media: cx18: " Easwar Hariharan
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Hans Verkuil, Mauro Carvalho Chehab,
	open list:COBALT MEDIA DRIVER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cobalt/cobalt-i2c.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-i2c.c b/drivers/media/pci/cobalt/cobalt-i2c.c
index 10c9ee33f73e..011130aef2ca 100644
--- a/drivers/media/pci/cobalt/cobalt-i2c.c
+++ b/drivers/media/pci/cobalt/cobalt-i2c.c
@@ -45,10 +45,10 @@ struct cobalt_i2c_regs {
 /* I2C stop condition */
 #define M00018_CR_BITMAP_STO_MSK	(1 << 6)
 
-/* I2C read from slave */
+/* I2C read from target */
 #define M00018_CR_BITMAP_RD_MSK		(1 << 5)
 
-/* I2C write to slave */
+/* I2C write to target */
 #define M00018_CR_BITMAP_WR_MSK		(1 << 4)
 
 /* I2C ack */
@@ -59,7 +59,7 @@ struct cobalt_i2c_regs {
 
 /* SR[7:0] - Status register */
 
-/* Receive acknowledge from slave */
+/* Receive acknowledge from target */
 #define M00018_SR_BITMAP_RXACK_MSK	(1 << 7)
 
 /* Busy, I2C bus busy (as defined by start / stop bits) */
-- 
2.34.1


^ permalink raw reply related	[relevance 71%]

* [PATCH v2 04/12] media: au0828: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (2 preceding siblings ...)
  2024-05-03 18:13 23% ` [PATCH v2 03/12] drm/i915: " Easwar Hariharan
@ 2024-05-03 18:13 69% ` Easwar Hariharan
  2024-05-03 18:13 71% ` [PATCH v2 05/12] media: cobalt: " Easwar Hariharan
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Easwar Hariharan,
	open list:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/usb/au0828/au0828-i2c.c   | 4 ++--
 drivers/media/usb/au0828/au0828-input.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/usb/au0828/au0828-i2c.c b/drivers/media/usb/au0828/au0828-i2c.c
index 749f90d73b5b..743cb44f52aa 100644
--- a/drivers/media/usb/au0828/au0828-i2c.c
+++ b/drivers/media/usb/au0828/au0828-i2c.c
@@ -23,7 +23,7 @@ MODULE_PARM_DESC(i2c_scan, "scan i2c bus at insmod time");
 #define I2C_WAIT_DELAY 25
 #define I2C_WAIT_RETRY 1000
 
-static inline int i2c_slave_did_read_ack(struct i2c_adapter *i2c_adap)
+static inline int i2c_target_did_read_ack(struct i2c_adapter *i2c_adap)
 {
 	struct au0828_dev *dev = i2c_adap->algo_data;
 	return au0828_read(dev, AU0828_I2C_STATUS_201) &
@@ -35,7 +35,7 @@ static int i2c_wait_read_ack(struct i2c_adapter *i2c_adap)
 	int count;
 
 	for (count = 0; count < I2C_WAIT_RETRY; count++) {
-		if (!i2c_slave_did_read_ack(i2c_adap))
+		if (!i2c_target_did_read_ack(i2c_adap))
 			break;
 		udelay(I2C_WAIT_DELAY);
 	}
diff --git a/drivers/media/usb/au0828/au0828-input.c b/drivers/media/usb/au0828/au0828-input.c
index 3d3368202cd0..6c9e5ea795f2 100644
--- a/drivers/media/usb/au0828/au0828-input.c
+++ b/drivers/media/usb/au0828/au0828-input.c
@@ -30,7 +30,7 @@ struct au0828_rc {
 	int polling;
 	struct delayed_work work;
 
-	/* i2c slave address of external device (if used) */
+	/* i2c target address of external device (if used) */
 	u16 i2c_dev_addr;
 
 	int  (*get_key_i2c)(struct au0828_rc *ir);
-- 
2.34.1


^ permalink raw reply related	[relevance 69%]

* [PATCH v2 03/12] drm/i915: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
  2024-05-03 18:13 22% ` [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
  2024-05-03 18:13 46% ` [PATCH v2 02/12] drm/gma500: " Easwar Hariharan
@ 2024-05-03 18:13 23% ` Easwar Hariharan
    2024-05-03 18:13 69% ` [PATCH v2 04/12] media: au0828: " Easwar Hariharan
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Jani Nikula, Rodrigo Vivi, Joonas Lahtinen, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, Zhenyu Wang, Zhi Wang,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL GVT-g DRIVERS (Intel GPU Virtualization)
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan, Zhi Wang

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Zhi Wang <zhiwang@kernel.org>
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/gpu/drm/i915/display/dvo_ch7017.c     | 14 ++++-----
 drivers/gpu/drm/i915/display/dvo_ch7xxx.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_ivch.c       | 16 +++++-----
 drivers/gpu/drm/i915/display/dvo_ns2501.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_sil164.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_tfp410.c     | 18 +++++------
 drivers/gpu/drm/i915/display/intel_bios.c     | 22 +++++++-------
 drivers/gpu/drm/i915/display/intel_ddi.c      |  2 +-
 .../gpu/drm/i915/display/intel_display_core.h |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi.h      |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c  | 20 ++++++-------
 drivers/gpu/drm/i915/display/intel_dvo.c      | 14 ++++-----
 drivers/gpu/drm/i915/display/intel_dvo_dev.h  |  2 +-
 drivers/gpu/drm/i915/display/intel_gmbus.c    |  4 +--
 drivers/gpu/drm/i915/display/intel_sdvo.c     | 30 +++++++++----------
 drivers/gpu/drm/i915/display/intel_vbt_defs.h |  4 +--
 drivers/gpu/drm/i915/gvt/edid.c               | 28 ++++++++---------
 drivers/gpu/drm/i915/gvt/edid.h               |  4 +--
 drivers/gpu/drm/i915/gvt/opregion.c           |  2 +-
 19 files changed, 119 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/dvo_ch7017.c b/drivers/gpu/drm/i915/display/dvo_ch7017.c
index d0c3880d7f80..493e730c685b 100644
--- a/drivers/gpu/drm/i915/display/dvo_ch7017.c
+++ b/drivers/gpu/drm/i915/display/dvo_ch7017.c
@@ -170,13 +170,13 @@ static bool ch7017_read(struct intel_dvo_device *dvo, u8 addr, u8 *val)
 {
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = &addr,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = val,
@@ -189,7 +189,7 @@ static bool ch7017_write(struct intel_dvo_device *dvo, u8 addr, u8 val)
 {
 	u8 buf[2] = { addr, val };
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = buf,
@@ -197,7 +197,7 @@ static bool ch7017_write(struct intel_dvo_device *dvo, u8 addr, u8 val)
 	return i2c_transfer(dvo->i2c_bus, &msg, 1) == 1;
 }
 
-/** Probes for a CH7017 on the given bus and slave address. */
+/** Probes for a CH7017 on the given bus and target address. */
 static bool ch7017_init(struct intel_dvo_device *dvo,
 			struct i2c_adapter *adapter)
 {
@@ -227,13 +227,13 @@ static bool ch7017_init(struct intel_dvo_device *dvo,
 		break;
 	default:
 		DRM_DEBUG_KMS("ch701x not detected, got %d: from %s "
-			      "slave %d.\n",
-			      val, adapter->name, dvo->slave_addr);
+			      "target %d.\n",
+			      val, adapter->name, dvo->target_addr);
 		goto fail;
 	}
 
 	DRM_DEBUG_KMS("%s detected on %s, addr %d\n",
-		      str, adapter->name, dvo->slave_addr);
+		      str, adapter->name, dvo->target_addr);
 	return true;
 
 fail:
diff --git a/drivers/gpu/drm/i915/display/dvo_ch7xxx.c b/drivers/gpu/drm/i915/display/dvo_ch7xxx.c
index 2e8e85da5a40..534b8544e0a4 100644
--- a/drivers/gpu/drm/i915/display/dvo_ch7xxx.c
+++ b/drivers/gpu/drm/i915/display/dvo_ch7xxx.c
@@ -153,13 +153,13 @@ static bool ch7xxx_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -176,7 +176,7 @@ static bool ch7xxx_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	if (!ch7xxx->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from %s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -188,7 +188,7 @@ static bool ch7xxx_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[2];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -202,7 +202,7 @@ static bool ch7xxx_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!ch7xxx->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -229,8 +229,8 @@ static bool ch7xxx_init(struct intel_dvo_device *dvo,
 
 	name = ch7xxx_get_id(vendor);
 	if (!name) {
-		DRM_DEBUG_KMS("ch7xxx not detected; got VID 0x%02x from %s slave %d.\n",
-			      vendor, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ch7xxx not detected; got VID 0x%02x from %s target %d.\n",
+			      vendor, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
@@ -240,8 +240,8 @@ static bool ch7xxx_init(struct intel_dvo_device *dvo,
 
 	devid = ch7xxx_get_did(device);
 	if (!devid) {
-		DRM_DEBUG_KMS("ch7xxx not detected; got DID 0x%02x from %s slave %d.\n",
-			      device, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ch7xxx not detected; got DID 0x%02x from %s target %d.\n",
+			      device, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
diff --git a/drivers/gpu/drm/i915/display/dvo_ivch.c b/drivers/gpu/drm/i915/display/dvo_ivch.c
index eef72bb3b767..0d5cce6051b1 100644
--- a/drivers/gpu/drm/i915/display/dvo_ivch.c
+++ b/drivers/gpu/drm/i915/display/dvo_ivch.c
@@ -198,7 +198,7 @@ static bool ivch_read(struct intel_dvo_device *dvo, int addr, u16 *data)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 0,
 		},
@@ -209,7 +209,7 @@ static bool ivch_read(struct intel_dvo_device *dvo, int addr, u16 *data)
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD | I2C_M_NOSTART,
 			.len = 2,
 			.buf = in_buf,
@@ -226,7 +226,7 @@ static bool ivch_read(struct intel_dvo_device *dvo, int addr, u16 *data)
 	if (!priv->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from "
 				"%s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -238,7 +238,7 @@ static bool ivch_write(struct intel_dvo_device *dvo, int addr, u16 data)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[3];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 3,
 		.buf = out_buf,
@@ -253,13 +253,13 @@ static bool ivch_write(struct intel_dvo_device *dvo, int addr, u16 data)
 
 	if (!priv->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
 }
 
-/* Probes the given bus and slave address for an ivch */
+/* Probes the given bus and target address for an ivch */
 static bool ivch_init(struct intel_dvo_device *dvo,
 		      struct i2c_adapter *adapter)
 {
@@ -283,10 +283,10 @@ static bool ivch_init(struct intel_dvo_device *dvo,
 	 * very unique, check that the value in the base address field matches
 	 * the address it's responding on.
 	 */
-	if ((temp & VR00_BASE_ADDRESS_MASK) != dvo->slave_addr) {
+	if ((temp & VR00_BASE_ADDRESS_MASK) != dvo->target_addr) {
 		DRM_DEBUG_KMS("ivch detect failed due to address mismatch "
 			  "(%d vs %d)\n",
-			  (temp & VR00_BASE_ADDRESS_MASK), dvo->slave_addr);
+			  (temp & VR00_BASE_ADDRESS_MASK), dvo->target_addr);
 		goto out;
 	}
 
diff --git a/drivers/gpu/drm/i915/display/dvo_ns2501.c b/drivers/gpu/drm/i915/display/dvo_ns2501.c
index 1df212fb000e..43fc0374fc7f 100644
--- a/drivers/gpu/drm/i915/display/dvo_ns2501.c
+++ b/drivers/gpu/drm/i915/display/dvo_ns2501.c
@@ -399,13 +399,13 @@ static bool ns2501_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-		 .addr = dvo->slave_addr,
+		 .addr = dvo->target_addr,
 		 .flags = 0,
 		 .len = 1,
 		 .buf = out_buf,
 		 },
 		{
-		 .addr = dvo->slave_addr,
+		 .addr = dvo->target_addr,
 		 .flags = I2C_M_RD,
 		 .len = 1,
 		 .buf = in_buf,
@@ -423,7 +423,7 @@ static bool ns2501_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 	if (!ns->quiet) {
 		DRM_DEBUG_KMS
 		    ("Unable to read register 0x%02x from %s:0x%02x.\n", addr,
-		     adapter->name, dvo->slave_addr);
+		     adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -442,7 +442,7 @@ static bool ns2501_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	u8 out_buf[2];
 
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -457,7 +457,7 @@ static bool ns2501_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!ns->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d\n",
-			      addr, adapter->name, dvo->slave_addr);
+			      addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -488,8 +488,8 @@ static bool ns2501_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (NS2501_VID & 0xff)) {
-		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Slave %d.\n",
-			      ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Target %d.\n",
+			      ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
@@ -497,8 +497,8 @@ static bool ns2501_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (NS2501_DID & 0xff)) {
-		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Slave %d.\n",
-			      ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Target %d.\n",
+			      ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 	ns->quiet = false;
diff --git a/drivers/gpu/drm/i915/display/dvo_sil164.c b/drivers/gpu/drm/i915/display/dvo_sil164.c
index 6c461024c8e3..a8dd40c00997 100644
--- a/drivers/gpu/drm/i915/display/dvo_sil164.c
+++ b/drivers/gpu/drm/i915/display/dvo_sil164.c
@@ -79,13 +79,13 @@ static bool sil164_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -102,7 +102,7 @@ static bool sil164_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	if (!sil->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from %s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -113,7 +113,7 @@ static bool sil164_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[2];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -127,7 +127,7 @@ static bool sil164_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!sil->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -153,8 +153,8 @@ static bool sil164_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (SIL164_VID & 0xff)) {
-		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Slave %d.\n",
-			  ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Target %d.\n",
+			  ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
@@ -162,8 +162,8 @@ static bool sil164_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (SIL164_DID & 0xff)) {
-		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Slave %d.\n",
-			  ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Target %d.\n",
+			  ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 	sil->quiet = false;
diff --git a/drivers/gpu/drm/i915/display/dvo_tfp410.c b/drivers/gpu/drm/i915/display/dvo_tfp410.c
index 0939e097f4f9..d9a0cd753a87 100644
--- a/drivers/gpu/drm/i915/display/dvo_tfp410.c
+++ b/drivers/gpu/drm/i915/display/dvo_tfp410.c
@@ -100,13 +100,13 @@ static bool tfp410_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -123,7 +123,7 @@ static bool tfp410_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	if (!tfp->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from %s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -134,7 +134,7 @@ static bool tfp410_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[2];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -148,7 +148,7 @@ static bool tfp410_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!tfp->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -183,15 +183,15 @@ static bool tfp410_init(struct intel_dvo_device *dvo,
 
 	if ((id = tfp410_getid(dvo, TFP410_VID_LO)) != TFP410_VID) {
 		DRM_DEBUG_KMS("tfp410 not detected got VID %X: from %s "
-				"Slave %d.\n",
-			  id, adapter->name, dvo->slave_addr);
+				"Target %d.\n",
+			  id, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
 	if ((id = tfp410_getid(dvo, TFP410_DID_LO)) != TFP410_DID) {
 		DRM_DEBUG_KMS("tfp410 not detected got DID %X: from %s "
-				"Slave %d.\n",
-			  id, adapter->name, dvo->slave_addr);
+				"Target %d.\n",
+			  id, adapter->name, dvo->target_addr);
 		goto out;
 	}
 	tfp->quiet = false;
diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index 52bd3576835b..3d4ecfc9462a 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -69,8 +69,8 @@ struct intel_bios_encoder_data {
 	struct list_head node;
 };
 
-#define	SLAVE_ADDR1	0x70
-#define	SLAVE_ADDR2	0x72
+#define	TARGET_ADDR1	0x70
+#define	TARGET_ADDR2	0x72
 
 /* Get BDB block size given a pointer to Block ID. */
 static u32 _get_blocksize(const u8 *block_base)
@@ -1231,10 +1231,10 @@ parse_sdvo_device_mapping(struct drm_i915_private *i915)
 		const struct child_device_config *child = &devdata->child;
 		struct sdvo_device_mapping *mapping;
 
-		if (child->slave_addr != SLAVE_ADDR1 &&
-		    child->slave_addr != SLAVE_ADDR2) {
+		if (child->target_addr != TARGET_ADDR1 &&
+		    child->target_addr != TARGET_ADDR2) {
 			/*
-			 * If the slave address is neither 0x70 nor 0x72,
+			 * If the target address is neither 0x70 nor 0x72,
 			 * it is not a SDVO device. Skip it.
 			 */
 			continue;
@@ -1247,22 +1247,22 @@ parse_sdvo_device_mapping(struct drm_i915_private *i915)
 			continue;
 		}
 		drm_dbg_kms(&i915->drm,
-			    "the SDVO device with slave addr %2x is found on"
+			    "the SDVO device with target addr %2x is found on"
 			    " %s port\n",
-			    child->slave_addr,
+			    child->target_addr,
 			    (child->dvo_port == DEVICE_PORT_DVOB) ?
 			    "SDVOB" : "SDVOC");
 		mapping = &i915->display.vbt.sdvo_mappings[child->dvo_port - 1];
 		if (!mapping->initialized) {
 			mapping->dvo_port = child->dvo_port;
-			mapping->slave_addr = child->slave_addr;
+			mapping->target_addr = child->target_addr;
 			mapping->dvo_wiring = child->dvo_wiring;
 			mapping->ddc_pin = child->ddc_pin;
 			mapping->i2c_pin = child->i2c_pin;
 			mapping->initialized = 1;
 			drm_dbg_kms(&i915->drm,
 				    "SDVO device: dvo=%x, addr=%x, wiring=%d, ddc_pin=%d, i2c_pin=%d\n",
-				    mapping->dvo_port, mapping->slave_addr,
+				    mapping->dvo_port, mapping->target_addr,
 				    mapping->dvo_wiring, mapping->ddc_pin,
 				    mapping->i2c_pin);
 		} else {
@@ -1270,11 +1270,11 @@ parse_sdvo_device_mapping(struct drm_i915_private *i915)
 				    "Maybe one SDVO port is shared by "
 				    "two SDVO device.\n");
 		}
-		if (child->slave2_addr) {
+		if (child->target2_addr) {
 			/* Maybe this is a SDVO device with multiple inputs */
 			/* And the mapping info is not added */
 			drm_dbg_kms(&i915->drm,
-				    "there exists the slave2_addr. Maybe this"
+				    "there exists the target2_addr. Maybe this"
 				    " is a SDVO device with multiple inputs.\n");
 		}
 		count++;
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index c17462b4c2ac..64db211148a8 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -4332,7 +4332,7 @@ static int intel_ddi_compute_config_late(struct intel_encoder *encoder,
 									connector->tile_group->id);
 
 	/*
-	 * EDP Transcoders cannot be ensalved
+	 * EDP Transcoders cannot be slaves
 	 * make them a master always when present
 	 */
 	if (port_sync_transcoders & BIT(TRANSCODER_EDP))
diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h b/drivers/gpu/drm/i915/display/intel_display_core.h
index 2167dbee5eea..5bfc91f0b563 100644
--- a/drivers/gpu/drm/i915/display/intel_display_core.h
+++ b/drivers/gpu/drm/i915/display/intel_display_core.h
@@ -236,7 +236,7 @@ struct intel_vbt_data {
 	struct sdvo_device_mapping {
 		u8 initialized;
 		u8 dvo_port;
-		u8 slave_addr;
+		u8 target_addr;
 		u8 dvo_wiring;
 		u8 i2c_pin;
 		u8 ddc_pin;
diff --git a/drivers/gpu/drm/i915/display/intel_dsi.h b/drivers/gpu/drm/i915/display/intel_dsi.h
index e99c94edfaae..e8ba4ccd99d3 100644
--- a/drivers/gpu/drm/i915/display/intel_dsi.h
+++ b/drivers/gpu/drm/i915/display/intel_dsi.h
@@ -66,7 +66,7 @@ struct intel_dsi {
 	/* number of DSI lanes */
 	unsigned int lane_count;
 
-	/* i2c bus associated with the slave device */
+	/* i2c bus associated with the target device */
 	int i2c_bus_num;
 
 	/*
diff --git a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
index a5d7fc8418c9..fb0b02e30c8b 100644
--- a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
+++ b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
@@ -56,7 +56,7 @@
 #define MIPI_PORT_SHIFT			3
 
 struct i2c_adapter_lookup {
-	u16 slave_addr;
+	u16 target_addr;
 	struct intel_dsi *intel_dsi;
 	acpi_handle dev_handle;
 };
@@ -443,7 +443,7 @@ static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
 	if (!i2c_acpi_get_i2c_resource(ares, &sb))
 		return 1;
 
-	if (lookup->slave_addr != sb->slave_address)
+	if (lookup->target_addr != sb->slave_address)
 		return 1;
 
 	status = acpi_get_handle(lookup->dev_handle,
@@ -460,12 +460,12 @@ static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
 }
 
 static void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
-				  const u16 slave_addr)
+				  const u16 target_addr)
 {
 	struct drm_device *drm_dev = intel_dsi->base.base.dev;
 	struct acpi_device *adev = ACPI_COMPANION(drm_dev->dev);
 	struct i2c_adapter_lookup lookup = {
-		.slave_addr = slave_addr,
+		.target_addr = target_addr,
 		.intel_dsi = intel_dsi,
 		.dev_handle = acpi_device_handle(adev),
 	};
@@ -476,7 +476,7 @@ static void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
 }
 #else
 static inline void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
-					 const u16 slave_addr)
+					 const u16 target_addr)
 {
 }
 #endif
@@ -488,17 +488,17 @@ static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
 	struct i2c_msg msg;
 	int ret;
 	u8 vbt_i2c_bus_num = *(data + 2);
-	u16 slave_addr = *(u16 *)(data + 3);
+	u16 target_addr = *(u16 *)(data + 3);
 	u8 reg_offset = *(data + 5);
 	u8 payload_size = *(data + 6);
 	u8 *payload_data;
 
-	drm_dbg_kms(&i915->drm, "bus %d client-addr 0x%02x reg 0x%02x data %*ph\n",
-		    vbt_i2c_bus_num, slave_addr, reg_offset, payload_size, data + 7);
+	drm_dbg_kms(&i915->drm, "bus %d target-addr 0x%02x reg 0x%02x data %*ph\n",
+		    vbt_i2c_bus_num, target_addr, reg_offset, payload_size, data + 7);
 
 	if (intel_dsi->i2c_bus_num < 0) {
 		intel_dsi->i2c_bus_num = vbt_i2c_bus_num;
-		i2c_acpi_find_adapter(intel_dsi, slave_addr);
+		i2c_acpi_find_adapter(intel_dsi, target_addr);
 	}
 
 	adapter = i2c_get_adapter(intel_dsi->i2c_bus_num);
@@ -514,7 +514,7 @@ static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
 	payload_data[0] = reg_offset;
 	memcpy(&payload_data[1], (data + 7), payload_size);
 
-	msg.addr = slave_addr;
+	msg.addr = target_addr;
 	msg.flags = 0;
 	msg.len = payload_size + 1;
 	msg.buf = payload_data;
diff --git a/drivers/gpu/drm/i915/display/intel_dvo.c b/drivers/gpu/drm/i915/display/intel_dvo.c
index c076da75b066..8d4c8f33f776 100644
--- a/drivers/gpu/drm/i915/display/intel_dvo.c
+++ b/drivers/gpu/drm/i915/display/intel_dvo.c
@@ -60,42 +60,42 @@ static const struct intel_dvo_device intel_dvo_devices[] = {
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "sil164",
 		.port = PORT_C,
-		.slave_addr = SIL164_ADDR,
+		.target_addr = SIL164_ADDR,
 		.dev_ops = &sil164_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "ch7xxx",
 		.port = PORT_C,
-		.slave_addr = CH7xxx_ADDR,
+		.target_addr = CH7xxx_ADDR,
 		.dev_ops = &ch7xxx_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "ch7xxx",
 		.port = PORT_C,
-		.slave_addr = 0x75, /* For some ch7010 */
+		.target_addr = 0x75, /* For some ch7010 */
 		.dev_ops = &ch7xxx_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_LVDS,
 		.name = "ivch",
 		.port = PORT_A,
-		.slave_addr = 0x02, /* Might also be 0x44, 0x84, 0xc4 */
+		.target_addr = 0x02, /* Might also be 0x44, 0x84, 0xc4 */
 		.dev_ops = &ivch_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "tfp410",
 		.port = PORT_C,
-		.slave_addr = TFP410_ADDR,
+		.target_addr = TFP410_ADDR,
 		.dev_ops = &tfp410_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_LVDS,
 		.name = "ch7017",
 		.port = PORT_C,
-		.slave_addr = 0x75,
+		.target_addr = 0x75,
 		.gpio = GMBUS_PIN_DPB,
 		.dev_ops = &ch7017_ops,
 	},
@@ -103,7 +103,7 @@ static const struct intel_dvo_device intel_dvo_devices[] = {
 		.type = INTEL_DVO_CHIP_LVDS_NO_FIXED,
 		.name = "ns2501",
 		.port = PORT_B,
-		.slave_addr = NS2501_ADDR,
+		.target_addr = NS2501_ADDR,
 		.dev_ops = &ns2501_ops,
 	},
 };
diff --git a/drivers/gpu/drm/i915/display/intel_dvo_dev.h b/drivers/gpu/drm/i915/display/intel_dvo_dev.h
index af7b04539b93..4bf476656b8c 100644
--- a/drivers/gpu/drm/i915/display/intel_dvo_dev.h
+++ b/drivers/gpu/drm/i915/display/intel_dvo_dev.h
@@ -38,7 +38,7 @@ struct intel_dvo_device {
 	enum port port;
 	/* GPIO register used for i2c bus to control this device */
 	u32 gpio;
-	int slave_addr;
+	int target_addr;
 
 	const struct intel_dvo_dev_ops *dev_ops;
 	void *dev_priv;
diff --git a/drivers/gpu/drm/i915/display/intel_gmbus.c b/drivers/gpu/drm/i915/display/intel_gmbus.c
index d3e03ed5b79c..fe9a3c1f0072 100644
--- a/drivers/gpu/drm/i915/display/intel_gmbus.c
+++ b/drivers/gpu/drm/i915/display/intel_gmbus.c
@@ -478,7 +478,7 @@ gmbus_xfer_read_chunk(struct drm_i915_private *i915,
 /*
  * HW spec says that 512Bytes in Burst read need special treatment.
  * But it doesn't talk about other multiple of 256Bytes. And couldn't locate
- * an I2C slave, which supports such a lengthy burst read too for experiments.
+ * an I2C target, which supports such a lengthy burst read too for experiments.
  *
  * So until things get clarified on HW support, to avoid the burst read length
  * in fold of 256Bytes except 512, max burst read length is fixed at 767Bytes.
@@ -701,7 +701,7 @@ do_gmbus_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs, int num,
 
 	/* Toggle the Software Clear Interrupt bit. This has the effect
 	 * of resetting the GMBUS controller and so clearing the
-	 * BUS_ERROR raised by the slave's NAK.
+	 * BUS_ERROR raised by the target's NAK.
 	 */
 	intel_de_write_fw(i915, GMBUS1(i915), GMBUS_SW_CLR_INT);
 	intel_de_write_fw(i915, GMBUS1(i915), 0);
diff --git a/drivers/gpu/drm/i915/display/intel_sdvo.c b/drivers/gpu/drm/i915/display/intel_sdvo.c
index 0cd9c183f621..cb50bf9c211d 100644
--- a/drivers/gpu/drm/i915/display/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/display/intel_sdvo.c
@@ -95,7 +95,7 @@ struct intel_sdvo {
 	struct intel_encoder base;
 
 	struct i2c_adapter *i2c;
-	u8 slave_addr;
+	u8 target_addr;
 
 	struct intel_sdvo_ddc ddc[3];
 
@@ -255,13 +255,13 @@ static bool intel_sdvo_read_byte(struct intel_sdvo *intel_sdvo, u8 addr, u8 *ch)
 	struct drm_i915_private *i915 = to_i915(intel_sdvo->base.base.dev);
 	struct i2c_msg msgs[] = {
 		{
-			.addr = intel_sdvo->slave_addr,
+			.addr = intel_sdvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = &addr,
 		},
 		{
-			.addr = intel_sdvo->slave_addr,
+			.addr = intel_sdvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = ch,
@@ -483,14 +483,14 @@ static bool __intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
 	intel_sdvo_debug_write(intel_sdvo, cmd, args, args_len);
 
 	for (i = 0; i < args_len; i++) {
-		msgs[i].addr = intel_sdvo->slave_addr;
+		msgs[i].addr = intel_sdvo->target_addr;
 		msgs[i].flags = 0;
 		msgs[i].len = 2;
 		msgs[i].buf = buf + 2 *i;
 		buf[2*i + 0] = SDVO_I2C_ARG_0 - i;
 		buf[2*i + 1] = ((u8*)args)[i];
 	}
-	msgs[i].addr = intel_sdvo->slave_addr;
+	msgs[i].addr = intel_sdvo->target_addr;
 	msgs[i].flags = 0;
 	msgs[i].len = 2;
 	msgs[i].buf = buf + 2*i;
@@ -499,12 +499,12 @@ static bool __intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
 
 	/* the following two are to read the response */
 	status = SDVO_I2C_CMD_STATUS;
-	msgs[i+1].addr = intel_sdvo->slave_addr;
+	msgs[i+1].addr = intel_sdvo->target_addr;
 	msgs[i+1].flags = 0;
 	msgs[i+1].len = 1;
 	msgs[i+1].buf = &status;
 
-	msgs[i+2].addr = intel_sdvo->slave_addr;
+	msgs[i+2].addr = intel_sdvo->target_addr;
 	msgs[i+2].flags = I2C_M_RD;
 	msgs[i+2].len = 1;
 	msgs[i+2].buf = &status;
@@ -2655,9 +2655,9 @@ intel_sdvo_select_i2c_bus(struct intel_sdvo *sdvo)
 	else
 		pin = GMBUS_PIN_DPB;
 
-	drm_dbg_kms(&dev_priv->drm, "[ENCODER:%d:%s] I2C pin %d, slave addr 0x%x\n",
+	drm_dbg_kms(&dev_priv->drm, "[ENCODER:%d:%s] I2C pin %d, target addr 0x%x\n",
 		    sdvo->base.base.base.id, sdvo->base.base.name,
-		    pin, sdvo->slave_addr);
+		    pin, sdvo->target_addr);
 
 	sdvo->i2c = intel_gmbus_get_adapter(dev_priv, pin);
 
@@ -2683,7 +2683,7 @@ intel_sdvo_is_hdmi_connector(struct intel_sdvo *intel_sdvo)
 }
 
 static u8
-intel_sdvo_get_slave_addr(struct intel_sdvo *sdvo)
+intel_sdvo_get_target_addr(struct intel_sdvo *sdvo)
 {
 	struct drm_i915_private *dev_priv = to_i915(sdvo->base.base.dev);
 	const struct sdvo_device_mapping *my_mapping, *other_mapping;
@@ -2697,15 +2697,15 @@ intel_sdvo_get_slave_addr(struct intel_sdvo *sdvo)
 	}
 
 	/* If the BIOS described our SDVO device, take advantage of it. */
-	if (my_mapping->slave_addr)
-		return my_mapping->slave_addr;
+	if (my_mapping->target_addr)
+		return my_mapping->target_addr;
 
 	/*
 	 * If the BIOS only described a different SDVO device, use the
 	 * address that it isn't using.
 	 */
-	if (other_mapping->slave_addr) {
-		if (other_mapping->slave_addr == 0x70)
+	if (other_mapping->target_addr) {
+		if (other_mapping->target_addr == 0x70)
 			return 0x72;
 		else
 			return 0x70;
@@ -3408,7 +3408,7 @@ bool intel_sdvo_init(struct drm_i915_private *dev_priv,
 			 "SDVO %c", port_name(port));
 
 	intel_sdvo->sdvo_reg = sdvo_reg;
-	intel_sdvo->slave_addr = intel_sdvo_get_slave_addr(intel_sdvo) >> 1;
+	intel_sdvo->target_addr = intel_sdvo_get_target_addr(intel_sdvo) >> 1;
 
 	intel_sdvo_select_i2c_bus(intel_sdvo);
 
diff --git a/drivers/gpu/drm/i915/display/intel_vbt_defs.h b/drivers/gpu/drm/i915/display/intel_vbt_defs.h
index a9f44abfc9fc..c0d5aae980a8 100644
--- a/drivers/gpu/drm/i915/display/intel_vbt_defs.h
+++ b/drivers/gpu/drm/i915/display/intel_vbt_defs.h
@@ -432,7 +432,7 @@ struct child_device_config {
 	u16 addin_offset;
 	u8 dvo_port; /* See DEVICE_PORT_* and DVO_PORT_* above */
 	u8 i2c_pin;
-	u8 slave_addr;
+	u8 target_addr;
 	u8 ddc_pin;
 	u16 edid_ptr;
 	u8 dvo_cfg; /* See DEVICE_CFG_* above */
@@ -441,7 +441,7 @@ struct child_device_config {
 		struct {
 			u8 dvo2_port;
 			u8 i2c2_pin;
-			u8 slave2_addr;
+			u8 target2_addr;
 			u8 ddc2_pin;
 		} __packed;
 		struct {
diff --git a/drivers/gpu/drm/i915/gvt/edid.c b/drivers/gpu/drm/i915/gvt/edid.c
index af9afdb53c7f..c022dc736045 100644
--- a/drivers/gpu/drm/i915/gvt/edid.c
+++ b/drivers/gpu/drm/i915/gvt/edid.c
@@ -42,8 +42,8 @@
 #define GMBUS1_TOTAL_BYTES_MASK 0x1ff
 #define gmbus1_total_byte_count(v) (((v) >> \
 	GMBUS1_TOTAL_BYTES_SHIFT) & GMBUS1_TOTAL_BYTES_MASK)
-#define gmbus1_slave_addr(v) (((v) & 0xff) >> 1)
-#define gmbus1_slave_index(v) (((v) >> 8) & 0xff)
+#define gmbus1_target_addr(v) (((v) & 0xff) >> 1)
+#define gmbus1_target_index(v) (((v) >> 8) & 0xff)
 #define gmbus1_bus_cycle(v) (((v) >> 25) & 0x7)
 
 /* GMBUS0 bits definitions */
@@ -54,7 +54,7 @@ static unsigned char edid_get_byte(struct intel_vgpu *vgpu)
 	struct intel_vgpu_i2c_edid *edid = &vgpu->display.i2c_edid;
 	unsigned char chr = 0;
 
-	if (edid->state == I2C_NOT_SPECIFIED || !edid->slave_selected) {
+	if (edid->state == I2C_NOT_SPECIFIED || !edid->target_selected) {
 		gvt_vgpu_err("Driver tries to read EDID without proper sequence!\n");
 		return 0;
 	}
@@ -179,7 +179,7 @@ static int gmbus1_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 		void *p_data, unsigned int bytes)
 {
 	struct intel_vgpu_i2c_edid *i2c_edid = &vgpu->display.i2c_edid;
-	u32 slave_addr;
+	u32 target_addr;
 	u32 wvalue = *(u32 *)p_data;
 
 	if (vgpu_vreg(vgpu, offset) & GMBUS_SW_CLR_INT) {
@@ -210,21 +210,21 @@ static int gmbus1_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 
 		i2c_edid->gmbus.total_byte_count =
 			gmbus1_total_byte_count(wvalue);
-		slave_addr = gmbus1_slave_addr(wvalue);
+		target_addr = gmbus1_target_addr(wvalue);
 
 		/* vgpu gmbus only support EDID */
-		if (slave_addr == EDID_ADDR) {
-			i2c_edid->slave_selected = true;
-		} else if (slave_addr != 0) {
+		if (target_addr == EDID_ADDR) {
+			i2c_edid->target_selected = true;
+		} else if (target_addr != 0) {
 			gvt_dbg_dpy(
-				"vgpu%d: unsupported gmbus slave addr(0x%x)\n"
+				"vgpu%d: unsupported gmbus target addr(0x%x)\n"
 				"	gmbus operations will be ignored.\n",
-					vgpu->id, slave_addr);
+					vgpu->id, target_addr);
 		}
 
 		if (wvalue & GMBUS_CYCLE_INDEX)
 			i2c_edid->current_edid_read =
-				gmbus1_slave_index(wvalue);
+				gmbus1_target_index(wvalue);
 
 		i2c_edid->gmbus.cycle_type = gmbus1_bus_cycle(wvalue);
 		switch (gmbus1_bus_cycle(wvalue)) {
@@ -523,7 +523,7 @@ void intel_gvt_i2c_handle_aux_ch_write(struct intel_vgpu *vgpu,
 			} else if (addr == EDID_ADDR) {
 				i2c_edid->state = I2C_AUX_CH;
 				i2c_edid->port = port_idx;
-				i2c_edid->slave_selected = true;
+				i2c_edid->target_selected = true;
 				if (intel_vgpu_has_monitor_on_port(vgpu,
 					port_idx) &&
 					intel_vgpu_port_is_dp(vgpu, port_idx))
@@ -542,7 +542,7 @@ void intel_gvt_i2c_handle_aux_ch_write(struct intel_vgpu *vgpu,
 			return;
 		if (drm_WARN_ON(&i915->drm, msg_length != 4))
 			return;
-		if (i2c_edid->edid_available && i2c_edid->slave_selected) {
+		if (i2c_edid->edid_available && i2c_edid->target_selected) {
 			unsigned char val = edid_get_byte(vgpu);
 
 			aux_data_for_write = (val << 16);
@@ -571,7 +571,7 @@ void intel_vgpu_init_i2c_edid(struct intel_vgpu *vgpu)
 	edid->state = I2C_NOT_SPECIFIED;
 
 	edid->port = -1;
-	edid->slave_selected = false;
+	edid->target_selected = false;
 	edid->edid_available = false;
 	edid->current_edid_read = 0;
 
diff --git a/drivers/gpu/drm/i915/gvt/edid.h b/drivers/gpu/drm/i915/gvt/edid.h
index dfe0cbc6aad8..c3b5a55aecb3 100644
--- a/drivers/gpu/drm/i915/gvt/edid.h
+++ b/drivers/gpu/drm/i915/gvt/edid.h
@@ -80,7 +80,7 @@ enum gmbus_cycle_type {
  *      R/W Protect
  *      Command and Status.
  *      bit0 is the direction bit: 1 is read; 0 is write.
- *      bit1 - bit7 is slave 7-bit address.
+ *      bit1 - bit7 is target 7-bit address.
  *      bit16 - bit24 total byte count (ignore?)
  *
  * GMBUS2:
@@ -130,7 +130,7 @@ struct intel_vgpu_i2c_edid {
 	enum i2c_state state;
 
 	unsigned int port;
-	bool slave_selected;
+	bool target_selected;
 	bool edid_available;
 	unsigned int current_edid_read;
 
diff --git a/drivers/gpu/drm/i915/gvt/opregion.c b/drivers/gpu/drm/i915/gvt/opregion.c
index d2bed466540a..908f910420c2 100644
--- a/drivers/gpu/drm/i915/gvt/opregion.c
+++ b/drivers/gpu/drm/i915/gvt/opregion.c
@@ -86,7 +86,7 @@ struct efp_child_device_config {
 	u8 skip2;
 	u8 dvo_port;
 	u8 i2c_pin; /* for add-in card */
-	u8 slave_addr; /* for add-in card */
+	u8 target_addr; /* for add-in card */
 	u8 ddc_pin;
 	u16 edid_ptr;
 	u8 dvo_config;
-- 
2.34.1


^ permalink raw reply related	[relevance 23%]

* [PATCH v2 02/12] drm/gma500: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
  2024-05-03 18:13 22% ` [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
@ 2024-05-03 18:13 46% ` Easwar Hariharan
  2024-05-03 18:13 23% ` [PATCH v2 03/12] drm/i915: " Easwar Hariharan
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Patrik Jakobsson, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter, dri-devel,
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/gpu/drm/gma500/cdv_intel_lvds.c |  2 +-
 drivers/gpu/drm/gma500/intel_bios.c     | 22 ++++++++++-----------
 drivers/gpu/drm/gma500/intel_bios.h     |  4 ++--
 drivers/gpu/drm/gma500/intel_gmbus.c    |  2 +-
 drivers/gpu/drm/gma500/psb_drv.h        |  2 +-
 drivers/gpu/drm/gma500/psb_intel_drv.h  |  2 +-
 drivers/gpu/drm/gma500/psb_intel_lvds.c |  4 ++--
 drivers/gpu/drm/gma500/psb_intel_sdvo.c | 26 ++++++++++++-------------
 8 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_intel_lvds.c b/drivers/gpu/drm/gma500/cdv_intel_lvds.c
index f08a6803dc18..c7652a02b42e 100644
--- a/drivers/gpu/drm/gma500/cdv_intel_lvds.c
+++ b/drivers/gpu/drm/gma500/cdv_intel_lvds.c
@@ -565,7 +565,7 @@ void cdv_intel_lvds_init(struct drm_device *dev,
 			dev->dev, "I2C bus registration failed.\n");
 		goto err_encoder_cleanup;
 	}
-	gma_encoder->i2c_bus->slave_addr = 0x2C;
+	gma_encoder->i2c_bus->target_addr = 0x2C;
 	dev_priv->lvds_i2c_bus = gma_encoder->i2c_bus;
 
 	/*
diff --git a/drivers/gpu/drm/gma500/intel_bios.c b/drivers/gpu/drm/gma500/intel_bios.c
index 8245b5603d2c..d5924ca3ed05 100644
--- a/drivers/gpu/drm/gma500/intel_bios.c
+++ b/drivers/gpu/drm/gma500/intel_bios.c
@@ -14,8 +14,8 @@
 #include "psb_intel_drv.h"
 #include "psb_intel_reg.h"
 
-#define	SLAVE_ADDR1	0x70
-#define	SLAVE_ADDR2	0x72
+#define	TARGET_ADDR1	0x70
+#define	TARGET_ADDR2	0x72
 
 static void *find_section(struct bdb_header *bdb, int section_id)
 {
@@ -357,10 +357,10 @@ parse_sdvo_device_mapping(struct drm_psb_private *dev_priv,
 			/* skip the device block if device type is invalid */
 			continue;
 		}
-		if (p_child->slave_addr != SLAVE_ADDR1 &&
-			p_child->slave_addr != SLAVE_ADDR2) {
+		if (p_child->target_addr != TARGET_ADDR1 &&
+			p_child->target_addr != TARGET_ADDR2) {
 			/*
-			 * If the slave address is neither 0x70 nor 0x72,
+			 * If the target address is neither 0x70 nor 0x72,
 			 * it is not a SDVO device. Skip it.
 			 */
 			continue;
@@ -371,22 +371,22 @@ parse_sdvo_device_mapping(struct drm_psb_private *dev_priv,
 			DRM_DEBUG_KMS("Incorrect SDVO port. Skip it\n");
 			continue;
 		}
-		DRM_DEBUG_KMS("the SDVO device with slave addr %2x is found on"
+		DRM_DEBUG_KMS("the SDVO device with target addr %2x is found on"
 				" %s port\n",
-				p_child->slave_addr,
+				p_child->target_addr,
 				(p_child->dvo_port == DEVICE_PORT_DVOB) ?
 					"SDVOB" : "SDVOC");
 		p_mapping = &(dev_priv->sdvo_mappings[p_child->dvo_port - 1]);
 		if (!p_mapping->initialized) {
 			p_mapping->dvo_port = p_child->dvo_port;
-			p_mapping->slave_addr = p_child->slave_addr;
+			p_mapping->target_addr = p_child->target_addr;
 			p_mapping->dvo_wiring = p_child->dvo_wiring;
 			p_mapping->ddc_pin = p_child->ddc_pin;
 			p_mapping->i2c_pin = p_child->i2c_pin;
 			p_mapping->initialized = 1;
 			DRM_DEBUG_KMS("SDVO device: dvo=%x, addr=%x, wiring=%d, ddc_pin=%d, i2c_pin=%d\n",
 				      p_mapping->dvo_port,
-				      p_mapping->slave_addr,
+				      p_mapping->target_addr,
 				      p_mapping->dvo_wiring,
 				      p_mapping->ddc_pin,
 				      p_mapping->i2c_pin);
@@ -394,10 +394,10 @@ parse_sdvo_device_mapping(struct drm_psb_private *dev_priv,
 			DRM_DEBUG_KMS("Maybe one SDVO port is shared by "
 					 "two SDVO device.\n");
 		}
-		if (p_child->slave2_addr) {
+		if (p_child->target2_addr) {
 			/* Maybe this is a SDVO device with multiple inputs */
 			/* And the mapping info is not added */
-			DRM_DEBUG_KMS("there exists the slave2_addr. Maybe this"
+			DRM_DEBUG_KMS("there exists the target2_addr. Maybe this"
 				" is a SDVO device with multiple inputs.\n");
 		}
 		count++;
diff --git a/drivers/gpu/drm/gma500/intel_bios.h b/drivers/gpu/drm/gma500/intel_bios.h
index 0e6facf21e33..b5adea2a20c3 100644
--- a/drivers/gpu/drm/gma500/intel_bios.h
+++ b/drivers/gpu/drm/gma500/intel_bios.h
@@ -186,13 +186,13 @@ struct child_device_config {
 	u16 addin_offset;
 	u8  dvo_port; /* See Device_PORT_* above */
 	u8  i2c_pin;
-	u8  slave_addr;
+	u8  target_addr;
 	u8  ddc_pin;
 	u16 edid_ptr;
 	u8  dvo_cfg; /* See DEVICE_CFG_* above */
 	u8  dvo2_port;
 	u8  i2c2_pin;
-	u8  slave2_addr;
+	u8  target2_addr;
 	u8  ddc2_pin;
 	u8  capabilities;
 	u8  dvo_wiring;/* See DEVICE_WIRE_* above */
diff --git a/drivers/gpu/drm/gma500/intel_gmbus.c b/drivers/gpu/drm/gma500/intel_gmbus.c
index aa45509859f2..ee8b047587f2 100644
--- a/drivers/gpu/drm/gma500/intel_gmbus.c
+++ b/drivers/gpu/drm/gma500/intel_gmbus.c
@@ -333,7 +333,7 @@ gmbus_xfer(struct i2c_adapter *adapter,
 clear_err:
 	/* Toggle the Software Clear Interrupt bit. This has the effect
 	 * of resetting the GMBUS controller and so clearing the
-	 * BUS_ERROR raised by the slave's NAK.
+	 * BUS_ERROR raised by the target's NAK.
 	 */
 	GMBUS_REG_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT);
 	GMBUS_REG_WRITE(GMBUS1 + reg_offset, 0);
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 83c17689c454..bddf89b82fec 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -202,7 +202,7 @@ struct psb_intel_opregion {
 struct sdvo_device_mapping {
 	u8 initialized;
 	u8 dvo_port;
-	u8 slave_addr;
+	u8 target_addr;
 	u8 dvo_wiring;
 	u8 i2c_pin;
 	u8 i2c_speed;
diff --git a/drivers/gpu/drm/gma500/psb_intel_drv.h b/drivers/gpu/drm/gma500/psb_intel_drv.h
index c111e933e1ed..2499fd6a80c9 100644
--- a/drivers/gpu/drm/gma500/psb_intel_drv.h
+++ b/drivers/gpu/drm/gma500/psb_intel_drv.h
@@ -80,7 +80,7 @@ struct psb_intel_mode_device {
 struct gma_i2c_chan {
 	struct i2c_adapter base;
 	struct i2c_algo_bit_data algo;
-	u8 slave_addr;
+	u8 target_addr;
 
 	/* for getting at dev. private (mmio etc.) */
 	struct drm_device *drm_dev;
diff --git a/drivers/gpu/drm/gma500/psb_intel_lvds.c b/drivers/gpu/drm/gma500/psb_intel_lvds.c
index 8486de230ec9..d1cd9a940395 100644
--- a/drivers/gpu/drm/gma500/psb_intel_lvds.c
+++ b/drivers/gpu/drm/gma500/psb_intel_lvds.c
@@ -97,7 +97,7 @@ static int psb_lvds_i2c_set_brightness(struct drm_device *dev,
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = lvds_i2c_bus->slave_addr,
+			.addr = lvds_i2c_bus->target_addr,
 			.flags = 0,
 			.len = 2,
 			.buf = out_buf,
@@ -707,7 +707,7 @@ void psb_intel_lvds_init(struct drm_device *dev,
 			dev->dev, "I2C bus registration failed.\n");
 		goto err_encoder_cleanup;
 	}
-	lvds_priv->i2c_bus->slave_addr = 0x2C;
+	lvds_priv->i2c_bus->target_addr = 0x2C;
 	dev_priv->lvds_i2c_bus =  lvds_priv->i2c_bus;
 
 	/*
diff --git a/drivers/gpu/drm/gma500/psb_intel_sdvo.c b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
index e4f914deceba..8dafff963ca8 100644
--- a/drivers/gpu/drm/gma500/psb_intel_sdvo.c
+++ b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
@@ -70,7 +70,7 @@ struct psb_intel_sdvo {
 	struct gma_encoder base;
 
 	struct i2c_adapter *i2c;
-	u8 slave_addr;
+	u8 target_addr;
 
 	struct i2c_adapter ddc;
 
@@ -259,13 +259,13 @@ static bool psb_intel_sdvo_read_byte(struct psb_intel_sdvo *psb_intel_sdvo, u8 a
 {
 	struct i2c_msg msgs[] = {
 		{
-			.addr = psb_intel_sdvo->slave_addr,
+			.addr = psb_intel_sdvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = &addr,
 		},
 		{
-			.addr = psb_intel_sdvo->slave_addr,
+			.addr = psb_intel_sdvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = ch,
@@ -463,14 +463,14 @@ static bool psb_intel_sdvo_write_cmd(struct psb_intel_sdvo *psb_intel_sdvo, u8 c
 	psb_intel_sdvo_debug_write(psb_intel_sdvo, cmd, args, args_len);
 
 	for (i = 0; i < args_len; i++) {
-		msgs[i].addr = psb_intel_sdvo->slave_addr;
+		msgs[i].addr = psb_intel_sdvo->target_addr;
 		msgs[i].flags = 0;
 		msgs[i].len = 2;
 		msgs[i].buf = buf + 2 *i;
 		buf[2*i + 0] = SDVO_I2C_ARG_0 - i;
 		buf[2*i + 1] = ((u8*)args)[i];
 	}
-	msgs[i].addr = psb_intel_sdvo->slave_addr;
+	msgs[i].addr = psb_intel_sdvo->target_addr;
 	msgs[i].flags = 0;
 	msgs[i].len = 2;
 	msgs[i].buf = buf + 2*i;
@@ -479,12 +479,12 @@ static bool psb_intel_sdvo_write_cmd(struct psb_intel_sdvo *psb_intel_sdvo, u8 c
 
 	/* the following two are to read the response */
 	status = SDVO_I2C_CMD_STATUS;
-	msgs[i+1].addr = psb_intel_sdvo->slave_addr;
+	msgs[i+1].addr = psb_intel_sdvo->target_addr;
 	msgs[i+1].flags = 0;
 	msgs[i+1].len = 1;
 	msgs[i+1].buf = &status;
 
-	msgs[i+2].addr = psb_intel_sdvo->slave_addr;
+	msgs[i+2].addr = psb_intel_sdvo->target_addr;
 	msgs[i+2].flags = I2C_M_RD;
 	msgs[i+2].len = 1;
 	msgs[i+2].buf = &status;
@@ -1899,7 +1899,7 @@ psb_intel_sdvo_is_hdmi_connector(struct psb_intel_sdvo *psb_intel_sdvo, int devi
 }
 
 static u8
-psb_intel_sdvo_get_slave_addr(struct drm_device *dev, int sdvo_reg)
+psb_intel_sdvo_get_target_addr(struct drm_device *dev, int sdvo_reg)
 {
 	struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
 	struct sdvo_device_mapping *my_mapping, *other_mapping;
@@ -1913,14 +1913,14 @@ psb_intel_sdvo_get_slave_addr(struct drm_device *dev, int sdvo_reg)
 	}
 
 	/* If the BIOS described our SDVO device, take advantage of it. */
-	if (my_mapping->slave_addr)
-		return my_mapping->slave_addr;
+	if (my_mapping->target_addr)
+		return my_mapping->target_addr;
 
 	/* If the BIOS only described a different SDVO device, use the
 	 * address that it isn't using.
 	 */
-	if (other_mapping->slave_addr) {
-		if (other_mapping->slave_addr == 0x70)
+	if (other_mapping->target_addr) {
+		if (other_mapping->target_addr == 0x70)
 			return 0x72;
 		else
 			return 0x70;
@@ -2446,7 +2446,7 @@ bool psb_intel_sdvo_init(struct drm_device *dev, int sdvo_reg)
 		return false;
 
 	psb_intel_sdvo->sdvo_reg = sdvo_reg;
-	psb_intel_sdvo->slave_addr = psb_intel_sdvo_get_slave_addr(dev, sdvo_reg) >> 1;
+	psb_intel_sdvo->target_addr = psb_intel_sdvo_get_target_addr(dev, sdvo_reg) >> 1;
 	psb_intel_sdvo_select_i2c_bus(dev_priv, psb_intel_sdvo, sdvo_reg);
 	if (!psb_intel_sdvo_init_ddc_proxy(psb_intel_sdvo, dev)) {
 		kfree(psb_intel_sdvo);
-- 
2.34.1


^ permalink raw reply related	[relevance 46%]

* [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive
  2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
@ 2024-05-03 18:13 22% ` Easwar Hariharan
  2024-05-07 18:16 67%   ` Easwar Hariharan
  2024-05-03 18:13 46% ` [PATCH v2 02/12] drm/gma500: " Easwar Hariharan
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  To: Alex Deucher, Christian König, Pan, Xinhui, David Airlie,
	Daniel Vetter, Harry Wentland, Leo Li, Rodrigo Siqueira,
	Evan Quan, Hawking Zhang, Candice Li, Ran Sun,
	Alexander Richards, Easwar Hariharan, Wolfram Sang, Andi Shyti,
	Dmitry Baryshkov, Heiko Stuebner, Heiner Kallweit, Hamza Mahfooz,
	Ruan Jinjie, Aurabindo Pillai, Wayne Lin, Samson Tam, Alvin Lee,
	Sohaib Nadeem, Charlene Liu, Tom Chung, Alan Liu,
	Bhawanpreet Lakha, Meenakshikumar Somasundaram, George Shen,
	Aric Cyr, Nicholas Kazlauskas, Qingqing Zhuo, Dillon Varone,
	Lijo Lazar, Asad kamal, Kenneth Feng, Ma Jun, Darren Powell,
	Yang Wang, Mario Limonciello, Yifan Zhang, Le Ma,
	open list:RADEON and AMDGPU DRM DRIVERS, open list:DRM DRIVERS,
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c  |  8 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c       | 10 +++----
 drivers/gpu/drm/amd/amdgpu/atombios_i2c.c     |  8 +++---
 drivers/gpu/drm/amd/amdgpu/atombios_i2c.h     |  2 +-
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    | 20 ++++++-------
 .../gpu/drm/amd/display/dc/bios/bios_parser.c |  2 +-
 .../drm/amd/display/dc/bios/bios_parser2.c    |  2 +-
 .../drm/amd/display/dc/core/dc_link_exports.c |  4 +--
 drivers/gpu/drm/amd/display/dc/dc.h           |  2 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c  |  4 +--
 .../display/include/grph_object_ctrl_defs.h   |  2 +-
 drivers/gpu/drm/amd/include/atombios.h        |  2 +-
 drivers/gpu/drm/amd/include/atomfirmware.h    | 26 ++++++++---------
 .../powerplay/hwmgr/vega20_processpptables.c  |  4 +--
 .../amd/pm/powerplay/inc/smu11_driver_if.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_arcturus.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_navi10.h      |  2 +-
 .../pmfw_if/smu11_driver_if_sienna_cichlid.h  |  2 +-
 .../inc/pmfw_if/smu13_driver_if_aldebaran.h   |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_0.h     |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_7.h     |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  4 +--
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  8 +++---
 drivers/gpu/drm/radeon/atombios.h             | 16 +++++------
 drivers/gpu/drm/radeon/atombios_i2c.c         |  4 +--
 drivers/gpu/drm/radeon/radeon_combios.c       | 28 +++++++++----------
 drivers/gpu/drm/radeon/radeon_i2c.c           | 10 +++----
 drivers/gpu/drm/radeon/radeon_mode.h          |  6 ++--
 28 files changed, 93 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
index 6857c586ded7..37f50fc5d496 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
@@ -614,7 +614,7 @@ bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev,
 		if ((frev == 3 && crev >= 4) || (frev > 3)) {
 			firmware_info = (union firmware_info *)
 				(mode_info->atom_context->bios + data_offset);
-			/* The ras_rom_i2c_slave_addr should ideally
+			/* The ras_rom_i2c_target_addr should ideally
 			 * be a 19-bit EEPROM address, which would be
 			 * used as is by the driver; see top of
 			 * amdgpu_eeprom.c.
@@ -625,13 +625,13 @@ bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev,
 			 * leave the check for the pointer.
 			 *
 			 * The reason this works right now is because
-			 * ras_rom_i2c_slave_addr contains the EEPROM
+			 * ras_rom_i2c_target_addr contains the EEPROM
 			 * device type qualifier 1010b in the top 4
 			 * bits.
 			 */
-			if (firmware_info->v34.ras_rom_i2c_slave_addr) {
+			if (firmware_info->v34.ras_rom_i2c_target_addr) {
 				if (i2c_address)
-					*i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr;
+					*i2c_address = firmware_info->v34.ras_rom_i2c_target_addr;
 				return true;
 			}
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index d79cb13e1aa8..070049c92e2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -280,7 +280,7 @@ amdgpu_i2c_lookup(struct amdgpu_device *adev,
 }
 
 static void amdgpu_i2c_get_byte(struct amdgpu_i2c_chan *i2c_bus,
-				 u8 slave_addr,
+				 u8 target_addr,
 				 u8 addr,
 				 u8 *val)
 {
@@ -288,13 +288,13 @@ static void amdgpu_i2c_get_byte(struct amdgpu_i2c_chan *i2c_bus,
 	u8 in_buf[2];
 	struct i2c_msg msgs[] = {
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -314,13 +314,13 @@ static void amdgpu_i2c_get_byte(struct amdgpu_i2c_chan *i2c_bus,
 }
 
 static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan *i2c_bus,
-				 u8 slave_addr,
+				 u8 target_addr,
 				 u8 addr,
 				 u8 val)
 {
 	uint8_t out_buf[2];
 	struct i2c_msg msg = {
-		.addr = slave_addr,
+		.addr = target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c b/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c
index a6501114322f..a7d3c3d2c633 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c
@@ -36,7 +36,7 @@
 #define ATOM_MAX_HW_I2C_READ  255
 
 static int amdgpu_atombios_i2c_process_i2c_ch(struct amdgpu_i2c_chan *chan,
-				       u8 slave_addr, u8 flags,
+				       u8 target_addr, u8 flags,
 				       u8 *buf, u8 num)
 {
 	struct drm_device *dev = chan->dev;
@@ -83,7 +83,7 @@ static int amdgpu_atombios_i2c_process_i2c_ch(struct amdgpu_i2c_chan *chan,
 	args.ucFlag = flags;
 	args.ucI2CSpeed = TARGET_HW_I2C_CLOCK;
 	args.ucTransBytes = num;
-	args.ucSlaveAddr = slave_addr << 1;
+	args.ucTargetAddr = target_addr << 1;
 	args.ucLineNumber = chan->rec.i2c_id;
 
 	amdgpu_atom_execute_table(adev->mode_info.atom_context, index, (uint32_t *)&args, sizeof(args));
@@ -159,7 +159,7 @@ u32 amdgpu_atombios_i2c_func(struct i2c_adapter *adap)
 	return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
 }
 
-void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device *adev, u8 slave_addr, u8 line_number, u8 offset, u8 data)
+void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device *adev, u8 target_addr, u8 line_number, u8 offset, u8 data)
 {
 	PROCESS_I2C_CHANNEL_TRANSACTION_PS_ALLOCATION args;
 	int index = GetIndexIntoMasterTable(COMMAND, ProcessI2cChannelTransaction);
@@ -169,7 +169,7 @@ void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device *adev, u8 slave_addr
 	args.ucFlag = 1;
 	args.ucI2CSpeed = TARGET_HW_I2C_CLOCK;
 	args.ucTransBytes = 1;
-	args.ucSlaveAddr = slave_addr;
+	args.ucTargetAddr = target_addr;
 	args.ucLineNumber = line_number;
 
 	amdgpu_atom_execute_table(adev->mode_info.atom_context, index, (uint32_t *)&args, sizeof(args));
diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_i2c.h b/drivers/gpu/drm/amd/amdgpu/atombios_i2c.h
index 251aaf41f65d..13e683896ef6 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_i2c.h
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_i2c.h
@@ -28,6 +28,6 @@ int amdgpu_atombios_i2c_xfer(struct i2c_adapter *i2c_adap,
 		      struct i2c_msg *msgs, int num);
 u32 amdgpu_atombios_i2c_func(struct i2c_adapter *adap);
 void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device* adev,
-		u8 slave_addr, u8 line_number, u8 offset, u8 data);
+		u8 target_addr, u8 line_number, u8 offset, u8 data);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index dd2d66090d23..b91ed6050541 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -229,7 +229,7 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
 
 	reg_c_tx_abrt_source = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TX_ABRT_SOURCE);
 
-	/* If slave is not present */
+	/* If target is not present */
 	if (REG_GET_FIELD(reg_c_tx_abrt_source,
 			  CKSVII2C_IC_TX_ABRT_SOURCE,
 			  ABRT_7B_ADDR_NOACK) == 1) {
@@ -255,10 +255,10 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
 }
 
 /**
- * smu_v11_0_i2c_transmit - Send a block of data over the I2C bus to a slave device.
+ * smu_v11_0_i2c_transmit - Send a block of data over the I2C bus to a target device.
  *
  * @control: I2C adapter reference
- * @address: The I2C address of the slave device.
+ * @address: The I2C address of the target device.
  * @data: The data to transmit over the bus.
  * @numbytes: The amount of data to transmit.
  * @i2c_flag: Flags for transmission
@@ -284,7 +284,7 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 			       16, 1, data, numbytes, false);
 	}
 
-	/* Set the I2C slave address */
+	/* Set the I2C target address */
 	smu_v11_0_i2c_set_address(control, address);
 	/* Enable I2C */
 	smu_v11_0_i2c_enable(control, true);
@@ -354,10 +354,10 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 
 
 /**
- * smu_v11_0_i2c_receive - Receive a block of data over the I2C bus from a slave device.
+ * smu_v11_0_i2c_receive - Receive a block of data over the I2C bus from a target device.
  *
  * @control: I2C adapter reference
- * @address: The I2C address of the slave device.
+ * @address: The I2C address of the target device.
  * @data: Placeholder to store received data.
  * @numbytes: The amount of data to transmit.
  * @i2c_flag: Flags for transmission
@@ -374,7 +374,7 @@ static uint32_t smu_v11_0_i2c_receive(struct i2c_adapter *control,
 
 	bytes_received = 0;
 
-	/* Set the I2C slave address */
+	/* Set the I2C target address */
 	smu_v11_0_i2c_set_address(control, address);
 
 	/* Enable I2C */
@@ -509,7 +509,7 @@ static void smu_v11_0_i2c_init(struct i2c_adapter *control)
 	if (res != I2C_OK)
 		smu_v11_0_i2c_abort(control);
 
-	/* Configure I2C to operate as master and in standard mode */
+	/* Configure I2C to operate as controller and in standard mode */
 	smu_v11_0_i2c_configure(control);
 
 	/* Initialize the clock to 50 kHz default */
@@ -650,11 +650,11 @@ static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
 
 	smu_v11_0_i2c_init(i2c_adap);
 
-	/* From the client's point of view, this sequence of
+	/* From the target's point of view, this sequence of
 	 * messages-- the array i2c_msg *msg, is a single transaction
 	 * on the bus, starting with START and ending with STOP.
 	 *
-	 * The client is welcome to send any sequence of messages in
+	 * The target is welcome to send any sequence of messages in
 	 * this array, as processing under this function here is
 	 * striving to be agnostic.
 	 *
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
index 6450853fea94..51aa72e4eba4 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
@@ -1871,7 +1871,7 @@ static enum bp_result get_gpio_i2c_info(struct bios_parser *bp,
 	info->i2c_hw_assist = record->sucI2cId.bfHW_Capable;
 	info->i2c_line = record->sucI2cId.bfI2C_LineMux;
 	info->i2c_engine_id = record->sucI2cId.bfHW_EngineID;
-	info->i2c_slave_address = record->ucI2CAddr;
+	info->i2c_target_address = record->ucI2CAddr;
 
 	info->gpio_info.clk_mask_register_index =
 			le16_to_cpu(header->asGPIO_Info[info->i2c_line].usClkMaskRegisterIndex);
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index ab31643b1096..90abab6bd00a 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -511,7 +511,7 @@ static enum bp_result get_gpio_i2c_info(
 	info->i2c_hw_assist = (record->i2c_id & I2C_HW_CAP) ? true : false;
 	info->i2c_line = record->i2c_id & I2C_HW_LANE_MUX;
 	info->i2c_engine_id = (record->i2c_id & I2C_HW_ENGINE_ID_MASK) >> 4;
-	info->i2c_slave_address = record->i2c_slave_addr;
+	info->i2c_target_address = record->i2c_target_addr;
 
 	/* TODO: check how to get register offset for en, Y, etc. */
 	info->gpio_info.clk_a_register_index = le16_to_cpu(pin->data_a_reg_index);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
index c6c35037bdb8..9d2ec5fce4ae 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
@@ -141,13 +141,13 @@ bool dc_link_update_dsc_config(struct pipe_ctx *pipe_ctx)
 
 bool dc_is_oem_i2c_device_present(
 	struct dc *dc,
-	size_t slave_address)
+	size_t target_address)
 {
 	if (dc->res_pool->oem_device)
 		return dce_i2c_oem_device_present(
 			dc->res_pool,
 			dc->res_pool->oem_device,
-			slave_address);
+			target_address);
 
 	return false;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
index ee8453bf958f..21608f42879f 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -1803,7 +1803,7 @@ int dc_link_aux_transfer_raw(struct ddc_service *ddc,
 
 bool dc_is_oem_i2c_device_present(
 	struct dc *dc,
-	size_t slave_address
+	size_t target_address
 );
 
 /* return true if the connected receiver supports the hdcp version */
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c b/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c
index f5cd2392fc5f..f4c83d322350 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c
@@ -28,7 +28,7 @@
 bool dce_i2c_oem_device_present(
 	struct resource_pool *pool,
 	struct ddc_service *ddc,
-	size_t slave_address
+	size_t target_address
 )
 {
 	struct dc *dc = ddc->ctx->dc;
@@ -45,7 +45,7 @@ bool dce_i2c_oem_device_present(
 	if (dcb->funcs->get_i2c_info(dcb, id, &i2c_info) != BP_RESULT_OK)
 		return false;
 
-	if (i2c_info.i2c_slave_address != slave_address)
+	if (i2c_info.i2c_target_address != target_address)
 		return false;
 
 	return true;
diff --git a/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h b/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h
index 813463ffe15c..c30a2117a539 100644
--- a/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h
+++ b/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h
@@ -92,7 +92,7 @@ struct graphics_object_i2c_info {
 	bool i2c_hw_assist;
 	uint32_t i2c_line;
 	uint32_t i2c_engine_id;
-	uint32_t i2c_slave_address;
+	uint32_t i2c_target_address;
 };
 
 struct graphics_object_hpd_info {
diff --git a/drivers/gpu/drm/amd/include/atombios.h b/drivers/gpu/drm/amd/include/atombios.h
index b78360a71bc9..5644920f45e6 100644
--- a/drivers/gpu/drm/amd/include/atombios.h
+++ b/drivers/gpu/drm/amd/include/atombios.h
@@ -8503,7 +8503,7 @@ typedef struct _PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS
    USHORT  lpI2CDataOut;
   UCHAR   ucFlag;
   UCHAR   ucTransBytes;
-  UCHAR   ucSlaveAddr;
+  UCHAR   ucTargetAddr;
   UCHAR   ucLineNumber;
 }PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS;
 
diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h b/drivers/gpu/drm/amd/include/atomfirmware.h
index af3eebb4c9bc..0b76c3655df7 100644
--- a/drivers/gpu/drm/amd/include/atomfirmware.h
+++ b/drivers/gpu/drm/amd/include/atomfirmware.h
@@ -534,7 +534,7 @@ struct atom_firmware_info_v3_2 {
   uint32_t mc_baseaddr_low;
   uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
   uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-  uint8_t  board_i2c_feature_slave_addr;
+  uint8_t  board_i2c_feature_target_addr;
   uint8_t  reserved3;
   uint16_t bootup_mvddq_mv;
   uint16_t bootup_mvpp_mv;
@@ -562,7 +562,7 @@ struct atom_firmware_info_v3_3
   uint32_t mc_baseaddr_low;
   uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
   uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-  uint8_t  board_i2c_feature_slave_addr;
+  uint8_t  board_i2c_feature_target_addr;
   uint8_t  reserved3;
   uint16_t bootup_mvddq_mv;
   uint16_t bootup_mvpp_mv;
@@ -590,8 +590,8 @@ struct atom_firmware_info_v3_4 {
 	uint32_t mc_baseaddr_low;
 	uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
 	uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-	uint8_t  board_i2c_feature_slave_addr;
-	uint8_t  ras_rom_i2c_slave_addr;
+	uint8_t  board_i2c_feature_target_addr;
+	uint8_t  ras_rom_i2c_target_addr;
 	uint16_t bootup_mvddq_mv;
 	uint16_t bootup_mvpp_mv;
 	uint32_t zfbstartaddrin16mb;
@@ -626,8 +626,8 @@ struct atom_firmware_info_v3_5 {
   uint32_t mc_baseaddr_low;
   uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
   uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-  uint8_t  board_i2c_feature_slave_addr;
-  uint8_t  ras_rom_i2c_slave_addr;
+  uint8_t  board_i2c_feature_target_addr;
+  uint8_t  ras_rom_i2c_target_addr;
   uint32_t bootup_voltage_reserved1;
   uint32_t zfb_reserved;
   // if pplib_pptable_id!=0, pplib get powerplay table inside driver instead of from VBIOS
@@ -830,7 +830,7 @@ struct atom_i2c_record
 {
   struct atom_common_record_header record_header;   //record_type = ATOM_I2C_RECORD_TYPE
   uint8_t i2c_id; 
-  uint8_t i2c_slave_addr;                   //The slave address, it's 0 when the record is attached to connector for DDC
+  uint8_t i2c_target_addr;                   //The target address, it's 0 when the record is attached to connector for DDC
 };
 
 struct atom_hpd_int_record
@@ -2026,7 +2026,7 @@ struct atom_smu_info_v3_5
   uint16_t smuinitoffset;
   uint32_t bootup_dprefclk_10khz;
   uint32_t bootup_usbclk_10khz;
-  uint32_t smb_slave_address;
+  uint32_t smb_target_address;
   uint32_t cg_fdo_ctrl0_val;
   uint32_t cg_fdo_ctrl1_val;
   uint32_t cg_fdo_ctrl2_val;
@@ -2083,7 +2083,7 @@ struct atom_smu_info_v3_6
 	uint16_t smuinitoffset;
 	uint32_t bootup_gfxavsclk_10khz;
 	uint32_t bootup_mpioclk_10khz;
-	uint32_t smb_slave_address;
+	uint32_t smb_target_address;
 	uint32_t cg_fdo_ctrl0_val;
 	uint32_t cg_fdo_ctrl1_val;
 	uint32_t cg_fdo_ctrl2_val;
@@ -2138,7 +2138,7 @@ struct atom_smu_info_v4_0 {
 	uint16_t smuinitoffset;
 	uint32_t bootup_dprefclk_10khz;
 	uint32_t bootup_usbclk_10khz;
-	uint32_t smb_slave_address;
+	uint32_t smb_target_address;
 	uint32_t cg_fdo_ctrl0_val;
 	uint32_t cg_fdo_ctrl1_val;
 	uint32_t cg_fdo_ctrl2_val;
@@ -2349,7 +2349,7 @@ struct atom_smc_dpm_info_v4_3
 
 struct smudpm_i2ccontrollerconfig_t {
   uint32_t  enabled;
-  uint32_t  slaveaddress;
+  uint32_t  targetaddress;
   uint32_t  controllerport;
   uint32_t  controllername;
   uint32_t  thermalthrottler;
@@ -3510,7 +3510,7 @@ struct  atom_i2c_voltage_object_v4
    struct atom_voltage_object_header_v4 header;  // voltage mode = VOLTAGE_OBJ_VR_I2C_INIT_SEQ
    uint8_t  regulator_id;                        //Indicate Voltage Regulator Id
    uint8_t  i2c_id;
-   uint8_t  i2c_slave_addr;
+   uint8_t  i2c_target_addr;
    uint8_t  i2c_control_offset;       
    uint8_t  i2c_flag;                            // Bit0: 0 - One byte data; 1 - Two byte data
    uint8_t  i2c_speed;                           // =0, use default i2c speed, otherwise use it in unit of kHz. 
@@ -4152,7 +4152,7 @@ struct process_i2c_channel_transaction_parameters
   uint16_t  i2c_data_out;
   uint8_t   flag;                    /* enum atom_process_i2c_status */
   uint8_t   trans_bytes;
-  uint8_t   slave_addr;
+  uint8_t   target_addr;
   uint8_t   i2c_id;
 };
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c
index 79c817752a33..cb9ee5345745 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c
@@ -784,8 +784,8 @@ static int append_vbios_pptable(struct pp_hwmgr *hwmgr, PPTable_t *ppsmc_pptable
 	for (i = 0; i < I2C_CONTROLLER_NAME_COUNT; i++) {
 		ppsmc_pptable->I2cControllers[i].Enabled =
 			smc_dpm_table->i2ccontrollers[i].enabled;
-		ppsmc_pptable->I2cControllers[i].SlaveAddress =
-			smc_dpm_table->i2ccontrollers[i].slaveaddress;
+		ppsmc_pptable->I2cControllers[i].TargetAddress =
+			smc_dpm_table->i2ccontrollers[i].targetaddress;
 		ppsmc_pptable->I2cControllers[i].ControllerPort =
 			smc_dpm_table->i2ccontrollers[i].controllerport;
 		ppsmc_pptable->I2cControllers[i].ThermalThrottler =
diff --git a/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h b/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h
index c2efc70ef288..69d7ec6fd971 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h
@@ -287,7 +287,7 @@ typedef enum {
 
 typedef struct {
   uint32_t Enabled;
-  uint32_t SlaveAddress;
+  uint32_t TargetAddress;
   uint32_t ControllerPort;
   uint32_t ControllerName;
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h
index d518dee18e1b..5684e2a16e6c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h
@@ -263,7 +263,7 @@ typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
   uint8_t   Padding[2];
-  uint32_t  SlaveAddress;
+  uint32_t  TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h
index c5c1943fb6a1..1782b8e8fcd2 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h
@@ -267,7 +267,7 @@ typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
   uint8_t   Padding[2];
-  uint32_t  SlaveAddress;
+  uint32_t  TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h
index aa6d29de4002..6be89c6dd492 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h
@@ -342,7 +342,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;  
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h
index cddf45eebee8..c590f4557074 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h
@@ -167,7 +167,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ThermalThrotter;
   uint8_t   I2cProtocol;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
index b114d14fc053..ebe2d344bf5b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
@@ -319,7 +319,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h
index 8b1496f8ce58..8e9c7fa22b4f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h
@@ -320,7 +320,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 0c2d04f978ac..e2c6a4806e5c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1909,8 +1909,8 @@ static void arcturus_dump_pptable(struct smu_context *smu)
 		dev_info(smu->adev->dev, "I2cControllers[%d]:\n", i);
 		dev_info(smu->adev->dev, "                   .Enabled = %d\n",
 				pptable->I2cControllers[i].Enabled);
-		dev_info(smu->adev->dev, "                   .SlaveAddress = 0x%x\n",
-				pptable->I2cControllers[i].SlaveAddress);
+		dev_info(smu->adev->dev, "                   .TargetAddress = 0x%x\n",
+				pptable->I2cControllers[i].TargetAddress);
 		dev_info(smu->adev->dev, "                   .ControllerPort = %d\n",
 				pptable->I2cControllers[i].ControllerPort);
 		dev_info(smu->adev->dev, "                   .ControllerName = %d\n",
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 1f18b61884f3..eec4b9b9598c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -2988,8 +2988,8 @@ static void beige_goby_dump_pptable(struct smu_context *smu)
 				pptable->I2cControllers[i].Enabled);
 		dev_info(smu->adev->dev, "                   .Speed = 0x%x\n",
 				pptable->I2cControllers[i].Speed);
-		dev_info(smu->adev->dev, "                   .SlaveAddress = 0x%x\n",
-				pptable->I2cControllers[i].SlaveAddress);
+		dev_info(smu->adev->dev, "                   .TargetAddress = 0x%x\n",
+				pptable->I2cControllers[i].TargetAddress);
 		dev_info(smu->adev->dev, "                   .ControllerPort = 0x%x\n",
 				pptable->I2cControllers[i].ControllerPort);
 		dev_info(smu->adev->dev, "                   .ControllerName = 0x%x\n",
@@ -3627,8 +3627,8 @@ static void sienna_cichlid_dump_pptable(struct smu_context *smu)
 				pptable->I2cControllers[i].Enabled);
 		dev_info(smu->adev->dev, "                   .Speed = 0x%x\n",
 				pptable->I2cControllers[i].Speed);
-		dev_info(smu->adev->dev, "                   .SlaveAddress = 0x%x\n",
-				pptable->I2cControllers[i].SlaveAddress);
+		dev_info(smu->adev->dev, "                   .TargetAddress = 0x%x\n",
+				pptable->I2cControllers[i].TargetAddress);
 		dev_info(smu->adev->dev, "                   .ControllerPort = 0x%x\n",
 				pptable->I2cControllers[i].ControllerPort);
 		dev_info(smu->adev->dev, "                   .ControllerName = 0x%x\n",
diff --git a/drivers/gpu/drm/radeon/atombios.h b/drivers/gpu/drm/radeon/atombios.h
index 2db40789235c..40444af51d0a 100644
--- a/drivers/gpu/drm/radeon/atombios.h
+++ b/drivers/gpu/drm/radeon/atombios.h
@@ -1834,7 +1834,7 @@ typedef struct _READ_EDID_FROM_HW_I2C_DATA_PARAMETERS
   USHORT    usVRAMAddress;      //Address in Frame Buffer where to pace raw EDID
   USHORT    usStatus;           //When use output: lower byte EDID checksum, high byte hardware status
                                 //WHen use input:  lower byte as 'byte to read':currently limited to 128byte or 1byte
-  UCHAR     ucSlaveAddr;        //Read from which slave
+  UCHAR     ucTargetAddr;        //Read from which slave
   UCHAR     ucLineNumber;       //Read from which HW assisted line
 }READ_EDID_FROM_HW_I2C_DATA_PARAMETERS;
 #define READ_EDID_FROM_HW_I2C_DATA_PS_ALLOCATION  READ_EDID_FROM_HW_I2C_DATA_PARAMETERS
@@ -1858,7 +1858,7 @@ typedef struct _WRITE_ONE_BYTE_HW_I2C_DATA_PARAMETERS
                                 //blockID+counterID+offsetID
   UCHAR     ucData;             //PS data1
   UCHAR     ucStatus;           //Status byte 1=success, 2=failure, Also is used as PS data2
-  UCHAR     ucSlaveAddr;        //Write to which slave
+  UCHAR     ucTargetAddr;        //Write to which slave
   UCHAR     ucLineNumber;       //Write from which HW assisted line
 }WRITE_ONE_BYTE_HW_I2C_DATA_PARAMETERS;
 
@@ -1867,7 +1867,7 @@ typedef struct _WRITE_ONE_BYTE_HW_I2C_DATA_PARAMETERS
 typedef struct _SET_UP_HW_I2C_DATA_PARAMETERS
 {
   USHORT    usPrescale;         //Ratio between Engine clock and I2C clock
-  UCHAR     ucSlaveAddr;        //Write to which slave
+  UCHAR     ucTargetAddr;        //Write to which slave
   UCHAR     ucLineNumber;       //Write from which HW assisted line
 }SET_UP_HW_I2C_DATA_PARAMETERS;
 
@@ -4741,7 +4741,7 @@ typedef struct _ATOM_POWER_SOURCE_OBJECT
 	UCHAR	ucPwrSrcId;													// Power source
 	UCHAR	ucPwrSensorType;										// GPIO, I2C or none
 	UCHAR	ucPwrSensId;											  // if GPIO detect, it is GPIO id,  if I2C detect, it is I2C id
-	UCHAR	ucPwrSensSlaveAddr;									// Slave address if I2C detect
+	UCHAR	ucPwrSensTargetAddr;									// Target address if I2C detect
 	UCHAR ucPwrSensRegIndex;									// I2C register Index if I2C detect
 	UCHAR ucPwrSensRegBitMask;								// detect which bit is used if I2C detect
 	UCHAR	ucPwrSensActiveState;								// high active or low active
@@ -5449,7 +5449,7 @@ typedef struct _ATOM_I2C_DEVICE_SETUP_INFO
 {
   ATOM_I2C_ID_CONFIG_ACCESS       sucI2cId;               //I2C line and HW/SW assisted cap.
   UCHAR		                        ucSSChipID;             //SS chip being used
-  UCHAR		                        ucSSChipSlaveAddr;      //Slave Address to set up this SS chip
+  UCHAR		                        ucSSChipTargetAddr;      //Target Address to set up this SS chip
   UCHAR                           ucNumOfI2CDataRecords;  //number of data block
   ATOM_I2C_DATA_RECORD            asI2CData[];
 }ATOM_I2C_DEVICE_SETUP_INFO;
@@ -7229,7 +7229,7 @@ typedef struct _PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS
 	USHORT  lpI2CDataOut;
   UCHAR   ucFlag;               
   UCHAR   ucTransBytes;
-  UCHAR   ucSlaveAddr;
+  UCHAR   ucTargetAddr;
   UCHAR   ucLineNumber;
 }PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS;
 
@@ -7599,8 +7599,8 @@ typedef struct _ATOM_XTMDS_INFO
   UCHAR                      ucSupportedLink;    // Bit field, bit0=1, single link supported;bit1=1,dual link supported
   UCHAR                      ucSequnceAlterID;   // Even with the same external TMDS asic, it's possible that the program seqence alters 
                                                  // due to design. This ID is used to alert driver that the sequence is not "standard"!              
-  UCHAR                      ucMasterAddress;    // Address to control Master xTMDS Chip
-  UCHAR                      ucSlaveAddress;     // Address to control Slave xTMDS Chip
+  UCHAR                      ucControllerAddress;    // Address to control Controller xTMDS Chip
+  UCHAR                      ucTargetAddress;     // Address to control Target xTMDS Chip
 }ATOM_XTMDS_INFO;
 
 typedef struct _DFP_DPMS_STATUS_CHANGE_PARAMETERS
diff --git a/drivers/gpu/drm/radeon/atombios_i2c.c b/drivers/gpu/drm/radeon/atombios_i2c.c
index 730f0b25312b..3acae0b28122 100644
--- a/drivers/gpu/drm/radeon/atombios_i2c.c
+++ b/drivers/gpu/drm/radeon/atombios_i2c.c
@@ -34,7 +34,7 @@
 #define ATOM_MAX_HW_I2C_READ  255
 
 static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan,
-				 u8 slave_addr, u8 flags,
+				 u8 target_addr, u8 flags,
 				 u8 *buf, int num)
 {
 	struct drm_device *dev = chan->dev;
@@ -75,7 +75,7 @@ static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan,
 	args.ucFlag = flags;
 	args.ucI2CSpeed = TARGET_HW_I2C_CLOCK;
 	args.ucTransBytes = num;
-	args.ucSlaveAddr = slave_addr << 1;
+	args.ucTargetAddr = target_addr << 1;
 	args.ucLineNumber = chan->rec.i2c_id;
 
 	atom_execute_table_scratch_unlocked(rdev->mode_info.atom_context, index, (uint32_t *)&args, sizeof(args));
diff --git a/drivers/gpu/drm/radeon/radeon_combios.c b/drivers/gpu/drm/radeon/radeon_combios.c
index 6952b1273b0f..107638ec8c75 100644
--- a/drivers/gpu/drm/radeon/radeon_combios.c
+++ b/drivers/gpu/drm/radeon/radeon_combios.c
@@ -1398,7 +1398,7 @@ bool radeon_legacy_get_ext_tmds_info_from_table(struct radeon_encoder *encoder,
 	case CT_MINI_EXTERNAL:
 	default:
 		tmds->dvo_chip = DVO_SIL164;
-		tmds->slave_addr = 0x70 >> 1; /* 7 bit addressing */
+		tmds->target_addr = 0x70 >> 1; /* 7 bit addressing */
 		break;
 	}
 
@@ -1420,14 +1420,14 @@ bool radeon_legacy_get_ext_tmds_info_from_combios(struct radeon_encoder *encoder
 		i2c_bus = combios_setup_i2c_bus(rdev, DDC_MONID, 0, 0);
 		tmds->i2c_bus = radeon_i2c_lookup(rdev, &i2c_bus);
 		tmds->dvo_chip = DVO_SIL164;
-		tmds->slave_addr = 0x70 >> 1; /* 7 bit addressing */
+		tmds->target_addr = 0x70 >> 1; /* 7 bit addressing */
 	} else {
 		offset = combios_get_table_offset(dev, COMBIOS_EXT_TMDS_INFO_TABLE);
 		if (offset) {
 			ver = RBIOS8(offset);
 			DRM_DEBUG_KMS("External TMDS Table revision: %d\n", ver);
-			tmds->slave_addr = RBIOS8(offset + 4 + 2);
-			tmds->slave_addr >>= 1; /* 7 bit addressing */
+			tmds->target_addr = RBIOS8(offset + 4 + 2);
+			tmds->target_addr >>= 1; /* 7 bit addressing */
 			gpio = RBIOS8(offset + 4 + 3);
 			if (gpio == DDC_LCD) {
 				/* MM i2c */
@@ -2846,19 +2846,19 @@ void radeon_external_tmds_setup(struct drm_encoder *encoder)
 	case DVO_SIL164:
 		/* sil 164 */
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				    tmds->slave_addr,
+				    tmds->target_addr,
 				    0x08, 0x30);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				       tmds->slave_addr,
+				       tmds->target_addr,
 				       0x09, 0x00);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				    tmds->slave_addr,
+				    tmds->target_addr,
 				    0x0a, 0x90);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				    tmds->slave_addr,
+				    tmds->target_addr,
 				    0x0c, 0x89);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				       tmds->slave_addr,
+				       tmds->target_addr,
 				       0x08, 0x3b);
 		break;
 	case DVO_SIL1178:
@@ -2887,7 +2887,7 @@ bool radeon_combios_external_tmds_setup(struct drm_encoder *encoder)
 	struct radeon_device *rdev = dev->dev_private;
 	struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder);
 	uint16_t offset;
-	uint8_t blocks, slave_addr, rev;
+	uint8_t blocks, target_addr, rev;
 	uint32_t index, id;
 	uint32_t reg, val, and_mask, or_mask;
 	struct radeon_encoder_ext_tmds *tmds = radeon_encoder->enc_priv;
@@ -2934,15 +2934,15 @@ bool radeon_combios_external_tmds_setup(struct drm_encoder *encoder)
 						mdelay(val);
 						break;
 					case 6:
-						slave_addr = id & 0xff;
-						slave_addr >>= 1; /* 7 bit addressing */
+						target_addr = id & 0xff;
+						target_addr >>= 1; /* 7 bit addressing */
 						index++;
 						reg = RBIOS8(index);
 						index++;
 						val = RBIOS8(index);
 						index++;
 						radeon_i2c_put_byte(tmds->i2c_bus,
-								    slave_addr,
+								    target_addr,
 								    reg, val);
 						break;
 					default:
@@ -2997,7 +2997,7 @@ bool radeon_combios_external_tmds_setup(struct drm_encoder *encoder)
 					val = RBIOS8(index);
 					index += 1;
 					radeon_i2c_put_byte(tmds->i2c_bus,
-							    tmds->slave_addr,
+							    tmds->target_addr,
 							    reg, val);
 					break;
 				default:
diff --git a/drivers/gpu/drm/radeon/radeon_i2c.c b/drivers/gpu/drm/radeon/radeon_i2c.c
index 3d174390a8af..a2eb00229428 100644
--- a/drivers/gpu/drm/radeon/radeon_i2c.c
+++ b/drivers/gpu/drm/radeon/radeon_i2c.c
@@ -1038,7 +1038,7 @@ struct radeon_i2c_chan *radeon_i2c_lookup(struct radeon_device *rdev,
 }
 
 void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
-			 u8 slave_addr,
+			 u8 target_addr,
 			 u8 addr,
 			 u8 *val)
 {
@@ -1046,13 +1046,13 @@ void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
 	u8 in_buf[2];
 	struct i2c_msg msgs[] = {
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -1072,13 +1072,13 @@ void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
 }
 
 void radeon_i2c_put_byte(struct radeon_i2c_chan *i2c_bus,
-			 u8 slave_addr,
+			 u8 target_addr,
 			 u8 addr,
 			 u8 val)
 {
 	uint8_t out_buf[2];
 	struct i2c_msg msg = {
-		.addr = slave_addr,
+		.addr = target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h b/drivers/gpu/drm/radeon/radeon_mode.h
index 546381a5c918..701c5f9046a0 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -409,7 +409,7 @@ struct radeon_encoder_int_tmds {
 struct radeon_encoder_ext_tmds {
 	/* tmds over dvo */
 	struct radeon_i2c_chan *i2c_bus;
-	uint8_t slave_addr;
+	uint8_t target_addr;
 	enum radeon_dvo_chip dvo_chip;
 };
 
@@ -749,11 +749,11 @@ extern struct radeon_i2c_chan *radeon_i2c_create(struct drm_device *dev,
 						 const char *name);
 extern void radeon_i2c_destroy(struct radeon_i2c_chan *i2c);
 extern void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
-				u8 slave_addr,
+				u8 target_addr,
 				u8 addr,
 				u8 *val);
 extern void radeon_i2c_put_byte(struct radeon_i2c_chan *i2c,
-				u8 slave_addr,
+				u8 target_addr,
 				u8 addr,
 				u8 val);
 extern void radeon_router_select_ddc_port(struct radeon_connector *radeon_connector);
-- 
2.34.1


^ permalink raw reply related	[relevance 22%]

* [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers
@ 2024-05-03 18:13 51% Easwar Hariharan
  2024-05-03 18:13 22% ` [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
                   ` (11 more replies)
  0 siblings, 12 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 18:13 UTC (permalink / raw)
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of the
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

Please chime in with your opinions and suggestions.

This series is based on 3d25a941ea50 ("Merge tag 'block-6.9-20240503' of git://git.kernel.dk/linux")

[1]:
https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
----

changelog:
v1->v2:
- v1 link: https://lore.kernel.org/all/20240430173812.1423757-1-eahariha@linux.microsoft.com/ 
- Switch to specification verbiage master->controller, slave->target,
  drop usage of host/client [Thomas]
- Pick up Reviewed-bys and Acked-bys from Rodrigo, Zhi, and Thomas [gma500, i915]
- Fix up some straggler master/slave terms in amdgpu, cx25821, ivtv,
  cx23885

v0->v1:
- v0 link: https://lore.kernel.org/all/20240329170038.3863998-1-eahariha@linux.microsoft.com/
- Drop drivers/infiniband patches [Leon, Dennis]
- Switch to specification verbiage master->controller, slave->target,
  drop usage of client [Andi, Ville, Jani, Christian]
- Add I3C specification version in commit messages [Andi]
- Pick up Reviewed-bys from Martin and Simon [sfc]
- Drop i2c/treewide patch to make this series independent from Wolfram's
  ([1]) [Wolfram]
- Split away drm/nouveau patch to allow expansion into non-I2C
  non-inclusive terms
----

Easwar Hariharan (12):
  drm/amdgpu, drm/radeon: Make I2C terminology more inclusive
  drm/gma500: Make I2C terminology more inclusive
  drm/i915: Make I2C terminology more inclusive
  media: au0828: Make I2C terminology more inclusive
  media: cobalt: Make I2C terminology more inclusive
  media: cx18: Make I2C terminology more inclusive
  media: cx25821: Make I2C terminology more inclusive
  media: ivtv: Make I2C terminology more inclusive
  media: cx23885: Make I2C terminology more inclusive
  sfc: falcon: Make I2C terminology more inclusive
  fbdev/smscufx: Make I2C terminology more inclusive
  fbdev/viafb: Make I2C terminology more inclusive

 .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c  |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c       | 10 +++----
 drivers/gpu/drm/amd/amdgpu/atombios_i2c.c     |  8 ++---
 drivers/gpu/drm/amd/amdgpu/atombios_i2c.h     |  2 +-
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    | 20 ++++++-------
 .../gpu/drm/amd/display/dc/bios/bios_parser.c |  2 +-
 .../drm/amd/display/dc/bios/bios_parser2.c    |  2 +-
 .../drm/amd/display/dc/core/dc_link_exports.c |  4 +--
 drivers/gpu/drm/amd/display/dc/dc.h           |  2 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c  |  4 +--
 .../display/include/grph_object_ctrl_defs.h   |  2 +-
 drivers/gpu/drm/amd/include/atombios.h        |  2 +-
 drivers/gpu/drm/amd/include/atomfirmware.h    | 26 ++++++++--------
 .../powerplay/hwmgr/vega20_processpptables.c  |  4 +--
 .../amd/pm/powerplay/inc/smu11_driver_if.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_arcturus.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_navi10.h      |  2 +-
 .../pmfw_if/smu11_driver_if_sienna_cichlid.h  |  2 +-
 .../inc/pmfw_if/smu13_driver_if_aldebaran.h   |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_0.h     |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_7.h     |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  4 +--
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  8 ++---
 drivers/gpu/drm/gma500/cdv_intel_lvds.c       |  2 +-
 drivers/gpu/drm/gma500/intel_bios.c           | 22 +++++++-------
 drivers/gpu/drm/gma500/intel_bios.h           |  4 +--
 drivers/gpu/drm/gma500/intel_gmbus.c          |  2 +-
 drivers/gpu/drm/gma500/psb_drv.h              |  2 +-
 drivers/gpu/drm/gma500/psb_intel_drv.h        |  2 +-
 drivers/gpu/drm/gma500/psb_intel_lvds.c       |  4 +--
 drivers/gpu/drm/gma500/psb_intel_sdvo.c       | 26 ++++++++--------
 drivers/gpu/drm/i915/display/dvo_ch7017.c     | 14 ++++-----
 drivers/gpu/drm/i915/display/dvo_ch7xxx.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_ivch.c       | 16 +++++-----
 drivers/gpu/drm/i915/display/dvo_ns2501.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_sil164.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_tfp410.c     | 18 +++++------
 drivers/gpu/drm/i915/display/intel_bios.c     | 22 +++++++-------
 drivers/gpu/drm/i915/display/intel_ddi.c      |  2 +-
 .../gpu/drm/i915/display/intel_display_core.h |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi.h      |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c  | 20 ++++++-------
 drivers/gpu/drm/i915/display/intel_dvo.c      | 14 ++++-----
 drivers/gpu/drm/i915/display/intel_dvo_dev.h  |  2 +-
 drivers/gpu/drm/i915/display/intel_gmbus.c    |  4 +--
 drivers/gpu/drm/i915/display/intel_sdvo.c     | 30 +++++++++----------
 drivers/gpu/drm/i915/display/intel_vbt_defs.h |  4 +--
 drivers/gpu/drm/i915/gvt/edid.c               | 28 ++++++++---------
 drivers/gpu/drm/i915/gvt/edid.h               |  4 +--
 drivers/gpu/drm/i915/gvt/opregion.c           |  2 +-
 drivers/gpu/drm/radeon/atombios.h             | 16 +++++-----
 drivers/gpu/drm/radeon/atombios_i2c.c         |  4 +--
 drivers/gpu/drm/radeon/radeon_combios.c       | 28 ++++++++---------
 drivers/gpu/drm/radeon/radeon_i2c.c           | 10 +++----
 drivers/gpu/drm/radeon/radeon_mode.h          |  6 ++--
 drivers/media/pci/cobalt/cobalt-i2c.c         |  6 ++--
 drivers/media/pci/cx18/cx18-av-firmware.c     |  8 ++---
 drivers/media/pci/cx18/cx18-cards.c           |  6 ++--
 drivers/media/pci/cx18/cx18-cards.h           |  4 +--
 drivers/media/pci/cx18/cx18-gpio.c            |  6 ++--
 drivers/media/pci/cx23885/cx23885-core.c      |  6 ++--
 drivers/media/pci/cx23885/cx23885-f300.c      |  8 ++---
 drivers/media/pci/cx23885/cx23885-i2c.c       |  6 ++--
 drivers/media/pci/cx23885/cx23885.h           |  2 +-
 drivers/media/pci/cx25821/cx25821-core.c      |  2 +-
 drivers/media/pci/cx25821/cx25821-i2c.c       |  6 ++--
 .../media/pci/cx25821/cx25821-medusa-video.c  |  2 +-
 drivers/media/pci/cx25821/cx25821.h           |  2 +-
 drivers/media/pci/ivtv/ivtv-i2c.c             | 20 ++++++-------
 drivers/media/usb/au0828/au0828-i2c.c         |  4 +--
 drivers/media/usb/au0828/au0828-input.c       |  2 +-
 drivers/net/ethernet/sfc/falcon/falcon.c      |  2 +-
 drivers/video/fbdev/smscufx.c                 |  4 +--
 drivers/video/fbdev/via/chip.h                |  8 ++---
 drivers/video/fbdev/via/dvi.c                 | 24 +++++++--------
 drivers/video/fbdev/via/lcd.c                 |  6 ++--
 drivers/video/fbdev/via/via_aux.h             |  2 +-
 drivers/video/fbdev/via/via_i2c.c             | 12 ++++----
 drivers/video/fbdev/via/vt1636.c              |  6 ++--
 79 files changed, 321 insertions(+), 321 deletions(-)


base-commit: 3d25a941ea5013b552b96330c83052ccace73a48
-- 
2.34.1


^ permalink raw reply	[relevance 51%]

* Re: [PATCH v1 12/12] fbdev/viafb: Make I2C terminology more inclusive
  @ 2024-05-03 16:48 79%         ` Easwar Hariharan
  0 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-05-03 16:48 UTC (permalink / raw)
  To: Thomas Zimmermann, Florian Tobias Schandinat, Helge Deller,
	open list:VIA UNICHROME(PRO)/CHROME9 FRAMEBUFFER DRIVER,
	open list:FRAMEBUFFER LAYER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER

On 5/3/2024 12:39 AM, Thomas Zimmermann wrote:
> Hi
> 
> Am 03.05.24 um 00:26 schrieb Easwar Hariharan:
>> On 5/2/2024 3:46 AM, Thomas Zimmermann wrote:
>>>
>>> Am 30.04.24 um 19:38 schrieb Easwar Hariharan:
>>>> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
>>>> with more appropriate terms. Inspired by and following on to Wolfram's
>>>> series to fix drivers/i2c/[1], fix the terminology for users of
>>>> I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
>>>> in the specification.
>>>>
>>>> Compile tested, no functionality changes intended
>>>>
>>>> [1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
>>>>
>>>> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
>>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
>>>
>> Thanks for the ack! I had been addressing feedback as I got it on the v0 series, and it seems
>> I missed out on updating viafb and smscufx to spec-compliant controller/target terminology like
>> the v0->v1 changelog calls out before posting v1.
>>
>> For smscufx, I feel phrasing the following line (as an example)
>>
>>> -/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, host,
>>> +/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, *controller*,
>> would actually impact readability negatively, so I propose to leave smscufx as is.
> 
> Why? I don't see much of a difference.
> 
>>
>> For viafb, I propose making it compliant with the spec using the controller/target terminology and
>> posting a v2 respin (which I can send out as soon as you say) and ask you to review again.
>>
>> What do you think?
> 
> I think we should adopt the spec's language everywhere. That makes it possible to grep the spec for terms used in the source code. Using 'host' in smscufx appears to introduce yet another term. If you are worried about using 'I2C controller' and 'controller' in the same sentence, you can replace 'I2C controller' with 'DDC channel'. That's even more precise about the purpose of this code.

Great, thanks! That was exactly my concern, I will fix up smscufx and send a v2.

Thanks,
Easwar


^ permalink raw reply	[relevance 79%]

* Re: [PATCH] [RFC] scsi: Convert from tasklet to BH workqueue
  @ 2024-05-03 15:32 79%     ` Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-05-03 15:32 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-scsi, Linux Kernel Mailing List, linuxppc-dev,
	target-devel, megaraidlinux.pdl, jejb, hare, martin.petersen,
	linuxdrivers, tyreld, npiggin, christophe.leroy, aneesh.kumar,
	naveen.n.rao, artur.paszkiewicz, kashyap.desai, sumit.saxena,
	shivasharan.srikanteshwara, chandrakanth.patil, jinpu.wang



> On May 2, 2024, at 7:03 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Allen Pais <apais@linux.microsoft.com> writes:
>> The only generic interface to execute asynchronously in the BH context is
>> tasklet; however, it's marked deprecated and has some design flaws. To
>> replace tasklets, BH workqueue support was recently added. A BH workqueue
>> behaves similarly to regular workqueues except that the queued work items
>> are executed in the BH context.
>> 
>> This patch converts drivers/scsi/* from tasklet to BH workqueue.
>> 
>> Based on the work done by Tejun Heo <tj@kernel.org>
>> Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10
>> 
>> Signed-off-by: Allen Pais <allen.lkml@gmail.com>
>> ---
>> drivers/scsi/aic7xxx/aic7xxx_osm.c          |  2 +-
>> drivers/scsi/aic94xx/aic94xx_hwi.c          | 14 ++--
>> drivers/scsi/aic94xx/aic94xx_hwi.h          |  5 +-
>> drivers/scsi/aic94xx/aic94xx_scb.c          | 36 +++++-----
>> drivers/scsi/aic94xx/aic94xx_task.c         | 14 ++--
>> drivers/scsi/aic94xx/aic94xx_tmf.c          | 34 ++++-----
>> drivers/scsi/esas2r/esas2r.h                | 12 ++--
>> drivers/scsi/esas2r/esas2r_init.c           | 14 ++--
>> drivers/scsi/esas2r/esas2r_int.c            | 18 ++---
>> drivers/scsi/esas2r/esas2r_io.c             |  2 +-
>> drivers/scsi/esas2r/esas2r_main.c           | 16 ++---
>> drivers/scsi/ibmvscsi/ibmvfc.c              | 16 ++---
>> drivers/scsi/ibmvscsi/ibmvfc.h              |  3 +-
>> drivers/scsi/ibmvscsi/ibmvscsi.c            | 16 ++---
>> drivers/scsi/ibmvscsi/ibmvscsi.h            |  3 +-
>> drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c    | 15 ++--
>> drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h    |  3 +-
> 
> Something there is giving me a build failure (ppc64le_guest_defconfig):
> 
>  + make -s 'CC=ccache powerpc64le-linux-gnu-gcc' -j 4
>  /linux/drivers/scsi/ibmvscsi/ibmvscsi.c: In function 'ibmvscsi_init_crq_queue':
>  Error: /linux/drivers/scsi/ibmvscsi/ibmvscsi.c:370:331: error: 'ibmvscsi_work' undeclared (first use in this function)
>  /linux/drivers/scsi/ibmvscsi/ibmvscsi.c:370:331: note: each undeclared identifier is reported only once for each function it appears in
>  /linux/scripts/Makefile.build:244: recipe for target 'drivers/scsi/ibmvscsi/ibmvscsi.o' failed
>  /linux/scripts/Makefile.build:485: recipe for target 'drivers/scsi/ibmvscsi' failed
>  /linux/scripts/Makefile.build:485: recipe for target 'drivers/scsi' failed
>  /linux/scripts/Makefile.build:485: recipe for target 'drivers' failed
>  /linux/drivers/scsi/ibmvscsi/ibmvscsi.c: In function 'ibmvscsi_probe':
>  Error: /linux/drivers/scsi/ibmvscsi/ibmvscsi.c:2255:78: error: passing argument 1 of 'kthread_create_on_node' from incompatible pointer type [-Werror=incompatible-pointer-types]
>  In file included from /linux/drivers/scsi/ibmvscsi/ibmvscsi.c:56:0:
>  /linux/include/linux/kthread.h:11:21: note: expected 'int (*)(void *)' but argument is of type 'int (*)(struct work_struct *)'
>   struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
>                       ^
>  /linux/drivers/scsi/ibmvscsi/ibmvscsi.c: At top level:
>  Warning: /linux/drivers/scsi/ibmvscsi/ibmvscsi.c:212:13: warning: 'ibmvscsi_task' defined but not used [-Wunused-function]
>   static void ibmvscsi_task(void *data)
>               ^
>  Warning: cc1: warning: unrecognized command line option '-Wno-shift-negative-value'
>  Warning: cc1: warning: unrecognized command line option '-Wno-stringop-overflow'
>  cc1: some warnings being treated as errors
>  make[6]: *** [drivers/scsi/ibmvscsi/ibmvscsi.o] Error 1
>  make[5]: *** [drivers/scsi/ibmvscsi] Error 2
>  make[4]: *** [drivers/scsi] Error 2
>  make[3]: *** [drivers] Error 2
>  make[3]: *** Waiting for unfinished jobs....
> 
> Full log here: https://github.com/linuxppc/linux-snowpatch/actions/runs/8930174372/job/24529645923

 Thank you for testing it out. Unfortunately, I did not cross-compile it.
Will fix this in v2.

- Allen

> 
> Cross compile instructions if you're keen: https://github.com/linuxppc/wiki/wiki/Building-powerpc-kernels
> 
> cheers


^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next v2 0/2] Add sysfs attributes for MANA
  2024-04-30  5:31 79%   ` Shradha Gupta
@ 2024-05-03  8:48 79%     ` Shradha Gupta
  0 siblings, 0 replies; 200+ results
From: Shradha Gupta @ 2024-05-03  8:48 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Bjorn Helgaas, Jonathan Corbet, Randy Dunlap, Johannes Berg,
	Breno Leitao, linux-kernel, netdev, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Souradeep Chakrabarti, Konstantin Taranov, Yury Norov,
	linux-hyperv, shradhagupta

On Mon, Apr 29, 2024 at 10:31:38PM -0700, Shradha Gupta wrote:
> On Wed, Apr 24, 2024 at 04:48:06PM +0200, Jiri Pirko wrote:
> > Wed, Apr 24, 2024 at 12:32:54PM CEST, shradhagupta@linux.microsoft.com wrote:
> > >These patches include adding sysfs attributes for improving
> > >debuggability on MANA devices.
> > >
> > >The first patch consists on max_mtu, min_mtu attributes that are
> > >implemented generically for all devices
> > >
> > >The second patch has mana specific attributes max_num_msix and num_ports
> > 
> > 1) you implement only max, min is never implemented, no point
> > introducing it.
> Sure. I had added it for the sake of completeness.
> > 2) having driver implement sysfs entry feels *very wrong*, don't do that
> > 3) why DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MAX
> >    and DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MIN
> >    Are not what you want?
> Thanks for pointing this out. We are still evaluating if this devlink param
> could be used for our usecase where we only need a read-only msix value for VF.
> We keep the thread updated.
The attribute that we want is per VF msix max. This is per PF and would not be
the right one for our use case.
Do you have any other recommendations/suggestions around this?

Regards,
Shradha.
> > 
> > >
> > >Shradha Gupta (2):
> > >  net: Add sysfs atttributes for max_mtu min_mtu
> > >  net: mana: Add new device attributes for mana
> > >
> > > Documentation/ABI/testing/sysfs-class-net     | 16 ++++++++++
> > > .../net/ethernet/microsoft/mana/gdma_main.c   | 32 +++++++++++++++++++
> > > net/core/net-sysfs.c                          |  4 +++
> > > 3 files changed, 52 insertions(+)
> > >
> > >-- 
> > >2.34.1
> > >
> > >

^ permalink raw reply	[relevance 79%]

* [PATCH v3] fs/coredump: Enable dynamic configuration of max file note size
@ 2024-05-02 23:56 63% Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-05-02 23:56 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, linux-mm, viro, brauner, jack, ebiederm, keescook,
	mcgrof, j.granados, allen.lkml

Introduce the capability to dynamically configure the maximum file
note size for ELF core dumps via sysctl. This enhancement removes
the previous static limit of 4MB, allowing system administrators to
adjust the size based on system-specific requirements or constraints.

- Remove hardcoded `MAX_FILE_NOTE_SIZE` from `fs/binfmt_elf.c`.
- Define `max_file_note_size` in `fs/coredump.c` with an initial value
  set to 4MB.
- Declare `max_file_note_size` as an external variable in
  `include/linux/coredump.h`.
- Add a new sysctl entry in `kernel/sysctl.c` to manage this setting
  at runtime.

$ sysctl -a | grep core_file_note_size_max
kernel.core_file_note_size_max = 4194304

$ sysctl -n kernel.core_file_note_size_max
4194304

$echo 519304 > /proc/sys/kernel/core_file_note_size_max

$sysctl -n kernel.core_file_note_size_max
519304

Attempting to write beyond the ceiling value of 16MB
$echo 17194304 > /proc/sys/kernel/core_file_note_size_max
bash: echo: write error: Invalid argument

Why is this being done?
We have observed that during a crash when there are more than 65k mmaps
in memory, the existing fixed limit on the size of the ELF notes section
becomes a bottleneck. The notes section quickly reaches its capacity,
leading to incomplete memory segment information in the resulting coredump.
This truncation compromises the utility of the coredumps, as crucial
information about the memory state at the time of the crash might be
omitted.

Signed-off-by: Vijay Nag <nagvijay@microsoft.com>
Signed-off-by: Allen Pais <apais@linux.microsoft.com>

---
Chagnes in v3:
   - Fix commit message to reflect the correct sysctl knob [Kees]
   - Add a ceiling for maximum pssible note size(16M) [Allen]
   - Add a pr_warn_once() [Kees]
Changes in v2:
   - Move new sysctl to fs/coredump.c [Luis & Kees]
   - rename max_file_note_size to core_file_note_size_max [kees]
   - Capture "why this is being done?" int he commit message [Luis & Kees]
---
 fs/binfmt_elf.c          |  8 ++++++--
 fs/coredump.c            | 15 +++++++++++++++
 include/linux/coredump.h |  1 +
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 5397b552fbeb..5294f8f3a9a8 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1564,7 +1564,6 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
 	fill_note(note, "CORE", NT_SIGINFO, sizeof(*csigdata), csigdata);
 }
 
-#define MAX_FILE_NOTE_SIZE (4*1024*1024)
 /*
  * Format of NT_FILE note:
  *
@@ -1592,8 +1591,13 @@ static int fill_files_note(struct memelfnote *note, struct coredump_params *cprm
 
 	names_ofs = (2 + 3 * count) * sizeof(data[0]);
  alloc:
-	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
+	/* paranoia check */
+	if (size >= core_file_note_size_max) {
+		pr_warn_once("coredump Note size too large: %u "
+		"(does kernel.core_file_note_size_max sysctl need adjustment?)\n",
+		size);
 		return -EINVAL;
+	}
 	size = round_up(size, PAGE_SIZE);
 	/*
 	 * "size" can be 0 here legitimately.
diff --git a/fs/coredump.c b/fs/coredump.c
index be6403b4b14b..ffaed8c1b3b0 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -56,10 +56,16 @@
 static bool dump_vma_snapshot(struct coredump_params *cprm);
 static void free_vma_snapshot(struct coredump_params *cprm);
 
+#define MAX_FILE_NOTE_SIZE (4*1024*1024)
+/* Define a reasonable max cap */
+#define MAX_ALLOWED_NOTE_SIZE (16*1024*1024)
+
 static int core_uses_pid;
 static unsigned int core_pipe_limit;
 static char core_pattern[CORENAME_MAX_SIZE] = "core";
 static int core_name_size = CORENAME_MAX_SIZE;
+unsigned int core_file_note_size_max = MAX_FILE_NOTE_SIZE;
+unsigned int core_file_note_size_allowed = MAX_ALLOWED_NOTE_SIZE;
 
 struct core_name {
 	char *corename;
@@ -1020,6 +1026,15 @@ static struct ctl_table coredump_sysctls[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname       = "core_file_note_size_max",
+		.data           = &core_file_note_size_max,
+		.maxlen         = sizeof(unsigned int),
+		.mode           = 0644,
+		.proc_handler	= proc_douintvec_minmax,
+		.extra1		= &core_file_note_size_max,
+		.extra2		= &core_file_note_size_allowed,
+	},
 };
 
 static int __init init_fs_coredump_sysctls(void)
diff --git a/include/linux/coredump.h b/include/linux/coredump.h
index d3eba4360150..14c057643e7f 100644
--- a/include/linux/coredump.h
+++ b/include/linux/coredump.h
@@ -46,6 +46,7 @@ static inline void do_coredump(const kernel_siginfo_t *siginfo) {}
 #endif
 
 #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL)
+extern unsigned int core_file_note_size_max;
 extern void validate_coredump_safety(void);
 #else
 static inline void validate_coredump_safety(void) {}
-- 
2.17.1


^ permalink raw reply related	[relevance 63%]

* Re: [PATCH v2 1/2] tracing/user_events: Fix non-spaced field matching
  @ 2024-05-02 22:58 79%     ` Beau Belgrave
  0 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-05-02 22:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: mhiramat, mathieu.desnoyers, linux-kernel, linux-trace-kernel, dcook

On Thu, May 02, 2024 at 05:16:34PM -0400, Steven Rostedt wrote:
> On Tue, 23 Apr 2024 16:23:37 +0000
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
> 
> > When the ABI was updated to prevent same name w/different args, it
> > missed an important corner case when fields don't end with a space.
> > Typically, space is used for fields to help separate them, like
> > "u8 field1; u8 field2". If no spaces are used, like
> > "u8 field1;u8 field2", then the parsing works for the first time.
> > However, the match check fails on a subsequent register, leading to
> > confusion.
> > 
> > This is because the match check uses argv_split() and assumes that all
> > fields will be split upon the space. When spaces are used, we get back
> > { "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }.
> > This causes a mismatch, and the user program gets back -EADDRINUSE.
> > 
> > Add a method to detect this case before calling argv_split(). If found
> > force a space after the field separator character ';'. This ensures all
> > cases work properly for matching.
> > 
> > With this fix, the following are all treated as matching:
> > u8 field1;u8 field2
> > u8 field1; u8 field2
> > u8 field1;\tu8 field2
> > u8 field1;\nu8 field2
> 
> I'm curious, what happens if you have: "u8 field1; u8 field2;" ?
> 

You'll get an extra whitespace during the copy, assuming it was really:
"u8 field1;u8 field2"

If it had spaces, this code wouldn't run.

> Do you care? As you will then create "u8 field1; u8 field2; "
> 
> but I'm guessing the extra whitespace at the end doesn't affect anything.
> 

Right, you get an extra byte allocated, but the argv_split() with ignore
it. The compare will work correctly (I've verified this just now to
double check).

IE these all match:
"Test u8 a; u8 b; "
"Test u8 a; u8 b;"
"Test u8 a; u8 b"

> 
> > 
> > Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event")
> > Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
> > ---
> >  kernel/trace/trace_events_user.c | 76 +++++++++++++++++++++++++++++++-
> >  1 file changed, 75 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
> > index 70d428c394b6..82b191f33a28 100644
> > --- a/kernel/trace/trace_events_user.c
> > +++ b/kernel/trace/trace_events_user.c
> > @@ -1989,6 +1989,80 @@ static int user_event_set_tp_name(struct user_event *user)
> >  	return 0;
> >  }
> >  
> > +/*
> > + * Counts how many ';' without a trailing space are in the args.
> > + */
> > +static int count_semis_no_space(char *args)
> > +{
> > +	int count = 0;
> > +
> > +	while ((args = strchr(args, ';'))) {
> > +		args++;
> > +
> > +		if (!isspace(*args))
> > +			count++;
> 
> This will count that "..;" 
> 
> This is most likely not an issue, but since I didn't see this case
> anywhere, I figured I bring it up just to confirm that it's not an issue.
> 

It's not an issue on the matching/logic. However, you do get an extra
byte alloc (which doesn't bother me in this edge case).

Thanks,
-Beau

> -- Steve
> 
> 
> > +	}
> > +
> > +	return count;
> > +}
> > +
> > +/*
> > + * Copies the arguments while ensuring all ';' have a trailing space.
> > + */
> > +static char *insert_space_after_semis(char *args, int count)
> > +{
> > +	char *fixed, *pos;
> > +	int len;
> > +
> > +	len = strlen(args) + count;
> > +	fixed = kmalloc(len + 1, GFP_KERNEL);
> > +
> > +	if (!fixed)
> > +		return NULL;
> > +
> > +	pos = fixed;
> > +
> > +	/* Insert a space after ';' if there is no trailing space. */
> > +	while (*args) {
> > +		*pos = *args++;
> > +
> > +		if (*pos++ == ';' && !isspace(*args))
> > +			*pos++ = ' ';
> > +	}
> > +
> > +	*pos = '\0';
> > +
> > +	return fixed;
> > +}
> > +

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v1 12/12] fbdev/viafb: Make I2C terminology more inclusive
  @ 2024-05-02 22:26 74%     ` Easwar Hariharan
    0 siblings, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-05-02 22:26 UTC (permalink / raw)
  To: Thomas Zimmermann, Florian Tobias Schandinat, Helge Deller,
	open list:VIA UNICHROME(PRO)/CHROME9 FRAMEBUFFER DRIVER,
	open list:FRAMEBUFFER LAYER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER

On 5/2/2024 3:46 AM, Thomas Zimmermann wrote:
> 
> 
> Am 30.04.24 um 19:38 schrieb Easwar Hariharan:
>> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
>> with more appropriate terms. Inspired by and following on to Wolfram's
>> series to fix drivers/i2c/[1], fix the terminology for users of
>> I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
>> in the specification.
>>
>> Compile tested, no functionality changes intended
>>
>> [1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
>>
>> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
> 
> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> 

Thanks for the ack! I had been addressing feedback as I got it on the v0 series, and it seems
I missed out on updating viafb and smscufx to spec-compliant controller/target terminology like
the v0->v1 changelog calls out before posting v1.

For smscufx, I feel phrasing the following line (as an example)

> -/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, host, 
> +/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, *controller*, 

would actually impact readability negatively, so I propose to leave smscufx as is.

For viafb, I propose making it compliant with the spec using the controller/target terminology and
posting a v2 respin (which I can send out as soon as you say) and ask you to review again.

What do you think?

Thanks,
Easwar

>> ---
>>   drivers/video/fbdev/via/chip.h    |  8 ++++----
>>   drivers/video/fbdev/via/dvi.c     | 24 ++++++++++++------------
>>   drivers/video/fbdev/via/lcd.c     |  6 +++---
>>   drivers/video/fbdev/via/via_aux.h |  2 +-
>>   drivers/video/fbdev/via/via_i2c.c | 12 ++++++------
>>   drivers/video/fbdev/via/vt1636.c  |  6 +++---
>>   6 files changed, 29 insertions(+), 29 deletions(-)
>>

<snip>

^ permalink raw reply	[relevance 74%]

* [PATCH 0/1] Convert tasklets to bottom half workqueues
@ 2024-05-02 20:34 63% Allen Pais
  2024-05-02 20:34 14% ` [PATCH] [RFC] scsi: Convert from tasklet to BH workqueue Allen Pais
  0 siblings, 1 reply; 200+ results
From: Allen Pais @ 2024-05-02 20:34 UTC (permalink / raw)
  To: linux-scsi
  Cc: linux-kernel, linuxppc-dev, target-devel, megaraidlinux.pdl,
	jejb, hare, martin.petersen, linuxdrivers, tyreld, mpe, npiggin,
	christophe.leroy, aneesh.kumar, naveen.n.rao, artur.paszkiewicz,
	kashyap.desai, sumit.saxena, shivasharan.srikanteshwara,
	chandrakanth.patil, jinpu.wang

I am submitting this patch which converts instances of tasklets
in drivers/scsi/* to bottom half workqueues. I appreciate your
feedback and suggestion on the changes.

Note: The patch is only compile tested.

In the patcheset, you will notice *FIXME* in two places:
1. pm8001/pm8001_init.c @ pm8001_work(struct work_struct *t)
2. pmcraid.c @ pmcraid_work_function(struct work_struct *t)

The current implementation limits context-aware processing
within work functions due to the lack of a mechanism to identify
the source work_struct in the array. The proposed solution wraps
each work_struct with a struct work_wrapper, adding crucial context
like the array index and a reference to the parent data structure.

Ex:

#define SOME_CONSTANT 10
struct xxx_data {

.....
struct work_struct work[SOME_CONSTANT]:
.....
};

The xxx_data module currently uses an array of work_structs
for scheduling work, but it lacks the ability to identify which
array element is associated with a specific invocation of the work
function. This limitation prevents the execution of context-specific
actions based on the source of the work request.

The proposed solution is to introduce a struct work_wrapper that
encapsulates each work_struct along with additional metadata,
including an index and a pointer to the parent xxx_data structure.
This enhancement allows the work function to access necessary
context information.

Changes:

1. Definition of struct work_wrapper:

struct work_wrapper {
    struct work_struct work;
    struct xxx_data *data;
    int index;
};

struct xxx_data {
    struct work_wrapper work[SOME_CONSTANT];
};

During initialization:

for (int i = 0; i < SOME_CONSTANT; i++) {
    p->work[i].data = p;
    p->work[i].index = i;
    INIT_WORK(&p->work[i].work, work_func);
}

And it's usage in the handler:

void work_func(struct work_struct *t)
{
    struct work_wrapper *wrapper = from_work(wrapper, t, work);
    struct xxx_data *a = wrapper->data;
    int index = wrapper->index;

    ....
}

If the above is solution is acceptable, I can have the same
incorporated in version 2.

Thanks.

Allen Pais (1):
  [RFC] scsi: Convert from tasklet to BH workqueue

 drivers/scsi/aic7xxx/aic7xxx_osm.c          |  2 +-
 drivers/scsi/aic94xx/aic94xx_hwi.c          | 14 ++--
 drivers/scsi/aic94xx/aic94xx_hwi.h          |  5 +-
 drivers/scsi/aic94xx/aic94xx_scb.c          | 36 +++++-----
 drivers/scsi/aic94xx/aic94xx_task.c         | 14 ++--
 drivers/scsi/aic94xx/aic94xx_tmf.c          | 34 +++++-----
 drivers/scsi/esas2r/esas2r.h                | 12 ++--
 drivers/scsi/esas2r/esas2r_init.c           | 14 ++--
 drivers/scsi/esas2r/esas2r_int.c            | 18 ++---
 drivers/scsi/esas2r/esas2r_io.c             |  2 +-
 drivers/scsi/esas2r/esas2r_main.c           | 16 ++---
 drivers/scsi/ibmvscsi/ibmvfc.c              | 16 ++---
 drivers/scsi/ibmvscsi/ibmvfc.h              |  3 +-
 drivers/scsi/ibmvscsi/ibmvscsi.c            | 16 ++---
 drivers/scsi/ibmvscsi/ibmvscsi.h            |  3 +-
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c    | 15 ++---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h    |  3 +-
 drivers/scsi/isci/host.c                    | 12 ++--
 drivers/scsi/isci/host.h                    |  8 +--
 drivers/scsi/isci/init.c                    |  4 +-
 drivers/scsi/megaraid/mega_common.h         |  5 +-
 drivers/scsi/megaraid/megaraid_mbox.c       | 21 +++---
 drivers/scsi/megaraid/megaraid_sas.h        |  4 +-
 drivers/scsi/megaraid/megaraid_sas_base.c   | 32 +++++----
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 16 ++---
 drivers/scsi/mvsas/mv_init.c                | 27 ++++----
 drivers/scsi/mvsas/mv_sas.h                 |  9 +--
 drivers/scsi/pm8001/pm8001_init.c           | 57 ++++++++--------
 drivers/scsi/pm8001/pm8001_sas.h            |  2 +-
 drivers/scsi/pmcraid.c                      | 75 ++++++++++-----------
 drivers/scsi/pmcraid.h                      |  5 +-
 31 files changed, 249 insertions(+), 251 deletions(-)

-- 
2.17.1


^ permalink raw reply	[relevance 63%]

* [PATCH] [RFC] scsi: Convert from tasklet to BH workqueue
  2024-05-02 20:34 63% [PATCH 0/1] Convert tasklets to bottom half workqueues Allen Pais
@ 2024-05-02 20:34 14% ` Allen Pais
    0 siblings, 1 reply; 200+ results
From: Allen Pais @ 2024-05-02 20:34 UTC (permalink / raw)
  To: linux-scsi
  Cc: linux-kernel, linuxppc-dev, target-devel, megaraidlinux.pdl,
	jejb, hare, martin.petersen, linuxdrivers, tyreld, mpe, npiggin,
	christophe.leroy, aneesh.kumar, naveen.n.rao, artur.paszkiewicz,
	kashyap.desai, sumit.saxena, shivasharan.srikanteshwara,
	chandrakanth.patil, jinpu.wang

The only generic interface to execute asynchronously in the BH context is
tasklet; however, it's marked deprecated and has some design flaws. To
replace tasklets, BH workqueue support was recently added. A BH workqueue
behaves similarly to regular workqueues except that the queued work items
are executed in the BH context.

This patch converts drivers/scsi/* from tasklet to BH workqueue.

Based on the work done by Tejun Heo <tj@kernel.org>
Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
---
 drivers/scsi/aic7xxx/aic7xxx_osm.c          |  2 +-
 drivers/scsi/aic94xx/aic94xx_hwi.c          | 14 ++--
 drivers/scsi/aic94xx/aic94xx_hwi.h          |  5 +-
 drivers/scsi/aic94xx/aic94xx_scb.c          | 36 +++++-----
 drivers/scsi/aic94xx/aic94xx_task.c         | 14 ++--
 drivers/scsi/aic94xx/aic94xx_tmf.c          | 34 ++++-----
 drivers/scsi/esas2r/esas2r.h                | 12 ++--
 drivers/scsi/esas2r/esas2r_init.c           | 14 ++--
 drivers/scsi/esas2r/esas2r_int.c            | 18 ++---
 drivers/scsi/esas2r/esas2r_io.c             |  2 +-
 drivers/scsi/esas2r/esas2r_main.c           | 16 ++---
 drivers/scsi/ibmvscsi/ibmvfc.c              | 16 ++---
 drivers/scsi/ibmvscsi/ibmvfc.h              |  3 +-
 drivers/scsi/ibmvscsi/ibmvscsi.c            | 16 ++---
 drivers/scsi/ibmvscsi/ibmvscsi.h            |  3 +-
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c    | 15 ++--
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h    |  3 +-
 drivers/scsi/isci/host.c                    | 12 ++--
 drivers/scsi/isci/host.h                    |  8 +--
 drivers/scsi/isci/init.c                    |  4 +-
 drivers/scsi/megaraid/mega_common.h         |  5 +-
 drivers/scsi/megaraid/megaraid_mbox.c       | 21 +++---
 drivers/scsi/megaraid/megaraid_sas.h        |  4 +-
 drivers/scsi/megaraid/megaraid_sas_base.c   | 32 ++++-----
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 16 ++---
 drivers/scsi/mvsas/mv_init.c                | 27 ++++---
 drivers/scsi/mvsas/mv_sas.h                 |  9 +--
 drivers/scsi/pm8001/pm8001_init.c           | 55 ++++++++-------
 drivers/scsi/pm8001/pm8001_sas.h            |  2 +-
 drivers/scsi/pmcraid.c                      | 78 +++++++++++----------
 drivers/scsi/pmcraid.h                      |  5 +-
 31 files changed, 251 insertions(+), 250 deletions(-)

diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c b/drivers/scsi/aic7xxx/aic7xxx_osm.c
index b0c4f2345321..42f76391f589 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c
@@ -797,7 +797,7 @@ struct scsi_host_template aic7xxx_driver_template = {
 	.target_destroy		= ahc_linux_target_destroy,
 };
 
-/**************************** Tasklet Handler *********************************/
+/**************************** Work Handler *********************************/
 
 
 static inline unsigned int ahc_build_scsiid(struct ahc_softc *ahc,
diff --git a/drivers/scsi/aic94xx/aic94xx_hwi.c b/drivers/scsi/aic94xx/aic94xx_hwi.c
index 9dda296c0152..b08f0231e562 100644
--- a/drivers/scsi/aic94xx/aic94xx_hwi.c
+++ b/drivers/scsi/aic94xx/aic94xx_hwi.c
@@ -246,7 +246,7 @@ static void asd_get_max_scb_ddb(struct asd_ha_struct *asd_ha)
 
 /* ---------- Done List initialization ---------- */
 
-static void asd_dl_tasklet_handler(unsigned long);
+static void asd_dl_work_handler(struct work_struct *);
 
 static int asd_init_dl(struct asd_ha_struct *asd_ha)
 {
@@ -259,8 +259,7 @@ static int asd_init_dl(struct asd_ha_struct *asd_ha)
 	asd_ha->seq.dl = asd_ha->seq.actual_dl->vaddr;
 	asd_ha->seq.dl_toggle = ASD_DEF_DL_TOGGLE;
 	asd_ha->seq.dl_next = 0;
-	tasklet_init(&asd_ha->seq.dl_tasklet, asd_dl_tasklet_handler,
-		     (unsigned long) asd_ha);
+	INIT_WORK(&asd_ha->seq.dl_work, asd_dl_work_handler);
 
 	return 0;
 }
@@ -709,10 +708,9 @@ static void asd_chip_reset(struct asd_ha_struct *asd_ha)
 
 /* ---------- Done List Routines ---------- */
 
-static void asd_dl_tasklet_handler(unsigned long data)
+static void asd_dl_work_handler(struct work_struct *t)
 {
-	struct asd_ha_struct *asd_ha = (struct asd_ha_struct *) data;
-	struct asd_seq_data *seq = &asd_ha->seq;
+	struct asd_seq_data *seq = from_work(seq, t, dl_work);
 	unsigned long flags;
 
 	while (1) {
@@ -739,7 +737,7 @@ static void asd_dl_tasklet_handler(unsigned long data)
 		seq->pending--;
 		spin_unlock_irqrestore(&seq->pend_q_lock, flags);
 	out:
-		ascb->tasklet_complete(ascb, dl);
+		ascb->work_complete(ascb, dl);
 
 	next_1:
 		seq->dl_next = (seq->dl_next + 1) & (ASD_DL_SIZE-1);
@@ -756,7 +754,7 @@ static void asd_dl_tasklet_handler(unsigned long data)
  */
 static void asd_process_donelist_isr(struct asd_ha_struct *asd_ha)
 {
-	tasklet_schedule(&asd_ha->seq.dl_tasklet);
+	queue_work(system_bh_wq, &asd_ha->seq.dl_work);
 }
 
 /**
diff --git a/drivers/scsi/aic94xx/aic94xx_hwi.h b/drivers/scsi/aic94xx/aic94xx_hwi.h
index 930e192b1cd4..2cc6fb7aa1a7 100644
--- a/drivers/scsi/aic94xx/aic94xx_hwi.h
+++ b/drivers/scsi/aic94xx/aic94xx_hwi.h
@@ -12,6 +12,7 @@
 #include <linux/interrupt.h>
 #include <linux/pci.h>
 #include <linux/dma-mapping.h>
+#include <linux/workqueue.h>
 
 #include <scsi/libsas.h>
 
@@ -117,7 +118,7 @@ struct asd_ascb {
 	struct asd_dma_tok dma_scb;
 	struct asd_dma_tok *sg_arr;
 
-	void (*tasklet_complete)(struct asd_ascb *, struct done_list_struct *);
+	void (*work_complete)(struct asd_ascb *, struct done_list_struct *);
 	u8     uldd_timer:1;
 
 	/* internally generated command */
@@ -152,7 +153,7 @@ struct asd_seq_data {
 	void *tc_index_bitmap;
 	int   tc_index_bitmap_bits;
 
-	struct tasklet_struct dl_tasklet;
+	struct work_struct dl_work;
 	struct done_list_struct *dl; /* array of done list entries, equals */
 	struct asd_dma_tok *actual_dl; /* actual_dl->vaddr */
 	int    dl_toggle;
diff --git a/drivers/scsi/aic94xx/aic94xx_scb.c b/drivers/scsi/aic94xx/aic94xx_scb.c
index 68214a58b160..256800811553 100644
--- a/drivers/scsi/aic94xx/aic94xx_scb.c
+++ b/drivers/scsi/aic94xx/aic94xx_scb.c
@@ -64,7 +64,7 @@ static void get_lrate_mode(struct asd_phy *phy, u8 oob_mode)
 		phy->sas_phy.oob_mode = SATA_OOB_MODE;
 }
 
-static void asd_phy_event_tasklet(struct asd_ascb *ascb,
+static void asd_phy_event_work(struct asd_ascb *ascb,
 					 struct done_list_struct *dl)
 {
 	struct asd_ha_struct *asd_ha = ascb->ha;
@@ -215,7 +215,7 @@ static void asd_deform_port(struct asd_ha_struct *asd_ha, struct asd_phy *phy)
 	spin_unlock_irqrestore(&asd_ha->asd_ports_lock, flags);
 }
 
-static void asd_bytes_dmaed_tasklet(struct asd_ascb *ascb,
+static void asd_bytes_dmaed_work(struct asd_ascb *ascb,
 				    struct done_list_struct *dl,
 				    int edb_id, int phy_id)
 {
@@ -237,7 +237,7 @@ static void asd_bytes_dmaed_tasklet(struct asd_ascb *ascb,
 	sas_notify_port_event(&phy->sas_phy, PORTE_BYTES_DMAED, GFP_ATOMIC);
 }
 
-static void asd_link_reset_err_tasklet(struct asd_ascb *ascb,
+static void asd_link_reset_err_work(struct asd_ascb *ascb,
 				       struct done_list_struct *dl,
 				       int phy_id)
 {
@@ -290,7 +290,7 @@ static void asd_link_reset_err_tasklet(struct asd_ascb *ascb,
 	;
 }
 
-static void asd_primitive_rcvd_tasklet(struct asd_ascb *ascb,
+static void asd_primitive_rcvd_work(struct asd_ascb *ascb,
 				       struct done_list_struct *dl,
 				       int phy_id)
 {
@@ -361,7 +361,7 @@ static void asd_primitive_rcvd_tasklet(struct asd_ascb *ascb,
  *
  * After an EDB has been invalidated, if all EDBs in this ESCB have been
  * invalidated, the ESCB is posted back to the sequencer.
- * Context is tasklet/IRQ.
+ * Context is BH work/IRQ.
  */
 void asd_invalidate_edb(struct asd_ascb *ascb, int edb_id)
 {
@@ -396,7 +396,7 @@ void asd_invalidate_edb(struct asd_ascb *ascb, int edb_id)
 	}
 }
 
-static void escb_tasklet_complete(struct asd_ascb *ascb,
+static void escb_work_complete(struct asd_ascb *ascb,
 				  struct done_list_struct *dl)
 {
 	struct asd_ha_struct *asd_ha = ascb->ha;
@@ -546,21 +546,21 @@ static void escb_tasklet_complete(struct asd_ascb *ascb,
 	switch (sb_opcode) {
 	case BYTES_DMAED:
 		ASD_DPRINTK("%s: phy%d: BYTES_DMAED\n", __func__, phy_id);
-		asd_bytes_dmaed_tasklet(ascb, dl, edb, phy_id);
+		asd_bytes_dmaed_work(ascb, dl, edb, phy_id);
 		break;
 	case PRIMITIVE_RECVD:
 		ASD_DPRINTK("%s: phy%d: PRIMITIVE_RECVD\n", __func__,
 			    phy_id);
-		asd_primitive_rcvd_tasklet(ascb, dl, phy_id);
+		asd_primitive_rcvd_work(ascb, dl, phy_id);
 		break;
 	case PHY_EVENT:
 		ASD_DPRINTK("%s: phy%d: PHY_EVENT\n", __func__, phy_id);
-		asd_phy_event_tasklet(ascb, dl);
+		asd_phy_event_work(ascb, dl);
 		break;
 	case LINK_RESET_ERROR:
 		ASD_DPRINTK("%s: phy%d: LINK_RESET_ERROR\n", __func__,
 			    phy_id);
-		asd_link_reset_err_tasklet(ascb, dl, phy_id);
+		asd_link_reset_err_work(ascb, dl, phy_id);
 		break;
 	case TIMER_EVENT:
 		ASD_DPRINTK("%s: phy%d: TIMER_EVENT, lost dw sync\n",
@@ -600,7 +600,7 @@ int asd_init_post_escbs(struct asd_ha_struct *asd_ha)
 	int i;
 
 	for (i = 0; i < seq->num_escbs; i++)
-		seq->escb_arr[i]->tasklet_complete = escb_tasklet_complete;
+		seq->escb_arr[i]->work_complete = escb_work_complete;
 
 	ASD_DPRINTK("posting %d escbs\n", i);
 	return asd_post_escb_list(asd_ha, seq->escb_arr[0], seq->num_escbs);
@@ -613,7 +613,7 @@ int asd_init_post_escbs(struct asd_ha_struct *asd_ha)
 			    | CURRENT_OOB_ERROR)
 
 /**
- * control_phy_tasklet_complete -- tasklet complete for CONTROL PHY ascb
+ * control_phy_work_complete -- BH work complete for CONTROL PHY ascb
  * @ascb: pointer to an ascb
  * @dl: pointer to the done list entry
  *
@@ -623,7 +623,7 @@ int asd_init_post_escbs(struct asd_ha_struct *asd_ha)
  *  - if a device is connected to the LED, it is lit,
  *  - if no device is connected to the LED, is is dimmed (off).
  */
-static void control_phy_tasklet_complete(struct asd_ascb *ascb,
+static void control_phy_work_complete(struct asd_ascb *ascb,
 					 struct done_list_struct *dl)
 {
 	struct asd_ha_struct *asd_ha = ascb->ha;
@@ -758,9 +758,9 @@ static void set_speed_mask(u8 *speed_mask, struct asd_phy_desc *pd)
  *
  * This function builds a CONTROL PHY scb.  No allocation of any kind
  * is performed. @ascb is allocated with the list function.
- * The caller can override the ascb->tasklet_complete to point
+ * The caller can override the ascb->work_complete to point
  * to its own callback function.  It must call asd_ascb_free()
- * at its tasklet complete function.
+ * at its BH work complete function.
  * See the default implementation.
  */
 void asd_build_control_phy(struct asd_ascb *ascb, int phy_id, u8 subfunc)
@@ -806,14 +806,14 @@ void asd_build_control_phy(struct asd_ascb *ascb, int phy_id, u8 subfunc)
 
 	control_phy->conn_handle = cpu_to_le16(0xFFFF);
 
-	ascb->tasklet_complete = control_phy_tasklet_complete;
+	ascb->work_complete = control_phy_work_complete;
 }
 
 /* ---------- INITIATE LINK ADM TASK ---------- */
 
 #if 0
 
-static void link_adm_tasklet_complete(struct asd_ascb *ascb,
+static void link_adm_work_complete(struct asd_ascb *ascb,
 				      struct done_list_struct *dl)
 {
 	u8 opcode = dl->opcode;
@@ -842,7 +842,7 @@ void asd_build_initiate_link_adm_task(struct asd_ascb *ascb, int phy_id,
 	link_adm->sub_func = subfunc;
 	link_adm->conn_handle = cpu_to_le16(0xFFFF);
 
-	ascb->tasklet_complete = link_adm_tasklet_complete;
+	ascb->work_complete = link_adm_work_complete;
 }
 
 #endif  /*  0  */
diff --git a/drivers/scsi/aic94xx/aic94xx_task.c b/drivers/scsi/aic94xx/aic94xx_task.c
index 4bfd03724ad6..2e1e30ba5555 100644
--- a/drivers/scsi/aic94xx/aic94xx_task.c
+++ b/drivers/scsi/aic94xx/aic94xx_task.c
@@ -138,9 +138,9 @@ static void asd_unmap_scatterlist(struct asd_ascb *ascb)
 			     task->num_scatter, task->data_dir);
 }
 
-/* ---------- Task complete tasklet ---------- */
+/* ---------- Task complete BH work ---------- */
 
-static void asd_get_response_tasklet(struct asd_ascb *ascb,
+static void asd_get_response_work(struct asd_ascb *ascb,
 				     struct done_list_struct *dl)
 {
 	struct asd_ha_struct *asd_ha = ascb->ha;
@@ -194,7 +194,7 @@ static void asd_get_response_tasklet(struct asd_ascb *ascb,
 	asd_invalidate_edb(escb, edb_id);
 }
 
-static void asd_task_tasklet_complete(struct asd_ascb *ascb,
+static void asd_task_work_complete(struct asd_ascb *ascb,
 				      struct done_list_struct *dl)
 {
 	struct sas_task *task = ascb->uldd_task;
@@ -224,7 +224,7 @@ static void asd_task_tasklet_complete(struct asd_ascb *ascb,
 	case TC_ATA_RESP:
 		ts->resp = SAS_TASK_COMPLETE;
 		ts->stat = SAS_PROTO_RESPONSE;
-		asd_get_response_tasklet(ascb, dl);
+		asd_get_response_work(ascb, dl);
 		break;
 	case TF_OPEN_REJECT:
 		ts->resp = SAS_TASK_UNDELIVERED;
@@ -392,7 +392,7 @@ static int asd_build_ata_ascb(struct asd_ascb *ascb, struct sas_task *task,
 
 		scb->ata_task.flags = 0;
 	}
-	ascb->tasklet_complete = asd_task_tasklet_complete;
+	ascb->work_complete = asd_task_work_complete;
 
 	if (likely(!task->ata_task.device_control_reg_update))
 		res = asd_map_scatterlist(task, scb->ata_task.sg_element,
@@ -440,7 +440,7 @@ static int asd_build_smp_ascb(struct asd_ascb *ascb, struct sas_task *task,
 	scb->smp_task.conn_handle = cpu_to_le16((u16)
 						(unsigned long)dev->lldd_dev);
 
-	ascb->tasklet_complete = asd_task_tasklet_complete;
+	ascb->work_complete = asd_task_work_complete;
 
 	return 0;
 }
@@ -490,7 +490,7 @@ static int asd_build_ssp_ascb(struct asd_ascb *ascb, struct sas_task *task,
 	scb->ssp_task.data_dir = data_dir_flags[task->data_dir];
 	scb->ssp_task.retry_count = scb->ssp_task.retry_count;
 
-	ascb->tasklet_complete = asd_task_tasklet_complete;
+	ascb->work_complete = asd_task_work_complete;
 
 	res = asd_map_scatterlist(task, scb->ssp_task.sg_element, gfp_flags);
 
diff --git a/drivers/scsi/aic94xx/aic94xx_tmf.c b/drivers/scsi/aic94xx/aic94xx_tmf.c
index 27d32b8c2987..5eb0cc57ed2a 100644
--- a/drivers/scsi/aic94xx/aic94xx_tmf.c
+++ b/drivers/scsi/aic94xx/aic94xx_tmf.c
@@ -15,13 +15,13 @@
 /* ---------- Internal enqueue ---------- */
 
 static int asd_enqueue_internal(struct asd_ascb *ascb,
-		void (*tasklet_complete)(struct asd_ascb *,
+		void (*work_complete)(struct asd_ascb *,
 					 struct done_list_struct *),
 				void (*timed_out)(struct timer_list *t))
 {
 	int res;
 
-	ascb->tasklet_complete = tasklet_complete;
+	ascb->work_complete = work_complete;
 	ascb->uldd_timer = 1;
 
 	ascb->timer.function = timed_out;
@@ -37,7 +37,7 @@ static int asd_enqueue_internal(struct asd_ascb *ascb,
 
 /* ---------- CLEAR NEXUS ---------- */
 
-struct tasklet_completion_status {
+struct work_completion_status {
 	int	dl_opcode;
 	int	tmf_state;
 	u8	tag_valid:1;
@@ -45,7 +45,7 @@ struct tasklet_completion_status {
 };
 
 #define DECLARE_TCS(tcs) \
-	struct tasklet_completion_status tcs = { \
+	struct work_completion_status tcs = { \
 		.dl_opcode = 0, \
 		.tmf_state = 0, \
 		.tag_valid = 0, \
@@ -53,10 +53,10 @@ struct tasklet_completion_status {
 	}
 
 
-static void asd_clear_nexus_tasklet_complete(struct asd_ascb *ascb,
+static void asd_clear_nexus_work_complete(struct asd_ascb *ascb,
 					     struct done_list_struct *dl)
 {
-	struct tasklet_completion_status *tcs = ascb->uldd_task;
+	struct work_completion_status *tcs = ascb->uldd_task;
 	ASD_DPRINTK("%s: here\n", __func__);
 	if (!del_timer(&ascb->timer)) {
 		ASD_DPRINTK("%s: couldn't delete timer\n", __func__);
@@ -71,7 +71,7 @@ static void asd_clear_nexus_tasklet_complete(struct asd_ascb *ascb,
 static void asd_clear_nexus_timedout(struct timer_list *t)
 {
 	struct asd_ascb *ascb = from_timer(ascb, t, timer);
-	struct tasklet_completion_status *tcs = ascb->uldd_task;
+	struct work_completion_status *tcs = ascb->uldd_task;
 
 	ASD_DPRINTK("%s: here\n", __func__);
 	tcs->dl_opcode = TMF_RESP_FUNC_FAILED;
@@ -98,7 +98,7 @@ static void asd_clear_nexus_timedout(struct timer_list *t)
 
 #define CLEAR_NEXUS_POST        \
 	ASD_DPRINTK("%s: POST\n", __func__); \
-	res = asd_enqueue_internal(ascb, asd_clear_nexus_tasklet_complete, \
+	res = asd_enqueue_internal(ascb, asd_clear_nexus_work_complete, \
 				   asd_clear_nexus_timedout);              \
 	if (res)                \
 		goto out_err;   \
@@ -245,14 +245,14 @@ static int asd_clear_nexus_index(struct sas_task *task)
 static void asd_tmf_timedout(struct timer_list *t)
 {
 	struct asd_ascb *ascb = from_timer(ascb, t, timer);
-	struct tasklet_completion_status *tcs = ascb->uldd_task;
+	struct work_completion_status *tcs = ascb->uldd_task;
 
 	ASD_DPRINTK("tmf timed out\n");
 	tcs->tmf_state = TMF_RESP_FUNC_FAILED;
 	complete(ascb->completion);
 }
 
-static int asd_get_tmf_resp_tasklet(struct asd_ascb *ascb,
+static int asd_get_tmf_resp_work(struct asd_ascb *ascb,
 				    struct done_list_struct *dl)
 {
 	struct asd_ha_struct *asd_ha = ascb->ha;
@@ -270,7 +270,7 @@ static int asd_get_tmf_resp_tasklet(struct asd_ascb *ascb,
 	struct ssp_response_iu   *ru;
 	int res = TMF_RESP_FUNC_FAILED;
 
-	ASD_DPRINTK("tmf resp tasklet\n");
+	ASD_DPRINTK("tmf resp BH work\n");
 
 	spin_lock_irqsave(&asd_ha->seq.tc_index_lock, flags);
 	escb = asd_tc_index_find(&asd_ha->seq,
@@ -298,21 +298,21 @@ static int asd_get_tmf_resp_tasklet(struct asd_ascb *ascb,
 	return res;
 }
 
-static void asd_tmf_tasklet_complete(struct asd_ascb *ascb,
+static void asd_tmf_work_complete(struct asd_ascb *ascb,
 				     struct done_list_struct *dl)
 {
-	struct tasklet_completion_status *tcs;
+	struct work_completion_status *tcs;
 
 	if (!del_timer(&ascb->timer))
 		return;
 
 	tcs = ascb->uldd_task;
-	ASD_DPRINTK("tmf tasklet complete\n");
+	ASD_DPRINTK("tmf BH work complete\n");
 
 	tcs->dl_opcode = dl->opcode;
 
 	if (dl->opcode == TC_SSP_RESP) {
-		tcs->tmf_state = asd_get_tmf_resp_tasklet(ascb, dl);
+		tcs->tmf_state = asd_get_tmf_resp_work(ascb, dl);
 		tcs->tag_valid = ascb->tag_valid;
 		tcs->tag = ascb->tag;
 	}
@@ -452,7 +452,7 @@ int asd_abort_task(struct sas_task *task)
 	scb->abort_task.index = cpu_to_le16((u16)tascb->tc_index);
 	scb->abort_task.itnl_to = cpu_to_le16(ITNL_TIMEOUT_CONST);
 
-	res = asd_enqueue_internal(ascb, asd_tmf_tasklet_complete,
+	res = asd_enqueue_internal(ascb, asd_tmf_work_complete,
 				   asd_tmf_timedout);
 	if (res)
 		goto out_free;
@@ -600,7 +600,7 @@ static int asd_initiate_ssp_tmf(struct domain_device *dev, u8 *lun,
 	if (tmf == TMF_QUERY_TASK)
 		scb->ssp_tmf.index = cpu_to_le16(index);
 
-	res = asd_enqueue_internal(ascb, asd_tmf_tasklet_complete,
+	res = asd_enqueue_internal(ascb, asd_tmf_work_complete,
 				   asd_tmf_timedout);
 	if (res)
 		goto out_err;
diff --git a/drivers/scsi/esas2r/esas2r.h b/drivers/scsi/esas2r/esas2r.h
index ed63f7a9ea54..7c9db9e80576 100644
--- a/drivers/scsi/esas2r/esas2r.h
+++ b/drivers/scsi/esas2r/esas2r.h
@@ -900,7 +900,7 @@ struct esas2r_adapter {
 	struct esas2r_flash_context flash_context;
 	u32 num_targets_backend;
 	u32 ioctl_tunnel;
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 	struct pci_dev *pcid;
 	struct Scsi_Host *host;
 	unsigned int index;
@@ -992,7 +992,7 @@ int esas2r_write_vda(struct esas2r_adapter *a, const char *buf, long off,
 int esas2r_read_fs(struct esas2r_adapter *a, char *buf, long off, int count);
 int esas2r_write_fs(struct esas2r_adapter *a, const char *buf, long off,
 		    int count);
-void esas2r_adapter_tasklet(unsigned long context);
+void esas2r_adapter_work(struct work_struct *work);
 irqreturn_t esas2r_interrupt(int irq, void *dev_id);
 irqreturn_t esas2r_msi_interrupt(int irq, void *dev_id);
 void esas2r_kickoff_timer(struct esas2r_adapter *a);
@@ -1022,7 +1022,7 @@ bool esas2r_init_adapter_hw(struct esas2r_adapter *a, bool init_poll);
 void esas2r_start_request(struct esas2r_adapter *a, struct esas2r_request *rq);
 bool esas2r_send_task_mgmt(struct esas2r_adapter *a,
 			   struct esas2r_request *rqaux, u8 task_mgt_func);
-void esas2r_do_tasklet_tasks(struct esas2r_adapter *a);
+void esas2r_do_work_tasks(struct esas2r_adapter *a);
 void esas2r_adapter_interrupt(struct esas2r_adapter *a);
 void esas2r_do_deferred_processes(struct esas2r_adapter *a);
 void esas2r_reset_bus(struct esas2r_adapter *a);
@@ -1283,7 +1283,7 @@ static inline void esas2r_rq_destroy_request(struct esas2r_request *rq,
 	rq->data_buf = NULL;
 }
 
-static inline bool esas2r_is_tasklet_pending(struct esas2r_adapter *a)
+static inline bool esas2r_is_work_pending(struct esas2r_adapter *a)
 {
 
 	return test_bit(AF_BUSRST_NEEDED, &a->flags) ||
@@ -1327,11 +1327,11 @@ static inline void esas2r_enable_chip_interrupts(struct esas2r_adapter *a)
 /* Schedule a TASKLET to perform non-interrupt tasks that may require delays
  * or long completion times.
  */
-static inline void esas2r_schedule_tasklet(struct esas2r_adapter *a)
+static inline void esas2r_schedule_work(struct esas2r_adapter *a)
 {
 	/* make sure we don't schedule twice */
 	if (!test_and_set_bit(AF_TASKLET_SCHEDULED, &a->flags))
-		tasklet_hi_schedule(&a->tasklet);
+		queue_work(system_bh_highpri_wq, &a->work);
 }
 
 static inline void esas2r_enable_heartbeat(struct esas2r_adapter *a)
diff --git a/drivers/scsi/esas2r/esas2r_init.c b/drivers/scsi/esas2r/esas2r_init.c
index c1a5ab662dc8..cf149a69ec55 100644
--- a/drivers/scsi/esas2r/esas2r_init.c
+++ b/drivers/scsi/esas2r/esas2r_init.c
@@ -401,9 +401,7 @@ int esas2r_init_adapter(struct Scsi_Host *host, struct pci_dev *pcid,
 		return 0;
 	}
 
-	tasklet_init(&a->tasklet,
-		     esas2r_adapter_tasklet,
-		     (unsigned long)a);
+	INIT_WORK(&a->work, esas2r_adapter_work);
 
 	/*
 	 * Disable chip interrupts to prevent spurious interrupts
@@ -441,7 +439,7 @@ static void esas2r_adapter_power_down(struct esas2r_adapter *a,
 	    &&  (!test_bit(AF_DEGRADED_MODE, &a->flags))) {
 		if (!power_management) {
 			del_timer_sync(&a->timer);
-			tasklet_kill(&a->tasklet);
+			cancel_work_sync(&a->work);
 		}
 		esas2r_power_down(a);
 
@@ -1346,7 +1344,7 @@ bool esas2r_init_adapter_hw(struct esas2r_adapter *a, bool init_poll)
 		u32 deltatime;
 
 		/*
-		 * Block Tasklets from getting scheduled and indicate this is
+		 * Block Works from getting scheduled and indicate this is
 		 * polled discovery.
 		 */
 		set_bit(AF_TASKLET_SCHEDULED, &a->flags);
@@ -1394,8 +1392,8 @@ bool esas2r_init_adapter_hw(struct esas2r_adapter *a, bool init_poll)
 				nexttick -= deltatime;
 
 			/* Do any deferred processing */
-			if (esas2r_is_tasklet_pending(a))
-				esas2r_do_tasklet_tasks(a);
+			if (esas2r_is_work_pending(a))
+				esas2r_do_work_tasks(a);
 
 		}
 
@@ -1463,7 +1461,7 @@ void esas2r_reset_adapter(struct esas2r_adapter *a)
 {
 	set_bit(AF_OS_RESET, &a->flags);
 	esas2r_local_reset_adapter(a);
-	esas2r_schedule_tasklet(a);
+	esas2r_schedule_work(a);
 }
 
 void esas2r_reset_chip(struct esas2r_adapter *a)
diff --git a/drivers/scsi/esas2r/esas2r_int.c b/drivers/scsi/esas2r/esas2r_int.c
index 5281d9356327..54e6eea522f8 100644
--- a/drivers/scsi/esas2r/esas2r_int.c
+++ b/drivers/scsi/esas2r/esas2r_int.c
@@ -97,7 +97,7 @@ irqreturn_t esas2r_interrupt(int irq, void *dev_id)
 		return IRQ_NONE;
 
 	set_bit(AF2_INT_PENDING, &a->flags2);
-	esas2r_schedule_tasklet(a);
+	esas2r_schedule_work(a);
 
 	return IRQ_HANDLED;
 }
@@ -162,7 +162,7 @@ irqreturn_t esas2r_msi_interrupt(int irq, void *dev_id)
 	if (likely(atomic_read(&a->disable_cnt) == 0))
 		esas2r_do_deferred_processes(a);
 
-	esas2r_do_tasklet_tasks(a);
+	esas2r_do_work_tasks(a);
 
 	return 1;
 }
@@ -327,8 +327,8 @@ void esas2r_do_deferred_processes(struct esas2r_adapter *a)
 
 	/* Clear off the completed list to be processed later. */
 
-	if (esas2r_is_tasklet_pending(a)) {
-		esas2r_schedule_tasklet(a);
+	if (esas2r_is_work_pending(a)) {
+		esas2r_schedule_work(a);
 
 		startreqs = 0;
 	}
@@ -476,7 +476,7 @@ static void esas2r_process_bus_reset(struct esas2r_adapter *a)
 	esas2r_trace_exit();
 }
 
-static void esas2r_chip_rst_needed_during_tasklet(struct esas2r_adapter *a)
+static void esas2r_chip_rst_needed_during_work(struct esas2r_adapter *a)
 {
 
 	clear_bit(AF_CHPRST_NEEDED, &a->flags);
@@ -558,7 +558,7 @@ static void esas2r_chip_rst_needed_during_tasklet(struct esas2r_adapter *a)
 	}
 }
 
-static void esas2r_handle_chip_rst_during_tasklet(struct esas2r_adapter *a)
+static void esas2r_handle_chip_rst_during_work(struct esas2r_adapter *a)
 {
 	while (test_bit(AF_CHPRST_DETECTED, &a->flags)) {
 		/*
@@ -614,15 +614,15 @@ static void esas2r_handle_chip_rst_during_tasklet(struct esas2r_adapter *a)
 
 
 /* Perform deferred tasks when chip interrupts are disabled */
-void esas2r_do_tasklet_tasks(struct esas2r_adapter *a)
+void esas2r_do_work_tasks(struct esas2r_adapter *a)
 {
 
 	if (test_bit(AF_CHPRST_NEEDED, &a->flags) ||
 	    test_bit(AF_CHPRST_DETECTED, &a->flags)) {
 		if (test_bit(AF_CHPRST_NEEDED, &a->flags))
-			esas2r_chip_rst_needed_during_tasklet(a);
+			esas2r_chip_rst_needed_during_work(a);
 
-		esas2r_handle_chip_rst_during_tasklet(a);
+		esas2r_handle_chip_rst_during_work(a);
 	}
 
 	if (test_bit(AF_BUSRST_NEEDED, &a->flags)) {
diff --git a/drivers/scsi/esas2r/esas2r_io.c b/drivers/scsi/esas2r/esas2r_io.c
index a8df916cd57a..d45e6e16a858 100644
--- a/drivers/scsi/esas2r/esas2r_io.c
+++ b/drivers/scsi/esas2r/esas2r_io.c
@@ -851,7 +851,7 @@ void esas2r_reset_bus(struct esas2r_adapter *a)
 		set_bit(AF_BUSRST_PENDING, &a->flags);
 		set_bit(AF_OS_RESET, &a->flags);
 
-		esas2r_schedule_tasklet(a);
+		esas2r_schedule_work(a);
 	}
 }
 
diff --git a/drivers/scsi/esas2r/esas2r_main.c b/drivers/scsi/esas2r/esas2r_main.c
index f700a16cd885..e4e378adf7ed 100644
--- a/drivers/scsi/esas2r/esas2r_main.c
+++ b/drivers/scsi/esas2r/esas2r_main.c
@@ -1543,10 +1543,10 @@ void esas2r_complete_request_cb(struct esas2r_adapter *a,
 	esas2r_free_request(a, rq);
 }
 
-/* Run tasklet to handle stuff outside of interrupt context. */
-void esas2r_adapter_tasklet(unsigned long context)
+/* Run BH work to handle stuff outside of interrupt context. */
+void esas2r_adapter_work(struct work_struct *t)
 {
-	struct esas2r_adapter *a = (struct esas2r_adapter *)context;
+	struct esas2r_adapter *a = from_work(a, t, work);
 
 	if (unlikely(test_bit(AF2_TIMER_TICK, &a->flags2))) {
 		clear_bit(AF2_TIMER_TICK, &a->flags2);
@@ -1558,14 +1558,14 @@ void esas2r_adapter_tasklet(unsigned long context)
 		esas2r_adapter_interrupt(a);
 	}
 
-	if (esas2r_is_tasklet_pending(a))
-		esas2r_do_tasklet_tasks(a);
+	if (esas2r_is_work_pending(a))
+		esas2r_do_work_tasks(a);
 
-	if (esas2r_is_tasklet_pending(a)
+	if (esas2r_is_work_pending(a)
 	    || (test_bit(AF2_INT_PENDING, &a->flags2))
 	    || (test_bit(AF2_TIMER_TICK, &a->flags2))) {
 		clear_bit(AF_TASKLET_SCHEDULED, &a->flags);
-		esas2r_schedule_tasklet(a);
+		esas2r_schedule_work(a);
 	} else {
 		clear_bit(AF_TASKLET_SCHEDULED, &a->flags);
 	}
@@ -1589,7 +1589,7 @@ static void esas2r_timer_callback(struct timer_list *t)
 
 	set_bit(AF2_TIMER_TICK, &a->flags2);
 
-	esas2r_schedule_tasklet(a);
+	esas2r_schedule_work(a);
 
 	esas2r_kickoff_timer(a);
 }
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 05b126bfd18b..6a8ecd3358c4 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -899,7 +899,7 @@ static void ibmvfc_release_crq_queue(struct ibmvfc_host *vhost)
 
 	ibmvfc_dbg(vhost, "Releasing CRQ\n");
 	free_irq(vdev->irq, vhost);
-	tasklet_kill(&vhost->tasklet);
+	cancel_work_sync(&vhost->work);
 	do {
 		if (rc)
 			msleep(100);
@@ -3767,21 +3767,21 @@ static irqreturn_t ibmvfc_interrupt(int irq, void *dev_instance)
 
 	spin_lock_irqsave(vhost->host->host_lock, flags);
 	vio_disable_interrupts(to_vio_dev(vhost->dev));
-	tasklet_schedule(&vhost->tasklet);
+	queue_work(system_bh_wq, &vhost->work);
 	spin_unlock_irqrestore(vhost->host->host_lock, flags);
 	return IRQ_HANDLED;
 }
 
 /**
- * ibmvfc_tasklet - Interrupt handler tasklet
+ * ibmvfc_work - Interrupt handler work
  * @data:		ibmvfc host struct
  *
  * Returns:
  *	Nothing
  **/
-static void ibmvfc_tasklet(void *data)
+static void ibmvfc_work(struct work_struct *t)
 {
-	struct ibmvfc_host *vhost = data;
+	struct ibmvfc_host *vhost = from_work(vhost, t, work);
 	struct vio_dev *vdev = to_vio_dev(vhost->dev);
 	struct ibmvfc_crq *crq;
 	struct ibmvfc_async_crq *async;
@@ -5885,7 +5885,7 @@ static int ibmvfc_init_crq(struct ibmvfc_host *vhost)
 
 	retrc = 0;
 
-	tasklet_init(&vhost->tasklet, (void *)ibmvfc_tasklet, (unsigned long)vhost);
+	INIT_WORK(&vhost->work, ibmvfc_work);
 
 	if ((rc = request_irq(vdev->irq, ibmvfc_interrupt, 0, IBMVFC_NAME, vhost))) {
 		dev_err(dev, "Couldn't register irq 0x%x. rc=%d\n", vdev->irq, rc);
@@ -5901,7 +5901,7 @@ static int ibmvfc_init_crq(struct ibmvfc_host *vhost)
 	return retrc;
 
 req_irq_failed:
-	tasklet_kill(&vhost->tasklet);
+	cancel_work_sync(&vhost->work);
 	do {
 		rc = plpar_hcall_norets(H_FREE_CRQ, vdev->unit_address);
 	} while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
@@ -6474,7 +6474,7 @@ static int ibmvfc_resume(struct device *dev)
 
 	spin_lock_irqsave(vhost->host->host_lock, flags);
 	vio_disable_interrupts(vdev);
-	tasklet_schedule(&vhost->tasklet);
+	queue_work(system_bh_wq, &vhost->work);
 	spin_unlock_irqrestore(vhost->host->host_lock, flags);
 	return 0;
 }
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 745ad5ac7251..42861ee62bf9 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -12,6 +12,7 @@
 
 #include <linux/list.h>
 #include <linux/types.h>
+#include <linux/workqueue.h>
 #include <scsi/viosrp.h>
 
 #define IBMVFC_NAME	"ibmvfc"
@@ -910,7 +911,7 @@ struct ibmvfc_host {
 	char partition_name[97];
 	void (*job_step) (struct ibmvfc_host *);
 	struct task_struct *work_thread;
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 	struct work_struct rport_add_work_q;
 	wait_queue_head_t init_wait_q;
 	wait_queue_head_t work_wait_q;
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 71f3e9563520..91e1600bf219 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -125,7 +125,7 @@ static irqreturn_t ibmvscsi_handle_event(int irq, void *dev_instance)
 	struct ibmvscsi_host_data *hostdata =
 	    (struct ibmvscsi_host_data *)dev_instance;
 	vio_disable_interrupts(to_vio_dev(hostdata->dev));
-	tasklet_schedule(&hostdata->srp_task);
+	queue_work(system_bh_wq, &hostdata->srp_task);
 	return IRQ_HANDLED;
 }
 
@@ -145,7 +145,7 @@ static void ibmvscsi_release_crq_queue(struct crq_queue *queue,
 	long rc = 0;
 	struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 	free_irq(vdev->irq, (void *)hostdata);
-	tasklet_kill(&hostdata->srp_task);
+	cancel_work_sync(&hostdata->srp_task);
 	do {
 		if (rc)
 			msleep(100);
@@ -367,8 +367,7 @@ static int ibmvscsi_init_crq_queue(struct crq_queue *queue,
 	queue->cur = 0;
 	spin_lock_init(&queue->lock);
 
-	tasklet_init(&hostdata->srp_task, (void *)ibmvscsi_task,
-		     (unsigned long)hostdata);
+	INIT_WORK(&hostdata->srp_task, ibmvscsi_work);
 
 	if (request_irq(vdev->irq,
 			ibmvscsi_handle_event,
@@ -387,7 +386,7 @@ static int ibmvscsi_init_crq_queue(struct crq_queue *queue,
 	return retrc;
 
       req_irq_failed:
-	tasklet_kill(&hostdata->srp_task);
+	cancel_work_sync(&hostdata->srp_task);
 	rc = 0;
 	do {
 		if (rc)
@@ -2194,9 +2193,10 @@ static int ibmvscsi_work_to_do(struct ibmvscsi_host_data *hostdata)
 	return rc;
 }
 
-static int ibmvscsi_work(void *data)
+static int ibmvscsi_work(struct work_struct *t)
 {
-	struct ibmvscsi_host_data *hostdata = data;
+	struct ibmvscsi_host_data *hostdata =
+		from_work(hostdata, t, srp_task);
 	int rc;
 
 	set_user_nice(current, MIN_NICE);
@@ -2371,7 +2371,7 @@ static int ibmvscsi_resume(struct device *dev)
 {
 	struct ibmvscsi_host_data *hostdata = dev_get_drvdata(dev);
 	vio_disable_interrupts(to_vio_dev(hostdata->dev));
-	tasklet_schedule(&hostdata->srp_task);
+	queue_work(system_bh_wq, &hostdata->srp_task);
 
 	return 0;
 }
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.h b/drivers/scsi/ibmvscsi/ibmvscsi.h
index e60916ef7a49..cfc0a70c434c 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.h
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.h
@@ -19,6 +19,7 @@
 #include <linux/list.h>
 #include <linux/completion.h>
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 #include <scsi/viosrp.h>
 
 struct scsi_cmnd;
@@ -90,7 +91,7 @@ struct ibmvscsi_host_data {
 	struct device *dev;
 	struct event_pool pool;
 	struct crq_queue queue;
-	struct tasklet_struct srp_task;
+	struct work_struct srp_task;
 	struct list_head sent;
 	struct Scsi_Host *host;
 	struct task_struct *work_thread;
diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
index 68b99924ee4f..204975fb61ba 100644
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -2948,7 +2948,7 @@ static irqreturn_t ibmvscsis_interrupt(int dummy, void *data)
 	struct scsi_info *vscsi = data;
 
 	vio_disable_interrupts(vscsi->dma_dev);
-	tasklet_schedule(&vscsi->work_task);
+	queue_work(system_bh_wq, &scsi->work_task);
 
 	return IRQ_HANDLED;
 }
@@ -3309,7 +3309,7 @@ static int ibmvscsis_rdma(struct ibmvscsis_cmd *cmd, struct scatterlist *sg,
 
 /**
  * ibmvscsis_handle_crq() - Handle CRQ
- * @data:	Pointer to our adapter structure
+ * @t:	Pointer to work_struct
  *
  * Read the command elements from the command queue and copy the payloads
  * associated with the command elements to local memory and execute the
@@ -3317,9 +3317,9 @@ static int ibmvscsis_rdma(struct ibmvscsis_cmd *cmd, struct scatterlist *sg,
  *
  * Note: this is an edge triggered interrupt. It can not be shared.
  */
-static void ibmvscsis_handle_crq(unsigned long data)
+static void ibmvscsis_handle_crq(struct work_struct *t)
 {
-	struct scsi_info *vscsi = (struct scsi_info *)data;
+	struct scsi_info *vscsi = from_work(scsi, t, work_task);
 	struct viosrp_crq *crq;
 	long rc;
 	bool ack = true;
@@ -3530,8 +3530,7 @@ static int ibmvscsis_probe(struct vio_dev *vdev,
 	dev_dbg(&vscsi->dev, "probe hrc %ld, client partition num %d\n",
 		hrc, vscsi->client_data.partition_number);
 
-	tasklet_init(&vscsi->work_task, ibmvscsis_handle_crq,
-		     (unsigned long)vscsi);
+	INIT_WORK(&vscsi->work_task, ibmvscsis_handle_crq);
 
 	init_completion(&vscsi->wait_idle);
 	init_completion(&vscsi->unconfig);
@@ -3565,7 +3564,7 @@ static int ibmvscsis_probe(struct vio_dev *vdev,
 free_buf:
 	kfree(vscsi->map_buf);
 destroy_queue:
-	tasklet_kill(&vscsi->work_task);
+	cancel_work_sync(&vscsi->work_task);
 	ibmvscsis_unregister_command_q(vscsi);
 	ibmvscsis_destroy_command_q(vscsi);
 free_timer:
@@ -3602,7 +3601,7 @@ static void ibmvscsis_remove(struct vio_dev *vdev)
 	dma_unmap_single(&vdev->dev, vscsi->map_ioba, PAGE_SIZE,
 			 DMA_BIDIRECTIONAL);
 	kfree(vscsi->map_buf);
-	tasklet_kill(&vscsi->work_task);
+	cancel_work_sync(&vscsi->work_task);
 	ibmvscsis_destroy_command_q(vscsi);
 	ibmvscsis_freetimer(vscsi);
 	ibmvscsis_free_cmds(vscsi);
diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
index 7ae074e5d7a1..e7dea32e4dbc 100644
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
@@ -18,6 +18,7 @@
 #define __H_IBMVSCSI_TGT
 
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 #include "libsrp.h"
 
 #define SYS_ID_NAME_LEN		64
@@ -295,7 +296,7 @@ struct scsi_info {
 	struct vio_dev *dma_dev;
 	struct srp_target target;
 	struct ibmvscsis_tport tport;
-	struct tasklet_struct work_task;
+	struct work_struct work_task;
 	struct work_struct proc_work;
 };
 
diff --git a/drivers/scsi/isci/host.c b/drivers/scsi/isci/host.c
index 35589b6af90d..d911dc159809 100644
--- a/drivers/scsi/isci/host.c
+++ b/drivers/scsi/isci/host.c
@@ -220,7 +220,7 @@ irqreturn_t isci_msix_isr(int vec, void *data)
 	struct isci_host *ihost = data;
 
 	if (sci_controller_isr(ihost))
-		tasklet_schedule(&ihost->completion_tasklet);
+		queue_work(system_bh_wq, &ihost->completion_work);
 
 	return IRQ_HANDLED;
 }
@@ -610,7 +610,7 @@ irqreturn_t isci_intx_isr(int vec, void *data)
 
 	if (sci_controller_isr(ihost)) {
 		writel(SMU_ISR_COMPLETION, &ihost->smu_registers->interrupt_status);
-		tasklet_schedule(&ihost->completion_tasklet);
+		queue_work(system_bh_wq, &ihost->completion_work);
 		ret = IRQ_HANDLED;
 	} else if (sci_controller_error_isr(ihost)) {
 		spin_lock(&ihost->scic_lock);
@@ -1106,14 +1106,14 @@ void ireq_done(struct isci_host *ihost, struct isci_request *ireq, struct sas_ta
 /**
  * isci_host_completion_routine() - This function is the delayed service
  *    routine that calls the sci core library's completion handler. It's
- *    scheduled as a tasklet from the interrupt service routine when interrupts
+ *    scheduled as a BH work from the interrupt service routine when interrupts
  *    in use, or set as the timeout function in polled mode.
- * @data: This parameter specifies the ISCI host object
+ * @t: pointer to the work_struct
  *
  */
-void isci_host_completion_routine(unsigned long data)
+void isci_host_completion_routine(struct work_struct *t)
 {
-	struct isci_host *ihost = (struct isci_host *)data;
+	struct isci_host *ihost = from_work(ihost, t, completion_work);
 	u16 active;
 
 	spin_lock_irq(&ihost->scic_lock);
diff --git a/drivers/scsi/isci/host.h b/drivers/scsi/isci/host.h
index 52388374cf31..8350e70bfb3a 100644
--- a/drivers/scsi/isci/host.h
+++ b/drivers/scsi/isci/host.h
@@ -131,8 +131,8 @@ struct sci_port_configuration_agent {
  * @device_table: rni (hw remote node index) to remote device lookup table
  * @available_remote_nodes: rni allocator
  * @power_control: manage device spin up
- * @io_request_sequence: generation number for tci's (task contexts)
- * @task_context_table: hw task context table
+ * @io_request_sequence: generation number for tci's (bh contexts)
+ * @task_context_table: hw bh context table
  * @remote_node_context_table: hw remote node context table
  * @completion_queue: hw-producer driver-consumer communication ring
  * @completion_queue_get: tracks the driver 'head' of the ring to notify hw
@@ -203,7 +203,7 @@ struct isci_host {
 	#define IHOST_IRQ_ENABLED 2
 	unsigned long flags;
 	wait_queue_head_t eventq;
-	struct tasklet_struct completion_tasklet;
+	struct work_struct completion_work;
 	spinlock_t scic_lock;
 	struct isci_request *reqs[SCI_MAX_IO_REQUESTS];
 	struct isci_remote_device devices[SCI_MAX_REMOTE_DEVICES];
@@ -478,7 +478,7 @@ void isci_tci_free(struct isci_host *ihost, u16 tci);
 void ireq_done(struct isci_host *ihost, struct isci_request *ireq, struct sas_task *task);
 
 int isci_host_init(struct isci_host *);
-void isci_host_completion_routine(unsigned long data);
+void isci_host_completion_routine(struct work_struct *t);
 void isci_host_deinit(struct isci_host *);
 void sci_controller_disable_interrupts(struct isci_host *ihost);
 bool sci_controller_has_remote_devices_stopping(struct isci_host *ihost);
diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index c582a3932cea..605e4d965e04 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -510,8 +510,8 @@ static struct isci_host *isci_host_alloc(struct pci_dev *pdev, int id)
 	init_waitqueue_head(&ihost->eventq);
 	ihost->sas_ha.dev = &ihost->pdev->dev;
 	ihost->sas_ha.lldd_ha = ihost;
-	tasklet_init(&ihost->completion_tasklet,
-		     isci_host_completion_routine, (unsigned long)ihost);
+	INIT_WORK(&ihost->completion_work,
+		     isci_host_completion_routine);
 
 	/* validate module parameters */
 	/* TODO: kill struct sci_user_parameters and reference directly */
diff --git a/drivers/scsi/megaraid/mega_common.h b/drivers/scsi/megaraid/mega_common.h
index 2ad0aa2f837d..cff3e98dbe31 100644
--- a/drivers/scsi/megaraid/mega_common.h
+++ b/drivers/scsi/megaraid/mega_common.h
@@ -24,6 +24,7 @@
 #include <linux/list.h>
 #include <linux/moduleparam.h>
 #include <linux/dma-mapping.h>
+#include <linux/workqueue.h>
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_device.h>
@@ -95,7 +96,7 @@ typedef struct {
 
 /**
  * struct adapter_t - driver's initialization structure
- * @aram dpc_h			: tasklet handle
+ * @aram dpc_h			: work handle
  * @pdev			: pci configuration pointer for kernel
  * @host			: pointer to host structure of mid-layer
  * @lock			: synchronization lock for mid-layer and driver
@@ -149,7 +150,7 @@ typedef struct {
 #define VERSION_SIZE	16
 
 typedef struct {
-	struct tasklet_struct	dpc_h;
+	struct work_struct	dpc_h;
 	struct pci_dev		*pdev;
 	struct Scsi_Host	*host;
 	spinlock_t		lock;
diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index bc867da650b6..4ce033cb9554 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -119,7 +119,7 @@ static void megaraid_mbox_prepare_epthru(adapter_t *, scb_t *,
 
 static irqreturn_t megaraid_isr(int, void *);
 
-static void megaraid_mbox_dpc(unsigned long);
+static void megaraid_mbox_dpc(struct work_struct *);
 
 static ssize_t megaraid_mbox_app_hndl_show(struct device *, struct device_attribute *attr, char *);
 static ssize_t megaraid_mbox_ld_show(struct device *, struct device_attribute *attr, char *);
@@ -879,9 +879,8 @@ megaraid_init_mbox(adapter_t *adapter)
 		}
 	}
 
-	// setup tasklet for DPC
-	tasklet_init(&adapter->dpc_h, megaraid_mbox_dpc,
-			(unsigned long)adapter);
+	/* Initialize the work for DPC */
+	INIT_WORK(&adapter->dpc_h, megaraid_mbox_dpc);
 
 	con_log(CL_DLEVEL1, (KERN_INFO
 		"megaraid mbox hba successfully initialized\n"));
@@ -917,7 +916,7 @@ megaraid_fini_mbox(adapter_t *adapter)
 	// flush all caches
 	megaraid_mbox_flush_cache(adapter);
 
-	tasklet_kill(&adapter->dpc_h);
+	cancel_work_sync(&adapter->dpc_h);
 
 	megaraid_sysfs_free_resources(adapter);
 
@@ -2127,7 +2126,7 @@ megaraid_ack_sequence(adapter_t *adapter)
 
 	// schedule the DPC if there is some work for it
 	if (handled)
-		tasklet_schedule(&adapter->dpc_h);
+		queue_work(system_bh_wq, &adapter->dpc_h);
 
 	return handled;
 }
@@ -2158,17 +2157,17 @@ megaraid_isr(int irq, void *devp)
 
 
 /**
- * megaraid_mbox_dpc - the tasklet to complete the commands from completed list
- * @devp	: pointer to HBA soft state
+ * megaraid_mbox_dpc - the work handler to complete the commands from completed list
+ * @t	: pointer to work_struct
  *
  * Pick up the commands from the completed list and send back to the owners.
  * This is a reentrant function and does not assume any locks are held while
  * it is being called.
  */
 static void
-megaraid_mbox_dpc(unsigned long devp)
+megaraid_mbox_dpc(struct work_struct *t)
 {
-	adapter_t		*adapter = (adapter_t *)devp;
+	adapter_t		*adapter = from_work(adapter, t, dpc_h);
 	mraid_device_t		*raid_dev;
 	struct list_head	clist;
 	struct scatterlist	*sgl;
@@ -3812,7 +3811,7 @@ megaraid_sysfs_free_resources(adapter_t *adapter)
  * megaraid_sysfs_get_ldmap_done - callback for get ldmap
  * @uioc	: completed packet
  *
- * Callback routine called in the ISR/tasklet context for get ldmap call
+ * Callback routine called in the ISR/BH context for get ldmap call
  */
 static void
 megaraid_sysfs_get_ldmap_done(uioc_t *uioc)
diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index 56624cbf7fa5..8de7a678e096 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2389,7 +2389,7 @@ struct megasas_instance {
 	atomic64_t high_iops_outstanding;
 
 	struct megasas_instance_template *instancet;
-	struct tasklet_struct isr_tasklet;
+	struct work_struct isr_work;
 	struct work_struct work_init;
 	struct delayed_work fw_fault_work;
 	struct workqueue_struct *fw_fault_work_q;
@@ -2551,7 +2551,7 @@ struct megasas_instance_template {
 	int (*check_reset)(struct megasas_instance *, \
 		struct megasas_register_set __iomem *);
 	irqreturn_t (*service_isr)(int irq, void *devp);
-	void (*tasklet)(unsigned long);
+	void (*work)(struct work_struct *);
 	u32 (*init_adapter)(struct megasas_instance *);
 	u32 (*build_and_issue_cmd) (struct megasas_instance *,
 				    struct scsi_cmnd *);
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 3d4f13da1ae8..dd935943ae4f 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -234,7 +234,7 @@ megasas_init_adapter_mfi(struct megasas_instance *instance);
 u32
 megasas_build_and_issue_cmd(struct megasas_instance *instance,
 			    struct scsi_cmnd *scmd);
-static void megasas_complete_cmd_dpc(unsigned long instance_addr);
+static void megasas_complete_cmd_dpc(struct work_struct *t);
 int
 wait_and_poll(struct megasas_instance *instance, struct megasas_cmd *cmd,
 	int seconds);
@@ -615,7 +615,7 @@ static struct megasas_instance_template megasas_instance_template_xscale = {
 	.adp_reset = megasas_adp_reset_xscale,
 	.check_reset = megasas_check_reset_xscale,
 	.service_isr = megasas_isr,
-	.tasklet = megasas_complete_cmd_dpc,
+	.work = megasas_complete_cmd_dpc,
 	.init_adapter = megasas_init_adapter_mfi,
 	.build_and_issue_cmd = megasas_build_and_issue_cmd,
 	.issue_dcmd = megasas_issue_dcmd,
@@ -754,7 +754,7 @@ static struct megasas_instance_template megasas_instance_template_ppc = {
 	.adp_reset = megasas_adp_reset_xscale,
 	.check_reset = megasas_check_reset_ppc,
 	.service_isr = megasas_isr,
-	.tasklet = megasas_complete_cmd_dpc,
+	.work = megasas_complete_cmd_dpc,
 	.init_adapter = megasas_init_adapter_mfi,
 	.build_and_issue_cmd = megasas_build_and_issue_cmd,
 	.issue_dcmd = megasas_issue_dcmd,
@@ -895,7 +895,7 @@ static struct megasas_instance_template megasas_instance_template_skinny = {
 	.adp_reset = megasas_adp_reset_gen2,
 	.check_reset = megasas_check_reset_skinny,
 	.service_isr = megasas_isr,
-	.tasklet = megasas_complete_cmd_dpc,
+	.work = megasas_complete_cmd_dpc,
 	.init_adapter = megasas_init_adapter_mfi,
 	.build_and_issue_cmd = megasas_build_and_issue_cmd,
 	.issue_dcmd = megasas_issue_dcmd,
@@ -1095,7 +1095,7 @@ static struct megasas_instance_template megasas_instance_template_gen2 = {
 	.adp_reset = megasas_adp_reset_gen2,
 	.check_reset = megasas_check_reset_gen2,
 	.service_isr = megasas_isr,
-	.tasklet = megasas_complete_cmd_dpc,
+	.work = megasas_complete_cmd_dpc,
 	.init_adapter = megasas_init_adapter_mfi,
 	.build_and_issue_cmd = megasas_build_and_issue_cmd,
 	.issue_dcmd = megasas_issue_dcmd,
@@ -2269,18 +2269,18 @@ megasas_check_and_restore_queue_depth(struct megasas_instance *instance)
 
 /**
  * megasas_complete_cmd_dpc	 -	Returns FW's controller structure
- * @instance_addr:			Address of adapter soft state
+ * @t:					pointer to the work_struct
  *
- * Tasklet to complete cmds
+ * Work to complete cmds
  */
-static void megasas_complete_cmd_dpc(unsigned long instance_addr)
+static void megasas_complete_cmd_dpc(struct work_struct *t)
 {
 	u32 producer;
 	u32 consumer;
 	u32 context;
 	struct megasas_cmd *cmd;
 	struct megasas_instance *instance =
-				(struct megasas_instance *)instance_addr;
+				from_work(instance, t, isr_work);
 	unsigned long flags;
 
 	/* If we have already declared adapter dead, donot complete cmds */
@@ -2825,7 +2825,7 @@ static int megasas_wait_for_outstanding(struct megasas_instance *instance)
 			 * Call cmd completion routine. Cmd to be
 			 * be completed directly without depending on isr.
 			 */
-			megasas_complete_cmd_dpc((unsigned long)instance);
+			megasas_complete_cmd_dpc(&instance->isr_work);
 		}
 
 		msleep(1000);
@@ -4073,7 +4073,7 @@ megasas_deplete_reply_queue(struct megasas_instance *instance,
 		}
 	}
 
-	tasklet_schedule(&instance->isr_tasklet);
+	queue_work(system_bh_wq, &instance->isr_work);
 	return IRQ_HANDLED;
 }
 
@@ -6313,8 +6313,7 @@ static int megasas_init_fw(struct megasas_instance *instance)
 	dev_info(&instance->pdev->dev,
 		"RDPQ mode\t: (%s)\n", instance->is_rdpq ? "enabled" : "disabled");
 
-	tasklet_init(&instance->isr_tasklet, instance->instancet->tasklet,
-		(unsigned long)instance);
+	INIT_WORK(&instance->isr_work, instance->instancet->work);
 
 	/*
 	 * Below are default value for legacy Firmware.
@@ -7757,7 +7756,7 @@ megasas_suspend(struct device *dev)
 		instance->ev = NULL;
 	}
 
-	tasklet_kill(&instance->isr_tasklet);
+	cancel_work_sync(&instance->isr_work);
 
 	pci_set_drvdata(instance->pdev, instance);
 	instance->instancet->disable_intr(instance);
@@ -7865,8 +7864,7 @@ megasas_resume(struct device *dev)
 	if (megasas_get_ctrl_info(instance) != DCMD_SUCCESS)
 		goto fail_init_mfi;
 
-	tasklet_init(&instance->isr_tasklet, instance->instancet->tasklet,
-		     (unsigned long)instance);
+	INIT_WORK(&instance->isr_work, instance->instancet->work);
 
 	if (instance->msix_vectors ?
 			megasas_setup_irqs_msix(instance, 0) :
@@ -7997,7 +7995,7 @@ static void megasas_detach_one(struct pci_dev *pdev)
 	/* cancel all wait events */
 	wake_up_all(&instance->int_cmd_wait_q);
 
-	tasklet_kill(&instance->isr_tasklet);
+	cancel_work_sync(&instance->isr_work);
 
 	/*
 	 * Take the instance off the instance array. Note that we will not
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index c60014e07b44..7dd036b31a0c 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -3821,15 +3821,15 @@ int megasas_irqpoll(struct irq_poll *irqpoll, int budget)
 
 /**
  * megasas_complete_cmd_dpc_fusion -	Completes command
- * @instance_addr:			Adapter soft state address
+ * @t:					pointer to the work_struct
  *
- * Tasklet to complete cmds
+ * Work to complete cmds
  */
 static void
-megasas_complete_cmd_dpc_fusion(unsigned long instance_addr)
+megasas_complete_cmd_dpc_fusion(struct work_struct *t)
 {
 	struct megasas_instance *instance =
-		(struct megasas_instance *)instance_addr;
+		from_work(instance, t, isr_work);
 	struct megasas_irq_context *irq_ctx = NULL;
 	u32 count, MSIxIndex;
 
@@ -4180,7 +4180,7 @@ megasas_wait_for_outstanding_fusion(struct megasas_instance *instance,
 	if (reason == MFI_IO_TIMEOUT_OCR) {
 		dev_info(&instance->pdev->dev,
 			"MFI command is timed out\n");
-		megasas_complete_cmd_dpc_fusion((unsigned long)instance);
+		megasas_complete_cmd_dpc_fusion(&instance->isr_work);
 		if (instance->snapdump_wait_time)
 			megasas_trigger_snap_dump(instance);
 		retval = 1;
@@ -4196,7 +4196,7 @@ megasas_wait_for_outstanding_fusion(struct megasas_instance *instance,
 				   "FW in FAULT state Fault code:0x%x subcode:0x%x func:%s\n",
 				   abs_state & MFI_STATE_FAULT_CODE,
 				   abs_state & MFI_STATE_FAULT_SUBCODE, __func__);
-			megasas_complete_cmd_dpc_fusion((unsigned long)instance);
+			megasas_complete_cmd_dpc_fusion(&instance->isr_work);
 			if (instance->requestorId && reason) {
 				dev_warn(&instance->pdev->dev, "SR-IOV Found FW in FAULT"
 				" state while polling during"
@@ -4240,7 +4240,7 @@ megasas_wait_for_outstanding_fusion(struct megasas_instance *instance,
 			}
 		}
 
-		megasas_complete_cmd_dpc_fusion((unsigned long)instance);
+		megasas_complete_cmd_dpc_fusion(&instance->isr_work);
 		outstanding = atomic_read(&instance->fw_outstanding);
 		if (!outstanding)
 			goto out;
@@ -5371,7 +5371,7 @@ struct megasas_instance_template megasas_instance_template_fusion = {
 	.adp_reset = megasas_adp_reset_fusion,
 	.check_reset = megasas_check_reset_fusion,
 	.service_isr = megasas_isr_fusion,
-	.tasklet = megasas_complete_cmd_dpc_fusion,
+	.work = megasas_complete_cmd_dpc_fusion,
 	.init_adapter = megasas_init_adapter_fusion,
 	.build_and_issue_cmd = megasas_build_and_issue_cmd_fusion,
 	.issue_dcmd = megasas_issue_dcmd_fusion,
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 43ebb331e216..c8b3c18cfc6c 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -144,14 +144,14 @@ static void mvs_free(struct mvs_info *mvi)
 	kfree(mvi);
 }
 
-#ifdef CONFIG_SCSI_MVSAS_TASKLET
-static void mvs_tasklet(unsigned long opaque)
+#ifdef CONFIG_SCSI_MVSAS_WORK
+static void mvs_work(struct work_struct *t)
 {
 	u32 stat;
 	u16 core_nr, i = 0;
 
-	struct mvs_info *mvi;
-	struct sas_ha_struct *sha = (struct sas_ha_struct *)opaque;
+	struct mvs_info *mvi = from_work(mvi, t, mv_work);
+	struct sas_ha_struct *sha = mvi->sha;
 
 	core_nr = ((struct mvs_prv_info *)sha->lldd_ha)->n_host;
 	mvi = ((struct mvs_prv_info *)sha->lldd_ha)->mvi[0];
@@ -178,7 +178,7 @@ static irqreturn_t mvs_interrupt(int irq, void *opaque)
 	u32 stat;
 	struct mvs_info *mvi;
 	struct sas_ha_struct *sha = opaque;
-#ifndef CONFIG_SCSI_MVSAS_TASKLET
+#ifndef CONFIG_SCSI_MVSAS_WORK
 	u32 i;
 	u32 core_nr;
 
@@ -189,20 +189,20 @@ static irqreturn_t mvs_interrupt(int irq, void *opaque)
 
 	if (unlikely(!mvi))
 		return IRQ_NONE;
-#ifdef CONFIG_SCSI_MVSAS_TASKLET
+#ifdef CONFIG_SCSI_MVSAS_WORK
 	MVS_CHIP_DISP->interrupt_disable(mvi);
 #endif
 
 	stat = MVS_CHIP_DISP->isr_status(mvi, irq);
 	if (!stat) {
-	#ifdef CONFIG_SCSI_MVSAS_TASKLET
+	#ifdef CONFIG_SCSI_MVSAS_WORK
 		MVS_CHIP_DISP->interrupt_enable(mvi);
 	#endif
 		return IRQ_NONE;
 	}
 
-#ifdef CONFIG_SCSI_MVSAS_TASKLET
-	tasklet_schedule(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet);
+#ifdef CONFIG_SCSI_MVSAS_WORK
+	queue_work(system_bh_wq, &((struct mvs_prv_info *)sha->lldd_ha)->mv_work);
 #else
 	for (i = 0; i < core_nr; i++) {
 		mvi = ((struct mvs_prv_info *)sha->lldd_ha)->mvi[i];
@@ -553,12 +553,11 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent)
 		}
 		nhost++;
 	} while (nhost < chip->n_host);
-#ifdef CONFIG_SCSI_MVSAS_TASKLET
+#ifdef CONFIG_SCSI_MVSAS_WORK
 	{
 	struct mvs_prv_info *mpi = SHOST_TO_SAS_HA(shost)->lldd_ha;
 
-	tasklet_init(&(mpi->mv_tasklet), mvs_tasklet,
-		     (unsigned long)SHOST_TO_SAS_HA(shost));
+	INIT_WORK(&(mpi->mv_work), mvs_work);
 	}
 #endif
 
@@ -603,8 +602,8 @@ static void mvs_pci_remove(struct pci_dev *pdev)
 	core_nr = ((struct mvs_prv_info *)sha->lldd_ha)->n_host;
 	mvi = ((struct mvs_prv_info *)sha->lldd_ha)->mvi[0];
 
-#ifdef CONFIG_SCSI_MVSAS_TASKLET
-	tasklet_kill(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet);
+#ifdef CONFIG_SCSI_MVSAS_WORK
+	cancel_work_sync(&((struct mvs_prv_info *)sha->lldd_ha)->mv_work);
 #endif
 
 	sas_unregister_ha(sha);
diff --git a/drivers/scsi/mvsas/mv_sas.h b/drivers/scsi/mvsas/mv_sas.h
index 68df771e2975..2bf1af51e2a4 100644
--- a/drivers/scsi/mvsas/mv_sas.h
+++ b/drivers/scsi/mvsas/mv_sas.h
@@ -23,6 +23,7 @@
 #include <linux/irq.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/workqueue.h>
 #include <asm/unaligned.h>
 #include <scsi/libsas.h>
 #include <scsi/scsi.h>
@@ -402,7 +403,7 @@ struct mvs_prv_info{
 	u8 scan_finished;
 	u8 reserve;
 	struct mvs_info *mvi[2];
-	struct tasklet_struct mv_tasklet;
+	struct work_struct mv_work;
 };
 
 struct mvs_wq {
@@ -432,8 +433,8 @@ void mvs_set_sas_addr(struct mvs_info *mvi, int port_id, u32 off_lo,
 		      u32 off_hi, u64 sas_addr);
 void mvs_scan_start(struct Scsi_Host *shost);
 int mvs_scan_finished(struct Scsi_Host *shost, unsigned long time);
-int mvs_queue_command(struct sas_task *task, gfp_t gfp_flags);
-int mvs_abort_task(struct sas_task *task);
+int mvs_queue_command(struct sas_task *work, gfp_t gfp_flags);
+int mvs_abort_task(struct sas_task *work);
 void mvs_port_formed(struct asd_sas_phy *sas_phy);
 void mvs_port_deformed(struct asd_sas_phy *sas_phy);
 int mvs_dev_found(struct domain_device *dev);
@@ -441,7 +442,7 @@ void mvs_dev_gone(struct domain_device *dev);
 int mvs_lu_reset(struct domain_device *dev, u8 *lun);
 int mvs_slot_complete(struct mvs_info *mvi, u32 rx_desc, u32 flags);
 int mvs_I_T_nexus_reset(struct domain_device *dev);
-int mvs_query_task(struct sas_task *task);
+int mvs_query_task(struct sas_task *work);
 void mvs_release_task(struct mvs_info *mvi,
 			struct domain_device *dev);
 void mvs_do_release_task(struct mvs_info *mvi, int phy_no,
diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index ed6b7d954dda..bda175682785 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -60,8 +60,8 @@ bool pm8001_use_msix = true;
 module_param_named(use_msix, pm8001_use_msix, bool, 0444);
 MODULE_PARM_DESC(zoned, "Use MSIX interrupts. Default: true");
 
-static bool pm8001_use_tasklet = true;
-module_param_named(use_tasklet, pm8001_use_tasklet, bool, 0444);
+static bool pm8001_use_bh_work = true;
+module_param_named(use_bh_work, pm8001_use_bh_work, bool, 0444);
 MODULE_PARM_DESC(zoned, "Use MSIX interrupts. Default: true");
 
 static bool pm8001_read_wwn = true;
@@ -213,14 +213,17 @@ static void pm8001_free(struct pm8001_hba_info *pm8001_ha)
 }
 
 /**
- * pm8001_tasklet() - tasklet for 64 msi-x interrupt handler
- * @opaque: the passed general host adapter struct
- * Note: pm8001_tasklet is common for pm8001 & pm80xx
+ * pm8001_work() - BH work for 64 msi-x interrupt handler
+ * @t: pointer to work_struct
+ * Note: pm8001_work is common for pm8001 & pm80xx
  */
-static void pm8001_tasklet(unsigned long opaque)
+static void pm8001_work(struct work_struct *t)
 {
-	struct isr_param *irq_vector = (struct isr_param *)opaque;
-	struct pm8001_hba_info *pm8001_ha = irq_vector->drv_inst;
+	/*FIXME: Since we don't know the index, we need a
+	 * mechanism to determine it or always use index 0
+	 */
+	struct pm8001_hba_info *pm8001_ha = from_work(pm8001_ha, t, work[0]);
+	struct isr_param *irq_vector = pm8001_ha->irq_vector;
 
 	if (WARN_ON_ONCE(!pm8001_ha))
 		return;
@@ -228,41 +231,39 @@ static void pm8001_tasklet(unsigned long opaque)
 	PM8001_CHIP_DISP->isr(pm8001_ha, irq_vector->irq_id);
 }
 
-static void pm8001_init_tasklet(struct pm8001_hba_info *pm8001_ha)
+static void pm8001_init_work(struct pm8001_hba_info *pm8001_ha)
 {
 	int i;
 
-	if (!pm8001_use_tasklet)
+	if (!pm8001_use_bh_work)
 		return;
 
-	/*  Tasklet for non msi-x interrupt handler */
+	/*  Work for non msi-x interrupt handler */
 	if ((!pm8001_ha->pdev->msix_cap || !pci_msi_enabled()) ||
 	    (pm8001_ha->chip_id == chip_8001)) {
-		tasklet_init(&pm8001_ha->tasklet[0], pm8001_tasklet,
-			     (unsigned long)&(pm8001_ha->irq_vector[0]));
+		INIT_WORK(&pm8001_ha->work[0], pm8001_work);
 		return;
 	}
 	for (i = 0; i < PM8001_MAX_MSIX_VEC; i++)
-		tasklet_init(&pm8001_ha->tasklet[i], pm8001_tasklet,
-			     (unsigned long)&(pm8001_ha->irq_vector[i]));
+		INIT_WORK(&pm8001_ha->work[i], pm8001_work);
 }
 
-static void pm8001_kill_tasklet(struct pm8001_hba_info *pm8001_ha)
+static void pm8001_cancel_work(struct pm8001_hba_info *pm8001_ha)
 {
 	int i;
 
-	if (!pm8001_use_tasklet)
+	if (!pm8001_use_bh_work)
 		return;
 
 	/* For non-msix and msix interrupts */
 	if ((!pm8001_ha->pdev->msix_cap || !pci_msi_enabled()) ||
 	    (pm8001_ha->chip_id == chip_8001)) {
-		tasklet_kill(&pm8001_ha->tasklet[0]);
+		cancel_work_sync(&pm8001_ha->work[0]);
 		return;
 	}
 
 	for (i = 0; i < PM8001_MAX_MSIX_VEC; i++)
-		tasklet_kill(&pm8001_ha->tasklet[i]);
+		cancel_work_sync(&pm8001_ha->work[i]);
 }
 
 static irqreturn_t pm8001_handle_irq(struct pm8001_hba_info *pm8001_ha,
@@ -274,10 +275,10 @@ static irqreturn_t pm8001_handle_irq(struct pm8001_hba_info *pm8001_ha,
 	if (!PM8001_CHIP_DISP->is_our_interrupt(pm8001_ha))
 		return IRQ_NONE;
 
-	if (!pm8001_use_tasklet)
+	if (!pm8001_use_bh_work)
 		return PM8001_CHIP_DISP->isr(pm8001_ha, irq);
 
-	tasklet_schedule(&pm8001_ha->tasklet[irq]);
+	queue_work(system_bh_wq, &pm8001_ha->work[irq]);
 	return IRQ_HANDLED;
 }
 
@@ -580,7 +581,7 @@ static struct pm8001_hba_info *pm8001_pci_alloc(struct pci_dev *pdev,
 	else
 		pm8001_ha->iomb_size = IOMB_SIZE_SPC;
 
-	pm8001_init_tasklet(pm8001_ha);
+	pm8001_init_work(pm8001_ha);
 
 	if (pm8001_ioremap(pm8001_ha))
 		goto failed_pci_alloc;
@@ -1318,7 +1319,7 @@ static void pm8001_pci_remove(struct pci_dev *pdev)
 	PM8001_CHIP_DISP->chip_soft_rst(pm8001_ha);
 
 	pm8001_free_irq(pm8001_ha);
-	pm8001_kill_tasklet(pm8001_ha);
+	pm8001_cancel_work(pm8001_ha);
 	scsi_host_put(pm8001_ha->shost);
 
 	for (i = 0; i < pm8001_ha->ccb_count; i++) {
@@ -1361,7 +1362,7 @@ static int __maybe_unused pm8001_pci_suspend(struct device *dev)
 	PM8001_CHIP_DISP->chip_soft_rst(pm8001_ha);
 
 	pm8001_free_irq(pm8001_ha);
-	pm8001_kill_tasklet(pm8001_ha);
+	pm8001_cancel_work(pm8001_ha);
 
 	pm8001_info(pm8001_ha, "pdev=0x%p, slot=%s, entering "
 		      "suspended state\n", pdev,
@@ -1410,7 +1411,7 @@ static int __maybe_unused pm8001_pci_resume(struct device *dev)
 	if (rc)
 		goto err_out_disable;
 
-	pm8001_init_tasklet(pm8001_ha);
+	pm8001_init_work(pm8001_ha);
 
 	PM8001_CHIP_DISP->interrupt_enable(pm8001_ha, 0);
 	if (pm8001_ha->chip_id != chip_8001) {
@@ -1543,8 +1544,8 @@ static int __init pm8001_init(void)
 {
 	int rc = -ENOMEM;
 
-	if (pm8001_use_tasklet && !pm8001_use_msix)
-		pm8001_use_tasklet = false;
+	if (pm8001_use_bh_work && !pm8001_use_msix)
+		pm8001_use_bh_work = false;
 
 	pm8001_wq = alloc_workqueue("pm80xx", 0, 0);
 	if (!pm8001_wq)
diff --git a/drivers/scsi/pm8001/pm8001_sas.h b/drivers/scsi/pm8001/pm8001_sas.h
index 3ccb7371902f..08ab597406c7 100644
--- a/drivers/scsi/pm8001/pm8001_sas.h
+++ b/drivers/scsi/pm8001/pm8001_sas.h
@@ -522,7 +522,7 @@ struct pm8001_hba_info {
 	int			number_of_intr;/*will be used in remove()*/
 	char			intr_drvname[PM8001_MAX_MSIX_VEC]
 				[PM8001_NAME_LENGTH+1+3+1];
-	struct tasklet_struct	tasklet[PM8001_MAX_MSIX_VEC];
+	struct work_struct	work[PM8001_MAX_MSIX_VEC];
 	u32			logging_level;
 	u32			link_rate;
 	u32			fw_status;
diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c
index e8bcc3a88732..be21c0ffe002 100644
--- a/drivers/scsi/pmcraid.c
+++ b/drivers/scsi/pmcraid.c
@@ -859,7 +859,7 @@ static void _pmcraid_fire_command(struct pmcraid_cmd *cmd)
 	/* Add this command block to pending cmd pool. We do this prior to
 	 * writting IOARCB to ioarrin because IOA might complete the command
 	 * by the time we are about to add it to the list. Response handler
-	 * (isr/tasklet) looks for cmd block in the pending pending list.
+	 * (isr/BH work) looks for cmd block in the pending list.
 	 */
 	spin_lock_irqsave(&pinstance->pending_pool_lock, lock_flags);
 	list_add_tail(&cmd->free_list, &pinstance->pending_cmd_pool);
@@ -1077,7 +1077,7 @@ static void pmcraid_identify_hrrq(struct pmcraid_cmd *cmd)
 
 	/* Subsequent commands require HRRQ identification to be successful.
 	 * Note that this gets called even during reset from SCSI mid-layer
-	 * or tasklet
+	 * or BH work
 	 */
 	pmcraid_send_cmd(cmd, done_function,
 			 PMCRAID_INTERNAL_TIMEOUT,
@@ -1843,7 +1843,7 @@ static void pmcraid_unregister_hcams(struct pmcraid_cmd *cmd)
 {
 	struct pmcraid_instance *pinstance = cmd->drv_inst;
 
-	/* During IOA bringdown, HCAM gets fired and tasklet proceeds with
+	/* During IOA bringdown, HCAM gets fired and BH work proceeds with
 	 * handling hcam response though it is not necessary. In order to
 	 * prevent this, set 'ignore', so that bring-down sequence doesn't
 	 * re-send any more hcams
@@ -1916,7 +1916,7 @@ static void pmcraid_soft_reset(struct pmcraid_cmd *cmd)
 	u32 doorbell;
 
 	/* There will be an interrupt when Transition to Operational bit is
-	 * set so tasklet would execute next reset task. The timeout handler
+	 * set so BH work would execute next reset task. The timeout handler
 	 * would re-initiate a reset
 	 */
 	cmd->cmd_done = pmcraid_ioa_reset;
@@ -2039,7 +2039,7 @@ static void pmcraid_fail_outstanding_cmds(struct pmcraid_instance *pinstance)
  * @cmd: pointer to the cmd block to be used for entire reset process
  *
  * This function executes most of the steps required for IOA reset. This gets
- * called by user threads (modprobe/insmod/rmmod) timer, tasklet and midlayer's
+ * called by user threads (modprobe/insmod/rmmod) timer, BH work and midlayer's
  * 'eh_' thread. Access to variables used for controlling the reset sequence is
  * synchronized using host lock. Various functions called during reset process
  * would make use of a single command block, pointer to which is also stored in
@@ -2199,7 +2199,7 @@ static void pmcraid_ioa_reset(struct pmcraid_cmd *cmd)
 		pinstance->ioa_state = IOA_STATE_IN_BRINGUP;
 
 		/* Initialization commands start with HRRQ identification. From
-		 * now on tasklet completes most of the commands as IOA is up
+		 * now on BH work completes most of the commands as IOA is up
 		 * and intrs are enabled
 		 */
 		pmcraid_identify_hrrq(cmd);
@@ -2261,7 +2261,7 @@ static void pmcraid_ioa_reset(struct pmcraid_cmd *cmd)
 
 /**
  * pmcraid_initiate_reset - initiates reset sequence. This is called from
- * ISR/tasklet during error interrupts including IOA unit check. If reset
+ * ISR/BH work during error interrupts including IOA unit check. If reset
  * is already in progress, it just returns, otherwise initiates IOA reset
  * to bring IOA up to operational state.
  *
@@ -2303,7 +2303,7 @@ static void pmcraid_initiate_reset(struct pmcraid_instance *pinstance)
  * @target_state: expected target state after reset
  *
  * Note: This command initiates reset and waits for its completion. Hence this
- * should not be called from isr/timer/tasklet functions (timeout handlers,
+ * should not be called from isr/timer/BH work functions (timeout handlers,
  * error response handlers and interrupt handlers).
  *
  * Return Value
@@ -2449,7 +2449,7 @@ static void pmcraid_request_sense(struct pmcraid_cmd *cmd)
 	ioadl->flags = IOADL_FLAGS_LAST_DESC;
 
 	/* request sense might be called as part of error response processing
-	 * which runs in tasklets context. It is possible that mid-layer might
+	 * which runs in works context. It is possible that mid-layer might
 	 * schedule queuecommand during this time, hence, writting to IOARRIN
 	 * must be protect by host_lock
 	 */
@@ -2566,7 +2566,7 @@ static void pmcraid_frame_auto_sense(struct pmcraid_cmd *cmd)
  * @cmd: pointer to pmcraid_cmd that has failed
  *
  * This function determines whether or not to initiate ERP on the affected
- * device. This is called from a tasklet, which doesn't hold any locks.
+ * device. This is called from a BH work, which doesn't hold any locks.
  *
  * Return value:
  *	 0 it caller can complete the request, otherwise 1 where in error
@@ -2825,7 +2825,7 @@ static int _pmcraid_io_done(struct pmcraid_cmd *cmd, int reslen, int ioasc)
  *
  * @cmd: pointer to pmcraid command struct
  *
- * This function is invoked by tasklet/mid-layer error handler to completing
+ * This function is invoked by BH work/mid-layer error handler to completing
  * the SCSI ops sent from mid-layer.
  *
  * Return value
@@ -3743,7 +3743,7 @@ static irqreturn_t pmcraid_isr_msix(int irq, void *dev_id)
 		}
 	}
 
-	tasklet_schedule(&(pinstance->isr_tasklet[hrrq_id]));
+	queue_work(system_bh_wq, &(pinstance->isr_work[hrrq_id]));
 
 	return IRQ_HANDLED;
 }
@@ -3811,8 +3811,8 @@ static irqreturn_t pmcraid_isr(int irq, void *dev_id)
 			ioread32(
 				pinstance->int_regs.ioa_host_interrupt_clr_reg);
 
-			tasklet_schedule(
-					&(pinstance->isr_tasklet[hrrq_id]));
+			queue_work(system_bh_wq,
+					&(pinstance->isr_work[hrrq_id]));
 		}
 	}
 
@@ -3918,14 +3918,14 @@ static void pmcraid_worker_function(struct work_struct *workp)
 }
 
 /**
- * pmcraid_tasklet_function - Tasklet function
+ * pmcraid_work_function - Work function
  *
- * @instance: pointer to msix param structure
+ * @t:  pointer to work_struct
  *
  * Return Value
  *	None
  */
-static void pmcraid_tasklet_function(unsigned long instance)
+static void pmcraid_work_function(struct work_struct *t)
 {
 	struct pmcraid_isr_param *hrrq_vector;
 	struct pmcraid_instance *pinstance;
@@ -3936,14 +3936,17 @@ static void pmcraid_tasklet_function(unsigned long instance)
 	int id;
 	u32 resp;
 
-	hrrq_vector = (struct pmcraid_isr_param *)instance;
-	pinstance = hrrq_vector->drv_inst;
+	/* FIXME: Since we don't know the index, we need a
+	 * mechanism to determine it or always use index 0
+	 */
+	pinstance = from_work(pinstance, t, isr_work[0]);
+	hrrq_vector = pinstance->hrrq_vector;
 	id = hrrq_vector->hrrq_id;
 	lockp = &(pinstance->hrrq_lock[id]);
 
 	/* loop through each of the commands responded by IOA. Each HRRQ buf is
 	 * protected by its own lock. Traversals must be done within this lock
-	 * as there may be multiple tasklets running on multiple CPUs. Note
+	 * as there may be multiple works running on multiple CPUs. Note
 	 * that the lock is held just for picking up the response handle and
 	 * manipulating hrrq_curr/toggle_bit values.
 	 */
@@ -4416,35 +4419,34 @@ static int pmcraid_allocate_config_buffers(struct pmcraid_instance *pinstance)
 }
 
 /**
- * pmcraid_init_tasklets - registers tasklets for response handling
+ * pmcraid_init_works - registers works for response handling
  *
  * @pinstance: pointer adapter instance structure
  *
  * Return value
  *	none
  */
-static void pmcraid_init_tasklets(struct pmcraid_instance *pinstance)
+static void pmcraid_init_works(struct pmcraid_instance *pinstance)
 {
 	int i;
 	for (i = 0; i < pinstance->num_hrrq; i++)
-		tasklet_init(&pinstance->isr_tasklet[i],
-			     pmcraid_tasklet_function,
-			     (unsigned long)&pinstance->hrrq_vector[i]);
+		INIT_WORK(&pinstance->isr_work[i],
+			     pmcraid_work_function);
 }
 
 /**
- * pmcraid_kill_tasklets - destroys tasklets registered for response handling
+ * pmcraid_kill_works - destroys works registered for response handling
  *
  * @pinstance: pointer to adapter instance structure
  *
  * Return value
  *	none
  */
-static void pmcraid_kill_tasklets(struct pmcraid_instance *pinstance)
+static void pmcraid_kill_works(struct pmcraid_instance *pinstance)
 {
 	int i;
 	for (i = 0; i < pinstance->num_hrrq; i++)
-		tasklet_kill(&pinstance->isr_tasklet[i]);
+		cancel_work_sync(&pinstance->isr_work[i]);
 }
 
 /**
@@ -4770,7 +4772,7 @@ static void pmcraid_remove(struct pci_dev *pdev)
 	pmcraid_disable_interrupts(pinstance, ~0);
 	flush_work(&pinstance->worker_q);
 
-	pmcraid_kill_tasklets(pinstance);
+	pmcraid_kill_works(pinstance);
 	pmcraid_unregister_interrupt_handler(pinstance);
 	pmcraid_release_buffers(pinstance);
 	iounmap(pinstance->mapped_dma_addr);
@@ -4794,7 +4796,7 @@ static int __maybe_unused pmcraid_suspend(struct device *dev)
 
 	pmcraid_shutdown(pdev);
 	pmcraid_disable_interrupts(pinstance, ~0);
-	pmcraid_kill_tasklets(pinstance);
+	pmcraid_kill_works(pinstance);
 	pmcraid_unregister_interrupt_handler(pinstance);
 
 	return 0;
@@ -4836,7 +4838,7 @@ static int __maybe_unused pmcraid_resume(struct device *dev)
 		goto release_host;
 	}
 
-	pmcraid_init_tasklets(pinstance);
+	pmcraid_init_works(pinstance);
 	pmcraid_enable_interrupts(pinstance, PMCRAID_PCI_INTERRUPTS);
 
 	/* Start with hard reset sequence which brings up IOA to operational
@@ -4850,14 +4852,14 @@ static int __maybe_unused pmcraid_resume(struct device *dev)
 	if (pmcraid_reset_bringup(pinstance)) {
 		dev_err(&pdev->dev, "couldn't initialize IOA\n");
 		rc = -ENODEV;
-		goto release_tasklets;
+		goto release_works;
 	}
 
 	return 0;
 
-release_tasklets:
+release_works:
 	pmcraid_disable_interrupts(pinstance, ~0);
-	pmcraid_kill_tasklets(pinstance);
+	pmcraid_kill_works(pinstance);
 	pmcraid_unregister_interrupt_handler(pinstance);
 
 release_host:
@@ -4869,7 +4871,7 @@ static int __maybe_unused pmcraid_resume(struct device *dev)
 }
 
 /**
- * pmcraid_complete_ioa_reset - Called by either timer or tasklet during
+ * pmcraid_complete_ioa_reset - Called by either timer or BH work during
  *				completion of the ioa reset
  * @cmd: pointer to reset command block
  */
@@ -5014,7 +5016,7 @@ static void pmcraid_init_res_table(struct pmcraid_cmd *cmd)
 
 	/* resource list is protected by pinstance->resource_lock.
 	 * init_res_table can be called from probe (user-thread) or runtime
-	 * reset (timer/tasklet)
+	 * reset (timer/BH work)
 	 */
 	spin_lock_irqsave(&pinstance->resource_lock, lock_flags);
 
@@ -5281,7 +5283,7 @@ static int pmcraid_probe(struct pci_dev *pdev,
 		goto out_scsi_host_put;
 	}
 
-	pmcraid_init_tasklets(pinstance);
+	pmcraid_init_works(pinstance);
 
 	/* allocate verious buffers used by LLD.*/
 	rc = pmcraid_init_buffers(pinstance);
@@ -5337,7 +5339,7 @@ static int pmcraid_probe(struct pci_dev *pdev,
 	pmcraid_release_buffers(pinstance);
 
 out_unregister_isr:
-	pmcraid_kill_tasklets(pinstance);
+	pmcraid_kill_works(pinstance);
 	pmcraid_unregister_interrupt_handler(pinstance);
 
 out_scsi_host_put:
diff --git a/drivers/scsi/pmcraid.h b/drivers/scsi/pmcraid.h
index 9f59930e8b4f..2c17b1f4b57e 100644
--- a/drivers/scsi/pmcraid.h
+++ b/drivers/scsi/pmcraid.h
@@ -20,6 +20,7 @@
 #include <net/netlink.h>
 #include <net/genetlink.h>
 #include <linux/connector.h>
+#include <linux/workqueue.h>
 /*
  * Driver name   : string representing the driver name
  * Device file   : /dev file to be used for management interfaces
@@ -752,8 +753,8 @@ struct pmcraid_instance {
 	spinlock_t free_pool_lock;		/* free pool lock */
 	spinlock_t pending_pool_lock;		/* pending pool lock */
 
-	/* Tasklet to handle deferred processing */
-	struct tasklet_struct isr_tasklet[PMCRAID_NUM_MSIX_VECTORS];
+	/* BH work to handle deferred processing */
+	struct work_struct isr_work[PMCRAID_NUM_MSIX_VECTORS];
 
 	/* Work-queue (Shared) for deferred reset processing */
 	struct work_struct worker_q;
-- 
2.17.1


^ permalink raw reply related	[relevance 14%]

* RE: [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-05-01 14:01 79%     ` Konstantin Taranov
@ 2024-05-02 17:05 79%       ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-05-02 17:05 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Here is the PR to rdma-core with the changes:
> https://github.co/
> m%2Flinux-rdma%2Frdma-
> core%2Fpull%2F1455&data=05%7C02%7Clongli%40microsoft.com%7C15661647
> 58b54b0d382508dc69e72a37%7C72f988bf86f141af91ab2d7cd011db47%7C1%7
> C0%7C638501688834172236%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C
> &sdata=lZPsMJvU%2BeoVBajN5HZBNZaADKnwOfROv5OqXL52%2FF4%3D&reser
> ved=0
> The code was tested in 4 variations (old + new kernels against old + new rdma-
> cores) to confirm compatibility.
>
> Konstantin

There are minor issues with rdma-core change. This kernel patch looks good to me.

I have added "Reviewed-by" on this patch.

Long

^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-04-26 13:12 68% ` [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
  2024-04-29 18:08 79%   ` Long Li
@ 2024-05-02 17:04 79%   ` Long Li
  1 sibling, 0 replies; 200+ results
From: Long Li @ 2024-05-02 17:04 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation
> of rnic cq
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Enable users to create RNIC CQs using a corresponding flag.
> With the previous request size, an ethernet CQ is created.
> As a response, return ID of the created CQ.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/cq.c | 55 ++++++++++++++++++++++++++++++---
>  include/uapi/rdma/mana-abi.h    | 12 +++++++
>  2 files changed, 63 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
> index 688ffe6..c6a3fd5 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -9,17 +9,22 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		      struct ib_udata *udata)
>  {
>  	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> +	struct mana_ib_create_cq_resp resp = {};
> +	struct mana_ib_ucontext *mana_ucontext;
>  	struct ib_device *ibdev = ibcq->device;
>  	struct mana_ib_create_cq ucmd = {};
>  	struct mana_ib_dev *mdev;
> +	bool is_rnic_cq;
> +	u32 doorbell;
>  	int err;
> 
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> 
> -	if (udata->inlen < sizeof(ucmd))
> -		return -EINVAL;
> -
>  	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> +	cq->cq_handle = INVALID_MANA_HANDLE;
> +
> +	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> +		return -EINVAL;
> 
>  	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> >inlen));
>  	if (err) {
> @@ -28,7 +33,9 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		return err;
>  	}
> 
> -	if (attr->cqe > mdev->adapter_caps.max_qp_wr) {
> +	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> +
> +	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
>  		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
>  		return -EINVAL;
>  	}
> @@ -40,7 +47,41 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		return err;
>  	}
> 
> +	mana_ucontext = rdma_udata_to_drv_context(udata, struct
> mana_ib_ucontext,
> +						  ibucontext);
> +	doorbell = mana_ucontext->doorbell;
> +
> +	if (is_rnic_cq) {
> +		err = mana_ib_gd_create_cq(mdev, cq, doorbell);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to create RNIC cq, %d\n", err);
> +			goto err_destroy_queue;
> +		}
> +
> +		err = mana_ib_install_cq_cb(mdev, cq);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to install cq callback, %d\n",
> err);
> +			goto err_destroy_rnic_cq;
> +		}
> +	}
> +
> +	resp.cqid = cq->queue.id;
> +	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
> +	if (err) {
> +		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
> +		goto err_remove_cq_cb;
> +	}
> +
>  	return 0;
> +
> +err_remove_cq_cb:
> +	mana_ib_remove_cq_cb(mdev, cq);
> +err_destroy_rnic_cq:
> +	mana_ib_gd_destroy_cq(mdev, cq);
> +err_destroy_queue:
> +	mana_ib_destroy_queue(mdev, &cq->queue);
> +
> +	return err;
>  }
> 
>  int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) @@ -52,6
> +93,12 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> 
>  	mana_ib_remove_cq_cb(mdev, cq);
> +
> +	/* Ignore return code as there is not much we can do about it.
> +	 * The error message is printed inside.
> +	 */
> +	mana_ib_gd_destroy_cq(mdev, cq);
> +
>  	mana_ib_destroy_queue(mdev, &cq->queue);
> 
>  	return 0;
> diff --git a/include/uapi/rdma/mana-abi.h b/include/uapi/rdma/mana-abi.h
> index 5fcb31b..2c41cc3 100644
> --- a/include/uapi/rdma/mana-abi.h
> +++ b/include/uapi/rdma/mana-abi.h
> @@ -16,8 +16,20 @@
> 
>  #define MANA_IB_UVERBS_ABI_VERSION 1
> 
> +enum mana_ib_create_cq_flags {
> +	MANA_IB_CREATE_RNIC_CQ	= 1 << 0,
> +};
> +
>  struct mana_ib_create_cq {
>  	__aligned_u64 buf_addr;
> +	__u16	flags;
> +	__u16	reserved0;
> +	__u32	reserved1;
> +};
> +
> +struct mana_ib_create_cq_resp {
> +	__u32 cqid;
> +	__u32 reserved;
>  };
> 
>  struct mana_ib_create_qp {
> --
> 2.43.0


^ permalink raw reply	[relevance 79%]

* [PATCH v2] fs/coredump: Enable dynamic configuration of max file note size
@ 2024-05-02 14:59 66% Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-05-02 14:59 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, linux-mm, viro, brauner, jack, ebiederm, keescook,
	mcgrof, j.granados

Introduce the capability to dynamically configure the maximum file
note size for ELF core dumps via sysctl. This enhancement removes
the previous static limit of 4MB, allowing system administrators to
adjust the size based on system-specific requirements or constraints.

- Remove hardcoded `MAX_FILE_NOTE_SIZE` from `fs/binfmt_elf.c`.
- Define `max_file_note_size` in `fs/coredump.c` with an initial value
  set to 4MB.
- Declare `max_file_note_size` as an external variable in
  `include/linux/coredump.h`.
- Add a new sysctl entry in `kernel/sysctl.c` to manage this setting
  at runtime.

$ sysctl -a | grep max_file_note_size
kernel.max_file_note_size = 4194304

$ sysctl -n kernel.max_file_note_size
4194304

$echo 519304 > /proc/sys/kernel/max_file_note_size

$sysctl -n kernel.max_file_note_size
519304

Why is this being done?
We have observed that during a crash when there are more than 65k mmaps
in memory, the existing fixed limit on the size of the ELF notes section
becomes a bottleneck. The notes section quickly reaches its capacity,
leading to incomplete memory segment information in the resulting coredump.
This truncation compromises the utility of the coredumps, as crucial
information about the memory state at the time of the crash might be
omitted.

Signed-off-by: Vijay Nag <nagvijay@microsoft.com>
Signed-off-by: Allen Pais <apais@linux.microsoft.com>

---
Changes in v2:
   - Move new sysctl to fs/coredump.c [Luis & Kees]
   - rename max_file_note_size to core_file_note_size_max [kees]
   - Capture "why this is being done?" int he commit message [Luis & Kees]
---
 fs/binfmt_elf.c          |  3 +--
 fs/coredump.c            | 10 ++++++++++
 include/linux/coredump.h |  1 +
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 5397b552fbeb..6aebd062b92b 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1564,7 +1564,6 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
 	fill_note(note, "CORE", NT_SIGINFO, sizeof(*csigdata), csigdata);
 }
 
-#define MAX_FILE_NOTE_SIZE (4*1024*1024)
 /*
  * Format of NT_FILE note:
  *
@@ -1592,7 +1591,7 @@ static int fill_files_note(struct memelfnote *note, struct coredump_params *cprm
 
 	names_ofs = (2 + 3 * count) * sizeof(data[0]);
  alloc:
-	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
+	if (size >= core_file_note_size_max) /* paranoia check */
 		return -EINVAL;
 	size = round_up(size, PAGE_SIZE);
 	/*
diff --git a/fs/coredump.c b/fs/coredump.c
index be6403b4b14b..a312be48030f 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -56,10 +56,13 @@
 static bool dump_vma_snapshot(struct coredump_params *cprm);
 static void free_vma_snapshot(struct coredump_params *cprm);
 
+#define MAX_FILE_NOTE_SIZE (4*1024*1024)
+
 static int core_uses_pid;
 static unsigned int core_pipe_limit;
 static char core_pattern[CORENAME_MAX_SIZE] = "core";
 static int core_name_size = CORENAME_MAX_SIZE;
+unsigned int core_file_note_size_max = MAX_FILE_NOTE_SIZE;
 
 struct core_name {
 	char *corename;
@@ -1020,6 +1023,13 @@ static struct ctl_table coredump_sysctls[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname       = "core_file_note_size_max",
+		.data           = &core_file_note_size_max,
+		.maxlen         = sizeof(unsigned int),
+		.mode           = 0644,
+		.proc_handler   = proc_douintvec,
+	},
 };
 
 static int __init init_fs_coredump_sysctls(void)
diff --git a/include/linux/coredump.h b/include/linux/coredump.h
index d3eba4360150..14c057643e7f 100644
--- a/include/linux/coredump.h
+++ b/include/linux/coredump.h
@@ -46,6 +46,7 @@ static inline void do_coredump(const kernel_siginfo_t *siginfo) {}
 #endif
 
 #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL)
+extern unsigned int core_file_note_size_max;
 extern void validate_coredump_safety(void);
 #else
 static inline void validate_coredump_safety(void) {}
-- 
2.17.1


^ permalink raw reply related	[relevance 66%]

* RE: [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-04-29 18:08 79%   ` Long Li
@ 2024-05-01 14:01 79%     ` Konstantin Taranov
  2024-05-02 17:05 79%       ` Long Li
  0 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-05-01 14:01 UTC (permalink / raw)
  To: Long Li, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> 
> For this review, it will be helpful if you can also post a link to the rdma-core
> changes.
> 
> Long

Here is the PR to rdma-core with the changes: https://github.com/linux-rdma/rdma-core/pull/1455
The code was tested in 4 variations (old + new kernels against old + new rdma-cores) to confirm compatibility.

Konstantin

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v1 03/12] drm/i915: Make I2C terminology more inclusive
  @ 2024-04-30 21:40 75%     ` Easwar Hariharan
  0 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 21:40 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
	Daniel Vetter, Zhenyu Wang, Zhi Wang,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL GVT-g DRIVERS (Intel GPU Virtualization),
	Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

On 4/30/2024 1:29 PM, Rodrigo Vivi wrote:
> On Tue, Apr 30, 2024 at 05:38:02PM +0000, Easwar Hariharan wrote:
>> I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
>> with more appropriate terms. Inspired by and following on to Wolfram's
>> series to fix drivers/i2c/[1], fix the terminology for users of
>> I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
>> in the specification.
>>
>> Compile tested, no functionality changes intended
>>
>> [1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
>>
>> Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
> 
> I'm glad to see this change!
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
>> ---
>>  drivers/gpu/drm/i915/display/dvo_ch7017.c     | 14 ++++-----
>>  drivers/gpu/drm/i915/display/dvo_ch7xxx.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/dvo_ivch.c       | 16 +++++-----
>>  drivers/gpu/drm/i915/display/dvo_ns2501.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/dvo_sil164.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/dvo_tfp410.c     | 18 +++++------
>>  drivers/gpu/drm/i915/display/intel_bios.c     | 22 +++++++-------
>>  drivers/gpu/drm/i915/display/intel_ddi.c      |  2 +-
>>  .../gpu/drm/i915/display/intel_display_core.h |  2 +-
>>  drivers/gpu/drm/i915/display/intel_dsi.h      |  2 +-
>>  drivers/gpu/drm/i915/display/intel_dsi_vbt.c  | 20 ++++++-------
>>  drivers/gpu/drm/i915/display/intel_dvo.c      | 14 ++++-----
>>  drivers/gpu/drm/i915/display/intel_dvo_dev.h  |  2 +-
>>  drivers/gpu/drm/i915/display/intel_gmbus.c    |  4 +--
>>  drivers/gpu/drm/i915/display/intel_sdvo.c     | 30 +++++++++----------
>>  drivers/gpu/drm/i915/display/intel_vbt_defs.h |  4 +--
>>  drivers/gpu/drm/i915/gvt/edid.c               | 28 ++++++++---------
>>  drivers/gpu/drm/i915/gvt/edid.h               |  4 +--
>>  drivers/gpu/drm/i915/gvt/opregion.c           |  2 +-
>>  19 files changed, 119 insertions(+), 119 deletions(-)
> 
> The chances of conflicts are high with this many changes,
> but should be easy enough to deal with later, so feel free
> to move with this i915 patch on any other tree and we catch-up
> later.
> 
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 

Thanks for the review and ack! I actually thought that this might end up going in as individual
patches via the various respective trees since it's now completely independent of Wolfram's enabling
series with the drop of the final patch that was treewide.

What do you think?

Thanks,
Easwar


^ permalink raw reply	[relevance 75%]

* [PATCH v1 12/12] fbdev/viafb: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (10 preceding siblings ...)
  2024-04-30 17:38 70% ` [PATCH v1 11/12] fbdev/smscufx: " Easwar Hariharan
@ 2024-04-30 17:38 47% ` Easwar Hariharan
    11 siblings, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Florian Tobias Schandinat, Helge Deller,
	open list:VIA UNICHROME(PRO)/CHROME9 FRAMEBUFFER DRIVER,
	open list:FRAMEBUFFER LAYER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/video/fbdev/via/chip.h    |  8 ++++----
 drivers/video/fbdev/via/dvi.c     | 24 ++++++++++++------------
 drivers/video/fbdev/via/lcd.c     |  6 +++---
 drivers/video/fbdev/via/via_aux.h |  2 +-
 drivers/video/fbdev/via/via_i2c.c | 12 ++++++------
 drivers/video/fbdev/via/vt1636.c  |  6 +++---
 6 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/video/fbdev/via/chip.h b/drivers/video/fbdev/via/chip.h
index f0a19cbcb9e5..1ea6d4ce79e7 100644
--- a/drivers/video/fbdev/via/chip.h
+++ b/drivers/video/fbdev/via/chip.h
@@ -69,7 +69,7 @@
 #define     VT1632_TMDS             0x01
 #define     INTEGRATED_TMDS         0x42
 
-/* Definition TMDS Trasmitter I2C Slave Address */
+/* Definition TMDS Trasmitter I2C Client Address */
 #define     VT1632_TMDS_I2C_ADDR    0x10
 
 /**************************************************/
@@ -88,21 +88,21 @@
 #define     TX_DATA_DDR_MODE        0x04
 #define     TX_DATA_SDR_MODE        0x08
 
-/* Definition LVDS Trasmitter I2C Slave Address */
+/* Definition LVDS Trasmitter I2C Client Address */
 #define     VT1631_LVDS_I2C_ADDR    0x70
 #define     VT3271_LVDS_I2C_ADDR    0x80
 #define     VT1636_LVDS_I2C_ADDR    0x80
 
 struct tmds_chip_information {
 	int tmds_chip_name;
-	int tmds_chip_slave_addr;
+	int tmds_chip_client_addr;
 	int output_interface;
 	int i2c_port;
 };
 
 struct lvds_chip_information {
 	int lvds_chip_name;
-	int lvds_chip_slave_addr;
+	int lvds_chip_client_addr;
 	int output_interface;
 	int i2c_port;
 };
diff --git a/drivers/video/fbdev/via/dvi.c b/drivers/video/fbdev/via/dvi.c
index 13147e3066eb..db7db26416c3 100644
--- a/drivers/video/fbdev/via/dvi.c
+++ b/drivers/video/fbdev/via/dvi.c
@@ -70,7 +70,7 @@ bool viafb_tmds_trasmitter_identify(void)
 	/* Check for VT1632: */
 	viaparinfo->chip_info->tmds_chip_info.tmds_chip_name = VT1632_TMDS;
 	viaparinfo->chip_info->
-		tmds_chip_info.tmds_chip_slave_addr = VT1632_TMDS_I2C_ADDR;
+		tmds_chip_info.tmds_chip_client_addr = VT1632_TMDS_I2C_ADDR;
 	viaparinfo->chip_info->tmds_chip_info.i2c_port = VIA_PORT_31;
 	if (check_tmds_chip(VT1632_DEVICE_ID_REG, VT1632_DEVICE_ID)) {
 		/*
@@ -128,14 +128,14 @@ bool viafb_tmds_trasmitter_identify(void)
 	viaparinfo->chip_info->
 		tmds_chip_info.tmds_chip_name = NON_TMDS_TRANSMITTER;
 	viaparinfo->chip_info->tmds_chip_info.
-		tmds_chip_slave_addr = VT1632_TMDS_I2C_ADDR;
+		tmds_chip_client_addr = VT1632_TMDS_I2C_ADDR;
 	return false;
 }
 
 static void tmds_register_write(int index, u8 data)
 {
 	viafb_i2c_writebyte(viaparinfo->chip_info->tmds_chip_info.i2c_port,
-			    viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr,
+			    viaparinfo->chip_info->tmds_chip_info.tmds_chip_client_addr,
 			    index, data);
 }
 
@@ -144,7 +144,7 @@ static int tmds_register_read(int index)
 	u8 data;
 
 	viafb_i2c_readbyte(viaparinfo->chip_info->tmds_chip_info.i2c_port,
-			   (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr,
+			   (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_client_addr,
 			   (u8) index, &data);
 	return data;
 }
@@ -152,7 +152,7 @@ static int tmds_register_read(int index)
 static int tmds_register_read_bytes(int index, u8 *buff, int buff_len)
 {
 	viafb_i2c_readbytes(viaparinfo->chip_info->tmds_chip_info.i2c_port,
-			    (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr,
+			    (u8) viaparinfo->chip_info->tmds_chip_info.tmds_chip_client_addr,
 			    (u8) index, buff, buff_len);
 	return 0;
 }
@@ -256,14 +256,14 @@ static int viafb_dvi_query_EDID(void)
 
 	DEBUG_MSG(KERN_INFO "viafb_dvi_query_EDID!!\n");
 
-	restore = viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr;
-	viaparinfo->chip_info->tmds_chip_info.tmds_chip_slave_addr = 0xA0;
+	restore = viaparinfo->chip_info->tmds_chip_info.tmds_chip_client_addr;
+	viaparinfo->chip_info->tmds_chip_info.tmds_chip_client_addr = 0xA0;
 
 	data0 = (u8) tmds_register_read(0x00);
 	data1 = (u8) tmds_register_read(0x01);
 	if ((data0 == 0) && (data1 == 0xFF)) {
 		viaparinfo->chip_info->
-			tmds_chip_info.tmds_chip_slave_addr = restore;
+			tmds_chip_info.tmds_chip_client_addr = restore;
 		return EDID_VERSION_1;	/* Found EDID1 Table */
 	}
 
@@ -280,8 +280,8 @@ static void dvi_get_panel_size_from_DDCv1(
 
 	DEBUG_MSG(KERN_INFO "\n dvi_get_panel_size_from_DDCv1 \n");
 
-	restore = tmds_chip->tmds_chip_slave_addr;
-	tmds_chip->tmds_chip_slave_addr = 0xA0;
+	restore = tmds_chip->tmds_chip_client_addr;
+	tmds_chip->tmds_chip_client_addr = 0xA0;
 	for (i = 0x25; i < 0x6D; i++) {
 		switch (i) {
 		case 0x36:
@@ -306,7 +306,7 @@ static void dvi_get_panel_size_from_DDCv1(
 
 	DEBUG_MSG(KERN_INFO "DVI max pixelclock = %d\n",
 		tmds_setting->max_pixel_clock);
-	tmds_chip->tmds_chip_slave_addr = restore;
+	tmds_chip->tmds_chip_client_addr = restore;
 }
 
 /* If Disable DVI, turn off pad */
@@ -427,7 +427,7 @@ void viafb_dvi_enable(void)
 				viafb_i2c_writebyte(viaparinfo->chip_info->
 					tmds_chip_info.i2c_port,
 					viaparinfo->chip_info->
-					tmds_chip_info.tmds_chip_slave_addr,
+					tmds_chip_info.tmds_chip_client_addr,
 					0x08, data);
 			}
 		}
diff --git a/drivers/video/fbdev/via/lcd.c b/drivers/video/fbdev/via/lcd.c
index beec5c8d4d08..9a6e4ac9e551 100644
--- a/drivers/video/fbdev/via/lcd.c
+++ b/drivers/video/fbdev/via/lcd.c
@@ -147,7 +147,7 @@ bool viafb_lvds_trasmitter_identify(void)
 		return true;
 	/* Check for VT1631: */
 	viaparinfo->chip_info->lvds_chip_info.lvds_chip_name = VT1631_LVDS;
-	viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr =
+	viaparinfo->chip_info->lvds_chip_info.lvds_chip_client_addr =
 		VT1631_LVDS_I2C_ADDR;
 
 	if (check_lvds_chip(VT1631_DEVICE_ID_REG, VT1631_DEVICE_ID)) {
@@ -161,7 +161,7 @@ bool viafb_lvds_trasmitter_identify(void)
 
 	viaparinfo->chip_info->lvds_chip_info.lvds_chip_name =
 		NON_LVDS_TRANSMITTER;
-	viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr =
+	viaparinfo->chip_info->lvds_chip_info.lvds_chip_client_addr =
 		VT1631_LVDS_I2C_ADDR;
 	return false;
 }
@@ -327,7 +327,7 @@ static int lvds_register_read(int index)
 	u8 data;
 
 	viafb_i2c_readbyte(VIA_PORT_2C,
-			(u8) viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr,
+			(u8) viaparinfo->chip_info->lvds_chip_info.lvds_chip_client_addr,
 			(u8) index, &data);
 	return data;
 }
diff --git a/drivers/video/fbdev/via/via_aux.h b/drivers/video/fbdev/via/via_aux.h
index 0933bbf20e58..e2b617b1e6fd 100644
--- a/drivers/video/fbdev/via/via_aux.h
+++ b/drivers/video/fbdev/via/via_aux.h
@@ -24,7 +24,7 @@ struct via_aux_drv {
 	struct list_head chain;		/* chain to support multiple drivers */
 
 	struct via_aux_bus *bus;	/* the I2C bus used */
-	u8 addr;			/* the I2C slave address */
+	u8 addr;			/* the I2C client address */
 
 	const char *name;	/* human readable name of the driver */
 	void *data;		/* private data of this driver */
diff --git a/drivers/video/fbdev/via/via_i2c.c b/drivers/video/fbdev/via/via_i2c.c
index 582502810575..907c739475d0 100644
--- a/drivers/video/fbdev/via/via_i2c.c
+++ b/drivers/video/fbdev/via/via_i2c.c
@@ -104,7 +104,7 @@ static void via_i2c_setsda(void *data, int state)
 	spin_unlock_irqrestore(&i2c_vdev->reg_lock, flags);
 }
 
-int viafb_i2c_readbyte(u8 adap, u8 slave_addr, u8 index, u8 *pdata)
+int viafb_i2c_readbyte(u8 adap, u8 client_addr, u8 index, u8 *pdata)
 {
 	int ret;
 	u8 mm1[] = {0x00};
@@ -115,7 +115,7 @@ int viafb_i2c_readbyte(u8 adap, u8 slave_addr, u8 index, u8 *pdata)
 	*pdata = 0;
 	msgs[0].flags = 0;
 	msgs[1].flags = I2C_M_RD;
-	msgs[0].addr = msgs[1].addr = slave_addr / 2;
+	msgs[0].addr = msgs[1].addr = client_addr / 2;
 	mm1[0] = index;
 	msgs[0].len = 1; msgs[1].len = 1;
 	msgs[0].buf = mm1; msgs[1].buf = pdata;
@@ -128,7 +128,7 @@ int viafb_i2c_readbyte(u8 adap, u8 slave_addr, u8 index, u8 *pdata)
 	return ret;
 }
 
-int viafb_i2c_writebyte(u8 adap, u8 slave_addr, u8 index, u8 data)
+int viafb_i2c_writebyte(u8 adap, u8 client_addr, u8 index, u8 data)
 {
 	int ret;
 	u8 msg[2] = { index, data };
@@ -137,7 +137,7 @@ int viafb_i2c_writebyte(u8 adap, u8 slave_addr, u8 index, u8 data)
 	if (!via_i2c_par[adap].is_active)
 		return -ENODEV;
 	msgs.flags = 0;
-	msgs.addr = slave_addr / 2;
+	msgs.addr = client_addr / 2;
 	msgs.len = 2;
 	msgs.buf = msg;
 	ret = i2c_transfer(&via_i2c_par[adap].adapter, &msgs, 1);
@@ -149,7 +149,7 @@ int viafb_i2c_writebyte(u8 adap, u8 slave_addr, u8 index, u8 data)
 	return ret;
 }
 
-int viafb_i2c_readbytes(u8 adap, u8 slave_addr, u8 index, u8 *buff, int buff_len)
+int viafb_i2c_readbytes(u8 adap, u8 client_addr, u8 index, u8 *buff, int buff_len)
 {
 	int ret;
 	u8 mm1[] = {0x00};
@@ -159,7 +159,7 @@ int viafb_i2c_readbytes(u8 adap, u8 slave_addr, u8 index, u8 *buff, int buff_len
 		return -ENODEV;
 	msgs[0].flags = 0;
 	msgs[1].flags = I2C_M_RD;
-	msgs[0].addr = msgs[1].addr = slave_addr / 2;
+	msgs[0].addr = msgs[1].addr = client_addr / 2;
 	mm1[0] = index;
 	msgs[0].len = 1; msgs[1].len = buff_len;
 	msgs[0].buf = mm1; msgs[1].buf = buff;
diff --git a/drivers/video/fbdev/via/vt1636.c b/drivers/video/fbdev/via/vt1636.c
index 8d8cfdb05618..614e5c29a449 100644
--- a/drivers/video/fbdev/via/vt1636.c
+++ b/drivers/video/fbdev/via/vt1636.c
@@ -44,7 +44,7 @@ u8 viafb_gpio_i2c_read_lvds(struct lvds_setting_information
 	u8 data;
 
 	viafb_i2c_readbyte(plvds_chip_info->i2c_port,
-			   plvds_chip_info->lvds_chip_slave_addr, index, &data);
+			   plvds_chip_info->lvds_chip_client_addr, index, &data);
 	return data;
 }
 
@@ -60,7 +60,7 @@ void viafb_gpio_i2c_write_mask_lvds(struct lvds_setting_information
 	data = (data & (~io_data.Mask)) | io_data.Data;
 
 	viafb_i2c_writebyte(plvds_chip_info->i2c_port,
-			    plvds_chip_info->lvds_chip_slave_addr, index, data);
+			    plvds_chip_info->lvds_chip_client_addr, index, data);
 }
 
 void viafb_init_lvds_vt1636(struct lvds_setting_information
@@ -113,7 +113,7 @@ bool viafb_lvds_identify_vt1636(u8 i2c_adapter)
 	DEBUG_MSG(KERN_INFO "viafb_lvds_identify_vt1636.\n");
 
 	/* Sense VT1636 LVDS Transmiter */
-	viaparinfo->chip_info->lvds_chip_info.lvds_chip_slave_addr =
+	viaparinfo->chip_info->lvds_chip_info.lvds_chip_client_addr =
 		VT1636_LVDS_I2C_ADDR;
 
 	/* Check vendor ID first: */
-- 
2.34.1


^ permalink raw reply related	[relevance 47%]

* [PATCH v1 11/12] fbdev/smscufx: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (9 preceding siblings ...)
  2024-04-30 17:38 70% ` [PATCH v1 10/12] sfc: falcon: " Easwar Hariharan
@ 2024-04-30 17:38 70% ` Easwar Hariharan
  2024-04-30 17:38 47% ` [PATCH v1 12/12] fbdev/viafb: " Easwar Hariharan
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Steve Glendinning, Helge Deller,
	open list:SMSC UFX6000 and UFX7000 USB to VGA DRIVER,
	open list:FRAMEBUFFER LAYER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/video/fbdev/smscufx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/fbdev/smscufx.c b/drivers/video/fbdev/smscufx.c
index 35d682b110c4..1c80c1a3d516 100644
--- a/drivers/video/fbdev/smscufx.c
+++ b/drivers/video/fbdev/smscufx.c
@@ -1292,7 +1292,7 @@ static int ufx_realloc_framebuffer(struct ufx_data *dev, struct fb_info *info)
 	return 0;
 }
 
-/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, master,
+/* sets up I2C Controller for 100 Kbps, std. speed, 7-bit addr, host,
  * restart enabled, but no start byte, enable controller */
 static int ufx_i2c_init(struct ufx_data *dev)
 {
@@ -1321,7 +1321,7 @@ static int ufx_i2c_init(struct ufx_data *dev)
 	/* 7-bit (not 10-bit) addressing */
 	tmp &= ~(0x10);
 
-	/* enable restart conditions and master mode */
+	/* enable restart conditions and host mode */
 	tmp |= 0x21;
 
 	status = ufx_reg_write(dev, 0x1000, tmp);
-- 
2.34.1


^ permalink raw reply related	[relevance 70%]

* [PATCH v1 10/12] sfc: falcon: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (8 preceding siblings ...)
  2024-04-30 17:38 65% ` [PATCH v1 09/12] media: cx23885: " Easwar Hariharan
@ 2024-04-30 17:38 70% ` Easwar Hariharan
    2024-04-30 17:38 70% ` [PATCH v1 11/12] fbdev/smscufx: " Easwar Hariharan
  2024-04-30 17:38 47% ` [PATCH v1 12/12] fbdev/viafb: " Easwar Hariharan
  11 siblings, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Edward Cree, Martin Habets, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Easwar Hariharan, Simon Horman,
	open list:SFC NETWORK DRIVER, open list:SFC NETWORK DRIVER,
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/net/ethernet/sfc/falcon/falcon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sfc/falcon/falcon.c b/drivers/net/ethernet/sfc/falcon/falcon.c
index 7a1c9337081b..36114ce88034 100644
--- a/drivers/net/ethernet/sfc/falcon/falcon.c
+++ b/drivers/net/ethernet/sfc/falcon/falcon.c
@@ -367,7 +367,7 @@ static const struct i2c_algo_bit_data falcon_i2c_bit_operations = {
 	.getsda		= falcon_getsda,
 	.getscl		= falcon_getscl,
 	.udelay		= 5,
-	/* Wait up to 50 ms for slave to let us pull SCL high */
+	/* Wait up to 50 ms for target to let us pull SCL high */
 	.timeout	= DIV_ROUND_UP(HZ, 20),
 };
 
-- 
2.34.1


^ permalink raw reply related	[relevance 70%]

* [PATCH v1 09/12] media: cx23885: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (7 preceding siblings ...)
  2024-04-30 17:38 59% ` [PATCH v1 08/12] media: ivtv: " Easwar Hariharan
@ 2024-04-30 17:38 65% ` Easwar Hariharan
  2024-04-30 17:38 70% ` [PATCH v1 10/12] sfc: falcon: " Easwar Hariharan
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Easwar Hariharan,
	open list:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cx23885/cx23885-f300.c | 8 ++++----
 drivers/media/pci/cx23885/cx23885-i2c.c  | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/media/pci/cx23885/cx23885-f300.c b/drivers/media/pci/cx23885/cx23885-f300.c
index ac1c434e8e24..5f937c281793 100644
--- a/drivers/media/pci/cx23885/cx23885-f300.c
+++ b/drivers/media/pci/cx23885/cx23885-f300.c
@@ -92,7 +92,7 @@ static u8 f300_xfer(struct dvb_frontend *fe, u8 *buf)
 	f300_set_line(dev, F300_RESET, 0);/* begin to send data */
 	msleep(1);
 
-	f300_send_byte(dev, 0xe0);/* the slave address is 0xe0, write */
+	f300_send_byte(dev, 0xe0);/* the client address is 0xe0, write */
 	msleep(1);
 
 	temp = buf[0];
@@ -112,10 +112,10 @@ static u8 f300_xfer(struct dvb_frontend *fe, u8 *buf)
 	}
 
 	if (i > 7) {
-		pr_err("%s: timeout, the slave no response\n",
+		pr_err("%s: timeout, the client no response\n",
 								__func__);
-		ret = 1; /* timeout, the slave no response */
-	} else { /* the slave not busy, prepare for getting data */
+		ret = 1; /* timeout, the client no response */
+	} else { /* the client not busy, prepare for getting data */
 		f300_set_line(dev, F300_RESET, 0);/*ready...*/
 		msleep(1);
 		f300_send_byte(dev, 0xe1);/* 0xe1 is Read */
diff --git a/drivers/media/pci/cx23885/cx23885-i2c.c b/drivers/media/pci/cx23885/cx23885-i2c.c
index f51fad33dc04..385af2a893b4 100644
--- a/drivers/media/pci/cx23885/cx23885-i2c.c
+++ b/drivers/media/pci/cx23885/cx23885-i2c.c
@@ -34,7 +34,7 @@ MODULE_PARM_DESC(i2c_scan, "scan i2c bus at insmod time");
 #define I2C_EXTEND  (1 << 3)
 #define I2C_NOSTOP  (1 << 4)
 
-static inline int i2c_slave_did_ack(struct i2c_adapter *i2c_adap)
+static inline int i2c_client_did_ack(struct i2c_adapter *i2c_adap)
 {
 	struct cx23885_i2c *bus = i2c_adap->algo_data;
 	struct cx23885_dev *dev = bus->dev;
@@ -84,7 +84,7 @@ static int i2c_sendbytes(struct i2c_adapter *i2c_adap,
 		cx_write(bus->reg_ctrl, bus->i2c_period | (1 << 2));
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_client_did_ack(i2c_adap))
 			return -ENXIO;
 
 		dprintk(1, "%s() returns 0\n", __func__);
@@ -163,7 +163,7 @@ static int i2c_readbytes(struct i2c_adapter *i2c_adap,
 		cx_write(bus->reg_ctrl, bus->i2c_period | (1 << 2) | 1);
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_client_did_ack(i2c_adap))
 			return -ENXIO;
 
 
-- 
2.34.1


^ permalink raw reply related	[relevance 65%]

* [PATCH v1 08/12] media: ivtv: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (6 preceding siblings ...)
  2024-04-30 17:38 70% ` [PATCH v1 07/12] media: cx25821: " Easwar Hariharan
@ 2024-04-30 17:38 59% ` Easwar Hariharan
  2024-04-30 17:38 65% ` [PATCH v1 09/12] media: cx23885: " Easwar Hariharan
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Andy Walls, Mauro Carvalho Chehab,
	open list:IVTV VIDEO4LINUX DRIVER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/ivtv/ivtv-i2c.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/media/pci/ivtv/ivtv-i2c.c b/drivers/media/pci/ivtv/ivtv-i2c.c
index c052c57c6dce..967e6a025020 100644
--- a/drivers/media/pci/ivtv/ivtv-i2c.c
+++ b/drivers/media/pci/ivtv/ivtv-i2c.c
@@ -33,14 +33,14 @@
     Some more general comments about what we are doing:
 
     The i2c bus is a 2 wire serial bus, with clock (SCL) and data (SDA)
-    lines.  To communicate on the bus (as a master, we don't act as a slave),
+    lines.  To communicate on the bus (as a host, we don't act as a client),
     we first initiate a start condition (ivtv_start).  We then write the
     address of the device that we want to communicate with, along with a flag
-    that indicates whether this is a read or a write.  The slave then issues
+    that indicates whether this is a read or a write.  The client then issues
     an ACK signal (ivtv_ack), which tells us that it is ready for reading /
     writing.  We then proceed with reading or writing (ivtv_read/ivtv_write),
     and finally issue a stop condition (ivtv_stop) to make the bus available
-    to other masters.
+    to other hosts.
 
     There is an additional form of transaction where a write may be
     immediately followed by a read.  In this case, there is no intervening
@@ -379,7 +379,7 @@ static int ivtv_waitsda(struct ivtv *itv, int val)
 	return 0;
 }
 
-/* Wait for the slave to issue an ACK */
+/* Wait for the client to issue an ACK */
 static int ivtv_ack(struct ivtv *itv)
 {
 	int ret = 0;
@@ -407,7 +407,7 @@ static int ivtv_ack(struct ivtv *itv)
 	return ret;
 }
 
-/* Write a single byte to the i2c bus and wait for the slave to ACK */
+/* Write a single byte to the i2c bus and wait for the client to ACK */
 static int ivtv_sendbyte(struct ivtv *itv, unsigned char byte)
 {
 	int i, bit;
@@ -471,7 +471,7 @@ static int ivtv_readbyte(struct ivtv *itv, unsigned char *byte, int nack)
 	return 0;
 }
 
-/* Issue a start condition on the i2c bus to alert slaves to prepare for
+/* Issue a start condition on the i2c bus to alert clients to prepare for
    an address write */
 static int ivtv_start(struct ivtv *itv)
 {
@@ -534,7 +534,7 @@ static int ivtv_stop(struct ivtv *itv)
 	return 0;
 }
 
-/* Write a message to the given i2c slave.  do_stop may be 0 to prevent
+/* Write a message to the given i2c client.  do_stop may be 0 to prevent
    issuing the i2c stop condition (when following with a read) */
 static int ivtv_write(struct ivtv *itv, unsigned char addr, unsigned char *data, u32 len, int do_stop)
 {
@@ -558,7 +558,7 @@ static int ivtv_write(struct ivtv *itv, unsigned char addr, unsigned char *data,
 	return ret;
 }
 
-/* Read data from the given i2c slave.  A stop condition is always issued. */
+/* Read data from the given i2c client.  A stop condition is always issued. */
 static int ivtv_read(struct ivtv *itv, unsigned char addr, unsigned char *data, u32 len)
 {
 	int retry, ret = -EREMOTEIO;
-- 
2.34.1


^ permalink raw reply related	[relevance 59%]

* [PATCH v1 07/12] media: cx25821: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (5 preceding siblings ...)
  2024-04-30 17:38 56% ` [PATCH v1 06/12] media: cx18: " Easwar Hariharan
@ 2024-04-30 17:38 70% ` Easwar Hariharan
  2024-04-30 17:38 59% ` [PATCH v1 08/12] media: ivtv: " Easwar Hariharan
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Easwar Hariharan,
	open list:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cx25821/cx25821-i2c.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/pci/cx25821/cx25821-i2c.c b/drivers/media/pci/cx25821/cx25821-i2c.c
index 0ef4cd6528a0..bad8fb9f5319 100644
--- a/drivers/media/pci/cx25821/cx25821-i2c.c
+++ b/drivers/media/pci/cx25821/cx25821-i2c.c
@@ -33,7 +33,7 @@ do {									\
 #define I2C_EXTEND  (1 << 3)
 #define I2C_NOSTOP  (1 << 4)
 
-static inline int i2c_slave_did_ack(struct i2c_adapter *i2c_adap)
+static inline int i2c_client_did_ack(struct i2c_adapter *i2c_adap)
 {
 	struct cx25821_i2c *bus = i2c_adap->algo_data;
 	struct cx25821_dev *dev = bus->dev;
@@ -85,7 +85,7 @@ static int i2c_sendbytes(struct i2c_adapter *i2c_adap,
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
 
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_client_did_ack(i2c_adap))
 			return -EIO;
 
 		dprintk(1, "%s(): returns 0\n", __func__);
@@ -174,7 +174,7 @@ static int i2c_readbytes(struct i2c_adapter *i2c_adap,
 		cx_write(bus->reg_ctrl, bus->i2c_period | (1 << 2) | 1);
 		if (!i2c_wait_done(i2c_adap))
 			return -EIO;
-		if (!i2c_slave_did_ack(i2c_adap))
+		if (!i2c_client_did_ack(i2c_adap))
 			return -EIO;
 
 		dprintk(1, "%s(): returns 0\n", __func__);
-- 
2.34.1


^ permalink raw reply related	[relevance 70%]

* [PATCH v1 06/12] media: cx18: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (4 preceding siblings ...)
  2024-04-30 17:38 71% ` [PATCH v1 05/12] media: cobalt: " Easwar Hariharan
@ 2024-04-30 17:38 56% ` Easwar Hariharan
  2024-04-30 17:38 70% ` [PATCH v1 07/12] media: cx25821: " Easwar Hariharan
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Andy Walls, Mauro Carvalho Chehab,
	open list:CX18 VIDEO4LINUX DRIVER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

I2S specification has also updated the terms in v.3 to use "controller"
and "target" respectively. Make those changes in the relevant spaces as
well.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cx18/cx18-av-firmware.c | 8 ++++----
 drivers/media/pci/cx18/cx18-cards.c       | 6 +++---
 drivers/media/pci/cx18/cx18-cards.h       | 4 ++--
 drivers/media/pci/cx18/cx18-gpio.c        | 6 +++---
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/media/pci/cx18/cx18-av-firmware.c b/drivers/media/pci/cx18/cx18-av-firmware.c
index 61aeb8c9af7f..906e0b33cffc 100644
--- a/drivers/media/pci/cx18/cx18-av-firmware.c
+++ b/drivers/media/pci/cx18/cx18-av-firmware.c
@@ -140,22 +140,22 @@ int cx18_av_loadfw(struct cx18 *cx)
 	cx18_av_and_or4(cx, CXADEC_PIN_CTRL1, ~0, 0x78000);
 
 	/* Audio input control 1 set to Sony mode */
-	/* Audio output input 2 is 0 for slave operation input */
+	/* Audio output input 2 is 0 for target operation input */
 	/* 0xC4000914[5]: 0 = left sample on WS=0, 1 = left sample on WS=1 */
 	/* 0xC4000914[7]: 0 = Philips mode, 1 = Sony mode (1st SCK rising edge
 	   after WS transition for first bit of audio word. */
 	cx18_av_write4(cx, CXADEC_I2S_IN_CTL, 0x000000A0);
 
 	/* Audio output control 1 is set to Sony mode */
-	/* Audio output control 2 is set to 1 for master mode */
+	/* Audio output control 2 is set to 1 for controller mode */
 	/* 0xC4000918[5]: 0 = left sample on WS=0, 1 = left sample on WS=1 */
 	/* 0xC4000918[7]: 0 = Philips mode, 1 = Sony mode (1st SCK rising edge
 	   after WS transition for first bit of audio word. */
-	/* 0xC4000918[8]: 0 = slave operation, 1 = master (SCK_OUT and WS_OUT
+	/* 0xC4000918[8]: 0 = target operation, 1 = controller (SCK_OUT and WS_OUT
 	   are generated) */
 	cx18_av_write4(cx, CXADEC_I2S_OUT_CTL, 0x000001A0);
 
-	/* set alt I2s master clock to /0x16 and enable alt divider i2s
+	/* set alt I2s controller clock to /0x16 and enable alt divider i2s
 	   passthrough */
 	cx18_av_write4(cx, CXADEC_PIN_CFG3, 0x5600B687);
 
diff --git a/drivers/media/pci/cx18/cx18-cards.c b/drivers/media/pci/cx18/cx18-cards.c
index f5a30959a367..d9b859ee4b1b 100644
--- a/drivers/media/pci/cx18/cx18-cards.c
+++ b/drivers/media/pci/cx18/cx18-cards.c
@@ -82,7 +82,7 @@ static const struct cx18_card cx18_card_hvr1600_esmt = {
 	},
 	.gpio_init.initial_value = 0x3001,
 	.gpio_init.direction = 0x3001,
-	.gpio_i2c_slave_reset = {
+	.gpio_i2c_client_reset = {
 		.active_lo_mask = 0x3001,
 		.msecs_asserted = 10,
 		.msecs_recovery = 40,
@@ -129,7 +129,7 @@ static const struct cx18_card cx18_card_hvr1600_s5h1411 = {
 	},
 	.gpio_init.initial_value = 0x3801,
 	.gpio_init.direction = 0x3801,
-	.gpio_i2c_slave_reset = {
+	.gpio_i2c_client_reset = {
 		.active_lo_mask = 0x3801,
 		.msecs_asserted = 10,
 		.msecs_recovery = 40,
@@ -176,7 +176,7 @@ static const struct cx18_card cx18_card_hvr1600_samsung = {
 	},
 	.gpio_init.initial_value = 0x3001,
 	.gpio_init.direction = 0x3001,
-	.gpio_i2c_slave_reset = {
+	.gpio_i2c_client_reset = {
 		.active_lo_mask = 0x3001,
 		.msecs_asserted = 10,
 		.msecs_recovery = 40,
diff --git a/drivers/media/pci/cx18/cx18-cards.h b/drivers/media/pci/cx18/cx18-cards.h
index ae9cf5bfdd59..86f41ec6ca2f 100644
--- a/drivers/media/pci/cx18/cx18-cards.h
+++ b/drivers/media/pci/cx18/cx18-cards.h
@@ -69,7 +69,7 @@ struct cx18_gpio_init { /* set initial GPIO DIR and OUT values */
 	u32 initial_value;
 };
 
-struct cx18_gpio_i2c_slave_reset {
+struct cx18_gpio_i2c_client_reset {
 	u32 active_lo_mask; /* GPIO outputs that reset i2c chips when low */
 	u32 active_hi_mask; /* GPIO outputs that reset i2c chips when high */
 	int msecs_asserted; /* time period reset must remain asserted */
@@ -121,7 +121,7 @@ struct cx18_card {
 	/* GPIO card-specific settings */
 	u8 xceive_pin;		/* XCeive tuner GPIO reset pin */
 	struct cx18_gpio_init		 gpio_init;
-	struct cx18_gpio_i2c_slave_reset gpio_i2c_slave_reset;
+	struct cx18_gpio_i2c_client_reset gpio_i2c_client_reset;
 	struct cx18_gpio_audio_input    gpio_audio_input;
 
 	struct cx18_card_tuner tuners[CX18_CARD_MAX_TUNERS];
diff --git a/drivers/media/pci/cx18/cx18-gpio.c b/drivers/media/pci/cx18/cx18-gpio.c
index c85eb8d25837..82c9104b9e85 100644
--- a/drivers/media/pci/cx18/cx18-gpio.c
+++ b/drivers/media/pci/cx18/cx18-gpio.c
@@ -204,9 +204,9 @@ static int resetctrl_log_status(struct v4l2_subdev *sd)
 static int resetctrl_reset(struct v4l2_subdev *sd, u32 val)
 {
 	struct cx18 *cx = v4l2_get_subdevdata(sd);
-	const struct cx18_gpio_i2c_slave_reset *p;
+	const struct cx18_gpio_i2c_client_reset *p;
 
-	p = &cx->card->gpio_i2c_slave_reset;
+	p = &cx->card->gpio_i2c_client_reset;
 	switch (val) {
 	case CX18_GPIO_RESET_I2C:
 		gpio_reset_seq(cx, p->active_lo_mask, p->active_hi_mask,
@@ -309,7 +309,7 @@ void cx18_reset_ir_gpio(void *data)
 {
 	struct cx18 *cx = to_cx18(data);
 
-	if (cx->card->gpio_i2c_slave_reset.ir_reset_mask == 0)
+	if (cx->card->gpio_i2c_client_reset.ir_reset_mask == 0)
 		return;
 
 	CX18_DEBUG_INFO("Resetting IR microcontroller\n");
-- 
2.34.1


^ permalink raw reply related	[relevance 56%]

* [PATCH v1 05/12] media: cobalt: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (3 preceding siblings ...)
  2024-04-30 17:38 69% ` [PATCH v1 04/12] media: au0828: " Easwar Hariharan
@ 2024-04-30 17:38 71% ` Easwar Hariharan
  2024-04-30 17:38 56% ` [PATCH v1 06/12] media: cx18: " Easwar Hariharan
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Hans Verkuil, Mauro Carvalho Chehab,
	open list:COBALT MEDIA DRIVER, open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/pci/cobalt/cobalt-i2c.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-i2c.c b/drivers/media/pci/cobalt/cobalt-i2c.c
index 10c9ee33f73e..d2963370f949 100644
--- a/drivers/media/pci/cobalt/cobalt-i2c.c
+++ b/drivers/media/pci/cobalt/cobalt-i2c.c
@@ -45,10 +45,10 @@ struct cobalt_i2c_regs {
 /* I2C stop condition */
 #define M00018_CR_BITMAP_STO_MSK	(1 << 6)
 
-/* I2C read from slave */
+/* I2C read from client */
 #define M00018_CR_BITMAP_RD_MSK		(1 << 5)
 
-/* I2C write to slave */
+/* I2C write to client */
 #define M00018_CR_BITMAP_WR_MSK		(1 << 4)
 
 /* I2C ack */
@@ -59,7 +59,7 @@ struct cobalt_i2c_regs {
 
 /* SR[7:0] - Status register */
 
-/* Receive acknowledge from slave */
+/* Receive acknowledge from client */
 #define M00018_SR_BITMAP_RXACK_MSK	(1 << 7)
 
 /* Busy, I2C bus busy (as defined by start / stop bits) */
-- 
2.34.1


^ permalink raw reply related	[relevance 71%]

* [PATCH v1 04/12] media: au0828: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
                   ` (2 preceding siblings ...)
  2024-04-30 17:38 23% ` [PATCH v1 03/12] drm/i915: " Easwar Hariharan
@ 2024-04-30 17:38 69% ` Easwar Hariharan
  2024-04-30 17:38 71% ` [PATCH v1 05/12] media: cobalt: " Easwar Hariharan
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Easwar Hariharan,
	open list:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/media/usb/au0828/au0828-i2c.c   | 4 ++--
 drivers/media/usb/au0828/au0828-input.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/usb/au0828/au0828-i2c.c b/drivers/media/usb/au0828/au0828-i2c.c
index 749f90d73b5b..3e66d42bf134 100644
--- a/drivers/media/usb/au0828/au0828-i2c.c
+++ b/drivers/media/usb/au0828/au0828-i2c.c
@@ -23,7 +23,7 @@ MODULE_PARM_DESC(i2c_scan, "scan i2c bus at insmod time");
 #define I2C_WAIT_DELAY 25
 #define I2C_WAIT_RETRY 1000
 
-static inline int i2c_slave_did_read_ack(struct i2c_adapter *i2c_adap)
+static inline int i2c_client_did_read_ack(struct i2c_adapter *i2c_adap)
 {
 	struct au0828_dev *dev = i2c_adap->algo_data;
 	return au0828_read(dev, AU0828_I2C_STATUS_201) &
@@ -35,7 +35,7 @@ static int i2c_wait_read_ack(struct i2c_adapter *i2c_adap)
 	int count;
 
 	for (count = 0; count < I2C_WAIT_RETRY; count++) {
-		if (!i2c_slave_did_read_ack(i2c_adap))
+		if (!i2c_client_did_read_ack(i2c_adap))
 			break;
 		udelay(I2C_WAIT_DELAY);
 	}
diff --git a/drivers/media/usb/au0828/au0828-input.c b/drivers/media/usb/au0828/au0828-input.c
index 3d3368202cd0..98a57b6e02e2 100644
--- a/drivers/media/usb/au0828/au0828-input.c
+++ b/drivers/media/usb/au0828/au0828-input.c
@@ -30,7 +30,7 @@ struct au0828_rc {
 	int polling;
 	struct delayed_work work;
 
-	/* i2c slave address of external device (if used) */
+	/* i2c client address of external device (if used) */
 	u16 i2c_dev_addr;
 
 	int  (*get_key_i2c)(struct au0828_rc *ir);
-- 
2.34.1


^ permalink raw reply related	[relevance 69%]

* [PATCH v1 03/12] drm/i915: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
  2024-04-30 17:38 24% ` [PATCH v1 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
  2024-04-30 17:38 46% ` [PATCH v1 02/12] drm/gma500: " Easwar Hariharan
@ 2024-04-30 17:38 23% ` Easwar Hariharan
    2024-04-30 17:38 69% ` [PATCH v1 04/12] media: au0828: " Easwar Hariharan
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Jani Nikula, Rodrigo Vivi, Joonas Lahtinen, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, Zhenyu Wang, Zhi Wang,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL GVT-g DRIVERS (Intel GPU Virtualization)
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/gpu/drm/i915/display/dvo_ch7017.c     | 14 ++++-----
 drivers/gpu/drm/i915/display/dvo_ch7xxx.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_ivch.c       | 16 +++++-----
 drivers/gpu/drm/i915/display/dvo_ns2501.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_sil164.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_tfp410.c     | 18 +++++------
 drivers/gpu/drm/i915/display/intel_bios.c     | 22 +++++++-------
 drivers/gpu/drm/i915/display/intel_ddi.c      |  2 +-
 .../gpu/drm/i915/display/intel_display_core.h |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi.h      |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c  | 20 ++++++-------
 drivers/gpu/drm/i915/display/intel_dvo.c      | 14 ++++-----
 drivers/gpu/drm/i915/display/intel_dvo_dev.h  |  2 +-
 drivers/gpu/drm/i915/display/intel_gmbus.c    |  4 +--
 drivers/gpu/drm/i915/display/intel_sdvo.c     | 30 +++++++++----------
 drivers/gpu/drm/i915/display/intel_vbt_defs.h |  4 +--
 drivers/gpu/drm/i915/gvt/edid.c               | 28 ++++++++---------
 drivers/gpu/drm/i915/gvt/edid.h               |  4 +--
 drivers/gpu/drm/i915/gvt/opregion.c           |  2 +-
 19 files changed, 119 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/dvo_ch7017.c b/drivers/gpu/drm/i915/display/dvo_ch7017.c
index d0c3880d7f80..493e730c685b 100644
--- a/drivers/gpu/drm/i915/display/dvo_ch7017.c
+++ b/drivers/gpu/drm/i915/display/dvo_ch7017.c
@@ -170,13 +170,13 @@ static bool ch7017_read(struct intel_dvo_device *dvo, u8 addr, u8 *val)
 {
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = &addr,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = val,
@@ -189,7 +189,7 @@ static bool ch7017_write(struct intel_dvo_device *dvo, u8 addr, u8 val)
 {
 	u8 buf[2] = { addr, val };
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = buf,
@@ -197,7 +197,7 @@ static bool ch7017_write(struct intel_dvo_device *dvo, u8 addr, u8 val)
 	return i2c_transfer(dvo->i2c_bus, &msg, 1) == 1;
 }
 
-/** Probes for a CH7017 on the given bus and slave address. */
+/** Probes for a CH7017 on the given bus and target address. */
 static bool ch7017_init(struct intel_dvo_device *dvo,
 			struct i2c_adapter *adapter)
 {
@@ -227,13 +227,13 @@ static bool ch7017_init(struct intel_dvo_device *dvo,
 		break;
 	default:
 		DRM_DEBUG_KMS("ch701x not detected, got %d: from %s "
-			      "slave %d.\n",
-			      val, adapter->name, dvo->slave_addr);
+			      "target %d.\n",
+			      val, adapter->name, dvo->target_addr);
 		goto fail;
 	}
 
 	DRM_DEBUG_KMS("%s detected on %s, addr %d\n",
-		      str, adapter->name, dvo->slave_addr);
+		      str, adapter->name, dvo->target_addr);
 	return true;
 
 fail:
diff --git a/drivers/gpu/drm/i915/display/dvo_ch7xxx.c b/drivers/gpu/drm/i915/display/dvo_ch7xxx.c
index 2e8e85da5a40..534b8544e0a4 100644
--- a/drivers/gpu/drm/i915/display/dvo_ch7xxx.c
+++ b/drivers/gpu/drm/i915/display/dvo_ch7xxx.c
@@ -153,13 +153,13 @@ static bool ch7xxx_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -176,7 +176,7 @@ static bool ch7xxx_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	if (!ch7xxx->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from %s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -188,7 +188,7 @@ static bool ch7xxx_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[2];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -202,7 +202,7 @@ static bool ch7xxx_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!ch7xxx->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -229,8 +229,8 @@ static bool ch7xxx_init(struct intel_dvo_device *dvo,
 
 	name = ch7xxx_get_id(vendor);
 	if (!name) {
-		DRM_DEBUG_KMS("ch7xxx not detected; got VID 0x%02x from %s slave %d.\n",
-			      vendor, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ch7xxx not detected; got VID 0x%02x from %s target %d.\n",
+			      vendor, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
@@ -240,8 +240,8 @@ static bool ch7xxx_init(struct intel_dvo_device *dvo,
 
 	devid = ch7xxx_get_did(device);
 	if (!devid) {
-		DRM_DEBUG_KMS("ch7xxx not detected; got DID 0x%02x from %s slave %d.\n",
-			      device, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ch7xxx not detected; got DID 0x%02x from %s target %d.\n",
+			      device, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
diff --git a/drivers/gpu/drm/i915/display/dvo_ivch.c b/drivers/gpu/drm/i915/display/dvo_ivch.c
index eef72bb3b767..0d5cce6051b1 100644
--- a/drivers/gpu/drm/i915/display/dvo_ivch.c
+++ b/drivers/gpu/drm/i915/display/dvo_ivch.c
@@ -198,7 +198,7 @@ static bool ivch_read(struct intel_dvo_device *dvo, int addr, u16 *data)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 0,
 		},
@@ -209,7 +209,7 @@ static bool ivch_read(struct intel_dvo_device *dvo, int addr, u16 *data)
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD | I2C_M_NOSTART,
 			.len = 2,
 			.buf = in_buf,
@@ -226,7 +226,7 @@ static bool ivch_read(struct intel_dvo_device *dvo, int addr, u16 *data)
 	if (!priv->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from "
 				"%s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -238,7 +238,7 @@ static bool ivch_write(struct intel_dvo_device *dvo, int addr, u16 data)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[3];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 3,
 		.buf = out_buf,
@@ -253,13 +253,13 @@ static bool ivch_write(struct intel_dvo_device *dvo, int addr, u16 data)
 
 	if (!priv->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
 }
 
-/* Probes the given bus and slave address for an ivch */
+/* Probes the given bus and target address for an ivch */
 static bool ivch_init(struct intel_dvo_device *dvo,
 		      struct i2c_adapter *adapter)
 {
@@ -283,10 +283,10 @@ static bool ivch_init(struct intel_dvo_device *dvo,
 	 * very unique, check that the value in the base address field matches
 	 * the address it's responding on.
 	 */
-	if ((temp & VR00_BASE_ADDRESS_MASK) != dvo->slave_addr) {
+	if ((temp & VR00_BASE_ADDRESS_MASK) != dvo->target_addr) {
 		DRM_DEBUG_KMS("ivch detect failed due to address mismatch "
 			  "(%d vs %d)\n",
-			  (temp & VR00_BASE_ADDRESS_MASK), dvo->slave_addr);
+			  (temp & VR00_BASE_ADDRESS_MASK), dvo->target_addr);
 		goto out;
 	}
 
diff --git a/drivers/gpu/drm/i915/display/dvo_ns2501.c b/drivers/gpu/drm/i915/display/dvo_ns2501.c
index 1df212fb000e..43fc0374fc7f 100644
--- a/drivers/gpu/drm/i915/display/dvo_ns2501.c
+++ b/drivers/gpu/drm/i915/display/dvo_ns2501.c
@@ -399,13 +399,13 @@ static bool ns2501_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-		 .addr = dvo->slave_addr,
+		 .addr = dvo->target_addr,
 		 .flags = 0,
 		 .len = 1,
 		 .buf = out_buf,
 		 },
 		{
-		 .addr = dvo->slave_addr,
+		 .addr = dvo->target_addr,
 		 .flags = I2C_M_RD,
 		 .len = 1,
 		 .buf = in_buf,
@@ -423,7 +423,7 @@ static bool ns2501_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 	if (!ns->quiet) {
 		DRM_DEBUG_KMS
 		    ("Unable to read register 0x%02x from %s:0x%02x.\n", addr,
-		     adapter->name, dvo->slave_addr);
+		     adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -442,7 +442,7 @@ static bool ns2501_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	u8 out_buf[2];
 
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -457,7 +457,7 @@ static bool ns2501_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!ns->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d\n",
-			      addr, adapter->name, dvo->slave_addr);
+			      addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -488,8 +488,8 @@ static bool ns2501_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (NS2501_VID & 0xff)) {
-		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Slave %d.\n",
-			      ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Target %d.\n",
+			      ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
@@ -497,8 +497,8 @@ static bool ns2501_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (NS2501_DID & 0xff)) {
-		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Slave %d.\n",
-			      ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("ns2501 not detected got %d: from %s Target %d.\n",
+			      ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 	ns->quiet = false;
diff --git a/drivers/gpu/drm/i915/display/dvo_sil164.c b/drivers/gpu/drm/i915/display/dvo_sil164.c
index 6c461024c8e3..a8dd40c00997 100644
--- a/drivers/gpu/drm/i915/display/dvo_sil164.c
+++ b/drivers/gpu/drm/i915/display/dvo_sil164.c
@@ -79,13 +79,13 @@ static bool sil164_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -102,7 +102,7 @@ static bool sil164_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	if (!sil->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from %s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -113,7 +113,7 @@ static bool sil164_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[2];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -127,7 +127,7 @@ static bool sil164_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!sil->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -153,8 +153,8 @@ static bool sil164_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (SIL164_VID & 0xff)) {
-		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Slave %d.\n",
-			  ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Target %d.\n",
+			  ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
@@ -162,8 +162,8 @@ static bool sil164_init(struct intel_dvo_device *dvo,
 		goto out;
 
 	if (ch != (SIL164_DID & 0xff)) {
-		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Slave %d.\n",
-			  ch, adapter->name, dvo->slave_addr);
+		DRM_DEBUG_KMS("sil164 not detected got %d: from %s Target %d.\n",
+			  ch, adapter->name, dvo->target_addr);
 		goto out;
 	}
 	sil->quiet = false;
diff --git a/drivers/gpu/drm/i915/display/dvo_tfp410.c b/drivers/gpu/drm/i915/display/dvo_tfp410.c
index 0939e097f4f9..d9a0cd753a87 100644
--- a/drivers/gpu/drm/i915/display/dvo_tfp410.c
+++ b/drivers/gpu/drm/i915/display/dvo_tfp410.c
@@ -100,13 +100,13 @@ static bool tfp410_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = dvo->slave_addr,
+			.addr = dvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -123,7 +123,7 @@ static bool tfp410_readb(struct intel_dvo_device *dvo, int addr, u8 *ch)
 
 	if (!tfp->quiet) {
 		DRM_DEBUG_KMS("Unable to read register 0x%02x from %s:%02x.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 	return false;
 }
@@ -134,7 +134,7 @@ static bool tfp410_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 	struct i2c_adapter *adapter = dvo->i2c_bus;
 	u8 out_buf[2];
 	struct i2c_msg msg = {
-		.addr = dvo->slave_addr,
+		.addr = dvo->target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
@@ -148,7 +148,7 @@ static bool tfp410_writeb(struct intel_dvo_device *dvo, int addr, u8 ch)
 
 	if (!tfp->quiet) {
 		DRM_DEBUG_KMS("Unable to write register 0x%02x to %s:%d.\n",
-			  addr, adapter->name, dvo->slave_addr);
+			  addr, adapter->name, dvo->target_addr);
 	}
 
 	return false;
@@ -183,15 +183,15 @@ static bool tfp410_init(struct intel_dvo_device *dvo,
 
 	if ((id = tfp410_getid(dvo, TFP410_VID_LO)) != TFP410_VID) {
 		DRM_DEBUG_KMS("tfp410 not detected got VID %X: from %s "
-				"Slave %d.\n",
-			  id, adapter->name, dvo->slave_addr);
+				"Target %d.\n",
+			  id, adapter->name, dvo->target_addr);
 		goto out;
 	}
 
 	if ((id = tfp410_getid(dvo, TFP410_DID_LO)) != TFP410_DID) {
 		DRM_DEBUG_KMS("tfp410 not detected got DID %X: from %s "
-				"Slave %d.\n",
-			  id, adapter->name, dvo->slave_addr);
+				"Target %d.\n",
+			  id, adapter->name, dvo->target_addr);
 		goto out;
 	}
 	tfp->quiet = false;
diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index fe52c06271ef..35f48fbd9e3e 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -69,8 +69,8 @@ struct intel_bios_encoder_data {
 	struct list_head node;
 };
 
-#define	SLAVE_ADDR1	0x70
-#define	SLAVE_ADDR2	0x72
+#define	TARGET_ADDR1	0x70
+#define	TARGET_ADDR2	0x72
 
 /* Get BDB block size given a pointer to Block ID. */
 static u32 _get_blocksize(const u8 *block_base)
@@ -1231,10 +1231,10 @@ parse_sdvo_device_mapping(struct drm_i915_private *i915)
 		const struct child_device_config *child = &devdata->child;
 		struct sdvo_device_mapping *mapping;
 
-		if (child->slave_addr != SLAVE_ADDR1 &&
-		    child->slave_addr != SLAVE_ADDR2) {
+		if (child->target_addr != TARGET_ADDR1 &&
+		    child->target_addr != TARGET_ADDR2) {
 			/*
-			 * If the slave address is neither 0x70 nor 0x72,
+			 * If the target address is neither 0x70 nor 0x72,
 			 * it is not a SDVO device. Skip it.
 			 */
 			continue;
@@ -1247,22 +1247,22 @@ parse_sdvo_device_mapping(struct drm_i915_private *i915)
 			continue;
 		}
 		drm_dbg_kms(&i915->drm,
-			    "the SDVO device with slave addr %2x is found on"
+			    "the SDVO device with target addr %2x is found on"
 			    " %s port\n",
-			    child->slave_addr,
+			    child->target_addr,
 			    (child->dvo_port == DEVICE_PORT_DVOB) ?
 			    "SDVOB" : "SDVOC");
 		mapping = &i915->display.vbt.sdvo_mappings[child->dvo_port - 1];
 		if (!mapping->initialized) {
 			mapping->dvo_port = child->dvo_port;
-			mapping->slave_addr = child->slave_addr;
+			mapping->target_addr = child->target_addr;
 			mapping->dvo_wiring = child->dvo_wiring;
 			mapping->ddc_pin = child->ddc_pin;
 			mapping->i2c_pin = child->i2c_pin;
 			mapping->initialized = 1;
 			drm_dbg_kms(&i915->drm,
 				    "SDVO device: dvo=%x, addr=%x, wiring=%d, ddc_pin=%d, i2c_pin=%d\n",
-				    mapping->dvo_port, mapping->slave_addr,
+				    mapping->dvo_port, mapping->target_addr,
 				    mapping->dvo_wiring, mapping->ddc_pin,
 				    mapping->i2c_pin);
 		} else {
@@ -1270,11 +1270,11 @@ parse_sdvo_device_mapping(struct drm_i915_private *i915)
 				    "Maybe one SDVO port is shared by "
 				    "two SDVO device.\n");
 		}
-		if (child->slave2_addr) {
+		if (child->target2_addr) {
 			/* Maybe this is a SDVO device with multiple inputs */
 			/* And the mapping info is not added */
 			drm_dbg_kms(&i915->drm,
-				    "there exists the slave2_addr. Maybe this"
+				    "there exists the target2_addr. Maybe this"
 				    " is a SDVO device with multiple inputs.\n");
 		}
 		count++;
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index c587a8efeafc..c408daee412a 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -4327,7 +4327,7 @@ static int intel_ddi_compute_config_late(struct intel_encoder *encoder,
 									connector->tile_group->id);
 
 	/*
-	 * EDP Transcoders cannot be ensalved
+	 * EDP Transcoders cannot be slaves
 	 * make them a master always when present
 	 */
 	if (port_sync_transcoders & BIT(TRANSCODER_EDP))
diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h b/drivers/gpu/drm/i915/display/intel_display_core.h
index 2167dbee5eea..5bfc91f0b563 100644
--- a/drivers/gpu/drm/i915/display/intel_display_core.h
+++ b/drivers/gpu/drm/i915/display/intel_display_core.h
@@ -236,7 +236,7 @@ struct intel_vbt_data {
 	struct sdvo_device_mapping {
 		u8 initialized;
 		u8 dvo_port;
-		u8 slave_addr;
+		u8 target_addr;
 		u8 dvo_wiring;
 		u8 i2c_pin;
 		u8 ddc_pin;
diff --git a/drivers/gpu/drm/i915/display/intel_dsi.h b/drivers/gpu/drm/i915/display/intel_dsi.h
index e99c94edfaae..e8ba4ccd99d3 100644
--- a/drivers/gpu/drm/i915/display/intel_dsi.h
+++ b/drivers/gpu/drm/i915/display/intel_dsi.h
@@ -66,7 +66,7 @@ struct intel_dsi {
 	/* number of DSI lanes */
 	unsigned int lane_count;
 
-	/* i2c bus associated with the slave device */
+	/* i2c bus associated with the target device */
 	int i2c_bus_num;
 
 	/*
diff --git a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
index a5d7fc8418c9..fb0b02e30c8b 100644
--- a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
+++ b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
@@ -56,7 +56,7 @@
 #define MIPI_PORT_SHIFT			3
 
 struct i2c_adapter_lookup {
-	u16 slave_addr;
+	u16 target_addr;
 	struct intel_dsi *intel_dsi;
 	acpi_handle dev_handle;
 };
@@ -443,7 +443,7 @@ static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
 	if (!i2c_acpi_get_i2c_resource(ares, &sb))
 		return 1;
 
-	if (lookup->slave_addr != sb->slave_address)
+	if (lookup->target_addr != sb->slave_address)
 		return 1;
 
 	status = acpi_get_handle(lookup->dev_handle,
@@ -460,12 +460,12 @@ static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
 }
 
 static void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
-				  const u16 slave_addr)
+				  const u16 target_addr)
 {
 	struct drm_device *drm_dev = intel_dsi->base.base.dev;
 	struct acpi_device *adev = ACPI_COMPANION(drm_dev->dev);
 	struct i2c_adapter_lookup lookup = {
-		.slave_addr = slave_addr,
+		.target_addr = target_addr,
 		.intel_dsi = intel_dsi,
 		.dev_handle = acpi_device_handle(adev),
 	};
@@ -476,7 +476,7 @@ static void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
 }
 #else
 static inline void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
-					 const u16 slave_addr)
+					 const u16 target_addr)
 {
 }
 #endif
@@ -488,17 +488,17 @@ static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
 	struct i2c_msg msg;
 	int ret;
 	u8 vbt_i2c_bus_num = *(data + 2);
-	u16 slave_addr = *(u16 *)(data + 3);
+	u16 target_addr = *(u16 *)(data + 3);
 	u8 reg_offset = *(data + 5);
 	u8 payload_size = *(data + 6);
 	u8 *payload_data;
 
-	drm_dbg_kms(&i915->drm, "bus %d client-addr 0x%02x reg 0x%02x data %*ph\n",
-		    vbt_i2c_bus_num, slave_addr, reg_offset, payload_size, data + 7);
+	drm_dbg_kms(&i915->drm, "bus %d target-addr 0x%02x reg 0x%02x data %*ph\n",
+		    vbt_i2c_bus_num, target_addr, reg_offset, payload_size, data + 7);
 
 	if (intel_dsi->i2c_bus_num < 0) {
 		intel_dsi->i2c_bus_num = vbt_i2c_bus_num;
-		i2c_acpi_find_adapter(intel_dsi, slave_addr);
+		i2c_acpi_find_adapter(intel_dsi, target_addr);
 	}
 
 	adapter = i2c_get_adapter(intel_dsi->i2c_bus_num);
@@ -514,7 +514,7 @@ static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
 	payload_data[0] = reg_offset;
 	memcpy(&payload_data[1], (data + 7), payload_size);
 
-	msg.addr = slave_addr;
+	msg.addr = target_addr;
 	msg.flags = 0;
 	msg.len = payload_size + 1;
 	msg.buf = payload_data;
diff --git a/drivers/gpu/drm/i915/display/intel_dvo.c b/drivers/gpu/drm/i915/display/intel_dvo.c
index c076da75b066..8d4c8f33f776 100644
--- a/drivers/gpu/drm/i915/display/intel_dvo.c
+++ b/drivers/gpu/drm/i915/display/intel_dvo.c
@@ -60,42 +60,42 @@ static const struct intel_dvo_device intel_dvo_devices[] = {
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "sil164",
 		.port = PORT_C,
-		.slave_addr = SIL164_ADDR,
+		.target_addr = SIL164_ADDR,
 		.dev_ops = &sil164_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "ch7xxx",
 		.port = PORT_C,
-		.slave_addr = CH7xxx_ADDR,
+		.target_addr = CH7xxx_ADDR,
 		.dev_ops = &ch7xxx_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "ch7xxx",
 		.port = PORT_C,
-		.slave_addr = 0x75, /* For some ch7010 */
+		.target_addr = 0x75, /* For some ch7010 */
 		.dev_ops = &ch7xxx_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_LVDS,
 		.name = "ivch",
 		.port = PORT_A,
-		.slave_addr = 0x02, /* Might also be 0x44, 0x84, 0xc4 */
+		.target_addr = 0x02, /* Might also be 0x44, 0x84, 0xc4 */
 		.dev_ops = &ivch_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_TMDS,
 		.name = "tfp410",
 		.port = PORT_C,
-		.slave_addr = TFP410_ADDR,
+		.target_addr = TFP410_ADDR,
 		.dev_ops = &tfp410_ops,
 	},
 	{
 		.type = INTEL_DVO_CHIP_LVDS,
 		.name = "ch7017",
 		.port = PORT_C,
-		.slave_addr = 0x75,
+		.target_addr = 0x75,
 		.gpio = GMBUS_PIN_DPB,
 		.dev_ops = &ch7017_ops,
 	},
@@ -103,7 +103,7 @@ static const struct intel_dvo_device intel_dvo_devices[] = {
 		.type = INTEL_DVO_CHIP_LVDS_NO_FIXED,
 		.name = "ns2501",
 		.port = PORT_B,
-		.slave_addr = NS2501_ADDR,
+		.target_addr = NS2501_ADDR,
 		.dev_ops = &ns2501_ops,
 	},
 };
diff --git a/drivers/gpu/drm/i915/display/intel_dvo_dev.h b/drivers/gpu/drm/i915/display/intel_dvo_dev.h
index af7b04539b93..4bf476656b8c 100644
--- a/drivers/gpu/drm/i915/display/intel_dvo_dev.h
+++ b/drivers/gpu/drm/i915/display/intel_dvo_dev.h
@@ -38,7 +38,7 @@ struct intel_dvo_device {
 	enum port port;
 	/* GPIO register used for i2c bus to control this device */
 	u32 gpio;
-	int slave_addr;
+	int target_addr;
 
 	const struct intel_dvo_dev_ops *dev_ops;
 	void *dev_priv;
diff --git a/drivers/gpu/drm/i915/display/intel_gmbus.c b/drivers/gpu/drm/i915/display/intel_gmbus.c
index d3e03ed5b79c..fe9a3c1f0072 100644
--- a/drivers/gpu/drm/i915/display/intel_gmbus.c
+++ b/drivers/gpu/drm/i915/display/intel_gmbus.c
@@ -478,7 +478,7 @@ gmbus_xfer_read_chunk(struct drm_i915_private *i915,
 /*
  * HW spec says that 512Bytes in Burst read need special treatment.
  * But it doesn't talk about other multiple of 256Bytes. And couldn't locate
- * an I2C slave, which supports such a lengthy burst read too for experiments.
+ * an I2C target, which supports such a lengthy burst read too for experiments.
  *
  * So until things get clarified on HW support, to avoid the burst read length
  * in fold of 256Bytes except 512, max burst read length is fixed at 767Bytes.
@@ -701,7 +701,7 @@ do_gmbus_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs, int num,
 
 	/* Toggle the Software Clear Interrupt bit. This has the effect
 	 * of resetting the GMBUS controller and so clearing the
-	 * BUS_ERROR raised by the slave's NAK.
+	 * BUS_ERROR raised by the target's NAK.
 	 */
 	intel_de_write_fw(i915, GMBUS1(i915), GMBUS_SW_CLR_INT);
 	intel_de_write_fw(i915, GMBUS1(i915), 0);
diff --git a/drivers/gpu/drm/i915/display/intel_sdvo.c b/drivers/gpu/drm/i915/display/intel_sdvo.c
index 5f9e748adc89..87052bd1c554 100644
--- a/drivers/gpu/drm/i915/display/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/display/intel_sdvo.c
@@ -95,7 +95,7 @@ struct intel_sdvo {
 	struct intel_encoder base;
 
 	struct i2c_adapter *i2c;
-	u8 slave_addr;
+	u8 target_addr;
 
 	struct intel_sdvo_ddc ddc[3];
 
@@ -255,13 +255,13 @@ static bool intel_sdvo_read_byte(struct intel_sdvo *intel_sdvo, u8 addr, u8 *ch)
 	struct drm_i915_private *i915 = to_i915(intel_sdvo->base.base.dev);
 	struct i2c_msg msgs[] = {
 		{
-			.addr = intel_sdvo->slave_addr,
+			.addr = intel_sdvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = &addr,
 		},
 		{
-			.addr = intel_sdvo->slave_addr,
+			.addr = intel_sdvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = ch,
@@ -483,14 +483,14 @@ static bool __intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
 	intel_sdvo_debug_write(intel_sdvo, cmd, args, args_len);
 
 	for (i = 0; i < args_len; i++) {
-		msgs[i].addr = intel_sdvo->slave_addr;
+		msgs[i].addr = intel_sdvo->target_addr;
 		msgs[i].flags = 0;
 		msgs[i].len = 2;
 		msgs[i].buf = buf + 2 *i;
 		buf[2*i + 0] = SDVO_I2C_ARG_0 - i;
 		buf[2*i + 1] = ((u8*)args)[i];
 	}
-	msgs[i].addr = intel_sdvo->slave_addr;
+	msgs[i].addr = intel_sdvo->target_addr;
 	msgs[i].flags = 0;
 	msgs[i].len = 2;
 	msgs[i].buf = buf + 2*i;
@@ -499,12 +499,12 @@ static bool __intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
 
 	/* the following two are to read the response */
 	status = SDVO_I2C_CMD_STATUS;
-	msgs[i+1].addr = intel_sdvo->slave_addr;
+	msgs[i+1].addr = intel_sdvo->target_addr;
 	msgs[i+1].flags = 0;
 	msgs[i+1].len = 1;
 	msgs[i+1].buf = &status;
 
-	msgs[i+2].addr = intel_sdvo->slave_addr;
+	msgs[i+2].addr = intel_sdvo->target_addr;
 	msgs[i+2].flags = I2C_M_RD;
 	msgs[i+2].len = 1;
 	msgs[i+2].buf = &status;
@@ -2659,9 +2659,9 @@ intel_sdvo_select_i2c_bus(struct intel_sdvo *sdvo)
 	else
 		pin = GMBUS_PIN_DPB;
 
-	drm_dbg_kms(&dev_priv->drm, "[ENCODER:%d:%s] I2C pin %d, slave addr 0x%x\n",
+	drm_dbg_kms(&dev_priv->drm, "[ENCODER:%d:%s] I2C pin %d, target addr 0x%x\n",
 		    sdvo->base.base.base.id, sdvo->base.base.name,
-		    pin, sdvo->slave_addr);
+		    pin, sdvo->target_addr);
 
 	sdvo->i2c = intel_gmbus_get_adapter(dev_priv, pin);
 
@@ -2687,7 +2687,7 @@ intel_sdvo_is_hdmi_connector(struct intel_sdvo *intel_sdvo)
 }
 
 static u8
-intel_sdvo_get_slave_addr(struct intel_sdvo *sdvo)
+intel_sdvo_get_target_addr(struct intel_sdvo *sdvo)
 {
 	struct drm_i915_private *dev_priv = to_i915(sdvo->base.base.dev);
 	const struct sdvo_device_mapping *my_mapping, *other_mapping;
@@ -2701,15 +2701,15 @@ intel_sdvo_get_slave_addr(struct intel_sdvo *sdvo)
 	}
 
 	/* If the BIOS described our SDVO device, take advantage of it. */
-	if (my_mapping->slave_addr)
-		return my_mapping->slave_addr;
+	if (my_mapping->target_addr)
+		return my_mapping->target_addr;
 
 	/*
 	 * If the BIOS only described a different SDVO device, use the
 	 * address that it isn't using.
 	 */
-	if (other_mapping->slave_addr) {
-		if (other_mapping->slave_addr == 0x70)
+	if (other_mapping->target_addr) {
+		if (other_mapping->target_addr == 0x70)
 			return 0x72;
 		else
 			return 0x70;
@@ -3412,7 +3412,7 @@ bool intel_sdvo_init(struct drm_i915_private *dev_priv,
 			 "SDVO %c", port_name(port));
 
 	intel_sdvo->sdvo_reg = sdvo_reg;
-	intel_sdvo->slave_addr = intel_sdvo_get_slave_addr(intel_sdvo) >> 1;
+	intel_sdvo->target_addr = intel_sdvo_get_target_addr(intel_sdvo) >> 1;
 
 	intel_sdvo_select_i2c_bus(intel_sdvo);
 
diff --git a/drivers/gpu/drm/i915/display/intel_vbt_defs.h b/drivers/gpu/drm/i915/display/intel_vbt_defs.h
index a9f44abfc9fc..c0d5aae980a8 100644
--- a/drivers/gpu/drm/i915/display/intel_vbt_defs.h
+++ b/drivers/gpu/drm/i915/display/intel_vbt_defs.h
@@ -432,7 +432,7 @@ struct child_device_config {
 	u16 addin_offset;
 	u8 dvo_port; /* See DEVICE_PORT_* and DVO_PORT_* above */
 	u8 i2c_pin;
-	u8 slave_addr;
+	u8 target_addr;
 	u8 ddc_pin;
 	u16 edid_ptr;
 	u8 dvo_cfg; /* See DEVICE_CFG_* above */
@@ -441,7 +441,7 @@ struct child_device_config {
 		struct {
 			u8 dvo2_port;
 			u8 i2c2_pin;
-			u8 slave2_addr;
+			u8 target2_addr;
 			u8 ddc2_pin;
 		} __packed;
 		struct {
diff --git a/drivers/gpu/drm/i915/gvt/edid.c b/drivers/gpu/drm/i915/gvt/edid.c
index af9afdb53c7f..c022dc736045 100644
--- a/drivers/gpu/drm/i915/gvt/edid.c
+++ b/drivers/gpu/drm/i915/gvt/edid.c
@@ -42,8 +42,8 @@
 #define GMBUS1_TOTAL_BYTES_MASK 0x1ff
 #define gmbus1_total_byte_count(v) (((v) >> \
 	GMBUS1_TOTAL_BYTES_SHIFT) & GMBUS1_TOTAL_BYTES_MASK)
-#define gmbus1_slave_addr(v) (((v) & 0xff) >> 1)
-#define gmbus1_slave_index(v) (((v) >> 8) & 0xff)
+#define gmbus1_target_addr(v) (((v) & 0xff) >> 1)
+#define gmbus1_target_index(v) (((v) >> 8) & 0xff)
 #define gmbus1_bus_cycle(v) (((v) >> 25) & 0x7)
 
 /* GMBUS0 bits definitions */
@@ -54,7 +54,7 @@ static unsigned char edid_get_byte(struct intel_vgpu *vgpu)
 	struct intel_vgpu_i2c_edid *edid = &vgpu->display.i2c_edid;
 	unsigned char chr = 0;
 
-	if (edid->state == I2C_NOT_SPECIFIED || !edid->slave_selected) {
+	if (edid->state == I2C_NOT_SPECIFIED || !edid->target_selected) {
 		gvt_vgpu_err("Driver tries to read EDID without proper sequence!\n");
 		return 0;
 	}
@@ -179,7 +179,7 @@ static int gmbus1_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 		void *p_data, unsigned int bytes)
 {
 	struct intel_vgpu_i2c_edid *i2c_edid = &vgpu->display.i2c_edid;
-	u32 slave_addr;
+	u32 target_addr;
 	u32 wvalue = *(u32 *)p_data;
 
 	if (vgpu_vreg(vgpu, offset) & GMBUS_SW_CLR_INT) {
@@ -210,21 +210,21 @@ static int gmbus1_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 
 		i2c_edid->gmbus.total_byte_count =
 			gmbus1_total_byte_count(wvalue);
-		slave_addr = gmbus1_slave_addr(wvalue);
+		target_addr = gmbus1_target_addr(wvalue);
 
 		/* vgpu gmbus only support EDID */
-		if (slave_addr == EDID_ADDR) {
-			i2c_edid->slave_selected = true;
-		} else if (slave_addr != 0) {
+		if (target_addr == EDID_ADDR) {
+			i2c_edid->target_selected = true;
+		} else if (target_addr != 0) {
 			gvt_dbg_dpy(
-				"vgpu%d: unsupported gmbus slave addr(0x%x)\n"
+				"vgpu%d: unsupported gmbus target addr(0x%x)\n"
 				"	gmbus operations will be ignored.\n",
-					vgpu->id, slave_addr);
+					vgpu->id, target_addr);
 		}
 
 		if (wvalue & GMBUS_CYCLE_INDEX)
 			i2c_edid->current_edid_read =
-				gmbus1_slave_index(wvalue);
+				gmbus1_target_index(wvalue);
 
 		i2c_edid->gmbus.cycle_type = gmbus1_bus_cycle(wvalue);
 		switch (gmbus1_bus_cycle(wvalue)) {
@@ -523,7 +523,7 @@ void intel_gvt_i2c_handle_aux_ch_write(struct intel_vgpu *vgpu,
 			} else if (addr == EDID_ADDR) {
 				i2c_edid->state = I2C_AUX_CH;
 				i2c_edid->port = port_idx;
-				i2c_edid->slave_selected = true;
+				i2c_edid->target_selected = true;
 				if (intel_vgpu_has_monitor_on_port(vgpu,
 					port_idx) &&
 					intel_vgpu_port_is_dp(vgpu, port_idx))
@@ -542,7 +542,7 @@ void intel_gvt_i2c_handle_aux_ch_write(struct intel_vgpu *vgpu,
 			return;
 		if (drm_WARN_ON(&i915->drm, msg_length != 4))
 			return;
-		if (i2c_edid->edid_available && i2c_edid->slave_selected) {
+		if (i2c_edid->edid_available && i2c_edid->target_selected) {
 			unsigned char val = edid_get_byte(vgpu);
 
 			aux_data_for_write = (val << 16);
@@ -571,7 +571,7 @@ void intel_vgpu_init_i2c_edid(struct intel_vgpu *vgpu)
 	edid->state = I2C_NOT_SPECIFIED;
 
 	edid->port = -1;
-	edid->slave_selected = false;
+	edid->target_selected = false;
 	edid->edid_available = false;
 	edid->current_edid_read = 0;
 
diff --git a/drivers/gpu/drm/i915/gvt/edid.h b/drivers/gpu/drm/i915/gvt/edid.h
index dfe0cbc6aad8..c3b5a55aecb3 100644
--- a/drivers/gpu/drm/i915/gvt/edid.h
+++ b/drivers/gpu/drm/i915/gvt/edid.h
@@ -80,7 +80,7 @@ enum gmbus_cycle_type {
  *      R/W Protect
  *      Command and Status.
  *      bit0 is the direction bit: 1 is read; 0 is write.
- *      bit1 - bit7 is slave 7-bit address.
+ *      bit1 - bit7 is target 7-bit address.
  *      bit16 - bit24 total byte count (ignore?)
  *
  * GMBUS2:
@@ -130,7 +130,7 @@ struct intel_vgpu_i2c_edid {
 	enum i2c_state state;
 
 	unsigned int port;
-	bool slave_selected;
+	bool target_selected;
 	bool edid_available;
 	unsigned int current_edid_read;
 
diff --git a/drivers/gpu/drm/i915/gvt/opregion.c b/drivers/gpu/drm/i915/gvt/opregion.c
index d2bed466540a..908f910420c2 100644
--- a/drivers/gpu/drm/i915/gvt/opregion.c
+++ b/drivers/gpu/drm/i915/gvt/opregion.c
@@ -86,7 +86,7 @@ struct efp_child_device_config {
 	u8 skip2;
 	u8 dvo_port;
 	u8 i2c_pin; /* for add-in card */
-	u8 slave_addr; /* for add-in card */
+	u8 target_addr; /* for add-in card */
 	u8 ddc_pin;
 	u16 edid_ptr;
 	u8 dvo_config;
-- 
2.34.1


^ permalink raw reply related	[relevance 23%]

* [PATCH v1 02/12] drm/gma500: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
  2024-04-30 17:38 24% ` [PATCH v1 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
@ 2024-04-30 17:38 46% ` Easwar Hariharan
  2024-04-30 17:38 23% ` [PATCH v1 03/12] drm/i915: " Easwar Hariharan
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Patrik Jakobsson, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter, dri-devel,
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 drivers/gpu/drm/gma500/cdv_intel_lvds.c |  2 +-
 drivers/gpu/drm/gma500/intel_bios.c     | 22 ++++++++++-----------
 drivers/gpu/drm/gma500/intel_bios.h     |  4 ++--
 drivers/gpu/drm/gma500/intel_gmbus.c    |  2 +-
 drivers/gpu/drm/gma500/psb_drv.h        |  2 +-
 drivers/gpu/drm/gma500/psb_intel_drv.h  |  2 +-
 drivers/gpu/drm/gma500/psb_intel_lvds.c |  4 ++--
 drivers/gpu/drm/gma500/psb_intel_sdvo.c | 26 ++++++++++++-------------
 8 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_intel_lvds.c b/drivers/gpu/drm/gma500/cdv_intel_lvds.c
index f08a6803dc18..c7652a02b42e 100644
--- a/drivers/gpu/drm/gma500/cdv_intel_lvds.c
+++ b/drivers/gpu/drm/gma500/cdv_intel_lvds.c
@@ -565,7 +565,7 @@ void cdv_intel_lvds_init(struct drm_device *dev,
 			dev->dev, "I2C bus registration failed.\n");
 		goto err_encoder_cleanup;
 	}
-	gma_encoder->i2c_bus->slave_addr = 0x2C;
+	gma_encoder->i2c_bus->target_addr = 0x2C;
 	dev_priv->lvds_i2c_bus = gma_encoder->i2c_bus;
 
 	/*
diff --git a/drivers/gpu/drm/gma500/intel_bios.c b/drivers/gpu/drm/gma500/intel_bios.c
index 8245b5603d2c..d5924ca3ed05 100644
--- a/drivers/gpu/drm/gma500/intel_bios.c
+++ b/drivers/gpu/drm/gma500/intel_bios.c
@@ -14,8 +14,8 @@
 #include "psb_intel_drv.h"
 #include "psb_intel_reg.h"
 
-#define	SLAVE_ADDR1	0x70
-#define	SLAVE_ADDR2	0x72
+#define	TARGET_ADDR1	0x70
+#define	TARGET_ADDR2	0x72
 
 static void *find_section(struct bdb_header *bdb, int section_id)
 {
@@ -357,10 +357,10 @@ parse_sdvo_device_mapping(struct drm_psb_private *dev_priv,
 			/* skip the device block if device type is invalid */
 			continue;
 		}
-		if (p_child->slave_addr != SLAVE_ADDR1 &&
-			p_child->slave_addr != SLAVE_ADDR2) {
+		if (p_child->target_addr != TARGET_ADDR1 &&
+			p_child->target_addr != TARGET_ADDR2) {
 			/*
-			 * If the slave address is neither 0x70 nor 0x72,
+			 * If the target address is neither 0x70 nor 0x72,
 			 * it is not a SDVO device. Skip it.
 			 */
 			continue;
@@ -371,22 +371,22 @@ parse_sdvo_device_mapping(struct drm_psb_private *dev_priv,
 			DRM_DEBUG_KMS("Incorrect SDVO port. Skip it\n");
 			continue;
 		}
-		DRM_DEBUG_KMS("the SDVO device with slave addr %2x is found on"
+		DRM_DEBUG_KMS("the SDVO device with target addr %2x is found on"
 				" %s port\n",
-				p_child->slave_addr,
+				p_child->target_addr,
 				(p_child->dvo_port == DEVICE_PORT_DVOB) ?
 					"SDVOB" : "SDVOC");
 		p_mapping = &(dev_priv->sdvo_mappings[p_child->dvo_port - 1]);
 		if (!p_mapping->initialized) {
 			p_mapping->dvo_port = p_child->dvo_port;
-			p_mapping->slave_addr = p_child->slave_addr;
+			p_mapping->target_addr = p_child->target_addr;
 			p_mapping->dvo_wiring = p_child->dvo_wiring;
 			p_mapping->ddc_pin = p_child->ddc_pin;
 			p_mapping->i2c_pin = p_child->i2c_pin;
 			p_mapping->initialized = 1;
 			DRM_DEBUG_KMS("SDVO device: dvo=%x, addr=%x, wiring=%d, ddc_pin=%d, i2c_pin=%d\n",
 				      p_mapping->dvo_port,
-				      p_mapping->slave_addr,
+				      p_mapping->target_addr,
 				      p_mapping->dvo_wiring,
 				      p_mapping->ddc_pin,
 				      p_mapping->i2c_pin);
@@ -394,10 +394,10 @@ parse_sdvo_device_mapping(struct drm_psb_private *dev_priv,
 			DRM_DEBUG_KMS("Maybe one SDVO port is shared by "
 					 "two SDVO device.\n");
 		}
-		if (p_child->slave2_addr) {
+		if (p_child->target2_addr) {
 			/* Maybe this is a SDVO device with multiple inputs */
 			/* And the mapping info is not added */
-			DRM_DEBUG_KMS("there exists the slave2_addr. Maybe this"
+			DRM_DEBUG_KMS("there exists the target2_addr. Maybe this"
 				" is a SDVO device with multiple inputs.\n");
 		}
 		count++;
diff --git a/drivers/gpu/drm/gma500/intel_bios.h b/drivers/gpu/drm/gma500/intel_bios.h
index 0e6facf21e33..b5adea2a20c3 100644
--- a/drivers/gpu/drm/gma500/intel_bios.h
+++ b/drivers/gpu/drm/gma500/intel_bios.h
@@ -186,13 +186,13 @@ struct child_device_config {
 	u16 addin_offset;
 	u8  dvo_port; /* See Device_PORT_* above */
 	u8  i2c_pin;
-	u8  slave_addr;
+	u8  target_addr;
 	u8  ddc_pin;
 	u16 edid_ptr;
 	u8  dvo_cfg; /* See DEVICE_CFG_* above */
 	u8  dvo2_port;
 	u8  i2c2_pin;
-	u8  slave2_addr;
+	u8  target2_addr;
 	u8  ddc2_pin;
 	u8  capabilities;
 	u8  dvo_wiring;/* See DEVICE_WIRE_* above */
diff --git a/drivers/gpu/drm/gma500/intel_gmbus.c b/drivers/gpu/drm/gma500/intel_gmbus.c
index aa45509859f2..ee8b047587f2 100644
--- a/drivers/gpu/drm/gma500/intel_gmbus.c
+++ b/drivers/gpu/drm/gma500/intel_gmbus.c
@@ -333,7 +333,7 @@ gmbus_xfer(struct i2c_adapter *adapter,
 clear_err:
 	/* Toggle the Software Clear Interrupt bit. This has the effect
 	 * of resetting the GMBUS controller and so clearing the
-	 * BUS_ERROR raised by the slave's NAK.
+	 * BUS_ERROR raised by the target's NAK.
 	 */
 	GMBUS_REG_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT);
 	GMBUS_REG_WRITE(GMBUS1 + reg_offset, 0);
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index c5edfa4aa4cc..eeab6afb42dc 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -203,7 +203,7 @@ struct psb_intel_opregion {
 struct sdvo_device_mapping {
 	u8 initialized;
 	u8 dvo_port;
-	u8 slave_addr;
+	u8 target_addr;
 	u8 dvo_wiring;
 	u8 i2c_pin;
 	u8 i2c_speed;
diff --git a/drivers/gpu/drm/gma500/psb_intel_drv.h b/drivers/gpu/drm/gma500/psb_intel_drv.h
index c111e933e1ed..2499fd6a80c9 100644
--- a/drivers/gpu/drm/gma500/psb_intel_drv.h
+++ b/drivers/gpu/drm/gma500/psb_intel_drv.h
@@ -80,7 +80,7 @@ struct psb_intel_mode_device {
 struct gma_i2c_chan {
 	struct i2c_adapter base;
 	struct i2c_algo_bit_data algo;
-	u8 slave_addr;
+	u8 target_addr;
 
 	/* for getting at dev. private (mmio etc.) */
 	struct drm_device *drm_dev;
diff --git a/drivers/gpu/drm/gma500/psb_intel_lvds.c b/drivers/gpu/drm/gma500/psb_intel_lvds.c
index 8486de230ec9..d1cd9a940395 100644
--- a/drivers/gpu/drm/gma500/psb_intel_lvds.c
+++ b/drivers/gpu/drm/gma500/psb_intel_lvds.c
@@ -97,7 +97,7 @@ static int psb_lvds_i2c_set_brightness(struct drm_device *dev,
 
 	struct i2c_msg msgs[] = {
 		{
-			.addr = lvds_i2c_bus->slave_addr,
+			.addr = lvds_i2c_bus->target_addr,
 			.flags = 0,
 			.len = 2,
 			.buf = out_buf,
@@ -707,7 +707,7 @@ void psb_intel_lvds_init(struct drm_device *dev,
 			dev->dev, "I2C bus registration failed.\n");
 		goto err_encoder_cleanup;
 	}
-	lvds_priv->i2c_bus->slave_addr = 0x2C;
+	lvds_priv->i2c_bus->target_addr = 0x2C;
 	dev_priv->lvds_i2c_bus =  lvds_priv->i2c_bus;
 
 	/*
diff --git a/drivers/gpu/drm/gma500/psb_intel_sdvo.c b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
index e4f914deceba..8dafff963ca8 100644
--- a/drivers/gpu/drm/gma500/psb_intel_sdvo.c
+++ b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
@@ -70,7 +70,7 @@ struct psb_intel_sdvo {
 	struct gma_encoder base;
 
 	struct i2c_adapter *i2c;
-	u8 slave_addr;
+	u8 target_addr;
 
 	struct i2c_adapter ddc;
 
@@ -259,13 +259,13 @@ static bool psb_intel_sdvo_read_byte(struct psb_intel_sdvo *psb_intel_sdvo, u8 a
 {
 	struct i2c_msg msgs[] = {
 		{
-			.addr = psb_intel_sdvo->slave_addr,
+			.addr = psb_intel_sdvo->target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = &addr,
 		},
 		{
-			.addr = psb_intel_sdvo->slave_addr,
+			.addr = psb_intel_sdvo->target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = ch,
@@ -463,14 +463,14 @@ static bool psb_intel_sdvo_write_cmd(struct psb_intel_sdvo *psb_intel_sdvo, u8 c
 	psb_intel_sdvo_debug_write(psb_intel_sdvo, cmd, args, args_len);
 
 	for (i = 0; i < args_len; i++) {
-		msgs[i].addr = psb_intel_sdvo->slave_addr;
+		msgs[i].addr = psb_intel_sdvo->target_addr;
 		msgs[i].flags = 0;
 		msgs[i].len = 2;
 		msgs[i].buf = buf + 2 *i;
 		buf[2*i + 0] = SDVO_I2C_ARG_0 - i;
 		buf[2*i + 1] = ((u8*)args)[i];
 	}
-	msgs[i].addr = psb_intel_sdvo->slave_addr;
+	msgs[i].addr = psb_intel_sdvo->target_addr;
 	msgs[i].flags = 0;
 	msgs[i].len = 2;
 	msgs[i].buf = buf + 2*i;
@@ -479,12 +479,12 @@ static bool psb_intel_sdvo_write_cmd(struct psb_intel_sdvo *psb_intel_sdvo, u8 c
 
 	/* the following two are to read the response */
 	status = SDVO_I2C_CMD_STATUS;
-	msgs[i+1].addr = psb_intel_sdvo->slave_addr;
+	msgs[i+1].addr = psb_intel_sdvo->target_addr;
 	msgs[i+1].flags = 0;
 	msgs[i+1].len = 1;
 	msgs[i+1].buf = &status;
 
-	msgs[i+2].addr = psb_intel_sdvo->slave_addr;
+	msgs[i+2].addr = psb_intel_sdvo->target_addr;
 	msgs[i+2].flags = I2C_M_RD;
 	msgs[i+2].len = 1;
 	msgs[i+2].buf = &status;
@@ -1899,7 +1899,7 @@ psb_intel_sdvo_is_hdmi_connector(struct psb_intel_sdvo *psb_intel_sdvo, int devi
 }
 
 static u8
-psb_intel_sdvo_get_slave_addr(struct drm_device *dev, int sdvo_reg)
+psb_intel_sdvo_get_target_addr(struct drm_device *dev, int sdvo_reg)
 {
 	struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
 	struct sdvo_device_mapping *my_mapping, *other_mapping;
@@ -1913,14 +1913,14 @@ psb_intel_sdvo_get_slave_addr(struct drm_device *dev, int sdvo_reg)
 	}
 
 	/* If the BIOS described our SDVO device, take advantage of it. */
-	if (my_mapping->slave_addr)
-		return my_mapping->slave_addr;
+	if (my_mapping->target_addr)
+		return my_mapping->target_addr;
 
 	/* If the BIOS only described a different SDVO device, use the
 	 * address that it isn't using.
 	 */
-	if (other_mapping->slave_addr) {
-		if (other_mapping->slave_addr == 0x70)
+	if (other_mapping->target_addr) {
+		if (other_mapping->target_addr == 0x70)
 			return 0x72;
 		else
 			return 0x70;
@@ -2446,7 +2446,7 @@ bool psb_intel_sdvo_init(struct drm_device *dev, int sdvo_reg)
 		return false;
 
 	psb_intel_sdvo->sdvo_reg = sdvo_reg;
-	psb_intel_sdvo->slave_addr = psb_intel_sdvo_get_slave_addr(dev, sdvo_reg) >> 1;
+	psb_intel_sdvo->target_addr = psb_intel_sdvo_get_target_addr(dev, sdvo_reg) >> 1;
 	psb_intel_sdvo_select_i2c_bus(dev_priv, psb_intel_sdvo, sdvo_reg);
 	if (!psb_intel_sdvo_init_ddc_proxy(psb_intel_sdvo, dev)) {
 		kfree(psb_intel_sdvo);
-- 
2.34.1


^ permalink raw reply related	[relevance 46%]

* [PATCH v1 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive
  2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
@ 2024-04-30 17:38 24% ` Easwar Hariharan
  2024-04-30 17:38 46% ` [PATCH v1 02/12] drm/gma500: " Easwar Hariharan
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:38 UTC (permalink / raw)
  To: Alex Deucher, Christian König, Pan, Xinhui, David Airlie,
	Daniel Vetter, Harry Wentland, Leo Li, Rodrigo Siqueira,
	Evan Quan, Hawking Zhang, Candice Li, Alexander Richards,
	Ran Sun, Easwar Hariharan, Thomas Zimmermann, Jani Nikula,
	Dmitry Baryshkov, AngeloGioacchino Del Regno, Andi Shyti,
	Heiner Kallweit, Hamza Mahfooz, Alan Liu, Ruan Jinjie,
	Aurabindo Pillai, Wayne Lin, Samson Tam, Alvin Lee,
	Sohaib Nadeem, Charlene Liu, Bhawanpreet Lakha,
	Meenakshikumar Somasundaram, Tom Chung, George Shen, Aric Cyr,
	Nicholas Kazlauskas, Qingqing Zhuo, Dillon Varone, Lijo Lazar,
	Asad kamal, Ma Jun, Kenneth Feng, Mario Limonciello,
	Darren Powell, Yang Wang, Yifan Zhang, Le Ma,
	open list:RADEON and AMDGPU DRM DRIVERS, open list:DRM DRIVERS,
	open list
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c  |  8 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c       | 10 +++----
 drivers/gpu/drm/amd/amdgpu/atombios_i2c.c     |  8 +++---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    | 20 ++++++-------
 .../gpu/drm/amd/display/dc/bios/bios_parser.c |  2 +-
 .../drm/amd/display/dc/bios/bios_parser2.c    |  2 +-
 .../drm/amd/display/dc/core/dc_link_exports.c |  4 +--
 drivers/gpu/drm/amd/display/dc/dc.h           |  2 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c  |  4 +--
 .../display/include/grph_object_ctrl_defs.h   |  2 +-
 drivers/gpu/drm/amd/include/atombios.h        |  2 +-
 drivers/gpu/drm/amd/include/atomfirmware.h    | 26 ++++++++---------
 .../powerplay/hwmgr/vega20_processpptables.c  |  4 +--
 .../amd/pm/powerplay/inc/smu11_driver_if.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_arcturus.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_navi10.h      |  2 +-
 .../pmfw_if/smu11_driver_if_sienna_cichlid.h  |  2 +-
 .../inc/pmfw_if/smu13_driver_if_aldebaran.h   |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_0.h     |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_7.h     |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  4 +--
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  8 +++---
 drivers/gpu/drm/radeon/atombios.h             |  2 +-
 drivers/gpu/drm/radeon/atombios_i2c.c         |  4 +--
 drivers/gpu/drm/radeon/radeon_combios.c       | 28 +++++++++----------
 drivers/gpu/drm/radeon/radeon_i2c.c           | 10 +++----
 drivers/gpu/drm/radeon/radeon_mode.h          |  6 ++--
 27 files changed, 85 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
index 6857c586ded7..37f50fc5d496 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
@@ -614,7 +614,7 @@ bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev,
 		if ((frev == 3 && crev >= 4) || (frev > 3)) {
 			firmware_info = (union firmware_info *)
 				(mode_info->atom_context->bios + data_offset);
-			/* The ras_rom_i2c_slave_addr should ideally
+			/* The ras_rom_i2c_target_addr should ideally
 			 * be a 19-bit EEPROM address, which would be
 			 * used as is by the driver; see top of
 			 * amdgpu_eeprom.c.
@@ -625,13 +625,13 @@ bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev,
 			 * leave the check for the pointer.
 			 *
 			 * The reason this works right now is because
-			 * ras_rom_i2c_slave_addr contains the EEPROM
+			 * ras_rom_i2c_target_addr contains the EEPROM
 			 * device type qualifier 1010b in the top 4
 			 * bits.
 			 */
-			if (firmware_info->v34.ras_rom_i2c_slave_addr) {
+			if (firmware_info->v34.ras_rom_i2c_target_addr) {
 				if (i2c_address)
-					*i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr;
+					*i2c_address = firmware_info->v34.ras_rom_i2c_target_addr;
 				return true;
 			}
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index d79cb13e1aa8..070049c92e2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -280,7 +280,7 @@ amdgpu_i2c_lookup(struct amdgpu_device *adev,
 }
 
 static void amdgpu_i2c_get_byte(struct amdgpu_i2c_chan *i2c_bus,
-				 u8 slave_addr,
+				 u8 target_addr,
 				 u8 addr,
 				 u8 *val)
 {
@@ -288,13 +288,13 @@ static void amdgpu_i2c_get_byte(struct amdgpu_i2c_chan *i2c_bus,
 	u8 in_buf[2];
 	struct i2c_msg msgs[] = {
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -314,13 +314,13 @@ static void amdgpu_i2c_get_byte(struct amdgpu_i2c_chan *i2c_bus,
 }
 
 static void amdgpu_i2c_put_byte(struct amdgpu_i2c_chan *i2c_bus,
-				 u8 slave_addr,
+				 u8 target_addr,
 				 u8 addr,
 				 u8 val)
 {
 	uint8_t out_buf[2];
 	struct i2c_msg msg = {
-		.addr = slave_addr,
+		.addr = target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c b/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c
index a6501114322f..a7d3c3d2c633 100644
--- a/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_i2c.c
@@ -36,7 +36,7 @@
 #define ATOM_MAX_HW_I2C_READ  255
 
 static int amdgpu_atombios_i2c_process_i2c_ch(struct amdgpu_i2c_chan *chan,
-				       u8 slave_addr, u8 flags,
+				       u8 target_addr, u8 flags,
 				       u8 *buf, u8 num)
 {
 	struct drm_device *dev = chan->dev;
@@ -83,7 +83,7 @@ static int amdgpu_atombios_i2c_process_i2c_ch(struct amdgpu_i2c_chan *chan,
 	args.ucFlag = flags;
 	args.ucI2CSpeed = TARGET_HW_I2C_CLOCK;
 	args.ucTransBytes = num;
-	args.ucSlaveAddr = slave_addr << 1;
+	args.ucTargetAddr = target_addr << 1;
 	args.ucLineNumber = chan->rec.i2c_id;
 
 	amdgpu_atom_execute_table(adev->mode_info.atom_context, index, (uint32_t *)&args, sizeof(args));
@@ -159,7 +159,7 @@ u32 amdgpu_atombios_i2c_func(struct i2c_adapter *adap)
 	return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
 }
 
-void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device *adev, u8 slave_addr, u8 line_number, u8 offset, u8 data)
+void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device *adev, u8 target_addr, u8 line_number, u8 offset, u8 data)
 {
 	PROCESS_I2C_CHANNEL_TRANSACTION_PS_ALLOCATION args;
 	int index = GetIndexIntoMasterTable(COMMAND, ProcessI2cChannelTransaction);
@@ -169,7 +169,7 @@ void amdgpu_atombios_i2c_channel_trans(struct amdgpu_device *adev, u8 slave_addr
 	args.ucFlag = 1;
 	args.ucI2CSpeed = TARGET_HW_I2C_CLOCK;
 	args.ucTransBytes = 1;
-	args.ucSlaveAddr = slave_addr;
+	args.ucTargetAddr = target_addr;
 	args.ucLineNumber = line_number;
 
 	amdgpu_atom_execute_table(adev->mode_info.atom_context, index, (uint32_t *)&args, sizeof(args));
diff --git a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
index dd2d66090d23..b91ed6050541 100644
--- a/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c
@@ -229,7 +229,7 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
 
 	reg_c_tx_abrt_source = RREG32_SOC15(SMUIO, 0, mmCKSVII2C_IC_TX_ABRT_SOURCE);
 
-	/* If slave is not present */
+	/* If target is not present */
 	if (REG_GET_FIELD(reg_c_tx_abrt_source,
 			  CKSVII2C_IC_TX_ABRT_SOURCE,
 			  ABRT_7B_ADDR_NOACK) == 1) {
@@ -255,10 +255,10 @@ static uint32_t smu_v11_0_i2c_poll_rx_status(struct i2c_adapter *control)
 }
 
 /**
- * smu_v11_0_i2c_transmit - Send a block of data over the I2C bus to a slave device.
+ * smu_v11_0_i2c_transmit - Send a block of data over the I2C bus to a target device.
  *
  * @control: I2C adapter reference
- * @address: The I2C address of the slave device.
+ * @address: The I2C address of the target device.
  * @data: The data to transmit over the bus.
  * @numbytes: The amount of data to transmit.
  * @i2c_flag: Flags for transmission
@@ -284,7 +284,7 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 			       16, 1, data, numbytes, false);
 	}
 
-	/* Set the I2C slave address */
+	/* Set the I2C target address */
 	smu_v11_0_i2c_set_address(control, address);
 	/* Enable I2C */
 	smu_v11_0_i2c_enable(control, true);
@@ -354,10 +354,10 @@ static uint32_t smu_v11_0_i2c_transmit(struct i2c_adapter *control,
 
 
 /**
- * smu_v11_0_i2c_receive - Receive a block of data over the I2C bus from a slave device.
+ * smu_v11_0_i2c_receive - Receive a block of data over the I2C bus from a target device.
  *
  * @control: I2C adapter reference
- * @address: The I2C address of the slave device.
+ * @address: The I2C address of the target device.
  * @data: Placeholder to store received data.
  * @numbytes: The amount of data to transmit.
  * @i2c_flag: Flags for transmission
@@ -374,7 +374,7 @@ static uint32_t smu_v11_0_i2c_receive(struct i2c_adapter *control,
 
 	bytes_received = 0;
 
-	/* Set the I2C slave address */
+	/* Set the I2C target address */
 	smu_v11_0_i2c_set_address(control, address);
 
 	/* Enable I2C */
@@ -509,7 +509,7 @@ static void smu_v11_0_i2c_init(struct i2c_adapter *control)
 	if (res != I2C_OK)
 		smu_v11_0_i2c_abort(control);
 
-	/* Configure I2C to operate as master and in standard mode */
+	/* Configure I2C to operate as controller and in standard mode */
 	smu_v11_0_i2c_configure(control);
 
 	/* Initialize the clock to 50 kHz default */
@@ -650,11 +650,11 @@ static int smu_v11_0_i2c_xfer(struct i2c_adapter *i2c_adap,
 
 	smu_v11_0_i2c_init(i2c_adap);
 
-	/* From the client's point of view, this sequence of
+	/* From the target's point of view, this sequence of
 	 * messages-- the array i2c_msg *msg, is a single transaction
 	 * on the bus, starting with START and ending with STOP.
 	 *
-	 * The client is welcome to send any sequence of messages in
+	 * The target is welcome to send any sequence of messages in
 	 * this array, as processing under this function here is
 	 * striving to be agnostic.
 	 *
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
index 6450853fea94..51aa72e4eba4 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
@@ -1871,7 +1871,7 @@ static enum bp_result get_gpio_i2c_info(struct bios_parser *bp,
 	info->i2c_hw_assist = record->sucI2cId.bfHW_Capable;
 	info->i2c_line = record->sucI2cId.bfI2C_LineMux;
 	info->i2c_engine_id = record->sucI2cId.bfHW_EngineID;
-	info->i2c_slave_address = record->ucI2CAddr;
+	info->i2c_target_address = record->ucI2CAddr;
 
 	info->gpio_info.clk_mask_register_index =
 			le16_to_cpu(header->asGPIO_Info[info->i2c_line].usClkMaskRegisterIndex);
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index 05f392501c0a..abc66f46bb31 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -511,7 +511,7 @@ static enum bp_result get_gpio_i2c_info(
 	info->i2c_hw_assist = (record->i2c_id & I2C_HW_CAP) ? true : false;
 	info->i2c_line = record->i2c_id & I2C_HW_LANE_MUX;
 	info->i2c_engine_id = (record->i2c_id & I2C_HW_ENGINE_ID_MASK) >> 4;
-	info->i2c_slave_address = record->i2c_slave_addr;
+	info->i2c_target_address = record->i2c_target_addr;
 
 	/* TODO: check how to get register offset for en, Y, etc. */
 	info->gpio_info.clk_a_register_index = le16_to_cpu(pin->data_a_reg_index);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
index c6c35037bdb8..9d2ec5fce4ae 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
@@ -141,13 +141,13 @@ bool dc_link_update_dsc_config(struct pipe_ctx *pipe_ctx)
 
 bool dc_is_oem_i2c_device_present(
 	struct dc *dc,
-	size_t slave_address)
+	size_t target_address)
 {
 	if (dc->res_pool->oem_device)
 		return dce_i2c_oem_device_present(
 			dc->res_pool,
 			dc->res_pool->oem_device,
-			slave_address);
+			target_address);
 
 	return false;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
index ee8453bf958f..21608f42879f 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -1803,7 +1803,7 @@ int dc_link_aux_transfer_raw(struct ddc_service *ddc,
 
 bool dc_is_oem_i2c_device_present(
 	struct dc *dc,
-	size_t slave_address
+	size_t target_address
 );
 
 /* return true if the connected receiver supports the hdcp version */
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c b/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c
index f5cd2392fc5f..f4c83d322350 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c
@@ -28,7 +28,7 @@
 bool dce_i2c_oem_device_present(
 	struct resource_pool *pool,
 	struct ddc_service *ddc,
-	size_t slave_address
+	size_t target_address
 )
 {
 	struct dc *dc = ddc->ctx->dc;
@@ -45,7 +45,7 @@ bool dce_i2c_oem_device_present(
 	if (dcb->funcs->get_i2c_info(dcb, id, &i2c_info) != BP_RESULT_OK)
 		return false;
 
-	if (i2c_info.i2c_slave_address != slave_address)
+	if (i2c_info.i2c_target_address != target_address)
 		return false;
 
 	return true;
diff --git a/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h b/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h
index 813463ffe15c..c30a2117a539 100644
--- a/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h
+++ b/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h
@@ -92,7 +92,7 @@ struct graphics_object_i2c_info {
 	bool i2c_hw_assist;
 	uint32_t i2c_line;
 	uint32_t i2c_engine_id;
-	uint32_t i2c_slave_address;
+	uint32_t i2c_target_address;
 };
 
 struct graphics_object_hpd_info {
diff --git a/drivers/gpu/drm/amd/include/atombios.h b/drivers/gpu/drm/amd/include/atombios.h
index b78360a71bc9..5644920f45e6 100644
--- a/drivers/gpu/drm/amd/include/atombios.h
+++ b/drivers/gpu/drm/amd/include/atombios.h
@@ -8503,7 +8503,7 @@ typedef struct _PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS
    USHORT  lpI2CDataOut;
   UCHAR   ucFlag;
   UCHAR   ucTransBytes;
-  UCHAR   ucSlaveAddr;
+  UCHAR   ucTargetAddr;
   UCHAR   ucLineNumber;
 }PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS;
 
diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h b/drivers/gpu/drm/amd/include/atomfirmware.h
index af3eebb4c9bc..0b76c3655df7 100644
--- a/drivers/gpu/drm/amd/include/atomfirmware.h
+++ b/drivers/gpu/drm/amd/include/atomfirmware.h
@@ -534,7 +534,7 @@ struct atom_firmware_info_v3_2 {
   uint32_t mc_baseaddr_low;
   uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
   uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-  uint8_t  board_i2c_feature_slave_addr;
+  uint8_t  board_i2c_feature_target_addr;
   uint8_t  reserved3;
   uint16_t bootup_mvddq_mv;
   uint16_t bootup_mvpp_mv;
@@ -562,7 +562,7 @@ struct atom_firmware_info_v3_3
   uint32_t mc_baseaddr_low;
   uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
   uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-  uint8_t  board_i2c_feature_slave_addr;
+  uint8_t  board_i2c_feature_target_addr;
   uint8_t  reserved3;
   uint16_t bootup_mvddq_mv;
   uint16_t bootup_mvpp_mv;
@@ -590,8 +590,8 @@ struct atom_firmware_info_v3_4 {
 	uint32_t mc_baseaddr_low;
 	uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
 	uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-	uint8_t  board_i2c_feature_slave_addr;
-	uint8_t  ras_rom_i2c_slave_addr;
+	uint8_t  board_i2c_feature_target_addr;
+	uint8_t  ras_rom_i2c_target_addr;
 	uint16_t bootup_mvddq_mv;
 	uint16_t bootup_mvpp_mv;
 	uint32_t zfbstartaddrin16mb;
@@ -626,8 +626,8 @@ struct atom_firmware_info_v3_5 {
   uint32_t mc_baseaddr_low;
   uint8_t  board_i2c_feature_id;            // enum of atom_board_i2c_feature_id_def
   uint8_t  board_i2c_feature_gpio_id;       // i2c id find in gpio_lut data table gpio_id
-  uint8_t  board_i2c_feature_slave_addr;
-  uint8_t  ras_rom_i2c_slave_addr;
+  uint8_t  board_i2c_feature_target_addr;
+  uint8_t  ras_rom_i2c_target_addr;
   uint32_t bootup_voltage_reserved1;
   uint32_t zfb_reserved;
   // if pplib_pptable_id!=0, pplib get powerplay table inside driver instead of from VBIOS
@@ -830,7 +830,7 @@ struct atom_i2c_record
 {
   struct atom_common_record_header record_header;   //record_type = ATOM_I2C_RECORD_TYPE
   uint8_t i2c_id; 
-  uint8_t i2c_slave_addr;                   //The slave address, it's 0 when the record is attached to connector for DDC
+  uint8_t i2c_target_addr;                   //The target address, it's 0 when the record is attached to connector for DDC
 };
 
 struct atom_hpd_int_record
@@ -2026,7 +2026,7 @@ struct atom_smu_info_v3_5
   uint16_t smuinitoffset;
   uint32_t bootup_dprefclk_10khz;
   uint32_t bootup_usbclk_10khz;
-  uint32_t smb_slave_address;
+  uint32_t smb_target_address;
   uint32_t cg_fdo_ctrl0_val;
   uint32_t cg_fdo_ctrl1_val;
   uint32_t cg_fdo_ctrl2_val;
@@ -2083,7 +2083,7 @@ struct atom_smu_info_v3_6
 	uint16_t smuinitoffset;
 	uint32_t bootup_gfxavsclk_10khz;
 	uint32_t bootup_mpioclk_10khz;
-	uint32_t smb_slave_address;
+	uint32_t smb_target_address;
 	uint32_t cg_fdo_ctrl0_val;
 	uint32_t cg_fdo_ctrl1_val;
 	uint32_t cg_fdo_ctrl2_val;
@@ -2138,7 +2138,7 @@ struct atom_smu_info_v4_0 {
 	uint16_t smuinitoffset;
 	uint32_t bootup_dprefclk_10khz;
 	uint32_t bootup_usbclk_10khz;
-	uint32_t smb_slave_address;
+	uint32_t smb_target_address;
 	uint32_t cg_fdo_ctrl0_val;
 	uint32_t cg_fdo_ctrl1_val;
 	uint32_t cg_fdo_ctrl2_val;
@@ -2349,7 +2349,7 @@ struct atom_smc_dpm_info_v4_3
 
 struct smudpm_i2ccontrollerconfig_t {
   uint32_t  enabled;
-  uint32_t  slaveaddress;
+  uint32_t  targetaddress;
   uint32_t  controllerport;
   uint32_t  controllername;
   uint32_t  thermalthrottler;
@@ -3510,7 +3510,7 @@ struct  atom_i2c_voltage_object_v4
    struct atom_voltage_object_header_v4 header;  // voltage mode = VOLTAGE_OBJ_VR_I2C_INIT_SEQ
    uint8_t  regulator_id;                        //Indicate Voltage Regulator Id
    uint8_t  i2c_id;
-   uint8_t  i2c_slave_addr;
+   uint8_t  i2c_target_addr;
    uint8_t  i2c_control_offset;       
    uint8_t  i2c_flag;                            // Bit0: 0 - One byte data; 1 - Two byte data
    uint8_t  i2c_speed;                           // =0, use default i2c speed, otherwise use it in unit of kHz. 
@@ -4152,7 +4152,7 @@ struct process_i2c_channel_transaction_parameters
   uint16_t  i2c_data_out;
   uint8_t   flag;                    /* enum atom_process_i2c_status */
   uint8_t   trans_bytes;
-  uint8_t   slave_addr;
+  uint8_t   target_addr;
   uint8_t   i2c_id;
 };
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c
index 79c817752a33..cb9ee5345745 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_processpptables.c
@@ -784,8 +784,8 @@ static int append_vbios_pptable(struct pp_hwmgr *hwmgr, PPTable_t *ppsmc_pptable
 	for (i = 0; i < I2C_CONTROLLER_NAME_COUNT; i++) {
 		ppsmc_pptable->I2cControllers[i].Enabled =
 			smc_dpm_table->i2ccontrollers[i].enabled;
-		ppsmc_pptable->I2cControllers[i].SlaveAddress =
-			smc_dpm_table->i2ccontrollers[i].slaveaddress;
+		ppsmc_pptable->I2cControllers[i].TargetAddress =
+			smc_dpm_table->i2ccontrollers[i].targetaddress;
 		ppsmc_pptable->I2cControllers[i].ControllerPort =
 			smc_dpm_table->i2ccontrollers[i].controllerport;
 		ppsmc_pptable->I2cControllers[i].ThermalThrottler =
diff --git a/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h b/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h
index c2efc70ef288..69d7ec6fd971 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/inc/smu11_driver_if.h
@@ -287,7 +287,7 @@ typedef enum {
 
 typedef struct {
   uint32_t Enabled;
-  uint32_t SlaveAddress;
+  uint32_t TargetAddress;
   uint32_t ControllerPort;
   uint32_t ControllerName;
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h
index d518dee18e1b..5684e2a16e6c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_arcturus.h
@@ -263,7 +263,7 @@ typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
   uint8_t   Padding[2];
-  uint32_t  SlaveAddress;
+  uint32_t  TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h
index c5c1943fb6a1..1782b8e8fcd2 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_navi10.h
@@ -267,7 +267,7 @@ typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
   uint8_t   Padding[2];
-  uint32_t  SlaveAddress;
+  uint32_t  TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h
index aa6d29de4002..6be89c6dd492 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h
@@ -342,7 +342,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;  
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h
index cddf45eebee8..c590f4557074 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_aldebaran.h
@@ -167,7 +167,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ThermalThrotter;
   uint8_t   I2cProtocol;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
index b114d14fc053..ebe2d344bf5b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
@@ -319,7 +319,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h
index 8b1496f8ce58..8e9c7fa22b4f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h
@@ -320,7 +320,7 @@ typedef enum {
 typedef struct {
   uint8_t   Enabled;
   uint8_t   Speed;
-  uint8_t   SlaveAddress;
+  uint8_t   TargetAddress;
   uint8_t   ControllerPort;
   uint8_t   ControllerName;
   uint8_t   ThermalThrotter;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 0c2d04f978ac..e2c6a4806e5c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -1909,8 +1909,8 @@ static void arcturus_dump_pptable(struct smu_context *smu)
 		dev_info(smu->adev->dev, "I2cControllers[%d]:\n", i);
 		dev_info(smu->adev->dev, "                   .Enabled = %d\n",
 				pptable->I2cControllers[i].Enabled);
-		dev_info(smu->adev->dev, "                   .SlaveAddress = 0x%x\n",
-				pptable->I2cControllers[i].SlaveAddress);
+		dev_info(smu->adev->dev, "                   .TargetAddress = 0x%x\n",
+				pptable->I2cControllers[i].TargetAddress);
 		dev_info(smu->adev->dev, "                   .ControllerPort = %d\n",
 				pptable->I2cControllers[i].ControllerPort);
 		dev_info(smu->adev->dev, "                   .ControllerName = %d\n",
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 1f18b61884f3..eec4b9b9598c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -2988,8 +2988,8 @@ static void beige_goby_dump_pptable(struct smu_context *smu)
 				pptable->I2cControllers[i].Enabled);
 		dev_info(smu->adev->dev, "                   .Speed = 0x%x\n",
 				pptable->I2cControllers[i].Speed);
-		dev_info(smu->adev->dev, "                   .SlaveAddress = 0x%x\n",
-				pptable->I2cControllers[i].SlaveAddress);
+		dev_info(smu->adev->dev, "                   .TargetAddress = 0x%x\n",
+				pptable->I2cControllers[i].TargetAddress);
 		dev_info(smu->adev->dev, "                   .ControllerPort = 0x%x\n",
 				pptable->I2cControllers[i].ControllerPort);
 		dev_info(smu->adev->dev, "                   .ControllerName = 0x%x\n",
@@ -3627,8 +3627,8 @@ static void sienna_cichlid_dump_pptable(struct smu_context *smu)
 				pptable->I2cControllers[i].Enabled);
 		dev_info(smu->adev->dev, "                   .Speed = 0x%x\n",
 				pptable->I2cControllers[i].Speed);
-		dev_info(smu->adev->dev, "                   .SlaveAddress = 0x%x\n",
-				pptable->I2cControllers[i].SlaveAddress);
+		dev_info(smu->adev->dev, "                   .TargetAddress = 0x%x\n",
+				pptable->I2cControllers[i].TargetAddress);
 		dev_info(smu->adev->dev, "                   .ControllerPort = 0x%x\n",
 				pptable->I2cControllers[i].ControllerPort);
 		dev_info(smu->adev->dev, "                   .ControllerName = 0x%x\n",
diff --git a/drivers/gpu/drm/radeon/atombios.h b/drivers/gpu/drm/radeon/atombios.h
index 2db40789235c..cdb266294894 100644
--- a/drivers/gpu/drm/radeon/atombios.h
+++ b/drivers/gpu/drm/radeon/atombios.h
@@ -7229,7 +7229,7 @@ typedef struct _PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS
 	USHORT  lpI2CDataOut;
   UCHAR   ucFlag;               
   UCHAR   ucTransBytes;
-  UCHAR   ucSlaveAddr;
+  UCHAR   ucTargetAddr;
   UCHAR   ucLineNumber;
 }PROCESS_I2C_CHANNEL_TRANSACTION_PARAMETERS;
 
diff --git a/drivers/gpu/drm/radeon/atombios_i2c.c b/drivers/gpu/drm/radeon/atombios_i2c.c
index 730f0b25312b..3acae0b28122 100644
--- a/drivers/gpu/drm/radeon/atombios_i2c.c
+++ b/drivers/gpu/drm/radeon/atombios_i2c.c
@@ -34,7 +34,7 @@
 #define ATOM_MAX_HW_I2C_READ  255
 
 static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan,
-				 u8 slave_addr, u8 flags,
+				 u8 target_addr, u8 flags,
 				 u8 *buf, int num)
 {
 	struct drm_device *dev = chan->dev;
@@ -75,7 +75,7 @@ static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan,
 	args.ucFlag = flags;
 	args.ucI2CSpeed = TARGET_HW_I2C_CLOCK;
 	args.ucTransBytes = num;
-	args.ucSlaveAddr = slave_addr << 1;
+	args.ucTargetAddr = target_addr << 1;
 	args.ucLineNumber = chan->rec.i2c_id;
 
 	atom_execute_table_scratch_unlocked(rdev->mode_info.atom_context, index, (uint32_t *)&args, sizeof(args));
diff --git a/drivers/gpu/drm/radeon/radeon_combios.c b/drivers/gpu/drm/radeon/radeon_combios.c
index 6952b1273b0f..107638ec8c75 100644
--- a/drivers/gpu/drm/radeon/radeon_combios.c
+++ b/drivers/gpu/drm/radeon/radeon_combios.c
@@ -1398,7 +1398,7 @@ bool radeon_legacy_get_ext_tmds_info_from_table(struct radeon_encoder *encoder,
 	case CT_MINI_EXTERNAL:
 	default:
 		tmds->dvo_chip = DVO_SIL164;
-		tmds->slave_addr = 0x70 >> 1; /* 7 bit addressing */
+		tmds->target_addr = 0x70 >> 1; /* 7 bit addressing */
 		break;
 	}
 
@@ -1420,14 +1420,14 @@ bool radeon_legacy_get_ext_tmds_info_from_combios(struct radeon_encoder *encoder
 		i2c_bus = combios_setup_i2c_bus(rdev, DDC_MONID, 0, 0);
 		tmds->i2c_bus = radeon_i2c_lookup(rdev, &i2c_bus);
 		tmds->dvo_chip = DVO_SIL164;
-		tmds->slave_addr = 0x70 >> 1; /* 7 bit addressing */
+		tmds->target_addr = 0x70 >> 1; /* 7 bit addressing */
 	} else {
 		offset = combios_get_table_offset(dev, COMBIOS_EXT_TMDS_INFO_TABLE);
 		if (offset) {
 			ver = RBIOS8(offset);
 			DRM_DEBUG_KMS("External TMDS Table revision: %d\n", ver);
-			tmds->slave_addr = RBIOS8(offset + 4 + 2);
-			tmds->slave_addr >>= 1; /* 7 bit addressing */
+			tmds->target_addr = RBIOS8(offset + 4 + 2);
+			tmds->target_addr >>= 1; /* 7 bit addressing */
 			gpio = RBIOS8(offset + 4 + 3);
 			if (gpio == DDC_LCD) {
 				/* MM i2c */
@@ -2846,19 +2846,19 @@ void radeon_external_tmds_setup(struct drm_encoder *encoder)
 	case DVO_SIL164:
 		/* sil 164 */
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				    tmds->slave_addr,
+				    tmds->target_addr,
 				    0x08, 0x30);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				       tmds->slave_addr,
+				       tmds->target_addr,
 				       0x09, 0x00);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				    tmds->slave_addr,
+				    tmds->target_addr,
 				    0x0a, 0x90);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				    tmds->slave_addr,
+				    tmds->target_addr,
 				    0x0c, 0x89);
 		radeon_i2c_put_byte(tmds->i2c_bus,
-				       tmds->slave_addr,
+				       tmds->target_addr,
 				       0x08, 0x3b);
 		break;
 	case DVO_SIL1178:
@@ -2887,7 +2887,7 @@ bool radeon_combios_external_tmds_setup(struct drm_encoder *encoder)
 	struct radeon_device *rdev = dev->dev_private;
 	struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder);
 	uint16_t offset;
-	uint8_t blocks, slave_addr, rev;
+	uint8_t blocks, target_addr, rev;
 	uint32_t index, id;
 	uint32_t reg, val, and_mask, or_mask;
 	struct radeon_encoder_ext_tmds *tmds = radeon_encoder->enc_priv;
@@ -2934,15 +2934,15 @@ bool radeon_combios_external_tmds_setup(struct drm_encoder *encoder)
 						mdelay(val);
 						break;
 					case 6:
-						slave_addr = id & 0xff;
-						slave_addr >>= 1; /* 7 bit addressing */
+						target_addr = id & 0xff;
+						target_addr >>= 1; /* 7 bit addressing */
 						index++;
 						reg = RBIOS8(index);
 						index++;
 						val = RBIOS8(index);
 						index++;
 						radeon_i2c_put_byte(tmds->i2c_bus,
-								    slave_addr,
+								    target_addr,
 								    reg, val);
 						break;
 					default:
@@ -2997,7 +2997,7 @@ bool radeon_combios_external_tmds_setup(struct drm_encoder *encoder)
 					val = RBIOS8(index);
 					index += 1;
 					radeon_i2c_put_byte(tmds->i2c_bus,
-							    tmds->slave_addr,
+							    tmds->target_addr,
 							    reg, val);
 					break;
 				default:
diff --git a/drivers/gpu/drm/radeon/radeon_i2c.c b/drivers/gpu/drm/radeon/radeon_i2c.c
index 3d174390a8af..a2eb00229428 100644
--- a/drivers/gpu/drm/radeon/radeon_i2c.c
+++ b/drivers/gpu/drm/radeon/radeon_i2c.c
@@ -1038,7 +1038,7 @@ struct radeon_i2c_chan *radeon_i2c_lookup(struct radeon_device *rdev,
 }
 
 void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
-			 u8 slave_addr,
+			 u8 target_addr,
 			 u8 addr,
 			 u8 *val)
 {
@@ -1046,13 +1046,13 @@ void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
 	u8 in_buf[2];
 	struct i2c_msg msgs[] = {
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = 0,
 			.len = 1,
 			.buf = out_buf,
 		},
 		{
-			.addr = slave_addr,
+			.addr = target_addr,
 			.flags = I2C_M_RD,
 			.len = 1,
 			.buf = in_buf,
@@ -1072,13 +1072,13 @@ void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
 }
 
 void radeon_i2c_put_byte(struct radeon_i2c_chan *i2c_bus,
-			 u8 slave_addr,
+			 u8 target_addr,
 			 u8 addr,
 			 u8 val)
 {
 	uint8_t out_buf[2];
 	struct i2c_msg msg = {
-		.addr = slave_addr,
+		.addr = target_addr,
 		.flags = 0,
 		.len = 2,
 		.buf = out_buf,
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h b/drivers/gpu/drm/radeon/radeon_mode.h
index 546381a5c918..701c5f9046a0 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -409,7 +409,7 @@ struct radeon_encoder_int_tmds {
 struct radeon_encoder_ext_tmds {
 	/* tmds over dvo */
 	struct radeon_i2c_chan *i2c_bus;
-	uint8_t slave_addr;
+	uint8_t target_addr;
 	enum radeon_dvo_chip dvo_chip;
 };
 
@@ -749,11 +749,11 @@ extern struct radeon_i2c_chan *radeon_i2c_create(struct drm_device *dev,
 						 const char *name);
 extern void radeon_i2c_destroy(struct radeon_i2c_chan *i2c);
 extern void radeon_i2c_get_byte(struct radeon_i2c_chan *i2c_bus,
-				u8 slave_addr,
+				u8 target_addr,
 				u8 addr,
 				u8 *val);
 extern void radeon_i2c_put_byte(struct radeon_i2c_chan *i2c,
-				u8 slave_addr,
+				u8 target_addr,
 				u8 addr,
 				u8 val);
 extern void radeon_router_select_ddc_port(struct radeon_connector *radeon_connector);
-- 
2.34.1


^ permalink raw reply related	[relevance 24%]

* [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers
@ 2024-04-30 17:37 54% Easwar Hariharan
  2024-04-30 17:38 24% ` [PATCH v1 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
                   ` (11 more replies)
  0 siblings, 12 replies; 200+ results
From: Easwar Hariharan @ 2024-04-30 17:37 UTC (permalink / raw)
  Cc: Wolfram Sang, open list:RADEON and AMDGPU DRM DRIVERS,
	open list:DRM DRIVERS, open list,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:INTEL DRM DISPLAY FOR XE AND I915 DRIVERS,
	open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS,
	open list:I2C SUBSYSTEM HOST DRIVERS,
	open list:BTTV VIDEO4LINUX DRIVER, open list:FRAMEBUFFER LAYER,
	Easwar Hariharan

I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of the
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

Please chime in with your opinions and suggestions.

This series is based on v6.9-rc1.

[1]:
https://lore.kernel.org/all/20240322132619.6389-1-wsa+renesas@sang-engineering.com/
----

changelog:
v0->v1:
- Link: https://lore.kernel.org/all/20240329170038.3863998-1-eahariha@linux.microsoft.com/
- Drop drivers/infiniband patches [Leon, Dennis]
- Switch to specification verbiage master->controller, slave->target,
  drop usage of client [Andi, Ville, Jani, Christian]
- Add I3C specification version in commit messages [Andi]
- Pick up Reviewed-bys from Martin and Simon [sfc]
- Drop i2c/treewide patch to make this series independent from Wolfram's
  ([1]) [Wolfram]
- Split away drm/nouveau patch to allow expansion into non-I2C
  non-inclusive terms

----

Easwar Hariharan (12):
  drm/amdgpu, drm/radeon: Make I2C terminology more inclusive
  drm/gma500: Make I2C terminology more inclusive
  drm/i915: Make I2C terminology more inclusive
  media: au0828: Make I2C terminology more inclusive
  media: cobalt: Make I2C terminology more inclusive
  media: cx18: Make I2C terminology more inclusive
  media: cx25821: Make I2C terminology more inclusive
  media: ivtv: Make I2C terminology more inclusive
  media: cx23885: Make I2C terminology more inclusive
  sfc: falcon: Make I2C terminology more inclusive
  fbdev/smscufx: Make I2C terminology more inclusive
  fbdev/viafb: Make I2C terminology more inclusive

 .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c  |  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c       | 10 +++----
 drivers/gpu/drm/amd/amdgpu/atombios_i2c.c     |  8 ++---
 drivers/gpu/drm/amd/amdgpu/smu_v11_0_i2c.c    | 20 ++++++-------
 .../gpu/drm/amd/display/dc/bios/bios_parser.c |  2 +-
 .../drm/amd/display/dc/bios/bios_parser2.c    |  2 +-
 .../drm/amd/display/dc/core/dc_link_exports.c |  4 +--
 drivers/gpu/drm/amd/display/dc/dc.h           |  2 +-
 drivers/gpu/drm/amd/display/dc/dce/dce_i2c.c  |  4 +--
 .../display/include/grph_object_ctrl_defs.h   |  2 +-
 drivers/gpu/drm/amd/include/atombios.h        |  2 +-
 drivers/gpu/drm/amd/include/atomfirmware.h    | 26 ++++++++--------
 .../powerplay/hwmgr/vega20_processpptables.c  |  4 +--
 .../amd/pm/powerplay/inc/smu11_driver_if.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_arcturus.h    |  2 +-
 .../inc/pmfw_if/smu11_driver_if_navi10.h      |  2 +-
 .../pmfw_if/smu11_driver_if_sienna_cichlid.h  |  2 +-
 .../inc/pmfw_if/smu13_driver_if_aldebaran.h   |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_0.h     |  2 +-
 .../inc/pmfw_if/smu13_driver_if_v13_0_7.h     |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  4 +--
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  8 ++---
 drivers/gpu/drm/gma500/cdv_intel_lvds.c       |  2 +-
 drivers/gpu/drm/gma500/intel_bios.c           | 22 +++++++-------
 drivers/gpu/drm/gma500/intel_bios.h           |  4 +--
 drivers/gpu/drm/gma500/intel_gmbus.c          |  2 +-
 drivers/gpu/drm/gma500/psb_drv.h              |  2 +-
 drivers/gpu/drm/gma500/psb_intel_drv.h        |  2 +-
 drivers/gpu/drm/gma500/psb_intel_lvds.c       |  4 +--
 drivers/gpu/drm/gma500/psb_intel_sdvo.c       | 26 ++++++++--------
 drivers/gpu/drm/i915/display/dvo_ch7017.c     | 14 ++++-----
 drivers/gpu/drm/i915/display/dvo_ch7xxx.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_ivch.c       | 16 +++++-----
 drivers/gpu/drm/i915/display/dvo_ns2501.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_sil164.c     | 18 +++++------
 drivers/gpu/drm/i915/display/dvo_tfp410.c     | 18 +++++------
 drivers/gpu/drm/i915/display/intel_bios.c     | 22 +++++++-------
 drivers/gpu/drm/i915/display/intel_ddi.c      |  2 +-
 .../gpu/drm/i915/display/intel_display_core.h |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi.h      |  2 +-
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c  | 20 ++++++-------
 drivers/gpu/drm/i915/display/intel_dvo.c      | 14 ++++-----
 drivers/gpu/drm/i915/display/intel_dvo_dev.h  |  2 +-
 drivers/gpu/drm/i915/display/intel_gmbus.c    |  4 +--
 drivers/gpu/drm/i915/display/intel_sdvo.c     | 30 +++++++++----------
 drivers/gpu/drm/i915/display/intel_vbt_defs.h |  4 +--
 drivers/gpu/drm/i915/gvt/edid.c               | 28 ++++++++---------
 drivers/gpu/drm/i915/gvt/edid.h               |  4 +--
 drivers/gpu/drm/i915/gvt/opregion.c           |  2 +-
 drivers/gpu/drm/radeon/atombios.h             |  2 +-
 drivers/gpu/drm/radeon/atombios_i2c.c         |  4 +--
 drivers/gpu/drm/radeon/radeon_combios.c       | 28 ++++++++---------
 drivers/gpu/drm/radeon/radeon_i2c.c           | 10 +++----
 drivers/gpu/drm/radeon/radeon_mode.h          |  6 ++--
 drivers/media/pci/cobalt/cobalt-i2c.c         |  6 ++--
 drivers/media/pci/cx18/cx18-av-firmware.c     |  8 ++---
 drivers/media/pci/cx18/cx18-cards.c           |  6 ++--
 drivers/media/pci/cx18/cx18-cards.h           |  4 +--
 drivers/media/pci/cx18/cx18-gpio.c            |  6 ++--
 drivers/media/pci/cx23885/cx23885-f300.c      |  8 ++---
 drivers/media/pci/cx23885/cx23885-i2c.c       |  6 ++--
 drivers/media/pci/cx25821/cx25821-i2c.c       |  6 ++--
 drivers/media/pci/ivtv/ivtv-i2c.c             | 16 +++++-----
 drivers/media/usb/au0828/au0828-i2c.c         |  4 +--
 drivers/media/usb/au0828/au0828-input.c       |  2 +-
 drivers/net/ethernet/sfc/falcon/falcon.c      |  2 +-
 drivers/video/fbdev/smscufx.c                 |  4 +--
 drivers/video/fbdev/via/chip.h                |  8 ++---
 drivers/video/fbdev/via/dvi.c                 | 24 +++++++--------
 drivers/video/fbdev/via/lcd.c                 |  6 ++--
 drivers/video/fbdev/via/via_aux.h             |  2 +-
 drivers/video/fbdev/via/via_i2c.c             | 12 ++++----
 drivers/video/fbdev/via/vt1636.c              |  6 ++--
 73 files changed, 304 insertions(+), 304 deletions(-)


base-commit: 4cece764965020c22cff7665b18a012006359095
-- 
2.34.1


^ permalink raw reply	[relevance 54%]

* Re: [PATCH net-next v2 0/2] Add sysfs attributes for MANA
  @ 2024-04-30  5:31 79%   ` Shradha Gupta
  2024-05-03  8:48 79%     ` Shradha Gupta
  0 siblings, 1 reply; 200+ results
From: Shradha Gupta @ 2024-04-30  5:31 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Bjorn Helgaas, Jonathan Corbet, Randy Dunlap, Johannes Berg,
	Breno Leitao, linux-kernel, netdev, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Souradeep Chakrabarti, Konstantin Taranov, Yury Norov,
	linux-hyperv, shradhagupta

On Wed, Apr 24, 2024 at 04:48:06PM +0200, Jiri Pirko wrote:
> Wed, Apr 24, 2024 at 12:32:54PM CEST, shradhagupta@linux.microsoft.com wrote:
> >These patches include adding sysfs attributes for improving
> >debuggability on MANA devices.
> >
> >The first patch consists on max_mtu, min_mtu attributes that are
> >implemented generically for all devices
> >
> >The second patch has mana specific attributes max_num_msix and num_ports
> 
> 1) you implement only max, min is never implemented, no point
> introducing it.
Sure. I had added it for the sake of completeness.
> 2) having driver implement sysfs entry feels *very wrong*, don't do that
> 3) why DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MAX
>    and DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MIN
>    Are not what you want?
Thanks for pointing this out. We are still evaluating if this devlink param
could be used for our usecase where we only need a read-only msix value for VF.
We keep the thread updated.
> 
> >
> >Shradha Gupta (2):
> >  net: Add sysfs atttributes for max_mtu min_mtu
> >  net: mana: Add new device attributes for mana
> >
> > Documentation/ABI/testing/sysfs-class-net     | 16 ++++++++++
> > .../net/ethernet/microsoft/mana/gdma_main.c   | 32 +++++++++++++++++++
> > net/core/net-sysfs.c                          |  4 +++
> > 3 files changed, 52 insertions(+)
> >
> >-- 
> >2.34.1
> >
> >

^ permalink raw reply	[relevance 79%]

* RE: [EXTERNAL] [PATCH v3 05/12] cifs: drop usage of page_file_offset
  @ 2024-04-30  2:23 70%       ` Steven French
  0 siblings, 0 replies; 200+ results
From: Steven French @ 2024-04-30  2:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kairui Song, linux-mm, Andrew Morton, Huang, Ying, Chris Li,
	Barry Song, Ryan Roberts, Neil Brown, Minchan Kim, Hugh Dickins,
	David Hildenbrand, Yosry Ahmed, linux-fsdevel, linux-kernel,
	Namjae Jeon, Paulo Alcantara (SUSE),
	Shyam Prasad, Bharath S M

Makes sense - I will try to look at this fixing the swapon and reading from swap over smb3.1.1 mounts in the next few weeks, but if you have a good example of sample code (from one of the other FS that does this well) that would help.

-----Original Message-----
From: Matthew Wilcox <willy@infradead.org> 
Sent: Monday, April 29, 2024 3:26 PM
To: Steven French <Steven.French@microsoft.com>
Cc: Kairui Song <kasong@tencent.com>; linux-mm@kvack.org; Andrew Morton <akpm@linux-foundation.org>; Huang, Ying <ying.huang@intel.com>; Chris Li <chrisl@kernel.org>; Barry Song <v-songbaohua@oppo.com>; Ryan Roberts <ryan.roberts@arm.com>; Neil Brown <neilb@suse.de>; Minchan Kim <minchan@kernel.org>; Hugh Dickins <hughd@google.com>; David Hildenbrand <david@redhat.com>; Yosry Ahmed <yosryahmed@google.com>; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; Namjae Jeon <linkinjeon@kernel.org>; Paulo Alcantara (SUSE) <pc@manguebit.com>; Shyam Prasad <Shyam.Prasad@microsoft.com>; Bharath S M <bharathsm@microsoft.com>
Subject: Re: [EXTERNAL] [PATCH v3 05/12] cifs: drop usage of page_file_offset

On Mon, Apr 29, 2024 at 08:19:31PM +0000, Steven French wrote:
> Wouldn't this make it harder to fix the regression when swap file support was temporarily removed from cifs.ko (due to the folio migration)?   I was hoping to come back to fixing swapfile support for cifs.ko in 6.10-rc (which used to pass the various xfstests for this but code got removed with folios/netfs changes).

It was neither the folio conversion nor the netfs conversion which removed the claim of swap support from cifs, but NeilBrown's introduction of ->swap_rw.  In commit e1209d3a7a67 he claims that

    Only two filesystems set SWP_FS_OPS:
    - cifs sets the flag, but ->direct_IO always fails so swap cannot work.
    - nfs sets the flag, but ->direct_IO calls generic_write_checks()
      which has failed on swap files for several releases.

As I recall the xfstests only checked that swapon/swapoff works; they don't actually test that writing to swap and reading back from it work.

^ permalink raw reply	[relevance 70%]

* RE: [EXTERNAL] [PATCH v3 05/12] cifs: drop usage of page_file_offset
  @ 2024-04-29 20:19 65%   ` Steven French
    0 siblings, 1 reply; 200+ results
From: Steven French @ 2024-04-29 20:19 UTC (permalink / raw)
  To: Kairui Song, linux-mm
  Cc: Andrew Morton, Huang, Ying, Matthew Wilcox, Chris Li, Barry Song,
	Ryan Roberts, Neil Brown, Minchan Kim, Hugh Dickins,
	David Hildenbrand, Yosry Ahmed, linux-fsdevel, linux-kernel,
	Namjae Jeon, Paulo Alcantara (SUSE),
	Shyam Prasad, Bharath S M

Wouldn't this make it harder to fix the regression when swap file support was temporarily removed from cifs.ko (due to the folio migration)?   I was hoping to come back to fixing swapfile support for cifs.ko in 6.10-rc (which used to pass the various xfstests for this but code got removed with folios/netfs changes).

-----Original Message-----
From: Kairui Song <ryncsn@gmail.com> 
Sent: Monday, April 29, 2024 2:05 PM
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>; Huang, Ying <ying.huang@intel.com>; Matthew Wilcox <willy@infradead.org>; Chris Li <chrisl@kernel.org>; Barry Song <v-songbaohua@oppo.com>; Ryan Roberts <ryan.roberts@arm.com>; Neil Brown <neilb@suse.de>; Minchan Kim <minchan@kernel.org>; Hugh Dickins <hughd@google.com>; David Hildenbrand <david@redhat.com>; Yosry Ahmed <yosryahmed@google.com>; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; Kairui Song <kasong@tencent.com>; Steven French <Steven.French@microsoft.com>; Namjae Jeon <linkinjeon@kernel.org>; Paulo Alcantara (SUSE) <pc@manguebit.com>; Shyam Prasad <Shyam.Prasad@microsoft.com>; Bharath S M <bharathsm@microsoft.com>
Subject: [EXTERNAL] [PATCH v3 05/12] cifs: drop usage of page_file_offset

[Some people who received this message don't often get email from ryncsn@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

From: Kairui Song <kasong@tencent.com>

page_file_offset is only needed for mixed usage of page cache and swap cache, for pure page cache usage, the caller can just use page_offset instead.

It can't be a swap cache page here, so just drop it and convert it to use folio.

Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Steve French <stfrench@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Paulo Alcantara <pc@manguebit.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Bharath SM <bharathsm@microsoft.com>
---
 fs/smb/client/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c index 9be37d0fe724..388343b0fceb 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -4828,7 +4828,7 @@ static int cifs_readpage_worker(struct file *file, struct page *page,  static int cifs_read_folio(struct file *file, struct folio *folio)  {
        struct page *page = &folio->page;
-       loff_t offset = page_file_offset(page);
+       loff_t offset = folio_pos(folio);
        int rc = -EACCES;
        unsigned int xid;

--
2.44.0


^ permalink raw reply	[relevance 65%]

* RE: [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-04-26 13:12 68% ` [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
@ 2024-04-29 18:08 79%   ` Long Li
  2024-05-01 14:01 79%     ` Konstantin Taranov
  2024-05-02 17:04 79%   ` Long Li
  1 sibling, 1 reply; 200+ results
From: Long Li @ 2024-04-29 18:08 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation
> of rnic cq
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Enable users to create RNIC CQs using a corresponding flag.
> With the previous request size, an ethernet CQ is created.
> As a response, return ID of the created CQ.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/cq.c | 55 ++++++++++++++++++++++++++++++---
>  include/uapi/rdma/mana-abi.h    | 12 +++++++
>  2 files changed, 63 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
> index 688ffe6..c6a3fd5 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -9,17 +9,22 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		      struct ib_udata *udata)
>  {
>  	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> +	struct mana_ib_create_cq_resp resp = {};
> +	struct mana_ib_ucontext *mana_ucontext;
>  	struct ib_device *ibdev = ibcq->device;
>  	struct mana_ib_create_cq ucmd = {};
>  	struct mana_ib_dev *mdev;
> +	bool is_rnic_cq;
> +	u32 doorbell;
>  	int err;
> 
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> 
> -	if (udata->inlen < sizeof(ucmd))
> -		return -EINVAL;
> -
>  	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> +	cq->cq_handle = INVALID_MANA_HANDLE;
> +
> +	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> +		return -EINVAL;
> 
>  	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> >inlen));
>  	if (err) {
> @@ -28,7 +33,9 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		return err;
>  	}
> 
> -	if (attr->cqe > mdev->adapter_caps.max_qp_wr) {
> +	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> +
> +	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
>  		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
>  		return -EINVAL;
>  	}
> @@ -40,7 +47,41 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		return err;
>  	}
> 
> +	mana_ucontext = rdma_udata_to_drv_context(udata, struct
> mana_ib_ucontext,
> +						  ibucontext);
> +	doorbell = mana_ucontext->doorbell;
> +
> +	if (is_rnic_cq) {
> +		err = mana_ib_gd_create_cq(mdev, cq, doorbell);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to create RNIC cq, %d\n", err);
> +			goto err_destroy_queue;
> +		}
> +
> +		err = mana_ib_install_cq_cb(mdev, cq);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to install cq callback, %d\n",
> err);
> +			goto err_destroy_rnic_cq;
> +		}
> +	}
> +
> +	resp.cqid = cq->queue.id;
> +	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
> +	if (err) {
> +		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
> +		goto err_remove_cq_cb;
> +	}
> +
>  	return 0;
> +
> +err_remove_cq_cb:
> +	mana_ib_remove_cq_cb(mdev, cq);
> +err_destroy_rnic_cq:
> +	mana_ib_gd_destroy_cq(mdev, cq);
> +err_destroy_queue:
> +	mana_ib_destroy_queue(mdev, &cq->queue);
> +
> +	return err;
>  }
> 
>  int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) @@ -52,6
> +93,12 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> 
>  	mana_ib_remove_cq_cb(mdev, cq);
> +
> +	/* Ignore return code as there is not much we can do about it.
> +	 * The error message is printed inside.
> +	 */
> +	mana_ib_gd_destroy_cq(mdev, cq);
> +
>  	mana_ib_destroy_queue(mdev, &cq->queue);
> 
>  	return 0;
> diff --git a/include/uapi/rdma/mana-abi.h b/include/uapi/rdma/mana-abi.h
> index 5fcb31b..2c41cc3 100644
> --- a/include/uapi/rdma/mana-abi.h
> +++ b/include/uapi/rdma/mana-abi.h
> @@ -16,8 +16,20 @@
> 
>  #define MANA_IB_UVERBS_ABI_VERSION 1
> 
> +enum mana_ib_create_cq_flags {
> +	MANA_IB_CREATE_RNIC_CQ	= 1 << 0,
> +};
> +
>  struct mana_ib_create_cq {
>  	__aligned_u64 buf_addr;
> +	__u16	flags;
> +	__u16	reserved0;
> +	__u32	reserved1;
> +};
> +
> +struct mana_ib_create_cq_resp {
> +	__u32 cqid;
> +	__u32 reserved;
>  };
> 
>  struct mana_ib_create_qp {
> --
> 2.43.0

For this review, it will be helpful if you can also post a link to the rdma-core changes.

Long

^ permalink raw reply	[relevance 79%]

* [RFC PATCH] fs/coredump: Enable dynamic configuration of max file note size
@ 2024-04-29 17:21 68% Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-04-29 17:21 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-kernel, linux-mm, viro, brauner, jack, ebiederm, keescook,
	mcgrof, j.granados

Introduce the capability to dynamically configure the maximum file
note size for ELF core dumps via sysctl. This enhancement removes
the previous static limit of 4MB, allowing system administrators to
adjust the size based on system-specific requirements or constraints.

- Remove hardcoded `MAX_FILE_NOTE_SIZE` from `fs/binfmt_elf.c`.
- Define `max_file_note_size` in `fs/coredump.c` with an initial value set to 4MB.
- Declare `max_file_note_size` as an external variable in `include/linux/coredump.h`.
- Add a new sysctl entry in `kernel/sysctl.c` to manage this setting at runtime.

$ sysctl -a | grep max_file_note_size
kernel.max_file_note_size = 4194304

$ sysctl -n kernel.max_file_note_size
4194304

$echo 519304 > /proc/sys/kernel/max_file_note_size

$sysctl -n kernel.max_file_note_size
519304

Signed-off-by: Vijay Nag <nagvijay@microsoft.com>
Signed-off-by: Allen Pais <apais@linux.microsoft.com>
---
 fs/binfmt_elf.c          | 3 +--
 fs/coredump.c            | 3 +++
 include/linux/coredump.h | 1 +
 kernel/sysctl.c          | 8 ++++++++
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 5397b552fbeb..5fc7baa9ebf2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1564,7 +1564,6 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
 	fill_note(note, "CORE", NT_SIGINFO, sizeof(*csigdata), csigdata);
 }
 
-#define MAX_FILE_NOTE_SIZE (4*1024*1024)
 /*
  * Format of NT_FILE note:
  *
@@ -1592,7 +1591,7 @@ static int fill_files_note(struct memelfnote *note, struct coredump_params *cprm
 
 	names_ofs = (2 + 3 * count) * sizeof(data[0]);
  alloc:
-	if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
+	if (size >= max_file_note_size) /* paranoia check */
 		return -EINVAL;
 	size = round_up(size, PAGE_SIZE);
 	/*
diff --git a/fs/coredump.c b/fs/coredump.c
index be6403b4b14b..a83c6cc893fc 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -56,10 +56,13 @@
 static bool dump_vma_snapshot(struct coredump_params *cprm);
 static void free_vma_snapshot(struct coredump_params *cprm);
 
+#define MAX_FILE_NOTE_SIZE (4*1024*1024)
+
 static int core_uses_pid;
 static unsigned int core_pipe_limit;
 static char core_pattern[CORENAME_MAX_SIZE] = "core";
 static int core_name_size = CORENAME_MAX_SIZE;
+unsigned int max_file_note_size = MAX_FILE_NOTE_SIZE;
 
 struct core_name {
 	char *corename;
diff --git a/include/linux/coredump.h b/include/linux/coredump.h
index d3eba4360150..e1ae7ab33d76 100644
--- a/include/linux/coredump.h
+++ b/include/linux/coredump.h
@@ -46,6 +46,7 @@ static inline void do_coredump(const kernel_siginfo_t *siginfo) {}
 #endif
 
 #if defined(CONFIG_COREDUMP) && defined(CONFIG_SYSCTL)
+extern unsigned int max_file_note_size;
 extern void validate_coredump_safety(void);
 #else
 static inline void validate_coredump_safety(void) {}
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 81cc974913bb..80cdc37f2fa2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -63,6 +63,7 @@
 #include <linux/mount.h>
 #include <linux/userfaultfd_k.h>
 #include <linux/pid.h>
+#include <linux/coredump.h>
 
 #include "../lib/kstrtox.h"
 
@@ -1623,6 +1624,13 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname       = "max_file_note_size",
+		.data           = &max_file_note_size,
+		.maxlen         = sizeof(unsigned int),
+		.mode           = 0644,
+		.proc_handler   = proc_dointvec,
+	},
 #ifdef CONFIG_PROC_SYSCTL
 	{
 		.procname	= "tainted",
-- 
2.17.1


^ permalink raw reply related	[relevance 68%]

* [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-04-26 13:12 69% [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
                   ` (3 preceding siblings ...)
  2024-04-26 13:12 79% ` [PATCH rdma-next v2 4/5] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
@ 2024-04-26 13:12 68% ` Konstantin Taranov
  2024-04-29 18:08 79%   ` Long Li
  2024-05-02 17:04 79%   ` Long Li
  4 siblings, 2 replies; 200+ results
From: Konstantin Taranov @ 2024-04-26 13:12 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Enable users to create RNIC CQs using a corresponding flag.
With the previous request size, an ethernet CQ is created.
As a response, return ID of the created CQ.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c | 55 ++++++++++++++++++++++++++++++---
 include/uapi/rdma/mana-abi.h    | 12 +++++++
 2 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 688ffe6..c6a3fd5 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -9,17 +9,22 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		      struct ib_udata *udata)
 {
 	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
+	struct mana_ib_create_cq_resp resp = {};
+	struct mana_ib_ucontext *mana_ucontext;
 	struct ib_device *ibdev = ibcq->device;
 	struct mana_ib_create_cq ucmd = {};
 	struct mana_ib_dev *mdev;
+	bool is_rnic_cq;
+	u32 doorbell;
 	int err;
 
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
 
-	if (udata->inlen < sizeof(ucmd))
-		return -EINVAL;
-
 	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
+	cq->cq_handle = INVALID_MANA_HANDLE;
+
+	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
+		return -EINVAL;
 
 	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen));
 	if (err) {
@@ -28,7 +33,9 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		return err;
 	}
 
-	if (attr->cqe > mdev->adapter_caps.max_qp_wr) {
+	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
+
+	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
 		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
 		return -EINVAL;
 	}
@@ -40,7 +47,41 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		return err;
 	}
 
+	mana_ucontext = rdma_udata_to_drv_context(udata, struct mana_ib_ucontext,
+						  ibucontext);
+	doorbell = mana_ucontext->doorbell;
+
+	if (is_rnic_cq) {
+		err = mana_ib_gd_create_cq(mdev, cq, doorbell);
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to create RNIC cq, %d\n", err);
+			goto err_destroy_queue;
+		}
+
+		err = mana_ib_install_cq_cb(mdev, cq);
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to install cq callback, %d\n", err);
+			goto err_destroy_rnic_cq;
+		}
+	}
+
+	resp.cqid = cq->queue.id;
+	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
+	if (err) {
+		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
+		goto err_remove_cq_cb;
+	}
+
 	return 0;
+
+err_remove_cq_cb:
+	mana_ib_remove_cq_cb(mdev, cq);
+err_destroy_rnic_cq:
+	mana_ib_gd_destroy_cq(mdev, cq);
+err_destroy_queue:
+	mana_ib_destroy_queue(mdev, &cq->queue);
+
+	return err;
 }
 
 int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
@@ -52,6 +93,12 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
 
 	mana_ib_remove_cq_cb(mdev, cq);
+
+	/* Ignore return code as there is not much we can do about it.
+	 * The error message is printed inside.
+	 */
+	mana_ib_gd_destroy_cq(mdev, cq);
+
 	mana_ib_destroy_queue(mdev, &cq->queue);
 
 	return 0;
diff --git a/include/uapi/rdma/mana-abi.h b/include/uapi/rdma/mana-abi.h
index 5fcb31b..2c41cc3 100644
--- a/include/uapi/rdma/mana-abi.h
+++ b/include/uapi/rdma/mana-abi.h
@@ -16,8 +16,20 @@
 
 #define MANA_IB_UVERBS_ABI_VERSION 1
 
+enum mana_ib_create_cq_flags {
+	MANA_IB_CREATE_RNIC_CQ	= 1 << 0,
+};
+
 struct mana_ib_create_cq {
 	__aligned_u64 buf_addr;
+	__u16	flags;
+	__u16	reserved0;
+	__u32	reserved1;
+};
+
+struct mana_ib_create_cq_resp {
+	__u32 cqid;
+	__u32 reserved;
 };
 
 struct mana_ib_create_qp {
-- 
2.43.0


^ permalink raw reply related	[relevance 68%]

* [PATCH rdma-next v2 4/5] RDMA/mana_ib: boundary check before installing cq callbacks
  2024-04-26 13:12 69% [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
                   ` (2 preceding siblings ...)
  2024-04-26 13:12 64% ` [PATCH rdma-next v2 3/5] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
@ 2024-04-26 13:12 79% ` Konstantin Taranov
  2024-04-26 13:12 68% ` [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
  4 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-26 13:12 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Add a boundary check inside mana_ib_install_cq_cb to prevent index overflow.

Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function")
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 298e8f1..688ffe6 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -70,6 +70,8 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct gdma_queue *gdma_cq;
 
+	if (cq->queue.id >= gc->max_num_cqs)
+		return -EINVAL;
 	/* Create CQ table entry */
 	WARN_ON(gc->cq_table[cq->queue.id]);
 	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
-- 
2.43.0


^ permalink raw reply related	[relevance 79%]

* [PATCH rdma-next v2 1/5] RDMA/mana_ib: create EQs for RNIC CQs
  2024-04-26 13:12 69% [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
@ 2024-04-26 13:12 75% ` Konstantin Taranov
  2024-04-26 13:12 68% ` [PATCH rdma-next v2 2/5] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-26 13:12 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Create EQs within mana_ib device. Such EQs are required
for creation of RNIC CQs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    | 34 ++++++++++++++++++++++++++--
 drivers/infiniband/hw/mana/mana_ib.h |  1 +
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index f540147..546d059 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -658,7 +658,7 @@ int mana_ib_create_eqs(struct mana_ib_dev *mdev)
 {
 	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct gdma_queue_spec spec = {};
-	int err;
+	int err, i;
 
 	spec.type = GDMA_EQ;
 	spec.monitor_avl_buf = false;
@@ -672,12 +672,42 @@ int mana_ib_create_eqs(struct mana_ib_dev *mdev)
 	if (err)
 		return err;
 
+	mdev->eqs = kcalloc(mdev->ib_dev.num_comp_vectors, sizeof(struct gdma_queue *),
+			    GFP_KERNEL);
+	if (!mdev->eqs) {
+		err = -ENOMEM;
+		goto destroy_fatal_eq;
+	}
+
+	for (i = 0; i < mdev->ib_dev.num_comp_vectors; i++) {
+		spec.eq.msix_index = (i + 1) % gc->num_msix_usable;
+		err = mana_gd_create_mana_eq(mdev->gdma_dev, &spec, &mdev->eqs[i]);
+		if (err)
+			goto destroy_eqs;
+	}
+
 	return 0;
+
+destroy_eqs:
+	while (i-- > 0)
+		mana_gd_destroy_queue(gc, mdev->eqs[i]);
+	kfree(mdev->eqs);
+destroy_fatal_eq:
+	mana_gd_destroy_queue(gc, mdev->fatal_err_eq);
+	return err;
 }
 
 void mana_ib_destroy_eqs(struct mana_ib_dev *mdev)
 {
-	mana_gd_destroy_queue(mdev_to_gc(mdev), mdev->fatal_err_eq);
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	int i;
+
+	mana_gd_destroy_queue(gc, mdev->fatal_err_eq);
+
+	for (i = 0; i < mdev->ib_dev.num_comp_vectors; i++)
+		mana_gd_destroy_queue(gc, mdev->eqs[i]);
+
+	kfree(mdev->eqs);
 }
 
 int mana_ib_gd_create_rnic_adapter(struct mana_ib_dev *mdev)
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 4c1240d..bfcf6df 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -56,6 +56,7 @@ struct mana_ib_dev {
 	struct gdma_dev *gdma_dev;
 	mana_handle_t adapter_handle;
 	struct gdma_queue *fatal_err_eq;
+	struct gdma_queue **eqs;
 	struct mana_ib_adapter_caps adapter_caps;
 };
 
-- 
2.43.0


^ permalink raw reply related	[relevance 75%]

* [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs
@ 2024-04-26 13:12 69% Konstantin Taranov
  2024-04-26 13:12 75% ` [PATCH rdma-next v2 1/5] RDMA/mana_ib: create EQs for " Konstantin Taranov
                   ` (4 more replies)
  0 siblings, 5 replies; 200+ results
From: Konstantin Taranov @ 2024-04-26 13:12 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

This patch series implements creation and destruction of CQs
which can be used with RC QPs.

Patches with RC QPs will be sent in the next patch series.

To create a CQ for RNIC, mana_ib requires creation of EQs
within mana_ib device. An EQ of mana ethernet cannot be used.

To make the implementation of create_cq cleaner, this series
also introduces a helper to remove CQ callbacks.

Mana ethernet and mana_ib CQs are different entities which are
created in different isolation zones (ethernet vs rnic).
As a result, RNIC cannot use ethenet CQs and ethernet cannot
use RNIC CQs.
That is why, we use existing udata request for creation of
ethernet CQs. If the request has an extra flag, then we create
an RNIC CQ. The kernel-level CQs will be RNIC CQs (in future
patches).

To preserve backward and forward compatibility with RDMA-CORE,
we will make the following changes to mana provider in RDMA-CORE:

The rdma-core will request RNIC CQs by default, with the proposed
request format and the special flag.
If the mana has installed an allocator with manadv_set_context_attr,
then the rdma-core understands that this is a DPDK use-case and
requests an ethernet CQ, by not setting the flag.

If the user has a new RDMA-core and an old kernel, then the user can
detect it as the response to create RNIC cq will not have queue id.

If the user has an old RDMA-core, then the flags will be 0 and ethernet
CQ will be created (as expected by the user).

v1->v2:
1) removed patch that replace cqe with buf_size
2) added aditional check of queue id in the remove cb helper
3) removed buf_size from uapi request and added flags instead. It seems
to be a better proposal that will not require to increase the ABI version.

Konstantin Taranov (5):
  RDMA/mana_ib: create EQs for RNIC CQs
  RDMA/mana_ib: create and destroy RNIC cqs
  RDMA/mana_ib: introduce a helper to remove cq callbacks
  RDMA/mana_ib: boundary check before installing cq callbacks
  RDMA/mana_ib: implement uapi for creation of rnic cq

 drivers/infiniband/hw/mana/cq.c      | 74 +++++++++++++++++++----
 drivers/infiniband/hw/mana/main.c    | 88 +++++++++++++++++++++++++++-
 drivers/infiniband/hw/mana/mana_ib.h | 34 +++++++++++
 drivers/infiniband/hw/mana/qp.c      | 26 ++------
 include/uapi/rdma/mana-abi.h         | 12 ++++
 5 files changed, 200 insertions(+), 34 deletions(-)

-- 
2.43.0


^ permalink raw reply	[relevance 69%]

* [PATCH rdma-next v2 2/5] RDMA/mana_ib: create and destroy RNIC cqs
  2024-04-26 13:12 69% [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
  2024-04-26 13:12 75% ` [PATCH rdma-next v2 1/5] RDMA/mana_ib: create EQs for " Konstantin Taranov
@ 2024-04-26 13:12 68% ` Konstantin Taranov
  2024-04-26 13:12 64% ` [PATCH rdma-next v2 3/5] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-26 13:12 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement RNIC requests for creation and destruction of RNIC CQs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    | 54 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h | 32 +++++++++++++++++
 2 files changed, 86 insertions(+)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 546d059..2a41135 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -834,3 +834,57 @@ int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8
 
 	return 0;
 }
+
+int mana_ib_gd_create_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq, u32 doorbell)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_create_cq_resp resp = {};
+	struct mana_rnic_create_cq_req req = {};
+	int err;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CREATE_CQ, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.gdma_region = cq->queue.gdma_region;
+	req.eq_id = mdev->eqs[cq->comp_vector]->id;
+	req.doorbell_page = doorbell;
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create cq err %d", err);
+		return err;
+	}
+
+	cq->queue.id  = resp.cq_id;
+	cq->cq_handle = resp.cq_handle;
+	/* The GDMA region is now owned by the CQ handle */
+	cq->queue.gdma_region = GDMA_INVALID_DMA_REGION;
+
+	return 0;
+}
+
+int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_destroy_cq_resp resp = {};
+	struct mana_rnic_destroy_cq_req req = {};
+	int err;
+
+	if (cq->cq_handle == INVALID_MANA_HANDLE)
+		return 0;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_DESTROY_CQ, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.cq_handle = cq->cq_handle;
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to destroy cq err %d", err);
+		return err;
+	}
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index bfcf6df..9162f29 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -92,6 +92,7 @@ struct mana_ib_cq {
 	struct mana_ib_queue queue;
 	int cqe;
 	u32 comp_vector;
+	mana_handle_t  cq_handle;
 };
 
 struct mana_ib_qp {
@@ -119,6 +120,8 @@ enum mana_ib_command_code {
 	MANA_IB_DESTROY_ADAPTER = 0x30003,
 	MANA_IB_CONFIG_IP_ADDR	= 0x30004,
 	MANA_IB_CONFIG_MAC_ADDR	= 0x30005,
+	MANA_IB_CREATE_CQ       = 0x30008,
+	MANA_IB_DESTROY_CQ      = 0x30009,
 };
 
 struct mana_ib_query_adapter_caps_req {
@@ -202,6 +205,31 @@ struct mana_rnic_config_mac_addr_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+struct mana_rnic_create_cq_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	u64 gdma_region;
+	u32 eq_id;
+	u32 doorbell_page;
+}; /* HW Data */
+
+struct mana_rnic_create_cq_resp {
+	struct gdma_resp_hdr hdr;
+	mana_handle_t cq_handle;
+	u32 cq_id;
+	u32 reserved;
+}; /* HW Data */
+
+struct mana_rnic_destroy_cq_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t cq_handle;
+}; /* HW Data */
+
+struct mana_rnic_destroy_cq_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -321,4 +349,8 @@ int mana_ib_gd_add_gid(const struct ib_gid_attr *attr, void **context);
 int mana_ib_gd_del_gid(const struct ib_gid_attr *attr, void **context);
 
 int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8 *mac);
+
+int mana_ib_gd_create_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq, u32 doorbell);
+
+int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
 #endif
-- 
2.43.0


^ permalink raw reply related	[relevance 68%]

* [PATCH rdma-next v2 3/5] RDMA/mana_ib: introduce a helper to remove cq callbacks
  2024-04-26 13:12 69% [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
  2024-04-26 13:12 75% ` [PATCH rdma-next v2 1/5] RDMA/mana_ib: create EQs for " Konstantin Taranov
  2024-04-26 13:12 68% ` [PATCH rdma-next v2 2/5] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
@ 2024-04-26 13:12 64% ` Konstantin Taranov
  2024-04-26 13:12 79% ` [PATCH rdma-next v2 4/5] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
  2024-04-26 13:12 68% ` [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
  4 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-26 13:12 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Intoduce the mana_ib_remove_cq_cb helper to remove cq callbacks.
The helper removes code duplicates.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c      | 19 ++++++++++++-------
 drivers/infiniband/hw/mana/mana_ib.h |  1 +
 drivers/infiniband/hw/mana/qp.c      | 26 ++++----------------------
 3 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index dc931b9..298e8f1 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -48,16 +48,10 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
 	struct ib_device *ibdev = ibcq->device;
 	struct mana_ib_dev *mdev;
-	struct gdma_context *gc;
 
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
-	gc = mdev_to_gc(mdev);
-
-	if (cq->queue.id != INVALID_QUEUE_ID) {
-		kfree(gc->cq_table[cq->queue.id]);
-		gc->cq_table[cq->queue.id] = NULL;
-	}
 
+	mana_ib_remove_cq_cb(mdev, cq);
 	mana_ib_destroy_queue(mdev, &cq->queue);
 
 	return 0;
@@ -89,3 +83,14 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 	gc->cq_table[cq->queue.id] = gdma_cq;
 	return 0;
 }
+
+void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+
+	if (cq->queue.id >= gc->max_num_cqs || cq->queue.id == INVALID_QUEUE_ID)
+		return;
+
+	kfree(gc->cq_table[cq->queue.id]);
+	gc->cq_table[cq->queue.id] = NULL;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 9162f29..68c3b4f 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -255,6 +255,7 @@ static inline void copy_in_reverse(u8 *dst, const u8 *src, u32 size)
 }
 
 int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
+void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
 
 int mana_ib_create_zero_offset_dma_region(struct mana_ib_dev *dev, struct ib_umem *umem,
 					  mana_handle_t *gdma_region);
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 280e85a..ba13c5a 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -95,11 +95,9 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
 	struct mana_ib_dev *mdev =
 		container_of(pd->device, struct mana_ib_dev, ib_dev);
-	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct ib_rwq_ind_table *ind_tbl = attr->rwq_ind_tbl;
 	struct mana_ib_create_qp_rss_resp resp = {};
 	struct mana_ib_create_qp_rss ucmd = {};
-	struct gdma_queue **gdma_cq_allocated;
 	mana_handle_t *mana_ind_table;
 	struct mana_port_context *mpc;
 	unsigned int ind_tbl_size;
@@ -173,13 +171,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		goto fail;
 	}
 
-	gdma_cq_allocated = kcalloc(ind_tbl_size, sizeof(*gdma_cq_allocated),
-				    GFP_KERNEL);
-	if (!gdma_cq_allocated) {
-		ret = -ENOMEM;
-		goto fail;
-	}
-
 	qp->port = port;
 
 	for (i = 0; i < ind_tbl_size; i++) {
@@ -229,8 +220,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		ret = mana_ib_install_cq_cb(mdev, cq);
 		if (ret)
 			goto fail;
-
-		gdma_cq_allocated[i] = gc->cq_table[cq->queue.id];
 	}
 	resp.num_entries = i;
 
@@ -250,7 +239,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		goto fail;
 	}
 
-	kfree(gdma_cq_allocated);
 	kfree(mana_ind_table);
 
 	return 0;
@@ -262,13 +250,10 @@ fail:
 		wq = container_of(ibwq, struct mana_ib_wq, ibwq);
 		cq = container_of(ibcq, struct mana_ib_cq, ibcq);
 
-		gc->cq_table[cq->queue.id] = NULL;
-		kfree(gdma_cq_allocated[i]);
-
+		mana_ib_remove_cq_cb(mdev, cq);
 		mana_destroy_wq_obj(mpc, GDMA_RQ, wq->rx_object);
 	}
 
-	kfree(gdma_cq_allocated);
 	kfree(mana_ind_table);
 
 	return ret;
@@ -287,10 +272,8 @@ static int mana_ib_create_qp_raw(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	struct mana_ib_ucontext *mana_ucontext =
 		rdma_udata_to_drv_context(udata, struct mana_ib_ucontext,
 					  ibucontext);
-	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct mana_ib_create_qp_resp resp = {};
 	struct mana_ib_create_qp ucmd = {};
-	struct gdma_queue *gdma_cq = NULL;
 	struct mana_obj_spec wq_spec = {};
 	struct mana_obj_spec cq_spec = {};
 	struct mana_port_context *mpc;
@@ -395,14 +378,13 @@ static int mana_ib_create_qp_raw(struct ib_qp *ibqp, struct ib_pd *ibpd,
 		ibdev_dbg(&mdev->ib_dev,
 			  "Failed copy udata for create qp-raw, %d\n",
 			  err);
-		goto err_release_gdma_cq;
+		goto err_remove_cq_cb;
 	}
 
 	return 0;
 
-err_release_gdma_cq:
-	kfree(gdma_cq);
-	gc->cq_table[send_cq->queue.id] = NULL;
+err_remove_cq_cb:
+	mana_ib_remove_cq_cb(mdev, send_cq);
 
 err_destroy_wq_obj:
 	mana_destroy_wq_obj(mpc, GDMA_SQ, qp->qp_handle);
-- 
2.43.0


^ permalink raw reply related	[relevance 64%]

* Re: [PATCH net-next v2 1/2] net: Add sysfs atttributes for max_mtu min_mtu
  @ 2024-04-26 11:06 79%     ` Shradha Gupta
  0 siblings, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-26 11:06 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Bjorn Helgaas,
	Jonathan Corbet, Randy Dunlap, Johannes Berg, Breno Leitao,
	linux-kernel, netdev, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Souradeep Chakrabarti, Konstantin Taranov,
	Yury Norov, linux-hyperv, shradhagupta

On Wed, Apr 24, 2024 at 08:27:03PM -0700, Jakub Kicinski wrote:
> On Wed, 24 Apr 2024 03:33:37 -0700 Shradha Gupta wrote:
> > Add sysfs attributes to read max_mtu and min_mtu value for
> > network devices
> 
> Absolutely pointless. You posted v1, dumping this as a driver
> specific value, even tho it's already reported by the core...
> And you can't even produce a meaningful commit message longer
> than one sentence.
> 
> This is not meeting the bar. Please get your patches reviewed
> internally at Microsoft by someone with good understanding of
> Linux networking before you post.
Noted, I'll do the needful going forward. Apologies.

^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing cq callbacks
  2024-04-18 16:52 79% ` [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
  2024-04-23 23:45 79%   ` Long Li
@ 2024-04-25 20:31 79%   ` Long Li
  1 sibling, 0 replies; 200+ results
From: Long Li @ 2024-04-25 20:31 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before
> installing cq callbacks
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Add a boundary check inside mana_ib_install_cq_cb to prevent index overflow.
> 
> Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper
> function")
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/cq.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
> index 6c3bb8c..8323085 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -70,6 +70,8 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev,
> struct mana_ib_cq *cq)
>  	struct gdma_context *gc = mdev_to_gc(mdev);
>  	struct gdma_queue *gdma_cq;
> 
> +	if (cq->queue.id >= gc->max_num_cqs)
> +		return -EINVAL;
>  	/* Create CQ table entry */
>  	WARN_ON(gc->cq_table[cq->queue.id]);
>  	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> --
> 2.43.0


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks
  2024-04-24  8:50 79%     ` Konstantin Taranov
@ 2024-04-25 20:29 79%       ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-25 20:29 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> > > +void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct
> > > mana_ib_cq
> > > +*cq) {
> > > +	struct gdma_context *gc = mdev_to_gc(mdev);
> > > +
> > > +	if (cq->queue.id >= gc->max_num_cqs)
> > > +		return;
> > > +
> > > +	kfree(gc->cq_table[cq->queue.id]);
> > > +	gc->cq_table[cq->queue.id] = NULL;
> >
> > Why the check for (cq->queue.id != INVALID_QUEUE_ID) is removed?
> 
> As max_num_cqs is always less than INVALID_QUEUE_ID, it is included in the "if".
> I can add " || cq->queue.id == INVALID_QUEUE_ID " to the condition if you want.

Okay, can you add a comment before if (cq->queue.id >= gc->max_num_cqs) saying it also works with INVALID_QUEUE_ID?

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v17 13/21] dm verity: consume root hash digest and expose signature data via LSM hook
  @ 2024-04-25 20:23 76%     ` Fan Wu
  0 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-25 20:23 UTC (permalink / raw)
  To: Eric Biggers
  Cc: corbet, zohar, jmorris, serge, tytso, axboe, agk, snitzer,
	eparis, paul, linux-doc, linux-integrity, linux-security-module,
	fsverity, linux-block, dm-devel, audit, linux-kernel,
	Deven Bowers



On 4/24/2024 8:56 PM, Eric Biggers wrote:
> On Fri, Apr 12, 2024 at 05:55:56PM -0700, Fan Wu wrote:
>> dm verity: consume root hash digest and expose signature data via LSM hook
> 
> As in the fsverity patch, nothing is being "consumed" here.  This patch adds a
> supplier, not a consumer.  I think you mean something like: expose root digest
> and signature to LSMs.
> 
Thanks for the suggestion.

>> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
>> index bb5da66da4c1..fbb83c6fd99c 100644
>> --- a/drivers/md/dm-verity-target.c
>> +++ b/drivers/md/dm-verity-target.c
>> @@ -22,6 +22,8 @@
>>   #include <linux/scatterlist.h>
>>   #include <linux/string.h>
>>   #include <linux/jump_label.h>
>> +#include <linux/security.h>
>> +#include <linux/dm-verity.h>
>>   
>>   #define DM_MSG_PREFIX			"verity"
>>   
>> @@ -1017,6 +1019,38 @@ static void verity_io_hints(struct dm_target *ti, struct queue_limits *limits)
>>   	blk_limits_io_min(limits, limits->logical_block_size);
>>   }
>>   
>> +#ifdef CONFIG_SECURITY
>> +
>> +static int verity_init_sig(struct dm_verity *v, const void *sig,
>> +			   size_t sig_size)
>> +{
>> +	v->sig_size = sig_size;
>> +	v->root_digest_sig = kmemdup(sig, v->sig_size, GFP_KERNEL);
>> +	if (!v->root_digest)
>> +		return -ENOMEM;
> 
> root_digest_sig, not root_digest
> 
Thanks for pointing out!

>> +#ifdef CONFIG_SECURITY
>> +
>> +static int verity_finalize(struct dm_target *ti)
>> +{
>> +	struct block_device *bdev;
>> +	struct dm_verity_digest root_digest;
>> +	struct dm_verity *v;
>> +	int r;
>> +
>> +	v = ti->private;
>> +	bdev = dm_disk(dm_table_get_md(ti->table))->part0;
>> +	root_digest.digest = v->root_digest;
>> +	root_digest.digest_len = v->digest_size;
>> +	root_digest.alg = v->alg_name;
>> +
>> +	r = security_bdev_setintegrity(bdev, LSM_INT_DMVERITY_ROOTHASH, &root_digest,
>> +				       sizeof(root_digest));
>> +	if (r)
>> +		return r;
>> +
>> +	r = security_bdev_setintegrity(bdev,
>> +				       LSM_INT_DMVERITY_SIG_VALID,
>> +				       v->root_digest_sig,
>> +				       v->sig_size);
> 
> The signature is only checked if CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y, whereas
> this code is built whenever CONFIG_SECURITY=y.
> 
> So this seems like the same issue that has turned up elsewhere in the IPE
> patchset, where IPE is (apparently) happy with any signature, even one that
> hasn't been checked...
> 

Yes I do agree the second hook call should better depend on 
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y.

However, the current implementation does not happy with any signature.

In case of CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y, any signature 
provided to dm-verity will be checked against the configured keyring, 
the hook call won't be reached if the check failed. In case of no 
signature is provided and !DM_VERITY_IS_SIG_FORCE_ENABLED(), the hook 
will be called with signature value NULL.

In case of CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=n, signature won't be 
accepted by dm-verity. In addition, the whole support of dm-verity will 
be disabled for IPE because CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=n.

>> diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
>> index 20b1bcf03474..89e862f0cdf6 100644
>> --- a/drivers/md/dm-verity.h
>> +++ b/drivers/md/dm-verity.h
>> @@ -43,6 +43,9 @@ struct dm_verity {
>>   	u8 *root_digest;	/* digest of the root block */
>>   	u8 *salt;		/* salt: its size is salt_size */
>>   	u8 *zero_digest;	/* digest for a zero block */
>> +#ifdef CONFIG_SECURITY
>> +	u8 *root_digest_sig;	/* digest signature of the root block */
>> +#endif /* CONFIG_SECURITY */
> 
> No, it's not a signature of the root block, at least not directly.  It's a
> signature of the root digest (the digest of the root block).
> 
>> diff --git a/include/linux/dm-verity.h b/include/linux/dm-verity.h
>> new file mode 100644
>> index 000000000000..a799a8043d85
>> --- /dev/null
>> +++ b/include/linux/dm-verity.h
>> @@ -0,0 +1,12 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _LINUX_DM_VERITY_H
>> +#define _LINUX_DM_VERITY_H
>> +
>> +struct dm_verity_digest {
>> +	const char *alg;
>> +	const u8 *digest;
>> +	size_t digest_len;
>> +};
>> +
>> +#endif /* _LINUX_DM_VERITY_H */
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index ac0985641611..9e46b13a356c 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -84,7 +84,8 @@ enum lsm_event {
>>   };
>>   
>>   enum lsm_integrity_type {
>> -	__LSM_INT_MAX
>> +	LSM_INT_DMVERITY_SIG_VALID,
>> +	LSM_INT_DMVERITY_ROOTHASH,
>>   };
> 
> Shouldn't struct dm_verity_digest be defined next to LSM_INT_DMVERITY_ROOTHASH?
> It's the struct that's associated with it.
> 
> It seems weird to create a brand new header <linux/dm-verity.h> that just
> contains this one LSM related definition, when there's already a header for the
> LSM definitions that even includes the related value LSM_INT_DMVERITY_ROOTHASH.
> 
> - Eric

Yes they can just be in the same header. Thanks for the suggestion.

-Fan

^ permalink raw reply	[relevance 76%]

* RE: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size
  2024-04-24  8:43 79%     ` Konstantin Taranov
@ 2024-04-25 20:17 79%       ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-25 20:17 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: RE: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with
> buf_size
> 
> > From: Long Li <longli@microsoft.com>
> > Sent: Wednesday, 24 April 2024 01:35
> > To: Konstantin Taranov <kotaranov@linux.microsoft.com>; Konstantin
> > Taranov <kotaranov@microsoft.com>; sharmaajay@microsoft.com;
> > jgg@ziepe.ca; leon@kernel.org
> > Cc: linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: RE: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe
> > with buf_size
> >
> > > Subject: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe
> > > with buf_size
> >
> > I don't understand this commit message on "duplicate" cqe. I couldn't
> > find a duplicate of it in the existing code.
> 
> If we need cqe, we could use it at cq->ibcq.cqe. The patch does not assign it as it
> is not used, but if you want I can add "cq->ibcq.cqe = attr->cqe;" in v2.
> 
> - Konstantin

I see. We don't need buf_size because it can be computed from cq->ibcq.cqe?

The commit message is confusing enough to make people think cqe is a duplicate of buf_size.

Long 

^ permalink raw reply	[relevance 79%]

* [PATCH v3] media/*: Convert from tasklet to BH workqueue
@ 2024-04-25 18:31 19% Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-04-25 18:31 UTC (permalink / raw)
  To: linux-media
  Cc: linux-kernel, hverkuil-cisco, sean, patrice.chotard,
	andrey.utkin, anton, maintainers, mchehab

The only generic interface to execute asynchronously in the BH context is
tasklet; however, it's marked deprecated and has some design flaws. To
replace tasklets, BH workqueue support was recently added. A BH workqueue
behaves similarly to regular workqueues except that the queued work items
are executed in the BH context.

This patch converts drivers/media/* from tasklet to BH workqueue.

Based on the work done by Tejun Heo <tj@kernel.org>
Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
---
 drivers/media/pci/bt8xx/bt878.c               |  8 ++--
 drivers/media/pci/bt8xx/bt878.h               |  3 +-
 drivers/media/pci/bt8xx/dvb-bt8xx.c           |  9 ++--
 drivers/media/pci/ddbridge/ddbridge.h         |  2 +-
 drivers/media/pci/mantis/hopper_cards.c       |  2 +-
 drivers/media/pci/mantis/mantis_cards.c       |  2 +-
 drivers/media/pci/mantis/mantis_common.h      |  2 +-
 drivers/media/pci/mantis/mantis_dma.c         |  5 ++-
 drivers/media/pci/mantis/mantis_dma.h         |  2 +-
 drivers/media/pci/mantis/mantis_dvb.c         | 12 +++---
 drivers/media/pci/ngene/ngene-core.c          | 23 ++++++-----
 drivers/media/pci/ngene/ngene.h               |  5 ++-
 drivers/media/pci/smipcie/smipcie-main.c      | 18 ++++----
 drivers/media/pci/smipcie/smipcie.h           |  3 +-
 drivers/media/pci/ttpci/budget-av.c           |  3 +-
 drivers/media/pci/ttpci/budget-ci.c           | 27 ++++++------
 drivers/media/pci/ttpci/budget-core.c         | 10 ++---
 drivers/media/pci/ttpci/budget.h              |  5 ++-
 drivers/media/pci/tw5864/tw5864-core.c        |  2 +-
 drivers/media/pci/tw5864/tw5864-video.c       | 13 +++---
 drivers/media/pci/tw5864/tw5864.h             |  7 ++--
 drivers/media/platform/intel/pxa_camera.c     | 15 +++----
 drivers/media/platform/marvell/mcam-core.c    | 11 ++---
 drivers/media/platform/marvell/mcam-core.h    |  3 +-
 .../st/sti/c8sectpfe/c8sectpfe-core.c         | 15 +++----
 .../st/sti/c8sectpfe/c8sectpfe-core.h         |  2 +-
 drivers/media/radio/wl128x/fmdrv.h            |  7 ++--
 drivers/media/radio/wl128x/fmdrv_common.c     | 41 ++++++++++---------
 drivers/media/rc/mceusb.c                     |  2 +-
 drivers/media/usb/ttusb-dec/ttusb_dec.c       | 21 +++++-----
 30 files changed, 149 insertions(+), 131 deletions(-)

diff --git a/drivers/media/pci/bt8xx/bt878.c b/drivers/media/pci/bt8xx/bt878.c
index 90972d6952f1..ec780e037bee 100644
--- a/drivers/media/pci/bt8xx/bt878.c
+++ b/drivers/media/pci/bt8xx/bt878.c
@@ -300,8 +300,8 @@ static irqreturn_t bt878_irq(int irq, void *dev_id)
 		}
 		if (astat & BT878_ARISCI) {
 			bt->finished_block = (stat & BT878_ARISCS) >> 28;
-			if (bt->tasklet.callback)
-				tasklet_schedule(&bt->tasklet);
+			if (bt->work.func)
+				queue_work(system_bh_wq, &bt->work);
 			break;
 		}
 		count++;
@@ -478,8 +478,8 @@ static int bt878_probe(struct pci_dev *dev, const struct pci_device_id *pci_id)
 	btwrite(0, BT878_AINT_MASK);
 	bt878_num++;
 
-	if (!bt->tasklet.func)
-		tasklet_disable(&bt->tasklet);
+	if (!bt->work.func)
+		disable_work_sync(&bt->work);
 
 	return 0;
 
diff --git a/drivers/media/pci/bt8xx/bt878.h b/drivers/media/pci/bt8xx/bt878.h
index fde8db293c54..b9ce78e5116b 100644
--- a/drivers/media/pci/bt8xx/bt878.h
+++ b/drivers/media/pci/bt8xx/bt878.h
@@ -14,6 +14,7 @@
 #include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/mutex.h>
+#include <linux/workqueue.h>
 
 #include "bt848.h"
 #include "bttv.h"
@@ -120,7 +121,7 @@ struct bt878 {
 	dma_addr_t risc_dma;
 	u32 risc_pos;
 
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 	int shutdown;
 };
 
diff --git a/drivers/media/pci/bt8xx/dvb-bt8xx.c b/drivers/media/pci/bt8xx/dvb-bt8xx.c
index 390cbba6c065..8c0e1fa764a4 100644
--- a/drivers/media/pci/bt8xx/dvb-bt8xx.c
+++ b/drivers/media/pci/bt8xx/dvb-bt8xx.c
@@ -15,6 +15,7 @@
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/i2c.h>
+#include <linux/workqueue.h>
 
 #include <media/dmxdev.h>
 #include <media/dvbdev.h>
@@ -39,9 +40,9 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
 
 #define IF_FREQUENCYx6 217    /* 6 * 36.16666666667MHz */
 
-static void dvb_bt8xx_task(struct tasklet_struct *t)
+static void dvb_bt8xx_task(struct work_struct *t)
 {
-	struct bt878 *bt = from_tasklet(bt, t, tasklet);
+	struct bt878 *bt = from_work(bt, t, work);
 	struct dvb_bt8xx_card *card = dev_get_drvdata(&bt->adapter->dev);
 
 	dprintk("%d\n", card->bt->finished_block);
@@ -782,7 +783,7 @@ static int dvb_bt8xx_load_card(struct dvb_bt8xx_card *card, u32 type)
 		goto err_disconnect_frontend;
 	}
 
-	tasklet_setup(&card->bt->tasklet, dvb_bt8xx_task);
+	INIT_WORK(&card->bt->work, dvb_bt8xx_task);
 
 	frontend_init(card, type);
 
@@ -922,7 +923,7 @@ static void dvb_bt8xx_remove(struct bttv_sub_device *sub)
 	dprintk("dvb_bt8xx: unloading card%d\n", card->bttv_nr);
 
 	bt878_stop(card->bt);
-	tasklet_kill(&card->bt->tasklet);
+	cancel_work_sync(&card->bt->work);
 	dvb_net_release(&card->dvbnet);
 	card->demux.dmx.remove_frontend(&card->demux.dmx, &card->fe_mem);
 	card->demux.dmx.remove_frontend(&card->demux.dmx, &card->fe_hw);
diff --git a/drivers/media/pci/ddbridge/ddbridge.h b/drivers/media/pci/ddbridge/ddbridge.h
index f3699dbd193f..6044f3085fad 100644
--- a/drivers/media/pci/ddbridge/ddbridge.h
+++ b/drivers/media/pci/ddbridge/ddbridge.h
@@ -298,7 +298,7 @@ struct ddb_link {
 	spinlock_t             lock; /* lock link access */
 	struct mutex           flash_mutex; /* lock flash access */
 	struct ddb_lnb         lnb;
-	struct tasklet_struct  tasklet;
+	struct work_struct work;
 	struct ddb_ids         ids;
 
 	spinlock_t             temp_lock; /* lock temp chip access */
diff --git a/drivers/media/pci/mantis/hopper_cards.c b/drivers/media/pci/mantis/hopper_cards.c
index c0bd5d7e148b..869ea88c4893 100644
--- a/drivers/media/pci/mantis/hopper_cards.c
+++ b/drivers/media/pci/mantis/hopper_cards.c
@@ -116,7 +116,7 @@ static irqreturn_t hopper_irq_handler(int irq, void *dev_id)
 	if (stat & MANTIS_INT_RISCI) {
 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[8]);
 		mantis->busy_block = (stat & MANTIS_INT_RISCSTAT) >> 28;
-		tasklet_schedule(&mantis->tasklet);
+		queue_work(system_bh_wq, &mantis->work);
 	}
 	if (stat & MANTIS_INT_I2CDONE) {
 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[9]);
diff --git a/drivers/media/pci/mantis/mantis_cards.c b/drivers/media/pci/mantis/mantis_cards.c
index 906e4500d87d..cb124b19e36e 100644
--- a/drivers/media/pci/mantis/mantis_cards.c
+++ b/drivers/media/pci/mantis/mantis_cards.c
@@ -125,7 +125,7 @@ static irqreturn_t mantis_irq_handler(int irq, void *dev_id)
 	if (stat & MANTIS_INT_RISCI) {
 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[8]);
 		mantis->busy_block = (stat & MANTIS_INT_RISCSTAT) >> 28;
-		tasklet_schedule(&mantis->tasklet);
+		queue_work(system_bh_wq, &mantis->work);
 	}
 	if (stat & MANTIS_INT_I2CDONE) {
 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[9]);
diff --git a/drivers/media/pci/mantis/mantis_common.h b/drivers/media/pci/mantis/mantis_common.h
index d88ac280226c..cf4b604b55f5 100644
--- a/drivers/media/pci/mantis/mantis_common.h
+++ b/drivers/media/pci/mantis/mantis_common.h
@@ -125,7 +125,7 @@ struct mantis_pci {
 	__le32			*risc_cpu;
 	dma_addr_t		risc_dma;
 
-	struct tasklet_struct	tasklet;
+	struct work_struct	work;
 	spinlock_t		intmask_lock;
 
 	struct i2c_adapter	adapter;
diff --git a/drivers/media/pci/mantis/mantis_dma.c b/drivers/media/pci/mantis/mantis_dma.c
index 80c843936493..c85f9b84a2c6 100644
--- a/drivers/media/pci/mantis/mantis_dma.c
+++ b/drivers/media/pci/mantis/mantis_dma.c
@@ -15,6 +15,7 @@
 #include <linux/signal.h>
 #include <linux/sched.h>
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 
 #include <media/dmxdev.h>
 #include <media/dvbdev.h>
@@ -200,9 +201,9 @@ void mantis_dma_stop(struct mantis_pci *mantis)
 }
 
 
-void mantis_dma_xfer(struct tasklet_struct *t)
+void mantis_dma_xfer(struct work_struct *t)
 {
-	struct mantis_pci *mantis = from_tasklet(mantis, t, tasklet);
+	struct mantis_pci *mantis = from_work(mantis, t, work);
 	struct mantis_hwconfig *config = mantis->hwconfig;
 
 	while (mantis->last_block != mantis->busy_block) {
diff --git a/drivers/media/pci/mantis/mantis_dma.h b/drivers/media/pci/mantis/mantis_dma.h
index 37da982c9c29..5db0d3728f15 100644
--- a/drivers/media/pci/mantis/mantis_dma.h
+++ b/drivers/media/pci/mantis/mantis_dma.h
@@ -13,6 +13,6 @@ extern int mantis_dma_init(struct mantis_pci *mantis);
 extern int mantis_dma_exit(struct mantis_pci *mantis);
 extern void mantis_dma_start(struct mantis_pci *mantis);
 extern void mantis_dma_stop(struct mantis_pci *mantis);
-extern void mantis_dma_xfer(struct tasklet_struct *t);
+extern void mantis_dma_xfer(struct work_struct *t);
 
 #endif /* __MANTIS_DMA_H */
diff --git a/drivers/media/pci/mantis/mantis_dvb.c b/drivers/media/pci/mantis/mantis_dvb.c
index c7ba4a76e608..f640635de170 100644
--- a/drivers/media/pci/mantis/mantis_dvb.c
+++ b/drivers/media/pci/mantis/mantis_dvb.c
@@ -105,7 +105,7 @@ static int mantis_dvb_start_feed(struct dvb_demux_feed *dvbdmxfeed)
 	if (mantis->feeds == 1)	 {
 		dprintk(MANTIS_DEBUG, 1, "mantis start feed & dma");
 		mantis_dma_start(mantis);
-		tasklet_enable(&mantis->tasklet);
+		enable_and_queue_work(system_bh_wq, &mantis->work);
 	}
 
 	return mantis->feeds;
@@ -125,7 +125,7 @@ static int mantis_dvb_stop_feed(struct dvb_demux_feed *dvbdmxfeed)
 	mantis->feeds--;
 	if (mantis->feeds == 0) {
 		dprintk(MANTIS_DEBUG, 1, "mantis stop feed and dma");
-		tasklet_disable(&mantis->tasklet);
+		disable_work_sync(&mantis->work);
 		mantis_dma_stop(mantis);
 	}
 
@@ -205,8 +205,8 @@ int mantis_dvb_init(struct mantis_pci *mantis)
 	}
 
 	dvb_net_init(&mantis->dvb_adapter, &mantis->dvbnet, &mantis->demux.dmx);
-	tasklet_setup(&mantis->tasklet, mantis_dma_xfer);
-	tasklet_disable(&mantis->tasklet);
+	INIT_WORK(&mantis->bh, mantis_dma_xfer);
+	disable_work_sync(&mantis->work);
 	if (mantis->hwconfig) {
 		result = config->frontend_init(mantis, mantis->fe);
 		if (result < 0) {
@@ -235,7 +235,7 @@ int mantis_dvb_init(struct mantis_pci *mantis)
 
 	/* Error conditions ..	*/
 err5:
-	tasklet_kill(&mantis->tasklet);
+	cancel_work_sync(&mantis->work);
 	dvb_net_release(&mantis->dvbnet);
 	if (mantis->fe) {
 		dvb_unregister_frontend(mantis->fe);
@@ -273,7 +273,7 @@ int mantis_dvb_exit(struct mantis_pci *mantis)
 		dvb_frontend_detach(mantis->fe);
 	}
 
-	tasklet_kill(&mantis->tasklet);
+	cancel_work_sync(&mantis->work);
 	dvb_net_release(&mantis->dvbnet);
 
 	mantis->demux.dmx.remove_frontend(&mantis->demux.dmx, &mantis->fe_mem);
diff --git a/drivers/media/pci/ngene/ngene-core.c b/drivers/media/pci/ngene/ngene-core.c
index 7481f553f959..5211d6796748 100644
--- a/drivers/media/pci/ngene/ngene-core.c
+++ b/drivers/media/pci/ngene/ngene-core.c
@@ -21,6 +21,7 @@
 #include <linux/byteorder/generic.h>
 #include <linux/firmware.h>
 #include <linux/vmalloc.h>
+#include <linux/workqueue.h>
 
 #include "ngene.h"
 
@@ -50,9 +51,9 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
 /* nGene interrupt handler **************************************************/
 /****************************************************************************/
 
-static void event_tasklet(struct tasklet_struct *t)
+static void event_work(struct work_struct *t)
 {
-	struct ngene *dev = from_tasklet(dev, t, event_tasklet);
+	struct ngene *dev = from_work(dev, t, event_work);
 
 	while (dev->EventQueueReadIndex != dev->EventQueueWriteIndex) {
 		struct EVENT_BUFFER Event =
@@ -68,9 +69,9 @@ static void event_tasklet(struct tasklet_struct *t)
 	}
 }
 
-static void demux_tasklet(struct tasklet_struct *t)
+static void demux_work(struct work_struct *t)
 {
-	struct ngene_channel *chan = from_tasklet(chan, t, demux_tasklet);
+	struct ngene_channel *chan = from_work(chan, t, demux_work);
 	struct device *pdev = &chan->dev->pci_dev->dev;
 	struct SBufferHeader *Cur = chan->nextBuffer;
 
@@ -204,7 +205,7 @@ static irqreturn_t irq_handler(int irq, void *dev_id)
 			dev->EventQueueOverflowFlag = 1;
 		}
 		dev->EventBuffer->EventStatus &= ~0x80;
-		tasklet_schedule(&dev->event_tasklet);
+		queue_work(system_bh_wq, &dev->event_work);
 		rc = IRQ_HANDLED;
 	}
 
@@ -217,8 +218,8 @@ static irqreturn_t irq_handler(int irq, void *dev_id)
 			     ngeneBuffer.SR.Flags & 0xC0) == 0x80) {
 				dev->channel[i].nextBuffer->
 					ngeneBuffer.SR.Flags |= 0x40;
-				tasklet_schedule(
-					&dev->channel[i].demux_tasklet);
+				queue_work(system_bh_wq,
+					&dev->channel[i].demux_work);
 				rc = IRQ_HANDLED;
 			}
 		}
@@ -1181,7 +1182,7 @@ static void ngene_init(struct ngene *dev)
 	struct device *pdev = &dev->pci_dev->dev;
 	int i;
 
-	tasklet_setup(&dev->event_tasklet, event_tasklet);
+	INIT_WORK(&dev->event_work, event_work);
 
 	memset_io(dev->iomem + 0xc000, 0x00, 0x220);
 	memset_io(dev->iomem + 0xc400, 0x00, 0x100);
@@ -1395,7 +1396,7 @@ static void release_channel(struct ngene_channel *chan)
 	if (chan->running)
 		set_transfer(chan, 0);
 
-	tasklet_kill(&chan->demux_tasklet);
+	cancel_work_sync(&chan->demux_work);
 
 	if (chan->ci_dev) {
 		dvb_unregister_device(chan->ci_dev);
@@ -1445,7 +1446,7 @@ static int init_channel(struct ngene_channel *chan)
 	struct ngene_info *ni = dev->card_info;
 	int io = ni->io_type[nr];
 
-	tasklet_setup(&chan->demux_tasklet, demux_tasklet);
+	INIT_WORK(&chan->demux_work, demux_work);
 	chan->users = 0;
 	chan->type = io;
 	chan->mode = chan->type;	/* for now only one mode */
@@ -1647,7 +1648,7 @@ void ngene_remove(struct pci_dev *pdev)
 	struct ngene *dev = pci_get_drvdata(pdev);
 	int i;
 
-	tasklet_kill(&dev->event_tasklet);
+	cancel_work_sync(&dev->event_work);
 	for (i = MAX_STREAM - 1; i >= 0; i--)
 		release_channel(&dev->channel[i]);
 	if (dev->ci.en)
diff --git a/drivers/media/pci/ngene/ngene.h b/drivers/media/pci/ngene/ngene.h
index d1d7da84cd9d..c2a23f6dbe09 100644
--- a/drivers/media/pci/ngene/ngene.h
+++ b/drivers/media/pci/ngene/ngene.h
@@ -16,6 +16,7 @@
 #include <linux/scatterlist.h>
 
 #include <linux/dvb/frontend.h>
+#include <linux/workqueue.h>
 
 #include <media/dmxdev.h>
 #include <media/dvbdev.h>
@@ -621,7 +622,7 @@ struct ngene_channel {
 	int                   users;
 	struct video_device  *v4l_dev;
 	struct dvb_device    *ci_dev;
-	struct tasklet_struct demux_tasklet;
+	struct work_struct demux_work;
 
 	struct SBufferHeader *nextBuffer;
 	enum KSSTATE          State;
@@ -717,7 +718,7 @@ struct ngene {
 	struct EVENT_BUFFER   EventQueue[EVENT_QUEUE_SIZE];
 	int                   EventQueueOverflowCount;
 	int                   EventQueueOverflowFlag;
-	struct tasklet_struct event_tasklet;
+	struct work_struct event_work;
 	struct EVENT_BUFFER  *EventBuffer;
 	int                   EventQueueWriteIndex;
 	int                   EventQueueReadIndex;
diff --git a/drivers/media/pci/smipcie/smipcie-main.c b/drivers/media/pci/smipcie/smipcie-main.c
index 0c300d019d9c..7da6bb55660b 100644
--- a/drivers/media/pci/smipcie/smipcie-main.c
+++ b/drivers/media/pci/smipcie/smipcie-main.c
@@ -279,10 +279,10 @@ static void smi_port_clearInterrupt(struct smi_port *port)
 		(port->_dmaInterruptCH0 | port->_dmaInterruptCH1));
 }
 
-/* tasklet handler: DMA data to dmx.*/
-static void smi_dma_xfer(struct tasklet_struct *t)
+/* work handler: DMA data to dmx.*/
+static void smi_dma_xfer(struct work_struct *t)
 {
-	struct smi_port *port = from_tasklet(port, t, tasklet);
+	struct smi_port *port = from_work(port, t, work);
 	struct smi_dev *dev = port->dev;
 	u32 intr_status, finishedData, dmaManagement;
 	u8 dmaChan0State, dmaChan1State;
@@ -426,8 +426,8 @@ static int smi_port_init(struct smi_port *port, int dmaChanUsed)
 	}
 
 	smi_port_disableInterrupt(port);
-	tasklet_setup(&port->tasklet, smi_dma_xfer);
-	tasklet_disable(&port->tasklet);
+	INIT_WORK(&port->work, smi_dma_xfer);
+	disable_work_sync(&port->work);
 	port->enable = 1;
 	return 0;
 err:
@@ -438,7 +438,7 @@ static int smi_port_init(struct smi_port *port, int dmaChanUsed)
 static void smi_port_exit(struct smi_port *port)
 {
 	smi_port_disableInterrupt(port);
-	tasklet_kill(&port->tasklet);
+	cancel_work_sync(&port->work);
 	smi_port_dma_free(port);
 	port->enable = 0;
 }
@@ -452,7 +452,7 @@ static int smi_port_irq(struct smi_port *port, u32 int_status)
 		smi_port_disableInterrupt(port);
 		port->_int_status = int_status;
 		smi_port_clearInterrupt(port);
-		tasklet_schedule(&port->tasklet);
+		queue_work(system_bh_wq, &port->work);
 		handled = 1;
 	}
 	return handled;
@@ -823,7 +823,7 @@ static int smi_start_feed(struct dvb_demux_feed *dvbdmxfeed)
 		smi_port_clearInterrupt(port);
 		smi_port_enableInterrupt(port);
 		smi_write(port->DMA_MANAGEMENT, dmaManagement);
-		tasklet_enable(&port->tasklet);
+		enable_and_queue_work(system_bh_wq, &port->work);
 	}
 	return port->users;
 }
@@ -837,7 +837,7 @@ static int smi_stop_feed(struct dvb_demux_feed *dvbdmxfeed)
 	if (--port->users)
 		return port->users;
 
-	tasklet_disable(&port->tasklet);
+	disable_work_sync(&port->work);
 	smi_port_disableInterrupt(port);
 	smi_clear(port->DMA_MANAGEMENT, 0x30003);
 	return 0;
diff --git a/drivers/media/pci/smipcie/smipcie.h b/drivers/media/pci/smipcie/smipcie.h
index 2b5e0154814c..f124d2cdead6 100644
--- a/drivers/media/pci/smipcie/smipcie.h
+++ b/drivers/media/pci/smipcie/smipcie.h
@@ -17,6 +17,7 @@
 #include <linux/pci.h>
 #include <linux/dma-mapping.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 #include <media/rc-core.h>
 
 #include <media/demux.h>
@@ -257,7 +258,7 @@ struct smi_port {
 	u32 _dmaInterruptCH0;
 	u32 _dmaInterruptCH1;
 	u32 _int_status;
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 	/* dvb */
 	struct dmx_frontend hw_frontend;
 	struct dmx_frontend mem_frontend;
diff --git a/drivers/media/pci/ttpci/budget-av.c b/drivers/media/pci/ttpci/budget-av.c
index a47c5850ef87..6e43b1a01191 100644
--- a/drivers/media/pci/ttpci/budget-av.c
+++ b/drivers/media/pci/ttpci/budget-av.c
@@ -37,6 +37,7 @@
 #include <linux/interrupt.h>
 #include <linux/input.h>
 #include <linux/spinlock.h>
+#include <linux/workqueue.h>
 
 #include <media/dvb_ca_en50221.h>
 
@@ -55,7 +56,7 @@ struct budget_av {
 	struct video_device vd;
 	int cur_input;
 	int has_saa7113;
-	struct tasklet_struct ciintf_irq_tasklet;
+	struct work_struct ciintf_irq_work;
 	int slot_status;
 	struct dvb_ca_en50221 ca;
 	u8 reinitialise_demod:1;
diff --git a/drivers/media/pci/ttpci/budget-ci.c b/drivers/media/pci/ttpci/budget-ci.c
index 66e1a004ee43..11e0ed62707e 100644
--- a/drivers/media/pci/ttpci/budget-ci.c
+++ b/drivers/media/pci/ttpci/budget-ci.c
@@ -17,6 +17,7 @@
 #include <linux/slab.h>
 #include <linux/interrupt.h>
 #include <linux/spinlock.h>
+#include <linux/workqueue.h>
 #include <media/rc-core.h>
 
 #include "budget.h"
@@ -80,7 +81,7 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
 
 struct budget_ci_ir {
 	struct rc_dev *dev;
-	struct tasklet_struct msp430_irq_tasklet;
+	struct work_struct msp430_irq_work;
 	char name[72]; /* 40 + 32 for (struct saa7146_dev).name */
 	char phys[32];
 	int rc5_device;
@@ -91,7 +92,7 @@ struct budget_ci_ir {
 
 struct budget_ci {
 	struct budget budget;
-	struct tasklet_struct ciintf_irq_tasklet;
+	struct work_struct ciintf_irq_work;
 	int slot_status;
 	int ci_irq;
 	struct dvb_ca_en50221 ca;
@@ -99,9 +100,9 @@ struct budget_ci {
 	u8 tuner_pll_address; /* used for philips_tdm1316l configs */
 };
 
-static void msp430_ir_interrupt(struct tasklet_struct *t)
+static void msp430_ir_interrupt(struct work_struct *t)
 {
-	struct budget_ci_ir *ir = from_tasklet(ir, t, msp430_irq_tasklet);
+	struct budget_ci_ir *ir = from_work(ir, t, msp430_irq_work);
 	struct budget_ci *budget_ci = container_of(ir, typeof(*budget_ci), ir);
 	struct rc_dev *dev = budget_ci->ir.dev;
 	u32 command = ttpci_budget_debiread(&budget_ci->budget, DEBINOSWAP, DEBIADDR_IR, 2, 1, 0) >> 8;
@@ -230,7 +231,7 @@ static int msp430_ir_init(struct budget_ci *budget_ci)
 
 	budget_ci->ir.dev = dev;
 
-	tasklet_setup(&budget_ci->ir.msp430_irq_tasklet, msp430_ir_interrupt);
+	INIT_WORK(&budget_ci->ir.msp430_irq_work, msp430_ir_interrupt);
 
 	SAA7146_IER_ENABLE(saa, MASK_06);
 	saa7146_setgpio(saa, 3, SAA7146_GPIO_IRQHI);
@@ -244,7 +245,7 @@ static void msp430_ir_deinit(struct budget_ci *budget_ci)
 
 	SAA7146_IER_DISABLE(saa, MASK_06);
 	saa7146_setgpio(saa, 3, SAA7146_GPIO_INPUT);
-	tasklet_kill(&budget_ci->ir.msp430_irq_tasklet);
+	cancel_work_sync(&budget_ci->ir.msp430_irq_work);
 
 	rc_unregister_device(budget_ci->ir.dev);
 }
@@ -348,10 +349,10 @@ static int ciintf_slot_ts_enable(struct dvb_ca_en50221 *ca, int slot)
 	return 0;
 }
 
-static void ciintf_interrupt(struct tasklet_struct *t)
+static void ciintf_interrupt(struct work_struct *t)
 {
-	struct budget_ci *budget_ci = from_tasklet(budget_ci, t,
-						   ciintf_irq_tasklet);
+	struct budget_ci *budget_ci = from_work(budget_ci, t,
+						   ciintf_irq_work);
 	struct saa7146_dev *saa = budget_ci->budget.dev;
 	unsigned int flags;
 
@@ -492,7 +493,7 @@ static int ciintf_init(struct budget_ci *budget_ci)
 
 	// Setup CI slot IRQ
 	if (budget_ci->ci_irq) {
-		tasklet_setup(&budget_ci->ciintf_irq_tasklet, ciintf_interrupt);
+		INIT_WORK(&budget_ci->ciintf_irq_work, ciintf_interrupt);
 		if (budget_ci->slot_status != SLOTSTATUS_NONE) {
 			saa7146_setgpio(saa, 0, SAA7146_GPIO_IRQLO);
 		} else {
@@ -532,7 +533,7 @@ static void ciintf_deinit(struct budget_ci *budget_ci)
 	if (budget_ci->ci_irq) {
 		SAA7146_IER_DISABLE(saa, MASK_03);
 		saa7146_setgpio(saa, 0, SAA7146_GPIO_INPUT);
-		tasklet_kill(&budget_ci->ciintf_irq_tasklet);
+		cancel_work_sync(&budget_ci->ciintf_irq_work);
 	}
 
 	// reset interface
@@ -558,13 +559,13 @@ static void budget_ci_irq(struct saa7146_dev *dev, u32 * isr)
 	dprintk(8, "dev: %p, budget_ci: %p\n", dev, budget_ci);
 
 	if (*isr & MASK_06)
-		tasklet_schedule(&budget_ci->ir.msp430_irq_tasklet);
+		queue_work(system_bh_wq, &budget_ci->ir.msp430_irq_work);
 
 	if (*isr & MASK_10)
 		ttpci_budget_irq10_handler(dev, isr);
 
 	if ((*isr & MASK_03) && (budget_ci->budget.ci_present) && (budget_ci->ci_irq))
-		tasklet_schedule(&budget_ci->ciintf_irq_tasklet);
+		queue_work(system_bh_wq, &budget_ci->ciintf_irq_work);
 }
 
 static u8 philips_su1278_tt_inittab[] = {
diff --git a/drivers/media/pci/ttpci/budget-core.c b/drivers/media/pci/ttpci/budget-core.c
index 25f44c3eebf3..3443c12dc9f2 100644
--- a/drivers/media/pci/ttpci/budget-core.c
+++ b/drivers/media/pci/ttpci/budget-core.c
@@ -171,9 +171,9 @@ static int budget_read_fe_status(struct dvb_frontend *fe,
 	return ret;
 }
 
-static void vpeirq(struct tasklet_struct *t)
+static void vpeirq(struct work_struct *t)
 {
-	struct budget *budget = from_tasklet(budget, t, vpe_tasklet);
+	struct budget *budget = from_work(budget, t, vpe_work);
 	u8 *mem = (u8 *) (budget->grabbing);
 	u32 olddma = budget->ttbp;
 	u32 newdma = saa7146_read(budget->dev, PCI_VDP3);
@@ -520,7 +520,7 @@ int ttpci_budget_init(struct budget *budget, struct saa7146_dev *dev,
 	/* upload all */
 	saa7146_write(dev, GPIO_CTRL, 0x000000);
 
-	tasklet_setup(&budget->vpe_tasklet, vpeirq);
+	INIT_WORK(&budget->vpe_work, vpeirq);
 
 	/* frontend power on */
 	if (bi->type != BUDGET_FS_ACTIVY)
@@ -557,7 +557,7 @@ int ttpci_budget_deinit(struct budget *budget)
 
 	budget_unregister(budget);
 
-	tasklet_kill(&budget->vpe_tasklet);
+	cancel_work_sync(&budget->vpe_work);
 
 	saa7146_vfree_destroy_pgtable(dev->pci, budget->grabbing, &budget->pt);
 
@@ -575,7 +575,7 @@ void ttpci_budget_irq10_handler(struct saa7146_dev *dev, u32 * isr)
 	dprintk(8, "dev: %p, budget: %p\n", dev, budget);
 
 	if (*isr & MASK_10)
-		tasklet_schedule(&budget->vpe_tasklet);
+		queue_work(system_bh_wq, &budget->vpe_work);
 }
 
 void ttpci_budget_set_video_port(struct saa7146_dev *dev, int video_port)
diff --git a/drivers/media/pci/ttpci/budget.h b/drivers/media/pci/ttpci/budget.h
index bd87432e6cde..a3ee75e326b4 100644
--- a/drivers/media/pci/ttpci/budget.h
+++ b/drivers/media/pci/ttpci/budget.h
@@ -12,6 +12,7 @@
 
 #include <linux/module.h>
 #include <linux/mutex.h>
+#include <linux/workqueue.h>
 
 #include <media/drv-intf/saa7146.h>
 
@@ -49,8 +50,8 @@ struct budget {
 	unsigned char *grabbing;
 	struct saa7146_pgtable pt;
 
-	struct tasklet_struct fidb_tasklet;
-	struct tasklet_struct vpe_tasklet;
+	struct work_struct fidb_work;
+	struct work_struct vpe_work;
 
 	struct dmxdev dmxdev;
 	struct dvb_demux demux;
diff --git a/drivers/media/pci/tw5864/tw5864-core.c b/drivers/media/pci/tw5864/tw5864-core.c
index 560ff1ddcc83..a58c268e94a8 100644
--- a/drivers/media/pci/tw5864/tw5864-core.c
+++ b/drivers/media/pci/tw5864/tw5864-core.c
@@ -144,7 +144,7 @@ static void tw5864_h264_isr(struct tw5864_dev *dev)
 		cur_frame->gop_seqno = input->frame_gop_seqno;
 
 		dev->h264_buf_w_index = next_frame_index;
-		tasklet_schedule(&dev->tasklet);
+		queue_work(system_bh_wq, &dev->work);
 
 		cur_frame = next_frame;
 
diff --git a/drivers/media/pci/tw5864/tw5864-video.c b/drivers/media/pci/tw5864/tw5864-video.c
index 8b1aae4b6319..ac2249626506 100644
--- a/drivers/media/pci/tw5864/tw5864-video.c
+++ b/drivers/media/pci/tw5864/tw5864-video.c
@@ -6,6 +6,7 @@
  */
 
 #include <linux/module.h>
+#include <linux/workqueue.h>
 #include <media/v4l2-common.h>
 #include <media/v4l2-event.h>
 #include <media/videobuf2-dma-contig.h>
@@ -175,7 +176,7 @@ static const unsigned int intra4x4_lambda3[] = {
 static v4l2_std_id tw5864_get_v4l2_std(enum tw5864_vid_std std);
 static enum tw5864_vid_std tw5864_from_v4l2_std(v4l2_std_id v4l2_std);
 
-static void tw5864_handle_frame_task(struct tasklet_struct *t);
+static void tw5864_handle_frame_task(struct work_struct *t);
 static void tw5864_handle_frame(struct tw5864_h264_frame *frame);
 static void tw5864_frame_interval_set(struct tw5864_input *input);
 
@@ -1062,7 +1063,7 @@ int tw5864_video_init(struct tw5864_dev *dev, int *video_nr)
 	dev->irqmask |= TW5864_INTR_VLC_DONE | TW5864_INTR_TIMER;
 	tw5864_irqmask_apply(dev);
 
-	tasklet_setup(&dev->tasklet, tw5864_handle_frame_task);
+	INIT_WORK(&dev->work, tw5864_handle_frame_task);
 
 	for (i = 0; i < TW5864_INPUTS; i++) {
 		dev->inputs[i].root = dev;
@@ -1079,7 +1080,7 @@ int tw5864_video_init(struct tw5864_dev *dev, int *video_nr)
 	for (i = last_input_nr_registered; i >= 0; i--)
 		tw5864_video_input_fini(&dev->inputs[i]);
 
-	tasklet_kill(&dev->tasklet);
+	cancel_work_sync(&dev->work);
 
 free_dma:
 	for (i = last_dma_allocated; i >= 0; i--) {
@@ -1198,7 +1199,7 @@ void tw5864_video_fini(struct tw5864_dev *dev)
 {
 	int i;
 
-	tasklet_kill(&dev->tasklet);
+	cancel_work_sync(&dev->work);
 
 	for (i = 0; i < TW5864_INPUTS; i++)
 		tw5864_video_input_fini(&dev->inputs[i]);
@@ -1315,9 +1316,9 @@ static int tw5864_is_motion_triggered(struct tw5864_h264_frame *frame)
 	return detected;
 }
 
-static void tw5864_handle_frame_task(struct tasklet_struct *t)
+static void tw5864_handle_frame_task(struct work_struct *t)
 {
-	struct tw5864_dev *dev = from_tasklet(dev, t, tasklet);
+	struct tw5864_dev *dev = from_work(dev, t, work);
 	unsigned long flags;
 	int batch_size = H264_BUF_CNT;
 
diff --git a/drivers/media/pci/tw5864/tw5864.h b/drivers/media/pci/tw5864/tw5864.h
index a8b6fbd5b710..278373859098 100644
--- a/drivers/media/pci/tw5864/tw5864.h
+++ b/drivers/media/pci/tw5864/tw5864.h
@@ -12,6 +12,7 @@
 #include <linux/mutex.h>
 #include <linux/io.h>
 #include <linux/interrupt.h>
+#include <linux/workqueue.h>
 
 #include <media/v4l2-common.h>
 #include <media/v4l2-ioctl.h>
@@ -85,7 +86,7 @@ struct tw5864_input {
 	int nr; /* input number */
 	struct tw5864_dev *root;
 	struct mutex lock; /* used for vidq and vdev */
-	spinlock_t slock; /* used for sync between ISR, tasklet & V4L2 API */
+	spinlock_t slock; /* used for sync between ISR, work & V4L2 API */
 	struct video_device vdev;
 	struct v4l2_ctrl_handler hdl;
 	struct vb2_queue vidq;
@@ -142,7 +143,7 @@ struct tw5864_h264_frame {
 
 /* global device status */
 struct tw5864_dev {
-	spinlock_t slock; /* used for sync between ISR, tasklet & V4L2 API */
+	spinlock_t slock; /* used for sync between ISR, work & V4L2 API */
 	struct v4l2_device v4l2_dev;
 	struct tw5864_input inputs[TW5864_INPUTS];
 #define H264_BUF_CNT 4
@@ -150,7 +151,7 @@ struct tw5864_dev {
 	int h264_buf_r_index;
 	int h264_buf_w_index;
 
-	struct tasklet_struct tasklet;
+	struct work_struct work;
 
 	int encoder_busy;
 	/* Input number to check next for ready raw picture (in RR fashion) */
diff --git a/drivers/media/platform/intel/pxa_camera.c b/drivers/media/platform/intel/pxa_camera.c
index d904952bf00e..fdbf363237d6 100644
--- a/drivers/media/platform/intel/pxa_camera.c
+++ b/drivers/media/platform/intel/pxa_camera.c
@@ -43,6 +43,7 @@
 #include <linux/videodev2.h>
 
 #include <linux/platform_data/media/camera-pxa.h>
+#include <linux/workqueue.h>
 
 #define PXA_CAM_VERSION "0.0.6"
 #define PXA_CAM_DRV_NAME "pxa27x-camera"
@@ -683,7 +684,7 @@ struct pxa_camera_dev {
 	unsigned int		buf_sequence;
 
 	struct pxa_buffer	*active;
-	struct tasklet_struct	task_eof;
+	struct work_struct	task_eof;
 
 	u32			save_cicr[5];
 };
@@ -1146,9 +1147,9 @@ static void pxa_camera_deactivate(struct pxa_camera_dev *pcdev)
 	clk_disable_unprepare(pcdev->clk);
 }
 
-static void pxa_camera_eof(struct tasklet_struct *t)
+static void pxa_camera_eof(struct work_struct *t)
 {
-	struct pxa_camera_dev *pcdev = from_tasklet(pcdev, t, task_eof);
+	struct pxa_camera_dev *pcdev = from_work(pcdev, t, task_eof);
 	unsigned long cifr;
 	struct pxa_buffer *buf;
 
@@ -1185,7 +1186,7 @@ static irqreturn_t pxa_camera_irq(int irq, void *data)
 	if (status & CISR_EOF) {
 		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_EOFM;
 		__raw_writel(cicr0, pcdev->base + CICR0);
-		tasklet_schedule(&pcdev->task_eof);
+		queue_work(system_bh_wq, &pcdev->task_eof);
 	}
 
 	return IRQ_HANDLED;
@@ -2383,7 +2384,7 @@ static int pxa_camera_probe(struct platform_device *pdev)
 		}
 	}
 
-	tasklet_setup(&pcdev->task_eof, pxa_camera_eof);
+	INIT_WORK(&pcdev->task_eof, pxa_camera_eof);
 
 	pxa_camera_activate(pcdev);
 
@@ -2409,7 +2410,7 @@ static int pxa_camera_probe(struct platform_device *pdev)
 	return 0;
 exit_deactivate:
 	pxa_camera_deactivate(pcdev);
-	tasklet_kill(&pcdev->task_eof);
+	cancel_work_sync(&pcdev->task_eof);
 exit_free_dma:
 	dma_release_channel(pcdev->dma_chans[2]);
 exit_free_dma_u:
@@ -2428,7 +2429,7 @@ static void pxa_camera_remove(struct platform_device *pdev)
 	struct pxa_camera_dev *pcdev = platform_get_drvdata(pdev);
 
 	pxa_camera_deactivate(pcdev);
-	tasklet_kill(&pcdev->task_eof);
+	cancel_work_sync(&pcdev->task_eof);
 	dma_release_channel(pcdev->dma_chans[0]);
 	dma_release_channel(pcdev->dma_chans[1]);
 	dma_release_channel(pcdev->dma_chans[2]);
diff --git a/drivers/media/platform/marvell/mcam-core.c b/drivers/media/platform/marvell/mcam-core.c
index 66688b4aece5..d6b96a7039be 100644
--- a/drivers/media/platform/marvell/mcam-core.c
+++ b/drivers/media/platform/marvell/mcam-core.c
@@ -25,6 +25,7 @@
 #include <linux/clk-provider.h>
 #include <linux/videodev2.h>
 #include <linux/pm_runtime.h>
+#include <linux/workqueue.h>
 #include <media/v4l2-device.h>
 #include <media/v4l2-ioctl.h>
 #include <media/v4l2-ctrls.h>
@@ -439,9 +440,9 @@ static void mcam_ctlr_dma_vmalloc(struct mcam_camera *cam)
 /*
  * Copy data out to user space in the vmalloc case
  */
-static void mcam_frame_tasklet(struct tasklet_struct *t)
+static void mcam_frame_work(struct work_struct *t)
 {
-	struct mcam_camera *cam = from_tasklet(cam, t, s_tasklet);
+	struct mcam_camera *cam = from_work(cam, t, s_work);
 	int i;
 	unsigned long flags;
 	struct mcam_vb_buffer *buf;
@@ -480,7 +481,7 @@ static void mcam_frame_tasklet(struct tasklet_struct *t)
 
 
 /*
- * Make sure our allocated buffers are up to the task.
+ * Make sure our allocated buffers are up to the work.
  */
 static int mcam_check_dma_buffers(struct mcam_camera *cam)
 {
@@ -493,7 +494,7 @@ static int mcam_check_dma_buffers(struct mcam_camera *cam)
 
 static void mcam_vmalloc_done(struct mcam_camera *cam, int frame)
 {
-	tasklet_schedule(&cam->s_tasklet);
+	queue_work(system_bh_wq, &cam->s_work);
 }
 
 #else /* MCAM_MODE_VMALLOC */
@@ -1305,7 +1306,7 @@ static int mcam_setup_vb2(struct mcam_camera *cam)
 		break;
 	case B_vmalloc:
 #ifdef MCAM_MODE_VMALLOC
-		tasklet_setup(&cam->s_tasklet, mcam_frame_tasklet);
+		INIT_WORK(&cam->s_work, mcam_frame_work);
 		vq->ops = &mcam_vb2_ops;
 		vq->mem_ops = &vb2_vmalloc_memops;
 		cam->dma_setup = mcam_ctlr_dma_vmalloc;
diff --git a/drivers/media/platform/marvell/mcam-core.h b/drivers/media/platform/marvell/mcam-core.h
index 51e66db45af6..0d4b953dbb23 100644
--- a/drivers/media/platform/marvell/mcam-core.h
+++ b/drivers/media/platform/marvell/mcam-core.h
@@ -9,6 +9,7 @@
 
 #include <linux/list.h>
 #include <linux/clk-provider.h>
+#include <linux/workqueue.h>
 #include <media/v4l2-common.h>
 #include <media/v4l2-ctrls.h>
 #include <media/v4l2-dev.h>
@@ -167,7 +168,7 @@ struct mcam_camera {
 	unsigned int dma_buf_size;	/* allocated size */
 	void *dma_bufs[MAX_DMA_BUFS];	/* Internal buffer addresses */
 	dma_addr_t dma_handles[MAX_DMA_BUFS]; /* Buffer bus addresses */
-	struct tasklet_struct s_tasklet;
+	struct work_struct s_work;
 #endif
 	unsigned int sequence;		/* Frame sequence number */
 	unsigned int buf_seq[MAX_DMA_BUFS]; /* Sequence for individual bufs */
diff --git a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c
index e4cf27b5a072..dc817cff4121 100644
--- a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c
+++ b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c
@@ -33,6 +33,7 @@
 #include <linux/time.h>
 #include <linux/usb.h>
 #include <linux/wait.h>
+#include <linux/workqueue.h>
 
 #include "c8sectpfe-common.h"
 #include "c8sectpfe-core.h"
@@ -73,16 +74,16 @@ static void c8sectpfe_timer_interrupt(struct timer_list *t)
 
 		/* is this descriptor initialised and TP enabled */
 		if (channel->irec && readl(channel->irec + DMA_PRDS_TPENABLE))
-			tasklet_schedule(&channel->tsklet);
+			queue_work(system_bh_wq, &channel->work);
 	}
 
 	fei->timer.expires = jiffies +	msecs_to_jiffies(POLL_MSECS);
 	add_timer(&fei->timer);
 }
 
-static void channel_swdemux_tsklet(struct tasklet_struct *t)
+static void channel_swdemux_work(struct work_struct *t)
 {
-	struct channel_info *channel = from_tasklet(channel, t, tsklet);
+	struct channel_info *channel = from_work(channel, t, work);
 	struct c8sectpfei *fei;
 	unsigned long wp, rp;
 	int pos, num_packets, n, size;
@@ -211,7 +212,7 @@ static int c8sectpfe_start_feed(struct dvb_demux_feed *dvbdmxfeed)
 
 		dev_dbg(fei->dev, "Starting channel=%p\n", channel);
 
-		tasklet_setup(&channel->tsklet, channel_swdemux_tsklet);
+		INIT_WORK(&channel->work, channel_swdemux_work);
 
 		/* Reset the internal inputblock sram pointers */
 		writel(channel->fifo,
@@ -304,7 +305,7 @@ static int c8sectpfe_stop_feed(struct dvb_demux_feed *dvbdmxfeed)
 		/* disable this channels descriptor */
 		writel(0,  channel->irec + DMA_PRDS_TPENABLE);
 
-		tasklet_disable(&channel->tsklet);
+		disable_work_sync(&channel->work);
 
 		/* now request memdma channel goes idle */
 		idlereq = (1 << channel->tsin_id) | IDLEREQ;
@@ -631,8 +632,8 @@ static int configure_memdma_and_inputblock(struct c8sectpfei *fei,
 	writel(tsin->back_buffer_busaddr, tsin->irec + DMA_PRDS_BUSWP_TP(0));
 	writel(tsin->back_buffer_busaddr, tsin->irec + DMA_PRDS_BUSRP_TP(0));
 
-	/* initialize tasklet */
-	tasklet_setup(&tsin->tsklet, channel_swdemux_tsklet);
+	/* initialize work */
+	INIT_WORK(&tsin->work, channel_swdemux_work);
 
 	return 0;
 
diff --git a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h
index bf377cc82225..284d62a90987 100644
--- a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h
+++ b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h
@@ -51,7 +51,7 @@ struct channel_info {
 	unsigned long  fifo;
 
 	struct completion idle_completion;
-	struct tasklet_struct tsklet;
+	struct work_struct work;
 
 	struct c8sectpfei *fei;
 	void __iomem *irec;
diff --git a/drivers/media/radio/wl128x/fmdrv.h b/drivers/media/radio/wl128x/fmdrv.h
index da8920169df8..85282f638c4a 100644
--- a/drivers/media/radio/wl128x/fmdrv.h
+++ b/drivers/media/radio/wl128x/fmdrv.h
@@ -15,6 +15,7 @@
 #include <sound/core.h>
 #include <sound/initval.h>
 #include <linux/timer.h>
+#include <linux/workqueue.h>
 #include <media/v4l2-ioctl.h>
 #include <media/v4l2-common.h>
 #include <media/v4l2-device.h>
@@ -200,15 +201,15 @@ struct fmdev {
 	int streg_cbdata; /* status of ST registration */
 
 	struct sk_buff_head rx_q;	/* RX queue */
-	struct tasklet_struct rx_task;	/* RX Tasklet */
+	struct work_struct rx_task;	/* RX Work */
 
 	struct sk_buff_head tx_q;	/* TX queue */
-	struct tasklet_struct tx_task;	/* TX Tasklet */
+	struct work_struct tx_task;	/* TX Work */
 	unsigned long last_tx_jiffies;	/* Timestamp of last pkt sent */
 	atomic_t tx_cnt;	/* Number of packets can send at a time */
 
 	struct sk_buff *resp_skb;	/* Response from the chip */
-	/* Main task completion handler */
+	/* Main work completion handler */
 	struct completion maintask_comp;
 	/* Opcode of last command sent to the chip */
 	u8 pre_op;
diff --git a/drivers/media/radio/wl128x/fmdrv_common.c b/drivers/media/radio/wl128x/fmdrv_common.c
index 3da8e5102bec..2528ce6c1cda 100644
--- a/drivers/media/radio/wl128x/fmdrv_common.c
+++ b/drivers/media/radio/wl128x/fmdrv_common.c
@@ -9,7 +9,7 @@
  *     one Channel-8 command to be sent to the chip).
  *  2) Sending each Channel-8 command to the chip and reading
  *     response back over Shared Transport.
- *  3) Managing TX and RX Queues and Tasklets.
+ *  3) Managing TX and RX Queues and Work.
  *  4) Handling FM Interrupt packet and taking appropriate action.
  *  5) Loading FM firmware to the chip (common, FM TX, and FM RX
  *     firmware files based on mode selection)
@@ -29,6 +29,7 @@
 #include "fmdrv_v4l2.h"
 #include "fmdrv_common.h"
 #include <linux/ti_wilink_st.h>
+#include <linux/workqueue.h>
 #include "fmdrv_rx.h"
 #include "fmdrv_tx.h"
 
@@ -244,10 +245,10 @@ void fmc_update_region_info(struct fmdev *fmdev, u8 region_to_set)
 }
 
 /*
- * FM common sub-module will schedule this tasklet whenever it receives
+ * FM common sub-module will queue this work whenever it receives
  * FM packet from ST driver.
  */
-static void recv_tasklet(struct tasklet_struct *t)
+static void recv_work(struct work_struct *t)
 {
 	struct fmdev *fmdev;
 	struct fm_irq *irq_info;
@@ -256,7 +257,7 @@ static void recv_tasklet(struct tasklet_struct *t)
 	u8 num_fm_hci_cmds;
 	unsigned long flags;
 
-	fmdev = from_tasklet(fmdev, t, tx_task);
+	fmdev = from_work(fmdev, t, tx_task);
 	irq_info = &fmdev->irq_info;
 	/* Process all packets in the RX queue */
 	while ((skb = skb_dequeue(&fmdev->rx_q))) {
@@ -322,22 +323,22 @@ static void recv_tasklet(struct tasklet_struct *t)
 
 		/*
 		 * Check flow control field. If Num_FM_HCI_Commands field is
-		 * not zero, schedule FM TX tasklet.
+		 * not zero, queue FM TX work.
 		 */
 		if (num_fm_hci_cmds && atomic_read(&fmdev->tx_cnt))
 			if (!skb_queue_empty(&fmdev->tx_q))
-				tasklet_schedule(&fmdev->tx_task);
+				queue_work(system_bh_wq, &fmdev->tx_task);
 	}
 }
 
-/* FM send tasklet: is scheduled when FM packet has to be sent to chip */
-static void send_tasklet(struct tasklet_struct *t)
+/* FM send work: is scheduled when FM packet has to be sent to chip */
+static void send_work(struct work_struct *t)
 {
 	struct fmdev *fmdev;
 	struct sk_buff *skb;
 	int len;
 
-	fmdev = from_tasklet(fmdev, t, tx_task);
+	fmdev = from_work(fmdev, t, tx_task);
 
 	if (!atomic_read(&fmdev->tx_cnt))
 		return;
@@ -366,7 +367,7 @@ static void send_tasklet(struct tasklet_struct *t)
 	if (len < 0) {
 		kfree_skb(skb);
 		fmdev->resp_comp = NULL;
-		fmerr("TX tasklet failed to send skb(%p)\n", skb);
+		fmerr("TX work failed to send skb(%p)\n", skb);
 		atomic_set(&fmdev->tx_cnt, 1);
 	} else {
 		fmdev->last_tx_jiffies = jiffies;
@@ -374,7 +375,7 @@ static void send_tasklet(struct tasklet_struct *t)
 }
 
 /*
- * Queues FM Channel-8 packet to FM TX queue and schedules FM TX tasklet for
+ * Queues FM Channel-8 packet to FM TX queue and schedules FM TX work for
  * transmission
  */
 static int fm_send_cmd(struct fmdev *fmdev, u8 fm_op, u16 type,	void *payload,
@@ -440,7 +441,7 @@ static int fm_send_cmd(struct fmdev *fmdev, u8 fm_op, u16 type,	void *payload,
 
 	fm_cb(skb)->completion = wait_completion;
 	skb_queue_tail(&fmdev->tx_q, skb);
-	tasklet_schedule(&fmdev->tx_task);
+	queue_work(system_bh_wq, &fmdev->tx_task);
 
 	return 0;
 }
@@ -462,7 +463,7 @@ int fmc_send_cmd(struct fmdev *fmdev, u8 fm_op, u16 type, void *payload,
 
 	if (!wait_for_completion_timeout(&fmdev->maintask_comp,
 					 FM_DRV_TX_TIMEOUT)) {
-		fmerr("Timeout(%d sec),didn't get regcompletion signal from RX tasklet\n",
+		fmerr("Timeout(%d sec),didn't get regcompletion signal from RX work\n",
 			   jiffies_to_msecs(FM_DRV_TX_TIMEOUT) / 1000);
 		return -ETIMEDOUT;
 	}
@@ -1455,7 +1456,7 @@ static long fm_st_receive(void *arg, struct sk_buff *skb)
 
 	memcpy(skb_push(skb, 1), &skb->cb[0], 1);
 	skb_queue_tail(&fmdev->rx_q, skb);
-	tasklet_schedule(&fmdev->rx_task);
+	queue_work(system_bh_wq, &fmdev->rx_task);
 
 	return 0;
 }
@@ -1537,13 +1538,13 @@ int fmc_prepare(struct fmdev *fmdev)
 	spin_lock_init(&fmdev->rds_buff_lock);
 	spin_lock_init(&fmdev->resp_skb_lock);
 
-	/* Initialize TX queue and TX tasklet */
+	/* Initialize TX queue and TX work */
 	skb_queue_head_init(&fmdev->tx_q);
-	tasklet_setup(&fmdev->tx_task, send_tasklet);
+	INIT_WORK(&fmdev->tx_task, send_work);
 
-	/* Initialize RX Queue and RX tasklet */
+	/* Initialize RX Queue and RX work */
 	skb_queue_head_init(&fmdev->rx_q);
-	tasklet_setup(&fmdev->rx_task, recv_tasklet);
+	INIT_WORK(&fmdev->rx_task, recv_work);
 
 	fmdev->irq_info.stage = 0;
 	atomic_set(&fmdev->tx_cnt, 1);
@@ -1589,8 +1590,8 @@ int fmc_release(struct fmdev *fmdev)
 	/* Service pending read */
 	wake_up_interruptible(&fmdev->rx.rds.read_queue);
 
-	tasklet_kill(&fmdev->tx_task);
-	tasklet_kill(&fmdev->rx_task);
+	cancel_work_sync(&fmdev->tx_task);
+	cancel_work_sync(&fmdev->rx_task);
 
 	skb_queue_purge(&fmdev->tx_q);
 	skb_queue_purge(&fmdev->rx_q);
diff --git a/drivers/media/rc/mceusb.c b/drivers/media/rc/mceusb.c
index c76ba24c1f55..a2e2e58b7506 100644
--- a/drivers/media/rc/mceusb.c
+++ b/drivers/media/rc/mceusb.c
@@ -774,7 +774,7 @@ static void mceusb_dev_printdata(struct mceusb_dev *ir, u8 *buf, int buf_len,
 
 /*
  * Schedule work that can't be done in interrupt handlers
- * (mceusb_dev_recv() and mce_write_callback()) nor tasklets.
+ * (mceusb_dev_recv() and mce_write_callback()) nor works.
  * Invokes mceusb_deferred_kevent() for recovering from
  * error events specified by the kevent bit field.
  */
diff --git a/drivers/media/usb/ttusb-dec/ttusb_dec.c b/drivers/media/usb/ttusb-dec/ttusb_dec.c
index 79faa2560613..c8bc84f4aefb 100644
--- a/drivers/media/usb/ttusb-dec/ttusb_dec.c
+++ b/drivers/media/usb/ttusb-dec/ttusb_dec.c
@@ -19,6 +19,7 @@
 #include <linux/input.h>
 
 #include <linux/mutex.h>
+#include <linux/workqueue.h>
 
 #include <media/dmxdev.h>
 #include <media/dvb_demux.h>
@@ -139,7 +140,7 @@ struct ttusb_dec {
 	int			v_pes_postbytes;
 
 	struct list_head	urb_frame_list;
-	struct tasklet_struct	urb_tasklet;
+	struct work_struct	urb_work;
 	spinlock_t		urb_frame_list_lock;
 
 	struct dvb_demux_filter	*audio_filter;
@@ -766,9 +767,9 @@ static void ttusb_dec_process_urb_frame(struct ttusb_dec *dec, u8 *b,
 	}
 }
 
-static void ttusb_dec_process_urb_frame_list(struct tasklet_struct *t)
+static void ttusb_dec_process_urb_frame_list(struct work_struct *t)
 {
-	struct ttusb_dec *dec = from_tasklet(dec, t, urb_tasklet);
+	struct ttusb_dec *dec = from_work(dec, t, urb_work);
 	struct list_head *item;
 	struct urb_frame *frame;
 	unsigned long flags;
@@ -822,7 +823,7 @@ static void ttusb_dec_process_urb(struct urb *urb)
 				spin_unlock_irqrestore(&dec->urb_frame_list_lock,
 						       flags);
 
-				tasklet_schedule(&dec->urb_tasklet);
+				queue_work(system_bh_wq, &dec->urb_work);
 			}
 		}
 	} else {
@@ -1198,11 +1199,11 @@ static int ttusb_dec_alloc_iso_urbs(struct ttusb_dec *dec)
 	return 0;
 }
 
-static void ttusb_dec_init_tasklet(struct ttusb_dec *dec)
+static void ttusb_dec_init_work(struct ttusb_dec *dec)
 {
 	spin_lock_init(&dec->urb_frame_list_lock);
 	INIT_LIST_HEAD(&dec->urb_frame_list);
-	tasklet_setup(&dec->urb_tasklet, ttusb_dec_process_urb_frame_list);
+	INIT_WORK(&dec->urb_work, ttusb_dec_process_urb_frame_list);
 }
 
 static int ttusb_init_rc( struct ttusb_dec *dec)
@@ -1588,12 +1589,12 @@ static void ttusb_dec_exit_usb(struct ttusb_dec *dec)
 	ttusb_dec_free_iso_urbs(dec);
 }
 
-static void ttusb_dec_exit_tasklet(struct ttusb_dec *dec)
+static void ttusb_dec_exit_work(struct ttusb_dec *dec)
 {
 	struct list_head *item;
 	struct urb_frame *frame;
 
-	tasklet_kill(&dec->urb_tasklet);
+	cancel_work_sync(&dec->urb_work);
 
 	while ((item = dec->urb_frame_list.next) != &dec->urb_frame_list) {
 		frame = list_entry(item, struct urb_frame, urb_frame_list);
@@ -1703,7 +1704,7 @@ static int ttusb_dec_probe(struct usb_interface *intf,
 
 	ttusb_dec_init_v_pes(dec);
 	ttusb_dec_init_filters(dec);
-	ttusb_dec_init_tasklet(dec);
+	ttusb_dec_init_work(dec);
 
 	dec->active = 1;
 
@@ -1729,7 +1730,7 @@ static void ttusb_dec_disconnect(struct usb_interface *intf)
 	dprintk("%s\n", __func__);
 
 	if (dec->active) {
-		ttusb_dec_exit_tasklet(dec);
+		ttusb_dec_exit_work(dec);
 		ttusb_dec_exit_filters(dec);
 		if(enable_rc)
 			ttusb_dec_exit_rc(dec);
-- 
2.17.1


^ permalink raw reply related	[relevance 19%]

* Re: [PATCH 8/9] drivers/media/*: Convert from tasklet to BH workqueue
  @ 2024-04-24 16:48 65%     ` Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-04-24 16:48 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Linux Kernel Mailing List, Tejun Heo, Kees Cook, Vinod Koul,
	marcan, sven, florian.fainelli, Ray Jui, Scott Branden,
	Paul Cercueil, Eugeniy.Paltsev, manivannan.sadhasivam,
	Viresh Kumar, Frank.Li, Leo Li, zw, Zhou Wang, haijie1,
	Shawn Guo, Sascha Hauer, Sean Wang, Matthias Brugger,
	angelogioacchino.delregno, Andreas Färber, Logan Gunthorpe,
	Daniel Mack, Haojian Zhuang, Robert Jarzmik, andersson,
	konrad.dybcio, Orson Zhai, baolin.wang, Lyra Zhang,
	Patrice CHOTARD, Linus Walleij, Chen-Yu Tsai,
	Jernej Škrabec, peter.ujfalusi, kys, haiyangz, wei.liu,
	decui, jassisinghbrar, mchehab, maintainers, aubin.constans,
	ulf.hansson, manuel.lauss, mirq-linux, jh80.chung, oakad,
	hayashi.kunihiko, mhiramat, brucechang, HaraldWelte, pierre,
	duncan.sands, stern, oneukum, openipmi-developer, dmaengine,
	asahi, linux-arm-kernel, linux-rpi-kernel, linux-mips, imx,
	linuxppc-dev, linux-mediatek, linux-actions, linux-arm-msm,
	linux-riscv, linux-sunxi, linux-tegra, linux-hyperv, linux-rdma,
	linux-media, linux-mmc, linux-omap, linux-renesas-soc,
	linux-s390, netdev, linux-usb



> On Apr 24, 2024, at 2:12 AM, Hans Verkuil <hverkuil@xs4all.nl> wrote:
> 
> On 27/03/2024 17:03, Allen Pais wrote:
>> The only generic interface to execute asynchronously in the BH context is
>> tasklet; however, it's marked deprecated and has some design flaws. To
>> replace tasklets, BH workqueue support was recently added. A BH workqueue
>> behaves similarly to regular workqueues except that the queued work items
>> are executed in the BH context.
>> 
>> This patch converts drivers/media/* from tasklet to BH workqueue.
>> 
>> Based on the work done by Tejun Heo <tj@kernel.org>
>> Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10
>> 
>> Signed-off-by: Allen Pais <allen.lkml@gmail.com>
>> ---
>> drivers/media/pci/bt8xx/bt878.c               |  8 ++--
>> drivers/media/pci/bt8xx/bt878.h               |  3 +-
>> drivers/media/pci/bt8xx/dvb-bt8xx.c           |  9 ++--
>> drivers/media/pci/ddbridge/ddbridge.h         |  3 +-
>> drivers/media/pci/mantis/hopper_cards.c       |  2 +-
>> drivers/media/pci/mantis/mantis_cards.c       |  2 +-
>> drivers/media/pci/mantis/mantis_common.h      |  3 +-
>> drivers/media/pci/mantis/mantis_dma.c         |  5 ++-
>> drivers/media/pci/mantis/mantis_dma.h         |  2 +-
>> drivers/media/pci/mantis/mantis_dvb.c         | 12 +++---
>> drivers/media/pci/ngene/ngene-core.c          | 23 ++++++-----
>> drivers/media/pci/ngene/ngene.h               |  5 ++-
>> drivers/media/pci/smipcie/smipcie-main.c      | 18 ++++----
>> drivers/media/pci/smipcie/smipcie.h           |  3 +-
>> drivers/media/pci/ttpci/budget-av.c           |  3 +-
>> drivers/media/pci/ttpci/budget-ci.c           | 27 ++++++------
>> drivers/media/pci/ttpci/budget-core.c         | 10 ++---
>> drivers/media/pci/ttpci/budget.h              |  5 ++-
>> drivers/media/pci/tw5864/tw5864-core.c        |  2 +-
>> drivers/media/pci/tw5864/tw5864-video.c       | 13 +++---
>> drivers/media/pci/tw5864/tw5864.h             |  7 ++--
>> drivers/media/platform/intel/pxa_camera.c     | 15 +++----
>> drivers/media/platform/marvell/mcam-core.c    | 11 ++---
>> drivers/media/platform/marvell/mcam-core.h    |  3 +-
>> .../st/sti/c8sectpfe/c8sectpfe-core.c         | 15 +++----
>> .../st/sti/c8sectpfe/c8sectpfe-core.h         |  2 +-
>> drivers/media/radio/wl128x/fmdrv.h            |  7 ++--
>> drivers/media/radio/wl128x/fmdrv_common.c     | 41 ++++++++++---------
>> drivers/media/rc/mceusb.c                     |  2 +-
>> drivers/media/usb/ttusb-dec/ttusb_dec.c       | 21 +++++-----
>> 30 files changed, 151 insertions(+), 131 deletions(-)
>> 
>> diff --git a/drivers/media/pci/bt8xx/bt878.c b/drivers/media/pci/bt8xx/bt878.c
>> index 90972d6952f1..983ec29108f0 100644
>> --- a/drivers/media/pci/bt8xx/bt878.c
>> +++ b/drivers/media/pci/bt8xx/bt878.c
>> @@ -300,8 +300,8 @@ static irqreturn_t bt878_irq(int irq, void *dev_id)
>> 		}
>> 		if (astat & BT878_ARISCI) {
>> 			bt->finished_block = (stat & BT878_ARISCS) >> 28;
>> -			if (bt->tasklet.callback)
>> -				tasklet_schedule(&bt->tasklet);
>> +			if (bt->work.func)
>> +				queue_work(system_bh_wq,
> 
> I stopped reviewing here: this clearly has not been compile tested.
> 
> Also please check the patch with 'checkpatch.pl --strict' and fix the reported issues.
> 
> Regards,
> 
> 	Hans

 Hans,

   Thanks for taking the time out to review. This was a mistake, I sent out a v2 which had
This fixed. I am working on v3 based on some of the comments I received  recently. Will
Appreciate your review when it is sent out.

Allen

> 
>> 			break;
>> 		}
>> 		count++;
>> @@ -478,8 +478,8 @@ static int bt878_probe(struct pci_dev *dev, const struct pci_device_id *pci_id)
>> 	btwrite(0, BT878_AINT_MASK);
>> 	bt878_num++;
>> 
>> -	if (!bt->tasklet.func)
>> -		tasklet_disable(&bt->tasklet);
>> +	if (!bt->work.func)
>> +		disable_work_sync(&bt->work);
>> 
>> 	return 0;
>> 
>> diff --git a/drivers/media/pci/bt8xx/bt878.h b/drivers/media/pci/bt8xx/bt878.h
>> index fde8db293c54..b9ce78e5116b 100644
>> --- a/drivers/media/pci/bt8xx/bt878.h
>> +++ b/drivers/media/pci/bt8xx/bt878.h
>> @@ -14,6 +14,7 @@
>> #include <linux/sched.h>
>> #include <linux/spinlock.h>
>> #include <linux/mutex.h>
>> +#include <linux/workqueue.h>
>> 
>> #include "bt848.h"
>> #include "bttv.h"
>> @@ -120,7 +121,7 @@ struct bt878 {
>> 	dma_addr_t risc_dma;
>> 	u32 risc_pos;
>> 
>> -	struct tasklet_struct tasklet;
>> +	struct work_struct work;
>> 	int shutdown;
>> };
>> 
>> diff --git a/drivers/media/pci/bt8xx/dvb-bt8xx.c b/drivers/media/pci/bt8xx/dvb-bt8xx.c
>> index 390cbba6c065..8c0e1fa764a4 100644
>> --- a/drivers/media/pci/bt8xx/dvb-bt8xx.c
>> +++ b/drivers/media/pci/bt8xx/dvb-bt8xx.c
>> @@ -15,6 +15,7 @@
>> #include <linux/delay.h>
>> #include <linux/slab.h>
>> #include <linux/i2c.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/dmxdev.h>
>> #include <media/dvbdev.h>
>> @@ -39,9 +40,9 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>> 
>> #define IF_FREQUENCYx6 217    /* 6 * 36.16666666667MHz */
>> 
>> -static void dvb_bt8xx_task(struct tasklet_struct *t)
>> +static void dvb_bt8xx_task(struct work_struct *t)
>> {
>> -	struct bt878 *bt = from_tasklet(bt, t, tasklet);
>> +	struct bt878 *bt = from_work(bt, t, work);
>> 	struct dvb_bt8xx_card *card = dev_get_drvdata(&bt->adapter->dev);
>> 
>> 	dprintk("%d\n", card->bt->finished_block);
>> @@ -782,7 +783,7 @@ static int dvb_bt8xx_load_card(struct dvb_bt8xx_card *card, u32 type)
>> 		goto err_disconnect_frontend;
>> 	}
>> 
>> -	tasklet_setup(&card->bt->tasklet, dvb_bt8xx_task);
>> +	INIT_WORK(&card->bt->work, dvb_bt8xx_task);
>> 
>> 	frontend_init(card, type);
>> 
>> @@ -922,7 +923,7 @@ static void dvb_bt8xx_remove(struct bttv_sub_device *sub)
>> 	dprintk("dvb_bt8xx: unloading card%d\n", card->bttv_nr);
>> 
>> 	bt878_stop(card->bt);
>> -	tasklet_kill(&card->bt->tasklet);
>> +	cancel_work_sync(&card->bt->work);
>> 	dvb_net_release(&card->dvbnet);
>> 	card->demux.dmx.remove_frontend(&card->demux.dmx, &card->fe_mem);
>> 	card->demux.dmx.remove_frontend(&card->demux.dmx, &card->fe_hw);
>> diff --git a/drivers/media/pci/ddbridge/ddbridge.h b/drivers/media/pci/ddbridge/ddbridge.h
>> index f3699dbd193f..037d1d13ef0f 100644
>> --- a/drivers/media/pci/ddbridge/ddbridge.h
>> +++ b/drivers/media/pci/ddbridge/ddbridge.h
>> @@ -35,6 +35,7 @@
>> #include <linux/uaccess.h>
>> #include <linux/vmalloc.h>
>> #include <linux/workqueue.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <asm/dma.h>
>> #include <asm/irq.h>
>> @@ -298,7 +299,7 @@ struct ddb_link {
>> 	spinlock_t             lock; /* lock link access */
>> 	struct mutex           flash_mutex; /* lock flash access */
>> 	struct ddb_lnb         lnb;
>> -	struct tasklet_struct  tasklet;
>> +	struct work_struct work;
>> 	struct ddb_ids         ids;
>> 
>> 	spinlock_t             temp_lock; /* lock temp chip access */
>> diff --git a/drivers/media/pci/mantis/hopper_cards.c b/drivers/media/pci/mantis/hopper_cards.c
>> index c0bd5d7e148b..869ea88c4893 100644
>> --- a/drivers/media/pci/mantis/hopper_cards.c
>> +++ b/drivers/media/pci/mantis/hopper_cards.c
>> @@ -116,7 +116,7 @@ static irqreturn_t hopper_irq_handler(int irq, void *dev_id)
>> 	if (stat & MANTIS_INT_RISCI) {
>> 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[8]);
>> 		mantis->busy_block = (stat & MANTIS_INT_RISCSTAT) >> 28;
>> -		tasklet_schedule(&mantis->tasklet);
>> +		queue_work(system_bh_wq, &mantis->work);
>> 	}
>> 	if (stat & MANTIS_INT_I2CDONE) {
>> 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[9]);
>> diff --git a/drivers/media/pci/mantis/mantis_cards.c b/drivers/media/pci/mantis/mantis_cards.c
>> index 906e4500d87d..cb124b19e36e 100644
>> --- a/drivers/media/pci/mantis/mantis_cards.c
>> +++ b/drivers/media/pci/mantis/mantis_cards.c
>> @@ -125,7 +125,7 @@ static irqreturn_t mantis_irq_handler(int irq, void *dev_id)
>> 	if (stat & MANTIS_INT_RISCI) {
>> 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[8]);
>> 		mantis->busy_block = (stat & MANTIS_INT_RISCSTAT) >> 28;
>> -		tasklet_schedule(&mantis->tasklet);
>> +		queue_work(system_bh_wq, &mantis->work);
>> 	}
>> 	if (stat & MANTIS_INT_I2CDONE) {
>> 		dprintk(MANTIS_DEBUG, 0, "<%s>", label[9]);
>> diff --git a/drivers/media/pci/mantis/mantis_common.h b/drivers/media/pci/mantis/mantis_common.h
>> index d88ac280226c..f2247148f268 100644
>> --- a/drivers/media/pci/mantis/mantis_common.h
>> +++ b/drivers/media/pci/mantis/mantis_common.h
>> @@ -12,6 +12,7 @@
>> #include <linux/interrupt.h>
>> #include <linux/mutex.h>
>> #include <linux/workqueue.h>
>> +#include <linux/workqueue.h>
>> 
>> #include "mantis_reg.h"
>> #include "mantis_uart.h"
>> @@ -125,7 +126,7 @@ struct mantis_pci {
>> 	__le32			*risc_cpu;
>> 	dma_addr_t		risc_dma;
>> 
>> -	struct tasklet_struct	tasklet;
>> +	struct work_struct 	work;
>> 	spinlock_t		intmask_lock;
>> 
>> 	struct i2c_adapter	adapter;
>> diff --git a/drivers/media/pci/mantis/mantis_dma.c b/drivers/media/pci/mantis/mantis_dma.c
>> index 80c843936493..c85f9b84a2c6 100644
>> --- a/drivers/media/pci/mantis/mantis_dma.c
>> +++ b/drivers/media/pci/mantis/mantis_dma.c
>> @@ -15,6 +15,7 @@
>> #include <linux/signal.h>
>> #include <linux/sched.h>
>> #include <linux/interrupt.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/dmxdev.h>
>> #include <media/dvbdev.h>
>> @@ -200,9 +201,9 @@ void mantis_dma_stop(struct mantis_pci *mantis)
>> }
>> 
>> 
>> -void mantis_dma_xfer(struct tasklet_struct *t)
>> +void mantis_dma_xfer(struct work_struct *t)
>> {
>> -	struct mantis_pci *mantis = from_tasklet(mantis, t, tasklet);
>> +	struct mantis_pci *mantis = from_work(mantis, t, work);
>> 	struct mantis_hwconfig *config = mantis->hwconfig;
>> 
>> 	while (mantis->last_block != mantis->busy_block) {
>> diff --git a/drivers/media/pci/mantis/mantis_dma.h b/drivers/media/pci/mantis/mantis_dma.h
>> index 37da982c9c29..5db0d3728f15 100644
>> --- a/drivers/media/pci/mantis/mantis_dma.h
>> +++ b/drivers/media/pci/mantis/mantis_dma.h
>> @@ -13,6 +13,6 @@ extern int mantis_dma_init(struct mantis_pci *mantis);
>> extern int mantis_dma_exit(struct mantis_pci *mantis);
>> extern void mantis_dma_start(struct mantis_pci *mantis);
>> extern void mantis_dma_stop(struct mantis_pci *mantis);
>> -extern void mantis_dma_xfer(struct tasklet_struct *t);
>> +extern void mantis_dma_xfer(struct work_struct *t);
>> 
>> #endif /* __MANTIS_DMA_H */
>> diff --git a/drivers/media/pci/mantis/mantis_dvb.c b/drivers/media/pci/mantis/mantis_dvb.c
>> index c7ba4a76e608..f640635de170 100644
>> --- a/drivers/media/pci/mantis/mantis_dvb.c
>> +++ b/drivers/media/pci/mantis/mantis_dvb.c
>> @@ -105,7 +105,7 @@ static int mantis_dvb_start_feed(struct dvb_demux_feed *dvbdmxfeed)
>> 	if (mantis->feeds == 1)	 {
>> 		dprintk(MANTIS_DEBUG, 1, "mantis start feed & dma");
>> 		mantis_dma_start(mantis);
>> -		tasklet_enable(&mantis->tasklet);
>> +		enable_and_queue_work(system_bh_wq, &mantis->work);
>> 	}
>> 
>> 	return mantis->feeds;
>> @@ -125,7 +125,7 @@ static int mantis_dvb_stop_feed(struct dvb_demux_feed *dvbdmxfeed)
>> 	mantis->feeds--;
>> 	if (mantis->feeds == 0) {
>> 		dprintk(MANTIS_DEBUG, 1, "mantis stop feed and dma");
>> -		tasklet_disable(&mantis->tasklet);
>> +		disable_work_sync(&mantis->work);
>> 		mantis_dma_stop(mantis);
>> 	}
>> 
>> @@ -205,8 +205,8 @@ int mantis_dvb_init(struct mantis_pci *mantis)
>> 	}
>> 
>> 	dvb_net_init(&mantis->dvb_adapter, &mantis->dvbnet, &mantis->demux.dmx);
>> -	tasklet_setup(&mantis->tasklet, mantis_dma_xfer);
>> -	tasklet_disable(&mantis->tasklet);
>> +	INIT_WORK(&mantis->bh, mantis_dma_xfer);
>> +	disable_work_sync(&mantis->work);
>> 	if (mantis->hwconfig) {
>> 		result = config->frontend_init(mantis, mantis->fe);
>> 		if (result < 0) {
>> @@ -235,7 +235,7 @@ int mantis_dvb_init(struct mantis_pci *mantis)
>> 
>> 	/* Error conditions ..	*/
>> err5:
>> -	tasklet_kill(&mantis->tasklet);
>> +	cancel_work_sync(&mantis->work);
>> 	dvb_net_release(&mantis->dvbnet);
>> 	if (mantis->fe) {
>> 		dvb_unregister_frontend(mantis->fe);
>> @@ -273,7 +273,7 @@ int mantis_dvb_exit(struct mantis_pci *mantis)
>> 		dvb_frontend_detach(mantis->fe);
>> 	}
>> 
>> -	tasklet_kill(&mantis->tasklet);
>> +	cancel_work_sync(&mantis->work);
>> 	dvb_net_release(&mantis->dvbnet);
>> 
>> 	mantis->demux.dmx.remove_frontend(&mantis->demux.dmx, &mantis->fe_mem);
>> diff --git a/drivers/media/pci/ngene/ngene-core.c b/drivers/media/pci/ngene/ngene-core.c
>> index 7481f553f959..5211d6796748 100644
>> --- a/drivers/media/pci/ngene/ngene-core.c
>> +++ b/drivers/media/pci/ngene/ngene-core.c
>> @@ -21,6 +21,7 @@
>> #include <linux/byteorder/generic.h>
>> #include <linux/firmware.h>
>> #include <linux/vmalloc.h>
>> +#include <linux/workqueue.h>
>> 
>> #include "ngene.h"
>> 
>> @@ -50,9 +51,9 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>> /* nGene interrupt handler **************************************************/
>> /****************************************************************************/
>> 
>> -static void event_tasklet(struct tasklet_struct *t)
>> +static void event_work(struct work_struct *t)
>> {
>> -	struct ngene *dev = from_tasklet(dev, t, event_tasklet);
>> +	struct ngene *dev = from_work(dev, t, event_work);
>> 
>> 	while (dev->EventQueueReadIndex != dev->EventQueueWriteIndex) {
>> 		struct EVENT_BUFFER Event =
>> @@ -68,9 +69,9 @@ static void event_tasklet(struct tasklet_struct *t)
>> 	}
>> }
>> 
>> -static void demux_tasklet(struct tasklet_struct *t)
>> +static void demux_work(struct work_struct *t)
>> {
>> -	struct ngene_channel *chan = from_tasklet(chan, t, demux_tasklet);
>> +	struct ngene_channel *chan = from_work(chan, t, demux_work);
>> 	struct device *pdev = &chan->dev->pci_dev->dev;
>> 	struct SBufferHeader *Cur = chan->nextBuffer;
>> 
>> @@ -204,7 +205,7 @@ static irqreturn_t irq_handler(int irq, void *dev_id)
>> 			dev->EventQueueOverflowFlag = 1;
>> 		}
>> 		dev->EventBuffer->EventStatus &= ~0x80;
>> -		tasklet_schedule(&dev->event_tasklet);
>> +		queue_work(system_bh_wq, &dev->event_work);
>> 		rc = IRQ_HANDLED;
>> 	}
>> 
>> @@ -217,8 +218,8 @@ static irqreturn_t irq_handler(int irq, void *dev_id)
>> 			     ngeneBuffer.SR.Flags & 0xC0) == 0x80) {
>> 				dev->channel[i].nextBuffer->
>> 					ngeneBuffer.SR.Flags |= 0x40;
>> -				tasklet_schedule(
>> -					&dev->channel[i].demux_tasklet);
>> +				queue_work(system_bh_wq,
>> +					&dev->channel[i].demux_work);
>> 				rc = IRQ_HANDLED;
>> 			}
>> 		}
>> @@ -1181,7 +1182,7 @@ static void ngene_init(struct ngene *dev)
>> 	struct device *pdev = &dev->pci_dev->dev;
>> 	int i;
>> 
>> -	tasklet_setup(&dev->event_tasklet, event_tasklet);
>> +	INIT_WORK(&dev->event_work, event_work);
>> 
>> 	memset_io(dev->iomem + 0xc000, 0x00, 0x220);
>> 	memset_io(dev->iomem + 0xc400, 0x00, 0x100);
>> @@ -1395,7 +1396,7 @@ static void release_channel(struct ngene_channel *chan)
>> 	if (chan->running)
>> 		set_transfer(chan, 0);
>> 
>> -	tasklet_kill(&chan->demux_tasklet);
>> +	cancel_work_sync(&chan->demux_work);
>> 
>> 	if (chan->ci_dev) {
>> 		dvb_unregister_device(chan->ci_dev);
>> @@ -1445,7 +1446,7 @@ static int init_channel(struct ngene_channel *chan)
>> 	struct ngene_info *ni = dev->card_info;
>> 	int io = ni->io_type[nr];
>> 
>> -	tasklet_setup(&chan->demux_tasklet, demux_tasklet);
>> +	INIT_WORK(&chan->demux_work, demux_work);
>> 	chan->users = 0;
>> 	chan->type = io;
>> 	chan->mode = chan->type;	/* for now only one mode */
>> @@ -1647,7 +1648,7 @@ void ngene_remove(struct pci_dev *pdev)
>> 	struct ngene *dev = pci_get_drvdata(pdev);
>> 	int i;
>> 
>> -	tasklet_kill(&dev->event_tasklet);
>> +	cancel_work_sync(&dev->event_work);
>> 	for (i = MAX_STREAM - 1; i >= 0; i--)
>> 		release_channel(&dev->channel[i]);
>> 	if (dev->ci.en)
>> diff --git a/drivers/media/pci/ngene/ngene.h b/drivers/media/pci/ngene/ngene.h
>> index d1d7da84cd9d..c2a23f6dbe09 100644
>> --- a/drivers/media/pci/ngene/ngene.h
>> +++ b/drivers/media/pci/ngene/ngene.h
>> @@ -16,6 +16,7 @@
>> #include <linux/scatterlist.h>
>> 
>> #include <linux/dvb/frontend.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/dmxdev.h>
>> #include <media/dvbdev.h>
>> @@ -621,7 +622,7 @@ struct ngene_channel {
>> 	int                   users;
>> 	struct video_device  *v4l_dev;
>> 	struct dvb_device    *ci_dev;
>> -	struct tasklet_struct demux_tasklet;
>> +	struct work_struct demux_work;
>> 
>> 	struct SBufferHeader *nextBuffer;
>> 	enum KSSTATE          State;
>> @@ -717,7 +718,7 @@ struct ngene {
>> 	struct EVENT_BUFFER   EventQueue[EVENT_QUEUE_SIZE];
>> 	int                   EventQueueOverflowCount;
>> 	int                   EventQueueOverflowFlag;
>> -	struct tasklet_struct event_tasklet;
>> +	struct work_struct event_work;
>> 	struct EVENT_BUFFER  *EventBuffer;
>> 	int                   EventQueueWriteIndex;
>> 	int                   EventQueueReadIndex;
>> diff --git a/drivers/media/pci/smipcie/smipcie-main.c b/drivers/media/pci/smipcie/smipcie-main.c
>> index 0c300d019d9c..7da6bb55660b 100644
>> --- a/drivers/media/pci/smipcie/smipcie-main.c
>> +++ b/drivers/media/pci/smipcie/smipcie-main.c
>> @@ -279,10 +279,10 @@ static void smi_port_clearInterrupt(struct smi_port *port)
>> 		(port->_dmaInterruptCH0 | port->_dmaInterruptCH1));
>> }
>> 
>> -/* tasklet handler: DMA data to dmx.*/
>> -static void smi_dma_xfer(struct tasklet_struct *t)
>> +/* work handler: DMA data to dmx.*/
>> +static void smi_dma_xfer(struct work_struct *t)
>> {
>> -	struct smi_port *port = from_tasklet(port, t, tasklet);
>> +	struct smi_port *port = from_work(port, t, work);
>> 	struct smi_dev *dev = port->dev;
>> 	u32 intr_status, finishedData, dmaManagement;
>> 	u8 dmaChan0State, dmaChan1State;
>> @@ -426,8 +426,8 @@ static int smi_port_init(struct smi_port *port, int dmaChanUsed)
>> 	}
>> 
>> 	smi_port_disableInterrupt(port);
>> -	tasklet_setup(&port->tasklet, smi_dma_xfer);
>> -	tasklet_disable(&port->tasklet);
>> +	INIT_WORK(&port->work, smi_dma_xfer);
>> +	disable_work_sync(&port->work);
>> 	port->enable = 1;
>> 	return 0;
>> err:
>> @@ -438,7 +438,7 @@ static int smi_port_init(struct smi_port *port, int dmaChanUsed)
>> static void smi_port_exit(struct smi_port *port)
>> {
>> 	smi_port_disableInterrupt(port);
>> -	tasklet_kill(&port->tasklet);
>> +	cancel_work_sync(&port->work);
>> 	smi_port_dma_free(port);
>> 	port->enable = 0;
>> }
>> @@ -452,7 +452,7 @@ static int smi_port_irq(struct smi_port *port, u32 int_status)
>> 		smi_port_disableInterrupt(port);
>> 		port->_int_status = int_status;
>> 		smi_port_clearInterrupt(port);
>> -		tasklet_schedule(&port->tasklet);
>> +		queue_work(system_bh_wq, &port->work);
>> 		handled = 1;
>> 	}
>> 	return handled;
>> @@ -823,7 +823,7 @@ static int smi_start_feed(struct dvb_demux_feed *dvbdmxfeed)
>> 		smi_port_clearInterrupt(port);
>> 		smi_port_enableInterrupt(port);
>> 		smi_write(port->DMA_MANAGEMENT, dmaManagement);
>> -		tasklet_enable(&port->tasklet);
>> +		enable_and_queue_work(system_bh_wq, &port->work);
>> 	}
>> 	return port->users;
>> }
>> @@ -837,7 +837,7 @@ static int smi_stop_feed(struct dvb_demux_feed *dvbdmxfeed)
>> 	if (--port->users)
>> 		return port->users;
>> 
>> -	tasklet_disable(&port->tasklet);
>> +	disable_work_sync(&port->work);
>> 	smi_port_disableInterrupt(port);
>> 	smi_clear(port->DMA_MANAGEMENT, 0x30003);
>> 	return 0;
>> diff --git a/drivers/media/pci/smipcie/smipcie.h b/drivers/media/pci/smipcie/smipcie.h
>> index 2b5e0154814c..f124d2cdead6 100644
>> --- a/drivers/media/pci/smipcie/smipcie.h
>> +++ b/drivers/media/pci/smipcie/smipcie.h
>> @@ -17,6 +17,7 @@
>> #include <linux/pci.h>
>> #include <linux/dma-mapping.h>
>> #include <linux/slab.h>
>> +#include <linux/workqueue.h>
>> #include <media/rc-core.h>
>> 
>> #include <media/demux.h>
>> @@ -257,7 +258,7 @@ struct smi_port {
>> 	u32 _dmaInterruptCH0;
>> 	u32 _dmaInterruptCH1;
>> 	u32 _int_status;
>> -	struct tasklet_struct tasklet;
>> +	struct work_struct work;
>> 	/* dvb */
>> 	struct dmx_frontend hw_frontend;
>> 	struct dmx_frontend mem_frontend;
>> diff --git a/drivers/media/pci/ttpci/budget-av.c b/drivers/media/pci/ttpci/budget-av.c
>> index a47c5850ef87..6e43b1a01191 100644
>> --- a/drivers/media/pci/ttpci/budget-av.c
>> +++ b/drivers/media/pci/ttpci/budget-av.c
>> @@ -37,6 +37,7 @@
>> #include <linux/interrupt.h>
>> #include <linux/input.h>
>> #include <linux/spinlock.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/dvb_ca_en50221.h>
>> 
>> @@ -55,7 +56,7 @@ struct budget_av {
>> 	struct video_device vd;
>> 	int cur_input;
>> 	int has_saa7113;
>> -	struct tasklet_struct ciintf_irq_tasklet;
>> +	struct work_struct ciintf_irq_work;
>> 	int slot_status;
>> 	struct dvb_ca_en50221 ca;
>> 	u8 reinitialise_demod:1;
>> diff --git a/drivers/media/pci/ttpci/budget-ci.c b/drivers/media/pci/ttpci/budget-ci.c
>> index 66e1a004ee43..11e0ed62707e 100644
>> --- a/drivers/media/pci/ttpci/budget-ci.c
>> +++ b/drivers/media/pci/ttpci/budget-ci.c
>> @@ -17,6 +17,7 @@
>> #include <linux/slab.h>
>> #include <linux/interrupt.h>
>> #include <linux/spinlock.h>
>> +#include <linux/workqueue.h>
>> #include <media/rc-core.h>
>> 
>> #include "budget.h"
>> @@ -80,7 +81,7 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>> 
>> struct budget_ci_ir {
>> 	struct rc_dev *dev;
>> -	struct tasklet_struct msp430_irq_tasklet;
>> +	struct work_struct msp430_irq_work;
>> 	char name[72]; /* 40 + 32 for (struct saa7146_dev).name */
>> 	char phys[32];
>> 	int rc5_device;
>> @@ -91,7 +92,7 @@ struct budget_ci_ir {
>> 
>> struct budget_ci {
>> 	struct budget budget;
>> -	struct tasklet_struct ciintf_irq_tasklet;
>> +	struct work_struct ciintf_irq_work;
>> 	int slot_status;
>> 	int ci_irq;
>> 	struct dvb_ca_en50221 ca;
>> @@ -99,9 +100,9 @@ struct budget_ci {
>> 	u8 tuner_pll_address; /* used for philips_tdm1316l configs */
>> };
>> 
>> -static void msp430_ir_interrupt(struct tasklet_struct *t)
>> +static void msp430_ir_interrupt(struct work_struct *t)
>> {
>> -	struct budget_ci_ir *ir = from_tasklet(ir, t, msp430_irq_tasklet);
>> +	struct budget_ci_ir *ir = from_work(ir, t, msp430_irq_work);
>> 	struct budget_ci *budget_ci = container_of(ir, typeof(*budget_ci), ir);
>> 	struct rc_dev *dev = budget_ci->ir.dev;
>> 	u32 command = ttpci_budget_debiread(&budget_ci->budget, DEBINOSWAP, DEBIADDR_IR, 2, 1, 0) >> 8;
>> @@ -230,7 +231,7 @@ static int msp430_ir_init(struct budget_ci *budget_ci)
>> 
>> 	budget_ci->ir.dev = dev;
>> 
>> -	tasklet_setup(&budget_ci->ir.msp430_irq_tasklet, msp430_ir_interrupt);
>> +	INIT_WORK(&budget_ci->ir.msp430_irq_work, msp430_ir_interrupt);
>> 
>> 	SAA7146_IER_ENABLE(saa, MASK_06);
>> 	saa7146_setgpio(saa, 3, SAA7146_GPIO_IRQHI);
>> @@ -244,7 +245,7 @@ static void msp430_ir_deinit(struct budget_ci *budget_ci)
>> 
>> 	SAA7146_IER_DISABLE(saa, MASK_06);
>> 	saa7146_setgpio(saa, 3, SAA7146_GPIO_INPUT);
>> -	tasklet_kill(&budget_ci->ir.msp430_irq_tasklet);
>> +	cancel_work_sync(&budget_ci->ir.msp430_irq_work);
>> 
>> 	rc_unregister_device(budget_ci->ir.dev);
>> }
>> @@ -348,10 +349,10 @@ static int ciintf_slot_ts_enable(struct dvb_ca_en50221 *ca, int slot)
>> 	return 0;
>> }
>> 
>> -static void ciintf_interrupt(struct tasklet_struct *t)
>> +static void ciintf_interrupt(struct work_struct *t)
>> {
>> -	struct budget_ci *budget_ci = from_tasklet(budget_ci, t,
>> -						   ciintf_irq_tasklet);
>> +	struct budget_ci *budget_ci = from_work(budget_ci, t,
>> +						   ciintf_irq_work);
>> 	struct saa7146_dev *saa = budget_ci->budget.dev;
>> 	unsigned int flags;
>> 
>> @@ -492,7 +493,7 @@ static int ciintf_init(struct budget_ci *budget_ci)
>> 
>> 	// Setup CI slot IRQ
>> 	if (budget_ci->ci_irq) {
>> -		tasklet_setup(&budget_ci->ciintf_irq_tasklet, ciintf_interrupt);
>> +		INIT_WORK(&budget_ci->ciintf_irq_work, ciintf_interrupt);
>> 		if (budget_ci->slot_status != SLOTSTATUS_NONE) {
>> 			saa7146_setgpio(saa, 0, SAA7146_GPIO_IRQLO);
>> 		} else {
>> @@ -532,7 +533,7 @@ static void ciintf_deinit(struct budget_ci *budget_ci)
>> 	if (budget_ci->ci_irq) {
>> 		SAA7146_IER_DISABLE(saa, MASK_03);
>> 		saa7146_setgpio(saa, 0, SAA7146_GPIO_INPUT);
>> -		tasklet_kill(&budget_ci->ciintf_irq_tasklet);
>> +		cancel_work_sync(&budget_ci->ciintf_irq_work);
>> 	}
>> 
>> 	// reset interface
>> @@ -558,13 +559,13 @@ static void budget_ci_irq(struct saa7146_dev *dev, u32 * isr)
>> 	dprintk(8, "dev: %p, budget_ci: %p\n", dev, budget_ci);
>> 
>> 	if (*isr & MASK_06)
>> -		tasklet_schedule(&budget_ci->ir.msp430_irq_tasklet);
>> +		queue_work(system_bh_wq, &budget_ci->ir.msp430_irq_work);
>> 
>> 	if (*isr & MASK_10)
>> 		ttpci_budget_irq10_handler(dev, isr);
>> 
>> 	if ((*isr & MASK_03) && (budget_ci->budget.ci_present) && (budget_ci->ci_irq))
>> -		tasklet_schedule(&budget_ci->ciintf_irq_tasklet);
>> +		queue_work(system_bh_wq, &budget_ci->ciintf_irq_work);
>> }
>> 
>> static u8 philips_su1278_tt_inittab[] = {
>> diff --git a/drivers/media/pci/ttpci/budget-core.c b/drivers/media/pci/ttpci/budget-core.c
>> index 25f44c3eebf3..3443c12dc9f2 100644
>> --- a/drivers/media/pci/ttpci/budget-core.c
>> +++ b/drivers/media/pci/ttpci/budget-core.c
>> @@ -171,9 +171,9 @@ static int budget_read_fe_status(struct dvb_frontend *fe,
>> 	return ret;
>> }
>> 
>> -static void vpeirq(struct tasklet_struct *t)
>> +static void vpeirq(struct work_struct *t)
>> {
>> -	struct budget *budget = from_tasklet(budget, t, vpe_tasklet);
>> +	struct budget *budget = from_work(budget, t, vpe_work);
>> 	u8 *mem = (u8 *) (budget->grabbing);
>> 	u32 olddma = budget->ttbp;
>> 	u32 newdma = saa7146_read(budget->dev, PCI_VDP3);
>> @@ -520,7 +520,7 @@ int ttpci_budget_init(struct budget *budget, struct saa7146_dev *dev,
>> 	/* upload all */
>> 	saa7146_write(dev, GPIO_CTRL, 0x000000);
>> 
>> -	tasklet_setup(&budget->vpe_tasklet, vpeirq);
>> +	INIT_WORK(&budget->vpe_work, vpeirq);
>> 
>> 	/* frontend power on */
>> 	if (bi->type != BUDGET_FS_ACTIVY)
>> @@ -557,7 +557,7 @@ int ttpci_budget_deinit(struct budget *budget)
>> 
>> 	budget_unregister(budget);
>> 
>> -	tasklet_kill(&budget->vpe_tasklet);
>> +	cancel_work_sync(&budget->vpe_work);
>> 
>> 	saa7146_vfree_destroy_pgtable(dev->pci, budget->grabbing, &budget->pt);
>> 
>> @@ -575,7 +575,7 @@ void ttpci_budget_irq10_handler(struct saa7146_dev *dev, u32 * isr)
>> 	dprintk(8, "dev: %p, budget: %p\n", dev, budget);
>> 
>> 	if (*isr & MASK_10)
>> -		tasklet_schedule(&budget->vpe_tasklet);
>> +		queue_work(system_bh_wq, &budget->vpe_work);
>> }
>> 
>> void ttpci_budget_set_video_port(struct saa7146_dev *dev, int video_port)
>> diff --git a/drivers/media/pci/ttpci/budget.h b/drivers/media/pci/ttpci/budget.h
>> index bd87432e6cde..a3ee75e326b4 100644
>> --- a/drivers/media/pci/ttpci/budget.h
>> +++ b/drivers/media/pci/ttpci/budget.h
>> @@ -12,6 +12,7 @@
>> 
>> #include <linux/module.h>
>> #include <linux/mutex.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/drv-intf/saa7146.h>
>> 
>> @@ -49,8 +50,8 @@ struct budget {
>> 	unsigned char *grabbing;
>> 	struct saa7146_pgtable pt;
>> 
>> -	struct tasklet_struct fidb_tasklet;
>> -	struct tasklet_struct vpe_tasklet;
>> +	struct work_struct fidb_work;
>> +	struct work_struct vpe_work;
>> 
>> 	struct dmxdev dmxdev;
>> 	struct dvb_demux demux;
>> diff --git a/drivers/media/pci/tw5864/tw5864-core.c b/drivers/media/pci/tw5864/tw5864-core.c
>> index 560ff1ddcc83..a58c268e94a8 100644
>> --- a/drivers/media/pci/tw5864/tw5864-core.c
>> +++ b/drivers/media/pci/tw5864/tw5864-core.c
>> @@ -144,7 +144,7 @@ static void tw5864_h264_isr(struct tw5864_dev *dev)
>> 		cur_frame->gop_seqno = input->frame_gop_seqno;
>> 
>> 		dev->h264_buf_w_index = next_frame_index;
>> -		tasklet_schedule(&dev->tasklet);
>> +		queue_work(system_bh_wq, &dev->work);
>> 
>> 		cur_frame = next_frame;
>> 
>> diff --git a/drivers/media/pci/tw5864/tw5864-video.c b/drivers/media/pci/tw5864/tw5864-video.c
>> index 8b1aae4b6319..ac2249626506 100644
>> --- a/drivers/media/pci/tw5864/tw5864-video.c
>> +++ b/drivers/media/pci/tw5864/tw5864-video.c
>> @@ -6,6 +6,7 @@
>>  */
>> 
>> #include <linux/module.h>
>> +#include <linux/workqueue.h>
>> #include <media/v4l2-common.h>
>> #include <media/v4l2-event.h>
>> #include <media/videobuf2-dma-contig.h>
>> @@ -175,7 +176,7 @@ static const unsigned int intra4x4_lambda3[] = {
>> static v4l2_std_id tw5864_get_v4l2_std(enum tw5864_vid_std std);
>> static enum tw5864_vid_std tw5864_from_v4l2_std(v4l2_std_id v4l2_std);
>> 
>> -static void tw5864_handle_frame_task(struct tasklet_struct *t);
>> +static void tw5864_handle_frame_task(struct work_struct *t);
>> static void tw5864_handle_frame(struct tw5864_h264_frame *frame);
>> static void tw5864_frame_interval_set(struct tw5864_input *input);
>> 
>> @@ -1062,7 +1063,7 @@ int tw5864_video_init(struct tw5864_dev *dev, int *video_nr)
>> 	dev->irqmask |= TW5864_INTR_VLC_DONE | TW5864_INTR_TIMER;
>> 	tw5864_irqmask_apply(dev);
>> 
>> -	tasklet_setup(&dev->tasklet, tw5864_handle_frame_task);
>> +	INIT_WORK(&dev->work, tw5864_handle_frame_task);
>> 
>> 	for (i = 0; i < TW5864_INPUTS; i++) {
>> 		dev->inputs[i].root = dev;
>> @@ -1079,7 +1080,7 @@ int tw5864_video_init(struct tw5864_dev *dev, int *video_nr)
>> 	for (i = last_input_nr_registered; i >= 0; i--)
>> 		tw5864_video_input_fini(&dev->inputs[i]);
>> 
>> -	tasklet_kill(&dev->tasklet);
>> +	cancel_work_sync(&dev->work);
>> 
>> free_dma:
>> 	for (i = last_dma_allocated; i >= 0; i--) {
>> @@ -1198,7 +1199,7 @@ void tw5864_video_fini(struct tw5864_dev *dev)
>> {
>> 	int i;
>> 
>> -	tasklet_kill(&dev->tasklet);
>> +	cancel_work_sync(&dev->work);
>> 
>> 	for (i = 0; i < TW5864_INPUTS; i++)
>> 		tw5864_video_input_fini(&dev->inputs[i]);
>> @@ -1315,9 +1316,9 @@ static int tw5864_is_motion_triggered(struct tw5864_h264_frame *frame)
>> 	return detected;
>> }
>> 
>> -static void tw5864_handle_frame_task(struct tasklet_struct *t)
>> +static void tw5864_handle_frame_task(struct work_struct *t)
>> {
>> -	struct tw5864_dev *dev = from_tasklet(dev, t, tasklet);
>> +	struct tw5864_dev *dev = from_work(dev, t, work);
>> 	unsigned long flags;
>> 	int batch_size = H264_BUF_CNT;
>> 
>> diff --git a/drivers/media/pci/tw5864/tw5864.h b/drivers/media/pci/tw5864/tw5864.h
>> index a8b6fbd5b710..278373859098 100644
>> --- a/drivers/media/pci/tw5864/tw5864.h
>> +++ b/drivers/media/pci/tw5864/tw5864.h
>> @@ -12,6 +12,7 @@
>> #include <linux/mutex.h>
>> #include <linux/io.h>
>> #include <linux/interrupt.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/v4l2-common.h>
>> #include <media/v4l2-ioctl.h>
>> @@ -85,7 +86,7 @@ struct tw5864_input {
>> 	int nr; /* input number */
>> 	struct tw5864_dev *root;
>> 	struct mutex lock; /* used for vidq and vdev */
>> -	spinlock_t slock; /* used for sync between ISR, tasklet & V4L2 API */
>> +	spinlock_t slock; /* used for sync between ISR, work & V4L2 API */
>> 	struct video_device vdev;
>> 	struct v4l2_ctrl_handler hdl;
>> 	struct vb2_queue vidq;
>> @@ -142,7 +143,7 @@ struct tw5864_h264_frame {
>> 
>> /* global device status */
>> struct tw5864_dev {
>> -	spinlock_t slock; /* used for sync between ISR, tasklet & V4L2 API */
>> +	spinlock_t slock; /* used for sync between ISR, work & V4L2 API */
>> 	struct v4l2_device v4l2_dev;
>> 	struct tw5864_input inputs[TW5864_INPUTS];
>> #define H264_BUF_CNT 4
>> @@ -150,7 +151,7 @@ struct tw5864_dev {
>> 	int h264_buf_r_index;
>> 	int h264_buf_w_index;
>> 
>> -	struct tasklet_struct tasklet;
>> +	struct work_struct work;
>> 
>> 	int encoder_busy;
>> 	/* Input number to check next for ready raw picture (in RR fashion) */
>> diff --git a/drivers/media/platform/intel/pxa_camera.c b/drivers/media/platform/intel/pxa_camera.c
>> index d904952bf00e..df0a3c559287 100644
>> --- a/drivers/media/platform/intel/pxa_camera.c
>> +++ b/drivers/media/platform/intel/pxa_camera.c
>> @@ -43,6 +43,7 @@
>> #include <linux/videodev2.h>
>> 
>> #include <linux/platform_data/media/camera-pxa.h>
>> +#include <linux/workqueue.h>
>> 
>> #define PXA_CAM_VERSION "0.0.6"
>> #define PXA_CAM_DRV_NAME "pxa27x-camera"
>> @@ -683,7 +684,7 @@ struct pxa_camera_dev {
>> 	unsigned int		buf_sequence;
>> 
>> 	struct pxa_buffer	*active;
>> -	struct tasklet_struct	task_eof;
>> +	struct work_struct 	task_eof;
>> 
>> 	u32			save_cicr[5];
>> };
>> @@ -1146,9 +1147,9 @@ static void pxa_camera_deactivate(struct pxa_camera_dev *pcdev)
>> 	clk_disable_unprepare(pcdev->clk);
>> }
>> 
>> -static void pxa_camera_eof(struct tasklet_struct *t)
>> +static void pxa_camera_eof(struct work_struct *t)
>> {
>> -	struct pxa_camera_dev *pcdev = from_tasklet(pcdev, t, task_eof);
>> +	struct pxa_camera_dev *pcdev = from_work(pcdev, t, task_eof);
>> 	unsigned long cifr;
>> 	struct pxa_buffer *buf;
>> 
>> @@ -1185,7 +1186,7 @@ static irqreturn_t pxa_camera_irq(int irq, void *data)
>> 	if (status & CISR_EOF) {
>> 		cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_EOFM;
>> 		__raw_writel(cicr0, pcdev->base + CICR0);
>> -		tasklet_schedule(&pcdev->task_eof);
>> +		queue_work(system_bh_wq, &pcdev->task_eof);
>> 	}
>> 
>> 	return IRQ_HANDLED;
>> @@ -2383,7 +2384,7 @@ static int pxa_camera_probe(struct platform_device *pdev)
>> 		}
>> 	}
>> 
>> -	tasklet_setup(&pcdev->task_eof, pxa_camera_eof);
>> +	INIT_WORK(&pcdev->task_eof, pxa_camera_eof);
>> 
>> 	pxa_camera_activate(pcdev);
>> 
>> @@ -2409,7 +2410,7 @@ static int pxa_camera_probe(struct platform_device *pdev)
>> 	return 0;
>> exit_deactivate:
>> 	pxa_camera_deactivate(pcdev);
>> -	tasklet_kill(&pcdev->task_eof);
>> +	cancel_work_sync(&pcdev->task_eof);
>> exit_free_dma:
>> 	dma_release_channel(pcdev->dma_chans[2]);
>> exit_free_dma_u:
>> @@ -2428,7 +2429,7 @@ static void pxa_camera_remove(struct platform_device *pdev)
>> 	struct pxa_camera_dev *pcdev = platform_get_drvdata(pdev);
>> 
>> 	pxa_camera_deactivate(pcdev);
>> -	tasklet_kill(&pcdev->task_eof);
>> +	cancel_work_sync(&pcdev->task_eof);
>> 	dma_release_channel(pcdev->dma_chans[0]);
>> 	dma_release_channel(pcdev->dma_chans[1]);
>> 	dma_release_channel(pcdev->dma_chans[2]);
>> diff --git a/drivers/media/platform/marvell/mcam-core.c b/drivers/media/platform/marvell/mcam-core.c
>> index 66688b4aece5..d6b96a7039be 100644
>> --- a/drivers/media/platform/marvell/mcam-core.c
>> +++ b/drivers/media/platform/marvell/mcam-core.c
>> @@ -25,6 +25,7 @@
>> #include <linux/clk-provider.h>
>> #include <linux/videodev2.h>
>> #include <linux/pm_runtime.h>
>> +#include <linux/workqueue.h>
>> #include <media/v4l2-device.h>
>> #include <media/v4l2-ioctl.h>
>> #include <media/v4l2-ctrls.h>
>> @@ -439,9 +440,9 @@ static void mcam_ctlr_dma_vmalloc(struct mcam_camera *cam)
>> /*
>>  * Copy data out to user space in the vmalloc case
>>  */
>> -static void mcam_frame_tasklet(struct tasklet_struct *t)
>> +static void mcam_frame_work(struct work_struct *t)
>> {
>> -	struct mcam_camera *cam = from_tasklet(cam, t, s_tasklet);
>> +	struct mcam_camera *cam = from_work(cam, t, s_work);
>> 	int i;
>> 	unsigned long flags;
>> 	struct mcam_vb_buffer *buf;
>> @@ -480,7 +481,7 @@ static void mcam_frame_tasklet(struct tasklet_struct *t)
>> 
>> 
>> /*
>> - * Make sure our allocated buffers are up to the task.
>> + * Make sure our allocated buffers are up to the work.
>>  */
>> static int mcam_check_dma_buffers(struct mcam_camera *cam)
>> {
>> @@ -493,7 +494,7 @@ static int mcam_check_dma_buffers(struct mcam_camera *cam)
>> 
>> static void mcam_vmalloc_done(struct mcam_camera *cam, int frame)
>> {
>> -	tasklet_schedule(&cam->s_tasklet);
>> +	queue_work(system_bh_wq, &cam->s_work);
>> }
>> 
>> #else /* MCAM_MODE_VMALLOC */
>> @@ -1305,7 +1306,7 @@ static int mcam_setup_vb2(struct mcam_camera *cam)
>> 		break;
>> 	case B_vmalloc:
>> #ifdef MCAM_MODE_VMALLOC
>> -		tasklet_setup(&cam->s_tasklet, mcam_frame_tasklet);
>> +		INIT_WORK(&cam->s_work, mcam_frame_work);
>> 		vq->ops = &mcam_vb2_ops;
>> 		vq->mem_ops = &vb2_vmalloc_memops;
>> 		cam->dma_setup = mcam_ctlr_dma_vmalloc;
>> diff --git a/drivers/media/platform/marvell/mcam-core.h b/drivers/media/platform/marvell/mcam-core.h
>> index 51e66db45af6..0d4b953dbb23 100644
>> --- a/drivers/media/platform/marvell/mcam-core.h
>> +++ b/drivers/media/platform/marvell/mcam-core.h
>> @@ -9,6 +9,7 @@
>> 
>> #include <linux/list.h>
>> #include <linux/clk-provider.h>
>> +#include <linux/workqueue.h>
>> #include <media/v4l2-common.h>
>> #include <media/v4l2-ctrls.h>
>> #include <media/v4l2-dev.h>
>> @@ -167,7 +168,7 @@ struct mcam_camera {
>> 	unsigned int dma_buf_size;	/* allocated size */
>> 	void *dma_bufs[MAX_DMA_BUFS];	/* Internal buffer addresses */
>> 	dma_addr_t dma_handles[MAX_DMA_BUFS]; /* Buffer bus addresses */
>> -	struct tasklet_struct s_tasklet;
>> +	struct work_struct s_work;
>> #endif
>> 	unsigned int sequence;		/* Frame sequence number */
>> 	unsigned int buf_seq[MAX_DMA_BUFS]; /* Sequence for individual bufs */
>> diff --git a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c
>> index e4cf27b5a072..22b359569a10 100644
>> --- a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c
>> +++ b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.c
>> @@ -33,6 +33,7 @@
>> #include <linux/time.h>
>> #include <linux/usb.h>
>> #include <linux/wait.h>
>> +#include <linux/workqueue.h>
>> 
>> #include "c8sectpfe-common.h"
>> #include "c8sectpfe-core.h"
>> @@ -73,16 +74,16 @@ static void c8sectpfe_timer_interrupt(struct timer_list *t)
>> 
>> 		/* is this descriptor initialised and TP enabled */
>> 		if (channel->irec && readl(channel->irec + DMA_PRDS_TPENABLE))
>> -			tasklet_schedule(&channel->tsklet);
>> +			queue_work(system_bh_wq, &channel->tsklet);
>> 	}
>> 
>> 	fei->timer.expires = jiffies +	msecs_to_jiffies(POLL_MSECS);
>> 	add_timer(&fei->timer);
>> }
>> 
>> -static void channel_swdemux_tsklet(struct tasklet_struct *t)
>> +static void channel_swdemux_tsklet(struct work_struct *t)
>> {
>> -	struct channel_info *channel = from_tasklet(channel, t, tsklet);
>> +	struct channel_info *channel = from_work(channel, t, tsklet);
>> 	struct c8sectpfei *fei;
>> 	unsigned long wp, rp;
>> 	int pos, num_packets, n, size;
>> @@ -211,7 +212,7 @@ static int c8sectpfe_start_feed(struct dvb_demux_feed *dvbdmxfeed)
>> 
>> 		dev_dbg(fei->dev, "Starting channel=%p\n", channel);
>> 
>> -		tasklet_setup(&channel->tsklet, channel_swdemux_tsklet);
>> +		INIT_WORK(&channel->tsklet, channel_swdemux_tsklet);
>> 
>> 		/* Reset the internal inputblock sram pointers */
>> 		writel(channel->fifo,
>> @@ -304,7 +305,7 @@ static int c8sectpfe_stop_feed(struct dvb_demux_feed *dvbdmxfeed)
>> 		/* disable this channels descriptor */
>> 		writel(0,  channel->irec + DMA_PRDS_TPENABLE);
>> 
>> -		tasklet_disable(&channel->tsklet);
>> +		disable_work_sync(&channel->tsklet);
>> 
>> 		/* now request memdma channel goes idle */
>> 		idlereq = (1 << channel->tsin_id) | IDLEREQ;
>> @@ -631,8 +632,8 @@ static int configure_memdma_and_inputblock(struct c8sectpfei *fei,
>> 	writel(tsin->back_buffer_busaddr, tsin->irec + DMA_PRDS_BUSWP_TP(0));
>> 	writel(tsin->back_buffer_busaddr, tsin->irec + DMA_PRDS_BUSRP_TP(0));
>> 
>> -	/* initialize tasklet */
>> -	tasklet_setup(&tsin->tsklet, channel_swdemux_tsklet);
>> +	/* initialize work */
>> +	INIT_WORK(&tsin->tsklet, channel_swdemux_tsklet);
>> 
>> 	return 0;
>> 
>> diff --git a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h
>> index bf377cc82225..d63f0ee83615 100644
>> --- a/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h
>> +++ b/drivers/media/platform/st/sti/c8sectpfe/c8sectpfe-core.h
>> @@ -51,7 +51,7 @@ struct channel_info {
>> 	unsigned long  fifo;
>> 
>> 	struct completion idle_completion;
>> -	struct tasklet_struct tsklet;
>> +	struct work_struct tsklet;
>> 
>> 	struct c8sectpfei *fei;
>> 	void __iomem *irec;
>> diff --git a/drivers/media/radio/wl128x/fmdrv.h b/drivers/media/radio/wl128x/fmdrv.h
>> index da8920169df8..85282f638c4a 100644
>> --- a/drivers/media/radio/wl128x/fmdrv.h
>> +++ b/drivers/media/radio/wl128x/fmdrv.h
>> @@ -15,6 +15,7 @@
>> #include <sound/core.h>
>> #include <sound/initval.h>
>> #include <linux/timer.h>
>> +#include <linux/workqueue.h>
>> #include <media/v4l2-ioctl.h>
>> #include <media/v4l2-common.h>
>> #include <media/v4l2-device.h>
>> @@ -200,15 +201,15 @@ struct fmdev {
>> 	int streg_cbdata; /* status of ST registration */
>> 
>> 	struct sk_buff_head rx_q;	/* RX queue */
>> -	struct tasklet_struct rx_task;	/* RX Tasklet */
>> +	struct work_struct rx_task;	/* RX Work */
>> 
>> 	struct sk_buff_head tx_q;	/* TX queue */
>> -	struct tasklet_struct tx_task;	/* TX Tasklet */
>> +	struct work_struct tx_task;	/* TX Work */
>> 	unsigned long last_tx_jiffies;	/* Timestamp of last pkt sent */
>> 	atomic_t tx_cnt;	/* Number of packets can send at a time */
>> 
>> 	struct sk_buff *resp_skb;	/* Response from the chip */
>> -	/* Main task completion handler */
>> +	/* Main work completion handler */
>> 	struct completion maintask_comp;
>> 	/* Opcode of last command sent to the chip */
>> 	u8 pre_op;
>> diff --git a/drivers/media/radio/wl128x/fmdrv_common.c b/drivers/media/radio/wl128x/fmdrv_common.c
>> index 3da8e5102bec..52290bb4a4ad 100644
>> --- a/drivers/media/radio/wl128x/fmdrv_common.c
>> +++ b/drivers/media/radio/wl128x/fmdrv_common.c
>> @@ -9,7 +9,7 @@
>>  *     one Channel-8 command to be sent to the chip).
>>  *  2) Sending each Channel-8 command to the chip and reading
>>  *     response back over Shared Transport.
>> - *  3) Managing TX and RX Queues and Tasklets.
>> + *  3) Managing TX and RX Queues and Works.
>>  *  4) Handling FM Interrupt packet and taking appropriate action.
>>  *  5) Loading FM firmware to the chip (common, FM TX, and FM RX
>>  *     firmware files based on mode selection)
>> @@ -29,6 +29,7 @@
>> #include "fmdrv_v4l2.h"
>> #include "fmdrv_common.h"
>> #include <linux/ti_wilink_st.h>
>> +#include <linux/workqueue.h>
>> #include "fmdrv_rx.h"
>> #include "fmdrv_tx.h"
>> 
>> @@ -244,10 +245,10 @@ void fmc_update_region_info(struct fmdev *fmdev, u8 region_to_set)
>> }
>> 
>> /*
>> - * FM common sub-module will schedule this tasklet whenever it receives
>> + * FM common sub-module will schedule this work whenever it receives
>>  * FM packet from ST driver.
>>  */
>> -static void recv_tasklet(struct tasklet_struct *t)
>> +static void recv_work(struct work_struct *t)
>> {
>> 	struct fmdev *fmdev;
>> 	struct fm_irq *irq_info;
>> @@ -256,7 +257,7 @@ static void recv_tasklet(struct tasklet_struct *t)
>> 	u8 num_fm_hci_cmds;
>> 	unsigned long flags;
>> 
>> -	fmdev = from_tasklet(fmdev, t, tx_task);
>> +	fmdev = from_work(fmdev, t, tx_task);
>> 	irq_info = &fmdev->irq_info;
>> 	/* Process all packets in the RX queue */
>> 	while ((skb = skb_dequeue(&fmdev->rx_q))) {
>> @@ -322,22 +323,22 @@ static void recv_tasklet(struct tasklet_struct *t)
>> 
>> 		/*
>> 		 * Check flow control field. If Num_FM_HCI_Commands field is
>> -		 * not zero, schedule FM TX tasklet.
>> +		 * not zero, schedule FM TX work.
>> 		 */
>> 		if (num_fm_hci_cmds && atomic_read(&fmdev->tx_cnt))
>> 			if (!skb_queue_empty(&fmdev->tx_q))
>> -				tasklet_schedule(&fmdev->tx_task);
>> +				queue_work(system_bh_wq, &fmdev->tx_task);
>> 	}
>> }
>> 
>> -/* FM send tasklet: is scheduled when FM packet has to be sent to chip */
>> -static void send_tasklet(struct tasklet_struct *t)
>> +/* FM send work: is scheduled when FM packet has to be sent to chip */
>> +static void send_work(struct work_struct *t)
>> {
>> 	struct fmdev *fmdev;
>> 	struct sk_buff *skb;
>> 	int len;
>> 
>> -	fmdev = from_tasklet(fmdev, t, tx_task);
>> +	fmdev = from_work(fmdev, t, tx_task);
>> 
>> 	if (!atomic_read(&fmdev->tx_cnt))
>> 		return;
>> @@ -366,7 +367,7 @@ static void send_tasklet(struct tasklet_struct *t)
>> 	if (len < 0) {
>> 		kfree_skb(skb);
>> 		fmdev->resp_comp = NULL;
>> -		fmerr("TX tasklet failed to send skb(%p)\n", skb);
>> +		fmerr("TX work failed to send skb(%p)\n", skb);
>> 		atomic_set(&fmdev->tx_cnt, 1);
>> 	} else {
>> 		fmdev->last_tx_jiffies = jiffies;
>> @@ -374,7 +375,7 @@ static void send_tasklet(struct tasklet_struct *t)
>> }
>> 
>> /*
>> - * Queues FM Channel-8 packet to FM TX queue and schedules FM TX tasklet for
>> + * Queues FM Channel-8 packet to FM TX queue and schedules FM TX work for
>>  * transmission
>>  */
>> static int fm_send_cmd(struct fmdev *fmdev, u8 fm_op, u16 type,	void *payload,
>> @@ -440,7 +441,7 @@ static int fm_send_cmd(struct fmdev *fmdev, u8 fm_op, u16 type,	void *payload,
>> 
>> 	fm_cb(skb)->completion = wait_completion;
>> 	skb_queue_tail(&fmdev->tx_q, skb);
>> -	tasklet_schedule(&fmdev->tx_task);
>> +	queue_work(system_bh_wq, &fmdev->tx_task);
>> 
>> 	return 0;
>> }
>> @@ -462,7 +463,7 @@ int fmc_send_cmd(struct fmdev *fmdev, u8 fm_op, u16 type, void *payload,
>> 
>> 	if (!wait_for_completion_timeout(&fmdev->maintask_comp,
>> 					 FM_DRV_TX_TIMEOUT)) {
>> -		fmerr("Timeout(%d sec),didn't get regcompletion signal from RX tasklet\n",
>> +		fmerr("Timeout(%d sec),didn't get regcompletion signal from RX work\n",
>> 			   jiffies_to_msecs(FM_DRV_TX_TIMEOUT) / 1000);
>> 		return -ETIMEDOUT;
>> 	}
>> @@ -1455,7 +1456,7 @@ static long fm_st_receive(void *arg, struct sk_buff *skb)
>> 
>> 	memcpy(skb_push(skb, 1), &skb->cb[0], 1);
>> 	skb_queue_tail(&fmdev->rx_q, skb);
>> -	tasklet_schedule(&fmdev->rx_task);
>> +	queue_work(system_bh_wq, &fmdev->rx_task);
>> 
>> 	return 0;
>> }
>> @@ -1537,13 +1538,13 @@ int fmc_prepare(struct fmdev *fmdev)
>> 	spin_lock_init(&fmdev->rds_buff_lock);
>> 	spin_lock_init(&fmdev->resp_skb_lock);
>> 
>> -	/* Initialize TX queue and TX tasklet */
>> +	/* Initialize TX queue and TX work */
>> 	skb_queue_head_init(&fmdev->tx_q);
>> -	tasklet_setup(&fmdev->tx_task, send_tasklet);
>> +	INIT_WORK(&fmdev->tx_task, send_work);
>> 
>> -	/* Initialize RX Queue and RX tasklet */
>> +	/* Initialize RX Queue and RX work */
>> 	skb_queue_head_init(&fmdev->rx_q);
>> -	tasklet_setup(&fmdev->rx_task, recv_tasklet);
>> +	INIT_WORK(&fmdev->rx_task, recv_work);
>> 
>> 	fmdev->irq_info.stage = 0;
>> 	atomic_set(&fmdev->tx_cnt, 1);
>> @@ -1589,8 +1590,8 @@ int fmc_release(struct fmdev *fmdev)
>> 	/* Service pending read */
>> 	wake_up_interruptible(&fmdev->rx.rds.read_queue);
>> 
>> -	tasklet_kill(&fmdev->tx_task);
>> -	tasklet_kill(&fmdev->rx_task);
>> +	cancel_work_sync(&fmdev->tx_task);
>> +	cancel_work_sync(&fmdev->rx_task);
>> 
>> 	skb_queue_purge(&fmdev->tx_q);
>> 	skb_queue_purge(&fmdev->rx_q);
>> diff --git a/drivers/media/rc/mceusb.c b/drivers/media/rc/mceusb.c
>> index c76ba24c1f55..a2e2e58b7506 100644
>> --- a/drivers/media/rc/mceusb.c
>> +++ b/drivers/media/rc/mceusb.c
>> @@ -774,7 +774,7 @@ static void mceusb_dev_printdata(struct mceusb_dev *ir, u8 *buf, int buf_len,
>> 
>> /*
>>  * Schedule work that can't be done in interrupt handlers
>> - * (mceusb_dev_recv() and mce_write_callback()) nor tasklets.
>> + * (mceusb_dev_recv() and mce_write_callback()) nor works.
>>  * Invokes mceusb_deferred_kevent() for recovering from
>>  * error events specified by the kevent bit field.
>>  */
>> diff --git a/drivers/media/usb/ttusb-dec/ttusb_dec.c b/drivers/media/usb/ttusb-dec/ttusb_dec.c
>> index 79faa2560613..55eeb00f1126 100644
>> --- a/drivers/media/usb/ttusb-dec/ttusb_dec.c
>> +++ b/drivers/media/usb/ttusb-dec/ttusb_dec.c
>> @@ -19,6 +19,7 @@
>> #include <linux/input.h>
>> 
>> #include <linux/mutex.h>
>> +#include <linux/workqueue.h>
>> 
>> #include <media/dmxdev.h>
>> #include <media/dvb_demux.h>
>> @@ -139,7 +140,7 @@ struct ttusb_dec {
>> 	int			v_pes_postbytes;
>> 
>> 	struct list_head	urb_frame_list;
>> -	struct tasklet_struct	urb_tasklet;
>> +	struct work_struct 	urb_work;
>> 	spinlock_t		urb_frame_list_lock;
>> 
>> 	struct dvb_demux_filter	*audio_filter;
>> @@ -766,9 +767,9 @@ static void ttusb_dec_process_urb_frame(struct ttusb_dec *dec, u8 *b,
>> 	}
>> }
>> 
>> -static void ttusb_dec_process_urb_frame_list(struct tasklet_struct *t)
>> +static void ttusb_dec_process_urb_frame_list(struct work_struct *t)
>> {
>> -	struct ttusb_dec *dec = from_tasklet(dec, t, urb_tasklet);
>> +	struct ttusb_dec *dec = from_work(dec, t, urb_work);
>> 	struct list_head *item;
>> 	struct urb_frame *frame;
>> 	unsigned long flags;
>> @@ -822,7 +823,7 @@ static void ttusb_dec_process_urb(struct urb *urb)
>> 				spin_unlock_irqrestore(&dec->urb_frame_list_lock,
>> 						       flags);
>> 
>> -				tasklet_schedule(&dec->urb_tasklet);
>> +				queue_work(system_bh_wq, &dec->urb_work);
>> 			}
>> 		}
>> 	} else {
>> @@ -1198,11 +1199,11 @@ static int ttusb_dec_alloc_iso_urbs(struct ttusb_dec *dec)
>> 	return 0;
>> }
>> 
>> -static void ttusb_dec_init_tasklet(struct ttusb_dec *dec)
>> +static void ttusb_dec_init_work(struct ttusb_dec *dec)
>> {
>> 	spin_lock_init(&dec->urb_frame_list_lock);
>> 	INIT_LIST_HEAD(&dec->urb_frame_list);
>> -	tasklet_setup(&dec->urb_tasklet, ttusb_dec_process_urb_frame_list);
>> +	INIT_WORK(&dec->urb_work, ttusb_dec_process_urb_frame_list);
>> }
>> 
>> static int ttusb_init_rc( struct ttusb_dec *dec)
>> @@ -1588,12 +1589,12 @@ static void ttusb_dec_exit_usb(struct ttusb_dec *dec)
>> 	ttusb_dec_free_iso_urbs(dec);
>> }
>> 
>> -static void ttusb_dec_exit_tasklet(struct ttusb_dec *dec)
>> +static void ttusb_dec_exit_work(struct ttusb_dec *dec)
>> {
>> 	struct list_head *item;
>> 	struct urb_frame *frame;
>> 
>> -	tasklet_kill(&dec->urb_tasklet);
>> +	cancel_work_sync(&dec->urb_work);
>> 
>> 	while ((item = dec->urb_frame_list.next) != &dec->urb_frame_list) {
>> 		frame = list_entry(item, struct urb_frame, urb_frame_list);
>> @@ -1703,7 +1704,7 @@ static int ttusb_dec_probe(struct usb_interface *intf,
>> 
>> 	ttusb_dec_init_v_pes(dec);
>> 	ttusb_dec_init_filters(dec);
>> -	ttusb_dec_init_tasklet(dec);
>> +	ttusb_dec_init_work(dec);
>> 
>> 	dec->active = 1;
>> 
>> @@ -1729,7 +1730,7 @@ static void ttusb_dec_disconnect(struct usb_interface *intf)
>> 	dprintk("%s\n", __func__);
>> 
>> 	if (dec->active) {
>> -		ttusb_dec_exit_tasklet(dec);
>> +		ttusb_dec_exit_work(dec);
>> 		ttusb_dec_exit_filters(dec);
>> 		if(enable_rc)
>> 			ttusb_dec_exit_rc(dec);


^ permalink raw reply	[relevance 65%]

* [PATCH net-next v2 2/2] net: mana: Add new device attributes for mana
  2024-04-24 10:32 78% [PATCH net-next v2 0/2] Add sysfs attributes for MANA Shradha Gupta
  2024-04-24 10:33 72% ` [PATCH net-next v2 1/2] net: Add sysfs atttributes for max_mtu min_mtu Shradha Gupta
@ 2024-04-24 10:34 72% ` Shradha Gupta
    2 siblings, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-24 10:34 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Bjorn Helgaas, Jonathan Corbet, Randy Dunlap, Johannes Berg,
	Breno Leitao, linux-kernel, netdev, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Souradeep Chakrabarti, Konstantin Taranov, Yury Norov,
	linux-hyperv
  Cc: Shradha Gupta, shradhagupta

Add new device attributes to read num_ports and max_num_msix setting for
MANA device.

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
---
 Changes in v2
 * Used the suggested method(v1 dicsussion) to implement sysfs device parameters
   for MANA device
 * Implemented attributes max_mtu and min_mtu generically for all device
   drivers
---
 .../net/ethernet/microsoft/mana/gdma_main.c   | 32 +++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 1332db9a08eb..e35f984e34ce 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1471,6 +1471,37 @@ static bool mana_is_pf(unsigned short dev_id)
 	return dev_id == MANA_PF_DEVICE_ID;
 }
 
+static ssize_t num_ports_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct gdma_context *gc = pci_get_drvdata(pdev);
+	struct mana_context *ac = gc->mana.driver_data;
+
+	return sysfs_emit(buf, "%d\n", ac->num_ports);
+}
+
+static DEVICE_ATTR_RO(num_ports);
+
+static ssize_t max_num_msix_show(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct gdma_context *gc = pci_get_drvdata(pdev);
+
+	return sysfs_emit(buf, "%d\n", gc->max_num_msix);
+}
+
+static DEVICE_ATTR_RO(max_num_msix);
+
+static struct attribute *mana_gd_device_attrs[] = {
+	&dev_attr_num_ports.attr,
+	&dev_attr_max_num_msix.attr,
+	NULL,
+};
+
+ATTRIBUTE_GROUPS(mana_gd_device);
+
 static int mana_gd_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct gdma_context *gc;
@@ -1613,6 +1644,7 @@ static const struct pci_device_id mana_id_table[] = {
 };
 
 static struct pci_driver mana_driver = {
+	.dev_groups	= mana_gd_device_groups,
 	.name		= "mana",
 	.id_table	= mana_id_table,
 	.probe		= mana_gd_probe,
-- 
2.34.1


^ permalink raw reply related	[relevance 72%]

* [PATCH net-next v2 1/2] net: Add sysfs atttributes for max_mtu min_mtu
  2024-04-24 10:32 78% [PATCH net-next v2 0/2] Add sysfs attributes for MANA Shradha Gupta
@ 2024-04-24 10:33 72% ` Shradha Gupta
    2024-04-24 10:34 72% ` [PATCH net-next v2 2/2] net: mana: Add new device attributes for mana Shradha Gupta
    2 siblings, 1 reply; 200+ results
From: Shradha Gupta @ 2024-04-24 10:33 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Bjorn Helgaas, Jonathan Corbet, Randy Dunlap, Johannes Berg,
	Breno Leitao, linux-kernel, netdev, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Souradeep Chakrabarti, Konstantin Taranov, Yury Norov,
	linux-hyperv
  Cc: Shradha Gupta, shradhagupta

Add sysfs attributes to read max_mtu and min_mtu value for
network devices

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
---
 Changes in v2:
 * Created a new patch for generic attributes
---
 Documentation/ABI/testing/sysfs-class-net | 16 ++++++++++++++++
 net/core/net-sysfs.c                      |  4 ++++
 2 files changed, 20 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
index ebf21beba846..f68f3b9be6ec 100644
--- a/Documentation/ABI/testing/sysfs-class-net
+++ b/Documentation/ABI/testing/sysfs-class-net
@@ -352,3 +352,19 @@ Description:
 		0  threaded mode disabled for this dev
 		1  threaded mode enabled for this dev
 		== ==================================
+
+What:           /sys/class/net/<iface>/max_mtu
+Date:           April 2024
+KernelVersion:  6.10
+Contact:        netdev@vger.kernel.org
+Description:
+                Indicates the interface's maximum supported MTU value, in
+                bytes, and in decimal format.
+
+What:           /sys/class/net/<iface>/min_mtu
+Date:           April 2024
+KernelVersion:  6.10
+Contact:        netdev@vger.kernel.org
+Description:
+                Indicates the interface's minimum supported MTU value, in
+                bytes, and in decimal format.
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index e3d7a8cfa20b..525b85d47676 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -114,6 +114,8 @@ NETDEVICE_SHOW_RO(addr_len, fmt_dec);
 NETDEVICE_SHOW_RO(ifindex, fmt_dec);
 NETDEVICE_SHOW_RO(type, fmt_dec);
 NETDEVICE_SHOW_RO(link_mode, fmt_dec);
+NETDEVICE_SHOW_RO(max_mtu, fmt_dec);
+NETDEVICE_SHOW_RO(min_mtu, fmt_dec);
 
 static ssize_t iflink_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
@@ -671,6 +673,8 @@ static struct attribute *net_class_attrs[] __ro_after_init = {
 	&dev_attr_carrier_up_count.attr,
 	&dev_attr_carrier_down_count.attr,
 	&dev_attr_threaded.attr,
+	&dev_attr_max_mtu.attr,
+	&dev_attr_min_mtu.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(net_class);
-- 
2.34.1


^ permalink raw reply related	[relevance 72%]

* [PATCH net-next v2 0/2] Add sysfs attributes for MANA
@ 2024-04-24 10:32 78% Shradha Gupta
  2024-04-24 10:33 72% ` [PATCH net-next v2 1/2] net: Add sysfs atttributes for max_mtu min_mtu Shradha Gupta
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Shradha Gupta @ 2024-04-24 10:32 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Bjorn Helgaas, Jonathan Corbet, Randy Dunlap, Johannes Berg,
	Breno Leitao, linux-kernel, netdev, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Souradeep Chakrabarti, Konstantin Taranov, Yury Norov,
	linux-hyperv
  Cc: Shradha Gupta, shradhagupta

These patches include adding sysfs attributes for improving
debuggability on MANA devices.

The first patch consists on max_mtu, min_mtu attributes that are
implemented generically for all devices

The second patch has mana specific attributes max_num_msix and num_ports

Shradha Gupta (2):
  net: Add sysfs atttributes for max_mtu min_mtu
  net: mana: Add new device attributes for mana

 Documentation/ABI/testing/sysfs-class-net     | 16 ++++++++++
 .../net/ethernet/microsoft/mana/gdma_main.c   | 32 +++++++++++++++++++
 net/core/net-sysfs.c                          |  4 +++
 3 files changed, 52 insertions(+)

-- 
2.34.1


^ permalink raw reply	[relevance 78%]

* RE: [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing cq callbacks
  2024-04-23 23:45 79%   ` Long Li
@ 2024-04-24  8:58 79%     ` Konstantin Taranov
  0 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-24  8:58 UTC (permalink / raw)
  To: Long Li, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> > Add a boundary check inside mana_ib_install_cq_cb to prevent index
> > overflow.
> 
> How is this condition possible that we are getting an out of bound queue id
> from SOC?
> 

Yes, it should not happen as the HW says the upper limit on CQ_ID,
but I think it is safer to have it to dodge bugs/faulty HW.
Better safe than sorry.
You can see the same check all over the mana.ko module.


> >
> > Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb
> > helper function")
> > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > ---
> >  drivers/infiniband/hw/mana/cq.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/infiniband/hw/mana/cq.c
> > b/drivers/infiniband/hw/mana/cq.c index 6c3bb8c..8323085 100644
> > --- a/drivers/infiniband/hw/mana/cq.c
> > +++ b/drivers/infiniband/hw/mana/cq.c
> > @@ -70,6 +70,8 @@ int mana_ib_install_cq_cb(struct mana_ib_dev
> *mdev,
> > struct mana_ib_cq *cq)
> >  	struct gdma_context *gc = mdev_to_gc(mdev);
> >  	struct gdma_queue *gdma_cq;
> >
> > +	if (cq->queue.id >= gc->max_num_cqs)
> > +		return -EINVAL;
> >  	/* Create CQ table entry */
> >  	WARN_ON(gc->cq_table[cq->queue.id]);
> >  	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> > --
> > 2.43.0


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks
  2024-04-23 23:42 79%   ` Long Li
@ 2024-04-24  8:50 79%     ` Konstantin Taranov
  2024-04-25 20:29 79%       ` Long Li
  0 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-24  8:50 UTC (permalink / raw)
  To: Long Li, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> > +void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct
> > mana_ib_cq
> > +*cq) {
> > +	struct gdma_context *gc = mdev_to_gc(mdev);
> > +
> > +	if (cq->queue.id >= gc->max_num_cqs)
> > +		return;
> > +
> > +	kfree(gc->cq_table[cq->queue.id]);
> > +	gc->cq_table[cq->queue.id] = NULL;
> 
> Why the check for (cq->queue.id != INVALID_QUEUE_ID) is removed?

As max_num_cqs is always less than INVALID_QUEUE_ID, it is included in the "if".
I can add " || cq->queue.id == INVALID_QUEUE_ID " to the condition if you want.

> > @@ -173,13 +171,6 @@ static int mana_ib_create_qp_rss(struct ib_qp
> > *ibqp, struct ib_pd *pd,
> >  		goto fail;
> >  	}
> >
> > -	gdma_cq_allocated = kcalloc(ind_tbl_size,
> > sizeof(*gdma_cq_allocated),
> > -				    GFP_KERNEL);
> > -	if (!gdma_cq_allocated) {
> > -		ret = -ENOMEM;
> > -		goto fail;
> > -	}
> > -
> 
> Why the allocation for CQs is removed? This is not related to this patch.

It becomes the dead code if I add the helper. You allocated gdma_cq_allocated to
temporary store gdma_queue to be able to deallocate them. The introduced helper
frees pointers to gdma_queue from kfree(gc->cq_table[cq->queue.id]), making
gdma_cq_allocated unused.


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size
  2024-04-23 23:34 79%   ` Long Li
@ 2024-04-24  8:43 79%     ` Konstantin Taranov
  2024-04-25 20:17 79%       ` Long Li
  0 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-24  8:43 UTC (permalink / raw)
  To: Long Li, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> From: Long Li <longli@microsoft.com>
> Sent: Wednesday, 24 April 2024 01:35
> To: Konstantin Taranov <kotaranov@linux.microsoft.com>; Konstantin
> Taranov <kotaranov@microsoft.com>; sharmaajay@microsoft.com;
> jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe
> with buf_size
> 
> > Subject: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe
> > with buf_size
> 
> I don't understand this commit message on "duplicate" cqe. I couldn't find a
> duplicate of it in the existing code.

If we need cqe, we could use it at cq->ibcq.cqe. The patch does not assign it as
it is not used, but if you want I can add "cq->ibcq.cqe = attr->cqe;" in v2.

- Konstantin

^ permalink raw reply	[relevance 79%]

* RE: [PATCH v1 1/1] RDMA/mana_ib: Fix compilation error
  @ 2024-04-23 23:58 79% ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-23 23:58 UTC (permalink / raw)
  To: Andy Shevchenko, Leon Romanovsky, Konstantin Taranov, linux-rdma,
	linux-kernel
  Cc: Ajay Sharma, Jason Gunthorpe

> Subject: [PATCH v1 1/1] RDMA/mana_ib: Fix compilation error
> 
> The compilation with CONFIG_WERROR=y is broken:
> 
> .../hw/mana/device.c:88:6: error: variable 'ret' is used uninitialized whenever
> 'if' condition is true [-Werror,-Wsometimes-uninitialized]
> 	if (!upper_ndev) {
> 	    ^~~~~~~~~~~
> 
> Fix this by assigning the ret to -ENODEV in respective condition.
> 
> Fixes: 8b184e4f1c32 ("RDMA/mana_ib: Enable RoCE on port 1")
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/device.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/infiniband/hw/mana/device.c
> b/drivers/infiniband/hw/mana/device.c
> index fca4d0d85c64..4c45f8681e7f 100644
> --- a/drivers/infiniband/hw/mana/device.c
> +++ b/drivers/infiniband/hw/mana/device.c
> @@ -88,6 +88,7 @@ static int mana_ib_probe(struct auxiliary_device *adev,
>  	if (!upper_ndev) {
>  		rcu_read_unlock();
>  		ibdev_err(&dev->ib_dev, "Failed to get master netdev");
> +		ret = -ENODEV;
>  		goto free_ib_device;
>  	}
>  	ether_addr_copy(mac_addr, upper_ndev->dev_addr);
> --
> 2.43.0.rc1.1336.g36b5255a03ac


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-04-18 16:52 67% ` [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
@ 2024-04-23 23:57 79%   ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-23 23:57 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation
> of rnic cq
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Enable users to create RNIC CQs.
> With the previous request size, an ethernet CQ is created.
> Use the cq_buf_size from the user to create an RNIC CQ and return its ID.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/cq.c | 56 ++++++++++++++++++++++++++++++---
>  include/uapi/rdma/mana-abi.h    |  7 +++++
>  2 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c
> b/drivers/infiniband/hw/mana/cq.c index 8323085..a62bda7 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -9,17 +9,25 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		      struct ib_udata *udata)
>  {
>  	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> +	struct mana_ib_create_cq_resp resp = {};
> +	struct mana_ib_ucontext *mana_ucontext;
>  	struct ib_device *ibdev = ibcq->device;
>  	struct mana_ib_create_cq ucmd = {};
>  	struct mana_ib_dev *mdev;
> +	bool is_rnic_cq = true;
> +	u32 doorbell;
>  	int err;
> 
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> 
> -	if (udata->inlen < sizeof(ucmd))
> +	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> +	cq->cq_handle = INVALID_MANA_HANDLE;
> +
> +	if (udata->inlen < offsetof(struct mana_ib_create_cq, cq_buf_size))
>  		return -EINVAL;
> 
> -	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> +	if (udata->inlen == offsetof(struct mana_ib_create_cq, cq_buf_size))
> +		is_rnic_cq = false;

I think it's okay with checking on offset in uapi message to decide if this is a newer/updated RNIC uverb.

But increasing MANA_IB_UVERBS_ABI_VERSION may make the code simpler. I have a feeling that you may need to increase it anyway, because a new uapi message "mana_ib_create_cq_resp" is introduced.

Jason or Leon may have a better idea on this.


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing cq callbacks
  2024-04-18 16:52 79% ` [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
@ 2024-04-23 23:45 79%   ` Long Li
  2024-04-24  8:58 79%     ` Konstantin Taranov
  2024-04-25 20:31 79%   ` Long Li
  1 sibling, 1 reply; 200+ results
From: Long Li @ 2024-04-23 23:45 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before
> installing cq callbacks
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Add a boundary check inside mana_ib_install_cq_cb to prevent index
> overflow.

How is this condition possible that we are getting an out of bound queue id from SOC?

> 
> Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb
> helper function")
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/cq.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c
> b/drivers/infiniband/hw/mana/cq.c index 6c3bb8c..8323085 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -70,6 +70,8 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev,
> struct mana_ib_cq *cq)
>  	struct gdma_context *gc = mdev_to_gc(mdev);
>  	struct gdma_queue *gdma_cq;
> 
> +	if (cq->queue.id >= gc->max_num_cqs)
> +		return -EINVAL;
>  	/* Create CQ table entry */
>  	WARN_ON(gc->cq_table[cq->queue.id]);
>  	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> --
> 2.43.0


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks
  2024-04-18 16:52 64% ` [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
@ 2024-04-23 23:42 79%   ` Long Li
  2024-04-24  8:50 79%     ` Konstantin Taranov
  0 siblings, 1 reply; 200+ results
From: Long Li @ 2024-04-23 23:42 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to
> remove cq callbacks
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Intoduce the mana_ib_remove_cq_cb helper to remove cq callbacks.
> The helper removes code duplicates.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/cq.c      | 19 ++++++++++++-------
>  drivers/infiniband/hw/mana/mana_ib.h |  1 +
>  drivers/infiniband/hw/mana/qp.c      | 26 ++++----------------------
>  3 files changed, 17 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c
> b/drivers/infiniband/hw/mana/cq.c index 0467ee8..6c3bb8c 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -48,16 +48,10 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct
> ib_udata *udata)
>  	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
>  	struct ib_device *ibdev = ibcq->device;
>  	struct mana_ib_dev *mdev;
> -	struct gdma_context *gc;
> 
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> -	gc = mdev_to_gc(mdev);
> -
> -	if (cq->queue.id != INVALID_QUEUE_ID) {
> -		kfree(gc->cq_table[cq->queue.id]);
> -		gc->cq_table[cq->queue.id] = NULL;
> -	}
> 
> +	mana_ib_remove_cq_cb(mdev, cq);
>  	mana_ib_destroy_queue(mdev, &cq->queue);
> 
>  	return 0;
> @@ -89,3 +83,14 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev,
> struct mana_ib_cq *cq)
>  	gc->cq_table[cq->queue.id] = gdma_cq;
>  	return 0;
>  }
> +
> +void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct
> mana_ib_cq
> +*cq) {
> +	struct gdma_context *gc = mdev_to_gc(mdev);
> +
> +	if (cq->queue.id >= gc->max_num_cqs)
> +		return;
> +
> +	kfree(gc->cq_table[cq->queue.id]);
> +	gc->cq_table[cq->queue.id] = NULL;

Why the check for (cq->queue.id != INVALID_QUEUE_ID) is removed?

> +}
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index 9c07021..6c19f4f 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -255,6 +255,7 @@ static inline void copy_in_reverse(u8 *dst, const u8
> *src, u32 size)  }
> 
>  int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq
> *cq);
> +void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct
> mana_ib_cq
> +*cq);
> 
>  int mana_ib_create_zero_offset_dma_region(struct mana_ib_dev *dev,
> struct ib_umem *umem,
>  					  mana_handle_t *gdma_region);
> diff --git a/drivers/infiniband/hw/mana/qp.c
> b/drivers/infiniband/hw/mana/qp.c index c4fb8b4..169b286 100644
> --- a/drivers/infiniband/hw/mana/qp.c
> +++ b/drivers/infiniband/hw/mana/qp.c
> @@ -95,11 +95,9 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp,
> struct ib_pd *pd,
>  	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp,
> ibqp);
>  	struct mana_ib_dev *mdev =
>  		container_of(pd->device, struct mana_ib_dev, ib_dev);
> -	struct gdma_context *gc = mdev_to_gc(mdev);
>  	struct ib_rwq_ind_table *ind_tbl = attr->rwq_ind_tbl;
>  	struct mana_ib_create_qp_rss_resp resp = {};
>  	struct mana_ib_create_qp_rss ucmd = {};
> -	struct gdma_queue **gdma_cq_allocated;
>  	mana_handle_t *mana_ind_table;
>  	struct mana_port_context *mpc;
>  	unsigned int ind_tbl_size;
> @@ -173,13 +171,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp,
> struct ib_pd *pd,
>  		goto fail;
>  	}
> 
> -	gdma_cq_allocated = kcalloc(ind_tbl_size,
> sizeof(*gdma_cq_allocated),
> -				    GFP_KERNEL);
> -	if (!gdma_cq_allocated) {
> -		ret = -ENOMEM;
> -		goto fail;
> -	}
> -

Why the allocation for CQs is removed? This is not related to this patch.


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size
  2024-04-18 16:52 73% ` [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size Konstantin Taranov
@ 2024-04-23 23:34 79%   ` Long Li
  2024-04-24  8:43 79%     ` Konstantin Taranov
  0 siblings, 1 reply; 200+ results
From: Long Li @ 2024-04-23 23:34 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with
> buf_size

I don't understand this commit message on "duplicate" cqe. I couldn't find a duplicate of it in the existing code.

^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs
  2024-04-18 16:52 68% ` [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
@ 2024-04-23 23:30 79%   ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-23 23:30 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement RNIC requests for creation and destruction of RNIC CQs.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>


^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for RNIC CQs
  2024-04-18 16:52 75% ` [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for " Konstantin Taranov
@ 2024-04-23 23:24 79%   ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-23 23:24 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for RNIC CQs
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Create EQs within mana_ib device. Such EQs are required for creation of RNIC
> CQs.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>



^ permalink raw reply	[relevance 79%]

* [PATCH v2 2/2] selftests/user_events: Add non-spacing separator check
  2024-04-23 16:23 74% [PATCH v2 0/2] tracing/user_events: Fix non-spaced field matching Beau Belgrave
  2024-04-23 16:23 67% ` [PATCH v2 1/2] " Beau Belgrave
@ 2024-04-23 16:23 77% ` Beau Belgrave
  1 sibling, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-23 16:23 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, dcook

The ABI documentation indicates that field separators do not need a
space between them, only a ';'. When no spacing is used, the register
must work. Any subsequent register, with or without spaces, must match
and not return -EADDRINUSE.

Add a non-spacing separator case to our self-test register case to ensure
it works going forward.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 tools/testing/selftests/user_events/ftrace_test.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/testing/selftests/user_events/ftrace_test.c b/tools/testing/selftests/user_events/ftrace_test.c
index dcd7509fe2e0..0bb46793dcd4 100644
--- a/tools/testing/selftests/user_events/ftrace_test.c
+++ b/tools/testing/selftests/user_events/ftrace_test.c
@@ -261,6 +261,12 @@ TEST_F(user, register_events) {
 	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, &reg));
 	ASSERT_EQ(0, reg.write_index);
 
+	/* Register without separator spacing should still match */
+	reg.enable_bit = 29;
+	reg.name_args = (__u64)"__test_event u32 field1;u32 field2";
+	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, &reg));
+	ASSERT_EQ(0, reg.write_index);
+
 	/* Multiple registers to same name but different args should fail */
 	reg.enable_bit = 29;
 	reg.name_args = (__u64)"__test_event u32 field1;";
@@ -288,6 +294,8 @@ TEST_F(user, register_events) {
 	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, &unreg));
 	unreg.disable_bit = 30;
 	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, &unreg));
+	unreg.disable_bit = 29;
+	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, &unreg));
 
 	/* Delete should have been auto-done after close and unregister */
 	close(self->data_fd);
-- 
2.34.1


^ permalink raw reply related	[relevance 77%]

* [PATCH v2 0/2] tracing/user_events: Fix non-spaced field matching
@ 2024-04-23 16:23 74% Beau Belgrave
  2024-04-23 16:23 67% ` [PATCH v2 1/2] " Beau Belgrave
  2024-04-23 16:23 77% ` [PATCH v2 2/2] selftests/user_events: Add non-spacing separator check Beau Belgrave
  0 siblings, 2 replies; 200+ results
From: Beau Belgrave @ 2024-04-23 16:23 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, dcook

When the ABI was updated to prevent same name w/different args, it
missed an important corner case when fields don't end with a space.
Typically, space is used for fields to help separate them, like
"u8 field1; u8 field2". If no spaces are used, like
"u8 field1;u8 field2", then the parsing works for the first time.
However, the match check fails on a subsequent register, leading to
confusion.

This is because the match check uses argv_split() and assumes that all
fields will be split upon the space. When spaces are used, we get back
{ "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }.
This causes a mismatch, and the user program gets back -EADDRINUSE.

Add a method to detect this case before calling argv_split(). If found
force a space after the field separator character ';'. This ensures all
cases work properly for matching.

I could not find an existing function to accomplish this, so I had to
hand code a copy with this logic. If there is a better way to achieve
this, I'm all ears.

This series also adds a selftest to ensure this doesn't break again.

With this fix, the following are all treated as matching:
u8 field1;u8 field2
u8 field1; u8 field2
u8 field1;\tu8 field2
u8 field1;\nu8 field2

V2 changes:
  Renamed fix_semis_no_space() to insert_space_after_semis().
  Have user_event_argv_split() return fast in no-split case.
  Pulled in Masami's shorter loop in insert_space_after_semis().

Beau Belgrave (2):
  tracing/user_events: Fix non-spaced field matching
  selftests/user_events: Add non-spacing separator check

 kernel/trace/trace_events_user.c              | 76 ++++++++++++++++++-
 .../selftests/user_events/ftrace_test.c       |  8 ++
 2 files changed, 83 insertions(+), 1 deletion(-)


base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680
-- 
2.34.1


^ permalink raw reply	[relevance 74%]

* [PATCH v2 1/2] tracing/user_events: Fix non-spaced field matching
  2024-04-23 16:23 74% [PATCH v2 0/2] tracing/user_events: Fix non-spaced field matching Beau Belgrave
@ 2024-04-23 16:23 67% ` Beau Belgrave
    2024-04-23 16:23 77% ` [PATCH v2 2/2] selftests/user_events: Add non-spacing separator check Beau Belgrave
  1 sibling, 1 reply; 200+ results
From: Beau Belgrave @ 2024-04-23 16:23 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, dcook

When the ABI was updated to prevent same name w/different args, it
missed an important corner case when fields don't end with a space.
Typically, space is used for fields to help separate them, like
"u8 field1; u8 field2". If no spaces are used, like
"u8 field1;u8 field2", then the parsing works for the first time.
However, the match check fails on a subsequent register, leading to
confusion.

This is because the match check uses argv_split() and assumes that all
fields will be split upon the space. When spaces are used, we get back
{ "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }.
This causes a mismatch, and the user program gets back -EADDRINUSE.

Add a method to detect this case before calling argv_split(). If found
force a space after the field separator character ';'. This ensures all
cases work properly for matching.

With this fix, the following are all treated as matching:
u8 field1;u8 field2
u8 field1; u8 field2
u8 field1;\tu8 field2
u8 field1;\nu8 field2

Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event")
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 kernel/trace/trace_events_user.c | 76 +++++++++++++++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 70d428c394b6..82b191f33a28 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1989,6 +1989,80 @@ static int user_event_set_tp_name(struct user_event *user)
 	return 0;
 }
 
+/*
+ * Counts how many ';' without a trailing space are in the args.
+ */
+static int count_semis_no_space(char *args)
+{
+	int count = 0;
+
+	while ((args = strchr(args, ';'))) {
+		args++;
+
+		if (!isspace(*args))
+			count++;
+	}
+
+	return count;
+}
+
+/*
+ * Copies the arguments while ensuring all ';' have a trailing space.
+ */
+static char *insert_space_after_semis(char *args, int count)
+{
+	char *fixed, *pos;
+	int len;
+
+	len = strlen(args) + count;
+	fixed = kmalloc(len + 1, GFP_KERNEL);
+
+	if (!fixed)
+		return NULL;
+
+	pos = fixed;
+
+	/* Insert a space after ';' if there is no trailing space. */
+	while (*args) {
+		*pos = *args++;
+
+		if (*pos++ == ';' && !isspace(*args))
+			*pos++ = ' ';
+	}
+
+	*pos = '\0';
+
+	return fixed;
+}
+
+static char **user_event_argv_split(char *args, int *argc)
+{
+	char **split;
+	char *fixed;
+	int count;
+
+	/* Count how many ';' without a trailing space */
+	count = count_semis_no_space(args);
+
+	/* No fixup is required */
+	if (!count)
+		return argv_split(GFP_KERNEL, args, argc);
+
+	/* We must fixup 'field;field' to 'field; field' */
+	fixed = insert_space_after_semis(args, count);
+
+	if (!fixed)
+		return NULL;
+
+	/* We do a normal split afterwards */
+	split = argv_split(GFP_KERNEL, fixed, argc);
+
+	/* We can free since argv_split makes a copy */
+	kfree(fixed);
+
+	return split;
+}
+
 /*
  * Parses the event name, arguments and flags then registers if successful.
  * The name buffer lifetime is owned by this method for success cases only.
@@ -2012,7 +2086,7 @@ static int user_event_parse(struct user_event_group *group, char *name,
 		return -EPERM;
 
 	if (args) {
-		argv = argv_split(GFP_KERNEL, args, &argc);
+		argv = user_event_argv_split(args, &argc);
 
 		if (!argv)
 			return -ENOMEM;
-- 
2.34.1


^ permalink raw reply related	[relevance 67%]

* [PATCH rdma-next 1/1] RDMA/mana_ib: fix missing ret value
@ 2024-04-23 14:15 79% Konstantin Taranov
  0 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-23 14:15 UTC (permalink / raw)
  To: nathan, kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Set ret to -ENODEV when netdev_master_upper_dev_get_rcu
returns NULL.

Fixes: 8b184e4f1c32 ("RDMA/mana_ib: Enable RoCE on port 1")
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index fca4d0d85c64..7e09ceb3da53 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -87,6 +87,7 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 	upper_ndev = netdev_master_upper_dev_get_rcu(mc->ports[0]);
 	if (!upper_ndev) {
 		rcu_read_unlock();
+		ret = -ENODEV;
 		ibdev_err(&dev->ib_dev, "Failed to get master netdev");
 		goto free_ib_device;
 	}
-- 
2.43.0


^ permalink raw reply related	[relevance 79%]

* Re: [PATCH rdma-next v3 4/6] RDMA/mana_ib: enable RoCE on port 1
  @ 2024-04-23  7:15 79%     ` Konstantin Taranov
  0 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-23  7:15 UTC (permalink / raw)
  To: Nathan Chancellor, Konstantin Taranov
  Cc: sharmaajay, Long Li, jgg, leon, linux-rdma, linux-kernel

> Hi Konstantin,
> 
> On Wed, Apr 10, 2024 at 01:42:29AM -0700, Konstantin Taranov wrote:
> > From: Konstantin Taranov <kotaranov@microsoft.com>
> >
> > Set netdev and RoCEv2 flag to enable GID population on port 1.
> > Use GIDs of the master netdev. As mc->ports[] stores slave devices,
> > use a helper to get the master netdev.
> >
> > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > ---
> >  drivers/infiniband/hw/mana/device.c | 15 +++++++++++++++
> >  drivers/infiniband/hw/mana/main.c   | 15 +++++++++++----
> >  2 files changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/mana/device.c
> > b/drivers/infiniband/hw/mana/device.c
> > index 47547a962b19..e7981301d10b 100644
> > --- a/drivers/infiniband/hw/mana/device.c
> > +++ b/drivers/infiniband/hw/mana/device.c
> > @@ -53,6 +53,7 @@ static int mana_ib_probe(struct auxiliary_device
> > *adev,  {
> >  	struct mana_adev *madev = container_of(adev, struct mana_adev,
> adev);
> >  	struct gdma_dev *mdev = madev->mdev;
> > +	struct net_device *upper_ndev;
> >  	struct mana_context *mc;
> >  	struct mana_ib_dev *dev;
> >  	int ret;
> > @@ -79,6 +80,20 @@ static int mana_ib_probe(struct auxiliary_device
> *adev,
> >  	dev->ib_dev.num_comp_vectors = 1;
> >  	dev->ib_dev.dev.parent = mdev->gdma_context->dev;
> >
> > +	rcu_read_lock(); /* required to get upper dev */
> > +	upper_ndev = netdev_master_upper_dev_get_rcu(mc->ports[0]);
> > +	if (!upper_ndev) {
> > +		rcu_read_unlock();
> > +		ibdev_err(&dev->ib_dev, "Failed to get master netdev");
> > +		goto free_ib_device;
> > +	}
> 
> Clang now warns (or errors with CONFIG_WERROR):
> 
>   drivers/infiniband/hw/mana/device.c:88:6: error: variable 'ret' is used
> uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-
> uninitialized]
>      88 |         if (!upper_ndev) {
>         |             ^~~~~~~~~~~
>   drivers/infiniband/hw/mana/device.c:150:9: note: uninitialized use occurs
> here
>     150 |         return ret;
>         |                ^~~
>   drivers/infiniband/hw/mana/device.c:88:2: note: remove the 'if' if its
> condition is always false
>      88 |         if (!upper_ndev) {
>         |         ^~~~~~~~~~~~~~~~~~
>      89 |                 rcu_read_unlock();
>         |                 ~~~~~~~~~~~~~~~~~~
>      90 |                 ibdev_err(&dev->ib_dev, "Failed to get master netdev");
>         |
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>      91 |                 goto free_ib_device;
>         |                 ~~~~~~~~~~~~~~~~~~~~
>      92 |         }
>         |         ~
>   drivers/infiniband/hw/mana/device.c:62:9: note: initialize the variable 'ret'
> to silence this warning
>      62 |         int ret;
>         |                ^
>         |                 = 0
>   1 error generated.
> 
> I could not really find a consistent return code for when
> netdev_master_upper_dev_get_rcu() fails. -ENODEV?

Thanks for catching this! Yes, I think ret = -ENODEV; is appropriate fix.
Should I send a patch to rdma-next? Or what should I do now to fix this?

Konstantin

> 
> Cheers,
> Nathan
> 


^ permalink raw reply	[relevance 79%]

* [PATCH v4] Drivers: hv: Cosmetic changes for hv.c and balloon.c
@ 2024-04-23  3:18 38% Aditya Nagesh
  0 siblings, 0 replies; 200+ results
From: Aditya Nagesh @ 2024-04-23  3:18 UTC (permalink / raw)
  To: adityanagesh, kys, haiyangz, wei.liu, decui, linux-hyperv, linux-kernel
  Cc: Aditya Nagesh

Fix issues reported by checkpatch.pl script in hv.c and
balloon.c
 - Remove unnecessary parentheses
 - Remove extra newlines
 - Remove extra spaces
 - Add spaces between comparison operators
 - Remove comparison with NULL in if statements

No functional changes intended

Signed-off-by: Aditya Nagesh <adityanagesh@linux.microsoft.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
---
[V4]
Fix Alignment issue and revert a line since 100 characters are allowed in a line

[V3]
Fix alignment issues in multiline function parameters.

[V2]
Change Subject from "Drivers: hv: Fix Issues reported by checkpatch.pl script"
 to "Drivers: hv: Cosmetic changes for hv.c and balloon.c"

 drivers/hv/hv.c         |  37 +++++++--------
 drivers/hv/hv_balloon.c | 102 +++++++++++++++-------------------------
 2 files changed, 55 insertions(+), 84 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index a8ad728354cb..e0d676c74f14 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -45,8 +45,8 @@ int hv_init(void)
  * This involves a hypercall.
  */
 int hv_post_message(union hv_connection_id connection_id,
-		  enum hv_message_type message_type,
-		  void *payload, size_t payload_size)
+			enum hv_message_type message_type,
+			void *payload, size_t payload_size)
 {
 	struct hv_input_post_message *aligned_msg;
 	unsigned long flags;
@@ -86,7 +86,7 @@ int hv_post_message(union hv_connection_id connection_id,
 			status = HV_STATUS_INVALID_PARAMETER;
 	} else {
 		status = hv_do_hypercall(HVCALL_POST_MESSAGE,
-				aligned_msg, NULL);
+					 aligned_msg, NULL);
 	}
 
 	local_irq_restore(flags);
@@ -111,7 +111,7 @@ int hv_synic_alloc(void)
 
 	hv_context.hv_numa_map = kcalloc(nr_node_ids, sizeof(struct cpumask),
 					 GFP_KERNEL);
-	if (hv_context.hv_numa_map == NULL) {
+	if (!hv_context.hv_numa_map) {
 		pr_err("Unable to allocate NUMA map\n");
 		goto err;
 	}
@@ -120,11 +120,11 @@ int hv_synic_alloc(void)
 		hv_cpu = per_cpu_ptr(hv_context.cpu_context, cpu);
 
 		tasklet_init(&hv_cpu->msg_dpc,
-			     vmbus_on_msg_dpc, (unsigned long) hv_cpu);
+			     vmbus_on_msg_dpc, (unsigned long)hv_cpu);
 
 		if (ms_hyperv.paravisor_present && hv_isolation_type_tdx()) {
 			hv_cpu->post_msg_page = (void *)get_zeroed_page(GFP_ATOMIC);
-			if (hv_cpu->post_msg_page == NULL) {
+			if (!hv_cpu->post_msg_page) {
 				pr_err("Unable to allocate post msg page\n");
 				goto err;
 			}
@@ -147,14 +147,14 @@ int hv_synic_alloc(void)
 		if (!ms_hyperv.paravisor_present && !hv_root_partition) {
 			hv_cpu->synic_message_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
-			if (hv_cpu->synic_message_page == NULL) {
+			if (!hv_cpu->synic_message_page) {
 				pr_err("Unable to allocate SYNIC message page\n");
 				goto err;
 			}
 
 			hv_cpu->synic_event_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
-			if (hv_cpu->synic_event_page == NULL) {
+			if (!hv_cpu->synic_event_page) {
 				pr_err("Unable to allocate SYNIC event page\n");
 
 				free_page((unsigned long)hv_cpu->synic_message_page);
@@ -203,14 +203,13 @@ int hv_synic_alloc(void)
 	return ret;
 }
 
-
 void hv_synic_free(void)
 {
 	int cpu, ret;
 
 	for_each_present_cpu(cpu) {
-		struct hv_per_cpu_context *hv_cpu
-			= per_cpu_ptr(hv_context.cpu_context, cpu);
+		struct hv_per_cpu_context *hv_cpu =
+			per_cpu_ptr(hv_context.cpu_context, cpu);
 
 		/* It's better to leak the page if the encryption fails. */
 		if (ms_hyperv.paravisor_present && hv_isolation_type_tdx()) {
@@ -262,8 +261,8 @@ void hv_synic_free(void)
  */
 void hv_synic_enable_regs(unsigned int cpu)
 {
-	struct hv_per_cpu_context *hv_cpu
-		= per_cpu_ptr(hv_context.cpu_context, cpu);
+	struct hv_per_cpu_context *hv_cpu =
+		per_cpu_ptr(hv_context.cpu_context, cpu);
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
 	union hv_synic_sint shared_sint;
@@ -277,8 +276,8 @@ void hv_synic_enable_regs(unsigned int cpu)
 		/* Mask out vTOM bit. ioremap_cache() maps decrypted */
 		u64 base = (simp.base_simp_gpa << HV_HYP_PAGE_SHIFT) &
 				~ms_hyperv.shared_gpa_boundary;
-		hv_cpu->synic_message_page
-			= (void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
+		hv_cpu->synic_message_page =
+			(void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
 		if (!hv_cpu->synic_message_page)
 			pr_err("Fail to map synic message page.\n");
 	} else {
@@ -296,8 +295,8 @@ void hv_synic_enable_regs(unsigned int cpu)
 		/* Mask out vTOM bit. ioremap_cache() maps decrypted */
 		u64 base = (siefp.base_siefp_gpa << HV_HYP_PAGE_SHIFT) &
 				~ms_hyperv.shared_gpa_boundary;
-		hv_cpu->synic_event_page
-			= (void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
+		hv_cpu->synic_event_page =
+			(void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
 		if (!hv_cpu->synic_event_page)
 			pr_err("Fail to map synic event page.\n");
 	} else {
@@ -348,8 +347,8 @@ int hv_synic_init(unsigned int cpu)
  */
 void hv_synic_disable_regs(unsigned int cpu)
 {
-	struct hv_per_cpu_context *hv_cpu
-		= per_cpu_ptr(hv_context.cpu_context, cpu);
+	struct hv_per_cpu_context *hv_cpu =
+		per_cpu_ptr(hv_context.cpu_context, cpu);
 	union hv_synic_sint shared_sint;
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index e000fa3b9f97..c3c16756a0fb 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -41,8 +41,6 @@
  * Begin protocol definitions.
  */
 
-
-
 /*
  * Protocol versions. The low word is the minor version, the high word the major
  * version.
@@ -71,8 +69,6 @@ enum {
 	DYNMEM_PROTOCOL_VERSION_CURRENT = DYNMEM_PROTOCOL_VERSION_WIN10
 };
 
-
-
 /*
  * Message Types
  */
@@ -101,7 +97,6 @@ enum dm_message_type {
 	DM_VERSION_1_MAX		= 12
 };
 
-
 /*
  * Structures defining the dynamic memory management
  * protocol.
@@ -115,7 +110,6 @@ union dm_version {
 	__u32 version;
 } __packed;
 
-
 union dm_caps {
 	struct {
 		__u64 balloon:1;
@@ -148,8 +142,6 @@ union dm_mem_page_range {
 	__u64  page_range;
 } __packed;
 
-
-
 /*
  * The header for all dynamic memory messages:
  *
@@ -174,7 +166,6 @@ struct dm_message {
 	__u8 data[]; /* enclosed message */
 } __packed;
 
-
 /*
  * Specific message types supporting the dynamic memory protocol.
  */
@@ -271,7 +262,6 @@ struct dm_status {
 	__u32 io_diff;
 } __packed;
 
-
 /*
  * Message to ask the guest to allocate memory - balloon up message.
  * This message is sent from the host to the guest. The guest may not be
@@ -286,14 +276,13 @@ struct dm_balloon {
 	__u32 reservedz;
 } __packed;
 
-
 /*
  * Balloon response message; this message is sent from the guest
  * to the host in response to the balloon message.
  *
  * reservedz: Reserved; must be set to zero.
  * more_pages: If FALSE, this is the last message of the transaction.
- * if TRUE there will atleast one more message from the guest.
+ * if TRUE there will be at least one more message from the guest.
  *
  * range_count: The number of ranges in the range array.
  *
@@ -314,7 +303,7 @@ struct dm_balloon_response {
  * to the guest to give guest more memory.
  *
  * more_pages: If FALSE, this is the last message of the transaction.
- * if TRUE there will atleast one more message from the guest.
+ * if TRUE there will be at least one more message from the guest.
  *
  * reservedz: Reserved; must be set to zero.
  *
@@ -342,7 +331,6 @@ struct dm_unballoon_response {
 	struct dm_header hdr;
 } __packed;
 
-
 /*
  * Hot add request message. Message sent from the host to the guest.
  *
@@ -390,7 +378,6 @@ enum dm_info_type {
 	MAX_INFO_TYPE
 };
 
-
 /*
  * Header for the information message.
  */
@@ -480,10 +467,10 @@ static unsigned long last_post_time;
 
 static int hv_hypercall_multi_failure;
 
-module_param(hot_add, bool, (S_IRUGO | S_IWUSR));
+module_param(hot_add, bool, 0644);
 MODULE_PARM_DESC(hot_add, "If set attempt memory hot_add");
 
-module_param(pressure_report_delay, uint, (S_IRUGO | S_IWUSR));
+module_param(pressure_report_delay, uint, 0644);
 MODULE_PARM_DESC(pressure_report_delay, "Delay in secs in reporting pressure");
 static atomic_t trans_id = ATOMIC_INIT(0);
 
@@ -502,7 +489,6 @@ enum hv_dm_state {
 	DM_INIT_ERROR
 };
 
-
 static __u8 recv_buffer[HV_HYP_PAGE_SIZE];
 static __u8 balloon_up_send_buffer[HV_HYP_PAGE_SIZE];
 #define PAGES_IN_2M (2 * 1024 * 1024 / PAGE_SIZE)
@@ -595,12 +581,12 @@ static inline bool has_pfn_is_backed(struct hv_hotadd_state *has,
 	struct hv_hotadd_gap *gap;
 
 	/* The page is not backed. */
-	if ((pfn < has->covered_start_pfn) || (pfn >= has->covered_end_pfn))
+	if (pfn < has->covered_start_pfn || pfn >= has->covered_end_pfn)
 		return false;
 
 	/* Check for gaps. */
 	list_for_each_entry(gap, &has->gap_list, list) {
-		if ((pfn >= gap->start_pfn) && (pfn < gap->end_pfn))
+		if (pfn >= gap->start_pfn && pfn < gap->end_pfn)
 			return false;
 	}
 
@@ -724,7 +710,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
 	unsigned long processed_pfn;
 	unsigned long total_pfn = pfn_count;
 
-	for (i = 0; i < (size/HA_CHUNK); i++) {
+	for (i = 0; i < (size / HA_CHUNK); i++) {
 		start_pfn = start + (i * HA_CHUNK);
 
 		scoped_guard(spinlock_irqsave, &dm_device.ha_lock) {
@@ -745,7 +731,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
 
 		nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
 		ret = add_memory(nid, PFN_PHYS((start_pfn)),
-				(HA_CHUNK << PAGE_SHIFT), MHP_MERGE_RESOURCE);
+				 (HA_CHUNK << PAGE_SHIFT), MHP_MERGE_RESOURCE);
 
 		if (ret) {
 			pr_err("hot_add memory failed error is %d\n", ret);
@@ -787,8 +773,8 @@ static void hv_online_page(struct page *pg, unsigned int order)
 	guard(spinlock_irqsave)(&dm_device.ha_lock);
 	list_for_each_entry(has, &dm_device.ha_region_list, list) {
 		/* The page belongs to a different HAS. */
-		if ((pfn < has->start_pfn) ||
-				(pfn + (1UL << order) > has->end_pfn))
+		if (pfn < has->start_pfn ||
+		    (pfn + (1UL << order) > has->end_pfn))
 			continue;
 
 		hv_bring_pgs_online(has, pfn, 1UL << order);
@@ -855,7 +841,7 @@ static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 }
 
 static unsigned long handle_pg_range(unsigned long pg_start,
-					unsigned long pg_count)
+				     unsigned long pg_count)
 {
 	unsigned long start_pfn = pg_start;
 	unsigned long pfn_cnt = pg_count;
@@ -866,7 +852,7 @@ static unsigned long handle_pg_range(unsigned long pg_start,
 	unsigned long res = 0, flags;
 
 	pr_debug("Hot adding %lu pages starting at pfn 0x%lx.\n", pg_count,
-		pg_start);
+		 pg_start);
 
 	spin_lock_irqsave(&dm_device.ha_lock, flags);
 	list_for_each_entry(has, &dm_device.ha_region_list, list) {
@@ -902,10 +888,9 @@ static unsigned long handle_pg_range(unsigned long pg_start,
 			if (start_pfn > has->start_pfn &&
 			    online_section_nr(pfn_to_section_nr(start_pfn)))
 				hv_bring_pgs_online(has, start_pfn, pgs_ol);
-
 		}
 
-		if ((has->ha_end_pfn < has->end_pfn) && (pfn_cnt > 0)) {
+		if (has->ha_end_pfn < has->end_pfn && pfn_cnt > 0) {
 			/*
 			 * We have some residual hot add range
 			 * that needs to be hot added; hot add
@@ -1010,7 +995,7 @@ static void hot_add_req(struct work_struct *dummy)
 	rg_start = dm->ha_wrk.ha_region_range.finfo.start_page;
 	rg_sz = dm->ha_wrk.ha_region_range.finfo.page_cnt;
 
-	if ((rg_start == 0) && (!dm->host_specified_ha_region)) {
+	if (rg_start == 0 && !dm->host_specified_ha_region) {
 		unsigned long region_size;
 		unsigned long region_start;
 
@@ -1033,7 +1018,7 @@ static void hot_add_req(struct work_struct *dummy)
 
 	if (do_hot_add)
 		resp.page_count = process_hot_add(pg_start, pfn_cnt,
-						rg_start, rg_sz);
+						  rg_start, rg_sz);
 
 	dm->num_pages_added += resp.page_count;
 #endif
@@ -1211,11 +1196,10 @@ static void post_status(struct hv_dynmem_device *dm)
 				sizeof(struct dm_status),
 				(unsigned long)NULL,
 				VM_PKT_DATA_INBAND, 0);
-
 }
 
 static void free_balloon_pages(struct hv_dynmem_device *dm,
-			 union dm_mem_page_range *range_array)
+			       union dm_mem_page_range *range_array)
 {
 	int num_pages = range_array->finfo.page_cnt;
 	__u64 start_frame = range_array->finfo.start_page;
@@ -1231,8 +1215,6 @@ static void free_balloon_pages(struct hv_dynmem_device *dm,
 	}
 }
 
-
-
 static unsigned int alloc_balloon_pages(struct hv_dynmem_device *dm,
 					unsigned int num_pages,
 					struct dm_balloon_response *bl_resp,
@@ -1278,7 +1260,6 @@ static unsigned int alloc_balloon_pages(struct hv_dynmem_device *dm,
 			page_to_pfn(pg);
 		bl_resp->range_array[i].finfo.page_cnt = alloc_unit;
 		bl_resp->hdr.size += sizeof(union dm_mem_page_range);
-
 	}
 
 	return i * alloc_unit;
@@ -1332,7 +1313,7 @@ static void balloon_up(struct work_struct *dummy)
 
 		if (num_ballooned == 0 || num_ballooned == num_pages) {
 			pr_debug("Ballooned %u out of %u requested pages.\n",
-				num_pages, dm_device.balloon_wrk.num_pages);
+				 num_pages, dm_device.balloon_wrk.num_pages);
 
 			bl_resp->more_pages = 0;
 			done = true;
@@ -1366,16 +1347,15 @@ static void balloon_up(struct work_struct *dummy)
 
 			for (i = 0; i < bl_resp->range_count; i++)
 				free_balloon_pages(&dm_device,
-						 &bl_resp->range_array[i]);
+						   &bl_resp->range_array[i]);
 
 			done = true;
 		}
 	}
-
 }
 
 static void balloon_down(struct hv_dynmem_device *dm,
-			struct dm_unballoon_request *req)
+			 struct dm_unballoon_request *req)
 {
 	union dm_mem_page_range *range_array = req->range_array;
 	int range_count = req->range_count;
@@ -1389,7 +1369,7 @@ static void balloon_down(struct hv_dynmem_device *dm,
 	}
 
 	pr_debug("Freed %u ballooned pages.\n",
-		prev_pages_ballooned - dm->num_pages_ballooned);
+		 prev_pages_ballooned - dm->num_pages_ballooned);
 
 	if (req->more_pages == 1)
 		return;
@@ -1414,8 +1394,7 @@ static int dm_thread_func(void *dm_dev)
 	struct hv_dynmem_device *dm = dm_dev;
 
 	while (!kthread_should_stop()) {
-		wait_for_completion_interruptible_timeout(
-						&dm_device.config_event, 1*HZ);
+		wait_for_completion_interruptible_timeout(&dm_device.config_event, 1 * HZ);
 		/*
 		 * The host expects us to post information on the memory
 		 * pressure every second.
@@ -1439,9 +1418,8 @@ static int dm_thread_func(void *dm_dev)
 	return 0;
 }
 
-
 static void version_resp(struct hv_dynmem_device *dm,
-			struct dm_version_response *vresp)
+			 struct dm_version_response *vresp)
 {
 	struct dm_version_request version_req;
 	int ret;
@@ -1502,7 +1480,7 @@ static void version_resp(struct hv_dynmem_device *dm,
 }
 
 static void cap_resp(struct hv_dynmem_device *dm,
-			struct dm_capabilities_resp_msg *cap_resp)
+		     struct dm_capabilities_resp_msg *cap_resp)
 {
 	if (!cap_resp->is_accepted) {
 		pr_err("Capabilities not accepted by host\n");
@@ -1535,7 +1513,7 @@ static void balloon_onchannelcallback(void *context)
 		switch (dm_hdr->type) {
 		case DM_VERSION_RESPONSE:
 			version_resp(dm,
-				 (struct dm_version_response *)dm_msg);
+				     (struct dm_version_response *)dm_msg);
 			break;
 
 		case DM_CAPABILITIES_RESPONSE:
@@ -1565,7 +1543,7 @@ static void balloon_onchannelcallback(void *context)
 
 			dm->state = DM_BALLOON_DOWN;
 			balloon_down(dm,
-				 (struct dm_unballoon_request *)recv_buffer);
+				     (struct dm_unballoon_request *)recv_buffer);
 			break;
 
 		case DM_MEM_HOT_ADD_REQUEST:
@@ -1603,17 +1581,15 @@ static void balloon_onchannelcallback(void *context)
 
 		default:
 			pr_warn_ratelimited("Unhandled message: type: %d\n", dm_hdr->type);
-
 		}
 	}
-
 }
 
 #define HV_LARGE_REPORTING_ORDER	9
 #define HV_LARGE_REPORTING_LEN (HV_HYP_PAGE_SIZE << \
 		HV_LARGE_REPORTING_ORDER)
 static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
-		    struct scatterlist *sgl, unsigned int nents)
+			       struct scatterlist *sgl, unsigned int nents)
 {
 	unsigned long flags;
 	struct hv_memory_hint *hint;
@@ -1648,7 +1624,7 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
 		 */
 
 		/* page reporting for pages 2MB or higher */
-		if (order >= HV_LARGE_REPORTING_ORDER ) {
+		if (order >= HV_LARGE_REPORTING_ORDER) {
 			range->page.largepage = 1;
 			range->page_size = HV_GPA_PAGE_RANGE_PAGE_SIZE_2MB;
 			range->base_large_pfn = page_to_hvpfn(
@@ -1662,23 +1638,21 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
 			range->page.additional_pages =
 				(sg->length / HV_HYP_PAGE_SIZE) - 1;
 		}
-
 	}
 
 	status = hv_do_rep_hypercall(HV_EXT_CALL_MEMORY_HEAT_HINT, nents, 0,
 				     hint, NULL);
 	local_irq_restore(flags);
 	if (!hv_result_success(status)) {
-
 		pr_err("Cold memory discard hypercall failed with status %llx\n",
-				status);
+		       status);
 		if (hv_hypercall_multi_failure > 0)
 			hv_hypercall_multi_failure++;
 
 		if (hv_result(status) == HV_STATUS_INVALID_PARAMETER) {
 			pr_err("Underlying Hyper-V does not support order less than 9. Hypercall failed\n");
 			pr_err("Defaulting to page_reporting_order %d\n",
-					pageblock_order);
+			       pageblock_order);
 			page_reporting_order = pageblock_order;
 			hv_hypercall_multi_failure++;
 			return -EINVAL;
@@ -1712,7 +1686,7 @@ static void enable_page_reporting(void)
 		pr_err("Failed to enable cold memory discard: %d\n", ret);
 	} else {
 		pr_info("Cold memory discard hint enabled with order %d\n",
-				page_reporting_order);
+			page_reporting_order);
 	}
 }
 
@@ -1795,7 +1769,7 @@ static int balloon_connect_vsp(struct hv_device *dev)
 	if (ret)
 		goto out;
 
-	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
+	t = wait_for_completion_timeout(&dm_device.host_event, 5 * HZ);
 	if (t == 0) {
 		ret = -ETIMEDOUT;
 		goto out;
@@ -1850,7 +1824,7 @@ static int balloon_connect_vsp(struct hv_device *dev)
 	if (ret)
 		goto out;
 
-	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
+	t = wait_for_completion_timeout(&dm_device.host_event, 5 * HZ);
 	if (t == 0) {
 		ret = -ETIMEDOUT;
 		goto out;
@@ -1891,8 +1865,8 @@ static int hv_balloon_debug_show(struct seq_file *f, void *offset)
 	char *sname;
 
 	seq_printf(f, "%-22s: %u.%u\n", "host_version",
-				DYNMEM_MAJOR_VERSION(dm->version),
-				DYNMEM_MINOR_VERSION(dm->version));
+			DYNMEM_MAJOR_VERSION(dm->version),
+			DYNMEM_MINOR_VERSION(dm->version));
 
 	seq_printf(f, "%-22s:", "capabilities");
 	if (ballooning_enabled())
@@ -1941,10 +1915,10 @@ static int hv_balloon_debug_show(struct seq_file *f, void *offset)
 	seq_printf(f, "%-22s: %u\n", "pages_ballooned", dm->num_pages_ballooned);
 
 	seq_printf(f, "%-22s: %lu\n", "total_pages_committed",
-				get_pages_committed(dm));
+		   get_pages_committed(dm));
 
 	seq_printf(f, "%-22s: %llu\n", "max_dynamic_page_count",
-				dm->max_dynamic_page_count);
+		   dm->max_dynamic_page_count);
 
 	return 0;
 }
@@ -1954,7 +1928,7 @@ DEFINE_SHOW_ATTRIBUTE(hv_balloon_debug);
 static void  hv_balloon_debugfs_init(struct hv_dynmem_device *b)
 {
 	debugfs_create_file("hv-balloon", 0444, NULL, b,
-			&hv_balloon_debug_fops);
+			    &hv_balloon_debug_fops);
 }
 
 static void  hv_balloon_debugfs_exit(struct hv_dynmem_device *b)
@@ -2097,7 +2071,6 @@ static int balloon_suspend(struct hv_device *hv_dev)
 	tasklet_enable(&hv_dev->channel->callback_event);
 
 	return 0;
-
 }
 
 static int balloon_resume(struct hv_device *dev)
@@ -2156,7 +2129,6 @@ static  struct hv_driver balloon_drv = {
 
 static int __init init_balloon_drv(void)
 {
-
 	return vmbus_driver_register(&balloon_drv);
 }
 
-- 
2.34.1


^ permalink raw reply related	[relevance 38%]

* Re: [PATCH 1/2] tracing/user_events: Fix non-spaced field matching
  @ 2024-04-22 21:55 79%         ` Beau Belgrave
  0 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-22 21:55 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: rostedt, mathieu.desnoyers, linux-kernel, linux-trace-kernel, dcook

On Sat, Apr 20, 2024 at 09:50:52PM +0900, Masami Hiramatsu wrote:
> On Fri, 19 Apr 2024 14:13:34 -0700
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
> 
> > On Fri, Apr 19, 2024 at 11:33:05AM +0900, Masami Hiramatsu wrote:
> > > On Tue, 16 Apr 2024 22:41:01 +0000
> > > Beau Belgrave <beaub@linux.microsoft.com> wrote:

*SNIP*

> > > nit: This loop can be simpler, because we are sure fixed has enough length;
> > > 
> > > /* insert a space after ';' if there is no space. */
> > > while(*args) {
> > > 	*pos = *args++;
> > > 	if (*pos++ == ';' && !isspace(*args))
> > > 		*pos++ = ' ';
> > > }
> > > 
> > 
> > I was worried that if count_semis_no_space() ever had different logic
> > (maybe after this commit) that it could cause an overflow if the count
> > was wrong, etc.
> > 
> > I don't have an issue making it shorter, but I was trying to be more on
> > the safe side, since this isn't a fast path (event register).
> 
> OK, anyway current code looks correct. But note that I don't think
> "pos++; len--;" is safer, since it is not atomic. This pattern
> easily loose "len--;" in my experience. So please carefully use it ;)
> 

I'll stick with your loop. Perhaps others will chime in on the v2 and
state a stronger opinion.

You scared me with the atomic comment, I went back and looked at all the
paths for this. In the user_events IOCTL the buffer is copied from user
to kernel, so it cannot change (and no other threads access it). I also
checked trace_parse_run_command() which is the same. So at least in this
context the non-atomic part is OK.

> > 
> > > > +
> > > > +	/*
> > > > +	 * len is the length of the copy excluding the null.
> > > > +	 * This ensures we always have room for a null.
> > > > +	 */
> > > > +	*pos = '\0';
> > > > +
> > > > +	return fixed;
> > > > +}
> > > > +
> > > > +static char **user_event_argv_split(char *args, int *argc)
> > > > +{
> > > > +	/* Count how many ';' without a trailing space */
> > > > +	int count = count_semis_no_space(args);
> > > > +
> > > > +	if (count) {
> > > 
> > > nit: it is better to exit fast, so 
> > > 
> > > 	if (!count)
> > > 		return argv_split(GFP_KERNEL, args, argc);
> > > 
> > > 	...
> > 
> > Sure, will fix in a v2.
> > 
> > > 
> > > Thank you,
> > > 
> > > OT: BTW, can this also simplify synthetic events?
> > > 
> > 
> > I'm not sure, I'll check when I have some time. I want to get this fix
> > in sooner rather than later.
> 
> Ah, nevermind. Synthetic event parses the field by strsep(';') first
> and argv_split(). So it does not have this issue.
> 

Ok, seems unrelated. Thanks for checking.

Thanks,
-Beau

> Thank you,
> 
> > 
> > Thanks,
> > -Beau
> > 

*SNIP* 

> > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
  @ 2024-04-22 10:08 79%                 ` Shradha Gupta
  0 siblings, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-22 10:08 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jason Gunthorpe, Zhu Yanjun, linux-kernel, linux-hyperv,
	linux-rdma, netdev, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Ajay Sharma, Leon Romanovsky, Thomas Gleixner,
	Sebastian Andrzej Siewior, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Long Li, Michael Kelley, Shradha Gupta,
	Yury Norov, Konstantin Taranov, Souradeep Chakrabarti

On Fri, Apr 19, 2024 at 08:51:02PM +0200, Andrew Lunn wrote:
> On Fri, Apr 19, 2024 at 09:59:26AM -0700, Shradha Gupta wrote:
> > On Thu, Apr 18, 2024 at 08:42:59PM +0200, Andrew Lunn wrote:
> > > > >From an RDMA perspective this is all available from other APIs already
> > > > at least and I wouldn't want to see new sysfs unless there is a netdev
> > > > justification.
> > > 
> > > It is unlikely there is a netdev justification. Configuration happens
> > > via netlink, not sysfs.
> > > 
> > >     Andrew
> > 
> > Thanks. Sure, it makes sense to make the generic attribute configurable
> > through the netdevice ops or netlink implementation. I will keep that in
> > mind while adding the next set of configuration attributes for the driver.
> > These attributes(from the patch) however, are hardware specific(that show
> > the maximum supported values by the hardware in most cases).
> 
>         ndev->max_mtu = gc->adapter_mtu - ETH_HLEN;
>         ndev->min_mtu = ETH_MIN_MTU;
> 
> This does not appear to be specific to your device. This is very
> generic. We already have /sys/class/net/eth42/mtu, why not add
> /sys/class/net/eth42/max_mtu and /sys/class/net/eth42/min_mtu for
> every driver?
> 
> Are these values really hardware specific? Are they really unique to
> your hardware? I have to wounder because you clearly did not think
> much about MTU, and how it is actually generic...
> 
>      Andrew
That makes sense. I will make these as generic attributes in the next version.
Thanks.

^ permalink raw reply	[relevance 79%]

* Re: [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr
  @ 2024-04-22  9:12 79%         ` Konstantin Taranov
  0 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-22  9:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Konstantin Taranov, sharmaajay, Long Li, leon, linux-rdma, linux-kernel

> From: Jason Gunthorpe <jgg@ziepe.ca>
> On Fri, Apr 19, 2024 at 09:14:14AM +0000, Konstantin Taranov wrote:
> > > From: Jason Gunthorpe <jgg@ziepe.ca> On Wed, Apr 17, 2024 at
> > > 07:20:59AM -0700, Konstantin Taranov wrote:
> > > > From: Konstantin Taranov <kotaranov@microsoft.com>
> > > >
> > > > Implement allocation of DMA-mapped memory regions.
> > > >
> > > > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > > > ---
> > > >  drivers/infiniband/hw/mana/device.c |  1 +
> > > >  drivers/infiniband/hw/mana/mr.c     | 36
> > > +++++++++++++++++++++++++++++
> > > >  include/net/mana/gdma.h             |  5 ++++
> > > >  3 files changed, 42 insertions(+)
> > >
> > > What is the point of doing this without supporting enough verbs to
> > > allow a kernel ULP?
> > >
> >
> > True, the proposed code is useless at this state.
> > Nevertheless, mana_ib team aims to send kernel ULP patches after we
> > are done with uverbs pathes (i.e., udata is not null). As this change
> > does not conflict with the current effort, I decided to send this
> > patch now. I can extend the series to make it more useful.
> >
> > Jason, could  you suggest a minimal list of ib_device_ops methods,
> > that includes get_dma_mr, which can be approved?
> 
> Is there any chance you can send a single series to support a ULP. NVMe or
> something like?

Sure, I can. I will investigate the way to make get_dma_mr used with fewer changes. 

Generally, I am wondering what would be easier for reviewers.
Should I try to send short patch series enabling one feature, or should I actually try
to produce long patch series that enable a use-case consisting of several features?

Konstantin

^ permalink raw reply	[relevance 79%]

* Re: [PATCH 1/2] tracing/user_events: Fix non-spaced field matching
  @ 2024-04-19 21:13 79%     ` Beau Belgrave
    0 siblings, 1 reply; 200+ results
From: Beau Belgrave @ 2024-04-19 21:13 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: rostedt, mathieu.desnoyers, linux-kernel, linux-trace-kernel, dcook

On Fri, Apr 19, 2024 at 11:33:05AM +0900, Masami Hiramatsu wrote:
> On Tue, 16 Apr 2024 22:41:01 +0000
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
> 
> > When the ABI was updated to prevent same name w/different args, it
> > missed an important corner case when fields don't end with a space.
> > Typically, space is used for fields to help separate them, like
> > "u8 field1; u8 field2". If no spaces are used, like
> > "u8 field1;u8 field2", then the parsing works for the first time.
> > However, the match check fails on a subsequent register, leading to
> > confusion.
> > 
> > This is because the match check uses argv_split() and assumes that all
> > fields will be split upon the space. When spaces are used, we get back
> > { "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }.
> > This causes a mismatch, and the user program gets back -EADDRINUSE.
> > 
> > Add a method to detect this case before calling argv_split(). If found
> > force a space after the field separator character ';'. This ensures all
> > cases work properly for matching.
> > 
> > With this fix, the following are all treated as matching:
> > u8 field1;u8 field2
> > u8 field1; u8 field2
> > u8 field1;\tu8 field2
> > u8 field1;\nu8 field2
> 
> Sounds good to me. I just have some nits.
> 
> > 
> > Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event")
> > Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
> > ---
> >  kernel/trace/trace_events_user.c | 88 +++++++++++++++++++++++++++++++-
> >  1 file changed, 87 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
> > index 70d428c394b6..9184d3962b2a 100644
> > --- a/kernel/trace/trace_events_user.c
> > +++ b/kernel/trace/trace_events_user.c
> > @@ -1989,6 +1989,92 @@ static int user_event_set_tp_name(struct user_event *user)
> >  	return 0;
> >  }
> >  
> > +/*
> > + * Counts how many ';' without a trailing space are in the args.
> > + */
> > +static int count_semis_no_space(char *args)
> > +{
> > +	int count = 0;
> > +
> > +	while ((args = strchr(args, ';'))) {
> > +		args++;
> > +
> > +		if (!isspace(*args))
> > +			count++;
> > +	}
> > +
> > +	return count;
> > +}
> > +
> > +/*
> > + * Copies the arguments while ensuring all ';' have a trailing space.
> > + */
> > +static char *fix_semis_no_space(char *args, int count)
> 
> nit: This name does not represent what it does. 'insert_space_after_semis()'
> is more self-described.
> 

Sure, will fix in a v2.

> > +{
> > +	char *fixed, *pos;
> > +	char c, last;
> > +	int len;
> > +
> > +	len = strlen(args) + count;
> > +	fixed = kmalloc(len + 1, GFP_KERNEL);
> > +
> > +	if (!fixed)
> > +		return NULL;
> > +
> > +	pos = fixed;
> > +	last = '\0';
> > +
> > +	while (len > 0) {
> > +		c = *args++;
> > +
> > +		if (last == ';' && !isspace(c)) {
> > +			*pos++ = ' ';
> > +			len--;
> > +		}
> > +
> > +		if (len > 0) {
> > +			*pos++ = c;
> > +			len--;
> > +		}
> > +
> > +		last = c;
> > +	}
> 
> nit: This loop can be simpler, because we are sure fixed has enough length;
> 
> /* insert a space after ';' if there is no space. */
> while(*args) {
> 	*pos = *args++;
> 	if (*pos++ == ';' && !isspace(*args))
> 		*pos++ = ' ';
> }
> 

I was worried that if count_semis_no_space() ever had different logic
(maybe after this commit) that it could cause an overflow if the count
was wrong, etc.

I don't have an issue making it shorter, but I was trying to be more on
the safe side, since this isn't a fast path (event register).

> > +
> > +	/*
> > +	 * len is the length of the copy excluding the null.
> > +	 * This ensures we always have room for a null.
> > +	 */
> > +	*pos = '\0';
> > +
> > +	return fixed;
> > +}
> > +
> > +static char **user_event_argv_split(char *args, int *argc)
> > +{
> > +	/* Count how many ';' without a trailing space */
> > +	int count = count_semis_no_space(args);
> > +
> > +	if (count) {
> 
> nit: it is better to exit fast, so 
> 
> 	if (!count)
> 		return argv_split(GFP_KERNEL, args, argc);
> 
> 	...

Sure, will fix in a v2.

> 
> Thank you,
> 
> OT: BTW, can this also simplify synthetic events?
> 

I'm not sure, I'll check when I have some time. I want to get this fix
in sooner rather than later.

Thanks,
-Beau

> > +		/* We must fixup 'field;field' to 'field; field' */
> > +		char *fixed = fix_semis_no_space(args, count);
> > +		char **split;
> > +
> > +		if (!fixed)
> > +			return NULL;
> > +
> > +		/* We do a normal split afterwards */
> > +		split = argv_split(GFP_KERNEL, fixed, argc);
> > +
> > +		/* We can free since argv_split makes a copy */
> > +		kfree(fixed);
> > +
> > +		return split;
> > +	}
> > +
> > +	/* No fixup is required */
> > +	return argv_split(GFP_KERNEL, args, argc);
> > +}
> > +
> >  /*
> >   * Parses the event name, arguments and flags then registers if successful.
> >   * The name buffer lifetime is owned by this method for success cases only.
> > @@ -2012,7 +2098,7 @@ static int user_event_parse(struct user_event_group *group, char *name,
> >  		return -EPERM;
> >  
> >  	if (args) {
> > -		argv = argv_split(GFP_KERNEL, args, &argc);
> > +		argv = user_event_argv_split(args, &argc);
> >  
> >  		if (!argv)
> >  			return -ENOMEM;
> > -- 
> > 2.34.1
> > 
> 
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
  @ 2024-04-19 16:59 76%             ` Shradha Gupta
    0 siblings, 1 reply; 200+ results
From: Shradha Gupta @ 2024-04-19 16:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jason Gunthorpe, Zhu Yanjun, linux-kernel, linux-hyperv,
	linux-rdma, netdev, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Ajay Sharma, Leon Romanovsky, Thomas Gleixner,
	Sebastian Andrzej Siewior, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Long Li, Michael Kelley, Shradha Gupta,
	Yury Norov, Konstantin Taranov, Souradeep Chakrabarti

On Thu, Apr 18, 2024 at 08:42:59PM +0200, Andrew Lunn wrote:
> > >From an RDMA perspective this is all available from other APIs already
> > at least and I wouldn't want to see new sysfs unless there is a netdev
> > justification.
> 
> It is unlikely there is a netdev justification. Configuration happens
> via netlink, not sysfs.
> 
>     Andrew

Thanks. Sure, it makes sense to make the generic attribute configurable
through the netdevice ops or netlink implementation. I will keep that in
mind while adding the next set of configuration attributes for the driver.
These attributes(from the patch) however, are hardware specific(that show
the maximum supported values by the hardware in most cases). We want them
to be a part of sysfs so that they are readily available in the production
for improving debuggability. I will change the names of these attribute to
indicate the same to avoid possible confusion.

Regards,
Shradha.

^ permalink raw reply	[relevance 76%]

* Re: [PATCH v2] Add a header in ifcfg and nm keyfiles describing the owner of the files
    2024-04-18 16:15 79% ` Easwar Hariharan
@ 2024-04-19 16:54 79% ` Shradha Gupta
  1 sibling, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-19 16:54 UTC (permalink / raw)
  To: Ani Sinha
  Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, eahariha,
	linux-hyperv, linux-kernel

On Thu, Apr 18, 2024 at 05:35:49PM +0530, Ani Sinha wrote:
> A comment describing the source of writing the contents of the ifcfg and
> network manager keyfiles (hyperv kvp daemon) is useful. It is valuable both
> for debugging as well as for preventing users from modifying them.
> 
> CC: shradhagupta@linux.microsoft.com
> CC: eahariha@linux.microsoft.com
> CC: wei.liu@kernel.org
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>  tools/hv/hv_kvp_daemon.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> changelog:
> v2: simplify and fix issues with error handling.
> 
> diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
> index ae57bf69ad4a..014e45be6981 100644
> --- a/tools/hv/hv_kvp_daemon.c
> +++ b/tools/hv/hv_kvp_daemon.c
> @@ -94,6 +94,8 @@ static char *lic_version = "Unknown version";
>  static char full_domain_name[HV_KVP_EXCHANGE_MAX_VALUE_SIZE];
>  static struct utsname uts_buf;
>  
> +#define CFG_HEADER "# Generated by hyperv key-value pair daemon. Please do not modify.\n"
> +
>  /*
>   * The location of the interface configuration file.
>   */
> @@ -1435,6 +1437,18 @@ static int kvp_set_ip_info(char *if_name, struct hv_kvp_ipaddr_value *new_val)
>  		return HV_E_FAIL;
>  	}
>  
> +	/* Write the config file headers */
> +	error = fprintf(ifcfg_file, CFG_HEADER);
> +	if (error < 0) {
> +		error = HV_E_FAIL;
> +		goto setval_error;
> +	}
> +	error = fprintf(nmfile, CFG_HEADER);
> +	if (error < 0) {
> +		error = HV_E_FAIL;
> +		goto setval_error;
> +	}
> +
>  	/*
>  	 * First write out the MAC address.
>  	 */
> -- 
> 2.42.0
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v2] Add a header in ifcfg and nm keyfiles describing the owner of the files
  2024-04-18 19:01 79%   ` Dexuan Cui
@ 2024-04-19 16:51 79%     ` Shradha Gupta
  0 siblings, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-19 16:51 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Easwar Hariharan, Ani Sinha, KY Srinivasan, Haiyang Zhang,
	Wei Liu, linux-hyperv, linux-kernel

On Thu, Apr 18, 2024 at 07:01:20PM +0000, Dexuan Cui wrote:
> > From: Easwar Hariharan <eahariha@linux.microsoft.com>
> > Sent: Thursday, April 18, 2024 9:16 AM
> > 
> > On 4/18/2024 5:05 AM, Ani Sinha wrote:
> > > A comment describing the source of writing the contents of the ifcfg and
> > > network manager keyfiles (hyperv kvp daemon) is useful. It is valuable
> 
> s/hyperv/Hyper-V/
> 
> > > +#define CFG_HEADER "# Generated by hyperv key-value pair daemon.
> > Please do not modify.\n"
> 
> s/hyperv/Hyper-V/
> 
> > Looks good to me, I'll defer to other folks on the recipient list on whether
> > "hyperv" should be capitalized as HyperV or other such feedback.
> 
> It's recommended to use "Hyper-V". Wei can help fix this so I guess
> there is no need to resend the patch :-)
Sounds good!

^ permalink raw reply	[relevance 79%]

* RE: [PATCH] PCI: Add a mutex to protect the global list pci_domain_busn_res_list
  2024-04-19  1:53 76% [PATCH] PCI: Add a mutex to protect the global list pci_domain_busn_res_list Dexuan Cui
@ 2024-04-19 15:07 79% ` Haiyang Zhang
  0 siblings, 0 replies; 200+ results
From: Haiyang Zhang @ 2024-04-19 15:07 UTC (permalink / raw)
  To: Dexuan Cui, bhelgaas, wei.liu, KY Srinivasan, lpieralisi, linux-pci
  Cc: linux-hyperv, linux-kernel, Boqun Feng, Sunil Muthuswamy,
	Saurabh Singh Sengar



> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Thursday, April 18, 2024 9:53 PM
> To: bhelgaas@google.com; wei.liu@kernel.org; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> lpieralisi@kernel.org; linux-pci@vger.kernel.org
> Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; Boqun
> Feng <Boqun.Feng@microsoft.com>; Sunil Muthuswamy
> <sunilmut@microsoft.com>; Saurabh Singh Sengar <ssengar@microsoft.com>;
> Dexuan Cui <decui@microsoft.com>
> Subject: [PATCH] PCI: Add a mutex to protect the global list
> pci_domain_busn_res_list
> 
> There has been an effort to make the pci-hyperv driver support
> async-probing to reduce the boot time. With async-probing, multiple
> kernel threads can be running hv_pci_probe() -> create_root_hv_pci_bus()
> ->
> pci_scan_root_bus_bridge() -> pci_bus_insert_busn_res() at the same time
> to
> update the global list, causing list corruption.
> 
> Add a mutex to protect the list.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  drivers/pci/probe.c | 25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index e19b79821dd6..1327fd820b24 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -37,6 +37,7 @@ LIST_HEAD(pci_root_buses);
>  EXPORT_SYMBOL(pci_root_buses);
> 
>  static LIST_HEAD(pci_domain_busn_res_list);
> +static DEFINE_MUTEX(pci_domain_busn_res_list_lock);
> 
>  struct pci_domain_busn_res {
>  	struct list_head list;
> @@ -47,14 +48,22 @@ struct pci_domain_busn_res {
>  static struct resource *get_pci_domain_busn_res(int domain_nr)
>  {
>  	struct pci_domain_busn_res *r;
> +	struct resource *ret;
> 
> -	list_for_each_entry(r, &pci_domain_busn_res_list, list)
> -		if (r->domain_nr == domain_nr)
> -			return &r->res;
> +	mutex_lock(&pci_domain_busn_res_list_lock);
> +
> +	list_for_each_entry(r, &pci_domain_busn_res_list, list) {
> +		if (r->domain_nr == domain_nr) {
> +			ret = &r->res;
> +			goto out;
> +		}
> +	}
> 
>  	r = kzalloc(sizeof(*r), GFP_KERNEL);
> -	if (!r)
> -		return NULL;
> +	if (!r) {
> +		ret = NULL;
> +		goto out;
> +	}
> 
>  	r->domain_nr = domain_nr;
>  	r->res.start = 0;
> @@ -62,8 +71,10 @@ static struct resource *get_pci_domain_busn_res(int
> domain_nr)
>  	r->res.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED;
> 
>  	list_add_tail(&r->list, &pci_domain_busn_res_list);
> -
> -	return &r->res;
> +	ret = &r->res;
> +out:
> +	mutex_unlock(&pci_domain_busn_res_list_lock);
> +	return ret;
>  }

The patch is for common pci code. So, this bug has been there for a while?
Do you have a sample stack trace of the crash?

I checked pci-hyperv, it doesn't define the .driver.probe_type, so 
PROBE_DEFAULT_STRATEGY is in effect. driver_allows_async_probing() returns 
false unless kernel/mod param requests async. So async probing haven't 
been practiced here.

If in the future, we change the pci-hyperv's probe_type to PROBE_PREFER_ASYNCHRONOUS, 
how does it affect the underlying PCI device's probes within the same 
device type?
For example, MANA driver doesn't set probe_type. Will pci-hyperv's async 
probing cause async probing or potentially nondeterministic naming for 
MANA devices?

Thanks,
- Haiyang


^ permalink raw reply	[relevance 79%]

* Re: [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr
  @ 2024-04-19  9:14 79%     ` Konstantin Taranov
    0 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-19  9:14 UTC (permalink / raw)
  To: Jason Gunthorpe, Konstantin Taranov
  Cc: sharmaajay, Long Li, leon, linux-rdma, linux-kernel

> From: Jason Gunthorpe <jgg@ziepe.ca>
> On Wed, Apr 17, 2024 at 07:20:59AM -0700, Konstantin Taranov wrote:
> > From: Konstantin Taranov <kotaranov@microsoft.com>
> >
> > Implement allocation of DMA-mapped memory regions.
> >
> > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > ---
> >  drivers/infiniband/hw/mana/device.c |  1 +
> >  drivers/infiniband/hw/mana/mr.c     | 36
> +++++++++++++++++++++++++++++
> >  include/net/mana/gdma.h             |  5 ++++
> >  3 files changed, 42 insertions(+)
> 
> What is the point of doing this without supporting enough verbs to allow a
> kernel ULP?
> 

True, the proposed code is useless at this state.
Nevertheless, mana_ib team aims to send kernel ULP patches after we are done
with uverbs pathes (i.e., udata is not null). As this change does not conflict with the
current effort, I decided to send this patch now. I can extend the series to make
it more useful.

Jason, could  you suggest a minimal list of ib_device_ops methods, that includes
get_dma_mr, which can be approved?

Thanks,
Konstantin

> Jason

^ permalink raw reply	[relevance 79%]

* Re: [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr
  @ 2024-04-19  9:02 79%     ` Konstantin Taranov
  0 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-19  9:02 UTC (permalink / raw)
  To: Zhu Yanjun, Konstantin Taranov, sharmaajay, Long Li, jgg, leon
  Cc: linux-rdma, linux-kernel

> From: Zhu Yanjun <zyjzyj2000@gmail.com>
> On 17.04.24 16:20, Konstantin Taranov wrote:
> > From: Konstantin Taranov <kotaranov@microsoft.com>
> >
> > Implement allocation of DMA-mapped memory regions.
> >
> > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > ---
> >   drivers/infiniband/hw/mana/device.c |  1 +
> >   drivers/infiniband/hw/mana/mr.c     | 36
> +++++++++++++++++++++++++++++
> >   include/net/mana/gdma.h             |  5 ++++
> >   3 files changed, 42 insertions(+)
> >
> > diff --git a/drivers/infiniband/hw/mana/device.c
> > b/drivers/infiniband/hw/mana/device.c
> > index 6fa902ee80a6..043cef09f1c2 100644
> > --- a/drivers/infiniband/hw/mana/device.c
> > +++ b/drivers/infiniband/hw/mana/device.c
> > @@ -29,6 +29,7 @@ static const struct ib_device_ops mana_ib_dev_ops =
> {
> >     .destroy_rwq_ind_table = mana_ib_destroy_rwq_ind_table,
> >     .destroy_wq = mana_ib_destroy_wq,
> >     .disassociate_ucontext = mana_ib_disassociate_ucontext,
> > +   .get_dma_mr = mana_ib_get_dma_mr,
> >     .get_port_immutable = mana_ib_get_port_immutable,
> >     .mmap = mana_ib_mmap,
> >     .modify_qp = mana_ib_modify_qp,
> > diff --git a/drivers/infiniband/hw/mana/mr.c
> > b/drivers/infiniband/hw/mana/mr.c index 4f13423ecdbd..7c9394926a18
> > 100644
> > --- a/drivers/infiniband/hw/mana/mr.c
> > +++ b/drivers/infiniband/hw/mana/mr.c
> > @@ -8,6 +8,8 @@
> >   #define VALID_MR_FLAGS                                                         \
> >     (IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE |
> > IB_ACCESS_REMOTE_READ)
> >
> > +#define VALID_DMA_MR_FLAGS IB_ACCESS_LOCAL_WRITE
> > +
> >   static enum gdma_mr_access_flags
> >   mana_ib_verbs_to_gdma_access_flags(int access_flags)
> >   {
> > @@ -39,6 +41,8 @@ static int mana_ib_gd_create_mr(struct mana_ib_dev
> *dev, struct mana_ib_mr *mr,
> >     req.mr_type = mr_params->mr_type;
> >
> >     switch (mr_params->mr_type) {
> > +   case GDMA_MR_TYPE_GPA:
> > +           break;
> >     case GDMA_MR_TYPE_GVA:
> >             req.gva.dma_region_handle = mr_params-
> >gva.dma_region_handle;
> >             req.gva.virtual_address = mr_params->gva.virtual_address;
> @@
> > -168,6 +172,38 @@ struct ib_mr *mana_ib_reg_user_mr(struct ib_pd
> *ibpd, u64 start, u64 length,
> >     return ERR_PTR(err);
> >   }
> >
>
> Not sure if the following function needs comments or not.
> If yes, the kernel doc
> https://git.ke/
> rnel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
> Ftree%2FDocumentation%2Fdoc-guide%2Fkernel-doc.rst%3Fh%3Dv6.9-
> rc4%23n67&data=05%7C02%7Ckotaranov%40microsoft.com%7C2816715935
> 85405f280e08dc5f925007%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C
> 0%7C638490329257001758%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C
> %7C&sdata=1Mzt0DKzty2jMJYm52gP%2FaloYnFGUTzN7gzAP05RdoQ%3D&res
> erved=0
> can provide a good example.
>

Thanks! I will have a look and see how I can improve comments.

> Best Regards,
> Zhu Yanjun
>
> > +struct ib_mr *mana_ib_get_dma_mr(struct ib_pd *ibpd, int
> > +access_flags) {
> > +   struct mana_ib_pd *pd = container_of(ibpd, struct mana_ib_pd,
> ibpd);
> > +   struct gdma_create_mr_params mr_params = {};
> > +   struct ib_device *ibdev = ibpd->device;
> > +   struct mana_ib_dev *dev;
> > +   struct mana_ib_mr *mr;
> > +   int err;
> > +
> > +   dev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> > +
> > +   if (access_flags & ~VALID_DMA_MR_FLAGS)
> > +           return ERR_PTR(-EINVAL);
> > +
> > +   mr = kzalloc(sizeof(*mr), GFP_KERNEL);
> > +   if (!mr)
> > +           return ERR_PTR(-ENOMEM);
> > +
> > +   mr_params.pd_handle = pd->pd_handle;
> > +   mr_params.mr_type = GDMA_MR_TYPE_GPA;
> > +
> > +   err = mana_ib_gd_create_mr(dev, mr, &mr_params);
> > +   if (err)
> > +           goto err_free;
> > +
> > +   return &mr->ibmr;
> > +
> > +err_free:
> > +   kfree(mr);
> > +   return ERR_PTR(err);
> > +}
> > +
> >   int mana_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> >   {
> >     struct mana_ib_mr *mr = container_of(ibmr, struct mana_ib_mr,
> > ibmr); diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
> > index 8d796a30ddde..dc19b5cb33a6 100644
> > --- a/include/net/mana/gdma.h
> > +++ b/include/net/mana/gdma.h
> > @@ -788,6 +788,11 @@ struct gdma_destory_pd_resp {
> >   };/* HW DATA */
> >
> >   enum gdma_mr_type {
> > +   /*
> > +    * Guest Physical Address - MRs of this type allow access
> > +    * to any DMA-mapped memory using bus-logical address
> > +    */
> > +   GDMA_MR_TYPE_GPA = 1,
> >     /* Guest Virtual Address - MRs of this type allow access
> >      * to memory mapped by PTEs associated with this MR using a virtual
> >      * address that is set up in the MST


^ permalink raw reply	[relevance 79%]

* [PATCH] hv/vmbus_drv: rename hv_acpi_init() to vmbus_init()
@ 2024-04-19  5:56 79% Erni Sri Satya Vennela
  0 siblings, 0 replies; 200+ results
From: Erni Sri Satya Vennela @ 2024-04-19  5:56 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, linux-hyperv, linux-kernel
  Cc: ernis, Erni Sri Satya Vennela

As the driver now supports both ACPI and DeviceTree,
rename hv_acpi_init() to vmbus_init() and
change comments accordingly.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
 drivers/hv/vmbus_drv.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 12a707ab73f8..e140cbf8d4c7 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2609,7 +2609,7 @@ static struct syscore_ops hv_synic_syscore_ops = {
 	.resume = hv_synic_resume,
 };
 
-static int __init hv_acpi_init(void)
+static int __init vmbus_init(void)
 {
 	int ret;
 
@@ -2620,7 +2620,7 @@ static int __init hv_acpi_init(void)
 		return 0;
 
 	/*
-	 * Get ACPI resources first.
+	 * Get VMBus resources first.
 	 */
 	ret = platform_driver_register(&vmbus_platform_driver);
 	if (ret)
@@ -2707,5 +2707,5 @@ static void __exit vmbus_exit(void)
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("Microsoft Hyper-V VMBus Driver");
 
-subsys_initcall(hv_acpi_init);
+subsys_initcall(vmbus_init);
 module_exit(vmbus_exit);
-- 
2.34.1


^ permalink raw reply related	[relevance 79%]

* [PATCH] PCI: Add a mutex to protect the global list pci_domain_busn_res_list
@ 2024-04-19  1:53 76% Dexuan Cui
  2024-04-19 15:07 79% ` Haiyang Zhang
  0 siblings, 1 reply; 200+ results
From: Dexuan Cui @ 2024-04-19  1:53 UTC (permalink / raw)
  To: bhelgaas, wei.liu, kys, haiyangz, lpieralisi, linux-pci
  Cc: linux-hyperv, linux-kernel, Boqun.Feng, sunilmut, ssengar, Dexuan Cui

There has been an effort to make the pci-hyperv driver support
async-probing to reduce the boot time. With async-probing, multiple
kernel threads can be running hv_pci_probe() -> create_root_hv_pci_bus() ->
pci_scan_root_bus_bridge() -> pci_bus_insert_busn_res() at the same time to
update the global list, causing list corruption.

Add a mutex to protect the list.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/pci/probe.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e19b79821dd6..1327fd820b24 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -37,6 +37,7 @@ LIST_HEAD(pci_root_buses);
 EXPORT_SYMBOL(pci_root_buses);
 
 static LIST_HEAD(pci_domain_busn_res_list);
+static DEFINE_MUTEX(pci_domain_busn_res_list_lock);
 
 struct pci_domain_busn_res {
 	struct list_head list;
@@ -47,14 +48,22 @@ struct pci_domain_busn_res {
 static struct resource *get_pci_domain_busn_res(int domain_nr)
 {
 	struct pci_domain_busn_res *r;
+	struct resource *ret;
 
-	list_for_each_entry(r, &pci_domain_busn_res_list, list)
-		if (r->domain_nr == domain_nr)
-			return &r->res;
+	mutex_lock(&pci_domain_busn_res_list_lock);
+
+	list_for_each_entry(r, &pci_domain_busn_res_list, list) {
+		if (r->domain_nr == domain_nr) {
+			ret = &r->res;
+			goto out;
+		}
+	}
 
 	r = kzalloc(sizeof(*r), GFP_KERNEL);
-	if (!r)
-		return NULL;
+	if (!r) {
+		ret = NULL;
+		goto out;
+	}
 
 	r->domain_nr = domain_nr;
 	r->res.start = 0;
@@ -62,8 +71,10 @@ static struct resource *get_pci_domain_busn_res(int domain_nr)
 	r->res.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED;
 
 	list_add_tail(&r->list, &pci_domain_busn_res_list);
-
-	return &r->res;
+	ret = &r->res;
+out:
+	mutex_unlock(&pci_domain_busn_res_list_lock);
+	return ret;
 }
 
 /*
-- 
2.25.1


^ permalink raw reply related	[relevance 76%]

* RE: [PATCH net-next] net: mana: Add new device attributes for mana
  2024-04-15  9:49 63% [PATCH net-next] net: mana: Add new device attributes for mana Shradha Gupta
    2024-04-15 16:38 79% ` Saurabh Singh Sengar
@ 2024-04-18 21:29 79% ` Haiyang Zhang
  2 siblings, 0 replies; 200+ results
From: Haiyang Zhang @ 2024-04-18 21:29 UTC (permalink / raw)
  To: Shradha Gupta, linux-kernel, linux-hyperv, linux-rdma, netdev
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Leon Romanovsky,
	Thomas Gleixner, Sebastian Andrzej Siewior, KY Srinivasan,
	Wei Liu, Dexuan Cui, Long Li, Michael Kelley, Shradha Gupta,
	Yury Norov, Konstantin Taranov, Souradeep Chakrabarti



> -----Original Message-----
> From: Shradha Gupta <shradhagupta@linux.microsoft.com>
> Sent: Monday, April 15, 2024 5:50 AM
> To: linux-kernel@vger.kernel.org; linux-hyperv@vger.kernel.org; linux-
> rdma@vger.kernel.org; netdev@vger.kernel.org
> Cc: Shradha Gupta <shradhagupta@linux.microsoft.com>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Ajay Sharma <sharmaajay@microsoft.com>; Leon
> Romanovsky <leon@kernel.org>; Thomas Gleixner <tglx@linutronix.de>;
> Sebastian Andrzej Siewior <bigeasy@linutronix.de>; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Wei Liu
> <wei.liu@kernel.org>; Dexuan Cui <decui@microsoft.com>; Long Li
> <longli@microsoft.com>; Michael Kelley <mikelley@microsoft.com>; Shradha
> Gupta <shradhagupta@microsoft.com>; Yury Norov <yury.norov@gmail.com>;
> Konstantin Taranov <kotaranov@microsoft.com>; Souradeep Chakrabarti
> <schakrabarti@linux.microsoft.com>
> Subject: [PATCH net-next] net: mana: Add new device attributes for mana
> 
> Add new device attributes to view multiport, msix, and adapter MTU
> setting for MANA device.
> 
> Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
> ---
>  .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
>  include/net/mana/gdma.h                       |  9 +++
>  2 files changed, 83 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 1332db9a08eb..6674a02cff06 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
>  	return dev_id == MANA_PF_DEVICE_ID;
>  }
> 
> +static ssize_t mana_attr_show(struct device *dev,
> +			      struct device_attribute *attr, char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct gdma_context *gc = pci_get_drvdata(pdev);
> +	struct mana_context *ac = gc->mana.driver_data;
> +
> +	if (strcmp(attr->attr.name, "mport") == 0)
> +		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
> +	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
> +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
> +	else if (strcmp(attr->attr.name, "msix") == 0)
> +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
> +	else
> +		return -EINVAL;
> +}
> +
> +static int mana_gd_setup_sysfs(struct pci_dev *pdev)
> +{
> +	struct gdma_context *gc = pci_get_drvdata(pdev);
> +	int retval = 0;
> +
> +	gc->mana_attributes.mana_mport_attr.attr.name = "mport";
> +	gc->mana_attributes.mana_mport_attr.attr.mode = 0444;
> +	gc->mana_attributes.mana_mport_attr.show = mana_attr_show;
> +	sysfs_attr_init(&gc->mana_attributes.mana_mport_attr);
> +	retval = device_create_file(&pdev->dev,
> +				    &gc->mana_attributes.mana_mport_attr);
> +	if (retval < 0)
> +		return retval;
> +
> +	gc->mana_attributes.mana_adapter_mtu_attr.attr.name =
> "adapter_mtu";
> +	gc->mana_attributes.mana_adapter_mtu_attr.attr.mode = 0444;
> +	gc->mana_attributes.mana_adapter_mtu_attr.show = mana_attr_show;
> +	sysfs_attr_init(&gc->mana_attributes.mana_adapter_mtu_attr);
> +	retval = device_create_file(&pdev->dev,
> +				    &gc->mana_attributes.mana_adapter_mtu_attr);
> +	if (retval < 0)
> +		goto mtu_attr_error;
> +
> +	gc->mana_attributes.mana_msix_attr.attr.name = "msix";
> +	gc->mana_attributes.mana_msix_attr.attr.mode = 0444;
> +	gc->mana_attributes.mana_msix_attr.show = mana_attr_show;
> +	sysfs_attr_init(&gc->mana_attributes.mana_msix_attr);
> +	retval = device_create_file(&pdev->dev,
> +				    &gc->mana_attributes.mana_msix_attr);
> +	if (retval < 0)
> +		goto msix_attr_error;
> +
> +	return retval;
> +msix_attr_error:
> +	device_remove_file(&pdev->dev,
> +			   &gc->mana_attributes.mana_adapter_mtu_attr);
> +mtu_attr_error:
> +	device_remove_file(&pdev->dev,
> +			   &gc->mana_attributes.mana_mport_attr);
> +	return retval;
> +}
> +
>  static int mana_gd_probe(struct pci_dev *pdev, const struct
> pci_device_id *ent)
>  {
>  	struct gdma_context *gc;
> @@ -1519,6 +1578,10 @@ static int mana_gd_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>  	gc->bar0_va = bar0_va;
>  	gc->dev = &pdev->dev;
> 
> +	err = mana_gd_setup_sysfs(pdev);
> +	if (err < 0)
> +		goto free_gc;
> +

Regarding examples, vmbus_drv.c has a number of sysfs variables:

static ssize_t in_read_bytes_avail_show(struct device *dev,
                                        struct device_attribute *dev_attr,
                                        char *buf)
{
        struct hv_device *hv_dev = device_to_hv_device(dev);
        struct hv_ring_buffer_debug_info inbound;
        int ret;

        if (!hv_dev->channel)
                return -ENODEV;

        ret = hv_ringbuffer_get_debuginfo(&hv_dev->channel->inbound, &inbound);
        if (ret < 0)
                return ret;

        return sprintf(buf, "%d\n", inbound.bytes_avail_toread);
}
static DEVICE_ATTR_RO(in_read_bytes_avail);

Thanks,
- Haiyang

^ permalink raw reply	[relevance 79%]

* RE: [PATCH v2] Add a header in ifcfg and nm keyfiles describing the owner of the files
  2024-04-18 16:15 79% ` Easwar Hariharan
@ 2024-04-18 19:01 79%   ` Dexuan Cui
  2024-04-19 16:51 79%     ` Shradha Gupta
  0 siblings, 1 reply; 200+ results
From: Dexuan Cui @ 2024-04-18 19:01 UTC (permalink / raw)
  To: Easwar Hariharan, Ani Sinha, KY Srinivasan, Haiyang Zhang, Wei Liu
  Cc: shradhagupta, linux-hyperv, linux-kernel

> From: Easwar Hariharan <eahariha@linux.microsoft.com>
> Sent: Thursday, April 18, 2024 9:16 AM
> 
> On 4/18/2024 5:05 AM, Ani Sinha wrote:
> > A comment describing the source of writing the contents of the ifcfg and
> > network manager keyfiles (hyperv kvp daemon) is useful. It is valuable

s/hyperv/Hyper-V/

> > +#define CFG_HEADER "# Generated by hyperv key-value pair daemon.
> Please do not modify.\n"

s/hyperv/Hyper-V/

> Looks good to me, I'll defer to other folks on the recipient list on whether
> "hyperv" should be capitalized as HyperV or other such feedback.

It's recommended to use "Hyper-V". Wei can help fix this so I guess
there is no need to resend the patch :-)

^ permalink raw reply	[relevance 79%]

* [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing cq callbacks
  2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
                   ` (3 preceding siblings ...)
  2024-04-18 16:52 64% ` [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
@ 2024-04-18 16:52 79% ` Konstantin Taranov
  2024-04-23 23:45 79%   ` Long Li
  2024-04-25 20:31 79%   ` Long Li
  2024-04-18 16:52 67% ` [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
  5 siblings, 2 replies; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:52 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Add a boundary check inside mana_ib_install_cq_cb to prevent index overflow.

Fixes: 2a31c5a7e0d8 ("RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function")
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 6c3bb8c..8323085 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -70,6 +70,8 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct gdma_queue *gdma_cq;
 
+	if (cq->queue.id >= gc->max_num_cqs)
+		return -EINVAL;
 	/* Create CQ table entry */
 	WARN_ON(gc->cq_table[cq->queue.id]);
 	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
-- 
2.43.0


^ permalink raw reply related	[relevance 79%]

* [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation of rnic cq
  2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
                   ` (4 preceding siblings ...)
  2024-04-18 16:52 79% ` [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
@ 2024-04-18 16:52 67% ` Konstantin Taranov
  2024-04-23 23:57 79%   ` Long Li
  5 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:52 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Enable users to create RNIC CQs.
With the previous request size, an ethernet CQ is created.
Use the cq_buf_size from the user to create an RNIC CQ and return its ID.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c | 56 ++++++++++++++++++++++++++++++---
 include/uapi/rdma/mana-abi.h    |  7 +++++
 2 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 8323085..a62bda7 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -9,17 +9,25 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		      struct ib_udata *udata)
 {
 	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
+	struct mana_ib_create_cq_resp resp = {};
+	struct mana_ib_ucontext *mana_ucontext;
 	struct ib_device *ibdev = ibcq->device;
 	struct mana_ib_create_cq ucmd = {};
 	struct mana_ib_dev *mdev;
+	bool is_rnic_cq = true;
+	u32 doorbell;
 	int err;
 
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
 
-	if (udata->inlen < sizeof(ucmd))
+	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
+	cq->cq_handle = INVALID_MANA_HANDLE;
+
+	if (udata->inlen < offsetof(struct mana_ib_create_cq, cq_buf_size))
 		return -EINVAL;
 
-	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
+	if (udata->inlen == offsetof(struct mana_ib_create_cq, cq_buf_size))
+		is_rnic_cq = false;
 
 	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen));
 	if (err) {
@@ -28,19 +36,53 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		return err;
 	}
 
-	if (attr->cqe > mdev->adapter_caps.max_qp_wr) {
+	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
 		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
 		return -EINVAL;
 	}
 
-	cq->buf_size = attr->cqe * COMP_ENTRY_SIZE;
+	cq->buf_size = is_rnic_cq ? ucmd.cq_buf_size : attr->cqe * COMP_ENTRY_SIZE;
 	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->buf_size, &cq->queue);
 	if (err) {
 		ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n", err);
 		return err;
 	}
 
+	mana_ucontext = rdma_udata_to_drv_context(udata, struct mana_ib_ucontext,
+						  ibucontext);
+	doorbell = mana_ucontext->doorbell;
+
+	if (is_rnic_cq) {
+		err = mana_ib_gd_create_cq(mdev, cq, doorbell);
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to create RNIC cq, %d\n", err);
+			goto err_destroy_queue;
+		}
+
+		err = mana_ib_install_cq_cb(mdev, cq);
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to install cq callback, %d\n", err);
+			goto err_destroy_rnic_cq;
+		}
+	}
+
+	resp.cqid = cq->queue.id;
+	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
+	if (err) {
+		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
+		goto err_remove_cq_cb;
+	}
+
 	return 0;
+
+err_remove_cq_cb:
+	mana_ib_remove_cq_cb(mdev, cq);
+err_destroy_rnic_cq:
+	mana_ib_gd_destroy_cq(mdev, cq);
+err_destroy_queue:
+	mana_ib_destroy_queue(mdev, &cq->queue);
+
+	return err;
 }
 
 int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
@@ -52,6 +94,12 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
 
 	mana_ib_remove_cq_cb(mdev, cq);
+
+	/* Ignore return code as there is not much we can do about it.
+	 * The error message is printed inside.
+	 */
+	mana_ib_gd_destroy_cq(mdev, cq);
+
 	mana_ib_destroy_queue(mdev, &cq->queue);
 
 	return 0;
diff --git a/include/uapi/rdma/mana-abi.h b/include/uapi/rdma/mana-abi.h
index 5fcb31b..8fc9d32 100644
--- a/include/uapi/rdma/mana-abi.h
+++ b/include/uapi/rdma/mana-abi.h
@@ -18,6 +18,13 @@
 
 struct mana_ib_create_cq {
 	__aligned_u64 buf_addr;
+	__u32 cq_buf_size;
+	__u32 reserved;
+};
+
+struct mana_ib_create_cq_resp {
+	__u32 cqid;
+	__u32 reserved;
 };
 
 struct mana_ib_create_qp {
-- 
2.43.0


^ permalink raw reply related	[relevance 67%]

* [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for RNIC CQs
  2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
@ 2024-04-18 16:52 75% ` Konstantin Taranov
  2024-04-23 23:24 79%   ` Long Li
  2024-04-18 16:52 68% ` [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:52 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Create EQs within mana_ib device. Such EQs are required
for creation of RNIC CQs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    | 34 ++++++++++++++++++++++++++--
 drivers/infiniband/hw/mana/mana_ib.h |  1 +
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index f540147..546d059 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -658,7 +658,7 @@ int mana_ib_create_eqs(struct mana_ib_dev *mdev)
 {
 	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct gdma_queue_spec spec = {};
-	int err;
+	int err, i;
 
 	spec.type = GDMA_EQ;
 	spec.monitor_avl_buf = false;
@@ -672,12 +672,42 @@ int mana_ib_create_eqs(struct mana_ib_dev *mdev)
 	if (err)
 		return err;
 
+	mdev->eqs = kcalloc(mdev->ib_dev.num_comp_vectors, sizeof(struct gdma_queue *),
+			    GFP_KERNEL);
+	if (!mdev->eqs) {
+		err = -ENOMEM;
+		goto destroy_fatal_eq;
+	}
+
+	for (i = 0; i < mdev->ib_dev.num_comp_vectors; i++) {
+		spec.eq.msix_index = (i + 1) % gc->num_msix_usable;
+		err = mana_gd_create_mana_eq(mdev->gdma_dev, &spec, &mdev->eqs[i]);
+		if (err)
+			goto destroy_eqs;
+	}
+
 	return 0;
+
+destroy_eqs:
+	while (i-- > 0)
+		mana_gd_destroy_queue(gc, mdev->eqs[i]);
+	kfree(mdev->eqs);
+destroy_fatal_eq:
+	mana_gd_destroy_queue(gc, mdev->fatal_err_eq);
+	return err;
 }
 
 void mana_ib_destroy_eqs(struct mana_ib_dev *mdev)
 {
-	mana_gd_destroy_queue(mdev_to_gc(mdev), mdev->fatal_err_eq);
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	int i;
+
+	mana_gd_destroy_queue(gc, mdev->fatal_err_eq);
+
+	for (i = 0; i < mdev->ib_dev.num_comp_vectors; i++)
+		mana_gd_destroy_queue(gc, mdev->eqs[i]);
+
+	kfree(mdev->eqs);
 }
 
 int mana_ib_gd_create_rnic_adapter(struct mana_ib_dev *mdev)
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 4c1240d..bfcf6df 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -56,6 +56,7 @@ struct mana_ib_dev {
 	struct gdma_dev *gdma_dev;
 	mana_handle_t adapter_handle;
 	struct gdma_queue *fatal_err_eq;
+	struct gdma_queue **eqs;
 	struct mana_ib_adapter_caps adapter_caps;
 };
 
-- 
2.43.0


^ permalink raw reply related	[relevance 75%]

* [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size
  2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
  2024-04-18 16:52 75% ` [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for " Konstantin Taranov
  2024-04-18 16:52 68% ` [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
@ 2024-04-18 16:52 73% ` Konstantin Taranov
  2024-04-23 23:34 79%   ` Long Li
  2024-04-18 16:52 64% ` [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:52 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Replace cqe with buf_size in struct mana_ib_cq.
The cqe field is already present in struct ib_cq and can be accessed there.
The buf_size field is required for mana RNIC CQs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c      | 4 ++--
 drivers/infiniband/hw/mana/mana_ib.h | 2 +-
 drivers/infiniband/hw/mana/qp.c      | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index dc931b9..0467ee8 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -33,8 +33,8 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		return -EINVAL;
 	}
 
-	cq->cqe = attr->cqe;
-	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe * COMP_ENTRY_SIZE, &cq->queue);
+	cq->buf_size = attr->cqe * COMP_ENTRY_SIZE;
+	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->buf_size, &cq->queue);
 	if (err) {
 		ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n", err);
 		return err;
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 9162f29..9c07021 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -90,7 +90,7 @@ struct mana_ib_mr {
 struct mana_ib_cq {
 	struct ib_cq ibcq;
 	struct mana_ib_queue queue;
-	int cqe;
+	u32 buf_size;
 	u32 comp_vector;
 	mana_handle_t  cq_handle;
 };
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 280e85a..c4fb8b4 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -196,7 +196,7 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		wq_spec.queue_size = wq->wq_buf_size;
 
 		cq_spec.gdma_region = cq->queue.gdma_region;
-		cq_spec.queue_size = cq->cqe * COMP_ENTRY_SIZE;
+		cq_spec.queue_size = cq->buf_size;
 		cq_spec.modr_ctx_id = 0;
 		eq = &mpc->ac->eqs[cq->comp_vector];
 		cq_spec.attached_eq = eq->eq->id;
@@ -355,7 +355,7 @@ static int mana_ib_create_qp_raw(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	wq_spec.queue_size = ucmd.sq_buf_size;
 
 	cq_spec.gdma_region = send_cq->queue.gdma_region;
-	cq_spec.queue_size = send_cq->cqe * COMP_ENTRY_SIZE;
+	cq_spec.queue_size = send_cq->buf_size;
 	cq_spec.modr_ctx_id = 0;
 	eq_vec = send_cq->comp_vector;
 	eq = &mpc->ac->eqs[eq_vec];
-- 
2.43.0


^ permalink raw reply related	[relevance 73%]

* [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs
@ 2024-04-18 16:51 72% Konstantin Taranov
  2024-04-18 16:52 75% ` [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for " Konstantin Taranov
                   ` (5 more replies)
  0 siblings, 6 replies; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:51 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

This patch series implements creation and destruction of CQs
which can be used with RC QPs.

Patches with RC QPs will be sent in the next patch series.

To create a CQ for RNIC, mana_ib requires creation of EQs
within mana_ib device. An EQ of mana ethernet cannot be used.

To make the implementation of create_cq cleaner, this series
also introduces minor changes to mana_cq structure (cqe->buf_size)
and adds a helper to remove CQ callbacks.

Mana ethernet and mana_ib CQs are different entities which are
created in different isolation zones (ethernet vs rnic).
As a result, RNIC cannot use ethenet CQs and ethernet cannot
use RNIC CQs.
That is why, we use existing udata request for creation of
ethernet CQs. If the request has extra fields, then we create
an RNIC CQ. The kernel-level CQs will be RNIC CQs (in future
patches).

To preserve backward and forward compatibility with RDMA-CORE,
we will make the following changes to mana provider in RDMA-CORE:

The rdma-core will request RNIC CQs by default, with the proposed
request format.
If the mana has installed an allocator with manadv_set_context_attr,
then the rdma-core undestands that this is a DPDK use-case and
requests an ethernet CQ, using old short request format.

Konstantin Taranov (6):
  RDMA/mana_ib: create EQs for RNIC CQs
  RDMA/mana_ib: create and destroy RNIC cqs
  RDMA/mana_ib: replace duplicate cqe with buf_size
  RDMA/mana_ib: introduce a helper to remove cq callbacks
  RDMA/mana_ib: boundary check before installing cq callbacks
  RDMA/mana_ib: implement uapi for creation of rnic cq

 drivers/infiniband/hw/mana/cq.c      | 77 ++++++++++++++++++++----
 drivers/infiniband/hw/mana/main.c    | 88 +++++++++++++++++++++++++++-
 drivers/infiniband/hw/mana/mana_ib.h | 36 +++++++++++-
 drivers/infiniband/hw/mana/qp.c      | 30 ++--------
 include/uapi/rdma/mana-abi.h         |  7 +++
 5 files changed, 200 insertions(+), 38 deletions(-)

-- 
2.43.0


^ permalink raw reply	[relevance 72%]

* [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs
  2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
  2024-04-18 16:52 75% ` [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for " Konstantin Taranov
@ 2024-04-18 16:52 68% ` Konstantin Taranov
  2024-04-23 23:30 79%   ` Long Li
  2024-04-18 16:52 73% ` [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size Konstantin Taranov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:52 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement RNIC requests for creation and destruction of RNIC CQs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    | 54 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h | 32 +++++++++++++++++
 2 files changed, 86 insertions(+)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 546d059..2a41135 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -834,3 +834,57 @@ int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8
 
 	return 0;
 }
+
+int mana_ib_gd_create_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq, u32 doorbell)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_create_cq_resp resp = {};
+	struct mana_rnic_create_cq_req req = {};
+	int err;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CREATE_CQ, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.gdma_region = cq->queue.gdma_region;
+	req.eq_id = mdev->eqs[cq->comp_vector]->id;
+	req.doorbell_page = doorbell;
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create cq err %d", err);
+		return err;
+	}
+
+	cq->queue.id  = resp.cq_id;
+	cq->cq_handle = resp.cq_handle;
+	/* The GDMA region is now owned by the CQ handle */
+	cq->queue.gdma_region = GDMA_INVALID_DMA_REGION;
+
+	return 0;
+}
+
+int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_destroy_cq_resp resp = {};
+	struct mana_rnic_destroy_cq_req req = {};
+	int err;
+
+	if (cq->cq_handle == INVALID_MANA_HANDLE)
+		return 0;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_DESTROY_CQ, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.cq_handle = cq->cq_handle;
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to destroy cq err %d", err);
+		return err;
+	}
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index bfcf6df..9162f29 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -92,6 +92,7 @@ struct mana_ib_cq {
 	struct mana_ib_queue queue;
 	int cqe;
 	u32 comp_vector;
+	mana_handle_t  cq_handle;
 };
 
 struct mana_ib_qp {
@@ -119,6 +120,8 @@ enum mana_ib_command_code {
 	MANA_IB_DESTROY_ADAPTER = 0x30003,
 	MANA_IB_CONFIG_IP_ADDR	= 0x30004,
 	MANA_IB_CONFIG_MAC_ADDR	= 0x30005,
+	MANA_IB_CREATE_CQ       = 0x30008,
+	MANA_IB_DESTROY_CQ      = 0x30009,
 };
 
 struct mana_ib_query_adapter_caps_req {
@@ -202,6 +205,31 @@ struct mana_rnic_config_mac_addr_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+struct mana_rnic_create_cq_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	u64 gdma_region;
+	u32 eq_id;
+	u32 doorbell_page;
+}; /* HW Data */
+
+struct mana_rnic_create_cq_resp {
+	struct gdma_resp_hdr hdr;
+	mana_handle_t cq_handle;
+	u32 cq_id;
+	u32 reserved;
+}; /* HW Data */
+
+struct mana_rnic_destroy_cq_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t cq_handle;
+}; /* HW Data */
+
+struct mana_rnic_destroy_cq_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -321,4 +349,8 @@ int mana_ib_gd_add_gid(const struct ib_gid_attr *attr, void **context);
 int mana_ib_gd_del_gid(const struct ib_gid_attr *attr, void **context);
 
 int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8 *mac);
+
+int mana_ib_gd_create_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq, u32 doorbell);
+
+int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
 #endif
-- 
2.43.0


^ permalink raw reply related	[relevance 68%]

* [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks
  2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
                   ` (2 preceding siblings ...)
  2024-04-18 16:52 73% ` [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size Konstantin Taranov
@ 2024-04-18 16:52 64% ` Konstantin Taranov
  2024-04-23 23:42 79%   ` Long Li
  2024-04-18 16:52 79% ` [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
  2024-04-18 16:52 67% ` [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
  5 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-18 16:52 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Intoduce the mana_ib_remove_cq_cb helper to remove cq callbacks.
The helper removes code duplicates.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c      | 19 ++++++++++++-------
 drivers/infiniband/hw/mana/mana_ib.h |  1 +
 drivers/infiniband/hw/mana/qp.c      | 26 ++++----------------------
 3 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 0467ee8..6c3bb8c 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -48,16 +48,10 @@ int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
 	struct ib_device *ibdev = ibcq->device;
 	struct mana_ib_dev *mdev;
-	struct gdma_context *gc;
 
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
-	gc = mdev_to_gc(mdev);
-
-	if (cq->queue.id != INVALID_QUEUE_ID) {
-		kfree(gc->cq_table[cq->queue.id]);
-		gc->cq_table[cq->queue.id] = NULL;
-	}
 
+	mana_ib_remove_cq_cb(mdev, cq);
 	mana_ib_destroy_queue(mdev, &cq->queue);
 
 	return 0;
@@ -89,3 +83,14 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 	gc->cq_table[cq->queue.id] = gdma_cq;
 	return 0;
 }
+
+void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+
+	if (cq->queue.id >= gc->max_num_cqs)
+		return;
+
+	kfree(gc->cq_table[cq->queue.id]);
+	gc->cq_table[cq->queue.id] = NULL;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 9c07021..6c19f4f 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -255,6 +255,7 @@ static inline void copy_in_reverse(u8 *dst, const u8 *src, u32 size)
 }
 
 int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
+void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
 
 int mana_ib_create_zero_offset_dma_region(struct mana_ib_dev *dev, struct ib_umem *umem,
 					  mana_handle_t *gdma_region);
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index c4fb8b4..169b286 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -95,11 +95,9 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
 	struct mana_ib_dev *mdev =
 		container_of(pd->device, struct mana_ib_dev, ib_dev);
-	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct ib_rwq_ind_table *ind_tbl = attr->rwq_ind_tbl;
 	struct mana_ib_create_qp_rss_resp resp = {};
 	struct mana_ib_create_qp_rss ucmd = {};
-	struct gdma_queue **gdma_cq_allocated;
 	mana_handle_t *mana_ind_table;
 	struct mana_port_context *mpc;
 	unsigned int ind_tbl_size;
@@ -173,13 +171,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		goto fail;
 	}
 
-	gdma_cq_allocated = kcalloc(ind_tbl_size, sizeof(*gdma_cq_allocated),
-				    GFP_KERNEL);
-	if (!gdma_cq_allocated) {
-		ret = -ENOMEM;
-		goto fail;
-	}
-
 	qp->port = port;
 
 	for (i = 0; i < ind_tbl_size; i++) {
@@ -229,8 +220,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		ret = mana_ib_install_cq_cb(mdev, cq);
 		if (ret)
 			goto fail;
-
-		gdma_cq_allocated[i] = gc->cq_table[cq->queue.id];
 	}
 	resp.num_entries = i;
 
@@ -250,7 +239,6 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		goto fail;
 	}
 
-	kfree(gdma_cq_allocated);
 	kfree(mana_ind_table);
 
 	return 0;
@@ -262,13 +250,10 @@ fail:
 		wq = container_of(ibwq, struct mana_ib_wq, ibwq);
 		cq = container_of(ibcq, struct mana_ib_cq, ibcq);
 
-		gc->cq_table[cq->queue.id] = NULL;
-		kfree(gdma_cq_allocated[i]);
-
+		mana_ib_remove_cq_cb(mdev, cq);
 		mana_destroy_wq_obj(mpc, GDMA_RQ, wq->rx_object);
 	}
 
-	kfree(gdma_cq_allocated);
 	kfree(mana_ind_table);
 
 	return ret;
@@ -287,10 +272,8 @@ static int mana_ib_create_qp_raw(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	struct mana_ib_ucontext *mana_ucontext =
 		rdma_udata_to_drv_context(udata, struct mana_ib_ucontext,
 					  ibucontext);
-	struct gdma_context *gc = mdev_to_gc(mdev);
 	struct mana_ib_create_qp_resp resp = {};
 	struct mana_ib_create_qp ucmd = {};
-	struct gdma_queue *gdma_cq = NULL;
 	struct mana_obj_spec wq_spec = {};
 	struct mana_obj_spec cq_spec = {};
 	struct mana_port_context *mpc;
@@ -395,14 +378,13 @@ static int mana_ib_create_qp_raw(struct ib_qp *ibqp, struct ib_pd *ibpd,
 		ibdev_dbg(&mdev->ib_dev,
 			  "Failed copy udata for create qp-raw, %d\n",
 			  err);
-		goto err_release_gdma_cq;
+		goto err_remove_cq_cb;
 	}
 
 	return 0;
 
-err_release_gdma_cq:
-	kfree(gdma_cq);
-	gc->cq_table[send_cq->queue.id] = NULL;
+err_remove_cq_cb:
+	mana_ib_remove_cq_cb(mdev, send_cq);
 
 err_destroy_wq_obj:
 	mana_destroy_wq_obj(mpc, GDMA_SQ, qp->qp_handle);
-- 
2.43.0


^ permalink raw reply related	[relevance 64%]

* Re: [PATCH v2] Add a header in ifcfg and nm keyfiles describing the owner of the files
  @ 2024-04-18 16:15 79% ` Easwar Hariharan
  2024-04-18 19:01 79%   ` Dexuan Cui
  2024-04-19 16:54 79% ` Shradha Gupta
  1 sibling, 1 reply; 200+ results
From: Easwar Hariharan @ 2024-04-18 16:15 UTC (permalink / raw)
  To: Ani Sinha, K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui
  Cc: shradhagupta, linux-hyperv, linux-kernel

On 4/18/2024 5:05 AM, Ani Sinha wrote:
> A comment describing the source of writing the contents of the ifcfg and
> network manager keyfiles (hyperv kvp daemon) is useful. It is valuable both
> for debugging as well as for preventing users from modifying them.
> 
> CC: shradhagupta@linux.microsoft.com
> CC: eahariha@linux.microsoft.com
> CC: wei.liu@kernel.org
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>  tools/hv/hv_kvp_daemon.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> changelog:
> v2: simplify and fix issues with error handling.
> 
> diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
> index ae57bf69ad4a..014e45be6981 100644
> --- a/tools/hv/hv_kvp_daemon.c
> +++ b/tools/hv/hv_kvp_daemon.c
> @@ -94,6 +94,8 @@ static char *lic_version = "Unknown version";
>  static char full_domain_name[HV_KVP_EXCHANGE_MAX_VALUE_SIZE];
>  static struct utsname uts_buf;
>  
> +#define CFG_HEADER "# Generated by hyperv key-value pair daemon. Please do not modify.\n"
> +
>  /*
>   * The location of the interface configuration file.
>   */
> @@ -1435,6 +1437,18 @@ static int kvp_set_ip_info(char *if_name, struct hv_kvp_ipaddr_value *new_val)
>  		return HV_E_FAIL;
>  	}
>  
> +	/* Write the config file headers */
> +	error = fprintf(ifcfg_file, CFG_HEADER);
> +	if (error < 0) {
> +		error = HV_E_FAIL;
> +		goto setval_error;
> +	}
> +	error = fprintf(nmfile, CFG_HEADER);
> +	if (error < 0) {
> +		error = HV_E_FAIL;
> +		goto setval_error;
> +	}
> +
>  	/*
>  	 * First write out the MAC address.
>  	 */


Looks good to me, I'll defer to other folks on the recipient list on whether "hyperv" should be capitalized
as HyperV or other such feedback.

Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>

Thanks,
Easwar

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v3] Drivers: hv: Cosmetic changes for hv.c and balloon.c
  2024-04-12  5:28 38% [PATCH v3] Drivers: hv: Cosmetic changes for hv.c and balloon.c Aditya Nagesh
@ 2024-04-18 15:06 79% ` Saurabh Singh Sengar
  0 siblings, 0 replies; 200+ results
From: Saurabh Singh Sengar @ 2024-04-18 15:06 UTC (permalink / raw)
  To: Aditya Nagesh
  Cc: adityanagesh, kys, haiyangz, wei.liu, decui, linux-hyperv, linux-kernel

On Thu, Apr 11, 2024 at 10:28:03PM -0700, Aditya Nagesh wrote:
> Fix issues reported by checkpatch.pl script in hv.c and
> balloon.c
>  - Remove unnecessary parentheses
>  - Remove extra newlines
>  - Remove extra spaces
>  - Add spaces between comparison operators
>  - Remove comparison with NULL in if statements
> 
> No functional changes intended
> 
> Signed-off-by: Aditya Nagesh <adityanagesh@linux.microsoft.com>
> ---
> [V3]
> Fix alignment issues in multiline function parameters.
> 
> [V2]
> Change Subject from "Drivers: hv: Fix Issues reported by checkpatch.pl script"
>  to "Drivers: hv: Cosmetic changes for hv.c and balloon.c"
> 
>  drivers/hv/hv.c         |  35 +++++++-------
>  drivers/hv/hv_balloon.c | 101 +++++++++++++++-------------------------
>  2 files changed, 54 insertions(+), 82 deletions(-)
> 
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index a8ad728354cb..4906611475fb 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -45,7 +45,7 @@ int hv_init(void)
>   * This involves a hypercall.
>   */
>  int hv_post_message(union hv_connection_id connection_id,
> -		  enum hv_message_type message_type,
> +		    enum hv_message_type message_type,

This line is fixed now, but now below line is unaligned.

>  		  void *payload, size_t payload_size)
>  {

<snip>

>  
>  	if (req->more_pages == 1)
>  		return;
> @@ -1415,7 +1395,7 @@ static int dm_thread_func(void *dm_dev)
>  
>  	while (!kthread_should_stop()) {
>  		wait_for_completion_interruptible_timeout(
> -						&dm_device.config_event, 1*HZ);
> +						&dm_device.config_event, 1 * HZ);

IMO, we can move this to previous line, as now 100 characters are allowed in single line.


Once fixed above,
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>

- Saurabh


^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
  @ 2024-04-18  6:01 78%       ` Shradha Gupta
    0 siblings, 1 reply; 200+ results
From: Shradha Gupta @ 2024-04-18  6:01 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Zhu Yanjun, Jason Gunthorpe, linux-kernel, linux-hyperv,
	linux-rdma, netdev, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Ajay Sharma, Leon Romanovsky, Thomas Gleixner,
	Sebastian Andrzej Siewior, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Long Li, Michael Kelley, Shradha Gupta,
	Yury Norov, Konstantin Taranov, Souradeep Chakrabarti

On Tue, Apr 16, 2024 at 08:09:35PM +0200, Andrew Lunn wrote:
> On Tue, Apr 16, 2024 at 06:27:04AM +0200, Zhu Yanjun wrote:
> > ??? 2024/4/15 18:13, Jason Gunthorpe ??????:
> > > On Mon, Apr 15, 2024 at 02:49:49AM -0700, Shradha Gupta wrote:
> > > > Add new device attributes to view multiport, msix, and adapter MTU
> > > > setting for MANA device.
> > > > 
> > > > Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
> > > > ---
> > > >   .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
> > > >   include/net/mana/gdma.h                       |  9 +++
> > > >   2 files changed, 83 insertions(+)
> > > > 
> > > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > > > index 1332db9a08eb..6674a02cff06 100644
> > > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > > > @@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
> > > >   	return dev_id == MANA_PF_DEVICE_ID;
> > > >   }
> > > > +static ssize_t mana_attr_show(struct device *dev,
> > > > +			      struct device_attribute *attr, char *buf)
> > > > +{
> > > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > > +	struct gdma_context *gc = pci_get_drvdata(pdev);
> > > > +	struct mana_context *ac = gc->mana.driver_data;
> > > > +
> > > > +	if (strcmp(attr->attr.name, "mport") == 0)
> > > > +		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
> > > > +	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
> > > > +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
> > > > +	else if (strcmp(attr->attr.name, "msix") == 0)
> > > > +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
> > > > +	else
> > > > +		return -EINVAL;
> > > > +
> > > 
> > > That is not how sysfs should be implemented at all, please find a
> > > good example to copy from. Every attribute should use its own function
> > > with the macros to link it into an attributes group and sysfs_emit
> > > should be used for printing
> > 
> > Not sure if this file drivers/infiniband/hw/usnic/usnic_ib_sysfs.c is a good
> > example or not.
> 
> The first question should be, what are these values used for? You can
> then decide on debugfs or sysfs. debugfs is easier to use, and you
> avoid any ABI, which will make long term support easier.

Hi Andrew,
We want to eventually use these attributes to make the device settings configurable
and also improve debuggability for MANA devices. I feel having these attributes 
in sysfs would make more sense as we plan to extend the attribute list and also make
them settable.

Regards,
Shradha.
> 
>       Andrew

^ permalink raw reply	[relevance 78%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
    @ 2024-04-18  5:51 79%     ` Shradha Gupta
  1 sibling, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-18  5:51 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Jason Gunthorpe, linux-kernel, linux-hyperv, linux-rdma, netdev,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Ajay Sharma,
	Leon Romanovsky, Thomas Gleixner, Sebastian Andrzej Siewior,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Michael Kelley, Shradha Gupta, Yury Norov, Konstantin Taranov,
	Souradeep Chakrabarti

On Tue, Apr 16, 2024 at 06:27:04AM +0200, Zhu Yanjun wrote:
> ??? 2024/4/15 18:13, Jason Gunthorpe ??????:
> >On Mon, Apr 15, 2024 at 02:49:49AM -0700, Shradha Gupta wrote:
> >>Add new device attributes to view multiport, msix, and adapter MTU
> >>setting for MANA device.
> >>
> >>Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
> >>---
> >>  .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
> >>  include/net/mana/gdma.h                       |  9 +++
> >>  2 files changed, 83 insertions(+)
> >>
> >>diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >>index 1332db9a08eb..6674a02cff06 100644
> >>--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >>+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >>@@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
> >>  	return dev_id == MANA_PF_DEVICE_ID;
> >>  }
> >>+static ssize_t mana_attr_show(struct device *dev,
> >>+			      struct device_attribute *attr, char *buf)
> >>+{
> >>+	struct pci_dev *pdev = to_pci_dev(dev);
> >>+	struct gdma_context *gc = pci_get_drvdata(pdev);
> >>+	struct mana_context *ac = gc->mana.driver_data;
> >>+
> >>+	if (strcmp(attr->attr.name, "mport") == 0)
> >>+		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
> >>+	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
> >>+		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
> >>+	else if (strcmp(attr->attr.name, "msix") == 0)
> >>+		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
> >>+	else
> >>+		return -EINVAL;
> >>+
> >
> >That is not how sysfs should be implemented at all, please find a
> >good example to copy from. Every attribute should use its own function
> >with the macros to link it into an attributes group and sysfs_emit
> >should be used for printing
> 
> Not sure if this file drivers/infiniband/hw/usnic/usnic_ib_sysfs.c
> is a good example or not.
> 
> Zhu Yanjun
Thanks for the reference, Zhu.
> 
> >
> >Jason

^ permalink raw reply	[relevance 79%]

* [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr
  2024-04-17 14:20 79% [PATCH rdma-next 0/2] RDMA/mana_ib: Enable DMA-mapped memory regions Konstantin Taranov
  2024-04-17 14:20 79% ` [PATCH rdma-next 1/2] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
@ 2024-04-17 14:20 70% ` Konstantin Taranov
      1 sibling, 2 replies; 200+ results
From: Konstantin Taranov @ 2024-04-17 14:20 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement allocation of DMA-mapped memory regions.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/device.c |  1 +
 drivers/infiniband/hw/mana/mr.c     | 36 +++++++++++++++++++++++++++++
 include/net/mana/gdma.h             |  5 ++++
 3 files changed, 42 insertions(+)

diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 6fa902ee80a6..043cef09f1c2 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -29,6 +29,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.destroy_rwq_ind_table = mana_ib_destroy_rwq_ind_table,
 	.destroy_wq = mana_ib_destroy_wq,
 	.disassociate_ucontext = mana_ib_disassociate_ucontext,
+	.get_dma_mr = mana_ib_get_dma_mr,
 	.get_port_immutable = mana_ib_get_port_immutable,
 	.mmap = mana_ib_mmap,
 	.modify_qp = mana_ib_modify_qp,
diff --git a/drivers/infiniband/hw/mana/mr.c b/drivers/infiniband/hw/mana/mr.c
index 4f13423ecdbd..7c9394926a18 100644
--- a/drivers/infiniband/hw/mana/mr.c
+++ b/drivers/infiniband/hw/mana/mr.c
@@ -8,6 +8,8 @@
 #define VALID_MR_FLAGS                                                         \
 	(IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE | IB_ACCESS_REMOTE_READ)
 
+#define VALID_DMA_MR_FLAGS IB_ACCESS_LOCAL_WRITE
+
 static enum gdma_mr_access_flags
 mana_ib_verbs_to_gdma_access_flags(int access_flags)
 {
@@ -39,6 +41,8 @@ static int mana_ib_gd_create_mr(struct mana_ib_dev *dev, struct mana_ib_mr *mr,
 	req.mr_type = mr_params->mr_type;
 
 	switch (mr_params->mr_type) {
+	case GDMA_MR_TYPE_GPA:
+		break;
 	case GDMA_MR_TYPE_GVA:
 		req.gva.dma_region_handle = mr_params->gva.dma_region_handle;
 		req.gva.virtual_address = mr_params->gva.virtual_address;
@@ -168,6 +172,38 @@ struct ib_mr *mana_ib_reg_user_mr(struct ib_pd *ibpd, u64 start, u64 length,
 	return ERR_PTR(err);
 }
 
+struct ib_mr *mana_ib_get_dma_mr(struct ib_pd *ibpd, int access_flags)
+{
+	struct mana_ib_pd *pd = container_of(ibpd, struct mana_ib_pd, ibpd);
+	struct gdma_create_mr_params mr_params = {};
+	struct ib_device *ibdev = ibpd->device;
+	struct mana_ib_dev *dev;
+	struct mana_ib_mr *mr;
+	int err;
+
+	dev = container_of(ibdev, struct mana_ib_dev, ib_dev);
+
+	if (access_flags & ~VALID_DMA_MR_FLAGS)
+		return ERR_PTR(-EINVAL);
+
+	mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+	if (!mr)
+		return ERR_PTR(-ENOMEM);
+
+	mr_params.pd_handle = pd->pd_handle;
+	mr_params.mr_type = GDMA_MR_TYPE_GPA;
+
+	err = mana_ib_gd_create_mr(dev, mr, &mr_params);
+	if (err)
+		goto err_free;
+
+	return &mr->ibmr;
+
+err_free:
+	kfree(mr);
+	return ERR_PTR(err);
+}
+
 int mana_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
 {
 	struct mana_ib_mr *mr = container_of(ibmr, struct mana_ib_mr, ibmr);
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 8d796a30ddde..dc19b5cb33a6 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -788,6 +788,11 @@ struct gdma_destory_pd_resp {
 };/* HW DATA */
 
 enum gdma_mr_type {
+	/*
+	 * Guest Physical Address - MRs of this type allow access
+	 * to any DMA-mapped memory using bus-logical address
+	 */
+	GDMA_MR_TYPE_GPA = 1,
 	/* Guest Virtual Address - MRs of this type allow access
 	 * to memory mapped by PTEs associated with this MR using a virtual
 	 * address that is set up in the MST
-- 
2.43.0


^ permalink raw reply related	[relevance 70%]

* [PATCH rdma-next 0/2] RDMA/mana_ib: Enable DMA-mapped memory regions
@ 2024-04-17 14:20 79% Konstantin Taranov
  2024-04-17 14:20 79% ` [PATCH rdma-next 1/2] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
  2024-04-17 14:20 70% ` [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr Konstantin Taranov
  0 siblings, 2 replies; 200+ results
From: Konstantin Taranov @ 2024-04-17 14:20 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

This patch series enables creation of DMA-mapped memory regions.
It allows GPA creation in kernel-level PDs and implements get_dma_mr().
Note, mana_ib_get_dma_mr was already declared, but not implemented.

Konstantin Taranov (2):
  RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
  RDMA/mana_ib: Implement get_dma_mr

 drivers/infiniband/hw/mana/device.c |  1 +
 drivers/infiniband/hw/mana/main.c   |  3 +++
 drivers/infiniband/hw/mana/mr.c     | 36 +++++++++++++++++++++++++++++
 include/net/mana/gdma.h             |  6 +++++
 4 files changed, 46 insertions(+)


base-commit: dfcdb38b21e4fb92a49acdbdf6afa82c07c8eba0
-- 
2.43.0


^ permalink raw reply	[relevance 79%]

* [PATCH rdma-next 1/2] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
  2024-04-17 14:20 79% [PATCH rdma-next 0/2] RDMA/mana_ib: Enable DMA-mapped memory regions Konstantin Taranov
@ 2024-04-17 14:20 79% ` Konstantin Taranov
  2024-04-17 14:20 70% ` [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr Konstantin Taranov
  1 sibling, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-17 14:20 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Allow the HW to register DMA-mapped memory for kernel-level PDs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c | 3 +++
 include/net/mana/gdma.h           | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index b31dcff32699..820af42d1fe1 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -82,6 +82,9 @@ int mana_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	mana_gd_init_req_hdr(&req.hdr, GDMA_CREATE_PD, sizeof(req),
 			     sizeof(resp));
 
+	if (!udata)
+		flags |= GDMA_PD_FLAG_ALLOW_GPA_MR;
+
 	req.flags = flags;
 	err = mana_gd_send_request(gc, sizeof(req), &req,
 				   sizeof(resp), &resp);
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 27684135bb4d..8d796a30ddde 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -762,6 +762,7 @@ struct gdma_destroy_dma_region_req {
 
 enum gdma_pd_flags {
 	GDMA_PD_FLAG_INVALID = 0,
+	GDMA_PD_FLAG_ALLOW_GPA_MR = 1,
 };
 
 struct gdma_create_pd_req {
-- 
2.43.0


^ permalink raw reply related	[relevance 79%]

* Re: [PATCH] tools: hv: suppress the invalid warning for packed member alignment
  @ 2024-04-17  8:21 79%   ` Saurabh Singh Sengar
  0 siblings, 0 replies; 200+ results
From: Saurabh Singh Sengar @ 2024-04-17  8:21 UTC (permalink / raw)
  To: Greg KH
  Cc: kys, haiyangz, wei.liu, decui, linux-kernel, linux-hyperv, ssengar

On Wed, Apr 17, 2024 at 10:17:21AM +0200, Greg KH wrote:
> On Wed, Apr 17, 2024 at 01:00:48AM -0700, Saurabh Sengar wrote:
> > Packed struct vmbus_bufring is 4096 byte aligned and the reporting
> > warning is for the first member of that struct which shouldn't add
> > any offset to create alignment issue.
> > 
> > Suppress the warning by adding -Wno-address-of-packed-member flag to
> > gcc.
> > 
> > Reported-by: kernel test robot <yujie.liu@intel.com>
> > Closes: https://lore.kernel.org/all/202404121913.GhtSoKbW-lkp@intel.com/
> > Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> 
> What commit id does this fix?

Fixes: 45bab4d74651 ("tools: hv: Add vmbus_bufring")

- Saurabh

^ permalink raw reply	[relevance 79%]

* [PATCH] tools: hv: suppress the invalid warning for packed member alignment
@ 2024-04-17  8:00 79% Saurabh Sengar
    0 siblings, 1 reply; 200+ results
From: Saurabh Sengar @ 2024-04-17  8:00 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, gregkh, linux-kernel, linux-hyperv; +Cc: ssengar

Packed struct vmbus_bufring is 4096 byte aligned and the reporting
warning is for the first member of that struct which shouldn't add
any offset to create alignment issue.

Suppress the warning by adding -Wno-address-of-packed-member flag to
gcc.

Reported-by: kernel test robot <yujie.liu@intel.com>
Closes: https://lore.kernel.org/all/202404121913.GhtSoKbW-lkp@intel.com/
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
---
 tools/hv/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/hv/Makefile b/tools/hv/Makefile
index bb52871da341..2e60e2c212cd 100644
--- a/tools/hv/Makefile
+++ b/tools/hv/Makefile
@@ -17,6 +17,7 @@ endif
 MAKEFLAGS += -r
 
 override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include
+override CFLAGS += -Wno-address-of-packed-member
 
 ALL_TARGETS := hv_kvp_daemon hv_vss_daemon
 ifneq ($(ARCH), aarch64)
-- 
2.34.1


^ permalink raw reply related	[relevance 79%]

* [PATCH 2/2] selftests/user_events: Add non-spacing separator check
  2024-04-16 22:41 75% [PATCH 0/2] tracing/user_events: Fix non-spaced field matching Beau Belgrave
  2024-04-16 22:41 66% ` [PATCH 1/2] " Beau Belgrave
@ 2024-04-16 22:41 77% ` Beau Belgrave
  1 sibling, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-16 22:41 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, dcook

The ABI documentation indicates that field separators do not need a
space between them, only a ';'. When no spacing is used, the register
must work. Any subsequent register, with or without spaces, must match
and not return -EADDRINUSE.

Add a non-spacing separator case to our self-test register case to ensure
it works going forward.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 tools/testing/selftests/user_events/ftrace_test.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/testing/selftests/user_events/ftrace_test.c b/tools/testing/selftests/user_events/ftrace_test.c
index dcd7509fe2e0..0bb46793dcd4 100644
--- a/tools/testing/selftests/user_events/ftrace_test.c
+++ b/tools/testing/selftests/user_events/ftrace_test.c
@@ -261,6 +261,12 @@ TEST_F(user, register_events) {
 	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, &reg));
 	ASSERT_EQ(0, reg.write_index);
 
+	/* Register without separator spacing should still match */
+	reg.enable_bit = 29;
+	reg.name_args = (__u64)"__test_event u32 field1;u32 field2";
+	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSREG, &reg));
+	ASSERT_EQ(0, reg.write_index);
+
 	/* Multiple registers to same name but different args should fail */
 	reg.enable_bit = 29;
 	reg.name_args = (__u64)"__test_event u32 field1;";
@@ -288,6 +294,8 @@ TEST_F(user, register_events) {
 	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, &unreg));
 	unreg.disable_bit = 30;
 	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, &unreg));
+	unreg.disable_bit = 29;
+	ASSERT_EQ(0, ioctl(self->data_fd, DIAG_IOCSUNREG, &unreg));
 
 	/* Delete should have been auto-done after close and unregister */
 	close(self->data_fd);
-- 
2.34.1


^ permalink raw reply related	[relevance 77%]

* [PATCH 0/2] tracing/user_events: Fix non-spaced field matching
@ 2024-04-16 22:41 75% Beau Belgrave
  2024-04-16 22:41 66% ` [PATCH 1/2] " Beau Belgrave
  2024-04-16 22:41 77% ` [PATCH 2/2] selftests/user_events: Add non-spacing separator check Beau Belgrave
  0 siblings, 2 replies; 200+ results
From: Beau Belgrave @ 2024-04-16 22:41 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, dcook

When the ABI was updated to prevent same name w/different args, it
missed an important corner case when fields don't end with a space.
Typically, space is used for fields to help separate them, like
"u8 field1; u8 field2". If no spaces are used, like
"u8 field1;u8 field2", then the parsing works for the first time.
However, the match check fails on a subsequent register, leading to
confusion.

This is because the match check uses argv_split() and assumes that all
fields will be split upon the space. When spaces are used, we get back
{ "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }.
This causes a mismatch, and the user program gets back -EADDRINUSE.

Add a method to detect this case before calling argv_split(). If found
force a space after the field separator character ';'. This ensures all
cases work properly for matching.

I could not find an existing function to accomplish this, so I had to
hand code a copy with this logic. If there is a better way to achieve
this, I'm all ears.

This series also adds a selftest to ensure this doesn't break again.

With this fix, the following are all treated as matching:
u8 field1;u8 field2
u8 field1; u8 field2
u8 field1;\tu8 field2
u8 field1;\nu8 field2

Beau Belgrave (2):
  tracing/user_events: Fix non-spaced field matching
  selftests/user_events: Add non-spacing separator check

 kernel/trace/trace_events_user.c              | 88 ++++++++++++++++++-
 .../selftests/user_events/ftrace_test.c       |  8 ++
 2 files changed, 95 insertions(+), 1 deletion(-)


base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680
-- 
2.34.1


^ permalink raw reply	[relevance 75%]

* [PATCH 1/2] tracing/user_events: Fix non-spaced field matching
  2024-04-16 22:41 75% [PATCH 0/2] tracing/user_events: Fix non-spaced field matching Beau Belgrave
@ 2024-04-16 22:41 66% ` Beau Belgrave
    2024-04-16 22:41 77% ` [PATCH 2/2] selftests/user_events: Add non-spacing separator check Beau Belgrave
  1 sibling, 1 reply; 200+ results
From: Beau Belgrave @ 2024-04-16 22:41 UTC (permalink / raw)
  To: rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, dcook

When the ABI was updated to prevent same name w/different args, it
missed an important corner case when fields don't end with a space.
Typically, space is used for fields to help separate them, like
"u8 field1; u8 field2". If no spaces are used, like
"u8 field1;u8 field2", then the parsing works for the first time.
However, the match check fails on a subsequent register, leading to
confusion.

This is because the match check uses argv_split() and assumes that all
fields will be split upon the space. When spaces are used, we get back
{ "u8", "field1;" }, without spaces we get back { "u8", "field1;u8" }.
This causes a mismatch, and the user program gets back -EADDRINUSE.

Add a method to detect this case before calling argv_split(). If found
force a space after the field separator character ';'. This ensures all
cases work properly for matching.

With this fix, the following are all treated as matching:
u8 field1;u8 field2
u8 field1; u8 field2
u8 field1;\tu8 field2
u8 field1;\nu8 field2

Fixes: ba470eebc2f6 ("tracing/user_events: Prevent same name but different args event")
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 kernel/trace/trace_events_user.c | 88 +++++++++++++++++++++++++++++++-
 1 file changed, 87 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
index 70d428c394b6..9184d3962b2a 100644
--- a/kernel/trace/trace_events_user.c
+++ b/kernel/trace/trace_events_user.c
@@ -1989,6 +1989,92 @@ static int user_event_set_tp_name(struct user_event *user)
 	return 0;
 }
 
+/*
+ * Counts how many ';' without a trailing space are in the args.
+ */
+static int count_semis_no_space(char *args)
+{
+	int count = 0;
+
+	while ((args = strchr(args, ';'))) {
+		args++;
+
+		if (!isspace(*args))
+			count++;
+	}
+
+	return count;
+}
+
+/*
+ * Copies the arguments while ensuring all ';' have a trailing space.
+ */
+static char *fix_semis_no_space(char *args, int count)
+{
+	char *fixed, *pos;
+	char c, last;
+	int len;
+
+	len = strlen(args) + count;
+	fixed = kmalloc(len + 1, GFP_KERNEL);
+
+	if (!fixed)
+		return NULL;
+
+	pos = fixed;
+	last = '\0';
+
+	while (len > 0) {
+		c = *args++;
+
+		if (last == ';' && !isspace(c)) {
+			*pos++ = ' ';
+			len--;
+		}
+
+		if (len > 0) {
+			*pos++ = c;
+			len--;
+		}
+
+		last = c;
+	}
+
+	/*
+	 * len is the length of the copy excluding the null.
+	 * This ensures we always have room for a null.
+	 */
+	*pos = '\0';
+
+	return fixed;
+}
+
+static char **user_event_argv_split(char *args, int *argc)
+{
+	/* Count how many ';' without a trailing space */
+	int count = count_semis_no_space(args);
+
+	if (count) {
+		/* We must fixup 'field;field' to 'field; field' */
+		char *fixed = fix_semis_no_space(args, count);
+		char **split;
+
+		if (!fixed)
+			return NULL;
+
+		/* We do a normal split afterwards */
+		split = argv_split(GFP_KERNEL, fixed, argc);
+
+		/* We can free since argv_split makes a copy */
+		kfree(fixed);
+
+		return split;
+	}
+
+	/* No fixup is required */
+	return argv_split(GFP_KERNEL, args, argc);
+}
+
 /*
  * Parses the event name, arguments and flags then registers if successful.
  * The name buffer lifetime is owned by this method for success cases only.
@@ -2012,7 +2098,7 @@ static int user_event_parse(struct user_event_group *group, char *name,
 		return -EPERM;
 
 	if (args) {
-		argv = argv_split(GFP_KERNEL, args, &argc);
+		argv = user_event_argv_split(args, &argc);
 
 		if (!argv)
 			return -ENOMEM;
-- 
2.34.1


^ permalink raw reply related	[relevance 66%]

* Re: [PATCH] ACPI: CPPC: Fix bit_offset shift in MASK_VAL macro
  @ 2024-04-16 17:24 79% ` Jarred White
  0 siblings, 0 replies; 200+ results
From: Jarred White @ 2024-04-16 17:24 UTC (permalink / raw)
  To: Rafael J. Wysocki, Len Brown, Easwar Hariharan, open list:ACPI,
	open list
  Cc: Vanshidhar Konda, stable

On 4/8/2024 10:23 PM, Jarred White wrote:
> Commit 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for
> system memory accesses") neglected to properly wrap the bit_offset shift
> when it comes to applying the mask. This may cause incorrect values to be
> read and may cause the cpufreq module not be loaded.
> 
> [   11.059751] cpu_capacity: CPU0 missing/invalid highest performance.
> [   11.066005] cpu_capacity: partial information: fallback to 1024 for all CPUs
> 
> Also, corrected the bitmask generation in GENMASK (extra bit being added).
> 
> Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
> Signed-off-by: Jarred White <jarredwhite@linux.microsoft.com>
> CC: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
> CC: stable@vger.kernel.org #5.15+
> ---
>   drivers/acpi/cppc_acpi.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 4bfbe55553f4..00a30ca35e78 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -170,8 +170,8 @@ show_cppc_data(cppc_get_perf_ctrs, cppc_perf_fb_ctrs, wraparound_time);
>   #define GET_BIT_WIDTH(reg) ((reg)->access_width ? (8 << ((reg)->access_width - 1)) : (reg)->bit_width)
>   
>   /* Shift and apply the mask for CPC reads/writes */
> -#define MASK_VAL(reg, val) ((val) >> ((reg)->bit_offset & 			\
> -					GENMASK(((reg)->bit_width), 0)))
> +#define MASK_VAL(reg, val) (((val) >> (reg)->bit_offset) & 			\
> +					GENMASK(((reg)->bit_width) - 1, 0))
>   
>   static ssize_t show_feedback_ctrs(struct kobject *kobj,
>   		struct kobj_attribute *attr, char *buf)

Hi Vanshi,

Could you review please?


Thanks,
Jarred

^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
  2024-04-15 16:38 79% ` Saurabh Singh Sengar
@ 2024-04-16  4:26 79%   ` Shradha Gupta
  0 siblings, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-16  4:26 UTC (permalink / raw)
  To: Saurabh Singh Sengar
  Cc: linux-kernel, linux-hyperv, linux-rdma, netdev, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Ajay Sharma, Leon Romanovsky,
	Thomas Gleixner, Sebastian Andrzej Siewior, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Michael Kelley,
	Shradha Gupta, Yury Norov, Konstantin Taranov,
	Souradeep Chakrabarti

On Mon, Apr 15, 2024 at 09:38:32AM -0700, Saurabh Singh Sengar wrote:
> On Mon, Apr 15, 2024 at 02:49:49AM -0700, Shradha Gupta wrote:
> > Add new device attributes to view multiport, msix, and adapter MTU
> > setting for MANA device.
> > 
> > Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
> > ---
> >  .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
> >  include/net/mana/gdma.h                       |  9 +++
> >  2 files changed, 83 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > index 1332db9a08eb..6674a02cff06 100644
> > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > @@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
> >  	return dev_id == MANA_PF_DEVICE_ID;
> >  }
> >  
> > +static ssize_t mana_attr_show(struct device *dev,
> > +			      struct device_attribute *attr, char *buf)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct gdma_context *gc = pci_get_drvdata(pdev);
> > +	struct mana_context *ac = gc->mana.driver_data;
> > +
> > +	if (strcmp(attr->attr.name, "mport") == 0)
> > +		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
> > +	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
> > +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
> > +	else if (strcmp(attr->attr.name, "msix") == 0)
> > +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
> > +	else
> > +		return -EINVAL;
> > +}
> > +
> > +static int mana_gd_setup_sysfs(struct pci_dev *pdev)
> > +{
> > +	struct gdma_context *gc = pci_get_drvdata(pdev);
> > +	int retval = 0;
> > +
> > +	gc->mana_attributes.mana_mport_attr.attr.name = "mport";
> > +	gc->mana_attributes.mana_mport_attr.attr.mode = 0444;
> > +	gc->mana_attributes.mana_mport_attr.show = mana_attr_show;
> > +	sysfs_attr_init(&gc->mana_attributes.mana_mport_attr);
> > +	retval = device_create_file(&pdev->dev,
> > +				    &gc->mana_attributes.mana_mport_attr);
> 
> if you can use .dev_groups, sysfs creation and removal will be lot more
> simplified for the driver.
Sure Saurabh, I think we can do this too.
> 
> - Saurabh

^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
  @ 2024-04-16  4:25 79%   ` Shradha Gupta
    1 sibling, 0 replies; 200+ results
From: Shradha Gupta @ 2024-04-16  4:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-kernel, linux-hyperv, linux-rdma, netdev, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Ajay Sharma, Leon Romanovsky,
	Thomas Gleixner, Sebastian Andrzej Siewior, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Michael Kelley,
	Shradha Gupta, Yury Norov, Konstantin Taranov,
	Souradeep Chakrabarti

On Mon, Apr 15, 2024 at 01:13:05PM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 15, 2024 at 02:49:49AM -0700, Shradha Gupta wrote:
> > Add new device attributes to view multiport, msix, and adapter MTU
> > setting for MANA device.
> > 
> > Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
> > ---
> >  .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
> >  include/net/mana/gdma.h                       |  9 +++
> >  2 files changed, 83 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > index 1332db9a08eb..6674a02cff06 100644
> > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > @@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
> >  	return dev_id == MANA_PF_DEVICE_ID;
> >  }
> >  
> > +static ssize_t mana_attr_show(struct device *dev,
> > +			      struct device_attribute *attr, char *buf)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct gdma_context *gc = pci_get_drvdata(pdev);
> > +	struct mana_context *ac = gc->mana.driver_data;
> > +
> > +	if (strcmp(attr->attr.name, "mport") == 0)
> > +		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
> > +	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
> > +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
> > +	else if (strcmp(attr->attr.name, "msix") == 0)
> > +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
> > +	else
> > +		return -EINVAL;
> > +
> 
> That is not how sysfs should be implemented at all, please find a
> good example to copy from. Every attribute should use its own function
> with the macros to link it into an attributes group and sysfs_emit
> should be used for printing
> 
> Jason
Thanks Jason, I will make the appropriate changes in the next version.

Regards,
Shradha.

^ permalink raw reply	[relevance 79%]

* Re: [PATCH 5.15 00/45] 5.15.156-rc1 review
  @ 2024-04-15 23:53 79% ` Kelsey Steele
  0 siblings, 0 replies; 200+ results
From: Kelsey Steele @ 2024-04-15 23:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie

On Mon, Apr 15, 2024 at 04:21:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.156 release.
> There are 45 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000.
> Anything received after that time might be too late.
> 
No regressions found on WSL (x86 and arm64).

Built, booted, and reviewed dmesg.

Thank you. :)

Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> 

^ permalink raw reply	[relevance 79%]

* Re: [PATCH 6.1 00/69] 6.1.87-rc1 review
  @ 2024-04-15 23:53 79% ` Kelsey Steele
  0 siblings, 0 replies; 200+ results
From: Kelsey Steele @ 2024-04-15 23:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie

On Mon, Apr 15, 2024 at 04:20:31PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.87 release.
> There are 69 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000.
> Anything received after that time might be too late.
> 
No regressions found on WSL (x86 and arm64).

Built, booted, and reviewed dmesg.

Thank you. :)

Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> 

^ permalink raw reply	[relevance 79%]

* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
  @ 2024-04-15 23:52 79% ` Kelsey Steele
  0 siblings, 0 replies; 200+ results
From: Kelsey Steele @ 2024-04-15 23:52 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie

On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.28 release.
> There are 122 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000.
> Anything received after that time might be too late.
> 
No regressions found on WSL (x86 and arm64).

Built, booted, and reviewed dmesg.

Thank you. :)

Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> 

^ permalink raw reply	[relevance 79%]

* RE: [EXTERNAL] Re: [PATCH] PCI/sysfs: Fix race in pci sysfs creation
  @ 2024-04-15 18:15 79%     ` Saurabh Singh Sengar
  0 siblings, 0 replies; 200+ results
From: Saurabh Singh Sengar @ 2024-04-15 18:15 UTC (permalink / raw)
  To: Krzysztof Wilczyński, Bjorn Helgaas
  Cc: Saurabh Singh Sengar, bhelgaas, linux-pci, linux-kernel,
	alexander.stein, Dexuan Cui



> -----Original Message-----
> From: Krzysztof Wilczyński <kwilczynski@kernel.org>
> Sent: Wednesday, February 28, 2024 10:53 PM
> To: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Saurabh Singh Sengar <ssengar@linux.microsoft.com>;
> bhelgaas@google.com; linux-pci@vger.kernel.org; linux-
> kernel@vger.kernel.org; alexander.stein@ew.tq-group.com; Dexuan Cui
> <decui@microsoft.com>
> Subject: [EXTERNAL] Re: [PATCH] PCI/sysfs: Fix race in pci sysfs creation
> 
> [You don't often get email from kwilczynski@kernel.org. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> Hello,
> 
> Sorry for late reply.
> 
> [...]
> > > > > Krzysztof has done a ton of work to convert these files to
> > > > > static attributes, where the device model prevents most of these
> races:
> > > > >
> > > > >   506140f9c06b ("PCI/sysfs: Convert "index", "acpi_index", "label" to
> static attributes")
> > > > >   d93f8399053d ("PCI/sysfs: Convert "vpd" to static attribute")
> > > > >   f42c35ea3b13 ("PCI/sysfs: Convert "reset" to static attribute")
> > > > >   527139d738d7 ("PCI/sysfs: Convert "rom" to static attribute")
> > > > >   e1d3f3268b0e ("PCI/sysfs: Convert "config" to static
> > > > > attribute")
> > > > >
> > > > > and he even posted a series to do the same for the resource files:
> > > > >
> > > > >
> > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> > > > > 2Flore.kernel.org%2Flinux-pci%2F20210910202623.2293708-1-
> kw%40li
> > > > >
> nux.com%2F&data=05%7C02%7Cssengar%40microsoft.com%7C99b036f685e
> 4
> > > > >
> 448ddb5408dc3881e998%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0
> %
> > > > >
> 7C638447377886194494%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDA
> > > > >
> iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sda
> > > > >
> ta=TsOJsR8CQGaOrJVnw0BPm0QGL%2FAUQ0GCTuzgrN8FX%2BQ%3D&reserve
> d=0
> > > > >
> > > > > I can't remember why we didn't apply that at the time, and it no
> > > > > longer applies cleanly, but I think that's the direction we should go.
> > > >
> > > > Thanks for you review.
> > > >
> > > > Please inform me if there's existing feedback explaining why this
> > > > series hasn't been merged yet. I am willing to further improve it
> > > > if necessary.
> > >
> > > Let us know your opinion so that we can move ahead in fixing this
> > > long pending bug.
> 
> I really thought you were asking me about your patch.  So, I didn't reply
> since Bjorn added his review.
> 
> > There's no feedback on the mailing list (I checked the link above), so
> > the way forward is to update the series so it applies cleanly again
> > and post it as a v3.
> 
> Start with a review, if you have some time.  Perhaps we can make it better
> before sending another revision.
> 
> There are two areas which this series decided not to tackle initially:
> 
>   - Support for the Alpha platform
>   - Support for legacy PCI platforms
> 
> Feel free to have a look at the above.  Perhaps you will have some ideas on
> how to best convert both of these to use static attributes, so that we could
> convert everything at the same time.
> 
> > There's no need to wait for Krzysztof to refresh it, and if you have
> > time to do it, it would be very welcomed!  The best base would be
> > v6.8-rc1.
> 
> That I can do, perhaps this coming weekend.  Or even sooner when I find
> some time this week.
> 
>         Krzysztof

Krzysztof,
Are you still planning to send the revised version for it ?

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v3] ACPI: CPPC: Fix access width used for PCC registers
  @ 2024-04-15 16:59 79% ` Jarred White
  0 siblings, 0 replies; 200+ results
From: Jarred White @ 2024-04-15 16:59 UTC (permalink / raw)
  To: Vanshidhar Konda, Easwar Hariharan
  Cc: Rafael J . Wysocki, linux-acpi, linux-kernel, 5 . 15+

On 4/11/2024 4:18 PM, Vanshidhar Konda wrote:
> commit 2f4a4d63a193be6fd530d180bb13c3592052904c modified
> cpc_read/cpc_write to use access_width to read CPC registers. For PCC
> registers the access width field in the ACPI register macro specifies
> the PCC subspace id. For non-zero PCC subspace id the access width is
> incorrectly treated as access width. This causes errors when reading
> from PCC registers in the CPPC driver.
> 
> For PCC registers base the size of read/write on the bit width field.
> The debug message in cpc_read/cpc_write is updated to print relevant
> information for the address space type used to read the register.
> 
> Signed-off-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
> Tested-by: Jarred White <jarredwhite@linux.microsoft.com>
> Reviewed-by: Jarred White <jarredwhite@linux.microsoft.com>
> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
> Cc: 5.15+ <stable@vger.kernel.org> # 5.15+

Hi Vanshi,

v3 changes are good. Thanks again for catching this!


Thanks,
Jarred

^ permalink raw reply	[relevance 79%]

* Re: [PATCH net-next] net: mana: Add new device attributes for mana
  2024-04-15  9:49 63% [PATCH net-next] net: mana: Add new device attributes for mana Shradha Gupta
  @ 2024-04-15 16:38 79% ` Saurabh Singh Sengar
  2024-04-16  4:26 79%   ` Shradha Gupta
  2024-04-18 21:29 79% ` Haiyang Zhang
  2 siblings, 1 reply; 200+ results
From: Saurabh Singh Sengar @ 2024-04-15 16:38 UTC (permalink / raw)
  To: Shradha Gupta
  Cc: linux-kernel, linux-hyperv, linux-rdma, netdev, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Ajay Sharma, Leon Romanovsky,
	Thomas Gleixner, Sebastian Andrzej Siewior, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Michael Kelley,
	Shradha Gupta, Yury Norov, Konstantin Taranov,
	Souradeep Chakrabarti

On Mon, Apr 15, 2024 at 02:49:49AM -0700, Shradha Gupta wrote:
> Add new device attributes to view multiport, msix, and adapter MTU
> setting for MANA device.
> 
> Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
> ---
>  .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
>  include/net/mana/gdma.h                       |  9 +++
>  2 files changed, 83 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 1332db9a08eb..6674a02cff06 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
>  	return dev_id == MANA_PF_DEVICE_ID;
>  }
>  
> +static ssize_t mana_attr_show(struct device *dev,
> +			      struct device_attribute *attr, char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct gdma_context *gc = pci_get_drvdata(pdev);
> +	struct mana_context *ac = gc->mana.driver_data;
> +
> +	if (strcmp(attr->attr.name, "mport") == 0)
> +		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
> +	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
> +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
> +	else if (strcmp(attr->attr.name, "msix") == 0)
> +		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
> +	else
> +		return -EINVAL;
> +}
> +
> +static int mana_gd_setup_sysfs(struct pci_dev *pdev)
> +{
> +	struct gdma_context *gc = pci_get_drvdata(pdev);
> +	int retval = 0;
> +
> +	gc->mana_attributes.mana_mport_attr.attr.name = "mport";
> +	gc->mana_attributes.mana_mport_attr.attr.mode = 0444;
> +	gc->mana_attributes.mana_mport_attr.show = mana_attr_show;
> +	sysfs_attr_init(&gc->mana_attributes.mana_mport_attr);
> +	retval = device_create_file(&pdev->dev,
> +				    &gc->mana_attributes.mana_mport_attr);

if you can use .dev_groups, sysfs creation and removal will be lot more
simplified for the driver.

- Saurabh

^ permalink raw reply	[relevance 79%]

* [PATCH net-next] net: mana: Add new device attributes for mana
@ 2024-04-15  9:49 63% Shradha Gupta
                     ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Shradha Gupta @ 2024-04-15  9:49 UTC (permalink / raw)
  To: linux-kernel, linux-hyperv, linux-rdma, netdev
  Cc: Shradha Gupta, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Ajay Sharma, Leon Romanovsky, Thomas Gleixner,
	Sebastian Andrzej Siewior, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Long Li, Michael Kelley, Shradha Gupta,
	Yury Norov, Konstantin Taranov, Souradeep Chakrabarti

Add new device attributes to view multiport, msix, and adapter MTU
setting for MANA device.

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
---
 .../net/ethernet/microsoft/mana/gdma_main.c   | 74 +++++++++++++++++++
 include/net/mana/gdma.h                       |  9 +++
 2 files changed, 83 insertions(+)

diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 1332db9a08eb..6674a02cff06 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1471,6 +1471,65 @@ static bool mana_is_pf(unsigned short dev_id)
 	return dev_id == MANA_PF_DEVICE_ID;
 }
 
+static ssize_t mana_attr_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct gdma_context *gc = pci_get_drvdata(pdev);
+	struct mana_context *ac = gc->mana.driver_data;
+
+	if (strcmp(attr->attr.name, "mport") == 0)
+		return snprintf(buf, PAGE_SIZE, "%d\n", ac->num_ports);
+	else if (strcmp(attr->attr.name, "adapter_mtu") == 0)
+		return snprintf(buf, PAGE_SIZE, "%d\n", gc->adapter_mtu);
+	else if (strcmp(attr->attr.name, "msix") == 0)
+		return snprintf(buf, PAGE_SIZE, "%d\n", gc->max_num_msix);
+	else
+		return -EINVAL;
+}
+
+static int mana_gd_setup_sysfs(struct pci_dev *pdev)
+{
+	struct gdma_context *gc = pci_get_drvdata(pdev);
+	int retval = 0;
+
+	gc->mana_attributes.mana_mport_attr.attr.name = "mport";
+	gc->mana_attributes.mana_mport_attr.attr.mode = 0444;
+	gc->mana_attributes.mana_mport_attr.show = mana_attr_show;
+	sysfs_attr_init(&gc->mana_attributes.mana_mport_attr);
+	retval = device_create_file(&pdev->dev,
+				    &gc->mana_attributes.mana_mport_attr);
+	if (retval < 0)
+		return retval;
+
+	gc->mana_attributes.mana_adapter_mtu_attr.attr.name = "adapter_mtu";
+	gc->mana_attributes.mana_adapter_mtu_attr.attr.mode = 0444;
+	gc->mana_attributes.mana_adapter_mtu_attr.show = mana_attr_show;
+	sysfs_attr_init(&gc->mana_attributes.mana_adapter_mtu_attr);
+	retval = device_create_file(&pdev->dev,
+				    &gc->mana_attributes.mana_adapter_mtu_attr);
+	if (retval < 0)
+		goto mtu_attr_error;
+
+	gc->mana_attributes.mana_msix_attr.attr.name = "msix";
+	gc->mana_attributes.mana_msix_attr.attr.mode = 0444;
+	gc->mana_attributes.mana_msix_attr.show = mana_attr_show;
+	sysfs_attr_init(&gc->mana_attributes.mana_msix_attr);
+	retval = device_create_file(&pdev->dev,
+				    &gc->mana_attributes.mana_msix_attr);
+	if (retval < 0)
+		goto msix_attr_error;
+
+	return retval;
+msix_attr_error:
+	device_remove_file(&pdev->dev,
+			   &gc->mana_attributes.mana_adapter_mtu_attr);
+mtu_attr_error:
+	device_remove_file(&pdev->dev,
+			   &gc->mana_attributes.mana_mport_attr);
+	return retval;
+}
+
 static int mana_gd_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct gdma_context *gc;
@@ -1519,6 +1578,10 @@ static int mana_gd_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	gc->bar0_va = bar0_va;
 	gc->dev = &pdev->dev;
 
+	err = mana_gd_setup_sysfs(pdev);
+	if (err < 0)
+		goto free_gc;
+
 	err = mana_gd_setup(pdev);
 	if (err)
 		goto unmap_bar;
@@ -1544,6 +1607,15 @@ static int mana_gd_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	return err;
 }
 
+static void mana_cleanup_sysfs_files(struct pci_dev *pdev,
+				     struct gdma_context *gc)
+{
+	device_remove_file(&pdev->dev, &gc->mana_attributes.mana_msix_attr);
+	device_remove_file(&pdev->dev,
+			   &gc->mana_attributes.mana_adapter_mtu_attr);
+	device_remove_file(&pdev->dev, &gc->mana_attributes.mana_mport_attr);
+}
+
 static void mana_gd_remove(struct pci_dev *pdev)
 {
 	struct gdma_context *gc = pci_get_drvdata(pdev);
@@ -1552,6 +1624,8 @@ static void mana_gd_remove(struct pci_dev *pdev)
 
 	mana_gd_cleanup(pdev);
 
+	mana_cleanup_sysfs_files(pdev, gc);
+
 	pci_iounmap(pdev, gc->bar0_va);
 
 	vfree(gc);
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 27684135bb4d..ea636959164c 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -354,6 +354,12 @@ struct gdma_irq_context {
 	char name[MANA_IRQ_NAME_SZ];
 };
 
+struct mana_device_attributes {
+	struct device_attribute mana_mport_attr;
+	struct device_attribute mana_adapter_mtu_attr;
+	struct device_attribute mana_msix_attr;
+};
+
 struct gdma_context {
 	struct device		*dev;
 
@@ -395,6 +401,9 @@ struct gdma_context {
 
 	/* Azure RDMA adapter */
 	struct gdma_dev		mana_ib;
+
+	/* device attributes */
+	struct mana_device_attributes mana_attributes;
 };
 
 #define MAX_NUM_GDMA_DEVICES	4
-- 
2.34.1


^ permalink raw reply related	[relevance 63%]

* Re: [PATCH] x86/Kconfig: Allow NR_CPUS between 512 and 8192
  @ 2024-04-14 16:31 79%   ` Saurabh Singh Sengar
  0 siblings, 0 replies; 200+ results
From: Saurabh Singh Sengar @ 2024-04-14 16:31 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, x86, hpa, linux-kernel, sgeorgejohn
  Cc: ssengar, libo.chen, mhklinux

On Mon, Mar 04, 2024 at 08:13:23AM -0800, Saurabh Singh Sengar wrote:
> On Tue, Feb 20, 2024 at 01:50:13AM -0800, Saurabh Sengar wrote:
> > Today there is no way one can choose any value between 512 to 8192
> > for NR_CPUS seamlessly. NR_CPUS is guarded by NR_CPUS_RANGE_END which
> > is further dependent on CPUMASK_OFFSTACK to allow NR_CPUs > 512.
> > 
> > For x86, CPUMASK_OFFSTACK can only be enabled either by selecting MAXSMP
> > or DEBUG_PER_CPU_MAPS. Both of these options has a cost to pay. MAXSMP
> > will increase the NR_CPUS to 8192 which will have impact on kernel image
> > size whereas DEBUG_PER_CPU_MAPS will have additional run time overheads.
> > Thus there is no good way to have NR_CPUS anything between 512 to 8192.
> > 
> > Fix this by selecting CPUMASK_OFFSTACK if NR_CPUS > 512 and
> > let NR_CPUS_RANGE_END set to 8192.
> > 
> > On a Hyper-V system where max number of CPUs are only 2048, this
> > patch saves around 1 MB of kernel image size, compare to MAXSMP.
> > 
> > Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> > ---
> > 
> > I want to mention that in ARM and other archs its very simple
> > to select any value for NR_CPUS. This is an attempt to have more
> > flexibilty in x86 arch as well to choose NR_CPUS.
> > 
> > Some of the earlier discussions reated to it which could be of interest:
> > https://lore.kernel.org/lkml/1708092603-14504-1-git-send-email-ssengar@linux.microsoft.com/
> > https://lore.kernel.org/lkml/794a1211-630b-3ee5-55a3-c06f10df1490@linux.com/
> > 
> > Another approach I can think of is to allow CPUMASK_OFFSTACK to be enabled
> > more freely like the below patch of Libo Chen, that will also solve the
> > problem I am addressing. But I feel this patch may have impact on other
> > archs as well and I am not sure if that is in best interest of all the archs.
> > 
> > https://lore.kernel.org/lkml/20220412231508.32629-2-libo.chen@oracle.com/
> > 
> >  arch/x86/Kconfig | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 07a0c8d4e9c7..458f3f250d7f 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -34,6 +34,7 @@ config X86_64
> >  	select SWIOTLB
> >  	select ARCH_HAS_ELFCORE_COMPAT
> >  	select ZONE_DMA32
> > +	select CPUMASK_OFFSTACK if NR_CPUS > 512
> >  
> >  config FORCE_DYNAMIC_FTRACE
> >  	def_bool y
> > @@ -1006,8 +1007,7 @@ config NR_CPUS_RANGE_END
> >  config NR_CPUS_RANGE_END
> >  	int
> >  	depends on X86_64
> > -	default 8192 if  SMP && CPUMASK_OFFSTACK
> > -	default  512 if  SMP && !CPUMASK_OFFSTACK
> > +	default 8192 if  SMP
> >  	default    1 if !SMP
> >  
> >  config NR_CPUS_DEFAULT
> > 
> > -- 
> > 2.34.1
> > 
> 
> x86 Maintainers,
> 
> Kind reminder to have your feedback on this patch.
> 
> - Saurabh
> 
ping

^ permalink raw reply	[relevance 79%]

* [PATCH v17 21/21] MAINTAINERS: ipe: add ipe maintainer information
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (19 preceding siblings ...)
  2024-04-13  0:56 12% ` [PATCH v17 20/21] Documentation: add ipe documentation Fan Wu
@ 2024-04-13  0:56 77% ` Fan Wu
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:56 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

Update MAINTAINERS to include ipe maintainer information.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

--
v1-v16:
  + Not present

v17:
  + Introduced
---
 MAINTAINERS | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index b5b89687680b..93eb4e12a789 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10745,6 +10745,16 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git
 F:	security/integrity/
 F:	security/integrity/ima/
 
+INTEGRITY POLICY ENFORCEMENT (IPE)
+M:	Fan Wu <wufan@linux.microsoft.com>
+L:	linux-security-module@vger.kernel.org
+S:	Supported
+T:	git https://github.com/microsoft/ipe.git
+F:	Documentation/admin-guide/LSM/ipe.rst
+F:	Documentation/security/ipe.rst
+F:	scripts/ipe/
+F:	security/ipe/
+
 INTEL 810/815 FRAMEBUFFER DRIVER
 M:	Antonino Daplas <adaplas@gmail.com>
 L:	linux-fbdev@vger.kernel.org
-- 
2.44.0


^ permalink raw reply related	[relevance 77%]

* [PATCH v17 19/21] ipe: kunit test for parser
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (17 preceding siblings ...)
  2024-04-13  0:56 47% ` [PATCH v17 18/21] scripts: add boot policy generation program Fan Wu
@ 2024-04-13  0:56 48% ` Fan Wu
  2024-04-13  0:56 12% ` [PATCH v17 20/21] Documentation: add ipe documentation Fan Wu
  2024-04-13  0:56 77% ` [PATCH v17 21/21] MAINTAINERS: ipe: add ipe maintainer information Fan Wu
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:56 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Add various happy/unhappy unit tests for both IPE's policy parser.

Besides, a test suite for IPE functionality is available at
https://github.com/microsoft/ipe/tree/test-suite

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v6:
  + Not present

v7:
  Introduced

v8:
  + Remove the kunit tests with respect to the fsverity digest, as these
    require significant changes to work with the new method of acquiring
    the digest at runtime.

v9:
  + Remove the kunit tests related to ipe_context

v10:
  + No changes

v11:
  + No changes

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 security/ipe/Kconfig        |  17 +++
 security/ipe/Makefile       |   3 +
 security/ipe/policy_tests.c | 296 ++++++++++++++++++++++++++++++++++++
 3 files changed, 316 insertions(+)
 create mode 100644 security/ipe/policy_tests.c

diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index ab7e7b9235bc..fb769ce12580 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -56,4 +56,21 @@ config IPE_PROP_FS_VERITY
 
 endmenu
 
+config SECURITY_IPE_KUNIT_TEST
+	bool "Build KUnit tests for IPE" if !KUNIT_ALL_TESTS
+	depends on KUNIT=y
+	default KUNIT_ALL_TESTS
+	help
+	  This builds the IPE KUnit tests.
+
+	  KUnit tests run during boot and output the results to the debug log
+	  in TAP format (https://testanything.org/). Only useful for kernel devs
+	  running KUnit test harness and are not for inclusion into a
+	  production build.
+
+	  For more information on KUnit and unit tests in general please refer
+	  to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+	  If unsure, say N.
+
 endif
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 84ad76556170..5125b8357e2f 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -26,3 +26,6 @@ obj-$(CONFIG_SECURITY_IPE) += \
 	audit.o \
 
 clean-files := boot-policy.c \
+
+obj-$(CONFIG_SECURITY_IPE_KUNIT_TEST) += \
+	policy_tests.o \
diff --git a/security/ipe/policy_tests.c b/security/ipe/policy_tests.c
new file mode 100644
index 000000000000..89521f6b9994
--- /dev/null
+++ b/security/ipe/policy_tests.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <kunit/test.h>
+#include "policy.h"
+struct policy_case {
+	const char *const policy;
+	int errno;
+	const char *const desc;
+};
+
+static const struct policy_case policy_cases[] = {
+	{
+		"policy_name=allowall policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW",
+		0,
+		"basic",
+	},
+	{
+		"policy_name=trailing_comment policy_version=152.0.0 #This is comment\n"
+		"DEFAULT action=ALLOW",
+		0,
+		"trailing comment",
+	},
+	{
+		"policy_name=allowallnewline policy_version=0.2.0\n"
+		"DEFAULT action=ALLOW\n"
+		"\n",
+		0,
+		"trailing newline",
+	},
+	{
+		"policy_name=carriagereturnlinefeed policy_version=0.0.1\n"
+		"DEFAULT action=ALLOW\n"
+		"\r\n",
+		0,
+		"clrf newline",
+	},
+	{
+		"policy_name=whitespace policy_version=0.0.0\n"
+		"DEFAULT\taction=ALLOW\n"
+		"     \t     DEFAULT \t    op=EXECUTE      action=DENY\n"
+		"op=EXECUTE boot_verified=TRUE action=ALLOW\n"
+		"# this is a\tcomment\t\t\t\t\n"
+		"DEFAULT \t op=KMODULE\t\t\t  action=DENY\r\n"
+		"op=KMODULE boot_verified=TRUE action=ALLOW\n",
+		0,
+		"various whitespaces and nested default",
+	},
+	{
+		"policy_name=boot_verified policy_version=-1236.0.0\n"
+		"DEFAULT\taction=ALLOW\n",
+		-EINVAL,
+		"negative version",
+	},
+	{
+		"policy_name=$@!*&^%%\\:;{}() policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW",
+		0,
+		"special characters",
+	},
+	{
+		"policy_name=test policy_version=999999.0.0\n"
+		"DEFAULT action=ALLOW",
+		-ERANGE,
+		"overflow version",
+	},
+	{
+		"policy_name=test policy_version=255.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"incomplete version",
+	},
+	{
+		"policy_name=test policy_version=111.0.0.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"extra version",
+	},
+	{
+		"",
+		-EBADMSG,
+		"0-length policy",
+	},
+	{
+		"policy_name=test\0policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"random null in header",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"\0DEFAULT action=ALLOW",
+		-EBADMSG,
+		"incomplete policy from NULL",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=DENY\n\0"
+		"op=EXECUTE dmverity_signature=TRUE action=ALLOW\n",
+		0,
+		"NULL truncates policy",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_signature=abc action=ALLOW",
+		-EBADMSG,
+		"invalid property type",
+	},
+	{
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"missing policy header",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n",
+		-EBADMSG,
+		"missing default definition",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"dmverity_signature=TRUE op=EXECUTE action=ALLOW",
+		-EBADMSG,
+		"invalid rule ordering"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"action=ALLOW op=EXECUTE dmverity_signature=TRUE",
+		-EBADMSG,
+		"invalid rule ordering (2)",
+	},
+	{
+		"policy_name=test policy_version=0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_signature=TRUE action=ALLOW",
+		-EBADMSG,
+		"invalid version",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=UNKNOWN dmverity_signature=TRUE action=ALLOW",
+		-EBADMSG,
+		"unknown operation",
+	},
+	{
+		"policy_name=asdvpolicy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n",
+		-EBADMSG,
+		"missing space after policy name",
+	},
+	{
+		"policy_name=test\xFF\xEF policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_signature=TRUE action=ALLOW",
+		0,
+		"expanded ascii",
+	},
+	{
+		"policy_name=test\xFF\xEF policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_roothash=GOOD_DOG action=ALLOW",
+		-EBADMSG,
+		"invalid property value (2)",
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"policy_name=test policy_version=0.1.0\n"
+		"DEFAULT action=ALLOW",
+		-EBADMSG,
+		"double header"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT action=ALLOW\n",
+		-EBADMSG,
+		"double default"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT op=EXECUTE action=DENY\n"
+		"DEFAULT op=EXECUTE action=ALLOW\n",
+		-EBADMSG,
+		"double operation default"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT op=EXECUTE action=DEN\n",
+		-EBADMSG,
+		"invalid action value"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"DEFAULT op=EXECUTE action\n",
+		-EBADMSG,
+		"invalid action value (2)"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"UNKNOWN value=true\n",
+		-EBADMSG,
+		"unrecognized statement"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE dmverity_roothash=1c0d7ee1f8343b7fbe418378e8eb22c061d7dec7 action=DENY\n",
+		-EBADMSG,
+		"old-style digest"
+	},
+	{
+		"policy_name=test policy_version=0.0.0\n"
+		"DEFAULT action=ALLOW\n"
+		"op=EXECUTE fsverity_digest=1c0d7ee1f8343b7fbe418378e8eb22c061d7dec7 action=DENY\n",
+		-EBADMSG,
+		"old-style digest"
+	}
+};
+
+static void pol_to_desc(const struct policy_case *c, char *desc)
+{
+	strscpy(desc, c->desc, KUNIT_PARAM_DESC_SIZE);
+}
+
+KUNIT_ARRAY_PARAM(ipe_policies, policy_cases, pol_to_desc);
+
+/**
+ * ipe_parser_unsigned_test - Test the parser by passing unsigned policies.
+ * @test: Supplies a pointer to a kunit structure.
+ *
+ * This is called by the kunit harness. This test does not check the correctness
+ * of the policy, but ensures that errors are handled correctly.
+ */
+static void ipe_parser_unsigned_test(struct kunit *test)
+{
+	const struct policy_case *p = test->param_value;
+	struct ipe_policy *pol;
+
+	pol = ipe_new_policy(p->policy, strlen(p->policy), NULL, 0);
+
+	if (p->errno) {
+		KUNIT_EXPECT_EQ(test, PTR_ERR(pol), p->errno);
+		return;
+	}
+
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pol);
+	KUNIT_EXPECT_NOT_ERR_OR_NULL(test, pol->parsed);
+	KUNIT_EXPECT_STREQ(test, pol->text, p->policy);
+	KUNIT_EXPECT_PTR_EQ(test, NULL, pol->pkcs7);
+	KUNIT_EXPECT_EQ(test, 0, pol->pkcs7len);
+
+	ipe_free_policy(pol);
+}
+
+/**
+ * ipe_parser_widestring_test - Ensure parser fail on a wide string policy.
+ * @test: Supplies a pointer to a kunit structure.
+ *
+ * This is called by the kunit harness.
+ */
+static void ipe_parser_widestring_test(struct kunit *test)
+{
+	const unsigned short policy[] = L"policy_name=Test policy_version=0.0.0\n"
+					L"DEFAULT action=ALLOW";
+	struct ipe_policy *pol = NULL;
+
+	pol = ipe_new_policy((const char *)policy, (ARRAY_SIZE(policy) - 1) * 2, NULL, 0);
+	KUNIT_EXPECT_TRUE(test, IS_ERR_OR_NULL(pol));
+
+	ipe_free_policy(pol);
+}
+
+static struct kunit_case ipe_parser_test_cases[] = {
+	KUNIT_CASE_PARAM(ipe_parser_unsigned_test, ipe_policies_gen_params),
+	KUNIT_CASE(ipe_parser_widestring_test),
+};
+
+static struct kunit_suite ipe_parser_test_suite = {
+	.name = "ipe-parser",
+	.test_cases = ipe_parser_test_cases,
+};
+
+kunit_test_suite(ipe_parser_test_suite);
-- 
2.44.0


^ permalink raw reply related	[relevance 48%]

* [PATCH v17 18/21] scripts: add boot policy generation program
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (16 preceding siblings ...)
  2024-04-13  0:56 39% ` [PATCH v17 17/21] ipe: enable support for fs-verity as a trust provider Fan Wu
@ 2024-04-13  0:56 47% ` Fan Wu
  2024-04-13  0:56 48% ` [PATCH v17 19/21] ipe: kunit test for parser Fan Wu
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:56 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Enables an IPE policy to be enforced from kernel start, enabling access
control based on trust from kernel startup. This is accomplished by
transforming an IPE policy indicated by CONFIG_IPE_BOOT_POLICY into a
c-string literal that is parsed at kernel startup as an unsigned policy.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + No Changes

v3:
  + No Changes

v4:
  + No Changes

v5:
  + No Changes

v6:
  + No Changes

v7:
  + Move from 01/11 to 14/16
  + Don't return errno directly.
  + Make output of script more user-friendly
  + Add escaping for tab and '?'
  + Mark argv pointer const
  + Invert return code check in the boot policy parsing code path.

v8:
  + No significant changes.

v9:
  + No changes

v10:
  + Update the init part code for rcu changes in the eval loop patch

v11:
  + Fix code style issues

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + Fix one grammar issue in Kconfig

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 scripts/Makefile              |   1 +
 scripts/ipe/Makefile          |   2 +
 scripts/ipe/polgen/.gitignore |   2 +
 scripts/ipe/polgen/Makefile   |   5 ++
 scripts/ipe/polgen/polgen.c   | 145 ++++++++++++++++++++++++++++++++++
 security/ipe/.gitignore       |   2 +
 security/ipe/Kconfig          |  10 +++
 security/ipe/Makefile         |  11 +++
 security/ipe/fs.c             |   8 ++
 security/ipe/ipe.c            |  12 +++
 10 files changed, 198 insertions(+)
 create mode 100644 scripts/ipe/Makefile
 create mode 100644 scripts/ipe/polgen/.gitignore
 create mode 100644 scripts/ipe/polgen/Makefile
 create mode 100644 scripts/ipe/polgen/polgen.c
 create mode 100644 security/ipe/.gitignore

diff --git a/scripts/Makefile b/scripts/Makefile
index bc90520a5426..cae8a14fa40d 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -55,6 +55,7 @@ targets += module.lds
 subdir-$(CONFIG_GCC_PLUGINS) += gcc-plugins
 subdir-$(CONFIG_MODVERSIONS) += genksyms
 subdir-$(CONFIG_SECURITY_SELINUX) += selinux
+subdir-$(CONFIG_SECURITY_IPE) += ipe
 
 # Let clean descend into subdirs
 subdir-	+= basic dtc gdb kconfig mod
diff --git a/scripts/ipe/Makefile b/scripts/ipe/Makefile
new file mode 100644
index 000000000000..e87553fbb8d6
--- /dev/null
+++ b/scripts/ipe/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+subdir-y := polgen
diff --git a/scripts/ipe/polgen/.gitignore b/scripts/ipe/polgen/.gitignore
new file mode 100644
index 000000000000..b6f05cf3dc0e
--- /dev/null
+++ b/scripts/ipe/polgen/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+polgen
diff --git a/scripts/ipe/polgen/Makefile b/scripts/ipe/polgen/Makefile
new file mode 100644
index 000000000000..c20456a2f2e9
--- /dev/null
+++ b/scripts/ipe/polgen/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+hostprogs-always-y	:= polgen
+HOST_EXTRACFLAGS += \
+	-I$(srctree)/include \
+	-I$(srctree)/include/uapi \
diff --git a/scripts/ipe/polgen/polgen.c b/scripts/ipe/polgen/polgen.c
new file mode 100644
index 000000000000..c6283b3ff006
--- /dev/null
+++ b/scripts/ipe/polgen/polgen.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <errno.h>
+
+static void usage(const char *const name)
+{
+	printf("Usage: %s OutputFile (PolicyFile)\n", name);
+	exit(EINVAL);
+}
+
+static int policy_to_buffer(const char *pathname, char **buffer, size_t *size)
+{
+	size_t fsize;
+	size_t read;
+	char *lbuf;
+	int rc = 0;
+	FILE *fd;
+
+	fd = fopen(pathname, "r");
+	if (!fd) {
+		rc = errno;
+		goto out;
+	}
+
+	fseek(fd, 0, SEEK_END);
+	fsize = ftell(fd);
+	rewind(fd);
+
+	lbuf = malloc(fsize);
+	if (!lbuf) {
+		rc = ENOMEM;
+		goto out_close;
+	}
+
+	read = fread((void *)lbuf, sizeof(*lbuf), fsize, fd);
+	if (read != fsize) {
+		rc = -1;
+		goto out_free;
+	}
+
+	*buffer = lbuf;
+	*size = fsize;
+	fclose(fd);
+
+	return rc;
+
+out_free:
+	free(lbuf);
+out_close:
+	fclose(fd);
+out:
+	return rc;
+}
+
+static int write_boot_policy(const char *pathname, const char *buf, size_t size)
+{
+	int rc = 0;
+	FILE *fd;
+	size_t i;
+
+	fd = fopen(pathname, "w");
+	if (!fd) {
+		rc = errno;
+		goto err;
+	}
+
+	fprintf(fd, "/* This file is automatically generated.");
+	fprintf(fd, " Do not edit. */\n");
+	fprintf(fd, "#include <linux/stddef.h>\n");
+	fprintf(fd, "\nextern const char *const ipe_boot_policy;\n\n");
+	fprintf(fd, "const char *const ipe_boot_policy =\n");
+
+	if (!buf || size == 0) {
+		fprintf(fd, "\tNULL;\n");
+		fclose(fd);
+		return 0;
+	}
+
+	fprintf(fd, "\t\"");
+
+	for (i = 0; i < size; ++i) {
+		switch (buf[i]) {
+		case '"':
+			fprintf(fd, "\\\"");
+			break;
+		case '\'':
+			fprintf(fd, "'");
+			break;
+		case '\n':
+			fprintf(fd, "\\n\"\n\t\"");
+			break;
+		case '\\':
+			fprintf(fd, "\\\\");
+			break;
+		case '\t':
+			fprintf(fd, "\\t");
+			break;
+		case '\?':
+			fprintf(fd, "\\?");
+			break;
+		default:
+			fprintf(fd, "%c", buf[i]);
+		}
+	}
+	fprintf(fd, "\";\n");
+	fclose(fd);
+
+	return 0;
+
+err:
+	if (fd)
+		fclose(fd);
+	return rc;
+}
+
+int main(int argc, const char *const argv[])
+{
+	char *policy = NULL;
+	size_t len = 0;
+	int rc = 0;
+
+	if (argc < 2)
+		usage(argv[0]);
+
+	if (argc > 2) {
+		rc = policy_to_buffer(argv[2], &policy, &len);
+		if (rc != 0)
+			goto cleanup;
+	}
+
+	rc = write_boot_policy(argv[1], policy, len);
+cleanup:
+	if (policy)
+		free(policy);
+	if (rc != 0)
+		perror("An error occurred during policy conversion: ");
+	return rc;
+}
diff --git a/security/ipe/.gitignore b/security/ipe/.gitignore
new file mode 100644
index 000000000000..07313d3fd74a
--- /dev/null
+++ b/security/ipe/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+boot-policy.c
diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index 20fe88deb756..ab7e7b9235bc 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -18,6 +18,16 @@ menuconfig SECURITY_IPE
 	  If unsure, answer N.
 
 if SECURITY_IPE
+config IPE_BOOT_POLICY
+	string "Integrity policy to apply on system startup"
+	help
+	  This option specifies a filepath to an IPE policy that is compiled
+	  into the kernel. This policy will be enforced until a policy update
+	  is deployed via the $securityfs/ipe/policies/$policy_name/active
+	  interface.
+
+	  If unsure, leave blank.
+
 menu "IPE Trust Providers"
 
 config IPE_PROP_DM_VERITY
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index e1019bb9f0f3..84ad76556170 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -5,7 +5,16 @@
 # Makefile for building the IPE module as part of the kernel tree.
 #
 
+quiet_cmd_polgen = IPE_POL $(2)
+      cmd_polgen = scripts/ipe/polgen/polgen security/ipe/boot-policy.c $(2)
+
+targets += boot-policy.c
+
+$(obj)/boot-policy.c: scripts/ipe/polgen/polgen $(CONFIG_IPE_BOOT_POLICY) FORCE
+	$(call if_changed,polgen,$(CONFIG_IPE_BOOT_POLICY))
+
 obj-$(CONFIG_SECURITY_IPE) += \
+	boot-policy.o \
 	digest.o \
 	eval.o \
 	hooks.o \
@@ -15,3 +24,5 @@ obj-$(CONFIG_SECURITY_IPE) += \
 	policy_fs.o \
 	policy_parser.o \
 	audit.o \
+
+clean-files := boot-policy.c \
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
index b52fb6023904..5b6d19fb844a 100644
--- a/security/ipe/fs.c
+++ b/security/ipe/fs.c
@@ -190,6 +190,7 @@ static const struct file_operations enforce_fops = {
 static int __init ipe_init_securityfs(void)
 {
 	int rc = 0;
+	struct ipe_policy *ap;
 
 	if (!ipe_enabled)
 		return -EOPNOTSUPP;
@@ -220,6 +221,13 @@ static int __init ipe_init_securityfs(void)
 		goto err;
 	}
 
+	ap = rcu_access_pointer(ipe_active_policy);
+	if (ap) {
+		rc = ipe_new_policyfs_node(ap);
+		if (rc)
+			goto err;
+	}
+
 	np = securityfs_create_file("new_policy", 0200, root, NULL, &np_fops);
 	if (IS_ERR(np)) {
 		rc = PTR_ERR(np);
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 3896a8da4213..bfd71ba47f45 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -9,6 +9,7 @@
 #include "hooks.h"
 #include "eval.h"
 
+extern const char *const ipe_boot_policy;
 bool ipe_enabled;
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
@@ -74,9 +75,20 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
  */
 static int __init ipe_init(void)
 {
+	struct ipe_policy *p = NULL;
+
 	security_add_hooks(ipe_hooks, ARRAY_SIZE(ipe_hooks), &ipe_lsmid);
 	ipe_enabled = true;
 
+	if (ipe_boot_policy) {
+		p = ipe_new_policy(ipe_boot_policy, strlen(ipe_boot_policy),
+				   NULL, 0);
+		if (IS_ERR(p))
+			return PTR_ERR(p);
+
+		rcu_assign_pointer(ipe_active_policy, p);
+	}
+
 	return 0;
 }
 
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v17 17/21] ipe: enable support for fs-verity as a trust provider
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (15 preceding siblings ...)
  2024-04-13  0:55 45% ` [PATCH v17 16/21] fsverity: expose verified fsverity built-in signatures to LSMs Fan Wu
@ 2024-04-13  0:56 39% ` Fan Wu
  2024-04-13  0:56 47% ` [PATCH v17 18/21] scripts: add boot policy generation program Fan Wu
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:56 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu, Deven Bowers

Enable IPE policy authors to indicate trust for a singular fsverity
file, identified by the digest information, through "fsverity_digest"
and all files using valid fsverity builtin signatures via
"fsverity_signature".

This enables file-level integrity claims to be expressed in IPE,
allowing individual files to be authorized, giving some flexibility
for policy authors. Such file-level claims are important to be expressed
for enforcing the integrity of packages, as well as address some of the
scalability issues in a sole dm-verity based solution (# of loop back
devices, etc).

This solution cannot be done in userspace as the minimum threat that
IPE should mitigate is an attacker downloads malicious payload with
all required dependencies. These dependencies can lack the userspace
check, bypassing the protection entirely. A similar attack succeeds if
the userspace component is replaced with a version that does not
perform the check. As a result, this can only be done in the common
entry point - the kernel.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v6:
  + Not present

v7:
  Introduced

v8:
  * Undo squash of 08/12, 10/12 - separating drivers/md/ from security/
  * Use common-audit function for fsverity_signature.
  + Change fsverity implementation to use fsverity_get_digest
  + prevent unnecessary copy of fs-verity signature data, instead
    just check for presence of signature data.
  + Remove free_inode_security hook, as the digest is now acquired
    at runtime instead of via LSM blob.

v9:
  + Adapt to the new parser

v10:
  + Update the fsverity get digest call

v11:
  + No changes

v12:
  + Fix audit format
  + Simplify property evaluation

v13:
  + Remove the CONFIG_IPE_PROP_FS_VERITY dependency inside the parser
    to make the policy grammar independent of the kernel config.

v14:
  + No changes

v15:
  + Fix on grammar issue in Kconfig
  + Switch hook to security_inode_setintegrity()

v16:
  + Rewrite fsverity signature part in Kconfig

v17:
  + Fix documentation issues
  + Use new enum name LSM_INT_FSVERITY_BUILTINSIG_VALID
---
 security/ipe/Kconfig         |  14 +++++
 security/ipe/audit.c         |  17 ++++++
 security/ipe/eval.c          | 108 ++++++++++++++++++++++++++++++++++-
 security/ipe/eval.h          |  10 ++++
 security/ipe/hooks.c         |  28 +++++++++
 security/ipe/hooks.h         |   6 ++
 security/ipe/ipe.c           |  13 +++++
 security/ipe/ipe.h           |   3 +
 security/ipe/policy.h        |   3 +
 security/ipe/policy_parser.c |   6 ++
 10 files changed, 207 insertions(+), 1 deletion(-)

diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index 6179752c614f..20fe88deb756 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -30,6 +30,20 @@ config IPE_PROP_DM_VERITY
 	  that was mounted with a valid signed root-hash or the
 	  volume's root hash matches the supplied value in the policy.
 
+	  If unsure, answer Y.
+
+config IPE_PROP_FS_VERITY
+	bool "Enable property for fs-verity files"
+	depends on FS_VERITY && FS_VERITY_BUILTIN_SIGNATURES
+	help
+	  This option enables the usage of properties "fsverity_signature"
+	  and "fsverity_digest". These properties evaluate to TRUE when
+	  a file is fsverity enabled and has a valid builtin signature
+	  whose signing cert is in the .fs-verity keyring or its
+	  digest matches the supplied value in the policy.
+
+	  if unsure, answer Y.
+
 endmenu
 
 endif
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index 2c98520267c1..bd258f887e6f 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -53,6 +53,9 @@ static const char *const audit_prop_names[__IPE_PROP_MAX] = {
 	"dmverity_roothash=",
 	"dmverity_signature=FALSE",
 	"dmverity_signature=TRUE",
+	"fsverity_digest=",
+	"fsverity_signature=FALSE",
+	"fsverity_signature=TRUE",
 };
 
 /**
@@ -66,6 +69,17 @@ static void audit_dmv_roothash(struct audit_buffer *ab, const void *rh)
 	ipe_digest_audit(ab, rh);
 }
 
+/**
+ * audit_fsv_digest() - audit the digest of a fsverity_digest property.
+ * @ab: Supplies a pointer to the audit_buffer to append to.
+ * @d: Supplies a pointer to the digest structure.
+ */
+static void audit_fsv_digest(struct audit_buffer *ab, const void *d)
+{
+	audit_log_format(ab, "%s", audit_prop_names[IPE_PROP_FSV_DIGEST]);
+	ipe_digest_audit(ab, d);
+}
+
 /**
  * audit_rule() - audit an IPE policy rule.
  * @ab: Supplies a pointer to the audit_buffer to append to.
@@ -82,6 +96,9 @@ static void audit_rule(struct audit_buffer *ab, const struct ipe_rule *r)
 		case IPE_PROP_DMV_ROOTHASH:
 			audit_dmv_roothash(ab, ptr->value);
 			break;
+		case IPE_PROP_FSV_DIGEST:
+			audit_fsv_digest(ab, ptr->value);
+			break;
 		default:
 			audit_log_format(ab, "%s", audit_prop_names[ptr->type]);
 			break;
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 477f0d0ffda8..200003871417 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -10,6 +10,7 @@
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
 #include <linux/moduleparam.h>
+#include <linux/fsverity.h>
 
 #include "ipe.h"
 #include "eval.h"
@@ -51,6 +52,23 @@ static void build_ipe_bdev_ctx(struct ipe_eval_ctx *ctx, const struct inode *con
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+/**
+ * build_ipe_inode_ctx() - Build inode fields of an evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @ino: Supplies the inode struct of the file triggered IPE event.
+ */
+static void build_ipe_inode_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+	ctx->ino = ino;
+	ctx->ipe_inode = ipe_inode(ctx->ino);
+}
+#else
+static void build_ipe_inode_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
 /**
  * ipe_build_eval_ctx() - Build an ipe evaluation context.
  * @ctx: Supplies a pointer to the context to be populated.
@@ -63,13 +81,17 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 			enum ipe_op_type op,
 			enum ipe_hook_type hook)
 {
+	struct inode *ino;
+
 	ctx->file = file;
 	ctx->op = op;
 	ctx->hook = hook;
 
 	if (file) {
 		build_ipe_sb_ctx(ctx, file);
-		build_ipe_bdev_ctx(ctx, d_real_inode(file->f_path.dentry));
+		ino = d_real_inode(file->f_path.dentry);
+		build_ipe_bdev_ctx(ctx, ino);
+		build_ipe_inode_ctx(ctx, ino);
 	}
 }
 
@@ -148,6 +170,84 @@ static bool evaluate_dmv_sig_true(const struct ipe_eval_ctx *const ctx)
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+/**
+ * evaluate_fsv_digest() - Evaluate @ctx against a fsv digest property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ * @p: Supplies a pointer to the property being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_fsv_digest(const struct ipe_eval_ctx *const ctx,
+				struct ipe_prop *p)
+{
+	enum hash_algo alg;
+	u8 digest[FS_VERITY_MAX_DIGEST_SIZE];
+	struct digest_info info;
+
+	if (!ctx->ino)
+		return false;
+	if (!fsverity_get_digest((struct inode *)ctx->ino,
+				 digest,
+				 NULL,
+				 &alg))
+		return false;
+
+	info.alg = hash_algo_name[alg];
+	info.digest = digest;
+	info.digest_len = hash_digest_size[alg];
+
+	return ipe_digest_eval(p->value, &info);
+}
+
+/**
+ * evaluate_fsv_sig_false() - Evaluate @ctx against a fsv sig false property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the property
+ * * %false	- The current @ctx doesn't match the property
+ */
+static bool evaluate_fsv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return !ctx->ino ||
+	       !IS_VERITY(ctx->ino) ||
+	       !ctx->ipe_inode ||
+	       !ctx->ipe_inode->fs_verity_signed;
+}
+
+/**
+ * evaluate_fsv_sig_true() - Evaluate @ctx against a fsv sig true property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true - The current @ctx match the property
+ * * %false - The current @ctx doesn't match the property
+ */
+static bool evaluate_fsv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return !evaluate_fsv_sig_false(ctx);
+}
+#else
+static bool evaluate_fsv_digest(const struct ipe_eval_ctx *const ctx,
+				struct ipe_prop *p)
+{
+	return false;
+}
+
+static bool evaluate_fsv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+
+static bool evaluate_fsv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
 /**
  * evaluate_property() - Analyze @ctx against a rule property.
  * @ctx: Supplies a pointer to the context to be evaluated.
@@ -174,6 +274,12 @@ static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 		return evaluate_dmv_sig_false(ctx);
 	case IPE_PROP_DMV_SIG_TRUE:
 		return evaluate_dmv_sig_true(ctx);
+	case IPE_PROP_FSV_DIGEST:
+		return evaluate_fsv_digest(ctx, p);
+	case IPE_PROP_FSV_SIG_FALSE:
+		return evaluate_fsv_sig_false(ctx);
+	case IPE_PROP_FSV_SIG_TRUE:
+		return evaluate_fsv_sig_true(ctx);
 	default:
 		return false;
 	}
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index aa29e8036c48..be41b96881ff 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -29,6 +29,12 @@ struct ipe_bdev {
 };
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+struct ipe_inode {
+	bool fs_verity_signed;
+};
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
 	enum ipe_hook_type hook;
@@ -38,6 +44,10 @@ struct ipe_eval_ctx {
 #ifdef CONFIG_IPE_PROP_DM_VERITY
 	const struct ipe_bdev *ipe_bdev;
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+	const struct inode *ino;
+	const struct ipe_inode *ipe_inode;
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
 };
 
 enum ipe_match {
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index 5d4a9abb9c44..3a2968d6d363 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -269,3 +269,31 @@ int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type typ
 	return -EINVAL;
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+/**
+ * ipe_inode_setintegrity() - save integrity data from a inode to IPE's LSM blob.
+ * @inode: The inode to source the security blob from.
+ * @type: Supplies the integrity type.
+ * @value: The value to be stored.
+ * @size: The size of @value.
+ *
+ * This hook is currently used to save the existence of a validated fs-verity
+ * builtin signature into LSM blob.
+ *
+ * Return: %0 on success. If an error occurs, the function will return the
+ * -errno.
+ */
+int ipe_inode_setintegrity(struct inode *inode, enum lsm_integrity_type type,
+			   const void *value, size_t size)
+{
+	struct ipe_inode *inode_sec = ipe_inode(inode);
+
+	if (type == LSM_INT_FSVERITY_BUILTINSIG_VALID) {
+		inode_sec->fs_verity_signed = size > 0 && value;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+#endif /* CONFIG_CONFIG_IPE_PROP_FS_VERITY */
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index 4d585fb6ada3..095968fc7bbc 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -9,6 +9,7 @@
 #include <linux/binfmts.h>
 #include <linux/security.h>
 #include <linux/blk_types.h>
+#include <linux/fsverity.h>
 
 enum ipe_hook_type {
 	IPE_HOOK_BPRM_CHECK = 0,
@@ -43,4 +44,9 @@ int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type typ
 			  const void *value, size_t len);
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+int ipe_inode_setintegrity(struct inode *inode, enum lsm_integrity_type type,
+			   const void *value, size_t size);
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
 #endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 99cb42caa63a..3896a8da4213 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -16,6 +16,9 @@ static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 #ifdef CONFIG_IPE_PROP_DM_VERITY
 	.lbs_bdev = sizeof(struct ipe_bdev),
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+	.lbs_inode = sizeof(struct ipe_inode),
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
 };
 
 static const struct lsm_id ipe_lsmid = {
@@ -35,6 +38,13 @@ struct ipe_bdev *ipe_bdev(struct block_device *b)
 }
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
 
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+struct ipe_inode *ipe_inode(const struct inode *inode)
+{
+	return inode->i_security + ipe_blobs.lbs_inode;
+}
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
+
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
 	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
@@ -46,6 +56,9 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bdev_free_security, ipe_bdev_free_security),
 	LSM_HOOK_INIT(bdev_setintegrity, ipe_bdev_setintegrity),
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+	LSM_HOOK_INIT(inode_setintegrity, ipe_inode_setintegrity),
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
 };
 
 /**
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index 01f46286e383..9abb12e5e47c 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -19,5 +19,8 @@ extern bool ipe_enabled;
 #ifdef CONFIG_IPE_PROP_DM_VERITY
 struct ipe_bdev *ipe_bdev(struct block_device *b);
 #endif /* CONFIG_IPE_PROP_DM_VERITY */
+#ifdef CONFIG_IPE_PROP_FS_VERITY
+struct ipe_inode *ipe_inode(const struct inode *inode);
+#endif /* CONFIG_IPE_PROP_FS_VERITY */
 
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 26776092c710..5bfbdbddeef8 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -36,6 +36,9 @@ enum ipe_prop_type {
 	IPE_PROP_DMV_ROOTHASH,
 	IPE_PROP_DMV_SIG_FALSE,
 	IPE_PROP_DMV_SIG_TRUE,
+	IPE_PROP_FSV_DIGEST,
+	IPE_PROP_FSV_SIG_FALSE,
+	IPE_PROP_FSV_SIG_TRUE,
 	__IPE_PROP_MAX
 };
 
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 71c84b293029..5a182c006b0e 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -278,6 +278,9 @@ static const match_table_t property_tokens = {
 	{IPE_PROP_DMV_ROOTHASH,		"dmverity_roothash=%s"},
 	{IPE_PROP_DMV_SIG_FALSE,	"dmverity_signature=FALSE"},
 	{IPE_PROP_DMV_SIG_TRUE,		"dmverity_signature=TRUE"},
+	{IPE_PROP_FSV_DIGEST,		"fsverity_digest=%s"},
+	{IPE_PROP_FSV_SIG_FALSE,	"fsverity_signature=FALSE"},
+	{IPE_PROP_FSV_SIG_TRUE,		"fsverity_signature=TRUE"},
 	{IPE_PROP_INVALID,		NULL}
 };
 
@@ -310,6 +313,7 @@ static int parse_property(char *t, struct ipe_rule *r)
 
 	switch (token) {
 	case IPE_PROP_DMV_ROOTHASH:
+	case IPE_PROP_FSV_DIGEST:
 		dup = match_strdup(&args[0]);
 		if (!dup) {
 			rc = -ENOMEM;
@@ -325,6 +329,8 @@ static int parse_property(char *t, struct ipe_rule *r)
 	case IPE_PROP_BOOT_VERIFIED_TRUE:
 	case IPE_PROP_DMV_SIG_FALSE:
 	case IPE_PROP_DMV_SIG_TRUE:
+	case IPE_PROP_FSV_SIG_FALSE:
+	case IPE_PROP_FSV_SIG_TRUE:
 		p->type = token;
 		break;
 	default:
-- 
2.44.0


^ permalink raw reply related	[relevance 39%]

* [PATCH v17 20/21] Documentation: add ipe documentation
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (18 preceding siblings ...)
  2024-04-13  0:56 48% ` [PATCH v17 19/21] ipe: kunit test for parser Fan Wu
@ 2024-04-13  0:56 12% ` Fan Wu
  2024-04-13  0:56 77% ` [PATCH v17 21/21] MAINTAINERS: ipe: add ipe maintainer information Fan Wu
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:56 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Add IPE's admin and developer documentation to the kernel tree.

Co-developed-by: Fan Wu <wufan@linux.microsoft.com>
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
---
v2:
  + No Changes

v3:
  + Add Acked-by
  + Fixup code block syntax
  + Fix a minor grammatical issue.

v4:
  + Update documentation with the results of other
    code changes.

v5:
  + No changes

v6:
  + No changes

v7:
  + Add additional developer-level documentation
  + Update admin-guide docs to reflect changes.
  + Drop Acked-by due to significant changes
  + Added section about audit events in admin-guide

v8:
  + Correct terminology from "audit event" to "audit record"
  + Add associated documentation with the correct "audit event"
    terminology.
  + Add some context to the historical motivation for IPE and design
    philosophy.
  + Add some content about the securityfs layout in the policies
    directory.
  + Various spelling and grammatical corrections.

v9:
  + Correct spelling of "pitfalls"
  + Update the docs w.r.t the new parser and new audit formats

v10:
  + Refine user docs per upstream suggestions
  + Update audit events part

v11:
  + No changes

v12:
  + Update audit formats
  + Update initramfs related docs
  + Add test suite link

v13:
  + No changes

v14:
  + No changes

v15:
  + Update boot_verified part
  + Fix format issues
  + Add IPE doc link to fsverity.rst

v16:
  + Explicitly mention fsverity builtin signature

v17:
  + Rewrite many parts of Documentation/admin-guide/LSM/ipe.rst
  + Fix incorrect path name of policyfs interfaces
---
 Documentation/admin-guide/LSM/index.rst       |   1 +
 Documentation/admin-guide/LSM/ipe.rst         | 797 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  12 +
 Documentation/filesystems/fsverity.rst        |   5 +-
 Documentation/security/index.rst              |   1 +
 Documentation/security/ipe.rst                | 444 ++++++++++
 6 files changed, 1259 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/LSM/ipe.rst
 create mode 100644 Documentation/security/ipe.rst

diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index a6ba95fbaa9f..ce63be6d64ad 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -47,3 +47,4 @@ subdirectories.
    tomoyo
    Yama
    SafeSetID
+   ipe
diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst
new file mode 100644
index 000000000000..d2bdd6e5b662
--- /dev/null
+++ b/Documentation/admin-guide/LSM/ipe.rst
@@ -0,0 +1,797 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Integrity Policy Enforcement (IPE)
+==================================
+
+.. NOTE::
+
+   This is the documentation for admins, system builders, or individuals
+   attempting to use IPE. If you're looking for more developer-focused
+   documentation about IPE please see Documentation/security/ipe.rst
+
+Overview
+--------
+
+Integrity Policy Enforcement (IPE) is a Linux Security Module that takes a
+complementary approach to access control. Unlike traditional access control
+mechanisms that rely on labels and paths for decision-making, IPE focuses
+on the immutable security properties inherent to system components. These
+properties are fundamental attributes or features of a system component
+that cannot be altered, ensuring a consistent and reliable basis for
+security decisions.
+
+To elaborate, in the context of IPE, system components primarily refer to
+files or the devices these files reside on. However, this is just a
+starting point. The concept of system components is flexible and can be
+extended to include new elements as the system evolves. The immutable
+properties include the origin of a file, which remains constant and
+unchangeable over time. For example, IPE policies can be crafted to trust
+files originating from the initramfs. Since initramfs is typically verified
+by the bootloader, its files are deemed trustworthy; "file is from
+initramfs" becomes an immutable property under IPE's consideration.
+
+The immutable property concept extends to the security features enabled on
+a file's origin, such as dm-verity or fs-verity, which provide a layer of
+integrity and trust. For example, IPE allows the definition of policies
+that trust files from a dm-verity protected device. dm-verity ensures the
+integrity of an entire device by providing a verifiable and immutable state
+of its contents. Similarly, fs-verity offers filesystem-level integrity
+checks, allowing IPE to enforce policies that trust files protected by
+fs-verity. These two features cannot be turned off once established, so
+they are considered immutable properties. These examples demonstrate how
+IPE leverages immutable properties, such as a file's origin and its
+integrity protection mechanisms, to make access control decisions.
+
+For the IPE policy, specifically, it grants the ability to enforce
+stringent access controls by assessing security properties against
+reference values defined within the policy. This assessment can be based on
+the existence of a security property (e.g., verifying if a file originates
+from initramfs) or evaluating the internal state of an immutable security
+property. The latter includes checking the roothash of a dm-verity
+protected device, determining whether dm-verity possesses a valid
+signature, assessing the digest of a fs-verity protected file, or
+determining whether fs-verity possesses a valid built-in signature. This
+nuanced approach to policy enforcement enables a highly secure and
+customizable system defense mechanism, tailored to specific security
+requirements and trust models.
+
+To enable IPE, ensure that ``CONFIG_SECURITY_IPE`` (under
+:menuselection:`Security -> Integrity Policy Enforcement (IPE)`) config
+option is enabled.
+
+Use Cases
+---------
+
+IPE works best in fixed-function devices: devices in which their purpose
+is clearly defined and not supposed to be changed (e.g. network firewall
+device in a data center, an IoT device, etcetera), where all software and
+configuration is built and provisioned by the system owner.
+
+IPE is a long-way off for use in general-purpose computing: the Linux
+community as a whole tends to follow a decentralized trust model (known as
+the web of trust), which IPE has no support for it yet. Instead, IPE
+supports PKI (public key infrastructure), which generally designates a
+set of trusted entities that provide a measure of absolute trust.
+
+Additionally, while most packages are signed today, the files inside
+the packages (for instance, the executables), tend to be unsigned. This
+makes it difficult to utilize IPE in systems where a package manager is
+expected to be functional, without major changes to the package manager
+and ecosystem behind it.
+
+DIGLIM [#diglim]_ is a system that when combined with IPE, could be used to
+enable and support general-purpose computing use cases.
+
+Known Limitations
+-----------------
+
+IPE cannot verify the integrity of anonymous executable memory, such as
+the trampolines created by gcc closures and libffi (<3.4.2), or JIT'd code.
+Unfortunately, as this is dynamically generated code, there is no way
+for IPE to ensure the integrity of this code to form a trust basis. In all
+cases, the return result for these operations will be whatever the admin
+configures as the ``DEFAULT`` action for ``EXECUTE``.
+
+IPE cannot verify the integrity of programs written in interpreted
+languages when these scripts are invoked by passing these program files
+to the interpreter. This is because the way interpreters execute these
+files; the scripts themselves are not evaluated as executable code
+through one of IPE's hooks, but they are merely text files that are read
+(as opposed to compiled executables) [#interpreters]_.
+
+Threat Model
+------------
+
+IPE specifically targets the risk of tampering with user-space executable
+code after the kernel has initially booted, including the kernel modules
+loaded from userspace via ``modprobe`` or ``insmod``.
+
+To illustrate, consider a scenario where an untrusted binary, possibly
+malicious, is downloaded along with all necessary dependencies, including a
+loader and libc. The primary function of IPE in this context is to prevent
+the execution of such binaries and their dependencies.
+
+IPE achieves this by verifying the integrity and authenticity of all
+executable code before allowing them to run. It conducts a thorough
+check to ensure that the code's integrity is intact and that they match an
+authorized reference value (digest, signature, etc) as per the defined
+policy. If a binary does not pass this verification process, either
+because its integrity has been compromised or it does not meet the
+authorization criteria, IPE will deny its execution. Additionally, IPE
+generates audit logs which may be utilized to detect and analyze failures
+resulting from policy violation.
+
+Tampering threat scenarios include modification or replacement of
+executable code by a range of actors including:
+
+-  Actors with physical access to the hardware
+-  Actors with local network access to the system
+-  Actors with access to the deployment system
+-  Compromised internal systems under external control
+-  Malicious end users of the system
+-  Compromised end users of the system
+-  Remote (external) compromise of the system
+
+IPE does not mitigate threats arising from malicious but authorized
+developers (with access to a signing certificate), or compromised
+developer tools used by them (i.e. return-oriented programming attacks).
+Additionally, IPE draws hard security boundary between userspace and
+kernelspace. As a result, IPE does not provide any protections against a
+kernel level exploit, and a kernel-level exploit can disable or tamper
+with IPE's protections.
+
+Policy
+------
+
+IPE policy is a plain-text [#devdoc]_ policy composed of multiple statements
+over several lines. There is one required line, at the top of the
+policy, indicating the policy name, and the policy version, for
+instance::
+
+   policy_name=Ex_Policy policy_version=0.0.0
+
+The policy name is a unique key identifying this policy in a human
+readable name. This is used to create nodes under securityfs as well as
+uniquely identify policies to deploy new policies vs update existing
+policies.
+
+The policy version indicates the current version of the policy (NOT the
+policy syntax version). This is used to prevent rollback of policy to
+potentially insecure previous versions of the policy.
+
+The next portion of IPE policy are rules. Rules are formed by key=value
+pairs, known as properties. IPE rules require two properties: ``action``,
+which determines what IPE does when it encounters a match against the
+rule, and ``op``, which determines when the rule should be evaluated.
+The ordering is significant, a rule must start with ``op``, and end with
+``action``. Thus, a minimal rule is::
+
+   op=EXECUTE action=ALLOW
+
+This example will allow any execution. Additional properties are used to
+assess immutable security properties about the files being evaluated.
+These properties are intended to be descriptions of systems within the
+kernel that can provide a measure of integrity verification, such that IPE
+can determine the trust of the resource based on the value of the property.
+
+Rules are evaluated top-to-bottom. As a result, any revocation rules,
+or denies should be placed early in the file to ensure that these rules
+are evaluated before a rule with ``action=ALLOW``.
+
+IPE policy supports comments. The character '#' will function as a
+comment, ignoring all characters to the right of '#' until the newline.
+
+The default behavior of IPE evaluations can also be expressed in policy,
+through the ``DEFAULT`` statement. This can be done at a global level,
+or a per-operation level::
+
+   # Global
+   DEFAULT action=ALLOW
+
+   # Operation Specific
+   DEFAULT op=EXECUTE action=ALLOW
+
+A default must be set for all known operations in IPE. If you want to
+preserve older policies being compatible with newer kernels that can introduce
+new operations, set a global default of ``ALLOW``, then override the
+defaults on a per-operation basis (as above).
+
+With configurable policy-based LSMs, there's several issues with
+enforcing the configurable policies at startup, around reading and
+parsing the policy:
+
+1. The kernel *should* not read files from userspace, so directly reading
+   the policy file is prohibited.
+2. The kernel command line has a character limit, and one kernel module
+   should not reserve the entire character limit for its own
+   configuration.
+3. There are various boot loaders in the kernel ecosystem, so handing
+   off a memory block would be costly to maintain.
+
+As a result, IPE has addressed this problem through a concept of a "boot
+policy". A boot policy is a minimal policy which is compiled into the
+kernel. This policy is intended to get the system to a state where
+userspace is set up and ready to receive commands, at which point a more
+complex policy can be deployed via securityfs. The boot policy can be
+specified via ``SECURITY_IPE_BOOT_POLICY`` config option, which accepts
+a path to a plain-text version of the IPE policy to apply. This policy
+will be compiled into the kernel. If not specified, IPE will be disabled
+until a policy is deployed and activated through securityfs.
+
+Deploying Policies
+~~~~~~~~~~~~~~~~~~
+
+Policies can be deployed from userspace through securityfs. These policies
+are signed through the PKCS#7 message format to enforce some level of
+authorization of the policies (prohibiting an attacker from gaining
+unconstrained root, and deploying an "allow all" policy). These
+policies must be signed by a certificate that chains to the
+``SYSTEM_TRUSTED_KEYRING``. With openssl, the policy can be signed by::
+
+   openssl smime -sign \
+      -in "$MY_POLICY" \
+      -signer "$MY_CERTIFICATE" \
+      -inkey "$MY_PRIVATE_KEY" \
+      -noattr \
+      -nodetach \
+      -nosmimecap \
+      -outform der \
+      -out "$MY_POLICY.p7b"
+
+Deploying the policies is done through securityfs, through the
+``new_policy`` node. To deploy a policy, simply cat the file into the
+securityfs node::
+
+   cat "$MY_POLICY.p7b" > /sys/kernel/security/ipe/new_policy
+
+Upon success, this will create one subdirectory under
+``/sys/kernel/security/ipe/policies/``. The subdirectory will be the
+``policy_name`` field of the policy deployed, so for the example above,
+the directory will be ``/sys/kernel/security/ipe/policies/Ex_Policy``.
+Within this directory, there will be seven files: ``pkcs7``, ``policy``,
+``name``, ``version``, ``active``, ``update``, and ``delete``.
+
+The ``pkcs7`` file is read-only. Reading it returns the raw PKCS#7 data
+that was provided to the kernel, representing the policy. If the policy being
+read is the boot policy, this will return ``ENOENT``, as it is not signed.
+
+The ``policy`` file is read only. Reading it returns the PKCS#7 inner
+content of the policy, which will be the plain text policy.
+
+The ``active`` file is used to set a policy as the currently active policy.
+This file is rw, and accepts a value of ``"1"`` to set the policy as active.
+Since only a single policy can be active at one time, all other policies
+will be marked inactive. The policy being marked active must have a policy
+version greater or equal to the currently-running version.
+
+The ``update`` file is used to update a policy that is already present
+in the kernel. This file is write-only and accepts a PKCS#7 signed
+policy. Two checks will always be performed on this policy: First, the
+``policy_names`` must match with the updated version and the existing
+version. Second the updated policy must have a policy version greater than
+or equal to the currently-running version. This is to prevent rollback attacks.
+
+The ``delete`` file is used to remove a policy that is no longer needed.
+This file is write-only and accepts a value of ``1`` to delete the policy.
+On deletion, the securityfs node representing the policy will be removed.
+However, delete the current active policy is not allowed and will return
+an operation not permitted error.
+
+Similarly, writing to both ``update`` and ``new_policy`` could result in
+bad message(policy syntax error) or file exists error. The latter error happens
+when trying to deploy a policy with a ``policy_name`` while the kernel already
+has a deployed policy with the same ``policy_name``.
+
+Deploying a policy will *not* cause IPE to start enforcing the policy. IPE will
+only enforce the policy marked active. Note that only one policy can be active
+at a time.
+
+Once deployment is successful, the policy can be activated, by writing file
+``/sys/kernel/security/ipe/policies/$policy_name/active``.
+For example, the ``Ex_Policy`` can be activated by::
+
+   echo 1 > "/sys/kernel/security/ipe/policies/Ex_Policy/active"
+
+From above point on, ``Ex_Policy`` is now the enforced policy on the
+system.
+
+IPE also provides a way to delete policies. This can be done via the
+``delete`` securityfs node,
+``/sys/kernel/security/ipe/policies/$policy_name/delete``.
+Writing ``1`` to that file deletes the policy::
+
+   echo 1 > "/sys/kernel/security/ipe/policies/$policy_name/delete"
+
+There is only one requirement to delete a policy: the policy being deleted
+must be inactive.
+
+.. NOTE::
+
+   If a traditional MAC system is enabled (SELinux, apparmor, smack), all
+   writes to ipe's securityfs nodes require ``CAP_MAC_ADMIN``.
+
+Modes
+~~~~~
+
+IPE supports two modes of operation: permissive (similar to SELinux's
+permissive mode) and enforced. In permissive mode, all events are
+checked and policy violations are logged, but the policy is not really
+enforced. This allows users to test policies before enforcing them.
+
+The default mode is enforce, and can be changed via the kernel command
+line parameter ``ipe.enforce=(0|1)``, or the securityfs node
+``/sys/kernel/security/ipe/enforce``.
+
+.. NOTE::
+
+   If a traditional MAC system is enabled (SELinux, apparmor, smack, etcetera),
+   all writes to ipe's securityfs nodes require ``CAP_MAC_ADMIN``.
+
+Audit Events
+~~~~~~~~~~~~
+
+1420 AUDIT_IPE_ACCESS
+^^^^^^^^^^^^^^^^^^^^^
+Event Examples::
+
+   type=1420 audit(1653364370.067:61): ipe_op=EXECUTE ipe_hook=MMAP enforcing=1 pid=2241 comm="ld-linux.so" path="/deny/lib/libc.so.6" dev="sda2" ino=14549020 rule="DEFAULT action=DENY"
+   type=1300 audit(1653364370.067:61): SYSCALL arch=c000003e syscall=9 success=no exit=-13 a0=7f1105a28000 a1=195000 a2=5 a3=812 items=0 ppid=2219 pid=2241 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="ld-linux.so" exe="/tmp/ipe-test/lib/ld-linux.so" subj=unconfined key=(null)
+   type=1327 audit(1653364370.067:61): 707974686F6E3300746573742F6D61696E2E7079002D6E00
+
+   type=1420 audit(1653364735.161:64): ipe_op=EXECUTE ipe_hook=MMAP enforcing=1 pid=2472 comm="mmap_test" path=? dev=? ino=? rule="DEFAULT action=DENY"
+   type=1300 audit(1653364735.161:64): SYSCALL arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=1000 a2=4 a3=21 items=0 ppid=2219 pid=2472 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="mmap_test" exe="/root/overlake_test/upstream_test/vol_fsverity/bin/mmap_test" subj=unconfined key=(null)
+   type=1327 audit(1653364735.161:64): 707974686F6E3300746573742F6D61696E2E7079002D6E00
+
+This event indicates that IPE made an access control decision; the IPE
+specific record (1420) is always emitted in conjunction with a
+``AUDITSYSCALL`` record.
+
+Determining whether IPE is in permissive or enforced mode can be derived
+from ``success`` property and exit code of the ``AUDITSYSCALL`` record.
+
+
+Field descriptions:
+
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| Field     | Value Type | Optional? | Description of Value                                                            |
++===========+============+===========+=================================================================================+
+| ipe_op    | string     | No        | The IPE operation name associated with the log                                  |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| ipe_hook  | string     | No        | The name of the LSM hook that triggered the IPE event                           |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| enforcing | integer    | No        | The current IPE enforcing state 1 is in enforcing mode, 0 is in permissive mode |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| pid       | integer    | No        | The pid of the process that triggered the IPE event.                            |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| comm      | string     | No        | The command line program name of the process that triggered the IPE event       |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| path      | string     | Yes       | The absolute path to the evaluated file                                         |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| ino       | integer    | Yes       | The inode number of the evaluated file                                          |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| dev       | string     | Yes       | The device name of the evaluated file, e.g. vda                                 |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+| rule      | string     | No        | The matched policy rule                                                         |
++-----------+------------+-----------+---------------------------------------------------------------------------------+
+
+1421 AUDIT_IPE_CONFIG_CHANGE
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Event Example::
+
+   type=1421 audit(1653425583.136:54): old_active_pol_name="Allow_All" old_active_pol_version=0.0.0 old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855 new_active_pol_name="boot_verified" new_active_pol_version=0.0.0 new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F26765076DD8EED7B8F4DB auid=4294967295 ses=4294967295 lsm=ipe res=1
+   type=1300 audit(1653425583.136:54): SYSCALL arch=c000003e syscall=1 success=yes exit=2 a0=3 a1=5596fcae1fb0 a2=2 a3=2 items=0 ppid=184 pid=229 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="python3" exe="/usr/bin/python3.10" key=(null)
+   type=1327 audit(1653425583.136:54): PROCTITLE proctitle=707974686F6E3300746573742F6D61696E2E7079002D66002E2
+
+This event indicates that IPE switched the active poliy from one to another
+along with the version and the hash digest of the two policies.
+Note IPE can only have one policy active at a time, all access decision
+evaluation is based on the current active policy.
+The normal procedure to deploy a new policy is loading the policy to deploy
+into the kernel first, then switch the active policy to it.
+
+This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record for the ``write`` syscall.
+
+Field descriptions:
+
++------------------------+------------+-----------+---------------------------------------------------+
+| Field                  | Value Type | Optional? | Description of Value                              |
++========================+============+===========+===================================================+
+| old_active_pol_name    | string     | No        | The name of previous active policy                |
++------------------------+------------+-----------+---------------------------------------------------+
+| old_active_pol_version | string     | No        | The version of previous active policy             |
++------------------------+------------+-----------+---------------------------------------------------+
+| old_policy_digest      | string     | No        | The hash of previous active policy                |
++------------------------+------------+-----------+---------------------------------------------------+
+| new_active_pol_name    | string     | No        | The name of current active policy                 |
++------------------------+------------+-----------+---------------------------------------------------+
+| new_active_pol_version | string     | No        | The version of current active policy              |
++------------------------+------------+-----------+---------------------------------------------------+
+| new_policy_digest      | string     | No        | The hash of current active policy                 |
++------------------------+------------+-----------+---------------------------------------------------+
+| auid                   | integer    | No        | The login user ID                                 |
++------------------------+------------+-----------+---------------------------------------------------+
+| ses                    | integer    | No        | The login session ID                              |
++------------------------+------------+-----------+---------------------------------------------------+
+| lsm                    | string     | No        | The lsm name associated with the event            |
++------------------------+------------+-----------+---------------------------------------------------+
+| res                    | integer    | No        | The result of the audited operation(success/fail) |
++------------------------+------------+-----------+---------------------------------------------------+
+
+1422 AUDIT_IPE_POLICY_LOAD
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Event Example::
+
+   type=1422 audit(1653425529.927:53): policy_name="boot_verified" policy_version=0.0.0 policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F26765076DD8EED7B8F4DB auid=4294967295 ses=4294967295 lsm=ipe res=1
+   type=1300 audit(1653425529.927:53): arch=c000003e syscall=1 success=yes exit=2567 a0=3 a1=5596fcae1fb0 a2=a07 a3=2 items=0 ppid=184 pid=229 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="python3" exe="/usr/bin/python3.10" key=(null)
+   type=1327 audit(1653425529.927:53): PROCTITLE proctitle=707974686F6E3300746573742F6D61696E2E7079002D66002E2E
+
+This record indicates a new policy has been loaded into the kernel with the policy name, policy version and policy hash.
+
+This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record for the ``write`` syscall.
+
+Field descriptions:
+
++----------------+------------+-----------+---------------------------------------------------+
+| Field          | Value Type | Optional? | Description of Value                              |
++================+============+===========+===================================================+
+| policy_name    | string     | No        | The policy_name                                   |
++----------------+------------+-----------+---------------------------------------------------+
+| policy_version | string     | No        | The policy_version                                |
++----------------+------------+-----------+---------------------------------------------------+
+| policy_digest  | string     | No        | The policy hash                                   |
++----------------+------------+-----------+---------------------------------------------------+
+| auid           | integer    | No        | The login user ID                                 |
++----------------+------------+-----------+---------------------------------------------------+
+| ses            | integer    | No        | The login session ID                              |
++----------------+------------+-----------+---------------------------------------------------+
+| lsm            | string     | No        | The lsm name associated with the event            |
++----------------+------------+-----------+---------------------------------------------------+
+| res            | integer    | No        | The result of the audited operation(success/fail) |
++----------------+------------+-----------+---------------------------------------------------+
+
+
+1404 AUDIT_MAC_STATUS
+^^^^^^^^^^^^^^^^^^^^^
+
+Event Examples::
+
+   type=1404 audit(1653425689.008:55): enforcing=0 old_enforcing=1 auid=4294967295 ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
+   type=1300 audit(1653425689.008:55): arch=c000003e syscall=1 success=yes exit=2 a0=1 a1=55c1065e5c60 a2=2 a3=0 items=0 ppid=405 pid=441 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=)
+   type=1327 audit(1653425689.008:55): proctitle="-bash"
+
+   type=1404 audit(1653425689.008:55): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
+   type=1300 audit(1653425689.008:55): arch=c000003e syscall=1 success=yes exit=2 a0=1 a1=55c1065e5c60 a2=2 a3=0 items=0 ppid=405 pid=441 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=)
+   type=1327 audit(1653425689.008:55): proctitle="-bash"
+
+This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record for the ``write`` syscall.
+
+Field descriptions:
+
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| Field         | Value Type | Optional? | Description of Value                                                                            |
++===============+============+===========+=================================================================================================+
+| enforcing     | integer    | No        | The enforcing state IPE is being switched to, 1 is in enforcing mode, 0 is in permissive mode   |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| old_enforcing | integer    | No        | The enforcing state IPE is being switched from, 1 is in enforcing mode, 0 is in permissive mode |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| auid          | integer    | No        | The login user ID                                                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| ses           | integer    | No        | The login session ID                                                                            |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| enabled       | integer    | No        | The new TTY audit enabled setting                                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| old-enabled   | integer    | No        | The old TTY audit enabled setting                                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| lsm           | string     | No        | The lsm name associated with the event                                                          |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+| res           | integer    | No        | The result of the audited operation(success/fail)                                               |
++---------------+------------+-----------+-------------------------------------------------------------------------------------------------+
+
+
+Success Auditing
+^^^^^^^^^^^^^^^^
+
+IPE supports success auditing. When enabled, all events that pass IPE
+policy and are not blocked will emit an audit event. This is disabled by
+default, and can be enabled via the kernel command line
+``ipe.success_audit=(0|1)`` or
+``/sys/kernel/security/ipe/success_audit`` securityfs file.
+
+This is *very* noisy, as IPE will check every userspace binary on the
+system, but is useful for debugging policies.
+
+.. NOTE::
+
+   If a traditional MAC system is enabled (SELinux, apparmor, smack, etcetera),
+   all writes to ipe's securityfs nodes require ``CAP_MAC_ADMIN``.
+
+Properties
+----------
+
+As explained above, IPE properties are ``key=value`` pairs expressed in IPE
+policy. Two properties are built-into the policy parser: 'op' and 'action'.
+The other properties are used to restrict immutable security properties
+about the files being evaluated. Currently those properties are:
+'``boot_verified``', '``dmverity_signature``', '``dmverity_roothash``',
+'``fsverity_signature``', '``fsverity_digest``'. A description of all
+properties supported by IPE are listed below:
+
+op
+~~
+
+Indicates the operation for a rule to apply to. Must be in every rule,
+as the first token. IPE supports the following operations:
+
+   ``EXECUTE``
+
+      Pertains to any file attempting to be executed, or loaded as an
+      executable.
+
+   ``FIRMWARE``:
+
+      Pertains to firmware being loaded via the firmware_class interface.
+      This covers both the preallocated buffer and the firmware file
+      itself.
+
+   ``KMODULE``:
+
+      Pertains to loading kernel modules via ``modprobe`` or ``insmod``.
+
+   ``KEXEC_IMAGE``:
+
+      Pertains to kernel images loading via ``kexec``.
+
+   ``KEXEC_INITRAMFS``
+
+      Pertains to initrd images loading via ``kexec --initrd``.
+
+   ``POLICY``:
+
+      Controls loading policies via reading a kernel-space initiated read.
+
+      An example of such is loading IMA policies by writing the path
+      to the policy file to ``$securityfs/ima/policy``
+
+   ``X509_CERT``:
+
+      Controls loading IMA certificates through the Kconfigs,
+      ``CONFIG_IMA_X509_PATH`` and ``CONFIG_EVM_X509_PATH``.
+
+action
+~~~~~~
+
+   Determines what IPE should do when a rule matches. Must be in every
+   rule, as the final clause. Can be one of:
+
+   ``ALLOW``:
+
+      If the rule matches, explicitly allow access to the resource to proceed
+      without executing any more rules.
+
+   ``DENY``:
+
+      If the rule matches, explicitly prohibit access to the resource to
+      proceed without executing any more rules.
+
+boot_verified
+~~~~~~~~~~~~~
+
+   This property can be utilized for authorization of files from initramfs.
+   The format of this property is::
+
+         boot_verified=(TRUE|FALSE)
+
+
+   .. WARNING::
+
+      This property will trust files from initramfs(rootfs). It should
+      only be used during early booting stage. Before mounting the real
+      rootfs on top of the initramfs, initramfs script will recursively
+      remove all files and directories on the initramfs. This is typically
+      implemented by using switch_root(8) [#switch_root]_. Therefore the
+      initramfs will be empty and not accessible after the real
+      rootfs takes over. It is advised to switch to a different policy
+      that doesn't rely on the property after this point.
+      This ensures that the trust policies remain relevant and effective
+      throughout the system's operation.
+
+dmverity_roothash
+~~~~~~~~~~~~~~~~~
+
+   This property can be utilized for authorization or revocation of
+   specific dm-verity volumes, identified via its root hash. It has a
+   dependency on the DM_VERITY module. This property is controlled by
+   the ``IPE_PROP_DM_VERITY`` config option, it will be automatically
+   selected when ``IPE_SECURITY`` , ``DM_VERITY`` and
+   ``DM_VERITY_VERIFY_ROOTHASH_SIG`` are all enabled.
+   The format of this property is::
+
+      dmverity_roothash=DigestName:HexadecimalString
+
+   The supported DigestNames for dmverity_roothash are [#dmveritydigests]_ [#securedigest]_ :
+
+      + blake2b-512
+      + blake2s-256
+      + sha1
+      + sha256
+      + sha384
+      + sha512
+      + sha3-224
+      + sha3-256
+      + sha3-384
+      + sha3-512
+      + md4
+      + md5
+      + sm3
+      + rmd160
+
+dmverity_signature
+~~~~~~~~~~~~~~~~~~
+
+   This property can be utilized for authorization of all dm-verity
+   volumes that have a signed roothash that validated by a keyring
+   specified by dm-verity's configuration, either the system trusted
+   keyring, or the secondary keyring. It depends on
+   ``DM_VERITY_VERIFY_ROOTHASH_SIG`` config option and is controlled by
+   the ``IPE_PROP_DM_VERITY`` config option, it will be automatically
+   selected when ``IPE_SECURITY``, ``DM_VERITY`` and
+   ``DM_VERITY_VERIFY_ROOTHASH_SIG`` are all enabled.
+   The format of this property is::
+
+      dmverity_signature=(TRUE|FALSE)
+
+fsverity_digest
+~~~~~~~~~~~~~~~
+
+   This property can be utilized for authorization or revocation of
+   specific fsverity enabled file, identified via its fsverity digest.
+   It depends on ``FS_VERITY`` config option and is controlled by
+   ``CONFIG_IPE_PROP_FS_VERITY``. The format of this property is::
+
+      fsverity_digest=DigestName:HexadecimalString
+
+   The supported DigestNames for fsverity_roothash are [#fsveritydigest]_ [#securedigest]_ :
+
+      + sha256
+      + sha512
+
+fsverity_signature
+~~~~~~~~~~~~~~~~~~
+
+   This property is used to authorize all fs-verity enabled files that have
+   been verified by fs-verity's built-in signature mechanism. The signature
+   verification relies on a key stored within the ".fs-verity" keyring. It
+   depends on ``CONFIG_FS_VERITY_BUILTIN_SIGNATURES`` and  it is controlled by
+   the Kconfig ``CONFIG_IPE_PROP_FS_VERITY``. The format of this
+   property is::
+
+      fsverity_signature=(TRUE|FALSE)
+
+Policy Examples
+---------------
+
+Allow all
+~~~~~~~~~
+
+::
+
+   policy_name=Allow_All policy_version=0.0.0
+   DEFAULT action=ALLOW
+
+Allow only initramfs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=Allow_All_Initramfs policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+
+Allow any signed dm-verity volume and the initramfs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=AllowSignedAndInitramfs policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+   op=EXECUTE dmverity_signature=TRUE action=ALLOW
+
+Prohibit execution from a specific dm-verity volume
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=AllowSignedAndInitramfs policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE dmverity_roothash=sha256:cd2c5bae7c6c579edaae4353049d58eb5f2e8be0244bf05345bc8e5ed257baff action=DENY
+
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+   op=EXECUTE dmverity_signature=TRUE action=ALLOW
+
+Allow only a specific dm-verity volume
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=AllowSignedAndInitramfs policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE dmverity_roothash=sha256:401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW
+
+Allow any signed fs-verity file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=AllowSignedFSVerity policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE fsverity_signature=TRUE action=ALLOW
+
+Prohibit execution of a specific fs-verity file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+   policy_name=ProhibitSpecificFSVF policy_version=0.0.0
+   DEFAULT action=DENY
+
+   op=EXECUTE fsverity_digest=sha256:fd88f2b8824e197f850bf4c5109bea5cf0ee38104f710843bb72da796ba5af9e action=DENY
+   op=EXECUTE boot_verified=TRUE action=ALLOW
+   op=EXECUTE dmverity_signature=TRUE action=ALLOW
+
+Additional Information
+----------------------
+
+- `Github Repository <https://github.com/microsoft/ipe>`_
+- Documentation/security/ipe.rst
+
+FAQ
+---
+
+Q:
+   What's the difference between other LSMs which provide a measure of
+   trust-based access control?
+
+A:
+
+   In general, there's two other LSMs that can provide similar functionality:
+   IMA, and Loadpin.
+
+   IMA and IPE are functionally very similar. The significant difference between
+   the two is the policy. [#devdoc]_
+
+   Loadpin and IPE differ fairly dramatically, as Loadpin only covers the IPE's
+   kernel read operations, whereas IPE is capable of controlling execution
+   on top of kernel read. The trust model is also different; Loadpin roots its
+   trust in the initial super-block, whereas trust in IPE is stemmed from kernel
+   itself (via ``SYSTEM_TRUSTED_KEYS``).
+
+-----------
+
+.. [#diglim] https://lore.kernel.org/bpf/4d6932e96d774227b42721d9f645ba51@huawei.com/T/
+
+.. [#interpreters] There is `some interest in solving this issue <https://lore.kernel.org/lkml/20220321161557.495388-1-mic@digikod.net/>`_.
+
+.. [#devdoc] Please see Documentation/security/ipe.rst for more on this topic.
+
+.. [#switch_root] https://man7.org/linux/man-pages/man8/switch_root.8.html
+
+.. [#fsveritydigest] These hash algorithms are based on values accepted by fsverity-utils;
+                     IPE does not impose any restrictions on the digest algorithm itself;
+                     thus, this list may be out of date.
+
+.. [#dmveritydigests] These hash algorithms are based on values accepted by dm-verity,
+                      specifically ``crypto_alloc_ahash`` in ``verity_ctr``; ``veritysetup``
+                      does support more algorithms than the list above. IPE does not impose
+                      any restrictions on the digest algorithm itself; thus, this list
+                      may be out of date.
+
+.. [#securedigest] Please ensure you are using cryptographically secure hash functions;
+                   just because something is *supported* does not mean it is *secure*.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 70046a019d42..7b7a24a59747 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2321,6 +2321,18 @@
 	ipcmni_extend	[KNL,EARLY] Extend the maximum number of unique System V
 			IPC identifiers from 32,768 to 16,777,216.
 
+	ipe.enforce=	[IPE]
+			Format: <bool>
+			Determine whether IPE starts in permissive (0) or
+			enforce (1) mode. The default is enforce.
+
+	ipe.success_audit=
+			[IPE]
+			Format: <bool>
+			Start IPE with success auditing enabled, emitting
+			an audit event when a binary is allowed. The default
+			is 0.
+
 	irqaffinity=	[SMP] Set the default irq affinity mask
 			The argument is a cpu list, as described above.
 
diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 362b7a5dc300..46ab280e1b13 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -92,7 +92,9 @@ authenticating fs-verity file hashes include:
   "IPE policy" specifically allows for the authorization of fs-verity
   files using properties ``fsverity_digest`` for identifying
   files by their verity digest, and ``fsverity_signature`` to authorize
-  files with a verified fs-verity's built-in signature.
+  files with a verified fs-verity's built-in signature. For
+  details on configuring IPE policies and understanding its operational
+  modes, please refer to Documentation/admin-guide/LSM/ipe.rst.
 
 - Trusted userspace code in combination with `Built-in signature
   verification`_.  This approach should be used only with great care.
@@ -508,6 +510,7 @@ be carefully considered before using them:
   files with a verified fs-verity builtin signature to perform certain
   operations, such as execution. Note that IPE doesn't require
   fs.verity.require_signatures=1.
+  Please refer to Documentation/admin-guide/LSM/ipe.rst for more details.
 
 - A file's builtin signature can only be set at the same time that
   fs-verity is being enabled on the file.  Changing or deleting the
diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 59f8fc106cb0..3e0a7114a862 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -19,3 +19,4 @@ Security Documentation
    digsig
    landlock
    secrets/index
+   ipe
diff --git a/Documentation/security/ipe.rst b/Documentation/security/ipe.rst
new file mode 100644
index 000000000000..674827982a72
--- /dev/null
+++ b/Documentation/security/ipe.rst
@@ -0,0 +1,444 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Integrity Policy Enforcement (IPE) - Kernel Documentation
+=========================================================
+
+.. NOTE::
+
+   This is documentation targeted at developers, instead of administrators.
+   If you're looking for documentation on the usage of IPE, please see
+   Documentation/admin-guide/LSM/ipe.rst
+
+Historical Motivation
+---------------------
+
+The original issue that prompted IPE's implementation was the creation
+of a locked-down system. This system would be born-secure, and have
+strong integrity guarantees over both the executable code, and specific
+*data files* on the system, that were critical to its function. These
+specific data files would not be readable unless they passed integrity
+policy. A mandatory access control system would be present, and
+as a result, xattrs would have to be protected. This lead to a selection
+of what would provide the integrity claims. At the time, there were two
+main mechanisms considered that could guarantee integrity for the system
+with these requirements:
+
+  1. IMA + EVM Signatures
+  2. DM-Verity
+
+Both options were carefully considered, however the choice to use DM-Verity
+over IMA+EVM as the *integrity mechanism* in the original use case of IPE
+was due to three main reasons:
+
+  1. Protection of additional attack vectors:
+
+    * With IMA+EVM, without an encryption solution, the system is vulnerable
+      to offline attack against the aforementioned specific data files.
+
+      Unlike executables, read operations (like those on the protected data
+      files), cannot be enforced to be globally integrity verified. This means
+      there must be some form of selector to determine whether a read should
+      enforce the integrity policy, or it should not.
+
+      At the time, this was done with mandatory access control labels. An IMA
+      policy would indicate what labels required integrity verification, which
+      presented an issue: EVM would protect the label, but if an attacker could
+      modify filesystem offline, the attacker could wipe all the xattrs -
+      including the SELinux labels that would be used to determine whether the
+      file should be subject to integrity policy.
+
+      With DM-Verity, as the xattrs are saved as part of the Merkel tree, if
+      offline mount occurs against the filesystem protected by dm-verity, the
+      checksum no longer matches and the file fails to be read.
+
+    * As userspace binaries are paged in Linux, dm-verity also offers the
+      additional protection against a hostile block device. In such an attack,
+      the block device reports the appropriate content for the IMA hash
+      initially, passing the required integrity check. Then, on the page fault
+      that accesses the real data, will report the attacker's payload. Since
+      dm-verity will check the data when the page fault occurs (and the disk
+      access), this attack is mitigated.
+
+  2. Performance:
+
+    * dm-verity provides integrity verification on demand as blocks are
+      read versus requiring the entire file being read into memory for
+      validation.
+
+  3. Simplicity of signing:
+
+    * No need for two signatures (IMA, then EVM): one signature covers
+      an entire block device.
+    * Signatures can be stored externally to the filesystem metadata.
+    * The signature supports an x.509-based signing infrastructure.
+
+The next step was to choose a *policy* to enforce the integrity mechanism.
+The minimum requirements for the policy were:
+
+  1. The policy itself must be integrity verified (preventing trivial
+     attack against it).
+  2. The policy itself must be resistant to rollback attacks.
+  3. The policy enforcement must have a permissive-like mode.
+  4. The policy must be able to be updated, in its entirety, without
+     a reboot.
+  5. Policy updates must be atomic.
+  6. The policy must support *revocations* of previously authored
+     components.
+  7. The policy must be auditable, at any point-of-time.
+
+IMA, as the only integrity policy mechanism at the time, was
+considered against these list of requirements, and did not fulfill
+all of the minimum requirements. Extending IMA to cover these
+requirements was considered, but ultimately discarded for a
+two reasons:
+
+  1. Regression risk; many of these changes would result in
+     dramatic code changes to IMA, which is already present in the
+     kernel, and therefore might impact users.
+
+  2. IMA was used in the system for measurement and attestation;
+     separation of measurement policy from local integrity policy
+     enforcement was considered favorable.
+
+Due to these reasons, it was decided that a new LSM should be created,
+whose responsibility would be only the local integrity policy enforcement.
+
+Role and Scope
+--------------
+
+IPE, as its name implies, is fundamentally an integrity policy enforcement
+solution; IPE does not mandate how integrity is provided, but instead
+leaves that decision to the system administrator to set the security bar,
+via the mechanisms that they select that suit their individual needs.
+There are several different integrity solutions that provide a different
+level of security guarantees; and IPE allows sysadmins to express policy for
+theoretically all of them.
+
+IPE does not have an inherent mechanism to ensure integrity on its own.
+Instead, there are more effective layers available for building systems that
+can guarantee integrity. It's important to note that the mechanism for proving
+integrity is independent of the policy for enforcing that integrity claim.
+
+Therefore, IPE was designed around:
+
+  1. Easy integrations with integrity providers.
+  2. Ease of use for platform administrators/sysadmins.
+
+Design Rationale:
+-----------------
+
+IPE was designed after evaluating existing integrity policy solutions
+in other operating systems and environments. In this survey of other
+implementations, there were a few pitfalls identified:
+
+  1. Policies were not readable by humans, usually requiring a binary
+     intermediary format.
+  2. A single, non-customizable action was implicitly taken as a default.
+  3. Debugging the policy required manual steps to determine what rule was violated.
+  4. Authoring a policy required an in-depth knowledge of the larger system,
+     or operating system.
+
+IPE attempts to avoid all of these pitfalls.
+
+Policy
+~~~~~~
+
+Plain Text
+^^^^^^^^^^
+
+IPE's policy is plain-text. This introduces slightly larger policy files than
+other LSMs, but solves two major problems that occurs with some integrity policy
+solutions on other platforms.
+
+The first issue is one of code maintenance and duplication. To author policies,
+the policy has to be some form of string representation (be it structured,
+through XML, JSON, YAML, etcetera), to allow the policy author to understand
+what is being written. In a hypothetical binary policy design, a serializer
+is necessary to write the policy from the human readable form, to the binary
+form, and a deserializer is needed to interpret the binary form into a data
+structure in the kernel.
+
+Eventually, another deserializer will be needed to transform the binary from
+back into the human-readable form with as much information preserved. This is because a
+user of this access control system will have to keep a lookup table of a checksum
+and the original file itself to try to understand what policies have been deployed
+on this system and what policies have not. For a single user, this may be alright,
+as old policies can be discarded almost immediately after the update takes hold.
+For users that manage computer fleets in the thousands, if not hundreds of thousands,
+with multiple different operating systems, and multiple different operational needs,
+this quickly becomes an issue, as stale policies from years ago may be present,
+quickly resulting in the need to recover the policy or fund extensive infrastructure
+to track what each policy contains.
+
+With now three separate serializer/deserializers, maintenance becomes costly. If the
+policy avoids the binary format, there is only one required serializer: from the
+human-readable form to the data structure in kernel, saving on code maintenance,
+and retaining operability.
+
+The second issue with a binary format is one of transparency. As IPE controls
+access based on the trust of the system's resources, it's policy must also be
+trusted to be changed. This is done through signatures, resulting in needing
+signing as a process. Signing, as a process, is typically done with a
+high security bar, as anything signed can be used to attack integrity
+enforcement systems. It is also important that, when signing something, that
+the signer is aware of what they are signing. A binary policy can cause
+obfuscation of that fact; what signers see is an opaque binary blob. A
+plain-text policy, on the other hand, the signers see the actual policy
+submitted for signing.
+
+Boot Policy
+~~~~~~~~~~~
+
+IPE, if configured appropriately, is able to enforce a policy as soon as a
+kernel is booted and usermode starts. That implies some level of storage
+of the policy to apply the minute usermode starts. Generally, that storage
+can be handled in one of three ways:
+
+  1. The policy file(s) live on disk and the kernel loads the policy prior
+     to an code path that would result in an enforcement decision.
+  2. The policy file(s) are passed by the bootloader to the kernel, who
+     parses the policy.
+  3. There is a policy file that is compiled into the kernel that is
+     parsed and enforced on initialization.
+
+The first option has problems: the kernel reading files from userspace
+is typically discouraged and very uncommon in the kernel.
+
+The second option also has problems: Linux supports a variety of bootloaders
+across its entire ecosystem - every bootloader would have to support this
+new methodology or there must be an independent source. It would likely
+result in more drastic changes to the kernel startup than necessary.
+
+The third option is the best but it's important to be aware that the policy
+will take disk space against the kernel it's compiled in. It's important to
+keep this policy generalized enough that userspace can load a new, more
+complicated policy, but restrictive enough that it will not overauthorize
+and cause security issues.
+
+The initramfs provides a way that this bootup path can be established. The
+kernel starts with a minimal policy, that trusts the initramfs only. Inside
+the initramfs, when the real rootfs is mounted, but not yet transferred to,
+it deploys and activates a policy that trusts the new root filesystem.
+This prevents overauthorization at any step, and keeps the kernel policy
+to a minimal size.
+
+Startup
+^^^^^^^
+
+Not every system, however starts with an initramfs, so the startup policy
+compiled into the kernel will need some flexibility to express how trust
+is established for the next phase of the bootup. To this end, if we just
+make the compiled-in policy a full IPE policy, it allows system builders
+to express the first stage bootup requirements appropriately.
+
+Updatable, Rebootless Policy
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As requirements change over time (vulnerabilities are found in previously
+trusted applications, keys roll, etcetera). Updating a kernel to change the
+meet those security goals is not always a suitable option, as updates are not
+always risk-free, and blocking a security update leaves systems vulnerable.
+This means IPE requires a policy that can be completely updated (allowing
+revocations of existing policy) from a source external to the kernel (allowing
+policies to be updated without updating the kernel).
+
+Additionally, since the kernel is stateless between invocations, and reading
+policy files off the disk from kernel space is a bad idea(tm), then the
+policy updates have to be done rebootlessly.
+
+To allow an update from an external source, it could be potentially malicious,
+so this policy needs to have a way to be identified as trusted. This is
+done via a signature chained to a trust source in the kernel. Arbitrarily,
+this is  the ``SYSTEM_TRUSTED_KEYRING``, a keyring that is initially
+populated at kernel compile-time, as this matches the expectation that the
+author of the compiled-in policy described above is the same entity that can
+deploy policy updates.
+
+Anti-Rollback / Anti-Replay
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Over time, vulnerabilities are found and trusted resources may not be
+trusted anymore. IPE's policy has no exception to this. There can be
+instances where a mistaken policy author deploys an insecure policy,
+before correcting it with a secure policy.
+
+Assuming that as soon as the insecure policy is signed, and an attacker
+acquires the insecure policy, IPE needs a way to prevent rollback
+from the secure policy update to the insecure policy update.
+
+Initially, IPE's policy can have a policy_version that states the
+minimum required version across all policies that can be active on
+the system. This will prevent rollback while the system is live.
+
+.. WARNING::
+
+  However, since the kernel is stateless across boots, this policy
+  version will be reset to 0.0.0 on the next boot. System builders
+  need to be aware of this, and ensure the new secure policies are
+  deployed ASAP after a boot to ensure that the window of
+  opportunity is minimal for an attacker to deploy the insecure policy.
+
+Implicit Actions:
+~~~~~~~~~~~~~~~~~
+
+The issue of implicit actions only becomes visible when you consider
+a mixed level of security bars across multiple operations in a system.
+For example, consider a system that has strong integrity guarantees
+over both the executable code, and specific *data files* on the system,
+that were critical to its function. In this system, three types of policies
+are possible:
+
+  1. A policy in which failure to match any rules in the policy results
+     in the action being denied.
+  2. A policy in which failure to match any rules in the policy results
+     in the action being allowed.
+  3. A policy in which the action taken when no rules are matched is
+     specified by the policy author.
+
+The first option could make a policy like this::
+
+  op=EXECUTE integrity_verified=YES action=ALLOW
+
+In the example system, this works well for the executables, as all
+executables should have integrity guarantees, without exception. The
+issue becomes with the second requirement about specific data files.
+This would result in a policy like this (assuming each line is
+evaluated in order)::
+
+  op=EXECUTE integrity_verified=YES action=ALLOW
+
+  op=READ integrity_verified=NO label=critical_t action=DENY
+  op=READ action=ALLOW
+
+This is somewhat clear if you read the docs, understand the policy
+is executed in order and that the default is a denial; however, the
+last line effectively changes that default to an ALLOW. This is
+required, because in a realistic system, there are some unverified
+reads (imagine appending to a log file).
+
+The second option, matching no rules results in an allow, is clearer
+for the specific data files::
+
+  op=READ integrity_verified=NO label=critical_t action=DENY
+
+And, like the first option, falls short with the opposite scenario,
+effectively needing to override the default::
+
+  op=EXECUTE integrity_verified=YES action=ALLOW
+  op=EXECUTE action=DENY
+
+  op=READ integrity_verified=NO label=critical_t action=DENY
+
+This leaves the third option. Instead of making users be clever
+and override the default with an empty rule, force the end-user
+to consider what the appropriate default should be for their
+scenario and explicitly state it::
+
+  DEFAULT op=EXECUTE action=DENY
+  op=EXECUTE integrity_verified=YES action=ALLOW
+
+  DEFAULT op=READ action=ALLOW
+  op=READ integrity_verified=NO label=critical_t action=DENY
+
+Policy Debugging:
+~~~~~~~~~~~~~~~~~
+
+When developing a policy, it is useful to know what line of the policy
+is being violated to reduce debugging costs; narrowing the scope of the
+investigation to the exact line that resulted in the action. Some integrity
+policy systems do not provide this information, instead providing the
+information that was used in the evaluation. This then requires a correlation
+with the policy to evaluate what went wrong.
+
+Instead, IPE just emits the rule that was matched. This limits the scope
+of the investigation to the exact policy line (in the case of a specific
+rule), or the section (in the case of a DEFAULT). This decreases iteration
+and investigation times when policy failures are observed while evaluating
+policies.
+
+IPE's policy engine is also designed in a way that it makes it obvious to
+a human of how to investigate a policy failure. Each line is evaluated in
+the sequence that is written, so the algorithm is very simple to follow
+for humans to recreate the steps and could have caused the failure. In other
+surveyed systems, optimizations occur (sorting rules, for instance) when loading
+the policy. In those systems, it requires multiple steps to debug, and the
+algorithm may not always be clear to the end-user without reading the code first.
+
+Simplified Policy:
+~~~~~~~~~~~~~~~~~~
+
+Finally, IPE's policy is designed for sysadmins, not kernel developers. Instead
+of covering individual LSM hooks (or syscalls), IPE covers operations. This means
+instead of sysadmins needing to know that the syscalls ``mmap``, ``mprotect``,
+``execve``, and ``uselib`` must have rules protecting them, they must simple know
+that they want to restrict code execution. This limits the amount of bypasses that
+could occur due to a lack of knowledge of the underlying system; whereas the
+maintainers of IPE, being kernel developers can make the correct choice to determine
+whether something maps to these operations, and under what conditions.
+
+Implementation Notes
+--------------------
+
+Anonymous Memory
+~~~~~~~~~~~~~~~~
+
+Anonymous memory isn't treated any differently from any other access in IPE.
+When anonymous memory is mapped with ``+X``, it still comes into the ``file_mmap``
+or ``file_mprotect`` hook, but with a ``NULL`` file object. This is submitted to
+the evaluation, like any other file, however, all current trust mechanisms will
+return false as there is nothing to evaluate. This means anonymous memory
+execution is subject to whatever the ``DEFAULT`` is for ``EXECUTE``.
+
+.. WARNING::
+
+  This also occurs with the ``kernel_load_data`` hook, which is used by signed
+  and compressed kernel modules. Using signed and compressed kernel modules with
+  IPE will always result in the ``DEFAULT`` action for ``KMODULE``.
+
+Securityfs Interface
+~~~~~~~~~~~~~~~~~~~~
+
+The per-policy securityfs tree is somewhat unique. For example, for
+a standard securityfs policy tree::
+
+  MyPolicy
+    |- active
+    |- delete
+    |- name
+    |- pkcs7
+    |- policy
+    |- update
+    |- version
+
+The policy is stored in the ``->i_private`` data of the MyPolicy inode.
+
+Tests
+-----
+
+IPE has KUnit Tests for the policy parser. Recommended kunitconfig::
+
+  CONFIG_KUNIT=y
+  CONFIG_SECURITY=y
+  CONFIG_SECURITYFS=y
+  CONFIG_PKCS7_MESSAGE_PARSER=y
+  CONFIG_SYSTEM_DATA_VERIFICATION=y
+  CONFIG_FS_VERITY=y
+  CONFIG_FS_VERITY_BUILTIN_SIGNATURES=y
+  CONFIG_BLOCK=y
+  CONFIG_MD=y
+  CONFIG_BLK_DEV_DM=y
+  CONFIG_DM_VERITY=y
+  CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
+  CONFIG_NET=y
+  CONFIG_AUDIT=y
+  CONFIG_AUDITSYSCALL=y
+  CONFIG_BLK_DEV_INITRD=y
+
+  CONFIG_SECURITY_IPE=y
+  CONFIG_IPE_PROP_DM_VERITY=y
+  CONFIG_IPE_PROP_FS_VERITY=y
+  CONFIG_SECURITY_IPE_KUNIT_TEST=y
+
+In addition, IPE has a python based integration
+`test suite <https://github.com/microsoft/ipe/tree/test-suite>`_ that
+can test both user interfaces and enforcement functionalities.
-- 
2.44.0


^ permalink raw reply related	[relevance 12%]

* [PATCH v17 15/21] security: add security_inode_setintegrity() hook
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (13 preceding siblings ...)
  2024-04-13  0:55 32% ` [PATCH v17 14/21] ipe: add support for dm-verity as a trust provider Fan Wu
@ 2024-04-13  0:55 65% ` Fan Wu
  2024-04-13  0:55 45% ` [PATCH v17 16/21] fsverity: expose verified fsverity built-in signatures to LSMs Fan Wu
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

This patch introduces a new hook to save inode's integrity
data. For example, for fsverity enabled files, LSMs can use this hook to
save the verified fsverity builtin signature into the inode's security
blob, and LSMs can make access decisions based on the data inside
the signature, like the signer certificate.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

--
v1-v14:
  + Not present

v15:
  + Introduced

v16:
  + Switch to call_int_hook()

v17:
  + Fix a typo
---
 include/linux/lsm_hook_defs.h |  2 ++
 include/linux/security.h      | 10 ++++++++++
 security/security.c           | 20 ++++++++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index b391a7f13053..6f746dfdb28b 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -177,6 +177,8 @@ LSM_HOOK(int, 0, inode_listsecurity, struct inode *inode, char *buffer,
 LSM_HOOK(void, LSM_RET_VOID, inode_getsecid, struct inode *inode, u32 *secid)
 LSM_HOOK(int, 0, inode_copy_up, struct dentry *src, struct cred **new)
 LSM_HOOK(int, -EOPNOTSUPP, inode_copy_up_xattr, const char *name)
+LSM_HOOK(int, 0, inode_setintegrity, struct inode *inode,
+	 enum lsm_integrity_type type, const void *value, size_t size)
 LSM_HOOK(int, 0, kernfs_init_security, struct kernfs_node *kn_dir,
 	 struct kernfs_node *kn)
 LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
diff --git a/include/linux/security.h b/include/linux/security.h
index 9e46b13a356c..703762b0c4ad 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -404,6 +404,9 @@ int security_inode_listsecurity(struct inode *inode, char *buffer, size_t buffer
 void security_inode_getsecid(struct inode *inode, u32 *secid);
 int security_inode_copy_up(struct dentry *src, struct cred **new);
 int security_inode_copy_up_xattr(const char *name);
+int security_inode_setintegrity(struct inode *inode,
+				enum lsm_integrity_type type, const void *value,
+				size_t size);
 int security_kernfs_init_security(struct kernfs_node *kn_dir,
 				  struct kernfs_node *kn);
 int security_file_permission(struct file *file, int mask);
@@ -1020,6 +1023,13 @@ static inline int security_inode_copy_up(struct dentry *src, struct cred **new)
 	return 0;
 }
 
+static inline int security_inode_setintegrity(struct inode *inode,
+					      enum lsm_integrity_type type,
+					      const void *value, size_t size)
+{
+	return 0;
+}
+
 static inline int security_kernfs_init_security(struct kernfs_node *kn_dir,
 						struct kernfs_node *kn)
 {
diff --git a/security/security.c b/security/security.c
index 3a7724c3dd76..2c20635a589b 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2681,6 +2681,26 @@ int security_inode_copy_up_xattr(const char *name)
 }
 EXPORT_SYMBOL(security_inode_copy_up_xattr);
 
+/**
+ * security_inode_setintegrity() - Set the inode's integrity data
+ * @inode: inode
+ * @type: type of integrity, e.g. hash digest, signature, etc
+ * @value: the integrity value
+ * @size: size of the integrity value
+ *
+ * Register a verified integrity measurement of a inode with LSMs.
+ * LSMs should free the previously saved data if @value is NULL.
+ *
+ * Return: Returns 0 on success, negative values on failure.
+ */
+int security_inode_setintegrity(struct inode *inode,
+				enum lsm_integrity_type type, const void *value,
+				size_t size)
+{
+	return call_int_hook(inode_setintegrity, inode, type, value, size);
+}
+EXPORT_SYMBOL(security_inode_setintegrity);
+
 /**
  * security_kernfs_init_security() - Init LSM context for a kernfs node
  * @kn_dir: parent kernfs node
-- 
2.44.0


^ permalink raw reply related	[relevance 65%]

* [PATCH v17 16/21] fsverity: expose verified fsverity built-in signatures to LSMs
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (14 preceding siblings ...)
  2024-04-13  0:55 65% ` [PATCH v17 15/21] security: add security_inode_setintegrity() hook Fan Wu
@ 2024-04-13  0:55 45% ` Fan Wu
  2024-04-13  0:56 39% ` [PATCH v17 17/21] ipe: enable support for fs-verity as a trust provider Fan Wu
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu, Deven Bowers

This patch enhances fsverity's capabilities to support both integrity and
authenticity protection by introducing the consumption of built-in
signatures through a new LSM hook. This functionality allows LSMs,
e.g. IPE, to enforce policies based on the authenticity and integrity of
files, specifically focusing on built-in fsverity signatures. It enables
a policy enforcement layer within LSMs for fsverity, offering granular
control over the usage of authenticity claims. For instance, a policy
could be established to permit the execution of all files with verified
built-in fsverity signatures while restricting kernel module loading
from specified fsverity files via fsverity digets.

The introduction of a security_inode_setintegrity() hook call within
fsverity's workflow ensures that the verified built-in signature of a file
is exposed to LSMs, This enables LSMs to recognize and label fsverity files
that contain a verified built-in fsverity signature. This hook is invoked
subsequent to the fsverity_verify_signature() process, guaranteeing the
signature's verification against fsverity's keyring. This mechanism is
crucial for maintaining system security, as it operates in kernel space,
effectively thwarting attempts by malicious binaries to bypass user space
stack interactions.

The second to last commit in this patch set will add a link to the IPE
documentation in fsverity.rst.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v6:
  + Not present

v7:
  Introduced

v8:
  + Split fs/verity/ changes and security/ changes into separate patches
  + Change signature of fsverity_create_info to accept non-const inode
  + Change signature of fsverity_verify_signature to accept non-const inode
  + Don't cast-away const from inode.
  + Digest functionality dropped in favor of:
    ("fs-verity: define a function to return the integrity protected
      file digest")
  + Reworded commit description and title to match changes.
  + Fix a bug wherein no LSM implements the particular fsverity @name
    (or LSM is disabled), and returns -EOPNOTSUPP, causing errors.

v9:
  + No changes

v10:
  + Rename the signature blob key
  + Cleanup redundant code
  + Make the hook call depends on CONFIG_FS_VERITY_BUILTIN_SIGNATURES

v11:
  + No changes

v12:
  + Add constification to the hook call

v13:
  + No changes

v14:
  + Add doc/comment to built-in signature verification

v15:
  + Add more docs related to IPE
  + Switch the hook call to security_inode_setintegrity()

v16:
  + Explicitly mention "fsverity builtin signatures" in the commit
    message
  + Amend documentation in fsverity.rst
  + Fix format issue
  + Change enum name

v17:
  + Fix various documentation issues
  + Use new enum name LSM_INT_FSVERITY_BUILTINSIG_VALID
---
 Documentation/filesystems/fsverity.rst | 23 +++++++++++++++++++++--
 fs/verity/fsverity_private.h           |  2 +-
 fs/verity/open.c                       | 24 +++++++++++++++++++++++-
 fs/verity/signature.c                  |  6 +++++-
 include/linux/security.h               |  1 +
 5 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 13e4b18e5dbb..362b7a5dc300 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -86,6 +86,14 @@ authenticating fs-verity file hashes include:
   signature in their "security.ima" extended attribute, as controlled
   by the IMA policy.  For more information, see the IMA documentation.
 
+- Integrity Policy Enforcement (IPE).  IPE supports enforcing access
+  control decisions based on immutable security properties of files,
+  including those protected by fs-verity's built-in signatures.
+  "IPE policy" specifically allows for the authorization of fs-verity
+  files using properties ``fsverity_digest`` for identifying
+  files by their verity digest, and ``fsverity_signature`` to authorize
+  files with a verified fs-verity's built-in signature.
+
 - Trusted userspace code in combination with `Built-in signature
   verification`_.  This approach should be used only with great care.
 
@@ -457,7 +465,11 @@ Enabling this option adds the following:
    On success, the ioctl persists the signature alongside the Merkle
    tree.  Then, any time the file is opened, the kernel verifies the
    file's actual digest against this signature, using the certificates
-   in the ".fs-verity" keyring.
+   in the ".fs-verity" keyring. This verification happens as long as the
+   file's signature exists, regardless of the state of the sysctl variable
+   "fs.verity.require_signatures" described in the next item. The IPE LSM
+   relies on this behavior to recognize and label fsverity files
+   that contain a verified built-in fsverity signature.
 
 3. A new sysctl "fs.verity.require_signatures" is made available.
    When set to 1, the kernel requires that all verity files have a
@@ -481,7 +493,7 @@ be carefully considered before using them:
 
 - Builtin signature verification does *not* make the kernel enforce
   that any files actually have fs-verity enabled.  Thus, it is not a
-  complete authentication policy.  Currently, if it is used, the only
+  complete authentication policy.  Currently, if it is used, one
   way to complete the authentication policy is for trusted userspace
   code to explicitly check whether files have fs-verity enabled with a
   signature before they are accessed.  (With
@@ -490,6 +502,13 @@ be carefully considered before using them:
   could just store the signature alongside the file and verify it
   itself using a cryptographic library, instead of using this feature.
 
+- Another approach is to utilize fs-verity builtin signature
+  verification in conjunction with the IPE LSM, which supports defining
+  a kernel-enforced, system-wide authentication policy that allows only
+  files with a verified fs-verity builtin signature to perform certain
+  operations, such as execution. Note that IPE doesn't require
+  fs.verity.require_signatures=1.
+
 - A file's builtin signature can only be set at the same time that
   fs-verity is being enabled on the file.  Changing or deleting the
   builtin signature later requires re-creating the file.
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index b3506f56e180..a0e786c611c9 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -117,7 +117,7 @@ int fsverity_init_merkle_tree_params(struct merkle_tree_params *params,
 				     unsigned int log_blocksize,
 				     const u8 *salt, size_t salt_size);
 
-struct fsverity_info *fsverity_create_info(const struct inode *inode,
+struct fsverity_info *fsverity_create_info(struct inode *inode,
 					   struct fsverity_descriptor *desc);
 
 void fsverity_set_info(struct inode *inode, struct fsverity_info *vi);
diff --git a/fs/verity/open.c b/fs/verity/open.c
index fdeb95eca3af..e04fffa6b274 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -8,6 +8,7 @@
 #include "fsverity_private.h"
 
 #include <linux/mm.h>
+#include <linux/security.h>
 #include <linux/slab.h>
 
 static struct kmem_cache *fsverity_info_cachep;
@@ -172,12 +173,29 @@ static int compute_file_digest(const struct fsverity_hash_alg *hash_alg,
 	return err;
 }
 
+#ifdef CONFIG_FS_VERITY_BUILTIN_SIGNATURES
+static int fsverity_inode_setintegrity(struct inode *inode,
+				       const struct fsverity_descriptor *desc)
+{
+	return security_inode_setintegrity(inode,
+					   LSM_INT_FSVERITY_BUILTINSIG_VALID,
+					   desc->signature,
+					   le32_to_cpu(desc->sig_size));
+}
+#else
+static inline int fsverity_inode_setintegrity(struct inode *inode,
+					      const struct fsverity_descriptor *desc)
+{
+	return 0;
+}
+#endif /* CONFIG_FS_VERITY_BUILTIN_SIGNATURES */
+
 /*
  * Create a new fsverity_info from the given fsverity_descriptor (with optional
  * appended builtin signature), and check the signature if present.  The
  * fsverity_descriptor must have already undergone basic validation.
  */
-struct fsverity_info *fsverity_create_info(const struct inode *inode,
+struct fsverity_info *fsverity_create_info(struct inode *inode,
 					   struct fsverity_descriptor *desc)
 {
 	struct fsverity_info *vi;
@@ -241,6 +259,10 @@ struct fsverity_info *fsverity_create_info(const struct inode *inode,
 		}
 	}
 
+	err = fsverity_inode_setintegrity(inode, desc);
+	if (err)
+		goto fail;
+
 	return vi;
 
 fail:
diff --git a/fs/verity/signature.c b/fs/verity/signature.c
index 90c07573dd77..fd60e9704e78 100644
--- a/fs/verity/signature.c
+++ b/fs/verity/signature.c
@@ -41,7 +41,11 @@ static struct key *fsverity_keyring;
  * @sig_size: size of signature in bytes, or 0 if no signature
  *
  * If the file includes a signature of its fs-verity file digest, verify it
- * against the certificates in the fs-verity keyring.
+ * against the certificates in the fs-verity keyring. Note that signatures
+ * are verified regardless of the state of the 'fsverity_require_signatures'
+ * variable and the LSM subsystem relies on this behavior to help enforce
+ * file integrity policies. Please discuss changes with the LSM list
+ * (thank you!).
  *
  * Return: 0 on success (signature valid or not required); -errno on failure
  */
diff --git a/include/linux/security.h b/include/linux/security.h
index 703762b0c4ad..2d6752333d6f 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -86,6 +86,7 @@ enum lsm_event {
 enum lsm_integrity_type {
 	LSM_INT_DMVERITY_SIG_VALID,
 	LSM_INT_DMVERITY_ROOTHASH,
+	LSM_INT_FSVERITY_BUILTINSIG_VALID,
 };
 
 /*
-- 
2.44.0


^ permalink raw reply related	[relevance 45%]

* [PATCH v17 12/21] dm: add finalize hook to target_type
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (10 preceding siblings ...)
  2024-04-13  0:55 44% ` [PATCH v17 11/21] block,lsm: add LSM blob and new LSM hooks for block device Fan Wu
@ 2024-04-13  0:55 70% ` Fan Wu
  2024-04-13  0:55 50% ` [PATCH v17 13/21] dm verity: consume root hash digest and expose signature data via LSM hook Fan Wu
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

This patch adds a target finalize hook.

The hook is triggered just before activating an inactive table of a
mapped device. If it returns an error the __bind get cancelled.

The dm-verity target will use this hook to attach the dm-verity's
roothash metadata to the block_device struct of the mapped device.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v10:
  + Not present

v11:
  + Introduced

v12:
  + No changes

v13:
  + No changes

v14:
  + Add documentation

v15:
  + No changes

v16:
  + No changes

v17:
  + No changes
---
 drivers/md/dm.c               | 12 ++++++++++++
 include/linux/device-mapper.h |  9 +++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 56aa2a8b9d71..16d3fd644176 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2270,6 +2270,18 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
 		goto out;
 	}
 
+	for (unsigned int i = 0; i < t->num_targets; i++) {
+		struct dm_target *ti = dm_table_get_target(t, i);
+
+		if (ti->type->finalize) {
+			ret = ti->type->finalize(ti);
+			if (ret) {
+				old_map = ERR_PTR(ret);
+				goto out;
+			}
+		}
+	}
+
 	old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock));
 	rcu_assign_pointer(md->map, (void *)t);
 	md->immutable_target_type = dm_table_get_immutable_target_type(t);
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 82b2195efaca..ad368904b1d5 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -160,6 +160,14 @@ typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
  */
 typedef size_t (*dm_dax_recovery_write_fn)(struct dm_target *ti, pgoff_t pgoff,
 		void *addr, size_t bytes, struct iov_iter *i);
+/*
+ * This hook allows DM targets in an inactive table to complete their setup
+ * before the table is made active.
+ * Returns:
+ *  < 0 : error
+ *  = 0 : success
+ */
+typedef int (*dm_finalize_fn) (struct dm_target *target);
 
 void dm_error(const char *message);
 
@@ -210,6 +218,7 @@ struct target_type {
 	dm_dax_direct_access_fn direct_access;
 	dm_dax_zero_page_range_fn dax_zero_page_range;
 	dm_dax_recovery_write_fn dax_recovery_write;
+	dm_finalize_fn finalize;
 
 	/* For internal device-mapper use. */
 	struct list_head list;
-- 
2.44.0


^ permalink raw reply related	[relevance 70%]

* [PATCH v17 13/21] dm verity: consume root hash digest and expose signature data via LSM hook
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (11 preceding siblings ...)
  2024-04-13  0:55 70% ` [PATCH v17 12/21] dm: add finalize hook to target_type Fan Wu
@ 2024-04-13  0:55 50% ` Fan Wu
    2024-04-13  0:55 32% ` [PATCH v17 14/21] ipe: add support for dm-verity as a trust provider Fan Wu
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

dm-verity provides a strong guarantee of a block device's integrity. As
a generic way to check the integrity of a block device, it provides
those integrity guarantees to its higher layers, including the filesystem
level.

An LSM that control access to a resource on the system based on the
available integrity claims can use this transitive property of
dm-verity, by querying the underlying block_device of a particular
file.

The digest and signature information need to be stored in the block
device to fulfill the next requirement of authorization via LSM policy.
This will enable the LSM to perform revocation of devices that are still
mounted, prohibiting execution of files that are no longer authorized
by the LSM in question.

This patch adds two security hook calls in dm-verity to save the
dm-verity roothash and the roothash signature to the block device's
LSM blobs. The hook calls are depended on CONFIG_SECURITY.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
---
v2:
  + No Changes

v3:
  + No changes

v4:
  + No changes

v5:
  + No changes

v6:
  + Fix an improper cleanup that can result in
    a leak

v7:
  + Squash patch 08/12, 10/12 to [11/16]
  + Use part0 for block_device, to retrieve the block_device, when
    calling security_bdev_setsecurity

v8:
  + Undo squash of 08/12, 10/12 - separating drivers/md/ from
    security/ & block/
  + Use common-audit function for dmverity_signature.
  + Change implementation for storing the dm-verity digest to use the
    newly introduced dm_verity_digest structure introduced in patch
    14/20.
  + Create new structure, dm_verity_digest, containing digest algorithm,
    size, and digest itself to pass to the LSM layer. V7 was missing the
    algorithm.
  + Create an associated public header containing this new structure and
    the key values for the LSM hook, specific to dm-verity.
  + Additional information added to commit, discussing the layering of
    the changes and how the information passed will be used.

v9:
  + No changes

v10:
  + No changes

v11:
  + Add an optional field to save signature
  + Move the security hook call to the new finalize hook

v12:
  + No changes

v13:
  + No changes

v14:
  + Correct code format
  + Remove unnecessary header and switch to dm_disk()

v15:
  + Refactor security_bdev_setsecurity() to security_bdev_setintegrity()
  + Remove unnecessary headers

v16:
  + Use kmemdup to duplicate signature
  + Clean up lsm blob data in error case

v17:
  + Switch to depend on CONFIG_SECURITY
  + Use new enum name LSM_INT_DMVERITY_SIG_VALID
---
 drivers/md/dm-verity-target.c | 83 +++++++++++++++++++++++++++++++++++
 drivers/md/dm-verity.h        |  6 +++
 include/linux/dm-verity.h     | 12 +++++
 include/linux/security.h      |  3 +-
 4 files changed, 103 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/dm-verity.h

diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index bb5da66da4c1..fbb83c6fd99c 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -22,6 +22,8 @@
 #include <linux/scatterlist.h>
 #include <linux/string.h>
 #include <linux/jump_label.h>
+#include <linux/security.h>
+#include <linux/dm-verity.h>
 
 #define DM_MSG_PREFIX			"verity"
 
@@ -1017,6 +1019,38 @@ static void verity_io_hints(struct dm_target *ti, struct queue_limits *limits)
 	blk_limits_io_min(limits, limits->logical_block_size);
 }
 
+#ifdef CONFIG_SECURITY
+
+static int verity_init_sig(struct dm_verity *v, const void *sig,
+			   size_t sig_size)
+{
+	v->sig_size = sig_size;
+	v->root_digest_sig = kmemdup(sig, v->sig_size, GFP_KERNEL);
+	if (!v->root_digest)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void verity_free_sig(struct dm_verity *v)
+{
+	kfree(v->root_digest_sig);
+}
+
+#else
+
+static inline int verity_init_sig(struct dm_verity *v, const void *sig,
+				  size_t sig_size)
+{
+	return 0;
+}
+
+static inline void verity_free_sig(struct dm_verity *v)
+{
+}
+
+#endif /* CONFIG_SECURITY */
+
 static void verity_dtr(struct dm_target *ti)
 {
 	struct dm_verity *v = ti->private;
@@ -1035,6 +1069,7 @@ static void verity_dtr(struct dm_target *ti)
 	kfree(v->salt);
 	kfree(v->root_digest);
 	kfree(v->zero_digest);
+	verity_free_sig(v);
 
 	if (v->tfm)
 		crypto_free_ahash(v->tfm);
@@ -1434,6 +1469,13 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		ti->error = "Root hash verification failed";
 		goto bad;
 	}
+
+	r = verity_init_sig(v, verify_args.sig, verify_args.sig_size);
+	if (r < 0) {
+		ti->error = "Cannot allocate root digest signature";
+		goto bad;
+	}
+
 	v->hash_per_block_bits =
 		__fls((1 << v->hash_dev_block_bits) / v->digest_size);
 
@@ -1584,6 +1626,44 @@ int dm_verity_get_root_digest(struct dm_target *ti, u8 **root_digest, unsigned i
 	return 0;
 }
 
+#ifdef CONFIG_SECURITY
+
+static int verity_finalize(struct dm_target *ti)
+{
+	struct block_device *bdev;
+	struct dm_verity_digest root_digest;
+	struct dm_verity *v;
+	int r;
+
+	v = ti->private;
+	bdev = dm_disk(dm_table_get_md(ti->table))->part0;
+	root_digest.digest = v->root_digest;
+	root_digest.digest_len = v->digest_size;
+	root_digest.alg = v->alg_name;
+
+	r = security_bdev_setintegrity(bdev, LSM_INT_DMVERITY_ROOTHASH, &root_digest,
+				       sizeof(root_digest));
+	if (r)
+		return r;
+
+	r = security_bdev_setintegrity(bdev,
+				       LSM_INT_DMVERITY_SIG_VALID,
+				       v->root_digest_sig,
+				       v->sig_size);
+	if (r)
+		goto bad;
+
+	return 0;
+
+bad:
+
+	security_bdev_setintegrity(bdev, LSM_INT_DMVERITY_ROOTHASH, NULL, 0);
+
+	return r;
+}
+
+#endif /* CONFIG_SECURITY */
+
 static struct target_type verity_target = {
 	.name		= "verity",
 	.features	= DM_TARGET_SINGLETON | DM_TARGET_IMMUTABLE,
@@ -1596,6 +1676,9 @@ static struct target_type verity_target = {
 	.prepare_ioctl	= verity_prepare_ioctl,
 	.iterate_devices = verity_iterate_devices,
 	.io_hints	= verity_io_hints,
+#ifdef CONFIG_SECURITY
+	.finalize	= verity_finalize,
+#endif /* CONFIG_SECURITY */
 };
 module_dm(verity);
 
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index 20b1bcf03474..89e862f0cdf6 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -43,6 +43,9 @@ struct dm_verity {
 	u8 *root_digest;	/* digest of the root block */
 	u8 *salt;		/* salt: its size is salt_size */
 	u8 *zero_digest;	/* digest for a zero block */
+#ifdef CONFIG_SECURITY
+	u8 *root_digest_sig;	/* digest signature of the root block */
+#endif /* CONFIG_SECURITY */
 	unsigned int salt_size;
 	sector_t data_start;	/* data offset in 512-byte sectors */
 	sector_t hash_start;	/* hash start in blocks */
@@ -56,6 +59,9 @@ struct dm_verity {
 	bool hash_failed:1;	/* set if hash of any block failed */
 	bool use_bh_wq:1;	/* try to verify in BH wq before normal work-queue */
 	unsigned int digest_size;	/* digest size for the current hash algorithm */
+#ifdef CONFIG_SECURITY
+	unsigned int sig_size;	/* digest signature size */
+#endif /* CONFIG_SECURITY */
 	unsigned int ahash_reqsize;/* the size of temporary space for crypto */
 	enum verity_mode mode;	/* mode for handling verification errors */
 	unsigned int corrupted_errs;/* Number of errors for corrupted blocks */
diff --git a/include/linux/dm-verity.h b/include/linux/dm-verity.h
new file mode 100644
index 000000000000..a799a8043d85
--- /dev/null
+++ b/include/linux/dm-verity.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_DM_VERITY_H
+#define _LINUX_DM_VERITY_H
+
+struct dm_verity_digest {
+	const char *alg;
+	const u8 *digest;
+	size_t digest_len;
+};
+
+#endif /* _LINUX_DM_VERITY_H */
diff --git a/include/linux/security.h b/include/linux/security.h
index ac0985641611..9e46b13a356c 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -84,7 +84,8 @@ enum lsm_event {
 };
 
 enum lsm_integrity_type {
-	__LSM_INT_MAX
+	LSM_INT_DMVERITY_SIG_VALID,
+	LSM_INT_DMVERITY_ROOTHASH,
 };
 
 /*
-- 
2.44.0


^ permalink raw reply related	[relevance 50%]

* [PATCH v17 10/21] ipe: add permissive toggle
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (8 preceding siblings ...)
  2024-04-13  0:55 26% ` [PATCH v17 09/21] uapi|audit|ipe: add ipe auditing support Fan Wu
@ 2024-04-13  0:55 47% ` Fan Wu
  2024-04-13  0:55 44% ` [PATCH v17 11/21] block,lsm: add LSM blob and new LSM hooks for block device Fan Wu
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

IPE, like SELinux, supports a permissive mode. This mode allows policy
authors to test and evaluate IPE policy without it effecting their
programs. When the mode is changed, a 1404 AUDIT_MAC_STATUS
be reported.

This patch adds the following audit records:

    audit: MAC_STATUS enforcing=0 old_enforcing=1 auid=4294967295
      ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
    audit: MAC_STATUS enforcing=1 old_enforcing=0 auid=4294967295
      ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1

The audit record only emit when the value from the user input is
different from the current enforce value.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation into a separate commit from the
    evaluation loop and audit system, for easier review.
  + Propagating changes to support the new ipe_context structure in the
    evaluation loop.
  + Split out permissive functionality into a separate patch for easier
    review.
  + Remove permissive switch compile-time configuration option - this
    is trivial to add later.

v8:
  + Remove "IPE" prefix from permissive audit record
  + align fields to the linux-audit field dictionary. This causes the
    following fields to change:
      enforce -> permissive

  + Remove duplicated information correlated with syscall record, that
    will always be present in the audit event.
  + Change audit types:
    + AUDIT_TRUST_STATUS -> AUDIT_MAC_STATUS
      + There is no significant difference in meaning between
        these types.

v9:
  + Clean up ipe_context related code

v10:
  + Change audit format to comform with the existing format selinux is
    using
  + Remove the audit record emission during init to align with selinux,
    which does not perform this action.

v11:
  + Remove redundant code

v12:
  + Remove redundant code

v13:
  + Remove audit format macro

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Fix code and documentation style issues
---
 security/ipe/audit.c | 27 ++++++++++++++++--
 security/ipe/audit.h |  1 +
 security/ipe/eval.c  | 11 ++++++--
 security/ipe/eval.h  |  1 +
 security/ipe/fs.c    | 66 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index 6a3f24665655..a416291ba477 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -93,8 +93,8 @@ void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
 	if (!ab)
 		return;
 
-	audit_log_format(ab, "ipe_op=%s ipe_hook=%s pid=%d comm=",
-			 op, audit_hook_names[ctx->hook],
+	audit_log_format(ab, "ipe_op=%s ipe_hook=%s enforcing=%d pid=%d comm=",
+			 op, audit_hook_names[ctx->hook], READ_ONCE(enforce),
 			 task_tgid_nr(current));
 	audit_log_untrustedstring(ab, get_task_comm(comm, current));
 
@@ -212,3 +212,26 @@ void ipe_audit_policy_load(const struct ipe_policy *const p)
 
 	audit_log_end(ab);
 }
+
+/**
+ * ipe_audit_enforce() - Audit a change in IPE's enforcement state.
+ * @new_enforce: The new value enforce to be set.
+ * @old_enforce: The old value currently in enforce.
+ */
+void ipe_audit_enforce(bool new_enforce, bool old_enforce)
+{
+	struct audit_buffer *ab;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_MAC_STATUS);
+	if (!ab)
+		return;
+
+	audit_log(audit_context(), GFP_KERNEL, AUDIT_MAC_STATUS,
+		  "enforcing=%d old_enforcing=%d auid=%u ses=%u"
+		  " enabled=1 old-enabled=1 lsm=ipe res=1",
+		  new_enforce, old_enforce,
+		  from_kuid(&init_user_ns, audit_get_loginuid(current)),
+		  audit_get_sessionid(current));
+
+	audit_log_end(ab);
+}
diff --git a/security/ipe/audit.h b/security/ipe/audit.h
index 3ba8b8a91541..ed2620846a79 100644
--- a/security/ipe/audit.h
+++ b/security/ipe/audit.h
@@ -14,5 +14,6 @@ void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
 void ipe_audit_policy_load(const struct ipe_policy *const p);
 void ipe_audit_policy_activation(const struct ipe_policy *const op,
 				 const struct ipe_policy *const np);
+void ipe_audit_enforce(bool new_enforce, bool old_enforce);
 
 #endif /* _IPE_AUDIT_H */
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 18fd5d8fa03e..dd9064974be6 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -18,6 +18,7 @@
 
 struct ipe_policy __rcu *ipe_active_policy;
 bool success_audit;
+bool enforce = true;
 
 #define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
 
@@ -108,6 +109,7 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	enum ipe_action_type action;
 	enum ipe_match match_type;
 	bool match = false;
+	int rc = 0;
 
 	rcu_read_lock();
 
@@ -160,9 +162,12 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	ipe_audit_match(ctx, match_type, action, rule);
 
 	if (action == IPE_ACTION_DENY)
-		return -EACCES;
+		rc = -EACCES;
 
-	return 0;
+	if (!READ_ONCE(enforce))
+		rc = 0;
+
+	return rc;
 }
 
 /* Set the right module name */
@@ -173,3 +178,5 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 
 module_param(success_audit, bool, 0400);
 MODULE_PARM_DESC(success_audit, "Start IPE with success auditing enabled");
+module_param(enforce, bool, 0400);
+MODULE_PARM_DESC(enforce, "Start IPE in enforce or permissive mode");
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 42b74a7a7c2b..80b74f55fa69 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -16,6 +16,7 @@
 
 extern struct ipe_policy __rcu *ipe_active_policy;
 extern bool success_audit;
+extern bool enforce;
 
 struct ipe_superblock {
 	bool initramfs;
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
index 9e410982b759..b52fb6023904 100644
--- a/security/ipe/fs.c
+++ b/security/ipe/fs.c
@@ -16,6 +16,7 @@ static struct dentry *np __ro_after_init;
 static struct dentry *root __ro_after_init;
 struct dentry *policy_root __ro_after_init;
 static struct dentry *audit_node __ro_after_init;
+static struct dentry *enforce_node __ro_after_init;
 
 /**
  * setaudit() - Write handler for the securityfs node, "ipe/success_audit"
@@ -65,6 +66,58 @@ static ssize_t getaudit(struct file *f, char __user *data,
 	return simple_read_from_buffer(data, len, offset, result, 1);
 }
 
+/**
+ * setenforce() - Write handler for the securityfs node, "ipe/enforce"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ */
+static ssize_t setenforce(struct file *f, const char __user *data,
+			  size_t len, loff_t *offset)
+{
+	int rc = 0;
+	bool new_value, old_value;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	old_value = READ_ONCE(enforce);
+	rc = kstrtobool_from_user(data, len, &new_value);
+	if (rc)
+		return rc;
+
+	if (new_value != old_value) {
+		ipe_audit_enforce(new_value, old_value);
+		WRITE_ONCE(enforce, new_value);
+	}
+
+	return len;
+}
+
+/**
+ * getenforce() - Read handler for the securityfs node, "ipe/enforce"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the read syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return: Length of buffer written
+ */
+static ssize_t getenforce(struct file *f, char __user *data,
+			  size_t len, loff_t *offset)
+{
+	const char *result;
+
+	result = ((READ_ONCE(enforce)) ? "1" : "0");
+
+	return simple_read_from_buffer(data, len, offset, result, 1);
+}
+
 /**
  * new_policy() - Write handler for the securityfs node, "ipe/new_policy".
  * @f: Supplies a file structure representing the securityfs node.
@@ -123,6 +176,11 @@ static const struct file_operations audit_fops = {
 	.read = getaudit,
 };
 
+static const struct file_operations enforce_fops = {
+	.write = setenforce,
+	.read = getenforce,
+};
+
 /**
  * ipe_init_securityfs() - Initialize IPE's securityfs tree at fsinit.
  *
@@ -149,6 +207,13 @@ static int __init ipe_init_securityfs(void)
 		goto err;
 	}
 
+	enforce_node = securityfs_create_file("enforce", 0600, root, NULL,
+					      &enforce_fops);
+	if (IS_ERR(enforce_node)) {
+		rc = PTR_ERR(enforce_node);
+		goto err;
+	}
+
 	policy_root = securityfs_create_dir("policies", root);
 	if (IS_ERR(policy_root)) {
 		rc = PTR_ERR(policy_root);
@@ -165,6 +230,7 @@ static int __init ipe_init_securityfs(void)
 err:
 	securityfs_remove(np);
 	securityfs_remove(policy_root);
+	securityfs_remove(enforce_node);
 	securityfs_remove(audit_node);
 	securityfs_remove(root);
 	return rc;
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v17 11/21] block,lsm: add LSM blob and new LSM hooks for block device
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (9 preceding siblings ...)
  2024-04-13  0:55 47% ` [PATCH v17 10/21] ipe: add permissive toggle Fan Wu
@ 2024-04-13  0:55 44% ` Fan Wu
  2024-04-13  0:55 70% ` [PATCH v17 12/21] dm: add finalize hook to target_type Fan Wu
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Some block devices have valuable security properties that is only
accessible during the creation time.

For example, when creating a dm-verity block device, the dm-verity's
roothash and roothash signature, which are extreme important security
metadata, are passed to the kernel. However, the roothash will be saved
privately in dm-verity, which prevents the security subsystem to easily
access that information. Worse, in the current implementation the
roothash signature will be discarded after the verification, making it
impossible to utilize the roothash signature by the security subsystem.

With this patch, an LSM blob is added to the block_device structure.
This enables the security subsystem to store security-sensitive data
related to block devices within the security blob. For example, LSM can
use the new LSM blob to save the roothash signature of a dm-verity,
and LSM can make access decision based on the data inside the signature,
like the signer certificate.

The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.

This patch also introduces a new hook to save block device's integrity
data. For example, for dm-verity, LSMs can use this hook to save
the roothash signature of a dm-verity into the security blob,
and LSMs can make access decisions based on the data inside
the signature, like the signer certificate.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + No Changes

v3:
  + Minor style changes from checkpatch --strict

v4:
  + No Changes

v5:
  + Allow multiple callers to call security_bdev_setsecurity

v6:
  + Simplify security_bdev_setsecurity break condition

v7:
  + Squash all dm-verity related patches to two patches,
    the additions to dm-verity/fs, and the consumption of
    the additions.

v8:
  + Split dm-verity related patches squashed in v7 to 3 commits based on
    topic:
      + New LSM hook
      + Consumption of hook outside LSM
      + Consumption of hook inside LSM.

  + change return of security_bdev_alloc / security_bdev_setsecurity
    to LSM_RET_DEFAULT instead of 0.

  + Change return code to -EOPNOTSUPP, bring inline with other
    setsecurity hooks.

v9:
  + Add Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
  + Remove unlikely when calling LSM hook
  + Make the security field dependent on CONFIG_SECURITY

v10:
  + No changes

v11:
  + No changes

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + Drop security_bdev_setsecurity() for new hook
    security_bdev_setintegrity() in the next commit
  + Update call_int_hook() for 260017f

v16:
  + Drop Reviewed-by tag for the new changes
  + Squash the security_bdev_setintegrity() into this commit
  + Rename enum from lsm_intgr_type to lsm_integrity_type
  + Switch to use call_int_hook() for bdev_setintegrity()
  + Correct comment
  + Fix return in security_bdev_alloc()

v17:
  + Fix a typo
  + Improve the commit subject line
---
 block/bdev.c                  |  7 +++
 include/linux/blk_types.h     |  3 ++
 include/linux/lsm_hook_defs.h |  5 ++
 include/linux/lsm_hooks.h     |  1 +
 include/linux/security.h      | 26 ++++++++++
 security/security.c           | 89 +++++++++++++++++++++++++++++++++++
 6 files changed, 131 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index b8e32d933a63..df7c71a34472 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -24,6 +24,7 @@
 #include <linux/pseudo_fs.h>
 #include <linux/uio.h>
 #include <linux/namei.h>
+#include <linux/security.h>
 #include <linux/part_stat.h>
 #include <linux/uaccess.h>
 #include <linux/stat.h>
@@ -313,6 +314,11 @@ static struct inode *bdev_alloc_inode(struct super_block *sb)
 	if (!ei)
 		return NULL;
 	memset(&ei->bdev, 0, sizeof(ei->bdev));
+
+	if (security_bdev_alloc(&ei->bdev)) {
+		kmem_cache_free(bdev_cachep, ei);
+		return NULL;
+	}
 	return &ei->vfs_inode;
 }
 
@@ -322,6 +328,7 @@ static void bdev_free_inode(struct inode *inode)
 
 	free_percpu(bdev->bd_stats);
 	kfree(bdev->bd_meta_info);
+	security_bdev_free(bdev);
 
 	if (!bdev_is_partition(bdev)) {
 		if (bdev->bd_disk && bdev->bd_disk->bdi)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index cb1526ec44b5..effe3c4e6b35 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -70,6 +70,9 @@ struct block_device {
 #endif
 	bool			bd_ro_warned;
 	int			bd_writers;
+#ifdef CONFIG_SECURITY
+	void			*security;
+#endif
 	/*
 	 * keep this out-of-line as it's both big and not needed in the fast
 	 * path
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 7db99ae75651..b391a7f13053 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -452,3 +452,8 @@ LSM_HOOK(int, 0, uring_cmd, struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_IO_URING */
 
 LSM_HOOK(void, LSM_RET_VOID, initramfs_populated, void)
+
+LSM_HOOK(int, 0, bdev_alloc_security, struct block_device *bdev)
+LSM_HOOK(void, LSM_RET_VOID, bdev_free_security, struct block_device *bdev)
+LSM_HOOK(int, 0, bdev_setintegrity, struct block_device *bdev,
+	 enum lsm_integrity_type type, const void *value, size_t size)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index a2ade0ffe9e7..f1692179aa56 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -78,6 +78,7 @@ struct lsm_blob_sizes {
 	int	lbs_msg_msg;
 	int	lbs_task;
 	int	lbs_xattr_count; /* number of xattr slots in new_xattrs array */
+	int	lbs_bdev;
 };
 
 /**
diff --git a/include/linux/security.h b/include/linux/security.h
index f35af7b6cfba..ac0985641611 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -83,6 +83,10 @@ enum lsm_event {
 	LSM_POLICY_CHANGE,
 };
 
+enum lsm_integrity_type {
+	__LSM_INT_MAX
+};
+
 /*
  * These are reasons that can be passed to the security_locked_down()
  * LSM hook. Lockdown reasons that protect kernel integrity (ie, the
@@ -509,6 +513,11 @@ int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
 int security_locked_down(enum lockdown_reason what);
 int lsm_fill_user_ctx(struct lsm_ctx __user *uctx, u32 *uctx_len,
 		      void *val, size_t val_len, u64 id, u64 flags);
+int security_bdev_alloc(struct block_device *bdev);
+void security_bdev_free(struct block_device *bdev);
+int security_bdev_setintegrity(struct block_device *bdev,
+			       enum lsm_integrity_type type, const void *value,
+			       size_t size);
 #else /* CONFIG_SECURITY */
 
 static inline int call_blocking_lsm_notifier(enum lsm_event event, void *data)
@@ -1483,6 +1492,23 @@ static inline int lsm_fill_user_ctx(struct lsm_ctx __user *uctx,
 {
 	return -EOPNOTSUPP;
 }
+
+static inline int security_bdev_alloc(struct block_device *bdev)
+{
+	return 0;
+}
+
+static inline void security_bdev_free(struct block_device *bdev)
+{
+}
+
+static inline int security_bdev_setintegrity(struct block_device *bdev,
+					     enum lsm_integrity_type type,
+					     const void *value, size_t size)
+{
+	return 0;
+}
+
 #endif	/* CONFIG_SECURITY */
 
 #if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE)
diff --git a/security/security.c b/security/security.c
index 0db5a6b32aab..3a7724c3dd76 100644
--- a/security/security.c
+++ b/security/security.c
@@ -29,6 +29,7 @@
 #include <linux/msg.h>
 #include <linux/overflow.h>
 #include <net/flow.h>
+#include <linux/fs.h>
 
 /* How many LSMs were built into the kernel? */
 #define LSM_COUNT (__end_lsm_info - __start_lsm_info)
@@ -232,6 +233,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes *needed)
 	lsm_set_blob_size(&needed->lbs_task, &blob_sizes.lbs_task);
 	lsm_set_blob_size(&needed->lbs_xattr_count,
 			  &blob_sizes.lbs_xattr_count);
+	lsm_set_blob_size(&needed->lbs_bdev, &blob_sizes.lbs_bdev);
 }
 
 /* Prepare LSM for initialization. */
@@ -405,6 +407,7 @@ static void __init ordered_lsm_init(void)
 	init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
 	init_debug("task blob size       = %d\n", blob_sizes.lbs_task);
 	init_debug("xattr slots          = %d\n", blob_sizes.lbs_xattr_count);
+	init_debug("bdev blob size       = %d\n", blob_sizes.lbs_bdev);
 
 	/*
 	 * Create any kmem_caches needed for blobs
@@ -737,6 +740,28 @@ static int lsm_msg_msg_alloc(struct msg_msg *mp)
 	return 0;
 }
 
+/**
+ * lsm_bdev_alloc - allocate a composite block_device blob
+ * @bdev: the block_device that needs a blob
+ *
+ * Allocate the block_device blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_bdev_alloc(struct block_device *bdev)
+{
+	if (blob_sizes.lbs_bdev == 0) {
+		bdev->security = NULL;
+		return 0;
+	}
+
+	bdev->security = kzalloc(blob_sizes.lbs_bdev, GFP_KERNEL);
+	if (!bdev->security)
+		return -ENOMEM;
+
+	return 0;
+}
+
 /**
  * lsm_early_task - during initialization allocate a composite task blob
  * @task: the task that needs a blob
@@ -5568,6 +5593,70 @@ int security_locked_down(enum lockdown_reason what)
 }
 EXPORT_SYMBOL(security_locked_down);
 
+/**
+ * security_bdev_alloc() - Allocate a block device LSM blob
+ * @bdev: block device
+ *
+ * Allocate and attach a security structure to @bdev->security.  The
+ * security field is initialized to NULL when the bdev structure is
+ * allocated.
+ *
+ * Return: Return 0 if operation was successful.
+ */
+int security_bdev_alloc(struct block_device *bdev)
+{
+	int rc = 0;
+
+	rc = lsm_bdev_alloc(bdev);
+	if (unlikely(rc))
+		return rc;
+
+	rc = call_int_hook(bdev_alloc_security, bdev);
+	if (unlikely(rc))
+		security_bdev_free(bdev);
+
+	return rc;
+}
+EXPORT_SYMBOL(security_bdev_alloc);
+
+/**
+ * security_bdev_free() - Free a block device's LSM blob
+ * @bdev: block device
+ *
+ * Deallocate the bdev security structure and set @bdev->security to NULL.
+ */
+void security_bdev_free(struct block_device *bdev)
+{
+	if (!bdev->security)
+		return;
+
+	call_void_hook(bdev_free_security, bdev);
+
+	kfree(bdev->security);
+	bdev->security = NULL;
+}
+EXPORT_SYMBOL(security_bdev_free);
+
+/**
+ * security_bdev_setintegrity() - Set the device's integrity data
+ * @bdev: block device
+ * @type: type of integrity, e.g. hash digest, signature, etc
+ * @value: the integrity value
+ * @size: size of the integrity value
+ *
+ * Register a verified integrity measurement of a bdev with LSMs.
+ * LSMs should free the previously saved data if @value is NULL.
+ *
+ * Return: Returns 0 on success, negative values on failure.
+ */
+int security_bdev_setintegrity(struct block_device *bdev,
+			       enum lsm_integrity_type type, const void *value,
+			       size_t size)
+{
+	return call_int_hook(bdev_setintegrity, bdev, type, value, size);
+}
+EXPORT_SYMBOL(security_bdev_setintegrity);
+
 #ifdef CONFIG_PERF_EVENTS
 /**
  * security_perf_event_open() - Check if a perf event open is allowed
-- 
2.44.0


^ permalink raw reply related	[relevance 44%]

* [PATCH v17 14/21] ipe: add support for dm-verity as a trust provider
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (12 preceding siblings ...)
  2024-04-13  0:55 50% ` [PATCH v17 13/21] dm verity: consume root hash digest and expose signature data via LSM hook Fan Wu
@ 2024-04-13  0:55 32% ` Fan Wu
  2024-04-13  0:55 65% ` [PATCH v17 15/21] security: add security_inode_setintegrity() hook Fan Wu
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Allows author of IPE policy to indicate trust for a singular dm-verity
volume, identified by roothash, through "dmverity_roothash" and all
signed dm-verity volumes, through "dmverity_signature".

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + No Changes

v3:
  + No changes

v4:
  + No changes

v5:
  + No changes

v6:
  + Fix an improper cleanup that can result in
    a leak

v7:
  + Squash patch 08/12, 10/12 to [11/16]

v8:
  + Undo squash of 08/12, 10/12 - separating drivers/md/ from security/
    & block/
  + Use common-audit function for dmverity_signature.
  + Change implementation for storing the dm-verity digest to use the
    newly introduced dm_verity_digest structure introduced in patch
    14/20.

v9:
  + Adapt to the new parser

v10:
  + Select the Kconfig when all dependencies are enabled

v11:
  + No changes

v12:
  + Refactor to use struct digest_info* instead of void*
  + Correct audit format

v13:
  + Remove the CONFIG_IPE_PROP_DM_VERITY dependency inside the parser
    to make the policy grammar independent of the kernel config.

v14:
  + No changes

v15:
  + Fix one grammar issue in KCONFIG
  + Switch to use security_bdev_setintegrity() hook

v16:
  + Refactor for enum integrity type

v17:
  + Add years to license header
  + Fix code and documentation style issues
  + Return -EINVAL in ipe_bdev_setintegrity when passed type is not
    supported
  + Use new enum name LSM_INT_DMVERITY_SIG_VALID
---
 security/ipe/Kconfig         |  18 ++++++
 security/ipe/Makefile        |   1 +
 security/ipe/audit.c         |  29 ++++++++-
 security/ipe/digest.c        | 118 +++++++++++++++++++++++++++++++++++
 security/ipe/digest.h        |  26 ++++++++
 security/ipe/eval.c          |  91 ++++++++++++++++++++++++++-
 security/ipe/eval.h          |  10 +++
 security/ipe/hooks.c         |  78 +++++++++++++++++++++++
 security/ipe/hooks.h         |   8 +++
 security/ipe/ipe.c           |  15 +++++
 security/ipe/ipe.h           |   4 ++
 security/ipe/policy.h        |   3 +
 security/ipe/policy_parser.c |  24 ++++++-
 13 files changed, 421 insertions(+), 4 deletions(-)
 create mode 100644 security/ipe/digest.c
 create mode 100644 security/ipe/digest.h

diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index ac4d558e69d5..6179752c614f 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -8,6 +8,7 @@ menuconfig SECURITY_IPE
 	depends on SECURITY && SECURITYFS && AUDIT && AUDITSYSCALL
 	select PKCS7_MESSAGE_PARSER
 	select SYSTEM_DATA_VERIFICATION
+	select IPE_PROP_DM_VERITY if DM_VERITY && DM_VERITY_VERIFY_ROOTHASH_SIG
 	help
 	  This option enables the Integrity Policy Enforcement LSM
 	  allowing users to define a policy to enforce a trust-based access
@@ -15,3 +16,20 @@ menuconfig SECURITY_IPE
 	  admins to reconfigure trust requirements on the fly.
 
 	  If unsure, answer N.
+
+if SECURITY_IPE
+menu "IPE Trust Providers"
+
+config IPE_PROP_DM_VERITY
+	bool "Enable support for dm-verity volumes"
+	depends on DM_VERITY && DM_VERITY_VERIFY_ROOTHASH_SIG
+	help
+	  This option enables the properties 'dmverity_signature' and
+	  'dmverity_roothash' in IPE policy. These properties evaluate
+	  to TRUE when a file is evaluated against a dm-verity volume
+	  that was mounted with a valid signed root-hash or the
+	  volume's root hash matches the supplied value in the policy.
+
+endmenu
+
+endif
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 62caccba14b4..e1019bb9f0f3 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -6,6 +6,7 @@
 #
 
 obj-$(CONFIG_SECURITY_IPE) += \
+	digest.o \
 	eval.o \
 	hooks.o \
 	fs.o \
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
index a416291ba477..2c98520267c1 100644
--- a/security/ipe/audit.c
+++ b/security/ipe/audit.c
@@ -13,6 +13,7 @@
 #include "hooks.h"
 #include "policy.h"
 #include "audit.h"
+#include "digest.h"
 
 #define ACTSTR(x) ((x) == IPE_ACTION_ALLOW ? "ALLOW" : "DENY")
 
@@ -49,8 +50,22 @@ static const char *const audit_hook_names[__IPE_HOOK_MAX] = {
 static const char *const audit_prop_names[__IPE_PROP_MAX] = {
 	"boot_verified=FALSE",
 	"boot_verified=TRUE",
+	"dmverity_roothash=",
+	"dmverity_signature=FALSE",
+	"dmverity_signature=TRUE",
 };
 
+/**
+ * audit_dmv_roothash() - audit the roothash of a dmverity_roothash property.
+ * @ab: Supplies a pointer to the audit_buffer to append to.
+ * @rh: Supplies a pointer to the digest structure.
+ */
+static void audit_dmv_roothash(struct audit_buffer *ab, const void *rh)
+{
+	audit_log_format(ab, "%s", audit_prop_names[IPE_PROP_DMV_ROOTHASH]);
+	ipe_digest_audit(ab, rh);
+}
+
 /**
  * audit_rule() - audit an IPE policy rule.
  * @ab: Supplies a pointer to the audit_buffer to append to.
@@ -62,8 +77,18 @@ static void audit_rule(struct audit_buffer *ab, const struct ipe_rule *r)
 
 	audit_log_format(ab, " rule=\"%s ", audit_op_names[r->op]);
 
-	list_for_each_entry(ptr, &r->props, next)
-		audit_log_format(ab, "%s ", audit_prop_names[ptr->type]);
+	list_for_each_entry(ptr, &r->props, next) {
+		switch (ptr->type) {
+		case IPE_PROP_DMV_ROOTHASH:
+			audit_dmv_roothash(ab, ptr->value);
+			break;
+		default:
+			audit_log_format(ab, "%s", audit_prop_names[ptr->type]);
+			break;
+		}
+
+		audit_log_format(ab, " ");
+	}
 
 	audit_log_format(ab, "action=%s\"", ACTSTR(r->action));
 }
diff --git a/security/ipe/digest.c b/security/ipe/digest.c
new file mode 100644
index 000000000000..493716370570
--- /dev/null
+++ b/security/ipe/digest.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include "digest.h"
+
+/**
+ * ipe_digest_parse() - parse a digest in IPE's policy.
+ * @valstr: Supplies the string parsed from the policy.
+ *
+ * Digests in IPE are defined in a standard way:
+ *	<alg_name>:<hex>
+ *
+ * Use this function to create a property to parse the digest
+ * consistently. The parsed digest will be saved in @value in IPE's
+ * policy.
+ *
+ * Return: The parsed digest_info structure on success. If an error occurs,
+ * the function will return the error value (via ERR_PTR).
+ */
+struct digest_info *ipe_digest_parse(const char *valstr)
+{
+	struct digest_info *info = NULL;
+	char *sep, *raw_digest;
+	size_t raw_digest_len;
+	u8 *digest = NULL;
+	char *alg = NULL;
+	int rc = 0;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	sep = strchr(valstr, ':');
+	if (!sep) {
+		rc = -EBADMSG;
+		goto err;
+	}
+
+	alg = kstrndup(valstr, sep - valstr, GFP_KERNEL);
+	if (!alg) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	raw_digest = sep + 1;
+	raw_digest_len = strlen(raw_digest);
+
+	info->digest_len = (raw_digest_len + 1) / 2;
+	digest = kzalloc(info->digest_len, GFP_KERNEL);
+	if (!digest) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	rc = hex2bin(digest, raw_digest, info->digest_len);
+	if (rc < 0) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	info->alg = alg;
+	info->digest = digest;
+	return info;
+
+err:
+	kfree(alg);
+	kfree(digest);
+	kfree(info);
+	return ERR_PTR(rc);
+}
+
+/**
+ * ipe_digest_eval() - evaluate an IPE digest against another digest.
+ * @expected: Supplies the policy-provided digest value.
+ * @digest: Supplies the digest to compare against the policy digest value.
+ *
+ * Return:
+ * * %true	- digests match
+ * * %false	- digests do not match
+ */
+bool ipe_digest_eval(const struct digest_info *expected,
+		     const struct digest_info *digest)
+{
+	return (expected->digest_len == digest->digest_len) &&
+	       (!strcmp(expected->alg, digest->alg)) &&
+	       (!memcmp(expected->digest, digest->digest, expected->digest_len));
+}
+
+/**
+ * ipe_digest_free() - free an IPE digest.
+ * @info: Supplies a pointer the policy-provided digest to free.
+ */
+void ipe_digest_free(struct digest_info *info)
+{
+	if (IS_ERR_OR_NULL(info))
+		return;
+
+	kfree(info->alg);
+	kfree(info->digest);
+	kfree(info);
+}
+
+/**
+ * ipe_digest_audit() - audit a digest that was sourced from IPE's policy.
+ * @ab: Supplies the audit_buffer to append the formatted result.
+ * @info: Supplies a pointer to source the audit record from.
+ *
+ * Digests in IPE are audited in this format:
+ *	<alg_name>:<hex>
+ */
+void ipe_digest_audit(struct audit_buffer *ab, const struct digest_info *info)
+{
+	audit_log_untrustedstring(ab, info->alg);
+	audit_log_format(ab, ":");
+	audit_log_n_hex(ab, info->digest, info->digest_len);
+}
diff --git a/security/ipe/digest.h b/security/ipe/digest.h
new file mode 100644
index 000000000000..52c9b3844a38
--- /dev/null
+++ b/security/ipe/digest.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_DIGEST_H
+#define _IPE_DIGEST_H
+
+#include <linux/types.h>
+#include <linux/audit.h>
+
+#include "policy.h"
+
+struct digest_info {
+	const char *alg;
+	const u8 *digest;
+	size_t digest_len;
+};
+
+struct digest_info *ipe_digest_parse(const char *valstr);
+void ipe_digest_free(struct digest_info *digest_info);
+void ipe_digest_audit(struct audit_buffer *ab, const struct digest_info *val);
+bool ipe_digest_eval(const struct digest_info *expected,
+		     const struct digest_info *digest);
+
+#endif /* _IPE_DIGEST_H */
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index dd9064974be6..477f0d0ffda8 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -15,10 +15,12 @@
 #include "eval.h"
 #include "policy.h"
 #include "audit.h"
+#include "digest.h"
 
 struct ipe_policy __rcu *ipe_active_policy;
 bool success_audit;
 bool enforce = true;
+#define INO_BLOCK_DEV(ino) ((ino)->i_sb->s_bdev)
 
 #define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
 
@@ -32,6 +34,23 @@ static void build_ipe_sb_ctx(struct ipe_eval_ctx *ctx, const struct file *const
 	ctx->initramfs = ipe_sb(FILE_SUPERBLOCK(file))->initramfs;
 }
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+/**
+ * build_ipe_bdev_ctx() - Build ipe_bdev field of an evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @ino: Supplies the inode struct of the file triggered IPE event.
+ */
+static void build_ipe_bdev_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+	if (INO_BLOCK_DEV(ino))
+		ctx->ipe_bdev = ipe_bdev(INO_BLOCK_DEV(ino));
+}
+#else
+static void build_ipe_bdev_ctx(struct ipe_eval_ctx *ctx, const struct inode *const ino)
+{
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 /**
  * ipe_build_eval_ctx() - Build an ipe evaluation context.
  * @ctx: Supplies a pointer to the context to be populated.
@@ -48,8 +67,10 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 	ctx->op = op;
 	ctx->hook = hook;
 
-	if (file)
+	if (file) {
 		build_ipe_sb_ctx(ctx, file);
+		build_ipe_bdev_ctx(ctx, d_real_inode(file->f_path.dentry));
+	}
 }
 
 /**
@@ -65,6 +86,68 @@ static bool evaluate_boot_verified(const struct ipe_eval_ctx *const ctx)
 	return ctx->initramfs;
 }
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+/**
+ * evaluate_dmv_roothash() - Evaluate @ctx against a dmv roothash property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ * @p: Supplies a pointer to the property being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_dmv_roothash(const struct ipe_eval_ctx *const ctx,
+				  struct ipe_prop *p)
+{
+	return !!ctx->ipe_bdev &&
+	       !!ctx->ipe_bdev->root_hash &&
+	       ipe_digest_eval(p->value,
+			       ctx->ipe_bdev->root_hash);
+}
+
+/**
+ * evaluate_dmv_sig_false() - Evaluate @ctx against a dmv sig false property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the property
+ * * %false	- The current @ctx doesn't match the property
+ */
+static bool evaluate_dmv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return !ctx->ipe_bdev || (!ctx->ipe_bdev->dm_verity_signed);
+}
+
+/**
+ * evaluate_dmv_sig_true() - Evaluate @ctx against a dmv sig true property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the property
+ * * %false	- The current @ctx doesn't match the property
+ */
+static bool evaluate_dmv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return !evaluate_dmv_sig_false(ctx);
+}
+#else
+static bool evaluate_dmv_roothash(const struct ipe_eval_ctx *const ctx,
+				  struct ipe_prop *p)
+{
+	return false;
+}
+
+static bool evaluate_dmv_sig_false(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+
+static bool evaluate_dmv_sig_true(const struct ipe_eval_ctx *const ctx)
+{
+	return false;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 /**
  * evaluate_property() - Analyze @ctx against a rule property.
  * @ctx: Supplies a pointer to the context to be evaluated.
@@ -85,6 +168,12 @@ static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 		return !evaluate_boot_verified(ctx);
 	case IPE_PROP_BOOT_VERIFIED_TRUE:
 		return evaluate_boot_verified(ctx);
+	case IPE_PROP_DMV_ROOTHASH:
+		return evaluate_dmv_roothash(ctx, p);
+	case IPE_PROP_DMV_SIG_FALSE:
+		return evaluate_dmv_sig_false(ctx);
+	case IPE_PROP_DMV_SIG_TRUE:
+		return evaluate_dmv_sig_true(ctx);
 	default:
 		return false;
 	}
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 80b74f55fa69..aa29e8036c48 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -22,12 +22,22 @@ struct ipe_superblock {
 	bool initramfs;
 };
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+struct ipe_bdev {
+	bool dm_verity_signed;
+	struct digest_info *root_hash;
+};
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
 	enum ipe_hook_type hook;
 
 	const struct file *file;
 	bool initramfs;
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+	const struct ipe_bdev *ipe_bdev;
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
 };
 
 enum ipe_match {
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index b68719bf44fb..5d4a9abb9c44 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -8,10 +8,14 @@
 #include <linux/types.h>
 #include <linux/binfmts.h>
 #include <linux/mman.h>
+#include <linux/blk_types.h>
+#include <linux/dm-verity.h>
+#include <crypto/hash_info.h>
 
 #include "ipe.h"
 #include "hooks.h"
 #include "eval.h"
+#include "digest.h"
 
 /**
  * ipe_bprm_check_security() - ipe security hook function for bprm check.
@@ -191,3 +195,77 @@ void ipe_unpack_initramfs(void)
 {
 	ipe_sb(current->fs->root.mnt->mnt_sb)->initramfs = true;
 }
+
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+/**
+ * ipe_bdev_free_security() - Free IPE's LSM blob of block_devices.
+ * @bdev: Supplies a pointer to a block_device that contains the structure
+ *	  to free.
+ */
+void ipe_bdev_free_security(struct block_device *bdev)
+{
+	struct ipe_bdev *blob = ipe_bdev(bdev);
+
+	ipe_digest_free(blob->root_hash);
+}
+
+/**
+ * ipe_bdev_setintegrity() - Save integrity data from a bdev to IPE's LSM blob.
+ * @bdev: Supplies a pointer to a block_device that contains the LSM blob.
+ * @type: Supplies the integrity type.
+ * @value: Supplies the value to store.
+ * @size: The size of @value.
+ *
+ * This hook is currently used to save dm-verity's root hash or the existence
+ * of a validated signed dm-verity root hash into LSM blob.
+ *
+ * Return: %0 on success. If an error occurs, the function will return the
+ * -errno.
+ */
+int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type type,
+			  const void *value, size_t size)
+{
+	const struct dm_verity_digest *digest = NULL;
+	struct ipe_bdev *blob = ipe_bdev(bdev);
+	struct digest_info *info = NULL;
+
+	if (type == LSM_INT_DMVERITY_ROOTHASH) {
+		if (!value) {
+			ipe_digest_free(blob->root_hash);
+			blob->root_hash = NULL;
+
+			return 0;
+		}
+		digest = value;
+
+		info = kzalloc(sizeof(*info), GFP_KERNEL);
+		if (!info)
+			return -ENOMEM;
+
+		info->digest = kmemdup(digest->digest, digest->digest_len,
+				       GFP_KERNEL);
+		if (!info->digest)
+			goto dmv_roothash_err;
+
+		info->alg = kstrdup(digest->alg, GFP_KERNEL);
+		if (!info->alg)
+			goto dmv_roothash_err;
+
+		info->digest_len = digest->digest_len;
+
+		blob->root_hash = info;
+
+		return 0;
+dmv_roothash_err:
+		ipe_digest_free(info);
+
+		return -ENOMEM;
+	} else if (type == LSM_INT_DMVERITY_SIG_VALID) {
+		blob->dm_verity_signed = size > 0 && value;
+
+		return 0;
+	}
+
+	return -EINVAL;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index f4f0b544ddcc..4d585fb6ada3 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -8,6 +8,7 @@
 #include <linux/fs.h>
 #include <linux/binfmts.h>
 #include <linux/security.h>
+#include <linux/blk_types.h>
 
 enum ipe_hook_type {
 	IPE_HOOK_BPRM_CHECK = 0,
@@ -35,4 +36,11 @@ int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents);
 
 void ipe_unpack_initramfs(void);
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+void ipe_bdev_free_security(struct block_device *bdev);
+
+int ipe_bdev_setintegrity(struct block_device *bdev, enum lsm_integrity_type type,
+			  const void *value, size_t len);
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 #endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 53f2196b9bcc..99cb42caa63a 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -7,11 +7,15 @@
 #include "ipe.h"
 #include "eval.h"
 #include "hooks.h"
+#include "eval.h"
 
 bool ipe_enabled;
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 	.lbs_superblock = sizeof(struct ipe_superblock),
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+	.lbs_bdev = sizeof(struct ipe_bdev),
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
 };
 
 static const struct lsm_id ipe_lsmid = {
@@ -24,6 +28,13 @@ struct ipe_superblock *ipe_sb(const struct super_block *sb)
 	return sb->s_security + ipe_blobs.lbs_superblock;
 }
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+struct ipe_bdev *ipe_bdev(struct block_device *b)
+{
+	return b->security + ipe_blobs.lbs_bdev;
+}
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
 	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
@@ -31,6 +42,10 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(kernel_read_file, ipe_kernel_read_file),
 	LSM_HOOK_INIT(kernel_load_data, ipe_kernel_load_data),
 	LSM_HOOK_INIT(initramfs_populated, ipe_unpack_initramfs),
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+	LSM_HOOK_INIT(bdev_free_security, ipe_bdev_free_security),
+	LSM_HOOK_INIT(bdev_setintegrity, ipe_bdev_setintegrity),
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
 };
 
 /**
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index 4aa18d1d0525..01f46286e383 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -16,4 +16,8 @@ struct ipe_superblock *ipe_sb(const struct super_block *sb);
 
 extern bool ipe_enabled;
 
+#ifdef CONFIG_IPE_PROP_DM_VERITY
+struct ipe_bdev *ipe_bdev(struct block_device *b);
+#endif /* CONFIG_IPE_PROP_DM_VERITY */
+
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index ffd60cc7fda6..26776092c710 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -33,6 +33,9 @@ enum ipe_action_type {
 enum ipe_prop_type {
 	IPE_PROP_BOOT_VERIFIED_FALSE,
 	IPE_PROP_BOOT_VERIFIED_TRUE,
+	IPE_PROP_DMV_ROOTHASH,
+	IPE_PROP_DMV_SIG_FALSE,
+	IPE_PROP_DMV_SIG_TRUE,
 	__IPE_PROP_MAX
 };
 
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 84cc688be3a2..71c84b293029 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -11,6 +11,7 @@
 
 #include "policy.h"
 #include "policy_parser.h"
+#include "digest.h"
 
 #define START_COMMENT	'#'
 #define IPE_POLICY_DELIM " \t"
@@ -221,6 +222,7 @@ static void free_rule(struct ipe_rule *r)
 
 	list_for_each_entry_safe(p, t, &r->props, next) {
 		list_del(&p->next);
+		ipe_digest_free(p->value);
 		kfree(p);
 	}
 
@@ -273,6 +275,9 @@ static enum ipe_action_type parse_action(char *t)
 static const match_table_t property_tokens = {
 	{IPE_PROP_BOOT_VERIFIED_FALSE,	"boot_verified=FALSE"},
 	{IPE_PROP_BOOT_VERIFIED_TRUE,	"boot_verified=TRUE"},
+	{IPE_PROP_DMV_ROOTHASH,		"dmverity_roothash=%s"},
+	{IPE_PROP_DMV_SIG_FALSE,	"dmverity_signature=FALSE"},
+	{IPE_PROP_DMV_SIG_TRUE,		"dmverity_signature=TRUE"},
 	{IPE_PROP_INVALID,		NULL}
 };
 
@@ -295,6 +300,7 @@ static int parse_property(char *t, struct ipe_rule *r)
 	struct ipe_prop *p = NULL;
 	int rc = 0;
 	int token;
+	char *dup = NULL;
 
 	p = kzalloc(sizeof(*p), GFP_KERNEL);
 	if (!p)
@@ -303,8 +309,22 @@ static int parse_property(char *t, struct ipe_rule *r)
 	token = match_token(t, property_tokens, args);
 
 	switch (token) {
+	case IPE_PROP_DMV_ROOTHASH:
+		dup = match_strdup(&args[0]);
+		if (!dup) {
+			rc = -ENOMEM;
+			goto err;
+		}
+		p->value = ipe_digest_parse(dup);
+		if (IS_ERR(p->value)) {
+			rc = PTR_ERR(p->value);
+			goto err;
+		}
+		fallthrough;
 	case IPE_PROP_BOOT_VERIFIED_FALSE:
 	case IPE_PROP_BOOT_VERIFIED_TRUE:
+	case IPE_PROP_DMV_SIG_FALSE:
+	case IPE_PROP_DMV_SIG_TRUE:
 		p->type = token;
 		break;
 	default:
@@ -315,10 +335,12 @@ static int parse_property(char *t, struct ipe_rule *r)
 		goto err;
 	list_add_tail(&p->next, &r->props);
 
+out:
+	kfree(dup);
 	return rc;
 err:
 	kfree(p);
-	return rc;
+	goto out;
 }
 
 /**
-- 
2.44.0


^ permalink raw reply related	[relevance 32%]

* [PATCH v17 09/21] uapi|audit|ipe: add ipe auditing support
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (7 preceding siblings ...)
  2024-04-13  0:55 28% ` [PATCH v17 08/21] ipe: add userspace interface Fan Wu
@ 2024-04-13  0:55 26% ` Fan Wu
  2024-04-13  0:55 47% ` [PATCH v17 10/21] ipe: add permissive toggle Fan Wu
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Users of IPE require a way to identify when and why an operation fails,
allowing them to both respond to violations of policy and be notified
of potentially malicious actions on their systems with respect to IPE
itself.

This patch introduces 3 new audit events.

AUDIT_IPE_ACCESS(1420) indicates the result of an IPE policy evaluation
of a resource.
AUDIT_IPE_CONFIG_CHANGE(1421) indicates the current active IPE policy
has been changed to another loaded policy.
AUDIT_IPE_POLICY_LOAD(1422) indicates a new IPE policy has been loaded
into the kernel.

This patch also adds support for success auditing, allowing users to
identify why an allow decision was made for a resource. However, it is
recommended to use this option with caution, as it is quite noisy.

Here are some examples of the new audit record types:

AUDIT_IPE_ACCESS(1420):

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=297 comm="sh" path="/root/vol/bin/hello" dev="tmpfs"
      ino=3897 rule="op=EXECUTE boot_verified=TRUE action=ALLOW"

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=299 comm="sh" path="/mnt/ipe/bin/hello" dev="dm-0"
      ino=2 rule="DEFAULT action=DENY"

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
     pid=300 path="/tmp/tmpdp2h1lub/deny/bin/hello" dev="tmpfs"
      ino=131 rule="DEFAULT action=DENY"

The above three records were generated when the active IPE policy only
allows binaries from the initramfs to run. The three identical `hello`
binary were placed at different locations, only the first hello from
the rootfs(initramfs) was allowed.

Field ipe_op followed by the IPE operation name associated with the log.

Field ipe_hook followed by the name of the LSM hook that triggered the IPE
event.

Field enforcing followed by the enforcement state of IPE. (it will be
introduced in the next commit)

Field pid followed by the pid of the process that triggered the IPE
event.

Field comm followed by the command line program name of the process that
triggered the IPE event.

Field path followed by the file's path name.

Field dev followed by the device name as found in /dev where the file is
from.
Note that for device mappers it will use the name `dm-X` instead of
the name in /dev/mapper.
For a file in a temp file system, which is not from a device, it will use
`tmpfs` for the field.
The implementation of this part is following another existing use case
LSM_AUDIT_DATA_INODE in security/lsm_audit.c

Field ino followed by the file's inode number.

Field rule followed by the IPE rule made the access decision. The whole
rule must be audited because the decision is based on the combination of
all property conditions in the rule.

Along with the syscall audit event, user can know why a blocked
happened. For example:

    audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
      pid=2138 comm="bash" path="/mnt/ipe/bin/hello" dev="dm-0"
      ino=2 rule="DEFAULT action=DENY"
    audit[1956]: SYSCALL arch=c000003e syscall=59
      success=no exit=-13 a0=556790138df0 a1=556790135390 a2=5567901338b0
      a3=ab2a41a67f4f1f4e items=1 ppid=147 pid=1956 auid=4294967295 uid=0
      gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0
      ses=4294967295 comm="bash" exe="/usr/bin/bash" key=(null)

The above two records showed bash used execve to run "hello" and got
blocked by IPE. Note that the IPE records are always prior to a SYSCALL
record.

AUDIT_IPE_CONFIG_CHANGE(1421):

    audit: AUDIT1421
      old_active_pol_name="Allow_All" old_active_pol_version=0.0.0
      old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649
      new_active_pol_name="boot_verified" new_active_pol_version=0.0.0
      new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F
      auid=4294967295 ses=4294967295 lsm=ipe res=1

The above record showed the current IPE active policy switch from
`Allow_All` to `boot_verified` along with the version and the hash
digest of the two policies. Note IPE can only have one policy active
at a time, all access decision evaluation is based on the current active
policy.
The normal procedure to deploy a policy is loading the policy to deploy
into the kernel first, then switch the active policy to it.

AUDIT_IPE_POLICY_LOAD(1422):

    audit: AUDIT1422 policy_name="boot_verified" policy_version=0.0.0
      policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F2676
      auid=4294967295 ses=4294967295 lsm=ipe res=1

The above record showed a new policy has been loaded into the kernel
with the policy name, policy version and policy hash.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation, the audit system, the evaluation loop,
    and access control hooks into separate patches.
  + Further split audit system patch into two separate patches; one
    for include/uapi, and the usage of the new defines.
  + Split out the permissive functionality into another separate patch,
    for easier review.
  + Correct misuse of audit_log_n_untrusted string to audit_log_format
  + Use get_task_comm instead of comm directly.
  + Quote certain audit values
  + Remove unnecessary help text on choice options - these were
    previously indented at the wrong level
  + Correct a stale string constant (ctx_ns_enforce to ctx_enforce)

v8:

  + Change dependency for CONFIG_AUDIT to CONFIG_AUDITSYSCALL
  + Drop ctx_* prefix
  + Reuse, where appropriate, the audit fields from the field
    dictionary. This transforms:
      ctx_pathname  -> path
      ctx_ino       -> ino
      ctx_dev       -> dev

  + Add audit records and event examples to commit description.
  + Remove new_audit_ctx, replace with audit_log_start. All data that
    would provided by new_audit_ctx is already present in the syscall
    audit record, that is always emitted on these actions. The audit
    records should be correlated as such.
  + Change audit types:
    + AUDIT_TRUST_RESULT                -> AUDIT_IPE_ACCESS
      +  This prevents overloading of the AVC type.
    + AUDIT_TRUST_POLICY_ACTIVATE       -> AUDIT_MAC_CONFIG_CHANGE
    + AUDIT_TRUST_POLICY_LOAD           -> AUDIT_MAC_POLICY_LOAD
      + There were no significant difference in meaning between
        these types.

  + Remove enforcing parameter passed from the context structure
    for AUDIT_IPE_ACCESS.
    +  This field can be inferred from the SYSCALL audit event,
       based on the success field.

  + Remove all fields already captured in the syscall record. "hook",
    an IPE specific field, can be determined via the syscall field in
    the syscall record itself, so it has been removed.
      + ino, path, and dev in IPE's record refer to the subject of the
        syscall, while the syscall record refers to the calling process.

  + remove IPE prefix from policy load/policy activation events
  + fix a bug wherein a policy change audit record was not fired when
    updating a policy

v9:
  + Merge the AUDIT_IPE_ACCESS definition with the audit support commit
  + Change the audit format of policy load and switch
  + Remove the ipe audit kernel switch

v10:
  + Create AUDIT_IPE_CONFIG_CHANGE and AUDIT_IPE_POLICY_LOAD
  + Change field names per upstream feedback

v11:
  + Fix style issues

v12:
  + Add ipe_op, ipe_hook, and enforcing fields to AUDIT_IPE_ACCESS

v13:
  + Remove dependency on CONFIG_BLK_DEV_INITRD
  + Add field placeholders for anonymous files

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 include/uapi/linux/audit.h |   3 +
 security/ipe/Kconfig       |   2 +-
 security/ipe/Makefile      |   1 +
 security/ipe/audit.c       | 214 +++++++++++++++++++++++++++++++++++++
 security/ipe/audit.h       |  18 ++++
 security/ipe/eval.c        |  44 ++++++--
 security/ipe/eval.h        |  13 ++-
 security/ipe/fs.c          |  68 ++++++++++++
 security/ipe/hooks.c       |  10 +-
 security/ipe/hooks.h       |  11 ++
 security/ipe/policy.c      |   5 +
 11 files changed, 372 insertions(+), 17 deletions(-)
 create mode 100644 security/ipe/audit.c
 create mode 100644 security/ipe/audit.h

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index d676ed2b246e..75e21a135483 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -143,6 +143,9 @@
 #define AUDIT_MAC_UNLBL_STCDEL	1417	/* NetLabel: del a static label */
 #define AUDIT_MAC_CALIPSO_ADD	1418	/* NetLabel: add CALIPSO DOI entry */
 #define AUDIT_MAC_CALIPSO_DEL	1419	/* NetLabel: del CALIPSO DOI entry */
+#define AUDIT_IPE_ACCESS	1420	/* IPE denial or grant */
+#define AUDIT_IPE_CONFIG_CHANGE	1421	/* IPE config change */
+#define AUDIT_IPE_POLICY_LOAD	1422	/* IPE policy load */
 
 #define AUDIT_FIRST_KERN_ANOM_MSG   1700
 #define AUDIT_LAST_KERN_ANOM_MSG    1799
diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
index e4875fb04883..ac4d558e69d5 100644
--- a/security/ipe/Kconfig
+++ b/security/ipe/Kconfig
@@ -5,7 +5,7 @@
 
 menuconfig SECURITY_IPE
 	bool "Integrity Policy Enforcement (IPE)"
-	depends on SECURITY && SECURITYFS
+	depends on SECURITY && SECURITYFS && AUDIT && AUDITSYSCALL
 	select PKCS7_MESSAGE_PARSER
 	select SYSTEM_DATA_VERIFICATION
 	help
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index b97f8c10fe01..62caccba14b4 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_SECURITY_IPE) += \
 	policy.o \
 	policy_fs.o \
 	policy_parser.o \
+	audit.o \
diff --git a/security/ipe/audit.c b/security/ipe/audit.c
new file mode 100644
index 000000000000..6a3f24665655
--- /dev/null
+++ b/security/ipe/audit.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/slab.h>
+#include <linux/audit.h>
+#include <linux/types.h>
+#include <crypto/hash.h>
+
+#include "ipe.h"
+#include "eval.h"
+#include "hooks.h"
+#include "policy.h"
+#include "audit.h"
+
+#define ACTSTR(x) ((x) == IPE_ACTION_ALLOW ? "ALLOW" : "DENY")
+
+#define IPE_AUDIT_HASH_ALG "sha256"
+
+#define AUDIT_POLICY_LOAD_FMT "policy_name=\"%s\" policy_version=%hu.%hu.%hu "\
+			      "policy_digest=" IPE_AUDIT_HASH_ALG ":"
+#define AUDIT_OLD_ACTIVE_POLICY_FMT "old_active_pol_name=\"%s\" "\
+				    "old_active_pol_version=%hu.%hu.%hu "\
+				    "old_policy_digest=" IPE_AUDIT_HASH_ALG ":"
+#define AUDIT_NEW_ACTIVE_POLICY_FMT "new_active_pol_name=\"%s\" "\
+				    "new_active_pol_version=%hu.%hu.%hu "\
+				    "new_policy_digest=" IPE_AUDIT_HASH_ALG ":"
+
+static const char *const audit_op_names[__IPE_OP_MAX + 1] = {
+	"EXECUTE",
+	"FIRMWARE",
+	"KMODULE",
+	"KEXEC_IMAGE",
+	"KEXEC_INITRAMFS",
+	"POLICY",
+	"X509_CERT",
+	"UNKNOWN",
+};
+
+static const char *const audit_hook_names[__IPE_HOOK_MAX] = {
+	"BPRM_CHECK",
+	"MMAP",
+	"MPROTECT",
+	"KERNEL_READ",
+	"KERNEL_LOAD",
+};
+
+static const char *const audit_prop_names[__IPE_PROP_MAX] = {
+	"boot_verified=FALSE",
+	"boot_verified=TRUE",
+};
+
+/**
+ * audit_rule() - audit an IPE policy rule.
+ * @ab: Supplies a pointer to the audit_buffer to append to.
+ * @r: Supplies a pointer to the ipe_rule to approximate a string form for.
+ */
+static void audit_rule(struct audit_buffer *ab, const struct ipe_rule *r)
+{
+	const struct ipe_prop *ptr;
+
+	audit_log_format(ab, " rule=\"%s ", audit_op_names[r->op]);
+
+	list_for_each_entry(ptr, &r->props, next)
+		audit_log_format(ab, "%s ", audit_prop_names[ptr->type]);
+
+	audit_log_format(ab, "action=%s\"", ACTSTR(r->action));
+}
+
+/**
+ * ipe_audit_match() - Audit a rule match in a policy evaluation.
+ * @ctx: Supplies a pointer to the evaluation context that was used in the
+ *	 evaluation.
+ * @match_type: Supplies the scope of the match: rule, operation default,
+ *		global default.
+ * @act: Supplies the IPE's evaluation decision, deny or allow.
+ * @r: Supplies a pointer to the rule that was matched, if possible.
+ */
+void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
+		     enum ipe_match match_type,
+		     enum ipe_action_type act, const struct ipe_rule *const r)
+{
+	const char *op = audit_op_names[ctx->op];
+	char comm[sizeof(current->comm)];
+	struct audit_buffer *ab;
+	struct inode *inode;
+
+	if (act != IPE_ACTION_DENY && !READ_ONCE(success_audit))
+		return;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_IPE_ACCESS);
+	if (!ab)
+		return;
+
+	audit_log_format(ab, "ipe_op=%s ipe_hook=%s pid=%d comm=",
+			 op, audit_hook_names[ctx->hook],
+			 task_tgid_nr(current));
+	audit_log_untrustedstring(ab, get_task_comm(comm, current));
+
+	if (ctx->file) {
+		audit_log_d_path(ab, " path=", &ctx->file->f_path);
+		inode = file_inode(ctx->file);
+		if (inode) {
+			audit_log_format(ab, " dev=");
+			audit_log_untrustedstring(ab, inode->i_sb->s_id);
+			audit_log_format(ab, " ino=%lu", inode->i_ino);
+		} else {
+			audit_log_format(ab, " dev=? ino=?");
+		}
+	} else {
+		audit_log_format(ab, " path=? dev=? ino=?");
+	}
+
+	if (match_type == IPE_MATCH_RULE)
+		audit_rule(ab, r);
+	else if (match_type == IPE_MATCH_TABLE)
+		audit_log_format(ab, " rule=\"DEFAULT op=%s action=%s\"", op,
+				 ACTSTR(act));
+	else
+		audit_log_format(ab, " rule=\"DEFAULT action=%s\"",
+				 ACTSTR(act));
+
+	audit_log_end(ab);
+}
+
+/**
+ * audit_policy() - Audit a policy's name, version and thumbprint to @ab.
+ * @ab: Supplies a pointer to the audit buffer to append to.
+ * @audit_format: Supplies a pointer to the audit format string
+ * @p: Supplies a pointer to the policy to audit.
+ */
+static void audit_policy(struct audit_buffer *ab,
+			 const char *audit_format,
+			 const struct ipe_policy *const p)
+{
+	SHASH_DESC_ON_STACK(desc, tfm);
+	struct crypto_shash *tfm;
+	u8 *digest = NULL;
+
+	tfm = crypto_alloc_shash(IPE_AUDIT_HASH_ALG, 0, 0);
+	if (IS_ERR(tfm))
+		return;
+
+	desc->tfm = tfm;
+
+	digest = kzalloc(crypto_shash_digestsize(tfm), GFP_KERNEL);
+	if (!digest)
+		goto out;
+
+	if (crypto_shash_init(desc))
+		goto out;
+
+	if (crypto_shash_update(desc, p->pkcs7, p->pkcs7len))
+		goto out;
+
+	if (crypto_shash_final(desc, digest))
+		goto out;
+
+	audit_log_format(ab, audit_format, p->parsed->name,
+			 p->parsed->version.major, p->parsed->version.minor,
+			 p->parsed->version.rev);
+	audit_log_n_hex(ab, digest, crypto_shash_digestsize(tfm));
+
+out:
+	kfree(digest);
+	crypto_free_shash(tfm);
+}
+
+/**
+ * ipe_audit_policy_activation() - Audit a policy being activated.
+ * @op: Supplies a pointer to the previously activated policy to audit.
+ * @np: Supplies a pointer to the newly activated policy to audit.
+ */
+void ipe_audit_policy_activation(const struct ipe_policy *const op,
+				 const struct ipe_policy *const np)
+{
+	struct audit_buffer *ab;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL,
+			     AUDIT_IPE_CONFIG_CHANGE);
+	if (!ab)
+		return;
+
+	audit_policy(ab, AUDIT_OLD_ACTIVE_POLICY_FMT, op);
+	audit_log_format(ab, " ");
+	audit_policy(ab, AUDIT_NEW_ACTIVE_POLICY_FMT, np);
+	audit_log_format(ab, " auid=%u ses=%u lsm=ipe res=1",
+			 from_kuid(&init_user_ns, audit_get_loginuid(current)),
+			 audit_get_sessionid(current));
+
+	audit_log_end(ab);
+}
+
+/**
+ * ipe_audit_policy_load() - Audit a policy being loaded into the kernel.
+ * @p: Supplies a pointer to the policy to audit.
+ */
+void ipe_audit_policy_load(const struct ipe_policy *const p)
+{
+	struct audit_buffer *ab;
+
+	ab = audit_log_start(audit_context(), GFP_KERNEL,
+			     AUDIT_IPE_POLICY_LOAD);
+	if (!ab)
+		return;
+
+	audit_policy(ab, AUDIT_POLICY_LOAD_FMT, p);
+	audit_log_format(ab, " auid=%u ses=%u lsm=ipe res=1",
+			 from_kuid(&init_user_ns, audit_get_loginuid(current)),
+			 audit_get_sessionid(current));
+
+	audit_log_end(ab);
+}
diff --git a/security/ipe/audit.h b/security/ipe/audit.h
new file mode 100644
index 000000000000..3ba8b8a91541
--- /dev/null
+++ b/security/ipe/audit.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_AUDIT_H
+#define _IPE_AUDIT_H
+
+#include "policy.h"
+
+void ipe_audit_match(const struct ipe_eval_ctx *const ctx,
+		     enum ipe_match match_type,
+		     enum ipe_action_type act, const struct ipe_rule *const r);
+void ipe_audit_policy_load(const struct ipe_policy *const p);
+void ipe_audit_policy_activation(const struct ipe_policy *const op,
+				 const struct ipe_policy *const np);
+
+#endif /* _IPE_AUDIT_H */
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 28b3bded06c2..18fd5d8fa03e 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -9,12 +9,15 @@
 #include <linux/file.h>
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
+#include <linux/moduleparam.h>
 
 #include "ipe.h"
 #include "eval.h"
 #include "policy.h"
+#include "audit.h"
 
 struct ipe_policy __rcu *ipe_active_policy;
+bool success_audit;
 
 #define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
 
@@ -33,13 +36,16 @@ static void build_ipe_sb_ctx(struct ipe_eval_ctx *ctx, const struct file *const
  * @ctx: Supplies a pointer to the context to be populated.
  * @file: Supplies a pointer to the file to associated with the evaluation.
  * @op: Supplies the IPE policy operation associated with the evaluation.
+ * @hook: Supplies the LSM hook associated with the evaluation.
  */
 void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 			const struct file *file,
-			enum ipe_op_type op)
+			enum ipe_op_type op,
+			enum ipe_hook_type hook)
 {
 	ctx->file = file;
 	ctx->op = op;
+	ctx->hook = hook;
 
 	if (file)
 		build_ipe_sb_ctx(ctx, file);
@@ -100,6 +106,7 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	struct ipe_policy *pol = NULL;
 	struct ipe_prop *prop = NULL;
 	enum ipe_action_type action;
+	enum ipe_match match_type;
 	bool match = false;
 
 	rcu_read_lock();
@@ -111,14 +118,15 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 	}
 
 	if (ctx->op == IPE_OP_INVALID) {
-		if (pol->parsed->global_default_action == IPE_ACTION_DENY) {
-			rcu_read_unlock();
-			return -EACCES;
-		}
-		if (pol->parsed->global_default_action == IPE_ACTION_INVALID)
+		if (pol->parsed->global_default_action == IPE_ACTION_INVALID) {
 			WARN(1, "no default rule set for unknown op, ALLOW it");
+			action = IPE_ACTION_ALLOW;
+		} else {
+			action = pol->parsed->global_default_action;
+		}
 		rcu_read_unlock();
-		return 0;
+		match_type = IPE_MATCH_GLOBAL;
+		goto eval;
 	}
 
 	rules = &pol->parsed->rules[ctx->op];
@@ -136,16 +144,32 @@ int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
 			break;
 	}
 
-	if (match)
+	if (match) {
 		action = rule->action;
-	else if (rules->default_action != IPE_ACTION_INVALID)
+		match_type = IPE_MATCH_RULE;
+	} else if (rules->default_action != IPE_ACTION_INVALID) {
 		action = rules->default_action;
-	else
+		match_type = IPE_MATCH_TABLE;
+	} else {
 		action = pol->parsed->global_default_action;
+		match_type = IPE_MATCH_GLOBAL;
+	}
 
 	rcu_read_unlock();
+eval:
+	ipe_audit_match(ctx, match_type, action, rule);
+
 	if (action == IPE_ACTION_DENY)
 		return -EACCES;
 
 	return 0;
 }
+
+/* Set the right module name */
+#ifdef KBUILD_MODNAME
+#undef KBUILD_MODNAME
+#define KBUILD_MODNAME "ipe"
+#endif
+
+module_param(success_audit, bool, 0400);
+MODULE_PARM_DESC(success_audit, "Start IPE with success auditing enabled");
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 0fa6492354dd..42b74a7a7c2b 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -10,10 +10,12 @@
 #include <linux/types.h>
 
 #include "policy.h"
+#include "hooks.h"
 
 #define IPE_EVAL_CTX_INIT ((struct ipe_eval_ctx){ 0 })
 
 extern struct ipe_policy __rcu *ipe_active_policy;
+extern bool success_audit;
 
 struct ipe_superblock {
 	bool initramfs;
@@ -21,14 +23,23 @@ struct ipe_superblock {
 
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
+	enum ipe_hook_type hook;
 
 	const struct file *file;
 	bool initramfs;
 };
 
+enum ipe_match {
+	IPE_MATCH_RULE = 0,
+	IPE_MATCH_TABLE,
+	IPE_MATCH_GLOBAL,
+	__IPE_MATCH_MAX
+};
+
 void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 			const struct file *file,
-			enum ipe_op_type op);
+			enum ipe_op_type op,
+			enum ipe_hook_type hook);
 int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx);
 
 #endif /* _IPE_EVAL_H */
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
index 49484c8feead..9e410982b759 100644
--- a/security/ipe/fs.c
+++ b/security/ipe/fs.c
@@ -8,11 +8,62 @@
 
 #include "ipe.h"
 #include "fs.h"
+#include "eval.h"
 #include "policy.h"
+#include "audit.h"
 
 static struct dentry *np __ro_after_init;
 static struct dentry *root __ro_after_init;
 struct dentry *policy_root __ro_after_init;
+static struct dentry *audit_node __ro_after_init;
+
+/**
+ * setaudit() - Write handler for the securityfs node, "ipe/success_audit"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ */
+static ssize_t setaudit(struct file *f, const char __user *data,
+			size_t len, loff_t *offset)
+{
+	int rc = 0;
+	bool value;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	rc = kstrtobool_from_user(data, len, &value);
+	if (rc)
+		return rc;
+
+	WRITE_ONCE(success_audit, value);
+
+	return len;
+}
+
+/**
+ * getaudit() - Read handler for the securityfs node, "ipe/success_audit"
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the read syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return: Length of buffer written
+ */
+static ssize_t getaudit(struct file *f, char __user *data,
+			size_t len, loff_t *offset)
+{
+	const char *result;
+
+	result = ((READ_ONCE(success_audit)) ? "1" : "0");
+
+	return simple_read_from_buffer(data, len, offset, result, 1);
+}
 
 /**
  * new_policy() - Write handler for the securityfs node, "ipe/new_policy".
@@ -51,6 +102,10 @@ static ssize_t new_policy(struct file *f, const char __user *data,
 	}
 
 	rc = ipe_new_policyfs_node(p);
+	if (rc)
+		goto out;
+
+	ipe_audit_policy_load(p);
 
 out:
 	if (rc < 0)
@@ -63,6 +118,11 @@ static const struct file_operations np_fops = {
 	.write = new_policy,
 };
 
+static const struct file_operations audit_fops = {
+	.write = setaudit,
+	.read = getaudit,
+};
+
 /**
  * ipe_init_securityfs() - Initialize IPE's securityfs tree at fsinit.
  *
@@ -82,6 +142,13 @@ static int __init ipe_init_securityfs(void)
 		goto err;
 	}
 
+	audit_node = securityfs_create_file("success_audit", 0600, root,
+					    NULL, &audit_fops);
+	if (IS_ERR(audit_node)) {
+		rc = PTR_ERR(audit_node);
+		goto err;
+	}
+
 	policy_root = securityfs_create_dir("policies", root);
 	if (IS_ERR(policy_root)) {
 		rc = PTR_ERR(policy_root);
@@ -98,6 +165,7 @@ static int __init ipe_init_securityfs(void)
 err:
 	securityfs_remove(np);
 	securityfs_remove(policy_root);
+	securityfs_remove(audit_node);
 	securityfs_remove(root);
 	return rc;
 }
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index 76370919aac0..b68719bf44fb 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -29,7 +29,7 @@ int ipe_bprm_check_security(struct linux_binprm *bprm)
 {
 	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
 
-	ipe_build_eval_ctx(&ctx, bprm->file, IPE_OP_EXEC);
+	ipe_build_eval_ctx(&ctx, bprm->file, IPE_OP_EXEC, IPE_HOOK_BPRM_CHECK);
 	return ipe_evaluate_event(&ctx);
 }
 
@@ -54,7 +54,7 @@ int ipe_mmap_file(struct file *f, unsigned long reqprot __always_unused,
 	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
 
 	if (prot & PROT_EXEC) {
-		ipe_build_eval_ctx(&ctx, f, IPE_OP_EXEC);
+		ipe_build_eval_ctx(&ctx, f, IPE_OP_EXEC, IPE_HOOK_MMAP);
 		return ipe_evaluate_event(&ctx);
 	}
 
@@ -86,7 +86,7 @@ int ipe_file_mprotect(struct vm_area_struct *vma,
 		return 0;
 
 	if (prot & PROT_EXEC) {
-		ipe_build_eval_ctx(&ctx, vma->vm_file, IPE_OP_EXEC);
+		ipe_build_eval_ctx(&ctx, vma->vm_file, IPE_OP_EXEC, IPE_HOOK_MPROTECT);
 		return ipe_evaluate_event(&ctx);
 	}
 
@@ -136,7 +136,7 @@ int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
 		WARN(1, "no rule setup for kernel_read_file enum %d", id);
 	}
 
-	ipe_build_eval_ctx(&ctx, file, op);
+	ipe_build_eval_ctx(&ctx, file, op, IPE_HOOK_KERNEL_READ);
 	return ipe_evaluate_event(&ctx);
 }
 
@@ -180,7 +180,7 @@ int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents)
 		WARN(1, "no rule setup for kernel_load_data enum %d", id);
 	}
 
-	ipe_build_eval_ctx(&ctx, NULL, op);
+	ipe_build_eval_ctx(&ctx, NULL, op, IPE_HOOK_KERNEL_LOAD);
 	return ipe_evaluate_event(&ctx);
 }
 
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index 4de5fabebd54..f4f0b544ddcc 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -9,6 +9,17 @@
 #include <linux/binfmts.h>
 #include <linux/security.h>
 
+enum ipe_hook_type {
+	IPE_HOOK_BPRM_CHECK = 0,
+	IPE_HOOK_MMAP,
+	IPE_HOOK_MPROTECT,
+	IPE_HOOK_KERNEL_READ,
+	IPE_HOOK_KERNEL_LOAD,
+	__IPE_HOOK_MAX
+};
+
+#define IPE_HOOK_INVALID __IPE_HOOK_MAX
+
 int ipe_bprm_check_security(struct linux_binprm *bprm);
 
 int ipe_mmap_file(struct file *f, unsigned long reqprot, unsigned long prot,
diff --git a/security/ipe/policy.c b/security/ipe/policy.c
index 112913f83c6d..5ac7640a8aef 100644
--- a/security/ipe/policy.c
+++ b/security/ipe/policy.c
@@ -11,6 +11,7 @@
 #include "fs.h"
 #include "policy.h"
 #include "policy_parser.h"
+#include "audit.h"
 
 /* lock for synchronizing writers across ipe policy */
 DEFINE_MUTEX(ipe_policy_lock);
@@ -112,6 +113,7 @@ int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
 
 	root->i_private = new;
 	swap(new->policyfs, old->policyfs);
+	ipe_audit_policy_load(new);
 
 	mutex_lock(&ipe_policy_lock);
 	ap = rcu_dereference_protected(ipe_active_policy,
@@ -120,6 +122,7 @@ int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
 		rcu_assign_pointer(ipe_active_policy, new);
 		mutex_unlock(&ipe_policy_lock);
 		synchronize_rcu();
+		ipe_audit_policy_activation(old, new);
 	} else {
 		mutex_unlock(&ipe_policy_lock);
 	}
@@ -220,5 +223,7 @@ int ipe_set_active_pol(const struct ipe_policy *p)
 	mutex_unlock(&ipe_policy_lock);
 	synchronize_rcu();
 
+	ipe_audit_policy_activation(ap, p);
+
 	return 0;
 }
-- 
2.44.0


^ permalink raw reply related	[relevance 26%]

* [PATCH v17 07/21] security: add new securityfs delete function
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (5 preceding siblings ...)
  2024-04-13  0:55 47% ` [PATCH v17 06/21] ipe: introduce 'boot_verified' as a trust provider Fan Wu
@ 2024-04-13  0:55 69% ` Fan Wu
  2024-04-13  0:55 28% ` [PATCH v17 08/21] ipe: add userspace interface Fan Wu
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

When deleting a directory in the security file system, the existing
securityfs_remove requires the directory to be empty, otherwise
it will do nothing. This leads to a potential risk that the security
file system might be in an unclean state when the intended deletion
did not happen.

This commit introduces a new function securityfs_recursive_remove
to recursively delete a directory without leaving an unclean state.

Co-developed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v8:
  + Not present

v9:
  + Introduced

v10:
  + No changes

v11:
  + Fix code style issues

v12:
  + No changes

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + No changes
---
 include/linux/security.h |  1 +
 security/inode.c         | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/linux/security.h b/include/linux/security.h
index 14fff542f2e3..f35af7b6cfba 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -2089,6 +2089,7 @@ struct dentry *securityfs_create_symlink(const char *name,
 					 const char *target,
 					 const struct inode_operations *iops);
 extern void securityfs_remove(struct dentry *dentry);
+extern void securityfs_recursive_remove(struct dentry *dentry);
 
 #else /* CONFIG_SECURITYFS */
 
diff --git a/security/inode.c b/security/inode.c
index 9e7cde913667..f21847badb7d 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -313,6 +313,31 @@ void securityfs_remove(struct dentry *dentry)
 }
 EXPORT_SYMBOL_GPL(securityfs_remove);
 
+static void remove_one(struct dentry *victim)
+{
+	simple_release_fs(&mount, &mount_count);
+}
+
+/**
+ * securityfs_recursive_remove - recursively removes a file or directory
+ *
+ * @dentry: a pointer to a the dentry of the file or directory to be removed.
+ *
+ * This function recursively removes a file or directory in securityfs that was
+ * previously created with a call to another securityfs function (like
+ * securityfs_create_file() or variants thereof.)
+ */
+void securityfs_recursive_remove(struct dentry *dentry)
+{
+	if (IS_ERR_OR_NULL(dentry))
+		return;
+
+	simple_pin_fs(&fs_type, &mount, &mount_count);
+	simple_recursive_removal(dentry, remove_one);
+	simple_release_fs(&mount, &mount_count);
+}
+EXPORT_SYMBOL_GPL(securityfs_recursive_remove);
+
 #ifdef CONFIG_SECURITY
 static struct dentry *lsm_dentry;
 static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
-- 
2.44.0


^ permalink raw reply related	[relevance 69%]

* [PATCH v17 05/21] initramfs|security: Add a security hook to do_populate_rootfs()
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (3 preceding siblings ...)
  2024-04-13  0:55 45% ` [PATCH v17 04/21] ipe: add LSM hooks on execution and kernel read Fan Wu
@ 2024-04-13  0:55 67% ` Fan Wu
  2024-04-13  0:55 47% ` [PATCH v17 06/21] ipe: introduce 'boot_verified' as a trust provider Fan Wu
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

This patch introduces a new hook to notify security system that the
content of initramfs has been unpacked into the rootfs.

Upon receiving this notification, the security system can activate
a policy to allow only files that originated from the initramfs to
execute or load into kernel during the early stages of booting.

This approach is crucial for minimizing the attack surface by
ensuring that only trusted files from the initramfs are operational
in the critical boot phase.

Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v1-v11:
  + Not present

v12:
  + Introduced

v13:
  + Rename the hook name to initramfs_populated()

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Fix ocumentation style issues
---
 include/linux/lsm_hook_defs.h |  2 ++
 include/linux/security.h      |  8 ++++++++
 init/initramfs.c              |  3 +++
 security/security.c           | 10 ++++++++++
 4 files changed, 23 insertions(+)

diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 334e00efbde4..7db99ae75651 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -450,3 +450,5 @@ LSM_HOOK(int, 0, uring_override_creds, const struct cred *new)
 LSM_HOOK(int, 0, uring_sqpoll, void)
 LSM_HOOK(int, 0, uring_cmd, struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_IO_URING */
+
+LSM_HOOK(void, LSM_RET_VOID, initramfs_populated, void)
diff --git a/include/linux/security.h b/include/linux/security.h
index 41a8f667bdfa..14fff542f2e3 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -2255,4 +2255,12 @@ static inline int security_uring_cmd(struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_SECURITY */
 #endif /* CONFIG_IO_URING */
 
+#ifdef CONFIG_SECURITY
+extern void security_initramfs_populated(void);
+#else
+static inline void security_initramfs_populated(void)
+{
+}
+#endif /* CONFIG_SECURITY */
+
 #endif /* ! __LINUX_SECURITY_H */
diff --git a/init/initramfs.c b/init/initramfs.c
index a298a3854a80..feedb47d0f55 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -17,6 +17,7 @@
 #include <linux/namei.h>
 #include <linux/init_syscalls.h>
 #include <linux/umh.h>
+#include <linux/security.h>
 
 #include "do_mounts.h"
 
@@ -719,6 +720,8 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
 #endif
 	}
 
+	security_initramfs_populated();
+
 done:
 	/*
 	 * If the initrd region is overlapped with crashkernel reserved region,
diff --git a/security/security.c b/security/security.c
index 820e0d437452..0db5a6b32aab 100644
--- a/security/security.c
+++ b/security/security.c
@@ -5675,3 +5675,13 @@ int security_uring_cmd(struct io_uring_cmd *ioucmd)
 	return call_int_hook(uring_cmd, ioucmd);
 }
 #endif /* CONFIG_IO_URING */
+
+/**
+ * security_initramfs_populated() - Notify LSMs that initramfs has been loaded
+ *
+ * Tells the LSMs the initramfs has been unpacked into the rootfs.
+ */
+void security_initramfs_populated(void)
+{
+	call_void_hook(initramfs_populated);
+}
-- 
2.44.0


^ permalink raw reply related	[relevance 67%]

* [PATCH v17 06/21] ipe: introduce 'boot_verified' as a trust provider
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (4 preceding siblings ...)
  2024-04-13  0:55 67% ` [PATCH v17 05/21] initramfs|security: Add a security hook to do_populate_rootfs() Fan Wu
@ 2024-04-13  0:55 47% ` Fan Wu
  2024-04-13  0:55 69% ` [PATCH v17 07/21] security: add new securityfs delete function Fan Wu
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu, Deven Bowers

IPE is designed to provide system level trust guarantees, this usually
implies that trust starts from bootup with a hardware root of trust,
which validates the bootloader. After this, the bootloader verifies
the kernel and the initramfs.

As there's no currently supported integrity method for initramfs, and
it's typically already verified by the bootloader. This patch introduces
a new IPE property `boot_verified` which allows author of IPE policy to
indicate trust for files from initramfs.

The implementation of this feature utilizes the newly added
`initramfs_populated` hook. This hook marks the superblock of the rootfs
after the initramfs has been unpacked into it.

Before mounting the real rootfs on top of the initramfs, initramfs
script will recursively remove all files and directories on the
initramfs. This is typically implemented by using switch_root(8)
(https://man7.org/linux/man-pages/man8/switch_root.8.html).
Therefore the initramfs will be empty and not accessible after the real
rootfs takes over. It is advised to switch to a different policy
that doesn't rely on the `boot_verified` property after this point.
This ensures that the trust policies remain relevant and effective
throughout the system's operation.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  +No Changes

v3:
  + Remove useless caching system
  + Move ipe_load_properties to this match
  + Minor changes from checkpatch --strict warnings

v4:
  + Remove comments from headers that was missed previously.
  + Grammatical corrections.

v5:
  + No significant changes

v6:
  + No changes

v7:
  + Reword and refactor patch 04/12 to [09/16], based on changes in
the underlying system.
  + Add common audit function for boolean values
  + Use common audit function as implementation.

v8:
  + No changes

v9:
  + No changes

v10:
  + Replace struct file with struct super_block

v11:
  + Fix code style issues

v12:
  + Switch to use unpack_initramfs hook and security blob

v13:
  + Update the hook name
  + Rename the security blob field to initramfs
  + Remove the dependency on CONFIG_BLK_DEV_INITRD

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Fix code and documentation style issues
---
 security/ipe/eval.c          | 41 +++++++++++++++++++++++++++++++++---
 security/ipe/eval.h          |  5 +++++
 security/ipe/hooks.c         |  9 ++++++++
 security/ipe/hooks.h         |  2 ++
 security/ipe/ipe.c           |  8 +++++++
 security/ipe/ipe.h           |  1 +
 security/ipe/policy.h        |  2 ++
 security/ipe/policy_parser.c | 39 +++++++++++++++++++++++++++++++---
 8 files changed, 101 insertions(+), 6 deletions(-)

diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index cc3b3f6583ad..28b3bded06c2 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -16,6 +16,18 @@
 
 struct ipe_policy __rcu *ipe_active_policy;
 
+#define FILE_SUPERBLOCK(f) ((f)->f_path.mnt->mnt_sb)
+
+/**
+ * build_ipe_sb_ctx() - Build initramfs field of an ipe evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @file: Supplies the file struct of the file triggered IPE event.
+ */
+static void build_ipe_sb_ctx(struct ipe_eval_ctx *ctx, const struct file *const file)
+{
+	ctx->initramfs = ipe_sb(FILE_SUPERBLOCK(file))->initramfs;
+}
+
 /**
  * ipe_build_eval_ctx() - Build an ipe evaluation context.
  * @ctx: Supplies a pointer to the context to be populated.
@@ -28,6 +40,22 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 {
 	ctx->file = file;
 	ctx->op = op;
+
+	if (file)
+		build_ipe_sb_ctx(ctx, file);
+}
+
+/**
+ * evaluate_boot_verified() - Evaluate @ctx for the boot verified property.
+ * @ctx: Supplies a pointer to the context being evaluated.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_boot_verified(const struct ipe_eval_ctx *const ctx)
+{
+	return ctx->initramfs;
 }
 
 /**
@@ -35,8 +63,8 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
  * @ctx: Supplies a pointer to the context to be evaluated.
  * @p: Supplies a pointer to the property to be evaluated.
  *
- * This is a placeholder. The actual function will be introduced in the
- * latter commits.
+ * This function Determines whether the specified @ctx
+ * matches the conditions defined by a rule property @p.
  *
  * Return:
  * * %true	- The current @ctx match the @p
@@ -45,7 +73,14 @@ void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
 static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
 			      struct ipe_prop *p)
 {
-	return false;
+	switch (p->type) {
+	case IPE_PROP_BOOT_VERIFIED_FALSE:
+		return !evaluate_boot_verified(ctx);
+	case IPE_PROP_BOOT_VERIFIED_TRUE:
+		return evaluate_boot_verified(ctx);
+	default:
+		return false;
+	}
 }
 
 /**
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index 00ed8ceca10e..0fa6492354dd 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -15,10 +15,15 @@
 
 extern struct ipe_policy __rcu *ipe_active_policy;
 
+struct ipe_superblock {
+	bool initramfs;
+};
+
 struct ipe_eval_ctx {
 	enum ipe_op_type op;
 
 	const struct file *file;
+	bool initramfs;
 };
 
 void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
index f2aaa749dd7b..76370919aac0 100644
--- a/security/ipe/hooks.c
+++ b/security/ipe/hooks.c
@@ -4,6 +4,7 @@
  */
 
 #include <linux/fs.h>
+#include <linux/fs_struct.h>
 #include <linux/types.h>
 #include <linux/binfmts.h>
 #include <linux/mman.h>
@@ -182,3 +183,11 @@ int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents)
 	ipe_build_eval_ctx(&ctx, NULL, op);
 	return ipe_evaluate_event(&ctx);
 }
+
+/**
+ * ipe_unpack_initramfs() - Mark the current rootfs as initramfs.
+ */
+void ipe_unpack_initramfs(void)
+{
+	ipe_sb(current->fs->root.mnt->mnt_sb)->initramfs = true;
+}
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
index c22c3336d27c..4de5fabebd54 100644
--- a/security/ipe/hooks.h
+++ b/security/ipe/hooks.h
@@ -22,4 +22,6 @@ int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
 
 int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents);
 
+void ipe_unpack_initramfs(void);
+
 #endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 729334812636..28555eadb7f3 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -5,9 +5,11 @@
 #include <uapi/linux/lsm.h>
 
 #include "ipe.h"
+#include "eval.h"
 #include "hooks.h"
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
+	.lbs_superblock = sizeof(struct ipe_superblock),
 };
 
 static const struct lsm_id ipe_lsmid = {
@@ -15,12 +17,18 @@ static const struct lsm_id ipe_lsmid = {
 	.id = LSM_ID_IPE,
 };
 
+struct ipe_superblock *ipe_sb(const struct super_block *sb)
+{
+	return sb->s_security + ipe_blobs.lbs_superblock;
+}
+
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
 	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
 	LSM_HOOK_INIT(file_mprotect, ipe_file_mprotect),
 	LSM_HOOK_INIT(kernel_read_file, ipe_kernel_read_file),
 	LSM_HOOK_INIT(kernel_load_data, ipe_kernel_load_data),
+	LSM_HOOK_INIT(initramfs_populated, ipe_unpack_initramfs),
 };
 
 /**
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index adc3c45e9f53..7f1c818193a0 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -12,5 +12,6 @@
 #define pr_fmt(fmt) "ipe: " fmt
 
 #include <linux/lsm_hooks.h>
+struct ipe_superblock *ipe_sb(const struct super_block *sb);
 
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 8292ffaaff12..69ca8cdecd64 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -30,6 +30,8 @@ enum ipe_action_type {
 #define IPE_ACTION_INVALID __IPE_ACTION_MAX
 
 enum ipe_prop_type {
+	IPE_PROP_BOOT_VERIFIED_FALSE,
+	IPE_PROP_BOOT_VERIFIED_TRUE,
 	__IPE_PROP_MAX
 };
 
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
index 32064262348a..84cc688be3a2 100644
--- a/security/ipe/policy_parser.c
+++ b/security/ipe/policy_parser.c
@@ -270,13 +270,19 @@ static enum ipe_action_type parse_action(char *t)
 	return match_token(t, action_tokens, args);
 }
 
+static const match_table_t property_tokens = {
+	{IPE_PROP_BOOT_VERIFIED_FALSE,	"boot_verified=FALSE"},
+	{IPE_PROP_BOOT_VERIFIED_TRUE,	"boot_verified=TRUE"},
+	{IPE_PROP_INVALID,		NULL}
+};
+
 /**
  * parse_property() - Parse a rule property given a token string.
  * @t: Supplies the token string to be parsed.
  * @r: Supplies the ipe_rule the parsed property will be associated with.
  *
- * This is a placeholder. The actual function will be introduced in the
- * latter commits.
+ * This function parses and associates a property with an IPE rule based
+ * on a token string.
  *
  * Return:
  * * %0		- Success
@@ -285,7 +291,34 @@ static enum ipe_action_type parse_action(char *t)
  */
 static int parse_property(char *t, struct ipe_rule *r)
 {
-	return -EBADMSG;
+	substring_t args[MAX_OPT_ARGS];
+	struct ipe_prop *p = NULL;
+	int rc = 0;
+	int token;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return -ENOMEM;
+
+	token = match_token(t, property_tokens, args);
+
+	switch (token) {
+	case IPE_PROP_BOOT_VERIFIED_FALSE:
+	case IPE_PROP_BOOT_VERIFIED_TRUE:
+		p->type = token;
+		break;
+	default:
+		rc = -EBADMSG;
+		break;
+	}
+	if (rc)
+		goto err;
+	list_add_tail(&p->next, &r->props);
+
+	return rc;
+err:
+	kfree(p);
+	return rc;
 }
 
 /**
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v17 08/21] ipe: add userspace interface
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (6 preceding siblings ...)
  2024-04-13  0:55 69% ` [PATCH v17 07/21] security: add new securityfs delete function Fan Wu
@ 2024-04-13  0:55 28% ` Fan Wu
  2024-04-13  0:55 26% ` [PATCH v17 09/21] uapi|audit|ipe: add ipe auditing support Fan Wu
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

As is typical with LSMs, IPE uses securityfs as its interface with
userspace. for a complete list of the interfaces and the respective
inputs/outputs, please see the documentation under
admin-guide/LSM/ipe.rst

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move policy load and activation audit event to 03/12
  + Fix a potential panic when a policy failed to load.
  + use pr_warn for a failure to parse instead of an
    audit record
  + Remove comments from headers
  + Add lockdep assertions to ipe_update_active_policy and
    ipe_activate_policy
  + Fix up warnings with checkpatch --strict
  + Use file_ns_capable for CAP_MAC_ADMIN for securityfs
    nodes.
  + Use memdup_user instead of kzalloc+simple_write_to_buffer.
  + Remove strict_parse command line parameter, as it is added
    by the sysctl command line.
  + Prefix extern variables with ipe_

v4:
  + Remove securityfs to reverse-dependency
  + Add SHA1 reverse dependency.
  + Add versioning scheme for IPE properties, and associated
    interface to query the versioning scheme.
  + Cause a parser to always return an error on unknown syntax.
  + Remove strict_parse option
  + Change active_policy interface from sysctl, to securityfs,
    and change scheme.

v5:
  + Cause an error if a default action is not defined for each
    operation.
  + Minor function renames

v6:
  + No changes

v7:
  + Propagating changes to support the new ipe_context structure in the
    evaluation loop.

  + Further split the parser and userspace interface changes into
    separate commits.

  + "raw" was renamed to "pkcs7" and made read only
  + "raw"'s write functionality (update a policy) moved to "update"
  + introduced "version", "policy_name" nodes.
  + "content" renamed to "policy"
  + changes to allow the compiled-in policy to be treated
    identical to deployed-after-the-fact policies.

v8:
  + Prevent securityfs initialization if the LSM is disabled

v9:
  + Switch to securityfs_recursive_remove for policy folder deletion

v10:
  + Simplify and correct concurrency
  + Fix typos

v11:
  + Correct code comments

v12:
  + Correct locking and remove redundant code

v13:
  + Move the free of old policy into the ipe_update_policy function

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 security/ipe/Makefile    |   2 +
 security/ipe/fs.c        | 105 +++++++++
 security/ipe/fs.h        |  16 ++
 security/ipe/ipe.c       |   3 +
 security/ipe/ipe.h       |   2 +
 security/ipe/policy.c    | 121 ++++++++++
 security/ipe/policy.h    |   7 +
 security/ipe/policy_fs.c | 470 +++++++++++++++++++++++++++++++++++++++
 8 files changed, 726 insertions(+)
 create mode 100644 security/ipe/fs.c
 create mode 100644 security/ipe/fs.h
 create mode 100644 security/ipe/policy_fs.c

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index e1c27e974c5c..b97f8c10fe01 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -8,6 +8,8 @@
 obj-$(CONFIG_SECURITY_IPE) += \
 	eval.o \
 	hooks.o \
+	fs.o \
 	ipe.o \
 	policy.o \
+	policy_fs.o \
 	policy_parser.o \
diff --git a/security/ipe/fs.c b/security/ipe/fs.c
new file mode 100644
index 000000000000..49484c8feead
--- /dev/null
+++ b/security/ipe/fs.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/dcache.h>
+#include <linux/security.h>
+
+#include "ipe.h"
+#include "fs.h"
+#include "policy.h"
+
+static struct dentry *np __ro_after_init;
+static struct dentry *root __ro_after_init;
+struct dentry *policy_root __ro_after_init;
+
+/**
+ * new_policy() - Write handler for the securityfs node, "ipe/new_policy".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ * * %-ENOMEM			- Out of memory (OOM)
+ * * %-EBADMSG			- Policy is invalid
+ * * %-ERANGE			- Policy version number overflow
+ * * %-EINVAL			- Policy version parsing error
+ * * %-EEXIST			- Same name policy already deployed
+ */
+static ssize_t new_policy(struct file *f, const char __user *data,
+			  size_t len, loff_t *offset)
+{
+	struct ipe_policy *p = NULL;
+	char *copy = NULL;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	copy = memdup_user_nul(data, len);
+	if (IS_ERR(copy))
+		return PTR_ERR(copy);
+
+	p = ipe_new_policy(NULL, 0, copy, len);
+	if (IS_ERR(p)) {
+		rc = PTR_ERR(p);
+		goto out;
+	}
+
+	rc = ipe_new_policyfs_node(p);
+
+out:
+	if (rc < 0)
+		ipe_free_policy(p);
+	kfree(copy);
+	return (rc < 0) ? rc : len;
+}
+
+static const struct file_operations np_fops = {
+	.write = new_policy,
+};
+
+/**
+ * ipe_init_securityfs() - Initialize IPE's securityfs tree at fsinit.
+ *
+ * Return: %0 on success. If an error occurs, the function will return
+ * the -errno.
+ */
+static int __init ipe_init_securityfs(void)
+{
+	int rc = 0;
+
+	if (!ipe_enabled)
+		return -EOPNOTSUPP;
+
+	root = securityfs_create_dir("ipe", NULL);
+	if (IS_ERR(root)) {
+		rc = PTR_ERR(root);
+		goto err;
+	}
+
+	policy_root = securityfs_create_dir("policies", root);
+	if (IS_ERR(policy_root)) {
+		rc = PTR_ERR(policy_root);
+		goto err;
+	}
+
+	np = securityfs_create_file("new_policy", 0200, root, NULL, &np_fops);
+	if (IS_ERR(np)) {
+		rc = PTR_ERR(np);
+		goto err;
+	}
+
+	return 0;
+err:
+	securityfs_remove(np);
+	securityfs_remove(policy_root);
+	securityfs_remove(root);
+	return rc;
+}
+
+fs_initcall(ipe_init_securityfs);
diff --git a/security/ipe/fs.h b/security/ipe/fs.h
new file mode 100644
index 000000000000..0141ae8e86ec
--- /dev/null
+++ b/security/ipe/fs.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_FS_H
+#define _IPE_FS_H
+
+#include "policy.h"
+
+extern struct dentry *policy_root __ro_after_init;
+
+int ipe_new_policyfs_node(struct ipe_policy *p);
+void ipe_del_policyfs_node(struct ipe_policy *p);
+
+#endif /* _IPE_FS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 28555eadb7f3..53f2196b9bcc 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -8,6 +8,8 @@
 #include "eval.h"
 #include "hooks.h"
 
+bool ipe_enabled;
+
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 	.lbs_superblock = sizeof(struct ipe_superblock),
 };
@@ -45,6 +47,7 @@ static struct security_hook_list ipe_hooks[] __ro_after_init = {
 static int __init ipe_init(void)
 {
 	security_add_hooks(ipe_hooks, ARRAY_SIZE(ipe_hooks), &ipe_lsmid);
+	ipe_enabled = true;
 
 	return 0;
 }
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
index 7f1c818193a0..4aa18d1d0525 100644
--- a/security/ipe/ipe.h
+++ b/security/ipe/ipe.h
@@ -14,4 +14,6 @@
 #include <linux/lsm_hooks.h>
 struct ipe_superblock *ipe_sb(const struct super_block *sb);
 
+extern bool ipe_enabled;
+
 #endif /* _IPE_H */
diff --git a/security/ipe/policy.c b/security/ipe/policy.c
index dd7b5b79903a..112913f83c6d 100644
--- a/security/ipe/policy.c
+++ b/security/ipe/policy.c
@@ -7,9 +7,36 @@
 #include <linux/verification.h>
 
 #include "ipe.h"
+#include "eval.h"
+#include "fs.h"
 #include "policy.h"
 #include "policy_parser.h"
 
+/* lock for synchronizing writers across ipe policy */
+DEFINE_MUTEX(ipe_policy_lock);
+
+/**
+ * ver_to_u64() - Convert an internal ipe_policy_version to a u64.
+ * @p: Policy to extract the version from.
+ *
+ * Bits (LSB is index 0):
+ *	[48,32] -> Major
+ *	[32,16] -> Minor
+ *	[16, 0] -> Revision
+ *
+ * Return: u64 version of the embedded version structure.
+ */
+static inline u64 ver_to_u64(const struct ipe_policy *const p)
+{
+	u64 r;
+
+	r = (((u64)p->parsed->version.major) << 32)
+	  | (((u64)p->parsed->version.minor) << 16)
+	  | ((u64)(p->parsed->version.rev));
+
+	return r;
+}
+
 /**
  * ipe_free_policy() - Deallocate a given IPE policy.
  * @p: Supplies the policy to free.
@@ -21,6 +48,7 @@ void ipe_free_policy(struct ipe_policy *p)
 	if (IS_ERR_OR_NULL(p))
 		return;
 
+	ipe_del_policyfs_node(p);
 	ipe_free_parsed_policy(p->parsed);
 	/*
 	 * p->text is allocated only when p->pkcs7 is not NULL
@@ -43,6 +71,66 @@ static int set_pkcs7_data(void *ctx, const void *data, size_t len,
 	return 0;
 }
 
+/**
+ * ipe_update_policy() - parse a new policy and replace old with it.
+ * @root: Supplies a pointer to the securityfs inode saved the policy.
+ * @text: Supplies a pointer to the plain text policy.
+ * @textlen: Supplies the length of @text.
+ * @pkcs7: Supplies a pointer to a buffer containing a pkcs7 message.
+ * @pkcs7len: Supplies the length of @pkcs7len.
+ *
+ * @text/@textlen is mutually exclusive with @pkcs7/@pkcs7len - see
+ * ipe_new_policy.
+ *
+ * Context: Requires root->i_rwsem to be held.
+ * Return: %0 on success. If an error occurs, the function will return
+ * the -errno.
+ */
+int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
+		      const char *pkcs7, size_t pkcs7len)
+{
+	struct ipe_policy *old, *ap, *new = NULL;
+	int rc = 0;
+
+	old = (struct ipe_policy *)root->i_private;
+	if (!old)
+		return -ENOENT;
+
+	new = ipe_new_policy(text, textlen, pkcs7, pkcs7len);
+	if (IS_ERR(new))
+		return PTR_ERR(new);
+
+	if (strcmp(new->parsed->name, old->parsed->name)) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	if (ver_to_u64(old) > ver_to_u64(new)) {
+		rc = -EINVAL;
+		goto err;
+	}
+
+	root->i_private = new;
+	swap(new->policyfs, old->policyfs);
+
+	mutex_lock(&ipe_policy_lock);
+	ap = rcu_dereference_protected(ipe_active_policy,
+				       lockdep_is_held(&ipe_policy_lock));
+	if (old == ap) {
+		rcu_assign_pointer(ipe_active_policy, new);
+		mutex_unlock(&ipe_policy_lock);
+		synchronize_rcu();
+	} else {
+		mutex_unlock(&ipe_policy_lock);
+	}
+	ipe_free_policy(old);
+
+	return 0;
+err:
+	ipe_free_policy(new);
+	return rc;
+}
+
 /**
  * ipe_new_policy() - Allocate and parse an ipe_policy structure.
  *
@@ -101,3 +189,36 @@ struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
 	ipe_free_policy(new);
 	return ERR_PTR(rc);
 }
+
+/**
+ * ipe_set_active_pol() - Make @p the active policy.
+ * @p: Supplies a pointer to the policy to make active.
+ *
+ * Context: Requires root->i_rwsem, which i_private has the policy, to be held.
+ * Return:
+ * * %0	- Success
+ * * %-EINVAL	- New active policy version is invalid
+ */
+int ipe_set_active_pol(const struct ipe_policy *p)
+{
+	struct ipe_policy *ap = NULL;
+
+	mutex_lock(&ipe_policy_lock);
+
+	ap = rcu_dereference_protected(ipe_active_policy,
+				       lockdep_is_held(&ipe_policy_lock));
+	if (ap == p) {
+		mutex_unlock(&ipe_policy_lock);
+		return 0;
+	}
+	if (ap && ver_to_u64(ap) > ver_to_u64(p)) {
+		mutex_unlock(&ipe_policy_lock);
+		return -EINVAL;
+	}
+
+	rcu_assign_pointer(ipe_active_policy, p);
+	mutex_unlock(&ipe_policy_lock);
+	synchronize_rcu();
+
+	return 0;
+}
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
index 69ca8cdecd64..ffd60cc7fda6 100644
--- a/security/ipe/policy.h
+++ b/security/ipe/policy.h
@@ -7,6 +7,7 @@
 
 #include <linux/list.h>
 #include <linux/types.h>
+#include <linux/fs.h>
 
 enum ipe_op_type {
 	IPE_OP_EXEC = 0,
@@ -76,10 +77,16 @@ struct ipe_policy {
 	size_t textlen;
 
 	struct ipe_parsed_policy *parsed;
+
+	struct dentry *policyfs;
 };
 
 struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
 				  const char *pkcs7, size_t pkcs7len);
 void ipe_free_policy(struct ipe_policy *pol);
+int ipe_update_policy(struct inode *root, const char *text, size_t textlen,
+		      const char *pkcs7, size_t pkcs7len);
+int ipe_set_active_pol(const struct ipe_policy *p);
+extern struct mutex ipe_policy_lock;
 
 #endif /* _IPE_POLICY_H */
diff --git a/security/ipe/policy_fs.c b/security/ipe/policy_fs.c
new file mode 100644
index 000000000000..c19c06627efb
--- /dev/null
+++ b/security/ipe/policy_fs.c
@@ -0,0 +1,470 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/types.h>
+#include <linux/dcache.h>
+#include <linux/security.h>
+
+#include "ipe.h"
+#include "policy.h"
+#include "eval.h"
+#include "fs.h"
+
+#define MAX_VERSION_SIZE ARRAY_SIZE("65535.65535.65535")
+
+/**
+ * ipefs_file - defines a file in securityfs.
+ */
+struct ipefs_file {
+	const char *name;
+	umode_t access;
+	const struct file_operations *fops;
+};
+
+/**
+ * read_pkcs7() - Read handler for "ipe/policies/$name/pkcs7".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the pkcs7 blob representing the policy
+ * on success. If the policy is unsigned (like the boot policy), this
+ * will return -ENOENT.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted or is unsigned
+ */
+static ssize_t read_pkcs7(struct file *f, char __user *data,
+			  size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	if (!p->pkcs7) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = simple_read_from_buffer(data, len, offset, p->pkcs7, p->pkcs7len);
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * read_policy() - Read handler for "ipe/policies/$name/policy".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the plain-text version of the policy
+ * on success.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t read_policy(struct file *f, char __user *data,
+			   size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = simple_read_from_buffer(data, len, offset, p->text, p->textlen);
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * read_name() - Read handler for "ipe/policies/$name/name".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the policy_name attribute on success.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t read_name(struct file *f, char __user *data,
+			 size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = simple_read_from_buffer(data, len, offset, p->parsed->name,
+				     strlen(p->parsed->name));
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * read_version() - Read handler for "ipe/policies/$name/version".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the version string on success.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t read_version(struct file *f, char __user *data,
+			    size_t len, loff_t *offset)
+{
+	char buffer[MAX_VERSION_SIZE] = { 0 };
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	size_t strsize = 0;
+	ssize_t rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	strsize = scnprintf(buffer, ARRAY_SIZE(buffer), "%hu.%hu.%hu",
+			    p->parsed->version.major, p->parsed->version.minor,
+			    p->parsed->version.rev);
+
+	rc = simple_read_from_buffer(data, len, offset, buffer, strsize);
+
+out:
+	inode_unlock_shared(root);
+
+	return rc;
+}
+
+/**
+ * setactive() - Write handler for "ipe/policies/$name/active".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission
+ * * %-EINVAL			- Invalid input
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t setactive(struct file *f, const char __user *data,
+			 size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	bool value = false;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	rc = kstrtobool_from_user(data, len, &value);
+	if (rc)
+		return rc;
+
+	if (!value)
+		return -EINVAL;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+	inode_lock(root);
+
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		rc = -ENOENT;
+		goto out;
+	}
+
+	rc = ipe_set_active_pol(p);
+
+out:
+	inode_unlock(root);
+	return (rc < 0) ? rc : len;
+}
+
+/**
+ * getactive() - Read handler for "ipe/policies/$name/active".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * @data will be populated with the 1 or 0 depending on if the
+ * corresponding policy is active.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t getactive(struct file *f, char __user *data,
+			 size_t len, loff_t *offset)
+{
+	const struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	const char *str;
+	int rc = 0;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+
+	inode_lock_shared(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		inode_unlock_shared(root);
+		return -ENOENT;
+	}
+	inode_unlock_shared(root);
+
+	str = (p == rcu_access_pointer(ipe_active_policy)) ? "1" : "0";
+	rc = simple_read_from_buffer(data, len, offset, str, 1);
+
+	return rc;
+}
+
+/**
+ * update_policy() - Write handler for "ipe/policies/$name/update".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * On success this updates the policy represented by $name,
+ * in-place.
+ *
+ * Return: Length of buffer written on success. If an error occurs,
+ * the function will return the -errno.
+ */
+static ssize_t update_policy(struct file *f, const char __user *data,
+			     size_t len, loff_t *offset)
+{
+	struct inode *root = NULL;
+	char *copy = NULL;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	copy = memdup_user(data, len);
+	if (IS_ERR(copy))
+		return PTR_ERR(copy);
+
+	root = d_inode(f->f_path.dentry->d_parent);
+	inode_lock(root);
+	rc = ipe_update_policy(root, NULL, 0, copy, len);
+	inode_unlock(root);
+
+	kfree(copy);
+	if (rc)
+		return rc;
+
+	return len;
+}
+
+/**
+ * delete_policy() - write handler for  "ipe/policies/$name/delete".
+ * @f: Supplies a file structure representing the securityfs node.
+ * @data: Supplies a buffer passed to the write syscall.
+ * @len: Supplies the length of @data.
+ * @offset: unused.
+ *
+ * On success this deletes the policy represented by $name.
+ *
+ * Return:
+ * * Length of buffer written	- Success
+ * * %-EPERM			- Insufficient permission/deleting active policy
+ * * %-EINVAL			- Invalid input
+ * * %-ENOENT			- Policy initializing/deleted
+ */
+static ssize_t delete_policy(struct file *f, const char __user *data,
+			     size_t len, loff_t *offset)
+{
+	struct ipe_policy *ap = NULL;
+	struct ipe_policy *p = NULL;
+	struct inode *root = NULL;
+	bool value = false;
+	int rc = 0;
+
+	if (!file_ns_capable(f, &init_user_ns, CAP_MAC_ADMIN))
+		return -EPERM;
+
+	rc = kstrtobool_from_user(data, len, &value);
+	if (rc)
+		return rc;
+
+	if (!value)
+		return -EINVAL;
+
+	root = d_inode(f->f_path.dentry->d_parent);
+	inode_lock(root);
+	p = (struct ipe_policy *)root->i_private;
+	if (!p) {
+		inode_unlock(root);
+		return -ENOENT;
+	}
+
+	mutex_lock(&ipe_policy_lock);
+	ap = rcu_dereference_protected(ipe_active_policy,
+				       lockdep_is_held(&ipe_policy_lock));
+	if (p == ap) {
+		mutex_unlock(&ipe_policy_lock);
+		inode_unlock(root);
+		return -EPERM;
+	}
+	mutex_unlock(&ipe_policy_lock);
+
+	root->i_private = NULL;
+	inode_unlock(root);
+
+	ipe_free_policy(p);
+	return len;
+}
+
+static const struct file_operations content_fops = {
+	.read = read_policy,
+};
+
+static const struct file_operations pkcs7_fops = {
+	.read = read_pkcs7,
+};
+
+static const struct file_operations name_fops = {
+	.read = read_name,
+};
+
+static const struct file_operations ver_fops = {
+	.read = read_version,
+};
+
+static const struct file_operations active_fops = {
+	.write = setactive,
+	.read = getactive,
+};
+
+static const struct file_operations update_fops = {
+	.write = update_policy,
+};
+
+static const struct file_operations delete_fops = {
+	.write = delete_policy,
+};
+
+/**
+ * policy_subdir - files under a policy subdirectory
+ */
+static const struct ipefs_file policy_subdir[] = {
+	{ "pkcs7", 0444, &pkcs7_fops },
+	{ "policy", 0444, &content_fops },
+	{ "name", 0444, &name_fops },
+	{ "version", 0444, &ver_fops },
+	{ "active", 0600, &active_fops },
+	{ "update", 0200, &update_fops },
+	{ "delete", 0200, &delete_fops },
+};
+
+/**
+ * ipe_del_policyfs_node() - Delete a securityfs entry for @p.
+ * @p: Supplies a pointer to the policy to delete a securityfs entry for.
+ */
+void ipe_del_policyfs_node(struct ipe_policy *p)
+{
+	securityfs_recursive_remove(p->policyfs);
+	p->policyfs = NULL;
+}
+
+/**
+ * ipe_new_policyfs_node() - Create a securityfs entry for @p.
+ * @p: Supplies a pointer to the policy to create a securityfs entry for.
+ *
+ * Return: %0 on success. If an error occurs, the function will return
+ * the -errno.
+ */
+int ipe_new_policyfs_node(struct ipe_policy *p)
+{
+	const struct ipefs_file *f = NULL;
+	struct dentry *policyfs = NULL;
+	struct inode *root = NULL;
+	struct dentry *d = NULL;
+	size_t i = 0;
+	int rc = 0;
+
+	if (p->policyfs)
+		return 0;
+
+	policyfs = securityfs_create_dir(p->parsed->name, policy_root);
+	if (IS_ERR(policyfs))
+		return PTR_ERR(policyfs);
+
+	root = d_inode(policyfs);
+
+	for (i = 0; i < ARRAY_SIZE(policy_subdir); ++i) {
+		f = &policy_subdir[i];
+
+		d = securityfs_create_file(f->name, f->access, policyfs,
+					   NULL, f->fops);
+		if (IS_ERR(d)) {
+			rc = PTR_ERR(d);
+			goto err;
+		}
+	}
+
+	inode_lock(root);
+	p->policyfs = policyfs;
+	root->i_private = p;
+	inode_unlock(root);
+
+	return 0;
+err:
+	securityfs_recursive_remove(policyfs);
+	return rc;
+}
-- 
2.44.0


^ permalink raw reply related	[relevance 28%]

* [PATCH v17 03/21] ipe: add evaluation loop
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
  2024-04-13  0:55 47% ` [PATCH v17 01/21] security: add ipe lsm Fan Wu
  2024-04-13  0:55 33% ` [PATCH v17 02/21] ipe: add policy parser Fan Wu
@ 2024-04-13  0:55 54% ` Fan Wu
  2024-04-13  0:55 45% ` [PATCH v17 04/21] ipe: add LSM hooks on execution and kernel read Fan Wu
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Introduce a core evaluation function in IPE that will be triggered by
various security hooks (e.g., mmap, bprm_check, kexec). This function
systematically assesses actions against the defined IPE policy, by
iterating over rules specific to the action being taken. This critical
addition enables IPE to enforce its security policies effectively,
ensuring that actions intercepted by these hooks are scrutinized for policy
compliance before they are allowed to proceed.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
+ Split evaluation loop, access control hooks, and evaluation loop from policy parser and userspace interface to pass mailing list character limit

v3:
+ Move ipe_load_properties to patch 04.
+ Remove useless 0-initializations Prefix extern variables with ipe_
+ Remove kernel module parameters, as these are exposed through sysctls.
+ Add more prose to the IPE base config option help text.
+ Use GFP_KERNEL for audit_log_start.
+ Remove unnecessary caching system.
+ Remove comments from headers
+ Use rcu_access_pointer for rcu-pointer null check
+ Remove usage of reqprot; use prot only.
+Move policy load and activation audit event to 03/12

v4:
+ Remove sysctls in favor of securityfs nodes
+ Re-add kernel module parameters, as these are now exposed through securityfs.
+ Refactor property audit loop to a separate function.

v5:
+ fix minor grammatical errors
+ do not group rule by curly-brace in audit record,
+ reconstruct the exact rule.

v6:
+ No changes

v7:
+ Further split lsm creation into a separate commit from the evaluation loop and audit system, for easier review.
+ Propagating changes to support the new ipe_context structure in the evaluation loop.

v8:
+ Remove ipe_hook enumeration; hooks can be correlated via syscall record.

v9:
+ Remove ipe_context related code and simplify the evaluation loop.

v10:
+ Split eval part and boot_verified part

v11:
+ Fix code style issues

v12:
+ Correct an rcu_read_unlock usage
+ Add a WARN to unknown op during evaluation

v13:
+ No changes

v14:
+ No changes

v15:
+ No changes

v16:
+ No changes

v17:
+ Add years to license header
+ Fix code and documentation style issues
---
 security/ipe/Makefile |   1 +
 security/ipe/eval.c   | 102 ++++++++++++++++++++++++++++++++++++++++++
 security/ipe/eval.h   |  24 ++++++++++
 3 files changed, 127 insertions(+)
 create mode 100644 security/ipe/eval.c
 create mode 100644 security/ipe/eval.h

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 3093de1afd3e..4cc17eb92060 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -6,6 +6,7 @@
 #
 
 obj-$(CONFIG_SECURITY_IPE) += \
+	eval.o \
 	ipe.o \
 	policy.o \
 	policy_parser.o \
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
new file mode 100644
index 000000000000..41331afdef7c
--- /dev/null
+++ b/security/ipe/eval.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/file.h>
+#include <linux/sched.h>
+#include <linux/rcupdate.h>
+
+#include "ipe.h"
+#include "eval.h"
+#include "policy.h"
+
+struct ipe_policy __rcu *ipe_active_policy;
+
+/**
+ * evaluate_property() - Analyze @ctx against a rule property.
+ * @ctx: Supplies a pointer to the context to be evaluated.
+ * @p: Supplies a pointer to the property to be evaluated.
+ *
+ * This is a placeholder. The actual function will be introduced in the
+ * latter commits.
+ *
+ * Return:
+ * * %true	- The current @ctx match the @p
+ * * %false	- The current @ctx doesn't match the @p
+ */
+static bool evaluate_property(const struct ipe_eval_ctx *const ctx,
+			      struct ipe_prop *p)
+{
+	return false;
+}
+
+/**
+ * ipe_evaluate_event() - Analyze @ctx against the current active policy.
+ * @ctx: Supplies a pointer to the context to be evaluated.
+ *
+ * This is the loop where all policy evaluation happens against IPE policy.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- @ctx did not pass evaluation
+ */
+int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx)
+{
+	const struct ipe_op_table *rules = NULL;
+	const struct ipe_rule *rule = NULL;
+	struct ipe_policy *pol = NULL;
+	struct ipe_prop *prop = NULL;
+	enum ipe_action_type action;
+	bool match = false;
+
+	rcu_read_lock();
+
+	pol = rcu_dereference(ipe_active_policy);
+	if (!pol) {
+		rcu_read_unlock();
+		return 0;
+	}
+
+	if (ctx->op == IPE_OP_INVALID) {
+		if (pol->parsed->global_default_action == IPE_ACTION_DENY) {
+			rcu_read_unlock();
+			return -EACCES;
+		}
+		if (pol->parsed->global_default_action == IPE_ACTION_INVALID)
+			WARN(1, "no default rule set for unknown op, ALLOW it");
+		rcu_read_unlock();
+		return 0;
+	}
+
+	rules = &pol->parsed->rules[ctx->op];
+
+	list_for_each_entry(rule, &rules->rules, next) {
+		match = true;
+
+		list_for_each_entry(prop, &rule->props, next) {
+			match = evaluate_property(ctx, prop);
+			if (!match)
+				break;
+		}
+
+		if (match)
+			break;
+	}
+
+	if (match)
+		action = rule->action;
+	else if (rules->default_action != IPE_ACTION_INVALID)
+		action = rules->default_action;
+	else
+		action = pol->parsed->global_default_action;
+
+	rcu_read_unlock();
+	if (action == IPE_ACTION_DENY)
+		return -EACCES;
+
+	return 0;
+}
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
new file mode 100644
index 000000000000..b137f2107852
--- /dev/null
+++ b/security/ipe/eval.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_EVAL_H
+#define _IPE_EVAL_H
+
+#include <linux/file.h>
+#include <linux/types.h>
+
+#include "policy.h"
+
+extern struct ipe_policy __rcu *ipe_active_policy;
+
+struct ipe_eval_ctx {
+	enum ipe_op_type op;
+
+	const struct file *file;
+};
+
+int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx);
+
+#endif /* _IPE_EVAL_H */
-- 
2.44.0


^ permalink raw reply related	[relevance 54%]

* [PATCH v17 04/21] ipe: add LSM hooks on execution and kernel read
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
                   ` (2 preceding siblings ...)
  2024-04-13  0:55 54% ` [PATCH v17 03/21] ipe: add evaluation loop Fan Wu
@ 2024-04-13  0:55 45% ` Fan Wu
  2024-04-13  0:55 67% ` [PATCH v17 05/21] initramfs|security: Add a security hook to do_populate_rootfs() Fan Wu
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

IPE's initial goal is to control both execution and the loading of
kernel modules based on the system's definition of trust. It
accomplishes this by plugging into the security hooks for
bprm_check_security, file_mprotect, mmap_file, kernel_load_data,
and kernel_read_data.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation, the audit system, the evaluation loop
    and access control hooks into separate commits.

v8:
  + Rename hook functions to follow the lsmname_hook_name convention
  + Remove ipe_hook enumeration, can be derived from correlation with
    syscall audit record.

v9:
  + Minor changes for adapting to the new parser

v10:
  + Remove @reqprot part

v11:
  + Fix code style issues

v12:
  + Correct WARN usages

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 security/ipe/Makefile |   1 +
 security/ipe/eval.c   |  14 ++++
 security/ipe/eval.h   |   5 ++
 security/ipe/hooks.c  | 184 ++++++++++++++++++++++++++++++++++++++++++
 security/ipe/hooks.h  |  25 ++++++
 security/ipe/ipe.c    |   6 ++
 6 files changed, 235 insertions(+)
 create mode 100644 security/ipe/hooks.c
 create mode 100644 security/ipe/hooks.h

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 4cc17eb92060..e1c27e974c5c 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -7,6 +7,7 @@
 
 obj-$(CONFIG_SECURITY_IPE) += \
 	eval.o \
+	hooks.o \
 	ipe.o \
 	policy.o \
 	policy_parser.o \
diff --git a/security/ipe/eval.c b/security/ipe/eval.c
index 41331afdef7c..cc3b3f6583ad 100644
--- a/security/ipe/eval.c
+++ b/security/ipe/eval.c
@@ -16,6 +16,20 @@
 
 struct ipe_policy __rcu *ipe_active_policy;
 
+/**
+ * ipe_build_eval_ctx() - Build an ipe evaluation context.
+ * @ctx: Supplies a pointer to the context to be populated.
+ * @file: Supplies a pointer to the file to associated with the evaluation.
+ * @op: Supplies the IPE policy operation associated with the evaluation.
+ */
+void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
+			const struct file *file,
+			enum ipe_op_type op)
+{
+	ctx->file = file;
+	ctx->op = op;
+}
+
 /**
  * evaluate_property() - Analyze @ctx against a rule property.
  * @ctx: Supplies a pointer to the context to be evaluated.
diff --git a/security/ipe/eval.h b/security/ipe/eval.h
index b137f2107852..00ed8ceca10e 100644
--- a/security/ipe/eval.h
+++ b/security/ipe/eval.h
@@ -11,6 +11,8 @@
 
 #include "policy.h"
 
+#define IPE_EVAL_CTX_INIT ((struct ipe_eval_ctx){ 0 })
+
 extern struct ipe_policy __rcu *ipe_active_policy;
 
 struct ipe_eval_ctx {
@@ -19,6 +21,9 @@ struct ipe_eval_ctx {
 	const struct file *file;
 };
 
+void ipe_build_eval_ctx(struct ipe_eval_ctx *ctx,
+			const struct file *file,
+			enum ipe_op_type op);
 int ipe_evaluate_event(const struct ipe_eval_ctx *const ctx);
 
 #endif /* _IPE_EVAL_H */
diff --git a/security/ipe/hooks.c b/security/ipe/hooks.c
new file mode 100644
index 000000000000..f2aaa749dd7b
--- /dev/null
+++ b/security/ipe/hooks.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/binfmts.h>
+#include <linux/mman.h>
+
+#include "ipe.h"
+#include "hooks.h"
+#include "eval.h"
+
+/**
+ * ipe_bprm_check_security() - ipe security hook function for bprm check.
+ * @bprm: Supplies a pointer to a linux_binprm structure to source the file
+ *	  being evaluated.
+ *
+ * This LSM hook is called when a binary is loaded through the exec
+ * family of system calls.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_bprm_check_security(struct linux_binprm *bprm)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+
+	ipe_build_eval_ctx(&ctx, bprm->file, IPE_OP_EXEC);
+	return ipe_evaluate_event(&ctx);
+}
+
+/**
+ * ipe_mmap_file() - ipe security hook function for mmap check.
+ * @f: File being mmap'd. Can be NULL in the case of anonymous memory.
+ * @reqprot: The requested protection on the mmap, passed from usermode.
+ * @prot: The effective protection on the mmap, resolved from reqprot and
+ *	  system configuration.
+ * @flags: Unused.
+ *
+ * This hook is called when a file is loaded through the mmap
+ * family of system calls.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_mmap_file(struct file *f, unsigned long reqprot __always_unused,
+		  unsigned long prot, unsigned long flags)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+
+	if (prot & PROT_EXEC) {
+		ipe_build_eval_ctx(&ctx, f, IPE_OP_EXEC);
+		return ipe_evaluate_event(&ctx);
+	}
+
+	return 0;
+}
+
+/**
+ * ipe_file_mprotect() - ipe security hook function for mprotect check.
+ * @vma: Existing virtual memory area created by mmap or similar.
+ * @reqprot: The requested protection on the mmap, passed from usermode.
+ * @prot: The effective protection on the mmap, resolved from reqprot and
+ *	  system configuration.
+ *
+ * This LSM hook is called when a mmap'd region of memory is changing
+ * its protections via mprotect.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_file_mprotect(struct vm_area_struct *vma,
+		      unsigned long reqprot __always_unused,
+		      unsigned long prot)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+
+	/* Already Executable */
+	if (vma->vm_flags & VM_EXEC)
+		return 0;
+
+	if (prot & PROT_EXEC) {
+		ipe_build_eval_ctx(&ctx, vma->vm_file, IPE_OP_EXEC);
+		return ipe_evaluate_event(&ctx);
+	}
+
+	return 0;
+}
+
+/**
+ * ipe_kernel_read_file() - ipe security hook function for kernel read.
+ * @file: Supplies a pointer to the file structure being read in from disk.
+ * @id: Supplies the enumeration identifying the purpose of the read.
+ * @contents: Unused.
+ *
+ * This LSM hook is called when a file is being read in from disk from
+ * the kernel.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
+			 bool contents)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+	enum ipe_op_type op;
+
+	switch (id) {
+	case READING_FIRMWARE:
+		op = IPE_OP_FIRMWARE;
+		break;
+	case READING_MODULE:
+		op = IPE_OP_KERNEL_MODULE;
+		break;
+	case READING_KEXEC_INITRAMFS:
+		op = IPE_OP_KEXEC_INITRAMFS;
+		break;
+	case READING_KEXEC_IMAGE:
+		op = IPE_OP_KEXEC_IMAGE;
+		break;
+	case READING_POLICY:
+		op = IPE_OP_POLICY;
+		break;
+	case READING_X509_CERTIFICATE:
+		op = IPE_OP_X509;
+		break;
+	default:
+		op = IPE_OP_INVALID;
+		WARN(1, "no rule setup for kernel_read_file enum %d", id);
+	}
+
+	ipe_build_eval_ctx(&ctx, file, op);
+	return ipe_evaluate_event(&ctx);
+}
+
+/**
+ * ipe_kernel_load_data() - ipe security hook function for kernel load data.
+ * @id: Supplies the enumeration identifying the purpose of the read.
+ * @contents: Unused.
+ *
+ * This LSM hook is called when a buffer is being read in from disk.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EACCES	- Did not pass IPE policy
+ */
+int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents)
+{
+	struct ipe_eval_ctx ctx = IPE_EVAL_CTX_INIT;
+	enum ipe_op_type op;
+
+	switch (id) {
+	case LOADING_FIRMWARE:
+		op = IPE_OP_FIRMWARE;
+		break;
+	case LOADING_MODULE:
+		op = IPE_OP_KERNEL_MODULE;
+		break;
+	case LOADING_KEXEC_INITRAMFS:
+		op = IPE_OP_KEXEC_INITRAMFS;
+		break;
+	case LOADING_KEXEC_IMAGE:
+		op = IPE_OP_KEXEC_IMAGE;
+		break;
+	case LOADING_POLICY:
+		op = IPE_OP_POLICY;
+		break;
+	case LOADING_X509_CERTIFICATE:
+		op = IPE_OP_X509;
+		break;
+	default:
+		op = IPE_OP_INVALID;
+		WARN(1, "no rule setup for kernel_load_data enum %d", id);
+	}
+
+	ipe_build_eval_ctx(&ctx, NULL, op);
+	return ipe_evaluate_event(&ctx);
+}
diff --git a/security/ipe/hooks.h b/security/ipe/hooks.h
new file mode 100644
index 000000000000..c22c3336d27c
--- /dev/null
+++ b/security/ipe/hooks.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#ifndef _IPE_HOOKS_H
+#define _IPE_HOOKS_H
+
+#include <linux/fs.h>
+#include <linux/binfmts.h>
+#include <linux/security.h>
+
+int ipe_bprm_check_security(struct linux_binprm *bprm);
+
+int ipe_mmap_file(struct file *f, unsigned long reqprot, unsigned long prot,
+		  unsigned long flags);
+
+int ipe_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+		      unsigned long prot);
+
+int ipe_kernel_read_file(struct file *file, enum kernel_read_file_id id,
+			 bool contents);
+
+int ipe_kernel_load_data(enum kernel_load_data_id id, bool contents);
+
+#endif /* _IPE_HOOKS_H */
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
index 8d4ea372873e..729334812636 100644
--- a/security/ipe/ipe.c
+++ b/security/ipe/ipe.c
@@ -5,6 +5,7 @@
 #include <uapi/linux/lsm.h>
 
 #include "ipe.h"
+#include "hooks.h"
 
 static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
 };
@@ -15,6 +16,11 @@ static const struct lsm_id ipe_lsmid = {
 };
 
 static struct security_hook_list ipe_hooks[] __ro_after_init = {
+	LSM_HOOK_INIT(bprm_check_security, ipe_bprm_check_security),
+	LSM_HOOK_INIT(mmap_file, ipe_mmap_file),
+	LSM_HOOK_INIT(file_mprotect, ipe_file_mprotect),
+	LSM_HOOK_INIT(kernel_read_file, ipe_kernel_read_file),
+	LSM_HOOK_INIT(kernel_load_data, ipe_kernel_load_data),
 };
 
 /**
-- 
2.44.0


^ permalink raw reply related	[relevance 45%]

* [PATCH v17 01/21] security: add ipe lsm
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
@ 2024-04-13  0:55 47% ` Fan Wu
  2024-04-13  0:55 33% ` [PATCH v17 02/21] ipe: add policy parser Fan Wu
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

Integrity Policy Enforcement (IPE) is an LSM that provides an
complimentary approach to Mandatory Access Control than existing LSMs
today.

Existing LSMs have centered around the concept of access to a resource
should be controlled by the current user's credentials. IPE's approach,
is that access to a resource should be controlled by the system's trust
of a current resource.

The basis of this approach is defining a global policy to specify which
resource can be trusted.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move ipe_load_properties to patch 04.
  + Remove useless 0-initializations
  + Prefix extern variables with ipe_
  + Remove kernel module parameters, as these are
    exposed through sysctls.
  + Add more prose to the IPE base config option
    help text.
  + Use GFP_KERNEL for audit_log_start.
  + Remove unnecessary caching system.
  + Remove comments from headers
  + Use rcu_access_pointer for rcu-pointer null check
  + Remove usage of reqprot; use prot only.
  + Move policy load and activation audit event to 03/12

v4:
  + Remove sysctls in favor of securityfs nodes
  + Re-add kernel module parameters, as these are now
    exposed through securityfs.
  + Refactor property audit loop to a separate function.

v5:
  + fix minor grammatical errors
  + do not group rule by curly-brace in audit record,
    reconstruct the exact rule.

v6:
  + No changes

v7:
  + Further split lsm creation into a separate commit from the
    evaluation loop and audit system, for easier review.

  + Introduce the concept of an ipe_context, a scoped way to
    introduce execution policies, used initially for allowing for
    kunit tests in isolation.

v8:
  + Follow lsmname_hook_name convention for lsm hooks.
  + Move LSM blob accessors to ipe.c and mark LSM blobs as static.

v9:
  + Remove ipe_context for simplification

v10:
  + Add github url

v11:
  + Correct github url
  + Move ipe before bpf

v12:
  + Switch to use lsm_id instead of string for lsm name

v13:
  + No changes

v14:
  + No changes

v15:
  + Add missing code in tools/testing/selftests/lsm/lsm_list_modules_test.c

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 include/uapi/linux/lsm.h                      |  1 +
 security/Kconfig                              | 11 ++---
 security/Makefile                             |  1 +
 security/ipe/Kconfig                          | 17 ++++++++
 security/ipe/Makefile                         |  9 ++++
 security/ipe/ipe.c                            | 42 +++++++++++++++++++
 security/ipe/ipe.h                            | 16 +++++++
 security/security.c                           |  3 +-
 .../selftests/lsm/lsm_list_modules_test.c     |  3 ++
 9 files changed, 97 insertions(+), 6 deletions(-)
 create mode 100644 security/ipe/Kconfig
 create mode 100644 security/ipe/Makefile
 create mode 100644 security/ipe/ipe.c
 create mode 100644 security/ipe/ipe.h

diff --git a/include/uapi/linux/lsm.h b/include/uapi/linux/lsm.h
index 33d8c9f4aa6b..938593dfd5da 100644
--- a/include/uapi/linux/lsm.h
+++ b/include/uapi/linux/lsm.h
@@ -64,6 +64,7 @@ struct lsm_ctx {
 #define LSM_ID_LANDLOCK		110
 #define LSM_ID_IMA		111
 #define LSM_ID_EVM		112
+#define LSM_ID_IPE		113
 
 /*
  * LSM_ATTR_XXX definitions identify different LSM attributes
diff --git a/security/Kconfig b/security/Kconfig
index 412e76f1575d..9fb8f9b14972 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -192,6 +192,7 @@ source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
 source "security/landlock/Kconfig"
+source "security/ipe/Kconfig"
 
 source "security/integrity/Kconfig"
 
@@ -231,11 +232,11 @@ endchoice
 
 config LSM
 	string "Ordered list of enabled LSMs"
-	default "landlock,lockdown,yama,loadpin,safesetid,smack,selinux,tomoyo,apparmor,bpf" if DEFAULT_SECURITY_SMACK
-	default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,bpf" if DEFAULT_SECURITY_APPARMOR
-	default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,bpf" if DEFAULT_SECURITY_TOMOYO
-	default "landlock,lockdown,yama,loadpin,safesetid,bpf" if DEFAULT_SECURITY_DAC
-	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,bpf"
+	default "landlock,lockdown,yama,loadpin,safesetid,smack,selinux,tomoyo,apparmor,ipe,bpf" if DEFAULT_SECURITY_SMACK
+	default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,ipe,bpf" if DEFAULT_SECURITY_APPARMOR
+	default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,ipe,bpf" if DEFAULT_SECURITY_TOMOYO
+	default "landlock,lockdown,yama,loadpin,safesetid,ipe,bpf" if DEFAULT_SECURITY_DAC
+	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,bpf"
 	help
 	  A comma-separated list of LSMs, in initialization order.
 	  Any LSMs left off this list, except for those with order
diff --git a/security/Makefile b/security/Makefile
index 59f238490665..cc0982214b84 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown/
 obj-$(CONFIG_CGROUPS)			+= device_cgroup.o
 obj-$(CONFIG_BPF_LSM)			+= bpf/
 obj-$(CONFIG_SECURITY_LANDLOCK)		+= landlock/
+obj-$(CONFIG_SECURITY_IPE)		+= ipe/
 
 # Object integrity file lists
 obj-$(CONFIG_INTEGRITY)			+= integrity/
diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig
new file mode 100644
index 000000000000..e4875fb04883
--- /dev/null
+++ b/security/ipe/Kconfig
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Integrity Policy Enforcement (IPE) configuration
+#
+
+menuconfig SECURITY_IPE
+	bool "Integrity Policy Enforcement (IPE)"
+	depends on SECURITY && SECURITYFS
+	select PKCS7_MESSAGE_PARSER
+	select SYSTEM_DATA_VERIFICATION
+	help
+	  This option enables the Integrity Policy Enforcement LSM
+	  allowing users to define a policy to enforce a trust-based access
+	  control. A key feature of IPE is a customizable policy to allow
+	  admins to reconfigure trust requirements on the fly.
+
+	  If unsure, answer N.
diff --git a/security/ipe/Makefile b/security/ipe/Makefile
new file mode 100644
index 000000000000..5486398a69e9
--- /dev/null
+++ b/security/ipe/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+#
+# Makefile for building the IPE module as part of the kernel tree.
+#
+
+obj-$(CONFIG_SECURITY_IPE) += \
+	ipe.o \
diff --git a/security/ipe/ipe.c b/security/ipe/ipe.c
new file mode 100644
index 000000000000..8d4ea372873e
--- /dev/null
+++ b/security/ipe/ipe.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#include <uapi/linux/lsm.h>
+
+#include "ipe.h"
+
+static struct lsm_blob_sizes ipe_blobs __ro_after_init = {
+};
+
+static const struct lsm_id ipe_lsmid = {
+	.name = "ipe",
+	.id = LSM_ID_IPE,
+};
+
+static struct security_hook_list ipe_hooks[] __ro_after_init = {
+};
+
+/**
+ * ipe_init() - Entry point of IPE.
+ *
+ * This is called at LSM init, which happens occurs early during kernel
+ * start up. During this phase, IPE registers its hooks and loads the
+ * builtin boot policy.
+ *
+ * Return:
+ * * %0		- OK
+ * * %-ENOMEM	- Out of memory (OOM)
+ */
+static int __init ipe_init(void)
+{
+	security_add_hooks(ipe_hooks, ARRAY_SIZE(ipe_hooks), &ipe_lsmid);
+
+	return 0;
+}
+
+DEFINE_LSM(ipe) = {
+	.name = "ipe",
+	.init = ipe_init,
+	.blobs = &ipe_blobs,
+};
diff --git a/security/ipe/ipe.h b/security/ipe/ipe.h
new file mode 100644
index 000000000000..adc3c45e9f53
--- /dev/null
+++ b/security/ipe/ipe.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _IPE_H
+#define _IPE_H
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) "ipe: " fmt
+
+#include <linux/lsm_hooks.h>
+
+#endif /* _IPE_H */
diff --git a/security/security.c b/security/security.c
index 0a9a0ac3f266..820e0d437452 100644
--- a/security/security.c
+++ b/security/security.c
@@ -51,7 +51,8 @@
 	(IS_ENABLED(CONFIG_BPF_LSM) ? 1 : 0) + \
 	(IS_ENABLED(CONFIG_SECURITY_LANDLOCK) ? 1 : 0) + \
 	(IS_ENABLED(CONFIG_IMA) ? 1 : 0) + \
-	(IS_ENABLED(CONFIG_EVM) ? 1 : 0))
+	(IS_ENABLED(CONFIG_EVM) ? 1 : 0) + \
+	(IS_ENABLED(CONFIG_SECURITY_IPE) ? 1 : 0))
 
 /*
  * These are descriptions of the reasons that can be passed to the
diff --git a/tools/testing/selftests/lsm/lsm_list_modules_test.c b/tools/testing/selftests/lsm/lsm_list_modules_test.c
index 06d24d4679a6..1cc8a977c711 100644
--- a/tools/testing/selftests/lsm/lsm_list_modules_test.c
+++ b/tools/testing/selftests/lsm/lsm_list_modules_test.c
@@ -128,6 +128,9 @@ TEST(correct_lsm_list_modules)
 		case LSM_ID_EVM:
 			name = "evm";
 			break;
+		case LSM_ID_IPE:
+			name = "ipe";
+			break;
 		default:
 			name = "INVALID";
 			break;
-- 
2.44.0


^ permalink raw reply related	[relevance 47%]

* [PATCH v17 02/21] ipe: add policy parser
  2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
  2024-04-13  0:55 47% ` [PATCH v17 01/21] security: add ipe lsm Fan Wu
@ 2024-04-13  0:55 33% ` Fan Wu
  2024-04-13  0:55 54% ` [PATCH v17 03/21] ipe: add evaluation loop Fan Wu
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Deven Bowers, Fan Wu

From: Deven Bowers <deven.desai@linux.microsoft.com>

IPE's interpretation of the what the user trusts is accomplished through
its policy. IPE's design is to not provide support for a single trust
provider, but to support multiple providers to enable the end-user to
choose the best one to seek their needs.

This requires the policy to be rather flexible and modular so that
integrity providers, like fs-verity, dm-verity, dm-integrity, or
some other system, can plug into the policy with minimal code changes.

Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>

---
v2:
  + Split evaluation loop, access control hooks,
    and evaluation loop from policy parser and userspace
    interface to pass mailing list character limit

v3:
  + Move policy load and activation audit event to 03/12
  + Fix a potential panic when a policy failed to load.
  + use pr_warn for a failure to parse instead of an
    audit record
  + Remove comments from headers
  + Add lockdep assertions to ipe_update_active_policy and
    ipe_activate_policy
  + Fix up warnings with checkpatch --strict
  + Use file_ns_capable for CAP_MAC_ADMIN for securityfs
    nodes.
  + Use memdup_user instead of kzalloc+simple_write_to_buffer.
  + Remove strict_parse command line parameter, as it is added
    by the sysctl command line.
  + Prefix extern variables with ipe_

v4:
  + Remove securityfs to reverse-dependency
  + Add SHA1 reverse dependency.
  + Add versioning scheme for IPE properties, and associated
    interface to query the versioning scheme.
  + Cause a parser to always return an error on unknown syntax.
  + Remove strict_parse option
  + Change active_policy interface from sysctl, to securityfs,
    and change scheme.

v5:
  + Cause an error if a default action is not defined for each
    operation.
  + Minor function renames

v6:
  + No changes

v7:
  + Further split parser and userspace interface into two
    separate commits, for easier review.
  + Refactor policy parser to make code cleaner via introducing a
    more modular design, for easier extension of policy, and
    easier review.

v8:
  + remove unnecessary pr_info emission on parser loading
  + add explicit newline to the pr_err emitted when a parser
    fails to load.

v9:
  + switch to match table to parse policy
  + remove quote syntax and KERNEL_READ operation

v10:
  + Fix memory leaks in parser
  + Fix typos and change code styles

v11:
  + Fix code style issues

v12:
  + Add __always_unused to an unused parameter
  + Simplify error case handling

v13:
  + No changes

v14:
  + No changes

v15:
  + No changes

v16:
  + No changes

v17:
  + Add years to license header
  + Fix code and documentation style issues
---
 security/ipe/Makefile        |   2 +
 security/ipe/policy.c        | 103 ++++++++
 security/ipe/policy.h        |  83 ++++++
 security/ipe/policy_parser.c | 495 +++++++++++++++++++++++++++++++++++
 security/ipe/policy_parser.h |  11 +
 5 files changed, 694 insertions(+)
 create mode 100644 security/ipe/policy.c
 create mode 100644 security/ipe/policy.h
 create mode 100644 security/ipe/policy_parser.c
 create mode 100644 security/ipe/policy_parser.h

diff --git a/security/ipe/Makefile b/security/ipe/Makefile
index 5486398a69e9..3093de1afd3e 100644
--- a/security/ipe/Makefile
+++ b/security/ipe/Makefile
@@ -7,3 +7,5 @@
 
 obj-$(CONFIG_SECURITY_IPE) += \
 	ipe.o \
+	policy.o \
+	policy_parser.o \
diff --git a/security/ipe/policy.c b/security/ipe/policy.c
new file mode 100644
index 000000000000..dd7b5b79903a
--- /dev/null
+++ b/security/ipe/policy.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/errno.h>
+#include <linux/verification.h>
+
+#include "ipe.h"
+#include "policy.h"
+#include "policy_parser.h"
+
+/**
+ * ipe_free_policy() - Deallocate a given IPE policy.
+ * @p: Supplies the policy to free.
+ *
+ * Safe to call on IS_ERR/NULL.
+ */
+void ipe_free_policy(struct ipe_policy *p)
+{
+	if (IS_ERR_OR_NULL(p))
+		return;
+
+	ipe_free_parsed_policy(p->parsed);
+	/*
+	 * p->text is allocated only when p->pkcs7 is not NULL
+	 * otherwise it points to the plaintext data inside the pkcs7
+	 */
+	if (!p->pkcs7)
+		kfree(p->text);
+	kfree(p->pkcs7);
+	kfree(p);
+}
+
+static int set_pkcs7_data(void *ctx, const void *data, size_t len,
+			  size_t asn1hdrlen __always_unused)
+{
+	struct ipe_policy *p = ctx;
+
+	p->text = (const char *)data;
+	p->textlen = len;
+
+	return 0;
+}
+
+/**
+ * ipe_new_policy() - Allocate and parse an ipe_policy structure.
+ *
+ * @text: Supplies a pointer to the plain-text policy to parse.
+ * @textlen: Supplies the length of @text.
+ * @pkcs7: Supplies a pointer to a pkcs7-signed IPE policy.
+ * @pkcs7len: Supplies the length of @pkcs7.
+ *
+ * @text/@textlen Should be NULL/0 if @pkcs7/@pkcs7len is set.
+ *
+ * Return:
+ * * a pointer to the ipe_policy structure	- Success
+ * * %-EBADMSG					- Policy is invalid
+ * * %-ENOMEM					- Out of memory (OOM)
+ * * %-ERANGE					- Policy version number overflow
+ * * %-EINVAL					- Policy version parsing error
+ */
+struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
+				  const char *pkcs7, size_t pkcs7len)
+{
+	struct ipe_policy *new = NULL;
+	int rc = 0;
+
+	new = kzalloc(sizeof(*new), GFP_KERNEL);
+	if (!new)
+		return ERR_PTR(-ENOMEM);
+
+	if (!text) {
+		new->pkcs7len = pkcs7len;
+		new->pkcs7 = kmemdup(pkcs7, pkcs7len, GFP_KERNEL);
+		if (!new->pkcs7) {
+			rc = -ENOMEM;
+			goto err;
+		}
+
+		rc = verify_pkcs7_signature(NULL, 0, new->pkcs7, pkcs7len, NULL,
+					    VERIFYING_UNSPECIFIED_SIGNATURE,
+					    set_pkcs7_data, new);
+		if (rc)
+			goto err;
+	} else {
+		new->textlen = textlen;
+		new->text = kstrdup(text, GFP_KERNEL);
+		if (!new->text) {
+			rc = -ENOMEM;
+			goto err;
+		}
+	}
+
+	rc = ipe_parse_policy(new);
+	if (rc)
+		goto err;
+
+	return new;
+err:
+	ipe_free_policy(new);
+	return ERR_PTR(rc);
+}
diff --git a/security/ipe/policy.h b/security/ipe/policy.h
new file mode 100644
index 000000000000..8292ffaaff12
--- /dev/null
+++ b/security/ipe/policy.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#ifndef _IPE_POLICY_H
+#define _IPE_POLICY_H
+
+#include <linux/list.h>
+#include <linux/types.h>
+
+enum ipe_op_type {
+	IPE_OP_EXEC = 0,
+	IPE_OP_FIRMWARE,
+	IPE_OP_KERNEL_MODULE,
+	IPE_OP_KEXEC_IMAGE,
+	IPE_OP_KEXEC_INITRAMFS,
+	IPE_OP_POLICY,
+	IPE_OP_X509,
+	__IPE_OP_MAX,
+};
+
+#define IPE_OP_INVALID __IPE_OP_MAX
+
+enum ipe_action_type {
+	IPE_ACTION_ALLOW = 0,
+	IPE_ACTION_DENY,
+	__IPE_ACTION_MAX
+};
+
+#define IPE_ACTION_INVALID __IPE_ACTION_MAX
+
+enum ipe_prop_type {
+	__IPE_PROP_MAX
+};
+
+#define IPE_PROP_INVALID __IPE_PROP_MAX
+
+struct ipe_prop {
+	struct list_head next;
+	enum ipe_prop_type type;
+	void *value;
+};
+
+struct ipe_rule {
+	enum ipe_op_type op;
+	enum ipe_action_type action;
+	struct list_head props;
+	struct list_head next;
+};
+
+struct ipe_op_table {
+	struct list_head rules;
+	enum ipe_action_type default_action;
+};
+
+struct ipe_parsed_policy {
+	const char *name;
+	struct {
+		u16 major;
+		u16 minor;
+		u16 rev;
+	} version;
+
+	enum ipe_action_type global_default_action;
+
+	struct ipe_op_table rules[__IPE_OP_MAX];
+};
+
+struct ipe_policy {
+	const char *pkcs7;
+	size_t pkcs7len;
+
+	const char *text;
+	size_t textlen;
+
+	struct ipe_parsed_policy *parsed;
+};
+
+struct ipe_policy *ipe_new_policy(const char *text, size_t textlen,
+				  const char *pkcs7, size_t pkcs7len);
+void ipe_free_policy(struct ipe_policy *pol);
+
+#endif /* _IPE_POLICY_H */
diff --git a/security/ipe/policy_parser.c b/security/ipe/policy_parser.c
new file mode 100644
index 000000000000..32064262348a
--- /dev/null
+++ b/security/ipe/policy_parser.c
@@ -0,0 +1,495 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/parser.h>
+#include <linux/types.h>
+#include <linux/ctype.h>
+
+#include "policy.h"
+#include "policy_parser.h"
+
+#define START_COMMENT	'#'
+#define IPE_POLICY_DELIM " \t"
+#define IPE_LINE_DELIM "\n\r"
+
+/**
+ * new_parsed_policy() - Allocate and initialize a parsed policy.
+ *
+ * Return:
+ * * a pointer to the ipe_parsed_policy structure	- Success
+ * * %-ENOMEM						- Out of memory (OOM)
+ */
+static struct ipe_parsed_policy *new_parsed_policy(void)
+{
+	struct ipe_parsed_policy *p = NULL;
+	struct ipe_op_table *t = NULL;
+	size_t i = 0;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return ERR_PTR(-ENOMEM);
+
+	p->global_default_action = IPE_ACTION_INVALID;
+
+	for (i = 0; i < ARRAY_SIZE(p->rules); ++i) {
+		t = &p->rules[i];
+
+		t->default_action = IPE_ACTION_INVALID;
+		INIT_LIST_HEAD(&t->rules);
+	}
+
+	return p;
+}
+
+/**
+ * remove_comment() - Truncate all chars following START_COMMENT in a string.
+ *
+ * @line: Supplies a policy line string for preprocessing.
+ */
+static void remove_comment(char *line)
+{
+	line = strchr(line, START_COMMENT);
+
+	if (line)
+		*line = '\0';
+}
+
+/**
+ * remove_trailing_spaces() - Truncate all trailing spaces in a string.
+ *
+ * @line: Supplies a policy line string for preprocessing.
+ *
+ * Return: The length of truncated string.
+ */
+static size_t remove_trailing_spaces(char *line)
+{
+	size_t i = 0;
+
+	i = strlen(line);
+	while (i > 0 && isspace(line[i - 1]))
+		i--;
+
+	line[i] = '\0';
+
+	return i;
+}
+
+/**
+ * parse_version() - Parse policy version.
+ * @ver: Supplies a version string to be parsed.
+ * @p: Supplies the partial parsed policy.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Version string is invalid
+ * * %-ERANGE	- Version number overflow
+ * * %-EINVAL	- Parsing error
+ */
+static int parse_version(char *ver, struct ipe_parsed_policy *p)
+{
+	u16 *const cv[] = { &p->version.major, &p->version.minor, &p->version.rev };
+	size_t sep_count = 0;
+	char *token;
+	int rc = 0;
+
+	while ((token = strsep(&ver, ".")) != NULL) {
+		/* prevent overflow */
+		if (sep_count >= ARRAY_SIZE(cv))
+			return -EBADMSG;
+
+		rc = kstrtou16(token, 10, cv[sep_count]);
+		if (rc)
+			return rc;
+
+		++sep_count;
+	}
+
+	/* prevent underflow */
+	if (sep_count != ARRAY_SIZE(cv))
+		return -EBADMSG;
+
+	return 0;
+}
+
+enum header_opt {
+	IPE_HEADER_POLICY_NAME = 0,
+	IPE_HEADER_POLICY_VERSION,
+	__IPE_HEADER_MAX
+};
+
+static const match_table_t header_tokens = {
+	{IPE_HEADER_POLICY_NAME,	"policy_name=%s"},
+	{IPE_HEADER_POLICY_VERSION,	"policy_version=%s"},
+	{__IPE_HEADER_MAX,		NULL}
+};
+
+/**
+ * parse_header() - Parse policy header information.
+ * @line: Supplies header line to be parsed.
+ * @p: Supplies the partial parsed policy.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Header string is invalid
+ * * %-ENOMEM	- Out of memory (OOM)
+ * * %-ERANGE	- Version number overflow
+ * * %-EINVAL	- Version parsing error
+ */
+static int parse_header(char *line, struct ipe_parsed_policy *p)
+{
+	substring_t args[MAX_OPT_ARGS];
+	char *t, *ver = NULL;
+	size_t idx = 0;
+	int rc = 0;
+
+	while ((t = strsep(&line, IPE_POLICY_DELIM)) != NULL) {
+		int token;
+
+		if (*t == '\0')
+			continue;
+		if (idx >= __IPE_HEADER_MAX) {
+			rc = -EBADMSG;
+			goto out;
+		}
+
+		token = match_token(t, header_tokens, args);
+		if (token != idx) {
+			rc = -EBADMSG;
+			goto out;
+		}
+
+		switch (token) {
+		case IPE_HEADER_POLICY_NAME:
+			p->name = match_strdup(&args[0]);
+			if (!p->name)
+				rc = -ENOMEM;
+			break;
+		case IPE_HEADER_POLICY_VERSION:
+			ver = match_strdup(&args[0]);
+			if (!ver) {
+				rc = -ENOMEM;
+				break;
+			}
+			rc = parse_version(ver, p);
+			break;
+		default:
+			rc = -EBADMSG;
+		}
+		if (rc)
+			goto out;
+		++idx;
+	}
+
+	if (idx != __IPE_HEADER_MAX)
+		rc = -EBADMSG;
+
+out:
+	kfree(ver);
+	return rc;
+}
+
+/**
+ * token_default() - Determine if the given token is "DEFAULT".
+ * @token: Supplies the token string to be compared.
+ *
+ * Return:
+ * * %false	- The token is not "DEFAULT"
+ * * %true	- The token is "DEFAULT"
+ */
+static bool token_default(char *token)
+{
+	return !strcmp(token, "DEFAULT");
+}
+
+/**
+ * free_rule() - Free the supplied ipe_rule struct.
+ * @r: Supplies the ipe_rule struct to be freed.
+ *
+ * Free a ipe_rule struct @r. Note @r must be removed from any lists before
+ * calling this function.
+ */
+static void free_rule(struct ipe_rule *r)
+{
+	struct ipe_prop *p, *t;
+
+	if (IS_ERR_OR_NULL(r))
+		return;
+
+	list_for_each_entry_safe(p, t, &r->props, next) {
+		list_del(&p->next);
+		kfree(p);
+	}
+
+	kfree(r);
+}
+
+static const match_table_t operation_tokens = {
+	{IPE_OP_EXEC,			"op=EXECUTE"},
+	{IPE_OP_FIRMWARE,		"op=FIRMWARE"},
+	{IPE_OP_KERNEL_MODULE,		"op=KMODULE"},
+	{IPE_OP_KEXEC_IMAGE,		"op=KEXEC_IMAGE"},
+	{IPE_OP_KEXEC_INITRAMFS,	"op=KEXEC_INITRAMFS"},
+	{IPE_OP_POLICY,			"op=POLICY"},
+	{IPE_OP_X509,			"op=X509_CERT"},
+	{IPE_OP_INVALID,		NULL}
+};
+
+/**
+ * parse_operation() - Parse the operation type given a token string.
+ * @t: Supplies the token string to be parsed.
+ *
+ * Return: The parsed operation type.
+ */
+static enum ipe_op_type parse_operation(char *t)
+{
+	substring_t args[MAX_OPT_ARGS];
+
+	return match_token(t, operation_tokens, args);
+}
+
+static const match_table_t action_tokens = {
+	{IPE_ACTION_ALLOW,	"action=ALLOW"},
+	{IPE_ACTION_DENY,	"action=DENY"},
+	{IPE_ACTION_INVALID,	NULL}
+};
+
+/**
+ * parse_action() - Parse the action type given a token string.
+ * @t: Supplies the token string to be parsed.
+ *
+ * Return: The parsed action type.
+ */
+static enum ipe_action_type parse_action(char *t)
+{
+	substring_t args[MAX_OPT_ARGS];
+
+	return match_token(t, action_tokens, args);
+}
+
+/**
+ * parse_property() - Parse a rule property given a token string.
+ * @t: Supplies the token string to be parsed.
+ * @r: Supplies the ipe_rule the parsed property will be associated with.
+ *
+ * This is a placeholder. The actual function will be introduced in the
+ * latter commits.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-ENOMEM	- Out of memory (OOM)
+ * * %-EBADMSG	- The supplied token cannot be parsed
+ */
+static int parse_property(char *t, struct ipe_rule *r)
+{
+	return -EBADMSG;
+}
+
+/**
+ * parse_rule() - parse a policy rule line.
+ * @line: Supplies rule line to be parsed.
+ * @p: Supplies the partial parsed policy.
+ *
+ * Return:
+ * * 0		- Success
+ * * %-ENOMEM	- Out of memory (OOM)
+ * * %-EBADMSG	- Policy syntax error
+ */
+static int parse_rule(char *line, struct ipe_parsed_policy *p)
+{
+	enum ipe_action_type action = IPE_ACTION_INVALID;
+	enum ipe_op_type op = IPE_OP_INVALID;
+	bool is_default_rule = false;
+	struct ipe_rule *r = NULL;
+	bool first_token = true;
+	bool op_parsed = false;
+	int rc = 0;
+	char *t;
+
+	r = kzalloc(sizeof(*r), GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&r->next);
+	INIT_LIST_HEAD(&r->props);
+
+	while (t = strsep(&line, IPE_POLICY_DELIM), line) {
+		if (*t == '\0')
+			continue;
+		if (first_token && token_default(t)) {
+			is_default_rule = true;
+		} else {
+			if (!op_parsed) {
+				op = parse_operation(t);
+				if (op == IPE_OP_INVALID)
+					rc = -EBADMSG;
+				else
+					op_parsed = true;
+			} else {
+				rc = parse_property(t, r);
+			}
+		}
+
+		if (rc)
+			goto err;
+		first_token = false;
+	}
+
+	action = parse_action(t);
+	if (action == IPE_ACTION_INVALID) {
+		rc = -EBADMSG;
+		goto err;
+	}
+
+	if (is_default_rule) {
+		if (!list_empty(&r->props)) {
+			rc = -EBADMSG;
+		} else if (op == IPE_OP_INVALID) {
+			if (p->global_default_action != IPE_ACTION_INVALID)
+				rc = -EBADMSG;
+			else
+				p->global_default_action = action;
+		} else {
+			if (p->rules[op].default_action != IPE_ACTION_INVALID)
+				rc = -EBADMSG;
+			else
+				p->rules[op].default_action = action;
+		}
+	} else if (op != IPE_OP_INVALID && action != IPE_ACTION_INVALID) {
+		r->op = op;
+		r->action = action;
+	} else {
+		rc = -EBADMSG;
+	}
+
+	if (rc)
+		goto err;
+	if (!is_default_rule)
+		list_add_tail(&r->next, &p->rules[op].rules);
+	else
+		free_rule(r);
+
+	return rc;
+err:
+	free_rule(r);
+	return rc;
+}
+
+/**
+ * ipe_free_parsed_policy() - free a parsed policy structure.
+ * @p: Supplies the parsed policy.
+ */
+void ipe_free_parsed_policy(struct ipe_parsed_policy *p)
+{
+	struct ipe_rule *pp, *t;
+	size_t i = 0;
+
+	if (IS_ERR_OR_NULL(p))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(p->rules); ++i)
+		list_for_each_entry_safe(pp, t, &p->rules[i].rules, next) {
+			list_del(&pp->next);
+			free_rule(pp);
+		}
+
+	kfree(p->name);
+	kfree(p);
+}
+
+/**
+ * validate_policy() - validate a parsed policy.
+ * @p: Supplies the fully parsed policy.
+ *
+ * Given a policy structure that was just parsed, validate that all
+ * operations have their default rules or a global default rule is set.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Policy is invalid
+ */
+static int validate_policy(const struct ipe_parsed_policy *p)
+{
+	size_t i = 0;
+
+	if (p->global_default_action != IPE_ACTION_INVALID)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(p->rules); ++i) {
+		if (p->rules[i].default_action == IPE_ACTION_INVALID)
+			return -EBADMSG;
+	}
+
+	return 0;
+}
+
+/**
+ * ipe_parse_policy() - Given a string, parse the string into an IPE policy.
+ * @p: partially filled ipe_policy structure to populate with the result.
+ *     it must have text and textlen set.
+ *
+ * Return:
+ * * %0		- Success
+ * * %-EBADMSG	- Policy is invalid
+ * * %-ENOMEM	- Out of Memory
+ * * %-ERANGE	- Policy version number overflow
+ * * %-EINVAL	- Policy version parsing error
+ */
+int ipe_parse_policy(struct ipe_policy *p)
+{
+	struct ipe_parsed_policy *pp = NULL;
+	char *policy = NULL, *dup = NULL;
+	bool header_parsed = false;
+	char *line = NULL;
+	size_t len;
+	int rc = 0;
+
+	if (!p->textlen)
+		return -EBADMSG;
+
+	policy = kmemdup_nul(p->text, p->textlen, GFP_KERNEL);
+	if (!policy)
+		return -ENOMEM;
+	dup = policy;
+
+	pp = new_parsed_policy();
+	if (IS_ERR(pp)) {
+		rc = PTR_ERR(pp);
+		goto out;
+	}
+
+	while ((line = strsep(&policy, IPE_LINE_DELIM)) != NULL) {
+		remove_comment(line);
+		len = remove_trailing_spaces(line);
+		if (!len)
+			continue;
+
+		if (!header_parsed) {
+			rc = parse_header(line, pp);
+			if (rc)
+				goto err;
+			header_parsed = true;
+		} else {
+			rc = parse_rule(line, pp);
+			if (rc)
+				goto err;
+		}
+	}
+
+	if (!header_parsed || validate_policy(pp)) {
+		rc = -EBADMSG;
+		goto err;
+	}
+
+	p->parsed = pp;
+
+out:
+	kfree(dup);
+	return rc;
+err:
+	ipe_free_parsed_policy(pp);
+	goto out;
+}
diff --git a/security/ipe/policy_parser.h b/security/ipe/policy_parser.h
new file mode 100644
index 000000000000..62b6209019a2
--- /dev/null
+++ b/security/ipe/policy_parser.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020-2024 Microsoft Corporation. All rights reserved.
+ */
+#ifndef _IPE_POLICY_PARSER_H
+#define _IPE_POLICY_PARSER_H
+
+int ipe_parse_policy(struct ipe_policy *p);
+void ipe_free_parsed_policy(struct ipe_parsed_policy *p);
+
+#endif /* _IPE_POLICY_PARSER_H */
-- 
2.44.0


^ permalink raw reply related	[relevance 33%]

* [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE)
@ 2024-04-13  0:55 24% Fan Wu
  2024-04-13  0:55 47% ` [PATCH v17 01/21] security: add ipe lsm Fan Wu
                   ` (20 more replies)
  0 siblings, 21 replies; 200+ results
From: Fan Wu @ 2024-04-13  0:55 UTC (permalink / raw)
  To: corbet, zohar, jmorris, serge, tytso, ebiggers, axboe, agk,
	snitzer, eparis, paul
  Cc: linux-doc, linux-integrity, linux-security-module, fsverity,
	linux-block, dm-devel, audit, linux-kernel, Fan Wu

IPE is a Linux Security Module that takes a complementary approach to
access control. Unlike traditional access control mechanisms that rely on
labels and paths for decision-making, IPE focuses on the immutable security
properties inherent to system components. These properties are fundamental
attributes or features of a system component that cannot be altered,
ensuring a consistent and reliable basis for security decisions.

To elaborate, in the context of IPE, system components primarily refer to
files or the devices these files reside on. However, this is just a
starting point. The concept of system components is flexible and can be
extended to include new elements as the system evolves. The immutable
properties include the origin of a file, which remains constant and
unchangeable over time. For example, IPE policies can be crafted to trust
files originating from the initramfs. Since initramfs is typically verified
by the bootloader, its files are deemed trustworthy; "file is from
initramfs" becomes an immutable property under IPE's consideration.

The immutable property concept extends to the security features enabled on
a file's origin, such as dm-verity or fs-verity, which provide a layer of
integrity and trust. For example, IPE allows the definition of policies
that trust files from a dm-verity protected device. dm-verity ensures the
integrity of an entire device by providing a verifiable and immutable state
of its contents. Similarly, fs-verity offers filesystem-level integrity
checks, allowing IPE to enforce policies that trust files protected by
fs-verity. These two features cannot be turned off once established, so
they are considered immutable properties. These examples demonstrate how
IPE leverages immutable properties, such as a file's origin and its
integrity protection mechanisms, to make access control decisions.

For the IPE policy, specifically, it grants the ability to enforce
stringent access controls by assessing security properties against
reference values defined within the policy. This assessment can be based on
the existence of a security property (e.g., verifying if a file originates
from initramfs) or evaluating the internal state of an immutable security
property. The latter includes checking the roothash of a dm-verity
protected device, determining whether dm-verity possesses a valid
signature, assessing the digest of a fs-verity protected file, or
determining whether fs-verity possesses a valid built-in signature. This
nuanced approach to policy enforcement enables a highly secure and
customizable system defense mechanism, tailored to specific security
requirements and trust models.

IPE is compiled under CONFIG_SECURITY_IPE.

Use Cases
---------

IPE works best in fixed-function devices: Devices in which their purpose
is clearly defined and not supposed to be changed (e.g. network firewall
device in a data center, an IoT device, etcetera), where all software and
configuration is built and provisioned by the system owner.

IPE is a long-way off for use in general-purpose computing: the Linux
community as a whole tends to follow a decentralized trust model,
known as the web of trust, which IPE has no support for as of  yet.
There are exceptions, such as the case where a Linux distribution
vendor trusts only their own keys, where IPE can successfully be used
to enforce the trust requirement.

Additionally, while most packages are signed today, the files inside
the packages (for instance, the executables), tend to be unsigned. This
makes it difficult to utilize IPE in systems where a package manager is
expected to be functional, without major changes to the package manager
and ecosystem behind it.

DIGLIM[1] is a system that when combined with IPE, could be used to
enable general purpose computing scenarios.

Policy
-------

IPE policy is a plain-text policy composed of multiple statements
over several lines. There is one required line, at the top of the
policy, indicating the policy name, and the policy version, for
instance:

  policy_name=Ex_Policy policy_version=0.0.0

The policy version indicates the current version of the policy. This is
used to prevent roll-back of policy to potentially insecure previous
versions of the policy.

The next portion of IPE policy, are rules. Rules are formed by key=value
pairs, known as properties. IPE rules require two keys: "action", which
determines what IPE does when it encounters a match against the policy
and "op", which determines when that rule should be evaluated.

Thus, a minimal rule is:

  op=EXECUTE action=ALLOW

This example rule will allow any execution. A rule is required to have the
"op" property as the first token of a rule, and the "action" as the last
token of the rule.

Additional properties are used to assess immutable security properties
about the files being evaluated. These properties are intended to be
deterministic attributes that are resident in the kernel.

For example:

  op=EXECUTE dmverity_signature=FALSE action=DENY

This rule with property dmverity_signature will deny any file not from
a signed dmverity volume to be executed.

All available properties for IPE described in the documentation patch of
this series.

Rules are evaluated top-to-bottom. As a result, any revocation rules,
or denies should be placed early in the file to ensure that these rules
are evaluated before a rule with "action=ALLOW" is hit.

Any unknown syntax in IPE policy will result in a fatal error to parse
the policy.

Additionally, a DEFAULT operation must be set for all understood
operations within IPE. For policies to remain completely forwards
compatible, it is recommended that users add a "DEFAULT action=ALLOW"
and override the defaults on a per-operation basis.

For more information about the policy syntax, see the kernel
documentation page.

Early Usermode Protection
--------------------------

IPE can be provided with a policy at startup to load and enforce.
This is intended to be a minimal policy to get the system to a state
where userspace is setup and ready to receive commands, at which
point a policy can be deployed via securityfs. This "boot policy" can be
specified via the config, SECURITY_IPE_BOOT_POLICY, which accepts a path
to a plain-text version of the IPE policy to apply. This policy will be
compiled into the kernel. If not specified, IPE will be disabled until a
policy is deployed and activated through the method above.

Policy Examples
----------------

Allow all:

  policy_name=Allow_All policy_version=0.0.0
  DEFAULT action=ALLOW

Allow only initramfs:

  policy_name=Allow_All_Initramfs policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW

Allow any signed dm-verity volume and the initramfs:

  policy_name=AllowSignedAndInitramfs policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Prohibit execution from a specific dm-verity volume, while allowing
all signed volumes and the initramfs:

  policy_name=ProhibitSingleVolume policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=sha256:401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=DENY
  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Allow only a specific dm-verity volume:

  policy_name=AllowSpecific policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE dmverity_roothash=sha256:401fcec5944823ae12f62726e8184407a5fa9599783f030dec146938 action=ALLOW

Allow any signed fs-verity file

  policy_name=AllowSignedFSVerity policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE fsverity_signature=TRUE action=ALLOW

Deny a specific fs-verity file:

  policy_name=ProhibitSpecificFSVF policy_version=0.0.0
  DEFAULT action=DENY

  op=EXECUTE fsverity_digest=sha256:fd88f2b8824e197f850bf4c5109bea5cf0ee38104f710843bb72da796ba5af9e action=DENY
  op=EXECUTE boot_verified=TRUE action=ALLOW
  op=EXECUTE dmverity_signature=TRUE action=ALLOW

Deploying Policies
-------------------

First sign a plain text policy, with a certificate that is present in
the SYSTEM_TRUSTED_KEYRING of your test machine. Through openssl, the
signing can be done via:

  openssl smime -sign -in "$MY_POLICY" -signer "$MY_CERTIFICATE" \
    -inkey "$MY_PRIVATE_KEY" -outform der -noattr -nodetach \
    -out "$MY_POLICY.p7s"

Then, simply cat the file into the IPE's "new_policy" securityfs node:

  cat "$MY_POLICY.p7s" > /sys/kernel/security/ipe/new_policy

The policy should now be present under the policies/ subdirectory, under
its "policy_name" attribute.

The policy is now present in the kernel and can be marked as active,
via the securityfs node:

  echo 1 > "/sys/kernel/security/ipe/$MY_POLICY_NAME/active"

This will now mark the policy as active and the system will be enforcing
$MY_POLICY_NAME.

There is one requirement when marking a policy as active, the policy_version
attribute must either increase, or remain the same as the currently running
policy.

Policies can be updated via:

  cat "$MY_UPDATED_POLICY.p7s" > \
    "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/update"

Additionally, policies can be deleted via the "delete" securityfs
node. Simply write "1" to the corresponding node in the policy folder:

  echo 1 > "/sys/kernel/security/ipe/policies/$MY_POLICY_NAME/delete"

There is only one requirement to delete policies, the policy being
deleted must not be the active policy.

NOTE: Any securityfs write to IPE's nodes will require CAP_MAC_ADMIN.

Integrations
-------------

This patch series adds support for fsverity via digest and signature
(fsverity_signature and fsverity_digest), dm-verity by digest and
signature (dmverity_signature and dmverity_roothash), and trust for
the initramfs (boot_verified).

Please see the documentation patch for more information about the
integrations available.

Testing
--------

KUnit Tests are available. Recommended kunitconfig:

    CONFIG_KUNIT=y
    CONFIG_SECURITY=y
    CONFIG_SECURITYFS=y
    CONFIG_PKCS7_MESSAGE_PARSER=y
    CONFIG_SYSTEM_DATA_VERIFICATION=y
    CONFIG_FS_VERITY=y
    CONFIG_FS_VERITY_BUILTIN_SIGNATURES=y
    CONFIG_BLOCK=y
    CONFIG_MD=y
    CONFIG_BLK_DEV_DM=y
    CONFIG_DM_VERITY=y
    CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
    CONFIG_NET=y
    CONFIG_AUDIT=y
    CONFIG_AUDITSYSCALL=y
    CONFIG_BLK_DEV_INITRD=y

    CONFIG_SECURITY_IPE=y
    CONFIG_IPE_PROP_DM_VERITY=y
    CONFIG_IPE_PROP_FS_VERITY=y
    CONFIG_SECURITY_IPE_KUNIT_TEST=y

Simply run:

    make ARCH=um mrproper
    ./tools/testing/kunit/kunit.py run --kunitconfig <path/to/config>

And the tests will execute and report the result.

In addition, IPE has a python based integration
test suite https://github.com/microsoft/ipe/tree/test-suite that
can test both user interfaces and enforcement functionalities.

Documentation
--------------

There is both documentation available on github at
https://microsoft.github.io/ipe, and Documentation in this patch series,
to be added in-tree.

Known Gaps
-----------

IPE has two known gaps:

1. IPE cannot verify the integrity of anonymous executable memory, such as
  the trampolines created by gcc closures and libffi (<3.4.2), or JIT'd code.
  Unfortunately, as this is dynamically generated code, there is no way
  for IPE to ensure the integrity of this code to form a trust basis. In all
  cases, the return result for these operations will be whatever the admin
  configures the DEFAULT action for "EXECUTE".

2. IPE cannot verify the integrity of interpreted languages' programs when
  these scripts invoked via ``<interpreter> <file>``. This is because the
  way interpreters execute these files, the scripts themselves are not
  evaluated as executable code through one of IPE's hooks. Interpreters
  can be enlightened to the usage of IPE by trying to mmap a file into
  executable memory (+X), after opening the file and responding to the
  error code appropriately. This also applies to included files, or high
  value files, such as configuration files of critical system components.

Appendix
---------

A. IPE Github Repository: https://github.com/microsoft/ipe
B. IPE Users' Guide: Documentation/admin-guide/LSM/ipe.rst

References
-----------

1: https://lore.kernel.org/bpf/4d6932e96d774227b42721d9f645ba51@huawei.com/

FAQ:
----

Q: What is the difference between IMA and IPE?

A: See the documentation patch for more on this topic.

Previous Postings
-----------------

v1: https://lore.kernel.org/all/20200406181045.1024164-1-deven.desai@linux.microsoft.com/
v2: https://lore.kernel.org/all/20200406221439.1469862-1-deven.desai@linux.microsoft.com/
v3: https://lore.kernel.org/all/20200415162550.2324-1-deven.desai@linux.microsoft.com/
v4: https://lore.kernel.org/all/20200717230941.1190744-1-deven.desai@linux.microsoft.com/
v5: https://lore.kernel.org/all/20200728213614.586312-1-deven.desai@linux.microsoft.com/
v6: https://lore.kernel.org/all/20200730003113.2561644-1-deven.desai@linux.microsoft.com/
v7: https://lore.kernel.org/all/1634151995-16266-1-git-send-email-deven.desai@linux.microsoft.com/
v8: https://lore.kernel.org/all/1654714889-26728-1-git-send-email-deven.desai@linux.microsoft.com/
v9: https://lore.kernel.org/lkml/1675119451-23180-1-git-send-email-wufan@linux.microsoft.com/
v10: https://lore.kernel.org/lkml/1687986571-16823-1-git-send-email-wufan@linux.microsoft.com/
v11: https://lore.kernel.org/lkml/1696457386-3010-1-git-send-email-wufan@linux.microsoft.com/
v12: https://lore.kernel.org/lkml/1706654228-17180-1-git-send-email-wufan@linux.microsoft.com/
v13: https://lore.kernel.org/lkml/1709168102-7677-1-git-send-email-wufan@linux.microsoft.com/
v14: https://lore.kernel.org/lkml/1709768084-22539-1-git-send-email-wufan@linux.microsoft.com/
v15: https://lore.kernel.org/lkml/1710560151-28904-1-git-send-email-wufan@linux.microsoft.com/
v16: https://lore.kernel.org/lkml/1711657047-10526-1-git-send-email-wufan@linux.microsoft.com/

Changelog
----------

v2:
  Split the second patch of the previous series into two.
  Minor corrections in the cover-letter and documentation
  comments regarding CAP_MAC_ADMIN checks in IPE.

v3:
  Address various comments by Jann Horn. Highlights:
    Switch various audit allocators to GFP_KERNEL.
    Utilize rcu_access_pointer() in various locations.
    Strip out the caching system for properties
    Strip comments from headers
    Move functions around in patches
    Remove kernel command line parameters
    Reconcile the race condition on the delete node for policy by
      expanding the policy critical section.

  Address a few comments by Jonathan Corbet around the documentation
    pages for IPE.

  Fix an issue with the initialization of IPE policy with a "-0"
    version, caused by not initializing the hlist entries before
    freeing.

v4:
  Address a concern around IPE's behavior with unknown syntax.
    Specifically, make any unknown syntax a fatal error instead of a
    warning, as suggested by Mickaël Salaün.
  Introduce a new securityfs node, $securityfs/ipe/property_config,
    which provides a listing of what properties are enabled by the
    kernel and their versions. This allows usermode to predict what
    policies should be allowed.
  Strip some comments from c files that I missed.
  Clarify some documentation comments around 'boot_verified'.
    While this currently does not functionally change the property
    itself, the distinction is important when IPE can enforce verified
    reads. Additionally, 'KERNEL_READ' was omitted from the documentation.
    This has been corrected.
  Change SecurityFS and SHA1 to a reverse dependency.
  Update the cover-letter with the updated behavior of unknown syntax.
  Remove all sysctls, making an equivalent function in securityfs.
  Rework the active/delete mechanism to be a node under the policy in
    $securityfs/ipe/policies.
  The kernel command line parameters ipe.enforce and ipe.success_audit
    have returned as this functionality is no longer exposed through
    sysfs.

v5:
  Correct some grammatical errors reported by Randy Dunlap.
  Fix some warnings reported by kernel test bot.
  Change convention around security_bdev_setsecurity. -ENOSYS
    is now expected if an LSM does not implement a particular @name,
    as suggested by Casey Schaufler.
  Minor string corrections related to the move from sysfs to securityfs
  Correct a spelling of an #ifdef for the permissive argument.
  Add the kernel parameters re-added to the documentation.
  Fix a minor bug where the mode being audited on permissive switch
    was the original mode, not the mode being swapped to.
  Cleanup doc comments, fix some whitespace alignment issues.

v6:
  Change if statement condition in security_bdev_setsecurity to be
    more concise, as suggested by Casey Schaufler and Al Viro
  Drop the 6th patch in the series, "dm-verity move signature check..."
    due to numerous issues, and it ultimately providing no real value.
  Fix the patch tree - the previous iteration appears to have been in a
    torn state (patches 8+9 were merged). This has since been corrected.

v7:
  * Reword cover letter to more accurate convey IPE's purpose
    and latest updates.
  * Refactor series to:
      1. Support a context structure, enabling:
          1. Easier Testing via KUNIT
          2. A better architecture for future designs
      2. Make parser code cleaner
  * Move patch 01/12 to [14/16] of the series
  * Split up patch 02/12 into four parts:
      1. context creation [01/16]
      2. audit [07/16]
      3. evaluation loop [03/16]
      4. access control hooks [05/16]
      5. permissive mode [08/16]
  * Split up patch 03/12 into two parts:
      1. parser [02/16]
      2. userspace interface [04/16]
  * Reword and refactor patch 04/12 to [09/16]
  * Squash patch 05/12, 07/12, 09/12 to [10/16]
  * Squash patch 08/12, 10/12 to [11/16]
  * Change audit records to MAC region (14XX) from Integrity region (18XX)
  * Add FSVerity Support
  * Interface changes:
      1. "raw" was renamed to "pkcs7" and made read only
      2. "raw"'s write functionality (update a policy) moved to "update"
      3. introduced "version", "policy_name" nodes.
      4. "content" renamed to "policy"
      5. The boot policy can now be updated like any other policy.
  * Add additional developer-level documentation
  * Update admin-guide docs to reflect changes.
  * Kunit tests
  * Dropped CONFIG_SECURITY_IPE_PERMISSIVE_SWITCH - functionality can
    easily come later with a small patch.
  * Use partition0 for block_device for dm-verity patch

v8:
  * Add changelog information to individual commits
  * A large number of changes to the audit patch.
  * split fs/ & security/ changes to two separate patches.
  * split block/, security/ & drivers/md/ changes to separate patches.
  * Add some historical context to what lead to the creation of IPE
    in the documentation patch.
  * Cover-letter changes suggested by Roberto Sassu.

v9:
  * Rewrite IPE parser to use kernel match_table parser.
  * Adapt existing IPE properties to the new parser.
  * Remove ipe_context, quote policy syntax, kernel_read for simplicity.
  * Add new function in the security file system to delete IPE policy.
  * Make IPE audit builtin and change several audit formats.
  * Make boot_verified property builtin

v10:
  * Address various code style/format issues
  * Correct the rcu locking for active policy
  * Fix memleak bugs in the parser, optimize the parser per upstream feedback
  * Adding new audit events for IPE and update audit formats
  * Make the dmverity property auto selected
  * Adding more context in the commit messages

v11:
  * Address various code style/format issues
  * Add finalize hook to device mapper
  * move the security hook for dm-verity to the new device mapper finalize hook

v12:
  * Address locking issues
  * Change the implementation of boot_verified to trust initramfs only
  * Update audit format for IPE decision events
  * Refactor code for lsm_id
  * Add IPE test suite link

v13:
  * Rename the new security hook in initramfs
  * Make the policy grammar independent of kernel config
  * Correct IPE audit format
  * Refactor policy update code

v14:
  * Add more code comments/docs for dmverity/fsverity
  * Fix incorrect code usage and format in dmverity
  * Drop one accepted commit of dmverity

v15:
  * Fix grammar issues
  * Add more documentation to fsverity
  * Switch security hooks from *_setsecurity() to *_setintegrity()
  * Cleanup unnecessary headers

v16:
  * Fix format issues, refactor names
  * Further improve documentation for fsverity
  * Fix bugs in dmverity implementation
  * Switch to use call_int_hook() for *_setintegrity()

v17:
  * Fix various code/Documentation style issues
  * Switch to use reverse christmas tree style
  * add ipe_ prefix to all non-static functions
  * Correct documentation for fsverity
  * Rewrite design concept part of IPE Documentation
  * Fix incorrect interface path in IPE Documentation

Deven Bowers (13):
  security: add ipe lsm
  ipe: add policy parser
  ipe: add evaluation loop
  ipe: add LSM hooks on execution and kernel read
  ipe: add userspace interface
  uapi|audit|ipe: add ipe auditing support
  ipe: add permissive toggle
  block,lsm: add LSM blob and new LSM hooks for block device
  dm verity: consume root hash digest and expose signature data via LSM
    hook
  ipe: add support for dm-verity as a trust provider
  scripts: add boot policy generation program
  ipe: kunit test for parser
  Documentation: add ipe documentation

Fan Wu (8):
  initramfs|security: Add a security hook to do_populate_rootfs()
  ipe: introduce 'boot_verified' as a trust provider
  security: add new securityfs delete function
  dm: add finalize hook to target_type
  security: add security_inode_setintegrity() hook
  fsverity: expose verified fsverity built-in signatures to LSMs
  ipe: enable support for fs-verity as a trust provider
  MAINTAINERS: ipe: add ipe maintainer information

 Documentation/admin-guide/LSM/index.rst       |   1 +
 Documentation/admin-guide/LSM/ipe.rst         | 797 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  12 +
 Documentation/filesystems/fsverity.rst        |  26 +-
 Documentation/security/index.rst              |   1 +
 Documentation/security/ipe.rst                | 444 ++++++++++
 MAINTAINERS                                   |  10 +
 block/bdev.c                                  |   7 +
 drivers/md/dm-verity-target.c                 |  83 ++
 drivers/md/dm-verity.h                        |   6 +
 drivers/md/dm.c                               |  12 +
 fs/verity/fsverity_private.h                  |   2 +-
 fs/verity/open.c                              |  24 +-
 fs/verity/signature.c                         |   6 +-
 include/linux/blk_types.h                     |   3 +
 include/linux/device-mapper.h                 |   9 +
 include/linux/dm-verity.h                     |  12 +
 include/linux/lsm_hook_defs.h                 |   9 +
 include/linux/lsm_hooks.h                     |   1 +
 include/linux/security.h                      |  47 ++
 include/uapi/linux/audit.h                    |   3 +
 include/uapi/linux/lsm.h                      |   1 +
 init/initramfs.c                              |   3 +
 scripts/Makefile                              |   1 +
 scripts/ipe/Makefile                          |   2 +
 scripts/ipe/polgen/.gitignore                 |   2 +
 scripts/ipe/polgen/Makefile                   |   5 +
 scripts/ipe/polgen/polgen.c                   | 145 ++++
 security/Kconfig                              |  11 +-
 security/Makefile                             |   1 +
 security/inode.c                              |  25 +
 security/ipe/.gitignore                       |   2 +
 security/ipe/Kconfig                          |  76 ++
 security/ipe/Makefile                         |  31 +
 security/ipe/audit.c                          | 279 ++++++
 security/ipe/audit.h                          |  19 +
 security/ipe/digest.c                         | 118 +++
 security/ipe/digest.h                         |  26 +
 security/ipe/eval.c                           | 377 +++++++++
 security/ipe/eval.h                           |  66 ++
 security/ipe/fs.c                             | 247 ++++++
 security/ipe/fs.h                             |  16 +
 security/ipe/hooks.c                          | 299 +++++++
 security/ipe/hooks.h                          |  52 ++
 security/ipe/ipe.c                            |  99 +++
 security/ipe/ipe.h                            |  26 +
 security/ipe/policy.c                         | 229 +++++
 security/ipe/policy.h                         |  98 +++
 security/ipe/policy_fs.c                      | 470 +++++++++++
 security/ipe/policy_parser.c                  | 556 ++++++++++++
 security/ipe/policy_parser.h                  |  11 +
 security/ipe/policy_tests.c                   | 296 +++++++
 security/security.c                           | 122 ++-
 .../selftests/lsm/lsm_list_modules_test.c     |   3 +
 54 files changed, 5218 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/admin-guide/LSM/ipe.rst
 create mode 100644 Documentation/security/ipe.rst
 create mode 100644 include/linux/dm-verity.h
 create mode 100644 scripts/ipe/Makefile
 create mode 100644 scripts/ipe/polgen/.gitignore
 create mode 100644 scripts/ipe/polgen/Makefile
 create mode 100644 scripts/ipe/polgen/polgen.c
 create mode 100644 security/ipe/.gitignore
 create mode 100644 security/ipe/Kconfig
 create mode 100644 security/ipe/Makefile
 create mode 100644 security/ipe/audit.c
 create mode 100644 security/ipe/audit.h
 create mode 100644 security/ipe/digest.c
 create mode 100644 security/ipe/digest.h
 create mode 100644 security/ipe/eval.c
 create mode 100644 security/ipe/eval.h
 create mode 100644 security/ipe/fs.c
 create mode 100644 security/ipe/fs.h
 create mode 100644 security/ipe/hooks.c
 create mode 100644 security/ipe/hooks.h
 create mode 100644 security/ipe/ipe.c
 create mode 100644 security/ipe/ipe.h
 create mode 100644 security/ipe/policy.c
 create mode 100644 security/ipe/policy.h
 create mode 100644 security/ipe/policy_fs.c
 create mode 100644 security/ipe/policy_parser.c
 create mode 100644 security/ipe/policy_parser.h
 create mode 100644 security/ipe/policy_tests.c

--
2.44.0


^ permalink raw reply	[relevance 24%]

* Re: [PATCH 6.6 000/114] 6.6.27-rc1 review
  @ 2024-04-12 22:24 79% ` Kelsey Steele
  0 siblings, 0 replies; 200+ results
From: Kelsey Steele @ 2024-04-12 22:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie

On Thu, Apr 11, 2024 at 11:55:27AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.27 release.
> There are 114 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000.
> Anything received after that time might be too late.

No regressions found on WSL (x86 and arm64).

Built, booted, and reviewed dmesg.

Thank you. :)

Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> 

^ permalink raw reply	[relevance 79%]

* Re: [PATCH 6.1 00/83] 6.1.86-rc1 review
  @ 2024-04-12 22:23 79% ` Kelsey Steele
  0 siblings, 0 replies; 200+ results
From: Kelsey Steele @ 2024-04-12 22:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie

On Thu, Apr 11, 2024 at 11:56:32AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.86 release.
> There are 83 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000.
> Anything received after that time might be too late.

No regressions found on WSL (x86 and arm64).

Built, booted, and reviewed dmesg.

Thank you. :)

Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> 

^ permalink raw reply	[relevance 79%]

* Re: [PATCH 5.15 00/57] 5.15.155-rc1 review
    2024-04-11 18:36 79% ` Easwar Hariharan
@ 2024-04-12 22:22 79% ` Kelsey Steele
  1 sibling, 0 replies; 200+ results
From: Kelsey Steele @ 2024-04-12 22:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie

On Thu, Apr 11, 2024 at 11:57:08AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.155 release.
> There are 57 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000.
> Anything received after that time might be too late.

No regressions found on WSL (x86 and arm64).

Built, booted, and reviewed dmesg.

Thank you. :)

Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> 

^ permalink raw reply	[relevance 79%]

* Re: [RFC PATCH 0/4] perf: Correlating user process data to samples
  @ 2024-04-12 16:37 79%   ` Beau Belgrave
  0 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-12 16:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, acme, namhyung, rostedt, mhiramat, mathieu.desnoyers,
	linux-kernel, linux-trace-kernel, linux-perf-users, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, primiano,
	aahringo, dcook

On Fri, Apr 12, 2024 at 09:12:45AM +0200, Peter Zijlstra wrote:
> 
> On Fri, Apr 12, 2024 at 12:17:28AM +0000, Beau Belgrave wrote:
> 
> > An idea flow would look like this:
> > User Task		Profile
> > do_work();		sample() -> IP + No activity
> > ...
> > set_activity(123);
> > ...
> > do_work();		sample() -> IP + activity (123)
> > ...
> > set_activity(124);
> > ...
> > do_work();		sample() -> IP + activity (124)
> 
> This, start with this, because until I saw this, I was utterly confused
> as to what the heck you were on about.
> 

Will do.

> I started by thinking we already have TID in samples so you can already
> associate back to user processes and got increasingly confused the
> further I went.
> 
> What you seem to want to do however is have some task-state included so
> you can see what the thread is doing.
> 

Yeah, there is typically an external context (not on the machine) that
wants to be tied to each sample. The context could be a simple integer,
UUID, or something else entirely. For OTel, this is a 16-byte array [1].

> Anyway, since we typically run stuff from NMI context, accessing user
> data is 'interesting'. As such I would really like to make this work
> depend on the call-graph rework that pushes all the user access bits
> into return-to-user.

Cool, I assume that's the SFRAME work? Are there pointers to work I
could look at and think about what a rebase looks like? Or do you have
someone in mind I should work with for this?

Thanks,
-Beau

1. https://www.w3.org/TR/trace-context/#version-format

^ permalink raw reply	[relevance 79%]

* Re: [RFC PATCH 0/4] perf: Correlating user process data to samples
  @ 2024-04-12 16:28 71%   ` Beau Belgrave
  0 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-12 16:28 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, acme, namhyung, rostedt, mhiramat,
	mathieu.desnoyers, linux-kernel, linux-trace-kernel,
	linux-perf-users, mark.rutland, alexander.shishkin, jolsa,
	adrian.hunter, primiano, aahringo, dcook

On Thu, Apr 11, 2024 at 09:52:22PM -0700, Ian Rogers wrote:
> On Thu, Apr 11, 2024 at 5:17 PM Beau Belgrave <beaub@linux.microsoft.com> wrote:
> >
> > In the Open Telemetry profiling SIG [1], we are trying to find a way to
> > grab a tracing association quickly on a per-sample basis. The team at
> > Elastic has a bespoke way to do this [2], however, I'd like to see a
> > more general way to achieve this. The folks I've been talking with seem
> > open to the idea of just having a TLS value for this we could capture
> 
> Presumably TLS == Thread Local Storage.
> 

Yes, the initial idea is to use thread local storage (TLS). It seems to
be the fastest option to save a per-thread value that changes at a fast
rate.

> > upon each sample. We could then just state, Open Telemetry SDKs should
> > have a TLS value for span correlation. However, we need a way to sample
> > the TLS or other value(s) when a sampling event is generated. This is
> > supported today on Windows via EventActivityIdControl() [3]. Since
> > Open Telemetry works on both Windows and Linux, ideally we can do
> > something as efficient for Linux based workloads.
> >
> > This series is to explore how it would be best possible to collect
> > supporting data from a user process when a profile sample is collected.
> > Having a value stored in TLS makes a lot of sense for this however
> > there are other ways to explore. Whatever is chosen, kernel samples
> > taken in process context should be able to get this supporting data.
> > In these patches on X64 the fsbase and gsbase are used for this.
> >
> > An option to explore suggested by Mathieu Desnoyers is to utilize rseq
> > for processes to register a value location that can be included when
> > profiling if desired. This would allow a tighter contract between user
> > processes and a profiler.  It would allow better labeling/categorizing
> > the correlation values.
> 
> It is hard to understand this idea. Are you saying stash a cookie in
> TLS for samples to capture to indicate an activity? Restartable
> sequences are about preemption on a CPU not of a thread, so at least
> my intuition is that they feel different. You could stash information
> like this today by changing the thread name which generates comm
> events. I've wondered about having similar information in some form of
> reserved for profiling stack slot, for example, to stash a pointer to
> the name of a function being interpreted. Snapshotting all of a stack
> is bad performance wise and for security. A stack slot would be able
> to deal with nesting.
> 

You are getting the idea. A slot or tag for a thread would be great! I'm
not a fan of overriding the thread comm name (as that already has a
use). TLS would be fine, if we could also pass an offset + size + type.

Maybe a stack slot that just points to parts of TLS? That way you could
have a set of slots that don't require much memory and selectively copy
them out of TLS (or where ever those slots point to in user memory).

When I was talking to Mathieu about this, it seems that rseq already had
a place to potentially put these slots. I'm unsure though how the per
thread aspects would work.

Mathieu, can you post your ideas here about that?

> > An idea flow would look like this:
> > User Task               Profile
> > do_work();              sample() -> IP + No activity
> > ...
> > set_activity(123);
> > ...
> > do_work();              sample() -> IP + activity (123)
> > ...
> > set_activity(124);
> > ...
> > do_work();              sample() -> IP + activity (124)
> >
> > Ideally, the set_activity() method would not be a syscall. It needs to
> > be very cheap as this should not bottleneck work. Ideally this is just
> > a memcpy of 16-20 bytes as it is on Windows via EventActivityIdControl()
> > using EVENT_ACTIVITY_CTRL_SET_ID.
> >
> > For those not aware, Open Telemetry allows collecting data from multiple
> > machines and show where time was spent. The tracing context is already
> > available for logs, but not for profiling samples. The idea is to show
> > where slowdowns occur and have profile samples to explain why they
> > slowed down. This must be possible without having to track context
> > switches to do this correlation. This is because the profiling rates
> > are typically 20hz - 1Khz, while the context switching rates are much
> > higher. We do not want to have to consume high context switch rates
> > just to know a correlation for a 20hz signal. Often these 20hz signals
> > are always enabled in some environments.
> >
> > Regardless if TLS, rseq, or other source is used I believe we will need
> > a way for perf_events to include it within a sample. The changes in this
> > series show how it could be done with TLS. There is some factoring work
> > under perf to make it easier to add more dump types using the existing
> > ABI. This is mostly to make the patches clearer, certainly the refactor
> > parts could get dropped and we could have duplicated/specialized paths.
> 
> fs and gs may be used for more than just the C runtime's TLS. For
> example, they may be used by emulators or managed runtimes. I'm not
> clear why this specific case couldn't be handled through BPF.
> 

Agree about the fs/gs possibly being used for other things. If we had a
stack slot we could avoid the confusion and have tighter couplings.

You can do this in eBPF (see [2]). However, it's very clunky and depends
on specific SDKs per-language/runtime. We ourselves don't run our profilers
with anything other than CAP_PERFMON, and also have environments without
BPF enabled due to various reasons. It'd be great if we could get this data
directly from perf. At the very least, I'd love to get a standardized
way to attribute thread values accessible to other performance systems
(like eBPF and perf).

Thanks,
-Beau

> Thanks,
> Ian
> 
> > 1. https://opentelemetry.io/blog/2024/profiling/
> > 2. https://www.elastic.co/blog/continuous-profiling-distributed-tracing-correlation
> > 3. https://learn.microsoft.com/en-us/windows/win32/api/evntprov/nf-evntprov-eventactivityidcontrol
> >
> > Beau Belgrave (4):
> >   perf/core: Introduce perf_prepare_dump_data()
> >   perf: Introduce PERF_SAMPLE_TLS_USER sample type
> >   perf/core: Factor perf_output_sample_udump()
> >   perf/x86/core: Add tls dump support
> >
> >  arch/Kconfig                      |   7 ++
> >  arch/x86/Kconfig                  |   1 +
> >  arch/x86/events/core.c            |  14 +++
> >  arch/x86/include/asm/perf_event.h |   5 +
> >  include/linux/perf_event.h        |   7 ++
> >  include/uapi/linux/perf_event.h   |   5 +-
> >  kernel/events/core.c              | 166 +++++++++++++++++++++++-------
> >  kernel/events/internal.h          |  16 +++
> >  8 files changed, 180 insertions(+), 41 deletions(-)
> >
> >
> > base-commit: fec50db7033ea478773b159e0e2efb135270e3b7
> > --
> > 2.34.1
> >

^ permalink raw reply	[relevance 71%]

* [PATCH rdma-next 1/1] RDMA/mana_ib: Use num_comp_vectors of ib_device
@ 2024-04-12  8:47 71% Konstantin Taranov
  0 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-12  8:47 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Use num_comp_vectors of struct ib_device instead of max_num_queues
from gdma_context.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c     | 7 +------
 drivers/infiniband/hw/mana/device.c | 2 +-
 drivers/infiniband/hw/mana/qp.c     | 4 ++--
 3 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index c9129218f1be..dc931b9c3491 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -12,19 +12,14 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	struct ib_device *ibdev = ibcq->device;
 	struct mana_ib_create_cq ucmd = {};
 	struct mana_ib_dev *mdev;
-	struct gdma_context *gc;
 	int err;
 
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
-	gc = mdev_to_gc(mdev);
 
 	if (udata->inlen < sizeof(ucmd))
 		return -EINVAL;
 
-	if (attr->comp_vector > gc->max_num_queues)
-		return -EINVAL;
-
-	cq->comp_vector = attr->comp_vector;
+	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
 
 	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen));
 	if (err) {
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 6fa902ee80a6..07e97de31886 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -74,7 +74,7 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 	 * num_comp_vectors needs to set to the max MSIX index
 	 * when interrupts and event queues are implemented
 	 */
-	dev->ib_dev.num_comp_vectors = 1;
+	dev->ib_dev.num_comp_vectors = mdev->gdma_context->max_num_queues;
 	dev->ib_dev.dev.parent = mdev->gdma_context->dev;
 
 	ret = mana_gd_register_device(&mdev->gdma_context->mana_ib);
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index ef0a6dc664d0..f5c743db3a94 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -200,7 +200,7 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
 		cq_spec.gdma_region = cq->queue.gdma_region;
 		cq_spec.queue_size = cq->cqe * COMP_ENTRY_SIZE;
 		cq_spec.modr_ctx_id = 0;
-		eq = &mpc->ac->eqs[cq->comp_vector % gc->max_num_queues];
+		eq = &mpc->ac->eqs[cq->comp_vector];
 		cq_spec.attached_eq = eq->eq->id;
 
 		ret = mana_create_wq_obj(mpc, mpc->port_handle, GDMA_RQ,
@@ -359,7 +359,7 @@ static int mana_ib_create_qp_raw(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	cq_spec.gdma_region = send_cq->queue.gdma_region;
 	cq_spec.queue_size = send_cq->cqe * COMP_ENTRY_SIZE;
 	cq_spec.modr_ctx_id = 0;
-	eq_vec = send_cq->comp_vector % gc->max_num_queues;
+	eq_vec = send_cq->comp_vector;
 	eq = &mpc->ac->eqs[eq_vec];
 	cq_spec.attached_eq = eq->eq->id;
 

base-commit: f10242b3da908dc9d4bfa040e6511a5b86522499
-- 
2.43.0


^ permalink raw reply related	[relevance 71%]

* [PATCH v3] Drivers: hv: Cosmetic changes for hv.c and balloon.c
@ 2024-04-12  5:28 38% Aditya Nagesh
  2024-04-18 15:06 79% ` Saurabh Singh Sengar
  0 siblings, 1 reply; 200+ results
From: Aditya Nagesh @ 2024-04-12  5:28 UTC (permalink / raw)
  To: adityanagesh, kys, haiyangz, wei.liu, decui, linux-hyperv, linux-kernel
  Cc: Aditya Nagesh

Fix issues reported by checkpatch.pl script in hv.c and
balloon.c
 - Remove unnecessary parentheses
 - Remove extra newlines
 - Remove extra spaces
 - Add spaces between comparison operators
 - Remove comparison with NULL in if statements

No functional changes intended

Signed-off-by: Aditya Nagesh <adityanagesh@linux.microsoft.com>
---
[V3]
Fix alignment issues in multiline function parameters.

[V2]
Change Subject from "Drivers: hv: Fix Issues reported by checkpatch.pl script"
 to "Drivers: hv: Cosmetic changes for hv.c and balloon.c"

 drivers/hv/hv.c         |  35 +++++++-------
 drivers/hv/hv_balloon.c | 101 +++++++++++++++-------------------------
 2 files changed, 54 insertions(+), 82 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index a8ad728354cb..4906611475fb 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -45,7 +45,7 @@ int hv_init(void)
  * This involves a hypercall.
  */
 int hv_post_message(union hv_connection_id connection_id,
-		  enum hv_message_type message_type,
+		    enum hv_message_type message_type,
 		  void *payload, size_t payload_size)
 {
 	struct hv_input_post_message *aligned_msg;
@@ -86,7 +86,7 @@ int hv_post_message(union hv_connection_id connection_id,
 			status = HV_STATUS_INVALID_PARAMETER;
 	} else {
 		status = hv_do_hypercall(HVCALL_POST_MESSAGE,
-				aligned_msg, NULL);
+					 aligned_msg, NULL);
 	}
 
 	local_irq_restore(flags);
@@ -111,7 +111,7 @@ int hv_synic_alloc(void)
 
 	hv_context.hv_numa_map = kcalloc(nr_node_ids, sizeof(struct cpumask),
 					 GFP_KERNEL);
-	if (hv_context.hv_numa_map == NULL) {
+	if (!hv_context.hv_numa_map) {
 		pr_err("Unable to allocate NUMA map\n");
 		goto err;
 	}
@@ -120,11 +120,11 @@ int hv_synic_alloc(void)
 		hv_cpu = per_cpu_ptr(hv_context.cpu_context, cpu);
 
 		tasklet_init(&hv_cpu->msg_dpc,
-			     vmbus_on_msg_dpc, (unsigned long) hv_cpu);
+			     vmbus_on_msg_dpc, (unsigned long)hv_cpu);
 
 		if (ms_hyperv.paravisor_present && hv_isolation_type_tdx()) {
 			hv_cpu->post_msg_page = (void *)get_zeroed_page(GFP_ATOMIC);
-			if (hv_cpu->post_msg_page == NULL) {
+			if (!hv_cpu->post_msg_page) {
 				pr_err("Unable to allocate post msg page\n");
 				goto err;
 			}
@@ -147,14 +147,14 @@ int hv_synic_alloc(void)
 		if (!ms_hyperv.paravisor_present && !hv_root_partition) {
 			hv_cpu->synic_message_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
-			if (hv_cpu->synic_message_page == NULL) {
+			if (!hv_cpu->synic_message_page) {
 				pr_err("Unable to allocate SYNIC message page\n");
 				goto err;
 			}
 
 			hv_cpu->synic_event_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
-			if (hv_cpu->synic_event_page == NULL) {
+			if (!hv_cpu->synic_event_page) {
 				pr_err("Unable to allocate SYNIC event page\n");
 
 				free_page((unsigned long)hv_cpu->synic_message_page);
@@ -203,14 +203,13 @@ int hv_synic_alloc(void)
 	return ret;
 }
 
-
 void hv_synic_free(void)
 {
 	int cpu, ret;
 
 	for_each_present_cpu(cpu) {
-		struct hv_per_cpu_context *hv_cpu
-			= per_cpu_ptr(hv_context.cpu_context, cpu);
+		struct hv_per_cpu_context *hv_cpu =
+			per_cpu_ptr(hv_context.cpu_context, cpu);
 
 		/* It's better to leak the page if the encryption fails. */
 		if (ms_hyperv.paravisor_present && hv_isolation_type_tdx()) {
@@ -262,8 +261,8 @@ void hv_synic_free(void)
  */
 void hv_synic_enable_regs(unsigned int cpu)
 {
-	struct hv_per_cpu_context *hv_cpu
-		= per_cpu_ptr(hv_context.cpu_context, cpu);
+	struct hv_per_cpu_context *hv_cpu =
+		per_cpu_ptr(hv_context.cpu_context, cpu);
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
 	union hv_synic_sint shared_sint;
@@ -277,8 +276,8 @@ void hv_synic_enable_regs(unsigned int cpu)
 		/* Mask out vTOM bit. ioremap_cache() maps decrypted */
 		u64 base = (simp.base_simp_gpa << HV_HYP_PAGE_SHIFT) &
 				~ms_hyperv.shared_gpa_boundary;
-		hv_cpu->synic_message_page
-			= (void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
+		hv_cpu->synic_message_page =
+			(void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
 		if (!hv_cpu->synic_message_page)
 			pr_err("Fail to map synic message page.\n");
 	} else {
@@ -296,8 +295,8 @@ void hv_synic_enable_regs(unsigned int cpu)
 		/* Mask out vTOM bit. ioremap_cache() maps decrypted */
 		u64 base = (siefp.base_siefp_gpa << HV_HYP_PAGE_SHIFT) &
 				~ms_hyperv.shared_gpa_boundary;
-		hv_cpu->synic_event_page
-			= (void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
+		hv_cpu->synic_event_page =
+			(void *)ioremap_cache(base, HV_HYP_PAGE_SIZE);
 		if (!hv_cpu->synic_event_page)
 			pr_err("Fail to map synic event page.\n");
 	} else {
@@ -348,8 +347,8 @@ int hv_synic_init(unsigned int cpu)
  */
 void hv_synic_disable_regs(unsigned int cpu)
 {
-	struct hv_per_cpu_context *hv_cpu
-		= per_cpu_ptr(hv_context.cpu_context, cpu);
+	struct hv_per_cpu_context *hv_cpu =
+		per_cpu_ptr(hv_context.cpu_context, cpu);
 	union hv_synic_sint shared_sint;
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index e000fa3b9f97..29abed90badf 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -41,8 +41,6 @@
  * Begin protocol definitions.
  */
 
-
-
 /*
  * Protocol versions. The low word is the minor version, the high word the major
  * version.
@@ -71,8 +69,6 @@ enum {
 	DYNMEM_PROTOCOL_VERSION_CURRENT = DYNMEM_PROTOCOL_VERSION_WIN10
 };
 
-
-
 /*
  * Message Types
  */
@@ -101,7 +97,6 @@ enum dm_message_type {
 	DM_VERSION_1_MAX		= 12
 };
 
-
 /*
  * Structures defining the dynamic memory management
  * protocol.
@@ -115,7 +110,6 @@ union dm_version {
 	__u32 version;
 } __packed;
 
-
 union dm_caps {
 	struct {
 		__u64 balloon:1;
@@ -148,8 +142,6 @@ union dm_mem_page_range {
 	__u64  page_range;
 } __packed;
 
-
-
 /*
  * The header for all dynamic memory messages:
  *
@@ -174,7 +166,6 @@ struct dm_message {
 	__u8 data[]; /* enclosed message */
 } __packed;
 
-
 /*
  * Specific message types supporting the dynamic memory protocol.
  */
@@ -271,7 +262,6 @@ struct dm_status {
 	__u32 io_diff;
 } __packed;
 
-
 /*
  * Message to ask the guest to allocate memory - balloon up message.
  * This message is sent from the host to the guest. The guest may not be
@@ -286,14 +276,13 @@ struct dm_balloon {
 	__u32 reservedz;
 } __packed;
 
-
 /*
  * Balloon response message; this message is sent from the guest
  * to the host in response to the balloon message.
  *
  * reservedz: Reserved; must be set to zero.
  * more_pages: If FALSE, this is the last message of the transaction.
- * if TRUE there will atleast one more message from the guest.
+ * if TRUE there will be at least one more message from the guest.
  *
  * range_count: The number of ranges in the range array.
  *
@@ -314,7 +303,7 @@ struct dm_balloon_response {
  * to the guest to give guest more memory.
  *
  * more_pages: If FALSE, this is the last message of the transaction.
- * if TRUE there will atleast one more message from the guest.
+ * if TRUE there will be at least one more message from the guest.
  *
  * reservedz: Reserved; must be set to zero.
  *
@@ -342,7 +331,6 @@ struct dm_unballoon_response {
 	struct dm_header hdr;
 } __packed;
 
-
 /*
  * Hot add request message. Message sent from the host to the guest.
  *
@@ -390,7 +378,6 @@ enum dm_info_type {
 	MAX_INFO_TYPE
 };
 
-
 /*
  * Header for the information message.
  */
@@ -480,10 +467,10 @@ static unsigned long last_post_time;
 
 static int hv_hypercall_multi_failure;
 
-module_param(hot_add, bool, (S_IRUGO | S_IWUSR));
+module_param(hot_add, bool, 0644);
 MODULE_PARM_DESC(hot_add, "If set attempt memory hot_add");
 
-module_param(pressure_report_delay, uint, (S_IRUGO | S_IWUSR));
+module_param(pressure_report_delay, uint, 0644);
 MODULE_PARM_DESC(pressure_report_delay, "Delay in secs in reporting pressure");
 static atomic_t trans_id = ATOMIC_INIT(0);
 
@@ -502,7 +489,6 @@ enum hv_dm_state {
 	DM_INIT_ERROR
 };
 
-
 static __u8 recv_buffer[HV_HYP_PAGE_SIZE];
 static __u8 balloon_up_send_buffer[HV_HYP_PAGE_SIZE];
 #define PAGES_IN_2M (2 * 1024 * 1024 / PAGE_SIZE)
@@ -595,12 +581,12 @@ static inline bool has_pfn_is_backed(struct hv_hotadd_state *has,
 	struct hv_hotadd_gap *gap;
 
 	/* The page is not backed. */
-	if ((pfn < has->covered_start_pfn) || (pfn >= has->covered_end_pfn))
+	if (pfn < has->covered_start_pfn || pfn >= has->covered_end_pfn)
 		return false;
 
 	/* Check for gaps. */
 	list_for_each_entry(gap, &has->gap_list, list) {
-		if ((pfn >= gap->start_pfn) && (pfn < gap->end_pfn))
+		if (pfn >= gap->start_pfn && pfn < gap->end_pfn)
 			return false;
 	}
 
@@ -724,7 +710,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
 	unsigned long processed_pfn;
 	unsigned long total_pfn = pfn_count;
 
-	for (i = 0; i < (size/HA_CHUNK); i++) {
+	for (i = 0; i < (size / HA_CHUNK); i++) {
 		start_pfn = start + (i * HA_CHUNK);
 
 		scoped_guard(spinlock_irqsave, &dm_device.ha_lock) {
@@ -745,7 +731,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
 
 		nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
 		ret = add_memory(nid, PFN_PHYS((start_pfn)),
-				(HA_CHUNK << PAGE_SHIFT), MHP_MERGE_RESOURCE);
+				 (HA_CHUNK << PAGE_SHIFT), MHP_MERGE_RESOURCE);
 
 		if (ret) {
 			pr_err("hot_add memory failed error is %d\n", ret);
@@ -787,8 +773,8 @@ static void hv_online_page(struct page *pg, unsigned int order)
 	guard(spinlock_irqsave)(&dm_device.ha_lock);
 	list_for_each_entry(has, &dm_device.ha_region_list, list) {
 		/* The page belongs to a different HAS. */
-		if ((pfn < has->start_pfn) ||
-				(pfn + (1UL << order) > has->end_pfn))
+		if (pfn < has->start_pfn ||
+		    (pfn + (1UL << order) > has->end_pfn))
 			continue;
 
 		hv_bring_pgs_online(has, pfn, 1UL << order);
@@ -855,7 +841,7 @@ static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
 }
 
 static unsigned long handle_pg_range(unsigned long pg_start,
-					unsigned long pg_count)
+				     unsigned long pg_count)
 {
 	unsigned long start_pfn = pg_start;
 	unsigned long pfn_cnt = pg_count;
@@ -866,7 +852,7 @@ static unsigned long handle_pg_range(unsigned long pg_start,
 	unsigned long res = 0, flags;
 
 	pr_debug("Hot adding %lu pages starting at pfn 0x%lx.\n", pg_count,
-		pg_start);
+		 pg_start);
 
 	spin_lock_irqsave(&dm_device.ha_lock, flags);
 	list_for_each_entry(has, &dm_device.ha_region_list, list) {
@@ -902,10 +888,9 @@ static unsigned long handle_pg_range(unsigned long pg_start,
 			if (start_pfn > has->start_pfn &&
 			    online_section_nr(pfn_to_section_nr(start_pfn)))
 				hv_bring_pgs_online(has, start_pfn, pgs_ol);
-
 		}
 
-		if ((has->ha_end_pfn < has->end_pfn) && (pfn_cnt > 0)) {
+		if (has->ha_end_pfn < has->end_pfn && pfn_cnt > 0) {
 			/*
 			 * We have some residual hot add range
 			 * that needs to be hot added; hot add
@@ -1010,7 +995,7 @@ static void hot_add_req(struct work_struct *dummy)
 	rg_start = dm->ha_wrk.ha_region_range.finfo.start_page;
 	rg_sz = dm->ha_wrk.ha_region_range.finfo.page_cnt;
 
-	if ((rg_start == 0) && (!dm->host_specified_ha_region)) {
+	if (rg_start == 0 && !dm->host_specified_ha_region) {
 		unsigned long region_size;
 		unsigned long region_start;
 
@@ -1033,7 +1018,7 @@ static void hot_add_req(struct work_struct *dummy)
 
 	if (do_hot_add)
 		resp.page_count = process_hot_add(pg_start, pfn_cnt,
-						rg_start, rg_sz);
+						  rg_start, rg_sz);
 
 	dm->num_pages_added += resp.page_count;
 #endif
@@ -1211,11 +1196,10 @@ static void post_status(struct hv_dynmem_device *dm)
 				sizeof(struct dm_status),
 				(unsigned long)NULL,
 				VM_PKT_DATA_INBAND, 0);
-
 }
 
 static void free_balloon_pages(struct hv_dynmem_device *dm,
-			 union dm_mem_page_range *range_array)
+			       union dm_mem_page_range *range_array)
 {
 	int num_pages = range_array->finfo.page_cnt;
 	__u64 start_frame = range_array->finfo.start_page;
@@ -1231,8 +1215,6 @@ static void free_balloon_pages(struct hv_dynmem_device *dm,
 	}
 }
 
-
-
 static unsigned int alloc_balloon_pages(struct hv_dynmem_device *dm,
 					unsigned int num_pages,
 					struct dm_balloon_response *bl_resp,
@@ -1278,7 +1260,6 @@ static unsigned int alloc_balloon_pages(struct hv_dynmem_device *dm,
 			page_to_pfn(pg);
 		bl_resp->range_array[i].finfo.page_cnt = alloc_unit;
 		bl_resp->hdr.size += sizeof(union dm_mem_page_range);
-
 	}
 
 	return i * alloc_unit;
@@ -1332,7 +1313,7 @@ static void balloon_up(struct work_struct *dummy)
 
 		if (num_ballooned == 0 || num_ballooned == num_pages) {
 			pr_debug("Ballooned %u out of %u requested pages.\n",
-				num_pages, dm_device.balloon_wrk.num_pages);
+				 num_pages, dm_device.balloon_wrk.num_pages);
 
 			bl_resp->more_pages = 0;
 			done = true;
@@ -1366,16 +1347,15 @@ static void balloon_up(struct work_struct *dummy)
 
 			for (i = 0; i < bl_resp->range_count; i++)
 				free_balloon_pages(&dm_device,
-						 &bl_resp->range_array[i]);
+						   &bl_resp->range_array[i]);
 
 			done = true;
 		}
 	}
-
 }
 
 static void balloon_down(struct hv_dynmem_device *dm,
-			struct dm_unballoon_request *req)
+			 struct dm_unballoon_request *req)
 {
 	union dm_mem_page_range *range_array = req->range_array;
 	int range_count = req->range_count;
@@ -1389,7 +1369,7 @@ static void balloon_down(struct hv_dynmem_device *dm,
 	}
 
 	pr_debug("Freed %u ballooned pages.\n",
-		prev_pages_ballooned - dm->num_pages_ballooned);
+		 prev_pages_ballooned - dm->num_pages_ballooned);
 
 	if (req->more_pages == 1)
 		return;
@@ -1415,7 +1395,7 @@ static int dm_thread_func(void *dm_dev)
 
 	while (!kthread_should_stop()) {
 		wait_for_completion_interruptible_timeout(
-						&dm_device.config_event, 1*HZ);
+						&dm_device.config_event, 1 * HZ);
 		/*
 		 * The host expects us to post information on the memory
 		 * pressure every second.
@@ -1439,9 +1419,8 @@ static int dm_thread_func(void *dm_dev)
 	return 0;
 }
 
-
 static void version_resp(struct hv_dynmem_device *dm,
-			struct dm_version_response *vresp)
+			 struct dm_version_response *vresp)
 {
 	struct dm_version_request version_req;
 	int ret;
@@ -1502,7 +1481,7 @@ static void version_resp(struct hv_dynmem_device *dm,
 }
 
 static void cap_resp(struct hv_dynmem_device *dm,
-			struct dm_capabilities_resp_msg *cap_resp)
+		     struct dm_capabilities_resp_msg *cap_resp)
 {
 	if (!cap_resp->is_accepted) {
 		pr_err("Capabilities not accepted by host\n");
@@ -1535,7 +1514,7 @@ static void balloon_onchannelcallback(void *context)
 		switch (dm_hdr->type) {
 		case DM_VERSION_RESPONSE:
 			version_resp(dm,
-				 (struct dm_version_response *)dm_msg);
+				     (struct dm_version_response *)dm_msg);
 			break;
 
 		case DM_CAPABILITIES_RESPONSE:
@@ -1565,7 +1544,7 @@ static void balloon_onchannelcallback(void *context)
 
 			dm->state = DM_BALLOON_DOWN;
 			balloon_down(dm,
-				 (struct dm_unballoon_request *)recv_buffer);
+				     (struct dm_unballoon_request *)recv_buffer);
 			break;
 
 		case DM_MEM_HOT_ADD_REQUEST:
@@ -1603,17 +1582,15 @@ static void balloon_onchannelcallback(void *context)
 
 		default:
 			pr_warn_ratelimited("Unhandled message: type: %d\n", dm_hdr->type);
-
 		}
 	}
-
 }
 
 #define HV_LARGE_REPORTING_ORDER	9
 #define HV_LARGE_REPORTING_LEN (HV_HYP_PAGE_SIZE << \
 		HV_LARGE_REPORTING_ORDER)
 static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
-		    struct scatterlist *sgl, unsigned int nents)
+			       struct scatterlist *sgl, unsigned int nents)
 {
 	unsigned long flags;
 	struct hv_memory_hint *hint;
@@ -1648,7 +1625,7 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
 		 */
 
 		/* page reporting for pages 2MB or higher */
-		if (order >= HV_LARGE_REPORTING_ORDER ) {
+		if (order >= HV_LARGE_REPORTING_ORDER) {
 			range->page.largepage = 1;
 			range->page_size = HV_GPA_PAGE_RANGE_PAGE_SIZE_2MB;
 			range->base_large_pfn = page_to_hvpfn(
@@ -1662,23 +1639,21 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
 			range->page.additional_pages =
 				(sg->length / HV_HYP_PAGE_SIZE) - 1;
 		}
-
 	}
 
 	status = hv_do_rep_hypercall(HV_EXT_CALL_MEMORY_HEAT_HINT, nents, 0,
 				     hint, NULL);
 	local_irq_restore(flags);
 	if (!hv_result_success(status)) {
-
 		pr_err("Cold memory discard hypercall failed with status %llx\n",
-				status);
+		       status);
 		if (hv_hypercall_multi_failure > 0)
 			hv_hypercall_multi_failure++;
 
 		if (hv_result(status) == HV_STATUS_INVALID_PARAMETER) {
 			pr_err("Underlying Hyper-V does not support order less than 9. Hypercall failed\n");
 			pr_err("Defaulting to page_reporting_order %d\n",
-					pageblock_order);
+			       pageblock_order);
 			page_reporting_order = pageblock_order;
 			hv_hypercall_multi_failure++;
 			return -EINVAL;
@@ -1712,7 +1687,7 @@ static void enable_page_reporting(void)
 		pr_err("Failed to enable cold memory discard: %d\n", ret);
 	} else {
 		pr_info("Cold memory discard hint enabled with order %d\n",
-				page_reporting_order);
+			page_reporting_order);
 	}
 }
 
@@ -1795,7 +1770,7 @@ static int balloon_connect_vsp(struct hv_device *dev)
 	if (ret)
 		goto out;
 
-	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
+	t = wait_for_completion_timeout(&dm_device.host_event, 5 * HZ);
 	if (t == 0) {
 		ret = -ETIMEDOUT;
 		goto out;
@@ -1850,7 +1825,7 @@ static int balloon_connect_vsp(struct hv_device *dev)
 	if (ret)
 		goto out;
 
-	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
+	t = wait_for_completion_timeout(&dm_device.host_event, 5 * HZ);
 	if (t == 0) {
 		ret = -ETIMEDOUT;
 		goto out;
@@ -1891,8 +1866,8 @@ static int hv_balloon_debug_show(struct seq_file *f, void *offset)
 	char *sname;
 
 	seq_printf(f, "%-22s: %u.%u\n", "host_version",
-				DYNMEM_MAJOR_VERSION(dm->version),
-				DYNMEM_MINOR_VERSION(dm->version));
+			DYNMEM_MAJOR_VERSION(dm->version),
+			DYNMEM_MINOR_VERSION(dm->version));
 
 	seq_printf(f, "%-22s:", "capabilities");
 	if (ballooning_enabled())
@@ -1941,10 +1916,10 @@ static int hv_balloon_debug_show(struct seq_file *f, void *offset)
 	seq_printf(f, "%-22s: %u\n", "pages_ballooned", dm->num_pages_ballooned);
 
 	seq_printf(f, "%-22s: %lu\n", "total_pages_committed",
-				get_pages_committed(dm));
+		   get_pages_committed(dm));
 
 	seq_printf(f, "%-22s: %llu\n", "max_dynamic_page_count",
-				dm->max_dynamic_page_count);
+		   dm->max_dynamic_page_count);
 
 	return 0;
 }
@@ -1954,7 +1929,7 @@ DEFINE_SHOW_ATTRIBUTE(hv_balloon_debug);
 static void  hv_balloon_debugfs_init(struct hv_dynmem_device *b)
 {
 	debugfs_create_file("hv-balloon", 0444, NULL, b,
-			&hv_balloon_debug_fops);
+			    &hv_balloon_debug_fops);
 }
 
 static void  hv_balloon_debugfs_exit(struct hv_dynmem_device *b)
@@ -2097,7 +2072,6 @@ static int balloon_suspend(struct hv_device *hv_dev)
 	tasklet_enable(&hv_dev->channel->callback_event);
 
 	return 0;
-
 }
 
 static int balloon_resume(struct hv_device *dev)
@@ -2156,7 +2130,6 @@ static  struct hv_driver balloon_drv = {
 
 static int __init init_balloon_drv(void)
 {
-
 	return vmbus_driver_register(&balloon_drv);
 }
 
-- 
2.34.1


^ permalink raw reply related	[relevance 38%]

* [RFC PATCH 4/4] perf/x86/core: Add tls dump support
  2024-04-12  0:17 60% [RFC PATCH 0/4] perf: Correlating user process data to samples Beau Belgrave
                   ` (2 preceding siblings ...)
  2024-04-12  0:17 68% ` [RFC PATCH 3/4] perf/core: Factor perf_output_sample_udump() Beau Belgrave
@ 2024-04-12  0:17 72% ` Beau Belgrave
      5 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-12  0:17 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, linux-perf-users, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, primiano,
	aahringo, dcook

Now that perf supports TLS dumps, x86-64 can provide the details for how
to get TLS data for user threads.

Enable HAVE_PERF_USER_TLS_DUMP Kconfig only for x86-64. I do not have
access to x86 to validate 32-bit.

Utilize mmap_is_ia32() to determine 32/64 bit threads. Use fsbase for
64-bit and gsbase for 32-bit with appropriate size.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 arch/x86/Kconfig                  |  1 +
 arch/x86/events/core.c            | 14 ++++++++++++++
 arch/x86/include/asm/perf_event.h |  5 +++++
 3 files changed, 20 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4fff6ed46e90..8d46ec8ded0c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -263,6 +263,7 @@ config X86
 	select HAVE_PCI
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
+	select HAVE_PERF_USER_TLS_DUMP		if X86_64
 	select MMU_GATHER_RCU_TABLE_FREE	if PARAVIRT
 	select MMU_GATHER_MERGE_VMAS
 	select HAVE_POSIX_CPU_TIMERS_TASK_WORK
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 09050641ce5d..3f851db4c591 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -41,6 +41,7 @@
 #include <asm/desc.h>
 #include <asm/ldt.h>
 #include <asm/unwind.h>
+#include <asm/elf.h>
 
 #include "perf_event.h"
 
@@ -3002,3 +3003,16 @@ u64 perf_get_hw_event_config(int hw_event)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(perf_get_hw_event_config);
+
+#ifdef CONFIG_X86_64
+void arch_perf_user_tls_pointer(struct perf_tls *tls)
+{
+	if (!mmap_is_ia32()) {
+		tls->base = current->thread.fsbase;
+		tls->size = sizeof(u64);
+	} else {
+		tls->base = current->thread.gsbase;
+		tls->size = sizeof(u32);
+	}
+}
+#endif
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 3736b8a46c04..d0f65e572c20 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -628,4 +628,9 @@ static __always_inline void perf_lopwr_cb(bool lopwr_in)
 
 #define arch_perf_out_copy_user copy_from_user_nmi
 
+#ifdef CONFIG_HAVE_PERF_USER_TLS_DUMP
+struct perf_tls;
+extern void arch_perf_user_tls_pointer(struct perf_tls *tls);
+#endif
+
 #endif /* _ASM_X86_PERF_EVENT_H */
-- 
2.34.1


^ permalink raw reply related	[relevance 72%]

* [RFC PATCH 3/4] perf/core: Factor perf_output_sample_udump()
  2024-04-12  0:17 60% [RFC PATCH 0/4] perf: Correlating user process data to samples Beau Belgrave
  2024-04-12  0:17 63% ` [RFC PATCH 1/4] perf/core: Introduce perf_prepare_dump_data() Beau Belgrave
  2024-04-12  0:17 52% ` [RFC PATCH 2/4] perf: Introduce PERF_SAMPLE_TLS_USER sample type Beau Belgrave
@ 2024-04-12  0:17 68% ` Beau Belgrave
  2024-04-12  0:17 72% ` [RFC PATCH 4/4] perf/x86/core: Add tls dump support Beau Belgrave
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-12  0:17 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, linux-perf-users, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, primiano,
	aahringo, dcook

We now have two user dump sources (stack and tls). Both are doing the
same logic to ensure the user dump ABI output is properly handled. The
only difference is one gets the address within the method, and the other
is passed the address.

Add perf_output_sample_udump() and utilize it for both stack and tls
sample dumps. The sp register is now fetched outside of this method and
passed to it. This allows both stack and tls to utilize the same code.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 kernel/events/core.c | 68 +++++++++++++-------------------------------
 1 file changed, 19 insertions(+), 49 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f848bf4be9bd..6b3cf5afdd32 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6998,47 +6998,10 @@ perf_sample_dump_size(u16 dump_size, u16 header_size, u64 task_size)
 }
 
 static void
-perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
-			  struct pt_regs *regs)
-{
-	/* Case of a kernel thread, nothing to dump */
-	if (!regs) {
-		u64 size = 0;
-		perf_output_put(handle, size);
-	} else {
-		unsigned long sp;
-		unsigned int rem;
-		u64 dyn_size;
-
-		/*
-		 * We dump:
-		 * static size
-		 *   - the size requested by user or the best one we can fit
-		 *     in to the sample max size
-		 * data
-		 *   - user stack dump data
-		 * dynamic size
-		 *   - the actual dumped size
-		 */
-
-		/* Static size. */
-		perf_output_put(handle, dump_size);
-
-		/* Data. */
-		sp = perf_user_stack_pointer(regs);
-		rem = __output_copy_user(handle, (void *) sp, dump_size);
-		dyn_size = dump_size - rem;
-
-		perf_output_skip(handle, rem);
-
-		/* Dynamic size. */
-		perf_output_put(handle, dyn_size);
-	}
-}
-
-static void
-perf_output_sample_utls(struct perf_output_handle *handle, u64 addr,
-			u64 dump_size, struct pt_regs *regs)
+perf_output_sample_udump(struct perf_output_handle *handle,
+			 unsigned long addr,
+			 u64 dump_size,
+			 struct pt_regs *regs)
 {
 	/* Case of a kernel thread, nothing to dump */
 	if (!regs) {
@@ -7054,7 +7017,7 @@ perf_output_sample_utls(struct perf_output_handle *handle, u64 addr,
 		 *   - the size requested by user or the best one we can fit
 		 *     in to the sample max size
 		 * data
-		 *   - user tls dump data
+		 *   - user dump data
 		 * dynamic size
 		 *   - the actual dumped size
 		 */
@@ -7507,9 +7470,16 @@ void perf_output_sample(struct perf_output_handle *handle,
 	}
 
 	if (sample_type & PERF_SAMPLE_STACK_USER) {
-		perf_output_sample_ustack(handle,
-					  data->stack_user_size,
-					  data->regs_user.regs);
+		struct pt_regs *regs = data->regs_user.regs;
+		unsigned long sp = 0;
+
+		if (regs)
+			sp = perf_user_stack_pointer(regs);
+
+		perf_output_sample_udump(handle,
+					 sp,
+					 data->stack_user_size,
+					 regs);
 	}
 
 	if (sample_type & PERF_SAMPLE_WEIGHT_TYPE)
@@ -7551,10 +7521,10 @@ void perf_output_sample(struct perf_output_handle *handle,
 		perf_output_put(handle, data->code_page_size);
 
 	if (sample_type & PERF_SAMPLE_TLS_USER) {
-		perf_output_sample_utls(handle,
-					data->tls_addr,
-					data->tls_user_size,
-					data->regs_user.regs);
+		perf_output_sample_udump(handle,
+					 data->tls_addr,
+					 data->tls_user_size,
+					 data->regs_user.regs);
 	}
 
 	if (sample_type & PERF_SAMPLE_AUX) {
-- 
2.34.1


^ permalink raw reply related	[relevance 68%]

* [RFC PATCH 1/4] perf/core: Introduce perf_prepare_dump_data()
  2024-04-12  0:17 60% [RFC PATCH 0/4] perf: Correlating user process data to samples Beau Belgrave
@ 2024-04-12  0:17 63% ` Beau Belgrave
  2024-04-12  0:17 52% ` [RFC PATCH 2/4] perf: Introduce PERF_SAMPLE_TLS_USER sample type Beau Belgrave
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-12  0:17 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, linux-perf-users, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, primiano,
	aahringo, dcook

Factor out perf_prepare_dump_data() so that the same logic is used for
dumping stack data as other types, such as TLS.

Slightly refactor perf_sample_ustack_size() to perf_sample_dump_size().
Move reg checks up into perf_ustack_task_size() since the task size
must now be calculated before preparing dump data.

Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 kernel/events/core.c | 79 ++++++++++++++++++++++++++------------------
 1 file changed, 47 insertions(+), 32 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 724e6d7e128f..07de5cc2aa25 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6912,7 +6912,13 @@ static void perf_sample_regs_intr(struct perf_regs *regs_intr,
  */
 static u64 perf_ustack_task_size(struct pt_regs *regs)
 {
-	unsigned long addr = perf_user_stack_pointer(regs);
+	unsigned long addr;
+
+	/* No regs, no stack pointer, no dump. */
+	if (!regs)
+		return 0;
+
+	addr = perf_user_stack_pointer(regs);
 
 	if (!addr || addr >= TASK_SIZE)
 		return 0;
@@ -6921,42 +6927,35 @@ static u64 perf_ustack_task_size(struct pt_regs *regs)
 }
 
 static u16
-perf_sample_ustack_size(u16 stack_size, u16 header_size,
-			struct pt_regs *regs)
+perf_sample_dump_size(u16 dump_size, u16 header_size, u64 task_size)
 {
-	u64 task_size;
-
-	/* No regs, no stack pointer, no dump. */
-	if (!regs)
-		return 0;
-
 	/*
-	 * Check if we fit in with the requested stack size into the:
+	 * Check if we fit in with the requested dump size into the:
 	 * - TASK_SIZE
 	 *   If we don't, we limit the size to the TASK_SIZE.
 	 *
 	 * - remaining sample size
-	 *   If we don't, we customize the stack size to
+	 *   If we don't, we customize the dump size to
 	 *   fit in to the remaining sample size.
 	 */
 
-	task_size  = min((u64) USHRT_MAX, perf_ustack_task_size(regs));
-	stack_size = min(stack_size, (u16) task_size);
+	task_size  = min((u64) USHRT_MAX, task_size);
+	dump_size = min(dump_size, (u16) task_size);
 
 	/* Current header size plus static size and dynamic size. */
 	header_size += 2 * sizeof(u64);
 
-	/* Do we fit in with the current stack dump size? */
-	if ((u16) (header_size + stack_size) < header_size) {
+	/* Do we fit in with the current dump size? */
+	if ((u16) (header_size + dump_size) < header_size) {
 		/*
 		 * If we overflow the maximum size for the sample,
-		 * we customize the stack dump size to fit in.
+		 * we customize the dump size to fit in.
 		 */
-		stack_size = USHRT_MAX - header_size - sizeof(u64);
-		stack_size = round_up(stack_size, sizeof(u64));
+		dump_size = USHRT_MAX - header_size - sizeof(u64);
+		dump_size = round_up(dump_size, sizeof(u64));
 	}
 
-	return stack_size;
+	return dump_size;
 }
 
 static void
@@ -7648,6 +7647,32 @@ static __always_inline u64 __cond_set(u64 flags, u64 s, u64 d)
 	return d * !!(flags & s);
 }
 
+static inline u16
+perf_prepare_dump_data(struct perf_sample_data *data,
+		       struct perf_event *event,
+		       struct pt_regs *regs,
+		       u16 dump_size,
+		       u64 task_size)
+{
+	u16 header_size = perf_sample_data_size(data, event);
+	u16 size = sizeof(u64);
+
+	dump_size = perf_sample_dump_size(dump_size, header_size,
+					  task_size);
+
+	/*
+	 * If there is something to dump, add space for the dump
+	 * itself and for the field that tells the dynamic size,
+	 * which is how many have been actually dumped.
+	 */
+	if (dump_size)
+		size += sizeof(u64) + dump_size;
+
+	data->dyn_size += size;
+
+	return dump_size;
+}
+
 void perf_prepare_sample(struct perf_sample_data *data,
 			 struct perf_event *event,
 			 struct pt_regs *regs)
@@ -7725,22 +7750,12 @@ void perf_prepare_sample(struct perf_sample_data *data,
 		 * up the rest of the sample size.
 		 */
 		u16 stack_size = event->attr.sample_stack_user;
-		u16 header_size = perf_sample_data_size(data, event);
-		u16 size = sizeof(u64);
-
-		stack_size = perf_sample_ustack_size(stack_size, header_size,
-						     data->regs_user.regs);
+		u64 task_size = perf_ustack_task_size(regs);
 
-		/*
-		 * If there is something to dump, add space for the dump
-		 * itself and for the field that tells the dynamic size,
-		 * which is how many have been actually dumped.
-		 */
-		if (stack_size)
-			size += sizeof(u64) + stack_size;
+		stack_size = perf_prepare_dump_data(data, event, regs,
+						    stack_size, task_size);
 
 		data->stack_user_size = stack_size;
-		data->dyn_size += size;
 		data->sample_flags |= PERF_SAMPLE_STACK_USER;
 	}
 
-- 
2.34.1


^ permalink raw reply related	[relevance 63%]

* [RFC PATCH 0/4] perf: Correlating user process data to samples
@ 2024-04-12  0:17 60% Beau Belgrave
  2024-04-12  0:17 63% ` [RFC PATCH 1/4] perf/core: Introduce perf_prepare_dump_data() Beau Belgrave
                   ` (5 more replies)
  0 siblings, 6 replies; 200+ results
From: Beau Belgrave @ 2024-04-12  0:17 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, linux-perf-users, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, primiano,
	aahringo, dcook

In the Open Telemetry profiling SIG [1], we are trying to find a way to
grab a tracing association quickly on a per-sample basis. The team at
Elastic has a bespoke way to do this [2], however, I'd like to see a
more general way to achieve this. The folks I've been talking with seem
open to the idea of just having a TLS value for this we could capture
upon each sample. We could then just state, Open Telemetry SDKs should
have a TLS value for span correlation. However, we need a way to sample
the TLS or other value(s) when a sampling event is generated. This is
supported today on Windows via EventActivityIdControl() [3]. Since
Open Telemetry works on both Windows and Linux, ideally we can do
something as efficient for Linux based workloads.

This series is to explore how it would be best possible to collect
supporting data from a user process when a profile sample is collected.
Having a value stored in TLS makes a lot of sense for this however
there are other ways to explore. Whatever is chosen, kernel samples
taken in process context should be able to get this supporting data.
In these patches on X64 the fsbase and gsbase are used for this.

An option to explore suggested by Mathieu Desnoyers is to utilize rseq
for processes to register a value location that can be included when
profiling if desired. This would allow a tighter contract between user
processes and a profiler.  It would allow better labeling/categorizing
the correlation values.

An idea flow would look like this:
User Task		Profile
do_work();		sample() -> IP + No activity
...
set_activity(123);
...
do_work();		sample() -> IP + activity (123)
...
set_activity(124);
...
do_work();		sample() -> IP + activity (124)

Ideally, the set_activity() method would not be a syscall. It needs to
be very cheap as this should not bottleneck work. Ideally this is just
a memcpy of 16-20 bytes as it is on Windows via EventActivityIdControl()
using EVENT_ACTIVITY_CTRL_SET_ID.

For those not aware, Open Telemetry allows collecting data from multiple
machines and show where time was spent. The tracing context is already
available for logs, but not for profiling samples. The idea is to show
where slowdowns occur and have profile samples to explain why they
slowed down. This must be possible without having to track context
switches to do this correlation. This is because the profiling rates
are typically 20hz - 1Khz, while the context switching rates are much
higher. We do not want to have to consume high context switch rates
just to know a correlation for a 20hz signal. Often these 20hz signals
are always enabled in some environments.

Regardless if TLS, rseq, or other source is used I believe we will need
a way for perf_events to include it within a sample. The changes in this
series show how it could be done with TLS. There is some factoring work
under perf to make it easier to add more dump types using the existing
ABI. This is mostly to make the patches clearer, certainly the refactor
parts could get dropped and we could have duplicated/specialized paths.

1. https://opentelemetry.io/blog/2024/profiling/
2. https://www.elastic.co/blog/continuous-profiling-distributed-tracing-correlation
3. https://learn.microsoft.com/en-us/windows/win32/api/evntprov/nf-evntprov-eventactivityidcontrol

Beau Belgrave (4):
  perf/core: Introduce perf_prepare_dump_data()
  perf: Introduce PERF_SAMPLE_TLS_USER sample type
  perf/core: Factor perf_output_sample_udump()
  perf/x86/core: Add tls dump support

 arch/Kconfig                      |   7 ++
 arch/x86/Kconfig                  |   1 +
 arch/x86/events/core.c            |  14 +++
 arch/x86/include/asm/perf_event.h |   5 +
 include/linux/perf_event.h        |   7 ++
 include/uapi/linux/perf_event.h   |   5 +-
 kernel/events/core.c              | 166 +++++++++++++++++++++++-------
 kernel/events/internal.h          |  16 +++
 8 files changed, 180 insertions(+), 41 deletions(-)


base-commit: fec50db7033ea478773b159e0e2efb135270e3b7
-- 
2.34.1


^ permalink raw reply	[relevance 60%]

* [RFC PATCH 2/4] perf: Introduce PERF_SAMPLE_TLS_USER sample type
  2024-04-12  0:17 60% [RFC PATCH 0/4] perf: Correlating user process data to samples Beau Belgrave
  2024-04-12  0:17 63% ` [RFC PATCH 1/4] perf/core: Introduce perf_prepare_dump_data() Beau Belgrave
@ 2024-04-12  0:17 52% ` Beau Belgrave
  2024-04-12  0:17 68% ` [RFC PATCH 3/4] perf/core: Factor perf_output_sample_udump() Beau Belgrave
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-12  0:17 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, rostedt, mhiramat, mathieu.desnoyers
  Cc: linux-kernel, linux-trace-kernel, linux-perf-users, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, primiano,
	aahringo, dcook

When samples are generated, there is no way via the perf_event ABI to
fetch per-thread data. This data is very useful in tracing scenarios
that involve correlation IDs, such as OpenTelemetry. They are also
useful for tracking per-thread performance details directly within a
cooperating user process.

The newly establish OpenTelemetry profiling group requires a way to get
tracing correlations on both Linux and Windows. On Windows this
correlation is on a per-thread basis directly via ETW. On Linux we need
a fast mechanism to store these details and TLS seems like the best
option, see links for more details.

Add a new sample type (PERF_SAMPLE_TLS_USER) that fetches TLS data up to
X bytes per-sample. Use the existing PERF_SAMPLE_STACK_USER ABI for
outputting data out to consumers. Store requested data size by the user
in the previously reserved u16 (__reserved_2) within perf_event_attr.

Add tls_addr and tls_user_size to perf_sample_data and calculate them
during sample preparation. This allows the output side to know if
truncation is going to occur and not having to re-fetch the TLS value
from the user process a second time.

Add CONFIG_HAVE_PERF_USER_TLS_DUMP so that architectures can specify if
they have a TLS specific register (or other logic) that can be used for
dumping. This does not yet enable any architecture to do TLS dump, it
simply makes it possible by allowing a arch defined method named
arch_perf_user_tls_pointer().

Add perf_tls struct that arch_perf_user_tls_pointer() utilizes to set
TLS details of the address and size (for 32bit on 64bit compat cases).

Link: https://opentelemetry.io/blog/2024/profiling/
Link: https://www.elastic.co/blog/continuous-profiling-distributed-tracing-correlation
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
---
 arch/Kconfig                    |   7 +++
 include/linux/perf_event.h      |   7 +++
 include/uapi/linux/perf_event.h |   5 +-
 kernel/events/core.c            | 105 +++++++++++++++++++++++++++++++-
 kernel/events/internal.h        |  16 +++++
 5 files changed, 137 insertions(+), 3 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 9f066785bb71..6afaf5f46e2f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -430,6 +430,13 @@ config HAVE_PERF_USER_STACK_DUMP
 	  access to the user stack pointer which is not unified across
 	  architectures.
 
+config HAVE_PERF_USER_TLS_DUMP
+	bool
+	help
+	  Support user tls dumps for perf event samples. This needs
+	  access to the user tls pointer which is not unified across
+	  architectures.
+
 config HAVE_ARCH_JUMP_LABEL
 	bool
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index d2a15c0c6f8a..7fac81929eed 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1202,8 +1202,15 @@ struct perf_sample_data {
 	u64				data_page_size;
 	u64				code_page_size;
 	u64				aux_size;
+	u64				tls_addr;
+	u64				tls_user_size;
 } ____cacheline_aligned;
 
+struct perf_tls {
+	unsigned long base; /* Base address for TLS */
+	unsigned long size; /* Size of base address */
+};
+
 /* default value for data source */
 #define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
 		    PERF_MEM_S(LVL, NA)   |\
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 3a64499b0f5d..b62669cfe581 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -162,8 +162,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_DATA_PAGE_SIZE		= 1U << 22,
 	PERF_SAMPLE_CODE_PAGE_SIZE		= 1U << 23,
 	PERF_SAMPLE_WEIGHT_STRUCT		= 1U << 24,
+	PERF_SAMPLE_TLS_USER			= 1U << 25,
 
-	PERF_SAMPLE_MAX = 1U << 25,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 26,		/* non-ABI */
 };
 
 #define PERF_SAMPLE_WEIGHT_TYPE	(PERF_SAMPLE_WEIGHT | PERF_SAMPLE_WEIGHT_STRUCT)
@@ -509,7 +510,7 @@ struct perf_event_attr {
 	 */
 	__u32	aux_watermark;
 	__u16	sample_max_stack;
-	__u16	__reserved_2;
+	__u16	sample_tls_user; /* Size of TLS data to dump on samples */
 	__u32	aux_sample_size;
 	__u32	__reserved_3;
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 07de5cc2aa25..f848bf4be9bd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6926,6 +6926,45 @@ static u64 perf_ustack_task_size(struct pt_regs *regs)
 	return TASK_SIZE - addr;
 }
 
+/*
+ * Get remaining task size from user tls pointer.
+ *
+ * Outputs the address to use for the dump to avoid doing
+ * this twice (prepare and output).
+ */
+static u64
+perf_utls_task_size(struct pt_regs *regs, u64 dump_size, u64 *tls_addr)
+{
+	struct perf_tls tls;
+	unsigned long addr;
+
+	*tls_addr = 0;
+
+	/* No regs, no tls pointer, no dump. */
+	if (!regs)
+		return 0;
+
+	perf_user_tls_pointer(&tls);
+
+	if (WARN_ONCE(tls.size > sizeof(addr), "perf: Bad TLS size.\n"))
+		return 0;
+
+	addr = 0;
+	arch_perf_out_copy_user(&addr, (void *)tls.base, tls.size);
+
+	if (addr < dump_size)
+		return 0;
+
+	addr -= dump_size;
+
+	if (!addr || addr >= TASK_SIZE)
+		return 0;
+
+	*tls_addr = addr;
+
+	return TASK_SIZE - addr;
+}
+
 static u16
 perf_sample_dump_size(u16 dump_size, u16 header_size, u64 task_size)
 {
@@ -6997,6 +7036,43 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
 	}
 }
 
+static void
+perf_output_sample_utls(struct perf_output_handle *handle, u64 addr,
+			u64 dump_size, struct pt_regs *regs)
+{
+	/* Case of a kernel thread, nothing to dump */
+	if (!regs) {
+		u64 size = 0;
+		perf_output_put(handle, size);
+	} else {
+		unsigned int rem;
+		u64 dyn_size;
+
+		/*
+		 * We dump:
+		 * static size
+		 *   - the size requested by user or the best one we can fit
+		 *     in to the sample max size
+		 * data
+		 *   - user tls dump data
+		 * dynamic size
+		 *   - the actual dumped size
+		 */
+
+		/* Static size. */
+		perf_output_put(handle, dump_size);
+
+		/* Data. */
+		rem = __output_copy_user(handle, (void *)addr, dump_size);
+		dyn_size = dump_size - rem;
+
+		perf_output_skip(handle, rem);
+
+		/* Dynamic size. */
+		perf_output_put(handle, dyn_size);
+	}
+}
+
 static unsigned long perf_prepare_sample_aux(struct perf_event *event,
 					  struct perf_sample_data *data,
 					  size_t size)
@@ -7474,6 +7550,13 @@ void perf_output_sample(struct perf_output_handle *handle,
 	if (sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)
 		perf_output_put(handle, data->code_page_size);
 
+	if (sample_type & PERF_SAMPLE_TLS_USER) {
+		perf_output_sample_utls(handle,
+					data->tls_addr,
+					data->tls_user_size,
+					data->regs_user.regs);
+	}
+
 	if (sample_type & PERF_SAMPLE_AUX) {
 		perf_output_put(handle, data->aux_size);
 
@@ -7759,6 +7842,19 @@ void perf_prepare_sample(struct perf_sample_data *data,
 		data->sample_flags |= PERF_SAMPLE_STACK_USER;
 	}
 
+	if (filtered_sample_type & PERF_SAMPLE_TLS_USER) {
+		u16 tls_size = event->attr.sample_tls_user;
+		u64 task_size = perf_utls_task_size(data->regs_user.regs,
+						    tls_size,
+						    &data->tls_addr);
+
+		tls_size = perf_prepare_dump_data(data, event, regs,
+						  tls_size, task_size);
+
+		data->tls_user_size = tls_size;
+		data->sample_flags |= PERF_SAMPLE_TLS_USER;
+	}
+
 	if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
 		data->weight.full = 0;
 		data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
@@ -12159,7 +12255,7 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 
 	attr->size = size;
 
-	if (attr->__reserved_1 || attr->__reserved_2 || attr->__reserved_3)
+	if (attr->__reserved_1 || attr->__reserved_3)
 		return -EINVAL;
 
 	if (attr->sample_type & ~(PERF_SAMPLE_MAX-1))
@@ -12225,6 +12321,13 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			return -EINVAL;
 	}
 
+	if (attr->sample_type & PERF_SAMPLE_TLS_USER) {
+		if (!arch_perf_have_user_tls_dump())
+			return -ENOSYS;
+		else if (!IS_ALIGNED(attr->sample_tls_user, sizeof(u64)))
+			return -EINVAL;
+	}
+
 	if (!attr->sample_max_stack)
 		attr->sample_max_stack = sysctl_perf_event_max_stack;
 
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 5150d5f84c03..b42747b1eb04 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -243,4 +243,20 @@ static inline bool arch_perf_have_user_stack_dump(void)
 #define perf_user_stack_pointer(regs) 0
 #endif /* CONFIG_HAVE_PERF_USER_STACK_DUMP */
 
+#ifdef CONFIG_HAVE_PERF_USER_TLS_DUMP
+static inline bool arch_perf_have_user_tls_dump(void)
+{
+	return true;
+}
+
+#define perf_user_tls_pointer(tls) arch_perf_user_tls_pointer(tls)
+#else
+static inline bool arch_perf_have_user_tls_dump(void)
+{
+	return false;
+}
+
+#define perf_user_tls_pointer(tls) memset(tls, 0, sizeof(*tls))
+#endif /* CONFIG_HAVE_PERF_USER_TLS_DUMP */
+
 #endif /* _KERNEL_EVENTS_INTERNAL_H */
-- 
2.34.1


^ permalink raw reply related	[relevance 52%]

* Re: [PATCH v2 1/2] hyperv: Convert from tasklet to BH workqueue
  @ 2024-04-11 23:58 79%   ` Allen Pais
  0 siblings, 0 replies; 200+ results
From: Allen Pais @ 2024-04-11 23:58 UTC (permalink / raw)
  To: Michael Kelley
  Cc: linux-kernel, tj, keescook, kys, haiyangz, wei.liu, decui, linux-hyperv



> On Apr 10, 2024, at 11:08 AM, Michael Kelley <mhklinux@outlook.com> wrote:
> 
> From: Allen Pais <apais@linux.microsoft.com> Sent: Wednesday, April 3, 2024 9:56 AM
>> 
>> The only generic interface to execute asynchronously in the BH context is
>> tasklet; however, it's marked deprecated and has some design flaws. To
>> replace tasklets, BH workqueue support was recently added. A BH workqueue
>> behaves similarly to regular workqueues except that the queued work items
>> are executed in the BH context.
>> 
>> This patch converts drivers/hv/* from tasklet to BH workqueue.
>> 
>> Based on the work done by Tejun Heo <tj@kernel.org>
>> Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10
>> 
>> Signed-off-by: Allen Pais <allen.lkml@gmail.com>
>> ---
>> drivers/hv/channel.c      |  8 ++++----
>> drivers/hv/channel_mgmt.c |  5 ++---
>> drivers/hv/connection.c   |  9 +++++----
>> drivers/hv/hv.c           |  3 +--
>> drivers/hv/hv_balloon.c   |  4 ++--
>> drivers/hv/hv_fcopy.c     |  8 ++++----
>> drivers/hv/hv_kvp.c       |  8 ++++----
>> drivers/hv/hv_snapshot.c  |  8 ++++----
>> drivers/hv/hyperv_vmbus.h |  9 +++++----
>> drivers/hv/vmbus_drv.c    | 20 +++++++++++---------
>> include/linux/hyperv.h    |  2 +-
>> 11 files changed, 43 insertions(+), 41 deletions(-)
>> 
>> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
>> index adbf674355b2..876d78eb4dce 100644
>> --- a/drivers/hv/channel.c
>> +++ b/drivers/hv/channel.c
>> @@ -859,7 +859,7 @@ void vmbus_reset_channel_cb(struct vmbus_channel
>> *channel)
>> 	unsigned long flags;
>> 
>> 	/*
>> -	 * vmbus_on_event(), running in the per-channel tasklet, can race
>> +	 * vmbus_on_event(), running in the per-channel work, can race
>> 	 * with vmbus_close_internal() in the case of SMP guest, e.g., when
>> 	 * the former is accessing channel->inbound.ring_buffer, the latter
>> 	 * could be freeing the ring_buffer pages, so here we must stop it
>> @@ -871,7 +871,7 @@ void vmbus_reset_channel_cb(struct vmbus_channel *channel)
>> 	 * and that the channel ring buffer is no longer being accessed, cf.
>> 	 * the calls to napi_disable() in netvsc_device_remove().
>> 	 */
>> -	tasklet_disable(&channel->callback_event);
>> +	disable_work_sync(&channel->callback_event);
>> 
>> 	/* See the inline comments in vmbus_chan_sched(). */
>> 	spin_lock_irqsave(&channel->sched_lock, flags);
>> @@ -880,8 +880,8 @@ void vmbus_reset_channel_cb(struct vmbus_channel *channel)
>> 
>> 	channel->sc_creation_callback = NULL;
>> 
>> -	/* Re-enable tasklet for use on re-open */
>> -	tasklet_enable(&channel->callback_event);
>> +	/* Re-enable work for use on re-open */
>> +	enable_and_queue_work(system_bh_wq, &channel->callback_event);
> 
> In this case and in several other cases in the Hyper-V related code, you've
> used enable_and_queue_work() as the replacement for tasklet_enable().
> I would have expected just enable_work() as the equivalent.  tasklet_enable()
> just re-enables the tasklet; it does not do tasklet_schedule().

 Thank you. I see your point. Let me update the call accordingly and send out
A new version.

> 
> Doing the additional queue_work() shouldn't break anything; the work
> function will run and find nothing to do, which is benign.  But it seems
> conceptually wrong to have these places in the code queueing the work
> to run.

Okay.

> 
> Other than that, the code looks good to me.  I can see that there's
> considerably more overhead in using a workqueue instead of a
> tasklet.  Tasklets access with only per-CPU data and have no spin locks,
> whereas the workqueue code reads some global data and does
> a spin lock obtain/release on per-CPU data.  I haven't done any
> perf testing, and won't be able to at least over the next week. But
> the key scenario will be to test VMs with high CPU counts and lots
> of synthetic and/or storage interrupts.  I suspect the additional
> overhead won't be noticeable/measurable, but I agree with your
> initial statement that this should be checked.

 I will try and grab hold of a vm with high CPU count and run some tests.
Thanks for the quick review.

- Allen

> 
> Michael
> 
>> }
>> 
>> static int vmbus_close_internal(struct vmbus_channel *channel)
>> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> index 2f4d09ce027a..58397071a0de 100644
>> --- a/drivers/hv/channel_mgmt.c
>> +++ b/drivers/hv/channel_mgmt.c
>> @@ -353,8 +353,7 @@ static struct vmbus_channel *alloc_channel(void)
>> 
>> 	INIT_LIST_HEAD(&channel->sc_list);
>> 
>> -	tasklet_init(&channel->callback_event,
>> -		     vmbus_on_event, (unsigned long)channel);
>> +	INIT_WORK(&channel->callback_event, vmbus_on_event);
>> 
>> 	hv_ringbuffer_pre_init(channel);
>> 
>> @@ -366,7 +365,7 @@ static struct vmbus_channel *alloc_channel(void)
>>  */
>> static void free_channel(struct vmbus_channel *channel)
>> {
>> -	tasklet_kill(&channel->callback_event);
>> +	cancel_work_sync(&channel->callback_event);
>> 	vmbus_remove_channel_attr_group(channel);
>> 
>> 	kobject_put(&channel->kobj);
>> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
>> index 3cabeeabb1ca..f2a3394a8303 100644
>> --- a/drivers/hv/connection.c
>> +++ b/drivers/hv/connection.c
>> @@ -372,12 +372,13 @@ struct vmbus_channel *relid2channel(u32 relid)
>>  * 3. Once we return, enable signaling from the host. Once this
>>  *    state is set we check to see if additional packets are
>>  *    available to read. In this case we repeat the process.
>> - *    If this tasklet has been running for a long time
>> + *    If this work has been running for a long time
>>  *    then reschedule ourselves.
>>  */
>> -void vmbus_on_event(unsigned long data)
>> +void vmbus_on_event(struct work_struct *t)
>> {
>> -	struct vmbus_channel *channel = (void *) data;
>> +	struct vmbus_channel *channel = from_work(channel, t,
>> +						callback_event);
>> 	void (*callback_fn)(void *context);
>> 
>> 	trace_vmbus_on_event(channel);
>> @@ -401,7 +402,7 @@ void vmbus_on_event(unsigned long data)
>> 		return;
>> 
>> 	hv_begin_read(&channel->inbound);
>> -	tasklet_schedule(&channel->callback_event);
>> +	queue_work(system_bh_wq, &channel->callback_event);
>> }
>> 
>> /*
>> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
>> index a8ad728354cb..2af92f08f9ce 100644
>> --- a/drivers/hv/hv.c
>> +++ b/drivers/hv/hv.c
>> @@ -119,8 +119,7 @@ int hv_synic_alloc(void)
>> 	for_each_present_cpu(cpu) {
>> 		hv_cpu = per_cpu_ptr(hv_context.cpu_context, cpu);
>> 
>> -		tasklet_init(&hv_cpu->msg_dpc,
>> -			     vmbus_on_msg_dpc, (unsigned long) hv_cpu);
>> +		INIT_WORK(&hv_cpu->msg_dpc, vmbus_on_msg_dpc);
>> 
>> 		if (ms_hyperv.paravisor_present && hv_isolation_type_tdx())
>> {
>> 			hv_cpu->post_msg_page = (void *)get_zeroed_page(GFP_ATOMIC);
>> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
>> index e000fa3b9f97..c7efa2ff4cdf 100644
>> --- a/drivers/hv/hv_balloon.c
>> +++ b/drivers/hv/hv_balloon.c
>> @@ -2083,7 +2083,7 @@ static int balloon_suspend(struct hv_device *hv_dev)
>> {
>> 	struct hv_dynmem_device *dm = hv_get_drvdata(hv_dev);
>> 
>> -	tasklet_disable(&hv_dev->channel->callback_event);
>> +	disable_work_sync(&hv_dev->channel->callback_event);
>> 
>> 	cancel_work_sync(&dm->balloon_wrk.wrk);
>> 	cancel_work_sync(&dm->ha_wrk.wrk);
>> @@ -2094,7 +2094,7 @@ static int balloon_suspend(struct hv_device *hv_dev)
>> 		vmbus_close(hv_dev->channel);
>> 	}
>> 
>> -	tasklet_enable(&hv_dev->channel->callback_event);
>> +	enable_and_queue_work(system_bh_wq, &hv_dev->channel->callback_event);
>> 
>> 	return 0;
>> 
>> diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
>> index 922d83eb7ddf..fd6799293c17 100644
>> --- a/drivers/hv/hv_fcopy.c
>> +++ b/drivers/hv/hv_fcopy.c
>> @@ -71,7 +71,7 @@ static void fcopy_poll_wrapper(void *channel)
>> {
>> 	/* Transaction is finished, reset the state here to avoid races. */
>> 	fcopy_transaction.state = HVUTIL_READY;
>> -	tasklet_schedule(&((struct vmbus_channel *)channel)->callback_event);
>> +	queue_work(system_bh_wq, &((struct vmbus_channel *)channel)->callback_event);
>> }
>> 
>> static void fcopy_timeout_func(struct work_struct *dummy)
>> @@ -391,7 +391,7 @@ int hv_fcopy_pre_suspend(void)
>> 	if (!fcopy_msg)
>> 		return -ENOMEM;
>> 
>> -	tasklet_disable(&channel->callback_event);
>> +	disable_work_sync(&channel->callback_event);
>> 
>> 	fcopy_msg->operation = CANCEL_FCOPY;
>> 
>> @@ -404,7 +404,7 @@ int hv_fcopy_pre_suspend(void)
>> 
>> 	fcopy_transaction.state = HVUTIL_READY;
>> 
>> -	/* tasklet_enable() will be called in hv_fcopy_pre_resume(). */
>> +	/* enable_and_queue_work(system_bh_wq, ) will be called in hv_fcopy_pre_resume(). */
>> 	return 0;
>> }
>> 
>> @@ -412,7 +412,7 @@ int hv_fcopy_pre_resume(void)
>> {
>> 	struct vmbus_channel *channel = fcopy_transaction.recv_channel;
>> 
>> -	tasklet_enable(&channel->callback_event);
>> +	enable_and_queue_work(system_bh_wq, &channel->callback_event);
>> 
>> 	return 0;
>> }
>> diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
>> index d35b60c06114..85b8fb4a3d2e 100644
>> --- a/drivers/hv/hv_kvp.c
>> +++ b/drivers/hv/hv_kvp.c
>> @@ -113,7 +113,7 @@ static void kvp_poll_wrapper(void *channel)
>> {
>> 	/* Transaction is finished, reset the state here to avoid races. */
>> 	kvp_transaction.state = HVUTIL_READY;
>> -	tasklet_schedule(&((struct vmbus_channel *)channel)->callback_event);
>> +	queue_work(system_bh_wq, &((struct vmbus_channel *)channel)->callback_event);
>> }
>> 
>> static void kvp_register_done(void)
>> @@ -160,7 +160,7 @@ static void kvp_timeout_func(struct work_struct *dummy)
>> 
>> static void kvp_host_handshake_func(struct work_struct *dummy)
>> {
>> -	tasklet_schedule(&kvp_transaction.recv_channel->callback_event);
>> +	queue_work(system_bh_wq, &kvp_transaction.recv_channel->callback_event);
>> }
>> 
>> static int kvp_handle_handshake(struct hv_kvp_msg *msg)
>> @@ -786,7 +786,7 @@ int hv_kvp_pre_suspend(void)
>> {
>> 	struct vmbus_channel *channel = kvp_transaction.recv_channel;
>> 
>> -	tasklet_disable(&channel->callback_event);
>> +	disable_work_sync(&channel->callback_event);
>> 
>> 	/*
>> 	 * If there is a pending transtion, it's unnecessary to tell the host
>> @@ -809,7 +809,7 @@ int hv_kvp_pre_resume(void)
>> {
>> 	struct vmbus_channel *channel = kvp_transaction.recv_channel;
>> 
>> -	tasklet_enable(&channel->callback_event);
>> +	enable_and_queue_work(system_bh_wq, &channel->callback_event);
>> 
>> 	return 0;
>> }
>> diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
>> index 0d2184be1691..46c2263d2591 100644
>> --- a/drivers/hv/hv_snapshot.c
>> +++ b/drivers/hv/hv_snapshot.c
>> @@ -83,7 +83,7 @@ static void vss_poll_wrapper(void *channel)
>> {
>> 	/* Transaction is finished, reset the state here to avoid races. */
>> 	vss_transaction.state = HVUTIL_READY;
>> -	tasklet_schedule(&((struct vmbus_channel *)channel)->callback_event);
>> +	queue_work(system_bh_wq, &((struct vmbus_channel *)channel)->callback_event);
>> }
>> 
>> /*
>> @@ -421,7 +421,7 @@ int hv_vss_pre_suspend(void)
>> 	if (!vss_msg)
>> 		return -ENOMEM;
>> 
>> -	tasklet_disable(&channel->callback_event);
>> +	disable_work_sync(&channel->callback_event);
>> 
>> 	vss_msg->vss_hdr.operation = VSS_OP_THAW;
>> 
>> @@ -435,7 +435,7 @@ int hv_vss_pre_suspend(void)
>> 
>> 	vss_transaction.state = HVUTIL_READY;
>> 
>> -	/* tasklet_enable() will be called in hv_vss_pre_resume(). */
>> +	/* enable_and_queue_work() will be called in hv_vss_pre_resume(). */
>> 	return 0;
>> }
>> 
>> @@ -443,7 +443,7 @@ int hv_vss_pre_resume(void)
>> {
>> 	struct vmbus_channel *channel = vss_transaction.recv_channel;
>> 
>> -	tasklet_enable(&channel->callback_event);
>> +	enable_and_queue_work(system_bh_wq, &channel->callback_event);
>> 
>> 	return 0;
>> }
>> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
>> index f6b1e710f805..95ca570ac7af 100644
>> --- a/drivers/hv/hyperv_vmbus.h
>> +++ b/drivers/hv/hyperv_vmbus.h
>> @@ -19,6 +19,7 @@
>> #include <linux/atomic.h>
>> #include <linux/hyperv.h>
>> #include <linux/interrupt.h>
>> +#include <linux/workqueue.h>
>> 
>> #include "hv_trace.h"
>> 
>> @@ -136,10 +137,10 @@ struct hv_per_cpu_context {
>> 
>> 	/*
>> 	 * Starting with win8, we can take channel interrupts on any CPU;
>> -	 * we will manage the tasklet that handles events messages on a per CPU
>> +	 * we will manage the work that handles events messages on a per CPU
>> 	 * basis.
>> 	 */
>> -	struct tasklet_struct msg_dpc;
>> +	struct work_struct msg_dpc;
>> };
>> 
>> struct hv_context {
>> @@ -366,8 +367,8 @@ void vmbus_disconnect(void);
>> 
>> int vmbus_post_msg(void *buffer, size_t buflen, bool can_sleep);
>> 
>> -void vmbus_on_event(unsigned long data);
>> -void vmbus_on_msg_dpc(unsigned long data);
>> +void vmbus_on_event(struct work_struct *t);
>> +void vmbus_on_msg_dpc(struct work_struct *t);
>> 
>> int hv_kvp_init(struct hv_util_service *srv);
>> void hv_kvp_deinit(void);
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index 4cb17603a828..28490068cacc 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -1025,9 +1025,9 @@ static void vmbus_onmessage_work(struct work_struct *work)
>> 	kfree(ctx);
>> }
>> 
>> -void vmbus_on_msg_dpc(unsigned long data)
>> +void vmbus_on_msg_dpc(struct work_struct *t)
>> {
>> -	struct hv_per_cpu_context *hv_cpu = (void *)data;
>> +	struct hv_per_cpu_context *hv_cpu = from_work(hv_cpu, t, msg_dpc);
>> 	void *page_addr = hv_cpu->synic_message_page;
>> 	struct hv_message msg_copy, *msg = (struct hv_message *)page_addr +
>> 				  VMBUS_MESSAGE_SINT;
>> @@ -1131,7 +1131,7 @@ void vmbus_on_msg_dpc(unsigned long data)
>> 			 * before sending the rescind message of the same
>> 			 * channel.  These messages are sent to the guest's
>> 			 * connect CPU; the guest then starts processing them
>> -			 * in the tasklet handler on this CPU:
>> +			 * in the work handler on this CPU:
>> 			 *
>> 			 * VMBUS_CONNECT_CPU
>> 			 *
>> @@ -1276,7 +1276,7 @@ static void vmbus_chan_sched(struct hv_per_cpu_context *hv_cpu)
>> 			hv_begin_read(&channel->inbound);
>> 			fallthrough;
>> 		case HV_CALL_DIRECT:
>> -			tasklet_schedule(&channel->callback_event);
>> +			queue_work(system_bh_wq, &channel->callback_event);
>> 		}
>> 
>> sched_unlock:
>> @@ -1304,7 +1304,7 @@ static void vmbus_isr(void)
>> 			hv_stimer0_isr();
>> 			vmbus_signal_eom(msg, HVMSG_TIMER_EXPIRED);
>> 		} else
>> -			tasklet_schedule(&hv_cpu->msg_dpc);
>> +			queue_work(system_bh_wq, &hv_cpu->msg_dpc);
>> 	}
>> 
>> 	add_interrupt_randomness(vmbus_interrupt);
>> @@ -2371,10 +2371,12 @@ static int vmbus_bus_suspend(struct device *dev)
>> 			hv_context.cpu_context, VMBUS_CONNECT_CPU);
>> 	struct vmbus_channel *channel, *sc;
>> 
>> -	tasklet_disable(&hv_cpu->msg_dpc);
>> +	disable_work_sync(&hv_cpu->msg_dpc);
>> 	vmbus_connection.ignore_any_offer_msg = true;
>> -	/* The tasklet_enable() takes care of providing a memory barrier */
>> -	tasklet_enable(&hv_cpu->msg_dpc);
>> +	/* The enable_and_queue_work() takes care of
>> +	 * providing a memory barrier
>> +	 */
>> +	enable_and_queue_work(system_bh_wq, &hv_cpu->msg_dpc);
>> 
>> 	/* Drain all the workqueues as we are in suspend */
>> 	drain_workqueue(vmbus_connection.rescind_work_queue);
>> @@ -2692,7 +2694,7 @@ static void __exit vmbus_exit(void)
>> 		struct hv_per_cpu_context *hv_cpu
>> 			= per_cpu_ptr(hv_context.cpu_context, cpu);
>> 
>> -		tasklet_kill(&hv_cpu->msg_dpc);
>> +		cancel_work_sync(&hv_cpu->msg_dpc);
>> 	}
>> 	hv_debug_rm_all_dir();
>> 
>> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
>> index 6ef0557b4bff..db3d85ea5ce6 100644
>> --- a/include/linux/hyperv.h
>> +++ b/include/linux/hyperv.h
>> @@ -882,7 +882,7 @@ struct vmbus_channel {
>> 	bool out_full_flag;
>> 
>> 	/* Channel callback's invoked in softirq context */
>> -	struct tasklet_struct callback_event;
>> +	struct work_struct callback_event;
>> 	void (*onchannel_callback)(void *context);
>> 	void *channel_callback_context;
>> 
>> --
>> 2.17.1
>> 


^ permalink raw reply	[relevance 79%]

* [PATCH] rust: remove unneeded `kernel::prelude` imports from doctests
@ 2024-04-11 22:53 70% Nell Shamrell-Harrington
  0 siblings, 0 replies; 200+ results
From: Nell Shamrell-Harrington @ 2024-04-11 22:53 UTC (permalink / raw)
  To: ojeda, alex.gaynor, wedsonaf
  Cc: boqun.feng, gary, bjorn3_gh, benno.lossin, a.hindborg, aliceryhl,
	fujita.tomonori, tmgross, yakoyoku, kent.overstreet,
	matthew.brost, kernel, netdev, rust-for-linux, linux-kernel

Rust doctests implicitly include `kernel::prelude::*`.

Removes explicit `kernel::prelude` imports from doctests.

Suggested-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://github.com/Rust-for-Linux/linux/issues/1064
Signed-off-by: Nell Shamrell-Harrington <nells@linux.microsoft.com>
---
 rust/kernel/init.rs      | 6 +++---
 rust/kernel/net/phy.rs   | 1 -
 rust/kernel/workqueue.rs | 3 ---
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
index 424257284d16..8f0380697c09 100644
--- a/rust/kernel/init.rs
+++ b/rust/kernel/init.rs
@@ -87,7 +87,7 @@
 //!
 //! ```rust
 //! # #![allow(clippy::disallowed_names)]
-//! # use kernel::{sync::Mutex, prelude::*, new_mutex, init::PinInit, try_pin_init};
+//! # use kernel::{sync::Mutex, new_mutex, init::PinInit, try_pin_init};
 //! #[pin_data]
 //! struct DriverData {
 //!     #[pin]
@@ -121,7 +121,7 @@
 //!
 //! ```rust
 //! # #![allow(unreachable_pub, clippy::disallowed_names)]
-//! use kernel::{prelude::*, init, types::Opaque};
+//! use kernel::{init, types::Opaque};
 //! use core::{ptr::addr_of_mut, marker::PhantomPinned, pin::Pin};
 //! # mod bindings {
 //! #     #![allow(non_camel_case_types)]
@@ -412,7 +412,7 @@ macro_rules! stack_try_pin_init {
 ///
 /// ```rust
 /// # #![allow(clippy::disallowed_names)]
-/// # use kernel::{init, pin_init, prelude::*, init::*};
+/// # use kernel::{init, pin_init, init::*};
 /// # use core::pin::Pin;
 /// # #[pin_data]
 /// # struct Foo {
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 96e09c6e8530..d10a415c376f 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -766,7 +766,6 @@ const fn as_int(&self) -> u32 {
 /// # mod module_phy_driver_sample {
 /// use kernel::c_str;
 /// use kernel::net::phy::{self, DeviceId};
-/// use kernel::prelude::*;
 ///
 /// kernel::module_phy_driver! {
 ///     drivers: [PhySample],
diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index c22504d5c8ad..7884f0007b38 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -33,7 +33,6 @@
 //! we do not need to specify ids for the fields.
 //!
 //! ```
-//! use kernel::prelude::*;
 //! use kernel::sync::Arc;
 //! use kernel::workqueue::{self, impl_has_work, new_work, Work, WorkItem};
 //!
@@ -75,7 +74,6 @@
 //! The following example shows how multiple `work_struct` fields can be used:
 //!
 //! ```
-//! use kernel::prelude::*;
 //! use kernel::sync::Arc;
 //! use kernel::workqueue::{self, impl_has_work, new_work, Work, WorkItem};
 //!
@@ -411,7 +409,6 @@ pub unsafe fn raw_get(ptr: *const Self) -> *mut bindings::work_struct {
 /// like this:
 ///
 /// ```no_run
-/// use kernel::prelude::*;
 /// use kernel::workqueue::{impl_has_work, Work};
 ///
 /// struct MyWorkItem {
-- 
2.34.1


^ permalink raw reply related	[relevance 70%]

* Re: [PATCH v2] ACPI: CPPC: Fix access width used for PCC registers
  @ 2024-04-11 22:07 79% ` Easwar Hariharan
  0 siblings, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-11 22:07 UTC (permalink / raw)
  To: Vanshidhar Konda, Jarred White
  Cc: Rafael J . Wysocki, linux-acpi, linux-kernel, 5 . 15+

On 4/11/2024 2:23 PM, Vanshidhar Konda wrote:
> commit 2f4a4d63a193be6fd530d180bb13c3592052904c modified
> cpc_read/cpc_write to use access_width to read CPC registers. For PCC
> registers the access width field in the ACPI register macro specifies
> the PCC subspace id. For non-zero PCC subspace id the access width is
> incorrectly treated as access width. This causes errors when reading
> from PCC registers in the CPPC driver.
> 
> For PCC registers base the size of read/write on the bit width field.
> The debug message in cpc_read/cpc_write is updated to print relevant
> information for the address space type used to read the register.
> 
> Signed-off-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
> Tested-by: Jarred White <jarredwhite@linux.microsoft.com>
> Reviewed-by: Jarred White <jarredwhite@linux.microsoft.com>
> Cc: 5.15+ <stable@vger.kernel.org> # 5.15+
> ---
> 
> When testing v6.9-rc1 kernel on AmpereOne system dmesg showed that
> cpufreq policy had failed to initialize on some cores during boot because
> cpufreq->get() always returned 0. On this system CPPC registers are in PCC
> subspace index 2 that are 32 bits wide. With this patch the CPPC driver
> interpreted the access width field as 16 bits, causing the register read
> to roll over too quickly to provide valid values during frequency
> computation.
> 
> v2:
> - Use size variable in debug print message
> - Use size instead of reg->bit_width for acpi_os_read_memory and
>   acpi_os_write_memory
> 
>  drivers/acpi/cppc_acpi.c | 53 ++++++++++++++++++++++++++++------------
>  1 file changed, 37 insertions(+), 16 deletions(-)

Thanks for adding the CC: stable tag. Couple of nits, assuming those are fixed:

Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>

> 
> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
> index 4bfbe55553f4..a037e9d15f48 100644
> --- a/drivers/acpi/cppc_acpi.c
> +++ b/drivers/acpi/cppc_acpi.c
> @@ -1002,14 +1002,14 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
>  	}
>  
>  	*val = 0;
> +	size = GET_BIT_WIDTH(reg);
>  
>  	if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
> -		u32 width = GET_BIT_WIDTH(reg);
>  		u32 val_u32;
>  		acpi_status status;
>  
>  		status = acpi_os_read_port((acpi_io_address)reg->address,
> -					   &val_u32, width);
> +					   &val_u32, size);
>  		if (ACPI_FAILURE(status)) {
>  			pr_debug("Error: Failed to read SystemIO port %llx\n",
>  				 reg->address);
> @@ -1018,17 +1018,22 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
>  
>  		*val = val_u32;
>  		return 0;
> -	} else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
> +	} else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
> +		/*
> +		 * For registers in PCC space, the register size is determined
> +		 * by the bit width field; the access size is used to indicate
> +		 * the PCC subspace id.
> +		 */
> +		size = reg->bit_width;
>  		vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
> +	}
>  	else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
>  		vaddr = reg_res->sys_mem_vaddr;
>  	else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
>  		return cpc_read_ffh(cpu, reg, val);
>  	else
>  		return acpi_os_read_memory((acpi_physical_address)reg->address,
> -				val, reg->bit_width);
> -
> -	size = GET_BIT_WIDTH(reg);
> +				val, size);
>  
>  	switch (size) {
>  	case 8:
> @@ -1044,8 +1049,13 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
>  		*val = readq_relaxed(vaddr);
>  		break;
>  	default:
> -		pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
> -			 reg->bit_width, pcc_ss_id);
> +		if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
> +			pr_debug("Error: Cannot read %u width from for system memory: 0x%llx\n",
> +				size, reg->address);

Nit: from for? There might be a missing word there, or just an extra. Ditto for cpc_write() below.

> +		} else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
> +			pr_debug("Error: Cannot read %u bit width to PCC for ss: %d\n",
> +				size, pcc_ss_id);
> +		}
>  		return -EFAULT;
>  	}
>  
> @@ -1063,12 +1073,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
>  	int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
>  	struct cpc_reg *reg = &reg_res->cpc_entry.reg;
>  
> +	size = GET_BIT_WIDTH(reg);
> +
>  	if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
> -		u32 width = GET_BIT_WIDTH(reg);
>  		acpi_status status;
>  
>  		status = acpi_os_write_port((acpi_io_address)reg->address,
> -					    (u32)val, width);
> +					    (u32)val, size);
>  		if (ACPI_FAILURE(status)) {
>  			pr_debug("Error: Failed to write SystemIO port %llx\n",
>  				 reg->address);
> @@ -1076,17 +1087,22 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
>  		}
>  
>  		return 0;
> -	} else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
> +	} else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
> +		/*
> +		 * For registers in PCC space, the register size is determined
> +		 * by the bit width field; the access size is used to indicate
> +		 * the PCC subspace id.
> +		 */
> +		size = reg->bit_width;
>  		vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
> +	}
>  	else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
>  		vaddr = reg_res->sys_mem_vaddr;
>  	else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
>  		return cpc_write_ffh(cpu, reg, val);
>  	else
>  		return acpi_os_write_memory((acpi_physical_address)reg->address,
> -				val, reg->bit_width);
> -
> -	size = GET_BIT_WIDTH(reg);
> +				val, size);
>  
>  	if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
>  		val = MASK_VAL(reg, val);
> @@ -1105,8 +1121,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
>  		writeq_relaxed(val, vaddr);
>  		break;
>  	default:
> -		pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
> -			 reg->bit_width, pcc_ss_id);
> +		if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
> +			pr_debug("Error: Cannot write %u width from for system memory: 0x%llx\n",
> +				size, reg->address);
> +		} else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
> +			pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
> +				size, pcc_ss_id);
> +		}
>  		ret_val = -EFAULT;
>  		break;
>  	}


^ permalink raw reply	[relevance 79%]

* Re: [PATCH 5.15 00/57] 5.15.155-rc1 review
  @ 2024-04-11 18:36 79% ` Easwar Hariharan
  2024-04-12 22:22 79% ` Kelsey Steele
  1 sibling, 0 replies; 200+ results
From: Easwar Hariharan @ 2024-04-11 18:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor, allen.lkml, broonie

On 4/11/2024 2:57 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.155 release.
> There are 57 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.155-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

<snip>

I wanted to repeat my request from another thread[1] here, that we revert commit 4949affd5288 
("ACPI: CPPC: Use access_width over bit_width for system memory accesses") in 5.15.155 due to
known problems with the patch, so it's not lost in the mail storm.

Thanks,
Easwar

[1] https://lore.kernel.org/all/97d25ef7-dee9-4cc5-842a-273f565869b3@linux.microsoft.com/

^ permalink raw reply	[relevance 79%]

* Re: Copying TLS/user register data per perf-sample?
  @ 2024-04-11 15:58 79%       ` Beau Belgrave
  0 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-11 15:58 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Namhyung Kim, linux-trace-kernel, linux-kernel

On Fri, Apr 12, 2024 at 12:55:19AM +0900, Masami Hiramatsu wrote:
> On Wed, 10 Apr 2024 08:35:42 -0700
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
> 
> > On Wed, Apr 10, 2024 at 10:06:28PM +0900, Masami Hiramatsu wrote:
> > > On Thu, 4 Apr 2024 12:26:41 -0700
> > > Beau Belgrave <beaub@linux.microsoft.com> wrote:
> > > 
> > > > Hello,
> > > > 
> > > > I'm looking into the possibility of capturing user data that is pointed
> > > > to by a user register (IE: fs/gs for TLS on x86/64) for each sample via
> > > > perf_events.
> > > > 
> > > > I was hoping to find a way to do this similar to PERF_SAMPLE_STACK_USER.
> > > > I think it could even use roughly the same ABI in the perf ring buffer.
> > > > Or it may be possible by some kprobe linked to the perf sample function.
> > > > 
> > > > This would allow a profiler to collect TLS (or other values) on x64. In
> > > > the Open Telemetry profiling SIG [1], we are trying to find a fast way
> > > > to grab a tracing association quickly on a per-thread basis. The team
> > > > at Elastic has a bespoke way to do this [2], however, I'd like to see a
> > > > more general way to achieve this. The folks I've been talking with seem
> > > > open to the idea of just having a TLS value for this we could capture
> > > > upon each sample. We could then just state, Open Telemetry SDKs should
> > > > have a TLS value for span correlation. However, we need a way to sample
> > > > the TLS value(s) when a sampling event is generated.
> > > > 
> > > > Is this already possible via some other means? It'd be great to be able
> > > > to do this directly at the perf_event sample via the ABI or a probe.
> > > > 
> > > 
> > > Have you tried to use uprobes? It should be able to access user-space
> > > registers including fs/gs.
> > > 
> > 
> > We need to get fs/gs during a sample interrupt from perf. If the sample
> > interrupt lands during kernel code (IE: syscall) we would also like to
> > get these TLS values when in process context.
> 
> OK, those are not directly accessible from pt_regs.
> 

Yeah, it's a per-arch thread attribute.

> > 
> > I have some patches into the kernel to make this possible via
> > perf_events that works well, however, I don't want to reinvent the wheel
> > if there is some way to get these via perf samples already.
> 
> I would like to see it. I think it is possible to introduce a helper
> to get a base address of user TLS for probe events, and start supporting
> from x86.
> 

For sure, I'm hoping the patches start the right conversations.

> > 
> > In OTel, we are trying to attribute samples to transactions that are
> > occurring. So the TLS fetch has to be aligned exactly with the sample.
> > You can do this via eBPF when it's available, however, we have
> > environments where eBPF is not available.
> > 
> > It's sounding like to do this properly without eBPF a new feature would
> > be required. If so, I do have some patches I can share in a bit as an
> > RFC.
> 
> It is better to be shared in RFC stage, so that we can discuss it from
> the direction level.
> 

Agree, it could be that having the ability to run a probe on sample may
be a better option. Not sure.

Thanks,
-Beau

> Thank you,
> 
> > 
> > Thanks,
> > -Beau
> > 
> > > Thank you,
> > > 
> > > -- 
> > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[relevance 79%]

* RE: [PATCH rdma-next 1/1] RDMA/mana_ib: remove useless return values from dbg prints
  @ 2024-04-11  4:18 79% ` Long Li
  0 siblings, 0 replies; 200+ results
From: Long Li @ 2024-04-11  4:18 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, sharmaajay, jgg, leon
  Cc: linux-rdma, linux-kernel

> Subject: [PATCH rdma-next 1/1] RDMA/mana_ib: remove useless return
> values from dbg prints
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Remove printing ret value on success as it was always 0.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>

^ permalink raw reply	[relevance 79%]

* Re: Copying TLS/user register data per perf-sample?
  @ 2024-04-10 15:37 79%   ` Beau Belgrave
  0 siblings, 0 replies; 200+ results
From: Beau Belgrave @ 2024-04-10 15:37 UTC (permalink / raw)
  To: Namhyung Kim; +Cc: Masami Hiramatsu, linux-trace-kernel, linux-kernel

On Tue, Apr 09, 2024 at 04:32:46PM -0700, Namhyung Kim wrote:
> Hello,
> 
> On Thu, Apr 4, 2024 at 12:26 PM Beau Belgrave <beaub@linux.microsoft.com> wrote:
> >
> > Hello,
> >
> > I'm looking into the possibility of capturing user data that is pointed
> > to by a user register (IE: fs/gs for TLS on x86/64) for each sample via
> > perf_events.
> >
> > I was hoping to find a way to do this similar to PERF_SAMPLE_STACK_USER.
> > I think it could even use roughly the same ABI in the perf ring buffer.
> > Or it may be possible by some kprobe linked to the perf sample function.
> >
> > This would allow a profiler to collect TLS (or other values) on x64. In
> > the Open Telemetry profiling SIG [1], we are trying to find a fast way
> > to grab a tracing association quickly on a per-thread basis. The team
> > at Elastic has a bespoke way to do this [2], however, I'd like to see a
> > more general way to achieve this. The folks I've been talking with seem
> > open to the idea of just having a TLS value for this we could capture
> > upon each sample. We could then just state, Open Telemetry SDKs should
> > have a TLS value for span correlation. However, we need a way to sample
> > the TLS value(s) when a sampling event is generated.
> >
> > Is this already possible via some other means? It'd be great to be able
> > to do this directly at the perf_event sample via the ABI or a probe.
> 
> I don't think the current perf ABI allows capturing %fs/%gs + offset.
> IIRC kprobes/uprobes don't have that too but I could be wrong.
> 

Yeah, I didn't see it either. I have some patches that I will submit in
a bit as RFC that enable this functionality. I was hoping there was
already an easy way to do this.

Thanks,
-Beau

> Thanks,
> Namhyung
> 
> >
> > 1. https://opentelemetry.io/blog/2024/profiling/
> > 2. https://www.elastic.co/blog/continuous-profiling-distributed-tracing-correlation

^ permalink raw reply	[relevance 79%]

* Re: Copying TLS/user register data per perf-sample?
  @ 2024-04-10 15:35 79%   ` Beau Belgrave
    0 siblings, 1 reply; 200+ results
From: Beau Belgrave @ 2024-04-10 15:35 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Namhyung Kim, linux-trace-kernel, linux-kernel

On Wed, Apr 10, 2024 at 10:06:28PM +0900, Masami Hiramatsu wrote:
> On Thu, 4 Apr 2024 12:26:41 -0700
> Beau Belgrave <beaub@linux.microsoft.com> wrote:
> 
> > Hello,
> > 
> > I'm looking into the possibility of capturing user data that is pointed
> > to by a user register (IE: fs/gs for TLS on x86/64) for each sample via
> > perf_events.
> > 
> > I was hoping to find a way to do this similar to PERF_SAMPLE_STACK_USER.
> > I think it could even use roughly the same ABI in the perf ring buffer.
> > Or it may be possible by some kprobe linked to the perf sample function.
> > 
> > This would allow a profiler to collect TLS (or other values) on x64. In
> > the Open Telemetry profiling SIG [1], we are trying to find a fast way
> > to grab a tracing association quickly on a per-thread basis. The team
> > at Elastic has a bespoke way to do this [2], however, I'd like to see a
> > more general way to achieve this. The folks I've been talking with seem
> > open to the idea of just having a TLS value for this we could capture
> > upon each sample. We could then just state, Open Telemetry SDKs should
> > have a TLS value for span correlation. However, we need a way to sample
> > the TLS value(s) when a sampling event is generated.
> > 
> > Is this already possible via some other means? It'd be great to be able
> > to do this directly at the perf_event sample via the ABI or a probe.
> > 
> 
> Have you tried to use uprobes? It should be able to access user-space
> registers including fs/gs.
> 

We need to get fs/gs during a sample interrupt from perf. If the sample
interrupt lands during kernel code (IE: syscall) we would also like to
get these TLS values when in process context.

I have some patches into the kernel to make this possible via
perf_events that works well, however, I don't want to reinvent the wheel
if there is some way to get these via perf samples already.

In OTel, we are trying to attribute samples to transactions that are
occurring. So the TLS fetch has to be aligned exactly with the sample.
You can do this via eBPF when it's available, however, we have
environments where eBPF is not available.

It's sounding like to do this properly without eBPF a new feature would
be required. If so, I do have some patches I can share in a bit as an
RFC.

Thanks,
-Beau

> Thank you,
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[relevance 79%]

* [PATCH rdma-next v3 6/6] RDMA/mana_ib: Configure mac address in RNIC
    2024-04-10  8:42 73% ` [PATCH rdma-next v3 4/6] RDMA/mana_ib: enable RoCE on port 1 Konstantin Taranov
  2024-04-10  8:42 63% ` [PATCH rdma-next v3 5/6] RDMA/mana_ib: adding and deleting GIDs Konstantin Taranov
@ 2024-04-10  8:42 68% ` Konstantin Taranov
  2 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-10  8:42 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Set local mac address in RNIC, which is required by the HW.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/device.c  |  9 +++++++++
 drivers/infiniband/hw/mana/main.c    | 22 ++++++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h | 15 +++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 71923e5d0570..97a9f7a2d185 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -58,6 +58,7 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 	struct net_device *upper_ndev;
 	struct mana_context *mc;
 	struct mana_ib_dev *dev;
+	u8 mac_addr[ETH_ALEN];
 	int ret;
 
 	mc = mdev->driver_data;
@@ -89,6 +90,7 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 		ibdev_err(&dev->ib_dev, "Failed to get master netdev");
 		goto free_ib_device;
 	}
+	ether_addr_copy(mac_addr, upper_ndev->dev_addr);
 	ret = ib_device_set_netdev(&dev->ib_dev, upper_ndev, 1);
 	rcu_read_unlock();
 	if (ret) {
@@ -121,6 +123,13 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 	if (ret)
 		goto destroy_eqs;
 
+	ret = mana_ib_gd_config_mac(dev, ADDR_OP_ADD, mac_addr);
+	if (ret) {
+		ibdev_err(&dev->ib_dev, "Failed to add Mac address, ret %d",
+			  ret);
+		goto destroy_rnic;
+	}
+
 	ret = ib_register_device(&dev->ib_dev, "mana_%d",
 				 mdev->gdma_context->dev);
 	if (ret)
diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 09d29cf538dc..5e037603d130 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -784,3 +784,25 @@ int mana_ib_gd_del_gid(const struct ib_gid_attr *attr, void **context)
 
 	return 0;
 }
+
+int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8 *mac)
+{
+	struct mana_rnic_config_mac_addr_resp resp = {};
+	struct mana_rnic_config_mac_addr_req req = {};
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	int err;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CONFIG_MAC_ADDR, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.op = op;
+	copy_in_reverse(req.mac_addr, mac, ETH_ALEN);
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to config Mac addr err %d", err);
+		return err;
+	}
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 89ac5b39dbce..4c1240da0c5f 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -117,6 +117,7 @@ enum mana_ib_command_code {
 	MANA_IB_CREATE_ADAPTER  = 0x30002,
 	MANA_IB_DESTROY_ADAPTER = 0x30003,
 	MANA_IB_CONFIG_IP_ADDR	= 0x30004,
+	MANA_IB_CONFIG_MAC_ADDR	= 0x30005,
 };
 
 struct mana_ib_query_adapter_caps_req {
@@ -188,6 +189,18 @@ struct mana_rnic_config_addr_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+struct mana_rnic_config_mac_addr_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	enum mana_ib_addr_op op;
+	u8 mac_addr[ETH_ALEN];
+	u8 reserved[6];
+}; /* HW Data */
+
+struct mana_rnic_config_mac_addr_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -305,4 +318,6 @@ enum rdma_link_layer mana_ib_get_link_layer(struct ib_device *device, u32 port_n
 int mana_ib_gd_add_gid(const struct ib_gid_attr *attr, void **context);
 
 int mana_ib_gd_del_gid(const struct ib_gid_attr *attr, void **context);
+
+int mana_ib_gd_config_mac(struct mana_ib_dev *mdev, enum mana_ib_addr_op op, u8 *mac);
 #endif
-- 
2.43.0


^ permalink raw reply related	[relevance 68%]

* [PATCH rdma-next v3 5/6] RDMA/mana_ib: adding and deleting GIDs
    2024-04-10  8:42 73% ` [PATCH rdma-next v3 4/6] RDMA/mana_ib: enable RoCE on port 1 Konstantin Taranov
@ 2024-04-10  8:42 63% ` Konstantin Taranov
  2024-04-10  8:42 68% ` [PATCH rdma-next v3 6/6] RDMA/mana_ib: Configure mac address in RNIC Konstantin Taranov
  2 siblings, 0 replies; 200+ results
From: Konstantin Taranov @ 2024-04-10  8:42 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement add_gid and del_gid for RNIC.
IPv4 and IPv6 addresses are supported.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/device.c  |  2 +
 drivers/infiniband/hw/mana/main.c    | 60 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h | 35 ++++++++++++++++
 3 files changed, 97 insertions(+)

diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index e7981301d10b..71923e5d0570 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -15,6 +15,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.driver_id = RDMA_DRIVER_MANA,
 	.uverbs_abi_ver = MANA_IB_UVERBS_ABI_VERSION,
 
+	.add_gid = mana_ib_gd_add_gid,
 	.alloc_pd = mana_ib_alloc_pd,
 	.alloc_ucontext = mana_ib_alloc_ucontext,
 	.create_cq = mana_ib_create_cq,
@@ -23,6 +24,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.create_wq = mana_ib_create_wq,
 	.dealloc_pd = mana_ib_dealloc_pd,
 	.dealloc_ucontext = mana_ib_dealloc_ucontext,
+	.del_gid = mana_ib_gd_del_gid,
 	.dereg_mr = mana_ib_dereg_mr,
 	.destroy_cq = mana_ib_destroy_cq,
 	.destroy_qp = mana_ib_destroy_qp,
diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 7a9d7e13b7b1..09d29cf538dc 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -724,3 +724,63 @@ int mana_ib_gd_destroy_rnic_adapter(struct mana_ib_dev *mdev)
 
 	return 0;
 }
+
+int mana_ib_gd_add_gid(const struct ib_gid_attr *attr, void **context)
+{
+	struct mana_ib_dev *mdev = container_of(attr->device, struct mana_ib_dev, ib_dev);
+	enum rdma_network_type ntype = rdma_gid_attr_network_type(attr);
+	struct mana_rnic_config_addr_resp resp = {};
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_config_addr_req req = {};
+	int err;
+
+	if (ntype != RDMA_NETWORK_IPV4 && ntype != RDMA_NETWORK_IPV6) {
+		ibdev_dbg(&mdev->ib_dev, "Unsupported rdma network type %d", ntype);
+		return -EINVAL;
+	}
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CONFIG_IP_ADDR, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.op = ADDR_OP_ADD;
+	req.sgid_type = (ntype == RDMA_NETWORK_IPV6) ? SGID_TYPE_IPV6 : SGID_TYPE_IPV4;
+	copy_in_reverse(req.ip_addr, attr->gid.raw, sizeof(union ib_gid));
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to config IP addr err %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+int mana_ib_gd_del_gid(const struct ib_gid_attr *attr, void **context)
+{
+	struct mana_ib_dev *mdev = container_of(attr->device, struct mana_ib_dev, ib_dev);
+	enum rdma_network_type ntype = rdma_gid_attr_network_type(attr);
+	struct mana_rnic_config_addr_resp resp = {};
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_config_addr_req req = {};
+	int err;
+
+	if (ntype != RDMA_NETWORK_IPV4 && ntype != RDMA_NETWORK_IPV6) {
+		ibdev_dbg(&mdev->ib_dev, "Unsupported rdma network type %d", ntype);
+		return -EINVAL;
+	}
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CONFIG_IP_ADDR, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.op = ADDR_OP_REMOVE;
+	req.sgid_type = (ntype == RDMA_NETWORK_IPV6) ? SGID_TYPE_IPV6 : SGID_TYPE_IPV4;
+	copy_in_reverse(req.ip_addr, attr->gid.raw, sizeof(union ib_gid));
+
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to config IP addr err %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index b9117cbc7629..89ac5b39dbce 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -116,6 +116,7 @@ enum mana_ib_command_code {
 	MANA_IB_GET_ADAPTER_CAP = 0x30001,
 	MANA_IB_CREATE_ADAPTER  = 0x30002,
 	MANA_IB_DESTROY_ADAPTER = 0x30003,
+	MANA_IB_CONFIG_IP_ADDR	= 0x30004,
 };
 
 struct mana_ib_query_adapter_caps_req {
@@ -165,6 +166,28 @@ struct mana_rnic_destroy_adapter_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+enum mana_ib_addr_op {
+	ADDR_OP_ADD = 1,
+	ADDR_OP_REMOVE = 2,
+};
+
+enum sgid_entry_type {
+	SGID_TYPE_IPV4 = 1,
+	SGID_TYPE_IPV6 = 2,
+};
+
+struct mana_rnic_config_addr_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	enum mana_ib_addr_op op;
+	enum sgid_entry_type sgid_type;
+	u8 ip_addr[16];
+}; /* HW Data */
+
+struct mana_rnic_config_addr_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -181,6 +204,14 @@ static inline struct net_device *mana_ib_get_netdev(struct ib_device *ibdev, u32
 	return mc->ports[port - 1];
 }
 
+static inline void copy_in_reverse(u8 *dst, const u8 *src, u32 size)
+{
+	u32 i;
+
+	for (i = 0; i < size; i++)
+		dst[size - 1 - i] = src[i];
+}
+
 int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
 
 int mana_ib_create_zero_offset_dma_region(struct mana_ib_dev *dev, struct ib_umem *umem,
@@ -270,4 +301,8 @@ int mana_ib_gd_destroy_rnic_adapter(struct mana_ib_dev *mdev);
 int mana_ib_query_pkey(struct ib_device *ibdev, u32 port, u16 index, u16 *pkey);
 
 enum rdma_link_layer mana_ib_get_link_layer(struct ib_device *device, u32 port_num);
+
+int mana_ib_gd_add_gid(const struct ib_gid_attr *attr, void **context);
+
+int mana_ib_gd_del_gid(const struct ib_gid_attr *attr, void **context);
 #endif
-- 
2.43.0


^ permalink raw reply related	[relevance 63%]

* [PATCH rdma-next v3 4/6] RDMA/mana_ib: enable RoCE on port 1
  @ 2024-04-10  8:42 73% ` Konstantin Taranov
    2024-04-10  8:42 63% ` [PATCH rdma-next v3 5/6] RDMA/mana_ib: adding and deleting GIDs Konstantin Taranov
  2024-04-10  8:42 68% ` [PATCH rdma-next v3 6/6] RDMA/mana_ib: Configure mac address in RNIC Konstantin Taranov
  2 siblings, 1 reply; 200+ results
From: Konstantin Taranov @ 2024-04-10  8:42 UTC (permalink / raw)
  To: kotaranov, sharmaajay, longli, jgg, leon; +Cc: linux-rdma, linux-kernel

From: Konstantin Taranov <kotaranov@microsoft.com>

Set netdev and RoCEv2 flag to enable GID population on port 1.
Use GIDs of the master netdev. As mc->ports[] stores slave devices,
use a helper to get the master netdev.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/device.c | 15 +++++++++++++++
 drivers/infiniband/hw/mana/main.c   | 15 +++++++++++----
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 47547a962b19..e7981301d10b 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -53,6 +53,7 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 {
 	struct mana_adev *madev = container_of(adev, struct mana_adev, adev);
 	struct gdma_dev *mdev = madev->mdev;
+	struct net_device *upper_ndev;
 	struct mana_context *mc;
 	struct mana_ib_dev *dev;
 	int ret;
@@ -79,6 +80,20 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 	dev->ib_dev.num_comp_vectors = 1;
 	dev->ib_dev.dev.parent = mdev->gdma_context->dev;
 
+	rcu_read_lock(); /* required to get upper dev */
+	upper_ndev = netdev_master_upper_dev_get_rcu(mc->ports[0]);
+	if (!upper_ndev) {
+		rcu_read_unlock();
+		ibdev_err(&dev->ib_dev, "Failed to get master netdev");
+		goto free_ib_device;
+	}
+	ret = ib_device_set_netdev(&dev->ib_dev, upper_ndev, 1);
+	rcu_read_unlock();
+	if (ret) {
+		ibdev_err(&dev->ib_dev, "Failed to set ib netdev, ret %d", ret);
+		goto free_ib_device;
+	}
+
 	ret = mana_gd_register_device(&mdev->gdma_context->mana_ib);
 	if (ret) {
 		ibdev_err(&dev->ib_dev, "Failed to register device, ret %d",
diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 29550f2173ff..7a9d7e13b7b1 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -527,11 +527,18 @@ int mana_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vma)
 int mana_ib_get_port_immutable(struct ib_device *ibdev, u32 port_num,
 			       struct ib_port_immutable *immutable)
 {
-	/*
-	 * This version only support RAW_PACKET
-	 * other values need to be filled for other types
-	 */
+	struct ib_port_attr attr;
+	int err;
+
+	err = ib_query_port(ibdev, port_num, &attr);
+	if (err)
+		return err;
+
+	immutable->pkey_tbl_len = attr.pkey_tbl_len;
+	immutable->gid_tbl_len = attr.gid_tbl_len;
 	immutable->core_cap_flags = RDMA_CORE_PORT_RAW_PACKET;
+	if (port_num == 1)
+		immutable->core_cap_flags |= RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
 
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related	[relevance 73%]

Results 1-200 of ~30000   | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2024-02-20  9:50     [PATCH] x86/Kconfig: Allow NR_CPUS between 512 and 8192 Saurabh Sengar
2024-03-04 16:13     ` Saurabh Singh Sengar
2024-04-14 16:31 79%   ` Saurabh Singh Sengar
2024-02-27 17:14     [PATCH] PCI/sysfs: Fix race in pci sysfs creation Saurabh Singh Sengar
2024-02-28 15:22     ` Bjorn Helgaas
2024-02-28 17:22       ` Krzysztof Wilczyński
2024-04-15 18:15 79%     ` [EXTERNAL] " Saurabh Singh Sengar
2024-03-27 16:03     [PATCH 0/9] Convert Tasklets to BH Workqueues Allen Pais
2024-03-27 16:03     ` [PATCH 8/9] drivers/media/*: Convert from tasklet to BH workqueue Allen Pais
2024-04-24  9:12       ` Hans Verkuil
2024-04-24 16:48 65%     ` Allen Pais
2024-04-03 16:55     [PATCH v2 1/2] hyperv: " Allen Pais
2024-04-10 18:08     ` Michael Kelley
2024-04-11 23:58 79%   ` Allen Pais
2024-04-04 19:26     Copying TLS/user register data per perf-sample? Beau Belgrave
2024-04-09 23:32     ` Namhyung Kim
2024-04-10 15:37 79%   ` Beau Belgrave
2024-04-10 13:06     ` Masami Hiramatsu
2024-04-10 15:35 79%   ` Beau Belgrave
2024-04-11 15:55         ` Masami Hiramatsu
2024-04-11 15:58 79%       ` Beau Belgrave
2024-04-09  5:23     [PATCH] ACPI: CPPC: Fix bit_offset shift in MASK_VAL macro Jarred White
2024-04-16 17:24 79% ` Jarred White
2024-04-09 14:21     [PATCH rdma-next 1/1] RDMA/mana_ib: remove useless return values from dbg prints Konstantin Taranov
2024-04-11  4:18 79% ` Long Li
2024-04-10  8:42     [PATCH rdma-next v3 0/6] RDMA/mana_ib: Enable RNIC adapter and populate it with GIDs Konstantin Taranov
2024-04-10  8:42 73% ` [PATCH rdma-next v3 4/6] RDMA/mana_ib: enable RoCE on port 1 Konstantin Taranov
2024-04-22 19:37       ` Nathan Chancellor
2024-04-23  7:15 79%     ` Konstantin Taranov
2024-04-10  8:42 63% ` [PATCH rdma-next v3 5/6] RDMA/mana_ib: adding and deleting GIDs Konstantin Taranov
2024-04-10  8:42 68% ` [PATCH rdma-next v3 6/6] RDMA/mana_ib: Configure mac address in RNIC Konstantin Taranov
2024-04-11  9:55     [PATCH 6.6 000/114] 6.6.27-rc1 review Greg Kroah-Hartman
2024-04-12 22:24 79% ` Kelsey Steele
2024-04-11  9:56     [PATCH 6.1 00/83] 6.1.86-rc1 review Greg Kroah-Hartman
2024-04-12 22:23 79% ` Kelsey Steele
2024-04-11  9:57     [PATCH 5.15 00/57] 5.15.155-rc1 review Greg Kroah-Hartman
2024-04-11 18:36 79% ` Easwar Hariharan
2024-04-12 22:22 79% ` Kelsey Steele
2024-04-11 21:23     [PATCH v2] ACPI: CPPC: Fix access width used for PCC registers Vanshidhar Konda
2024-04-11 22:07 79% ` Easwar Hariharan
2024-04-11 22:53 70% [PATCH] rust: remove unneeded `kernel::prelude` imports from doctests Nell Shamrell-Harrington
2024-04-11 23:18     [PATCH v3] ACPI: CPPC: Fix access width used for PCC registers Vanshidhar Konda
2024-04-15 16:59 79% ` Jarred White
2024-04-12  0:17 60% [RFC PATCH 0/4] perf: Correlating user process data to samples Beau Belgrave
2024-04-12  0:17 63% ` [RFC PATCH 1/4] perf/core: Introduce perf_prepare_dump_data() Beau Belgrave
2024-04-12  0:17 52% ` [RFC PATCH 2/4] perf: Introduce PERF_SAMPLE_TLS_USER sample type Beau Belgrave
2024-04-12  0:17 68% ` [RFC PATCH 3/4] perf/core: Factor perf_output_sample_udump() Beau Belgrave
2024-04-12  0:17 72% ` [RFC PATCH 4/4] perf/x86/core: Add tls dump support Beau Belgrave
2024-04-12  4:52     ` [RFC PATCH 0/4] perf: Correlating user process data to samples Ian Rogers
2024-04-12 16:28 71%   ` Beau Belgrave
2024-04-12  7:12     ` Peter Zijlstra
2024-04-12 16:37 79%   ` Beau Belgrave
2024-04-12  5:28 38% [PATCH v3] Drivers: hv: Cosmetic changes for hv.c and balloon.c Aditya Nagesh
2024-04-18 15:06 79% ` Saurabh Singh Sengar
2024-04-12  8:47 71% [PATCH rdma-next 1/1] RDMA/mana_ib: Use num_comp_vectors of ib_device Konstantin Taranov
2024-04-13  0:55 24% [PATCH v17 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
2024-04-13  0:55 47% ` [PATCH v17 01/21] security: add ipe lsm Fan Wu
2024-04-13  0:55 33% ` [PATCH v17 02/21] ipe: add policy parser Fan Wu
2024-04-13  0:55 54% ` [PATCH v17 03/21] ipe: add evaluation loop Fan Wu
2024-04-13  0:55 45% ` [PATCH v17 04/21] ipe: add LSM hooks on execution and kernel read Fan Wu
2024-04-13  0:55 67% ` [PATCH v17 05/21] initramfs|security: Add a security hook to do_populate_rootfs() Fan Wu
2024-04-13  0:55 47% ` [PATCH v17 06/21] ipe: introduce 'boot_verified' as a trust provider Fan Wu
2024-04-13  0:55 69% ` [PATCH v17 07/21] security: add new securityfs delete function Fan Wu
2024-04-13  0:55 28% ` [PATCH v17 08/21] ipe: add userspace interface Fan Wu
2024-04-13  0:55 26% ` [PATCH v17 09/21] uapi|audit|ipe: add ipe auditing support Fan Wu
2024-04-13  0:55 47% ` [PATCH v17 10/21] ipe: add permissive toggle Fan Wu
2024-04-13  0:55 44% ` [PATCH v17 11/21] block,lsm: add LSM blob and new LSM hooks for block device Fan Wu
2024-04-13  0:55 70% ` [PATCH v17 12/21] dm: add finalize hook to target_type Fan Wu
2024-04-13  0:55 50% ` [PATCH v17 13/21] dm verity: consume root hash digest and expose signature data via LSM hook Fan Wu
2024-04-25  3:56       ` Eric Biggers
2024-04-25 20:23 76%     ` Fan Wu
2024-04-13  0:55 32% ` [PATCH v17 14/21] ipe: add support for dm-verity as a trust provider Fan Wu
2024-04-13  0:55 65% ` [PATCH v17 15/21] security: add security_inode_setintegrity() hook Fan Wu
2024-04-13  0:55 45% ` [PATCH v17 16/21] fsverity: expose verified fsverity built-in signatures to LSMs Fan Wu
2024-04-13  0:56 39% ` [PATCH v17 17/21] ipe: enable support for fs-verity as a trust provider Fan Wu
2024-04-13  0:56 47% ` [PATCH v17 18/21] scripts: add boot policy generation program Fan Wu
2024-04-13  0:56 48% ` [PATCH v17 19/21] ipe: kunit test for parser Fan Wu
2024-04-13  0:56 12% ` [PATCH v17 20/21] Documentation: add ipe documentation Fan Wu
2024-04-13  0:56 77% ` [PATCH v17 21/21] MAINTAINERS: ipe: add ipe maintainer information Fan Wu
2024-04-15  9:49 63% [PATCH net-next] net: mana: Add new device attributes for mana Shradha Gupta
2024-04-15 16:13     ` Jason Gunthorpe
2024-04-16  4:25 79%   ` Shradha Gupta
2024-04-16  4:27       ` Zhu Yanjun
2024-04-16 18:09         ` Andrew Lunn
2024-04-18  6:01 78%       ` Shradha Gupta
2024-04-18 17:50             ` Jason Gunthorpe
2024-04-18 18:42               ` Andrew Lunn
2024-04-19 16:59 76%             ` Shradha Gupta
2024-04-19 18:51                   ` Andrew Lunn
2024-04-22 10:08 79%                 ` Shradha Gupta
2024-04-18  5:51 79%     ` Shradha Gupta
2024-04-15 16:38 79% ` Saurabh Singh Sengar
2024-04-16  4:26 79%   ` Shradha Gupta
2024-04-18 21:29 79% ` Haiyang Zhang
2024-04-15 14:19     [PATCH 6.6 000/122] 6.6.28-rc1 review Greg Kroah-Hartman
2024-04-15 23:52 79% ` Kelsey Steele
2024-04-15 14:20     [PATCH 6.1 00/69] 6.1.87-rc1 review Greg Kroah-Hartman
2024-04-15 23:53 79% ` Kelsey Steele
2024-04-15 14:21     [PATCH 5.15 00/45] 5.15.156-rc1 review Greg Kroah-Hartman
2024-04-15 23:53 79% ` Kelsey Steele
2024-04-16 22:41 75% [PATCH 0/2] tracing/user_events: Fix non-spaced field matching Beau Belgrave
2024-04-16 22:41 66% ` [PATCH 1/2] " Beau Belgrave
2024-04-19  2:33       ` Masami Hiramatsu
2024-04-19 21:13 79%     ` Beau Belgrave
2024-04-20 12:50           ` Masami Hiramatsu
2024-04-22 21:55 79%         ` Beau Belgrave
2024-04-16 22:41 77% ` [PATCH 2/2] selftests/user_events: Add non-spacing separator check Beau Belgrave
2024-04-17  8:00 79% [PATCH] tools: hv: suppress the invalid warning for packed member alignment Saurabh Sengar
2024-04-17  8:17     ` Greg KH
2024-04-17  8:21 79%   ` Saurabh Singh Sengar
2024-04-17 14:20 79% [PATCH rdma-next 0/2] RDMA/mana_ib: Enable DMA-mapped memory regions Konstantin Taranov
2024-04-17 14:20 79% ` [PATCH rdma-next 1/2] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
2024-04-17 14:20 70% ` [PATCH rdma-next 2/2] RDMA/mana_ib: Implement get_dma_mr Konstantin Taranov
2024-04-17 14:51       ` Jason Gunthorpe
2024-04-19  9:14 79%     ` Konstantin Taranov
2024-04-19 12:13           ` Jason Gunthorpe
2024-04-22  9:12 79%         ` Konstantin Taranov
2024-04-18 10:28       ` Zhu Yanjun
2024-04-19  9:02 79%     ` Konstantin Taranov
2024-04-18 12:05     [PATCH v2] Add a header in ifcfg and nm keyfiles describing the owner of the files Ani Sinha
2024-04-18 16:15 79% ` Easwar Hariharan
2024-04-18 19:01 79%   ` Dexuan Cui
2024-04-19 16:51 79%     ` Shradha Gupta
2024-04-19 16:54 79% ` Shradha Gupta
2024-04-18 16:51 72% [PATCH rdma-next 0/6] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
2024-04-18 16:52 75% ` [PATCH rdma-next 1/6] RDMA/mana_ib: create EQs for " Konstantin Taranov
2024-04-23 23:24 79%   ` Long Li
2024-04-18 16:52 68% ` [PATCH rdma-next 2/6] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
2024-04-23 23:30 79%   ` Long Li
2024-04-18 16:52 73% ` [PATCH rdma-next 3/6] RDMA/mana_ib: replace duplicate cqe with buf_size Konstantin Taranov
2024-04-23 23:34 79%   ` Long Li
2024-04-24  8:43 79%     ` Konstantin Taranov
2024-04-25 20:17 79%       ` Long Li
2024-04-18 16:52 64% ` [PATCH rdma-next 4/6] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
2024-04-23 23:42 79%   ` Long Li
2024-04-24  8:50 79%     ` Konstantin Taranov
2024-04-25 20:29 79%       ` Long Li
2024-04-18 16:52 79% ` [PATCH rdma-next 5/6] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
2024-04-23 23:45 79%   ` Long Li
2024-04-24  8:58 79%     ` Konstantin Taranov
2024-04-25 20:31 79%   ` Long Li
2024-04-18 16:52 67% ` [PATCH rdma-next 6/6] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
2024-04-23 23:57 79%   ` Long Li
2024-04-19  1:53 76% [PATCH] PCI: Add a mutex to protect the global list pci_domain_busn_res_list Dexuan Cui
2024-04-19 15:07 79% ` Haiyang Zhang
2024-04-19  5:56 79% [PATCH] hv/vmbus_drv: rename hv_acpi_init() to vmbus_init() Erni Sri Satya Vennela
2024-04-23  3:18 38% [PATCH v4] Drivers: hv: Cosmetic changes for hv.c and balloon.c Aditya Nagesh
2024-04-23 14:15 79% [PATCH rdma-next 1/1] RDMA/mana_ib: fix missing ret value Konstantin Taranov
2024-04-23 16:23 74% [PATCH v2 0/2] tracing/user_events: Fix non-spaced field matching Beau Belgrave
2024-04-23 16:23 67% ` [PATCH v2 1/2] " Beau Belgrave
2024-05-02 21:16       ` Steven Rostedt
2024-05-02 22:58 79%     ` Beau Belgrave
2024-04-23 16:23 77% ` [PATCH v2 2/2] selftests/user_events: Add non-spacing separator check Beau Belgrave
2024-04-23 20:42     [PATCH v1 1/1] RDMA/mana_ib: Fix compilation error Andy Shevchenko
2024-04-23 23:58 79% ` Long Li
2024-04-24 10:32 78% [PATCH net-next v2 0/2] Add sysfs attributes for MANA Shradha Gupta
2024-04-24 10:33 72% ` [PATCH net-next v2 1/2] net: Add sysfs atttributes for max_mtu min_mtu Shradha Gupta
2024-04-25  3:27       ` Jakub Kicinski
2024-04-26 11:06 79%     ` Shradha Gupta
2024-04-24 10:34 72% ` [PATCH net-next v2 2/2] net: mana: Add new device attributes for mana Shradha Gupta
2024-04-24 14:48     ` [PATCH net-next v2 0/2] Add sysfs attributes for MANA Jiri Pirko
2024-04-30  5:31 79%   ` Shradha Gupta
2024-05-03  8:48 79%     ` Shradha Gupta
2024-04-25 18:31 19% [PATCH v3] media/*: Convert from tasklet to BH workqueue Allen Pais
2024-04-26 13:12 69% [PATCH rdma-next v2 0/5] RDMA/mana_ib: Implement RNIC CQs Konstantin Taranov
2024-04-26 13:12 75% ` [PATCH rdma-next v2 1/5] RDMA/mana_ib: create EQs for " Konstantin Taranov
2024-04-26 13:12 68% ` [PATCH rdma-next v2 2/5] RDMA/mana_ib: create and destroy RNIC cqs Konstantin Taranov
2024-04-26 13:12 64% ` [PATCH rdma-next v2 3/5] RDMA/mana_ib: introduce a helper to remove cq callbacks Konstantin Taranov
2024-04-26 13:12 79% ` [PATCH rdma-next v2 4/5] RDMA/mana_ib: boundary check before installing " Konstantin Taranov
2024-04-26 13:12 68% ` [PATCH rdma-next v2 5/5] RDMA/mana_ib: implement uapi for creation of rnic cq Konstantin Taranov
2024-04-29 18:08 79%   ` Long Li
2024-05-01 14:01 79%     ` Konstantin Taranov
2024-05-02 17:05 79%       ` Long Li
2024-05-02 17:04 79%   ` Long Li
2024-04-29 17:21 68% [RFC PATCH] fs/coredump: Enable dynamic configuration of max file note size Allen Pais
2024-04-29 19:04     [PATCH v3 00/12] mm/swap: clean up and optimize swap cache index Kairui Song
2024-04-29 19:04     ` [PATCH v3 05/12] cifs: drop usage of page_file_offset Kairui Song
2024-04-29 20:19 65%   ` [EXTERNAL] " Steven French
2024-04-29 20:26         ` Matthew Wilcox
2024-04-30  2:23 70%       ` Steven French
2024-04-30 17:37 54% [PATCH v1 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
2024-04-30 17:38 24% ` [PATCH v1 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
2024-04-30 17:38 46% ` [PATCH v1 02/12] drm/gma500: " Easwar Hariharan
2024-04-30 17:38 23% ` [PATCH v1 03/12] drm/i915: " Easwar Hariharan
2024-04-30 20:29       ` Rodrigo Vivi
2024-04-30 21:40 75%     ` Easwar Hariharan
2024-04-30 17:38 69% ` [PATCH v1 04/12] media: au0828: " Easwar Hariharan
2024-04-30 17:38 71% ` [PATCH v1 05/12] media: cobalt: " Easwar Hariharan
2024-04-30 17:38 56% ` [PATCH v1 06/12] media: cx18: " Easwar Hariharan
2024-04-30 17:38 70% ` [PATCH v1 07/12] media: cx25821: " Easwar Hariharan
2024-04-30 17:38 59% ` [PATCH v1 08/12] media: ivtv: " Easwar Hariharan
2024-04-30 17:38 65% ` [PATCH v1 09/12] media: cx23885: " Easwar Hariharan
2024-04-30 17:38 70% ` [PATCH v1 10/12] sfc: falcon: " Easwar Hariharan
2024-05-03 22:13       ` Jakub Kicinski
2024-05-06 15:54 76%     ` Easwar Hariharan
2024-04-30 17:38 70% ` [PATCH v1 11/12] fbdev/smscufx: " Easwar Hariharan
2024-04-30 17:38 47% ` [PATCH v1 12/12] fbdev/viafb: " Easwar Hariharan
2024-05-02 10:46       ` Thomas Zimmermann
2024-05-02 22:26 74%     ` Easwar Hariharan
2024-05-03  7:39           ` Thomas Zimmermann
2024-05-03 16:48 79%         ` Easwar Hariharan
2024-05-02 14:59 66% [PATCH v2] fs/coredump: Enable dynamic configuration of max file note size Allen Pais
2024-05-02 20:34 63% [PATCH 0/1] Convert tasklets to bottom half workqueues Allen Pais
2024-05-02 20:34 14% ` [PATCH] [RFC] scsi: Convert from tasklet to BH workqueue Allen Pais
2024-05-03  2:03       ` Michael Ellerman
2024-05-03 15:32 79%     ` Allen Pais
2024-05-02 23:56 63% [PATCH v3] fs/coredump: Enable dynamic configuration of max file note size Allen Pais
2024-05-03 18:13 51% [PATCH v2 00/12] Make I2C terminology more inclusive for I2C Algobit and consumers Easwar Hariharan
2024-05-03 18:13 22% ` [PATCH v2 01/12] drm/amdgpu, drm/radeon: Make I2C terminology more inclusive Easwar Hariharan
2024-05-07 18:16 67%   ` Easwar Hariharan
2024-05-03 18:13 46% ` [PATCH v2 02/12] drm/gma500: " Easwar Hariharan
2024-05-03 18:13 23% ` [PATCH v2 03/12] drm/i915: " Easwar Hariharan
2024-05-03 19:34       ` Rodrigo Vivi
2024-05-03 21:04 72%     ` Easwar Hariharan
2024-05-03 18:13 69% ` [PATCH v2 04/12] media: au0828: " Easwar Hariharan
2024-05-03 18:13 71% ` [PATCH v2 05/12] media: cobalt: " Easwar Hariharan
2024-05-03 18:13 56% ` [PATCH v2 06/12] media: cx18: " Easwar Hariharan
2024-05-03 18:13 63% ` [PATCH v2 07/12] media: cx25821: " Easwar Hariharan
2024-05-03 18:13 57% ` [PATCH v2 08/12] media: ivtv: " Easwar Hariharan
2024-05-03 18:13 58% ` [PATCH v2 09/12] media: cx23885: " Easwar Hariharan
2024-05-03 18:13 70% ` [PATCH v2 10/12] sfc: falcon: " Easwar Hariharan
2024-05-03 18:13 70% ` [PATCH v2 11/12] fbdev/smscufx: " Easwar Hariharan
2024-05-03 18:13 47% ` [PATCH v2 12/12] fbdev/viafb: " Easwar Hariharan
2024-05-03 22:32 24% [PATCH v18 00/21] Integrity Policy Enforcement LSM (IPE) Fan Wu
2024-05-03 22:32 47% ` [PATCH v18 01/21] security: add ipe lsm Fan Wu
2024-05-03 22:32 33% ` [PATCH v18 02/21] ipe: add policy parser Fan Wu
2024-05-03 22:32 54% ` [PATCH v18 03/21] ipe: add evaluation loop Fan Wu
2024-05-03 22:32 45% ` [PATCH v18 04/21] ipe: add LSM hooks on execution and kernel read Fan Wu
2024-05-03 22:32 67% ` [PATCH v18 05/21] initramfs|security: Add a security hook to do_populate_rootfs() Fan Wu
2024-05-03 22:32 47% ` [PATCH v18 06/21] ipe: introduce 'boot_verified' as a trust provider Fan Wu
2024-05-03 22:32 69% ` [PATCH v18 07/21] security: add new securityfs delete function Fan Wu
2024-05-03 22:32 28% ` [PATCH v18 08/21] ipe: add userspace interface Fan Wu
2024-05-03 22:32 26% ` [PATCH v18 09/21] uapi|audit|ipe: add ipe auditing support Fan Wu
2024-05-03 22:32 47% ` [PATCH v18 10/21] ipe: add permissive toggle Fan Wu
2024-05-03 22:32 44% ` [PATCH v18 11/21] block,lsm: add LSM blob and new LSM hooks for block device Fan Wu
2024-05-03 22:32 70% ` [PATCH v18 12/21] dm: add finalize hook to target_type Fan Wu
2024-05-03 22:32 50% ` [PATCH v18 13/21] dm verity: expose root hash digest and signature data to LSMs Fan Wu
2024-05-03 22:32 31% ` [PATCH v18 14/21] ipe: add support for dm-verity as a trust provider Fan Wu
2024-05-03 22:32 65% ` [PATCH v18 15/21] security: add security_inode_setintegrity() hook Fan Wu
2024-05-03 22:32 48% ` [PATCH v18 16/21] fsverity: expose verified fsverity built-in signatures to LSMs Fan Wu
2024-05-03 22:32 38% ` [PATCH v18 17/21] ipe: enable support for fs-verity as a trust provider Fan Wu
2024-05-03 22:32 47% ` [PATCH v18 18/21] scripts: add boot policy generation program Fan Wu
2024-05-03 22:32 48% ` [PATCH v18 19/21] ipe: kunit test for parser Fan Wu
2024-05-03 22:32 12% ` [PATCH v18 20/21] Documentation: add ipe documentation Fan Wu
2024-05-04  8:04       ` Bagas Sanjaya
2024-05-04 20:13 79%     ` Fan Wu
2024-05-03 22:32 77% ` [PATCH v18 21/21] MAINTAINERS: ipe: add ipe maintainer information Fan Wu
2024-05-06  5:38 79% [PATCH v2] tools: hv: suppress the invalid warning for packed member alignment Saurabh Sengar
2024-05-06 19:37 63% [PATCH v4] fs/coredump: Enable dynamic configuration of max file note size Allen Pais
2024-05-07  9:53 79% [PATCH rdma-next 0/3] RDMA/mana_ib: Add support of RC QPs Konstantin Taranov
2024-05-07  9:53 63% ` [PATCH rdma-next 1/3] RDMA/mana_ib: Create and destroy RC QP Konstantin Taranov
2024-05-07  9:53 62% ` [PATCH rdma-next 2/3] RDMA/mana_ib: Implement uapi to create " Konstantin Taranov
2024-05-07  9:53 64% ` [PATCH rdma-next 3/3] RDMA/mana_ib: Modify QP state Konstantin Taranov
2024-05-07 19:01 55% [PATCH 0/1] Convert tasklets to BH workqueues in ethernet drivers Allen Pais
2024-05-07 19:01  7% ` [PATCH 1/1] [RFC] ethernet: Convert from tasklet to BH workqueue Allen Pais
2024-05-07 21:40 73% Watchdog Reset on Idle CPU with a task on its runq Vijay Balakrishna

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).