Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* stop using ->read and ->write for kernel access v3
@ 2020-07-07 17:47 Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 01/23] cachefiles: switch to kernel_write Christoph Hellwig
                   ` (24 more replies)
  0 siblings, 25 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Hi Al and Linus (and Stephen, see below),

as part of removing set_fs entirely (for which I have a working
prototype), we need to stop calling ->read and ->write with kernel
pointers under set_fs.

This removes the option to call ->read and ->write with kernel pointers
entirely.  The replacements are the existing ->read_iter and ->write_iter
methods which cope with kvecs and kernel pointers just fine and are
already used by many instances including all the "real" file systems.

A git branch is available here:

    git://git.infradead.org/users/hch/misc.git set_fs-rw

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/set_fs-rw

Given that this has been out and cooking for a while, is there a chance to
add the above as a temp branch to linux-next to get a little more exposure
until Al reviews it and (hopefully) picks it up?

Changes since v2:
 - dropped an already merged sysctl fix
 - merged with the prep series
 - simplify warn_unsupported()
 - make splice do the right thing on files with the iter ops by
   default instead of requiring manual setup

Changes since v1:
 - drop the read_uptr/write_uptr methods
 - pick up a bunch of sysctl fixes and iter conversion from Matthew Wilcox

Diffstat:
 Documentation/filesystems/seq_file.rst                            |    2 
 Documentation/process/clang-format.rst                            |    4 
 Documentation/translations/it_IT/process/clang-format.rst         |    4 
 arch/alpha/kernel/srm_env.c                                       |    2 
 arch/arm/mm/alignment.c                                           |    2 
 arch/arm/mm/ptdump_debugfs.c                                      |    2 
 arch/arm64/kvm/vgic/vgic-debug.c                                  |    2 
 arch/c6x/platforms/pll.c                                          |    2 
 arch/mips/cavium-octeon/oct_ilm.c                                 |    2 
 arch/mips/kernel/segment.c                                        |    2 
 arch/mips/ralink/bootrom.c                                        |    2 
 arch/powerpc/kernel/rtas-proc.c                                   |   10 
 arch/powerpc/kvm/book3s_xive_native.c                             |    2 
 arch/powerpc/kvm/timing.c                                         |    2 
 arch/powerpc/mm/numa.c                                            |    2 
 arch/powerpc/mm/ptdump/bats.c                                     |    2 
 arch/powerpc/mm/ptdump/hashpagetable.c                            |    2 
 arch/powerpc/mm/ptdump/ptdump.c                                   |    2 
 arch/powerpc/mm/ptdump/segment_regs.c                             |    2 
 arch/powerpc/platforms/cell/spufs/file.c                          |    8 
 arch/powerpc/platforms/pseries/hvCall_inst.c                      |    2 
 arch/powerpc/platforms/pseries/lpar.c                             |    4 
 arch/powerpc/platforms/pseries/lparcfg.c                          |    2 
 arch/s390/kernel/diag.c                                           |    2 
 arch/s390/mm/dump_pagetables.c                                    |    2 
 arch/s390/pci/pci_debug.c                                         |    2 
 arch/sh/mm/alignment.c                                            |    2 
 arch/sh/mm/asids-debugfs.c                                        |    2 
 arch/sh/mm/cache-debugfs.c                                        |    2 
 arch/sh/mm/pmb.c                                                  |    2 
 arch/sh/mm/tlb-debugfs.c                                          |    2 
 arch/sparc/kernel/led.c                                           |    2 
 arch/um/kernel/exitcode.c                                         |    2 
 arch/um/kernel/process.c                                          |    2 
 arch/x86/kernel/cpu/mce/severity.c                                |    2 
 arch/x86/kernel/cpu/mtrr/if.c                                     |    2 
 arch/x86/mm/pat/memtype.c                                         |    2 
 arch/x86/mm/pat/set_memory.c                                      |    2 
 arch/x86/platform/uv/tlb_uv.c                                     |    2 
 arch/x86/xen/p2m.c                                                |    2 
 block/blk-mq-debugfs.c                                            |    2 
 drivers/acpi/battery.c                                            |    2 
 drivers/acpi/proc.c                                               |    2 
 drivers/base/power/wakeup.c                                       |    2 
 drivers/block/aoe/aoeblk.c                                        |    2 
 drivers/block/drbd/drbd_debugfs.c                                 |   10 
 drivers/block/nbd.c                                               |    4 
 drivers/block/pktcdvd.c                                           |    2 
 drivers/block/rsxx/core.c                                         |    4 
 drivers/bus/mvebu-mbus.c                                          |    4 
 drivers/char/tpm/eventlog/common.c                                |    2 
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c                 |    2 
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c                 |    2 
 drivers/crypto/amlogic/amlogic-gxl-core.c                         |    2 
 drivers/crypto/caam/dpseci-debugfs.c                              |    2 
 drivers/crypto/cavium/zip/zip_main.c                              |    6 
 drivers/crypto/hisilicon/qm.c                                     |    2 
 drivers/crypto/qat/qat_common/adf_cfg.c                           |    2 
 drivers/crypto/qat/qat_common/adf_transport_debug.c               |    4 
 drivers/firmware/tegra/bpmp-debugfs.c                             |    2 
 drivers/gpio/gpiolib.c                                            |    2 
 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c                          |    4 
 drivers/gpu/drm/arm/display/komeda/komeda_dev.c                   |    2 
 drivers/gpu/drm/arm/malidp_drv.c                                  |    2 
 drivers/gpu/drm/armada/armada_debugfs.c                           |    2 
 drivers/gpu/drm/drm_debugfs.c                                     |    6 
 drivers/gpu/drm/drm_debugfs_crc.c                                 |    2 
 drivers/gpu/drm/drm_mipi_dbi.c                                    |    2 
 drivers/gpu/drm/i915/display/intel_display_debugfs.c              |   16 
 drivers/gpu/drm/i915/gt/debugfs_gt.h                              |    2 
 drivers/gpu/drm/i915/i915_debugfs_params.c                        |   12 
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c                      |    2 
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c                          |    4 
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c                       |    2 
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c                           |    4 
 drivers/gpu/drm/msm/msm_debugfs.c                                 |    2 
 drivers/gpu/drm/nouveau/nouveau_debugfs.c                         |    2 
 drivers/gpu/drm/omapdrm/dss/dss.c                                 |    2 
 drivers/gpu/host1x/debug.c                                        |    4 
 drivers/gpu/vga/vga_switcheroo.c                                  |    2 
 drivers/hid/hid-picolcd_debugfs.c                                 |    2 
 drivers/hid/hid-wiimote-debug.c                                   |    2 
 drivers/hwmon/dell-smm-hwmon.c                                    |    2 
 drivers/ide/ide-proc.c                                            |    4 
 drivers/infiniband/hw/cxgb4/device.c                              |    4 
 drivers/infiniband/hw/qib/qib_debugfs.c                           |    2 
 drivers/infiniband/ulp/ipoib/ipoib_fs.c                           |    4 
 drivers/input/input.c                                             |    4 
 drivers/macintosh/via-pmu.c                                       |    2 
 drivers/md/bcache/closure.c                                       |    2 
 drivers/md/md.c                                                   |    2 
 drivers/media/cec/core/cec-core.c                                 |    2 
 drivers/media/pci/saa7164/saa7164-core.c                          |    2 
 drivers/memory/emif.c                                             |    4 
 drivers/memory/tegra/tegra124-emc.c                               |    2 
 drivers/memory/tegra/tegra186-emc.c                               |    2 
 drivers/memory/tegra/tegra20-emc.c                                |    2 
 drivers/memory/tegra/tegra30-emc.c                                |    2 
 drivers/mfd/ab3100-core.c                                         |    2 
 drivers/mfd/ab3100-otp.c                                          |    2 
 drivers/mfd/ab8500-debugfs.c                                      |   14 
 drivers/mfd/tps65010.c                                            |    2 
 drivers/misc/habanalabs/debugfs.c                                 |    2 
 drivers/misc/sgi-gru/gruprocfs.c                                  |    6 
 drivers/mmc/core/mmc_test.c                                       |    2 
 drivers/mtd/mtdcore.c                                             |    4 
 drivers/mtd/ubi/debug.c                                           |    2 
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c                |   38 +-
 drivers/net/ethernet/chelsio/cxgb4/l2t.c                          |    2 
 drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c               |    8 
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c          |    6 
 drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c                  |    2 
 drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c           |    2 
 drivers/net/wireless/ath/ath5k/debug.c                            |    2 
 drivers/net/wireless/ath/wil6210/debugfs.c                        |   14 
 drivers/net/wireless/broadcom/brcm80211/brcmsmac/debug.c          |    2 
 drivers/net/wireless/intel/ipw2x00/libipw_module.c                |    2 
 drivers/net/wireless/intel/iwlwifi/fw/debugfs.c                   |    2 
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c                   |    2 
 drivers/net/wireless/intersil/hostap/hostap_download.c            |    2 
 drivers/net/wireless/mediatek/mt76/mt7603/debugfs.c               |    2 
 drivers/net/wireless/mediatek/mt76/mt7615/debugfs.c               |    2 
 drivers/net/wireless/mediatek/mt76/mt76x02_debugfs.c              |    4 
 drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c               |    4 
 drivers/net/wireless/mediatek/mt7601u/debugfs.c                   |    4 
 drivers/net/wireless/realtek/rtlwifi/debug.c                      |    2 
 drivers/net/wireless/realtek/rtw88/debug.c                        |    4 
 drivers/net/wireless/rsi/rsi_91x_debugfs.c                        |    4 
 drivers/net/xen-netback/xenbus.c                                  |    2 
 drivers/nvme/host/fabrics.c                                       |    2 
 drivers/parisc/led.c                                              |    2 
 drivers/pci/controller/pci-tegra.c                                |    2 
 drivers/platform/x86/asus-wmi.c                                   |    2 
 drivers/platform/x86/intel_pmc_core.c                             |    2 
 drivers/platform/x86/intel_telemetry_debugfs.c                    |    4 
 drivers/platform/x86/thinkpad_acpi.c                              |    2 
 drivers/platform/x86/toshiba_acpi.c                               |    8 
 drivers/pnp/pnpbios/proc.c                                        |    2 
 drivers/power/supply/da9030_battery.c                             |    2 
 drivers/pwm/core.c                                                |    2 
 drivers/ras/cec.c                                                 |    2 
 drivers/ras/debugfs.c                                             |    2 
 drivers/s390/block/dasd.c                                         |    2 
 drivers/s390/block/dasd_proc.c                                    |    2 
 drivers/s390/cio/blacklist.c                                      |    2 
 drivers/s390/cio/qdio_debug.c                                     |    2 
 drivers/scsi/hisi_sas/hisi_sas_main.c                             |   32 -
 drivers/scsi/qedf/qedf_dbg.h                                      |    2 
 drivers/scsi/qedi/qedi_dbg.h                                      |    2 
 drivers/scsi/qla2xxx/qla_dfs.c                                    |   12 
 drivers/scsi/scsi_devinfo.c                                       |    2 
 drivers/scsi/scsi_proc.c                                          |    4 
 drivers/scsi/sg.c                                                 |    4 
 drivers/scsi/snic/snic_debugfs.c                                  |    4 
 drivers/sh/intc/virq-debugfs.c                                    |    2 
 drivers/soc/qcom/cmd-db.c                                         |    2 
 drivers/soc/qcom/socinfo.c                                        |    4 
 drivers/soc/ti/knav_dma.c                                         |    2 
 drivers/soc/ti/knav_qmss_queue.c                                  |    2 
 drivers/staging/rtl8192u/ieee80211/ieee80211_module.c             |    2 
 drivers/staging/vc04_services/interface/vchiq_arm/vchiq_debugfs.c |    4 
 drivers/usb/chipidea/debug.c                                      |    4 
 drivers/usb/dwc2/debugfs.c                                        |    2 
 drivers/usb/dwc3/debugfs.c                                        |    8 
 drivers/usb/gadget/function/rndis.c                               |    2 
 drivers/usb/gadget/udc/lpc32xx_udc.c                              |    2 
 drivers/usb/gadget/udc/renesas_usb3.c                             |    2 
 drivers/usb/host/xhci-debugfs.c                                   |    6 
 drivers/usb/mtu3/mtu3_debugfs.c                                   |    8 
 drivers/usb/musb/musb_debugfs.c                                   |    4 
 drivers/video/fbdev/via/viafbdev.c                                |   14 
 drivers/visorbus/visorbus_main.c                                  |    2 
 drivers/xen/xenfs/xensyms.c                                       |    2 
 fs/adfs/file.c                                                    |    1 
 fs/affs/file.c                                                    |    1 
 fs/afs/file.c                                                     |    1 
 fs/autofs/waitq.c                                                 |    2 
 fs/bfs/file.c                                                     |    1 
 fs/block_dev.c                                                    |    1 
 fs/btrfs/file.c                                                   |    1 
 fs/cachefiles/rdwr.c                                              |    2 
 fs/ceph/file.c                                                    |    1 
 fs/cifs/cifs_debug.c                                              |   14 
 fs/cifs/cifsfs.c                                                  |    6 
 fs/cifs/dfs_cache.c                                               |    2 
 fs/coda/file.c                                                    |    1 
 fs/cramfs/inode.c                                                 |    1 
 fs/debugfs/file.c                                                 |    4 
 fs/dlm/debug_fs.c                                                 |    8 
 fs/ecryptfs/file.c                                                |    1 
 fs/exfat/file.c                                                   |    1 
 fs/ext2/file.c                                                    |    1 
 fs/ext4/file.c                                                    |    1 
 fs/f2fs/file.c                                                    |    1 
 fs/fat/file.c                                                     |    1 
 fs/fscache/object-list.c                                          |    2 
 fs/fuse/file.c                                                    |    1 
 fs/gfs2/file.c                                                    |    2 
 fs/gfs2/glock.c                                                   |    6 
 fs/hfs/inode.c                                                    |    1 
 fs/hfsplus/inode.c                                                |    1 
 fs/hostfs/hostfs_kern.c                                           |    1 
 fs/hpfs/file.c                                                    |    1 
 fs/jbd2/journal.c                                                 |    2 
 fs/jffs2/file.c                                                   |    1 
 fs/jfs/file.c                                                     |    1 
 fs/jfs/jfs_debug.c                                                |    2 
 fs/minix/file.c                                                   |    1 
 fs/nfs/file.c                                                     |    1 
 fs/nfs/nfs4file.c                                                 |    1 
 fs/nfsd/nfs4state.c                                               |    4 
 fs/nfsd/nfsctl.c                                                  |   12 
 fs/nfsd/stats.c                                                   |    2 
 fs/nilfs2/file.c                                                  |    1 
 fs/ntfs/file.c                                                    |    1 
 fs/ocfs2/cluster/netdebug.c                                       |    6 
 fs/ocfs2/dlm/dlmdebug.c                                           |    2 
 fs/ocfs2/dlmglue.c                                                |    2 
 fs/ocfs2/file.c                                                   |    2 
 fs/omfs/file.c                                                    |    1 
 fs/openpromfs/inode.c                                             |    2 
 fs/orangefs/orangefs-debugfs.c                                    |    2 
 fs/proc/array.c                                                   |    2 
 fs/proc/base.c                                                    |   24 -
 fs/proc/cpuinfo.c                                                 |    2 
 fs/proc/fd.c                                                      |    2 
 fs/proc/generic.c                                                 |    4 
 fs/proc/inode.c                                                   |  119 ++++---
 fs/proc/proc_net.c                                                |    4 
 fs/proc/proc_sysctl.c                                             |   44 +-
 fs/proc/stat.c                                                    |    2 
 fs/proc/task_mmu.c                                                |    8 
 fs/proc/task_nommu.c                                              |    2 
 fs/proc_namespace.c                                               |    6 
 fs/ramfs/file-mmu.c                                               |    1 
 fs/ramfs/file-nommu.c                                             |    1 
 fs/read_write.c                                                   |  170 ++++++----
 fs/reiserfs/file.c                                                |    1 
 fs/romfs/mmap-nommu.c                                             |    1 
 fs/seq_file.c                                                     |   45 +-
 fs/splice.c                                                       |  123 -------
 fs/sysv/file.c                                                    |    1 
 fs/ubifs/file.c                                                   |    1 
 fs/udf/file.c                                                     |    1 
 fs/ufs/file.c                                                     |    1 
 fs/vboxsf/file.c                                                  |    1 
 fs/xfs/xfs_file.c                                                 |    1 
 fs/zonefs/super.c                                                 |    1 
 include/linux/bpf-cgroup.h                                        |    2 
 include/linux/fs.h                                                |    4 
 include/linux/proc_fs.h                                           |    1 
 include/linux/seq_file.h                                          |    7 
 ipc/util.c                                                        |    2 
 kernel/bpf/cgroup.c                                               |    2 
 kernel/bpf/inode.c                                                |    2 
 kernel/fail_function.c                                            |    2 
 kernel/gcov/fs.c                                                  |    2 
 kernel/irq/debugfs.c                                              |    2 
 kernel/irq/proc.c                                                 |    6 
 kernel/kallsyms.c                                                 |    2 
 kernel/kcsan/debugfs.c                                            |    2 
 kernel/latencytop.c                                               |    2 
 kernel/locking/lockdep_proc.c                                     |    2 
 kernel/module.c                                                   |    2 
 kernel/profile.c                                                  |    2 
 kernel/sched/debug.c                                              |    2 
 kernel/sched/psi.c                                                |    6 
 kernel/time/test_udelay.c                                         |    2 
 kernel/trace/ftrace.c                                             |   16 
 kernel/trace/trace.c                                              |   20 -
 kernel/trace/trace_dynevent.c                                     |    2 
 kernel/trace/trace_events.c                                       |   10 
 kernel/trace/trace_events_hist.c                                  |    4 
 kernel/trace/trace_events_synth.c                                 |    2 
 kernel/trace/trace_events_trigger.c                               |    2 
 kernel/trace/trace_kprobe.c                                       |    4 
 kernel/trace/trace_printk.c                                       |    2 
 kernel/trace/trace_stack.c                                        |    4 
 kernel/trace/trace_stat.c                                         |    2 
 kernel/trace/trace_uprobe.c                                       |    4 
 lib/debugobjects.c                                                |    2 
 lib/dynamic_debug.c                                               |    4 
 lib/error-inject.c                                                |    2 
 lib/kunit/debugfs.c                                               |    2 
 mm/kmemleak.c                                                     |    2 
 mm/shmem.c                                                        |    1 
 mm/slab_common.c                                                  |    2 
 mm/swapfile.c                                                     |    2 
 net/6lowpan/debugfs.c                                             |    2 
 net/atm/mpoa_proc.c                                               |    2 
 net/batman-adv/debugfs.c                                          |    4 
 net/bluetooth/6lowpan.c                                           |    2 
 net/bpfilter/bpfilter_kern.c                                      |    2 
 net/core/pktgen.c                                                 |    6 
 net/hsr/hsr_debugfs.c                                             |    2 
 net/ipv4/netfilter/ipt_CLUSTERIP.c                                |    2 
 net/ipv4/route.c                                                  |    4 
 net/l2tp/l2tp_debugfs.c                                           |    2 
 net/netfilter/xt_recent.c                                         |    2 
 net/sunrpc/cache.c                                                |    4 
 net/sunrpc/debugfs.c                                              |    4 
 net/sunrpc/rpc_pipe.c                                             |    2 
 net/sunrpc/stats.c                                                |    2 
 security/apparmor/apparmorfs.c                                    |   10 
 security/integrity/iint.c                                         |   14 
 security/integrity/ima/ima_fs.c                                   |    6 
 security/selinux/selinuxfs.c                                      |    2 
 security/smack/smackfs.c                                          |   20 -
 sound/core/info.c                                                 |    2 
 309 files changed, 733 insertions(+), 796 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 01/23] cachefiles: switch to kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 02/23] autofs: " Christoph Hellwig
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel, David Howells

__kernel_write doesn't take a sb_writers references, which we need here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
---
 fs/cachefiles/rdwr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index e7726f5f1241c2..3080cda9e82457 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -937,7 +937,7 @@ int cachefiles_write_page(struct fscache_storage *op, struct page *page)
 	}
 
 	data = kmap(page);
-	ret = __kernel_write(file, data, len, &pos);
+	ret = kernel_write(file, data, len, &pos);
 	kunmap(page);
 	fput(file);
 	if (ret != len)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 02/23] autofs: switch to kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 01/23] cachefiles: switch to kernel_write Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 03/23] bpfilter: " Christoph Hellwig
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel, Ian Kent

While pipes don't really need sb_writers projection, __kernel_write is an
interface better kept private, and the additional rw_verify_area does not
hurt here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ian Kent <raven@themaw.net>
---
 fs/autofs/waitq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c
index b04c528b19d342..74c886f7c51cbe 100644
--- a/fs/autofs/waitq.c
+++ b/fs/autofs/waitq.c
@@ -53,7 +53,7 @@ static int autofs_write(struct autofs_sb_info *sbi,
 
 	mutex_lock(&sbi->pipe_mutex);
 	while (bytes) {
-		wr = __kernel_write(file, data, bytes, &file->f_pos);
+		wr = kernel_write(file, data, bytes, &file->f_pos);
 		if (wr <= 0)
 			break;
 		data += wr;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 03/23] bpfilter: switch to kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 01/23] cachefiles: switch to kernel_write Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 02/23] autofs: " Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 04/23] fs: unexport __kernel_write Christoph Hellwig
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

While pipes don't really need sb_writers projection, __kernel_write is an
interface better kept private, and the additional rw_verify_area does not
hurt here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 net/bpfilter/bpfilter_kern.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index c0f0990f30b604..1905e01c3aa9a7 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -50,7 +50,7 @@ static int __bpfilter_process_sockopt(struct sock *sk, int optname,
 	req.len = optlen;
 	if (!bpfilter_ops.info.pid)
 		goto out;
-	n = __kernel_write(bpfilter_ops.info.pipe_to_umh, &req, sizeof(req),
+	n = kernel_write(bpfilter_ops.info.pipe_to_umh, &req, sizeof(req),
 			   &pos);
 	if (n != sizeof(req)) {
 		pr_err("write fail %zd\n", n);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 04/23] fs: unexport __kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 03/23] bpfilter: " Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 05/23] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

This is a very special interface that skips sb_writes protection, and not
used by modules anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index bbfa9b12b15eb7..2c601d853ff3d8 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -522,7 +522,6 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	inc_syscw(current);
 	return ret;
 }
-EXPORT_SYMBOL(__kernel_write);
 
 ssize_t kernel_write(struct file *file, const void *buf, size_t count,
 			    loff_t *pos)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 05/23] fs: check FMODE_WRITE in __kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 04/23] fs: unexport __kernel_write Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 06/23] fs: implement kernel_write using __kernel_write Christoph Hellwig
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Add a WARN_ON_ONCE if the file isn't actually open for write.  This
matches the check done in vfs_write, but actually warn warns as a
kernel user calling write on a file not opened for writing is a pretty
obvious programming error.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index 2c601d853ff3d8..8f9fc05990ae8b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -505,6 +505,8 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	const char __user *p;
 	ssize_t ret;
 
+	if (WARN_ON_ONCE(!(file->f_mode & FMODE_WRITE)))
+		return -EBADF;
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 06/23] fs: implement kernel_write using __kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 05/23] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 07/23] fs: remove __vfs_write Christoph Hellwig
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Consolidate the two in-kernel write helpers to make upcoming changes
easier.  The only difference are the missing call to rw_verify_area
in kernel_write, and an access_ok check that doesn't make sense for
kernel buffers to start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 8f9fc05990ae8b..5110cd1e6e2771 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -499,6 +499,7 @@ static ssize_t __vfs_write(struct file *file, const char __user *p,
 		return -EINVAL;
 }
 
+/* caller is responsible for file_start_write/file_end_write */
 ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs;
@@ -528,16 +529,16 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 ssize_t kernel_write(struct file *file, const void *buf, size_t count,
 			    loff_t *pos)
 {
-	mm_segment_t old_fs;
-	ssize_t res;
+	ssize_t ret;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	res = vfs_write(file, (__force const char __user *)buf, count, pos);
-	set_fs(old_fs);
+	ret = rw_verify_area(WRITE, file, pos, count);
+	if (ret)
+		return ret;
 
-	return res;
+	file_start_write(file);
+	ret =  __kernel_write(file, buf, count, pos);
+	file_end_write(file);
+	return ret;
 }
 EXPORT_SYMBOL(kernel_write);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 07/23] fs: remove __vfs_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 06/23] fs: implement kernel_write using __kernel_write Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Fold it into the two callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 5110cd1e6e2771..96e8e354f99b45 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -488,17 +488,6 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 	return ret;
 }
 
-static ssize_t __vfs_write(struct file *file, const char __user *p,
-			   size_t count, loff_t *pos)
-{
-	if (file->f_op->write)
-		return file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		return new_sync_write(file, p, count, pos);
-	else
-		return -EINVAL;
-}
-
 /* caller is responsible for file_start_write/file_end_write */
 ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
 {
@@ -516,7 +505,12 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	ret = __vfs_write(file, p, count, pos);
+	if (file->f_op->write)
+		ret = file->f_op->write(file, p, count, pos);
+	else if (file->f_op->write_iter)
+		ret = new_sync_write(file, p, count, pos);
+	else
+		ret = -EINVAL;
 	set_fs(old_fs);
 	if (ret > 0) {
 		fsnotify_modify(file);
@@ -554,19 +548,23 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
 		return -EFAULT;
 
 	ret = rw_verify_area(WRITE, file, pos, count);
-	if (!ret) {
-		if (count > MAX_RW_COUNT)
-			count =  MAX_RW_COUNT;
-		file_start_write(file);
-		ret = __vfs_write(file, buf, count, pos);
-		if (ret > 0) {
-			fsnotify_modify(file);
-			add_wchar(current, ret);
-		}
-		inc_syscw(current);
-		file_end_write(file);
+	if (ret)
+		return ret;
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
+	file_start_write(file);
+	if (file->f_op->write)
+		ret = file->f_op->write(file, buf, count, pos);
+	else if (file->f_op->write_iter)
+		ret = new_sync_write(file, buf, count, pos);
+	else
+		ret = -EINVAL;
+	if (ret > 0) {
+		fsnotify_modify(file);
+		add_wchar(current, ret);
 	}
-
+	inc_syscw(current);
+	file_end_write(file);
 	return ret;
 }
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 07/23] fs: remove __vfs_write Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-29 20:50   ` Al Viro
  2020-07-07 17:47 ` [PATCH 09/23] fs: add a __kernel_read helper Christoph Hellwig
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

If we write to a file that implements ->write_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->write with a changed address
limit if available.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 96e8e354f99b45..bd46c959799e97 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -489,10 +489,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 }
 
 /* caller is responsible for file_start_write/file_end_write */
-ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
+ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
+		loff_t *pos)
 {
-	mm_segment_t old_fs;
-	const char __user *p;
 	ssize_t ret;
 
 	if (WARN_ON_ONCE(!(file->f_mode & FMODE_WRITE)))
@@ -500,18 +499,29 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write)
-		ret = file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		ret = new_sync_write(file, p, count, pos);
-	else
+	if (file->f_op->write_iter) {
+		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, WRITE, &iov, 1, count);
+		ret = file->f_op->write_iter(&kiocb, &iter);
+		if (ret > 0)
+			*pos = kiocb.ki_pos;
+	} else if (file->f_op->write) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
+		ret = file->f_op->write(file, (__force const char __user *)buf,
+				count, pos);
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_modify(file);
 		add_wchar(current, ret);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 09/23] fs: add a __kernel_read helper
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 10/23] integrity/ima: switch to using __kernel_read Christoph Hellwig
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

This is the counterpart to __kernel_write, and skip the rw_verify_area
call compared to kernel_read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    | 23 +++++++++++++++++++++++
 include/linux/fs.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index bd46c959799e97..cc8e0b4f3cd697 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -430,6 +430,29 @@ ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
 		return -EINVAL;
 }
 
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
+{
+	mm_segment_t old_fs = get_fs();
+	ssize_t ret;
+
+	if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))
+		return -EINVAL;
+	if (!(file->f_mode & FMODE_CAN_READ))
+		return -EINVAL;
+
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
+	set_fs(KERNEL_DS);
+	ret = __vfs_read(file, (void __user *)buf, count, pos);
+	set_fs(old_fs);
+	if (ret > 0) {
+		fsnotify_access(file);
+		add_rchar(current, ret);
+	}
+	inc_syscr(current);
+	return ret;
+}
+
 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3f881a892ea746..22cbe7b2e91994 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3033,6 +3033,7 @@ extern int kernel_read_file_from_path_initns(const char *, void **, loff_t *, lo
 extern int kernel_read_file_from_fd(int, void **, loff_t *, loff_t,
 				    enum kernel_read_file_id);
 extern ssize_t kernel_read(struct file *, void *, size_t, loff_t *);
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos);
 extern ssize_t kernel_write(struct file *, const void *, size_t, loff_t *);
 extern ssize_t __kernel_write(struct file *, const void *, size_t, loff_t *);
 extern struct file * open_exec(const char *);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 10/23] integrity/ima: switch to using __kernel_read
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 09/23] fs: add a __kernel_read helper Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 11/23] fs: implement kernel_read " Christoph Hellwig
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

__kernel_read has a bunch of additional sanity checks, and this moves
the set_fs out of non-core code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 security/integrity/iint.c | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/security/integrity/iint.c b/security/integrity/iint.c
index e12c4900510f60..1d20003243c3fb 100644
--- a/security/integrity/iint.c
+++ b/security/integrity/iint.c
@@ -188,19 +188,7 @@ DEFINE_LSM(integrity) = {
 int integrity_kernel_read(struct file *file, loff_t offset,
 			  void *addr, unsigned long count)
 {
-	mm_segment_t old_fs;
-	char __user *buf = (char __user *)addr;
-	ssize_t ret;
-
-	if (!(file->f_mode & FMODE_READ))
-		return -EBADF;
-
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	ret = __vfs_read(file, buf, count, &offset);
-	set_fs(old_fs);
-
-	return ret;
+	return __kernel_read(file, addr, count, &offset);
 }
 
 /*
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 11/23] fs: implement kernel_read using __kernel_read
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 10/23] integrity/ima: switch to using __kernel_read Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 12/23] fs: remove __vfs_read Christoph Hellwig
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Consolidate the two in-kernel read helpers to make upcoming changes
easier.  The only difference are the missing call to rw_verify_area
in kernel_read, and an access_ok check that doesn't make sense for
kernel buffers to start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index cc8e0b4f3cd697..a0a0b5d1d9249c 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -455,15 +455,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs;
-	ssize_t result;
+	ssize_t ret;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	result = vfs_read(file, (void __user *)buf, count, pos);
-	set_fs(old_fs);
-	return result;
+	ret = rw_verify_area(READ, file, pos, count);
+	if (ret)
+		return ret;
+	return __kernel_read(file, buf, count, pos);
 }
 EXPORT_SYMBOL(kernel_read);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 12/23] fs: remove __vfs_read
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 11/23] fs: implement kernel_read " Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 13/23] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Fold it into the two callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    | 43 +++++++++++++++++++++----------------------
 include/linux/fs.h |  1 -
 2 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index a0a0b5d1d9249c..6a2170eaee64f9 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -419,17 +419,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	return ret;
 }
 
-ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
-		   loff_t *pos)
-{
-	if (file->f_op->read)
-		return file->f_op->read(file, buf, count, pos);
-	else if (file->f_op->read_iter)
-		return new_sync_read(file, buf, count, pos);
-	else
-		return -EINVAL;
-}
-
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs = get_fs();
@@ -443,7 +432,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
 	set_fs(KERNEL_DS);
-	ret = __vfs_read(file, (void __user *)buf, count, pos);
+	if (file->f_op->read)
+		ret = file->f_op->read(file, (void __user *)buf, count, pos);
+	else if (file->f_op->read_iter)
+		ret = new_sync_read(file, (void __user *)buf, count, pos);
+	else
+		ret = -EINVAL;
 	set_fs(old_fs);
 	if (ret > 0) {
 		fsnotify_access(file);
@@ -476,17 +470,22 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
 		return -EFAULT;
 
 	ret = rw_verify_area(READ, file, pos, count);
-	if (!ret) {
-		if (count > MAX_RW_COUNT)
-			count =  MAX_RW_COUNT;
-		ret = __vfs_read(file, buf, count, pos);
-		if (ret > 0) {
-			fsnotify_access(file);
-			add_rchar(current, ret);
-		}
-		inc_syscr(current);
-	}
+	if (ret)
+		return ret;
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
 
+	if (file->f_op->read)
+		ret = file->f_op->read(file, buf, count, pos);
+	else if (file->f_op->read_iter)
+		ret = new_sync_read(file, buf, count, pos);
+	else
+		ret = -EINVAL;
+	if (ret > 0) {
+		fsnotify_access(file);
+		add_rchar(current, ret);
+	}
+	inc_syscr(current);
 	return ret;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 22cbe7b2e91994..0c0ec76b600b50 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1917,7 +1917,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
 			      struct iovec *fast_pointer,
 			      struct iovec **ret_pointer);
 
-extern ssize_t __vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 13/23] fs: don't change the address limit for ->read_iter in __kernel_read
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 12/23] fs: remove __vfs_read Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 14/23] seq_file: add seq_read_iter Christoph Hellwig
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

If we read to a file that implements ->read_iter there is no need
to change the address limit if we send a kvec down.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 6a2170eaee64f9..8bec4418543994 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -421,7 +421,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs = get_fs();
 	ssize_t ret;
 
 	if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))
@@ -431,14 +430,25 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	set_fs(KERNEL_DS);
-	if (file->f_op->read)
+	if (file->f_op->read) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
 		ret = file->f_op->read(file, (void __user *)buf, count, pos);
-	else if (file->f_op->read_iter)
-		ret = new_sync_read(file, (void __user *)buf, count, pos);
-	else
+		set_fs(old_fs);
+	} else if (file->f_op->read_iter) {
+		struct kvec iov = { .iov_base = buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, READ, &iov, 1, count);
+		ret = file->f_op->read_iter(&kiocb, &iter);
+		*pos = kiocb.ki_pos;
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_access(file);
 		add_rchar(current, ret);
@@ -520,7 +530,14 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
 
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write_iter) {
+	if (file->f_op->write) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
+		ret = file->f_op->write(file, (__force const char __user *)buf,
+				count, pos);
+		set_fs(old_fs);
+	} else if (file->f_op->write_iter) {
 		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
 		struct kiocb kiocb;
 		struct iov_iter iter;
@@ -531,13 +548,6 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
 		ret = file->f_op->write_iter(&kiocb, &iter);
 		if (ret > 0)
 			*pos = kiocb.ki_pos;
-	} else if (file->f_op->write) {
-		mm_segment_t old_fs = get_fs();
-
-		set_fs(KERNEL_DS);
-		ret = file->f_op->write(file, (__force const char __user *)buf,
-				count, pos);
-		set_fs(old_fs);
 	} else {
 		ret = -EINVAL;
 	}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 14/23] seq_file: add seq_read_iter
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 13/23] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 16/23] proc: remove a level of indentation in proc_get_inode Christoph Hellwig
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

iov_iter based variant for reading a seq_file.  seq_read is
reimplemented on top of the iter variant.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/seq_file.c            | 45 ++++++++++++++++++++++++++++------------
 include/linux/seq_file.h |  1 +
 2 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 4e6239f33c066a..4c00cd222adcdc 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -18,6 +18,7 @@
 #include <linux/mm.h>
 #include <linux/printk.h>
 #include <linux/string_helpers.h>
+#include <linux/uio.h>
 
 #include <linux/uaccess.h>
 #include <asm/page.h>
@@ -146,7 +147,28 @@ static int traverse(struct seq_file *m, loff_t offset)
  */
 ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 {
-	struct seq_file *m = file->private_data;
+	struct iovec iov = { .iov_base = buf, .iov_len = size};
+	struct kiocb kiocb;
+	struct iov_iter iter;
+	ssize_t ret;
+
+	init_sync_kiocb(&kiocb, file);
+	iov_iter_init(&iter, READ, &iov, 1, size);
+
+	kiocb.ki_pos = *ppos;
+	ret = seq_read_iter(&kiocb, &iter);
+	*ppos = kiocb.ki_pos;
+	return ret;
+}
+EXPORT_SYMBOL(seq_read);
+
+/*
+ * Ready-made ->f_op->read_iter()
+ */
+ssize_t seq_read_iter(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct seq_file *m = iocb->ki_filp->private_data;
+	size_t size = iov_iter_count(iter);
 	size_t copied = 0;
 	size_t n;
 	void *p;
@@ -158,14 +180,14 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 	 * if request is to read from zero offset, reset iterator to first
 	 * record as it might have been already advanced by previous requests
 	 */
-	if (*ppos == 0) {
+	if (iocb->ki_pos == 0) {
 		m->index = 0;
 		m->count = 0;
 	}
 
-	/* Don't assume *ppos is where we left it */
-	if (unlikely(*ppos != m->read_pos)) {
-		while ((err = traverse(m, *ppos)) == -EAGAIN)
+	/* Don't assume ki_pos is where we left it */
+	if (unlikely(iocb->ki_pos != m->read_pos)) {
+		while ((err = traverse(m, iocb->ki_pos)) == -EAGAIN)
 			;
 		if (err) {
 			/* With prejudice... */
@@ -174,7 +196,7 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 			m->count = 0;
 			goto Done;
 		} else {
-			m->read_pos = *ppos;
+			m->read_pos = iocb->ki_pos;
 		}
 	}
 
@@ -187,13 +209,11 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 	/* if not empty - flush it first */
 	if (m->count) {
 		n = min(m->count, size);
-		err = copy_to_user(buf, m->buf + m->from, n);
-		if (err)
+		if (copy_to_iter(m->buf + m->from, n, iter) != n)
 			goto Efault;
 		m->count -= n;
 		m->from += n;
 		size -= n;
-		buf += n;
 		copied += n;
 		if (!size)
 			goto Done;
@@ -254,8 +274,7 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 	}
 	m->op->stop(m, p);
 	n = min(m->count, size);
-	err = copy_to_user(buf, m->buf, n);
-	if (err)
+	if (copy_to_iter(m->buf, n, iter) != n)
 		goto Efault;
 	copied += n;
 	m->count -= n;
@@ -264,7 +283,7 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 	if (!copied)
 		copied = err;
 	else {
-		*ppos += copied;
+		iocb->ki_pos += copied;
 		m->read_pos += copied;
 	}
 	mutex_unlock(&m->lock);
@@ -276,7 +295,7 @@ ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
 	err = -EFAULT;
 	goto Done;
 }
-EXPORT_SYMBOL(seq_read);
+EXPORT_SYMBOL(seq_read_iter);
 
 /**
  *	seq_lseek -	->llseek() method for sequential files.
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index 813614d4b71fbc..b83b3ae3c877f3 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -107,6 +107,7 @@ void seq_pad(struct seq_file *m, char c);
 char *mangle_path(char *s, const char *p, const char *esc);
 int seq_open(struct file *, const struct seq_operations *);
 ssize_t seq_read(struct file *, char __user *, size_t, loff_t *);
+ssize_t seq_read_iter(struct kiocb *iocb, struct iov_iter *iter);
 loff_t seq_lseek(struct file *, loff_t, int);
 int seq_release(struct inode *, struct file *);
 int seq_write(struct seq_file *seq, const void *data, size_t len);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 16/23] proc: remove a level of indentation in proc_get_inode
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 14/23] seq_file: add seq_read_iter Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 17/23] proc: cleanup the compat vs no compat file ops Christoph Hellwig
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Just return early on inode allocation failure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/proc/inode.c | 72 +++++++++++++++++++++++++------------------------
 1 file changed, 37 insertions(+), 35 deletions(-)

diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 28d6105e908e4c..016b1302cbabc0 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -619,42 +619,44 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de)
 {
 	struct inode *inode = new_inode(sb);
 
-	if (inode) {
-		inode->i_ino = de->low_ino;
-		inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode);
-		PROC_I(inode)->pde = de;
-
-		if (is_empty_pde(de)) {
-			make_empty_dir_inode(inode);
-			return inode;
-		}
-		if (de->mode) {
-			inode->i_mode = de->mode;
-			inode->i_uid = de->uid;
-			inode->i_gid = de->gid;
-		}
-		if (de->size)
-			inode->i_size = de->size;
-		if (de->nlink)
-			set_nlink(inode, de->nlink);
-
-		if (S_ISREG(inode->i_mode)) {
-			inode->i_op = de->proc_iops;
-			inode->i_fop = &proc_reg_file_ops;
+	if (!inode) {
+		pde_put(de);
+		return NULL;
+	}
+
+	inode->i_ino = de->low_ino;
+	inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode);
+	PROC_I(inode)->pde = de;
+	if (is_empty_pde(de)) {
+		make_empty_dir_inode(inode);
+		return inode;
+	}
+
+	if (de->mode) {
+		inode->i_mode = de->mode;
+		inode->i_uid = de->uid;
+		inode->i_gid = de->gid;
+	}
+	if (de->size)
+		inode->i_size = de->size;
+	if (de->nlink)
+		set_nlink(inode, de->nlink);
+
+	if (S_ISREG(inode->i_mode)) {
+		inode->i_op = de->proc_iops;
+		inode->i_fop = &proc_reg_file_ops;
 #ifdef CONFIG_COMPAT
-			if (!de->proc_ops->proc_compat_ioctl) {
-				inode->i_fop = &proc_reg_file_ops_no_compat;
-			}
+		if (!de->proc_ops->proc_compat_ioctl)
+			inode->i_fop = &proc_reg_file_ops_no_compat;
 #endif
-		} else if (S_ISDIR(inode->i_mode)) {
-			inode->i_op = de->proc_iops;
-			inode->i_fop = de->proc_dir_ops;
-		} else if (S_ISLNK(inode->i_mode)) {
-			inode->i_op = de->proc_iops;
-			inode->i_fop = NULL;
-		} else
-			BUG();
-	} else
-	       pde_put(de);
+	} else if (S_ISDIR(inode->i_mode)) {
+		inode->i_op = de->proc_iops;
+		inode->i_fop = de->proc_dir_ops;
+	} else if (S_ISLNK(inode->i_mode)) {
+		inode->i_op = de->proc_iops;
+		inode->i_fop = NULL;
+	} else {
+		BUG();
+	}
 	return inode;
 }
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 17/23] proc: cleanup the compat vs no compat file ops
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (14 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 16/23] proc: remove a level of indentation in proc_get_inode Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 18/23] proc: add a read_iter method to proc proc_ops Christoph Hellwig
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Instead of providing a special no-compat version provide a special
compat version for operations with ->compat_ioctl.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/proc/inode.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 016b1302cbabc0..93dd2045737504 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -572,9 +572,6 @@ static const struct file_operations proc_reg_file_ops = {
 	.write		= proc_reg_write,
 	.poll		= proc_reg_poll,
 	.unlocked_ioctl	= proc_reg_unlocked_ioctl,
-#ifdef CONFIG_COMPAT
-	.compat_ioctl	= proc_reg_compat_ioctl,
-#endif
 	.mmap		= proc_reg_mmap,
 	.get_unmapped_area = proc_reg_get_unmapped_area,
 	.open		= proc_reg_open,
@@ -582,12 +579,13 @@ static const struct file_operations proc_reg_file_ops = {
 };
 
 #ifdef CONFIG_COMPAT
-static const struct file_operations proc_reg_file_ops_no_compat = {
+static const struct file_operations proc_reg_file_ops_compat = {
 	.llseek		= proc_reg_llseek,
 	.read		= proc_reg_read,
 	.write		= proc_reg_write,
 	.poll		= proc_reg_poll,
 	.unlocked_ioctl	= proc_reg_unlocked_ioctl,
+	.compat_ioctl	= proc_reg_compat_ioctl,
 	.mmap		= proc_reg_mmap,
 	.get_unmapped_area = proc_reg_get_unmapped_area,
 	.open		= proc_reg_open,
@@ -646,8 +644,8 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de)
 		inode->i_op = de->proc_iops;
 		inode->i_fop = &proc_reg_file_ops;
 #ifdef CONFIG_COMPAT
-		if (!de->proc_ops->proc_compat_ioctl)
-			inode->i_fop = &proc_reg_file_ops_no_compat;
+		if (de->proc_ops->proc_compat_ioctl)
+			inode->i_fop = &proc_reg_file_ops_compat;
 #endif
 	} else if (S_ISDIR(inode->i_mode)) {
 		inode->i_op = de->proc_iops;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 18/23] proc: add a read_iter method to proc proc_ops
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (15 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 17/23] proc: cleanup the compat vs no compat file ops Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 19/23] proc: switch over direct seq_read method calls to seq_read_iter Christoph Hellwig
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

This will allow proc files to implement iter read semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/proc/inode.c         | 53 ++++++++++++++++++++++++++++++++++++++---
 include/linux/proc_fs.h |  1 +
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 93dd2045737504..58c075e2a452d6 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -297,6 +297,21 @@ static loff_t proc_reg_llseek(struct file *file, loff_t offset, int whence)
 	return rv;
 }
 
+static ssize_t proc_reg_read_iter(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct proc_dir_entry *pde = PDE(file_inode(iocb->ki_filp));
+	ssize_t ret;
+
+	if (pde_is_permanent(pde))
+		return pde->proc_ops->proc_read_iter(iocb, iter);
+
+	if (!use_pde(pde))
+		return -EIO;
+	ret = pde->proc_ops->proc_read_iter(iocb, iter);
+	unuse_pde(pde);
+	return ret;
+}
+
 static ssize_t pde_read(struct proc_dir_entry *pde, struct file *file, char __user *buf, size_t count, loff_t *ppos)
 {
 	typeof_member(struct proc_ops, proc_read) read;
@@ -578,6 +593,18 @@ static const struct file_operations proc_reg_file_ops = {
 	.release	= proc_reg_release,
 };
 
+static const struct file_operations proc_iter_file_ops = {
+	.llseek		= proc_reg_llseek,
+	.read_iter	= proc_reg_read_iter,
+	.write		= proc_reg_write,
+	.poll		= proc_reg_poll,
+	.unlocked_ioctl	= proc_reg_unlocked_ioctl,
+	.mmap		= proc_reg_mmap,
+	.get_unmapped_area = proc_reg_get_unmapped_area,
+	.open		= proc_reg_open,
+	.release	= proc_reg_release,
+};
+
 #ifdef CONFIG_COMPAT
 static const struct file_operations proc_reg_file_ops_compat = {
 	.llseek		= proc_reg_llseek,
@@ -591,6 +618,19 @@ static const struct file_operations proc_reg_file_ops_compat = {
 	.open		= proc_reg_open,
 	.release	= proc_reg_release,
 };
+
+static const struct file_operations proc_iter_file_ops_compat = {
+	.llseek		= proc_reg_llseek,
+	.read_iter	= proc_reg_read_iter,
+	.write		= proc_reg_write,
+	.poll		= proc_reg_poll,
+	.unlocked_ioctl	= proc_reg_unlocked_ioctl,
+	.compat_ioctl	= proc_reg_compat_ioctl,
+	.mmap		= proc_reg_mmap,
+	.get_unmapped_area = proc_reg_get_unmapped_area,
+	.open		= proc_reg_open,
+	.release	= proc_reg_release,
+};
 #endif
 
 static void proc_put_link(void *p)
@@ -642,10 +682,17 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de)
 
 	if (S_ISREG(inode->i_mode)) {
 		inode->i_op = de->proc_iops;
-		inode->i_fop = &proc_reg_file_ops;
+		if (de->proc_ops->proc_read_iter)
+			inode->i_fop = &proc_iter_file_ops;
+		else
+			inode->i_fop = &proc_reg_file_ops;
 #ifdef CONFIG_COMPAT
-		if (de->proc_ops->proc_compat_ioctl)
-			inode->i_fop = &proc_reg_file_ops_compat;
+		if (de->proc_ops->proc_compat_ioctl) {
+			if (de->proc_ops->proc_read_iter)
+				inode->i_fop = &proc_iter_file_ops_compat;
+			else
+				inode->i_fop = &proc_reg_file_ops_compat;
+		}
 #endif
 	} else if (S_ISDIR(inode->i_mode)) {
 		inode->i_op = de->proc_iops;
diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
index d1eed1b4365172..97b3f5f06db9d8 100644
--- a/include/linux/proc_fs.h
+++ b/include/linux/proc_fs.h
@@ -30,6 +30,7 @@ struct proc_ops {
 	unsigned int proc_flags;
 	int	(*proc_open)(struct inode *, struct file *);
 	ssize_t	(*proc_read)(struct file *, char __user *, size_t, loff_t *);
+	ssize_t (*proc_read_iter)(struct kiocb *, struct iov_iter *);
 	ssize_t	(*proc_write)(struct file *, const char __user *, size_t, loff_t *);
 	loff_t	(*proc_lseek)(struct file *, loff_t, int);
 	int	(*proc_release)(struct inode *, struct file *);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 19/23] proc: switch over direct seq_read method calls to seq_read_iter
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (16 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 18/23] proc: add a read_iter method to proc proc_ops Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 20/23] sysctl: Convert to iter interfaces Christoph Hellwig
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Switch over all instances used directly as methods using these sed
expressions:

sed -i -e 's/\.proc_read\(\s*=\s*\)seq_read/\.proc_read_iter\1seq_read_iter/g'

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/alpha/kernel/srm_env.c                        |  2 +-
 arch/arm/mm/alignment.c                            |  2 +-
 arch/powerpc/kernel/rtas-proc.c                    | 10 +++++-----
 arch/powerpc/mm/numa.c                             |  2 +-
 arch/powerpc/platforms/pseries/lpar.c              |  4 ++--
 arch/powerpc/platforms/pseries/lparcfg.c           |  2 +-
 arch/sh/mm/alignment.c                             |  2 +-
 arch/sparc/kernel/led.c                            |  2 +-
 arch/um/kernel/exitcode.c                          |  2 +-
 arch/um/kernel/process.c                           |  2 +-
 arch/x86/kernel/cpu/mtrr/if.c                      |  2 +-
 arch/x86/platform/uv/tlb_uv.c                      |  2 +-
 drivers/acpi/battery.c                             |  2 +-
 drivers/acpi/proc.c                                |  2 +-
 drivers/hwmon/dell-smm-hwmon.c                     |  2 +-
 drivers/ide/ide-proc.c                             |  2 +-
 drivers/input/input.c                              |  4 ++--
 drivers/macintosh/via-pmu.c                        |  2 +-
 drivers/md/md.c                                    |  2 +-
 drivers/misc/sgi-gru/gruprocfs.c                   |  6 +++---
 drivers/net/wireless/intel/ipw2x00/libipw_module.c |  2 +-
 .../net/wireless/intersil/hostap/hostap_download.c |  2 +-
 drivers/parisc/led.c                               |  2 +-
 drivers/platform/x86/thinkpad_acpi.c               |  2 +-
 drivers/platform/x86/toshiba_acpi.c                |  8 ++++----
 drivers/pnp/pnpbios/proc.c                         |  2 +-
 drivers/s390/block/dasd_proc.c                     |  2 +-
 drivers/s390/cio/blacklist.c                       |  2 +-
 drivers/scsi/scsi_devinfo.c                        |  2 +-
 drivers/scsi/scsi_proc.c                           |  4 ++--
 drivers/scsi/sg.c                                  |  4 ++--
 .../staging/rtl8192u/ieee80211/ieee80211_module.c  |  2 +-
 drivers/usb/gadget/function/rndis.c                |  2 +-
 drivers/video/fbdev/via/viafbdev.c                 | 14 +++++++-------
 fs/cifs/cifs_debug.c                               | 14 +++++++-------
 fs/cifs/dfs_cache.c                                |  2 +-
 fs/fscache/object-list.c                           |  2 +-
 fs/jbd2/journal.c                                  |  2 +-
 fs/jfs/jfs_debug.c                                 |  2 +-
 fs/nfsd/nfsctl.c                                   |  2 +-
 fs/nfsd/stats.c                                    |  2 +-
 fs/proc/cpuinfo.c                                  |  2 +-
 fs/proc/generic.c                                  |  4 ++--
 fs/proc/proc_net.c                                 |  4 ++--
 fs/proc/stat.c                                     |  2 +-
 include/linux/seq_file.h                           |  2 +-
 ipc/util.c                                         |  2 +-
 kernel/irq/proc.c                                  |  6 +++---
 kernel/kallsyms.c                                  |  2 +-
 kernel/latencytop.c                                |  2 +-
 kernel/locking/lockdep_proc.c                      |  2 +-
 kernel/module.c                                    |  2 +-
 kernel/profile.c                                   |  2 +-
 kernel/sched/psi.c                                 |  6 +++---
 lib/dynamic_debug.c                                |  2 +-
 mm/slab_common.c                                   |  2 +-
 mm/swapfile.c                                      |  2 +-
 net/atm/mpoa_proc.c                                |  2 +-
 net/core/pktgen.c                                  |  6 +++---
 net/ipv4/netfilter/ipt_CLUSTERIP.c                 |  2 +-
 net/ipv4/route.c                                   |  4 ++--
 net/netfilter/xt_recent.c                          |  2 +-
 net/sunrpc/cache.c                                 |  2 +-
 net/sunrpc/stats.c                                 |  2 +-
 sound/core/info.c                                  |  2 +-
 65 files changed, 99 insertions(+), 99 deletions(-)

diff --git a/arch/alpha/kernel/srm_env.c b/arch/alpha/kernel/srm_env.c
index 528d2be5818298..8ad9c100ef7612 100644
--- a/arch/alpha/kernel/srm_env.c
+++ b/arch/alpha/kernel/srm_env.c
@@ -121,7 +121,7 @@ static ssize_t srm_env_proc_write(struct file *file, const char __user *buffer,
 
 static const struct proc_ops srm_env_proc_ops = {
 	.proc_open	= srm_env_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= srm_env_proc_write,
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 81a627e6e1c599..412cab88402acd 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -164,7 +164,7 @@ static ssize_t alignment_proc_write(struct file *file, const char __user *buffer
 
 static const struct proc_ops alignment_proc_ops = {
 	.proc_open	= alignment_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= alignment_proc_write,
diff --git a/arch/powerpc/kernel/rtas-proc.c b/arch/powerpc/kernel/rtas-proc.c
index 2d33f342a29307..3aace56aacc1df 100644
--- a/arch/powerpc/kernel/rtas-proc.c
+++ b/arch/powerpc/kernel/rtas-proc.c
@@ -161,7 +161,7 @@ static int poweron_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops ppc_rtas_poweron_proc_ops = {
 	.proc_open	= poweron_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= ppc_rtas_poweron_write,
 	.proc_release	= single_release,
@@ -174,7 +174,7 @@ static int progress_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops ppc_rtas_progress_proc_ops = {
 	.proc_open	= progress_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= ppc_rtas_progress_write,
 	.proc_release	= single_release,
@@ -187,7 +187,7 @@ static int clock_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops ppc_rtas_clock_proc_ops = {
 	.proc_open	= clock_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= ppc_rtas_clock_write,
 	.proc_release	= single_release,
@@ -200,7 +200,7 @@ static int tone_freq_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops ppc_rtas_tone_freq_proc_ops = {
 	.proc_open	= tone_freq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= ppc_rtas_tone_freq_write,
 	.proc_release	= single_release,
@@ -213,7 +213,7 @@ static int tone_volume_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops ppc_rtas_tone_volume_proc_ops = {
 	.proc_open	= tone_volume_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= ppc_rtas_tone_volume_write,
 	.proc_release	= single_release,
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 9fcf2d19583004..2f3aef6c0b513b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1672,7 +1672,7 @@ static ssize_t topology_write(struct file *file, const char __user *buf,
 }
 
 static const struct proc_ops topology_proc_ops = {
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= topology_write,
 	.proc_open	= topology_open,
 	.proc_release	= single_release,
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index fd26f3d21d7b4b..2b13a67d60e206 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -584,7 +584,7 @@ static int vcpudispatch_stats_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops vcpudispatch_stats_proc_ops = {
 	.proc_open	= vcpudispatch_stats_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= vcpudispatch_stats_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
@@ -628,7 +628,7 @@ static int vcpudispatch_stats_freq_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops vcpudispatch_stats_freq_proc_ops = {
 	.proc_open	= vcpudispatch_stats_freq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= vcpudispatch_stats_freq_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
diff --git a/arch/powerpc/platforms/pseries/lparcfg.c b/arch/powerpc/platforms/pseries/lparcfg.c
index b8d28ab881789d..35eb0e4b8fd31a 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -699,7 +699,7 @@ static int lparcfg_open(struct inode *inode, struct file *file)
 }
 
 static const struct proc_ops lparcfg_proc_ops = {
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= lparcfg_write,
 	.proc_open	= lparcfg_open,
 	.proc_release	= single_release,
diff --git a/arch/sh/mm/alignment.c b/arch/sh/mm/alignment.c
index fb517b82a87b10..66115241ad93a4 100644
--- a/arch/sh/mm/alignment.c
+++ b/arch/sh/mm/alignment.c
@@ -154,7 +154,7 @@ static ssize_t alignment_proc_write(struct file *file,
 
 static const struct proc_ops alignment_proc_ops = {
 	.proc_open	= alignment_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= alignment_proc_write,
diff --git a/arch/sparc/kernel/led.c b/arch/sparc/kernel/led.c
index bd48575172c323..a0b893d216c443 100644
--- a/arch/sparc/kernel/led.c
+++ b/arch/sparc/kernel/led.c
@@ -106,7 +106,7 @@ static ssize_t led_proc_write(struct file *file, const char __user *buffer,
 
 static const struct proc_ops led_proc_ops = {
 	.proc_open	= led_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= led_proc_write,
diff --git a/arch/um/kernel/exitcode.c b/arch/um/kernel/exitcode.c
index 43edc2aa57e4fb..95184d271a47cf 100644
--- a/arch/um/kernel/exitcode.c
+++ b/arch/um/kernel/exitcode.c
@@ -57,7 +57,7 @@ static ssize_t exitcode_proc_write(struct file *file,
 
 static const struct proc_ops exitcode_proc_ops = {
 	.proc_open	= exitcode_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= exitcode_proc_write,
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index e3a2cf92a3738b..f3e4bd48f6d5b6 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -312,7 +312,7 @@ static ssize_t sysemu_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops sysemu_proc_ops = {
 	.proc_open	= sysemu_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= sysemu_proc_write,
diff --git a/arch/x86/kernel/cpu/mtrr/if.c b/arch/x86/kernel/cpu/mtrr/if.c
index a5c506f6da7fa1..f5743b5ecaf232 100644
--- a/arch/x86/kernel/cpu/mtrr/if.c
+++ b/arch/x86/kernel/cpu/mtrr/if.c
@@ -398,7 +398,7 @@ static int mtrr_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops mtrr_proc_ops = {
 	.proc_open		= mtrr_open,
-	.proc_read		= seq_read,
+	.proc_read_iter		= seq_read_iter,
 	.proc_lseek		= seq_lseek,
 	.proc_write		= mtrr_write,
 	.proc_ioctl		= mtrr_ioctl,
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 0ac96ca304c7b5..cd6ee86283e6e1 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -1670,7 +1670,7 @@ static int tunables_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops uv_ptc_proc_ops = {
 	.proc_open	= ptc_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= ptc_proc_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
index 366c389175d844..c8e5972ad6c952 100644
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -1204,7 +1204,7 @@ static int acpi_battery_alarm_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops acpi_battery_alarm_proc_ops = {
 	.proc_open	= acpi_battery_alarm_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= acpi_battery_write_alarm,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
diff --git a/drivers/acpi/proc.c b/drivers/acpi/proc.c
index 7892980b3ce4d3..774da498c77265 100644
--- a/drivers/acpi/proc.c
+++ b/drivers/acpi/proc.c
@@ -136,7 +136,7 @@ acpi_system_wakeup_device_open_fs(struct inode *inode, struct file *file)
 
 static const struct proc_ops acpi_system_wakeup_device_proc_ops = {
 	.proc_open	= acpi_system_wakeup_device_open_fs,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= acpi_system_write_wakeup_device,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index 16be012a95ed84..29ac388429620f 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -597,7 +597,7 @@ static int i8k_open_fs(struct inode *inode, struct file *file)
 
 static const struct proc_ops i8k_proc_ops = {
 	.proc_open	= i8k_open_fs,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_ioctl	= i8k_ioctl,
diff --git a/drivers/ide/ide-proc.c b/drivers/ide/ide-proc.c
index 8ea282a3a19f0d..a0b56737152fb5 100644
--- a/drivers/ide/ide-proc.c
+++ b/drivers/ide/ide-proc.c
@@ -383,7 +383,7 @@ static ssize_t ide_settings_proc_write(struct file *file, const char __user *buf
 
 static const struct proc_ops ide_settings_proc_ops = {
 	.proc_open	= ide_settings_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= ide_settings_proc_write,
diff --git a/drivers/input/input.c b/drivers/input/input.c
index 3cfd2c18eebd9d..c8180d7f92d576 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1220,7 +1220,7 @@ static int input_proc_devices_open(struct inode *inode, struct file *file)
 static const struct proc_ops input_devices_proc_ops = {
 	.proc_open	= input_proc_devices_open,
 	.proc_poll	= input_proc_devices_poll,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
@@ -1282,7 +1282,7 @@ static int input_proc_handlers_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops input_handlers_proc_ops = {
 	.proc_open	= input_proc_handlers_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 73e6ae88fafd4e..9415eddb419402 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -973,7 +973,7 @@ static ssize_t pmu_options_proc_write(struct file *file,
 
 static const struct proc_ops pmu_options_proc_ops = {
 	.proc_open	= pmu_options_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= pmu_options_proc_write,
diff --git a/drivers/md/md.c b/drivers/md/md.c
index f567f536b529bd..0bae6c1523cec6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8301,7 +8301,7 @@ static __poll_t mdstat_poll(struct file *filp, poll_table *wait)
 
 static const struct proc_ops mdstat_proc_ops = {
 	.proc_open	= md_seq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 	.proc_poll	= mdstat_poll,
diff --git a/drivers/misc/sgi-gru/gruprocfs.c b/drivers/misc/sgi-gru/gruprocfs.c
index 97b8b38ab47dfd..fc9498ec797762 100644
--- a/drivers/misc/sgi-gru/gruprocfs.c
+++ b/drivers/misc/sgi-gru/gruprocfs.c
@@ -257,7 +257,7 @@ static int options_open(struct inode *inode, struct file *file)
 /* *INDENT-OFF* */
 static const struct proc_ops statistics_proc_ops = {
 	.proc_open	= statistics_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= statistics_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
@@ -265,7 +265,7 @@ static const struct proc_ops statistics_proc_ops = {
 
 static const struct proc_ops mcs_statistics_proc_ops = {
 	.proc_open	= mcs_statistics_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= mcs_statistics_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
@@ -273,7 +273,7 @@ static const struct proc_ops mcs_statistics_proc_ops = {
 
 static const struct proc_ops options_proc_ops = {
 	.proc_open	= options_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= options_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
diff --git a/drivers/net/wireless/intel/ipw2x00/libipw_module.c b/drivers/net/wireless/intel/ipw2x00/libipw_module.c
index 43bab92a4148f2..1929db6921d7e0 100644
--- a/drivers/net/wireless/intel/ipw2x00/libipw_module.c
+++ b/drivers/net/wireless/intel/ipw2x00/libipw_module.c
@@ -242,7 +242,7 @@ static ssize_t debug_level_proc_write(struct file *file,
 
 static const struct proc_ops debug_level_proc_ops = {
 	.proc_open	= debug_level_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= debug_level_proc_write,
diff --git a/drivers/net/wireless/intersil/hostap/hostap_download.c b/drivers/net/wireless/intersil/hostap/hostap_download.c
index 7c6a5a6d1d45d8..8980fd57b2eda4 100644
--- a/drivers/net/wireless/intersil/hostap/hostap_download.c
+++ b/drivers/net/wireless/intersil/hostap/hostap_download.c
@@ -234,7 +234,7 @@ static int prism2_download_aux_dump_proc_open(struct inode *inode, struct file *
 
 static const struct proc_ops prism2_download_aux_dump_proc_ops = {
 	.proc_open		= prism2_download_aux_dump_proc_open,
-	.proc_read		= seq_read,
+	.proc_read_iter		= seq_read_iter,
 	.proc_lseek		= seq_lseek,
 	.proc_release		= seq_release_private,
 };
diff --git a/drivers/parisc/led.c b/drivers/parisc/led.c
index 36c6613f7a36b7..d75df3977926b3 100644
--- a/drivers/parisc/led.c
+++ b/drivers/parisc/led.c
@@ -232,7 +232,7 @@ static ssize_t led_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops led_proc_ops = {
 	.proc_open	= led_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= led_proc_write,
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
index ff7f0a4f247563..f571d6254e7c34 100644
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -901,7 +901,7 @@ static ssize_t dispatch_proc_write(struct file *file,
 
 static const struct proc_ops dispatch_proc_ops = {
 	.proc_open	= dispatch_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= dispatch_proc_write,
diff --git a/drivers/platform/x86/toshiba_acpi.c b/drivers/platform/x86/toshiba_acpi.c
index 1ddab5a6dead6d..770477bb407d49 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -1428,7 +1428,7 @@ static ssize_t lcd_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops lcd_proc_ops = {
 	.proc_open	= lcd_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= lcd_proc_write,
@@ -1534,7 +1534,7 @@ static ssize_t video_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops video_proc_ops = {
 	.proc_open	= video_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= video_proc_write,
@@ -1611,7 +1611,7 @@ static ssize_t fan_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops fan_proc_ops = {
 	.proc_open	= fan_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= fan_proc_write,
@@ -1655,7 +1655,7 @@ static ssize_t keys_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops keys_proc_ops = {
 	.proc_open	= keys_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= keys_proc_write,
diff --git a/drivers/pnp/pnpbios/proc.c b/drivers/pnp/pnpbios/proc.c
index a806830e3a407f..10d0181c4430ab 100644
--- a/drivers/pnp/pnpbios/proc.c
+++ b/drivers/pnp/pnpbios/proc.c
@@ -212,7 +212,7 @@ static ssize_t pnpbios_proc_write(struct file *file, const char __user *buf,
 
 static const struct proc_ops pnpbios_proc_ops = {
 	.proc_open	= pnpbios_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= pnpbios_proc_write,
diff --git a/drivers/s390/block/dasd_proc.c b/drivers/s390/block/dasd_proc.c
index 62a859ea67f893..278f0dccc85ff1 100644
--- a/drivers/s390/block/dasd_proc.c
+++ b/drivers/s390/block/dasd_proc.c
@@ -322,7 +322,7 @@ static ssize_t dasd_stats_proc_write(struct file *file,
 
 static const struct proc_ops dasd_stats_proc_ops = {
 	.proc_open	= dasd_stats_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= dasd_stats_proc_write,
diff --git a/drivers/s390/cio/blacklist.c b/drivers/s390/cio/blacklist.c
index 4dd2eb63485699..05f58c453b060c 100644
--- a/drivers/s390/cio/blacklist.c
+++ b/drivers/s390/cio/blacklist.c
@@ -401,7 +401,7 @@ cio_ignore_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops cio_ignore_proc_ops = {
 	.proc_open	= cio_ignore_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release_private,
 	.proc_write	= cio_ignore_write,
diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c
index eed31021e7885c..87fb440ddfc5d8 100644
--- a/drivers/scsi/scsi_devinfo.c
+++ b/drivers/scsi/scsi_devinfo.c
@@ -738,7 +738,7 @@ static ssize_t proc_scsi_devinfo_write(struct file *file,
 
 static const struct proc_ops scsi_devinfo_proc_ops = {
 	.proc_open	= proc_scsi_devinfo_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= proc_scsi_devinfo_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/scsi_proc.c
index d6982d3557396b..81601a9e79c4db 100644
--- a/drivers/scsi/scsi_proc.c
+++ b/drivers/scsi/scsi_proc.c
@@ -86,7 +86,7 @@ static int proc_scsi_host_open(struct inode *inode, struct file *file)
 static const struct proc_ops proc_scsi_ops = {
 	.proc_open	= proc_scsi_host_open,
 	.proc_release	= single_release,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= proc_scsi_host_write
 };
@@ -438,7 +438,7 @@ static int proc_scsi_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops scsi_scsi_proc_ops = {
 	.proc_open	= proc_scsi_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= proc_scsi_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 20472aaaf630a4..c5d482190066bc 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -2328,7 +2328,7 @@ static ssize_t sg_proc_write_adio(struct file *filp, const char __user *buffer,
 			          size_t count, loff_t *off);
 static const struct proc_ops adio_proc_ops = {
 	.proc_open	= sg_proc_single_open_adio,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= sg_proc_write_adio,
 	.proc_release	= single_release,
@@ -2339,7 +2339,7 @@ static ssize_t sg_proc_write_dressz(struct file *filp,
 		const char __user *buffer, size_t count, loff_t *off);
 static const struct proc_ops dressz_proc_ops = {
 	.proc_open	= sg_proc_single_open_dressz,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= sg_proc_write_dressz,
 	.proc_release	= single_release,
diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c
index a5a1b14f5a40c5..e198779db6fc2e 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c
@@ -265,7 +265,7 @@ static int open_debug_level(struct inode *inode, struct file *file)
 
 static const struct proc_ops debug_level_proc_ops = {
 	.proc_open	= open_debug_level,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= write_debug_level,
 	.proc_release	= single_release,
diff --git a/drivers/usb/gadget/function/rndis.c b/drivers/usb/gadget/function/rndis.c
index 64de9f1b874c55..562781b95101d3 100644
--- a/drivers/usb/gadget/function/rndis.c
+++ b/drivers/usb/gadget/function/rndis.c
@@ -1166,7 +1166,7 @@ static int rndis_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops rndis_proc_ops = {
 	.proc_open	= rndis_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= rndis_proc_write,
diff --git a/drivers/video/fbdev/via/viafbdev.c b/drivers/video/fbdev/via/viafbdev.c
index 22deb340a0484f..6cf91191de7f15 100644
--- a/drivers/video/fbdev/via/viafbdev.c
+++ b/drivers/video/fbdev/via/viafbdev.c
@@ -1175,7 +1175,7 @@ static ssize_t viafb_dvp0_proc_write(struct file *file,
 
 static const struct proc_ops viafb_dvp0_proc_ops = {
 	.proc_open	= viafb_dvp0_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_dvp0_proc_write,
@@ -1239,7 +1239,7 @@ static ssize_t viafb_dvp1_proc_write(struct file *file,
 
 static const struct proc_ops viafb_dvp1_proc_ops = {
 	.proc_open	= viafb_dvp1_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_dvp1_proc_write,
@@ -1273,7 +1273,7 @@ static ssize_t viafb_dfph_proc_write(struct file *file,
 
 static const struct proc_ops viafb_dfph_proc_ops = {
 	.proc_open	= viafb_dfph_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_dfph_proc_write,
@@ -1307,7 +1307,7 @@ static ssize_t viafb_dfpl_proc_write(struct file *file,
 
 static const struct proc_ops viafb_dfpl_proc_ops = {
 	.proc_open	= viafb_dfpl_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_dfpl_proc_write,
@@ -1442,7 +1442,7 @@ static ssize_t viafb_vt1636_proc_write(struct file *file,
 
 static const struct proc_ops viafb_vt1636_proc_ops = {
 	.proc_open	= viafb_vt1636_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_vt1636_proc_write,
@@ -1519,7 +1519,7 @@ static ssize_t viafb_iga1_odev_proc_write(struct file *file,
 
 static const struct proc_ops viafb_iga1_odev_proc_ops = {
 	.proc_open	= viafb_iga1_odev_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_iga1_odev_proc_write,
@@ -1558,7 +1558,7 @@ static ssize_t viafb_iga2_odev_proc_write(struct file *file,
 
 static const struct proc_ops viafb_iga2_odev_proc_ops = {
 	.proc_open	= viafb_iga2_odev_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= viafb_iga2_odev_proc_write,
diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c
index 53588d7517b4d0..877763996fa987 100644
--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -619,7 +619,7 @@ static int cifs_stats_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops cifs_stats_proc_ops = {
 	.proc_open	= cifs_stats_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= cifs_stats_proc_write,
@@ -648,7 +648,7 @@ static int name##_open(struct inode *inode, struct file *file) \
 \
 static const struct proc_ops cifs_##name##_proc_fops = { \
 	.proc_open	= name##_open, \
-	.proc_read	= seq_read, \
+	.proc_read_iter	= seq_read_iter, \
 	.proc_lseek	= seq_lseek, \
 	.proc_release	= single_release, \
 	.proc_write	= name##_write, \
@@ -782,7 +782,7 @@ static ssize_t cifsFYI_proc_write(struct file *file, const char __user *buffer,
 
 static const struct proc_ops cifsFYI_proc_ops = {
 	.proc_open	= cifsFYI_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= cifsFYI_proc_write,
@@ -813,7 +813,7 @@ static ssize_t cifs_linux_ext_proc_write(struct file *file,
 
 static const struct proc_ops cifs_linux_ext_proc_ops = {
 	.proc_open	= cifs_linux_ext_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= cifs_linux_ext_proc_write,
@@ -844,7 +844,7 @@ static ssize_t cifs_lookup_cache_proc_write(struct file *file,
 
 static const struct proc_ops cifs_lookup_cache_proc_ops = {
 	.proc_open	= cifs_lookup_cache_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= cifs_lookup_cache_proc_write,
@@ -875,7 +875,7 @@ static ssize_t traceSMB_proc_write(struct file *file, const char __user *buffer,
 
 static const struct proc_ops traceSMB_proc_ops = {
 	.proc_open	= traceSMB_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= traceSMB_proc_write,
@@ -986,7 +986,7 @@ static ssize_t cifs_security_flags_proc_write(struct file *file,
 
 static const struct proc_ops cifs_security_flags_proc_ops = {
 	.proc_open	= cifs_security_flags_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= cifs_security_flags_proc_write,
diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index df81c718d2faec..a4fe155fc92a7a 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -214,7 +214,7 @@ static int dfscache_proc_open(struct inode *inode, struct file *file)
 
 const struct proc_ops dfscache_proc_ops = {
 	.proc_open	= dfscache_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= dfscache_proc_write,
diff --git a/fs/fscache/object-list.c b/fs/fscache/object-list.c
index e106a1a1600d82..fab5a4197f50c3 100644
--- a/fs/fscache/object-list.c
+++ b/fs/fscache/object-list.c
@@ -408,7 +408,7 @@ static int fscache_objlist_release(struct inode *inode, struct file *file)
 
 const struct proc_ops fscache_objlist_proc_ops = {
 	.proc_open	= fscache_objlist_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= fscache_objlist_release,
 };
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index e4944436e733d0..db661b953c4378 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1080,7 +1080,7 @@ static int jbd2_seq_info_release(struct inode *inode, struct file *file)
 
 static const struct proc_ops jbd2_info_proc_ops = {
 	.proc_open	= jbd2_seq_info_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= jbd2_seq_info_release,
 };
diff --git a/fs/jfs/jfs_debug.c b/fs/jfs/jfs_debug.c
index 44b62b3c322e9a..235df6bac1d71a 100644
--- a/fs/jfs/jfs_debug.c
+++ b/fs/jfs/jfs_debug.c
@@ -45,7 +45,7 @@ static ssize_t jfs_loglevel_proc_write(struct file *file,
 
 static const struct proc_ops jfs_loglevel_proc_ops = {
 	.proc_open	= jfs_loglevel_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= jfs_loglevel_proc_write,
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 583ede369fd7cd..c6f2faa759d1fb 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -159,7 +159,7 @@ static int exports_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops exports_proc_ops = {
 	.proc_open	= exports_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c
index b1bc582b0493e4..1076f87715ad88 100644
--- a/fs/nfsd/stats.c
+++ b/fs/nfsd/stats.c
@@ -86,7 +86,7 @@ static int nfsd_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops nfsd_proc_ops = {
 	.proc_open	= nfsd_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 };
diff --git a/fs/proc/cpuinfo.c b/fs/proc/cpuinfo.c
index d0989a443c77df..419760fd77bdd8 100644
--- a/fs/proc/cpuinfo.c
+++ b/fs/proc/cpuinfo.c
@@ -19,7 +19,7 @@ static int cpuinfo_open(struct inode *inode, struct file *file)
 static const struct proc_ops cpuinfo_proc_ops = {
 	.proc_flags	= PROC_ENTRY_PERMANENT,
 	.proc_open	= cpuinfo_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 2f9fa179194d72..4323b28db5643a 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -590,7 +590,7 @@ static int proc_seq_release(struct inode *inode, struct file *file)
 static const struct proc_ops proc_seq_ops = {
 	/* not permanent -- can call into arbitrary seq_operations */
 	.proc_open	= proc_seq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= proc_seq_release,
 };
@@ -621,7 +621,7 @@ static int proc_single_open(struct inode *inode, struct file *file)
 static const struct proc_ops proc_single_ops = {
 	/* not permanent -- can call into arbitrary ->single_show */
 	.proc_open	= proc_single_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 };
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index dba63b2429f05f..8274f4fc4d4338 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -92,7 +92,7 @@ static int seq_release_net(struct inode *ino, struct file *f)
 
 static const struct proc_ops proc_net_seq_ops = {
 	.proc_open	= seq_open_net,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= proc_simple_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release_net,
@@ -204,7 +204,7 @@ static int single_release_net(struct inode *ino, struct file *f)
 
 static const struct proc_ops proc_net_single_ops = {
 	.proc_open	= single_open_net,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= proc_simple_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release_net,
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 46b3293015fe61..4695b6de315129 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -226,7 +226,7 @@ static int stat_open(struct inode *inode, struct file *file)
 static const struct proc_ops stat_proc_ops = {
 	.proc_flags	= PROC_ENTRY_PERMANENT,
 	.proc_open	= stat_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 };
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index 0c9e9c8607e788..04f342179f5572 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -187,7 +187,7 @@ static int __name ## _open(struct inode *inode, struct file *file)	\
 									\
 static const struct proc_ops __name ## _proc_ops = {			\
 	.proc_open	= __name ## _open,				\
-	.proc_read	= seq_read,					\
+	.proc_read_iter	= seq_read_iter,					\
 	.proc_lseek	= seq_lseek,					\
 	.proc_release	= single_release,				\
 }
diff --git a/ipc/util.c b/ipc/util.c
index cfa0045e748d55..189c835108afc8 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -887,7 +887,7 @@ static int sysvipc_proc_release(struct inode *inode, struct file *file)
 static const struct proc_ops sysvipc_proc_ops = {
 	.proc_flags	= PROC_ENTRY_PERMANENT,
 	.proc_open	= sysvipc_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= sysvipc_proc_release,
 };
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 32c071d7bc0338..6c541898f614c4 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -200,7 +200,7 @@ static int irq_affinity_list_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops irq_affinity_proc_ops = {
 	.proc_open	= irq_affinity_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= irq_affinity_proc_write,
@@ -208,7 +208,7 @@ static const struct proc_ops irq_affinity_proc_ops = {
 
 static const struct proc_ops irq_affinity_list_proc_ops = {
 	.proc_open	= irq_affinity_list_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= irq_affinity_list_proc_write,
@@ -270,7 +270,7 @@ static int default_affinity_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops default_affinity_proc_ops = {
 	.proc_open	= default_affinity_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= default_affinity_write,
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 16c8c605f4b0fa..90facda3b723ab 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -699,7 +699,7 @@ const char *kdb_walk_kallsyms(loff_t *pos)
 
 static const struct proc_ops kallsyms_proc_ops = {
 	.proc_open	= kallsyms_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release_private,
 };
diff --git a/kernel/latencytop.c b/kernel/latencytop.c
index 166d7bf49666b0..543c7f552c45ce 100644
--- a/kernel/latencytop.c
+++ b/kernel/latencytop.c
@@ -257,7 +257,7 @@ static int lstats_open(struct inode *inode, struct file *filp)
 
 static const struct proc_ops lstats_proc_ops = {
 	.proc_open	= lstats_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= lstats_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 5525cd3ba0c83c..60c725f53c26d7 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -669,7 +669,7 @@ static int lock_stat_release(struct inode *inode, struct file *file)
 static const struct proc_ops lock_stat_proc_ops = {
 	.proc_open	= lock_stat_open,
 	.proc_write	= lock_stat_write,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= lock_stat_release,
 };
diff --git a/kernel/module.c b/kernel/module.c
index bee1c25ca5c5ec..ed5a99d520b40c 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4388,7 +4388,7 @@ static int modules_open(struct inode *inode, struct file *file)
 static const struct proc_ops modules_proc_ops = {
 	.proc_flags	= PROC_ENTRY_PERMANENT,
 	.proc_open	= modules_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
diff --git a/kernel/profile.c b/kernel/profile.c
index 6f69a4195d5630..101090397235ae 100644
--- a/kernel/profile.c
+++ b/kernel/profile.c
@@ -444,7 +444,7 @@ static ssize_t prof_cpu_mask_proc_write(struct file *file,
 
 static const struct proc_ops prof_cpu_mask_proc_ops = {
 	.proc_open	= prof_cpu_mask_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 	.proc_write	= prof_cpu_mask_proc_write,
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 8f45cdb6463b88..6795170140a031 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1305,7 +1305,7 @@ static int psi_fop_release(struct inode *inode, struct file *file)
 
 static const struct proc_ops psi_io_proc_ops = {
 	.proc_open	= psi_io_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= psi_io_write,
 	.proc_poll	= psi_fop_poll,
@@ -1314,7 +1314,7 @@ static const struct proc_ops psi_io_proc_ops = {
 
 static const struct proc_ops psi_memory_proc_ops = {
 	.proc_open	= psi_memory_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= psi_memory_write,
 	.proc_poll	= psi_fop_poll,
@@ -1323,7 +1323,7 @@ static const struct proc_ops psi_memory_proc_ops = {
 
 static const struct proc_ops psi_cpu_proc_ops = {
 	.proc_open	= psi_cpu_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= psi_cpu_write,
 	.proc_poll	= psi_fop_poll,
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 76750e73dcaf9d..2317d29e16def9 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -878,7 +878,7 @@ static const struct file_operations ddebug_proc_fops = {
 
 static const struct proc_ops proc_fops = {
 	.proc_open = ddebug_proc_open,
-	.proc_read = seq_read,
+	.proc_read_iter = seq_read_iter,
 	.proc_lseek = seq_lseek,
 	.proc_release = seq_release_private,
 	.proc_write = ddebug_proc_write
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 37d48a56431d04..5cf40d2d721c0b 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1584,7 +1584,7 @@ static int slabinfo_open(struct inode *inode, struct file *file)
 static const struct proc_ops slabinfo_proc_ops = {
 	.proc_flags	= PROC_ENTRY_PERMANENT,
 	.proc_open	= slabinfo_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= slabinfo_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 987276c557d1f1..2e50f52a14c7c2 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2835,7 +2835,7 @@ static int swaps_open(struct inode *inode, struct file *file)
 static const struct proc_ops swaps_proc_ops = {
 	.proc_flags	= PROC_ENTRY_PERMANENT,
 	.proc_open	= swaps_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 	.proc_poll	= swaps_poll,
diff --git a/net/atm/mpoa_proc.c b/net/atm/mpoa_proc.c
index 829db9eba0cb95..fe8f822c7750a6 100644
--- a/net/atm/mpoa_proc.c
+++ b/net/atm/mpoa_proc.c
@@ -55,7 +55,7 @@ static int parse_qos(const char *buff);
 
 static const struct proc_ops mpc_proc_ops = {
 	.proc_open	= proc_mpc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= proc_mpc_write,
 	.proc_release	= seq_release,
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index b53b6d38c4dff8..200b976202d0d1 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -537,7 +537,7 @@ static int pgctrl_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops pktgen_proc_ops = {
 	.proc_open	= pgctrl_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= pgctrl_write,
 	.proc_release	= single_release,
@@ -1709,7 +1709,7 @@ static int pktgen_if_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops pktgen_if_proc_ops = {
 	.proc_open	= pktgen_if_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= pktgen_if_write,
 	.proc_release	= single_release,
@@ -1846,7 +1846,7 @@ static int pktgen_thread_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops pktgen_thread_proc_ops = {
 	.proc_open	= pktgen_thread_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_write	= pktgen_thread_write,
 	.proc_release	= single_release,
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index f8755a4ae9d4bd..67472389c9c395 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -806,7 +806,7 @@ static ssize_t clusterip_proc_write(struct file *file, const char __user *input,
 
 static const struct proc_ops clusterip_proc_ops = {
 	.proc_open	= clusterip_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= clusterip_proc_write,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= clusterip_proc_release,
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 1d7076b78e630b..4d5f26478f1001 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -239,7 +239,7 @@ static int rt_cache_seq_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops rt_cache_proc_ops = {
 	.proc_open	= rt_cache_seq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
@@ -330,7 +330,7 @@ static int rt_cpu_seq_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops rt_cpu_proc_ops = {
 	.proc_open	= rt_cpu_seq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= seq_release,
 };
diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index 19bef176145eb9..f9cc00f7486058 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -618,7 +618,7 @@ recent_mt_proc_write(struct file *file, const char __user *input,
 
 static const struct proc_ops recent_mt_proc_ops = {
 	.proc_open	= recent_seq_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_write	= recent_mt_proc_write,
 	.proc_release	= seq_release_private,
 	.proc_lseek	= seq_lseek,
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index e5c01697c3f1d6..3671c464e0d30f 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1623,7 +1623,7 @@ static int content_release_procfs(struct inode *inode, struct file *filp)
 
 static const struct proc_ops content_proc_ops = {
 	.proc_open	= content_open_procfs,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= content_release_procfs,
 };
diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c
index c964b48eaabae4..95b56f0e5a01e8 100644
--- a/net/sunrpc/stats.c
+++ b/net/sunrpc/stats.c
@@ -71,7 +71,7 @@ static int rpc_proc_open(struct inode *inode, struct file *file)
 
 static const struct proc_ops rpc_proc_ops = {
 	.proc_open	= rpc_proc_open,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 	.proc_lseek	= seq_lseek,
 	.proc_release	= single_release,
 };
diff --git a/sound/core/info.c b/sound/core/info.c
index 8c6bc5241df50c..6e2a35a37b5e6b 100644
--- a/sound/core/info.c
+++ b/sound/core/info.c
@@ -426,7 +426,7 @@ static const struct proc_ops snd_info_text_entry_ops =
 	.proc_release	= snd_info_text_entry_release,
 	.proc_write	= snd_info_text_entry_write,
 	.proc_lseek	= seq_lseek,
-	.proc_read	= seq_read,
+	.proc_read_iter	= seq_read_iter,
 };
 
 static struct snd_info_entry *create_subdir(struct module *mod,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 20/23] sysctl: Convert to iter interfaces
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (17 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 19/23] proc: switch over direct seq_read method calls to seq_read_iter Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:47 ` [PATCH 21/23] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Using the read_iter/write_iter interfaces allows for in-kernel users
to set sysctls without using set_fs().  Also, the buffer is a string,
so give it the real type of 'char *', not void *.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/proc/proc_sysctl.c      | 44 ++++++++++++++++++--------------------
 include/linux/bpf-cgroup.h |  2 +-
 kernel/bpf/cgroup.c        |  2 +-
 3 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 6c1166ccdaea57..9f6b9c3e3fdaf5 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -12,6 +12,7 @@
 #include <linux/cred.h>
 #include <linux/namei.h>
 #include <linux/mm.h>
+#include <linux/uio.h>
 #include <linux/module.h>
 #include <linux/bpf-cgroup.h>
 #include <linux/mount.h>
@@ -540,13 +541,14 @@ static struct dentry *proc_sys_lookup(struct inode *dir, struct dentry *dentry,
 	return err;
 }
 
-static ssize_t proc_sys_call_handler(struct file *filp, void __user *ubuf,
-		size_t count, loff_t *ppos, int write)
+static ssize_t proc_sys_call_handler(struct kiocb *iocb, struct iov_iter *iter,
+		int write)
 {
-	struct inode *inode = file_inode(filp);
+	struct inode *inode = file_inode(iocb->ki_filp);
 	struct ctl_table_header *head = grab_header(inode);
 	struct ctl_table *table = PROC_I(inode)->sysctl_entry;
-	void *kbuf;
+	size_t count = iov_iter_count(iter);
+	char *kbuf;
 	ssize_t error;
 
 	if (IS_ERR(head))
@@ -569,32 +571,30 @@ static ssize_t proc_sys_call_handler(struct file *filp, void __user *ubuf,
 	error = -ENOMEM;
 	if (count >= KMALLOC_MAX_SIZE)
 		goto out;
+	kbuf = kzalloc(count + 1, GFP_KERNEL);
+	if (!kbuf)
+		goto out;
 
 	if (write) {
-		kbuf = memdup_user_nul(ubuf, count);
-		if (IS_ERR(kbuf)) {
-			error = PTR_ERR(kbuf);
-			goto out;
-		}
-	} else {
-		kbuf = kzalloc(count, GFP_KERNEL);
-		if (!kbuf)
+		error = -EFAULT;
+		if (!copy_from_iter_full(kbuf, count, iter))
 			goto out;
+		kbuf[count] = '\0';
 	}
 
 	error = BPF_CGROUP_RUN_PROG_SYSCTL(head, table, write, &kbuf, &count,
-					   ppos);
+					   &iocb->ki_pos);
 	if (error)
 		goto out_free_buf;
 
 	/* careful: calling conventions are nasty here */
-	error = table->proc_handler(table, write, kbuf, &count, ppos);
+	error = table->proc_handler(table, write, kbuf, &count, &iocb->ki_pos);
 	if (error)
 		goto out_free_buf;
 
 	if (!write) {
 		error = -EFAULT;
-		if (copy_to_user(ubuf, kbuf, count))
+		if (copy_to_iter(kbuf, count, iter) < count)
 			goto out_free_buf;
 	}
 
@@ -607,16 +607,14 @@ static ssize_t proc_sys_call_handler(struct file *filp, void __user *ubuf,
 	return error;
 }
 
-static ssize_t proc_sys_read(struct file *filp, char __user *buf,
-				size_t count, loff_t *ppos)
+static ssize_t proc_sys_read(struct kiocb *iocb, struct iov_iter *iter)
 {
-	return proc_sys_call_handler(filp, (void __user *)buf, count, ppos, 0);
+	return proc_sys_call_handler(iocb, iter, 0);
 }
 
-static ssize_t proc_sys_write(struct file *filp, const char __user *buf,
-				size_t count, loff_t *ppos)
+static ssize_t proc_sys_write(struct kiocb *iocb, struct iov_iter *iter)
 {
-	return proc_sys_call_handler(filp, (void __user *)buf, count, ppos, 1);
+	return proc_sys_call_handler(iocb, iter, 1);
 }
 
 static int proc_sys_open(struct inode *inode, struct file *filp)
@@ -853,8 +851,8 @@ static int proc_sys_getattr(const struct path *path, struct kstat *stat,
 static const struct file_operations proc_sys_file_operations = {
 	.open		= proc_sys_open,
 	.poll		= proc_sys_poll,
-	.read		= proc_sys_read,
-	.write		= proc_sys_write,
+	.read_iter	= proc_sys_read,
+	.write_iter	= proc_sys_write,
 	.llseek		= default_llseek,
 };
 
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index c66c545e161a60..f81d3b3752f919 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -132,7 +132,7 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
 
 int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
 				   struct ctl_table *table, int write,
-				   void **buf, size_t *pcount, loff_t *ppos,
+				   char **buf, size_t *pcount, loff_t *ppos,
 				   enum bpf_attach_type type);
 
 int __cgroup_bpf_run_filter_setsockopt(struct sock *sock, int *level,
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index ac53102e244a7a..81dcf15990ebe1 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1202,7 +1202,7 @@ const struct bpf_verifier_ops cg_dev_verifier_ops = {
  */
 int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
 				   struct ctl_table *table, int write,
-				   void **buf, size_t *pcount, loff_t *ppos,
+				   char **buf, size_t *pcount, loff_t *ppos,
 				   enum bpf_attach_type type)
 {
 	struct bpf_sysctl_kern ctx = {
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 21/23] fs: don't allow kernel reads and writes without iter ops
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (18 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 20/23] sysctl: Convert to iter interfaces Christoph Hellwig
@ 2020-07-07 17:47 ` Christoph Hellwig
  2020-07-07 17:48 ` [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter Christoph Hellwig
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:47 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Don't allow calling ->read or ->write with set_fs as a preparation for
killing off set_fs.  While I've not triggered any of these cases in my
setups as all the usual suspect (file systems, pipes, sockets, block
devices, system character devices) use the iter ops this is almost
going to be guaranteed to eventuall break something, so print a detailed
error message helping to debug such cases.  The fix will be to switch the
affected driver to use the iter ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 8bec4418543994..11c55547cfc9d6 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -419,6 +419,13 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	return ret;
 }
 
+static void warn_unsupported(struct file *file, const char *op)
+{
+	pr_warn_ratelimited(
+		"kernel %s not supported for file %pD4 (pid: %d comm: %.20s)\n",
+		op, file, current->pid, current->comm);
+}
+
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	ssize_t ret;
@@ -430,13 +437,7 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->read) {
-		mm_segment_t old_fs = get_fs();
-
-		set_fs(KERNEL_DS);
-		ret = file->f_op->read(file, (void __user *)buf, count, pos);
-		set_fs(old_fs);
-	} else if (file->f_op->read_iter) {
+	if (file->f_op->read_iter) {
 		struct kvec iov = { .iov_base = buf, .iov_len = count };
 		struct kiocb kiocb;
 		struct iov_iter iter;
@@ -447,6 +448,8 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 		ret = file->f_op->read_iter(&kiocb, &iter);
 		*pos = kiocb.ki_pos;
 	} else {
+		if (file->f_op->read)
+			warn_unsupported(file, "read");
 		ret = -EINVAL;
 	}
 	if (ret > 0) {
@@ -530,14 +533,7 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
 
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write) {
-		mm_segment_t old_fs = get_fs();
-
-		set_fs(KERNEL_DS);
-		ret = file->f_op->write(file, (__force const char __user *)buf,
-				count, pos);
-		set_fs(old_fs);
-	} else if (file->f_op->write_iter) {
+	if (file->f_op->write_iter) {
 		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
 		struct kiocb kiocb;
 		struct iov_iter iter;
@@ -549,6 +545,8 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
 		if (ret > 0)
 			*pos = kiocb.ki_pos;
 	} else {
+		if (file->f_op->write)
+			warn_unsupported(file, "write");
 		ret = -EINVAL;
 	}
 	if (ret > 0) {
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (19 preceding siblings ...)
  2020-07-07 17:47 ` [PATCH 21/23] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
@ 2020-07-07 17:48 ` Christoph Hellwig
  2020-07-30  0:05   ` Al Viro
  2020-07-07 17:48 ` [PATCH 23/23] fs: don't allow splice read/write without explicit ops Christoph Hellwig
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:48 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

If a file implements the ->read_iter method, the iter based splice read
works and is always preferred over the ->read based one.  Use it by
default in do_splice_to and remove all the direct assignment of
generic_file_splice_read to file_operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/adfs/file.c          | 1 -
 fs/affs/file.c          | 1 -
 fs/afs/file.c           | 1 -
 fs/bfs/file.c           | 1 -
 fs/block_dev.c          | 1 -
 fs/btrfs/file.c         | 1 -
 fs/ceph/file.c          | 1 -
 fs/cifs/cifsfs.c        | 6 ------
 fs/coda/file.c          | 1 -
 fs/cramfs/inode.c       | 1 -
 fs/ecryptfs/file.c      | 1 -
 fs/exfat/file.c         | 1 -
 fs/ext2/file.c          | 1 -
 fs/ext4/file.c          | 1 -
 fs/f2fs/file.c          | 1 -
 fs/fat/file.c           | 1 -
 fs/fuse/file.c          | 1 -
 fs/gfs2/file.c          | 2 --
 fs/hfs/inode.c          | 1 -
 fs/hfsplus/inode.c      | 1 -
 fs/hostfs/hostfs_kern.c | 1 -
 fs/hpfs/file.c          | 1 -
 fs/jffs2/file.c         | 1 -
 fs/jfs/file.c           | 1 -
 fs/minix/file.c         | 1 -
 fs/nfs/file.c           | 1 -
 fs/nfs/nfs4file.c       | 1 -
 fs/nilfs2/file.c        | 1 -
 fs/ntfs/file.c          | 1 -
 fs/ocfs2/file.c         | 2 --
 fs/omfs/file.c          | 1 -
 fs/ramfs/file-mmu.c     | 1 -
 fs/ramfs/file-nommu.c   | 1 -
 fs/read_write.c         | 1 -
 fs/reiserfs/file.c      | 1 -
 fs/romfs/mmap-nommu.c   | 1 -
 fs/splice.c             | 2 ++
 fs/sysv/file.c          | 1 -
 fs/ubifs/file.c         | 1 -
 fs/udf/file.c           | 1 -
 fs/ufs/file.c           | 1 -
 fs/vboxsf/file.c        | 1 -
 fs/xfs/xfs_file.c       | 1 -
 fs/zonefs/super.c       | 1 -
 mm/shmem.c              | 1 -
 45 files changed, 2 insertions(+), 51 deletions(-)

diff --git a/fs/adfs/file.c b/fs/adfs/file.c
index 754afb14a6ff74..b089b91c1870ae 100644
--- a/fs/adfs/file.c
+++ b/fs/adfs/file.c
@@ -28,7 +28,6 @@ const struct file_operations adfs_file_operations = {
 	.mmap		= generic_file_mmap,
 	.fsync		= generic_file_fsync,
 	.write_iter	= generic_file_write_iter,
-	.splice_read	= generic_file_splice_read,
 };
 
 const struct inode_operations adfs_file_inode_operations = {
diff --git a/fs/affs/file.c b/fs/affs/file.c
index a85817f54483f7..7d51cc2e3dabfa 100644
--- a/fs/affs/file.c
+++ b/fs/affs/file.c
@@ -975,7 +975,6 @@ const struct file_operations affs_file_operations = {
 	.open		= affs_file_open,
 	.release	= affs_file_release,
 	.fsync		= affs_file_fsync,
-	.splice_read	= generic_file_splice_read,
 };
 
 const struct inode_operations affs_file_inode_operations = {
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 6f6ed1605cfe30..2476f10383fbdd 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -32,7 +32,6 @@ const struct file_operations afs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= afs_file_write,
 	.mmap		= afs_file_mmap,
-	.splice_read	= generic_file_splice_read,
 	.fsync		= afs_fsync,
 	.lock		= afs_lock,
 	.flock		= afs_flock,
diff --git a/fs/bfs/file.c b/fs/bfs/file.c
index 0dceefc54b48ab..39088cc7492308 100644
--- a/fs/bfs/file.c
+++ b/fs/bfs/file.c
@@ -27,7 +27,6 @@ const struct file_operations bfs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
 };
 
 static int bfs_move_block(unsigned long from, unsigned long to,
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 0ae656e022fd57..0aa66a6075eb11 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -2160,7 +2160,6 @@ const struct file_operations def_blk_fops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= compat_blkdev_ioctl,
 #endif
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= blkdev_fallocate,
 };
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 2520605afc256e..322cc65902d107 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -3507,7 +3507,6 @@ static int btrfs_file_open(struct inode *inode, struct file *filp)
 const struct file_operations btrfs_file_operations = {
 	.llseek		= btrfs_file_llseek,
 	.read_iter      = generic_file_read_iter,
-	.splice_read	= generic_file_splice_read,
 	.write_iter	= btrfs_file_write_iter,
 	.mmap		= btrfs_file_mmap,
 	.open		= btrfs_file_open,
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 160644ddaeed70..e28c27751e6b3b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -2507,7 +2507,6 @@ const struct file_operations ceph_file_fops = {
 	.fsync = ceph_fsync,
 	.lock = ceph_lock,
 	.flock = ceph_flock,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl = ceph_ioctl,
 	.compat_ioctl = compat_ptr_ioctl,
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 0fb99d25e8a8a0..74da1dfe08c6fa 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1235,7 +1235,6 @@ const struct file_operations cifs_file_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap  = cifs_file_mmap,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1255,7 +1254,6 @@ const struct file_operations cifs_file_strict_ops = {
 	.fsync = cifs_strict_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_strict_mmap,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1275,7 +1273,6 @@ const struct file_operations cifs_file_direct_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_mmap,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl  = cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
@@ -1293,7 +1290,6 @@ const struct file_operations cifs_file_nobrl_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap  = cifs_file_mmap,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1311,7 +1307,6 @@ const struct file_operations cifs_file_strict_nobrl_ops = {
 	.fsync = cifs_strict_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_strict_mmap,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
@@ -1329,7 +1324,6 @@ const struct file_operations cifs_file_direct_nobrl_ops = {
 	.fsync = cifs_fsync,
 	.flush = cifs_flush,
 	.mmap = cifs_file_mmap,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl  = cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
diff --git a/fs/coda/file.c b/fs/coda/file.c
index 128d63df5bfb62..8dd438f2c09fe2 100644
--- a/fs/coda/file.c
+++ b/fs/coda/file.c
@@ -301,5 +301,4 @@ const struct file_operations coda_file_operations = {
 	.open		= coda_open,
 	.release	= coda_release,
 	.fsync		= coda_fsync,
-	.splice_read	= generic_file_splice_read,
 };
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 912308600d393d..0645c1af27c07d 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -485,7 +485,6 @@ static unsigned int cramfs_physmem_mmap_capabilities(struct file *file)
 static const struct file_operations cramfs_physmem_fops = {
 	.llseek			= generic_file_llseek,
 	.read_iter		= generic_file_read_iter,
-	.splice_read		= generic_file_splice_read,
 	.mmap			= cramfs_physmem_mmap,
 #ifndef CONFIG_MMU
 	.get_unmapped_area	= cramfs_physmem_get_unmapped_area,
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index 5fb45d865ce511..03210a02fe6c00 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -420,5 +420,4 @@ const struct file_operations ecryptfs_main_fops = {
 	.release = ecryptfs_release,
 	.fsync = ecryptfs_fsync,
 	.fasync = ecryptfs_fasync,
-	.splice_read = generic_file_splice_read,
 };
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 3b7fea465fd41e..8fe5df8a9ccbca 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -369,7 +369,6 @@ const struct file_operations exfat_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= exfat_file_fsync,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
 
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index 60378ddf1424b0..1c0828e0198440 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -191,7 +191,6 @@ const struct file_operations ext2_file_operations = {
 	.release	= ext2_release_file,
 	.fsync		= ext2_fsync,
 	.get_unmapped_area = thp_get_unmapped_area,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
 
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 2a01e31a032c4c..f5dc9a4e0937d1 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -896,7 +896,6 @@ const struct file_operations ext4_file_operations = {
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
 	.get_unmapped_area = thp_get_unmapped_area,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ext4_fallocate,
 };
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 3268f8dd59bbaf..6b34caf13b5668 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4033,6 +4033,5 @@ const struct file_operations f2fs_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= f2fs_compat_ioctl,
 #endif
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 };
diff --git a/fs/fat/file.c b/fs/fat/file.c
index 42134c58c87e19..e7a0342ccfe1f0 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -208,7 +208,6 @@ const struct file_operations fat_file_operations = {
 	.unlocked_ioctl	= fat_generic_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 	.fsync		= fat_file_fsync,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= fat_fallocate,
 };
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index e573b0cd2737dc..a404e147bb2cf7 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3382,7 +3382,6 @@ static const struct file_operations fuse_file_operations = {
 	.fsync		= fuse_fsync,
 	.lock		= fuse_file_lock,
 	.flock		= fuse_file_flock,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.unlocked_ioctl	= fuse_file_ioctl,
 	.compat_ioctl	= fuse_file_compat_ioctl,
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index fe305e4bfd3734..d23babc0c292b8 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -1323,7 +1323,6 @@ const struct file_operations gfs2_file_fops = {
 	.fsync		= gfs2_fsync,
 	.lock		= gfs2_lock,
 	.flock		= gfs2_flock,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= gfs2_file_splice_write,
 	.setlease	= simple_nosetlease,
 	.fallocate	= gfs2_fallocate,
@@ -1354,7 +1353,6 @@ const struct file_operations gfs2_file_fops_nolock = {
 	.open		= gfs2_open,
 	.release	= gfs2_release,
 	.fsync		= gfs2_fsync,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= gfs2_file_splice_write,
 	.setlease	= generic_setlease,
 	.fallocate	= gfs2_fallocate,
diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
index 2f224b98ee94a6..6181d3818e17c0 100644
--- a/fs/hfs/inode.c
+++ b/fs/hfs/inode.c
@@ -682,7 +682,6 @@ static const struct file_operations hfs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
 	.fsync		= hfs_file_fsync,
 	.open		= hfs_file_open,
 	.release	= hfs_file_release,
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index e3da9e96b83578..7bd61ba08fbc9e 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -358,7 +358,6 @@ static const struct file_operations hfsplus_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
 	.fsync		= hfsplus_file_fsync,
 	.open		= hfsplus_file_open,
 	.release	= hfsplus_file_release,
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index c070c0d8e3e977..c6453e3768294c 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -379,7 +379,6 @@ static int hostfs_fsync(struct file *file, loff_t start, loff_t end,
 
 static const struct file_operations hostfs_file_fops = {
 	.llseek		= generic_file_llseek,
-	.splice_read	= generic_file_splice_read,
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
index 077c25128eb741..53951f29f25e50 100644
--- a/fs/hpfs/file.c
+++ b/fs/hpfs/file.c
@@ -213,7 +213,6 @@ const struct file_operations hpfs_file_ops =
 	.mmap		= generic_file_mmap,
 	.release	= hpfs_file_release,
 	.fsync		= hpfs_file_fsync,
-	.splice_read	= generic_file_splice_read,
 	.unlocked_ioctl	= hpfs_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 };
diff --git a/fs/jffs2/file.c b/fs/jffs2/file.c
index f8fb89b10227ce..6e986a99669779 100644
--- a/fs/jffs2/file.c
+++ b/fs/jffs2/file.c
@@ -56,7 +56,6 @@ const struct file_operations jffs2_file_operations =
 	.unlocked_ioctl=jffs2_ioctl,
 	.mmap =		generic_file_readonly_mmap,
 	.fsync =	jffs2_fsync,
-	.splice_read =	generic_file_splice_read,
 };
 
 /* jffs2_file_inode_operations */
diff --git a/fs/jfs/file.c b/fs/jfs/file.c
index 930d2701f2062b..fb209673943697 100644
--- a/fs/jfs/file.c
+++ b/fs/jfs/file.c
@@ -141,7 +141,6 @@ const struct file_operations jfs_file_operations = {
 	.read_iter	= generic_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fsync		= jfs_fsync,
 	.release	= jfs_release,
diff --git a/fs/minix/file.c b/fs/minix/file.c
index c50b0a20fcd9c1..e787789b43fa95 100644
--- a/fs/minix/file.c
+++ b/fs/minix/file.c
@@ -19,7 +19,6 @@ const struct file_operations minix_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= generic_file_fsync,
-	.splice_read	= generic_file_splice_read,
 };
 
 static int minix_setattr(struct dentry *dentry, struct iattr *attr)
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index ccd6c1637b270b..8ba06f6c3ec5af 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -851,7 +851,6 @@ const struct file_operations nfs_file_operations = {
 	.fsync		= nfs_file_fsync,
 	.lock		= nfs_lock,
 	.flock		= nfs_flock,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.check_flags	= nfs_check_flags,
 	.setlease	= simple_nosetlease,
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 8e5d6223ddd359..3e3793cb217ec1 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -415,7 +415,6 @@ const struct file_operations nfs4_file_operations = {
 	.fsync		= nfs_file_fsync,
 	.lock		= nfs_lock,
 	.flock		= nfs_flock,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.check_flags	= nfs_check_flags,
 	.setlease	= simple_nosetlease,
diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index 64bc81363c6cc0..cb3269c52dabc7 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -140,7 +140,6 @@ const struct file_operations nilfs_file_operations = {
 	.open		= generic_file_open,
 	/* .release	= nilfs_release_file, */
 	.fsync		= nilfs_sync_file,
-	.splice_read	= generic_file_splice_read,
 };
 
 const struct inode_operations nilfs_file_inode_operations = {
diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c
index f42967b738eb67..8c1759e9185dd7 100644
--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -2012,7 +2012,6 @@ const struct file_operations ntfs_file_ops = {
 #endif /* NTFS_RW */
 	.mmap		= generic_file_mmap,
 	.open		= ntfs_file_open,
-	.splice_read	= generic_file_splice_read,
 };
 
 const struct inode_operations ntfs_file_inode_ops = {
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 85979e2214b39d..86069cae29047e 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2671,7 +2671,6 @@ const struct file_operations ocfs2_fops = {
 #endif
 	.lock		= ocfs2_lock,
 	.flock		= ocfs2_flock,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ocfs2_fallocate,
 	.remap_file_range = ocfs2_remap_file_range,
@@ -2717,7 +2716,6 @@ const struct file_operations ocfs2_fops_no_plocks = {
 	.compat_ioctl   = ocfs2_compat_ioctl,
 #endif
 	.flock		= ocfs2_flock,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= ocfs2_fallocate,
 	.remap_file_range = ocfs2_remap_file_range,
diff --git a/fs/omfs/file.c b/fs/omfs/file.c
index d7b5f09d298c9d..0dddc6c644c10c 100644
--- a/fs/omfs/file.c
+++ b/fs/omfs/file.c
@@ -340,7 +340,6 @@ const struct file_operations omfs_file_operations = {
 	.write_iter = generic_file_write_iter,
 	.mmap = generic_file_mmap,
 	.fsync = generic_file_fsync,
-	.splice_read = generic_file_splice_read,
 };
 
 static int omfs_setattr(struct dentry *dentry, struct iattr *attr)
diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
index 12af0490322f9d..d1e76267e9c323 100644
--- a/fs/ramfs/file-mmu.c
+++ b/fs/ramfs/file-mmu.c
@@ -43,7 +43,6 @@ const struct file_operations ramfs_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= noop_fsync,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.llseek		= generic_file_llseek,
 	.get_unmapped_area	= ramfs_mmu_get_unmapped_area,
diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
index 41469545495608..9336086e60fefd 100644
--- a/fs/ramfs/file-nommu.c
+++ b/fs/ramfs/file-nommu.c
@@ -43,7 +43,6 @@ const struct file_operations ramfs_file_operations = {
 	.read_iter		= generic_file_read_iter,
 	.write_iter		= generic_file_write_iter,
 	.fsync			= noop_fsync,
-	.splice_read		= generic_file_splice_read,
 	.splice_write		= iter_file_splice_write,
 	.llseek			= generic_file_llseek,
 };
diff --git a/fs/read_write.c b/fs/read_write.c
index 11c55547cfc9d6..8d8113ae8561e6 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -29,7 +29,6 @@ const struct file_operations generic_ro_fops = {
 	.llseek		= generic_file_llseek,
 	.read_iter	= generic_file_read_iter,
 	.mmap		= generic_file_readonly_mmap,
-	.splice_read	= generic_file_splice_read,
 };
 
 EXPORT_SYMBOL(generic_ro_fops);
diff --git a/fs/reiserfs/file.c b/fs/reiserfs/file.c
index 0b641ae694f123..4f71c3aca2b8c1 100644
--- a/fs/reiserfs/file.c
+++ b/fs/reiserfs/file.c
@@ -247,7 +247,6 @@ const struct file_operations reiserfs_file_operations = {
 	.fsync = reiserfs_sync_file,
 	.read_iter = generic_file_read_iter,
 	.write_iter = generic_file_write_iter,
-	.splice_read = generic_file_splice_read,
 	.splice_write = iter_file_splice_write,
 	.llseek = generic_file_llseek,
 };
diff --git a/fs/romfs/mmap-nommu.c b/fs/romfs/mmap-nommu.c
index 2c4a23113fb5f2..f37e27fa0c1084 100644
--- a/fs/romfs/mmap-nommu.c
+++ b/fs/romfs/mmap-nommu.c
@@ -78,7 +78,6 @@ static unsigned romfs_mmap_capabilities(struct file *file)
 const struct file_operations romfs_ro_fops = {
 	.llseek			= generic_file_llseek,
 	.read_iter		= generic_file_read_iter,
-	.splice_read		= generic_file_splice_read,
 	.mmap			= romfs_mmap,
 	.get_unmapped_area	= romfs_get_unmapped_area,
 	.mmap_capabilities	= romfs_mmap_capabilities,
diff --git a/fs/splice.c b/fs/splice.c
index d7c8a7c4db07ff..52485158023778 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -868,6 +868,8 @@ static long do_splice_to(struct file *in, loff_t *ppos,
 
 	if (in->f_op->splice_read)
 		return in->f_op->splice_read(in, ppos, pipe, len, flags);
+	if (in->f_op->read_iter)
+		return generic_file_splice_read(in, ppos, pipe, len, flags);
 	return default_file_splice_read(in, ppos, pipe, len, flags);
 }
 
diff --git a/fs/sysv/file.c b/fs/sysv/file.c
index 45fc79a18594f1..d023922c0e44c7 100644
--- a/fs/sysv/file.c
+++ b/fs/sysv/file.c
@@ -26,7 +26,6 @@ const struct file_operations sysv_file_operations = {
 	.write_iter	= generic_file_write_iter,
 	.mmap		= generic_file_mmap,
 	.fsync		= generic_file_fsync,
-	.splice_read	= generic_file_splice_read,
 };
 
 static int sysv_setattr(struct dentry *dentry, struct iattr *attr)
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 49fe062ce45ec2..a3af46b3950811 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1668,7 +1668,6 @@ const struct file_operations ubifs_file_operations = {
 	.mmap           = ubifs_file_mmap,
 	.fsync          = ubifs_fsync,
 	.unlocked_ioctl = ubifs_ioctl,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.open		= fscrypt_file_open,
 #ifdef CONFIG_COMPAT
diff --git a/fs/udf/file.c b/fs/udf/file.c
index 628941a6b79afb..6c796ef2bd8331 100644
--- a/fs/udf/file.c
+++ b/fs/udf/file.c
@@ -250,7 +250,6 @@ const struct file_operations udf_file_operations = {
 	.write_iter		= udf_file_write_iter,
 	.release		= udf_release_file,
 	.fsync			= generic_file_fsync,
-	.splice_read		= generic_file_splice_read,
 	.llseek			= generic_file_llseek,
 };
 
diff --git a/fs/ufs/file.c b/fs/ufs/file.c
index 7e087581be7e0c..7a6dbb32d22cb6 100644
--- a/fs/ufs/file.c
+++ b/fs/ufs/file.c
@@ -41,5 +41,4 @@ const struct file_operations ufs_file_operations = {
 	.mmap		= generic_file_mmap,
 	.open           = generic_file_open,
 	.fsync		= generic_file_fsync,
-	.splice_read	= generic_file_splice_read,
 };
diff --git a/fs/vboxsf/file.c b/fs/vboxsf/file.c
index c4ab5996d97a83..30671e1226dbed 100644
--- a/fs/vboxsf/file.c
+++ b/fs/vboxsf/file.c
@@ -200,7 +200,6 @@ const struct file_operations vboxsf_reg_fops = {
 	.open = vboxsf_file_open,
 	.release = vboxsf_file_release,
 	.fsync = noop_fsync,
-	.splice_read = generic_file_splice_read,
 };
 
 const struct inode_operations vboxsf_reg_iops = {
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 00db81eac80d6c..964bc733e765a4 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1297,7 +1297,6 @@ const struct file_operations xfs_file_operations = {
 	.llseek		= xfs_file_llseek,
 	.read_iter	= xfs_file_read_iter,
 	.write_iter	= xfs_file_write_iter,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.iopoll		= iomap_dio_iopoll,
 	.unlocked_ioctl	= xfs_file_ioctl,
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 07bc42d62673ce..d9f5fbeb55062e 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -869,7 +869,6 @@ static const struct file_operations zonefs_file_operations = {
 	.llseek		= zonefs_file_llseek,
 	.read_iter	= zonefs_file_read_iter,
 	.write_iter	= zonefs_file_write_iter,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.iopoll		= iomap_dio_iopoll,
 };
diff --git a/mm/shmem.c b/mm/shmem.c
index a0dbe62f8042e7..f019ff50084403 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3756,7 +3756,6 @@ static const struct file_operations shmem_file_operations = {
 	.read_iter	= shmem_file_read_iter,
 	.write_iter	= generic_file_write_iter,
 	.fsync		= noop_fsync,
-	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.fallocate	= shmem_fallocate,
 #endif
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 23/23] fs: don't allow splice read/write without explicit ops
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (20 preceding siblings ...)
  2020-07-07 17:48 ` [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter Christoph Hellwig
@ 2020-07-07 17:48 ` Christoph Hellwig
  2020-07-07 20:24 ` stop using ->read and ->write for kernel access v3 Linus Torvalds
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-07 17:48 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Don't allow calling ->read or ->write with set_fs as a preparation for
killing off set_fs.  While I've not triggered any of these cases in my
setups as all the usual suspect (file systems, pipes, sockets, block
devices, system character devices) use the iter ops this is almost
going to be guaranteed to eventuall break something, so print a detailed
error message helping to debug such cases.  The fix will be to switch the
affected driver to use the iter ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    |   2 +-
 fs/splice.c        | 121 ++++-----------------------------------------
 include/linux/fs.h |   2 -
 3 files changed, 10 insertions(+), 115 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 8d8113ae8561e6..c33182f97d1ef0 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1077,7 +1077,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
 }
 EXPORT_SYMBOL(vfs_iter_write);
 
-ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
+static ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
 		  unsigned long vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
diff --git a/fs/splice.c b/fs/splice.c
index 52485158023778..3ceaaf3b8c122c 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -342,89 +342,6 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = {
 };
 EXPORT_SYMBOL(nosteal_pipe_buf_ops);
 
-static ssize_t kernel_readv(struct file *file, const struct kvec *vec,
-			    unsigned long vlen, loff_t offset)
-{
-	mm_segment_t old_fs;
-	loff_t pos = offset;
-	ssize_t res;
-
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	res = vfs_readv(file, (const struct iovec __user *)vec, vlen, &pos, 0);
-	set_fs(old_fs);
-
-	return res;
-}
-
-static ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
-				 struct pipe_inode_info *pipe, size_t len,
-				 unsigned int flags)
-{
-	struct kvec *vec, __vec[PIPE_DEF_BUFFERS];
-	struct iov_iter to;
-	struct page **pages;
-	unsigned int nr_pages;
-	unsigned int mask;
-	size_t offset, base, copied = 0;
-	ssize_t res;
-	int i;
-
-	if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
-		return -EAGAIN;
-
-	/*
-	 * Try to keep page boundaries matching to source pagecache ones -
-	 * it probably won't be much help, but...
-	 */
-	offset = *ppos & ~PAGE_MASK;
-
-	iov_iter_pipe(&to, READ, pipe, len + offset);
-
-	res = iov_iter_get_pages_alloc(&to, &pages, len + offset, &base);
-	if (res <= 0)
-		return -ENOMEM;
-
-	nr_pages = DIV_ROUND_UP(res + base, PAGE_SIZE);
-
-	vec = __vec;
-	if (nr_pages > PIPE_DEF_BUFFERS) {
-		vec = kmalloc_array(nr_pages, sizeof(struct kvec), GFP_KERNEL);
-		if (unlikely(!vec)) {
-			res = -ENOMEM;
-			goto out;
-		}
-	}
-
-	mask = pipe->ring_size - 1;
-	pipe->bufs[to.head & mask].offset = offset;
-	pipe->bufs[to.head & mask].len -= offset;
-
-	for (i = 0; i < nr_pages; i++) {
-		size_t this_len = min_t(size_t, len, PAGE_SIZE - offset);
-		vec[i].iov_base = page_address(pages[i]) + offset;
-		vec[i].iov_len = this_len;
-		len -= this_len;
-		offset = 0;
-	}
-
-	res = kernel_readv(in, vec, nr_pages, *ppos);
-	if (res > 0) {
-		copied = res;
-		*ppos += res;
-	}
-
-	if (vec != __vec)
-		kfree(vec);
-out:
-	for (i = 0; i < nr_pages; i++)
-		put_page(pages[i]);
-	kvfree(pages);
-	iov_iter_advance(&to, copied);	/* truncates and discards */
-	return res;
-}
-
 /*
  * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos'
  * using sendpage(). Return the number of bytes sent.
@@ -788,33 +705,6 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out,
 
 EXPORT_SYMBOL(iter_file_splice_write);
 
-static int write_pipe_buf(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
-			  struct splice_desc *sd)
-{
-	int ret;
-	void *data;
-	loff_t tmp = sd->pos;
-
-	data = kmap(buf->page);
-	ret = __kernel_write(sd->u.file, data + buf->offset, sd->len, &tmp);
-	kunmap(buf->page);
-
-	return ret;
-}
-
-static ssize_t default_file_splice_write(struct pipe_inode_info *pipe,
-					 struct file *out, loff_t *ppos,
-					 size_t len, unsigned int flags)
-{
-	ssize_t ret;
-
-	ret = splice_from_pipe(pipe, out, ppos, len, flags, write_pipe_buf);
-	if (ret > 0)
-		*ppos += ret;
-
-	return ret;
-}
-
 /**
  * generic_splice_sendpage - splice data from a pipe to a socket
  * @pipe:	pipe to splice from
@@ -844,7 +734,10 @@ static long do_splice_from(struct pipe_inode_info *pipe, struct file *out,
 {
 	if (out->f_op->splice_write)
 		return out->f_op->splice_write(pipe, out, ppos, len, flags);
-	return default_file_splice_write(pipe, out, ppos, len, flags);
+	pr_warn_ratelimited(
+		"splice write not supported for file %pD4 (pid: %d comm: %.20s)\n",
+		out, current->pid, current->comm);
+	return -EINVAL;
 }
 
 /*
@@ -870,7 +763,11 @@ static long do_splice_to(struct file *in, loff_t *ppos,
 		return in->f_op->splice_read(in, ppos, pipe, len, flags);
 	if (in->f_op->read_iter)
 		return generic_file_splice_read(in, ppos, pipe, len, flags);
-	return default_file_splice_read(in, ppos, pipe, len, flags);
+
+	pr_warn_ratelimited(
+		"splice read not supported for file %pD4 (pid: %d comm: %.20s)\n",
+		in, current->pid, current->comm);
+	return -EINVAL;
 }
 
 /**
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0c0ec76b600b50..fac6aead402a98 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1919,8 +1919,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
 
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
-extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-		unsigned long, loff_t *, rwf_t);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
 				   loff_t, size_t, unsigned int);
 extern ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: stop using ->read and ->write for kernel access v3
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (21 preceding siblings ...)
  2020-07-07 17:48 ` [PATCH 23/23] fs: don't allow splice read/write without explicit ops Christoph Hellwig
@ 2020-07-07 20:24 ` Linus Torvalds
  2020-07-08  6:07   ` Christoph Hellwig
  2020-07-07 23:03 ` Stephen Rothwell
       [not found] ` <20200707174801.4162712-16-hch@lst.de>
  24 siblings, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2020-07-07 20:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Stephen Rothwell, Luis Chamberlain, Matthew Wilcox,
	Kees Cook, Iurii Zaikin, Linux Kernel Mailing List,
	linux-fsdevel

On Tue, Jul 7, 2020 at 10:48 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Hi Al and Linus (and Stephen, see below),
>
> as part of removing set_fs entirely (for which I have a working
> prototype), we need to stop calling ->read and ->write with kernel
> pointers under set_fs.

I'd be willing to pick up patches 1-6 as trivial and obvious cleanups
right now, if you sent those to me as a pull request. That would at
least focus the remaining series a bit on the actual changes..

           Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: stop using ->read and ->write for kernel access v3
  2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
                   ` (22 preceding siblings ...)
  2020-07-07 20:24 ` stop using ->read and ->write for kernel access v3 Linus Torvalds
@ 2020-07-07 23:03 ` Stephen Rothwell
       [not found] ` <20200707174801.4162712-16-hch@lst.de>
  24 siblings, 0 replies; 44+ messages in thread
From: Stephen Rothwell @ 2020-07-07 23:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Linus Torvalds, Luis Chamberlain, Matthew Wilcox,
	Kees Cook, Iurii Zaikin, linux-kernel, linux-fsdevel


[-- Attachment #1: Type: text/plain, Size: 1445 bytes --]

Hi Christoph,

On Tue,  7 Jul 2020 19:47:38 +0200 Christoph Hellwig <hch@lst.de> wrote:
>
> A git branch is available here:
> 
>     git://git.infradead.org/users/hch/misc.git set_fs-rw
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/set_fs-rw
> 
> Given that this has been out and cooking for a while, is there a chance to
> add the above as a temp branch to linux-next to get a little more exposure
> until Al reviews it and (hopefully) picks it up?

No worries, I will add it in later today and drop it after Al grabs it.

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgement of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
     * submitted under GPL v2 (or later) and include the Contributor's
        Signed-off-by,
     * posted to the relevant mailing list,
     * reviewed by you (or another maintainer of your subsystem tree),
     * successfully unit tested, and 
     * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

-- 
Cheers,
Stephen Rothwell 
sfr@canb.auug.org.au

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: stop using ->read and ->write for kernel access v3
  2020-07-07 20:24 ` stop using ->read and ->write for kernel access v3 Linus Torvalds
@ 2020-07-08  6:07   ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-08  6:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Al Viro, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin,
	Linux Kernel Mailing List, linux-fsdevel

On Tue, Jul 07, 2020 at 01:24:01PM -0700, Linus Torvalds wrote:
> On Tue, Jul 7, 2020 at 10:48 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > Hi Al and Linus (and Stephen, see below),
> >
> > as part of removing set_fs entirely (for which I have a working
> > prototype), we need to stop calling ->read and ->write with kernel
> > pointers under set_fs.
> 
> I'd be willing to pick up patches 1-6 as trivial and obvious cleanups
> right now, if you sent those to me as a pull request. That would at
> least focus the remaining series a bit on the actual changes..

If we do that we should do 1-7 and 9-12 to include the read side as
well.  But yes, maybe that way we'll get started.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
       [not found] ` <20200707174801.4162712-16-hch@lst.de>
@ 2020-07-10 12:55   ` Jon Hunter
  2020-07-10 12:58     ` Christoph Hellwig
  2020-07-11  6:48     ` Christoph Hellwig
  2020-07-17 21:09   ` Thomas Gleixner
  1 sibling, 2 replies; 44+ messages in thread
From: Jon Hunter @ 2020-07-10 12:55 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel, linux-tegra

Hi Christoph,

On 07/07/2020 18:47, Christoph Hellwig wrote:
> Switch over all instances used directly as methods using these sed
> expressions:
> 
> sed -i -e 's/\.read\(\s*=\s*\)seq_read/\.read_iter\1seq_read_iter/g'
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  Documentation/filesystems/seq_file.rst        |  2 +-
>  Documentation/process/clang-format.rst        |  4 +-
>  .../it_IT/process/clang-format.rst            |  4 +-
>  arch/arm/mm/ptdump_debugfs.c                  |  2 +-
>  arch/arm64/kvm/vgic/vgic-debug.c              |  2 +-
>  arch/c6x/platforms/pll.c                      |  2 +-
>  arch/mips/cavium-octeon/oct_ilm.c             |  2 +-
>  arch/mips/kernel/segment.c                    |  2 +-
>  arch/mips/ralink/bootrom.c                    |  2 +-
>  arch/powerpc/kvm/book3s_xive_native.c         |  2 +-
>  arch/powerpc/kvm/timing.c                     |  2 +-
>  arch/powerpc/mm/ptdump/bats.c                 |  2 +-
>  arch/powerpc/mm/ptdump/hashpagetable.c        |  2 +-
>  arch/powerpc/mm/ptdump/ptdump.c               |  2 +-
>  arch/powerpc/mm/ptdump/segment_regs.c         |  2 +-
>  arch/powerpc/platforms/cell/spufs/file.c      |  8 ++--
>  arch/powerpc/platforms/pseries/hvCall_inst.c  |  2 +-
>  arch/s390/kernel/diag.c                       |  2 +-
>  arch/s390/mm/dump_pagetables.c                |  2 +-
>  arch/s390/pci/pci_debug.c                     |  2 +-
>  arch/sh/mm/asids-debugfs.c                    |  2 +-
>  arch/sh/mm/cache-debugfs.c                    |  2 +-
>  arch/sh/mm/pmb.c                              |  2 +-
>  arch/sh/mm/tlb-debugfs.c                      |  2 +-
>  arch/x86/kernel/cpu/mce/severity.c            |  2 +-
>  arch/x86/mm/pat/memtype.c                     |  2 +-
>  arch/x86/mm/pat/set_memory.c                  |  2 +-
>  arch/x86/xen/p2m.c                            |  2 +-
>  block/blk-mq-debugfs.c                        |  2 +-
>  drivers/base/power/wakeup.c                   |  2 +-
>  drivers/block/aoe/aoeblk.c                    |  2 +-
>  drivers/block/drbd/drbd_debugfs.c             | 10 ++---
>  drivers/block/nbd.c                           |  4 +-
>  drivers/block/pktcdvd.c                       |  2 +-
>  drivers/block/rsxx/core.c                     |  4 +-
>  drivers/bus/mvebu-mbus.c                      |  4 +-
>  drivers/char/tpm/eventlog/common.c            |  2 +-
>  .../crypto/allwinner/sun8i-ce/sun8i-ce-core.c |  2 +-
>  .../crypto/allwinner/sun8i-ss/sun8i-ss-core.c |  2 +-
>  drivers/crypto/amlogic/amlogic-gxl-core.c     |  2 +-
>  drivers/crypto/caam/dpseci-debugfs.c          |  2 +-
>  drivers/crypto/cavium/zip/zip_main.c          |  6 +--
>  drivers/crypto/hisilicon/qm.c                 |  2 +-
>  drivers/crypto/qat/qat_common/adf_cfg.c       |  2 +-
>  .../qat/qat_common/adf_transport_debug.c      |  4 +-
>  drivers/firmware/tegra/bpmp-debugfs.c         |  2 +-
>  drivers/gpio/gpiolib.c                        |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c      |  4 +-
>  .../gpu/drm/arm/display/komeda/komeda_dev.c   |  2 +-
>  drivers/gpu/drm/arm/malidp_drv.c              |  2 +-
>  drivers/gpu/drm/armada/armada_debugfs.c       |  2 +-
>  drivers/gpu/drm/drm_debugfs.c                 |  6 +--
>  drivers/gpu/drm/drm_debugfs_crc.c             |  2 +-
>  drivers/gpu/drm/drm_mipi_dbi.c                |  2 +-
>  .../drm/i915/display/intel_display_debugfs.c  | 16 ++++----
>  drivers/gpu/drm/i915/gt/debugfs_gt.h          |  2 +-
>  drivers/gpu/drm/i915/i915_debugfs_params.c    | 12 +++---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c  |  2 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c      |  4 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   |  2 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |  4 +-
>  drivers/gpu/drm/msm/msm_debugfs.c             |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_debugfs.c     |  2 +-
>  drivers/gpu/drm/omapdrm/dss/dss.c             |  2 +-
>  drivers/gpu/host1x/debug.c                    |  4 +-
>  drivers/gpu/vga/vga_switcheroo.c              |  2 +-
>  drivers/hid/hid-picolcd_debugfs.c             |  2 +-
>  drivers/hid/hid-wiimote-debug.c               |  2 +-
>  drivers/ide/ide-proc.c                        |  2 +-
>  drivers/infiniband/hw/cxgb4/device.c          |  4 +-
>  drivers/infiniband/hw/qib/qib_debugfs.c       |  2 +-
>  drivers/infiniband/ulp/ipoib/ipoib_fs.c       |  4 +-
>  drivers/md/bcache/closure.c                   |  2 +-
>  drivers/media/cec/core/cec-core.c             |  2 +-
>  drivers/media/pci/saa7164/saa7164-core.c      |  2 +-
>  drivers/memory/emif.c                         |  4 +-
>  drivers/memory/tegra/tegra124-emc.c           |  2 +-
>  drivers/memory/tegra/tegra186-emc.c           |  2 +-
>  drivers/memory/tegra/tegra20-emc.c            |  2 +-
>  drivers/memory/tegra/tegra30-emc.c            |  2 +-
>  drivers/mfd/ab3100-core.c                     |  2 +-
>  drivers/mfd/ab3100-otp.c                      |  2 +-
>  drivers/mfd/ab8500-debugfs.c                  | 14 +++----
>  drivers/mfd/tps65010.c                        |  2 +-
>  drivers/misc/habanalabs/debugfs.c             |  2 +-
>  drivers/mmc/core/mmc_test.c                   |  2 +-
>  drivers/mtd/mtdcore.c                         |  4 +-
>  drivers/mtd/ubi/debug.c                       |  2 +-
>  .../ethernet/chelsio/cxgb4/cxgb4_debugfs.c    | 38 +++++++++----------
>  drivers/net/ethernet/chelsio/cxgb4/l2t.c      |  2 +-
>  .../ethernet/chelsio/cxgb4vf/cxgb4vf_main.c   |  8 ++--
>  .../freescale/dpaa2/dpaa2-eth-debugfs.c       |  6 +--
>  .../net/ethernet/intel/fm10k/fm10k_debugfs.c  |  2 +-
>  .../marvell/octeontx2/af/rvu_debugfs.c        |  2 +-
>  drivers/net/wireless/ath/ath5k/debug.c        |  2 +-
>  drivers/net/wireless/ath/wil6210/debugfs.c    | 14 +++----
>  .../broadcom/brcm80211/brcmsmac/debug.c       |  2 +-
>  .../net/wireless/intel/iwlwifi/fw/debugfs.c   |  2 +-
>  .../net/wireless/intel/iwlwifi/pcie/trans.c   |  2 +-
>  .../wireless/mediatek/mt76/mt7603/debugfs.c   |  2 +-
>  .../wireless/mediatek/mt76/mt7615/debugfs.c   |  2 +-
>  .../wireless/mediatek/mt76/mt76x02_debugfs.c  |  4 +-
>  .../wireless/mediatek/mt76/mt7915/debugfs.c   |  4 +-
>  .../net/wireless/mediatek/mt7601u/debugfs.c   |  4 +-
>  drivers/net/wireless/realtek/rtlwifi/debug.c  |  2 +-
>  drivers/net/wireless/realtek/rtw88/debug.c    |  4 +-
>  drivers/net/wireless/rsi/rsi_91x_debugfs.c    |  4 +-
>  drivers/net/xen-netback/xenbus.c              |  2 +-
>  drivers/nvme/host/fabrics.c                   |  2 +-
>  drivers/pci/controller/pci-tegra.c            |  2 +-
>  drivers/platform/x86/asus-wmi.c               |  2 +-
>  drivers/platform/x86/intel_pmc_core.c         |  2 +-
>  .../platform/x86/intel_telemetry_debugfs.c    |  4 +-
>  drivers/power/supply/da9030_battery.c         |  2 +-
>  drivers/pwm/core.c                            |  2 +-
>  drivers/ras/cec.c                             |  2 +-
>  drivers/ras/debugfs.c                         |  2 +-
>  drivers/s390/block/dasd.c                     |  2 +-
>  drivers/s390/cio/qdio_debug.c                 |  2 +-
>  drivers/scsi/hisi_sas/hisi_sas_main.c         | 32 ++++++++--------
>  drivers/scsi/qedf/qedf_dbg.h                  |  2 +-
>  drivers/scsi/qedi/qedi_dbg.h                  |  2 +-
>  drivers/scsi/qla2xxx/qla_dfs.c                | 12 +++---
>  drivers/scsi/snic/snic_debugfs.c              |  4 +-
>  drivers/sh/intc/virq-debugfs.c                |  2 +-
>  drivers/soc/qcom/cmd-db.c                     |  2 +-
>  drivers/soc/qcom/socinfo.c                    |  4 +-
>  drivers/soc/ti/knav_dma.c                     |  2 +-
>  drivers/soc/ti/knav_qmss_queue.c              |  2 +-
>  .../interface/vchiq_arm/vchiq_debugfs.c       |  4 +-
>  drivers/usb/chipidea/debug.c                  |  4 +-
>  drivers/usb/dwc2/debugfs.c                    |  2 +-
>  drivers/usb/dwc3/debugfs.c                    |  8 ++--
>  drivers/usb/gadget/udc/lpc32xx_udc.c          |  2 +-
>  drivers/usb/gadget/udc/renesas_usb3.c         |  2 +-
>  drivers/usb/host/xhci-debugfs.c               |  6 +--
>  drivers/usb/mtu3/mtu3_debugfs.c               |  8 ++--
>  drivers/usb/musb/musb_debugfs.c               |  4 +-
>  drivers/visorbus/visorbus_main.c              |  2 +-
>  drivers/xen/xenfs/xensyms.c                   |  2 +-
>  fs/debugfs/file.c                             |  4 +-
>  fs/dlm/debug_fs.c                             |  8 ++--
>  fs/gfs2/glock.c                               |  6 +--
>  fs/nfsd/nfs4state.c                           |  4 +-
>  fs/nfsd/nfsctl.c                              | 10 ++---
>  fs/ocfs2/cluster/netdebug.c                   |  6 +--
>  fs/ocfs2/dlm/dlmdebug.c                       |  2 +-
>  fs/ocfs2/dlmglue.c                            |  2 +-
>  fs/openpromfs/inode.c                         |  2 +-
>  fs/orangefs/orangefs-debugfs.c                |  2 +-
>  fs/proc/array.c                               |  2 +-
>  fs/proc/base.c                                | 24 ++++++------
>  fs/proc/fd.c                                  |  2 +-
>  fs/proc/task_mmu.c                            |  8 ++--
>  fs/proc/task_nommu.c                          |  2 +-
>  fs/proc_namespace.c                           |  6 +--
>  include/linux/seq_file.h                      |  4 +-
>  kernel/bpf/inode.c                            |  2 +-
>  kernel/fail_function.c                        |  2 +-
>  kernel/gcov/fs.c                              |  2 +-
>  kernel/irq/debugfs.c                          |  2 +-
>  kernel/kcsan/debugfs.c                        |  2 +-
>  kernel/sched/debug.c                          |  2 +-
>  kernel/time/test_udelay.c                     |  2 +-
>  kernel/trace/ftrace.c                         | 16 ++++----
>  kernel/trace/trace.c                          | 20 +++++-----
>  kernel/trace/trace_dynevent.c                 |  2 +-
>  kernel/trace/trace_events.c                   | 10 ++---
>  kernel/trace/trace_events_hist.c              |  4 +-
>  kernel/trace/trace_events_synth.c             |  2 +-
>  kernel/trace/trace_events_trigger.c           |  2 +-
>  kernel/trace/trace_kprobe.c                   |  4 +-
>  kernel/trace/trace_printk.c                   |  2 +-
>  kernel/trace/trace_stack.c                    |  4 +-
>  kernel/trace/trace_stat.c                     |  2 +-
>  kernel/trace/trace_uprobe.c                   |  4 +-
>  lib/debugobjects.c                            |  2 +-
>  lib/dynamic_debug.c                           |  2 +-
>  lib/error-inject.c                            |  2 +-
>  lib/kunit/debugfs.c                           |  2 +-
>  mm/kmemleak.c                                 |  2 +-
>  net/6lowpan/debugfs.c                         |  2 +-
>  net/batman-adv/debugfs.c                      |  4 +-
>  net/bluetooth/6lowpan.c                       |  2 +-
>  net/hsr/hsr_debugfs.c                         |  2 +-
>  net/l2tp/l2tp_debugfs.c                       |  2 +-
>  net/sunrpc/cache.c                            |  2 +-
>  net/sunrpc/debugfs.c                          |  4 +-
>  net/sunrpc/rpc_pipe.c                         |  2 +-
>  security/apparmor/apparmorfs.c                | 10 ++---
>  security/integrity/ima/ima_fs.c               |  6 +--
>  security/selinux/selinuxfs.c                  |  2 +-
>  security/smack/smackfs.c                      | 20 +++++-----
>  193 files changed, 375 insertions(+), 375 deletions(-)


Following this change, I have noticed that several debugfs entries can
no longer be read on some Tegra platforms. For example ...

$ sudo cat /sys/kernel/debug/usb/xhci/3530000.usb/event-ring/cycle
cat: /sys/kernel/debug/usb/xhci/3530000.usb/event-ring/cycle: Invalid
argument

$ sudo cat /sys/kernel/debug/emc/available_rates


cat: /sys/kernel/debug/emc/available_rates: Invalid argument

$ sudo cat /sys/kernel/debug/bpmp/debug/proc/testint
cat: /sys/kernel/debug/bpmp/debug/proc/testint: Invalid argument

$ sudo cat /sys/kernel/debug/pcie/ports


cat: /sys/kernel/debug/pcie/ports: Invalid argument

I have reverted the above drivers to use seq_read() instead of
seq_read_iter() and they work again. Have you seen any problems with this?

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
  2020-07-10 12:55   ` [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter Jon Hunter
@ 2020-07-10 12:58     ` Christoph Hellwig
  2020-07-11  6:48     ` Christoph Hellwig
  1 sibling, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-10 12:58 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Christoph Hellwig, Al Viro, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel, linux-tegra

On Fri, Jul 10, 2020 at 01:55:29PM +0100, Jon Hunter wrote:
> Following this change, I have noticed that several debugfs entries can
> no longer be read on some Tegra platforms. For example ...
> 
> $ sudo cat /sys/kernel/debug/usb/xhci/3530000.usb/event-ring/cycle
> cat: /sys/kernel/debug/usb/xhci/3530000.usb/event-ring/cycle: Invalid
> argument
> 
> $ sudo cat /sys/kernel/debug/emc/available_rates
> 
> 
> cat: /sys/kernel/debug/emc/available_rates: Invalid argument
> 
> $ sudo cat /sys/kernel/debug/bpmp/debug/proc/testint
> cat: /sys/kernel/debug/bpmp/debug/proc/testint: Invalid argument
> 
> $ sudo cat /sys/kernel/debug/pcie/ports
> 
> 
> cat: /sys/kernel/debug/pcie/ports: Invalid argument
> 
> I have reverted the above drivers to use seq_read() instead of
> seq_read_iter() and they work again. Have you seen any problems with this?

I haven't seen any of that.  But some of these files should also
exist on x86, so let me try to reproduce it.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
  2020-07-10 12:55   ` [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter Jon Hunter
  2020-07-10 12:58     ` Christoph Hellwig
@ 2020-07-11  6:48     ` Christoph Hellwig
  2020-07-11 11:47       ` Jon Hunter
  1 sibling, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-11  6:48 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Christoph Hellwig, Al Viro, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel, linux-tegra

Please try this one:

---
From 5e86146296fbcd7593da1d9d39b9685a5e6b83be Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Sat, 11 Jul 2020 08:46:10 +0200
Subject: debugfs: add a proxy stub for ->read_iter

debugfs registrations typically go through a set of proxy ops to deal
with refcounting, which need to support every method that can be
supported.  Add ->read_iter to the proxy ops to prepare for seq_file to
be switch to ->read_iter.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/debugfs/file.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index 8ba32c2feb1b73..dcd7bdaf67417f 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -231,6 +231,10 @@ FULL_PROXY_FUNC(read, ssize_t, filp,
 			loff_t *ppos),
 		ARGS(filp, buf, size, ppos));
 
+FULL_PROXY_FUNC(read_iter, ssize_t, iocb->ki_filp,
+		PROTO(struct kiocb *iocb, struct iov_iter *iter),
+		ARGS(iocb, iter));
+
 FULL_PROXY_FUNC(write, ssize_t, filp,
 		PROTO(struct file *filp, const char __user *buf, size_t size,
 			loff_t *ppos),
@@ -286,6 +290,8 @@ static void __full_proxy_fops_init(struct file_operations *proxy_fops,
 		proxy_fops->llseek = full_proxy_llseek;
 	if (real_fops->read)
 		proxy_fops->read = full_proxy_read;
+	if (real_fops->read_iter)
+		proxy_fops->read_iter = full_proxy_read_iter;
 	if (real_fops->write)
 		proxy_fops->write = full_proxy_write;
 	if (real_fops->poll)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
  2020-07-11  6:48     ` Christoph Hellwig
@ 2020-07-11 11:47       ` Jon Hunter
  0 siblings, 0 replies; 44+ messages in thread
From: Jon Hunter @ 2020-07-11 11:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Linus Torvalds, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin, linux-kernel,
	linux-fsdevel, linux-tegra


On 11/07/2020 07:48, Christoph Hellwig wrote:
> Please try this one:
> 
> ---
> From 5e86146296fbcd7593da1d9d39b9685a5e6b83be Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Sat, 11 Jul 2020 08:46:10 +0200
> Subject: debugfs: add a proxy stub for ->read_iter
> 
> debugfs registrations typically go through a set of proxy ops to deal
> with refcounting, which need to support every method that can be
> supported.  Add ->read_iter to the proxy ops to prepare for seq_file to
> be switch to ->read_iter.
> 
> Reported-by: Jon Hunter <jonathanh@nvidia.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/debugfs/file.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
> index 8ba32c2feb1b73..dcd7bdaf67417f 100644
> --- a/fs/debugfs/file.c
> +++ b/fs/debugfs/file.c
> @@ -231,6 +231,10 @@ FULL_PROXY_FUNC(read, ssize_t, filp,
>  			loff_t *ppos),
>  		ARGS(filp, buf, size, ppos));
>  
> +FULL_PROXY_FUNC(read_iter, ssize_t, iocb->ki_filp,
> +		PROTO(struct kiocb *iocb, struct iov_iter *iter),
> +		ARGS(iocb, iter));
> +
>  FULL_PROXY_FUNC(write, ssize_t, filp,
>  		PROTO(struct file *filp, const char __user *buf, size_t size,
>  			loff_t *ppos),
> @@ -286,6 +290,8 @@ static void __full_proxy_fops_init(struct file_operations *proxy_fops,
>  		proxy_fops->llseek = full_proxy_llseek;
>  	if (real_fops->read)
>  		proxy_fops->read = full_proxy_read;
> +	if (real_fops->read_iter)
> +		proxy_fops->read_iter = full_proxy_read_iter;
>  	if (real_fops->write)
>  		proxy_fops->write = full_proxy_write;
>  	if (real_fops->poll)
> 


Thanks! Works for me.

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
       [not found] ` <20200707174801.4162712-16-hch@lst.de>
  2020-07-10 12:55   ` [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter Jon Hunter
@ 2020-07-17 21:09   ` Thomas Gleixner
  2020-07-20  9:33     ` Christoph Hellwig
  2020-07-29 20:59     ` Al Viro
  1 sibling, 2 replies; 44+ messages in thread
From: Thomas Gleixner @ 2020-07-17 21:09 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro, Linus Torvalds, Stephen Rothwell
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Christoph Hellwig <hch@lst.de> writes:

> Switch over all instances used directly as methods using these sed
> expressions:
>
> sed -i -e 's/\.read\(\s*=\s*\)seq_read/\.read_iter\1seq_read_iter/g'

This sucks, really. I just got a patch against this converting the
changed version to DEFINE_SHOW_ATTRIBUTE(somefile) and thereby removing
the whole open coded gunk.

If we do a tree wide change like this, then can we pretty please use a
coccinelle script to convert all trivial instances to use
DEFINE_SHOW_ATTRIBUTE so we don't have to touch the same place over and
over.

Out of 375 places changed in your patch something about 2/3rd fall into
the trivial category:

static int debug_stats_open(struct inode *inode, struct file *filp)
{
	return single_open(filp, debug_stats_show, NULL);
}

static const struct file_operations debug_stats_fops = {
	.open		= debug_stats_open,
	.read		= seq_read,
	.llseek		= seq_lseek,
	.release	= single_release,
};

which can be replaced by:

DEFINE_SHOW_ATTRIBUTE(debug_stats);

removing 12 lines of gunk and one central place to do the iter change.

I'm pretty sure that quite some of the others which have only an
additional write function can be replaced by a new macro
DEFINE_RW_ATTRIBUTE() or such.

Needs some thought and maybe some cocci help from Julia, but that's way
better than this brute force sed thing which results in malformed crap
like this:

static const struct file_operations debug_stats_fops = {
	.open		= debug_stats_open,
	.read_iter		= seq_read_iter,
	.llseek		= seq_lseek,
	.release	= single_release,
};

and proliferates the copy and paste voodoo programming.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
  2020-07-17 21:09   ` Thomas Gleixner
@ 2020-07-20  9:33     ` Christoph Hellwig
  2020-07-29 20:59     ` Al Viro
  1 sibling, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-20  9:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Christoph Hellwig, Al Viro, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

On Fri, Jul 17, 2020 at 11:09:13PM +0200, Thomas Gleixner wrote:
> Christoph Hellwig <hch@lst.de> writes:
> 
> > Switch over all instances used directly as methods using these sed
> > expressions:
> >
> > sed -i -e 's/\.read\(\s*=\s*\)seq_read/\.read_iter\1seq_read_iter/g'
> 
> This sucks, really. I just got a patch against this converting the
> changed version to DEFINE_SHOW_ATTRIBUTE(somefile) and thereby removing
> the whole open coded gunk.

The changed version of what?

> If we do a tree wide change like this, then can we pretty please use a
> coccinelle script to convert all trivial instances to use
> DEFINE_SHOW_ATTRIBUTE so we don't have to touch the same place over and
> over.

I'm not going to complain about that if someone offers a script
for that.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-07-07 17:47 ` [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
@ 2020-07-29 20:50   ` Al Viro
  2020-07-30  7:02     ` Christoph Hellwig
  0 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2020-07-29 20:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin, linux-kernel,
	linux-fsdevel

On Tue, Jul 07, 2020 at 07:47:46PM +0200, Christoph Hellwig wrote:
> If we write to a file that implements ->write_iter there is no need
> to change the address limit if we send a kvec down.  Implement that
> case, and prefer it over using plain ->write with a changed address
> limit if available.

	You are flipping the priorities of ->write and ->write_iter
for kernel_write().  Now, there are 4 instances of file_operations
where we have both.  null_fops and zero_fops are fine either way -
->write() and ->write_iter() do the same thing there (and arguably
removing ->write might be the right thing; the only reason I hesistate
is that writing to /dev/null *is* critical for many things, including
the proper mail delivery ;-)

However, the other two (infinibarf and pcm) are different; there we
really have different semantics.  I don't believe anything writes into
either under KERNEL_DS, but having kernel_write() and vfs_write() with
subtly different semantics is asking for trouble down the road.

How about we remove ->write in null_fops/zero_fops and fail loudly if
*both* ->write() and ->write_iter() are present (in kernel_write(),
that is)?

There's a similar situation on the read side - there we have /dev/null
with both ->read() and ->read_iter() (and there "remove ->read" is
obviously the right thing to do) *and* we have pcm crap, with different
semantics for ->read() and ->read_iter().

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
  2020-07-17 21:09   ` Thomas Gleixner
  2020-07-20  9:33     ` Christoph Hellwig
@ 2020-07-29 20:59     ` Al Viro
  2020-07-30  7:10       ` Thomas Gleixner
  1 sibling, 1 reply; 44+ messages in thread
From: Al Viro @ 2020-07-29 20:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Christoph Hellwig, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

On Fri, Jul 17, 2020 at 11:09:13PM +0200, Thomas Gleixner wrote:
> 
> Needs some thought and maybe some cocci help from Julia, but that's way
> better than this brute force sed thing which results in malformed crap
> like this:
> 
> static const struct file_operations debug_stats_fops = {
> 	.open		= debug_stats_open,
> 	.read_iter		= seq_read_iter,
> 	.llseek		= seq_lseek,
> 	.release	= single_release,
> };
> 
> and proliferates the copy and paste voodoo programming.

Better copy and paste than templates, IMO; at least the former is
greppable; fucking DEFINE_..._ATRIBUTE is *NOT*, especially due
to the use of ##.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-07 17:48 ` [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter Christoph Hellwig
@ 2020-07-30  0:05   ` Al Viro
  2020-07-30  7:03     ` Christoph Hellwig
  0 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2020-07-30  0:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin, linux-kernel,
	linux-fsdevel

On Tue, Jul 07, 2020 at 07:48:00PM +0200, Christoph Hellwig wrote:
> If a file implements the ->read_iter method, the iter based splice read
> works and is always preferred over the ->read based one.  Use it by
> default in do_splice_to and remove all the direct assignment of
> generic_file_splice_read to file_operations.

The worst problem here is the assumption that all ->read_iter() instances
will take pipe-backed destination; that's _not_ automatically true.
In particular, it's almost certainly false for tap_read_iter() (as
well as tun_chr_read_iter() in IFF_VNET_HDR case).

Other potentially interesting cases: cuse and hugetlbfs.

But in any case, that blind assertion ("iter based splice read works")
really needs to be backed by something.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-07-29 20:50   ` Al Viro
@ 2020-07-30  7:02     ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-30  7:02 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

On Wed, Jul 29, 2020 at 09:50:36PM +0100, Al Viro wrote:
> On Tue, Jul 07, 2020 at 07:47:46PM +0200, Christoph Hellwig wrote:
> > If we write to a file that implements ->write_iter there is no need
> > to change the address limit if we send a kvec down.  Implement that
> > case, and prefer it over using plain ->write with a changed address
> > limit if available.
> 
> 	You are flipping the priorities of ->write and ->write_iter
> for kernel_write().

Note by the end of the series (and what's been in linux-next for a while
now) there is no order, as kernel_write only uses ->write_iter, so a
few patches later this kinda becomes moot point.

> Now, there are 4 instances of file_operations
> where we have both.  null_fops and zero_fops are fine either way -
> ->write() and ->write_iter() do the same thing there (and arguably
> removing ->write might be the right thing; the only reason I hesistate
> is that writing to /dev/null *is* critical for many things, including
> the proper mail delivery ;-)
> 
> However, the other two (infinibarf and pcm) are different; there we
> really have different semantics.  I don't believe anything writes into
> either under KERNEL_DS, but having kernel_write() and vfs_write() with
> subtly different semantics is asking for trouble down the road.
> 
> How about we remove ->write in null_fops/zero_fops and fail loudly if
> *both* ->write() and ->write_iter() are present (in kernel_write(),
> that is)?

I'm fine with removing plain ->write for /dev/null and /dev/zero, as
that seems the right thing to do.

Failing the kernel ops if both are present sounds fine, I'm not sure
about the loud part as it could be user triggered through splice.  I'd
go for the same kind of noticable not loud warning that we have for
the lack of iter ops in kernel_read/write.

> There's a similar situation on the read side - there we have /dev/null
> with both ->read() and ->read_iter() (and there "remove ->read" is
> obviously the right thing to do) *and* we have pcm crap, with different
> semantics for ->read() and ->read_iter().

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-30  0:05   ` Al Viro
@ 2020-07-30  7:03     ` Christoph Hellwig
  2020-07-30 15:08       ` Al Viro
  0 siblings, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-30  7:03 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

On Thu, Jul 30, 2020 at 01:05:44AM +0100, Al Viro wrote:
> On Tue, Jul 07, 2020 at 07:48:00PM +0200, Christoph Hellwig wrote:
> > If a file implements the ->read_iter method, the iter based splice read
> > works and is always preferred over the ->read based one.  Use it by
> > default in do_splice_to and remove all the direct assignment of
> > generic_file_splice_read to file_operations.
> 
> The worst problem here is the assumption that all ->read_iter() instances
> will take pipe-backed destination; that's _not_ automatically true.
> In particular, it's almost certainly false for tap_read_iter() (as
> well as tun_chr_read_iter() in IFF_VNET_HDR case).
> 
> Other potentially interesting cases: cuse and hugetlbfs.
> 
> But in any case, that blind assertion ("iter based splice read works")
> really needs to be backed by something.

I think we need to fix that in the instances, as we really expect
->splice_read to just work instead of the caller knowing what could
work and what might not.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter
  2020-07-29 20:59     ` Al Viro
@ 2020-07-30  7:10       ` Thomas Gleixner
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Gleixner @ 2020-07-30  7:10 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

Al Viro <viro@zeniv.linux.org.uk> writes:
> On Fri, Jul 17, 2020 at 11:09:13PM +0200, Thomas Gleixner wrote:
>> 
>> Needs some thought and maybe some cocci help from Julia, but that's way
>> better than this brute force sed thing which results in malformed crap
>> like this:
>> 
>> static const struct file_operations debug_stats_fops = {
>> 	.open		= debug_stats_open,
>> 	.read_iter		= seq_read_iter,
>> 	.llseek		= seq_lseek,
>> 	.release	= single_release,
>> };
>> 
>> and proliferates the copy and paste voodoo programming.
>
> Better copy and paste than templates, IMO; at least the former is
> greppable; fucking DEFINE_..._ATRIBUTE is *NOT*, especially due
> to the use of ##.

Copy and paste itself is not the issue, but once the copy and paste orgy
starts you end up with more subtle bugs and silly differences than
copies. I spent enough time cleaning such crap up just to figure out
that once you've finished a full tree sweep you can start over.

grep for these things is a nuisance, but it's not rocket science to
figure it out. I rather have to figure that out than staring at a
gazillion of broken implementations.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-30  7:03     ` Christoph Hellwig
@ 2020-07-30 15:08       ` Al Viro
  2020-07-30 15:20         ` Christoph Hellwig
  0 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2020-07-30 15:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin, linux-kernel,
	linux-fsdevel

On Thu, Jul 30, 2020 at 09:03:29AM +0200, Christoph Hellwig wrote:
> On Thu, Jul 30, 2020 at 01:05:44AM +0100, Al Viro wrote:
> > On Tue, Jul 07, 2020 at 07:48:00PM +0200, Christoph Hellwig wrote:
> > > If a file implements the ->read_iter method, the iter based splice read
> > > works and is always preferred over the ->read based one.  Use it by
> > > default in do_splice_to and remove all the direct assignment of
> > > generic_file_splice_read to file_operations.
> > 
> > The worst problem here is the assumption that all ->read_iter() instances
> > will take pipe-backed destination; that's _not_ automatically true.
> > In particular, it's almost certainly false for tap_read_iter() (as
> > well as tun_chr_read_iter() in IFF_VNET_HDR case).
> > 
> > Other potentially interesting cases: cuse and hugetlbfs.
> > 
> > But in any case, that blind assertion ("iter based splice read works")
> > really needs to be backed by something.
> 
> I think we need to fix that in the instances, as we really expect
> ->splice_read to just work instead of the caller knowing what could
> work and what might not.

Er...  generic_file_splice_read() is a library helper; the decision to use
is up to the filesystem/driver/protocol in question, and so's making sure
it's not used with ->read_iter() that isn't fit for it.

Note that we *do* have instances where we have different ->splice_read()
(sometimes using generic_file_splice_read(), sometimes not) even though
->read_iter() is there.

Your patch ignores those (thankfully), but commit message is rather
misleading - it strongly implies that generic_file_splice_read() is
*always* the right thing when ->read_iter() is there, not just that
in such cases it makes a better fallback than default_file_splice_read().

And even the latter assumption is not obvious - AFAICS, we do have
counterexamples.

I'm not saying that e.g. tun/tap don't need fixing for other reasons and
it's quite possible that they will become suitable for generic_file_splice_read()
after that's done.  But I'm really unhappy about the implied change of
generic_file_splice_read() role; if nothing else, commit message should
be very clear that if you have ->read_iter() and generic_file_splice_read()
won't do the right thing, you MUST provide ->splice_read() of your own.
Probably worth Documentation/filesystem/porting entry as well.

Alternatively, if you really want to change the role of that thing,
we need to go through all instances that are *not* generic_file_splice_read()
and see what's going on in those.  Starting with the sockets.

The list right now is:
fs/fuse/dev.c:2263:     .splice_read    = fuse_dev_splice_read,
fs/overlayfs/file.c:786:        .splice_read    = ovl_splice_read,
net/socket.c:164:       .splice_read =  sock_splice_read,
kernel/relay.c:1331:    .splice_read    = relay_file_splice_read,
kernel/trace/trace.c:7081:      .splice_read    = tracing_splice_read_pipe,
kernel/trace/trace.c:7149:      .splice_read    = tracing_buffers_splice_read,
kernel/trace/trace.c:7712:      .splice_read    = tracing_buffers_splice_read,

The first 3 have ->read_iter(); the rest (kernel/* stuff) doesn't.
Socket case uses generic_file_splice_read() unless the protocol provides
an override; SMC, TCP, TCPv6, AF_UNIX STREAM and KCM SEQPACKET do that.

I hadn't looked into the socket side of things for 5 years or so, so I'd
have to dig the notes out first.  It wasn't pleasant...

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-30 15:08       ` Al Viro
@ 2020-07-30 15:20         ` Christoph Hellwig
  2020-07-30 16:17           ` Al Viro
  0 siblings, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-30 15:20 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

On Thu, Jul 30, 2020 at 04:08:26PM +0100, Al Viro wrote:
> > I think we need to fix that in the instances, as we really expect
> > ->splice_read to just work instead of the caller knowing what could
> > work and what might not.
> 
> Er...  generic_file_splice_read() is a library helper; the decision to use
> is up to the filesystem/driver/protocol in question, and so's making sure
> it's not used with ->read_iter() that isn't fit for it.

Yes, but..  The problem is that while right now generic_file_splice_read
is the only user of ITER_PIPE there is absolutely not guarantee that
it remains the only user.  Having ->read_iter instances lingering that
can't deal with it is at best a mine field waiting for victims.

Fortunately I think the fix is pretty easy - remove the special pipe
zero copy optimization from copy_page_to_iter, and just have the
callers actually want it because they have pagecache or similar
refcountable pages use it explicitly for the ITER_PIPE case.  That gives
us a safe default with an opt-in into the optimized variant.  I'm
currently auditing all the users of for how it is used and that looks
pretty promising.

> Note that we *do* have instances where we have different ->splice_read()
> (sometimes using generic_file_splice_read(), sometimes not) even though
> ->read_iter() is there.
> 
> Your patch ignores those (thankfully), but commit message is rather
> misleading - it strongly implies that generic_file_splice_read() is
> *always* the right thing when ->read_iter() is there, not just that
> in such cases it makes a better fallback than default_file_splice_read().

I don't think it always is right.  Not without a major audit and more
work at least.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-30 15:20         ` Christoph Hellwig
@ 2020-07-30 16:17           ` Al Viro
  2020-07-30 16:22             ` Al Viro
  0 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2020-07-30 16:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin, linux-kernel,
	linux-fsdevel

On Thu, Jul 30, 2020 at 05:20:46PM +0200, Christoph Hellwig wrote:

> Fortunately I think the fix is pretty easy - remove the special pipe
> zero copy optimization from copy_page_to_iter, and just have the
> callers actually want it because they have pagecache or similar
> refcountable pages use it explicitly for the ITER_PIPE case.  That gives
> us a safe default with an opt-in into the optimized variant.  I'm
> currently auditing all the users of for how it is used and that looks
> pretty promising.

Huh?  What does that have to do with anything?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-30 16:17           ` Al Viro
@ 2020-07-30 16:22             ` Al Viro
  2020-07-30 16:31               ` Christoph Hellwig
  0 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2020-07-30 16:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Stephen Rothwell, Luis Chamberlain,
	Matthew Wilcox, Kees Cook, Iurii Zaikin, linux-kernel,
	linux-fsdevel

On Thu, Jul 30, 2020 at 05:17:01PM +0100, Al Viro wrote:
> On Thu, Jul 30, 2020 at 05:20:46PM +0200, Christoph Hellwig wrote:
> 
> > Fortunately I think the fix is pretty easy - remove the special pipe
> > zero copy optimization from copy_page_to_iter, and just have the
> > callers actually want it because they have pagecache or similar
> > refcountable pages use it explicitly for the ITER_PIPE case.  That gives
> > us a safe default with an opt-in into the optimized variant.  I'm
> > currently auditing all the users of for how it is used and that looks
> > pretty promising.
> 
> Huh?  What does that have to do with anything?

FWIW, none of the dubious (and outright broken) cases I've found go anywhere
near that.  And it definitely won't help tun/tap...

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter
  2020-07-30 16:22             ` Al Viro
@ 2020-07-30 16:31               ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-30 16:31 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Stephen Rothwell,
	Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

On Thu, Jul 30, 2020 at 05:22:19PM +0100, Al Viro wrote:
> FWIW, none of the dubious (and outright broken) cases I've found go anywhere
> near that.  And it definitely won't help tun/tap...

Then I'm missing something obvious - what is the problem with tun/tap?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-07-01 20:09 [RFC] stop using ->read and ->write for kernel access v3 Christoph Hellwig
@ 2020-07-01 20:09 ` Christoph Hellwig
  0 siblings, 0 replies; 44+ messages in thread
From: Christoph Hellwig @ 2020-07-01 20:09 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds
  Cc: Luis Chamberlain, Matthew Wilcox, Kees Cook, Iurii Zaikin,
	linux-kernel, linux-fsdevel

If we write to a file that implements ->write_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->write with a changed address
limit if available.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 96e8e354f99b45..bd46c959799e97 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -489,10 +489,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 }
 
 /* caller is responsible for file_start_write/file_end_write */
-ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
+ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
+		loff_t *pos)
 {
-	mm_segment_t old_fs;
-	const char __user *p;
 	ssize_t ret;
 
 	if (WARN_ON_ONCE(!(file->f_mode & FMODE_WRITE)))
@@ -500,18 +499,29 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write)
-		ret = file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		ret = new_sync_write(file, p, count, pos);
-	else
+	if (file->f_op->write_iter) {
+		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, WRITE, &iov, 1, count);
+		ret = file->f_op->write_iter(&kiocb, &iter);
+		if (ret > 0)
+			*pos = kiocb.ki_pos;
+	} else if (file->f_op->write) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
+		ret = file->f_op->write(file, (__force const char __user *)buf,
+				count, pos);
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_modify(file);
 		add_wchar(current, ret);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, back to index

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07 17:47 stop using ->read and ->write for kernel access v3 Christoph Hellwig
2020-07-07 17:47 ` [PATCH 01/23] cachefiles: switch to kernel_write Christoph Hellwig
2020-07-07 17:47 ` [PATCH 02/23] autofs: " Christoph Hellwig
2020-07-07 17:47 ` [PATCH 03/23] bpfilter: " Christoph Hellwig
2020-07-07 17:47 ` [PATCH 04/23] fs: unexport __kernel_write Christoph Hellwig
2020-07-07 17:47 ` [PATCH 05/23] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
2020-07-07 17:47 ` [PATCH 06/23] fs: implement kernel_write using __kernel_write Christoph Hellwig
2020-07-07 17:47 ` [PATCH 07/23] fs: remove __vfs_write Christoph Hellwig
2020-07-07 17:47 ` [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
2020-07-29 20:50   ` Al Viro
2020-07-30  7:02     ` Christoph Hellwig
2020-07-07 17:47 ` [PATCH 09/23] fs: add a __kernel_read helper Christoph Hellwig
2020-07-07 17:47 ` [PATCH 10/23] integrity/ima: switch to using __kernel_read Christoph Hellwig
2020-07-07 17:47 ` [PATCH 11/23] fs: implement kernel_read " Christoph Hellwig
2020-07-07 17:47 ` [PATCH 12/23] fs: remove __vfs_read Christoph Hellwig
2020-07-07 17:47 ` [PATCH 13/23] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
2020-07-07 17:47 ` [PATCH 14/23] seq_file: add seq_read_iter Christoph Hellwig
2020-07-07 17:47 ` [PATCH 16/23] proc: remove a level of indentation in proc_get_inode Christoph Hellwig
2020-07-07 17:47 ` [PATCH 17/23] proc: cleanup the compat vs no compat file ops Christoph Hellwig
2020-07-07 17:47 ` [PATCH 18/23] proc: add a read_iter method to proc proc_ops Christoph Hellwig
2020-07-07 17:47 ` [PATCH 19/23] proc: switch over direct seq_read method calls to seq_read_iter Christoph Hellwig
2020-07-07 17:47 ` [PATCH 20/23] sysctl: Convert to iter interfaces Christoph Hellwig
2020-07-07 17:47 ` [PATCH 21/23] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
2020-07-07 17:48 ` [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter Christoph Hellwig
2020-07-30  0:05   ` Al Viro
2020-07-30  7:03     ` Christoph Hellwig
2020-07-30 15:08       ` Al Viro
2020-07-30 15:20         ` Christoph Hellwig
2020-07-30 16:17           ` Al Viro
2020-07-30 16:22             ` Al Viro
2020-07-30 16:31               ` Christoph Hellwig
2020-07-07 17:48 ` [PATCH 23/23] fs: don't allow splice read/write without explicit ops Christoph Hellwig
2020-07-07 20:24 ` stop using ->read and ->write for kernel access v3 Linus Torvalds
2020-07-08  6:07   ` Christoph Hellwig
2020-07-07 23:03 ` Stephen Rothwell
     [not found] ` <20200707174801.4162712-16-hch@lst.de>
2020-07-10 12:55   ` [PATCH 15/23] seq_file: switch over direct seq_read method calls to seq_read_iter Jon Hunter
2020-07-10 12:58     ` Christoph Hellwig
2020-07-11  6:48     ` Christoph Hellwig
2020-07-11 11:47       ` Jon Hunter
2020-07-17 21:09   ` Thomas Gleixner
2020-07-20  9:33     ` Christoph Hellwig
2020-07-29 20:59     ` Al Viro
2020-07-30  7:10       ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2020-07-01 20:09 [RFC] stop using ->read and ->write for kernel access v3 Christoph Hellwig
2020-07-01 20:09 ` [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git