From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754597AbcJSWUe (ORCPT ); Wed, 19 Oct 2016 18:20:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35964 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751246AbcJSWUc (ORCPT ); Wed, 19 Oct 2016 18:20:32 -0400 Date: Wed, 19 Oct 2016 16:20:30 -0600 From: Alex Williamson To: Kirti Wankhede Cc: , , , , , , , , Subject: Re: [PATCH v9 01/12] vfio: Mediated device Core driver Message-ID: <20161019162030.6e37a6c1@t450s.home> In-Reply-To: References: <1476739332-4911-1-git-send-email-kwankhede@nvidia.com> <1476739332-4911-2-git-send-email-kwankhede@nvidia.com> <20161018171633.78816a8d@t450s.home> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 19 Oct 2016 22:20:31 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 20 Oct 2016 00:46:48 +0530 Kirti Wankhede wrote: > On 10/19/2016 4:46 AM, Alex Williamson wrote: > > On Tue, 18 Oct 2016 02:52:01 +0530 > > Kirti Wankhede wrote: > > > ... > >> +static struct mdev_device *__find_mdev_device(struct parent_device *parent, > >> + uuid_le uuid) > >> +{ > >> + struct device *dev; > >> + > >> + dev = device_find_child(parent->dev, &uuid, _find_mdev_device); > >> + if (!dev) > >> + return NULL; > >> + > >> + put_device(dev); > >> + > >> + return to_mdev_device(dev); > >> +} > > > > This function is only used by mdev_device_create() for the purpose of > > checking whether a given uuid for a parent already exists, so the > > returned device is not actually used. However, at the point where > > we're using to_mdev_device() here, we don't actually hold a reference to > > the device, so that function call and any possible use of the returned > > pointer by the callee is invalid. I would either turn this into a > > "get" function where the callee has a device reference and needs to do > > a "put" on it or change this to a "exists" test where true/false is > > returned and the function cannot be later mis-used to do a device > > lookup where the reference isn't actually valid. > > > > I'll change it to return 0 if not found and -EEXIST if found. > > > >> +int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le uuid) > >> +{ > >> + int ret; > >> + struct mdev_device *mdev; > >> + struct parent_device *parent; > >> + struct mdev_type *type = to_mdev_type(kobj); > >> + > >> + parent = mdev_get_parent(type->parent); > >> + if (!parent) > >> + return -EINVAL; > >> + > >> + /* Check for duplicate */ > >> + mdev = __find_mdev_device(parent, uuid); > >> + if (mdev) { > >> + ret = -EEXIST; > >> + goto create_err; > >> + } > > > > We check here whether the {parent,uuid} already exists, but what > > prevents us racing with another create call with the same uuid? ie. > > neither exists at this point. Will device_register() fail if the > > device name already exists? If so, should we just rely on the error > > there and skip this duplicate check? If not, we need a mutex to avoid > > the race. > > > > Yes, device_register() fails if device exists already with below > warning. Is it ok to dump such warning? I think, this should be fine, > right? then we can remove duplicate check. > > If we want to avoid such warning, we should have duplication check. We should avoid such warnings, bugs will get filed otherwise. Thanks for checking. Thanks, Alex > [ 610.847958] ------------[ cut here ]------------ > [ 610.855377] WARNING: CPU: 15 PID: 19839 at fs/sysfs/dir.c:31 > sysfs_warn_dup+0x64/0x80 > [ 610.865798] sysfs: cannot create duplicate filename > '/devices/pci0000:80/0000:80:02.0/0000:83:00.0/0000:84:08.0/0000:85:00.0/83b8f4f2-509f-382f-3c1e-e6bfe0fa1234' > [ 610.885101] Modules linked in:[ 610.888039] nvidia(POE) > vfio_iommu_type1 vfio_mdev mdev vfio nfsv4 dns_resolver nfs fscache > sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp > kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel > ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul ablk_helper > cryptd nfsd auth_rpcgss nfs_acl lockd mei_me grace iTCO_wdt > iTCO_vendor_support mei ipmi_si pcspkr ioatdma i2c_i801 lpc_ich shpchp > i2c_smbus mfd_core ipmi_msghandler acpi_pad uinput sunrpc xfs libcrc32c > sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt > fb_sys_fops ttm drm igb ahci libahci ptp libata pps_core dca > i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last > unloaded: mdev] > [ 610.963835] CPU: 15 PID: 19839 Comm: bash Tainted: P OE > 4.8.0-next-20161013+ #0 > [ 610.973779] Hardware name: Supermicro > SYS-2027GR-5204A-NC024/X9DRG-HF, BIOS 1.0c 02/28/2013 > [ 610.983769] ffffc90009323ae0 ffffffff813568bf ffffc90009323b30 > 0000000000000000 > [ 610.992867] ffffc90009323b20 ffffffff81085511 0000001f00001000 > ffff8808839ef000 > [ 611.001954] ffff88108b30f900 ffff88109ae368e8 ffff88109ae580b0 > ffff881099cc0818 > [ 611.011055] Call Trace: > [ 611.015087] [] dump_stack+0x63/0x84 > [ 611.021784] [] __warn+0xd1/0xf0 > [ 611.028115] [] warn_slowpath_fmt+0x5f/0x80 > [ 611.035379] [] ? kernfs_path_from_node+0x50/0x60 > [ 611.043148] [] sysfs_warn_dup+0x64/0x80 > [ 611.050109] [] sysfs_create_dir_ns+0x7e/0x90 > [ 611.057481] [] kobject_add_internal+0xc1/0x340 > [ 611.065018] [] kobject_add+0x75/0xd0 > [ 611.071635] [] device_add+0x119/0x610 > [ 611.078314] [] device_register+0x1a/0x20 > [ 611.085261] [] mdev_device_create+0xdd/0x200 [mdev] > [ 611.093143] [] create_store+0xa8/0xe0 [mdev] > [ 611.100385] [] mdev_type_attr_store+0x1b/0x30 [mdev] > [ 611.108309] [] sysfs_kf_write+0x3a/0x50 > [ 611.115096] [] kernfs_fop_write+0x10b/0x190 > [ 611.122231] [] __vfs_write+0x37/0x140 > [ 611.128817] [] ? handle_mm_fault+0x724/0xd80 > [ 611.135976] [] vfs_write+0xb2/0x1b0 > [ 611.142354] [] ? syscall_trace_enter+0x1d0/0x2b0 > [ 611.149836] [] SyS_write+0x55/0xc0 > [ 611.156065] [] do_syscall_64+0x67/0x180 > [ 611.162734] [] entry_SYSCALL64_slow_path+0x25/0x25 > [ 611.170345] ---[ end trace b05a73599da2ba3f ]--- > [ 611.175940] ------------[ cut here ]------------ > > > > >> +static ssize_t create_store(struct kobject *kobj, struct device *dev, > >> + const char *buf, size_t count) > >> +{ > >> + char *str; > >> + uuid_le uuid; > >> + int ret; > >> + > >> + if (count < UUID_STRING_LEN) > >> + return -EINVAL; > > > > > > Can't we also test for something unreasonably large? > > > > Ok. I'll add that check. > > > > >> + > >> + str = kstrndup(buf, count, GFP_KERNEL); > >> + if (!str) > >> + return -ENOMEM; > >> + > >> + ret = uuid_le_to_bin(str, &uuid); > > > > nit, we can kfree(str) here regardless of the return. > > > >> + if (!ret) { > >> + > >> + ret = mdev_device_create(kobj, dev, uuid); > >> + if (ret) > >> + pr_err("mdev_create: Failed to create mdev device\n"); > > > > What value does this pr_err add? It doesn't tell us why it failed and > > the user will already know if failed by the return value of their write. > > > > Ok, will remove it. > > >> + else > >> + ret = count; > >> + } > >> + > >> + kfree(str); > >> + return ret; > >> +} > > ... > > >> +static inline struct mdev_driver *to_mdev_driver(struct device_driver *drv) > >> +{ > >> + return drv ? container_of(drv, struct mdev_driver, driver) : NULL; > >> +} > >> + > >> +static inline struct mdev_device *to_mdev_device(struct device *dev) > >> +{ > >> + return dev ? container_of(dev, struct mdev_device, dev) : NULL; > > > > > > Do we really need this NULL dev/drv behavior? I don't see that any of > > the callers can pass NULL into these. The PCI equivalents don't > > support this behavior and it doesn't seem they need to. Thanks, > > > > Ok, I'll update that. > > Kirti