From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752068AbdAaThv (ORCPT ); Tue, 31 Jan 2017 14:37:51 -0500 Received: from mx2.suse.de ([195.135.220.15]:43404 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751580AbdAaThl (ORCPT ); Tue, 31 Jan 2017 14:37:41 -0500 Subject: Re: [PATCH V6 1/5] LIB: Indirect ISA/LPC port IO introduced To: John Garry , "zhichang.yuan" , catalin.marinas@arm.com, will.deacon@arm.com, robh+dt@kernel.org, frowand.list@gmail.com, bhelgaas@google.com, rafael@kernel.org, mark.rutland@arm.com, brian.starkey@arm.com, olof@lixom.net, arnd@arndb.de, linux-arm-kernel@lists.infradead.org References: <1485241525-201782-1-git-send-email-yuanzhichang@hisilicon.com> <1485241525-201782-2-git-send-email-yuanzhichang@hisilicon.com> <5dc0dc9c-124e-398f-a3b2-40059268e859@suse.de> <6d5523be-1841-18b9-9b83-6f7e591ed66b@huawei.com> Cc: lorenzo.pieralisi@arm.com, benh@kernel.crashing.org, linux-kernel@vger.kernel.org, linuxarm@huawei.com, devicetree@vger.kernel.org, linux-pci@vger.kernel.org, linux-serial@vger.kernel.org, minyard@acm.org, liviu.dudau@arm.com, zourongrong@gmail.com, gabriele.paoloni@huawei.com, zhichang.yuan02@gmail.com, kantyzc@163.com, xuwei5@hisilicon.com From: Alexander Graf Message-ID: <1a1f1e70-b6e7-f3c6-2b86-348b3d28edfe@suse.de> Date: Tue, 31 Jan 2017 20:37:28 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: <6d5523be-1841-18b9-9b83-6f7e591ed66b@huawei.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 31/01/2017 14:32, John Garry wrote: > On 30/01/2017 17:12, Alexander Graf wrote: >> On 01/24/2017 08:05 AM, zhichang.yuan wrote: >>> Low-pin-count interface is integrated into some SoCs. The accesses to >>> those >>> peripherals under LPC make use of I/O ports rather than the memory >>> mapped I/O. >>> >>> To drive these devices, this patch introduces a method named >>> indirect-IO. >>> In this method the in/out() accessor in include/asm-generic/io.h will be >>> redefined. When upper layer drivers call in/out() with those known >>> legacy port >>> addresses to access the peripherals, the I/O operations will be routed >>> to the >>> right hooks which are registered specific to the host device, such as >>> LPC. >>> Then the hardware relevant manupulations are finished by the >>> corresponding >>> host. >>> >>> According to the comments on V5, this patch adds a common indirect-IO >>> driver >>> which support this I/O indirection to the generic directory. >>> >>> In the later pathches, some host-relevant drivers are implemented to >>> support >>> the specific I/O hooks and register them. >>> Based on these, the upper layer drivers which depend on in/out() can >>> work well >>> without any extra work or any changes. >>> >>> Signed-off-by: zhichang.yuan >>> Signed-off-by: Gabriele Paoloni >>> Signed-off-by: John Garry >> >> I like the extio idea. That allows us to handle all PIO requests on >> platforms that don't have native PIO support via different routes >> depending on the region they're in. Unfortunately we now we have 2 >> frameworks for handling sparse PIO regions: One in extio, one in PCI. >> >> Why don't we just merge the two? Most of the code that has #ifdef >> PCI_IOBASE throughout the code base sounds like an ideal candidate to >> get migrated to extio instead. Then we only have a single framework to >> worry about ... > > To be clear, are you suggesting we merge the functionality from > pci_register_io_range(), pci_pio_to_address(), pci_address_to_pio() into > extio, so extio manages all PIO? Yes, I guess so. > And having a single type of node to > register PIO ranges, by amalgamating struct extio_node and io_range (as > Bjorn mentioned)? I'm not quite sure I follow you here. Basically I think you want a generic "non-x86 PIO" framework that PCI just plugs into. I don't think that necessarily means you want to statically allocate regions of that PIO space to separate (pseudo-)devices. Instead, everyone shares that space and should be able to fail gracefully if some space is already occupied. > It would make sense. We would be somewhat decoupling PIO from PCI. Yes :). > I think that other architectures, like PPC, and other code would need to > be fixed up to handle this. I think only PPC, Microblaze and ARM are using this. Grep for PCI_IOBASE. It's not that many. > We need to consider all the other challenges/obstacles to this. Well, getting our abstraction levels right to me sounds like it's worth the obstacles. > >> >>> --- >>> include/asm-generic/io.h | 50 ++++++++++++++++ >>> include/linux/extio.h | 85 +++++++++++++++++++++++++++ >>> include/linux/io.h | 1 + >>> lib/Kconfig | 8 +++ >>> lib/Makefile | 2 + >>> lib/extio.c | 147 >>> +++++++++++++++++++++++++++++++++++++++++++++++ xc>> create mode >>> 100644 include/linux/extio.h >>> create mode 100644 lib/extio.c >>> > > > >>> + * Copyright (C) 2016 Hisilicon Limited, All Rights Reserved. >>> + * Author: Zhichang Yuan >>> + * >>> + * This program is free software; you can redistribute it and/or modify >>> + * it under the terms of the GNU General Public License version 2 as >>> + * published by the Free Software Foundation. >>> + * >>> + * This program is distributed in the hope that it will be useful, >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>> + * GNU General Public License for more details. >>> + * >>> + * You should have received a copy of the GNU General Public License >>> + * along with this program. If not, see >>> . >>> + */ >>> + >>> +#include >>> +#include >>> + >>> +static LIST_HEAD(extio_dev_list); >>> +static DEFINE_RWLOCK(extio_list_lock); >> >> Why not just make the list an RCU list? Then you don't need read locks. >> We also wouldn't create potential lock contention between devices that >> could easily have parallel PIO operations (say a PCI device and an LPC >> device). >> > > OK > >>> + >>> +void register_extio(struct extio_node *node) >>> +{ >>> + write_lock(&extio_list_lock); >>> + list_add_tail(&node->list, &extio_dev_list); >>> + write_unlock(&extio_list_lock); >>> +} >>> + >>> +static struct extio_node *find_extio_token(unsigned long addr) >>> +{ >>> + struct extio_node *extio_entry; >>> + >>> + read_lock(&extio_list_lock); >>> + list_for_each_entry(extio_entry, &extio_dev_list, list) { >>> + if ((addr < extio_entry->io_start + extio_entry->range_size) && >>> + (addr >= extio_entry->io_start)) >>> + break; >>> + } >>> + read_unlock(&extio_list_lock); >>> + return (&extio_entry->list == &extio_dev_list) ? NULL : >>> extio_entry; >>> +} >>> + >>> +struct extio_node *extio_find_node(struct fwnode_handle *node) >>> +{ >>> + struct extio_node *entry; >>> + >>> + read_lock(&extio_list_lock); >>> + list_for_each_entry(entry, &extio_dev_list, list) { >>> + if (entry->fwnode == node) >>> + break; >>> + } >>> + read_unlock(&extio_list_lock); >>> + >>> + return (&entry->list == &extio_dev_list) ? NULL : entry; >>> +} >>> + >>> +unsigned long extio_translate(struct fwnode_handle *node, >>> + unsigned long bus_addr) >>> +{ >>> + struct extio_node *entry; >>> + unsigned long port_id = -1; >>> + >>> + read_lock(&extio_list_lock); >>> + list_for_each_entry(entry, &extio_dev_list, list) { >>> + if (entry->fwnode == node && >>> + bus_addr >= entry->bus_start && >>> + bus_addr - entry->bus_start < entry->range_size) >>> + port_id = entry->io_start + bus_addr - >>> + entry->bus_start; >>> + } >>> + read_unlock(&extio_list_lock); >>> + >>> + return port_id; >>> +} >>> + >>> +#ifdef PCI_IOBASE >>> + >>> +#define BUILD_EXTIO(bw, type) \ >>> +type extio_in##bw(unsigned long addr) \ >>> +{ \ >>> + struct extio_node *extio_entry = find_extio_token(addr); \ >>> + \ >>> + if (!extio_entry) \ >>> + return read##bw(PCI_IOBASE + addr); \ >>> + return extio_entry->ops->pfin ? \ >>> + extio_entry->ops->pfin(extio_entry->devpara, \ >>> + addr, sizeof(type)) : -1; \ >>> +} \ >>> + \ >>> +void extio_out##bw(type value, unsigned long addr) \ >>> +{ \ >>> + struct extio_node *extio_entry = find_extio_token(addr); \ >>> + \ >>> + if (!extio_entry) \ >>> + write##bw(value, PCI_IOBASE + addr); \ >> >> All of the fallback code would also disappear as a nice side effect of >> making pci pio handling a user of extio :). > > Is your idea that PCI IO space will also register accessors, which would > be the same read{b,w,l}/write{b,w,l}? Yes. If you need to later on accelerate that bit, you can always do something like if (extio_entry->ops->pfin == pci_extio_in) return pci_extio_in(...); which should get you all the prefetcher and branch prediction benefits that the current version gives you. But for starters I'd leave that out, since I doubt it'll have measurable performance impact to go via an indirect function call. > >> > > It would be nice to have a quicker way to so the lookup from address to > node, as we loop all nodes in find_extio_token() every single time. You can always replace the search with a tree. But to me that's an implementation detail that's easy enough to replace in a follow-up patch series. Alex