From: Mike Christie <mchristi@redhat.com> To: Xiubo Li <lixiubo@cmss.chinamobile.com>, nab@linux-iscsi.org Cc: agrover@redhat.com, iliastsi@arrikto.com, namei.unix@gmail.com, sheng@yasker.org, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Jianfei Hu <hujianfei@cmss.chinamobile.com> Subject: Re: [PATCH v6 2/2] tcmu: Add global data block pool support Date: Mon, 1 May 2017 13:40:32 -0500 [thread overview] Message-ID: <59078120.3000200@redhat.com> (raw) In-Reply-To: <25e7247f-116f-aaa5-e895-b503c0a893f9@cmss.chinamobile.com> On 04/30/2017 06:29 AM, Xiubo Li wrote: > [...] >>> +static struct page *tcmu_try_get_block_page(struct tcmu_dev *udev, >>> uint32_t dbi) >>> +{ >>> + struct page *page; >>> + int ret; >>> + >>> + mutex_lock(&udev->cmdr_lock); >>> + page = tcmu_get_block_page(udev, dbi); >>> + if (likely(page)) { >>> + mutex_unlock(&udev->cmdr_lock); >>> + return page; >>> + } >>> + >>> + /* >>> + * Normally it shouldn't be here: >>> + * Only when the userspace has touched the blocks which >>> + * are out of the tcmu_cmd's data iov[], and will return >>> + * one zeroed page. >> >> Is it a userspace bug when this happens? Do you know when it is >> occcuring? > Since the UIO will map the whole ring buffer to the user space at the > beginning, and the userspace is allowed and legal to access any block > within the limits of the mapped ring area. > > But actually when this happens, it normally will be one bug of the > userspace. Without this checking the kernel will output many page fault > dump traces. > > Maybe here outputing some warning message is a good idea, and will be > easy to debug for userspace. Yeah. > > > [...] >>> @@ -1388,6 +1509,81 @@ static ssize_t tcmu_cmd_time_out_store(struct >>> config_item *item, const char *pag >>> .tb_dev_attrib_attrs = NULL, >>> }; >>> +static int unmap_thread_fn(void *data) >>> +{ >>> + struct tcmu_dev *udev; >>> + loff_t off; >>> + uint32_t start, end, block; >>> + struct page *page; >>> + int i; >>> + >>> + while (1) { >>> + DEFINE_WAIT(__wait); >>> + >>> + prepare_to_wait(&unmap_wait, &__wait, TASK_INTERRUPTIBLE); >>> + schedule(); >>> + finish_wait(&unmap_wait, &__wait); >>> + >>> + mutex_lock(&root_udev_mutex); >>> + list_for_each_entry(udev, &root_udev, node) { >>> + mutex_lock(&udev->cmdr_lock); >>> + >>> + /* Try to complete the finished commands first */ >>> + tcmu_handle_completions(udev); >>> + >>> + /* Skip the udevs waiting the global pool or in idle */ >>> + if (udev->waiting_global || !udev->dbi_thresh) { >>> + mutex_unlock(&udev->cmdr_lock); >>> + continue; >>> + } >>> + >>> + end = udev->dbi_max + 1; >>> + block = find_last_bit(udev->data_bitmap, end); >>> + if (block == udev->dbi_max) { >>> + /* >>> + * The last bit is dbi_max, so there is >>> + * no need to shrink any blocks. >>> + */ >>> + mutex_unlock(&udev->cmdr_lock); >>> + continue; >>> + } else if (block == end) { >>> + /* The current udev will goto idle state */ >>> + udev->dbi_thresh = start = 0; >>> + udev->dbi_max = 0; >>> + } else { >>> + udev->dbi_thresh = start = block + 1; >>> + udev->dbi_max = block; >>> + } >>> + >>> + /* Here will truncate the data area from off */ >>> + off = udev->data_off + start * DATA_BLOCK_SIZE; >>> + unmap_mapping_range(udev->inode->i_mapping, off, 0, 1); >>> + >>> + /* Release the block pages */ >>> + for (i = start; i < end; i++) { >>> + page = radix_tree_delete(&udev->data_blocks, i); >>> + if (page) { >>> + __free_page(page); >>> + atomic_dec(&global_db_count); >>> + } >>> + } >>> + mutex_unlock(&udev->cmdr_lock); >>> + } >>> + >>> + /* >>> + * Try to wake up the udevs who are waiting >>> + * for the global data pool. >>> + */ >>> + list_for_each_entry(udev, &root_udev, node) { >>> + if (udev->waiting_global) >>> + wake_up(&udev->wait_cmdr); >>> + } >> >> To avoid starvation, I think you want a second list/fifo that holds the >> watiers. In tcmu_get_empty_block if the list is not empty, record how >> many pages we needed and then add the device to the list and wait in >> tcmu_queue_cmd_ring. >> >> Above if we freed enough pages for the device at head then wake up the >> device. >> >> I think you also need a wake_up call in the completion path in case the >> initial call could not free enough pages. It could probably check if the >> completion was going to free enough pages for a waiter and then call >> wake. >> > Yes, I meant to introduce this later after this series to not let the > patches too > complex to review. > > If you agree I will do this later, or in V7 series ? Yeah, I am ok with adding it after the initial patches go in.
next prev parent reply other threads:[~2017-05-01 18:40 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-04-26 6:25 [PATCH v6 0/2] tcmu: Dynamic growing data area support lixiubo 2017-04-26 6:25 ` [PATCH v6 1/2] tcmu: Add dynamic growing data area feature support lixiubo 2017-04-30 5:48 ` Mike Christie 2017-04-30 10:22 ` [PATCH v6 1/2] tcmu: Add dynamic growing data area featuresupport Xiubo Li 2017-05-01 18:37 ` Mike Christie 2017-04-26 6:25 ` [PATCH v6 2/2] tcmu: Add global data block pool support lixiubo 2017-04-26 10:07 ` kbuild test robot 2017-04-30 7:31 ` Mike Christie 2017-04-30 11:29 ` Xiubo Li 2017-05-01 18:40 ` Mike Christie [this message] 2017-05-02 5:26 ` Nicholas A. Bellinger 2017-05-02 5:25 ` [PATCH v6 0/2] tcmu: Dynamic growing data area support Nicholas A. Bellinger 2017-05-02 9:25 ` 答复: " lixiubo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=59078120.3000200@redhat.com \ --to=mchristi@redhat.com \ --cc=agrover@redhat.com \ --cc=hujianfei@cmss.chinamobile.com \ --cc=iliastsi@arrikto.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-scsi@vger.kernel.org \ --cc=lixiubo@cmss.chinamobile.com \ --cc=nab@linux-iscsi.org \ --cc=namei.unix@gmail.com \ --cc=sheng@yasker.org \ --cc=target-devel@vger.kernel.org \ --subject='Re: [PATCH v6 2/2] tcmu: Add global data block pool support' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).