From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D4D2C47095 for ; Wed, 7 Oct 2020 17:17:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 03336206DD for ; Wed, 7 Oct 2020 17:17:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=kapsi.fi header.i=@kapsi.fi header.b="KhyMgS7j" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03336206DD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3503C6E917; Wed, 7 Oct 2020 17:17:27 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [IPv6:2001:67c:1be8::25]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2F64A6E917 for ; Wed, 7 Oct 2020 17:17:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=kapsi.fi; s=20161220; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=DoIZXxo+QghZmHm5l7527MUsaMCTshHGo5O3eQeqUTk=; b=KhyMgS7j+2TyTMYl7H4lpwMNSn dOltS7qltYTK0CnRdQ6W3Z2cwKD7r4T0k+4h/DW+AVA093CcQzo47j/q7ehhEzn0KO3SAL+EhnAru QupznL21+/+bqOwNtNqTWUBuoQcXPj0wIG6W0dUdaRKgua+9JMbR3D5jQug3hRbCejDl/fVT2yHmz q7suXQqnW5IupiNrThHM+FoqPQh0XA7HnRICBYultRNqcpOdg49rDMIogCddJxbJ31TAeYSHD7F8b AxuS7I7jdPXtE+jUEApfmxNWV9phwlOeg5mJU4bmQkYcGt9q3FdkMCjsj35qMdOKegNb148ItaYWb l/Cm4xYg==; Received: from dsl-hkibng22-54f986-236.dhcp.inet.fi ([84.249.134.236] helo=toshino.localdomain) by mail.kapsi.fi with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1kQD40-0006P9-IO; Wed, 07 Oct 2020 20:17:16 +0300 From: Mikko Perttunen To: thierry.reding@gmail.com, jonathanh@nvidia.com, digetx@gmail.com, airlied@linux.ie, daniel@ffwll.ch Subject: [PATCH v3 12/20] gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer Date: Wed, 7 Oct 2020 20:12:30 +0300 Message-Id: <20201007171238.1795964-13-mperttunen@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201007171238.1795964-1-mperttunen@nvidia.com> References: <20201007171238.1795964-1-mperttunen@nvidia.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 84.249.134.236 X-SA-Exim-Mail-From: mperttunen@nvidia.com X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-tegra@vger.kernel.org, talho@nvidia.com, bhuntsman@nvidia.com, dri-devel@lists.freedesktop.org, Mikko Perttunen Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add support for inserting syncpoint waits in the CDMA pushbuffer. These waits need to be done in HOST1X class, while gather submitted by the application execute in engine class. Support is added by converting the gather list of job into a command list that can include both gathers and waits. When the job is submitted, these commands are pushed as the appropriate opcodes on the CDMA pushbuffer. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/hw/channel_hw.c | 51 +++++++++++++++-------- drivers/gpu/host1x/hw/debug_hw.c | 9 +++- drivers/gpu/host1x/job.c | 67 +++++++++++++++++++++--------- drivers/gpu/host1x/job.h | 14 +++++++ include/linux/host1x.h | 5 ++- 5 files changed, 105 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index bf21512e5078..d88a32f73f5e 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -55,31 +55,46 @@ static void submit_gathers(struct host1x_job *job) #endif unsigned int i; - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; - dma_addr_t addr = g->base + g->offset; - u32 op2, op3; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_cmd *cmd = &job->cmds[i]; - op2 = lower_32_bits(addr); - op3 = upper_32_bits(addr); + if (cmd->is_wait) { + /* TODO use modern wait */ + host1x_cdma_push(cdma, + host1x_opcode_setclass(HOST1X_CLASS_HOST1X, + host1x_uclass_wait_syncpt_r(), 1), + host1x_class_host_wait_syncpt(cmd->wait.id, + cmd->wait.threshold)); + host1x_cdma_push( + cdma, host1x_opcode_setclass(job->class, 0, 0), + HOST1X_OPCODE_NOP); + } else { + struct host1x_job_gather *g = &cmd->gather; - trace_write_gather(cdma, g->bo, g->offset, g->words); + dma_addr_t addr = g->base + g->offset; + u32 op2, op3; - if (op3 != 0) { + op2 = lower_32_bits(addr); + op3 = upper_32_bits(addr); + + trace_write_gather(cdma, g->bo, g->offset, g->words); + + if (op3 != 0) { #if HOST1X_HW >= 6 - u32 op1 = host1x_opcode_gather_wide(g->words); - u32 op4 = HOST1X_OPCODE_NOP; + u32 op1 = host1x_opcode_gather_wide(g->words); + u32 op4 = HOST1X_OPCODE_NOP; - host1x_cdma_push_wide(cdma, op1, op2, op3, op4); + host1x_cdma_push_wide(cdma, op1, op2, op3, op4); #else - dev_err(dev, "invalid gather for push buffer %pad\n", - &addr); - continue; + dev_err(dev, "invalid gather for push buffer %pad\n", + &addr); + continue; #endif - } else { - u32 op1 = host1x_opcode_gather(g->words); + } else { + u32 op1 = host1x_opcode_gather(g->words); - host1x_cdma_push(cdma, op1, op2); + host1x_cdma_push(cdma, op1, op2); + } } } } @@ -126,7 +141,7 @@ static int channel_submit(struct host1x_job *job) struct host1x *host = dev_get_drvdata(ch->dev->parent); trace_host1x_channel_submit(dev_name(ch->dev), - job->num_gathers, job->num_relocs, + job->num_cmds, job->num_relocs, job->syncpt->id, job->syncpt_incrs); /* before error checks, return current max */ diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c index ceb48229d14b..35952fd5597e 100644 --- a/drivers/gpu/host1x/hw/debug_hw.c +++ b/drivers/gpu/host1x/hw/debug_hw.c @@ -208,10 +208,15 @@ static void show_channel_gathers(struct output *o, struct host1x_cdma *cdma) job->first_get, job->timeout, job->num_slots, job->num_unpins); - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_gather *g; u32 *mapped; + if (job->cmds[i].is_wait) + continue; + + g = &job->cmds[i].gather; + if (job->gather_copy_mapped) mapped = (u32 *)job->gather_copy_mapped; else diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index acf322beb56c..ac4091596811 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -38,7 +38,7 @@ struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, total = sizeof(struct host1x_job) + (u64)num_relocs * sizeof(struct host1x_reloc) + (u64)num_unpins * sizeof(struct host1x_job_unpin_data) + - (u64)num_cmdbufs * sizeof(struct host1x_job_gather) + + (u64)num_cmdbufs * sizeof(struct host1x_job_cmd) + (u64)num_unpins * sizeof(dma_addr_t) + (u64)num_unpins * sizeof(u32 *); if (total > ULONG_MAX) @@ -57,8 +57,8 @@ struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, mem += num_relocs * sizeof(struct host1x_reloc); job->unpins = num_unpins ? mem : NULL; mem += num_unpins * sizeof(struct host1x_job_unpin_data); - job->gathers = num_cmdbufs ? mem : NULL; - mem += num_cmdbufs * sizeof(struct host1x_job_gather); + job->cmds = num_cmdbufs ? mem : NULL; + mem += num_cmdbufs * sizeof(struct host1x_job_cmd); job->addr_phys = num_unpins ? mem : NULL; job->reloc_addr_phys = job->addr_phys; @@ -101,22 +101,35 @@ EXPORT_SYMBOL(host1x_job_put); void host1x_job_add_gather(struct host1x_job *job, struct host1x_bo *bo, unsigned int words, unsigned int offset) { - struct host1x_job_gather *gather = &job->gathers[job->num_gathers]; + struct host1x_job_gather *gather = &job->cmds[job->num_cmds].gather; gather->words = words; gather->bo = bo; gather->offset = offset; - job->num_gathers++; + job->num_cmds++; } EXPORT_SYMBOL(host1x_job_add_gather); +void host1x_job_add_wait(struct host1x_job *job, u32 id, u32 thresh) +{ + struct host1x_job_cmd *cmd = &job->cmds[job->num_cmds]; + + cmd->is_wait = true; + cmd->wait.id = id; + cmd->wait.threshold = thresh; + + job->num_cmds++; +} +EXPORT_SYMBOL(host1x_job_add_wait); + static unsigned int pin_job(struct host1x *host, struct host1x_job *job) { struct host1x_client *client = job->client; struct device *dev = client->dev; struct host1x_job_gather *g; struct iommu_domain *domain; + struct sg_table *sgt; unsigned int i; int err; @@ -126,7 +139,6 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) for (i = 0; i < job->num_relocs; i++) { struct host1x_reloc *reloc = &job->relocs[i]; dma_addr_t phys_addr, *phys; - struct sg_table *sgt; reloc->target.bo = host1x_bo_get(reloc->target.bo); if (!reloc->target.bo) { @@ -204,17 +216,20 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) return 0; - for (i = 0; i < job->num_gathers; i++) { + for (i = 0; i < job->num_cmds; i++) { size_t gather_size = 0; struct scatterlist *sg; - struct sg_table *sgt; dma_addr_t phys_addr; unsigned long shift; struct iova *alloc; dma_addr_t *phys; unsigned int j; - g = &job->gathers[i]; + if (job->cmds[i].is_wait) + continue; + + g = &job->cmds[i].gather; + g->bo = host1x_bo_get(g->bo); if (!g->bo) { err = -EINVAL; @@ -550,8 +565,13 @@ static inline int copy_gathers(struct device *host, struct host1x_job *job, fw.num_relocs = job->num_relocs; fw.class = job->class; - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_gather *g; + + if (job->cmds[i].is_wait) + continue; + + g = &job->cmds[i].gather; size += g->words * sizeof(u32); } @@ -573,10 +593,14 @@ static inline int copy_gathers(struct device *host, struct host1x_job *job, job->gather_copy_size = size; - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_gather *g; void *gather; + if (job->cmds[i].is_wait) + continue; + g = &job->cmds[i].gather; + /* Copy the gather */ gather = host1x_bo_mmap(g->bo); memcpy(job->gather_copy_mapped + offset, gather + g->offset, @@ -619,8 +643,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) } /* patch gathers */ - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_gather *g; + + if (job->cmds[i].is_wait) + continue; + g = &job->cmds[i].gather; /* process each gather mem only once */ if (g->handled) @@ -630,10 +658,11 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) g->base = job->gather_addr_phys[i]; - for (j = i + 1; j < job->num_gathers; j++) { - if (job->gathers[j].bo == g->bo) { - job->gathers[j].handled = true; - job->gathers[j].base = g->base; + for (j = i + 1; j < job->num_cmds; j++) { + if (!job->cmds[j].is_wait && + job->cmds[j].gather.bo == g->bo) { + job->cmds[j].gather.handled = true; + job->cmds[j].gather.base = g->base; } } diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index 94bc2e4ae241..33adfaede842 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -18,6 +18,20 @@ struct host1x_job_gather { bool handled; }; +struct host1x_job_wait { + u32 id; + u32 threshold; +}; + +struct host1x_job_cmd { + bool is_wait; + + union { + struct host1x_job_gather gather; + struct host1x_job_wait wait; + }; +}; + struct host1x_job_unpin_data { struct host1x_bo *bo; struct sg_table *sgt; diff --git a/include/linux/host1x.h b/include/linux/host1x.h index d7070fd65833..59296d3346fe 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -215,8 +215,8 @@ struct host1x_job { struct host1x_client *client; /* Gathers and their memory */ - struct host1x_job_gather *gathers; - unsigned int num_gathers; + struct host1x_job_cmd *cmds; + unsigned int num_cmds; /* Array of handles to be pinned & unpinned */ struct host1x_reloc *relocs; @@ -275,6 +275,7 @@ struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, u32 num_cmdbufs, u32 num_relocs); void host1x_job_add_gather(struct host1x_job *job, struct host1x_bo *bo, unsigned int words, unsigned int offset); +void host1x_job_add_wait(struct host1x_job *job, u32 id, u32 thresh); struct host1x_job *host1x_job_get(struct host1x_job *job); void host1x_job_put(struct host1x_job *job); int host1x_job_pin(struct host1x_job *job, struct device *dev); -- 2.28.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel