From mboxrd@z Thu Jan  1 00:00:00 1970
From: Inki Dae <daeinki@gmail.com>
Subject: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework
Date: Tue, 25 Jun 2013 18:09:32 +0900
Message-ID: <CAAQKjZNnJRddACHzD+VF=A8vJpt9SEy2ttnS3Kw0y3hexu8dnw@mail.gmail.com>
References: <20130617182127.GM2718@n2100.arm.linux.org.uk>
	<007301ce6be4$8d5c6040$a81520c0$%dae@samsung.com>
	<20130618084308.GU2718@n2100.arm.linux.org.uk>
	<008a01ce6c02$e00a9f50$a01fddf0$%dae@samsung.com>
	<1371548849.4276.6.camel@weser.hi.pengutronix.de>
	<008601ce6cb0$2c8cec40$85a6c4c0$%dae@samsung.com>
	<1371637326.4230.24.camel@weser.hi.pengutronix.de>
	<00ae01ce6cd9$f4834630$dd89d290$%dae@samsung.com>
	<1371645247.4230.41.camel@weser.hi.pengutronix.de>
	<CAAQKjZNJD4HpnJQ7iE+Gez36066M6U0YQeUEdA0+UcSOKqeghg@mail.gmail.com>
	<20130619182925.GL2718@n2100.arm.linux.org.uk>
	<00da01ce6d81$76eb3d60$64c1b820$%dae@samsung.com>
	<1371714427.4230.64.camel@weser.hi.pengutronix.de>
	<00db01ce6d8f$a3c23dd0$eb46b970$%dae@samsung.com>
	<1371723063.4114.12.camel@weser.hi.pengutronix.de>
	<010801ce6da7$896affe0$9c40ffa0$%dae@samsung.com>
	<1371804843.4114.49.camel@weser.hi.pengutronix.de>
	<CAAQKjZOxOMuL3zh_yV7tU2LBcZ7oVryiKa+LgjTM5HLY+va8zQ@mail.gmail.com>
	<1371817628.5882.13.camel@weser.hi.pengutronix.de>
	<CAAQKjZOeskLB7n6FM+bnB8n7ecuQM5k6uANXJXo=xk979f9s9Q@mail.gmail.com>
	<CAH3drwZVhs=odjFdB_Mf+K0JLT5NSSbz5mP9aOS=5fx-PVdzSg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============1431084902=="
Return-path: <dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org>
Received: from mail-la0-f42.google.com (mail-la0-f42.google.com
	[209.85.215.42])
	by gabe.freedesktop.org (Postfix) with ESMTP id 158F0E5D50
	for <dri-devel@lists.freedesktop.org>;
	Tue, 25 Jun 2013 02:09:33 -0700 (PDT)
Received: by mail-la0-f42.google.com with SMTP id eb20so11633761lab.29
	for <dri-devel@lists.freedesktop.org>;
	Tue, 25 Jun 2013 02:09:33 -0700 (PDT)
In-Reply-To: <CAH3drwZVhs=odjFdB_Mf+K0JLT5NSSbz5mP9aOS=5fx-PVdzSg@mail.gmail.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org
To: Jerome Glisse <j.glisse@gmail.com>
Cc: linux-fbdev <linux-fbdev@vger.kernel.org>, Russell King - ARM Linux <linux@arm.linux.org.uk>, DRI mailing list <dri-devel@lists.freedesktop.org>, Kyungmin Park <kyungmin.park@samsung.com>, "myungjoo.ham" <myungjoo.ham@samsung.com>, YoungJun Cho <yj44.cho@samsung.com>, "linux-media@vger.kernel.org" <linux-media@vger.kernel.org>, "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>
List-Id: dri-devel@lists.freedesktop.org

--===============1431084902==
Content-Type: multipart/alternative; boundary=089e0160a3ac29f42d04dff6e371

--089e0160a3ac29f42d04dff6e371
Content-Type: text/plain; charset=ISO-8859-1

2013/6/22 Jerome Glisse <j.glisse@gmail.com>:
> On Fri, Jun 21, 2013 at 12:55 PM, Inki Dae <daeinki@gmail.com> wrote:
>> 2013/6/21 Lucas Stach <l.stach@pengutronix.de>:
>>> Hi Inki,
>>>
>>> please refrain from sending HTML Mails, it makes proper quoting without
>>> messing up the layout everywhere pretty hard.
>>>
>>
>> Sorry about that. I should have used text mode.
>>
>>> Am Freitag, den 21.06.2013, 20:01 +0900 schrieb Inki Dae:
>>> [...]
>>>
>>>>         Yeah, you'll some knowledge and understanding about the API
>>>>         you are
>>>>         working with to get things right. But I think it's not an
>>>>         unreasonable
>>>>         thing to expect the programmer working directly with kernel
>>>>         interfaces
>>>>         to read up on how things work.
>>>>
>>>>         Second thing: I'll rather have *one* consistent API for every
>>>>         subsystem,
>>>>         even if they differ from each other than having to implement
>>>>         this
>>>>         syncpoint thing in every subsystem. Remember: a single execbuf
>>>>         in DRM
>>>>         might reference both GEM objects backed by dma-buf as well
>>>>         native SHM or
>>>>         CMA backed objects. The dma-buf-mgr proposal already allows
>>>>         you to
>>>>         handle dma-bufs much the same way during validation than
>>>>         native GEM
>>>>         objects.
>>>>
>>>> Actually, at first I had implemented a fence helper framework based on
>>>> reservation and dma fence to provide easy-use-interface for device
>>>> drivers. However, that was wrong implemention: I had not only
>>>> customized the dma fence but also not considered dead lock issue.
>>>> After that, I have reimplemented it as dmabuf sync to solve dead
>>>> issue, and at that time, I realized that we first need to concentrate
>>>> on the most basic thing: the fact CPU and CPU, CPU and DMA, or DMA and
>>>> DMA can access a same buffer, And the fact simple is the best, and the
>>>> fact we need not only kernel side but also user side interfaces. After
>>>> that, I collected what is the common part for all subsystems, and I
>>>> have devised this dmabuf sync framework for it. I'm not really
>>>> specialist in Desktop world. So question. isn't the execbuf used only
>>>> for the GPU? the gpu has dedicated video memory(VRAM) so it needs
>>>> migration mechanism between system memory and the dedicated video
>>>> memory, and also to consider ordering issue while be migrated.
>>>>
>>>
>>> Yeah, execbuf is pretty GPU specific, but I don't see how this matters
>>> for this discussion. Also I don't see a big difference between embedded
>>> and desktop GPUs. Buffer migration is more of a detail here. Both take
>>> command stream that potentially reference other buffers, which might be
>>> native GEM or dma-buf backed objects. Both have to make sure the buffers
>>> are in the right domain (caches cleaned and address mappings set up) and
>>> are available for the desired operation, meaning you have to sync with
>>> other DMA engines and maybe also with CPU.
>>
>> Yeah, right. Then, in case of desktop gpu, does't it need additional
>> something to do when a buffer/s is/are migrated from system memory to
>> video memory domain, or from video memory to system memory domain? I
>> guess the below members does similar thing, and all other DMA devices
>> would not need them:
>>         struct fence {
>>                   ...
>>                   unsigned int context, seqno;
>>                   ...
>>         };
>>
>> And,
>>        struct seqno_fence {
>>                  ...
>>                  uint32_t seqno_ofs;
>>                  ...
>>        };
>>
>>>
>>> The only case where sync isn't clearly defined right now by the current
>>> API entrypoints is when you access memory through the dma-buf fallback
>>> mmap support, which might happen with some software processing element
>>> in a video pipeline or something. I agree that we will need a userspace
>>> interface here, but I think this shouldn't be yet another sync object,
>>> but rather more a prepare/fini_cpu_access ioctl on the dma-buf which
>>> hooks into the existing dma-fence and reservation stuff.
>>
>> I think we don't need addition ioctl commands for that. I am thinking
>> of using existing resources as possible. My idea also is similar in
>> using the reservation stuff to your idea because my approach also
>> should use the dma-buf resource. However, My idea is that a user
>> process, that wants buffer synchronization with the other, sees a sync
>> object as a file descriptor like dma-buf does. The below shows simple
>> my idea about it:
>>
>> ioctl(dmabuf_fd, DMA_BUF_IOC_OPEN_SYNC, &sync);
>>
>> flock(sync->fd, LOCK_SH); <- LOCK_SH means a shared lock.
>> CPU access for read
>> flock(sync->fd, LOCK_UN);
>>
>> Or
>>
>> flock(sync->fd, LOCK_EX); <- LOCK_EX means an exclusive lock
>> CPU access for write
>> flock(sync->fd, LOCK_UN);
>>
>> close(sync->fd);
>>
>> As you know, that's similar to dmabuf export feature.
>>
>> In addition, a more simple idea,
>> flock(dmabuf_fd, LOCK_SH/EX);
>> CPU access for read/write
>> flock(dmabuf_fd, LOCK_UN);
>>
>> However, I'm not sure that the above examples could be worked well,
>> and there are no problems yet: actually, I don't fully understand
>> flock mechanism, so looking into it.
>>
>>>
>>>>
>>>>         And to get back to my original point: if you have more than
>>>>         one task
>>>>         operating together on a buffer you absolutely need some kind
>>>>         of real IPC
>>>>         to sync them up and do something useful. Both you syncpoints
>>>>         and the
>>>>         proposed dma-fences only protect the buffer accesses to make
>>>>         sure
>>>>         different task don't stomp on each other. There is nothing in
>>>>         there to
>>>>         make sure that the output of your pipeline is valid. You have
>>>>         to take
>>>>         care of that yourself in userspace. I'll reuse your example to
>>>>         make it
>>>>         clear what I mean:
>>>>
>>>>         Task A                                         Task B
>>>>         ------                                         -------
>>>>         dma_buf_sync_lock(buf1)
>>>>         CPU write buf1
>>>>         dma_buf_sync_unlock(buf1)
>>>>                   ---------schedule Task A again-------
>>>>         dma_buf_sync_lock(buf1)
>>>>         CPU write buf1
>>>>         dma_buf_sync_unlock(buf1)
>>>>                     ---------schedule Task B---------
>>>>                                                        qbuf(buf1)
>>>>
>>>>         dma_buf_sync_lock(buf1)
>>>>                                                        ....
>>>>
>>>>         This is what can happen if you don't take care of proper
>>>>         syncing. Task A
>>>>         writes something to the buffer in expectation that Task B will
>>>>         take care
>>>>         of it, but before Task B even gets scheduled Task A overwrites
>>>>         the
>>>>         buffer again. Not what you wanted, isn't it?
>>>>
>>>> Exactly wrong example. I had already mentioned about that. "In case
>>>> that data flow goes from A to B, it needs some kind of IPC between the
>>>> two tasks every time"  So again, your example would have no any
>>>> problem in case that *two tasks share the same buffer but these tasks
>>>> access the buffer(buf1) as write, and data of the buffer(buf1) isn't
>>>> needed to be shared*.  They just need to use the buffer as *storage*.
>>>> So All they want is to avoid stomping on the buffer in this case.
>>>>
>>> Sorry, but I don't see the point. If no one is interested in the data of
>>> the buffer, why are you sharing it in the first place?
>>>
>>
>> Just used as a storage. i.e., Task A fills the buffer with "AAAAAA"
>> using CPU, And Task B fills the buffer with "BBBBBB" using DMA. They
>> don't share data of the buffer, but they share *memory region* of the
>> buffer. That would be very useful for the embedded systems with very
>> small size system memory.
>
> Just so i understand. You want to share backing memory, you don't want
> to share content ie you want to do memory management in userspace.
> This sounds wrong on so many level (not even considering the security
> implication).
>
> If Task A need memory and then can release it for Task B usage

Not true. Task A can never release memory because All task A can do is to
unreference dma buf object of sync object. And please know that user
interfaces hasn't been implemented yet, and we just have a plan for it as I
already mentioned.

> that
> should be the role of kernel memory management which of course needs
> synchronization btw A and B. But in no case this should be done using
> dma-buf. dma-buf is for sharing content btw different devices not
> sharing resources.
>

hmm, is that true? And are you sure? Then how do you think about
reservation? the reservation also uses dma-buf with same reason as long as
I know: actually, we use reservation to use dma-buf. As you may know, a
reservation object is allocated and initialized when a buffer object is
exported to a dma buf.

Thanks,
Inki Dae

>
> Also don't over complicate the vram case, just consider desktop gpu as
> using system memory directly. They can do it and they do it. Migration
> to vram is orthogonal to all this, it's an optimization so to speak.
>
> Cheers,
> Jerome

--089e0160a3ac29f42d04dff6e371
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br>2013/6/22 Jerome Glisse &lt;<a href=3D"mailto:j.glisse@gmail.com">j=
.glisse@gmail.com</a>&gt;:<br>&gt; On Fri, Jun 21, 2013 at 12:55 PM, Inki D=
ae &lt;<a href=3D"mailto:daeinki@gmail.com">daeinki@gmail.com</a>&gt; wrote=
:<br>
&gt;&gt; 2013/6/21 Lucas Stach &lt;<a href=3D"mailto:l.stach@pengutronix.de=
">l.stach@pengutronix.de</a>&gt;:<br>&gt;&gt;&gt; Hi Inki,<br>&gt;&gt;&gt;<=
br>&gt;&gt;&gt; please refrain from sending HTML Mails, it makes proper quo=
ting without<br>
&gt;&gt;&gt; messing up the layout everywhere pretty hard.<br>&gt;&gt;&gt;<=
br>&gt;&gt;<br>&gt;&gt; Sorry about that. I should have used text mode.<br>=
&gt;&gt;<br>&gt;&gt;&gt; Am Freitag, den 21.06.2013, 20:01 +0900 schrieb In=
ki Dae:<br>
&gt;&gt;&gt; [...]<br>&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 Yeah=
, you&#39;ll some knowledge and understanding about the API<br>&gt;&gt;&gt;=
&gt; =A0 =A0 =A0 =A0 you are<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 working wi=
th to get things right. But I think it&#39;s not an<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 unreasonable<br>&gt;&gt;&gt;&gt; =A0 =A0 =
=A0 =A0 thing to expect the programmer working directly with kernel<br>&gt;=
&gt;&gt;&gt; =A0 =A0 =A0 =A0 interfaces<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0=
 to read up on how things work.<br>
&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 Second thing: I&#39;ll=
 rather have *one* consistent API for every<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0=
 =A0 subsystem,<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 even if they differ fro=
m each other than having to implement<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 this<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 s=
yncpoint thing in every subsystem. Remember: a single execbuf<br>&gt;&gt;&g=
t;&gt; =A0 =A0 =A0 =A0 in DRM<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 might ref=
erence both GEM objects backed by dma-buf as well<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 native SHM or<br>&gt;&gt;&gt;&gt; =A0 =A0 =
=A0 =A0 CMA backed objects. The dma-buf-mgr proposal already allows<br>&gt;=
&gt;&gt;&gt; =A0 =A0 =A0 =A0 you to<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 han=
dle dma-bufs much the same way during validation than<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 native GEM<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0=
 =A0 objects.<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Actually, at first I =
had implemented a fence helper framework based on<br>&gt;&gt;&gt;&gt; reser=
vation and dma fence to provide easy-use-interface for device<br>
&gt;&gt;&gt;&gt; drivers. However, that was wrong implemention: I had not o=
nly<br>&gt;&gt;&gt;&gt; customized the dma fence but also not considered de=
ad lock issue.<br>&gt;&gt;&gt;&gt; After that, I have reimplemented it as d=
mabuf sync to solve dead<br>
&gt;&gt;&gt;&gt; issue, and at that time, I realized that we first need to =
concentrate<br>&gt;&gt;&gt;&gt; on the most basic thing: the fact CPU and C=
PU, CPU and DMA, or DMA and<br>&gt;&gt;&gt;&gt; DMA can access a same buffe=
r, And the fact simple is the best, and the<br>
&gt;&gt;&gt;&gt; fact we need not only kernel side but also user side inter=
faces. After<br>&gt;&gt;&gt;&gt; that, I collected what is the common part =
for all subsystems, and I<br>&gt;&gt;&gt;&gt; have devised this dmabuf sync=
 framework for it. I&#39;m not really<br>
&gt;&gt;&gt;&gt; specialist in Desktop world. So question. isn&#39;t the ex=
ecbuf used only<br>&gt;&gt;&gt;&gt; for the GPU? the gpu has dedicated vide=
o memory(VRAM) so it needs<br>&gt;&gt;&gt;&gt; migration mechanism between =
system memory and the dedicated video<br>
&gt;&gt;&gt;&gt; memory, and also to consider ordering issue while be migra=
ted.<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; Yeah, execbuf is p=
retty GPU specific, but I don&#39;t see how this matters<br>&gt;&gt;&gt; fo=
r this discussion. Also I don&#39;t see a big difference between embedded<b=
r>
&gt;&gt;&gt; and desktop GPUs. Buffer migration is more of a detail here. B=
oth take<br>&gt;&gt;&gt; command stream that potentially reference other bu=
ffers, which might be<br>&gt;&gt;&gt; native GEM or dma-buf backed objects.=
 Both have to make sure the buffers<br>
&gt;&gt;&gt; are in the right domain (caches cleaned and address mappings s=
et up) and<br>&gt;&gt;&gt; are available for the desired operation, meaning=
 you have to sync with<br>&gt;&gt;&gt; other DMA engines and maybe also wit=
h CPU.<br>
&gt;&gt;<br>&gt;&gt; Yeah, right. Then, in case of desktop gpu, does&#39;t =
it need additional<br>&gt;&gt; something to do when a buffer/s is/are migra=
ted from system memory to<br>&gt;&gt; video memory domain, or from video me=
mory to system memory domain? I<br>
&gt;&gt; guess the below members does similar thing, and all other DMA devi=
ces<br>&gt;&gt; would not need them:<br>&gt;&gt; =A0 =A0 =A0 =A0 struct fen=
ce {<br>&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ...<br>&gt;&gt; =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unsigned int context, seqno;<br>
&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ...<br>&gt;&gt; =A0 =A0 =A0 =
=A0 };<br>&gt;&gt;<br>&gt;&gt; And,<br>&gt;&gt; =A0 =A0 =A0 =A0struct seqno=
_fence {<br>&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0...<br>&gt;&gt; =A0=
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0uint32_t seqno_ofs;<br>&gt;&gt; =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0...<br>
&gt;&gt; =A0 =A0 =A0 =A0};<br>&gt;&gt;<br>&gt;&gt;&gt;<br>&gt;&gt;&gt; The =
only case where sync isn&#39;t clearly defined right now by the current<br>=
&gt;&gt;&gt; API entrypoints is when you access memory through the dma-buf =
fallback<br>
&gt;&gt;&gt; mmap support, which might happen with some software processing=
 element<br>&gt;&gt;&gt; in a video pipeline or something. I agree that we =
will need a userspace<br>&gt;&gt;&gt; interface here, but I think this shou=
ldn&#39;t be yet another sync object,<br>
&gt;&gt;&gt; but rather more a prepare/fini_cpu_access ioctl on the dma-buf=
 which<br>&gt;&gt;&gt; hooks into the existing dma-fence and reservation st=
uff.<br>&gt;&gt;<br>&gt;&gt; I think we don&#39;t need addition ioctl comma=
nds for that. I am thinking<br>
&gt;&gt; of using existing resources as possible. My idea also is similar i=
n<br>&gt;&gt; using the reservation stuff to your idea because my approach =
also<br>&gt;&gt; should use the dma-buf resource. However, My idea is that =
a user<br>
&gt;&gt; process, that wants buffer synchronization with the other, sees a =
sync<br>&gt;&gt; object as a file descriptor like dma-buf does. The below s=
hows simple<br>&gt;&gt; my idea about it:<br>&gt;&gt;<br>&gt;&gt; ioctl(dma=
buf_fd, DMA_BUF_IOC_OPEN_SYNC, &amp;sync);<br>
&gt;&gt;<br>&gt;&gt; flock(sync-&gt;fd, LOCK_SH); &lt;- LOCK_SH means a sha=
red lock.<br>&gt;&gt; CPU access for read<br>&gt;&gt; flock(sync-&gt;fd, LO=
CK_UN);<br>&gt;&gt;<br>&gt;&gt; Or<br>&gt;&gt;<br>&gt;&gt; flock(sync-&gt;f=
d, LOCK_EX); &lt;- LOCK_EX means an exclusive lock<br>
&gt;&gt; CPU access for write<br>&gt;&gt; flock(sync-&gt;fd, LOCK_UN);<br>&=
gt;&gt;<br>&gt;&gt; close(sync-&gt;fd);<br>&gt;&gt;<br>&gt;&gt; As you know=
, that&#39;s similar to dmabuf export feature.<br>&gt;&gt;<br>&gt;&gt; In a=
ddition, a more simple idea,<br>
&gt;&gt; flock(dmabuf_fd, LOCK_SH/EX);<br>&gt;&gt; CPU access for read/writ=
e<br>&gt;&gt; flock(dmabuf_fd, LOCK_UN);<br>&gt;&gt;<br>&gt;&gt; However, I=
&#39;m not sure that the above examples could be worked well,<br>&gt;&gt; a=
nd there are no problems yet: actually, I don&#39;t fully understand<br>
&gt;&gt; flock mechanism, so looking into it.<br>&gt;&gt;<br>&gt;&gt;&gt;<b=
r>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 And to get back to m=
y original point: if you have more than<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0=
 one task<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 operating together on a buffer you absolut=
ely need some kind<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 of real IPC<br>&gt;&=
gt;&gt;&gt; =A0 =A0 =A0 =A0 to sync them up and do something useful. Both y=
ou syncpoints<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 and the<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 proposed dma-fences only protect the buffe=
r accesses to make<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 sure<br>&gt;&gt;&gt;=
&gt; =A0 =A0 =A0 =A0 different task don&#39;t stomp on each other. There is=
 nothing in<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 there to<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 make sure that the output of your pipeline=
 is valid. You have<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 to take<br>&gt;&gt;=
&gt;&gt; =A0 =A0 =A0 =A0 care of that yourself in userspace. I&#39;ll reuse=
 your example to<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 make it<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =
=A0 clear what I mean:<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =
=A0 Task A =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 Task B<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 ------ =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ---=
----<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 dma_buf_sync_lock(buf1)<br>&gt;&gt;&gt;&gt=
; =A0 =A0 =A0 =A0 CPU write buf1<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 dma_bu=
f_sync_unlock(buf1)<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 ---------schedule Task A again-------<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 dma_buf_sync_lock(buf1)<br>&gt;&gt;&gt;&gt=
; =A0 =A0 =A0 =A0 CPU write buf1<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 dma_bu=
f_sync_unlock(buf1)<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 ---------schedule Task B---------<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0qbuf(buf1)<br>&gt;&g=
t;&gt;&gt;<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 dma_buf_sync_lock(buf1)<br>&=
gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0....<br>
&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 This is what can happe=
n if you don&#39;t take care of proper<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 =
syncing. Task A<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 writes something to the=
 buffer in expectation that Task B will<br>
&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 take care<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =
=A0 of it, but before Task B even gets scheduled Task A overwrites<br>&gt;&=
gt;&gt;&gt; =A0 =A0 =A0 =A0 the<br>&gt;&gt;&gt;&gt; =A0 =A0 =A0 =A0 buffer =
again. Not what you wanted, isn&#39;t it?<br>
&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt;&gt; Exactly wrong example. I had already m=
entioned about that. &quot;In case<br>&gt;&gt;&gt;&gt; that data flow goes =
from A to B, it needs some kind of IPC between the<br>&gt;&gt;&gt;&gt; two =
tasks every time&quot; =A0So again, your example would have no any<br>
&gt;&gt;&gt;&gt; problem in case that *two tasks share the same buffer but =
these tasks<br>&gt;&gt;&gt;&gt; access the buffer(buf1) as write, and data =
of the buffer(buf1) isn&#39;t<br>&gt;&gt;&gt;&gt; needed to be shared*. =A0=
They just need to use the buffer as *storage*.<br>
&gt;&gt;&gt;&gt; So All they want is to avoid stomping on the buffer in thi=
s case.<br>&gt;&gt;&gt;&gt;<br>&gt;&gt;&gt; Sorry, but I don&#39;t see the =
point. If no one is interested in the data of<br>&gt;&gt;&gt; the buffer, w=
hy are you sharing it in the first place?<br>
&gt;&gt;&gt;<br>&gt;&gt;<br>&gt;&gt; Just used as a storage. i.e., Task A f=
ills the buffer with &quot;AAAAAA&quot;<br>&gt;&gt; using CPU, And Task B f=
ills the buffer with &quot;BBBBBB&quot; using DMA. They<br>&gt;&gt; don&#39=
;t share data of the buffer, but they share *memory region* of the<br>
&gt;&gt; buffer. That would be very useful for the embedded systems with ve=
ry<br>&gt;&gt; small size system memory.<br>&gt;<br>&gt; Just so i understa=
nd. You want to share backing memory, you don&#39;t want<br>&gt; to share c=
ontent ie you want to do memory management in userspace.<br>
&gt; This sounds wrong on so many level (not even considering the security<=
br>&gt; implication).<br>&gt;<br>&gt; If Task A need memory and then can re=
lease it for Task B usage<br><br>Not true. Task A can never release memory =
because All task A can do is to unreference dma buf object of sync object. =
And please know that user interfaces hasn&#39;t been implemented yet, and w=
e just have a plan for it as I already mentioned.<br>
<br>&gt; that<br>&gt; should be the role of kernel memory management which =
of course needs<br>&gt; synchronization btw A and B. But in no case this sh=
ould be done using<br>&gt; dma-buf. dma-buf is for sharing content btw diff=
erent devices not<br>
&gt; sharing resources.<br>&gt;<br><br>hmm, is that true? And are you sure?=
 Then how do you think about reservation? the reservation also uses dma-buf=
 with same reason as long as I know: actually, we use reservation to use dm=
a-buf. As you may know, a reservation object is allocated and initialized w=
hen a buffer object is exported to a dma buf.<br>
<br>Thanks,<br>Inki Dae<br><br>&gt;<br>&gt; Also don&#39;t over complicate =
the vram case, just consider desktop gpu as<br>&gt; using system memory dir=
ectly. They can do it and they do it. Migration<br>&gt; to vram is orthogon=
al to all this, it&#39;s an optimization so to speak.<br>
&gt;<br>&gt; Cheers,<br>&gt; Jerome<br><br><br>

--089e0160a3ac29f42d04dff6e371--

--===============1431084902==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

--===============1431084902==--