From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============3820914216375391804==" MIME-Version: 1.0 From: Walker, Benjamin Subject: Re: [SPDK] spdk_malloc vs. malloc Date: Mon, 07 May 2018 17:18:48 +0000 Message-ID: <1525713527.46960.38.camel@intel.com> In-Reply-To: F776E011-11D4-4AAE-BB1A-B04CD4FE88C5@intel.com List-ID: To: spdk@lists.01.org --===============3820914216375391804== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable SPDK needs four different types of memory allocation operations today: 1) Standard virtual memory 2) Process-shared memory 3) DMA-safe memory ("pinned") 4) DMA-safe and process-shared memory For #1, SPDK is using POSIX to do that allocation and free. That one is a v= ery separate discussion than the other 3, so I'll leave that alone. For numbers 2-4, SPDK could either have 3 separate functions or one function with flags to do the allocations. I don't think the allocation side choice = makes much of a difference. On the free side, however, we need to decide if we're going to do one generic spdk_free(void *buf), an spdk_free with flags, or 3 separate free functions corresponding to 3 separate allocation functions. W= e're currently moving toward spdk_malloc with flags and spdk_free without, but t= hat is up for debate. There are at least 3 known implementations of the SPDK env layer today, so = I'll attempt to outline what each of those needs so we have some concrete exampl= es. The primary implementation of the env layer, which ships with SPDK, is the = one based on DPDK. DPDK internally always allocates all memory as both DMA-safe= and shared, so the distinction there isn't necessary. Essentially, allocation t= ypes 2-4 are all degenerate and map to a single DPDK call. On the free side, DPDK does not need to know what type of allocation it was (again, because they'r= e all degenerate). One known alternative implementation (Oracle) has three separate operations corresponding to memory allocation types 2, 3, and 4. However, my understan= ding is that they also have internal tracking structures such that they can figu= re out which type of allocation an address came from originally, so they are a= ble to implement spdk_free(void *buf) by doing internal look-ups and don't need= the user to pass them flags. Another known alternative implementation (NetApp) has two memory allocation operations internally. One corresponds to both types 1 and 2, and the other corresponds to types 3 and 4. Further, on the free side there are different= free functions for types 1 and 2 compared to types 3 and 4. The current abstract= ion in SPDK has a single call for types 2-4 with no flags, so it's valid critic= ism that this API is difficult for NetApp to implement. We need to figure out a= way to address this. The easiest solution today, in my opinion, is for the NetApp implementation= to always allocate DMA-safe memory in response to calls to spdk_malloc, even if only the shared flag was specified (case #2). Note that SPDK doesn't call spdk_malloc for case #1 already, so that's a separate code path. Then alway= s do the DMA-safe memory free in response to spdk_free. That will definitely mak= e it work correctly today, and then the NetApp implementation doesn't need the f= lags on free either. I'd love to have as many people weigh in here as possible. The more diverse opinions we get, the better the ultimate design will be. We absolutely have= to make sure this memory abstraction is solid and meets everyone's needs, while still being reasonably simple to use. Thanks, Ben On Fri, 2018-05-04 at 17:17 -0700, Harris, James R wrote: > You=E2=80=99re right Ed. We=E2=80=99ll need to hold off on adding usages= of this new API > until we get this sorted out. = > = > What if we just add an spdk_dma_shared_malloc() and got rid of the flags > completely? And then just keep all of the existing malloc/free calls whe= re > DMA/shared memory is not required. This should alleviate the impact to y= our > code base and I *think* would work for Oracle =E2=80=93 but would need Za= hra to > confirm. This adds an explicit assumption of course that we won=E2=80=99= t have a need > for some other kind of malloc =E2=80=98flag=E2=80=99 in the future. > = > Sorry John =E2=80=93 I know you wanted to end this thread but I=E2=80=99v= e decided to continue > it. ( > = > -Jim > = > = > = > On 5/4/18, 3:37 PM, "Rodriguez, Edwin" wrote: > = > The problem I have with the new code is that it makes distinguishing > between dma and non-dma free difficult. In the kernel, dma memory comes = from > a different pool than regular allocations, i.e. contigmalloc/contigfree in > FreeBsd. How does spdk_free distinguish between a block allocated with > SPDK_MALLOC_DMA and a regular block? > = > Ed R > = > On 5/4/18, 6:30 PM, "SPDK on behalf of Meneghini, John" s.01.org on behalf of John.Meneghini(a)netapp.com> wrote: > = > Thanks, Ben, et al, for your reply. > = > > you may want to avoid calling them at run-time because they can > fail, but SPDK generally already does > > that for performance reasons anyway. Can you outline what you m= ean? > = > What is run-time? We are using the SPDK libraries inside of a k= ernel > kmod. So, all time is run-time. > = > I think we can end this thread. I tried to win this argument once > before. I guess calloc() and malloc() are here to stay... so > it doesn't matter what we do with spdk_malloc(). > = > :-) > = > If anyone wants to ask me about my views on this topic, I'll be a= t the > SPDK Developer's Summit in two weeks. > = > See you then. > = > /John = > = > On 5/3/18, 7:24 PM, "Walker, Benjamin" > wrote: > = > On Thu, 2018-05-03 at 20:14 +0000, Meneghini, John wrote: > > The issue is: spdk_malloc and spdk_calloc are logical APIs = to > abstract the > > POSIX malloc and calloc calls. We chose to NOT use those A= PI > names originally > > and went with the spdk_dma_{} api names. My concern is that > taking those API > > names now will eliminate the possibility of any other use i= n the > future. > > = > > I am not asking to "fully abstract POSIX". I am simply say= ing > that the > > malloc(), calloc() and free() POSIX APIs suffer from some > problems. This is > > why DPDK implemented rte_malloc() and rte_calloc(). As we= move > forward to > > build production quality products out of SPDK we will want = to > abstract malloc, > > calloc and free; especially in the libraries. I don't care > about what the > > applications do. I want a better memory management APIs in= the > libraries. > = > DPDK implemented rte_malloc and rte_calloc to allocate DMA-sa= fe > and shared > memory. They still call POSIX malloc and calloc for all other > types of > allocations. This is the same thing that SPDK has done. > = > I've never heard of malloc, calloc, and free having problems = to > the extent that > you'd want to reimplement them in your application some other= way. > Certainly, > you may want to avoid calling them at run-time because they c= an > fail, but SPDK > generally already does that for performance reasons anyway. C= an > you outline what > you mean? > = > > = > > I'd be happy to simply map spdk_malloc/calloc() to > rte_malloc/calloc() and I > > don't see the point introducing an spdk_malloc() that only > extends > > spdk_dma_malloc(). I'd rather absorb an API change to > spdk_dma_malloc(). We > > change many APIs in SPDK from release to release. I don't s= ee > why this API > > can't be changed - with agreement from the SPDK community. > > = > > /* > > * Allocate memory on default heap. > > */ > > void * > > rte_malloc(const char *type, size_t size, unsigned align) > > { > > return rte_malloc_socket(type, size, align, > SOCKET_ID_ANY); > > } > > = > > /* = > > * Allocate zero'd memory on default heap. > > */ > > void * > > rte_calloc(const char *type, size_t num, size_t size, unsig= ned > align) > > { > > return rte_zmalloc(type, num * size, align); > > } > > = > > /John > > = > > On 5/3/18, 3:21 PM, "Walker, Benjamin" m> wrote: > > = > > On Thu, 2018-05-03 at 18:23 +0000, Meneghini, John wrot= e: > > > > If the user passes flags =3D=3D 0 to the new spdk_= malloc() > call, this > > could be > > > implemented by malloc() or equivalent behind the scen= es, > > > = > > > So, does this mean you=E2=80=99re willing to change a= ll calls to > malloc(size) > > with > > > spdk_malloc(size, 0, NULL, SPDK_ENV_SOCKET_ID_ANY, > SPDK_MALLOC_SHARE)? > > > = > > > Is that the plan? > > > = > > = > > Hi John, > > = > > SPDK explicitly depends on POSIX throughout the code ba= se. > We do have an > > environment abstraction layer designed to make porting = SPDK > to various > > environments (i.e. away from DPDK) easier, but this only > abstracts > > operations > > that are not available through standard POSIX calls. We= 're > concerned that > > fully > > abstracting POSIX would introduce a significant barrier= to > entry for new > > contributors to the project, while only enabling one > additional use case > > that > > we're aware of. I understand that this one use case hap= pens > to be yours, > > and so > > this is a challenge for you. > > = > > In your code today, I assume you carry patches to repla= ce > the POSIX calls > > with > > your available alternatives. We've attempted to make > carrying these > > patches > > reasonably easy by moving all of the POSIX includes int= o a > single header > > (include/stdinc.h). Since you're already carrying those > patches, can you > > simply > > choose a different name for your replacement for > malloc/calloc? > > = > > Thanks, > > Ben > > = > > = > = > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > = > = > = >=20 --===============3820914216375391804==--