All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harman Kalra <hkalra@marvell.com>
To: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	Thomas Monjalon <thomas@monjalon.net>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>, Ray Kinsella <mdr@ashroe.eu>
Subject: Re: [dpdk-dev] [EXT] Re: [PATCH v3 2/7] eal/interrupts: implement get set APIs
Date: Thu, 21 Oct 2021 09:16:01 +0000	[thread overview]
Message-ID: <BN9PR18MB42048E137A3C13DD6F4FE991C5BF9@BN9PR18MB4204.namprd18.prod.outlook.com> (raw)
In-Reply-To: <20211020183051.657b05c1@sovereign>



> -----Original Message-----
> From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> Sent: Wednesday, October 20, 2021 9:01 PM
> To: Harman Kalra <hkalra@marvell.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>; Thomas
> Monjalon <thomas@monjalon.net>; david.marchand@redhat.com;
> dev@dpdk.org; Ray Kinsella <mdr@ashroe.eu>
> Subject: Re: [EXT] Re: [dpdk-dev] [PATCH v3 2/7] eal/interrupts: implement
> get set APIs
> 
> > >
> > > > +	/* Detect if DPDK malloc APIs are ready to be used. */
> > > > +	mem_allocator = rte_malloc_is_ready();
> > > > +	if (mem_allocator)
> > > > +		intr_handle = rte_zmalloc(NULL, sizeof(struct
> > > rte_intr_handle),
> > > > +					  0);
> > > > +	else
> > > > +		intr_handle = calloc(1, sizeof(struct rte_intr_handle));
> > >
> > > This is problematic way to do this.
> > > The reason to use rte_malloc vs malloc should be determined by usage.
> > >
> > > If the pointer will be shared between primary/secondary process then
> > > it has to be in hugepages (ie rte_malloc). If it is not shared then
> > > then use regular malloc.
> > >
> > > But what you have done is created a method which will be a latent
> > > bug for anyone using primary/secondary process.
> > >
> > > Either:
> > >     intr_handle is not allowed to be used in secondary.
> > >       Then always use malloc().
> > > Or.
> > >     intr_handle can be used by both primary and secondary.
> > >     Then always use rte_malloc().
> > >     Any code path that allocates intr_handle before pool is
> > >     ready is broken.
> >
> > Hi Stephan,
> >
> > Till V2, I implemented this API in a way where user of the API can
> > choose If he wants intr handle to be allocated using malloc or
> > rte_malloc by passing a flag arg to the rte_intr_instanc_alloc API.
> > User of the API will best know if the intr handle is to be shared with
> secondary or not.
> >
> > But after some discussions and suggestions from the community we
> > decided to drop that flag argument and auto detect on whether
> > rte_malloc APIs are ready to be used and thereafter make all further
> allocations via rte_malloc.
> > Currently alarm subsystem (or any driver doing allocation in
> > constructor) gets interrupt instance allocated using glibc malloc that
> > too because rte_malloc* is not ready by rte_eal_alarm_init(), while
> > all further consumers gets instance allocated via rte_malloc.
> 
> Just as a comment, bus scanning is the real issue, not the alarms.
> Alarms could be initialized after the memory management (but it's irrelevant
> because their handle is not accessed from the outside).
> However, MM needs to know bus IOVA requirements to initialize, which is
> usually determined by at least bus device requirements.
> 
> >  I think this should not cause any issue in primary/secondary model as
> > all interrupt instance pointer will be shared.
> 
> What do you mean? Aren't we discussing the issue that those allocated early
> are not shared?
> 
> > Infact to avoid any surprises of primary/secondary not working we
> > thought of making all allocations via rte_malloc.
> 
> I don't see why anyone would not make them shared.
> In order to only use rte_malloc(), we need:
> 1. In bus drivers, move handle allocation from scan to probe stage.
> 2. In EAL, move alarm initialization to after the MM.
> It all can be done later with v3 design---but there are out-of-tree drivers.
> We need to force them to make step 1 at some point.
> I see two options:
> a) Right now have an external API that only works with rte_malloc()
>    and internal API with autodetection. Fix DPDK and drop internal API.
> b) Have external API with autodetection. Fix DPDK.
>    At the next ABI breakage drop autodetection and libc-malloc.
> 
> > David, Thomas, Dmitry, please add if I missed anything.
> >
> > Can we please conclude on this series APIs as API freeze deadline (rc1) is
> very near.
> 
> I support v3 design with no options and autodetection, because that's the
> interface we want in the end.
> Implementation can be improved later.

Hi All,

I came across 2 issues introduced with auto detection mechanism.
1. In case of primary secondary model.  Primary application is started which makes lots of allocations via
rte_malloc*
    
    Secondary side:
    a. Secondary starts, in its "rte_eal_init()" it makes some allocation via rte_*, and in one of the allocation
request for heap expand is made as current memseg got exhausted. (malloc_heap_alloc_on_heap_id ()->
   alloc_more_mem_on_socket()->try_expand_heap())
   b. A request to primary for heap expand is sent. Please note secondary holds the spinlock while making
the request. (malloc_heap_alloc_on_heap_id ()->rte_spinlock_lock(&(heap->lock));)

   Primary side:
   a. Primary receives the request, install a new hugepage and setups up the heap (handle_alloc_request())
   b. To inform all the secondaries about the new memseg, primary sends a sync notice where it sets up an 
alarm (rte_mp_request_async ()->mp_request_async()).
   c. Inside alarm setup API, we register an interrupt callback.
   d. Inside rte_intr_callback_register(), a new interrupt instance allocation is requested for "src->intr_handle"
   e. Since memory management is detected as up, inside "rte_intr_instance_alloc()", call to "rte_zmalloc" for
allocating memory and further inside "malloc_heap_alloc_on_heap_id()", primary will experience a deadlock
while taking up the spinlock because this spinlock is already hold by secondary.


2. "eal_flags_file_prefix_autotest" is failing because the spawned process by this tests are expected to cleanup
their hugepage traces from respective directories (eg /dev/hugepage). 
a. Inside eal_cleanup, rte_free()->malloc_heap_free(), where element to be freed is added to the free list and
checked if nearby elements can be joined together and form a big free chunk (malloc_elem_free()).
b. If this free chunk is big enough than the hugepage size, respective hugepage can be uninstalled after making
sure no allocation from this hugepage exists. (malloc_heap_free()->malloc_heap_free_pages()->eal_memalloc_free_seg())

But because of interrupt allocations made for pci intr handles (used for VFIO) and other driver specific interrupt
handles are not cleaned up in "rte_eal_cleanup()", these hugepage files are not removed and test fails.

There could be more such issues, I think we should firstly fix the DPDK.
1. Memory management should be made independent and should be the first thing to come up in rte_eal_init()
2. rte_eal_cleanup() should be exactly opposite to rte_eal_init(), just like bus_probe, we should have bus_remove
to clean up all the memory allocations.

Regarding this IRQ series, I would like to fall back to our original design i.e. rte_intr_instance_alloc() should take
an argument whether its memory should be allocated using glibc malloc or rte_malloc*. Decision for allocation
(malloc or rte_malloc) can be made on fact that in the existing code is the interrupt handle is shared?
Eg.  a. In case of alarm intr_handle was global entry and not confined to any structure, so this can be allocated from
normal malloc.
b. PCI device, had static entry for intr_handle inside "struct rte_pci_device" and memory for struct rte_pci_device is
via normal malloc, so it intr_handle can also be malloc'ed
c. Some driver with intr_handle inside its priv structure, and this priv structure gets allocated via rte_malloc, so
Intr_handle can also be rte_malloc.

Later once DPDK is fixed up, this argument can be removed and all allocations can be via rte_malloc family without
any auto detection.


David, Dmitry, Thomas, Stephan, please share your views....

Thanks
Harman

  reply	other threads:[~2021-10-21  9:16 UTC|newest]

Thread overview: 152+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26 14:57 [dpdk-dev] [RFC 0/7] make rte_intr_handle internal Harman Kalra
2021-08-26 14:57 ` [dpdk-dev] [RFC 1/7] eal: interrupt handle API prototypes Harman Kalra
2021-08-31 15:52   ` Kinsella, Ray
2021-08-26 14:57 ` [dpdk-dev] [RFC 2/7] eal/interrupts: implement get set APIs Harman Kalra
2021-08-31 15:53   ` Kinsella, Ray
2021-08-26 14:57 ` [dpdk-dev] [RFC 3/7] eal/interrupts: avoid direct access to interrupt handle Harman Kalra
2021-08-26 14:57 ` [dpdk-dev] [RFC 4/7] test/interrupt: apply get set interrupt handle APIs Harman Kalra
2021-08-26 14:57 ` [dpdk-dev] [RFC 5/7] drivers: remove direct access to interrupt handle fields Harman Kalra
2021-08-26 14:57 ` [dpdk-dev] [RFC 6/7] eal/interrupts: make interrupt handle structure opaque Harman Kalra
2021-08-26 14:57 ` [dpdk-dev] [RFC 7/7] eal/alarm: introduce alarm fini routine Harman Kalra
2021-09-03 12:40 ` [dpdk-dev] [PATCH v1 0/7] make rte_intr_handle internal Harman Kalra
2021-09-03 12:40   ` [dpdk-dev] [PATCH v1 1/7] eal: interrupt handle API prototypes Harman Kalra
2021-09-03 12:40   ` [dpdk-dev] [PATCH v1 2/7] eal/interrupts: implement get set APIs Harman Kalra
2021-09-28 15:46     ` David Marchand
2021-10-04  8:51       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-04  9:57         ` David Marchand
2021-10-12 15:22           ` Thomas Monjalon
2021-10-13 17:54             ` Harman Kalra
2021-10-13 17:57               ` Harman Kalra
2021-10-13 18:52                 ` Thomas Monjalon
2021-10-14  8:22                   ` Thomas Monjalon
2021-10-14  9:31                     ` Harman Kalra
2021-10-14  9:37                       ` David Marchand
2021-10-14  9:41                       ` Thomas Monjalon
2021-10-14 10:31                         ` Harman Kalra
2021-10-14 10:35                           ` Thomas Monjalon
2021-10-14 10:44                             ` Harman Kalra
2021-10-14 12:04                               ` Thomas Monjalon
2021-10-14 10:25                       ` Dmitry Kozlyuk
2021-10-03 18:05     ` [dpdk-dev] " Dmitry Kozlyuk
2021-10-04 10:37       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-04 11:18         ` Dmitry Kozlyuk
2021-10-04 14:03           ` Harman Kalra
2021-09-03 12:40   ` [dpdk-dev] [PATCH v1 3/7] eal/interrupts: avoid direct access to interrupt handle Harman Kalra
2021-09-03 12:40   ` [dpdk-dev] [PATCH v1 4/7] test/interrupt: apply get set interrupt handle APIs Harman Kalra
2021-09-03 12:41   ` [dpdk-dev] [PATCH v1 5/7] drivers: remove direct access to interrupt handle fields Harman Kalra
2021-09-03 12:41   ` [dpdk-dev] [PATCH v1 6/7] eal/interrupts: make interrupt handle structure opaque Harman Kalra
2021-10-03 18:16     ` Dmitry Kozlyuk
2021-10-04 14:09       ` [dpdk-dev] [EXT] " Harman Kalra
2021-09-03 12:41   ` [dpdk-dev] [PATCH v1 7/7] eal/alarm: introduce alarm fini routine Harman Kalra
2021-09-15 14:13   ` [dpdk-dev] [PATCH v1 0/7] make rte_intr_handle internal Harman Kalra
2021-09-23  8:20   ` David Marchand
2021-10-05 12:14 ` [dpdk-dev] [PATCH v2 0/6] " Harman Kalra
2021-10-05 12:14   ` [dpdk-dev] [PATCH v2 1/6] eal/interrupts: implement get set APIs Harman Kalra
2021-10-14  0:58     ` Dmitry Kozlyuk
2021-10-14 17:15       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-14 17:53         ` Dmitry Kozlyuk
2021-10-15  7:53           ` Thomas Monjalon
2021-10-14  7:31     ` [dpdk-dev] " David Marchand
2021-10-05 12:14   ` [dpdk-dev] [PATCH v2 2/6] eal/interrupts: avoid direct access to interrupt handle Harman Kalra
2021-10-14  0:59     ` Dmitry Kozlyuk
2021-10-14 17:31       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-14 17:53         ` Dmitry Kozlyuk
2021-10-05 12:14   ` [dpdk-dev] [PATCH v2 3/6] test/interrupt: apply get set interrupt handle APIs Harman Kalra
2021-10-05 12:15   ` [dpdk-dev] [PATCH v2 4/6] drivers: remove direct access to interrupt handle Harman Kalra
2021-10-05 12:15   ` [dpdk-dev] [PATCH v2 5/6] eal/interrupts: make interrupt handle structure opaque Harman Kalra
2021-10-05 12:15   ` [dpdk-dev] [PATCH v2 6/6] eal/alarm: introduce alarm fini routine Harman Kalra
2021-10-05 16:07 ` [dpdk-dev] [RFC 0/7] make rte_intr_handle internal Stephen Hemminger
2021-10-07 10:57   ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-18 19:37 ` [dpdk-dev] [PATCH v3 " Harman Kalra
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 1/7] malloc: introduce malloc is ready API Harman Kalra
2021-10-19 15:53     ` Thomas Monjalon
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 2/7] eal/interrupts: implement get set APIs Harman Kalra
2021-10-18 22:07     ` Dmitry Kozlyuk
2021-10-19  8:50       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-19 18:44         ` Harman Kalra
2021-10-18 22:56     ` [dpdk-dev] " Stephen Hemminger
2021-10-19  8:32       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-19 15:58         ` Thomas Monjalon
2021-10-20 15:30         ` Dmitry Kozlyuk
2021-10-21  9:16           ` Harman Kalra [this message]
2021-10-21 12:33             ` Dmitry Kozlyuk
2021-10-21 13:32               ` David Marchand
2021-10-21 16:05                 ` Harman Kalra
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 3/7] eal/interrupts: avoid direct access to interrupt handle Harman Kalra
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 4/7] test/interrupt: apply get set interrupt handle APIs Harman Kalra
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 5/7] drivers: remove direct access to interrupt handle Harman Kalra
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 6/7] eal/interrupts: make interrupt handle structure opaque Harman Kalra
2021-10-18 19:37   ` [dpdk-dev] [PATCH v3 7/7] eal/alarm: introduce alarm fini routine Harman Kalra
2021-10-19 18:35 ` [dpdk-dev] [PATCH v4 0/7] make rte_intr_handle internal Harman Kalra
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 1/7] malloc: introduce malloc is ready API Harman Kalra
2021-10-19 22:01     ` Dmitry Kozlyuk
2021-10-19 22:04       ` Dmitry Kozlyuk
2021-10-20  9:01         ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 2/7] eal/interrupts: implement get set APIs Harman Kalra
2021-10-20  6:14     ` David Marchand
2021-10-20 14:29       ` Dmitry Kozlyuk
2021-10-20 16:15     ` Dmitry Kozlyuk
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 3/7] eal/interrupts: avoid direct access to interrupt handle Harman Kalra
2021-10-19 21:27     ` Dmitry Kozlyuk
2021-10-20  9:25       ` [dpdk-dev] [EXT] " Harman Kalra
2021-10-20  9:52         ` Dmitry Kozlyuk
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 4/7] test/interrupt: apply get set interrupt handle APIs Harman Kalra
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 5/7] drivers: remove direct access to interrupt handle Harman Kalra
2021-10-20  1:57     ` Hyong Youb Kim (hyonkim)
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 6/7] eal/interrupts: make interrupt handle structure opaque Harman Kalra
2021-10-19 18:35   ` [dpdk-dev] [PATCH v4 7/7] eal/alarm: introduce alarm fini routine Harman Kalra
2021-10-19 21:39     ` Dmitry Kozlyuk
2021-10-22 20:49 ` [dpdk-dev] [PATCH v5 0/6] make rte_intr_handle internal Harman Kalra
2021-10-22 20:49   ` [dpdk-dev] [PATCH v5 1/6] eal/interrupts: implement get set APIs Harman Kalra
2021-10-22 23:33     ` Dmitry Kozlyuk
2021-10-22 20:49   ` [dpdk-dev] [PATCH v5 2/6] eal/interrupts: avoid direct access to interrupt handle Harman Kalra
2021-10-22 23:33     ` Dmitry Kozlyuk
2021-10-22 20:49   ` [dpdk-dev] [PATCH v5 3/6] test/interrupt: apply get set interrupt handle APIs Harman Kalra
2021-10-22 20:49   ` [dpdk-dev] [PATCH v5 4/6] drivers: remove direct access to interrupt handle Harman Kalra
2021-10-22 20:49   ` [dpdk-dev] [PATCH v5 5/6] eal/interrupts: make interrupt handle structure opaque Harman Kalra
2021-10-22 23:33     ` Dmitry Kozlyuk
2021-10-22 20:49   ` [dpdk-dev] [PATCH v5 6/6] eal/alarm: introduce alarm fini routine Harman Kalra
2021-10-22 23:33     ` Dmitry Kozlyuk
2021-10-22 23:37       ` Dmitry Kozlyuk
2021-10-24 20:04   ` [dpdk-dev] [PATCH v6 0/9] make rte_intr_handle internal David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 1/9] interrupts: add allocator and accessors David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 2/9] interrupts: remove direct access to interrupt handle David Marchand
2021-10-25  6:57       ` David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 3/9] test/interrupts: " David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 4/9] alarm: " David Marchand
2021-10-25 10:49       ` Dmitry Kozlyuk
2021-10-25 11:09         ` David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 5/9] lib: " David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 6/9] drivers: " David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 7/9] interrupts: make interrupt handle structure opaque David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 8/9] interrupts: rename device specific file descriptor David Marchand
2021-10-24 20:04     ` [dpdk-dev] [PATCH v6 9/9] interrupts: extend event list David Marchand
2021-10-25 10:49       ` Dmitry Kozlyuk
2021-10-25 11:11         ` David Marchand
2021-10-25 13:04   ` [dpdk-dev] [PATCH v5 0/6] make rte_intr_handle internal Raslan Darawsheh
2021-10-25 13:09     ` David Marchand
2021-10-25 13:34   ` [dpdk-dev] [PATCH v7 0/9] " David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 1/9] interrupts: add allocator and accessors David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 2/9] interrupts: remove direct access to interrupt handle David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 3/9] test/interrupts: " David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 4/9] alarm: " David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 5/9] lib: " David Marchand
2021-10-28  6:14       ` Jiang, YuX
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 6/9] drivers: " David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 7/9] interrupts: make interrupt handle structure opaque David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 8/9] interrupts: rename device specific file descriptor David Marchand
2021-10-25 13:34     ` [dpdk-dev] [PATCH v7 9/9] interrupts: extend event list David Marchand
2021-10-25 14:27   ` [dpdk-dev] [PATCH v8 0/9] make rte_intr_handle internal David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 1/9] interrupts: add allocator and accessors David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 2/9] interrupts: remove direct access to interrupt handle David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 3/9] test/interrupts: " David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 4/9] alarm: " David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 5/9] lib: " David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 6/9] drivers: " David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 7/9] interrupts: make interrupt handle structure opaque David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 8/9] interrupts: rename device specific file descriptor David Marchand
2021-10-25 14:27     ` [dpdk-dev] [PATCH v8 9/9] interrupts: extend event list David Marchand
2021-10-28 15:58       ` Ji, Kai
2021-10-28 17:16         ` David Marchand
2021-10-25 14:32     ` [dpdk-dev] [PATCH v8 0/9] make rte_intr_handle internal Raslan Darawsheh
2021-10-25 19:24     ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR18MB42048E137A3C13DD6F4FE991C5BF9@BN9PR18MB4204.namprd18.prod.outlook.com \
    --to=hkalra@marvell.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=mdr@ashroe.eu \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.