All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Shamis, Pavel" <shamisp-1Heg1YXhbW8@public.gmane.org>
To: "Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Jack Morgenstein
	<jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
	"linux-rdma
	(linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org)"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org"
	<tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>,
	"dotanb-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org"
	<dotanb-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>,
	"Jeff Squyres
	(jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org)"
	<jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
	"Shumilin,
	Victor" <victor.shumilin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Truschin,
	Vladimir"
	<vladimir.truschin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Devendar Bureddy
	<bureddy-wPOY3OvGL++pAIv7I8X2sze48wsgrGvP@public.gmane.org>,
	"mvapich-core-wPOY3OvGL++pAIv7I8X2sze48wsgrGvP@public.gmane.org"
	<mvapich-core-wPOY3OvGL++pAIv7I8X2sze48wsgrGvP@public.gmane.org>
Subject: Re: [RFC] XRC upstream merge reboot
Date: Tue, 26 Jul 2011 16:04:33 -0400	[thread overview]
Message-ID: <26AE60A9-D055-4D40-A830-5AADDBA20ED8@ornl.gov> (raw)
In-Reply-To: <1828884A29C6694DAF28B7E6B8A82373136F6691-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>

Please see my notes below.

>>> I've tried to come up with a clean way to determine the lifetime of an xrc
>> tgt qp,\
>>> and I think the best approach is still:
>>> 
>>> 1. Allow the creating process to destroy it at any time, and
>>> 
>>> 2a. If not explicitly destroyed, the tgt qp is bound to the lifetime of the
>> xrc domain
>>> or
>>> 2b. The creating process specifies during the creation of the tgt qp
>>> whether the qp should be destroyed on exit.
>>> 
>>> The MPIs associate an xrc domain with a job, so this should work.
>>> Everything else significantly complicates the usage model and
>> implementation,
>>> both for verbs and the CM.  An application can maintain a reference count
>>> out of band with a persistent server and use explicit destruction
>>> if they want to share the xrcd across jobs.
>> I assume that you intend the persistent server to replace the reg_xrc_rcv_qp/
>> unreg_xrc_rcv_qp verbs.  Correct?
> 
> I'm suggesting that anyone who wants to share an xrcd across jobs can use out of band communication to maintain their own reference count, rather than pushing that feature into the mainline.  This requires a code change for apps that have coded to OFED and use this feature.


Actually I think it is really not so good idea manage reference counter  across OOB communication. 

Few years ago we had a long discussion among OFED and MPI communities (HP MPI, Intel MPI, Open MPI, Mvapich) about XRC interface definition in OFED.
All of us agreed about the interface that we have today and so far we have not heard about any complains. 
I don't say that it is ideal interface, but I would like to clarify motivation behind the idea of XRC and XRC API that we have today.

The purpose of XRC is to decrease the amount of resources (QPs) that are required for user level communication between multicore nodes. The primary customer of this protocol is middleware HPC software and MPI specifically (but not only). The original intend was to allow to share single receive QP between multiple in-depended processes on the same node.
In order to manage the single resource between multiple process couple of options have been discussed:

1. OOB synchronization on MPI level.
Pros:
- It makes life easier for verbs developer :-)

Cons:
- All MPIs will have to implement the same OOB synchronization mechanism. 
Potentially it adds a lot of overhead and synchronization code to MPI implementation, and to be honest, we already have more than 
enough MPI code that tries to workaround open fabrics API limitations. As well it will make MPI2 dynamic process management much more complicated.

- By definition the XRC QP is owned  by group of processes, that share the same XRC domain, consequently VERBS API
should provide usable API that will allow group management for XRC QP. Luck of such API makes
XRC problematic for integration to HPC communication libraries.

2. Reference counter on verbs level.   

Cons:
- Probably it will make life more complicated for verb developer. 
( Even so, it is not relevant anymore, since the code already exist and no new code development is required )

Pros:
- This solution does not introduce any additional overhead for MPI implementation. 
We have elegant increase/decrease call that manages the reference counter and alows efficient XRC QP management without any extra overhead. 
As well it does not require any special code for MPI-2 dynamic process management.


Obviously, we decided to go with option #2. As result XRC support was easily adopted by multiple MPI implementation. 
And as I mentioned earlier, we haven't heard any complaints.

IMHO, I don't see a good reason to redefine existing API.
I afraid, that such API change will encourage MPI developers to abandon XRC support.

My 0.02$

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2011-07-26 20:04 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-16 21:13 [RFC] XRC upstream merge reboot Hefty, Sean
     [not found] ` <1828884A29C6694DAF28B7E6B8A82373F7AB-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-18 14:54   ` Jack Morgenstein
     [not found]     ` <201105181754.33759.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-05-18 15:27       ` Hefty, Sean
2011-06-22  7:17       ` Jack Morgenstein
     [not found]         ` <201106221017.06212.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-06-22 16:14           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A82373029A95-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-06-22 17:03               ` Jack Morgenstein
     [not found]                 ` <201106222003.50214.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-06-22 18:19                   ` Hefty, Sean
     [not found]                     ` <1828884A29C6694DAF28B7E6B8A82373029B3F-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-06-22 19:21                       ` Jack Morgenstein
     [not found]                         ` <201106222221.05993.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-06-22 19:32                           ` Tziporet Koren
2011-06-22 19:57                           ` Hefty, Sean
     [not found]                             ` <1828884A29C6694DAF28B7E6B8A82373029BDE-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-06-23  6:11                               ` Jack Morgenstein
2011-06-23  6:35                               ` Jack Morgenstein
     [not found]                                 ` <201106230935.07425.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-06-23 18:03                                   ` Hefty, Sean
2011-07-20 18:51                                   ` Hefty, Sean
     [not found]                                     ` <1828884A29C6694DAF28B7E6B8A82373136F63B9-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-07-21  7:38                                       ` Jack Morgenstein
     [not found]                                         ` <201107211038.23000.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-07-21  7:57                                           ` Jack Morgenstein
2011-07-21 11:58                                           ` Jeff Squyres
     [not found]                                             ` <D8276D45-5FE8-464C-B3A4-14404DE8C760-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2011-07-21 12:47                                               ` Jack Morgenstein
     [not found]                                                 ` <201107211547.31850.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-07-21 12:46                                                   ` Jeff Squyres
2011-07-21 16:06                                                   ` Hefty, Sean
2011-07-21 17:53                                           ` Hefty, Sean
     [not found]                                             ` <1828884A29C6694DAF28B7E6B8A82373136F6691-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-07-26 20:04                                               ` Shamis, Pavel [this message]
     [not found]                                                 ` <26AE60A9-D055-4D40-A830-5AADDBA20ED8-1Heg1YXhbW8@public.gmane.org>
2011-08-01 15:03                                                   ` Hefty, Sean
     [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373136F9075-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-01 16:20                                                       ` Shamis, Pavel
     [not found]                                                         ` <AE625966-FD97-4DBF-A024-22B83B5F3E39-1Heg1YXhbW8@public.gmane.org>
2011-08-01 18:28                                                           ` Hefty, Sean
     [not found]                                                             ` <1828884A29C6694DAF28B7E6B8A82373136F9194-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-02 10:44                                                               ` Jack Morgenstein
     [not found]                                                                 ` <201108021344.25284.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-08-02 16:29                                                                   ` Shamis, Pavel
     [not found]                                                                     ` <32D25205-3E9C-4757-B0AB-7117BDF3F2F7-1Heg1YXhbW8@public.gmane.org>
2011-08-03 10:37                                                                       ` Jack Morgenstein
     [not found]                                                                         ` <201108031337.24527.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-08-10 22:20                                                                           ` Hefty, Sean
     [not found]                                                                             ` <1828884A29C6694DAF28B7E6B8A8237316E3E55C-Q3cL8pyY+6ukrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-11 14:12                                                                               ` Shamis, Pavel
2011-08-21 14:42                                                                               ` Jack Morgenstein
     [not found]                                                                                 ` <201108211742.18803.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-08-22 15:46                                                                                   ` Hefty, Sean
2011-08-02 19:08                                                               ` Shamis, Pavel
     [not found]                                                                 ` <EABE213A-448A-45F8-B131-AE1EE3F9547F-1Heg1YXhbW8@public.gmane.org>
2011-08-02 21:25                                                                   ` Hefty, Sean
     [not found]                                                                     ` <1828884A29C6694DAF28B7E6B8A82373136F962C-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-02 23:00                                                                       ` Shamis, Pavel
     [not found]                                                                         ` <DE779D97-F54F-45E4-B3D4-DBEB10F9302D-1Heg1YXhbW8@public.gmane.org>
2011-08-02 23:53                                                                           ` Hefty, Sean
     [not found]                                                                             ` <1828884A29C6694DAF28B7E6B8A82373136F967E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-03 13:47                                                                               ` Shamis, Pavel
     [not found]                                                                                 ` <5C691E518F345F4882FAB9E9839E60BA0BCA4622F4-vxojlfkN5A++qDdrU24kdQ@public.gmane.org>
2011-08-03 15:52                                                                                   ` Hefty, Sean
     [not found]                                                                                     ` <1828884A29C6694DAF28B7E6B8A82373136F97AC-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-03 20:22                                                                                       ` Shamis, Pavel
     [not found]                                                                                         ` <5C691E518F345F4882FAB9E9839E60BA0BCA462300-vxojlfkN5A++qDdrU24kdQ@public.gmane.org>
2011-08-03 20:49                                                                                           ` Hefty, Sean
     [not found]                                                                                             ` <1828884A29C6694DAF28B7E6B8A82373136F9A13-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-03 21:16                                                                                               ` Shamis, Pavel
     [not found]                                                                                                 ` <5C691E518F345F4882FAB9E9839E60BA0BCA462301-vxojlfkN5A++qDdrU24kdQ@public.gmane.org>
2011-08-03 21:36                                                                                                   ` Jason Gunthorpe
     [not found]                                                                                                     ` <20110803213642.GE28465-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-08-04  0:06                                                                                                       ` Hefty, Sean
     [not found]                                                                                                         ` <1828884A29C6694DAF28B7E6B8A82373136F9A7E-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-08-04  4:05                                                                                                           ` Jason Gunthorpe
     [not found]                                                                                                             ` <20110804040503.GA13935-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-08-04  5:53                                                                                                               ` Hefty, Sean
2011-05-18 16:44   ` Roland Dreier
     [not found]     ` <BANLkTimWMU9ohSQGYEEnFR0HbBaypFR51A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-18 17:02       ` Jason Gunthorpe
     [not found]         ` <20110518170226.GA2595-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-05-18 17:30           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A82373FBC7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-18 18:05               ` Jason Gunthorpe
     [not found]                 ` <20110518180519.GA11860-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-05-18 18:13                   ` Hefty, Sean
     [not found]                     ` <1828884A29C6694DAF28B7E6B8A82373FC13-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-18 18:22                       ` Jason Gunthorpe
2011-05-18 19:22               ` Roland Dreier
     [not found]                 ` <BANLkTi=cLjErM3pKzihyFtGWZ0kSu9BiPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-19  5:29                   ` Hefty, Sean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26AE60A9-D055-4D40-A830-5AADDBA20ED8@ornl.gov \
    --to=shamisp-1heg1yxhbw8@public.gmane.org \
    --cc=bureddy-wPOY3OvGL++pAIv7I8X2sze48wsgrGvP@public.gmane.org \
    --cc=dotanb-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org \
    --cc=jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mvapich-core-wPOY3OvGL++pAIv7I8X2sze48wsgrGvP@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=tziporet-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org \
    --cc=victor.shumilin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=vladimir.truschin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.