From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Uselton Date: Sun, 16 May 2010 20:24:55 -0700 Subject: [Lustre-devel] Lustre RPC visualization In-Reply-To: <4BEFBB07.4030403@tu-dresden.de> References: <000c01cae6ee$1d4693d0$57d3bb70$%barton@oracle.com> <4BD90FB9.5030702@tu-dresden.de> <4BD9CF75.8030204@oracle.com> <4BDE8C3C.2050505@tu-dresden.de> <699F57EF-52E6-41D1-A04B-3C39D469D133@oracle.com> <4BDF1199.2030007@tu-dresden.de> <4BDF1CC7.5020502@oracle.com> <4BDF24BC.9050701@tu-dresden.de> <4BDF2999.2000207@oracle.com> <4BEFBB07.4030403@tu-dresden.de> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: lustre-devel@lists.lustre.org I think this work is very interesting. Will anyone be at CUG 2010 next week to discuss? Cheers, Andrew 2010/5/16 Michael Kluge > Hi WangDi, > > the first version works. Screenshot is attached. I have a couple of counter > realized: RPC's in flight and RPC's completed in total on the client, RPC's > enqueued, RPC's in processing and RPC'c completed in total on the server. > All these counter can be broken down by the type of RPC (op code). The > picture has not yet the lines that show each single RPC, I still have to do > counter like "avg. time to complete an RPC over the last second" and there > are some more TODO's. Like the timer synchronization. (In the screenshot the > first and the last counter show total values while the one in the middle > shows a rate.) > > What I like to have is a complete set of traces from a small cluster (<100 > nodes) including the servers. Would that be possible? > > Is one of you in Hamburg May, 31-June, 3 for ISC'2010? I'll be there and > like to talk about what would be useful for the next steps. > > > > Regards, Michael > > Am 03.05.2010 21:52, schrieb di.wang: > >> Michael Kluge wrote: >> >> One more question: RPC 1334380768266400 (in the log WangDi sent me) >>>>> has on the client side only a "Sending RPC" message, thus missing the >>>>> "Completed RPC". The server has all three (received,start work, done >>>>> work). Has this RPC vanished on the way back to the client? There is >>>>> no further indication what happend. The last timestamp in the client >>>>> log is: >>>>> 1272565368.228628 >>>>> and the server says it finished the processing of the request at: >>>>> 1272565281.379471 >>>>> So the client log has been recorded long enough to contain the >>>>> "Completed RPC" message for this RPC if it arrived ever ... >>>>> >>>> Logically, yes. But in some cases, some debug logs might be abandoned >>>> for some reasons(actually, it happens not rarely), and probably you need >>>> maintain an average time from server "Handled RPC" to client "Completed >>>> RPC", then you just guess the client "Completed RPC" time in this case. >>>> >>> >>> Oh my gosh ;) I don't want to start speculations about the helpfulness >>> of incomplete debug logs. Anyway, what can get lost? Any kind of >>> message on the servers and clients? I think I'd like to know what >>> cases have to be handled while I try to track individual RPC's on >>> their way. >>> >> Any records can get lost here. Unfortunately, there are not any messages >> indicate the missing happened. :( >> (Usually, I would check the time stamp in the log, i.e. no records for a >> "long" time, for example several seconds, but this is not the accurate >> way). >> >> I guess you can just ignore these uncompleted records in your first >> step? Let's see how these incomplete log will >> impact the profiling result, then we will decide how to deal with this? >> >> Thanks >> Wangdi >> >>> >>> Regards, Michael >>> _______________________________________________ >>> Lustre-devel mailing list >>> Lustre-devel at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-devel >>> >> >> >> > > -- > Michael Kluge, M.Sc. > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: