All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namjae Jeon <linkinjeon@gmail.com>
To: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Cc: bfields@fieldses.org, neilb@suse.de, linux-nfs@vger.kernel.org,
	Amit Sahrawat <a.sahrawat@samsung.com>,
	Nam-Jae Jeon <namjae.jeon@samsung.com>
Subject: Re: sunrpc/cache.c: races while updating cache entries
Date: Mon, 13 May 2013 13:08:45 +0900	[thread overview]
Message-ID: <CAKYAXd_9a=xukJDpV=ug3npyaoa4mrpW8ijf_6DiKPDjiOYe7g@mail.gmail.com> (raw)
In-Reply-To: <CAKYAXd9dWGA1Eaq5mi-eRbY0RRhkmWDR7CeDoeW18dBcKcGv+Q@mail.gmail.com>

Hi.

Sorry for interrupt.
I fixed my issue using this patch(nfsd4: fix hang on fast-booting nfs
servers). it was different issue with this subject on current mail.

Thanks.

2013/5/10, Namjae Jeon <linkinjeon@gmail.com>:
> Hi. Bodo.
>
> We are facing issues with respect to the SUNRPC cache.
> In our case we have two targets connected back-to-back
> NFS Server: Kernel version, 2.6.35
>
> At times, when Client tries to connect to the Server it stucks for
> very long duration and keeps on trying to mount.
>
> When we try to figure out using logs, we checked that client was not
> getting response of FSINFO request.
>
> Further, by debugging we found that the request was getting dropped at
> the SERVER, so this request was not being served.
>
> In the code we reached this point:
> svcauth_unix_set_client()->
>   gi = unix_gid_find(cred->cr_uid, rqstp);
>         switch (PTR_ERR(gi)) {
>         case -EAGAIN:
>                 return SVC_DROP;
>
> This path is related with the SUNRPC cache management.
>
> When we remove this UNIX_GID_FIND path from our code, there is no problem.
>
> When we try to figure the possible related problems as per our
> scneario, We found that you have faced similar issue for RACE in the
> cache.
> Can you please suggest what could be the problem  so that we can check
> further ?
>
> Or from the solution if you encounter the similar situation.
> can you please suggest on the possible patches for 2.6.35 - which we
> can try in our environment ?
>
> We will be highly grateful.
>
> Thanks
>
>
> 2013/4/20, Bodo Stroesser <bstroesser@ts.fujitsu.com>:
>> On 05 Apr 2013 23:09:00 +0100 J. Bruce Fields <bfields@fieldses.org>
>> wrote:
>>> On Fri, Apr 05, 2013 at 05:33:49PM +0200, Bodo Stroesser wrote:
>>> > On 05 Apr 2013 14:40:00 +0100 J. Bruce Fields <bfields@fieldses.org>
>>> > wrote:
>>> > > On Thu, Apr 04, 2013 at 07:59:35PM +0200, Bodo Stroesser wrote:
>>> > > > There is no reason for apologies. The thread meanwhile seems to be
>>> > > > a
>>> > > > bit
>>> > > > confusing :-)
>>> > > >
>>> > > > Current state is:
>>> > > >
>>> > > > - Neil Brown has created two series of patches. One for SLES11-SP1
>>> > > > and a
>>> > > >   second one for -SP2
>>> > > >
>>> > > > - AFAICS, the series for -SP2 will match with mainline also.
>>> > > >
>>> > > > - Today I found and fixed the (hopefully) last problem in the -SP1
>>> > > > series.
>>> > > >   My test using this patchset will run until Monday.
>>> > > >
>>> > > > - Provided the test on SP1 succeeds, probably on Tuesday I'll
>>> > > > start
>>> > > > to test
>>> > > >   the patches for SP2 (and mainline). If it runs fine, we'll have
>>> > > > a
>>> > > > tested
>>> > > >   patchset not later than Mon 15th.
>>> > >
>>> > > OK, great, as long as it hasn't just been forgotten!
>>> > >
>>> > > I'd also be curious to understand why we aren't getting a lot of
>>> > > complaints about this from elsewhere....  Is there something unique
>>> > > about your setup?  Do the bugs that remain upstream take a long time
>>> > > to
>>> > > reproduce?
>>> > >
>>> > > --b.
>>> > >
>>> >
>>> > It's no secret, what we are doing. So let me try to explain:
>>>
>>> Thanks for the detailed explanation!  I'll look forward to the patches.
>>>
>>> --b.
>>>
>>
>> Let me give an intermediate result:
>>
>> The test of the -SP1 patch series succeeded.
>>
>> We started the test of the -SP2 (and mainline) series on Tue, 9th, but
>> had
>> no
>> success.
>> We did _not_ find a problem with the patches, but under -SP2 our test
>> scenario
>> has less than 40% of the throughput we saw under -SP1. With that low
>> performance, we had a 4 day run without any dropped RPC request. But we
>> don't
>> know the error rate without the patches under these conditions. So we
>> can't
>> give an o.k. for the patches yet.
>>
>> Currently we try to find the reason for the different behavior of SP1 and
>> SP2
>>
>> Bodo
>>
>

  reply	other threads:[~2013-05-13  4:08 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-19 16:55 sunrpc/cache.c: races while updating cache entries Bodo Stroesser
2013-05-10  7:51 ` Namjae Jeon
2013-05-13  4:08   ` Namjae Jeon [this message]
     [not found] <61eb00$3oamkh@dgate20u.abg.fsc.net>
2013-06-13  1:54 ` NeilBrown
2013-06-13  2:04   ` J. Bruce Fields
  -- strict thread matches above, loose matches on Subject: below --
2013-06-03 14:27 Bodo Stroesser
     [not found] <d6437a$47jkcm@dgate10u.abg.fsc.net>
2013-04-05 21:08 ` J. Bruce Fields
2013-04-05 15:33 Bodo Stroesser
     [not found] <61eb00$3itd78@dgate20u.abg.fsc.net>
2013-04-05 12:40 ` J. Bruce Fields
2013-04-04 17:59 Bodo Stroesser
     [not found] <61eb00$3hon1j@dgate20u.abg.fsc.net>
2013-04-03 18:36 ` J. Bruce Fields
2013-03-21 16:41 Bodo Stroesser
     [not found] <61eb00$3hl8ah@dgate20u.abg.fsc.net>
2013-03-20 23:33 ` NeilBrown
2013-03-20 18:45 Bodo Stroesser
     [not found] <d6437a$45t6bs@dgate10u.abg.fsc.net>
2013-03-20  4:39 ` NeilBrown
2013-03-19 19:58 Bodo Stroesser
     [not found] <d6437a$45efvo@dgate10u.abg.fsc.net>
2013-03-19  3:23 ` NeilBrown
2013-03-15 20:35 Bodo Stroesser
2013-03-14 17:31 Bodo Stroesser
2013-03-13 16:47 Bodo Stroesser
     [not found] <61eb00$3gpm51@dgate20u.abg.fsc.net>
2013-03-13  5:55 ` NeilBrown
2013-03-11 16:13 Bodo Stroesser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKYAXd_9a=xukJDpV=ug3npyaoa4mrpW8ijf_6DiKPDjiOYe7g@mail.gmail.com' \
    --to=linkinjeon@gmail.com \
    --cc=a.sahrawat@samsung.com \
    --cc=bfields@fieldses.org \
    --cc=bstroesser@ts.fujitsu.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=namjae.jeon@samsung.com \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.