All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "libceph: use memalloc flags for net IO"
@ 2015-04-07 13:40 Ilya Dryomov
  2015-04-07 15:41 ` Mel Gorman
  2015-04-07 15:41 ` Mike Christie
  0 siblings, 2 replies; 3+ messages in thread
From: Ilya Dryomov @ 2015-04-07 13:40 UTC (permalink / raw)
  To: ceph-devel; +Cc: Mike Christie, Mel Gorman, Sage Weil

This reverts commit 89baaa570ab0b476db09408d209578cfed700e9f.

Dirty page throttling should be sufficient for us in the general case
so there is no need to use __GFP_MEMALLOC - it would be needed only in
the swap-over-rbd case, which we currently don't support.  (It would
probably take approximately the commit that is being reverted to add
that support, but we would also need the "swap" option to distinguish
from the general case and make sure swap ceph_client-s aren't shared
with anything else.)  See ceph-devel threads [1] and [2] for the
details of why enabling pfmemalloc reserves for all cases is a bad
thing.

On top of potential system lockups related to drained emergency
reserves, this turned out to cause ceph lockups in case peers are on
the same host and communicating via loopback due to sk_filter()
dropping pfmemalloc skbs on the receiving side because the receiving
loopback socket is not tagged with SOCK_MEMALLOC.

[1] "SOCK_MEMALLOC vs loopback"
    http://www.spinics.net/lists/ceph-devel/msg22998.html
[2] "[PATCH] libceph: don't set memalloc flags in loopback case"
    http://www.spinics.net/lists/ceph-devel/msg23392.html

Conflicts:
	net/ceph/messenger.c [ context: tcp_nodelay option ]

Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Sage Weil <sage@redhat.com>
Cc: stable@vger.kernel.org # 3.18+, needs backporting
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
---
 net/ceph/messenger.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 6b3f54ed65ba..a9f4ae45b7fb 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -484,7 +484,7 @@ static int ceph_tcp_connect(struct ceph_connection *con)
 			       IPPROTO_TCP, &sock);
 	if (ret)
 		return ret;
-	sock->sk->sk_allocation = GFP_NOFS | __GFP_MEMALLOC;
+	sock->sk->sk_allocation = GFP_NOFS;
 
 #ifdef CONFIG_LOCKDEP
 	lockdep_set_class(&sock->sk->sk_lock, &socket_class);
@@ -520,8 +520,6 @@ static int ceph_tcp_connect(struct ceph_connection *con)
 			       ret);
 	}
 
-	sk_set_memalloc(sock->sk);
-
 	con->sock = sock;
 	return 0;
 }
@@ -2808,11 +2806,8 @@ static void con_work(struct work_struct *work)
 {
 	struct ceph_connection *con = container_of(work, struct ceph_connection,
 						   work.work);
-	unsigned long pflags = current->flags;
 	bool fault;
 
-	current->flags |= PF_MEMALLOC;
-
 	mutex_lock(&con->mutex);
 	while (true) {
 		int ret;
@@ -2866,8 +2861,6 @@ static void con_work(struct work_struct *work)
 		con_fault_finish(con);
 
 	con->ops->put(con);
-
-	tsk_restore_flags(current, pflags, PF_MEMALLOC);
 }
 
 /*
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Revert "libceph: use memalloc flags for net IO"
  2015-04-07 13:40 [PATCH] Revert "libceph: use memalloc flags for net IO" Ilya Dryomov
@ 2015-04-07 15:41 ` Mel Gorman
  2015-04-07 15:41 ` Mike Christie
  1 sibling, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2015-04-07 15:41 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: ceph-devel, Mike Christie, Sage Weil

On Tue, Apr 07, 2015 at 04:40:24PM +0300, Ilya Dryomov wrote:
> This reverts commit 89baaa570ab0b476db09408d209578cfed700e9f.
> 
> Dirty page throttling should be sufficient for us in the general case
> so there is no need to use __GFP_MEMALLOC - it would be needed only in
> the swap-over-rbd case, which we currently don't support.  (It would
> probably take approximately the commit that is being reverted to add
> that support, but we would also need the "swap" option to distinguish
> from the general case and make sure swap ceph_client-s aren't shared
> with anything else.)  See ceph-devel threads [1] and [2] for the
> details of why enabling pfmemalloc reserves for all cases is a bad
> thing.
> 
> On top of potential system lockups related to drained emergency
> reserves, this turned out to cause ceph lockups in case peers are on
> the same host and communicating via loopback due to sk_filter()
> dropping pfmemalloc skbs on the receiving side because the receiving
> loopback socket is not tagged with SOCK_MEMALLOC.
> 
> [1] "SOCK_MEMALLOC vs loopback"
>     http://www.spinics.net/lists/ceph-devel/msg22998.html
> [2] "[PATCH] libceph: don't set memalloc flags in loopback case"
>     http://www.spinics.net/lists/ceph-devel/msg23392.html
> 
> Conflicts:
> 	net/ceph/messenger.c [ context: tcp_nodelay option ]
> 
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Sage Weil <sage@redhat.com>
> Cc: stable@vger.kernel.org # 3.18+, needs backporting
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Revert "libceph: use memalloc flags for net IO"
  2015-04-07 13:40 [PATCH] Revert "libceph: use memalloc flags for net IO" Ilya Dryomov
  2015-04-07 15:41 ` Mel Gorman
@ 2015-04-07 15:41 ` Mike Christie
  1 sibling, 0 replies; 3+ messages in thread
From: Mike Christie @ 2015-04-07 15:41 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Mike Christie, Mel Gorman, Sage Weil

On 04/07/2015 08:40 AM, Ilya Dryomov wrote:
> This reverts commit 89baaa570ab0b476db09408d209578cfed700e9f.
> 
> Dirty page throttling should be sufficient for us in the general case
> so there is no need to use __GFP_MEMALLOC - it would be needed only in
> the swap-over-rbd case, which we currently don't support.  (It would
> probably take approximately the commit that is being reverted to add
> that support, but we would also need the "swap" option to distinguish
> from the general case and make sure swap ceph_client-s aren't shared
> with anything else.)  See ceph-devel threads [1] and [2] for the
> details of why enabling pfmemalloc reserves for all cases is a bad
> thing.
> 
> On top of potential system lockups related to drained emergency
> reserves, this turned out to cause ceph lockups in case peers are on
> the same host and communicating via loopback due to sk_filter()
> dropping pfmemalloc skbs on the receiving side because the receiving
> loopback socket is not tagged with SOCK_MEMALLOC.
> 
> [1] "SOCK_MEMALLOC vs loopback"
>     http://www.spinics.net/lists/ceph-devel/msg22998.html
> [2] "[PATCH] libceph: don't set memalloc flags in loopback case"
>     http://www.spinics.net/lists/ceph-devel/msg23392.html
> 
> Conflicts:
> 	net/ceph/messenger.c [ context: tcp_nodelay option ]
> 
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Sage Weil <sage@redhat.com>
> Cc: stable@vger.kernel.org # 3.18+, needs backporting
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Yeah, I misunderstood the memalloc flag use.

Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-04-07 15:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-07 13:40 [PATCH] Revert "libceph: use memalloc flags for net IO" Ilya Dryomov
2015-04-07 15:41 ` Mel Gorman
2015-04-07 15:41 ` Mike Christie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.