[patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations

All of lore.kernel.org
 help / color / mirror / Atom feed

* [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations
@ 2010-07-21  2:44 David Rientjes
  2010-07-21  2:44   ` David Rientjes
                   ` (5 more replies)
  0 siblings, 6 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:44 UTC (permalink / raw)
  To: Andrew Morton, David S. Miller, Steve Wise, Al Viro,
	Steven Whitehouse, Jan Kara
  Cc: Benjamin Herrenschmidt, Roland Dreier, Jens Axboe, Bob Peterson,
	Andreas Dilger, Jiri Kosina, linux-mm

This patchset removes __GFP_NOFAIL from various allocations when those 
callers have error handling or the subsystem doesn't absolutely require
success.

This is the first phase of two for the total removal of __GFP_NOFAIL:
this patchset is intended to fix obvious users of __GFP_NOFAIL that are
already failable or otherwise unnecessary.  The second phase will replace
__GFP_NOFAIL with a different gfp which will use all of the page
allocator's resources (direct reclaim, compaction, and the oom killer)
to free memory but not infinitely loop in the allocator itself.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 1/6] sparc: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
@ 2010-07-21  2:44   ` David Rientjes
       [not found] ` <alpine.DEB.2.00.1007201936210.8728-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:44 UTC (permalink / raw)
  To: David S. Miller, Andrew Morton
  Cc: Benjamin Herrenschmidt, sparclinux, linux-mm

The kmalloc() in mdesc_kmalloc() is failable, so remove __GFP_NOFAIL from
its mask.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/sparc/kernel/mdesc.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/sparc/kernel/mdesc.c b/arch/sparc/kernel/mdesc.c
--- a/arch/sparc/kernel/mdesc.c
+++ b/arch/sparc/kernel/mdesc.c
@@ -134,7 +134,7 @@ static struct mdesc_handle *mdesc_kmalloc(unsigned int mdesc_size)
 		       sizeof(struct mdesc_hdr) +
 		       mdesc_size);
 
-	base = kmalloc(handle_size + 15, GFP_KERNEL | __GFP_NOFAIL);
+	base = kmalloc(handle_size + 15, GFP_KERNEL);
 	if (base) {
 		struct mdesc_handle *hp;
 		unsigned long addr;

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 1/6] sparc: remove dependency on __GFP_NOFAIL
@ 2010-07-21  2:44   ` David Rientjes
  0 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:44 UTC (permalink / raw)
  To: David S. Miller, Andrew Morton
  Cc: Benjamin Herrenschmidt, sparclinux, linux-mm

The kmalloc() in mdesc_kmalloc() is failable, so remove __GFP_NOFAIL from
its mask.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 arch/sparc/kernel/mdesc.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/sparc/kernel/mdesc.c b/arch/sparc/kernel/mdesc.c
--- a/arch/sparc/kernel/mdesc.c
+++ b/arch/sparc/kernel/mdesc.c
@@ -134,7 +134,7 @@ static struct mdesc_handle *mdesc_kmalloc(unsigned int mdesc_size)
 		       sizeof(struct mdesc_hdr) +
 		       mdesc_size);
 
-	base = kmalloc(handle_size + 15, GFP_KERNEL | __GFP_NOFAIL);
+	base = kmalloc(handle_size + 15, GFP_KERNEL);
 	if (base) {
 		struct mdesc_handle *hp;
 		unsigned long addr;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
@ 2010-07-21  2:44     ` David Rientjes
       [not found] ` <alpine.DEB.2.00.1007201936210.8728-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:44 UTC (permalink / raw)
  To: Steve Wise, Andrew Morton
  Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Cc: Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
---
 drivers/infiniband/hw/cxgb4/cq.c  |    4 ++--
 drivers/infiniband/hw/cxgb4/mem.c |    2 +-
 drivers/infiniband/hw/cxgb4/qp.c  |    6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -43,7 +43,7 @@ static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
 	int ret;
 
 	wr_len = sizeof *res_wr + sizeof *res;
-	skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(wr_len, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 	set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
@@ -118,7 +118,7 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
 	/* build fw_ri_res_wr */
 	wr_len = sizeof *res_wr + sizeof *res;
 
-	skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(wr_len, GFP_KERNEL);
 	if (!skb) {
 		ret = -ENOMEM;
 		goto err4;
diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -59,7 +59,7 @@ static int write_adapter_mem(struct c4iw_rdev *rdev, u32 addr, u32 len,
 		wr_len = roundup(sizeof *req + sizeof *sc +
 				 roundup(copy_len, T4_ULPTX_MIN_IO), 16);
 
-		skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+		skb = alloc_skb(wr_len, GFP_KERNEL);
 		if (!skb)
 			return -ENOMEM;
 		set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -130,7 +130,7 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq *wq,
 	/* build fw_ri_res_wr */
 	wr_len = sizeof *res_wr + 2 * sizeof *res;
 
-	skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(wr_len, GFP_KERNEL);
 	if (!skb) {
 		ret = -ENOMEM;
 		goto err7;
@@ -961,7 +961,7 @@ static int rdma_fini(struct c4iw_dev *rhp, struct c4iw_qp *qhp)
 	PDBG("%s qhp %p qid 0x%x tid %u\n", __func__, qhp, qhp->wq.sq.qid,
 	     qhp->ep->hwtid);
 
-	skb = alloc_skb(sizeof *wqe, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 	set_wr_txq(skb, CPL_PRIORITY_DATA, qhp->ep->txq_idx);
@@ -1035,7 +1035,7 @@ static int rdma_init(struct c4iw_dev *rhp, struct c4iw_qp *qhp)
 	PDBG("%s qhp %p qid 0x%x tid %u\n", __func__, qhp, qhp->wq.sq.qid,
 	     qhp->ep->hwtid);
 
-	skb = alloc_skb(sizeof *wqe, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 	set_wr_txq(skb, CPL_PRIORITY_DATA, qhp->ep->txq_idx);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL
@ 2010-07-21  2:44     ` David Rientjes
  0 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:44 UTC (permalink / raw)
  To: Steve Wise, Andrew Morton; +Cc: Roland Dreier, linux-rdma, linux-mm

The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 drivers/infiniband/hw/cxgb4/cq.c  |    4 ++--
 drivers/infiniband/hw/cxgb4/mem.c |    2 +-
 drivers/infiniband/hw/cxgb4/qp.c  |    6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -43,7 +43,7 @@ static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
 	int ret;
 
 	wr_len = sizeof *res_wr + sizeof *res;
-	skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(wr_len, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 	set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
@@ -118,7 +118,7 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq *cq,
 	/* build fw_ri_res_wr */
 	wr_len = sizeof *res_wr + sizeof *res;
 
-	skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(wr_len, GFP_KERNEL);
 	if (!skb) {
 		ret = -ENOMEM;
 		goto err4;
diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -59,7 +59,7 @@ static int write_adapter_mem(struct c4iw_rdev *rdev, u32 addr, u32 len,
 		wr_len = roundup(sizeof *req + sizeof *sc +
 				 roundup(copy_len, T4_ULPTX_MIN_IO), 16);
 
-		skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+		skb = alloc_skb(wr_len, GFP_KERNEL);
 		if (!skb)
 			return -ENOMEM;
 		set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -130,7 +130,7 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq *wq,
 	/* build fw_ri_res_wr */
 	wr_len = sizeof *res_wr + 2 * sizeof *res;
 
-	skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(wr_len, GFP_KERNEL);
 	if (!skb) {
 		ret = -ENOMEM;
 		goto err7;
@@ -961,7 +961,7 @@ static int rdma_fini(struct c4iw_dev *rhp, struct c4iw_qp *qhp)
 	PDBG("%s qhp %p qid 0x%x tid %u\n", __func__, qhp, qhp->wq.sq.qid,
 	     qhp->ep->hwtid);
 
-	skb = alloc_skb(sizeof *wqe, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 	set_wr_txq(skb, CPL_PRIORITY_DATA, qhp->ep->txq_idx);
@@ -1035,7 +1035,7 @@ static int rdma_init(struct c4iw_dev *rhp, struct c4iw_qp *qhp)
 	PDBG("%s qhp %p qid 0x%x tid %u\n", __func__, qhp, qhp->wq.sq.qid,
 	     qhp->ep->hwtid);
 
-	skb = alloc_skb(sizeof *wqe, GFP_KERNEL | __GFP_NOFAIL);
+	skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 	set_wr_txq(skb, CPL_PRIORITY_DATA, qhp->ep->txq_idx);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 3/6] fs: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
@ 2010-07-21  2:45   ` David Rientjes
       [not found] ` <alpine.DEB.2.00.1007201936210.8728-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:45 UTC (permalink / raw)
  To: Al Viro, Andrew Morton; +Cc: Jens Axboe, linux-fsdevel, linux-mm

The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
from its mask.

Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 fs/bio-integrity.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
 
 	/* Allocate kernel buffer for protection data */
 	len = sectors * blk_integrity_tuple_size(bi);
-	buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
+	buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
 	if (unlikely(buf == NULL)) {
 		printk(KERN_ERR "could not allocate integrity buffer\n");
 		return -EIO;

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 3/6] fs: remove dependency on __GFP_NOFAIL
@ 2010-07-21  2:45   ` David Rientjes
  0 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:45 UTC (permalink / raw)
  To: Al Viro, Andrew Morton; +Cc: Jens Axboe, linux-fsdevel, linux-mm

The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
from its mask.

Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 fs/bio-integrity.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
--- a/fs/bio-integrity.c
+++ b/fs/bio-integrity.c
@@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
 
 	/* Allocate kernel buffer for protection data */
 	len = sectors * blk_integrity_tuple_size(bi);
-	buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
+	buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
 	if (unlikely(buf == NULL)) {
 		printk(KERN_ERR "could not allocate integrity buffer\n");
 		return -EIO;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 4/6] gfs2: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
                   ` (2 preceding siblings ...)
  2010-07-21  2:45   ` David Rientjes
@ 2010-07-21  2:45 ` David Rientjes
  2010-07-21  9:24     ` [Cluster-devel] " Steven Whitehouse
  2010-07-21  2:45 ` [patch 6/6] jbd2: " David Rientjes
  2010-07-21 19:26 ` [patch 5/6] jbd: " David Rientjes
  5 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:45 UTC (permalink / raw)
  To: Steven Whitehouse, Andrew Morton; +Cc: Bob Peterson, cluser-devel, linux-mm

The k[mc]allocs in dr_split_leaf() and dir_double_exhash() are failable,
so remove __GFP_NOFAIL from their masks.

Cc: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 fs/gfs2/dir.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -955,7 +955,12 @@ static int dir_split_leaf(struct inode *inode, const struct qstr *name)
 	/* Change the pointers.
 	   Don't bother distinguishing stuffed from non-stuffed.
 	   This code is complicated enough already. */
-	lp = kmalloc(half_len * sizeof(__be64), GFP_NOFS | __GFP_NOFAIL);
+	lp = kmalloc(half_len * sizeof(__be64), GFP_NOFS);
+	if (!lp) {
+		error = -ENOMEM;
+		goto fail_brelse;
+	}
+
 	/*  Change the pointers  */
 	for (x = 0; x < half_len; x++)
 		lp[x] = cpu_to_be64(bn);
@@ -1063,7 +1068,9 @@ static int dir_double_exhash(struct gfs2_inode *dip)
 
 	/*  Allocate both the "from" and "to" buffers in one big chunk  */
 
-	buf = kcalloc(3, sdp->sd_hash_bsize, GFP_NOFS | __GFP_NOFAIL);
+	buf = kcalloc(3, sdp->sd_hash_bsize, GFP_NOFS);
+	if (!buf)
+		return -ENOMEM;
 
 	for (block = dip->i_disksize >> sdp->sd_hash_bsize_shift; block--;) {
 		error = gfs2_dir_read_data(dip, (char *)buf,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
                   ` (3 preceding siblings ...)
  2010-07-21  2:45 ` [patch 4/6] gfs2: " David Rientjes
@ 2010-07-21  2:45 ` David Rientjes
  2010-07-22 14:14   ` Ted Ts'o
  2010-07-21 19:26 ` [patch 5/6] jbd: " David Rientjes
  5 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-07-21  2:45 UTC (permalink / raw)
  To: Jan Kara, Andrew Morton; +Cc: Andreas Dilger, Jiri Kosina, linux-mm

The kzalloc() in start_this_handle() is failable, so remove __GFP_NOFAIL
from its mask.

Cc: Andreas Dilger <adilger@sun.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 fs/jbd2/transaction.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -102,8 +102,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle)
 
 alloc_transaction:
 	if (!journal->j_running_transaction) {
-		new_transaction = kzalloc(sizeof(*new_transaction),
-						GFP_NOFS|__GFP_NOFAIL);
+		new_transaction = kzalloc(sizeof(*new_transaction), GFP_NOFS):
 		if (!new_transaction) {
 			ret = -ENOMEM;
 			goto out;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44     ` David Rientjes
@ 2010-07-21  3:19         ` Steve Wise
  -1 siblings, 0 replies; 41+ messages in thread
From: Steve Wise @ 2010-07-21  3:19 UTC (permalink / raw)
  To: David Rientjes
  Cc: Steve Wise, Andrew Morton, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

Acked-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

David Rientjes wrote:
> The alloc_skb() in various allocations are failable, so remove
> __GFP_NOFAIL from their masks.
>
> Cc: Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/infiniband/hw/cxgb4/cq.c  |    4 ++--
>  drivers/infiniband/hw/cxgb4/mem.c |    2 +-
>  drivers/infiniband/hw/cxgb4/qp.c  |    6 +++---
>  3 files changed, 6 insertions(+), 6 deletions(-)
>   

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL
@ 2010-07-21  3:19         ` Steve Wise
  0 siblings, 0 replies; 41+ messages in thread
From: Steve Wise @ 2010-07-21  3:19 UTC (permalink / raw)
  To: David Rientjes
  Cc: Steve Wise, Andrew Morton, Roland Dreier, linux-rdma, linux-mm

Acked-by: Steve Wise <swise@opengridcomputing.com>

David Rientjes wrote:
> The alloc_skb() in various allocations are failable, so remove
> __GFP_NOFAIL from their masks.
>
> Cc: Roland Dreier <rolandd@cisco.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  drivers/infiniband/hw/cxgb4/cq.c  |    4 ++--
>  drivers/infiniband/hw/cxgb4/mem.c |    2 +-
>  drivers/infiniband/hw/cxgb4/qp.c  |    6 +++---
>  3 files changed, 6 insertions(+), 6 deletions(-)
>   

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 1/6] sparc: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44   ` David Rientjes
@ 2010-07-21  3:31     ` David Miller
  -1 siblings, 0 replies; 41+ messages in thread
From: David Miller @ 2010-07-21  3:31 UTC (permalink / raw)
  To: rientjes; +Cc: akpm, benh, sparclinux, linux-mm

From: David Rientjes <rientjes@google.com>
Date: Tue, 20 Jul 2010 19:44:53 -0700 (PDT)

> The kmalloc() in mdesc_kmalloc() is failable, so remove __GFP_NOFAIL from
> its mask.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

The __GFP_NOFAIL is there intentionally.

The code above this, in the cases where the machine description is
dynamically updated by the hypervisor at run time, long after boot,
has no failure handling.

We absolutely must accept the machine descriptor update and fetch it
from the hypervisor into a new buffer.

Please don't remove this.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 1/6] sparc: remove dependency on __GFP_NOFAIL
@ 2010-07-21  3:31     ` David Miller
  0 siblings, 0 replies; 41+ messages in thread
From: David Miller @ 2010-07-21  3:31 UTC (permalink / raw)
  To: rientjes; +Cc: akpm, benh, sparclinux, linux-mm

From: David Rientjes <rientjes@google.com>
Date: Tue, 20 Jul 2010 19:44:53 -0700 (PDT)

> The kmalloc() in mdesc_kmalloc() is failable, so remove __GFP_NOFAIL from
> its mask.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: David Rientjes <rientjes@google.com>

The __GFP_NOFAIL is there intentionally.

The code above this, in the cases where the machine description is
dynamically updated by the hypervisor at run time, long after boot,
has no failure handling.

We absolutely must accept the machine descriptor update and fetch it
from the hypervisor into a new buffer.

Please don't remove this.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 4/6] gfs2: remove dependency on __GFP_NOFAIL
  2010-07-21  2:45 ` [patch 4/6] gfs2: " David Rientjes
@ 2010-07-21  9:24     ` Steven Whitehouse
  0 siblings, 0 replies; 41+ messages in thread
From: Steven Whitehouse @ 2010-07-21  9:24 UTC (permalink / raw)
  To: David Rientjes; +Cc: Andrew Morton, Bob Peterson, cluster-devel, linux-mm

Hi,

Looks good to me, I've added it to the -nmw tree. There are a few more
GFP_NOFAIL instances in the code that we can probably remove in the
future, but these two are pretty easy. Thanks for the patch,

Steve.

On Tue, 2010-07-20 at 19:45 -0700, David Rientjes wrote:
> The k[mc]allocs in dr_split_leaf() and dir_double_exhash() are failable,
> so remove __GFP_NOFAIL from their masks.
> 
> Cc: Bob Peterson <rpeterso@redhat.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  fs/gfs2/dir.c |   11 +++++++++--
>  1 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
> --- a/fs/gfs2/dir.c
> +++ b/fs/gfs2/dir.c
> @@ -955,7 +955,12 @@ static int dir_split_leaf(struct inode *inode, const struct qstr *name)
>  	/* Change the pointers.
>  	   Don't bother distinguishing stuffed from non-stuffed.
>  	   This code is complicated enough already. */
> -	lp = kmalloc(half_len * sizeof(__be64), GFP_NOFS | __GFP_NOFAIL);
> +	lp = kmalloc(half_len * sizeof(__be64), GFP_NOFS);
> +	if (!lp) {
> +		error = -ENOMEM;
> +		goto fail_brelse;
> +	}
> +
>  	/*  Change the pointers  */
>  	for (x = 0; x < half_len; x++)
>  		lp[x] = cpu_to_be64(bn);
> @@ -1063,7 +1068,9 @@ static int dir_double_exhash(struct gfs2_inode *dip)
>  
>  	/*  Allocate both the "from" and "to" buffers in one big chunk  */
>  
> -	buf = kcalloc(3, sdp->sd_hash_bsize, GFP_NOFS | __GFP_NOFAIL);
> +	buf = kcalloc(3, sdp->sd_hash_bsize, GFP_NOFS);
> +	if (!buf)
> +		return -ENOMEM;
>  
>  	for (block = dip->i_disksize >> sdp->sd_hash_bsize_shift; block--;) {
>  		error = gfs2_dir_read_data(dip, (char *)buf,



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Cluster-devel] [patch 4/6] gfs2: remove dependency on __GFP_NOFAIL
@ 2010-07-21  9:24     ` Steven Whitehouse
  0 siblings, 0 replies; 41+ messages in thread
From: Steven Whitehouse @ 2010-07-21  9:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Looks good to me, I've added it to the -nmw tree. There are a few more
GFP_NOFAIL instances in the code that we can probably remove in the
future, but these two are pretty easy. Thanks for the patch,

Steve.

On Tue, 2010-07-20 at 19:45 -0700, David Rientjes wrote:
> The k[mc]allocs in dr_split_leaf() and dir_double_exhash() are failable,
> so remove __GFP_NOFAIL from their masks.
> 
> Cc: Bob Peterson <rpeterso@redhat.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  fs/gfs2/dir.c |   11 +++++++++--
>  1 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
> --- a/fs/gfs2/dir.c
> +++ b/fs/gfs2/dir.c
> @@ -955,7 +955,12 @@ static int dir_split_leaf(struct inode *inode, const struct qstr *name)
>  	/* Change the pointers.
>  	   Don't bother distinguishing stuffed from non-stuffed.
>  	   This code is complicated enough already. */
> -	lp = kmalloc(half_len * sizeof(__be64), GFP_NOFS | __GFP_NOFAIL);
> +	lp = kmalloc(half_len * sizeof(__be64), GFP_NOFS);
> +	if (!lp) {
> +		error = -ENOMEM;
> +		goto fail_brelse;
> +	}
> +
>  	/*  Change the pointers  */
>  	for (x = 0; x < half_len; x++)
>  		lp[x] = cpu_to_be64(bn);
> @@ -1063,7 +1068,9 @@ static int dir_double_exhash(struct gfs2_inode *dip)
>  
>  	/*  Allocate both the "from" and "to" buffers in one big chunk  */
>  
> -	buf = kcalloc(3, sdp->sd_hash_bsize, GFP_NOFS | __GFP_NOFAIL);
> +	buf = kcalloc(3, sdp->sd_hash_bsize, GFP_NOFS);
> +	if (!buf)
> +		return -ENOMEM;
>  
>  	for (block = dip->i_disksize >> sdp->sd_hash_bsize_shift; block--;) {
>  		error = gfs2_dir_read_data(dip, (char *)buf,





^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 4/6] gfs2: remove dependency on __GFP_NOFAIL
  2010-07-21  9:24     ` [Cluster-devel] " Steven Whitehouse
  (?)
@ 2010-07-21  9:31     ` David Rientjes
  -1 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  9:31 UTC (permalink / raw)
  To: Steven Whitehouse; +Cc: Andrew Morton, Bob Peterson, cluster-devel, linux-mm

On Wed, 21 Jul 2010, Steven Whitehouse wrote:

> Hi,
> 
> Looks good to me, I've added it to the -nmw tree. There are a few more
> GFP_NOFAIL instances in the code that we can probably remove in the
> future, but these two are pretty easy. Thanks for the patch,
> 

Thanks!  I'm planning on replacing __GFP_NOFAIL with a different flag that 
will use all of the page allocator's capabilities (direct reclaim, 
compaction for order > 0, and oom killer) but not loop forever.  Existing 
__GFP_NOFAIL callers can then do

	do {
		page = alloc_page(GFP_KERNEL | __GFP_KILLABLE);
	} while (!page);

to duplicate the behavior of __GFP_NOFAIL until such time as 
__GFP_KILLABLE can be removed as well.  That's what I was planning for the 
remaining instances in gfs2 during the second phase.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 1/6] sparc: remove dependency on __GFP_NOFAIL
  2010-07-21  3:31     ` David Miller
@ 2010-07-21  9:41       ` David Rientjes
  -1 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  9:41 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, benh, sparclinux, linux-mm

On Tue, 20 Jul 2010, David Miller wrote:

> > The kmalloc() in mdesc_kmalloc() is failable, so remove __GFP_NOFAIL from
> > its mask.
> > 
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Signed-off-by: David Rientjes <rientjes@google.com>
> 
> The __GFP_NOFAIL is there intentionally.
> 
> The code above this, in the cases where the machine description is
> dynamically updated by the hypervisor at run time, long after boot,
> has no failure handling.
> 
> We absolutely must accept the machine descriptor update and fetch it
> from the hypervisor into a new buffer.
> 
> Please don't remove this.
> 

Ok, fair enough.  I was convinced by the error handling in both 
mdesc_update() and mdesc_kmallloc() that this was a failable allocation, 
but I understand how mdesc_update() must succeed given your description.  
We can remove those branches from those two functions, though, since 
__GFP_NOFAIL will always succeed before returning.

I'm planning on replacing __GFP_NOFAIL with a __GFP_KILLABLE flag that 
will use all of the page allocator's capabilities (direct reclaim, memory 
compaction for order > 0, and the oom killer) before failing.  Then, 
existing __GFP_NOFAIL users can use

	do {
		page = alloc_page(GFP_KERNEL | __GFP_KILLABLE);
	} while (!page);

to remove several branches from the page allocator that we'll no longer 
need.  I'll do this in phase two and make sure to convert this instance to 
do that.

Thanks!

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 1/6] sparc: remove dependency on __GFP_NOFAIL
@ 2010-07-21  9:41       ` David Rientjes
  0 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21  9:41 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, benh, sparclinux, linux-mm

On Tue, 20 Jul 2010, David Miller wrote:

> > The kmalloc() in mdesc_kmalloc() is failable, so remove __GFP_NOFAIL from
> > its mask.
> > 
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Signed-off-by: David Rientjes <rientjes@google.com>
> 
> The __GFP_NOFAIL is there intentionally.
> 
> The code above this, in the cases where the machine description is
> dynamically updated by the hypervisor at run time, long after boot,
> has no failure handling.
> 
> We absolutely must accept the machine descriptor update and fetch it
> from the hypervisor into a new buffer.
> 
> Please don't remove this.
> 

Ok, fair enough.  I was convinced by the error handling in both 
mdesc_update() and mdesc_kmallloc() that this was a failable allocation, 
but I understand how mdesc_update() must succeed given your description.  
We can remove those branches from those two functions, though, since 
__GFP_NOFAIL will always succeed before returning.

I'm planning on replacing __GFP_NOFAIL with a __GFP_KILLABLE flag that 
will use all of the page allocator's capabilities (direct reclaim, memory 
compaction for order > 0, and the oom killer) before failing.  Then, 
existing __GFP_NOFAIL users can use

	do {
		page = alloc_page(GFP_KERNEL | __GFP_KILLABLE);
	} while (!page);

to remove several branches from the page allocator that we'll no longer 
need.  I'll do this in phase two and make sure to convert this instance to 
do that.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL
  2010-07-21  3:19         ` Steve Wise
@ 2010-07-21 17:55             ` Roland Dreier
  -1 siblings, 0 replies; 41+ messages in thread
From: Roland Dreier @ 2010-07-21 17:55 UTC (permalink / raw)
  To: Steve Wise
  Cc: David Rientjes, Steve Wise, Andrew Morton, Roland Dreier,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg

thanks guys, applied
-- 
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL
@ 2010-07-21 17:55             ` Roland Dreier
  0 siblings, 0 replies; 41+ messages in thread
From: Roland Dreier @ 2010-07-21 17:55 UTC (permalink / raw)
  To: Steve Wise
  Cc: David Rientjes, Steve Wise, Andrew Morton, Roland Dreier,
	linux-rdma, linux-mm

thanks guys, applied
-- 
Roland Dreier <rolandd@cisco.com> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [patch 5/6] jbd: remove dependency on __GFP_NOFAIL
  2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
                   ` (4 preceding siblings ...)
  2010-07-21  2:45 ` [patch 6/6] jbd2: " David Rientjes
@ 2010-07-21 19:26 ` David Rientjes
  5 siblings, 0 replies; 41+ messages in thread
From: David Rientjes @ 2010-07-21 19:26 UTC (permalink / raw)
  To: Andrew Morton, Jan Kara; +Cc: Andreas Dilger, Jiri Kosina, linux-mm

The kzalloc() in start_this_handle() is failable, so remove __GFP_NOFAIL
from its mask.

Cc: Andreas Dilger <adilger@sun.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 fs/jbd/transaction.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c
--- a/fs/jbd/transaction.c
+++ b/fs/jbd/transaction.c
@@ -99,8 +99,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle)
 
 alloc_transaction:
 	if (!journal->j_running_transaction) {
-		new_transaction = kzalloc(sizeof(*new_transaction),
-						GFP_NOFS|__GFP_NOFAIL);
+		new_transaction = kzalloc(sizeof(*new_transaction), GFP_NOFS);
 		if (!new_transaction) {
 			ret = -ENOMEM;
 			goto out;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-21  2:45 ` [patch 6/6] jbd2: " David Rientjes
@ 2010-07-22 14:14   ` Ted Ts'o
  2010-07-22 18:09     ` David Rientjes
  0 siblings, 1 reply; 41+ messages in thread
From: Ted Ts'o @ 2010-07-22 14:14 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm

On Tue, Jul 20, 2010 at 07:45:10PM -0700, David Rientjes wrote:
> The kzalloc() in start_this_handle() is failable, so remove __GFP_NOFAIL
> from its mask.

Unfortunately, while there is error handling in start_this_handle(),
there isn't in all of the callers of start_this_handle(), which is why
the __GFP_NOFAIL is there.  At the moment, if we get an ENOMEM in the
delayed writeback code paths, for example, it's a disaster; user data
can get lost, as a result.

I could add retry code in the places where we really can't fail, and
reflect the change back up to the userspace code everywhere else, but
that will take a while to do, and even then it means that any VFS
syscall (chmod, unlink, rmdir, mkdir, read, close, etc.) would now
potentially return ENOMEM.  And we know how often application
programmers do error checking....

But until we fix up all of the callers of jbd2_journal_start() and
jbd2_journal_restart(), I'd prefer keep this __GFP_NOFAIL in place,
thanks.

						- Ted

P.S.  There may also be some places where we haven't taken locks yet,
and so it might be safe in a few locations to omit the GFP_NOFS flag.
That would mean adding a new version of jbd2_journal_start() which can
take a gfp_mask, but the net result would be a file system that would
be a little bit more of a "good citizen" as far as allowing memory
reclaim.  I am worried about reflecting ENOMEM errors back up to
userspace, though, since I'm painfully aware of how much crappy
userspace code is out there.  What might be nice is a
"__GFP_RETRY_HARDER" which is weaker form of __GFP_NOFAIL...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-22 14:14   ` Ted Ts'o
@ 2010-07-22 18:09     ` David Rientjes
  2010-07-22 23:09       ` Ted Ts'o
  0 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-07-22 18:09 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm

On Thu, 22 Jul 2010, Ted Ts'o wrote:

> > The kzalloc() in start_this_handle() is failable, so remove __GFP_NOFAIL
> > from its mask.
> 
> Unfortunately, while there is error handling in start_this_handle(),
> there isn't in all of the callers of start_this_handle(), which is why
> the __GFP_NOFAIL is there.  At the moment, if we get an ENOMEM in the
> delayed writeback code paths, for example, it's a disaster; user data
> can get lost, as a result.
> 

I'll change this to

	do {
		new_transaction = kzalloc(sizeof(*new_transaction),
							GFP_NOFS);
	} while (!new_transaction);

in the next phase when I introduce __GFP_KILLABLE (that jbd and jbd2 can't 
use because they are GFP_NOFS).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-22 18:09     ` David Rientjes
@ 2010-07-22 23:09       ` Ted Ts'o
  2010-07-22 23:24         ` David Rientjes
  0 siblings, 1 reply; 41+ messages in thread
From: Ted Ts'o @ 2010-07-22 23:09 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm

On Thu, Jul 22, 2010 at 11:09:53AM -0700, David Rientjes wrote:
> 
> I'll change this to
> 
> 	do {
> 		new_transaction = kzalloc(sizeof(*new_transaction),
> 							GFP_NOFS);
> 	} while (!new_transaction);
> 
> in the next phase when I introduce __GFP_KILLABLE (that jbd and jbd2 can't 
> use because they are GFP_NOFS).

OK, I can carry a patch which does this in the ext4 tree to push to
linus when the merge window opens shortly, since the goal is you want
to get rid of __GFP_NOFAIL altogether, right?

      	       	     	    	  	    	   - Ted

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-22 23:09       ` Ted Ts'o
@ 2010-07-22 23:24         ` David Rientjes
  2010-07-23 14:10             ` Ted Ts'o
  0 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-07-22 23:24 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm

On Thu, 22 Jul 2010, Ted Ts'o wrote:

> > I'll change this to
> > 
> > 	do {
> > 		new_transaction = kzalloc(sizeof(*new_transaction),
> > 							GFP_NOFS);
> > 	} while (!new_transaction);
> > 
> > in the next phase when I introduce __GFP_KILLABLE (that jbd and jbd2 can't 
> > use because they are GFP_NOFS).
> 
> OK, I can carry a patch which does this in the ext4 tree to push to
> linus when the merge window opens shortly, since the goal is you want
> to get rid of __GFP_NOFAIL altogether, right?
> 

Yup, I was trying to do the removal in two phases.

First, remove __GFP_NOFAIL from callers that don't seem to need it.  I 
found that they were actually needed in some cases such as jbd, jbd2, and 
sparc although the reason was specific to those subsystems at a higher 
level and their error handling was actually unused code since __GFP_NOFAIL 
cannot return NULL.

Second, replace __GFP_NOFAIL with __GFP_KILLABLE which converts existing 
users of __GFP_NOFAIL into the do-while loop above and adding 
__GFP_KILLABLE for allocations allowing __GFP_FS which does memory 
compaction for order > 0, direct reclaim, and the oom killer but does not 
retry the allocation.  That would be the responsibility of the caller.  
This ends up removing several branches from the page allocator.

I didn't think about converting the existing GFP_NOFS | __GFP_NOFAIL 
callers into the do-while loop above until you mentioned it, thanks.  I'll 
send patches to do that shortly.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-22 23:24         ` David Rientjes
@ 2010-07-23 14:10             ` Ted Ts'o
  0 siblings, 0 replies; 41+ messages in thread
From: Ted Ts'o @ 2010-07-23 14:10 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm,
	linux-ext4

On Thu, Jul 22, 2010 at 04:24:23PM -0700, David Rientjes wrote:
> 
> I didn't think about converting the existing GFP_NOFS | __GFP_NOFAIL 
> callers into the do-while loop above until you mentioned it, thanks.  I'll 
> send patches to do that shortly.

Here's what I'm planning on queueing for the next merge window, along
with patches to ext4 to use jbd2__journal_start(..., GFP_KERNEL) in
places where we can afford to fail.  After doing some analysis, the
places where we can afford to fail are also the places where we can
use GFP_KERNEL instead of GFP_NOFS, so conveniently, I'm using the
lack of __GFP_FS to indicate that we should do the retry loop in
start_this_handle().  I also added the congestion_wait() call since
there's no point busy-looping the CPU while we're waiting for pages to
get swapped or paged out.

Comments would be appreciated.

    	      	    	    	      	     - Ted

>From 814be805d5e3d12343e590631ff9bc2d65c8f60a Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Fri, 23 Jul 2010 10:06:53 -0400
Subject: [PATCH] jbd2: Remove __GFP_NOFAIL from jbd2 layer

__GFP_NOFAIL is going away, so add our own retry loop.  Also add
jbd2__journal_start() and jbd2__journal_restart() which take a gfp
mask, so that file systems can optionally (re)start transaction
handles using GFP_KERNEL.  If they do this, then they need to be
prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
to reflect that error up to userspace.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/jbd2/journal.c     |   14 +++++++++--
 fs/jbd2/transaction.c |   60 +++++++++++++++++++++++++++++++++---------------
 include/linux/jbd2.h  |    4 ++-
 3 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index f7bf157..eb84fbd 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -48,8 +48,6 @@
 #include <asm/uaccess.h>
 #include <asm/page.h>
 
-EXPORT_SYMBOL(jbd2_journal_start);
-EXPORT_SYMBOL(jbd2_journal_restart);
 EXPORT_SYMBOL(jbd2_journal_extend);
 EXPORT_SYMBOL(jbd2_journal_stop);
 EXPORT_SYMBOL(jbd2_journal_lock_updates);
@@ -311,7 +309,17 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
 	 */
 	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
 
-	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
+retry_alloc:
+	new_bh = alloc_buffer_head(GFP_NOFS);
+	if (!new_bh) {
+		/*
+		 * Failure is not an option, but __GFP_NOFAIL is going
+		 * away; so we retry ourselves here.
+		 */
+		congestion_wait(BLK_RW_ASYNC, HZ/50);
+		goto retry_alloc;
+	}
+
 	/* keep subsequent assertions sane */
 	new_bh->b_state = 0;
 	init_buffer(new_bh, NULL, NULL);
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index e214d68..43241c0 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -83,30 +83,39 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
  * transaction's buffer credits.
  */
 
-static int start_this_handle(journal_t *journal, handle_t *handle)
+static int start_this_handle(journal_t *journal, handle_t *handle,
+			     int gfp_mask)
 {
 	transaction_t *transaction;
 	int needed;
 	int nblocks = handle->h_buffer_credits;
 	transaction_t *new_transaction = NULL;
-	int ret = 0;
 	unsigned long ts = jiffies;
 
 	if (nblocks > journal->j_max_transaction_buffers) {
 		printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
 		       current->comm, nblocks,
 		       journal->j_max_transaction_buffers);
-		ret = -ENOSPC;
-		goto out;
+		return -ENOSPC;
 	}
 
 alloc_transaction:
 	if (!journal->j_running_transaction) {
-		new_transaction = kzalloc(sizeof(*new_transaction),
-						GFP_NOFS|__GFP_NOFAIL);
+	retry_alloc:
+		new_transaction = kzalloc(sizeof(*new_transaction), gfp_mask);
 		if (!new_transaction) {
-			ret = -ENOMEM;
-			goto out;
+			/*
+			 * If __GFP_FS is not present, then we may be
+			 * being called from inside the fs writeback
+			 * layer, so we MUST NOT fail.  Since
+			 * __GFP_NOFAIL is going away, we will arrange
+			 * to retry the allocation ourselves.
+			 */
+			if ((gfp & __GFP_FS) == 0) {
+				congestion_wait(BLK_RW_ASYNC, HZ/50);
+				goto retry_alloc;
+			}
+			return -ENOMEM;
 		}
 	}
 
@@ -123,8 +132,8 @@ repeat_locked:
 	if (is_journal_aborted(journal) ||
 	    (journal->j_errno != 0 && !(journal->j_flags & JBD2_ACK_ERR))) {
 		spin_unlock(&journal->j_state_lock);
-		ret = -EROFS;
-		goto out;
+		kfree(new_transaction);
+		return -EROFS;
 	}
 
 	/* Wait on the journal's transaction barrier if necessary */
@@ -240,10 +249,8 @@ repeat_locked:
 	spin_unlock(&journal->j_state_lock);
 
 	lock_map_acquire(&handle->h_lockdep_map);
-out:
-	if (unlikely(new_transaction))		/* It's usually NULL */
-		kfree(new_transaction);
-	return ret;
+	kfree(new_transaction);
+	return 0;
 }
 
 static struct lock_class_key jbd2_handle_key;
@@ -278,7 +285,7 @@ static handle_t *new_handle(int nblocks)
  *
  * Return a pointer to a newly allocated handle, or NULL on failure
  */
-handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
+handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int gfp_mask)
 {
 	handle_t *handle = journal_current_handle();
 	int err;
@@ -298,7 +305,7 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
 
 	current->journal_info = handle;
 
-	err = start_this_handle(journal, handle);
+	err = start_this_handle(journal, handle, GFP_NOFS);
 	if (err < 0) {
 		jbd2_free_handle(handle);
 		current->journal_info = NULL;
@@ -308,6 +315,15 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
 out:
 	return handle;
 }
+EXPORT_SYMBOL(jbd2__journal_start);
+
+
+handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
+{
+	return jbd2__journal_start(journal, nblocks, GFP_NOFS);
+}
+EXPORT_SYMBOL(jbd2_journal_start);
+
 
 /**
  * int jbd2_journal_extend() - extend buffer credits.
@@ -394,8 +410,7 @@ out:
  * transaction capabable of guaranteeing the requested number of
  * credits.
  */
-
-int jbd2_journal_restart(handle_t *handle, int nblocks)
+int jbd2__journal_restart(handle_t *handle, int nblocks, int gfp_mask)
 {
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal = transaction->t_journal;
@@ -428,10 +443,17 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
 
 	lock_map_release(&handle->h_lockdep_map);
 	handle->h_buffer_credits = nblocks;
-	ret = start_this_handle(journal, handle);
+	ret = start_this_handle(journal, handle, GFP_NOFS);
 	return ret;
 }
+EXPORT_SYMBOL(jbd2__journal_restart);
+
 
+int jbd2_journal_restart(handle_t *handle, int nblocks)
+{
+	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
+}
+EXPORT_SYMBOL(jbd2_journal_restart);
 
 /**
  * void jbd2_journal_lock_updates () - establish a transaction barrier.
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index a4d2e9f..5a72bc7 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1081,7 +1081,9 @@ static inline handle_t *journal_current_handle(void)
  */
 
 extern handle_t *jbd2_journal_start(journal_t *, int nblocks);
-extern int	 jbd2_journal_restart (handle_t *, int nblocks);
+extern handle_t *jbd2__journal_start(journal_t *, int nblocks, int gfp_mask);
+extern int	 jbd2_journal_restart(handle_t *, int nblocks);
+extern int	 jbd2__journal_restart(handle_t *, int nblocks, int gfp_mask);
 extern int	 jbd2_journal_extend (handle_t *, int nblocks);
 extern int	 jbd2_journal_get_write_access(handle_t *, struct buffer_head *);
 extern int	 jbd2_journal_get_create_access (handle_t *, struct buffer_head *);
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
@ 2010-07-23 14:10             ` Ted Ts'o
  0 siblings, 0 replies; 41+ messages in thread
From: Ted Ts'o @ 2010-07-23 14:10 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm,
	linux-ext4

On Thu, Jul 22, 2010 at 04:24:23PM -0700, David Rientjes wrote:
> 
> I didn't think about converting the existing GFP_NOFS | __GFP_NOFAIL 
> callers into the do-while loop above until you mentioned it, thanks.  I'll 
> send patches to do that shortly.

Here's what I'm planning on queueing for the next merge window, along
with patches to ext4 to use jbd2__journal_start(..., GFP_KERNEL) in
places where we can afford to fail.  After doing some analysis, the
places where we can afford to fail are also the places where we can
use GFP_KERNEL instead of GFP_NOFS, so conveniently, I'm using the
lack of __GFP_FS to indicate that we should do the retry loop in
start_this_handle().  I also added the congestion_wait() call since
there's no point busy-looping the CPU while we're waiting for pages to
get swapped or paged out.

Comments would be appreciated.

    	      	    	    	      	     - Ted

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-23 14:10             ` Ted Ts'o
@ 2010-07-23 14:57               ` Jan Kara
  -1 siblings, 0 replies; 41+ messages in thread
From: Jan Kara @ 2010-07-23 14:57 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: David Rientjes, Jan Kara, Andrew Morton, Andreas Dilger,
	Jiri Kosina, linux-mm, linux-ext4

On Fri 23-07-10 10:10:54, Ted Ts'o wrote:
> On Thu, Jul 22, 2010 at 04:24:23PM -0700, David Rientjes wrote:
> > 
> > I didn't think about converting the existing GFP_NOFS | __GFP_NOFAIL 
> > callers into the do-while loop above until you mentioned it, thanks.  I'll 
> > send patches to do that shortly.
...
> From 814be805d5e3d12343e590631ff9bc2d65c8f60a Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 23 Jul 2010 10:06:53 -0400
> Subject: [PATCH] jbd2: Remove __GFP_NOFAIL from jbd2 layer
> 
> __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> mask, so that file systems can optionally (re)start transaction
> handles using GFP_KERNEL.  If they do this, then they need to be
> prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> to reflect that error up to userspace.
> 
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
>  fs/jbd2/journal.c     |   14 +++++++++--
>  fs/jbd2/transaction.c |   60 +++++++++++++++++++++++++++++++++---------------
>  include/linux/jbd2.h  |    4 ++-
>  3 files changed, 55 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index e214d68..43241c0 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -83,30 +83,39 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
>   * transaction's buffer credits.
>   */
>  
> -static int start_this_handle(journal_t *journal, handle_t *handle)
> +static int start_this_handle(journal_t *journal, handle_t *handle,
> +			     int gfp_mask)
>  {
>  	transaction_t *transaction;
>  	int needed;
>  	int nblocks = handle->h_buffer_credits;
>  	transaction_t *new_transaction = NULL;
> -	int ret = 0;
>  	unsigned long ts = jiffies;
>  
>  	if (nblocks > journal->j_max_transaction_buffers) {
>  		printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
>  		       current->comm, nblocks,
>  		       journal->j_max_transaction_buffers);
> -		ret = -ENOSPC;
> -		goto out;
> +		return -ENOSPC;
>  	}
>  
>  alloc_transaction:
>  	if (!journal->j_running_transaction) {
> -		new_transaction = kzalloc(sizeof(*new_transaction),
> -						GFP_NOFS|__GFP_NOFAIL);
> +	retry_alloc:
> +		new_transaction = kzalloc(sizeof(*new_transaction), gfp_mask);
>  		if (!new_transaction) {
> -			ret = -ENOMEM;
> -			goto out;
> +			/*
> +			 * If __GFP_FS is not present, then we may be
> +			 * being called from inside the fs writeback
> +			 * layer, so we MUST NOT fail.  Since
> +			 * __GFP_NOFAIL is going away, we will arrange
> +			 * to retry the allocation ourselves.
> +			 */
> +			if ((gfp & __GFP_FS) == 0) {
> +				congestion_wait(BLK_RW_ASYNC, HZ/50);
> +				goto retry_alloc;
    You could as well go to alloc_transaction label above...

> +			}
> +			return -ENOMEM;
>  		}
>  	}
>  
...
> @@ -278,7 +285,7 @@ static handle_t *new_handle(int nblocks)
>   *
>   * Return a pointer to a newly allocated handle, or NULL on failure
>   */
> -handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int gfp_mask)
>  {
>  	handle_t *handle = journal_current_handle();
>  	int err;
> @@ -298,7 +305,7 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  
>  	current->journal_info = handle;
>  
> -	err = start_this_handle(journal, handle);
> +	err = start_this_handle(journal, handle, GFP_NOFS);
  Here you want to use gfp_mask I guess.

>  	if (err < 0) {
>  		jbd2_free_handle(handle);
>  		current->journal_info = NULL;
> @@ -308,6 +315,15 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  out:
>  	return handle;
>  }
> +EXPORT_SYMBOL(jbd2__journal_start);
> +
> +
> +handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +{
> +	return jbd2__journal_start(journal, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_start);
> +
>  
>  /**
>   * int jbd2_journal_extend() - extend buffer credits.
> @@ -394,8 +410,7 @@ out:
>   * transaction capabable of guaranteeing the requested number of
>   * credits.
>   */
> -
> -int jbd2_journal_restart(handle_t *handle, int nblocks)
> +int jbd2__journal_restart(handle_t *handle, int nblocks, int gfp_mask)
>  {
>  	transaction_t *transaction = handle->h_transaction;
>  	journal_t *journal = transaction->t_journal;
> @@ -428,10 +443,17 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
>  
>  	lock_map_release(&handle->h_lockdep_map);
>  	handle->h_buffer_credits = nblocks;
> -	ret = start_this_handle(journal, handle);
> +	ret = start_this_handle(journal, handle, GFP_NOFS);
  And here you want to use gfp_mask as well.

>  	return ret;
>  }
> +EXPORT_SYMBOL(jbd2__journal_restart);
> +
>  
> +int jbd2_journal_restart(handle_t *handle, int nblocks)
> +{
> +	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_restart);
>  
>  /**
>   * void jbd2_journal_lock_updates () - establish a transaction barrier.


								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
@ 2010-07-23 14:57               ` Jan Kara
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Kara @ 2010-07-23 14:57 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: David Rientjes, Jan Kara, Andrew Morton, Andreas Dilger,
	Jiri Kosina, linux-mm, linux-ext4

On Fri 23-07-10 10:10:54, Ted Ts'o wrote:
> On Thu, Jul 22, 2010 at 04:24:23PM -0700, David Rientjes wrote:
> > 
> > I didn't think about converting the existing GFP_NOFS | __GFP_NOFAIL 
> > callers into the do-while loop above until you mentioned it, thanks.  I'll 
> > send patches to do that shortly.
...
> From 814be805d5e3d12343e590631ff9bc2d65c8f60a Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 23 Jul 2010 10:06:53 -0400
> Subject: [PATCH] jbd2: Remove __GFP_NOFAIL from jbd2 layer
> 
> __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> mask, so that file systems can optionally (re)start transaction
> handles using GFP_KERNEL.  If they do this, then they need to be
> prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> to reflect that error up to userspace.
> 
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
>  fs/jbd2/journal.c     |   14 +++++++++--
>  fs/jbd2/transaction.c |   60 +++++++++++++++++++++++++++++++++---------------
>  include/linux/jbd2.h  |    4 ++-
>  3 files changed, 55 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index e214d68..43241c0 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -83,30 +83,39 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
>   * transaction's buffer credits.
>   */
>  
> -static int start_this_handle(journal_t *journal, handle_t *handle)
> +static int start_this_handle(journal_t *journal, handle_t *handle,
> +			     int gfp_mask)
>  {
>  	transaction_t *transaction;
>  	int needed;
>  	int nblocks = handle->h_buffer_credits;
>  	transaction_t *new_transaction = NULL;
> -	int ret = 0;
>  	unsigned long ts = jiffies;
>  
>  	if (nblocks > journal->j_max_transaction_buffers) {
>  		printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
>  		       current->comm, nblocks,
>  		       journal->j_max_transaction_buffers);
> -		ret = -ENOSPC;
> -		goto out;
> +		return -ENOSPC;
>  	}
>  
>  alloc_transaction:
>  	if (!journal->j_running_transaction) {
> -		new_transaction = kzalloc(sizeof(*new_transaction),
> -						GFP_NOFS|__GFP_NOFAIL);
> +	retry_alloc:
> +		new_transaction = kzalloc(sizeof(*new_transaction), gfp_mask);
>  		if (!new_transaction) {
> -			ret = -ENOMEM;
> -			goto out;
> +			/*
> +			 * If __GFP_FS is not present, then we may be
> +			 * being called from inside the fs writeback
> +			 * layer, so we MUST NOT fail.  Since
> +			 * __GFP_NOFAIL is going away, we will arrange
> +			 * to retry the allocation ourselves.
> +			 */
> +			if ((gfp & __GFP_FS) == 0) {
> +				congestion_wait(BLK_RW_ASYNC, HZ/50);
> +				goto retry_alloc;
    You could as well go to alloc_transaction label above...

> +			}
> +			return -ENOMEM;
>  		}
>  	}
>  
...
> @@ -278,7 +285,7 @@ static handle_t *new_handle(int nblocks)
>   *
>   * Return a pointer to a newly allocated handle, or NULL on failure
>   */
> -handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int gfp_mask)
>  {
>  	handle_t *handle = journal_current_handle();
>  	int err;
> @@ -298,7 +305,7 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  
>  	current->journal_info = handle;
>  
> -	err = start_this_handle(journal, handle);
> +	err = start_this_handle(journal, handle, GFP_NOFS);
  Here you want to use gfp_mask I guess.

>  	if (err < 0) {
>  		jbd2_free_handle(handle);
>  		current->journal_info = NULL;
> @@ -308,6 +315,15 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  out:
>  	return handle;
>  }
> +EXPORT_SYMBOL(jbd2__journal_start);
> +
> +
> +handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +{
> +	return jbd2__journal_start(journal, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_start);
> +
>  
>  /**
>   * int jbd2_journal_extend() - extend buffer credits.
> @@ -394,8 +410,7 @@ out:
>   * transaction capabable of guaranteeing the requested number of
>   * credits.
>   */
> -
> -int jbd2_journal_restart(handle_t *handle, int nblocks)
> +int jbd2__journal_restart(handle_t *handle, int nblocks, int gfp_mask)
>  {
>  	transaction_t *transaction = handle->h_transaction;
>  	journal_t *journal = transaction->t_journal;
> @@ -428,10 +443,17 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
>  
>  	lock_map_release(&handle->h_lockdep_map);
>  	handle->h_buffer_credits = nblocks;
> -	ret = start_this_handle(journal, handle);
> +	ret = start_this_handle(journal, handle, GFP_NOFS);
  And here you want to use gfp_mask as well.

>  	return ret;
>  }
> +EXPORT_SYMBOL(jbd2__journal_restart);
> +
>  
> +int jbd2_journal_restart(handle_t *handle, int nblocks)
> +{
> +	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_restart);
>  
>  /**
>   * void jbd2_journal_lock_updates () - establish a transaction barrier.


								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-23 14:57               ` Jan Kara
@ 2010-07-23 15:05                 ` Ted Ts'o
  -1 siblings, 0 replies; 41+ messages in thread
From: Ted Ts'o @ 2010-07-23 15:05 UTC (permalink / raw)
  To: Jan Kara
  Cc: David Rientjes, Andrew Morton, Andreas Dilger, Jiri Kosina,
	linux-mm, linux-ext4

Yeah, oops.  Nice catches.  I also hadn't done a test compile, so
there were some missing #include's.

So once more, this time with feeling...

					- Ted

>From d24408e1b50e47b21b7d2ec5857b710e9b752dc9 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Fri, 23 Jul 2010 11:03:45 -0400
Subject: [PATCH] jbd2: Remove __GFP_NOFAIL from jbd2 layer

__GFP_NOFAIL is going away, so add our own retry loop.  Also add
jbd2__journal_start() and jbd2__journal_restart() which take a gfp
mask, so that file systems can optionally (re)start transaction
handles using GFP_KERNEL.  If they do this, then they need to be
prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
to reflect that error up to userspace.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/jbd2/journal.c     |   15 +++++++++--
 fs/jbd2/transaction.c |   61 +++++++++++++++++++++++++++++++++---------------
 include/linux/jbd2.h  |    4 ++-
 3 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index f7bf157..a79d334 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -41,6 +41,7 @@
 #include <linux/hash.h>
 #include <linux/log2.h>
 #include <linux/vmalloc.h>
+#include <linux/backing-dev.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/jbd2.h>
@@ -48,8 +49,6 @@
 #include <asm/uaccess.h>
 #include <asm/page.h>
 
-EXPORT_SYMBOL(jbd2_journal_start);
-EXPORT_SYMBOL(jbd2_journal_restart);
 EXPORT_SYMBOL(jbd2_journal_extend);
 EXPORT_SYMBOL(jbd2_journal_stop);
 EXPORT_SYMBOL(jbd2_journal_lock_updates);
@@ -311,7 +310,17 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
 	 */
 	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
 
-	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
+retry_alloc:
+	new_bh = alloc_buffer_head(GFP_NOFS);
+	if (!new_bh) {
+		/*
+		 * Failure is not an option, but __GFP_NOFAIL is going
+		 * away; so we retry ourselves here.
+		 */
+		congestion_wait(BLK_RW_ASYNC, HZ/50);
+		goto retry_alloc;
+	}
+
 	/* keep subsequent assertions sane */
 	new_bh->b_state = 0;
 	init_buffer(new_bh, NULL, NULL);
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index e214d68..001e95f 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -26,6 +26,8 @@
 #include <linux/mm.h>
 #include <linux/highmem.h>
 #include <linux/hrtimer.h>
+#include <linux/backing-dev.h>
+#include <linux/module.h>
 
 static void __jbd2_journal_temp_unlink_buffer(struct journal_head *jh);
 
@@ -83,30 +85,38 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
  * transaction's buffer credits.
  */
 
-static int start_this_handle(journal_t *journal, handle_t *handle)
+static int start_this_handle(journal_t *journal, handle_t *handle,
+			     int gfp_mask)
 {
 	transaction_t *transaction;
 	int needed;
 	int nblocks = handle->h_buffer_credits;
 	transaction_t *new_transaction = NULL;
-	int ret = 0;
 	unsigned long ts = jiffies;
 
 	if (nblocks > journal->j_max_transaction_buffers) {
 		printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
 		       current->comm, nblocks,
 		       journal->j_max_transaction_buffers);
-		ret = -ENOSPC;
-		goto out;
+		return -ENOSPC;
 	}
 
 alloc_transaction:
 	if (!journal->j_running_transaction) {
-		new_transaction = kzalloc(sizeof(*new_transaction),
-						GFP_NOFS|__GFP_NOFAIL);
+		new_transaction = kzalloc(sizeof(*new_transaction), gfp_mask);
 		if (!new_transaction) {
-			ret = -ENOMEM;
-			goto out;
+			/*
+			 * If __GFP_FS is not present, then we may be
+			 * being called from inside the fs writeback
+			 * layer, so we MUST NOT fail.  Since
+			 * __GFP_NOFAIL is going away, we will arrange
+			 * to retry the allocation ourselves.
+			 */
+			if ((gfp_mask & __GFP_FS) == 0) {
+				congestion_wait(BLK_RW_ASYNC, HZ/50);
+				goto alloc_transaction;
+			}
+			return -ENOMEM;
 		}
 	}
 
@@ -123,8 +133,8 @@ repeat_locked:
 	if (is_journal_aborted(journal) ||
 	    (journal->j_errno != 0 && !(journal->j_flags & JBD2_ACK_ERR))) {
 		spin_unlock(&journal->j_state_lock);
-		ret = -EROFS;
-		goto out;
+		kfree(new_transaction);
+		return -EROFS;
 	}
 
 	/* Wait on the journal's transaction barrier if necessary */
@@ -240,10 +250,8 @@ repeat_locked:
 	spin_unlock(&journal->j_state_lock);
 
 	lock_map_acquire(&handle->h_lockdep_map);
-out:
-	if (unlikely(new_transaction))		/* It's usually NULL */
-		kfree(new_transaction);
-	return ret;
+	kfree(new_transaction);
+	return 0;
 }
 
 static struct lock_class_key jbd2_handle_key;
@@ -278,7 +286,7 @@ static handle_t *new_handle(int nblocks)
  *
  * Return a pointer to a newly allocated handle, or NULL on failure
  */
-handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
+handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int gfp_mask)
 {
 	handle_t *handle = journal_current_handle();
 	int err;
@@ -298,7 +306,7 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
 
 	current->journal_info = handle;
 
-	err = start_this_handle(journal, handle);
+	err = start_this_handle(journal, handle, gfp_mask);
 	if (err < 0) {
 		jbd2_free_handle(handle);
 		current->journal_info = NULL;
@@ -308,6 +316,15 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
 out:
 	return handle;
 }
+EXPORT_SYMBOL(jbd2__journal_start);
+
+
+handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
+{
+	return jbd2__journal_start(journal, nblocks, GFP_NOFS);
+}
+EXPORT_SYMBOL(jbd2_journal_start);
+
 
 /**
  * int jbd2_journal_extend() - extend buffer credits.
@@ -394,8 +411,7 @@ out:
  * transaction capabable of guaranteeing the requested number of
  * credits.
  */
-
-int jbd2_journal_restart(handle_t *handle, int nblocks)
+int jbd2__journal_restart(handle_t *handle, int nblocks, int gfp_mask)
 {
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal = transaction->t_journal;
@@ -428,10 +444,17 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
 
 	lock_map_release(&handle->h_lockdep_map);
 	handle->h_buffer_credits = nblocks;
-	ret = start_this_handle(journal, handle);
+	ret = start_this_handle(journal, handle, gfp_mask);
 	return ret;
 }
+EXPORT_SYMBOL(jbd2__journal_restart);
+
 
+int jbd2_journal_restart(handle_t *handle, int nblocks)
+{
+	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
+}
+EXPORT_SYMBOL(jbd2_journal_restart);
 
 /**
  * void jbd2_journal_lock_updates () - establish a transaction barrier.
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index a4d2e9f..5a72bc7 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1081,7 +1081,9 @@ static inline handle_t *journal_current_handle(void)
  */
 
 extern handle_t *jbd2_journal_start(journal_t *, int nblocks);
-extern int	 jbd2_journal_restart (handle_t *, int nblocks);
+extern handle_t *jbd2__journal_start(journal_t *, int nblocks, int gfp_mask);
+extern int	 jbd2_journal_restart(handle_t *, int nblocks);
+extern int	 jbd2__journal_restart(handle_t *, int nblocks, int gfp_mask);
 extern int	 jbd2_journal_extend (handle_t *, int nblocks);
 extern int	 jbd2_journal_get_write_access(handle_t *, struct buffer_head *);
 extern int	 jbd2_journal_get_create_access (handle_t *, struct buffer_head *);
-- 
1.7.0.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
@ 2010-07-23 15:05                 ` Ted Ts'o
  0 siblings, 0 replies; 41+ messages in thread
From: Ted Ts'o @ 2010-07-23 15:05 UTC (permalink / raw)
  To: Jan Kara
  Cc: David Rientjes, Andrew Morton, Andreas Dilger, Jiri Kosina,
	linux-mm, linux-ext4

Yeah, oops.  Nice catches.  I also hadn't done a test compile, so
there were some missing #include's.

So once more, this time with feeling...

					- Ted

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-23 15:05                 ` Ted Ts'o
@ 2010-07-23 15:32                   ` Jan Kara
  -1 siblings, 0 replies; 41+ messages in thread
From: Jan Kara @ 2010-07-23 15:32 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Jan Kara, David Rientjes, Andrew Morton, Andreas Dilger,
	Jiri Kosina, linux-mm, linux-ext4

On Fri 23-07-10 11:05:43, Ted Ts'o wrote:
> Yeah, oops.  Nice catches.  I also hadn't done a test compile, so
> there were some missing #include's.
> 
> So once more, this time with feeling...
> 
> 					- Ted
> 
> From d24408e1b50e47b21b7d2ec5857b710e9b752dc9 Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 23 Jul 2010 11:03:45 -0400
> Subject: [PATCH] jbd2: Remove __GFP_NOFAIL from jbd2 layer
> 
> __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> mask, so that file systems can optionally (re)start transaction
> handles using GFP_KERNEL.  If they do this, then they need to be
> prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> to reflect that error up to userspace.
> 
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
  Now the patch looks good.
Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/journal.c     |   15 +++++++++--
>  fs/jbd2/transaction.c |   61 +++++++++++++++++++++++++++++++++---------------
>  include/linux/jbd2.h  |    4 ++-
>  3 files changed, 57 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index f7bf157..a79d334 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -41,6 +41,7 @@
>  #include <linux/hash.h>
>  #include <linux/log2.h>
>  #include <linux/vmalloc.h>
> +#include <linux/backing-dev.h>
>  
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/jbd2.h>
> @@ -48,8 +49,6 @@
>  #include <asm/uaccess.h>
>  #include <asm/page.h>
>  
> -EXPORT_SYMBOL(jbd2_journal_start);
> -EXPORT_SYMBOL(jbd2_journal_restart);
>  EXPORT_SYMBOL(jbd2_journal_extend);
>  EXPORT_SYMBOL(jbd2_journal_stop);
>  EXPORT_SYMBOL(jbd2_journal_lock_updates);
> @@ -311,7 +310,17 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
>  	 */
>  	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
>  
> -	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
> +retry_alloc:
> +	new_bh = alloc_buffer_head(GFP_NOFS);
> +	if (!new_bh) {
> +		/*
> +		 * Failure is not an option, but __GFP_NOFAIL is going
> +		 * away; so we retry ourselves here.
> +		 */
> +		congestion_wait(BLK_RW_ASYNC, HZ/50);
> +		goto retry_alloc;
> +	}
> +
>  	/* keep subsequent assertions sane */
>  	new_bh->b_state = 0;
>  	init_buffer(new_bh, NULL, NULL);
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index e214d68..001e95f 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -26,6 +26,8 @@
>  #include <linux/mm.h>
>  #include <linux/highmem.h>
>  #include <linux/hrtimer.h>
> +#include <linux/backing-dev.h>
> +#include <linux/module.h>
>  
>  static void __jbd2_journal_temp_unlink_buffer(struct journal_head *jh);
>  
> @@ -83,30 +85,38 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
>   * transaction's buffer credits.
>   */
>  
> -static int start_this_handle(journal_t *journal, handle_t *handle)
> +static int start_this_handle(journal_t *journal, handle_t *handle,
> +			     int gfp_mask)
>  {
>  	transaction_t *transaction;
>  	int needed;
>  	int nblocks = handle->h_buffer_credits;
>  	transaction_t *new_transaction = NULL;
> -	int ret = 0;
>  	unsigned long ts = jiffies;
>  
>  	if (nblocks > journal->j_max_transaction_buffers) {
>  		printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
>  		       current->comm, nblocks,
>  		       journal->j_max_transaction_buffers);
> -		ret = -ENOSPC;
> -		goto out;
> +		return -ENOSPC;
>  	}
>  
>  alloc_transaction:
>  	if (!journal->j_running_transaction) {
> -		new_transaction = kzalloc(sizeof(*new_transaction),
> -						GFP_NOFS|__GFP_NOFAIL);
> +		new_transaction = kzalloc(sizeof(*new_transaction), gfp_mask);
>  		if (!new_transaction) {
> -			ret = -ENOMEM;
> -			goto out;
> +			/*
> +			 * If __GFP_FS is not present, then we may be
> +			 * being called from inside the fs writeback
> +			 * layer, so we MUST NOT fail.  Since
> +			 * __GFP_NOFAIL is going away, we will arrange
> +			 * to retry the allocation ourselves.
> +			 */
> +			if ((gfp_mask & __GFP_FS) == 0) {
> +				congestion_wait(BLK_RW_ASYNC, HZ/50);
> +				goto alloc_transaction;
> +			}
> +			return -ENOMEM;
>  		}
>  	}
>  
> @@ -123,8 +133,8 @@ repeat_locked:
>  	if (is_journal_aborted(journal) ||
>  	    (journal->j_errno != 0 && !(journal->j_flags & JBD2_ACK_ERR))) {
>  		spin_unlock(&journal->j_state_lock);
> -		ret = -EROFS;
> -		goto out;
> +		kfree(new_transaction);
> +		return -EROFS;
>  	}
>  
>  	/* Wait on the journal's transaction barrier if necessary */
> @@ -240,10 +250,8 @@ repeat_locked:
>  	spin_unlock(&journal->j_state_lock);
>  
>  	lock_map_acquire(&handle->h_lockdep_map);
> -out:
> -	if (unlikely(new_transaction))		/* It's usually NULL */
> -		kfree(new_transaction);
> -	return ret;
> +	kfree(new_transaction);
> +	return 0;
>  }
>  
>  static struct lock_class_key jbd2_handle_key;
> @@ -278,7 +286,7 @@ static handle_t *new_handle(int nblocks)
>   *
>   * Return a pointer to a newly allocated handle, or NULL on failure
>   */
> -handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int gfp_mask)
>  {
>  	handle_t *handle = journal_current_handle();
>  	int err;
> @@ -298,7 +306,7 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  
>  	current->journal_info = handle;
>  
> -	err = start_this_handle(journal, handle);
> +	err = start_this_handle(journal, handle, gfp_mask);
>  	if (err < 0) {
>  		jbd2_free_handle(handle);
>  		current->journal_info = NULL;
> @@ -308,6 +316,15 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  out:
>  	return handle;
>  }
> +EXPORT_SYMBOL(jbd2__journal_start);
> +
> +
> +handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +{
> +	return jbd2__journal_start(journal, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_start);
> +
>  
>  /**
>   * int jbd2_journal_extend() - extend buffer credits.
> @@ -394,8 +411,7 @@ out:
>   * transaction capabable of guaranteeing the requested number of
>   * credits.
>   */
> -
> -int jbd2_journal_restart(handle_t *handle, int nblocks)
> +int jbd2__journal_restart(handle_t *handle, int nblocks, int gfp_mask)
>  {
>  	transaction_t *transaction = handle->h_transaction;
>  	journal_t *journal = transaction->t_journal;
> @@ -428,10 +444,17 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
>  
>  	lock_map_release(&handle->h_lockdep_map);
>  	handle->h_buffer_credits = nblocks;
> -	ret = start_this_handle(journal, handle);
> +	ret = start_this_handle(journal, handle, gfp_mask);
>  	return ret;
>  }
> +EXPORT_SYMBOL(jbd2__journal_restart);
> +
>  
> +int jbd2_journal_restart(handle_t *handle, int nblocks)
> +{
> +	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_restart);
>  
>  /**
>   * void jbd2_journal_lock_updates () - establish a transaction barrier.
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index a4d2e9f..5a72bc7 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1081,7 +1081,9 @@ static inline handle_t *journal_current_handle(void)
>   */
>  
>  extern handle_t *jbd2_journal_start(journal_t *, int nblocks);
> -extern int	 jbd2_journal_restart (handle_t *, int nblocks);
> +extern handle_t *jbd2__journal_start(journal_t *, int nblocks, int gfp_mask);
> +extern int	 jbd2_journal_restart(handle_t *, int nblocks);
> +extern int	 jbd2__journal_restart(handle_t *, int nblocks, int gfp_mask);
>  extern int	 jbd2_journal_extend (handle_t *, int nblocks);
>  extern int	 jbd2_journal_get_write_access(handle_t *, struct buffer_head *);
>  extern int	 jbd2_journal_get_create_access (handle_t *, struct buffer_head *);
> -- 
> 1.7.0.4
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
@ 2010-07-23 15:32                   ` Jan Kara
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Kara @ 2010-07-23 15:32 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Jan Kara, David Rientjes, Andrew Morton, Andreas Dilger,
	Jiri Kosina, linux-mm, linux-ext4

On Fri 23-07-10 11:05:43, Ted Ts'o wrote:
> Yeah, oops.  Nice catches.  I also hadn't done a test compile, so
> there were some missing #include's.
> 
> So once more, this time with feeling...
> 
> 					- Ted
> 
> From d24408e1b50e47b21b7d2ec5857b710e9b752dc9 Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 23 Jul 2010 11:03:45 -0400
> Subject: [PATCH] jbd2: Remove __GFP_NOFAIL from jbd2 layer
> 
> __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> mask, so that file systems can optionally (re)start transaction
> handles using GFP_KERNEL.  If they do this, then they need to be
> prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> to reflect that error up to userspace.
> 
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
  Now the patch looks good.
Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/journal.c     |   15 +++++++++--
>  fs/jbd2/transaction.c |   61 +++++++++++++++++++++++++++++++++---------------
>  include/linux/jbd2.h  |    4 ++-
>  3 files changed, 57 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index f7bf157..a79d334 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -41,6 +41,7 @@
>  #include <linux/hash.h>
>  #include <linux/log2.h>
>  #include <linux/vmalloc.h>
> +#include <linux/backing-dev.h>
>  
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/jbd2.h>
> @@ -48,8 +49,6 @@
>  #include <asm/uaccess.h>
>  #include <asm/page.h>
>  
> -EXPORT_SYMBOL(jbd2_journal_start);
> -EXPORT_SYMBOL(jbd2_journal_restart);
>  EXPORT_SYMBOL(jbd2_journal_extend);
>  EXPORT_SYMBOL(jbd2_journal_stop);
>  EXPORT_SYMBOL(jbd2_journal_lock_updates);
> @@ -311,7 +310,17 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
>  	 */
>  	J_ASSERT_BH(bh_in, buffer_jbddirty(bh_in));
>  
> -	new_bh = alloc_buffer_head(GFP_NOFS|__GFP_NOFAIL);
> +retry_alloc:
> +	new_bh = alloc_buffer_head(GFP_NOFS);
> +	if (!new_bh) {
> +		/*
> +		 * Failure is not an option, but __GFP_NOFAIL is going
> +		 * away; so we retry ourselves here.
> +		 */
> +		congestion_wait(BLK_RW_ASYNC, HZ/50);
> +		goto retry_alloc;
> +	}
> +
>  	/* keep subsequent assertions sane */
>  	new_bh->b_state = 0;
>  	init_buffer(new_bh, NULL, NULL);
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index e214d68..001e95f 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -26,6 +26,8 @@
>  #include <linux/mm.h>
>  #include <linux/highmem.h>
>  #include <linux/hrtimer.h>
> +#include <linux/backing-dev.h>
> +#include <linux/module.h>
>  
>  static void __jbd2_journal_temp_unlink_buffer(struct journal_head *jh);
>  
> @@ -83,30 +85,38 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
>   * transaction's buffer credits.
>   */
>  
> -static int start_this_handle(journal_t *journal, handle_t *handle)
> +static int start_this_handle(journal_t *journal, handle_t *handle,
> +			     int gfp_mask)
>  {
>  	transaction_t *transaction;
>  	int needed;
>  	int nblocks = handle->h_buffer_credits;
>  	transaction_t *new_transaction = NULL;
> -	int ret = 0;
>  	unsigned long ts = jiffies;
>  
>  	if (nblocks > journal->j_max_transaction_buffers) {
>  		printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
>  		       current->comm, nblocks,
>  		       journal->j_max_transaction_buffers);
> -		ret = -ENOSPC;
> -		goto out;
> +		return -ENOSPC;
>  	}
>  
>  alloc_transaction:
>  	if (!journal->j_running_transaction) {
> -		new_transaction = kzalloc(sizeof(*new_transaction),
> -						GFP_NOFS|__GFP_NOFAIL);
> +		new_transaction = kzalloc(sizeof(*new_transaction), gfp_mask);
>  		if (!new_transaction) {
> -			ret = -ENOMEM;
> -			goto out;
> +			/*
> +			 * If __GFP_FS is not present, then we may be
> +			 * being called from inside the fs writeback
> +			 * layer, so we MUST NOT fail.  Since
> +			 * __GFP_NOFAIL is going away, we will arrange
> +			 * to retry the allocation ourselves.
> +			 */
> +			if ((gfp_mask & __GFP_FS) == 0) {
> +				congestion_wait(BLK_RW_ASYNC, HZ/50);
> +				goto alloc_transaction;
> +			}
> +			return -ENOMEM;
>  		}
>  	}
>  
> @@ -123,8 +133,8 @@ repeat_locked:
>  	if (is_journal_aborted(journal) ||
>  	    (journal->j_errno != 0 && !(journal->j_flags & JBD2_ACK_ERR))) {
>  		spin_unlock(&journal->j_state_lock);
> -		ret = -EROFS;
> -		goto out;
> +		kfree(new_transaction);
> +		return -EROFS;
>  	}
>  
>  	/* Wait on the journal's transaction barrier if necessary */
> @@ -240,10 +250,8 @@ repeat_locked:
>  	spin_unlock(&journal->j_state_lock);
>  
>  	lock_map_acquire(&handle->h_lockdep_map);
> -out:
> -	if (unlikely(new_transaction))		/* It's usually NULL */
> -		kfree(new_transaction);
> -	return ret;
> +	kfree(new_transaction);
> +	return 0;
>  }
>  
>  static struct lock_class_key jbd2_handle_key;
> @@ -278,7 +286,7 @@ static handle_t *new_handle(int nblocks)
>   *
>   * Return a pointer to a newly allocated handle, or NULL on failure
>   */
> -handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int gfp_mask)
>  {
>  	handle_t *handle = journal_current_handle();
>  	int err;
> @@ -298,7 +306,7 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  
>  	current->journal_info = handle;
>  
> -	err = start_this_handle(journal, handle);
> +	err = start_this_handle(journal, handle, gfp_mask);
>  	if (err < 0) {
>  		jbd2_free_handle(handle);
>  		current->journal_info = NULL;
> @@ -308,6 +316,15 @@ handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
>  out:
>  	return handle;
>  }
> +EXPORT_SYMBOL(jbd2__journal_start);
> +
> +
> +handle_t *jbd2_journal_start(journal_t *journal, int nblocks)
> +{
> +	return jbd2__journal_start(journal, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_start);
> +
>  
>  /**
>   * int jbd2_journal_extend() - extend buffer credits.
> @@ -394,8 +411,7 @@ out:
>   * transaction capabable of guaranteeing the requested number of
>   * credits.
>   */
> -
> -int jbd2_journal_restart(handle_t *handle, int nblocks)
> +int jbd2__journal_restart(handle_t *handle, int nblocks, int gfp_mask)
>  {
>  	transaction_t *transaction = handle->h_transaction;
>  	journal_t *journal = transaction->t_journal;
> @@ -428,10 +444,17 @@ int jbd2_journal_restart(handle_t *handle, int nblocks)
>  
>  	lock_map_release(&handle->h_lockdep_map);
>  	handle->h_buffer_credits = nblocks;
> -	ret = start_this_handle(journal, handle);
> +	ret = start_this_handle(journal, handle, gfp_mask);
>  	return ret;
>  }
> +EXPORT_SYMBOL(jbd2__journal_restart);
> +
>  
> +int jbd2_journal_restart(handle_t *handle, int nblocks)
> +{
> +	return jbd2__journal_restart(handle, nblocks, GFP_NOFS);
> +}
> +EXPORT_SYMBOL(jbd2_journal_restart);
>  
>  /**
>   * void jbd2_journal_lock_updates () - establish a transaction barrier.
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index a4d2e9f..5a72bc7 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1081,7 +1081,9 @@ static inline handle_t *journal_current_handle(void)
>   */
>  
>  extern handle_t *jbd2_journal_start(journal_t *, int nblocks);
> -extern int	 jbd2_journal_restart (handle_t *, int nblocks);
> +extern handle_t *jbd2__journal_start(journal_t *, int nblocks, int gfp_mask);
> +extern int	 jbd2_journal_restart(handle_t *, int nblocks);
> +extern int	 jbd2__journal_restart(handle_t *, int nblocks, int gfp_mask);
>  extern int	 jbd2_journal_extend (handle_t *, int nblocks);
>  extern int	 jbd2_journal_get_write_access(handle_t *, struct buffer_head *);
>  extern int	 jbd2_journal_get_create_access (handle_t *, struct buffer_head *);
> -- 
> 1.7.0.4
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 3/6] fs: remove dependency on __GFP_NOFAIL
  2010-07-21  2:45   ` David Rientjes
@ 2010-07-23 19:36     ` Andrew Morton
  -1 siblings, 0 replies; 41+ messages in thread
From: Andrew Morton @ 2010-07-23 19:36 UTC (permalink / raw)
  To: David Rientjes; +Cc: Al Viro, Jens Axboe, linux-fsdevel, linux-mm

On Tue, 20 Jul 2010 19:45:00 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
> from its mask.
> 
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  fs/bio-integrity.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
> --- a/fs/bio-integrity.c
> +++ b/fs/bio-integrity.c
> @@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
>  
>  	/* Allocate kernel buffer for protection data */
>  	len = sectors * blk_integrity_tuple_size(bi);
> -	buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
> +	buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
>  	if (unlikely(buf == NULL)) {
>  		printk(KERN_ERR "could not allocate integrity buffer\n");
>  		return -EIO;

                        ^^^  what?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 3/6] fs: remove dependency on __GFP_NOFAIL
@ 2010-07-23 19:36     ` Andrew Morton
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Morton @ 2010-07-23 19:36 UTC (permalink / raw)
  To: David Rientjes; +Cc: Al Viro, Jens Axboe, linux-fsdevel, linux-mm

On Tue, 20 Jul 2010 19:45:00 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:

> The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
> from its mask.
> 
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  fs/bio-integrity.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
> --- a/fs/bio-integrity.c
> +++ b/fs/bio-integrity.c
> @@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
>  
>  	/* Allocate kernel buffer for protection data */
>  	len = sectors * blk_integrity_tuple_size(bi);
> -	buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
> +	buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
>  	if (unlikely(buf == NULL)) {
>  		printk(KERN_ERR "could not allocate integrity buffer\n");
>  		return -EIO;

                        ^^^  what?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-23 15:05                 ` Ted Ts'o
  (?)
  (?)
@ 2010-07-23 19:40                 ` David Rientjes
  2010-07-23 19:52                     ` Ted Ts'o
  -1 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-07-23 19:40 UTC (permalink / raw)
  To: Ted Ts'o
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm,
	linux-ext4

On Fri, 23 Jul 2010, Ted Ts'o wrote:

> __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> mask, so that file systems can optionally (re)start transaction
> handles using GFP_KERNEL.  If they do this, then they need to be
> prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> to reflect that error up to userspace.
> 
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

Acked-by: David Rientjes <rientjes@google.com>

Will you be pushing the equivalent patch for jbd?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 3/6] fs: remove dependency on __GFP_NOFAIL
  2010-07-23 19:36     ` Andrew Morton
  (?)
@ 2010-07-23 19:51     ` David Rientjes
  2010-07-23 21:12         ` Pekka Enberg
  -1 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-07-23 19:51 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Al Viro, Jens Axboe, linux-fsdevel, linux-mm

On Fri, 23 Jul 2010, Andrew Morton wrote:

> > The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
> > from its mask.
> > 
> > Cc: Jens Axboe <jens.axboe@oracle.com>
> > Signed-off-by: David Rientjes <rientjes@google.com>
> > ---
> >  fs/bio-integrity.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
> > --- a/fs/bio-integrity.c
> > +++ b/fs/bio-integrity.c
> > @@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
> >  
> >  	/* Allocate kernel buffer for protection data */
> >  	len = sectors * blk_integrity_tuple_size(bi);
> > -	buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
> > +	buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
> >  	if (unlikely(buf == NULL)) {
> >  		printk(KERN_ERR "could not allocate integrity buffer\n");
> >  		return -EIO;
> 
>                         ^^^  what?
> 

Right, I'm not sure why that decision was made, but it looks like it can 
be changed over to -ENOMEM without harming anything.  I'm concerned that 
the printk will spam the kernel log endlessly, though, if we're really oom 
and GFP_NOIO has no hope of freeing memory.  This code has never been 
active, so I'd like to wait for some feedback from Al and Jens (now with a 
corrected email address, jens.axboe@oracle.com bounced) to see if we want 
to return -ENOMEM, if the printk is really necessary, and if it would be 
better to just convert this to a loop with a congestion_wait() instead of 
returning from bio_integrity_prep().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
  2010-07-23 19:40                 ` David Rientjes
@ 2010-07-23 19:52                     ` Ted Ts'o
  0 siblings, 0 replies; 41+ messages in thread
From: Ted Ts'o @ 2010-07-23 19:52 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm,
	linux-ext4

On Fri, Jul 23, 2010 at 12:40:50PM -0700, David Rientjes wrote:
> On Fri, 23 Jul 2010, Ted Ts'o wrote:
> 
> > __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> > jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> > mask, so that file systems can optionally (re)start transaction
> > handles using GFP_KERNEL.  If they do this, then they need to be
> > prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> > to reflect that error up to userspace.
> > 
> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> 
> Acked-by: David Rientjes <rientjes@google.com>
> 
> Will you be pushing the equivalent patch for jbd?

I imagine Jan (as the person who has been primarily handling ext3
patches) is waiting to see how invasive the patches are to ext4 before
deciding whether he's willing backport the equivalent changes to ext3
so that some of the calls that end up calling start_this_handle() end
up passing GFP_KERNEL when it's OK for them to fail --- since ext3 is
in maintenance mode, so we tend not to backport the more disruptive
patches to ext3.  It's really a cost/benefit tradeoff, really.

I am certain that some application programmers will complain when
their programs start breaking because they're not prepared to handle
an ENOMEM failure from certain system calls that have never failed
that way before.  (I should introduce you to some Japanese enterprise
programmers who believe that if a system call ever starts returning an
error code not documented in a Linux manpage, it's a ABI compatibility
bug.  They're on crack, of course, but it's been hard convincing them
that it's their fault for writing brittle/fragile applications.)

So I could imagine that Jan and some of the enterprise distributions
that are shipping ext3 might not want this change, given the "better
the devil you know (random lockups in the case of extreme memory
pressure)" is better than the devil you don't (more system calls might
return ENOMEM, with undetermined application impacts, in the case of
extreme memory pressure).  So in the jbd layer, they might want to
simply unconditionally loop, so it would be bug-for-bug compatible
with existing ext3 behavior.

						- Ted

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 6/6] jbd2: remove dependency on __GFP_NOFAIL
@ 2010-07-23 19:52                     ` Ted Ts'o
  0 siblings, 0 replies; 41+ messages in thread
From: Ted Ts'o @ 2010-07-23 19:52 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Kara, Andrew Morton, Andreas Dilger, Jiri Kosina, linux-mm,
	linux-ext4

On Fri, Jul 23, 2010 at 12:40:50PM -0700, David Rientjes wrote:
> On Fri, 23 Jul 2010, Ted Ts'o wrote:
> 
> > __GFP_NOFAIL is going away, so add our own retry loop.  Also add
> > jbd2__journal_start() and jbd2__journal_restart() which take a gfp
> > mask, so that file systems can optionally (re)start transaction
> > handles using GFP_KERNEL.  If they do this, then they need to be
> > prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
> > to reflect that error up to userspace.
> > 
> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> 
> Acked-by: David Rientjes <rientjes@google.com>
> 
> Will you be pushing the equivalent patch for jbd?

I imagine Jan (as the person who has been primarily handling ext3
patches) is waiting to see how invasive the patches are to ext4 before
deciding whether he's willing backport the equivalent changes to ext3
so that some of the calls that end up calling start_this_handle() end
up passing GFP_KERNEL when it's OK for them to fail --- since ext3 is
in maintenance mode, so we tend not to backport the more disruptive
patches to ext3.  It's really a cost/benefit tradeoff, really.

I am certain that some application programmers will complain when
their programs start breaking because they're not prepared to handle
an ENOMEM failure from certain system calls that have never failed
that way before.  (I should introduce you to some Japanese enterprise
programmers who believe that if a system call ever starts returning an
error code not documented in a Linux manpage, it's a ABI compatibility
bug.  They're on crack, of course, but it's been hard convincing them
that it's their fault for writing brittle/fragile applications.)

So I could imagine that Jan and some of the enterprise distributions
that are shipping ext3 might not want this change, given the "better
the devil you know (random lockups in the case of extreme memory
pressure)" is better than the devil you don't (more system calls might
return ENOMEM, with undetermined application impacts, in the case of
extreme memory pressure).  So in the jbd layer, they might want to
simply unconditionally loop, so it would be bug-for-bug compatible
with existing ext3 behavior.

						- Ted

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 3/6] fs: remove dependency on __GFP_NOFAIL
  2010-07-23 19:51     ` David Rientjes
@ 2010-07-23 21:12         ` Pekka Enberg
  0 siblings, 0 replies; 41+ messages in thread
From: Pekka Enberg @ 2010-07-23 21:12 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Al Viro, Jens Axboe, linux-fsdevel, linux-mm

On Fri, Jul 23, 2010 at 10:51 PM, David Rientjes <rientjes@google.com> wrote:
> On Fri, 23 Jul 2010, Andrew Morton wrote:
>
>> > The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
>> > from its mask.
>> >
>> > Cc: Jens Axboe <jens.axboe@oracle.com>
>> > Signed-off-by: David Rientjes <rientjes@google.com>
>> > ---
>> >  fs/bio-integrity.c |    2 +-
>> >  1 files changed, 1 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
>> > --- a/fs/bio-integrity.c
>> > +++ b/fs/bio-integrity.c
>> > @@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
>> >
>> >     /* Allocate kernel buffer for protection data */
>> >     len = sectors * blk_integrity_tuple_size(bi);
>> > -   buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
>> > +   buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
>> >     if (unlikely(buf == NULL)) {
>> >             printk(KERN_ERR "could not allocate integrity buffer\n");
>> >             return -EIO;
>>
>>                         ^^^  what?
>
> Right, I'm not sure why that decision was made, but it looks like it can
> be changed over to -ENOMEM without harming anything.  I'm concerned that
> the printk will spam the kernel log endlessly, though, if we're really oom
> and GFP_NOIO has no hope of freeing memory.  This code has never been
> active, so I'd like to wait for some feedback from Al and Jens (now with a
> corrected email address, jens.axboe@oracle.com bounced) to see if we want
> to return -ENOMEM, if the printk is really necessary, and if it would be
> better to just convert this to a loop with a congestion_wait() instead of
> returning from bio_integrity_prep().

Btw, you probably want __GFP_NOWARN here if you expect the allocation
to fail under normal conditions.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [patch 3/6] fs: remove dependency on __GFP_NOFAIL
@ 2010-07-23 21:12         ` Pekka Enberg
  0 siblings, 0 replies; 41+ messages in thread
From: Pekka Enberg @ 2010-07-23 21:12 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Al Viro, Jens Axboe, linux-fsdevel, linux-mm

On Fri, Jul 23, 2010 at 10:51 PM, David Rientjes <rientjes@google.com> wrote:
> On Fri, 23 Jul 2010, Andrew Morton wrote:
>
>> > The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL
>> > from its mask.
>> >
>> > Cc: Jens Axboe <jens.axboe@oracle.com>
>> > Signed-off-by: David Rientjes <rientjes@google.com>
>> > ---
>> >  fs/bio-integrity.c |    2 +-
>> >  1 files changed, 1 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/fs/bio-integrity.c b/fs/bio-integrity.c
>> > --- a/fs/bio-integrity.c
>> > +++ b/fs/bio-integrity.c
>> > @@ -413,7 +413,7 @@ int bio_integrity_prep(struct bio *bio)
>> >
>> >     /* Allocate kernel buffer for protection data */
>> >     len = sectors * blk_integrity_tuple_size(bi);
>> > -   buf = kmalloc(len, GFP_NOIO | __GFP_NOFAIL | q->bounce_gfp);
>> > +   buf = kmalloc(len, GFP_NOIO | q->bounce_gfp);
>> >     if (unlikely(buf == NULL)) {
>> >             printk(KERN_ERR "could not allocate integrity buffer\n");
>> >             return -EIO;
>>
>>                         ^^^  what?
>
> Right, I'm not sure why that decision was made, but it looks like it can
> be changed over to -ENOMEM without harming anything.  I'm concerned that
> the printk will spam the kernel log endlessly, though, if we're really oom
> and GFP_NOIO has no hope of freeing memory.  This code has never been
> active, so I'd like to wait for some feedback from Al and Jens (now with a
> corrected email address, jens.axboe@oracle.com bounced) to see if we want
> to return -ENOMEM, if the printk is really necessary, and if it would be
> better to just convert this to a loop with a congestion_wait() instead of
> returning from bio_integrity_prep().

Btw, you probably want __GFP_NOWARN here if you expect the allocation
to fail under normal conditions.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2010-07-23 21:12 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-21  2:44 [patch 0/6] remove dependency on __GFP_NOFAIL for failable allocations David Rientjes
2010-07-21  2:44 ` [patch 1/6] sparc: remove dependency on __GFP_NOFAIL David Rientjes
2010-07-21  2:44   ` David Rientjes
2010-07-21  3:31   ` David Miller
2010-07-21  3:31     ` David Miller
2010-07-21  9:41     ` David Rientjes
2010-07-21  9:41       ` David Rientjes
     [not found] ` <alpine.DEB.2.00.1007201936210.8728-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2010-07-21  2:44   ` [patch 2/6] infiniband: " David Rientjes
2010-07-21  2:44     ` David Rientjes
     [not found]     ` <alpine.DEB.2.00.1007201938570.8728-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2010-07-21  3:19       ` Steve Wise
2010-07-21  3:19         ` Steve Wise
     [not found]         ` <4C466730.1070809-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-07-21 17:55           ` Roland Dreier
2010-07-21 17:55             ` Roland Dreier
2010-07-21  2:45 ` [patch 3/6] fs: " David Rientjes
2010-07-21  2:45   ` David Rientjes
2010-07-23 19:36   ` Andrew Morton
2010-07-23 19:36     ` Andrew Morton
2010-07-23 19:51     ` David Rientjes
2010-07-23 21:12       ` Pekka Enberg
2010-07-23 21:12         ` Pekka Enberg
2010-07-21  2:45 ` [patch 4/6] gfs2: " David Rientjes
2010-07-21  9:24   ` Steven Whitehouse
2010-07-21  9:24     ` [Cluster-devel] " Steven Whitehouse
2010-07-21  9:31     ` David Rientjes
2010-07-21  2:45 ` [patch 6/6] jbd2: " David Rientjes
2010-07-22 14:14   ` Ted Ts'o
2010-07-22 18:09     ` David Rientjes
2010-07-22 23:09       ` Ted Ts'o
2010-07-22 23:24         ` David Rientjes
2010-07-23 14:10           ` Ted Ts'o
2010-07-23 14:10             ` Ted Ts'o
2010-07-23 14:57             ` Jan Kara
2010-07-23 14:57               ` Jan Kara
2010-07-23 15:05               ` Ted Ts'o
2010-07-23 15:05                 ` Ted Ts'o
2010-07-23 15:32                 ` Jan Kara
2010-07-23 15:32                   ` Jan Kara
2010-07-23 19:40                 ` David Rientjes
2010-07-23 19:52                   ` Ted Ts'o
2010-07-23 19:52                     ` Ted Ts'o
2010-07-21 19:26 ` [patch 5/6] jbd: " David Rientjes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.