All of lore.kernel.org
 help / color / mirror / Atom feed
From: kaber@trash.net
To: davem@davemloft.net
Cc: netfilter-devel@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 27/79] IPVS: Handle Scheduling errors.
Date: Wed, 19 Jan 2011 20:14:27 +0100	[thread overview]
Message-ID: <1295464519-21763-28-git-send-email-kaber@trash.net> (raw)
In-Reply-To: <1295464519-21763-1-git-send-email-kaber@trash.net>

From: Hans Schillstrom <hans.schillstrom@ericsson.com>

If ip_vs_conn_fill_param_persist return an error to ip_vs_sched_persist,
this error must propagate as ignored=-1 to ip_vs_schedule().
Errors from ip_vs_conn_new() in ip_vs_sched_persist() and ip_vs_schedule()
should also return *ignored=-1;

This patch just relies on the fact that ignored is 1 before calling
ip_vs_sched_persist().

Sent from Julian:
  "The new case when ip_vs_conn_fill_param_persist fails
   should set *ignored = -1, so that we can use NF_DROP,
   see below. *ignored = -1 should be also used for ip_vs_conn_new
   failure in ip_vs_sched_persist() and ip_vs_schedule().
   The new negative value should be handled in tcp,udp,sctp"

"To summarize:

- *ignored = 1:
      protocol tried to schedule (eg. on SYN), found svc but the
      svc/scheduler decides that this packet should be accepted with
      NF_ACCEPT because it must not be scheduled.

- *ignored = 0:
      scheduler can not find destination, so try bypass or
      return ICMP and then NF_DROP (ip_vs_leave).

- *ignored = -1:
      scheduler tried to schedule but fatal error occurred, eg.
      ip_vs_conn_new failure (ENOMEM) or ip_vs_sip_fill_param
      failure such as missing Call-ID, ENOMEM on skb_linearize
      or pe_data. In this case we should return NF_DROP without
      any attempts to send ICMP with ip_vs_leave."

More or less all ideas and input to this patch is work from
Julian Anastasov

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_core.c       |   56 +++++++++++++++++++++++---------
 net/netfilter/ipvs/ip_vs_proto_sctp.c |   11 +++++--
 net/netfilter/ipvs/ip_vs_proto_tcp.c  |   10 +++++-
 net/netfilter/ipvs/ip_vs_proto_udp.c  |   10 +++++-
 4 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 9acdd79..3445da6 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -177,7 +177,7 @@ ip_vs_set_state(struct ip_vs_conn *cp, int direction,
 	return pp->state_transition(cp, direction, skb, pp);
 }
 
-static inline void
+static inline int
 ip_vs_conn_fill_param_persist(const struct ip_vs_service *svc,
 			      struct sk_buff *skb, int protocol,
 			      const union nf_inet_addr *caddr, __be16 cport,
@@ -187,7 +187,9 @@ ip_vs_conn_fill_param_persist(const struct ip_vs_service *svc,
 	ip_vs_conn_fill_param(svc->af, protocol, caddr, cport, vaddr, vport, p);
 	p->pe = svc->pe;
 	if (p->pe && p->pe->fill_param)
-		p->pe->fill_param(p, skb);
+		return p->pe->fill_param(p, skb);
+
+	return 0;
 }
 
 /*
@@ -200,7 +202,7 @@ ip_vs_conn_fill_param_persist(const struct ip_vs_service *svc,
 static struct ip_vs_conn *
 ip_vs_sched_persist(struct ip_vs_service *svc,
 		    struct sk_buff *skb,
-		    __be16 src_port, __be16 dst_port)
+		    __be16 src_port, __be16 dst_port, int *ignored)
 {
 	struct ip_vs_conn *cp = NULL;
 	struct ip_vs_iphdr iph;
@@ -268,20 +270,27 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
 				vaddr = &fwmark;
 			}
 		}
-		ip_vs_conn_fill_param_persist(svc, skb, protocol, &snet, 0,
-					      vaddr, vport, &param);
+		/* return *ignored = -1 so NF_DROP can be used */
+		if (ip_vs_conn_fill_param_persist(svc, skb, protocol, &snet, 0,
+						  vaddr, vport, &param) < 0) {
+			*ignored = -1;
+			return NULL;
+		}
 	}
 
 	/* Check if a template already exists */
 	ct = ip_vs_ct_in_get(&param);
 	if (!ct || !ip_vs_check_template(ct)) {
-		/* No template found or the dest of the connection
+		/*
+		 * No template found or the dest of the connection
 		 * template is not available.
+		 * return *ignored=0 i.e. ICMP and NF_DROP
 		 */
 		dest = svc->scheduler->schedule(svc, skb);
 		if (!dest) {
 			IP_VS_DBG(1, "p-schedule: no dest found.\n");
 			kfree(param.pe_data);
+			*ignored = 0;
 			return NULL;
 		}
 
@@ -296,6 +305,7 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
 				    IP_VS_CONN_F_TEMPLATE, dest, skb->mark);
 		if (ct == NULL) {
 			kfree(param.pe_data);
+			*ignored = -1;
 			return NULL;
 		}
 
@@ -323,6 +333,7 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
 	cp = ip_vs_conn_new(&param, &dest->addr, dport, flags, dest, skb->mark);
 	if (cp == NULL) {
 		ip_vs_conn_put(ct);
+		*ignored = -1;
 		return NULL;
 	}
 
@@ -342,6 +353,21 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
  *  It selects a server according to the virtual service, and
  *  creates a connection entry.
  *  Protocols supported: TCP, UDP
+ *
+ *  Usage of *ignored
+ *
+ * 1 :   protocol tried to schedule (eg. on SYN), found svc but the
+ *       svc/scheduler decides that this packet should be accepted with
+ *       NF_ACCEPT because it must not be scheduled.
+ *
+ * 0 :   scheduler can not find destination, so try bypass or
+ *       return ICMP and then NF_DROP (ip_vs_leave).
+ *
+ * -1 :  scheduler tried to schedule but fatal error occurred, eg.
+ *       ip_vs_conn_new failure (ENOMEM) or ip_vs_sip_fill_param
+ *       failure such as missing Call-ID, ENOMEM on skb_linearize
+ *       or pe_data. In this case we should return NF_DROP without
+ *       any attempts to send ICMP with ip_vs_leave.
  */
 struct ip_vs_conn *
 ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
@@ -372,11 +398,9 @@ ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
 	}
 
 	/*
-	 * Do not schedule replies from local real server. It is risky
-	 * for fwmark services but mostly for persistent services.
+	 *    Do not schedule replies from local real server.
 	 */
 	if ((!skb->dev || skb->dev->flags & IFF_LOOPBACK) &&
-	    (svc->flags & IP_VS_SVC_F_PERSISTENT || svc->fwmark) &&
 	    (cp = pp->conn_in_get(svc->af, skb, pp, &iph, iph.len, 1))) {
 		IP_VS_DBG_PKT(12, svc->af, pp, skb, 0,
 			      "Not scheduling reply for existing connection");
@@ -387,10 +411,10 @@ ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
 	/*
 	 *    Persistent service
 	 */
-	if (svc->flags & IP_VS_SVC_F_PERSISTENT) {
-		*ignored = 0;
-		return ip_vs_sched_persist(svc, skb, pptr[0], pptr[1]);
-	}
+	if (svc->flags & IP_VS_SVC_F_PERSISTENT)
+		return ip_vs_sched_persist(svc, skb, pptr[0], pptr[1], ignored);
+
+	*ignored = 0;
 
 	/*
 	 *    Non-persistent service
@@ -403,8 +427,6 @@ ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
 		return NULL;
 	}
 
-	*ignored = 0;
-
 	dest = svc->scheduler->schedule(svc, skb);
 	if (dest == NULL) {
 		IP_VS_DBG(1, "Schedule: no dest found.\n");
@@ -425,8 +447,10 @@ ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
 		cp = ip_vs_conn_new(&p, &dest->addr,
 				    dest->port ? dest->port : pptr[1],
 				    flags, dest, skb->mark);
-		if (!cp)
+		if (!cp) {
+			*ignored = -1;
 			return NULL;
+		}
 	}
 
 	IP_VS_DBG_BUF(6, "Schedule fwd:%c c:%s:%u v:%s:%u "
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 1ea96bc..a315159 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -47,13 +47,18 @@ sctp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
 		 * incoming connection, and create a connection entry.
 		 */
 		*cpp = ip_vs_schedule(svc, skb, pp, &ignored);
-		if (!*cpp && !ignored) {
-			*verdict = ip_vs_leave(svc, skb, pp);
+		if (!*cpp && ignored <= 0) {
+			if (!ignored)
+				*verdict = ip_vs_leave(svc, skb, pp);
+			else {
+				ip_vs_service_put(svc);
+				*verdict = NF_DROP;
+			}
 			return 0;
 		}
 		ip_vs_service_put(svc);
 	}
-
+	/* NF_ACCEPT */
 	return 1;
 }
 
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index f6c5200..1cdab12 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -64,12 +64,18 @@ tcp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
 		 * incoming connection, and create a connection entry.
 		 */
 		*cpp = ip_vs_schedule(svc, skb, pp, &ignored);
-		if (!*cpp && !ignored) {
-			*verdict = ip_vs_leave(svc, skb, pp);
+		if (!*cpp && ignored <= 0) {
+			if (!ignored)
+				*verdict = ip_vs_leave(svc, skb, pp);
+			else {
+				ip_vs_service_put(svc);
+				*verdict = NF_DROP;
+			}
 			return 0;
 		}
 		ip_vs_service_put(svc);
 	}
+	/* NF_ACCEPT */
 	return 1;
 }
 
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 9d106a0..cd398de 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -63,12 +63,18 @@ udp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
 		 * incoming connection, and create a connection entry.
 		 */
 		*cpp = ip_vs_schedule(svc, skb, pp, &ignored);
-		if (!*cpp && !ignored) {
-			*verdict = ip_vs_leave(svc, skb, pp);
+		if (!*cpp && ignored <= 0) {
+			if (!ignored)
+				*verdict = ip_vs_leave(svc, skb, pp);
+			else {
+				ip_vs_service_put(svc);
+				*verdict = NF_DROP;
+			}
 			return 0;
 		}
 		ip_vs_service_put(svc);
 	}
+	/* NF_ACCEPT */
 	return 1;
 }
 
-- 
1.7.2.3


  parent reply	other threads:[~2011-01-19 19:15 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-19 19:14 [PATCH 00/79] netfilter: netfilter update kaber
2011-01-19 19:14 ` [PATCH 01/79] netfilter: nf_conntrack: don't always initialize ct->proto kaber
2011-01-19 19:14 ` [PATCH 02/79] netfilter: xt_NFQUEUE: remove modulo operations kaber
2011-01-19 19:14 ` [PATCH 03/79] netfilter: xt_LOG: do print MAC header on FORWARD kaber
2011-01-19 19:14 ` [PATCH 04/79] netfilter: ct_extend: fix the wrong alloc_size kaber
2011-01-19 19:14 ` [PATCH 05/79] netfilter: nf_conntrack: define ct_*_info as needed kaber
2011-01-19 19:14 ` [PATCH 06/79] netfilter: nf_nat: don't use atomic bit operation kaber
2011-01-19 19:14 ` [PATCH 07/79] netfilter: ct_extend: define NF_CT_EXT_* as needed kaber
2011-01-19 19:14 ` [PATCH 08/79] netfilter: nf_nat: define nat_pptp_info " kaber
2011-01-19 19:14 ` [PATCH 09/79] netfilter: xt_CLASSIFY: add ARP support, allow CLASSIFY target on any table kaber
2011-01-19 19:14 ` [PATCH 10/79] netfilter: add __rcu annotations kaber
2011-01-19 19:14 ` [PATCH 11/79] netfilter: nf_ct_frag6_sysctl_table is static kaber
2011-01-19 19:14 ` [PATCH 12/79] netfilter: add __rcu annotations kaber
2011-01-19 19:14 ` [PATCH 13/79] netfilter: nf_nat_amanda: rename a variable kaber
2011-01-19 19:14 ` [PATCH 14/79] netfilter: rcu sparse cleanups kaber
2011-01-19 19:14 ` [PATCH 15/79] IPVS: Add persistence engine to connection entry kaber
2011-01-19 19:14 ` [PATCH 16/79] IPVS: Only match pe_data created by the same pe kaber
2011-01-19 19:14 ` [PATCH 17/79] IPVS: Make the cp argument to ip_vs_sync_conn() static kaber
2011-01-19 19:14 ` [PATCH 18/79] IPVS: Remove useless { } block from ip_vs_process_message() kaber
2011-01-19 19:40   ` Joe Perches
2011-01-25  2:10     ` Simon Horman
2011-01-25  5:16       ` Simon Horman
2011-01-19 19:14 ` [PATCH 19/79] IPVS: buffer argument to ip_vs_process_message() should not be const kaber
2011-01-19 19:14 ` [PATCH 20/79] ipvs: add static and read_mostly attributes kaber
2011-01-19 19:14 ` [PATCH 21/79] ipvs: remove shadow rt variable kaber
2011-01-19 19:14 ` [PATCH 22/79] ipvs: allow transmit of GRO aggregated skbs kaber
2011-01-19 19:14 ` [PATCH 23/79] netfilter: nf_conntrack: one less atomic op in nf_ct_expect_insert() kaber
2011-01-19 19:14 ` [PATCH 24/79] IPVS: Backup, Prepare for transferring firewall marks (fwmark) to the backup daemon kaber
2011-01-19 19:14 ` [PATCH 25/79] IPVS: Split ports[2] into src_port and dst_port kaber
2011-01-19 19:14 ` [PATCH 26/79] IPVS: skb defrag in L7 helpers kaber
2011-01-19 19:14 ` kaber [this message]
2011-01-19 19:14 ` [PATCH 28/79] IPVS: Backup, Adding structs for new sync format kaber
2011-01-19 19:14 ` [PATCH 29/79] IPVS: Backup, Adding Version 1 receive capability kaber
2011-01-19 19:14 ` [PATCH 30/79] IPVS: Backup, Change sending to Version 1 format kaber
2011-01-19 19:14 ` [PATCH 31/79] IPVS: Backup, adding version 0 sending capabilities kaber
2011-01-19 19:14 ` [PATCH 32/79] netfilter: xtables: use guarded types kaber
2011-01-19 19:14 ` [PATCH 33/79] netfilter: fix compilation when conntrack is disabled but tproxy is enabled kaber
2011-01-19 19:14 ` [PATCH 34/79] IPVS: netns, add basic init per netns kaber
2011-01-19 19:14 ` [PATCH 35/79] IPVS: netns to services part 1 kaber
2011-01-19 19:14 ` [PATCH 36/79] IPVS: netns awarness to lblcr sheduler kaber
2011-01-19 19:14 ` [PATCH 37/79] IPVS: netns awarness to lblc sheduler kaber
2011-01-19 19:14 ` [PATCH 38/79] IPVS: netns, prepare protocol kaber
2011-01-19 19:14 ` [PATCH 39/79] IPVS: netns preparation for proto_tcp kaber
2011-01-19 19:14 ` [PATCH 40/79] IPVS: netns preparation for proto_udp kaber
2011-01-19 19:14 ` [PATCH 41/79] IPVS: netns preparation for proto_sctp kaber
2011-01-19 19:14 ` [PATCH 42/79] IPVS: netns preparation for proto_ah_esp kaber
2011-01-19 19:14 ` [PATCH 43/79] IPVS: netns, use ip_vs_proto_data as param kaber
2011-01-19 19:14 ` [PATCH 44/79] IPVS: netns, common protocol changes and use of appcnt kaber
2011-01-19 19:14 ` [PATCH 45/79] IPVS: netns awareness to ip_vs_app kaber
2011-01-19 19:14 ` [PATCH 46/79] IPVS: netns awareness to ip_vs_est kaber
2011-01-19 19:14 ` [PATCH 47/79] IPVS: netns awareness to ip_vs_sync kaber
2011-01-19 19:14 ` [PATCH 48/79] IPVS: netns, ip_vs_stats and its procfs kaber
2011-01-19 19:14 ` [PATCH 49/79] IPVS: netns, connection hash got net as param kaber
2011-01-19 19:14 ` [PATCH 50/79] IPVS: netns, ip_vs_ctl local vars moved to ipvs struct kaber
2011-01-19 19:14 ` [PATCH 51/79] IPVS: netns, defense work timer kaber
2011-01-19 19:14 ` [PATCH 52/79] IPVS: netns, trash handling kaber
2011-01-19 19:14 ` [PATCH 53/79] IPVS: netns, svc counters moved in ip_vs_ctl,c kaber
2011-01-19 19:14 ` [PATCH 54/79] IPVS: netns, misc init_net removal in core kaber
2011-01-19 19:14 ` [PATCH 55/79] IPVS: netns, final patch enabling network name space kaber
2011-01-19 19:14 ` [PATCH 56/79] netfilter: xt_comment: drop unneeded unsigned qualifier kaber
2011-01-19 19:14 ` [PATCH 57/79] netfilter: xt_conntrack: support matching on port ranges kaber
2011-01-19 19:14 ` [PATCH 58/79] netfilter: x_table: speedup compat operations kaber
2011-01-19 19:14 ` [PATCH 59/79] netfilter: ebt_ip6: allow matching on ipv6-icmp types/codes kaber
2011-01-19 19:15 ` [PATCH 60/79] netfilter: fix Kconfig dependencies kaber
2011-01-19 19:15 ` [PATCH 61/79] netfilter: nf_conntrack: use is_vmalloc_addr() kaber
2011-01-19 19:15 ` [PATCH 62/79] netfilter: audit target to record accepted/dropped packets kaber
2011-01-19 19:15 ` [PATCH 63/79] netfilter: create audit records for x_tables replaces kaber
2011-01-19 19:15 ` [PATCH 64/79] netfilter: xtables: add missing aliases for autoloading via iptables kaber
2011-01-19 19:15 ` [PATCH 65/79] audit: export symbol for use with xt_AUDIT kaber
2011-01-19 19:15 ` [PATCH 66/79] netfilter: xt_connlimit: use hotdrop jump mark kaber
2011-01-19 19:15 ` [PATCH 67/79] netfilter: xtables: use __uXX guarded types for userspace exports kaber
2011-01-19 19:15 ` [PATCH 68/79] netfilter: xtables: add missing header files to export list kaber
2011-01-19 19:15 ` [PATCH 69/79] netfilter: nf_nat: fix conversion to non-atomic bit ops kaber
2011-01-19 19:15 ` [PATCH 70/79] netfilter: nf_conntrack: remove an atomic bit operation kaber
2011-01-19 19:15 ` [PATCH 71/79] netfilter: Kconfig: NFQUEUE is useless without NETFILTER_NETLINK_QUEUE kaber
2011-01-19 19:15 ` [PATCH 72/79] netfilter: nfnetlink_queue: return error number to caller kaber
2011-01-19 19:15 ` [PATCH 73/79] netfilter: nfnetlink_queue: do not free skb on error kaber
2011-01-19 19:15 ` [PATCH 74/79] netfilter: reduce NF_VERDICT_MASK to 0xff kaber
2011-01-19 19:15 ` [PATCH 75/79] netfilter: allow NFQUEUE bypass if no listener is available kaber
2011-01-19 19:15 ` [PATCH 76/79] netfilter: ipt_CLUSTERIP: remove "no conntrack!" kaber
2011-01-19 19:15 ` [PATCH 77/79] netfilter: nf_conntrack: nf_conntrack snmp helper kaber
2011-01-19 19:15 ` [PATCH 78/79] netfilter: nf_conntrack_tstamp: add flow-based timestamp extension kaber
2011-01-19 19:15 ` [PATCH 79/79] netfilter: nf_conntrack: fix lifetime display for disabled connections kaber
2011-01-19 21:55 ` [PATCH 00/79] netfilter: netfilter update David Miller
2011-01-20  0:50   ` David Miller
2011-01-20  0:59     ` Jan Engelhardt
2011-01-20  1:13     ` Patrick McHardy
2011-01-20  1:36       ` Jan Engelhardt
2011-01-20  7:49         ` Patrick McHardy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1295464519-21763-28-git-send-email-kaber@trash.net \
    --to=kaber@trash.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.