linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [PATCH BUGFIX -rc3] Smack: Don't register smackfs if we're not loaded
  @ 2008-03-05 12:12 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 12:12 UTC (permalink / raw)
  To: James Morris; +Cc: Casey Schaufler, Linus Torvalds, LKML

On Wed, Mar 05, 2008 at 11:58:08AM +1100, James Morris wrote:
> On Tue, 4 Mar 2008, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 4 Mar 2008, Casey Schaufler wrote:
> > > 
> > > One solution would be to tighten the smackfs code so that it
> > > handles the uninitialized LSM case properly.
> > 
> > I really would tend to prefer that.
> 
> Can you simply call init_smk_fs() from smack_init() prior to 
> register_security() ?
> 

After analysing below oops:

[    0.072989] Call Trace:
[    0.072989]  [<c017d4a4>] alloc_vfsmnt+0x24/0xd0
[    0.072989]  [<c0168ee1>] vfs_kern_mount+0x31/0x120
[    0.072989]  [<c0168fe2>] kern_mount_data+0x12/0x20
[    0.072989]  [<c038a73c>] init_smk_fs+0x4c/0x80
[    0.072989]  [<c038a6ce>] smack_init+0xae/0xd0
[    0.072989]  [<c038a192>] security_init+0x52/0x70
[    0.072989]  [<c037aaa5>] start_kernel+0x1e5/0x290
[    0.072989]  [<c037a380>] unknown_bootoption+0x0/0x1f0
[    0.072989]  =======================

I've found that vfs_caches_init() got called after calling 
security_init() in the main kernel init path. 

This leads to kern_mount() Oopsing cause of Null dereferencing 
of mnt_cache  in the alloc_vfsmnt(mnt_cache, GFP_KERNEL) step.

I was thinking about exporting the current chosen_lsm[] string,
but I don't know if this maybe seen by Linus as another global
that arises from bad design. It may not also scale well if the
LSM needs to check if it's enabled in various places (lots of
strcmp()s).

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* Re: [PATCH BUGFIX -rc3] Smack: Don't register smackfs if we're not loaded
    @ 2008-03-05 12:44 99% ` Ahmed S. Darwish
  2008-03-05 12:51 99%   ` Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 12:44 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Linus Torvalds, Stephen Smalley, James Morris, Eric Paris, LKML

On Tue, Mar 04, 2008 at 09:45:04AM -0800, Casey Schaufler wrote:
> ----- Original Message ----
> > From: Linus Torvalds <torvalds@linux-foundation.org>
> > To: Ahmed S. Darwish <darwish.07@gmail.com>
> > Cc: Casey Schaufler <casey@schaufler-ca.com>; LKML <linux-kernel@vger.kernel.org>
> > Sent: Tuesday, March 4, 2008 9:21:19 AM
> > Subject: Re: [PATCH BUGFIX -rc3] Smack: Don't register smackfs if we're not loaded
> > 
> > 
> > 
> > On Tue, 4 Mar 2008, Ahmed S. Darwish wrote:
> > > 
> > > Smackfs initialization without an enabled Smack leads to
> > > an early Oops that renders the system unusable.
> > 
> > I really think this is bogus. Global enables like this are just wrong, and 
> > a sign that something else bad is going on.
> > 
> > What is the oops? Why does it happen?
...
> 
> One solution would be to tighten the smackfs code so that it
> handles the uninitialized LSM case properly.
> 

IMHO no smackfs code should ever execute if smack isn't loaded.

This means catching it from the very fist step where it registers
itself in init_smk_fs instead of doing several if(we're enabled) cases
in the code path.

The solution should be a _general_ solution, _not_ a SMACK one cause 
SELinux sufferes from exactly the same problem.

a.k.a:

LSMs need a scalable way to know if they're enabled that makes 
everyone happy ( especially Linus ;) ).

Regads to all,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* Re: [PATCH BUGFIX -rc3] Smack: Don't register smackfs if we're not loaded
  2008-03-05 12:44 99% ` Ahmed S. Darwish
@ 2008-03-05 12:51 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 12:51 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Linus Torvalds, Stephen Smalley, James Morris, Eric Paris, LKML

On Wed, Mar 05, 2008 at 02:44:45PM +0200, Ahmed S. Darwish wrote:
> On Tue, Mar 04, 2008 at 09:45:04AM -0800, Casey Schaufler wrote:
> > > From: Linus Torvalds <torvalds@linux-foundation.org>
...
> > > 
> > > What is the oops? Why does it happen?
> ...
> > 
> > One solution would be to tighten the smackfs code so that it
> > handles the uninitialized LSM case properly.
> > 
> 
> IMHO no smackfs code should ever execute if smack isn't loaded.
> 
> This means catching it from the very fist step where it registers
> itself in init_smk_fs instead of doing several if(we're enabled) cases
> in the code path.
> 
> The solution should be a _general_ solution, _not_ a SMACK one cause 
> SELinux sufferes from exactly the same problem.
> 

Not to be misunderstood here, I'm used to writing SMACK in capitals.
I ,ofcourse, don't mean shouting (never will).

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* [PATCH -v7 -rc3] Security: Introduce security= boot parameter
  @ 2008-03-05 15:29 67%   ` Ahmed S. Darwish
    2008-03-05 18:46 65%   ` [PATCH -v7b " Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 15:29 UTC (permalink / raw)
  To: Chris Wright, Stephen Smalley, James Morris, Eric Paris,
	Casey Schaufler, Paul Moore, Alexey Dobriyan, Andrew Morton,
	Linus
  Cc: LKML, LSM-ML

Hi!,

[
    Our usual changelog:

    - Fold the SMACK bugfix here cause now it depends on the
      new introduced security_ symbol.
    - Do not register smackfs if Smack was not loaded
    - Do not use a global to check if Smack was enabled, use 
      security_module_enable(ops) instead.
    - Export smack_ops security operations to the rest of
      SMACK code to satisfy above point.
    - Remove James ACK cause patch semantics changed (could you
      please reACK ?).
]

-->

Add the security= boot parameter. This is done to avoid LSM 
registration clashes in case of more than one bult-in module.

User can choose a security module to enable at boot. If no 
security= boot parameter is specified, only the first LSM 
asking for registration will be loaded. An invalid security 
module name will be treated as if no module has been chosen.

LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux 
and SMACK to do so.

Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Reported-by: Alexey Dobriyan <adobriyan@sw.ru>
---

 Documentation/kernel-parameters.txt |    6 ++++
 include/linux/security.h            |   12 +++++++++
 security/dummy.c                    |    2 -
 security/security.c                 |   46 +++++++++++++++++++++++++++++++++++-
 security/selinux/hooks.c            |    7 +++++
 security/smack/smack.h              |    2 +
 security/smack/smack_lsm.c          |    7 ++++-
 security/smack/smackfs.c            |    9 ++++++-

 8 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9a5b665..64efbdc 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -374,6 +374,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			possible to determine what the correct size should be.
 			This option provides an override for these situations.
 
+	security=	[SECURITY] Choose a security module to enable at boot. 
+			If this boot parameter is not specified, only the first 
+			security module asking for security registration will be
+			loaded. An invalid security module name will be treated
+			as if no module has been chosen.
+
 	capability.disable=
 			[SECURITY] Disable capabilities.  This would normally
 			be used only if an alternative security model is to be
diff --git a/include/linux/security.h b/include/linux/security.h
index fe52cde..801c9ad 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -42,6 +42,9 @@
 
 extern unsigned securebits;
 
+/* Maximum number of letters for an LSM name string */
+#define SECURITY_NAME_MAX	10
+
 struct ctl_table;
 
 /*
@@ -117,6 +120,12 @@ struct request_sock;
 /**
  * struct security_operations - main security structure
  *
+ * Security module identifier.
+ *
+ * @name:
+ *	A string that acts as a unique identifeir for the LSM with max number
+ *	of characters = SECURITY_NAME_MAX.
+ *
  * Security hooks for program execution operations.
  *
  * @bprm_alloc_security:
@@ -1207,6 +1216,8 @@ struct request_sock;
  * This is the main security structure.
  */
 struct security_operations {
+	char name[SECURITY_NAME_MAX + 1];
+
 	int (*ptrace) (struct task_struct * parent, struct task_struct * child);
 	int (*capget) (struct task_struct * target,
 		       kernel_cap_t * effective,
@@ -1466,6 +1477,7 @@ struct security_operations {
 
 /* prototypes */
 extern int security_init	(void);
+extern int security_module_enable(struct security_operations *ops);
 extern int register_security	(struct security_operations *ops);
 extern int mod_reg_security	(const char *name, struct security_operations *ops);
 extern struct dentry *securityfs_create_file(const char *name, mode_t mode,
diff --git a/security/dummy.c b/security/dummy.c
index 649326b..96d196f 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -979,7 +979,7 @@ static inline int dummy_key_permission(key_ref_t key_ref,
 }
 #endif /* CONFIG_KEYS */
 
-struct security_operations dummy_security_ops;
+struct security_operations dummy_security_ops = { "dummy" };
 
 #define set_to_dummy_if_null(ops, function)				\
 	do {								\
diff --git a/security/security.c b/security/security.c
index d15e56c..f188672 100644
--- a/security/security.c
+++ b/security/security.c
@@ -17,6 +17,9 @@
 #include <linux/kernel.h>
 #include <linux/security.h>
 
+/* Boot-time LSM user choice */
+static spinlock_t chosen_lsm_lock;
+static char chosen_lsm[SECURITY_NAME_MAX + 1];
 
 /* things that live in dummy.c */
 extern struct security_operations dummy_security_ops;
@@ -62,18 +65,59 @@ int __init security_init(void)
 	}
 
 	security_ops = &dummy_security_ops;
+	spin_lock_init(&chosen_lsm_lock);
 	do_security_initcalls();
 
 	return 0;
 }
 
+/* Save user chosen LSM */
+static int __init choose_lsm(char *str)
+{
+	strncpy(chosen_lsm, str, SECURITY_NAME_MAX);
+	return 1;
+}
+__setup("security=", choose_lsm);
+
+/**
+ * security_module_enable - Load given security module on boot ?
+ * @ops: a pointer to the struct security_operations that is to be checked.
+ *
+ * Each LSM must pass this method before registering its own operations
+ * to avoid security registration races.
+ *
+ * Return true if:
+ *	-The passed LSM is the one chosen by user at boot time,
+ *	-or user didsn't specify a specific LSM and we're the first to ask
+ *	 for registeration permissoin.
+ * Otherwise, return false.
+ */
+int security_module_enable(struct security_operations *ops)
+{
+	int rc = 1;
+
+	spin_lock(&chosen_lsm_lock);
+	if (!*chosen_lsm)
+		strncpy(chosen_lsm, ops->name, SECURITY_NAME_MAX);
+	else if (strncmp(ops->name, chosen_lsm, SECURITY_NAME_MAX))
+		rc = 0;
+	spin_unlock(&chosen_lsm_lock);
+
+	if (rc)
+		printk(KERN_INFO "Security: Loading '%s' security module.\n",
+		       ops->name);
+
+	return rc;
+}
+
 /**
  * register_security - registers a security framework with the kernel
  * @ops: a pointer to the struct security_options that is to be registered
  *
  * This function is to allow a security module to register itself with the
  * kernel security subsystem.  Some rudimentary checking is done on the @ops
- * value passed to this function.
+ * value passed to this function. You'll need to check first if your LSM
+ * is allowed to register its @ops by calling security_module_enable(@ops).
  *
  * If there is already a security module registered with the kernel,
  * an error will be returned.  Otherwise 0 is returned on success.
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 75c2e99..49709a4 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -5219,6 +5219,8 @@ static int selinux_key_permission(key_ref_t key_ref,
 #endif
 
 static struct security_operations selinux_ops = {
+	.name =				"selinux",
+
 	.ptrace =			selinux_ptrace,
 	.capget =			selinux_capget,
 	.capset_check =			selinux_capset_check,
@@ -5405,6 +5407,11 @@ static __init int selinux_init(void)
 {
 	struct task_security_struct *tsec;
 
+	if (!security_module_enable(&selinux_ops)) {
+		selinux_enabled = 0;
+		return 0;
+	}
+
 	if (!selinux_enabled) {
 		printk(KERN_INFO "SELinux:  Disabled at boot.\n");
 		return 0;
diff --git a/security/smack/smack.h b/security/smack/smack.h
index a21a0e9..c444f48 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -15,6 +15,7 @@
 
 #include <linux/capability.h>
 #include <linux/spinlock.h>
+#include <linux/security.h>
 #include <net/netlabel.h>
 
 /*
@@ -195,6 +196,7 @@ extern struct smack_known smack_known_star;
 extern struct smack_known smack_known_unset;
 
 extern struct smk_list_entry *smack_list;
+extern struct security_operations smack_ops;
 
 /*
  * Stricly for CIPSO level manipulation.
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 770eb06..afa7967 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -2433,7 +2433,9 @@ static void smack_release_secctx(char *secdata, u32 seclen)
 {
 }
 
-static struct security_operations smack_ops = {
+struct security_operations smack_ops = {
+	.name =				"smack",
+
 	.ptrace = 			smack_ptrace,
 	.capget = 			cap_capget,
 	.capset_check = 		cap_capset_check,
@@ -2566,6 +2568,9 @@ static struct security_operations smack_ops = {
  */
 static __init int smack_init(void)
 {
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	printk(KERN_INFO "Smack:  Initializing.\n");
 
 	/*
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 358c92c..769da9a 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -986,12 +986,19 @@ static struct vfsmount *smackfs_mount;
  *
  * register the smackfs
  *
- * Returns 0 unless the registration fails.
+ * Do not register smackfs if Smack wasn't enabled
+ * on boot.
+ *
+ * Returns true if we were not chosen on boot or if
+ * we were chosen and filesystem registration succeeded.
  */
 static int __init init_smk_fs(void)
 {
 	int err;
 
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	err = register_filesystem(&smk_fs_type);
 	if (!err) {
 		smackfs_mount = kern_mount(&smk_fs_type);

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 67%]

* Re: [PATCH -v7 -rc3] Security: Introduce security= boot parameter
  @ 2008-03-05 16:55 98%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 16:55 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Chris Wright, Stephen Smalley, James Morris, Eric Paris,
	Paul Moore, Alexey Dobriyan, Andrew Morton, Linus, LKML, LSM-ML

Hi Casey,

On Wed, Mar 05, 2008 at 08:33:34AM -0800, Casey Schaufler wrote:
> 
> --- "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> 
...
> > 
> > Do not let SMACK register smackfs if it was not chosen on
> > boot. Smackfs assumes that smack hooks are registered and
> > the initial task security setup (swapper->security) is done.
> 
> If the problem with initializing smackfs is because the
> locks aren't initialized why not leave the lock initializations
> in smack_init, and have them done before the check to see if the
> smack LSM is going to get used? Really, we're only talking
> about the case where a kernel is configured for testing or
> development purposes, and the lock initialization can't
> be considered a major impact in any case.
> 

Beside the locking initialization issue, there's the current->security
issue. smackfs init code code access current->security in 
smk_unlbl_ambient().

As you know current->security may equal Null (Oops), or point to 
another LSM structure that preceeded us in registration. 

The locking argument can't be applied here since we may override
the other LSM tsk->security pointer this time.

Ofcourse all of the above points can be handleded by various
if(current->security) checks + rechecking the read/write methods
of each smackfs inode, but below only two lines will fix the 
problem from its roots ;):

+	if (!security_module_enable(&smack_ops))
+		return 0;

Is there a problem in the current approach that I'm not aware of ?

You have your veto in this issue at the end ;)

Thank you,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 98%]

* [PATCH -v7b -rc3] Security: Introduce security= boot parameter
    2008-03-05 15:29 67%   ` [PATCH -v7 " Ahmed S. Darwish
@ 2008-03-05 18:46 65%   ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 18:46 UTC (permalink / raw)
  To: Chris Wright, Stephen Smalley, James Morris, Eric Paris,
	Casey Schaufler, Paul Moore, Alexey Dobriyan, Andrew Morton,
	Linus
  Cc: LKML, LSM-ML

Hi!,

[
v7 log:
    - Fold the SMACK bugfix here cause now it depends on the
      new introduced security_ symbol.
    - Do not register smackfs if Smack was not loaded
    - Do not use a global to check if Smack was enabled, use 
      security_module_enable(ops) instead.
    - Export smack_ops security operations to the rest of
      SMACK code to satisfy above point.
    - Remove James ACK cause patch semantics changed (could you
      please reACK ?).      

++:
    - Add in security_module_enable() comments that it may also
      be used to check if our LSM is enabled on boot.
    - Remove the message `Security: %s is loaded, &lsm' from 
      security_module_enable cause of its new usage in above
      point.
    - Smack: Add a comment above init_smk_fs explaining why we
      can't initialize smackfs under the security_init context.
    - mmm, Will we reach v10 ?
]

-->

Add the security= boot parameter. This is done to avoid LSM 
registration clashes in case of more than one bult-in module.

User can choose a security module to enable at boot. If no 
security= boot parameter is specified, only the first LSM 
asking for registration will be loaded. An invalid security 
module name will be treated as if no module has been chosen.

LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux 
and SMACK to do so.

Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Reported-by: Alexey Dobriyan <adobriyan@sw.ru>
--- 

 Documentation/kernel-parameters.txt |    6 ++++
 include/linux/security.h            |   12 +++++++++
 security/dummy.c                    |    2 -
 security/security.c                 |   46 +++++++++++++++++++++++++++++++++++-
 security/selinux/hooks.c            |    7 +++++
 security/smack/smack.h              |    2 +
 security/smack/smack_lsm.c          |    7 ++++-
 security/smack/smackfs.c            |   11 +++++++-

 8 files changed, 89 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9a5b665..64efbdc 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -374,6 +374,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			possible to determine what the correct size should be.
 			This option provides an override for these situations.
 
+	security=	[SECURITY] Choose a security module to enable at boot. 
+			If this boot parameter is not specified, only the first 
+			security module asking for security registration will be
+			loaded. An invalid security module name will be treated
+			as if no module has been chosen.
+
 	capability.disable=
 			[SECURITY] Disable capabilities.  This would normally
 			be used only if an alternative security model is to be
diff --git a/include/linux/security.h b/include/linux/security.h
index fe52cde..801c9ad 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -42,6 +42,9 @@
 
 extern unsigned securebits;
 
+/* Maximum number of letters for an LSM name string */
+#define SECURITY_NAME_MAX	10
+
 struct ctl_table;
 
 /*
@@ -117,6 +120,12 @@ struct request_sock;
 /**
  * struct security_operations - main security structure
  *
+ * Security module identifier.
+ *
+ * @name:
+ *	A string that acts as a unique identifeir for the LSM with max number
+ *	of characters = SECURITY_NAME_MAX.
+ *
  * Security hooks for program execution operations.
  *
  * @bprm_alloc_security:
@@ -1207,6 +1216,8 @@ struct request_sock;
  * This is the main security structure.
  */
 struct security_operations {
+	char name[SECURITY_NAME_MAX + 1];
+
 	int (*ptrace) (struct task_struct * parent, struct task_struct * child);
 	int (*capget) (struct task_struct * target,
 		       kernel_cap_t * effective,
@@ -1466,6 +1477,7 @@ struct security_operations {
 
 /* prototypes */
 extern int security_init	(void);
+extern int security_module_enable(struct security_operations *ops);
 extern int register_security	(struct security_operations *ops);
 extern int mod_reg_security	(const char *name, struct security_operations *ops);
 extern struct dentry *securityfs_create_file(const char *name, mode_t mode,
diff --git a/security/dummy.c b/security/dummy.c
index 649326b..96d196f 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -979,7 +979,7 @@ static inline int dummy_key_permission(key_ref_t key_ref,
 }
 #endif /* CONFIG_KEYS */
 
-struct security_operations dummy_security_ops;
+struct security_operations dummy_security_ops = { "dummy" };
 
 #define set_to_dummy_if_null(ops, function)				\
 	do {								\
diff --git a/security/security.c b/security/security.c
index d15e56c..deb7e00 100644
--- a/security/security.c
+++ b/security/security.c
@@ -17,6 +17,9 @@
 #include <linux/kernel.h>
 #include <linux/security.h>
 
+/* Boot-time LSM user choice */
+static spinlock_t chosen_lsm_lock;
+static char chosen_lsm[SECURITY_NAME_MAX + 1];
 
 /* things that live in dummy.c */
 extern struct security_operations dummy_security_ops;
@@ -62,18 +65,59 @@ int __init security_init(void)
 	}
 
 	security_ops = &dummy_security_ops;
+	spin_lock_init(&chosen_lsm_lock);
 	do_security_initcalls();
 
 	return 0;
 }
 
+/* Save user chosen LSM */
+static int __init choose_lsm(char *str)
+{
+	strncpy(chosen_lsm, str, SECURITY_NAME_MAX);
+	return 1;
+}
+__setup("security=", choose_lsm);
+
+/**
+ * security_module_enable - Load given security module on boot ?
+ * @ops: a pointer to the struct security_operations that is to be checked.
+ *
+ * Each LSM must pass this method before registering its own operations
+ * to avoid security registration races. This method may also be used
+ * to check if your LSM is currently loaded.
+ *
+ * Return true if:
+ *	-The passed LSM is the one chosen by user at boot time,
+ *	-or user didsn't specify a specific LSM and we're the first to ask
+ *	 for registeration permissoin,
+ *	-or the passed LSM is currently loaded.
+ * Otherwise, return false.
+ */
+int security_module_enable(struct security_operations *ops)
+{
+	int rc = 1;
+
+	spin_lock(&chosen_lsm_lock);
+
+	if (!*chosen_lsm)
+		strncpy(chosen_lsm, ops->name, SECURITY_NAME_MAX);
+	else if (strncmp(ops->name, chosen_lsm, SECURITY_NAME_MAX))
+		rc = 0;
+
+	spin_unlock(&chosen_lsm_lock);
+
+	return rc;
+}
+
 /**
  * register_security - registers a security framework with the kernel
  * @ops: a pointer to the struct security_options that is to be registered
  *
  * This function is to allow a security module to register itself with the
  * kernel security subsystem.  Some rudimentary checking is done on the @ops
- * value passed to this function.
+ * value passed to this function. You'll need to check first if your LSM
+ * is allowed to register its @ops by calling security_module_enable(@ops).
  *
  * If there is already a security module registered with the kernel,
  * an error will be returned.  Otherwise 0 is returned on success.
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 75c2e99..49709a4 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -5219,6 +5219,8 @@ static int selinux_key_permission(key_ref_t key_ref,
 #endif
 
 static struct security_operations selinux_ops = {
+	.name =				"selinux",
+
 	.ptrace =			selinux_ptrace,
 	.capget =			selinux_capget,
 	.capset_check =			selinux_capset_check,
@@ -5405,6 +5407,11 @@ static __init int selinux_init(void)
 {
 	struct task_security_struct *tsec;
 
+	if (!security_module_enable(&selinux_ops)) {
+		selinux_enabled = 0;
+		return 0;
+	}
+
 	if (!selinux_enabled) {
 		printk(KERN_INFO "SELinux:  Disabled at boot.\n");
 		return 0;
diff --git a/security/smack/smack.h b/security/smack/smack.h
index a21a0e9..c444f48 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -15,6 +15,7 @@
 
 #include <linux/capability.h>
 #include <linux/spinlock.h>
+#include <linux/security.h>
 #include <net/netlabel.h>
 
 /*
@@ -195,6 +196,7 @@ extern struct smack_known smack_known_star;
 extern struct smack_known smack_known_unset;
 
 extern struct smk_list_entry *smack_list;
+extern struct security_operations smack_ops;
 
 /*
  * Stricly for CIPSO level manipulation.
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 770eb06..afa7967 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -2433,7 +2433,9 @@ static void smack_release_secctx(char *secdata, u32 seclen)
 {
 }
 
-static struct security_operations smack_ops = {
+struct security_operations smack_ops = {
+	.name =				"smack",
+
 	.ptrace = 			smack_ptrace,
 	.capget = 			cap_capget,
 	.capset_check = 		cap_capset_check,
@@ -2566,6 +2568,9 @@ static struct security_operations smack_ops = {
  */
 static __init int smack_init(void)
 {
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	printk(KERN_INFO "Smack:  Initializing.\n");
 
 	/*
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 358c92c..effd3de 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -986,12 +986,21 @@ static struct vfsmount *smackfs_mount;
  *
  * register the smackfs
  *
- * Returns 0 unless the registration fails.
+ * Do not register smackfs if Smack wasn't enabled
+ * on boot. We can not put this method normally under the
+ * smack_init() code path since the security subsystem get
+ * initialized before the vfs caches.
+ *
+ * Returns true if we were not chosen on boot or if
+ * we were chosen and filesystem registration succeeded.
  */
 static int __init init_smk_fs(void)
 {
 	int err;
 
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	err = register_filesystem(&smk_fs_type);
 	if (!err) {
 		smackfs_mount = kern_mount(&smk_fs_type);

--
"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 65%]

* Re: [PATCH -v5 -mm] LSM: Add security= boot parameter
  @ 2008-03-05 22:56 98%                   ` Ahmed S. Darwish
  2008-03-05 23:06 99%                     ` Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 22:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: jmorris, sds, casey, bunk, chrisw, eparis, adobriyan,
	linux-kernel, linux-security-module

On Wed, Mar 05, 2008 at 02:29:48PM -0800, Andrew Morton wrote:
> On Tue, 4 Mar 2008 05:04:07 +0200
> "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> 
...
> > ...
> >  
> > +/* Maximum number of letters for an LSM name string */
> > +#define SECURITY_NAME_MAX	10
> 
> Is this long enough?
> 

I've judged from the four common applicants (selinux, smack,
apparmor, tomoyo) that 10 would be enough. Anyway this will be
easy to fix when something longer appears.

> >  struct ctl_table;
> >  struct audit_krule;
> >  
> >  ...
> >
> > -struct security_operations dummy_security_ops;
> > +struct security_operations dummy_security_ops = { "dummy" };
> 
> Please don't rely upon the layout of data structures in this manner.  Use
> ".name = ".
> 

Will do.

> >  
> >  #define set_to_dummy_if_null(ops, function)				\
> >  	do {								\
> > diff --git a/security/security.c b/security/security.c
> > index 1bf2ee4..def9fc0 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -17,6 +17,9 @@
> >  #include <linux/kernel.h>
> >  #include <linux/security.h>
> >  
> > +/* Boot-time LSM user choice */
> > +static spinlock_t chosen_lsm_lock;
> > +static char chosen_lsm[SECURITY_NAME_MAX + 1];
> >  
> >  /* things that live in dummy.c */
> >  extern struct security_operations dummy_security_ops;
> > @@ -62,18 +65,59 @@ int __init security_init(void)
> >  	}
> >  
> >  	security_ops = &dummy_security_ops;
> > +	spin_lock_init(&chosen_lsm_lock);
> 
> Please remove this and use compile-time initialisation with DEFINE_SPINLOCK.
> 

Ooh I thought the dynamic one was better cause I remember I read it
somewhere on LWN that this is nicer for the RT-patches. I'll modify
it, no problem.

> Do we actually need the lock?  This code is only called at boot-time if I
> understand it correctly?
> 

In the latest version (-v7b, in another thread, CCed), security_module_enable()
is also used to let an LSM know if it's currently loaded or not. This
was done to avoid using a `smack_enabled' global.

I'll resend the v7b for -mm once the LSM devs give their ACKs for the
-rc3 one.

> Can chosen_lsm[] be __initdata?
> 

You're the expert ;), I don't really understand the difference.

...
> >
> > +/**
> > + * security_module_enable - Load given security module on boot ?
> > + * @ops: a pointer to the struct security_operations that is to be checked.
> > + *
> > + * Each LSM must pass this method before registering its own operations
> > + * to avoid security registration races.
> > + *
> > + * Return true if:
> > + *	-The passed LSM is the one chosen by user at boot time,
> > + *	-or user didsn't specify a specific LSM and we're the first to ask
> > + *	 for registeration permissoin.
> > + * Otherwise, return false.
> > + */
> > +int security_module_enable(struct security_operations *ops)
> > +{
> > +	int rc = 1;
> > +
> > +	spin_lock(&chosen_lsm_lock);
> > +	if (!*chosen_lsm)
> > +		strncpy(chosen_lsm, ops->name, SECURITY_NAME_MAX);
> > +	else if (strncmp(ops->name, chosen_lsm, SECURITY_NAME_MAX))
> > +		rc = 0;
> > +	spin_unlock(&chosen_lsm_lock);
> > +
> > +	if (rc)
> > +		printk(KERN_INFO "Security: Loading '%s' security module.\n",
> > +		       ops->name);
> > +
> > +	return rc;
> > +}
> 
> I believe this can be __init.
> 

Will do.

> > +	if (!security_module_enable(&selinux_ops)) {
> > +		selinux_enabled = 0;
> > +		return 0;
> > +	}
> > +
> >
> > ...
> >
> >  static __init int smack_init(void)
> >  {
> > +	if (!security_module_enable(&smack_ops))
> > +		return 0;
> > +
> >  	printk(KERN_INFO "Smack:  Initializing.\n");
> >  
> >  	/*
> 
> hm.  selinux has a global selinux_enabled knob, but smack seems to be able
> to get by without one.  +1 for smack ;)
> 

Thanks to Linus ;). 

I've sent a patch that added a similar global yesterday and it was 
knocked-down by Linus after exactly 2 minutes. 
The situation is handled now without a global in v7/v7b.

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 98%]

* Re: [PATCH -v5 -mm] LSM: Add security= boot parameter
  2008-03-05 22:56 98%                   ` Ahmed S. Darwish
@ 2008-03-05 23:06 99%                     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-05 23:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: jmorris, sds, casey, bunk, chrisw, eparis, adobriyan,
	linux-kernel, linux-security-module

On Thu, Mar 06, 2008 at 12:56:28AM +0200, Ahmed S. Darwish wrote:
> On Wed, Mar 05, 2008 at 02:29:48PM -0800, Andrew Morton wrote:
> > On Tue, 4 Mar 2008 05:04:07 +0200
> > "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> > 
> > Can chosen_lsm[] be __initdata?
> > 
> 
> You're the expert ;), I don't really understand the difference.
> 

After being a good citizen and reading the comments in init.h, I think
yes it should be marked so. even though we use it from init_smk_fs
since init_smk_fs is also marked as __init.

Thank you,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* [PATCH -v8 -rc3] Security: Introduce security= boot parameter
@ 2008-03-06 12:19 67% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-06 12:19 UTC (permalink / raw)
  To: Chris Wright, Stephen Smalley, James Morris, Eric Paris,
	Casey Schaufler, Paul Moore, Alexey Dobriyan, Andrew Morton,
	Linus
  Cc: LKML, LSM-ML

Hi!,

[
    Handle Andrew's concerns:
    - Use __init and __initdata in appropriate places.
    - Do not rely upon dummy_ops layout, use C99 initializations.
    - Use DEFINE_SPINLOCK instead of dynamic initialization.

    Create a new thread cause the original became hard to follow.
]

-->

Add the security= boot parameter. This is done to avoid LSM 
registration clashes in case of more than one bult-in module.

User can choose a security module to enable at boot. If no 
security= boot parameter is specified, only the first LSM 
asking for registration will be loaded. An invalid security 
module name will be treated as if no module has been chosen.

LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux 
and SMACK to do so.

Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
--- 

 Documentation/kernel-parameters.txt |    6 ++++
 include/linux/security.h            |   12 +++++++++
 security/dummy.c                    |    4 ++-
 security/security.c                 |   45 +++++++++++++++++++++++++++++++++++-
 security/selinux/hooks.c            |    7 +++++
 security/smack/smack.h              |    2 +
 security/smack/smack_lsm.c          |    7 ++++-
 security/smack/smackfs.c            |   11 ++++++++

 8 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9a5b665..64efbdc 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -374,6 +374,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			possible to determine what the correct size should be.
 			This option provides an override for these situations.
 
+	security=	[SECURITY] Choose a security module to enable at boot. 
+			If this boot parameter is not specified, only the first 
+			security module asking for security registration will be
+			loaded. An invalid security module name will be treated
+			as if no module has been chosen.
+
 	capability.disable=
 			[SECURITY] Disable capabilities.  This would normally
 			be used only if an alternative security model is to be
diff --git a/include/linux/security.h b/include/linux/security.h
index fe52cde..801c9ad 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -42,6 +42,9 @@
 
 extern unsigned securebits;
 
+/* Maximum number of letters for an LSM name string */
+#define SECURITY_NAME_MAX	10
+
 struct ctl_table;
 
 /*
@@ -117,6 +120,12 @@ struct request_sock;
 /**
  * struct security_operations - main security structure
  *
+ * Security module identifier.
+ *
+ * @name:
+ *	A string that acts as a unique identifeir for the LSM with max number
+ *	of characters = SECURITY_NAME_MAX.
+ *
  * Security hooks for program execution operations.
  *
  * @bprm_alloc_security:
@@ -1207,6 +1216,8 @@ struct request_sock;
  * This is the main security structure.
  */
 struct security_operations {
+	char name[SECURITY_NAME_MAX + 1];
+
 	int (*ptrace) (struct task_struct * parent, struct task_struct * child);
 	int (*capget) (struct task_struct * target,
 		       kernel_cap_t * effective,
@@ -1466,6 +1477,7 @@ struct security_operations {
 
 /* prototypes */
 extern int security_init	(void);
+extern int security_module_enable(struct security_operations *ops);
 extern int register_security	(struct security_operations *ops);
 extern int mod_reg_security	(const char *name, struct security_operations *ops);
 extern struct dentry *securityfs_create_file(const char *name, mode_t mode,
diff --git a/security/dummy.c b/security/dummy.c
index 649326b..b668fda 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -979,7 +979,9 @@ static inline int dummy_key_permission(key_ref_t key_ref,
 }
 #endif /* CONFIG_KEYS */
 
-struct security_operations dummy_security_ops;
+struct security_operations dummy_security_ops = { 
+	.name = "dummy", 
+};
 
 #define set_to_dummy_if_null(ops, function)				\
 	do {								\
diff --git a/security/security.c b/security/security.c
index d15e56c..768f171 100644
--- a/security/security.c
+++ b/security/security.c
@@ -17,6 +17,9 @@
 #include <linux/kernel.h>
 #include <linux/security.h>
 
+/* Boot-time LSM user choice */
+static __initdata DEFINE_SPINLOCK(chosen_lsm_lock);
+static __initdata char chosen_lsm[SECURITY_NAME_MAX + 1];
 
 /* things that live in dummy.c */
 extern struct security_operations dummy_security_ops;
@@ -67,13 +70,53 @@ int __init security_init(void)
 	return 0;
 }
 
+/* Save user chosen LSM */
+static int __init choose_lsm(char *str)
+{
+	strncpy(chosen_lsm, str, SECURITY_NAME_MAX);
+	return 1;
+}
+__setup("security=", choose_lsm);
+
+/**
+ * security_module_enable - Load given security module on boot ?
+ * @ops: a pointer to the struct security_operations that is to be checked.
+ *
+ * Each LSM must pass this method before registering its own operations
+ * to avoid security registration races. This method may also be used
+ * to check if your LSM is currently loaded.
+ *
+ * Return true if:
+ *	-The passed LSM is the one chosen by user at boot time,
+ *	-or user didsn't specify a specific LSM and we're the first to ask
+ *	 for registeration permissoin,
+ *	-or the passed LSM is currently loaded.
+ * Otherwise, return false.
+ */
+int __init security_module_enable(struct security_operations *ops)
+{
+	int rc = 1;
+
+	spin_lock(&chosen_lsm_lock);
+
+	if (!*chosen_lsm)
+		strncpy(chosen_lsm, ops->name, SECURITY_NAME_MAX);
+	else if (strncmp(ops->name, chosen_lsm, SECURITY_NAME_MAX))
+		rc = 0;
+
+	spin_unlock(&chosen_lsm_lock);
+
+	return rc;
+}
+
 /**
  * register_security - registers a security framework with the kernel
  * @ops: a pointer to the struct security_options that is to be registered
  *
  * This function is to allow a security module to register itself with the
  * kernel security subsystem.  Some rudimentary checking is done on the @ops
- * value passed to this function.
+ * value passed to this function. You'll need to check first if your LSM
+ * is allowed to register its @ops by calling security_module_enable(@ops).
  *
  * If there is already a security module registered with the kernel,
  * an error will be returned.  Otherwise 0 is returned on success.
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 75c2e99..49709a4 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -5219,6 +5219,8 @@ static int selinux_key_permission(key_ref_t key_ref,
 #endif
 
 static struct security_operations selinux_ops = {
+	.name =				"selinux",
+
 	.ptrace =			selinux_ptrace,
 	.capget =			selinux_capget,
 	.capset_check =			selinux_capset_check,
@@ -5405,6 +5407,11 @@ static __init int selinux_init(void)
 {
 	struct task_security_struct *tsec;
 
+	if (!security_module_enable(&selinux_ops)) {
+		selinux_enabled = 0;
+		return 0;
+	}
+
 	if (!selinux_enabled) {
 		printk(KERN_INFO "SELinux:  Disabled at boot.\n");
 		return 0;
diff --git a/security/smack/smack.h b/security/smack/smack.h
index a21a0e9..c444f48 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -15,6 +15,7 @@
 
 #include <linux/capability.h>
 #include <linux/spinlock.h>
+#include <linux/security.h>
 #include <net/netlabel.h>
 
 /*
@@ -195,6 +196,7 @@ extern struct smack_known smack_known_star;
 extern struct smack_known smack_known_unset;
 
 extern struct smk_list_entry *smack_list;
+extern struct security_operations smack_ops;
 
 /*
  * Stricly for CIPSO level manipulation.
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 770eb06..afa7967 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -2433,7 +2433,9 @@ static void smack_release_secctx(char *secdata, u32 seclen)
 {
 }
 
-static struct security_operations smack_ops = {
+struct security_operations smack_ops = {
+	.name =				"smack",
+
 	.ptrace = 			smack_ptrace,
 	.capget = 			cap_capget,
 	.capset_check = 		cap_capset_check,
@@ -2566,6 +2568,9 @@ static struct security_operations smack_ops = {
  */
 static __init int smack_init(void)
 {
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	printk(KERN_INFO "Smack:  Initializing.\n");
 
 	/*
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 358c92c..effd3de 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -986,12 +986,21 @@ static struct vfsmount *smackfs_mount;
  *
  * register the smackfs
  *
- * Returns 0 unless the registration fails.
+ * Do not register smackfs if Smack wasn't enabled
+ * on boot. We can not put this method normally under the
+ * smack_init() code path since the security subsystem get
+ * initialized before the vfs caches.
+ *
+ * Returns true if we were not chosen on boot or if
+ * we were chosen and filesystem registration succeeded.
  */
 static int __init init_smk_fs(void)
 {
 	int err;
 
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	err = register_filesystem(&smk_fs_type);
 	if (!err) {
 		smackfs_mount = kern_mount(&smk_fs_type);


-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 67%]

* Re: [PATCH -v8 -rc3] Security: Introduce security= boot parameter
  @ 2008-03-06 14:32 99%   ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-06 14:32 UTC (permalink / raw)
  To: James Morris
  Cc: Chris Wright, Stephen Smalley, Eric Paris, Casey Schaufler,
	Paul Moore, Alexey Dobriyan, Andrew Morton, Linus, LKML, LSM-ML

Hi James,

On Thu, Mar 6, 2008, James Morris <jmorris@namei.org> wrote:
> On Thu, 6 Mar 2008, Ahmed S. Darwish wrote:
>
>  >     Handle Andrew's concerns:
>  >     - Use __init and __initdata in appropriate places.
>  >     - Do not rely upon dummy_ops layout, use C99 initializations.
>  >     - Use DEFINE_SPINLOCK instead of dynamic initialization.
>
>  The spinlock is not needed now, if security_module_enable() can only be
>  called during boot via an initcall.
>

Will do.

Would you mind answering my confusions below so I can do the change
with good understanding ?

I see preempt_disable() before calling security and vfs_caches init,
but what will prevent two processors/cores from executing
security_module_enable() concurrently (thus possibly corrupting
chosen_lsm) ?
security_module_enable() is also now used in __init init_smk_fs().

Or the init path got executed serially ?

Thank you,

-- 
Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[relevance 99%]

* [PATCH -v8b -rc3] Security: Introduce security= boot parameter
  @ 2008-03-06 16:09 68%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-06 16:09 UTC (permalink / raw)
  To: James Morris
  Cc: Chris Wright, Stephen Smalley, Eric Paris, Casey Schaufler,
	Paul Moore, Alexey Dobriyan, Andrew Morton, Linus, LKML, LSM-ML

Hi,

[ 
    Thanks James for all those helpful comments since patch-v1!.

    - Remove spinlocks since we've dictated to use security_module_enable()
      under kernel init path only (__init).
]

-->

Add the security= boot parameter. This is done to avoid LSM 
registration clashes in case of more than one bult-in module.

User can choose a security module to enable at boot. If no 
security= boot parameter is specified, only the first LSM 
asking for registration will be loaded. An invalid security 
module name will be treated as if no module has been chosen.

LSM modules must check now if they are allowed to register
by calling security_module_enable(ops) first. Modify SELinux 
and SMACK to do so.

Do not let SMACK register smackfs if it was not chosen on
boot. Smackfs assumes that smack hooks are registered and
the initial task security setup (swapper->security) is done.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
--- 

 Documentation/kernel-parameters.txt |    6 +++++
 include/linux/security.h            |   12 +++++++++++
 security/dummy.c                    |    4 ++-
 security/security.c                 |   38 +++++++++++++++++++++++++++++++++++-
 security/selinux/hooks.c            |    7 ++++++
 security/smack/smack.h              |    2 +
 security/smack/smack_lsm.c          |    7 +++++-
 security/smack/smackfs.c            |   11 +++++++++-

 8 files changed, 83 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9a5b665..64efbdc 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -374,6 +374,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			possible to determine what the correct size should be.
 			This option provides an override for these situations.
 
+	security=	[SECURITY] Choose a security module to enable at boot. 
+			If this boot parameter is not specified, only the first 
+			security module asking for security registration will be
+			loaded. An invalid security module name will be treated
+			as if no module has been chosen.
+
 	capability.disable=
 			[SECURITY] Disable capabilities.  This would normally
 			be used only if an alternative security model is to be
diff --git a/include/linux/security.h b/include/linux/security.h
index fe52cde..801c9ad 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -42,6 +42,9 @@
 
 extern unsigned securebits;
 
+/* Maximum number of letters for an LSM name string */
+#define SECURITY_NAME_MAX	10
+
 struct ctl_table;
 
 /*
@@ -117,6 +120,12 @@ struct request_sock;
 /**
  * struct security_operations - main security structure
  *
+ * Security module identifier.
+ *
+ * @name:
+ *	A string that acts as a unique identifeir for the LSM with max number
+ *	of characters = SECURITY_NAME_MAX.
+ *
  * Security hooks for program execution operations.
  *
  * @bprm_alloc_security:
@@ -1207,6 +1216,8 @@ struct request_sock;
  * This is the main security structure.
  */
 struct security_operations {
+	char name[SECURITY_NAME_MAX + 1];
+
 	int (*ptrace) (struct task_struct * parent, struct task_struct * child);
 	int (*capget) (struct task_struct * target,
 		       kernel_cap_t * effective,
@@ -1466,6 +1477,7 @@ struct security_operations {
 
 /* prototypes */
 extern int security_init	(void);
+extern int security_module_enable(struct security_operations *ops);
 extern int register_security	(struct security_operations *ops);
 extern int mod_reg_security	(const char *name, struct security_operations *ops);
 extern struct dentry *securityfs_create_file(const char *name, mode_t mode,
diff --git a/security/dummy.c b/security/dummy.c
index 649326b..b668fda 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -979,7 +979,9 @@ static inline int dummy_key_permission(key_ref_t key_ref,
 }
 #endif /* CONFIG_KEYS */
 
-struct security_operations dummy_security_ops;
+struct security_operations dummy_security_ops = { 
+	.name = "dummy", 
+};
 
 #define set_to_dummy_if_null(ops, function)				\
 	do {								\
diff --git a/security/security.c b/security/security.c
index d15e56c..8d67e73 100644
--- a/security/security.c
+++ b/security/security.c
@@ -17,6 +17,8 @@
 #include <linux/kernel.h>
 #include <linux/security.h>
 
+/* Boot-time LSM user choice */
+static __initdata char chosen_lsm[SECURITY_NAME_MAX + 1];
 
 /* things that live in dummy.c */
 extern struct security_operations dummy_security_ops;
@@ -67,13 +69,47 @@ int __init security_init(void)
 	return 0;
 }
 
+/* Save user chosen LSM */
+static int __init choose_lsm(char *str)
+{
+	strncpy(chosen_lsm, str, SECURITY_NAME_MAX);
+	return 1;
+}
+__setup("security=", choose_lsm);
+
+/**
+ * security_module_enable - Load given security module on boot ?
+ * @ops: a pointer to the struct security_operations that is to be checked.
+ *
+ * Each LSM must pass this method before registering its own operations
+ * to avoid security registration races. This method may also be used
+ * to check if your LSM is currently loaded.
+ *
+ * Return true if:
+ *	-The passed LSM is the one chosen by user at boot time,
+ *	-or user didsn't specify a specific LSM and we're the first to ask
+ *	 for registeration permissoin,
+ *	-or the passed LSM is currently loaded.
+ * Otherwise, return false.
+ */
+int __init security_module_enable(struct security_operations *ops)
+{
+	if (!*chosen_lsm)
+		strncpy(chosen_lsm, ops->name, SECURITY_NAME_MAX);
+	else if (strncmp(ops->name, chosen_lsm, SECURITY_NAME_MAX))
+		return 0;
+
+	return 1;
+}
+
 /**
  * register_security - registers a security framework with the kernel
  * @ops: a pointer to the struct security_options that is to be registered
  *
  * This function is to allow a security module to register itself with the
  * kernel security subsystem.  Some rudimentary checking is done on the @ops
- * value passed to this function.
+ * value passed to this function. You'll need to check first if your LSM
+ * is allowed to register its @ops by calling security_module_enable(@ops).
  *
  * If there is already a security module registered with the kernel,
  * an error will be returned.  Otherwise 0 is returned on success.
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 75c2e99..49709a4 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -5219,6 +5219,8 @@ static int selinux_key_permission(key_ref_t key_ref,
 #endif
 
 static struct security_operations selinux_ops = {
+	.name =				"selinux",
+
 	.ptrace =			selinux_ptrace,
 	.capget =			selinux_capget,
 	.capset_check =			selinux_capset_check,
@@ -5405,6 +5407,11 @@ static __init int selinux_init(void)
 {
 	struct task_security_struct *tsec;
 
+	if (!security_module_enable(&selinux_ops)) {
+		selinux_enabled = 0;
+		return 0;
+	}
+
 	if (!selinux_enabled) {
 		printk(KERN_INFO "SELinux:  Disabled at boot.\n");
 		return 0;
diff --git a/security/smack/smack.h b/security/smack/smack.h
index a21a0e9..c444f48 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -15,6 +15,7 @@
 
 #include <linux/capability.h>
 #include <linux/spinlock.h>
+#include <linux/security.h>
 #include <net/netlabel.h>
 
 /*
@@ -195,6 +196,7 @@ extern struct smack_known smack_known_star;
 extern struct smack_known smack_known_unset;
 
 extern struct smk_list_entry *smack_list;
+extern struct security_operations smack_ops;
 
 /*
  * Stricly for CIPSO level manipulation.
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 770eb06..afa7967 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -2433,7 +2433,9 @@ static void smack_release_secctx(char *secdata, u32 seclen)
 {
 }
 
-static struct security_operations smack_ops = {
+struct security_operations smack_ops = {
+	.name =				"smack",
+
 	.ptrace = 			smack_ptrace,
 	.capget = 			cap_capget,
 	.capset_check = 		cap_capset_check,
@@ -2566,6 +2568,9 @@ static struct security_operations smack_ops = {
  */
 static __init int smack_init(void)
 {
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	printk(KERN_INFO "Smack:  Initializing.\n");
 
 	/*
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 358c92c..effd3de 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -986,12 +986,21 @@ static struct vfsmount *smackfs_mount;
  *
  * register the smackfs
  *
- * Returns 0 unless the registration fails.
+ * Do not register smackfs if Smack wasn't enabled
+ * on boot. We can not put this method normally under the
+ * smack_init() code path since the security subsystem get
+ * initialized before the vfs caches.
+ *
+ * Returns true if we were not chosen on boot or if
+ * we were chosen and filesystem registration succeeded.
  */
 static int __init init_smk_fs(void)
 {
 	int err;
 
+	if (!security_module_enable(&smack_ops))
+		return 0;
+
 	err = register_filesystem(&smk_fs_type);
 	if (!err) {
 		smackfs_mount = kern_mount(&smk_fs_type);

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 68%]

* [PATCH BUGFIX -rc4] Smackfs: Do not trust `count' in inodes write()s
@ 2008-03-08 21:38 86% Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-08 21:38 UTC (permalink / raw)
  To: Casey Schaufler, akpm; +Cc: LKML, Linus

Hi all,

Smackfs write() implementation does not put a higher bound on the 
number of bytes to copy from user-space. This may lead to a DOS
attack if a malicious `count' field is given. 

Assure that given `count' is exactly the length needed for a 
/smack/load rule. In case of /smack/cipso where the length is 
relative, assure that `count' does not exceed the size needed for 
a buffer representing maximum possible number of CIPSO 2.2 
categories.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

 smack.h   |    8 --------
 smackfs.c |   31 ++++++++++++++++++++-----------
 2 files changed, 20 insertions(+), 19 deletions(-)

Smack user-space utilities (smackload, smackcipso) still work 
fine after below change.

[
  Who's responsible for forwarding Smack patches to Linus ?
  I've CCed Andrew and Linus just in case.
]

diff --git a/security/smack/smack.h b/security/smack/smack.h
index c444f48..4a4477f 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -27,14 +27,6 @@
 #define SMK_MAXLEN	23
 #define SMK_LABELLEN	(SMK_MAXLEN+1)
 
-/*
- * How many kinds of access are there?
- * Here's your answer.
- */
-#define SMK_ACCESSDASH	'-'
-#define SMK_ACCESSLOW	"rwxa"
-#define SMK_ACCESSKINDS	(sizeof(SMK_ACCESSLOW) - 1)
-
 struct superblock_smack {
 	char		*smk_root;
 	char		*smk_floor;
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index effd3de..3fc3a07 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -81,10 +81,23 @@ static struct semaphore smack_write_sem;
 /*
  * Values for parsing cipso rules
  * SMK_DIGITLEN: Length of a digit field in a rule.
- * SMK_CIPSOMEN: Minimum possible cipso rule length.
+ * SMK_CIPSOMIN: Minimum possible cipso rule length.
+ * SMK_CIPSOMAX: Maximum possible cipso rule length.
  */
 #define SMK_DIGITLEN 4
-#define SMK_CIPSOMIN (SMK_MAXLEN + 2 * SMK_DIGITLEN)
+#define SMK_CIPSOMIN (SMK_LABELLEN + 2 * SMK_DIGITLEN)
+#define SMK_CIPSOMAX (SMK_CIPSOMIN + SMACK_CIPSO_MAXCATNUM * SMK_DIGITLEN)
+
+/*
+ * Values for parsing MAC rules
+ * SMK_ACCESS: Maximum possible combination of access permissions
+ * SMK_ACCESSLEN: Maximum length for a rule access field
+ * SMK_LOADLEN: Smack rule length
+ */
+#define SMK_ACCESS    "rwxa"
+#define SMK_ACCESSLEN (sizeof(SMK_ACCESS) - 1)
+#define SMK_LOADLEN   (SMK_LABELLEN + SMK_LABELLEN + SMK_ACCESSLEN)
+
 
 /*
  * Seq_file read operations for /smack/load
@@ -229,14 +242,10 @@ static void smk_set_access(struct smack_rule *srp)
  * The format is exactly:
  *     char subject[SMK_LABELLEN]
  *     char object[SMK_LABELLEN]
- *     char access[SMK_ACCESSKINDS]
- *
- *     Anything following is commentary and ignored.
+ *     char access[SMK_ACCESSLEN]
  *
- * writes must be SMK_LABELLEN+SMK_LABELLEN+4 bytes.
+ * writes must be SMK_LABELLEN+SMK_LABELLEN+SMK_ACCESSLEN bytes.
  */
-#define MINIMUM_LOAD (SMK_LABELLEN + SMK_LABELLEN + SMK_ACCESSKINDS)
-
 static ssize_t smk_write_load(struct file *file, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
@@ -253,7 +262,7 @@ static ssize_t smk_write_load(struct file *file, const char __user *buf,
 		return -EPERM;
 	if (*ppos != 0)
 		return -EINVAL;
-	if (count < MINIMUM_LOAD)
+	if (count != SMK_LOADLEN)
 		return -EINVAL;
 
 	data = kzalloc(count, GFP_KERNEL);
@@ -513,7 +522,7 @@ static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
 		return -EPERM;
 	if (*ppos != 0)
 		return -EINVAL;
-	if (count <= SMK_CIPSOMIN)
+	if (count < SMK_CIPSOMIN || count > SMK_CIPSOMAX)
 		return -EINVAL;
 
 	data = kzalloc(count + 1, GFP_KERNEL);
@@ -547,7 +556,7 @@ static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
 	if (ret != 1 || catlen > SMACK_CIPSO_MAXCATNUM)
 		goto out;
 
-	if (count <= (SMK_CIPSOMIN + catlen * SMK_DIGITLEN))
+	if (count != (SMK_CIPSOMIN + catlen * SMK_DIGITLEN))
 		goto out;
 
 	memset(mapcatset, 0, sizeof(mapcatset));

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 86%]

* [RFC][PATCH] Smack<->Audit integration
@ 2008-03-10 12:49 61% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-10 12:49 UTC (permalink / raw)
  To: Casey Schaufler, Andrew Morton, James Morris
  Cc: Paul Moore, LKML, LSM-ML, Audit-ML

Hi all,

This is the second step of integrating Audit with Smack. 

The AUDIT_SUBJ_USER and AUDIT_OBJ_USER SELinux flags are recycled
for Smack to avoid `auditd' userspace modifications. Smack only
needs auditing on subject/object bases, so those flags were enough.

Patch below setups the new Audit LSM hooks and add a secid field in 
smack inode and ipc security structures. The secid field was needed
cause it's the main Audit way of distinguishing kernel objects to
be audited from the remaining ones.

Possibly the last steps would be to:
1- report smack access grants and denials through Audit (like AVC)
2- Letting Audit send an OBJ_USER event in case of a process that 
   recieved a signal without affecting SELinux.

Thank you for your reviews.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

 smack.h     |    9 ++
 smack_lsm.c |  211 +++++++++++++++++++++++++++++++++++++++++++++++++++++-----

 2 files changed, 200 insertions(+), 17 deletions(-)

Patch is generated over James security/next branch cause it's the 
only tree that currently has the Audit hooks merged:
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6.git#next

diff --git a/security/smack/smack.h b/security/smack/smack.h
index c444f48..2c8bb4c 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -57,6 +57,15 @@ struct inode_smack {
 	char		*smk_inode;	/* label of the fso */
 	struct mutex	smk_lock;	/* initialization lock */
 	int		smk_flags;	/* smack inode flags */
+	int		secid;		/* security identifier */
+};
+
+/*
+ * IPC smack data
+ */
+struct ipc_smack {
+	char		*smk_ipc;	/* ipc object label */
+	int		secid;		/* security identifier */
 };
 
 #define	SMK_INODE_INSTANT	0x01	/* inode is instantiated */
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index afa7967..5b5e1fd 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -6,6 +6,9 @@
  *  Author:
  *	Casey Schaufler <casey@schaufler-ca.com>
  *
+ *  Audit integration by:
+ *	Ahmed S. Darwish <darwish.07@gmail.com>
+ *
  *  Copyright (C) 2007 Casey Schaufler <casey@schaufler-ca.com>
  *
  *	This program is free software; you can redistribute it and/or modify
@@ -24,6 +27,7 @@
 #include <linux/udp.h>
 #include <linux/mutex.h>
 #include <linux/pipe_fs_i.h>
+#include <linux/audit.h>
 #include <net/netlabel.h>
 #include <net/cipso_ipv4.h>
 
@@ -75,6 +79,7 @@ struct inode_smack *new_inode_smack(char *smack)
 
 	isp->smk_inode = smack;
 	isp->smk_flags = 0;
+	isp->secid = smack_to_secid(smack);
 	mutex_init(&isp->smk_lock);
 
 	return isp;
@@ -638,6 +643,8 @@ static void smack_inode_post_setxattr(struct dentry *dentry, char *name,
 	else
 		isp->smk_inode = smack_known_invalid.smk_known;
 
+	isp->secid = smack_to_secid(isp->smk_inode);
+
 	return;
 }
 
@@ -759,6 +766,17 @@ static int smack_inode_listsecurity(struct inode *inode, char *buffer,
 	return -EINVAL;
 }
 
+/**
+ * smack_inode_getsecid - Extract inode's security id
+ * @inode: inode to extract the info from
+ * @secid: where the result will be saved
+ */
+static void smack_inode_getsecid(const struct inode *inode, u32 *secid)
+{
+	struct inode_smack *isp = inode->i_security;
+	*secid = isp->secid;
+}
+
 /*
  * File Hooks
  */
@@ -1344,6 +1362,7 @@ static int smack_inode_setsecurity(struct inode *inode, const char *name,
 				   const void *value, size_t size, int flags)
 {
 	char *sp;
+	struct smack_known *skp;
 	struct inode_smack *nsp = inode->i_security;
 	struct socket_smack *ssp;
 	struct socket *sock;
@@ -1352,12 +1371,14 @@ static int smack_inode_setsecurity(struct inode *inode, const char *name,
 	if (value == NULL || size > SMK_LABELLEN)
 		return -EACCES;
 
-	sp = smk_import(value, size);
-	if (sp == NULL)
+	skp = smk_import_entry(value, size);
+	if (skp == NULL)
 		return -EINVAL;
 
+	sp = skp->smk_known;
 	if (strcmp(name, XATTR_SMACK_SUFFIX) == 0) {
 		nsp->smk_inode = sp;
+		nsp->secid = skp->smk_secid;
 		return 0;
 	}
 	/*
@@ -1471,9 +1492,16 @@ static char *smack_of_shm(struct shmid_kernel *shp)
  */
 static int smack_shm_alloc_security(struct shmid_kernel *shp)
 {
-	struct kern_ipc_perm *isp = &shp->shm_perm;
+	struct ipc_smack *isp;
+	struct kern_ipc_perm *ipcp = &shp->shm_perm;
 
-	isp->security = current->security;
+	isp = kzalloc(sizeof(struct ipc_smack), GFP_KERNEL);
+	if (isp == NULL)
+		return -ENOMEM;
+
+	isp->smk_ipc = current->security;
+	isp->secid = smack_to_secid(isp->smk_ipc);
+	ipcp->security = isp;
 	return 0;
 }
 
@@ -1485,9 +1513,9 @@ static int smack_shm_alloc_security(struct shmid_kernel *shp)
  */
 static void smack_shm_free_security(struct shmid_kernel *shp)
 {
-	struct kern_ipc_perm *isp = &shp->shm_perm;
+	struct kern_ipc_perm *ipcp = &shp->shm_perm;
 
-	isp->security = NULL;
+	kfree(ipcp->security);
 }
 
 /**
@@ -1579,9 +1607,16 @@ static char *smack_of_sem(struct sem_array *sma)
  */
 static int smack_sem_alloc_security(struct sem_array *sma)
 {
-	struct kern_ipc_perm *isp = &sma->sem_perm;
+	struct ipc_smack *isp;
+	struct kern_ipc_perm *ipcp = &sma->sem_perm;
+
+	isp = kzalloc(sizeof(struct ipc_smack), GFP_KERNEL);
+	if (isp == NULL)
+		return -ENOMEM;
 
-	isp->security = current->security;
+	isp->smk_ipc = current->security;
+	isp->secid = smack_to_secid(isp->smk_ipc);
+	ipcp->security = isp;
 	return 0;
 }
 
@@ -1595,7 +1630,7 @@ static void smack_sem_free_security(struct sem_array *sma)
 {
 	struct kern_ipc_perm *isp = &sma->sem_perm;
 
-	isp->security = NULL;
+	kfree(isp->security);
 }
 
 /**
@@ -1682,9 +1717,16 @@ static int smack_sem_semop(struct sem_array *sma, struct sembuf *sops,
  */
 static int smack_msg_queue_alloc_security(struct msg_queue *msq)
 {
-	struct kern_ipc_perm *kisp = &msq->q_perm;
+	struct ipc_smack *isp;
+	struct kern_ipc_perm *ipcp = &msq->q_perm;
 
-	kisp->security = current->security;
+	isp = kzalloc(sizeof(struct ipc_smack), GFP_KERNEL);
+	if (isp == NULL)
+		return -ENOMEM;
+
+	isp->smk_ipc = current->security;
+	isp->secid = smack_to_secid(isp->smk_ipc);
+	ipcp->security = isp;
 	return 0;
 }
 
@@ -1696,9 +1738,9 @@ static int smack_msg_queue_alloc_security(struct msg_queue *msq)
  */
 static void smack_msg_queue_free_security(struct msg_queue *msq)
 {
-	struct kern_ipc_perm *kisp = &msq->q_perm;
+	struct kern_ipc_perm *ipcp = &msq->q_perm;
 
-	kisp->security = NULL;
+	kfree(ipcp->security);
 }
 
 /**
@@ -1800,18 +1842,30 @@ static int smack_msg_queue_msgrcv(struct msg_queue *msq, struct msg_msg *msg,
 
 /**
  * smack_ipc_permission - Smack access for ipc_permission()
- * @ipp: the object permissions
+ * @ipcp: the object permissions
  * @flag: access requested
  *
  * Returns 0 if current has read and write access, error code otherwise
  */
-static int smack_ipc_permission(struct kern_ipc_perm *ipp, short flag)
+static int smack_ipc_permission(struct kern_ipc_perm *ipcp, short flag)
 {
-	char *isp = ipp->security;
 	int may;
+	struct ipc_smack *isp = ipcp->security;
 
 	may = smack_flags_to_may(flag);
-	return smk_curacc(isp, may);
+	return smk_curacc(isp->smk_ipc, may);
+}
+
+/**
+ * smack_ipc_getsecid - extract Smack security id
+ * @ipcp: the object permissions
+ * @secid: where result will be saved
+ */
+static void smack_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid)
+{
+	struct inode_smack *isp = ipcp->security;
+
+	*secid = isp->secid;
 }
 
 /* module stacking operations */
@@ -1967,6 +2021,7 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode)
 	else
 		isp->smk_inode = final;
 
+	isp->secid = smack_to_secid(isp->smk_inode);
 	isp->smk_flags |= SMK_INODE_INSTANT;
 
 unlockandout:
@@ -2391,6 +2446,116 @@ static int smack_key_permission(key_ref_t key_ref,
 #endif /* CONFIG_KEYS */
 
 /*
+ * Smack Audit hooks
+ *
+ * Audit requires an internal representation of each Smack specific
+ * rule which works as a glue between the audit hooks. Since smack_known
+ * repository entries are added but never deleted, we'll use the
+ * smack_known entry related to the given audit rule as the needed
+ * smack representation.
+ *
+ * This will also save us from searching the smack labels repository
+ * each time the audit_rule_match hook get called.
+ */
+#ifdef CONFIG_AUDIT
+
+/**
+ * smack_audit_rule_init - Initialize a smack audit rule
+ * @field: Message flags given from user-space (audit.h)
+ * @op: Required operation (Only equality testing is allowed)
+ * @rulestr: Given user-space rule
+ * @vrule: Where we'll save our own audit rule representation
+ *
+ * The label to be audited (rulestr) is created if necessay
+ */
+static int smack_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
+{
+	struct smack_known **smk_rule = (struct smack_known **)vrule;
+	*smk_rule = NULL;
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return -EINVAL;
+
+	if (op != AUDIT_EQUAL && op != AUDIT_NOT_EQUAL)
+		return -EINVAL;
+
+	*smk_rule = smk_import_entry(rulestr, 0);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_known - Distinguish smack audit rules from others
+ * @krule: rule of interest, in internal kernel representation format
+ *
+ * This is used to filter rules related to Smack from other saved
+ * Audit rules. If it's proved that this rule belongs to us, the
+ * audit_rule_match hook will be used to check a specific kernel
+ * object against its smack audit rule.
+ */
+static int smack_audit_rule_known(struct audit_krule *krule)
+{
+	struct audit_field *f;
+	int i;
+
+	for (i = 0; i < krule->field_count; i++) {
+		f = &krule->fields[i];
+
+		if (f->type == AUDIT_SUBJ_USER || f->type == AUDIT_OBJ_USER)
+			return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_match - Audit given secid identified object ?
+ * @secid: Security id to test
+ * @field: Message flags given from user-space
+ * @op: Required operation (only equality is allowed)
+ * @vrule: Smack audit rule that will be checked against the secid object
+ * @actx: audit context associated with the check (used for Audit logging)
+ *
+ * This is the core Audit hook. It's used to identify objects like 
+ * syscalls and inodes requested from user-space to be audited from 
+ * remaining kernel objects.
+ */
+static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule,
+				  struct audit_context *actx)
+{
+	struct smack_known *smk_rule = vrule;
+
+	if (!smk_rule) {
+		audit_log(actx, GFP_KERNEL, AUDIT_SELINUX_ERR,
+			  "Smack: missing rule\n");
+		return -ENOENT;
+	}
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return 0;
+
+	if (op == AUDIT_EQUAL)
+		return (smk_rule->smk_secid == secid);
+	if (op == AUDIT_NOT_EQUAL)
+		return (smk_rule->smk_secid != secid);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_free - free internal audit rule representation
+ * @vrule: rule to be freed.
+ *
+ * No memory was allocated in audit_rule_init.
+ */
+static void smack_audit_rule_free(void *vrule)
+{
+	/* No-op */
+}
+
+#endif /* CONFIG_AUDIT */
+
+/*
  * smack_secid_to_secctx - return the smack label for a secid
  * @secid: incoming integer
  * @secdata: destination
@@ -2476,6 +2641,7 @@ struct security_operations smack_ops = {
 	.inode_getsecurity = 		smack_inode_getsecurity,
 	.inode_setsecurity = 		smack_inode_setsecurity,
 	.inode_listsecurity = 		smack_inode_listsecurity,
+	.inode_getsecid =		smack_inode_getsecid,
 
 	.file_permission = 		smack_file_permission,
 	.file_alloc_security = 		smack_file_alloc_security,
@@ -2506,6 +2672,7 @@ struct security_operations smack_ops = {
 	.task_to_inode = 		smack_task_to_inode,
 
 	.ipc_permission = 		smack_ipc_permission,
+	.ipc_getsecid =			smack_ipc_getsecid,
 
 	.msg_msg_alloc_security = 	smack_msg_msg_alloc_security,
 	.msg_msg_free_security = 	smack_msg_msg_free_security,
@@ -2550,12 +2717,22 @@ struct security_operations smack_ops = {
 	.sk_free_security = 		smack_sk_free_security,
 	.sock_graft = 			smack_sock_graft,
 	.inet_conn_request = 		smack_inet_conn_request,
+
  /* key management security hooks */
 #ifdef CONFIG_KEYS
 	.key_alloc = 			smack_key_alloc,
 	.key_free = 			smack_key_free,
 	.key_permission = 		smack_key_permission,
 #endif /* CONFIG_KEYS */
+
+ /* Audit hooks */
+#ifdef CONFIG_AUDIT
+	.audit_rule_init =		smack_audit_rule_init,
+	.audit_rule_known =		smack_audit_rule_known,
+	.audit_rule_match =		smack_audit_rule_match,
+	.audit_rule_free =		smack_audit_rule_free,
+#endif /* CONFIG_AUDIT */
+
 	.secid_to_secctx = 		smack_secid_to_secctx,
 	.secctx_to_secid = 		smack_secctx_to_secid,
 	.release_secctx = 		smack_release_secctx,

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 61%]

* Re: [RFC][PATCH] Smack<->Audit integration
  @ 2008-03-10 18:26 99%   ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-10 18:26 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Andrew Morton, James Morris, Paul Moore, LKML, LSM-ML, Audit-ML

On Mon, Mar 10, 2008 at 09:07:08AM -0700, Casey Schaufler wrote:
> 
> --- "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> 
...
> > 
> > diff --git a/security/smack/smack.h b/security/smack/smack.h
> > index c444f48..2c8bb4c 100644
> > --- a/security/smack/smack.h
> > +++ b/security/smack/smack.h
> > @@ -57,6 +57,15 @@ struct inode_smack {
> >  	char		*smk_inode;	/* label of the fso */
> >  	struct mutex	smk_lock;	/* initialization lock */
> >  	int		smk_flags;	/* smack inode flags */
> > +	int		secid;		/* security identifier */
> 
> No.
> 
> Secid's are horrid things and every effort should be made to
> expunge them from the known universe. Under no circumstances
> should thier use be expanded. The only reason Smack has them
> at all is because certain interfaces that in my mind should
> have known better use them. If you must deal with secids,
> and for this round of audit I think that's a given, use
> smack_to_secid(sp->smk_inode) where you need to. If there's a
> real performance issue apply intelligence to smack_to_secid
> instead of storing the secid. There ought to be a way to
> use container_of to do smack_to_secid, but I had trouble with
> that and moved along without figuring out what I had done
> wrong.
> 

mm .. I should have remembered the un-official Smack motto:
"Everything is a label, and whenever possible, this label is allocated
 once through system lifetime"

About performance, yes there'll be issues searching labels espicially
in audit_rule_match() which got called at the end of every system call.
I'll try it using container_of (it should work at the end).

...
>
> > @@ -1696,9 +1738,9 @@ static int smack_msg_queue_alloc_security(struct
> > msg_queue *msq)
> >   */
> >  static void smack_msg_queue_free_security(struct msg_queue *msq)
> >  {
> > -	struct kern_ipc_perm *kisp = &msq->q_perm;
> > +	struct kern_ipc_perm *ipcp = &msq->q_perm;
> >  
> > -	kisp->security = NULL;
> > +	kfree(ipcp->security);
> >  }
> 
> Don't you just hate repetative reviewers?
>

Probably hating secids with passion :) ?

Admittedly, after some thinking I felt now that they don't fit with
the Smack model very well.

...
> > +
> > +/**
> > + * smack_audit_rule_match - Audit given secid identified object ?
> > + * @secid: Security id to test
> > + * @field: Message flags given from user-space
> > + * @op: Required operation (only equality is allowed)
> > + * @vrule: Smack audit rule that will be checked against the secid object
> > + * @actx: audit context associated with the check (used for Audit logging)
> > + *
> > + * This is the core Audit hook. It's used to identify objects like 
> > + * syscalls and inodes requested from user-space to be audited from 
> > + * remaining kernel objects.
> > + */
> > +static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule,
> > +				  struct audit_context *actx)
> > +{
> > +	struct smack_known *smk_rule = vrule;
> 
>         char *smack;
> 

More of "everything is a label".

> > +
> > +	if (!smk_rule) {
> > +		audit_log(actx, GFP_KERNEL, AUDIT_SELINUX_ERR,
> > +			  "Smack: missing rule\n");
> > +		return -ENOENT;
> > +	}
> > +
> > +	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
> > +		return 0;
> > +
> 
>         smack = smack_from_secid(secid);
> 
> > +	if (op == AUDIT_EQUAL)
> > +		return (smk_rule->smk_secid == secid);
> > +	if (op == AUDIT_NOT_EQUAL)
> > +		return (smk_rule->smk_secid != secid);
> 
>  	if (op == AUDIT_EQUAL)
> 		return (smk_rule->smk_smack == smack);
> 	if (op == AUDIT_NOT_EQUAL)
> 		return (smk_rule->smk_smack == smack);
> 

You've meant using the short-circuit:
	 smk_rule->smk_smack == smack || strnmp(smack, ..., ..)
Right ?

...
> > +
> > +/**
> > + * smack_audit_rule_free - free internal audit rule representation
> > + * @vrule: rule to be freed.
> > + *
> > + * No memory was allocated in audit_rule_init.
> > + */
> > +static void smack_audit_rule_free(void *vrule)
> > +{
> > +	/* No-op */
> > +}

This little no-op was the only thing that was agreed upon ;)

...
> 
> Casey Schaufler
> casey@schaufler-ca.com

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* [RFC][PATCH -v2] Smack: Integrate with Audit
  @ 2008-03-12  2:44 78%       ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-03-12  2:44 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Andrew Morton, James Morris, Paul Moore, LKML, LSM-ML, Audit-ML

Hi!,

Setup the new Audit hooks for Smack. The AUDIT_SUBJ_USER and 
AUDIT_OBJ_USER SELinux flags are recycled to avoid `auditd' 
userspace modifications. Smack only needs auditing on 
a subject/object bases, so those flags were enough.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

 smack_lsm.c |  153 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index afa7967..d471839 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -26,6 +26,7 @@
 #include <linux/pipe_fs_i.h>
 #include <net/netlabel.h>
 #include <net/cipso_ipv4.h>
+#include <linux/audit.h>
 
 #include "smack.h"
 
@@ -759,6 +760,17 @@ static int smack_inode_listsecurity(struct inode *inode, char *buffer,
 	return -EINVAL;
 }
 
+/**
+ * smack_inode_getsecid - Extract inode's security id
+ * @inode: inode to extract the info from
+ * @secid: where result will be saved
+ */
+static void smack_inode_getsecid(const struct inode *inode, u32 *secid)
+{
+	struct inode_smack *isp = inode->i_security;
+	*secid = smack_to_secid(isp->smk_inode);
+}
+
 /*
  * File Hooks
  */
@@ -1814,6 +1826,17 @@ static int smack_ipc_permission(struct kern_ipc_perm *ipp, short flag)
 	return smk_curacc(isp, may);
 }
 
+/**
+ * smack_ipc_getsecid - Extract ipc object security id
+ * @ipp: the object permissions
+ * @secid: where result will be saved
+ */
+static void smack_ipc_getsecid(struct kern_ipc_perm *ipp, u32 *secid)
+{
+	char *smack = ipp->security;
+	*secid = smack_to_secid(smack);
+}
+
 /* module stacking operations */
 
 /**
@@ -2391,6 +2414,124 @@ static int smack_key_permission(key_ref_t key_ref,
 #endif /* CONFIG_KEYS */
 
 /*
+ * Smack Audit hooks
+ *
+ * Audit requires a unique representation of each Smack specific
+ * rule. This unique representation is used to distinguish the
+ * object to be audited from remaining kernel objects and also
+ * works as a glue between the audit hooks.
+ *
+ * Since repository entries are added but never deleted, we'll use
+ * the smack_known label address related to the given audit rule as
+ * the needed unique representation. This also better fits the smack
+ * model where nearly everything is a label.
+ */
+#ifdef CONFIG_AUDIT
+
+/**
+ * smack_audit_rule_init - Initialize a smack audit rule
+ * @field: audit rule fields given from user-space (audit.h)
+ * @op: required testing operator (=, !=, >, <, ...)
+ * @rulestr: smack label to be audited
+ * @vrule: pointer to save our own audit rule representation
+ *
+ * Prepare to audit cases where (@field @op @rulestr) is true.
+ * The label to be audited is created if necessay.
+ */
+static int smack_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
+{
+	char **rule = (char **)vrule;
+	*rule = NULL;
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return -EINVAL;
+
+	if (op != AUDIT_EQUAL && op != AUDIT_NOT_EQUAL)
+		return -EINVAL;
+
+	*rule = smk_import(rulestr, 0);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_known - Distinguish Smack audit rules
+ * @krule: rule of interest, in Audit kernel representation format
+ *
+ * This is used to filter Smack rules from remaining Audit ones.
+ * If it's proved that this rule belongs to us, the
+ * audit_rule_match hook will be called to do the final judgement.
+ */
+static int smack_audit_rule_known(struct audit_krule *krule)
+{
+	struct audit_field *f;
+	int i;
+
+	for (i = 0; i < krule->field_count; i++) {
+		f = &krule->fields[i];
+
+		if (f->type == AUDIT_SUBJ_USER || f->type == AUDIT_OBJ_USER)
+			return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_match - Audit given object ?
+ * @secid: security id for identifying the object to test
+ * @field: audit rule flags given from user-space
+ * @op: required testing operator
+ * @vrule: smack internal rule presentation
+ * @actx: audit context associated with the check
+ *
+ * The core Audit hook. It's used to take the decision of
+ * whether to audit or not to audit a given object.
+ */
+static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule,
+				  struct audit_context *actx)
+{
+	char *smack;
+	char *rule = vrule;
+
+	if (!rule) {
+		audit_log(actx, GFP_KERNEL, AUDIT_SELINUX_ERR,
+			  "Smack: missing rule\n");
+		return -ENOENT;
+	}
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return 0;
+
+	smack = smack_from_secid(secid);
+
+	/*
+	 * No need to do string comparisons since we're sure
+	 * that if a match occurs, both pointers will point
+	 * to the same smack_konwn label.
+	 */
+	if (op == AUDIT_EQUAL)
+		return (rule == smack);
+	if (op == AUDIT_NOT_EQUAL)
+		return (rule != smack);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_free - free smack rule representation
+ * @vrule: rule to be freed.
+ *
+ * No memory was allocated.
+ */
+static void smack_audit_rule_free(void *vrule)
+{
+	/* No-op */
+}
+
+#endif /* CONFIG_AUDIT */
+
+/*
  * smack_secid_to_secctx - return the smack label for a secid
  * @secid: incoming integer
  * @secdata: destination
@@ -2476,6 +2617,7 @@ struct security_operations smack_ops = {
 	.inode_getsecurity = 		smack_inode_getsecurity,
 	.inode_setsecurity = 		smack_inode_setsecurity,
 	.inode_listsecurity = 		smack_inode_listsecurity,
+	.inode_getsecid =		smack_inode_getsecid,
 
 	.file_permission = 		smack_file_permission,
 	.file_alloc_security = 		smack_file_alloc_security,
@@ -2506,6 +2648,7 @@ struct security_operations smack_ops = {
 	.task_to_inode = 		smack_task_to_inode,
 
 	.ipc_permission = 		smack_ipc_permission,
+	.ipc_getsecid =			smack_ipc_getsecid,
 
 	.msg_msg_alloc_security = 	smack_msg_msg_alloc_security,
 	.msg_msg_free_security = 	smack_msg_msg_free_security,
@@ -2550,12 +2693,22 @@ struct security_operations smack_ops = {
 	.sk_free_security = 		smack_sk_free_security,
 	.sock_graft = 			smack_sock_graft,
 	.inet_conn_request = 		smack_inet_conn_request,
+
  /* key management security hooks */
 #ifdef CONFIG_KEYS
 	.key_alloc = 			smack_key_alloc,
 	.key_free = 			smack_key_free,
 	.key_permission = 		smack_key_permission,
 #endif /* CONFIG_KEYS */
+
+ /* Audit hooks */
+#ifdef CONFIG_AUDIT
+	.audit_rule_init =		smack_audit_rule_init,
+	.audit_rule_known =		smack_audit_rule_known,
+	.audit_rule_match =		smack_audit_rule_match,
+	.audit_rule_free =		smack_audit_rule_free,
+#endif /* CONFIG_AUDIT */
+
 	.secid_to_secctx = 		smack_secid_to_secctx,
 	.secctx_to_secid = 		smack_secctx_to_secid,
 	.release_secctx = 		smack_release_secctx,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 78%]

* [PATCH -v2b] Smack: Integrate with Audit
  @ 2008-03-12 12:18 78%           ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-12 12:18 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Andrew Morton, James Morris, Paul Moore, LKML, LSM-ML, Audit-ML

Hi!,

[ Minor styling fixes ]

-->

Setup the new Audit hooks for Smack. SELinux Audit rule fields 
are recycled to avoid `auditd' userspace modifications.
Currently only equality testing is supported on labels acting 
as a subject (AUDIT_SUBJ_USER) or as an object (AUDIT_OBJ_USER).

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index afa7967..f999ab5 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -26,6 +26,7 @@
 #include <linux/pipe_fs_i.h>
 #include <net/netlabel.h>
 #include <net/cipso_ipv4.h>
+#include <linux/audit.h>
 
 #include "smack.h"
 
@@ -759,6 +760,18 @@ static int smack_inode_listsecurity(struct inode *inode, char *buffer,
 	return -EINVAL;
 }
 
+/**
+ * smack_inode_getsecid - Extract inode's security id
+ * @inode: inode to extract the info from
+ * @secid: where result will be saved
+ */
+static void smack_inode_getsecid(const struct inode *inode, u32 *secid)
+{
+	struct inode_smack *isp = inode->i_security;
+
+	*secid = smack_to_secid(isp->smk_inode);
+}
+
 /*
  * File Hooks
  */
@@ -1814,6 +1827,18 @@ static int smack_ipc_permission(struct kern_ipc_perm *ipp, short flag)
 	return smk_curacc(isp, may);
 }
 
+/**
+ * smack_ipc_getsecid - Extract smack security id
+ * @ipcp: the object permissions
+ * @secid: where result will be saved
+ */
+static void smack_ipc_getsecid(struct kern_ipc_perm *ipp, u32 *secid)
+{
+	char *smack = ipp->security;
+
+	*secid = smack_to_secid(smack);
+}
+
 /* module stacking operations */
 
 /**
@@ -2391,6 +2416,124 @@ static int smack_key_permission(key_ref_t key_ref,
 #endif /* CONFIG_KEYS */
 
 /*
+ * Smack Audit hooks
+ *
+ * Audit requires a unique representation of each Smack specific
+ * rule. This unique representation is used to distinguish the
+ * object to be audited from remaining kernel objects and also
+ * works as a glue between the audit hooks.
+ *
+ * Since repository entries are added but never deleted, we'll use
+ * the smack_known label address related to the given audit rule as
+ * the needed unique representation. This also better fits the smack
+ * model where nearly everything is a label.
+ */
+#ifdef CONFIG_AUDIT
+
+/**
+ * smack_audit_rule_init - Initialize a smack audit rule
+ * @field: audit rule fields given from user-space (audit.h)
+ * @op: required testing operator (=, !=, >, <, ...)
+ * @rulestr: smack label to be audited
+ * @vrule: pointer to save our own audit rule representation
+ *
+ * Prepare to audit cases where (@field @op @rulestr) is true.
+ * The label to be audited is created if necessay.
+ */
+static int smack_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
+{
+	char **rule = (char **)vrule;
+	*rule = NULL;
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return -EINVAL;
+
+	if (op != AUDIT_EQUAL && op != AUDIT_NOT_EQUAL)
+		return -EINVAL;
+
+	*rule = smk_import(rulestr, 0);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_known - Distinguish Smack audit rules
+ * @krule: rule of interest, in Audit kernel representation format
+ *
+ * This is used to filter Smack rules from remaining Audit ones.
+ * If it's proved that this rule belongs to us, the
+ * audit_rule_match hook will be called to do the final judgement.
+ */
+static int smack_audit_rule_known(struct audit_krule *krule)
+{
+	struct audit_field *f;
+	int i;
+
+	for (i = 0; i < krule->field_count; i++) {
+		f = &krule->fields[i];
+
+		if (f->type == AUDIT_SUBJ_USER || f->type == AUDIT_OBJ_USER)
+			return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_match - Audit given object ?
+ * @secid: security id for identifying the object to test
+ * @field: audit rule flags given from user-space
+ * @op: required testing operator
+ * @vrule: smack internal rule presentation
+ * @actx: audit context associated with the check
+ *
+ * The core Audit hook. It's used to take the decision of
+ * whether to audit or not to audit a given object.
+ */
+static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule,
+				  struct audit_context *actx)
+{
+	char *smack;
+	char *rule = vrule;
+
+	if (!rule) {
+		audit_log(actx, GFP_KERNEL, AUDIT_SELINUX_ERR,
+			  "Smack: missing rule\n");
+		return -ENOENT;
+	}
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return 0;
+
+	smack = smack_from_secid(secid);
+
+	/*
+	 * No need to do string comparisons. If a match occurs,
+	 * both pointers will point to the same smack_known
+	 * label.
+	 */
+	if (op == AUDIT_EQUAL)
+		return (rule == smack);
+	if (op == AUDIT_NOT_EQUAL)
+		return (rule != smack);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_free - free smack rule representation
+ * @vrule: rule to be freed.
+ *
+ * No memory was allocated.
+ */
+static void smack_audit_rule_free(void *vrule)
+{
+	/* No-op */
+}
+
+#endif /* CONFIG_AUDIT */
+
+/*
  * smack_secid_to_secctx - return the smack label for a secid
  * @secid: incoming integer
  * @secdata: destination
@@ -2476,6 +2619,7 @@ struct security_operations smack_ops = {
 	.inode_getsecurity = 		smack_inode_getsecurity,
 	.inode_setsecurity = 		smack_inode_setsecurity,
 	.inode_listsecurity = 		smack_inode_listsecurity,
+	.inode_getsecid =		smack_inode_getsecid,
 
 	.file_permission = 		smack_file_permission,
 	.file_alloc_security = 		smack_file_alloc_security,
@@ -2506,6 +2650,7 @@ struct security_operations smack_ops = {
 	.task_to_inode = 		smack_task_to_inode,
 
 	.ipc_permission = 		smack_ipc_permission,
+	.ipc_getsecid =			smack_ipc_getsecid,
 
 	.msg_msg_alloc_security = 	smack_msg_msg_alloc_security,
 	.msg_msg_free_security = 	smack_msg_msg_free_security,
@@ -2550,12 +2695,22 @@ struct security_operations smack_ops = {
 	.sk_free_security = 		smack_sk_free_security,
 	.sock_graft = 			smack_sock_graft,
 	.inet_conn_request = 		smack_inet_conn_request,
+
  /* key management security hooks */
 #ifdef CONFIG_KEYS
 	.key_alloc = 			smack_key_alloc,
 	.key_free = 			smack_key_free,
 	.key_permission = 		smack_key_permission,
 #endif /* CONFIG_KEYS */
+
+ /* Audit hooks */
+#ifdef CONFIG_AUDIT
+	.audit_rule_init =		smack_audit_rule_init,
+	.audit_rule_known =		smack_audit_rule_known,
+	.audit_rule_match =		smack_audit_rule_match,
+	.audit_rule_free =		smack_audit_rule_free,
+#endif /* CONFIG_AUDIT */
+
 	.secid_to_secctx = 		smack_secid_to_secctx,
 	.secctx_to_secid = 		smack_secctx_to_secid,
 	.release_secctx = 		smack_release_secctx,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 78%]

* Re: [RFC][PATCH -v2] Smack: Integrate with Audit
  @ 2008-03-12 16:43 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-12 16:43 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: casey, Andrew Morton, James Morris, Paul Moore, LKML, LSM-ML,
	Audit-ML, Steve Grubb

On Wed, Mar 12, 2008 at 11:48:17AM -0400, Stephen Smalley wrote:
> 
> On Wed, 2008-03-12 at 08:40 -0700, Casey Schaufler wrote:
> > --- Stephen Smalley <sds@tycho.nsa.gov> wrote:
> > 
> > > 
> > > On Wed, 2008-03-12 at 04:44 +0200, Ahmed S. Darwish wrote:
> > > > Hi!,
> > > > 
> > > > Setup the new Audit hooks for Smack. The AUDIT_SUBJ_USER and 
> > > > AUDIT_OBJ_USER SELinux flags are recycled to avoid `auditd' 
> > > > userspace modifications. Smack only needs auditing on 
> > > > a subject/object bases, so those flags were enough.
> > > 
> > > Only question I have is whether audit folks are ok with reuse of the
> > > flags in this manner, and whether the _USER flag is best suited for this
> > > purpose if you are going to reuse an existing flag (since Smack label
> > > seems more like a SELinux type than a SELinux user).
> > 
> > To-mate-o toe-maht-o.
> > 
> > There really doesn't seem to be any real reason to create a new
> > flag just because the granularity is different. The choice between
> > _USER and _TYPE (and _ROLE for that matter) is arbitrary from a
> > functional point of view. I say that since Smack has users, but
> > not types or roles, _USER makes the most sense.
> 
> Perhaps I misunderstand, but Smack labels don't represent users (i.e.
> user identity) in any way, so it seemed like a mismatch to use the _USER
> flag there.  Whereas types in SELinux bear some similarity to Smack
> labels - simple unstructured names whose meaning is only defined by the
> policy rules.
> 

I think Casey meant the common use of Smack where a login program
(openssh, bin/login, ..) sets a label for each user that logs in, thus
letting each label effectively representing a user.

In a sense, smack labels share a bit of _USER and _TYPE.

> Regardless, it seems like the audit maintainers ought to weigh in on the
> matter.
> 

Indeed.

Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* [PATCH BUGFIX -rc5] Smack: Do not dereference NULL ipc object
@ 2008-03-14 23:10 96% Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-14 23:10 UTC (permalink / raw)
  To: Casey Schaufler, akpm; +Cc: LKML

Hi all,

In the SYSV ipc msgctl(),semctl(),shmctl() family, if the user passed
*_INFO as the desired operation, no specific object is meant to be 
controlled and only system-wide information is returned. This leads
to a NULL IPC object in the LSM hooks if the _INFO flag is given.

Avoid dereferencing this NULL pointer in Smack ipc *ctl() methods.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

 smack_lsm.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 0241fd3..38d7075 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -1508,7 +1508,7 @@ static int smack_shm_associate(struct shmid_kernel *shp, int shmflg)
  */
 static int smack_shm_shmctl(struct shmid_kernel *shp, int cmd)
 {
-	char *ssp = smack_of_shm(shp);
+	char *ssp;
 	int may;
 
 	switch (cmd) {
@@ -1532,6 +1532,7 @@ static int smack_shm_shmctl(struct shmid_kernel *shp, int cmd)
 		return -EINVAL;
 	}
 
+	ssp = smack_of_shm(shp);
 	return smk_curacc(ssp, may);
 }
 
@@ -1616,7 +1617,7 @@ static int smack_sem_associate(struct sem_array *sma, int semflg)
  */
 static int smack_sem_semctl(struct sem_array *sma, int cmd)
 {
-	char *ssp = smack_of_sem(sma);
+	char *ssp;
 	int may;
 
 	switch (cmd) {
@@ -1645,6 +1646,7 @@ static int smack_sem_semctl(struct sem_array *sma, int cmd)
 		return -EINVAL;
 	}
 
+	ssp = smack_of_sem(sma);
 	return smk_curacc(ssp, may);
 }
 
@@ -1730,7 +1732,7 @@ static int smack_msg_queue_associate(struct msg_queue *msq, int msqflg)
  */
 static int smack_msg_queue_msgctl(struct msg_queue *msq, int cmd)
 {
-	char *msp = smack_of_msq(msq);
+	char *msp;
 	int may;
 
 	switch (cmd) {
@@ -1752,6 +1754,7 @@ static int smack_msg_queue_msgctl(struct msg_queue *msq, int cmd)
 		return -EINVAL;
 	}
 
+	msp = smack_of_msq(msq);
 	return smk_curacc(msp, may);
 }
 
Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 96%]

* Re: [PATCH] de-semaphorize smackfs
  @ 2008-03-18 12:01 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-18 12:01 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jonathan Corbet, linux-kernel, Casey Schaufler

Hi!,

On Tue, Mar 18, 2008 at 9:35 AM, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Mar 17, 2008 at 05:14:40PM -0600, Jonathan Corbet wrote:
>  > While looking at the generic semaphore patch, I came to wonder what the
>  > remaining semaphore users in the kernel were actually doing.  After a
>  > quick grep for down_interruptible(), smackfs remained at the bottom of
>  > my screen.  It seems like a straightforward mutex case - low-hanging
>  > fruit.  So here's a conversion patch; compile-tested only, but what
>  > could go wrong?
>

I have a feeling that a very nice LWN article is under the way :).

>
>  The lock is kept while returning to userspace and could potentially
>  be release by another process.  This is not allowed for mutexes.
>

Yes, this was an artifact of an old design where different processes
write _sessions_ were not allowed to interleave, thus locking them in
the very beginning (open).

Since now those sessions can be interleaved safely, I'll modify this
issue today to move the locking to each write() call instead.

Thanks Jonathan/Cristoph

-- 
Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[relevance 99%]

* [PATCH BUGFIX -rc6] Smackfs: remove redundant lock, fix open(,O_RDWR)
@ 2008-03-21 15:05 94% Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-03-21 15:05 UTC (permalink / raw)
  To: Jonathan Corbet, Casey Schaufler, Andrew Morton
  Cc: LKML, Christoph Hellwig, Daniel Walker

Hi all,

Older smackfs was parsing MAC rules by characters, thus a need of
locking write sessions on open() was needed. This lock is no longer
useful now since each rule is handled by a single write() call.

This is also a bugfix since seq_open() was not called if an open()
O_RDWR flag was given, leading to a seq_read() without an initialized
seq_file, thus an Oops.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Reported-by: Jonathan Corbet <corbet@lwn.net>
--

 security/smack/smackfs.c |   35 ++---------------------------------
 1 file changed, 2 insertions(+), 33 deletions(-)

diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index afe7c9b..cfae8af 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -74,11 +74,6 @@ struct smk_list_entry *smack_list;
 #define	SEQ_READ_FINISHED	1
 
 /*
- * Disable concurrent writing open() operations
- */
-static struct semaphore smack_write_sem;
-
-/*
  * Values for parsing cipso rules
  * SMK_DIGITLEN: Length of a digit field in a rule.
  * SMK_CIPSOMIN: Minimum possible cipso rule length.
@@ -168,32 +163,7 @@ static struct seq_operations load_seq_ops = {
  */
 static int smk_open_load(struct inode *inode, struct file *file)
 {
-	if ((file->f_flags & O_ACCMODE) == O_RDONLY)
-		return seq_open(file, &load_seq_ops);
-
-	if (down_interruptible(&smack_write_sem))
-		return -ERESTARTSYS;
-
-	return 0;
-}
-
-/**
- * smk_release_load - release() for /smack/load
- * @inode: inode structure representing file
- * @file: "load" file pointer
- *
- * For a reading session, use the seq_file release
- * implementation.
- * Otherwise, we are at the end of a writing session so
- * clean everything up.
- */
-static int smk_release_load(struct inode *inode, struct file *file)
-{
-	if ((file->f_flags & O_ACCMODE) == O_RDONLY)
-		return seq_release(inode, file);
-
-	up(&smack_write_sem);
-	return 0;
+	return seq_open(file, &load_seq_ops);
 }
 
 /**
@@ -341,7 +311,7 @@ static const struct file_operations smk_load_ops = {
 	.read		= seq_read,
 	.llseek         = seq_lseek,
 	.write		= smk_write_load,
-	.release        = smk_release_load,
+	.release        = seq_release,
 };
 
 /**
@@ -1011,7 +981,6 @@ static int __init init_smk_fs(void)
 		}
 	}
 
-	sema_init(&smack_write_sem, 1);
 	smk_cipso_doi();
 	smk_unlbl_ambient(NULL);
 
Regards,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 94%]

* Re: [patch] SLQB v2
  @ 2008-04-15  3:44 94% ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-04-15  3:44 UTC (permalink / raw)
  To: Nick Piggin, Christoph Lameter
  Cc: Linux Memory Management List, Linux Kernel Mailing List

Hi!,

On Thu, Apr 10, 2008 at 09:31:38PM +0200, Nick Piggin wrote:
...
> +
> +/*
> + * We use struct slqb_page fields to manage some slob allocation aspects,
> + * however to avoid the horrible mess in include/linux/mm_types.h, we'll
> + * just define our own struct slqb_page type variant here.
> + */
> +struct slqb_page {
> +	union {
> +		struct {
> +			unsigned long flags;	/* mandatory */
> +			atomic_t _count;	/* mandatory */
> +			unsigned int inuse;	/* Nr of objects */
> +		   	struct kmem_cache_list *list; /* Pointer to list */
> +			void **freelist;	/* freelist req. slab lock */
> +			union {
> +				struct list_head lru; /* misc. list */
> +				struct rcu_head rcu_head; /* for rcu freeing */
> +			};
> +		};
> +		struct page page;
> +	};
> +};

A small question for SLUB devs, would you accept a patch that does
a similar thing by creating 'slub_page' instead of stuffing slub 
elements (freelist, inuse, ..) in 'mm_types::struct page' unions ?

Maybe cause I'm new to MM, but I felt I could understand the code 
much more better the SLQB slqb_page way.

...
> +/*
> + * Kmalloc subsystem.
> + */
> +#if defined(ARCH_KMALLOC_MINALIGN) && ARCH_KMALLOC_MINALIGN > 8
> +#define KMALLOC_MIN_SIZE ARCH_KMALLOC_MINALIGN
> +#else
> +#define KMALLOC_MIN_SIZE 8
> +#endif
> +
> +#define KMALLOC_SHIFT_LOW ilog2(KMALLOC_MIN_SIZE)
> +#define KMALLOC_SHIFT_SLQB_HIGH (PAGE_SHIFT + 5)
> +
> +/*
> + * We keep the general caches in an array of slab caches that are used for
> + * 2^x bytes of allocations.
> + */
> +extern struct kmem_cache kmalloc_caches[KMALLOC_SHIFT_SLQB_HIGH + 1];
> +

So AFAIK in an x86 where PAGE_SHIFT = 12, KMALLOC_SHIFT_SLQB_HIGH+1 will
equal= 18.

> +/*
> + * Sorry that the following has to be that ugly but some versions of GCC
> + * have trouble with constant propagation and loops.
> + */
> +static __always_inline int kmalloc_index(size_t size)
> +{
> +	if (!size)
> +		return 0;
> +
> +	if (size <= KMALLOC_MIN_SIZE)
> +		return KMALLOC_SHIFT_LOW;
> +
> +	if (size > 64 && size <= 96)
> +		return 1;
> +	if (size > 128 && size <= 192)
> +		return 2;
> +	if (size <=          8) return 3;
> +	if (size <=         16) return 4;
> +	if (size <=         32) return 5;
> +	if (size <=         64) return 6;
> +	if (size <=        128) return 7;
> +	if (size <=        256) return 8;
> +	if (size <=        512) return 9;
> +	if (size <=       1024) return 10;
> +	if (size <=   2 * 1024) return 11;
> +/*
> + * The following is only needed to support architectures with a larger page
> + * size than 4k.
> + */
> +	if (size <=   4 * 1024) return 12;
> +	if (size <=   8 * 1024) return 13;
> +	if (size <=  16 * 1024) return 14;
> +	if (size <=  32 * 1024) return 15;
> +	if (size <=  64 * 1024) return 16;
> +	if (size <= 128 * 1024) return 17;
> +	if (size <= 256 * 1024) return 18;
> +	if (size <= 512 * 1024) return 19;

I'm sure there's something utterly wrong in my understanding, but how this
is designed to not overflow kmalloc_caches[18] in an x86-32 machine ?

I can not see this possible-only-in-my-mind overflow happens in SLUB as it 
just delegate the work to the page allocator if size > PAGE_SIZE.

> +	if (size <= 1024 * 1024) return 20;
> +	if (size <=  2 * 1024 * 1024) return 21;
> +	return -1;
> +
> +/*
> + * What we really wanted to do and cannot do because of compiler issues is:
> + *	int i;
> + *	for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++)
> + *		if (size <= (1 << i))
> + *			return i;
> + */
> +}
> +
> +/*
> + * Find the slab cache for a given combination of allocation flags and size.
> + *
> + * This ought to end up with a global pointer to the right cache
> + * in kmalloc_caches.
> + */
> +static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
> +{
> +	int index = kmalloc_index(size);
> +
> +	if (index == 0)
> +		return NULL;
> +
> +	return &kmalloc_caches[index];
> +}
> +
> +#ifdef CONFIG_ZONE_DMA
> +#define SLQB_DMA __GFP_DMA
> +#else
> +/* Disable DMA functionality */
> +#define SLQB_DMA (__force gfp_t)0
> +#endif
> +
> +void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
> +void *__kmalloc(size_t size, gfp_t flags);
> +
> +static __always_inline void *kmalloc(size_t size, gfp_t flags)
> +{
> +	if (__builtin_constant_p(size)) {
> +		if (likely(!(flags & SLQB_DMA))) {
> +			struct kmem_cache *s = kmalloc_slab(size);
> +			if (!s)
> +				return ZERO_SIZE_PTR;
> +			return kmem_cache_alloc(s, flags);
> +		}
> +	}
> +	return __kmalloc(size, flags);
> +}
> +
> +#ifdef CONFIG_NUMA
> +void *__kmalloc_node(size_t size, gfp_t flags, int node);
> +void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
> +
> +static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> +{
> +	if (__builtin_constant_p(size)) {
> +		if (likely(!(flags & SLQB_DMA))) {
> +			struct kmem_cache *s = kmalloc_slab(size);
> +			if (!s)
> +				return ZERO_SIZE_PTR;
> +			return kmem_cache_alloc_node(s, flags, node);
> +		}
> +	}
> +	return __kmalloc_node(size, flags, node);
> +}

Why this compile-time/run-time divide although both kmem_cache_alloc{,_node}
and __kmalloc{,_node} call slab_alloc() at the end ?

> +#endif
> +
> +#endif /* _LINUX_SLQB_DEF_H */
> Index: linux-2.6/init/Kconfig
> ===================================================================
> --- linux-2.6.orig/init/Kconfig
> +++ linux-2.6/init/Kconfig
> @@ -701,6 +701,11 @@ config SLUB_DEBUG
>  	  SLUB sysfs support. /sys/slab will not exist and there will be
>  	  no support for cache validation etc.
>  
> +config SLQB_DEBUG
> +	default y
> +	bool "Enable SLQB debugging support"
> +	depends on SLQB
> +

Maybe if SLQB got merged it can just be easier to have a general SLAB_DEBUG
option that is recognized by the current 4 slab allocators ?

>  choice
>  	prompt "Choose SLAB allocator"
>  	default SLUB
> @@ -724,6 +729,9 @@ config SLUB
>  	   of queues of objects. SLUB can use memory efficiently
>  	   and has enhanced diagnostics.
>  
> +config SLQB
> +	bool "SLQB (Qeued allocator)"
> +
>  config SLOB
>  	depends on EMBEDDED
>  	bool "SLOB (Simple Allocator)"
> @@ -763,7 +771,7 @@ endmenu		# General setup
>  config SLABINFO
>  	bool
>  	depends on PROC_FS
> -	depends on SLAB || SLUB
> +	depends on SLAB || SLUB || SLQB
>  	default y
>  
>  config RT_MUTEXES
> Index: linux-2.6/lib/Kconfig.debug
> ===================================================================
> --- linux-2.6.orig/lib/Kconfig.debug
> +++ linux-2.6/lib/Kconfig.debug
> @@ -221,6 +221,16 @@ config SLUB_STATS
>  	  out which slabs are relevant to a particular load.
>  	  Try running: slabinfo -DA
>  
> +config SLQB_DEBUG_ON
> +	bool "SLQB debugging on by default"
> +	depends on SLQB_DEBUG
> +	default n
> +
> +config SLQB_STATS
> +	default n
> +	bool "Enable SLQB performance statistics"
> +	depends on SLQB
> +
>  config DEBUG_PREEMPT
>  	bool "Debug preemptible kernel"
>  	depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
> Index: linux-2.6/mm/slqb.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6/mm/slqb.c
> @@ -0,0 +1,4027 @@
> +/*
> + * SLQB: A slab allocator that focuses on per-CPU scaling, and good performance
> + * with order-0 allocations. Fastpaths emphasis is placed on local allocaiton
> + * and freeing, and remote freeing (freeing on another CPU from that which
> + * allocated).
> + *
> + * Using ideas from mm/slab.c, mm/slob.c, and mm/slub.c,
> + *
> + * And parts of code from mm/slub.c
> + * (C) 2007 SGI, Christoph Lameter <clameter@sgi.com>
> + */
> +
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/bit_spinlock.h>
> +#include <linux/interrupt.h>
> +#include <linux/bitops.h>
> +#include <linux/slab.h>
> +#include <linux/seq_file.h>
> +#include <linux/cpu.h>
> +#include <linux/cpuset.h>
> +#include <linux/mempolicy.h>
> +#include <linux/ctype.h>
> +#include <linux/kallsyms.h>
> +#include <linux/memory.h>
> +
> +/*
> + * Lock order:
> + *   1. kmem_cache_node->list_lock
> + *    2. kmem_cache_remote_free->lock
> + *
> + *   Interrupts are disabled during allocation and deallocation in order to
> + *   make the slab allocator safe to use in the context of an irq. In addition
> + *   interrupts are disabled to ensure that the processor does not change
> + *   while handling per_cpu slabs, due to kernel preemption.
> + *
> + * SLIB assigns one slab for allocation to each processor.
> + * Allocations only occur from these slabs called cpu slabs.
> + *

SLQB is a much more better name than SLIB :).

> + * Slabs with free elements are kept on a partial list and during regular
> + * operations no list for full slabs is used. If an object in a full slab is
> + * freed then the slab will show up again on the partial lists.
> + * We track full slabs for debugging purposes though because otherwise we
> + * cannot scan all objects.
> + *
> + * Slabs are freed when they become empty. Teardown and setup is
> + * minimal so we rely on the page allocators per cpu caches for
> + * fast frees and allocs.
> + */
> +

Ok, I admit I didn't do my homework yet of fully understanding the
diff between SLUB and SLQB except in the kmalloc(> PAGE_SIZE) case.
I hope my understanding will get better soon.

...
> +/*
> + * Slow path. The lockless freelist is empty or we need to perform
> + * debugging duties.
> + *
> + * Interrupts are disabled.
> + *
> + * Processing is still very fast if new objects have been freed to the
> + * regular freelist. In that case we simply take over the regular freelist
> + * as the lockless freelist and zap the regular freelist.
> + *
> + * If that is not working then we fall back to the partial lists. We take the
> + * first element of the freelist as the object to allocate now and move the
> + * rest of the freelist to the lockless freelist.
> + *
> + * And if we were unable to get a new slab from the partial slab lists then
> + * we need to allocate a new slab. This is slowest path since we may sleep.
> + */
> +static __always_inline void *__slab_alloc(struct kmem_cache *s,
> +		gfp_t gfpflags, int node, void *addr)
> +{

__slab_alloc istelf is not the slow-path, but the slow path begins 
from the alloc_new: label, right ? 

If so, then IMHO the comment is a bit misleading since it gives 
the impression that the whole __slab_alloc() is the slow path.

> +	void *object;
> +	struct slqb_page *page;
> +	struct kmem_cache_cpu *c;
> +	struct kmem_cache_list *l;
> +#ifdef CONFIG_NUMA
> +	struct kmem_cache_node *n;
> +
> +	if (unlikely(node != -1) && unlikely(node != numa_node_id())) {
> +		n = s->node[node];
> +		VM_BUG_ON(!n);
> +		l = &n->list;
> +
> +		if (unlikely(!l->nr_partial && !l->nr_free && !l->remote_free_check))
> +			goto alloc_new;
> +
> +		spin_lock(&n->list_lock);
> +remote_list_have_object:
> +		page = __cache_list_get_page(s, l);
> +		if (unlikely(!page)) {
> +			spin_unlock(&n->list_lock);
> +			goto alloc_new;
> +		}
> +		VM_BUG_ON(node != -1 && node != slqb_page_to_nid(page));
> +
> +remote_found:
> +		object = page->freelist;
> +		page->freelist = get_freepointer(s, object);
> +		//prefetch(((void *)page->freelist) + s->offset);
> +		page->inuse++;
> +		VM_BUG_ON((page->inuse == s->objects) != (page->freelist == NULL));
> +		spin_unlock(&n->list_lock);
> +
> +		return object;
> +	}
> +#endif
> +
> +	c = get_cpu_slab(s, smp_processor_id());
> +	VM_BUG_ON(!c);
> +	l = &c->list;
> +	page = __cache_list_get_page(s, l);
> +	if (unlikely(!page))
> +		goto alloc_new;
> +	VM_BUG_ON(node != -1 && node != slqb_page_to_nid(page));
> +
> +local_found:
> +	object = page->freelist;
> +	page->freelist = get_freepointer(s, object);
> +	//prefetch(((void *)page->freelist) + s->offset);
> +	page->inuse++;
> +	VM_BUG_ON((page->inuse == s->objects) != (page->freelist == NULL));
> +
> +	return object;
> +
> +alloc_new:
> +#if 0
> +	/* XXX: load any partial? */
> +#endif
> +
> +	/* Caller handles __GFP_ZERO */
> +	gfpflags &= ~__GFP_ZERO;
> +
> +	if (gfpflags & __GFP_WAIT)
> +		local_irq_enable();
> +	page = new_slab_page(s, gfpflags, node);
> +	if (gfpflags & __GFP_WAIT)
> +		local_irq_disable();
> +	if (unlikely(!page))
> +		return NULL;
> +
> +	if (!NUMA_BUILD || likely(slqb_page_to_nid(page) == numa_node_id())) {
> +		c = get_cpu_slab(s, smp_processor_id());
> +		l = &c->list;
> +		page->list = l;
> +		l->nr_slabs++;
> +		if (page->inuse + 1 < s->objects) {
> +			list_add(&page->lru, &l->partial);
> +			l->nr_partial++;
> +		} else {
> +/*XXX			list_add(&page->lru, &l->full); */
> +		}
> +		goto local_found;
> +	} else {
> +#ifdef CONFIG_NUMA
> +		n = s->node[slqb_page_to_nid(page)];
> +		spin_lock(&n->list_lock);
> +		l = &n->list;
> +
> +		if (l->nr_free || l->nr_partial || l->remote_free_check) {
> +			__free_slab(s, page);
> +			goto remote_list_have_object;
> +		}
> +
> +		l->nr_slabs++;
> +		page->list = l;
> +		if (page->inuse + 1 < s->objects) {
> +			list_add(&page->lru, &l->partial);
> +			l->nr_partial++;
> +		} else {
> +/*XXX			list_add(&page->lru, &l->full); */
> +		}
> +		goto remote_found;
> +#endif
> +	}
> +
...

Thanks for helping me gaining a better understanding of SLQB!

Warm regards

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 94%]

* [PATCH 2.6.26 #repost] Smack: Integrate with Audit
  @ 2008-04-17 16:10 75% ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-04-17 16:10 UTC (permalink / raw)
  To: James Morris, Casey Schaufler
  Cc: Stephen Smalley, Steve Grubb, Al Viro, linux-security-module,
	linux-audit, linux-kernel

Hi James/all,

On Thu, Apr 17, 2008 at 11:05:57AM +0000, James Morris wrote:
> 
> Please review the following security patches for 2.6.26, which have
> been undergoing testing in the "next" tree and affect multiple LSMs.
> 
> 

As a clarification, those new changes was added to the security tree 
to modularly integrate Smack with Audit. The final step is the
reposted below patch which setups the new Audit hooks for Smack.

The main concern against below patch was the reuse of SELinux Audit
fields. For such reuse, Stephen asked for an explicit ACK from the 
Audit devs. I've CCed Steve and Al as a kind request for the ACK.

Patch is re-based and re-tested over James's security/for-linus
branch.

Thanks all.

-->

Setup the new Audit hooks for Smack. SELinux Audit rule fields 
are recycled to avoid `auditd' userspace modifications.
Currently only equality testing is supported on labels acting 
as a subject (AUDIT_SUBJ_USER) or as an object (AUDIT_OBJ_USER).

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 904bdc0..70e4abc 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -26,6 +26,7 @@
 #include <linux/pipe_fs_i.h>
 #include <net/netlabel.h>
 #include <net/cipso_ipv4.h>
+#include <linux/audit.h>
 
 #include "smack.h"
 
@@ -752,6 +753,18 @@ static int smack_inode_listsecurity(struct inode *inode, char *buffer,
 	return -EINVAL;
 }
 
+/**
+ * smack_inode_getsecid - Extract inode's security id
+ * @inode: inode to extract the info from
+ * @secid: where result will be saved
+ */
+static void smack_inode_getsecid(const struct inode *inode, u32 *secid)
+{
+	struct inode_smack *isp = inode->i_security;
+
+	*secid = smack_to_secid(isp->smk_inode);
+}
+
 /*
  * File Hooks
  */
@@ -1805,6 +1818,18 @@ static int smack_ipc_permission(struct kern_ipc_perm *ipp, short flag)
 	return smk_curacc(isp, may);
 }
 
+/**
+ * smack_ipc_getsecid - Extract smack security id
+ * @ipcp: the object permissions
+ * @secid: where result will be saved
+ */
+static void smack_ipc_getsecid(struct kern_ipc_perm *ipp, u32 *secid)
+{
+	char *smack = ipp->security;
+
+	*secid = smack_to_secid(smack);
+}
+
 /* module stacking operations */
 
 /**
@@ -2382,6 +2407,124 @@ static int smack_key_permission(key_ref_t key_ref,
 #endif /* CONFIG_KEYS */
 
 /*
+ * Smack Audit hooks
+ *
+ * Audit requires a unique representation of each Smack specific
+ * rule. This unique representation is used to distinguish the
+ * object to be audited from remaining kernel objects and also
+ * works as a glue between the audit hooks.
+ *
+ * Since repository entries are added but never deleted, we'll use
+ * the smack_known label address related to the given audit rule as
+ * the needed unique representation. This also better fits the smack
+ * model where nearly everything is a label.
+ */
+#ifdef CONFIG_AUDIT
+
+/**
+ * smack_audit_rule_init - Initialize a smack audit rule
+ * @field: audit rule fields given from user-space (audit.h)
+ * @op: required testing operator (=, !=, >, <, ...)
+ * @rulestr: smack label to be audited
+ * @vrule: pointer to save our own audit rule representation
+ *
+ * Prepare to audit cases where (@field @op @rulestr) is true.
+ * The label to be audited is created if necessay.
+ */
+static int smack_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
+{
+	char **rule = (char **)vrule;
+	*rule = NULL;
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return -EINVAL;
+
+	if (op != AUDIT_EQUAL && op != AUDIT_NOT_EQUAL)
+		return -EINVAL;
+
+	*rule = smk_import(rulestr, 0);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_known - Distinguish Smack audit rules
+ * @krule: rule of interest, in Audit kernel representation format
+ *
+ * This is used to filter Smack rules from remaining Audit ones.
+ * If it's proved that this rule belongs to us, the
+ * audit_rule_match hook will be called to do the final judgement.
+ */
+static int smack_audit_rule_known(struct audit_krule *krule)
+{
+	struct audit_field *f;
+	int i;
+
+	for (i = 0; i < krule->field_count; i++) {
+		f = &krule->fields[i];
+
+		if (f->type == AUDIT_SUBJ_USER || f->type == AUDIT_OBJ_USER)
+			return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_match - Audit given object ?
+ * @secid: security id for identifying the object to test
+ * @field: audit rule flags given from user-space
+ * @op: required testing operator
+ * @vrule: smack internal rule presentation
+ * @actx: audit context associated with the check
+ *
+ * The core Audit hook. It's used to take the decision of
+ * whether to audit or not to audit a given object.
+ */
+static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule,
+				  struct audit_context *actx)
+{
+	char *smack;
+	char *rule = vrule;
+
+	if (!rule) {
+		audit_log(actx, GFP_KERNEL, AUDIT_SELINUX_ERR,
+			  "Smack: missing rule\n");
+		return -ENOENT;
+	}
+
+	if (field != AUDIT_SUBJ_USER && field != AUDIT_OBJ_USER)
+		return 0;
+
+	smack = smack_from_secid(secid);
+
+	/*
+	 * No need to do string comparisons. If a match occurs,
+	 * both pointers will point to the same smack_known
+	 * label.
+	 */
+	if (op == AUDIT_EQUAL)
+		return (rule == smack);
+	if (op == AUDIT_NOT_EQUAL)
+		return (rule != smack);
+
+	return 0;
+}
+
+/**
+ * smack_audit_rule_free - free smack rule representation
+ * @vrule: rule to be freed.
+ *
+ * No memory was allocated.
+ */
+static void smack_audit_rule_free(void *vrule)
+{
+	/* No-op */
+}
+
+#endif /* CONFIG_AUDIT */
+
+/*
  * smack_secid_to_secctx - return the smack label for a secid
  * @secid: incoming integer
  * @secdata: destination
@@ -2467,6 +2610,7 @@ struct security_operations smack_ops = {
 	.inode_getsecurity = 		smack_inode_getsecurity,
 	.inode_setsecurity = 		smack_inode_setsecurity,
 	.inode_listsecurity = 		smack_inode_listsecurity,
+	.inode_getsecid =		smack_inode_getsecid,
 
 	.file_permission = 		smack_file_permission,
 	.file_alloc_security = 		smack_file_alloc_security,
@@ -2497,6 +2641,7 @@ struct security_operations smack_ops = {
 	.task_to_inode = 		smack_task_to_inode,
 
 	.ipc_permission = 		smack_ipc_permission,
+	.ipc_getsecid =			smack_ipc_getsecid,
 
 	.msg_msg_alloc_security = 	smack_msg_msg_alloc_security,
 	.msg_msg_free_security = 	smack_msg_msg_free_security,
@@ -2541,12 +2686,22 @@ struct security_operations smack_ops = {
 	.sk_free_security = 		smack_sk_free_security,
 	.sock_graft = 			smack_sock_graft,
 	.inet_conn_request = 		smack_inet_conn_request,
+
  /* key management security hooks */
 #ifdef CONFIG_KEYS
 	.key_alloc = 			smack_key_alloc,
 	.key_free = 			smack_key_free,
 	.key_permission = 		smack_key_permission,
 #endif /* CONFIG_KEYS */
+
+ /* Audit hooks */
+#ifdef CONFIG_AUDIT
+	.audit_rule_init =		smack_audit_rule_init,
+	.audit_rule_known =		smack_audit_rule_known,
+	.audit_rule_match =		smack_audit_rule_match,
+	.audit_rule_free =		smack_audit_rule_free,
+#endif /* CONFIG_AUDIT */
+
 	.secid_to_secctx = 		smack_secid_to_secctx,
 	.secctx_to_secid = 		smack_secctx_to_secid,
 	.release_secctx = 		smack_release_secctx,

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 75%]

* [PATCH BUGFIX -rc4] Smack: Respect 'unlabeled' netlabel mode
@ 2008-05-30 23:36 99% Ahmed S. Darwish
    2008-05-30 23:57 94% ` [PATCH BUGFIX -v2 " Ahmed S. Darwish
  0 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2008-05-30 23:36 UTC (permalink / raw)
  To: Casey Schaufler, Paul Moore
  Cc: linux-security-module, LKML, netdev, Andrew Morton

Hi all,

In case of Smack 'unlabeled' netlabel option, Smack passes a _zero_
initialized 'secattr' to label a packet/sock. This causes an 
[unfound domain label error]/-ENOENT by netlbl_sock_setattr().
Above Netlabel failure leads to Smack socket hooks failure causing 
an always-on socket() -EPERM error.

Such packets should have a netlabel domain agreed with netlabel to 
represent unlabeled packets. Fortunately Smack net ambient label 
packets are agreed with netlabel to be treated as unlabeled packets. 

Treat all packets coming out from a 'unlabeled' Smack system as
coming from the smack net ambient label.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index b5c8f92..03735f4 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -1292,6 +1292,8 @@ static void smack_to_secattr(char *smack, struct netlbl_lsm_secattr *nlsp)
 		}
 		break;
 	default:
+		nlsp->domain = kstrdup(smack_net_ambient, GFP_ATOMIC);
+		nlsp->flags = NETLBL_SECATTR_DOMAIN;
 		break;
 	}
 }

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 99%]

* [PATCH BUGFIX -v2 -rc4] Smack: Respect 'unlabeled' netlabel mode
  2008-05-30 23:36 99% [PATCH BUGFIX -rc4] Smack: Respect 'unlabeled' netlabel mode Ahmed S. Darwish
  @ 2008-05-30 23:57 94% ` Ahmed S. Darwish
    1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2008-05-30 23:57 UTC (permalink / raw)
  To: Casey Schaufler, Paul Moore
  Cc: linux-security-module, LKML, netdev, Andrew Morton

Hi!,

[ Sorry, Fix: Acquire smack_ambient_lock before reading smack_net_ambient ]

-->

In case of Smack 'unlabeled' netlabel option, Smack passes a _zero_
initialized 'secattr' to label a packet/sock. This causes an 
[unfound domain label error]/-ENOENT by netlbl_sock_setattr().
Above Netlabel failure leads to Smack socket hooks failure causing 
an always-on socket() -EPERM error.

Such packets should have a netlabel domain agreed with netlabel to 
represent unlabeled packets. Fortunately Smack net ambient label 
packets are agreed with netlabel to be treated as unlabeled packets. 

Treat all packets coming out from a 'unlabeled' Smack system as
coming from the smack net ambient label.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/security/smack/smack.h b/security/smack/smack.h
index 4a4477f..81b94ba 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -16,6 +16,7 @@
 #include <linux/capability.h>
 #include <linux/spinlock.h>
 #include <linux/security.h>
+#include <linux/mutex.h>
 #include <net/netlabel.h>
 
 /*
@@ -178,6 +179,7 @@ u32 smack_to_secid(const char *);
 extern int smack_cipso_direct;
 extern int smack_net_nltype;
 extern char *smack_net_ambient;
+extern struct mutex smack_ambient_lock;
 
 extern struct smack_known *smack_known;
 extern struct smack_known smack_known_floor;
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index b5c8f92..0d9c7dc 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -1292,6 +1292,11 @@ static void smack_to_secattr(char *smack, struct netlbl_lsm_secattr *nlsp)
 		}
 		break;
 	default:
+		mutex_lock(&smack_ambient_lock);
+		nlsp->domain = kstrdup(smack_net_ambient, GFP_ATOMIC);
+		mutex_unlock(&smack_ambient_lock);
+
+		nlsp->flags = NETLBL_SECATTR_DOMAIN;
 		break;
 	}
 }
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 271a835..cc357dc 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -46,7 +46,7 @@ enum smk_inos {
  */
 static DEFINE_MUTEX(smack_list_lock);
 static DEFINE_MUTEX(smack_cipso_lock);
-static DEFINE_MUTEX(smack_ambient_lock);
+DEFINE_MUTEX(smack_ambient_lock);
 
 /*
  * This is the "ambient" label for network traffic.

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply related	[relevance 94%]

* Re: [PATCH BUGFIX -rc4] Smack: Respect 'unlabeled' netlabel mode
  @ 2008-05-31  0:58 95%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-05-31  0:58 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, linux-security-module, LKML, netdev, Andrew Morton

Hi Casey,

On Fri, May 30, 2008 at 04:10:37PM -0700, Casey Schaufler wrote:
> 
> 
> To date the behavior of a Smack system running with nltype
> unlabeled has been carefully undefined. 
>

In the early days (before the 'Smack: unlabeled outgoing ambient packets'
patch - 4bc87e62), I used '$ echo unlabeled > /smack/nltype' in my startup 
scripts to avoid sending cipso-affected packets. When I upgraded this 
machine's kernel, I faced the -EPERM problem mentiond above. 

> The way you're defining
> it will result in a system in which only processes running with
> the ambient label will be able to use sockets, unless I'm reading
> the code incorrectly. 

I've tried to see the relation but failed, any help?

I'm noticing the opposite though, without defining nltype=unlabeled, 
we're forcing every smack-labeled process to send cipso-affected 
packets (and usually no machine around understands cipso).

_Assuming_ the concept is accepted, depending on the ambient label
may actually lead to a race condition though:

- A packet is set with the ambient label domain
- Ambient label changes
  - old ambient-label netlabel domain is deleted
  - new ambient-label is set
  - new ambient-label netlabel domain is created
- call netlabel_sock_setattr(), uses the old ambient label, leads
  to the -EPERM problem.
  -- Rare, but can happen

There are two possible solutions in my mind:

- Using a predefined netlabel domain to denote to unlabeled packets.
  Defect: May collide with a user chosen label and used to break security.
  Solution: Use a domain name that can't become a label (Hackery ?)

- I've tried first to use what was done before the 'Smack: unlabeled outgoing 
  ambient packets' patch, which honored nltype=unlabeled, but ignored netlabel
  completely:
  i.e.

  int rc = 0;
  if (secattr.flags != NETLBL_SECATTR_NONE)
       rc = netlbl_sock_setattr(sk, &secattr);
  return rc

  Paul, would this be right from a netlabel perspective ?

-- 
Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 95%]

* Re: [PATCH BUGFIX -v2 -rc4] Smack: Respect 'unlabeled' netlabel mode
  @ 2008-05-31  1:12 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-05-31  1:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: casey, paul.moore, linux-security-module, linux-kernel, netdev,
	Tetsuo Handa

On Fri, May 30, 2008 at 04:25:00PM -0700, Andrew Morton wrote:
> On Sat, 31 May 2008 02:57:51 +0300
> "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> 
> > +		mutex_lock(&smack_ambient_lock);
> > +		nlsp->domain = kstrdup(smack_net_ambient, GFP_ATOMIC);
> > +		mutex_unlock(&smack_ambient_lock);
> 
> no no no no no.  And no.
> 
> GFP_ATOMIC is *unreliable*.  Using it in a "security" feature is a bug
> - if it fails, the feature isn't secure any more.
> 
> Failing to check the kmalloc() return value might be a bug.
> 
> If we _need_ GFP_ATOMIC here then taking a mutex in a cannot-sleep
> context is a bug.
> 
> The patch adds a kmalloc but doesn't add a kfree.  Is it leaky?
> 
> Finally, why is there a need to take a lock around a single store
> instruction?

Possibly the worst three lines written ever. GFP_ATOMIC line
was cut-and-paste forgetting to change it to GFP_KERNEL and the lock
is already useless. 

-- 

"Better to light a candle, than curse the darkness"

Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com


^ permalink raw reply	[relevance 99%]

* Re: SMACK netfilter smacklabel socket match
  @ 2008-10-06 23:05 99%                   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2008-10-06 23:05 UTC (permalink / raw)
  To: Tilman Baumann; +Cc: Casey Schaufler, Linux-Kernel, linux-security-module

Hi Tilman,

On Mon, Oct 6, 2008 at 2:57 PM, Tilman Baumann
<tilman.baumann@collax.com> wrote:
> If I set /smack/nltype to 'unlabeled' I have effectively shut off the
>  network.
...
> This might work well in trusted networks.
> But Internet is untrusted and needs to work too. At least in the most real
> world scenarios. :)
>
> If i set /smack/nltype to 'unlabled' i don't even get SYN packets out.
> (operation not permitted)
>
> That's probably a bug, but I think the "fix" is to disable the ability to
> set the nltype to anything other than CIPSO at least for the time being.

Check this patch:
http://article.gmane.org/gmane.linux.network/95294/match=

As far as I can remember, it does exactly what you're asking for.
There have been some arguments against it, but at least you can get
the idea and _try_ to discuss/enhance it further.

Regards

-- 
Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com

^ permalink raw reply	[relevance 99%]

* x86: Is 'volatile' necessary for readb/writeb and friends?
@ 2009-12-04  9:21 98% Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2009-12-04  9:21 UTC (permalink / raw)
  To: x86; +Cc: Rusty Russell, Ingo Molnar, H. Peter Anvin, linux-kernel

Hi all,

x86 memory-mapped IO register accessors cast the memory mapped address
parameter to a one with the 'volatile' type qualifier. For example, here
is readb() after cpp processing

--> arch/x86/include/asm/io.h:

static inline unsigned char readb(const volatile void __iomem *addr) {
	unsigned char ret;
	asm volatile("movb %1, %0"
		     :"=q" (ret)
		     :"m" (*(volatile unsigned char __force *)addr)
		     :"memory");
        return ret;
}

I wonder if the volatile qualifiers in the parameter above and at the asm
statement operand were strictly necessary, or just added for extra safety.

AFAIK, the asm statement already functions as a compiler barrier, and the
compiler won't 'optimize' the statement away due to the 'asm volatile' part,
so shouldn't things be safe without those volatile qualifiers?

The only red-herring I found in the gcc manual was the fact that the
"volatile asm instruction can be moved relative to other code, including
across jump instructions."

I wonder if this was the reason a volatile-type data dependency was added
to the mov{b,w,l,q} asm statements; not to reorder the asm instruction
around non-memory-accessing instructions (we already have a barrier).

Thank you!

-- 
Darwish

^ permalink raw reply	[relevance 98%]

* Re: Linux 2.6.37-rc7
  @ 2010-12-22 18:18 99% ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2010-12-22 18:18 UTC (permalink / raw)
  To: Jochen Voss; +Cc: Linus Torvalds, Dave Airlie, Chris Wilson, Jesse Barnes, LKML

Hi,

On Tue, 21 Dec 2010 23:30:30 +0000, Jochen Voss wrote:
> On Tue, Dec 21, 2010 at 11:52:17AM -0800, Linus Torvalds wrote:
> > I'm still nervous about some of the regression reports for intel
> > graphics, so please keep testing and reporting. This is the last -rc
> > before xmas (or whatever your holiday may be), so now you all have a
> > few free days when you have nothing better to do than test out an -rc
> > release, right?
>
> For reference, the regression I reported yesterday
> (message id "20101220140615.GA3035@automatix",
> http://www.spinics.net/lists/kernel/msg1126718.html
> on the web) is still present: -rc7 only shows a blank
> screen on my system, -rc7 with 541cc966 reverted works
> without problems.
>

I can confirm the regression: a completely-blank screen right after the
grub prompt.

Printing kernel output using early_printk() all along ('earlyprintk=
serial,keep') shows that the rest of the kernel internally goes on with
its booting process as usual.

Reverting the mentioned 541cc966 commit solves the problem; the bug is
reproducible in both qemu and real hardware.

regards,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 99%]

* [PATCH, Bugfix] RAMOOPS: Don't overflow over non-allocated regions
@ 2010-12-25  9:57 97% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2010-12-25  9:57 UTC (permalink / raw)
  To: Marco Stornelli, LKML
  Cc: Kyungmin Park, David Woodhouse, Greg KH, Andrew Morton

Hi,

Current code mis-calculates the ramoops header size, leading to an
overflow over the next record at best, or over a non-allocated region
at worst. Fix that calculation.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
--

Shouldn't this be added to -stable too?

Happy New Year!

 drivers/char/ramoops.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/char/ramoops.c b/drivers/char/ramoops.c
index 73dcb0e..d3d63be 100644
--- a/drivers/char/ramoops.c
+++ b/drivers/char/ramoops.c
@@ -29,7 +29,6 @@
 #include <linux/ramoops.h>
 
 #define RAMOOPS_KERNMSG_HDR "===="
-#define RAMOOPS_HEADER_SIZE   (5 + sizeof(struct timeval))
 
 #define RECORD_SIZE 4096
 
@@ -65,8 +64,8 @@ static void ramoops_do_dump(struct kmsg_dumper *dumper,
 			struct ramoops_context, dump);
 	unsigned long s1_start, s2_start;
 	unsigned long l1_cpy, l2_cpy;
-	int res;
-	char *buf;
+	int res, hdr_size;
+	char *buf, *buf_orig;
 	struct timeval timestamp;
 
 	/* Only dump oopses if dump_oops is set */
@@ -74,6 +73,8 @@ static void ramoops_do_dump(struct kmsg_dumper *dumper,
 		return;
 
 	buf = (char *)(cxt->virt_addr + (cxt->count * RECORD_SIZE));
+	buf_orig = buf;
+
 	memset(buf, '\0', RECORD_SIZE);
 	res = sprintf(buf, "%s", RAMOOPS_KERNMSG_HDR);
 	buf += res;
@@ -81,8 +82,9 @@ static void ramoops_do_dump(struct kmsg_dumper *dumper,
 	res = sprintf(buf, "%lu.%lu\n", (long)timestamp.tv_sec, (long)timestamp.tv_usec);
 	buf += res;
 
-	l2_cpy = min(l2, (unsigned long)(RECORD_SIZE - RAMOOPS_HEADER_SIZE));
-	l1_cpy = min(l1, (unsigned long)(RECORD_SIZE - RAMOOPS_HEADER_SIZE) - l2_cpy);
+	hdr_size = buf - buf_orig;
+	l2_cpy = min(l2, (unsigned long)(RECORD_SIZE - hdr_size));
+	l1_cpy = min(l1, (unsigned long)(RECORD_SIZE - hdr_size) - l2_cpy);
 
 	s2_start = l2 - l2_cpy;
 	s1_start = l1 - l1_cpy;

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 97%]

* Re: [PATCH, Bugfix] RAMOOPS: Don't overflow over non-allocated regions
  @ 2010-12-27 21:33 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2010-12-27 21:33 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: LKML, Kyungmin Park, David Woodhouse, Greg KH, Andrew Morton,
	Linus Torvalds

On Sat, Dec 25, 2010 at 12:19:35PM +0100, Marco Stornelli wrote:
> Il 25/12/2010 10:57, Ahmed S. Darwish ha scritto:
> > 
> > Current code mis-calculates the ramoops header size, leading to an
> > overflow over the next record at best, or over a non-allocated region
> > at worst. Fix that calculation.
> > 
> > Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
> > --
> 
> Acked-by: Marco Stornelli <marco.stornelli@gmail.com>
>

Someone to push this to Linus before -rc8? Andrew?

--
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 99%]

* Saving early panic() messages, to disk, using the BIOS
@ 2011-01-14 13:39 99% Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-01-14 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin; +Cc: Randy Dunlap, x86, LKML

Hi,

I'm facing some very early panics in latest -rc kernels. Vesafb is not
even working, so only the very bottom of the stack trace is available.

>From an overall-design perspective, would a patch that saves such early
panic messages to the hard-disk using real-mode BIOS calls be acceptable?

That seems to be the only path to _fully_ capture such panic; this is a
regular x86(-64) laptop with no serial ports or floppy disks. I'm in the
lmode->pmode->rmode part now, but I thought it might be wise to quickly
ask about such design mergeability before investing further effort.

(There is an old set of patches in Randy's homepage that does something
similar using floppy disks,[1] so it seems the basic idea is not that
strange. I also do something similar in a hobby kernel of mine.[2])

thanks,

[1] http://www.xenotime.net/linux/kmsgdump/
[1] http://gitorious.org/cute-os/cute/

-- 
Darwish

^ permalink raw reply	[relevance 99%]

* [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
@ 2011-01-25 13:47 85% Ahmed S. Darwish
  2011-01-25 13:51 55% ` [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS Ahmed S. Darwish
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Ahmed S. Darwish @ 2011-01-25 13:47 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86-ML
  Cc: Tony Luck, Dave Jones, Andrew Morton, Randy Dunlap,
	Willy Tarreau, Willy Tarreau, Dirk Hohndel, Dirk.Hohndel, IDE-ML,
	LKML

Hi,

I've faced some very early panics in latest kernel. Being a run of the mill
x86 laptop, the machine is void of debugging aids like serial ports or
network boot.

As a possible solution, below patches prototypes the idea of persistently
storing the kernel log ring to a hard disk partition using the enhanced BIOS
0x13 services.

The used BIOS INT 0x13 functions are the same ones originally used by all
contemporary bootloaders to load the Linux kernel. If the kernel code is
already loaded to RAM and being executed, such parts of the BIOS should be
stable enough.

The basic idea is to switch from 64-bit long mode all the way down to 16-bit
real-mode. Once in real-mode, we reset the disk controller and write the log
buffer to disk using a user-supplied absolute disk block address (LBA).

Doing so, we can capture very early panics (along with earlier log messages)
reliably since the writing mechanism has minimal dependency on any Linux code.

Unfortunately, there are problems on some machines.

In my laptop, when calling the BIOS with the "Reset Disk Controllers" command
or even issuing a direct "Extend Write" without a controller reset, the BIOS
hangs for around __5 minutes__. Afterwards, it returns with a 'Timeout' error
code.

The main problem, it seems, is that the BIOS "Reset controller" command is not
enough to restore disk hardware to a state understandable by the BIOS code.

So:

 - Is it possible to re-initialize the disk hardware to its POST state (thus
   make the BIOS services work reliably) while keeping system RAM unmodified?
 - If not, can we do it manually by reprogramming the controllers?

The first patch (#1) implements the longMode -> realMode switch and invokes
the BIOS. The second reserves needed low-memory areas for such code and
registers a panic logger using the kmsg_dump interface.

Both patches are on '-next' and include XXX marks where further help is also
appreciated. Please remember that these patches, while tested, are now for
prototyping the technical feasibility of the idea.

Diffstat:

 arch/x86/kernel/saveoops-rmode.S |  483 ++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/saveoops.h  |   15 ++
 arch/x86/kernel/saveoops.c       |  219 +++++++++++++++++
 arch/x86/kernel/setup.c          |    9 +
 arch/x86/kernel/Makefile         |    3 +
 lib/Kconfig.debug                |   15 ++
 6 files changed, 744 insertions(+), 0 deletions(-)

Related work and discussions:

 - Tony Luck, persistent store: http://article.gmane.org/gmane.linux.kernel.cross-arch/8495
 - Dirk Hohndel, hpa, Japan Symposium, 2D barcode: http://video.linux.com/video/1661
 - akpm, Dave Jones, oops pauser: http://article.gmane.org/gmane.linux.kernel/369739
 - Willy Tarreau, Randy Dunlap, kmsgdump: http://www.xenotime.net/linux/kmsgdump/

Thanks,

--
Darwish
http://darwish.07.googlepages.com


^ permalink raw reply	[relevance 85%]

* [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS
  2011-01-25 13:47 85% [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Ahmed S. Darwish
@ 2011-01-25 13:51 55% ` Ahmed S. Darwish
  2011-01-25 13:53 63% ` [PATCH -next 2/2][RFC] x86: Saveoops: Reserve low memory and register code Ahmed S. Darwish
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-01-25 13:51 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86-ML
  Cc: Tony Luck, Dave Jones, Andrew Morton, Randy Dunlap,
	Willy Tarreau, Willy Tarreau, Dirk Hohndel, Dirk.Hohndel, IDE-ML,
	LKML


We get called here upon panic()s to save the kernel log buffer.

First, switch from 64-bit long mode to 16-bit real mode. Afterwards, save the
log buffer to disk using extended INT 0x13 BIOS services. The user has given
us an absolute LBA disk address to save the log buffer to.

By x86 design, this code is mandated to run on a single identity-mapped page.

- How to initialize the disk hardware to its POST state (thus making the
  BIOS code work reliably) while keeping system RAM unmodified?

- Is it guaranteed that '0x80' will always be the boot disk drive number?
  If not, we need to be passed the boot drive number from the bootloader.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

 arch/x86/kernel/saveoops-rmode.S |  483 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 483 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/saveoops-rmode.S b/arch/x86/kernel/saveoops-rmode.S
new file mode 100644
index 0000000..6e07112
--- /dev/null
+++ b/arch/x86/kernel/saveoops-rmode.S
@@ -0,0 +1,483 @@
+/* PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE */
+
+/*
+ * Saveoops LongMode -> RealMode switch
+ *
+ * Don't come here with any unfinished business at hand, there's no return.
+ * After writing the log buffer to disk, we just halt.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr-index.h>
+#include <asm/pgtable_types.h>
+#include <asm/segment.h>
+#include <asm/saveoops.h>
+
+/*
+ * Notes:
+ * - Avoid using relocatable symbols: we run from a different place than
+ *   where we're originally linked to. Use absolute addresses
+ * - Run this from an identity page since we disable paging
+ * - Dynamic values are used for all x86 table bases to let this code run
+ *   from *any* memory region below 1-Mbyte
+ */
+	.code64
+ENTRY(saveoops_start)
+	/*
+	 * Switch to 32bit-compatibility mode using a L=0 code segment
+	 */
+
+	cli
+
+	/* Permanently store passed parameters */
+	movq	%rdi, %rbp
+	movl	%esi, (ringbuf_addr - saveoops_start)(%ebp)
+	movl	%edx, (rstack_base - saveoops_start)(%ebp)
+	movq	%rcx, (disk_sector - saveoops_start)(%ebp)
+	movl	%r8d, (ringbuf_len - saveoops_start)(%ebp)
+
+	/* Dynamically set the 32bit-compat. GDTR base */
+	leaq	(lmode32_gdt - saveoops_start)(%ebp), %rax
+	movq	%rax, (lmode32_gdt + 2 - saveoops_start)(%ebp)
+
+	/* Dynamically set the 32bit farpointer base */
+	leal	(compat32 - saveoops_start)(%ebp), %eax
+	movl	%eax, (lmode32_farpointer - saveoops_start)(%ebp)
+
+	lgdt	(lmode32_gdt - saveoops_start)(%ebp)
+	ljmpl	*(lmode32_farpointer - saveoops_start)(%ebp)	# addr32
+
+	.code32
+compat32:
+	/*
+	 * 32bit-compatibility Long Mode, using a L=0 %cs
+	 */
+
+	movw	$__KERNEL_DS, %ax
+	movw	%ax, %ds
+	movw	%ax, %es
+	movw	%ax, %ss
+
+	/* 'Deactivate' long mode: disable paging */
+	movl	%cr0, %eax
+	andl    $~X86_CR0_PG, %eax
+	movl    %eax, %cr0
+
+	/*
+	 * Prepare identity maps for the first 2Mbytes. PAE is already
+	 * enabled from the original pmode -> lmode transition.
+	 *
+	 * Reuse head.S page tables instead of creating new ones. Such
+	 * early tables are in fact already reused by the newer direct
+	 * mapping tables, but since paging is now disabled (and we're
+	 * not returning back), hopefully nothing will blow up.
+	 */
+
+	/*
+	 * Pick a table for the PAE Page Directory (PD)
+	 */
+
+	.equ	level2_pae_ident_pgt, (level2_ident_pgt - __START_KERNEL_map)
+	.equ	level2_entry_count, 512
+	.equ	level2_entry_len, 8
+
+	xorl	%eax, %eax
+	movl	$level2_pae_ident_pgt, %edi
+	movl    $((level2_entry_count * level2_entry_len) / 4), %ecx
+	rep	stosl
+
+	movl	$(0 + __PAGE_KERNEL_IDENT_LARGE_EXEC), level2_pae_ident_pgt
+
+	/*
+	 * Pick a table for for the PAE Page Directory Pointer (PDP)
+	 */
+
+	.equ	level3_pae_ident_pgt, (level2_spare_pgt - __START_KERNEL_map)
+	.equ	level3_entry_count, 4
+	.equ	level3_entry_len, 8
+
+	xorl	%eax, %eax
+	movl	$level3_pae_ident_pgt, %edi
+	movl    $((level3_entry_count * level3_entry_len) / 4), %ecx
+	rep	stosl
+
+	movl	$(level2_pae_ident_pgt + _PAGE_PRESENT), level3_pae_ident_pgt
+
+	movl	$level3_pae_ident_pgt, %eax
+	movl    %eax, %cr3
+
+	/* 'Disable' long mode: clear the EFER.LME bit */
+	movl	$MSR_EFER, %ecx
+	rdmsr
+	btcl	$_EFER_LME, %eax
+	wrmsr
+
+	/* Finally, move to 32-bit pmode: re-enabling paging */
+	movl	%cr0, %eax
+	orl     $X86_CR0_PG, %eax
+	movl    %eax, %cr0
+	jmp	pmode32			# flush prefetch
+
+pmode32:
+	/*
+	 * 32-bit protected mode, using a 2MB identity page.
+	 */
+
+	/* Paging was only enabled for the lmode->pmode step */
+	movl	%cr0, %eax
+	andl    $~X86_CR0_PG, %eax
+	movl    %eax, %cr0		# paging no more
+
+	xorl	%eax, %eax
+	movl	%eax, %cr3		# flush the TLB
+
+	/* Dynamically set the GDTR base value */
+	leal	(pmode16_gdt - saveoops_start)(%ebp), %eax
+	movl	%eax, (pmode16_gdt + 2 - saveoops_start)(%ebp)	# base[00:32]
+
+	/* Dynamically set %cs and %ds bases */
+	leal	(pmode16 - saveoops_start)(%ebp), %eax
+	movw	%ax, (pmode16_cs + 2 - saveoops_start)(%ebp)	# base[00:15]
+	movw	%ax, (pmode16_ds + 2 - saveoops_start)(%ebp)	# base[00:15]
+	shrl	$16, %eax
+	movb	%al, (pmode16_cs + 4 - saveoops_start)(%ebp)	# base[16:23]
+	movb	%al, (pmode16_ds + 4 - saveoops_start)(%ebp)	# base[16:23]
+
+	/* Load the 16-bit code and data segments */
+	lgdt	(pmode16_gdt - saveoops_start)(%ebp)
+
+	/* Switch to 16-bit pmode: use the setup 16-bit %cs */
+	ljmp	$0x08, $0x0
+
+	/*
+	 * - “Segment base addresses should be 16-byte aligned” --Intel
+	 * - We also use this as the rmode code base; the 16-byte align
+	 *   will make address caclulations much easier.
+	 */
+	.align 16
+	.globl pmode16
+	.code16
+pmode16:
+	/*
+	 * We're now in the 16-bit protected mode. Since PE is still = 1,
+	 * we can change a segment cache by loading a GDT selector value.
+	 */
+
+	movw	$0x10, %ax
+	movw	%ax, %ds
+	movw	%ax, %es
+	movw	%ax, %fs
+	movw	%ax, %gs
+	movw	%ax, %ss
+
+	/*
+	 * NOTE! Due to the new %cs and %ds bases, dereference addresses
+	 * using the from ‘label - pmode16’ from now on.
+	 */
+
+	/* Dynamically build an rmode segment and offset */
+	leal	(pmode16 - saveoops_start)(%ebp), %eax		# absolute value
+	shrl	$4, %eax
+	movw	%ax, rmode_farpointer - pmode16 + 2		# 8086 %cs
+	movw	$(rmode - pmode16), rmode_farpointer - pmode16	# offset
+
+	/* Restore real-mode BIOS interrupt entries */
+	lidt   (rmode_idtr - pmode16)
+
+	/* Switch to canonical real-mode: clear PE */
+	movl	%cr0, %eax
+	andl	$~X86_CR0_PE, %eax
+	movl	%eax, %cr0
+
+	/* Flush prefetch; use the 8086 code segment */
+	ljmp	*(rmode_farpointer - pmode16)
+
+#ifdef	SAVEOOPS_DEBUG
+	/*
+	 * Valid for any real-mode context where a stack exists
+	 */
+#define __print(msg)		;\
+	pushfl			;\
+	pushal			;\
+	pushw	$(1f - pmode16) ;\
+	call	print_string	;\
+	.ascii	"Saveoops: "	;\
+	.ascii	msg		;\
+	.asciz	"      \n\r"	;\
+1:	popal			;\
+	popfl
+#else
+#define __print(msg)		;
+#endif
+
+	.align 16
+rmode:
+	/*
+	 * REAL Mode, at last!
+	 *
+	 * For further details on the BIOS interrupts used, check any
+	 * version of the “Enhanced Disk Drive Specification”.
+	 */
+
+	movw	%cs, %ax
+	movw	%ax, %ds
+	movw	%ax, %es
+	movw	%ax, %fs
+	movw	%ax, %gs
+
+	/* Setup passed stack area */
+	movl	(rstack_base - pmode16), %eax
+	shrl	$4, %eax			# 16byte-aligned
+	movw	%ax, %ss
+	movw	$RMODE_STACK_LEN, %sp
+
+	__print	("Entered real mode")
+
+	/*
+	 * XXXX: We always use the boot disk drive number '0x80'. Can
+	 * this map to a wrong device?
+	 *
+	 * NOTE! Do not trust the BIOS: assume it clobbered all the
+	 * registers (relevant and not) while servicing interrupts.
+	 */
+
+	/*
+	 * Check Extensions Present (0x41) - Does the BIOS provide
+	 * EDD int 0x13 extensions?
+	 *
+	 * input  %bx     - 0x55aa
+	 * input  %dl     - drive number
+	 * output success - carry = 0 && bx = 0xaa55 && cx bit0 = 1
+	 * output failure - carry = 1 || any false condition above
+	 */
+	movb	$0x41, %ah
+	movw	$0x55aa, %bx
+	movb	$0x80, %dl
+	xorw	%cx, %cx
+	pushw	%ds
+	int	$0x13
+	popw	%ds
+	__print	("Queried BIOS for EDD services")
+	jc	no_edd1
+	cmpw	$0xaa55, %bx
+	jne	no_edd2
+	shrw	$1, %cx
+	jnc	no_edd3
+
+	/* Store 16byte-aligned ring buffer address in disk packet */
+	movl	(ringbuf_addr - pmode16), %eax
+	shrl	$4, %eax
+	movw	%ax, (buffer_seg - pmode16)
+	xorw	%ax, %ax
+	movw	%ax, (buffer_offset - pmode16)
+
+	/* Store ringbuf number of 512-byte blocks in disk packet */
+	movl	(ringbuf_len - pmode16), %eax
+	movb	%al, (sectors_cnt - pmode16)
+
+	__print	("Prepared the Disk Address Packet")
+
+	/*
+	 * Reset Hard Disks (0x00)
+	 *
+	 * input  %dl	  - drive number
+	 * output success - carry = 0 && %ah (err code) = 0
+	 * output failure - carry = 1 || %ah = error code
+	 *
+	 * The kernel has just paniced and left the disk controller
+	 * in an unknown state. Reset controllers before write.
+	 */
+	xorw	%ax, %ax
+	movb	$0x80, %dl
+	pushw	%ds
+	int	$0x13
+	popw	%ds
+	__print	("Disk controller reset")
+	jc	init_err1
+	cmpb	$0x0, %ah
+	jne	init_err2
+
+	/*
+	 * Extended Write (0x43) - Transfer data from RAM to disk
+	 *
+	 * input  %al     - 0 (write with verify off)
+	 * input  %dl     - drive number
+	 * input  %ds:si  - pointer to the Disk Address Packet
+	 * output success - carry = 0 && %ah (err code) = 0
+	 * output failure - carry = 1 || %ah = error code
+	 */
+	movb	$0x43, %ah
+	xorb	%al, %al
+	movb	$0x80, %dl
+	movw	$(disk_address_packet - pmode16), %si
+	pushw	%ds
+	int	$0x13
+	popw	%ds
+	__print	("Extended write finished")
+	jc	write_err1
+	cmpb	$0x0, %ah
+	jne	write_err2
+	jmp	success
+
+init_err1:
+	__print ("INT 0x13/0x0 init error 1")
+	jmp	print_errcode
+init_err2:
+	__print ("INT 0x13/0x0 init error 2")
+	jmp	print_errcode
+write_err1:
+	__print	("INT 0x13/0x43 write error 1")
+	jmp	print_errcode
+write_err2:
+	__print	("INT 0x13/0x43 write error 2")
+	jmp	print_errcode
+no_edd1:
+	__print	("Bios does not support EDD service (err=1)")
+	jmp	print_errcode
+no_edd2:
+	__print	("Bios does not support EDD service (err=2)")
+	jmp	print_errcode
+no_edd3:
+	__print	("Bios does not support EDD service (err=3)")
+	jmp	print_errcode
+success:
+	__print	("Sucess!!!")
+	jmp	print_errcode
+
+halt:	hlt
+	jmp	halt
+
+#ifdef	SAVEOOPS_DEBUG
+	/*
+	 * Print Null-terminated string pointed by top of the stack
+	 */
+	.type	print_string, @function
+print_string:
+	popw	%si
+1:	xorb	%bh, %bh
+	movb	$0x0e, %ah
+	lodsb
+	cmpb	$0, %al
+	je	2f
+	int	$0x10
+	jmp	1b
+2:	ret
+
+	/*
+	 * print %dx value in hexadecimal ascii
+	 */
+	.type	print_hex, @function
+print_hex:
+	xorb   %bh, %bh
+	movw   $4, %cx			# 2-bytes = 4 hex digits
+print_digit:
+	rolw   $4, %dx			# highest-order 4 bits in front
+	movw   $0x0e0f, %ax		# bios function 0x0e
+	andb   %dl, %al
+	cmpb   $0x0a, %al		# transform to ASCII
+	jl     digit
+	addb   $0x07, %al
+digit:
+	addb   $0x30, %al
+	int    $0x10
+	loop   print_digit
+	ret
+
+	/*
+	 * Print INT13 err code, number of sectors written
+	 */
+print_errcode:
+	movb	%ah, %dl
+	call	print_hex
+	movw	(sectors_cnt - pmode16), %dx
+	call	print_hex
+	jmp	halt
+#else
+print_errcode:
+	jmp	halt
+#endif
+
+
+/*
+ * Virtual data section; ‘(dyn.)’ = A dynamically-set value
+ */
+
+	.align 16
+lmode32_gdt:
+	.word	lmode32_gdt_end - lmode32_gdt - 1
+	.quad	0x0000000000000000	# base (dyn.)
+	.word	0, 0, 0			# padding
+lmode32_cs:
+	.word	0xffff			# limit
+	.word	0x0000			# base
+	.word	0x9a00			# P=1, C=0, type=0xA (r/x)
+	.word   0x00cf			# L=0 (compat.), D=1 (32-bit), G=1
+lmode32_ds:
+	.word	0xffff			# limit
+	.word	0x0000			# base
+	.word	0x9200			# P=1, type=0x2 (r/w)
+	.word	0x00cf			# G=1, D=1 (32-bit)
+lmode32_gdt_end:
+
+lmode32_farpointer:
+	.long	0x00000000		# offset (dyn.)
+	.word	lmode32_cs -lmode32_gdt # %cs selector
+
+	.align 16
+pmode16_gdt:
+	.word	pmode16_gdt_end - pmode16_gdt - 1
+	.long	0x00000000		# base (dyn.)
+	.word	0x0000			# padding
+pmode16_cs:
+	.word	0xffff			# limit
+	.word	0x0000			# base (dyn.)
+	.word	0x9a00			# P=1, DPL=00, type=0xA (execute/read)
+	.word	0x0000			# G=0 (byte), D=0 (16-bit)
+pmode16_ds:
+	.word	0xffff			# limit
+	.word	0x0000			# base (dyn.)
+	.word	0x9200			# P=1, DPL=00, type=0x2 (read/write)
+	.word	0x0000			# G=0 (byte), D=0 (16-bit)
+pmode16_gdt_end:
+
+rmode_farpointer:
+	.word	0x0000			# offset (dyn.)
+	.word	0x0000			# %cs (dyn.)
+
+rmode_idtr:
+	.equ	RIDT_BASE, 0x0		# PC architecture defined
+	.equ	RIDT_ENTRY_SIZE, 0x4	# 8086 defined
+	.equ	RIDT_ENTRIES, 0x100	# 8086, 286, 386+ defined
+	.word	RIDT_ENTRIES * RIDT_ENTRY_SIZE - 1
+	.long	RIDT_BASE
+
+	/* Values passed by long-mode C code */
+ringbuf_addr:
+	.long	0x00000000		# 16-byte aligned, < 1-MB (dyn.)
+ringbuf_len:
+	.long	0x00000000		# 512-byte aligned (dyn.)
+rstack_base:
+	.long	0x00000000		# 16-byte aligned, < 1-MB (dyn.)
+
+	.align 16
+disk_address_packet:			# for extended INT 0x13 services (dyn.)
+packet_size:
+	.byte	0x10			# in bytes
+reserved0:
+	.byte	0x00			# must be zero
+sectors_cnt:
+	.byte	0x00			# number of blocks to transfer [1 - 127]
+reserved1:
+	.byte	0x00			# must be zero
+buffer_offset:
+	.word	0x0000			# read/write buffer offset
+buffer_seg:
+	.word	0x0000			# read/write buffer segment
+disk_sector:
+	.quad	0x0000000000000000	# logical sector number (LBA)
+
+ENTRY(saveoops_end)
+
+/* PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE */

--
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 55%]

* [PATCH -next 2/2][RFC] x86: Saveoops: Reserve low memory and register code
  2011-01-25 13:47 85% [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Ahmed S. Darwish
  2011-01-25 13:51 55% ` [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS Ahmed S. Darwish
@ 2011-01-25 13:53 63% ` Ahmed S. Darwish
        3 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2011-01-25 13:53 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86-ML
  Cc: Tony Luck, Dave Jones, Andrew Morton, Randy Dunlap,
	Willy Tarreau, Willy Tarreau, Dirk Hohndel, Dirk.Hohndel,
	Simon Kagstrom, IDE-ML, LKML


Using the x86 memblock interface, reserve below 1-Mbyte low memory areas
for the Saveoops LongMode -> RealMode switch code, ring buffer, and stack.
All the low memory areas are dynamically allocated and reserved, giving
memblock enough flexibility to choose the best available areas possible.

To trigger Saveoops on panic(), it's registered using the kmsg_dump hooks.
That interface is quite racy for our goals, but it's quickly used now to
prototype the code (check the XXX mark for details.)

Once Saveoops code is triggered, it identity maps the first 2 MBytes (the
switch code disables paging), copy the log buffer to its reserved 8086-
accessible area, and jumps to the switch code (PATCH #1.)

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

 arch/x86/kernel/saveoops.c      |  219 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/setup.c         |    9 ++
 arch/x86/include/asm/saveoops.h |   15 +++
 arch/x86/kernel/Makefile        |    3 +
 lib/Kconfig.debug               |   15 +++
 5 files changed, 261 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/saveoops.c b/arch/x86/kernel/saveoops.c
new file mode 100644
index 0000000..f48fc0a
--- /dev/null
+++ b/arch/x86/kernel/saveoops.c
@@ -0,0 +1,219 @@
+/* PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE */
+
+/*
+ * SAVEOOPS -- Save kernel log buffer to disk upon panic()
+ *
+ * To safely access disk in situations like very early boot or where the
+ * disk access code itself is buggy, we use BIOS INT13h extended services.
+ * To access such services, switch to 8086 real-mode first.
+ */
+
+#include <linux/kernel.h>
+#include <linux/compiler.h>
+#include <linux/log2.h>
+#include <linux/time.h>
+#include <linux/kmsg_dump.h>
+#include <linux/memblock.h>
+#include <linux/sched.h>
+
+#include <asm/page.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
+#include <asm/saveoops.h>
+
+/*
+ * We can only access the first MByte in real mode, thus allocate
+ * low-memory areas for the ring buffer, and rmode code and stack.
+ */
+static phys_addr_t ring_buf;
+static phys_addr_t code_buf;
+static phys_addr_t rmode_stack;
+
+/*
+ * Below 1-Mbyte pointer to lmode->rmode switch code.
+ */
+static void (* __noreturn rmode_switch)(phys_addr_t code_buf,
+					phys_addr_t ring_buf,
+					phys_addr_t rmode_stack,
+					uint64_t disk_lba,
+					uint64_t ring_buf_len);
+
+/*
+ * Absolute LBA address where the log will be saved on disk.
+ */
+static uint64_t disk_lba = CONFIG_SAVEOOPS_DISK_LBA;
+
+/*
+ * Extended BIOS services write to disk in units of 512-byte sectors.
+ * Thus, always align the ring buffer size on a 512-byte boundary.
+ */
+#define RMODE_SEGMENT_LIMIT	0x10000UL
+#define RING_SIZE		(60UL * 1024)
+#define SAVEOOPS_HEADER		"*SAVEOOPS-WRITTEN KERNEL LOG*"
+
+/*
+ * Page tables to identity map the first 2 Mbytes.
+ */
+static __aligned(PAGE_SIZE) pud_t ident_level3[PTRS_PER_PUD];
+static __aligned(PAGE_SIZE) pmd_t ident_level2[PTRS_PER_PMD];
+
+/*
+ * The lmode->rmode switching code needs to run from an identity page
+ * since it disables paging.
+ */
+static void build_identity_mappings(void)
+{
+	pgd_t *pgde;
+	pud_t *pude;
+	pmd_t *pmde;
+
+	pmde = ident_level2;
+	set_pmd(pmde, __pmd(0 + __PAGE_KERNEL_IDENT_LARGE_EXEC));
+
+	pude = ident_level3;
+	set_pud(pude, __pud(__pa(ident_level2) + _KERNPG_TABLE));
+
+	pgde = init_level4_pgt;
+	set_pgd(pgde, __pgd(__pa(ident_level3) + _KERNPG_TABLE));
+
+	__flush_tlb_all();
+}
+
+/*
+ * XXX: Our use of kmsg_dump interface is invalid. We completely halt the
+ *	machine when getting called; this means:
+ *	- other registered loggers won't have a chance to read the ring
+ *	- other CPU cores might also be accessing the disk, racing with
+ *	  BIOS code that will do the same.
+ *
+ *	Such interface is now used to get things going. A new interface
+ *	satisfying our special requirements needs to be created. A
+ *	solution is to do an rmode->lmode switch after writing to disk.
+ */
+static void saveoops_do_dump(struct kmsg_dumper *dumper,
+			     enum kmsg_dump_reason reason,
+			     const char *s1, unsigned long l1,
+			     const char *s2, unsigned long l2)
+{
+	unsigned long l1_cpy, l2_cpy, s1_start, s2_start;
+	struct timeval timestamp;
+	char *buf, *buf_orig;
+	int hdr_size;
+
+	if (reason != KMSG_DUMP_PANIC)
+		return;
+
+	do_gettimeofday(&timestamp);
+
+	buf = __va(ring_buf);
+	buf_orig = buf;
+	memset(buf, '\0', RING_SIZE);
+	buf += sprintf(buf, "%s\n", SAVEOOPS_HEADER);
+	buf += sprintf(buf, "%lu.%lu\n", timestamp.tv_sec, timestamp.tv_usec);
+
+	hdr_size = buf - buf_orig;
+	l2_cpy = min(l2, RING_SIZE - hdr_size);
+	l1_cpy = min(l1, RING_SIZE - hdr_size - l2_cpy);
+
+	s2_start = l2 - l2_cpy;
+	s1_start = l1 - l1_cpy;
+	memcpy(buf, s1 + s1_start, l1_cpy);
+	memcpy(buf + l1_cpy, s2 + s2_start, l2_cpy);
+
+	printk(KERN_EMERG "Saveoops: Saving kernel log to boot disk LBA "
+	       "address %llu\n", disk_lba);
+
+	local_irq_disable();
+	build_identity_mappings();
+	rmode_switch(code_buf, ring_buf, rmode_stack, disk_lba, RING_SIZE >> 9);
+}
+
+static struct kmsg_dumper saveoops_dumper = {
+	.dump = saveoops_do_dump,
+};
+
+/*
+ * Real-mode switch code start and end markers.
+ * @pmode16: 16-bit protected mode entry point; 8086-segments base.
+ */
+extern const char saveoops_start[];
+extern const char saveoops_end[];
+extern const char pmode16[];
+
+/*
+ * Simplify real mode segmented-addressing calculations
+ */
+#define RMODE_DATA_ALIGN	16
+
+void __init saveoops_init(void)
+{
+	unsigned int code_size, code_align;
+	int res;
+
+	if (disk_lba == -1) {
+		printk(KERN_INFO "Saveoops: No disk LBA given; will not save "
+		       "kernel log to disk upon panic.\n");
+		return;
+	}
+
+	BUILD_BUG_ON(!IS_ALIGNED(RING_SIZE, 512));
+	BUILD_BUG_ON(RING_SIZE > RMODE_SEGMENT_LIMIT);
+	BUILD_BUG_ON(RMODE_STACK_LEN > RMODE_SEGMENT_LIMIT);
+	BUG_ON((saveoops_end - pmode16) > RMODE_SEGMENT_LIMIT);
+
+	ring_buf = memblock_find_in_range(0, 1<<20, RING_SIZE, RMODE_DATA_ALIGN);
+	if (ring_buf == MEMBLOCK_ERROR) {
+		printk(KERN_ERR "Saveoops: requesting a low-memory region "
+		       "for ring buffer failed\n");
+		return;
+	}
+	memblock_x86_reserve_range(ring_buf, ring_buf + RING_SIZE,
+				   "SAVEOOPS ringbuf");
+	printk(KERN_INFO "Saveoops: Acquired [0x%llx-0x%llx] for the ring "
+	       "buffer\n", ring_buf, ring_buf + RING_SIZE);
+
+	/* The pmode->rmode switch code “MUST” be in a single page */
+	code_size = saveoops_end - saveoops_start;
+	code_align = roundup_pow_of_two(code_size);
+	code_buf = memblock_find_in_range(0, 1<<20, code_size, code_align);
+	if (code_buf == MEMBLOCK_ERROR) {
+		printk(KERN_ERR "Saveoops: requesting a low-memory region "
+		       "for mode-switching code failed\n");
+		goto fail3;
+	}
+	memblock_x86_reserve_range(code_buf, code_buf + code_size,
+				   "SAVEOOPS codebuf");
+	printk(KERN_INFO "Saveoops: Acquired [0x%llx-0x%llx] for rmode-switch "
+	       "code\n", code_buf, code_buf + code_size);
+
+	rmode_stack = memblock_find_in_range(0, 1<<20, RMODE_STACK_LEN,
+					     RMODE_DATA_ALIGN);
+	if (rmode_stack == MEMBLOCK_ERROR) {
+		printk(KERN_ERR "Saveoops: requesting a low-memory region "
+		       "for real-mode stack failed\n");
+		goto fail2;
+	}
+	memblock_x86_reserve_range(rmode_stack, rmode_stack + RMODE_STACK_LEN,
+				   "SAVEOOPS r-stack");
+	printk(KERN_INFO "Saveoops: Acquired [0x%llx-0x%llx] for rmode stack\n",
+	       rmode_stack, rmode_stack + RMODE_STACK_LEN);
+
+	res = kmsg_dump_register(&saveoops_dumper);
+	if (res) {
+		printk(KERN_ERR "Saveoops: registering kmsg dumper failed");
+		goto fail1;
+	}
+
+	memcpy(__va(code_buf), saveoops_start, code_size);
+	rmode_switch = (void *)code_buf;
+	return;
+
+fail1:
+	memblock_x86_free_range(rmode_stack, rmode_stack + RMODE_STACK_LEN);
+fail2:
+	memblock_x86_free_range(code_buf, code_buf + code_size);
+fail3:
+	memblock_x86_free_range(ring_buf, ring_buf + RING_SIZE);
+}
+
+/* PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE - PROTOTYPE */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d3cfe26..3686df8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -50,6 +50,9 @@
 #include <asm/pci-direct.h>
 #include <linux/init_ohci1394_dma.h>
 #include <linux/kvm_para.h>
+#ifdef CONFIG_SAVEOOPS
+#include <asm/saveoops.h>
+#endif
 
 #include <linux/errno.h>
 #include <linux/kernel.h>
@@ -925,6 +928,12 @@ void __init setup_arch(char **cmdline_p)
 	memblock.current_limit = get_max_mapped();
 	memblock_x86_fill();
 
+#ifdef CONFIG_SAVEOOPS
+	/* Initialize Saveoops at the earliest point possible: memblock
+	 * find_in_range is used here to reserve low-memory areas */
+	saveoops_init();
+#endif
+
 	/* preallocate 4k for mptable mpc */
 	early_reserve_e820_mpc_new();

diff --git a/arch/x86/include/asm/saveoops.h b/arch/x86/include/asm/saveoops.h
new file mode 100644
index 0000000..d81e840
--- /dev/null
+++ b/arch/x86/include/asm/saveoops.h
@@ -0,0 +1,15 @@
+#ifndef _SAVEOOPS_H
+#define _SAVEOOPS_H
+
+/*
+ * Definitions shared between Saveoops C and assembly code.
+ */
+
+#define RMODE_STACK_LEN		0x1000	/* Arbitrary */
+
+#ifndef __ASSEMBLY__
+
+void __init saveoops_init(void);
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _SAVEOOPS_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 34244b2..9a097f2 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -121,4 +121,7 @@ ifeq ($(CONFIG_X86_64),y)
 
 	obj-$(CONFIG_PCI_MMCONFIG)	+= mmconf-fam10h_64.o
 	obj-y				+= vsmp_64.o
+
+	obj-$(CONFIG_SAVEOOPS)		+= saveoops.o
+	obj-$(CONFIG_SAVEOOPS)		+= saveoops-rmode.o
 endif
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 4a78f8c..b994791 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -231,6 +231,21 @@ config BOOTPARAM_HUNG_TASK_PANIC
 
 	  Say N if unsure.
 
+config SAVEOOPS
+	bool "Save kernel panics to disk using BIOS"
+	depends on X86_64
+	---help---
+	  <TO-BE-ADDED>
+
+config SAVEOOPS_DISK_LBA
+       int "Boot disk LBA offset to save panic to"
+       default -1
+       depends on SAVEOOPS
+       ---help---
+	 Use this boot disk LBA address to save the kernel log.
+	 To find a partition LBA address use: $fdisk -ul
+	 [VERY DANGEROUS] <FURTHER-INFO-TO-BE-ADDED>
+
 config BOOTPARAM_HUNG_TASK_PANIC_VALUE
 	int
 	depends on DETECT_HUNG_TASK

--
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 63%]

* Re: [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
    @ 2011-01-25 15:36 87%   ` Ahmed S. Darwish
    1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2011-01-25 15:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86-ML, Tony Luck,
	Dave Jones, Andrew Morton, Randy Dunlap, Willy Tarreau,
	Willy Tarreau, Dirk Hohndel, Dirk.Hohndel, IDE-ML, LKML,
	Linus Torvalds, Peter Zijlstra, Frédéric Weisbecker,
	Borislav Petkov, Arjan van de Ven

On Tue, Jan 25, 2011 at 03:09:48PM +0100, Ingo Molnar wrote:
> 
> * Ahmed S. Darwish <darwish.07@gmail.com> wrote:
> 
> > Hi,
> > 
> > I've faced some very early panics in latest kernel. Being a run of the mill
> > x86 laptop, the machine is void of debugging aids like serial ports or
> > network boot.
> > 
> > As a possible solution, below patches prototypes the idea of persistently
> > storing the kernel log ring to a hard disk partition using the enhanced BIOS
> > 0x13 services.

[...]

> > 
> > Diffstat:
> > 
> >  arch/x86/kernel/saveoops-rmode.S |  483 ++++++++++++++++++++++++++++++++++++++
> >  arch/x86/include/asm/saveoops.h  |   15 ++
> >  arch/x86/kernel/saveoops.c       |  219 +++++++++++++++++
> >  arch/x86/kernel/setup.c          |    9 +
> >  arch/x86/kernel/Makefile         |    3 +
> >  lib/Kconfig.debug                |   15 ++
> >  6 files changed, 744 insertions(+), 0 deletions(-)
> 
> Ok, i have to admit that while i'm a rabid BIOS-hater i find this debug feature very 
> very interesting, for the plain reason that if it's implemented in a robust and 
> clever way then this has a chance to improve debuggability of pretty much any Linux 
> laptop quite enormously!
> 
> While we generally thoroughly hate BIOSes from beginning to end, one thing can be 
> said, a BIOS bootstraps very early during bootup, and it's relatively simple to 
> trigger as well.
> 

Yes, the BIOS subset used is the same one used by SYSLINUX, grub, etc to load
the kernel. If the kernel is on, these functions are hopefully quite stable.

The complete __roadblock__ I'm currently facing though is restoring the disk
controllers to the state originally setup by the BIOS Power-on self-test (POST).
I hope such re-initialization is even technically feasible.

Without such re-initialization, we'll just be risking the BIOS code exploding.
That was the case in the 5-minute hang described in the cover sheet (PATCH #0).

> Also, since latest kernels do not stomp on BIOS data structures anymore (low RAM), 
> there's some good chance it's still functional at the point of crash - be that an 
> early crash or a later crash.
> 

Yes, I've noticed that thankfully the kernel reserves the BIOS EBDA area. I'm
not sure though if it reserves the real-mode vector table area 0x0 -> 0x400?

> I think the biggest areas of practical concern would be:
> 
>  - Can this mechanism ever, under any circumstance corrupt any real data, destroy 
>    the MBR or do other nasties. Can you think of any additional fail-safe measures 
>    where you could _further robustify the BIOS calls_ to make sure it can never go 
>    to the wrong sector(s)? I really do not want to think of trusting a BIOS to 
>    _write to my disk_.
> 

Admittedly, I was quite worried while testing this on disk. As a possible
safety trigger, I'm thinking of coding a small script under "tools/" that will
write a cookie in the desired area. If the real-mode code did not find such
cookie in the expected place, it abandons the write operation.

>  - Is there some hidden disk area somewhere on PCs, or somewhere on a real partition
>    on typical Linux distributions, which we could use without having to reinstall
>    the box? This would increase utility and availability enormously. I'm thinking of 
>    partition _ends_ which sometimes get rounded in an awkward way and which are 
>    potentially skipped by most Linux filesystems. Even a very small, 512 bytes of 
>    area would be extremely useful for debugging weird suspend hangs ...
> 

I did not have to re-partition the box here. A kindof a hacky solution was
disabling the swap partition and using it for storing the log. That would make
the feature available without re-installing the box, at the cost of temporarily
disabling swap.

Another testing method was booting the kernel from a USB stick and writing the
log there. If there are other free places in the boot disk, it can be used. But
I'm afraid it will make saveoops much more dangerous than what it already is
(by virtue of being much closer to file-system structures and such).

>  - Could we automate the recovery of the dump, and just put it into the regular 
>    kernel log on the next (successful) bootup (on a feature-enabled kernel)? That 
>    would make the log of the 'previous crash' very conveniently available in dmesg 
>    and the syslog. Tools like kerneloops could make use of it immediately.
> 

If I can make the code reliably work, it can be very very useful in a huge
number of situations such as these and more. The catch is the "If" part :)

> All in one, a very intriguing idea IMO, and the hardest bits (lowlevel x86 
> transition) is all implemented already.
> 

There is a place under arch/x86 that already does the lmode->rmode transition?

That was (so far) the hardest part; I quite searched the tree but did not find
any, so I rolled my own lmode->rmode switch code in the file "saveoops-rmode.S"

> 
> 	Ingo

Thanks,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 87%]

* Re: [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
  @ 2011-01-25 17:05 96%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-01-25 17:05 UTC (permalink / raw)
  To: James Bottomley
  Cc: Tejun Heo, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, X86-ML, Tony Luck, Dave Jones, Andrew Morton,
	Randy Dunlap, Willy Tarreau, Willy Tarreau, Dirk Hohndel,
	Dirk.Hohndel, IDE-ML, LKML, Linus Torvalds, Peter Zijlstra,
	Frédéric Weisbecker, Borislav Petkov, Arjan van de Ven,
	Jeff Garzik

Hi!

On Tue, Jan 25, 2011 at 11:02:35AM -0500, James Bottomley wrote:
> On Tue, 2011-01-25 at 17:36 +0200, Ahmed S. Darwish wrote:
> > The complete __roadblock__ I'm currently facing though is restoring the disk
> > controllers to the state originally setup by the BIOS Power-on self-test (POST).
> > I hope such re-initialization is even technically feasible.
> > 
> > Without such re-initialization, we'll just be risking the BIOS code exploding.
> > That was the case in the 5-minute hang described in the cover sheet (PATCH #0).
> 
> So this is the bit that's not really technically feasible.  BIOS tends
> to run storage devices in a very primitive way (so it takes basic
> settings, for example and sets the device up for one particular channel
> of access).  When preparing the device for an operating system, we have
> to blow away all the bios stuff and put it into a more generally
> performant mode (this isn't just the storage per se, it's also the
> interrupt and routing).  Unfortunately, currently, we don't bother to
> save the settings the BIOS was using, so there's no way to reinitialise
> the device back to bios without an effective reboot.
>

My current x86 laptop includes the very common ATA PIIX controller. If I
dumped the PIC, IOAPIC, and disk controller state in a register file in a
safe area, and re-initialized these before giving control to the BIOS, can
this move the solution space from being not really technically feasible to
"quite possible"?

I have to admit that my knowledge of disk controllers is nil. The final
result though can really be worth the effort (as stated by Ingo's mail),
so I'm _willing_ to explore and prototype different paths.

>
> Most BIOS doesn't seem to contain storage re-POST code that's usable
> (it's all embedded in the boot sequence).
>

There's this arcane `Warm boot with partial re-initialization' BIOS feature
using the CMOS register 0xf, but I don't want to to depend on it either.

> James
> 

Thanks,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 96%]

* Re: [PATCH -next 2/2][RFC] x86: Saveoops: Reserve low memory and register code
  @ 2011-01-26  9:04 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-01-26  9:04 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, X86-ML, Tony Luck,
	Dave Jones, Andrew Morton, Randy Dunlap, Willy Tarreau,
	Willy Tarreau, Dirk Hohndel, Dirk.Hohndel, IDE-ML, LKML,
	Linus Torvalds, Peter Zijlstra, Frédéric Weisbecker,
	Borislav Petkov, Arjan van de Ven, Tejun Heo, James Bottomley,
	Mark Lord, Jeff Garzik

On Tue, Jan 25, 2011 at 09:29:58AM -0800, H. Peter Anvin wrote:
> 
> However, I'm quite nervous about this -- this patch has *plenty* of real
> possibility of wrecking data.
> 

Yes, it does.

Keep in mind though that now I'm just prototyping different solutions to
a problem I'm facing. I fully understand that in no way this patch is
going to be merged in its current state.

I'll send another email summarizing the criticism and proposing different
paths in a moment.

thanks,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 99%]

* Re: [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
  @ 2011-01-26 11:44 99%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-01-26 11:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, X86-ML, Tony Luck,
	Dave Jones, Andrew Morton, Randy Dunlap, Willy Tarreau,
	Willy Tarreau, Dirk Hohndel, Dirk.Hohndel, IDE-ML, LKML,
	Linus Torvalds, Peter Zijlstra, Frédéric Weisbecker,
	Borislav Petkov, Arjan van de Ven, Tejun Heo, James Bottomley,
	Mark Lord, Jeff Garzik

On Tue, Jan 25, 2011 at 09:33:18AM -0800, H. Peter Anvin wrote:
> On 01/25/2011 07:08 AM, Tejun Heo wrote:
> 
> >> All in one, a very intriguing idea IMO, and the hardest bits
> >> (lowlevel x86 transition) is all implemented already.
> 
> Lowlevel x86 transition is not at all the hardest part.  It's
> detail-oriented but well defined (and, I might add, incompletely
> implemented -- 64 bits only, not using facilities we already have);
>

Making the transition code applicable to 32-bit CPUs will hopefully be
quite simple: it's mostly a subset of the posted parts.

Since I might use that transition code for a different purpose now, any
facilities not properly used beside the identity mappings?

thanks,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 99%]

* Re: [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
       [not found]               ` <m1sjweyeax.fsf@fess.ebiederm.org>
@ 2011-02-02 11:13 99%             ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-02-02 11:13 UTC (permalink / raw)
  To: Ingo Molnar, Eric W. Biederman
  Cc: H. Peter Anvin, Vivek Goyal, Linus Torvalds, Thomas Gleixner,
	Ingo Molnar, X86-ML, Tony Luck, Dave Jones, Andrew Morton,
	Randy Dunlap, Willy Tarreau, Willy Tarreau, Dirk Hohndel,
	Dirk.Hohndel, IDE-ML, LKML, Peter Zijlstra,
	Frédéric Weisbecker, Borislav Petkov, Arjan van de Ven,
	Tejun Heo, James Bottomley, Mark Lord, Jeff Garzik, Haren Myneni,
	KEXEC-ML, FBDEV-ML

Hi,

Quick note: the Internet has just returned back here after a full
5-day shutdown by the “authorities”. I will hopefully return back
home on Saturday to continue working on this.

thanks,

^ permalink raw reply	[relevance 99%]

* [PATCH -next] Documentation: Improve crashkernel= description
@ 2011-02-06 15:41 99% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2011-02-06 15:41 UTC (permalink / raw)
  To: Randy Dunlap, Vivek Goyal, Haren Myneni
  Cc: Eric Biederman, kexec, LKML, linux-doc

(Also applicable over 2.6.38-rc3)

Had to explore two C code files to make sense of the 'crashkernel='
kernel parameter values.  Improve the situation.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 89835a4..8f26b42 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -545,9 +545,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			Format:
 			<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
 
-	crashkernel=nn[KMG]@ss[KMG]
-			[KNL] Reserve a chunk of physical memory to
-			hold a kernel to switch to with kexec on panic.
+	crashkernel=size[KMG][@offset[KMG]]
+			[KNL] Using kexec, Linux can switch to a 'crash kernel'
+			upon panic. This parameter reserves the physical
+			memory region [offset, offset + size] for that kernel
+			image. If the '@offset' part was ignored, Linux finds
+			a suitable crash image start address automatically.
 
 	crashkernel=range1:size1[,range2:size2,...][@offset]
 			[KNL] Same as above, but depends on the memory

regards,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 99%]

* [PATCH -next] Documentation: Explain the [KMG] parameters suffix
@ 2011-02-06 17:45 99% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2011-02-06 17:45 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: LKML, linux-doc

(Also applicable over 2.6.38-rc3)

The '[KMG]' suffix is commonly described after a number of kernel
parameter values documentation.  Explicitly state its semantics.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 89835a4..3deedce 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -144,6 +144,11 @@ a fixed number of characters. This limit depends on the architecture
 and is between 256 and 4096 characters. It is defined in the file
 ./include/asm/setup.h as COMMAND_LINE_SIZE.
 
+Finally, the [KMG] suffix is commonly described after a number of kernel
+parameter values. These 'K', 'M', and 'G' letters represent the _binary_
+multipliers 'Kilo', 'Mega', and 'Giga', equalling 2^10, 2^20, and 2^30
+bytes respectively. Such letter suffixes can also be entirely omitted.
+
 
 	acpi=		[HW,ACPI,X86]
 			Advanced Configuration and Power Interface

regards,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 99%]

* [PATCH v2 -next] Documentation: Improve crashkernel= description
  @ 2011-02-07 11:30 99%     ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2011-02-07 11:30 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Simon Horman, Vivek Goyal, Haren Myneni, Eric Biederman, kexec,
	LKML, linux-doc


Had to explore two C code files to make sense of the 'crashkernel='
kernel parameter values.  Improve the situation.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 89835a4..5ad9980 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -545,9 +545,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			Format:
 			<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
 
-	crashkernel=nn[KMG]@ss[KMG]
-			[KNL] Reserve a chunk of physical memory to
-			hold a kernel to switch to with kexec on panic.
+	crashkernel=size[KMG][@offset[KMG]]
+			[KNL] Using kexec, Linux can switch to a 'crash kernel'
+			upon panic. This parameter reserves the physical
+			memory region [offset, offset + size] for that kernel
+			image. If '@offset' is omitted, then a suitable offset
+			is selected automatically.
 
 	crashkernel=range1:size1[,range2:size2,...][@offset]
 			[KNL] Same as above, but depends on the memory

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 99%]

* Re: [PATCH] Documentation: log_buf_len accepts [KMG]
  @ 2011-02-07 11:40 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-02-07 11:40 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: LKML, linux-doc, Randy Dunlap

On Sun, Feb 06, 2011 at 08:40:02PM -0800, Randy Dunlap wrote:
> 
> Update the "log_buf_len" description to use [KMG] syntax for the
> buffer size.
> 
> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
> ---
> 

This implies an ACK for the original patch?

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply	[relevance 99%]

* [PATCH v3 -next] Documentation: Improve crashkernel= description
  @ 2011-02-07 16:01 96%         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2011-02-07 16:01 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Randy Dunlap, Simon Horman, Haren Myneni, Eric Biederman, kexec,
	LKML, linux-doc

On Mon, Feb 07, 2011 at 09:25:50AM -0500, Vivek Goyal wrote:
> On Mon, Feb 07, 2011 at 01:30:54PM +0200, Ahmed S. Darwish wrote:
> > 
> > Had to explore two C code files to make sense of the 'crashkernel='
> > kernel parameter values.  Improve the situation.
> > 
> 
> Did you look at Documentation/kdump/kdump.txt before looking into the
> code. I thought kdump.txt explained the meaning of crashkernel= well.
> 
> In case if it was not obivious that for further details look into
> kdump.txt, I will suggest to add a line asking reader to look into
> kdump.txt for more details.
> 

Yes, I jumped to the code first :-) Here's a new patch with the link:

(Also applicable over latest -rc3)

==>

Complete the crashkernel= kernel parameter documentation.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 89835a4..050b0e5 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -545,16 +545,20 @@ and is between 256 and 4096 characters. It is defined in the file
 			Format:
 			<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
 
-	crashkernel=nn[KMG]@ss[KMG]
-			[KNL] Reserve a chunk of physical memory to
-			hold a kernel to switch to with kexec on panic.
+	crashkernel=size[KMG][@offset[KMG]]
+			[KNL] Using kexec, Linux can switch to a 'crash kernel'
+			upon panic. This parameter reserves the physical
+			memory region [offset, offset + size] for that kernel
+			image. If '@offset' is omitted, then a suitable offset
+			is selected automatically. Check
+			Documentation/kdump/kdump.txt for further details.
 
 	crashkernel=range1:size1[,range2:size2,...][@offset]
 			[KNL] Same as above, but depends on the memory
 			in the running system. The syntax of range is
 			start-[end] where start and end are both
 			a memory unit (amount[KMG]). See also
-			Documentation/kdump/kdump.txt for a example.
+			Documentation/kdump/kdump.txt for an example.
 
 	cs89x0_dma=	[HW,NET]
 			Format: <dma>


> 
> Acked-by: Vivek Goyal <vgoyal@redhat.com>
> 

thanks,

-- 
Darwish
http://darwish.07.googlepages.com

^ permalink raw reply related	[relevance 96%]

* [PATCH] can: kvaser_usb: Don't free packets when tight on URBs
@ 2014-12-23 15:46 96% Ahmed S. Darwish
  2014-12-23 15:53 40% ` [PATCH] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Ahmed S. Darwish @ 2014-12-23 15:46 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in high frequency caused seemingly-random panics in the
kernel.

On further inspection, it seems the driver erroneously freed the
to-be-transmitted packet upon getting tight on URBs and returning
NETDEV_TX_BUSY, leading to invalid memory writes and double frees
at a later point in time.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

 (Generated over 3.19.0-rc1)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 541fb7a..34c35d8 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1286,6 +1286,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	bool kfree_skb_on_error = true;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1336,6 +1337,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	if (!context) {
 		netdev_warn(netdev, "cannot find free context\n");
+		kfree_skb_on_error = false;
 		ret =  NETDEV_TX_BUSY;
 		goto releasebuf;
 	}
@@ -1364,8 +1366,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (unlikely(err)) {
 		can_free_echo_skb(netdev, context->echo_index);
 
-		skb = NULL; /* set to NULL to avoid double free in
-			     * dev_kfree_skb(skb) */
+		kfree_skb_on_error = false;
 
 		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
@@ -1389,7 +1390,8 @@ releasebuf:
 nobufmem:
 	usb_free_urb(urb);
 nourbmem:
-	dev_kfree_skb(skb);
+	if (kfree_skb_on_error)
+		dev_kfree_skb(skb);
 	return ret;
 }


^ permalink raw reply related	[relevance 96%]

* [PATCH] can: kvaser_usb: Add support for the Usbcan-II family
  2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
@ 2014-12-23 15:53 40% ` Ahmed S. Darwish
                       ` (6 subsequent siblings)
  7 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2014-12-23 15:53 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
divided into two major families: 'Leaf', and 'UsbcanII'.  From an
Operating System perspective, the firmware of both families behave
in a not too drastically different fashion.

This patch adds support for the UsbcanII family of devices to the
current Kvaser Leaf-only driver.

CAN frames sending, receiving, and error handling paths has been
tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
should also work nicely with other products in the same category.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 630 +++++++++++++++++++++++++++++++--------
 1 file changed, 505 insertions(+), 125 deletions(-)

 (Generated over 3.19.0-rc1 + generic bugfix at
  can-kvaser_usb-Don-t-free-packets-when-tight-on-URBs.patch)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 34c35d8..e7076da 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -6,12 +6,15 @@
  * Parts of this driver are based on the following:
  *  - Kvaser linux leaf driver (version 4.78)
  *  - CAN driver for esd CAN-USB/2
+ *  - Kvaser linux usbcanII driver (version 5.3)
  *
  * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
  * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
  * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
+ * Copyright (C) 2014 Valeo Corporation
  */
 
+#include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/module.h>
 #include <linux/netdevice.h>
@@ -21,6 +24,18 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/*
+ * Kvaser USB CAN dongles are divided into two major families:
+ * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
+ * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
+ */
+enum kvaser_usb_family {
+	KVASER_LEAF,
+	KVASER_USBCAN,
+};
+
 #define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
@@ -29,9 +44,12 @@
 #define USB_RECV_TIMEOUT		1000 /* msecs */
 #define RX_BUFFER_SIZE			3072
 #define CAN_USB_CLOCK			8000000
-#define MAX_NET_DEVICES			3
+#define LEAF_MAX_NET_DEVICES		3
+#define USBCAN_MAX_NET_DEVICES		2
+#define MAX_NET_DEVICES			MAX(LEAF_MAX_NET_DEVICES, \
+					    USBCAN_MAX_NET_DEVICES)
 
-/* Kvaser USB devices */
+/* Leaf USB devices */
 #define KVASER_VENDOR_ID		0x0bfd
 #define USB_LEAF_DEVEL_PRODUCT_ID	10
 #define USB_LEAF_LITE_PRODUCT_ID	11
@@ -55,6 +73,16 @@
 #define USB_CAN_R_PRODUCT_ID		39
 #define USB_LEAF_LITE_V2_PRODUCT_ID	288
 #define USB_MINI_PCIE_HS_PRODUCT_ID	289
+#define LEAF_PRODUCT_ID(id)		(id >= USB_LEAF_DEVEL_PRODUCT_ID && \
+					 id <= USB_MINI_PCIE_HS_PRODUCT_ID)
+
+/* USBCANII devices */
+#define USB_USBCAN_REVB_PRODUCT_ID	2
+#define USB_VCI2_PRODUCT_ID		3
+#define USB_USBCAN2_PRODUCT_ID		4
+#define USB_MEMORATOR_PRODUCT_ID	5
+#define USBCAN_PRODUCT_ID(id)		(id >= USB_USBCAN_REVB_PRODUCT_ID && \
+					 id <= USB_MEMORATOR_PRODUCT_ID)
 
 /* USB devices features */
 #define KVASER_HAS_SILENT_MODE		BIT(0)
@@ -73,7 +101,7 @@
 #define MSG_FLAG_TX_ACK			BIT(6)
 #define MSG_FLAG_TX_REQUEST		BIT(7)
 
-/* Can states */
+/* Can states (M16C CxSTRH register) */
 #define M16C_STATE_BUS_RESET		BIT(0)
 #define M16C_STATE_BUS_ERROR		BIT(4)
 #define M16C_STATE_BUS_PASSIVE		BIT(5)
@@ -98,7 +126,13 @@
 #define CMD_START_CHIP_REPLY		27
 #define CMD_STOP_CHIP			28
 #define CMD_STOP_CHIP_REPLY		29
-#define CMD_GET_CARD_INFO2		32
+#define CMD_READ_CLOCK			30
+#define CMD_READ_CLOCK_REPLY		31
+
+#define LEAF_CMD_GET_CARD_INFO2		32
+#define USBCAN_CMD_RESET_CLOCK		32
+#define USBCAN_CMD_CLOCK_OVERFLOW_EVENT	33
+
 #define CMD_GET_CARD_INFO		34
 #define CMD_GET_CARD_INFO_REPLY		35
 #define CMD_GET_SOFTWARE_INFO		38
@@ -108,8 +142,9 @@
 #define CMD_RESET_ERROR_COUNTER		49
 #define CMD_TX_ACKNOWLEDGE		50
 #define CMD_CAN_ERROR_EVENT		51
-#define CMD_USB_THROTTLE		77
-#define CMD_LOG_MESSAGE			106
+
+#define LEAF_CMD_USB_THROTTLE		77
+#define LEAF_CMD_LOG_MESSAGE		106
 
 /* error factors */
 #define M16C_EF_ACKE			BIT(0)
@@ -121,6 +156,13 @@
 #define M16C_EF_RCVE			BIT(6)
 #define M16C_EF_TRE			BIT(7)
 
+/* Only Leaf-based devices can report M16C error factors,
+ * thus define our own error status flags for USBCAN */
+#define USBCAN_ERROR_STATE_NONE		0
+#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
+#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
+#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
+
 /* bittiming parameters */
 #define KVASER_USB_TSEG1_MIN		1
 #define KVASER_USB_TSEG1_MAX		16
@@ -137,7 +179,7 @@
 #define KVASER_CTRL_MODE_SELFRECEPTION	3
 #define KVASER_CTRL_MODE_OFF		4
 
-/* log message */
+/* Extended CAN identifier flag */
 #define KVASER_EXTENDED_FRAME		BIT(31)
 
 struct kvaser_msg_simple {
@@ -148,30 +190,55 @@ struct kvaser_msg_simple {
 struct kvaser_msg_cardinfo {
 	u8 tid;
 	u8 nchannels;
-	__le32 serial_number;
-	__le32 padding;
+	union {
+		struct {
+			__le32 serial_number;
+			__le32 padding;
+		} __packed leaf0;
+		struct {
+			__le32 serial_number_low;
+			__le32 serial_number_high;
+		} __packed usbcan0;
+	} __packed;
 	__le32 clock_resolution;
 	__le32 mfgdate;
 	u8 ean[8];
 	u8 hw_revision;
-	u8 usb_hs_mode;
-	__le16 padding2;
+	union {
+		struct {
+			u8 usb_hs_mode;
+		} __packed leaf1;
+		struct {
+			u8 padding;
+		} __packed usbcan1;
+	} __packed;
+	__le16 padding;
 } __packed;
 
 struct kvaser_msg_cardinfo2 {
 	u8 tid;
-	u8 channel;
+	u8 reserved;
 	u8 pcb_id[24];
 	__le32 oem_unlock_code;
 } __packed;
 
-struct kvaser_msg_softinfo {
+struct leaf_msg_softinfo {
 	u8 tid;
-	u8 channel;
+	u8 padding0;
 	__le32 sw_options;
 	__le32 fw_version;
 	__le16 max_outstanding_tx;
-	__le16 padding[9];
+	__le16 padding1[9];
+} __packed;
+
+struct usbcan_msg_softinfo {
+	u8 tid;
+	u8 fw_name[5];
+	__le16 max_outstanding_tx;
+	u8 padding[6];
+	__le32 fw_version;
+	__le16 checksum;
+	__le16 sw_options;
 } __packed;
 
 struct kvaser_msg_busparams {
@@ -188,36 +255,86 @@ struct kvaser_msg_tx_can {
 	u8 channel;
 	u8 tid;
 	u8 msg[14];
-	u8 padding;
-	u8 flags;
+	union {
+		struct {
+			u8 padding;
+			u8 flags;
+		} __packed leaf;
+		struct {
+			u8 flags;
+			u8 padding;
+		} __packed usbcan;
+	} __packed;
+} __packed;
+
+struct kvaser_msg_rx_can_header {
+	u8 channel;
+	u8 flag;
 } __packed;
 
-struct kvaser_msg_rx_can {
+struct leaf_msg_rx_can {
 	u8 channel;
 	u8 flag;
+
 	__le16 time[3];
 	u8 msg[14];
 } __packed;
 
-struct kvaser_msg_chip_state_event {
+struct usbcan_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
+	u8 msg[14];
+	__le16 time;
+} __packed;
+
+struct leaf_msg_chip_state_event {
 	u8 tid;
 	u8 channel;
+
 	__le16 time[3];
 	u8 tx_errors_count;
 	u8 rx_errors_count;
+
 	u8 status;
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_tx_acknowledge {
+struct usbcan_msg_chip_state_event {
+	u8 tid;
+	u8 channel;
+
+	u8 tx_errors_count;
+	u8 rx_errors_count;
+	__le16 time;
+
+	u8 status;
+	u8 padding[3];
+} __packed;
+
+struct kvaser_msg_tx_acknowledge_header {
+	u8 channel;
+	u8 tid;
+};
+
+struct leaf_msg_tx_acknowledge {
 	u8 channel;
 	u8 tid;
+
 	__le16 time[3];
 	u8 flags;
 	u8 time_offset;
 } __packed;
 
-struct kvaser_msg_error_event {
+struct usbcan_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
+	__le16 time;
+	__le16 padding;
+} __packed;
+
+struct leaf_msg_error_event {
 	u8 tid;
 	u8 flags;
 	__le16 time[3];
@@ -229,6 +346,18 @@ struct kvaser_msg_error_event {
 	u8 error_factor;
 } __packed;
 
+struct usbcan_msg_error_event {
+	u8 tid;
+	u8 padding;
+	u8 tx_errors_count_ch0;
+	u8 rx_errors_count_ch0;
+	u8 tx_errors_count_ch1;
+	u8 rx_errors_count_ch1;
+	u8 status_ch0;
+	u8 status_ch1;
+	__le16 time;
+} __packed;
+
 struct kvaser_msg_ctrl_mode {
 	u8 tid;
 	u8 channel;
@@ -243,7 +372,7 @@ struct kvaser_msg_flush_queue {
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_log_message {
+struct leaf_msg_log_message {
 	u8 channel;
 	u8 flags;
 	__le16 time[3];
@@ -260,19 +389,49 @@ struct kvaser_msg {
 		struct kvaser_msg_simple simple;
 		struct kvaser_msg_cardinfo cardinfo;
 		struct kvaser_msg_cardinfo2 cardinfo2;
-		struct kvaser_msg_softinfo softinfo;
 		struct kvaser_msg_busparams busparams;
+
+		struct kvaser_msg_rx_can_header rx_can_header;
+		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
+
+		union {
+			struct leaf_msg_softinfo softinfo;
+			struct leaf_msg_rx_can rx_can;
+			struct leaf_msg_chip_state_event chip_state_event;
+			struct leaf_msg_tx_acknowledge tx_acknowledge;
+			struct leaf_msg_error_event error_event;
+			struct leaf_msg_log_message log_message;
+		} __packed leaf;
+
+		union {
+			struct usbcan_msg_softinfo softinfo;
+			struct usbcan_msg_rx_can rx_can;
+			struct usbcan_msg_chip_state_event chip_state_event;
+			struct usbcan_msg_tx_acknowledge tx_acknowledge;
+			struct usbcan_msg_error_event error_event;
+		} __packed usbcan;
+
 		struct kvaser_msg_tx_can tx_can;
-		struct kvaser_msg_rx_can rx_can;
-		struct kvaser_msg_chip_state_event chip_state_event;
-		struct kvaser_msg_tx_acknowledge tx_acknowledge;
-		struct kvaser_msg_error_event error_event;
 		struct kvaser_msg_ctrl_mode ctrl_mode;
 		struct kvaser_msg_flush_queue flush_queue;
-		struct kvaser_msg_log_message log_message;
 	} u;
 } __packed;
 
+/* Leaf/USBCAN-agnostic summary of an error event.
+ * No M16C error factors for USBCAN-based devices. */
+struct kvaser_error_summary {
+	u8 channel, status, txerr, rxerr;
+	union {
+		struct {
+			u8 error_factor;
+		} leaf;
+		struct {
+			u8 other_ch_status;
+			u8 error_state;
+		} usbcan;
+	};
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -288,6 +447,8 @@ struct kvaser_usb {
 
 	u32 fw_version;
 	unsigned int nchannels;
+	enum kvaser_usb_family family;
+	unsigned int max_channels;
 
 	bool rxinitdone;
 	void *rxbuf[MAX_RX_URBS];
@@ -311,6 +472,7 @@ struct kvaser_usb_net_priv {
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
+	/* Leaf family IDs */
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
@@ -360,6 +522,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
 		.driver_info = KVASER_HAS_TXRX_ERRORS },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+
+	/* USBCANII family IDs */
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+
 	{ }
 };
 MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
@@ -463,7 +636,18 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	if (err)
 		return err;
 
-	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
+	switch (dev->family) {
+	case KVASER_LEAF:
+		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		break;
+	case KVASER_USBCAN:
+		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		return -EINVAL;
+	}
 
 	return 0;
 }
@@ -482,7 +666,7 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
 		return err;
 
 	dev->nchannels = msg.u.cardinfo.nchannels;
-	if (dev->nchannels > MAX_NET_DEVICES)
+	if (dev->nchannels > dev->max_channels)
 		return -EINVAL;
 
 	return 0;
@@ -496,8 +680,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	u8 channel = msg->u.tx_acknowledge.channel;
-	u8 tid = msg->u.tx_acknowledge.tid;
+	u8 channel, tid;
+
+	channel = msg->u.tx_acknowledge_header.channel;
+	tid = msg->u.tx_acknowledge_header.tid;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -615,37 +801,83 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
 }
 
-static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
-				const struct kvaser_msg *msg)
+static void kvaser_report_error_event(const struct kvaser_usb *dev,
+				      struct kvaser_error_summary *es);
+
+/*
+ * Report error to userspace iff the controller's errors counter has
+ * increased, or we're the only channel seeing the bus error state.
+ *
+ * As reported by USBCAN sheets, "the CAN controller has difficulties
+ * to tell whether an error frame arrived on channel 1 or on channel 2."
+ * Thus, error counters are compared with their earlier values to
+ * determine which channel was responsible for the error event.
+ */
+static void usbcan_report_error_if_applicable(const struct kvaser_usb *dev,
+					      struct kvaser_error_summary *es)
 {
-	struct can_frame *cf;
-	struct sk_buff *skb;
-	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
+	int old_tx_err_count, old_rx_err_count, channel, report_error;
+
+	channel = es->channel;
+	if (channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", channel);
+		return;
+	}
+
+	priv = dev->nets[channel];
+	old_tx_err_count = priv->bec.txerr;
+	old_rx_err_count = priv->bec.rxerr;
+
+	report_error = 0;
+	if (es->txerr > old_tx_err_count) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
+		report_error = 1;
+	}
+	if (es->rxerr > old_rx_err_count) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
+		report_error = 1;
+	}
+	if ((es->status & M16C_STATE_BUS_ERROR) &&
+	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
+		report_error = 1;
+	}
+
+	if (report_error)
+		kvaser_report_error_event(dev, es);
+}
+
+/*
+ * Extract error summary from a Leaf-based device error message
+ */
+static void leaf_extract_error_from_msg(const struct kvaser_usb *dev,
+					const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { 0, };
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
+		es.channel = msg->u.leaf.error_event.channel;
+		es.status =  msg->u.leaf.error_event.status;
+		es.txerr = msg->u.leaf.error_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
+		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
 		break;
-	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
+	case LEAF_CMD_LOG_MESSAGE:
+		es.channel = msg->u.leaf.log_message.channel;
+		es.status = msg->u.leaf.log_message.data[0];
+		es.txerr = msg->u.leaf.log_message.data[2];
+		es.rxerr = msg->u.leaf.log_message.data[3];
+		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
 		break;
 	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
+		es.channel = msg->u.leaf.chip_state_event.channel;
+		es.status =  msg->u.leaf.chip_state_event.status;
+		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
+		es.leaf.error_factor = 0;
 		break;
 	default:
 		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
@@ -653,16 +885,92 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (channel >= dev->nchannels) {
+	kvaser_report_error_event(dev, &es);
+}
+
+/*
+ * Extract summary from a USBCANII-based device error message.
+ */
+static void usbcan_extract_error_from_msg(const struct kvaser_usb *dev,
+					const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { 0, };
+
+	switch (msg->id) {
+
+	/* Sometimes errors are sent as unsolicited chip state events */
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.usbcan.chip_state_event.channel;
+		es.status =  msg->u.usbcan.chip_state_event.status;
+		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
+		usbcan_report_error_if_applicable(dev, &es);
+		break;
+
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = 0;
+		es.status = msg->u.usbcan.error_event.status_ch0;
+		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
+		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
+		es.usbcan.other_ch_status =
+			msg->u.usbcan.error_event.status_ch1;
+		usbcan_report_error_if_applicable(dev, &es);
+
+		/* For error events, the USBCAN firmware does not support
+		 * more than 2 channels: ch0, and ch1. */
+		if (dev->nchannels > 1) {
+			es.channel = 1;
+			es.status = msg->u.usbcan.error_event.status_ch1;
+			es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch1;
+			es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch1;
+			es.usbcan.other_ch_status =
+				msg->u.usbcan.error_event.status_ch0;
+			usbcan_report_error_if_applicable(dev, &es);
+		}
+		break;
+
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+	}
+}
+
+static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
+				const struct kvaser_msg *msg)
+{
+	switch (dev->family) {
+	case KVASER_LEAF:
+		leaf_extract_error_from_msg(dev, msg);
+		break;
+	case KVASER_USBCAN:
+		usbcan_extract_error_from_msg(dev, msg);
+		break;
+	default:
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
+}
 
-	priv = dev->nets[channel];
+static void kvaser_report_error_event(const struct kvaser_usb *dev,
+				      struct kvaser_error_summary *es)
+{
+	struct can_frame *cf;
+	struct sk_buff *skb;
+	struct net_device_stats *stats;
+	struct kvaser_usb_net_priv *priv;
+	unsigned int new_state;
+
+	if (es->channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", es->channel);
+		return;
+	}
+
+	priv = dev->nets[es->channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
+	if (es->status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
@@ -675,9 +983,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	new_state = priv->can.state;
 
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	if (status & M16C_STATE_BUS_OFF) {
+	if (es->status & M16C_STATE_BUS_OFF) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
 		priv->can.can_stats.bus_off++;
@@ -687,12 +995,12 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		netif_carrier_off(priv->netdev);
 
 		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
 		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es->txerr || es->rxerr)
+				cf->data[1] = (es->txerr > es->rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
@@ -703,13 +1011,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	}
-
-	if (status == M16C_STATE_BUS_ERROR) {
+	} else if (es->status & M16C_STATE_BUS_ERROR) {
 		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es->txerr > es->rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
 
@@ -723,7 +1029,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 	}
 
-	if (!status) {
+	if (!es->status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
 
@@ -739,34 +1045,52 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
-		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
-
-		if (error_factor & M16C_EF_ACKE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
-					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
-			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
-			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
-			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
-			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
-			cf->data[2] |= CAN_ERR_PROT_TX;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			priv->can.can_stats.bus_error++;
+			stats->rx_errors++;
+
+			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+
+			if (es->leaf.error_factor & M16C_EF_ACKE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
+			if (es->leaf.error_factor & M16C_EF_CRCE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
+						CAN_ERR_PROT_LOC_CRC_DEL);
+			if (es->leaf.error_factor & M16C_EF_FORME)
+				cf->data[2] |= CAN_ERR_PROT_FORM;
+			if (es->leaf.error_factor & M16C_EF_STFE)
+				cf->data[2] |= CAN_ERR_PROT_STUFF;
+			if (es->leaf.error_factor & M16C_EF_BITE0)
+				cf->data[2] |= CAN_ERR_PROT_BIT0;
+			if (es->leaf.error_factor & M16C_EF_BITE1)
+				cf->data[2] |= CAN_ERR_PROT_BIT1;
+			if (es->leaf.error_factor & M16C_EF_TRE)
+				cf->data[2] |= CAN_ERR_PROT_TX;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
+			stats->tx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
+			stats->rx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			priv->can.can_stats.bus_error++;
+			cf->can_id |= CAN_ERR_BUSERROR;
+		}
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto err;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
+	cf->data[6] = es->txerr;
+	cf->data[7] = es->rxerr;
 
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
 
 	priv->can.state = new_state;
 
@@ -774,6 +1098,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+
+	return;
+
+err:
+	dev_kfree_skb(skb);
 }
 
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
@@ -783,16 +1112,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	struct sk_buff *skb;
 	struct net_device_stats *stats = &priv->netdev->stats;
 
-	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
+	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
 					 MSG_FLAG_NERR)) {
 		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
-			   msg->u.rx_can.flag);
+			   msg->u.rx_can_header.flag);
 
 		stats->rx_errors++;
 		return;
 	}
 
-	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -819,7 +1148,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
-	u8 channel = msg->u.rx_can.channel;
+	u8 channel = msg->u.rx_can_header.channel;
+	const u8 *rx_msg;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -830,19 +1160,32 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
-	    (msg->id == CMD_LOG_MESSAGE)) {
+	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+	    (dev->family == KVASER_LEAF && msg->id == LEAF_CMD_LOG_MESSAGE)) {
 		kvaser_usb_rx_error(dev, msg);
 		return;
-	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
-					 MSG_FLAG_NERR |
-					 MSG_FLAG_OVERRUN)) {
+	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
+						MSG_FLAG_NERR |
+						MSG_FLAG_OVERRUN)) {
 		kvaser_usb_rx_can_err(priv, msg);
 		return;
-	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
+	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
 		netdev_warn(priv->netdev,
 			    "Unhandled frame (flags: 0x%02x)",
-			    msg->u.rx_can.flag);
+			    msg->u.rx_can_header.flag);
+		return;
+	}
+
+	switch (dev->family) {
+	case KVASER_LEAF:
+		rx_msg = msg->u.leaf.rx_can.msg;
+		break;
+	case KVASER_USBCAN:
+		rx_msg = msg->u.usbcan.rx_can.msg;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
 
@@ -852,38 +1195,37 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (msg->id == CMD_LOG_MESSAGE) {
-		cf->can_id = le32_to_cpu(msg->u.log_message.id);
+	if (dev->family == KVASER_LEAF && msg->id == LEAF_CMD_LOG_MESSAGE) {
+		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
 		if (cf->can_id & KVASER_EXTENDED_FRAME)
 			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
 		else
 			cf->can_id &= CAN_SFF_MASK;
 
-		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
+		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
 
-		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.log_message.data,
+			memcpy(cf->data, &msg->u.leaf.log_message.data,
 			       cf->can_dlc);
 	} else {
-		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
-			     (msg->u.rx_can.msg[1] & 0x3f);
+		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
 
 		if (msg->id == CMD_RX_EXT_MESSAGE) {
 			cf->can_id <<= 18;
-			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
-				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
-				      (msg->u.rx_can.msg[4] & 0x3f);
+			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
+				      ((rx_msg[3] & 0xff) << 6) |
+				      (rx_msg[4] & 0x3f);
 			cf->can_id |= CAN_EFF_FLAG;
 		}
 
-		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
+		cf->can_dlc = get_can_dlc(rx_msg[5]);
 
-		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.rx_can.msg[6],
+			memcpy(cf->data, &rx_msg[6],
 			       cf->can_dlc);
 	}
 
@@ -947,7 +1289,12 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 
 	case CMD_RX_STD_MESSAGE:
 	case CMD_RX_EXT_MESSAGE:
-	case CMD_LOG_MESSAGE:
+		kvaser_usb_rx_can_msg(dev, msg);
+		break;
+
+	case LEAF_CMD_LOG_MESSAGE:
+		if (dev->family != KVASER_LEAF)
+			goto warn;
 		kvaser_usb_rx_can_msg(dev, msg);
 		break;
 
@@ -960,8 +1307,14 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 		kvaser_usb_tx_acknowledge(dev, msg);
 		break;
 
+	/* Ignored messages */
+	case USBCAN_CMD_CLOCK_OVERFLOW_EVENT:
+		if (dev->family != KVASER_USBCAN)
+			goto warn;
+		break;
+
 	default:
-		dev_warn(dev->udev->dev.parent,
+warn:		dev_warn(dev->udev->dev.parent,
 			 "Unhandled message (%d)\n", msg->id);
 		break;
 	}
@@ -1181,7 +1534,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 				  dev->rxbuf[i],
 				  dev->rxbuf_dma[i]);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++) {
+	for (i = 0; i < dev->max_channels; i++) {
 		struct kvaser_usb_net_priv *priv = dev->nets[i];
 
 		if (priv)
@@ -1286,6 +1639,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	uint8_t *msg_tx_can_flags;
 	bool kfree_skb_on_error = true;
 
 	if (can_dropped_invalid_skb(netdev, skb))
@@ -1306,9 +1660,23 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	msg = buf;
 	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
-	msg->u.tx_can.flags = 0;
 	msg->u.tx_can.channel = priv->channel;
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
+		break;
+	case KVASER_USBCAN:
+		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto releasebuf;
+	}
+
+	*msg_tx_can_flags = 0;
+
 	if (cf->can_id & CAN_EFF_FLAG) {
 		msg->id = CMD_TX_EXT_MESSAGE;
 		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
@@ -1326,7 +1694,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
 
 	if (cf->can_id & CAN_RTR_FLAG)
-		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
+		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
@@ -1596,6 +1964,18 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 	if (!dev)
 		return -ENOMEM;
 
+	if (LEAF_PRODUCT_ID(id->idProduct)) {
+		dev->family = KVASER_LEAF;
+		dev->max_channels = LEAF_MAX_NET_DEVICES;
+	} else if (USBCAN_PRODUCT_ID(id->idProduct)) {
+		dev->family = KVASER_USBCAN;
+		dev->max_channels = USBCAN_MAX_NET_DEVICES;
+	} else {
+		dev_err(&intf->dev, "Product ID (%d) does not belong to any "
+				    "known Kvaser USB family", id->idProduct);
+		return -ENODEV;
+	}
+
 	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
 	if (err) {
 		dev_err(&intf->dev, "Cannot get usb endpoint(s)");
@@ -1608,7 +1988,7 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++)
+	for (i = 0; i < dev->max_channels; i++)
 		kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, i);
 
 	err = kvaser_usb_get_software_info(dev);

^ permalink raw reply related	[relevance 40%]

* Re: [PATCH] can: kvaser_usb: Add support for the Usbcan-II family
  @ 2014-12-24 15:04 89%     ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2014-12-24 15:04 UTC (permalink / raw)
  To: Olivier Sobrie
  Cc: Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi Olivier,

On Wed, Dec 24, 2014 at 01:36:27PM +0100, Olivier Sobrie wrote:
> Hello Ahmed,
> 
> On Tue, Dec 23, 2014 at 05:53:11PM +0200, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
> > divided into two major families: 'Leaf', and 'UsbcanII'.  From an
> > Operating System perspective, the firmware of both families behave
> > in a not too drastically different fashion.
> > 
> > This patch adds support for the UsbcanII family of devices to the
> > current Kvaser Leaf-only driver.
> > 
> > CAN frames sending, receiving, and error handling paths has been
> > tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
> > should also work nicely with other products in the same category.
> > 
> 
> Good, thank you :-) I'll try to test the patch during the next
> week-end. Small remarks below.
> 

Great! thanks and Merry Christmas :-)

> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > ---
> >  drivers/net/can/usb/kvaser_usb.c | 630 +++++++++++++++++++++++++++++++--------
> >  1 file changed, 505 insertions(+), 125 deletions(-)
> > 
> >  (Generated over 3.19.0-rc1 + generic bugfix at
> >   can-kvaser_usb-Don-t-free-packets-when-tight-on-URBs.patch)
> > 
> > diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> > index 34c35d8..e7076da 100644
> > --- a/drivers/net/can/usb/kvaser_usb.c
> > +++ b/drivers/net/can/usb/kvaser_usb.c
> > @@ -6,12 +6,15 @@
> >   * Parts of this driver are based on the following:
> >   *  - Kvaser linux leaf driver (version 4.78)
> >   *  - CAN driver for esd CAN-USB/2
> > + *  - Kvaser linux usbcanII driver (version 5.3)
> >   *
> >   * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
> >   * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
> >   * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
> > + * Copyright (C) 2014 Valeo Corporation
> >   */
> >  
> > +#include <linux/kernel.h>
> >  #include <linux/completion.h>
> >  #include <linux/module.h>
> >  #include <linux/netdevice.h>
> > @@ -21,6 +24,18 @@
> >  #include <linux/can/dev.h>
> >  #include <linux/can/error.h>
> >  
> > +#define MAX(a, b) ((a) > (b) ? (a) : (b))
> 
> There is a max(a, b) macro in <linux/kernel.h>.
> 

Quite true, but it unfortunately fails when the symbol is
used in array size declaration as in below:

        struct kvaser_usb {
                ...
                struct kvaser_usb_net_priv *nets[MAX_NET_DEVICES];
                ...
        }

        include/linux/kernel.h:713:19: error: braced-group within
	expression allowed only inside a function
        #define max(x, y) ({    \
                   ^

> > +
> > +/*
> > + * Kvaser USB CAN dongles are divided into two major families:
> > + * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
> > + * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
> > + */
> > +enum kvaser_usb_family {
> > +	KVASER_LEAF,
> > +	KVASER_USBCAN,
> > +};
> > +
> >  #define MAX_TX_URBS			16
> >  #define MAX_RX_URBS			4
> >  #define START_TIMEOUT			1000 /* msecs */
> > @@ -29,9 +44,12 @@
> >  #define USB_RECV_TIMEOUT		1000 /* msecs */
> >  #define RX_BUFFER_SIZE			3072
> >  #define CAN_USB_CLOCK			8000000
> > -#define MAX_NET_DEVICES			3
> > +#define LEAF_MAX_NET_DEVICES		3
> > +#define USBCAN_MAX_NET_DEVICES		2
> > +#define MAX_NET_DEVICES			MAX(LEAF_MAX_NET_DEVICES, \
> > +					    USBCAN_MAX_NET_DEVICES)
> >  
> > -/* Kvaser USB devices */
> > +/* Leaf USB devices */
> >  #define KVASER_VENDOR_ID		0x0bfd
> >  #define USB_LEAF_DEVEL_PRODUCT_ID	10
> >  #define USB_LEAF_LITE_PRODUCT_ID	11
> > @@ -55,6 +73,16 @@
> >  #define USB_CAN_R_PRODUCT_ID		39
> >  #define USB_LEAF_LITE_V2_PRODUCT_ID	288
> >  #define USB_MINI_PCIE_HS_PRODUCT_ID	289
> > +#define LEAF_PRODUCT_ID(id)		(id >= USB_LEAF_DEVEL_PRODUCT_ID && \
> > +					 id <= USB_MINI_PCIE_HS_PRODUCT_ID)
> > +
> > +/* USBCANII devices */
> > +#define USB_USBCAN_REVB_PRODUCT_ID	2
> > +#define USB_VCI2_PRODUCT_ID		3
> > +#define USB_USBCAN2_PRODUCT_ID		4
> > +#define USB_MEMORATOR_PRODUCT_ID	5
> > +#define USBCAN_PRODUCT_ID(id)		(id >= USB_USBCAN_REVB_PRODUCT_ID && \
> > +					 id <= USB_MEMORATOR_PRODUCT_ID)
> >  
> >  /* USB devices features */
> >  #define KVASER_HAS_SILENT_MODE		BIT(0)
> > @@ -73,7 +101,7 @@
> >  #define MSG_FLAG_TX_ACK			BIT(6)
> >  #define MSG_FLAG_TX_REQUEST		BIT(7)
> >  
> > -/* Can states */
> > +/* Can states (M16C CxSTRH register) */
> >  #define M16C_STATE_BUS_RESET		BIT(0)
> >  #define M16C_STATE_BUS_ERROR		BIT(4)
> >  #define M16C_STATE_BUS_PASSIVE		BIT(5)
> > @@ -98,7 +126,13 @@
> >  #define CMD_START_CHIP_REPLY		27
> >  #define CMD_STOP_CHIP			28
> >  #define CMD_STOP_CHIP_REPLY		29
> > -#define CMD_GET_CARD_INFO2		32
> > +#define CMD_READ_CLOCK			30
> > +#define CMD_READ_CLOCK_REPLY		31
> > +
> > +#define LEAF_CMD_GET_CARD_INFO2		32
> > +#define USBCAN_CMD_RESET_CLOCK		32
> > +#define USBCAN_CMD_CLOCK_OVERFLOW_EVENT	33
> > +
> >  #define CMD_GET_CARD_INFO		34
> >  #define CMD_GET_CARD_INFO_REPLY		35
> >  #define CMD_GET_SOFTWARE_INFO		38
> > @@ -108,8 +142,9 @@
> >  #define CMD_RESET_ERROR_COUNTER		49
> >  #define CMD_TX_ACKNOWLEDGE		50
> >  #define CMD_CAN_ERROR_EVENT		51
> > -#define CMD_USB_THROTTLE		77
> > -#define CMD_LOG_MESSAGE			106
> > +
> > +#define LEAF_CMD_USB_THROTTLE		77
> > +#define LEAF_CMD_LOG_MESSAGE		106
> >  
> >  /* error factors */
> >  #define M16C_EF_ACKE			BIT(0)
> > @@ -121,6 +156,13 @@
> >  #define M16C_EF_RCVE			BIT(6)
> >  #define M16C_EF_TRE			BIT(7)
> >  
> > +/* Only Leaf-based devices can report M16C error factors,
> > + * thus define our own error status flags for USBCAN */
> > +#define USBCAN_ERROR_STATE_NONE		0
> > +#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
> > +#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
> > +#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
> > +
> >  /* bittiming parameters */
> >  #define KVASER_USB_TSEG1_MIN		1
> >  #define KVASER_USB_TSEG1_MAX		16
> > @@ -137,7 +179,7 @@
> >  #define KVASER_CTRL_MODE_SELFRECEPTION	3
> >  #define KVASER_CTRL_MODE_OFF		4
> >  
> > -/* log message */
> > +/* Extended CAN identifier flag */
> >  #define KVASER_EXTENDED_FRAME		BIT(31)
> >  
> >  struct kvaser_msg_simple {
> > @@ -148,30 +190,55 @@ struct kvaser_msg_simple {
> >  struct kvaser_msg_cardinfo {
> >  	u8 tid;
> >  	u8 nchannels;
> > -	__le32 serial_number;
> > -	__le32 padding;
> > +	union {
> > +		struct {
> > +			__le32 serial_number;
> > +			__le32 padding;
> > +		} __packed leaf0;
> > +		struct {
> > +			__le32 serial_number_low;
> > +			__le32 serial_number_high;
> > +		} __packed usbcan0;
> > +	} __packed;
> >  	__le32 clock_resolution;
> >  	__le32 mfgdate;
> >  	u8 ean[8];
> >  	u8 hw_revision;
> > -	u8 usb_hs_mode;
> > -	__le16 padding2;
> > +	union {
> > +		struct {
> > +			u8 usb_hs_mode;
> > +		} __packed leaf1;
> > +		struct {
> > +			u8 padding;
> > +		} __packed usbcan1;
> > +	} __packed;
> > +	__le16 padding;
> >  } __packed;
> >  
> >  struct kvaser_msg_cardinfo2 {
> >  	u8 tid;
> > -	u8 channel;
> > +	u8 reserved;
> >  	u8 pcb_id[24];
> >  	__le32 oem_unlock_code;
> >  } __packed;
> >  
> > -struct kvaser_msg_softinfo {
> > +struct leaf_msg_softinfo {
> >  	u8 tid;
> > -	u8 channel;
> > +	u8 padding0;
> >  	__le32 sw_options;
> >  	__le32 fw_version;
> >  	__le16 max_outstanding_tx;
> > -	__le16 padding[9];
> > +	__le16 padding1[9];
> > +} __packed;
> > +
> > +struct usbcan_msg_softinfo {
> > +	u8 tid;
> > +	u8 fw_name[5];
> > +	__le16 max_outstanding_tx;
> > +	u8 padding[6];
> > +	__le32 fw_version;
> > +	__le16 checksum;
> > +	__le16 sw_options;
> >  } __packed;
> >  
> >  struct kvaser_msg_busparams {
> > @@ -188,36 +255,86 @@ struct kvaser_msg_tx_can {
> >  	u8 channel;
> >  	u8 tid;
> >  	u8 msg[14];
> > -	u8 padding;
> > -	u8 flags;
> > +	union {
> > +		struct {
> > +			u8 padding;
> > +			u8 flags;
> > +		} __packed leaf;
> > +		struct {
> > +			u8 flags;
> > +			u8 padding;
> > +		} __packed usbcan;
> > +	} __packed;
> > +} __packed;
> > +
> > +struct kvaser_msg_rx_can_header {
> > +	u8 channel;
> > +	u8 flag;
> >  } __packed;
> >  
> > -struct kvaser_msg_rx_can {
> > +struct leaf_msg_rx_can {
> >  	u8 channel;
> >  	u8 flag;
> > +
> >  	__le16 time[3];
> >  	u8 msg[14];
> >  } __packed;
> >  
> > -struct kvaser_msg_chip_state_event {
> > +struct usbcan_msg_rx_can {
> > +	u8 channel;
> > +	u8 flag;
> > +
> > +	u8 msg[14];
> > +	__le16 time;
> > +} __packed;
> > +
> > +struct leaf_msg_chip_state_event {
> >  	u8 tid;
> >  	u8 channel;
> > +
> >  	__le16 time[3];
> >  	u8 tx_errors_count;
> >  	u8 rx_errors_count;
> > +
> >  	u8 status;
> >  	u8 padding[3];
> >  } __packed;
> >  
> > -struct kvaser_msg_tx_acknowledge {
> > +struct usbcan_msg_chip_state_event {
> > +	u8 tid;
> > +	u8 channel;
> > +
> > +	u8 tx_errors_count;
> > +	u8 rx_errors_count;
> > +	__le16 time;
> > +
> > +	u8 status;
> > +	u8 padding[3];
> > +} __packed;
> > +
> > +struct kvaser_msg_tx_acknowledge_header {
> > +	u8 channel;
> > +	u8 tid;
> > +};
> > +
> > +struct leaf_msg_tx_acknowledge {
> >  	u8 channel;
> >  	u8 tid;
> > +
> >  	__le16 time[3];
> >  	u8 flags;
> >  	u8 time_offset;
> >  } __packed;
> >  
> > -struct kvaser_msg_error_event {
> > +struct usbcan_msg_tx_acknowledge {
> > +	u8 channel;
> > +	u8 tid;
> > +
> > +	__le16 time;
> > +	__le16 padding;
> > +} __packed;
> > +
> > +struct leaf_msg_error_event {
> >  	u8 tid;
> >  	u8 flags;
> >  	__le16 time[3];
> > @@ -229,6 +346,18 @@ struct kvaser_msg_error_event {
> >  	u8 error_factor;
> >  } __packed;
> >  
> > +struct usbcan_msg_error_event {
> > +	u8 tid;
> > +	u8 padding;
> > +	u8 tx_errors_count_ch0;
> > +	u8 rx_errors_count_ch0;
> > +	u8 tx_errors_count_ch1;
> > +	u8 rx_errors_count_ch1;
> > +	u8 status_ch0;
> > +	u8 status_ch1;
> > +	__le16 time;
> > +} __packed;
> > +
> >  struct kvaser_msg_ctrl_mode {
> >  	u8 tid;
> >  	u8 channel;
> > @@ -243,7 +372,7 @@ struct kvaser_msg_flush_queue {
> >  	u8 padding[3];
> >  } __packed;
> >  
> > -struct kvaser_msg_log_message {
> > +struct leaf_msg_log_message {
> >  	u8 channel;
> >  	u8 flags;
> >  	__le16 time[3];
> > @@ -260,19 +389,49 @@ struct kvaser_msg {
> >  		struct kvaser_msg_simple simple;
> >  		struct kvaser_msg_cardinfo cardinfo;
> >  		struct kvaser_msg_cardinfo2 cardinfo2;
> > -		struct kvaser_msg_softinfo softinfo;
> >  		struct kvaser_msg_busparams busparams;
> > +
> > +		struct kvaser_msg_rx_can_header rx_can_header;
> > +		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
> > +
> > +		union {
> > +			struct leaf_msg_softinfo softinfo;
> > +			struct leaf_msg_rx_can rx_can;
> > +			struct leaf_msg_chip_state_event chip_state_event;
> > +			struct leaf_msg_tx_acknowledge tx_acknowledge;
> > +			struct leaf_msg_error_event error_event;
> > +			struct leaf_msg_log_message log_message;
> > +		} __packed leaf;
> > +
> > +		union {
> > +			struct usbcan_msg_softinfo softinfo;
> > +			struct usbcan_msg_rx_can rx_can;
> > +			struct usbcan_msg_chip_state_event chip_state_event;
> > +			struct usbcan_msg_tx_acknowledge tx_acknowledge;
> > +			struct usbcan_msg_error_event error_event;
> > +		} __packed usbcan;
> > +
> >  		struct kvaser_msg_tx_can tx_can;
> > -		struct kvaser_msg_rx_can rx_can;
> > -		struct kvaser_msg_chip_state_event chip_state_event;
> > -		struct kvaser_msg_tx_acknowledge tx_acknowledge;
> > -		struct kvaser_msg_error_event error_event;
> >  		struct kvaser_msg_ctrl_mode ctrl_mode;
> >  		struct kvaser_msg_flush_queue flush_queue;
> > -		struct kvaser_msg_log_message log_message;
> >  	} u;
> >  } __packed;
> >  
> > +/* Leaf/USBCAN-agnostic summary of an error event.
> > + * No M16C error factors for USBCAN-based devices. */
> > +struct kvaser_error_summary {
> > +	u8 channel, status, txerr, rxerr;
> > +	union {
> > +		struct {
> > +			u8 error_factor;
> > +		} leaf;
> > +		struct {
> > +			u8 other_ch_status;
> > +			u8 error_state;
> > +		} usbcan;
> > +	};
> > +};
> > +
> >  struct kvaser_usb_tx_urb_context {
> >  	struct kvaser_usb_net_priv *priv;
> >  	u32 echo_index;
> > @@ -288,6 +447,8 @@ struct kvaser_usb {
> >  
> >  	u32 fw_version;
> >  	unsigned int nchannels;
> > +	enum kvaser_usb_family family;
> > +	unsigned int max_channels;
> >  
> >  	bool rxinitdone;
> >  	void *rxbuf[MAX_RX_URBS];
> > @@ -311,6 +472,7 @@ struct kvaser_usb_net_priv {
> >  };
> >  
> >  static const struct usb_device_id kvaser_usb_table[] = {
> > +	/* Leaf family IDs */
> >  	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
> >  	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
> >  	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
> > @@ -360,6 +522,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
> >  		.driver_info = KVASER_HAS_TXRX_ERRORS },
> >  	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
> >  	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
> > +
> > +	/* USBCANII family IDs */
> > +	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
> > +		.driver_info = KVASER_HAS_TXRX_ERRORS },
> > +	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
> > +		.driver_info = KVASER_HAS_TXRX_ERRORS },
> > +	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
> > +		.driver_info = KVASER_HAS_TXRX_ERRORS },
> > +	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
> > +		.driver_info = KVASER_HAS_TXRX_ERRORS },
> > +
> >  	{ }
> >  };
> >  MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
> > @@ -463,7 +636,18 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
> >  	if (err)
> >  		return err;
> >  
> > -	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
> > +		break;
> > +	case KVASER_USBCAN:
> > +		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> > +		return -EINVAL;
> > +	}
> >  
> >  	return 0;
> >  }
> > @@ -482,7 +666,7 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
> >  		return err;
> >  
> >  	dev->nchannels = msg.u.cardinfo.nchannels;
> > -	if (dev->nchannels > MAX_NET_DEVICES)
> > +	if (dev->nchannels > dev->max_channels)
> >  		return -EINVAL;
> 
> IMHO, you can keep MAX_NET_DEVICES here.
> 

The UsbcanII firmware hardcodes a maximum of 2 channels in
its protocol. This is unfortunately due to its inability to
tell whether an error event is from CAN channel 0 or ch 1,
and also due to its error_event format:

struct usbcan_msg_error_event {
	u8 tid;
	u8 padding;
	u8 tx_errors_count_ch0;
	u8 rx_errors_count_ch0;
	u8 tx_errors_count_ch1;
	u8 rx_errors_count_ch1;
	u8 status_ch0;
	u8 status_ch1;
	__le16 time;
} __packed;

But since we have MAX_NET_DEVICES = 3, and given the above,
if the UsbcanII firmware reported to us having more than 2
channels, then it's:

a) most probably a memory corruption bug either in the firmware
   or in the driver
b) an updated device/firmware we cannot support yet, since
   we cannot arbitrate the origin of error events quite correctly
   (especially in the case of CAN_ERR_BUSERROR, where the error
   counters stays the same and we have to resort to other hacks.
   Kindly check usbcan_report_error_if_applicable().)

So allowing more than 2 channels given the current set of
affairs will really induce correctness problems :-(

> >  
> >  	return 0;
> > @@ -496,8 +680,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
> >  	struct kvaser_usb_net_priv *priv;
> >  	struct sk_buff *skb;
> >  	struct can_frame *cf;
> > -	u8 channel = msg->u.tx_acknowledge.channel;
> > -	u8 tid = msg->u.tx_acknowledge.tid;
> > +	u8 channel, tid;
> > +
> > +	channel = msg->u.tx_acknowledge_header.channel;
> > +	tid = msg->u.tx_acknowledge_header.tid;
> >  
> >  	if (channel >= dev->nchannels) {
> >  		dev_err(dev->udev->dev.parent,
> > @@ -615,37 +801,83 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
> >  		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
> >  }
> >  
> > -static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> > -				const struct kvaser_msg *msg)
> > +static void kvaser_report_error_event(const struct kvaser_usb *dev,
> > +				      struct kvaser_error_summary *es);
> > +
> > +/*
> > + * Report error to userspace iff the controller's errors counter has
> > + * increased, or we're the only channel seeing the bus error state.
> > + *
> > + * As reported by USBCAN sheets, "the CAN controller has difficulties
> > + * to tell whether an error frame arrived on channel 1 or on channel 2."
> > + * Thus, error counters are compared with their earlier values to
> > + * determine which channel was responsible for the error event.
> > + */
> > +static void usbcan_report_error_if_applicable(const struct kvaser_usb *dev,
> > +					      struct kvaser_error_summary *es)
> >  {
> > -	struct can_frame *cf;
> > -	struct sk_buff *skb;
> > -	struct net_device_stats *stats;
> >  	struct kvaser_usb_net_priv *priv;
> > -	unsigned int new_state;
> > -	u8 channel, status, txerr, rxerr, error_factor;
> > +	int old_tx_err_count, old_rx_err_count, channel, report_error;
> > +
> > +	channel = es->channel;
> > +	if (channel >= dev->nchannels) {
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid channel number (%d)\n", channel);
> > +		return;
> > +	}
> > +
> > +	priv = dev->nets[channel];
> > +	old_tx_err_count = priv->bec.txerr;
> > +	old_rx_err_count = priv->bec.rxerr;
> > +
> > +	report_error = 0;
> > +	if (es->txerr > old_tx_err_count) {
> > +		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
> > +		report_error = 1;
> > +	}
> > +	if (es->rxerr > old_rx_err_count) {
> > +		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
> > +		report_error = 1;
> > +	}
> > +	if ((es->status & M16C_STATE_BUS_ERROR) &&
> > +	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
> > +		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
> > +		report_error = 1;
> > +	}
> > +
> > +	if (report_error)
> > +		kvaser_report_error_event(dev, es);
> > +}
> > +
> > +/*
> > + * Extract error summary from a Leaf-based device error message
> > + */
> > +static void leaf_extract_error_from_msg(const struct kvaser_usb *dev,
> > +					const struct kvaser_msg *msg)
> > +{
> > +	struct kvaser_error_summary es = { 0, };
> >  
> >  	switch (msg->id) {
> >  	case CMD_CAN_ERROR_EVENT:
> > -		channel = msg->u.error_event.channel;
> > -		status =  msg->u.error_event.status;
> > -		txerr = msg->u.error_event.tx_errors_count;
> > -		rxerr = msg->u.error_event.rx_errors_count;
> > -		error_factor = msg->u.error_event.error_factor;
> > +		es.channel = msg->u.leaf.error_event.channel;
> > +		es.status =  msg->u.leaf.error_event.status;
> > +		es.txerr = msg->u.leaf.error_event.tx_errors_count;
> > +		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
> > +		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
> >  		break;
> > -	case CMD_LOG_MESSAGE:
> > -		channel = msg->u.log_message.channel;
> > -		status = msg->u.log_message.data[0];
> > -		txerr = msg->u.log_message.data[2];
> > -		rxerr = msg->u.log_message.data[3];
> > -		error_factor = msg->u.log_message.data[1];
> > +	case LEAF_CMD_LOG_MESSAGE:
> > +		es.channel = msg->u.leaf.log_message.channel;
> > +		es.status = msg->u.leaf.log_message.data[0];
> > +		es.txerr = msg->u.leaf.log_message.data[2];
> > +		es.rxerr = msg->u.leaf.log_message.data[3];
> > +		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
> >  		break;
> >  	case CMD_CHIP_STATE_EVENT:
> > -		channel = msg->u.chip_state_event.channel;
> > -		status =  msg->u.chip_state_event.status;
> > -		txerr = msg->u.chip_state_event.tx_errors_count;
> > -		rxerr = msg->u.chip_state_event.rx_errors_count;
> > -		error_factor = 0;
> > +		es.channel = msg->u.leaf.chip_state_event.channel;
> > +		es.status =  msg->u.leaf.chip_state_event.status;
> > +		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
> > +		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
> > +		es.leaf.error_factor = 0;
> >  		break;
> >  	default:
> >  		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
> > @@ -653,16 +885,92 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  		return;
> >  	}
> >  
> > -	if (channel >= dev->nchannels) {
> > +	kvaser_report_error_event(dev, &es);
> > +}
> > +
> > +/*
> > + * Extract summary from a USBCANII-based device error message.
> > + */
> > +static void usbcan_extract_error_from_msg(const struct kvaser_usb *dev,
> > +					const struct kvaser_msg *msg)
> > +{
> > +	struct kvaser_error_summary es = { 0, };
> > +
> > +	switch (msg->id) {
> > +
> > +	/* Sometimes errors are sent as unsolicited chip state events */
> > +	case CMD_CHIP_STATE_EVENT:
> > +		es.channel = msg->u.usbcan.chip_state_event.channel;
> > +		es.status =  msg->u.usbcan.chip_state_event.status;
> > +		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
> > +		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
> > +		usbcan_report_error_if_applicable(dev, &es);
> > +		break;
> > +
> > +	case CMD_CAN_ERROR_EVENT:
> > +		es.channel = 0;
> > +		es.status = msg->u.usbcan.error_event.status_ch0;
> > +		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
> > +		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
> > +		es.usbcan.other_ch_status =
> > +			msg->u.usbcan.error_event.status_ch1;
> > +		usbcan_report_error_if_applicable(dev, &es);
> > +
> > +		/* For error events, the USBCAN firmware does not support
> > +		 * more than 2 channels: ch0, and ch1. */
> > +		if (dev->nchannels > 1) {
> > +			es.channel = 1;
> > +			es.status = msg->u.usbcan.error_event.status_ch1;
> > +			es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch1;
> > +			es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch1;
> > +			es.usbcan.other_ch_status =
> > +				msg->u.usbcan.error_event.status_ch0;
> > +			usbcan_report_error_if_applicable(dev, &es);
> > +		}
> > +		break;
> > +
> > +	default:
> > +		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
> > +			msg->id);
> > +	}
> > +}
> > +
> > +static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> > +				const struct kvaser_msg *msg)
> > +{
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		leaf_extract_error_from_msg(dev, msg);
> > +		break;
> > +	case KVASER_USBCAN:
> > +		usbcan_extract_error_from_msg(dev, msg);
> > +		break;
> > +	default:
> >  		dev_err(dev->udev->dev.parent,
> > -			"Invalid channel number (%d)\n", channel);
> > +			"Invalid device family (%d)\n", dev->family);
> >  		return;
> >  	}
> > +}
> >  
> > -	priv = dev->nets[channel];
> > +static void kvaser_report_error_event(const struct kvaser_usb *dev,
> > +				      struct kvaser_error_summary *es)
> > +{
> > +	struct can_frame *cf;
> > +	struct sk_buff *skb;
> > +	struct net_device_stats *stats;
> > +	struct kvaser_usb_net_priv *priv;
> > +	unsigned int new_state;
> > +
> > +	if (es->channel >= dev->nchannels) {
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid channel number (%d)\n", es->channel);
> > +		return;
> > +	}
> > +
> > +	priv = dev->nets[es->channel];
> >  	stats = &priv->netdev->stats;
> >  
> > -	if (status & M16C_STATE_BUS_RESET) {
> > +	if (es->status & M16C_STATE_BUS_RESET) {
> >  		kvaser_usb_unlink_tx_urbs(priv);
> >  		return;
> >  	}
> > @@ -675,9 +983,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  
> >  	new_state = priv->can.state;
> >  
> > -	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
> > +	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
> >  
> > -	if (status & M16C_STATE_BUS_OFF) {
> > +	if (es->status & M16C_STATE_BUS_OFF) {
> >  		cf->can_id |= CAN_ERR_BUSOFF;
> >  
> >  		priv->can.can_stats.bus_off++;
> > @@ -687,12 +995,12 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  		netif_carrier_off(priv->netdev);
> >  
> >  		new_state = CAN_STATE_BUS_OFF;
> > -	} else if (status & M16C_STATE_BUS_PASSIVE) {
> > +	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
> >  		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
> >  			cf->can_id |= CAN_ERR_CRTL;
> >  
> > -			if (txerr || rxerr)
> > -				cf->data[1] = (txerr > rxerr)
> > +			if (es->txerr || es->rxerr)
> > +				cf->data[1] = (es->txerr > es->rxerr)
> >  						? CAN_ERR_CRTL_TX_PASSIVE
> >  						: CAN_ERR_CRTL_RX_PASSIVE;
> >  			else
> > @@ -703,13 +1011,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  		}
> >  
> >  		new_state = CAN_STATE_ERROR_PASSIVE;
> > -	}
> > -
> > -	if (status == M16C_STATE_BUS_ERROR) {
> > +	} else if (es->status & M16C_STATE_BUS_ERROR) {
> >  		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
> > -		    ((txerr >= 96) || (rxerr >= 96))) {
> > +		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
> >  			cf->can_id |= CAN_ERR_CRTL;
> > -			cf->data[1] = (txerr > rxerr)
> > +			cf->data[1] = (es->txerr > es->rxerr)
> >  					? CAN_ERR_CRTL_TX_WARNING
> >  					: CAN_ERR_CRTL_RX_WARNING;
> >  
> > @@ -723,7 +1029,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  		}
> >  	}
> >  
> > -	if (!status) {
> > +	if (!es->status) {
> >  		cf->can_id |= CAN_ERR_PROT;
> >  		cf->data[2] = CAN_ERR_PROT_ACTIVE;
> >  
> > @@ -739,34 +1045,52 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  		priv->can.can_stats.restarts++;
> >  	}
> >  
> > -	if (error_factor) {
> > -		priv->can.can_stats.bus_error++;
> > -		stats->rx_errors++;
> > -
> > -		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
> > -
> > -		if (error_factor & M16C_EF_ACKE)
> > -			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
> > -		if (error_factor & M16C_EF_CRCE)
> > -			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
> > -					CAN_ERR_PROT_LOC_CRC_DEL);
> > -		if (error_factor & M16C_EF_FORME)
> > -			cf->data[2] |= CAN_ERR_PROT_FORM;
> > -		if (error_factor & M16C_EF_STFE)
> > -			cf->data[2] |= CAN_ERR_PROT_STUFF;
> > -		if (error_factor & M16C_EF_BITE0)
> > -			cf->data[2] |= CAN_ERR_PROT_BIT0;
> > -		if (error_factor & M16C_EF_BITE1)
> > -			cf->data[2] |= CAN_ERR_PROT_BIT1;
> > -		if (error_factor & M16C_EF_TRE)
> > -			cf->data[2] |= CAN_ERR_PROT_TX;
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		if (es->leaf.error_factor) {
> > +			priv->can.can_stats.bus_error++;
> > +			stats->rx_errors++;
> > +
> > +			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
> > +
> > +			if (es->leaf.error_factor & M16C_EF_ACKE)
> > +				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
> > +			if (es->leaf.error_factor & M16C_EF_CRCE)
> > +				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
> > +						CAN_ERR_PROT_LOC_CRC_DEL);
> > +			if (es->leaf.error_factor & M16C_EF_FORME)
> > +				cf->data[2] |= CAN_ERR_PROT_FORM;
> > +			if (es->leaf.error_factor & M16C_EF_STFE)
> > +				cf->data[2] |= CAN_ERR_PROT_STUFF;
> > +			if (es->leaf.error_factor & M16C_EF_BITE0)
> > +				cf->data[2] |= CAN_ERR_PROT_BIT0;
> > +			if (es->leaf.error_factor & M16C_EF_BITE1)
> > +				cf->data[2] |= CAN_ERR_PROT_BIT1;
> > +			if (es->leaf.error_factor & M16C_EF_TRE)
> > +				cf->data[2] |= CAN_ERR_PROT_TX;
> > +		}
> > +		break;
> > +	case KVASER_USBCAN:
> > +		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
> > +			stats->tx_errors++;
> > +		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
> > +			stats->rx_errors++;
> > +		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
> > +			priv->can.can_stats.bus_error++;
> > +			cf->can_id |= CAN_ERR_BUSERROR;
> > +		}
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> > +		goto err;
> >  	}
> >  
> > -	cf->data[6] = txerr;
> > -	cf->data[7] = rxerr;
> > +	cf->data[6] = es->txerr;
> > +	cf->data[7] = es->rxerr;
> >  
> > -	priv->bec.txerr = txerr;
> > -	priv->bec.rxerr = rxerr;
> > +	priv->bec.txerr = es->txerr;
> > +	priv->bec.rxerr = es->rxerr;
> >  
> >  	priv->can.state = new_state;
> >  
> > @@ -774,6 +1098,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  
> >  	stats->rx_packets++;
> >  	stats->rx_bytes += cf->can_dlc;
> > +
> > +	return;
> > +
> > +err:
> > +	dev_kfree_skb(skb);
> >  }
> >  
> >  static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
> > @@ -783,16 +1112,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
> >  	struct sk_buff *skb;
> >  	struct net_device_stats *stats = &priv->netdev->stats;
> >  
> > -	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
> > +	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
> >  					 MSG_FLAG_NERR)) {
> >  		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
> > -			   msg->u.rx_can.flag);
> > +			   msg->u.rx_can_header.flag);
> >  
> >  		stats->rx_errors++;
> >  		return;
> >  	}
> >  
> > -	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
> > +	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
> >  		skb = alloc_can_err_skb(priv->netdev, &cf);
> >  		if (!skb) {
> >  			stats->rx_dropped++;
> > @@ -819,7 +1148,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
> >  	struct can_frame *cf;
> >  	struct sk_buff *skb;
> >  	struct net_device_stats *stats;
> > -	u8 channel = msg->u.rx_can.channel;
> > +	u8 channel = msg->u.rx_can_header.channel;
> > +	const u8 *rx_msg;
> >  
> >  	if (channel >= dev->nchannels) {
> >  		dev_err(dev->udev->dev.parent,
> > @@ -830,19 +1160,32 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
> >  	priv = dev->nets[channel];
> >  	stats = &priv->netdev->stats;
> >  
> > -	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
> > -	    (msg->id == CMD_LOG_MESSAGE)) {
> > +	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
> > +	    (dev->family == KVASER_LEAF && msg->id == LEAF_CMD_LOG_MESSAGE)) {
> >  		kvaser_usb_rx_error(dev, msg);
> >  		return;
> > -	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
> > -					 MSG_FLAG_NERR |
> > -					 MSG_FLAG_OVERRUN)) {
> > +	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
> > +						MSG_FLAG_NERR |
> > +						MSG_FLAG_OVERRUN)) {
> >  		kvaser_usb_rx_can_err(priv, msg);
> >  		return;
> > -	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
> > +	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
> >  		netdev_warn(priv->netdev,
> >  			    "Unhandled frame (flags: 0x%02x)",
> > -			    msg->u.rx_can.flag);
> > +			    msg->u.rx_can_header.flag);
> > +		return;
> > +	}
> > +
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		rx_msg = msg->u.leaf.rx_can.msg;
> > +		break;
> > +	case KVASER_USBCAN:
> > +		rx_msg = msg->u.usbcan.rx_can.msg;
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> >  		return;
> >  	}
> >  
> > @@ -852,38 +1195,37 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
> >  		return;
> >  	}
> >  
> > -	if (msg->id == CMD_LOG_MESSAGE) {
> > -		cf->can_id = le32_to_cpu(msg->u.log_message.id);
> > +	if (dev->family == KVASER_LEAF && msg->id == LEAF_CMD_LOG_MESSAGE) {
> > +		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
> >  		if (cf->can_id & KVASER_EXTENDED_FRAME)
> >  			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
> >  		else
> >  			cf->can_id &= CAN_SFF_MASK;
> >  
> > -		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
> > +		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
> >  
> > -		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
> > +		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
> >  			cf->can_id |= CAN_RTR_FLAG;
> >  		else
> > -			memcpy(cf->data, &msg->u.log_message.data,
> > +			memcpy(cf->data, &msg->u.leaf.log_message.data,
> >  			       cf->can_dlc);
> >  	} else {
> > -		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
> > -			     (msg->u.rx_can.msg[1] & 0x3f);
> > +		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
> >  
> >  		if (msg->id == CMD_RX_EXT_MESSAGE) {
> >  			cf->can_id <<= 18;
> > -			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
> > -				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
> > -				      (msg->u.rx_can.msg[4] & 0x3f);
> > +			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
> > +				      ((rx_msg[3] & 0xff) << 6) |
> > +				      (rx_msg[4] & 0x3f);
> >  			cf->can_id |= CAN_EFF_FLAG;
> >  		}
> >  
> > -		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
> > +		cf->can_dlc = get_can_dlc(rx_msg[5]);
> >  
> > -		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
> > +		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
> >  			cf->can_id |= CAN_RTR_FLAG;
> >  		else
> > -			memcpy(cf->data, &msg->u.rx_can.msg[6],
> > +			memcpy(cf->data, &rx_msg[6],
> >  			       cf->can_dlc);
> >  	}
> >  
> > @@ -947,7 +1289,12 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
> >  
> >  	case CMD_RX_STD_MESSAGE:
> >  	case CMD_RX_EXT_MESSAGE:
> > -	case CMD_LOG_MESSAGE:
> > +		kvaser_usb_rx_can_msg(dev, msg);
> > +		break;
> > +
> > +	case LEAF_CMD_LOG_MESSAGE:
> > +		if (dev->family != KVASER_LEAF)
> > +			goto warn;
> >  		kvaser_usb_rx_can_msg(dev, msg);
> >  		break;
> >  
> > @@ -960,8 +1307,14 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
> >  		kvaser_usb_tx_acknowledge(dev, msg);
> >  		break;
> >  
> > +	/* Ignored messages */
> > +	case USBCAN_CMD_CLOCK_OVERFLOW_EVENT:
> > +		if (dev->family != KVASER_USBCAN)
> > +			goto warn;
> > +		break;
> > +
> >  	default:
> > -		dev_warn(dev->udev->dev.parent,
> > +warn:		dev_warn(dev->udev->dev.parent,
> >  			 "Unhandled message (%d)\n", msg->id);
> >  		break;
> >  	}
> > @@ -1181,7 +1534,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
> >  				  dev->rxbuf[i],
> >  				  dev->rxbuf_dma[i]);
> >  
> > -	for (i = 0; i < MAX_NET_DEVICES; i++) {
> > +	for (i = 0; i < dev->max_channels; i++) {
> 
> here too... or replace it by nchannels.
> 

Yes, indeed. nchannels is the correct choice here, especially since
kvaser_usb_init_one() is called "dev->nchannels" times too.

> >  		struct kvaser_usb_net_priv *priv = dev->nets[i];
> >  
> >  		if (priv)
> > @@ -1286,6 +1639,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  	struct kvaser_msg *msg;
> >  	int i, err;
> >  	int ret = NETDEV_TX_OK;
> > +	uint8_t *msg_tx_can_flags;
> >  	bool kfree_skb_on_error = true;
> >  
> >  	if (can_dropped_invalid_skb(netdev, skb))
> > @@ -1306,9 +1660,23 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  
> >  	msg = buf;
> >  	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
> > -	msg->u.tx_can.flags = 0;
> >  	msg->u.tx_can.channel = priv->channel;
> >  
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
> > +		break;
> > +	case KVASER_USBCAN:
> > +		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> > +		goto releasebuf;
> > +	}
> > +
> > +	*msg_tx_can_flags = 0;
> > +
> >  	if (cf->can_id & CAN_EFF_FLAG) {
> >  		msg->id = CMD_TX_EXT_MESSAGE;
> >  		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
> > @@ -1326,7 +1694,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
> >  
> >  	if (cf->can_id & CAN_RTR_FLAG)
> > -		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
> > +		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
> >  
> >  	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
> >  		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
> > @@ -1596,6 +1964,18 @@ static int kvaser_usb_probe(struct usb_interface *intf,
> >  	if (!dev)
> >  		return -ENOMEM;
> >  
> > +	if (LEAF_PRODUCT_ID(id->idProduct)) {
> > +		dev->family = KVASER_LEAF;
> > +		dev->max_channels = LEAF_MAX_NET_DEVICES;
> > +	} else if (USBCAN_PRODUCT_ID(id->idProduct)) {
> > +		dev->family = KVASER_USBCAN;
> > +		dev->max_channels = USBCAN_MAX_NET_DEVICES;
> > +	} else {
> > +		dev_err(&intf->dev, "Product ID (%d) does not belong to any "
> > +				    "known Kvaser USB family", id->idProduct);
> > +		return -ENODEV;
> > +	}
> > +
> 
> Is it really required to keep max_channels in the kvaser_usb structure?
> If I looked correctly, you use this variable as a replacement for
> MAX_NET_DEVICES in the code and MAX_NET_DEVICES is only used in probe
> and disconnect functions. I think it can even be replaced by nchannels
> in the disconnect path. So I also think that it don't need to be in the
> kvaser_usb structure.
> 

hmmm.. given the current state of error arbitration explained
above, where I cannot accept a dev->nchannels > 2, I guess we
have two options:

a) Remove max_channels, and hardcode the channels count
correctness logic as follows:

        dev->nchannels = msg.u.cardinfo.nchannels;
        if ((dev->family == USBCAN && dev->nchannels > USBCAN_MAX_NET_DEVICES)
            || (dev->family == LEAF && dev->nchannels > LEAF_MAX_NET_DEVICES))
                return -EINVAL

b) Leave max_channels in 'struct kvaser_usb' as is.

I personally prefer the solution at 'b)' but I can do it as
in 'a)' if you prefer :-)

> >  	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
> >  	if (err) {
> >  		dev_err(&intf->dev, "Cannot get usb endpoint(s)");
> > @@ -1608,7 +1988,7 @@ static int kvaser_usb_probe(struct usb_interface *intf,
> >  
> >  	usb_set_intfdata(intf, dev);
> >  
> > -	for (i = 0; i < MAX_NET_DEVICES; i++)
> > +	for (i = 0; i < dev->max_channels; i++)
> >  		kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, i);
> 
> Someone reported me that recent leaf firmwares go in trouble when
> you send a command for a channel that does not exist. Instead of
> max_channels, you can use nchannels here and move the reset command
> in the kvaser_usb_init_one() function.
> I've a patch for this but It is not tested yet. I'll send it next week-end after
> I did some tests.
> 

Great. I guess I can submit a 3-patch series now
(kfree_skb fix + the above fix + driver).

> >  
> >  	err = kvaser_usb_get_software_info(dev);
> 
> Thank you,
> 

Thanks a lot for your review.

P.S. the Gmail mailer you've used messed badly with the patch
code identation; I had to manually restore it back to make the
discussion meaningful for others :-)

Regards,
--
Darwish

^ permalink raw reply	[relevance 89%]

* Re: [PATCH] can: kvaser_usb: Don't free packets when tight on URBs
  @ 2014-12-24 15:52 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2014-12-24 15:52 UTC (permalink / raw)
  To: Olivier Sobrie
  Cc: Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi Olivier,

On Wed, Dec 24, 2014 at 01:31:20PM +0100, Olivier Sobrie wrote:
> Hello Ahmed,
> 
> On Tue, Dec 23, 2014 at 05:46:54PM +0200, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > Flooding the Kvaser CAN to USB dongle with multiple reads and
> > writes in high frequency caused seemingly-random panics in the
> > kernel.
> > 
> > On further inspection, it seems the driver erroneously freed the
> > to-be-transmitted packet upon getting tight on URBs and returning
> > NETDEV_TX_BUSY, leading to invalid memory writes and double frees
> > at a later point in time.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> Thank you for the fix.
> 
> > ---
> >  drivers/net/can/usb/kvaser_usb.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> >  (Generated over 3.19.0-rc1)
> > 
> > diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> > index 541fb7a..34c35d8 100644
> > --- a/drivers/net/can/usb/kvaser_usb.c
> > +++ b/drivers/net/can/usb/kvaser_usb.c
> > @@ -1286,6 +1286,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  	struct kvaser_msg *msg;
> >  	int i, err;
> >  	int ret = NETDEV_TX_OK;
> > +	bool kfree_skb_on_error = true;
> >  
> >  	if (can_dropped_invalid_skb(netdev, skb))
> >  		return NETDEV_TX_OK;
> > @@ -1336,6 +1337,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  
> >  	if (!context) {
> >  		netdev_warn(netdev, "cannot find free context\n");
> > +		kfree_skb_on_error = false;
> 
> Instead of using an extra variable, you can also set skb to NULL here.
> Or maybe better, you can move the dev_kfree_skb() in the two previous
> tests (in the check of variables urb and buf).
> 

Nice, I'll move dev_kfree_skb() to the two earlier tests then.

Thanks,

P.S. mailer and patch identation; had to manually fix them
before replying (but thanks for the quick review, ofc ;-))

> Thank you,
> 
> Olivier
> 
> >  		ret =  NETDEV_TX_BUSY;
> >  		goto releasebuf;
> >  	}
> > @@ -1364,8 +1366,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  	if (unlikely(err)) {
> >  		can_free_echo_skb(netdev, context->echo_index);
> >  
> > -		skb = NULL; /* set to NULL to avoid double free in
> > -			     * dev_kfree_skb(skb) */
> > +		kfree_skb_on_error = false;
> >  
> >  		atomic_dec(&priv->active_tx_urbs);
> >  		usb_unanchor_urb(urb);
> > @@ -1389,7 +1390,8 @@ releasebuf:
> >  nobufmem:
> >  	usb_free_urb(urb);
> >  nourbmem:
> > -	dev_kfree_skb(skb);
> > +	if (kfree_skb_on_error)
> > +		dev_kfree_skb(skb);
> >  	return ret;
> >  }
> > 

^ permalink raw reply	[relevance 99%]

* [PATCH v2 1/4] can: kvaser_usb: Don't free packets when tight on URBs
  2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
  2014-12-23 15:53 40% ` [PATCH] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
  @ 2014-12-24 23:56 92% ` Ahmed S. Darwish
  2014-12-24 23:59 86%   ` [PATCH v2 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
                       ` (4 subsequent siblings)
  7 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2014-12-24 23:56 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, Greg KH,
	Linux-stable, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in high frequency caused seemingly-random panics in the
kernel.

On further inspection, it seems the driver erroneously freed the
to-be-transmitted packet upon getting tight on URBs and returning
NETDEV_TX_BUSY, leading to invalid memory writes and double frees
at a later point in time.

Note:

Finding no more URBs/transmit-contexts and returning NETDEV_TX_BUSY
is a driver bug in and out of itself: it means that our start/stop
queue flow control is broken.

This patch only fixes the (buggy) error handling code; the root
cause shall be fixed in a later commit.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

 (Marc, Greg, I believe this should also be added to -stable?)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 541fb7a..6479a2b 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1294,12 +1294,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (!urb) {
 		netdev_err(netdev, "No memory left for URBs\n");
 		stats->tx_dropped++;
-		goto nourbmem;
+		dev_kfree_skb(skb);
+		return NETDEV_TX_OK;
 	}
 
 	buf = kmalloc(sizeof(struct kvaser_msg), GFP_ATOMIC);
 	if (!buf) {
 		stats->tx_dropped++;
+		dev_kfree_skb(skb);
 		goto nobufmem;
 	}
 
@@ -1334,6 +1336,9 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		}
 	}
 
+	/*
+	 * This should never happen; it implies a flow control bug.
+	 */
 	if (!context) {
 		netdev_warn(netdev, "cannot find free context\n");
 		ret =  NETDEV_TX_BUSY;
@@ -1364,9 +1369,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (unlikely(err)) {
 		can_free_echo_skb(netdev, context->echo_index);
 
-		skb = NULL; /* set to NULL to avoid double free in
-			     * dev_kfree_skb(skb) */
-
 		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
@@ -1388,8 +1390,6 @@ releasebuf:
 	kfree(buf);
 nobufmem:
 	usb_free_urb(urb);
-nourbmem:
-	dev_kfree_skb(skb);
 	return ret;
 }
 

^ permalink raw reply related	[relevance 92%]

* [PATCH v2 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close
  2014-12-24 23:56 92% ` [PATCH v2 1/4] " Ahmed S. Darwish
@ 2014-12-24 23:59 86%   ` Ahmed S. Darwish
  2014-12-25  0:02 98%     ` [PATCH v2 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
    1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2014-12-24 23:59 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, Greg KH,
	Linux-stable, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in very high frequency (*), closing the CAN channel while
all the transmissions are on (#), opening the device again (@),
then sending a small number of packets would make the driver
enter an almost infinite loop of:

[....]
[15959.853988] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853990] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853991] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853993] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853994] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853995] kvaser_usb 4-3:1.0 can0: cannot find free context
[....]

_dragging the whole system down_ in the process due to the
excessive logging output.

Initially, this has caused random panics in the kernel due to a
buggy error recovery path.  That got fixed in an earlier commit.(%)
This patch aims at solving the root cause. -->

16 tx URBs and contexts are allocated per CAN channel per USB
device. Such URBs are protected by:

a) A simple atomic counter, up to a value of MAX_TX_URBS (16)
b) A flag in each URB context, stating if it's free
c) The fact that ndo_start_xmit calls are themselves protected
   by the networking layers higher above

After grabbing one of the tx URBs, if the driver noticed that all
of them are now taken, it stops the netif transmission queue.
Such queue is worken up again only if an acknowedgment was received
from the firmware on one of our earlier-sent frames.

Meanwhile, upon channel close (#), the driver sends a CMD_STOP_CHIP
to the firmware, effectively closing all further communication.  In
the high traffic case, the atomic counter remains at MAX_TX_URBS,
and all the URB contexts remain marked as active.  While opening
the channel again (@), it cannot send any further frames since no
more free tx URB contexts are available.

Reset all tx URB contexts upon CAN channel close.

(*) 50 parallel instances of `cangen0 -g 0 -ix`
(#) `ifconfig can0 down`
(@) `ifconfig can0 up`
(%) "can: kvaser_usb: Don't free packets when tight on URBs"

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 3 +++
 1 file changed, 3 insertions(+)

 (Marc, Greg, this also should be added to -stable?)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 6479a2b..598e251 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1246,6 +1246,9 @@ static int kvaser_usb_close(struct net_device *netdev)
 	if (err)
 		netdev_warn(netdev, "Cannot stop device, error %d\n", err);
 
+	/* reset tx contexts */
+	kvaser_usb_unlink_tx_urbs(priv);
+
 	priv->can.state = CAN_STATE_STOPPED;
 	close_candev(priv->netdev);
 

^ permalink raw reply related	[relevance 86%]

* [PATCH v2 4/4] can: kvaser_usb: Add support for the Usbcan-II family
  2014-12-25  0:02 98%     ` [PATCH v2 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
@ 2014-12-25  0:04 39%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2014-12-25  0:04 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Greg KH, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
divided into two major families: 'Leaf', and 'UsbcanII'.  From an
Operating System perspective, the firmware of both families behave
in a not too drastically different fashion.

This patch adds support for the UsbcanII family of devices to the
current Kvaser Leaf-only driver.

CAN frames sending, receiving, and error handling paths has been
tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
should also work nicely with other products in the same category.

List of new devices supported by this driver update:

         - Kvaser USBcan II HS/HS
         - Kvaser USBcan II HS/LS
         - Kvaser USBcan Rugged ("USBcan Rev B")
         - Kvaser Memorator HS/HS
         - Kvaser Memorator HS/LS
         - Scania VCI2 (if you have the Kvaser logo on top)

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/Kconfig      |   8 +-
 drivers/net/can/usb/kvaser_usb.c | 627 +++++++++++++++++++++++++++++++--------
 2 files changed, 510 insertions(+), 125 deletions(-)

** V2 Changelog:
- Update Kconfig entries
- Use actual number of CAN channels (instead of max) where appropriate
- Rebase over a new set of UsbcanII-independent driver fixes

diff --git a/drivers/net/can/usb/Kconfig b/drivers/net/can/usb/Kconfig
index a77db919..f6f5500 100644
--- a/drivers/net/can/usb/Kconfig
+++ b/drivers/net/can/usb/Kconfig
@@ -25,7 +25,7 @@ config CAN_KVASER_USB
 	tristate "Kvaser CAN/USB interface"
 	---help---
 	  This driver adds support for Kvaser CAN/USB devices like Kvaser
-	  Leaf Light.
+	  Leaf Light and Kvaser USBcan II.
 
 	  The driver provides support for the following devices:
 	    - Kvaser Leaf Light
@@ -46,6 +46,12 @@ config CAN_KVASER_USB
 	    - Kvaser USBcan R
 	    - Kvaser Leaf Light v2
 	    - Kvaser Mini PCI Express HS
+	    - Kvaser USBcan II HS/HS
+	    - Kvaser USBcan II HS/LS
+	    - Kvaser USBcan Rugged ("USBcan Rev B")
+	    - Kvaser Memorator HS/HS
+	    - Kvaser Memorator HS/LS
+	    - Scania VCI2 (if you have the Kvaser logo on top)
 
 	  If unsure, say N.
 
diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 2791501..50da317 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -6,10 +6,12 @@
  * Parts of this driver are based on the following:
  *  - Kvaser linux leaf driver (version 4.78)
  *  - CAN driver for esd CAN-USB/2
+ *  - Kvaser linux usbcanII driver (version 5.3)
  *
  * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
  * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
  * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
+ * Copyright (C) 2014 Valeo Corporation
  */
 
 #include <linux/completion.h>
@@ -21,6 +23,18 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
+/*
+ * Kvaser USB CAN dongles are divided into two major families:
+ * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
+ * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
+ */
+enum kvaser_usb_family {
+	KVASER_LEAF,
+	KVASER_USBCAN,
+};
+
+#define MAX(a, b)			((a) > (b) ? (a) : (b))
+
 #define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
@@ -29,9 +43,12 @@
 #define USB_RECV_TIMEOUT		1000 /* msecs */
 #define RX_BUFFER_SIZE			3072
 #define CAN_USB_CLOCK			8000000
-#define MAX_NET_DEVICES			3
+#define LEAF_MAX_NET_DEVICES		3
+#define USBCAN_MAX_NET_DEVICES		2
+#define MAX_NET_DEVICES			MAX(LEAF_MAX_NET_DEVICES, \
+					    USBCAN_MAX_NET_DEVICES)
 
-/* Kvaser USB devices */
+/* Leaf USB devices */
 #define KVASER_VENDOR_ID		0x0bfd
 #define USB_LEAF_DEVEL_PRODUCT_ID	10
 #define USB_LEAF_LITE_PRODUCT_ID	11
@@ -55,6 +72,16 @@
 #define USB_CAN_R_PRODUCT_ID		39
 #define USB_LEAF_LITE_V2_PRODUCT_ID	288
 #define USB_MINI_PCIE_HS_PRODUCT_ID	289
+#define LEAF_PRODUCT_ID(id)		(id >= USB_LEAF_DEVEL_PRODUCT_ID && \
+					 id <= USB_MINI_PCIE_HS_PRODUCT_ID)
+
+/* USBCANII devices */
+#define USB_USBCAN_REVB_PRODUCT_ID	2
+#define USB_VCI2_PRODUCT_ID		3
+#define USB_USBCAN2_PRODUCT_ID		4
+#define USB_MEMORATOR_PRODUCT_ID	5
+#define USBCAN_PRODUCT_ID(id)		(id >= USB_USBCAN_REVB_PRODUCT_ID && \
+					 id <= USB_MEMORATOR_PRODUCT_ID)
 
 /* USB devices features */
 #define KVASER_HAS_SILENT_MODE		BIT(0)
@@ -73,7 +100,7 @@
 #define MSG_FLAG_TX_ACK			BIT(6)
 #define MSG_FLAG_TX_REQUEST		BIT(7)
 
-/* Can states */
+/* Can states (M16C CxSTRH register) */
 #define M16C_STATE_BUS_RESET		BIT(0)
 #define M16C_STATE_BUS_ERROR		BIT(4)
 #define M16C_STATE_BUS_PASSIVE		BIT(5)
@@ -98,7 +125,13 @@
 #define CMD_START_CHIP_REPLY		27
 #define CMD_STOP_CHIP			28
 #define CMD_STOP_CHIP_REPLY		29
-#define CMD_GET_CARD_INFO2		32
+#define CMD_READ_CLOCK			30
+#define CMD_READ_CLOCK_REPLY		31
+
+#define LEAF_CMD_GET_CARD_INFO2		32
+#define USBCAN_CMD_RESET_CLOCK		32
+#define USBCAN_CMD_CLOCK_OVERFLOW_EVENT	33
+
 #define CMD_GET_CARD_INFO		34
 #define CMD_GET_CARD_INFO_REPLY		35
 #define CMD_GET_SOFTWARE_INFO		38
@@ -108,8 +141,9 @@
 #define CMD_RESET_ERROR_COUNTER		49
 #define CMD_TX_ACKNOWLEDGE		50
 #define CMD_CAN_ERROR_EVENT		51
-#define CMD_USB_THROTTLE		77
-#define CMD_LOG_MESSAGE			106
+
+#define LEAF_CMD_USB_THROTTLE		77
+#define LEAF_CMD_LOG_MESSAGE		106
 
 /* error factors */
 #define M16C_EF_ACKE			BIT(0)
@@ -121,6 +155,13 @@
 #define M16C_EF_RCVE			BIT(6)
 #define M16C_EF_TRE			BIT(7)
 
+/* Only Leaf-based devices can report M16C error factors,
+ * thus define our own error status flags for USBCAN */
+#define USBCAN_ERROR_STATE_NONE		0
+#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
+#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
+#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
+
 /* bittiming parameters */
 #define KVASER_USB_TSEG1_MIN		1
 #define KVASER_USB_TSEG1_MAX		16
@@ -137,7 +178,7 @@
 #define KVASER_CTRL_MODE_SELFRECEPTION	3
 #define KVASER_CTRL_MODE_OFF		4
 
-/* log message */
+/* Extended CAN identifier flag */
 #define KVASER_EXTENDED_FRAME		BIT(31)
 
 struct kvaser_msg_simple {
@@ -148,30 +189,55 @@ struct kvaser_msg_simple {
 struct kvaser_msg_cardinfo {
 	u8 tid;
 	u8 nchannels;
-	__le32 serial_number;
-	__le32 padding;
+	union {
+		struct {
+			__le32 serial_number;
+			__le32 padding;
+		} __packed leaf0;
+		struct {
+			__le32 serial_number_low;
+			__le32 serial_number_high;
+		} __packed usbcan0;
+	} __packed;
 	__le32 clock_resolution;
 	__le32 mfgdate;
 	u8 ean[8];
 	u8 hw_revision;
-	u8 usb_hs_mode;
-	__le16 padding2;
+	union {
+		struct {
+			u8 usb_hs_mode;
+		} __packed leaf1;
+		struct {
+			u8 padding;
+		} __packed usbcan1;
+	} __packed;
+	__le16 padding;
 } __packed;
 
 struct kvaser_msg_cardinfo2 {
 	u8 tid;
-	u8 channel;
+	u8 reserved;
 	u8 pcb_id[24];
 	__le32 oem_unlock_code;
 } __packed;
 
-struct kvaser_msg_softinfo {
+struct leaf_msg_softinfo {
 	u8 tid;
-	u8 channel;
+	u8 padding0;
 	__le32 sw_options;
 	__le32 fw_version;
 	__le16 max_outstanding_tx;
-	__le16 padding[9];
+	__le16 padding1[9];
+} __packed;
+
+struct usbcan_msg_softinfo {
+	u8 tid;
+	u8 fw_name[5];
+	__le16 max_outstanding_tx;
+	u8 padding[6];
+	__le32 fw_version;
+	__le16 checksum;
+	__le16 sw_options;
 } __packed;
 
 struct kvaser_msg_busparams {
@@ -188,36 +254,86 @@ struct kvaser_msg_tx_can {
 	u8 channel;
 	u8 tid;
 	u8 msg[14];
-	u8 padding;
-	u8 flags;
+	union {
+		struct {
+			u8 padding;
+			u8 flags;
+		} __packed leaf;
+		struct {
+			u8 flags;
+			u8 padding;
+		} __packed usbcan;
+	} __packed;
+} __packed;
+
+struct kvaser_msg_rx_can_header {
+	u8 channel;
+	u8 flag;
 } __packed;
 
-struct kvaser_msg_rx_can {
+struct leaf_msg_rx_can {
 	u8 channel;
 	u8 flag;
+
 	__le16 time[3];
 	u8 msg[14];
 } __packed;
 
-struct kvaser_msg_chip_state_event {
+struct usbcan_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
+	u8 msg[14];
+	__le16 time;
+} __packed;
+
+struct leaf_msg_chip_state_event {
 	u8 tid;
 	u8 channel;
+
 	__le16 time[3];
 	u8 tx_errors_count;
 	u8 rx_errors_count;
+
 	u8 status;
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_tx_acknowledge {
+struct usbcan_msg_chip_state_event {
+	u8 tid;
+	u8 channel;
+
+	u8 tx_errors_count;
+	u8 rx_errors_count;
+	__le16 time;
+
+	u8 status;
+	u8 padding[3];
+} __packed;
+
+struct kvaser_msg_tx_acknowledge_header {
+	u8 channel;
+	u8 tid;
+};
+
+struct leaf_msg_tx_acknowledge {
 	u8 channel;
 	u8 tid;
+
 	__le16 time[3];
 	u8 flags;
 	u8 time_offset;
 } __packed;
 
-struct kvaser_msg_error_event {
+struct usbcan_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
+	__le16 time;
+	__le16 padding;
+} __packed;
+
+struct leaf_msg_error_event {
 	u8 tid;
 	u8 flags;
 	__le16 time[3];
@@ -229,6 +345,18 @@ struct kvaser_msg_error_event {
 	u8 error_factor;
 } __packed;
 
+struct usbcan_msg_error_event {
+	u8 tid;
+	u8 padding;
+	u8 tx_errors_count_ch0;
+	u8 rx_errors_count_ch0;
+	u8 tx_errors_count_ch1;
+	u8 rx_errors_count_ch1;
+	u8 status_ch0;
+	u8 status_ch1;
+	__le16 time;
+} __packed;
+
 struct kvaser_msg_ctrl_mode {
 	u8 tid;
 	u8 channel;
@@ -243,7 +371,7 @@ struct kvaser_msg_flush_queue {
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_log_message {
+struct leaf_msg_log_message {
 	u8 channel;
 	u8 flags;
 	__le16 time[3];
@@ -260,19 +388,49 @@ struct kvaser_msg {
 		struct kvaser_msg_simple simple;
 		struct kvaser_msg_cardinfo cardinfo;
 		struct kvaser_msg_cardinfo2 cardinfo2;
-		struct kvaser_msg_softinfo softinfo;
 		struct kvaser_msg_busparams busparams;
+
+		struct kvaser_msg_rx_can_header rx_can_header;
+		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
+
+		union {
+			struct leaf_msg_softinfo softinfo;
+			struct leaf_msg_rx_can rx_can;
+			struct leaf_msg_chip_state_event chip_state_event;
+			struct leaf_msg_tx_acknowledge tx_acknowledge;
+			struct leaf_msg_error_event error_event;
+			struct leaf_msg_log_message log_message;
+		} __packed leaf;
+
+		union {
+			struct usbcan_msg_softinfo softinfo;
+			struct usbcan_msg_rx_can rx_can;
+			struct usbcan_msg_chip_state_event chip_state_event;
+			struct usbcan_msg_tx_acknowledge tx_acknowledge;
+			struct usbcan_msg_error_event error_event;
+		} __packed usbcan;
+
 		struct kvaser_msg_tx_can tx_can;
-		struct kvaser_msg_rx_can rx_can;
-		struct kvaser_msg_chip_state_event chip_state_event;
-		struct kvaser_msg_tx_acknowledge tx_acknowledge;
-		struct kvaser_msg_error_event error_event;
 		struct kvaser_msg_ctrl_mode ctrl_mode;
 		struct kvaser_msg_flush_queue flush_queue;
-		struct kvaser_msg_log_message log_message;
 	} u;
 } __packed;
 
+/* Leaf/USBCAN-agnostic summary of an error event.
+ * No M16C error factors for USBCAN-based devices. */
+struct kvaser_error_summary {
+	u8 channel, status, txerr, rxerr;
+	union {
+		struct {
+			u8 error_factor;
+		} leaf;
+		struct {
+			u8 other_ch_status;
+			u8 error_state;
+		} usbcan;
+	};
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -288,6 +446,8 @@ struct kvaser_usb {
 
 	u32 fw_version;
 	unsigned int nchannels;
+	enum kvaser_usb_family family;
+	unsigned int max_channels;
 
 	bool rxinitdone;
 	void *rxbuf[MAX_RX_URBS];
@@ -311,6 +471,7 @@ struct kvaser_usb_net_priv {
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
+	/* Leaf family IDs */
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
@@ -360,6 +521,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
 		.driver_info = KVASER_HAS_TXRX_ERRORS },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+
+	/* USBCANII family IDs */
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+
 	{ }
 };
 MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
@@ -463,7 +635,18 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	if (err)
 		return err;
 
-	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
+	switch (dev->family) {
+	case KVASER_LEAF:
+		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		break;
+	case KVASER_USBCAN:
+		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		return -EINVAL;
+	}
 
 	return 0;
 }
@@ -482,7 +665,7 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
 		return err;
 
 	dev->nchannels = msg.u.cardinfo.nchannels;
-	if (dev->nchannels > MAX_NET_DEVICES)
+	if (dev->nchannels > dev->max_channels)
 		return -EINVAL;
 
 	return 0;
@@ -496,8 +679,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	u8 channel = msg->u.tx_acknowledge.channel;
-	u8 tid = msg->u.tx_acknowledge.tid;
+	u8 channel, tid;
+
+	channel = msg->u.tx_acknowledge_header.channel;
+	tid = msg->u.tx_acknowledge_header.tid;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -615,37 +800,83 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
 }
 
-static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
-				const struct kvaser_msg *msg)
+static void kvaser_report_error_event(const struct kvaser_usb *dev,
+				      struct kvaser_error_summary *es);
+
+/*
+ * Report error to userspace iff the controller's errors counter has
+ * increased, or we're the only channel seeing the bus error state.
+ *
+ * As reported by USBCAN sheets, "the CAN controller has difficulties
+ * to tell whether an error frame arrived on channel 1 or on channel 2."
+ * Thus, error counters are compared with their earlier values to
+ * determine which channel was responsible for the error event.
+ */
+static void usbcan_report_error_if_applicable(const struct kvaser_usb *dev,
+					      struct kvaser_error_summary *es)
 {
-	struct can_frame *cf;
-	struct sk_buff *skb;
-	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
+	int old_tx_err_count, old_rx_err_count, channel, report_error;
+
+	channel = es->channel;
+	if (channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", channel);
+		return;
+	}
+
+	priv = dev->nets[channel];
+	old_tx_err_count = priv->bec.txerr;
+	old_rx_err_count = priv->bec.rxerr;
+
+	report_error = 0;
+	if (es->txerr > old_tx_err_count) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
+		report_error = 1;
+	}
+	if (es->rxerr > old_rx_err_count) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
+		report_error = 1;
+	}
+	if ((es->status & M16C_STATE_BUS_ERROR) &&
+	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
+		report_error = 1;
+	}
+
+	if (report_error)
+		kvaser_report_error_event(dev, es);
+}
+
+/*
+ * Extract error summary from a Leaf-based device error message
+ */
+static void leaf_extract_error_from_msg(const struct kvaser_usb *dev,
+					const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { 0, };
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
+		es.channel = msg->u.leaf.error_event.channel;
+		es.status =  msg->u.leaf.error_event.status;
+		es.txerr = msg->u.leaf.error_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
+		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
 		break;
-	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
+	case LEAF_CMD_LOG_MESSAGE:
+		es.channel = msg->u.leaf.log_message.channel;
+		es.status = msg->u.leaf.log_message.data[0];
+		es.txerr = msg->u.leaf.log_message.data[2];
+		es.rxerr = msg->u.leaf.log_message.data[3];
+		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
 		break;
 	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
+		es.channel = msg->u.leaf.chip_state_event.channel;
+		es.status =  msg->u.leaf.chip_state_event.status;
+		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
+		es.leaf.error_factor = 0;
 		break;
 	default:
 		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
@@ -653,16 +884,92 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (channel >= dev->nchannels) {
+	kvaser_report_error_event(dev, &es);
+}
+
+/*
+ * Extract summary from a USBCANII-based device error message.
+ */
+static void usbcan_extract_error_from_msg(const struct kvaser_usb *dev,
+					const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { 0, };
+
+	switch (msg->id) {
+
+	/* Sometimes errors are sent as unsolicited chip state events */
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.usbcan.chip_state_event.channel;
+		es.status =  msg->u.usbcan.chip_state_event.status;
+		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
+		usbcan_report_error_if_applicable(dev, &es);
+		break;
+
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = 0;
+		es.status = msg->u.usbcan.error_event.status_ch0;
+		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
+		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
+		es.usbcan.other_ch_status =
+			msg->u.usbcan.error_event.status_ch1;
+		usbcan_report_error_if_applicable(dev, &es);
+
+		/* For error events, the USBCAN firmware does not support
+		 * more than 2 channels: ch0, and ch1. */
+		if (dev->nchannels > 1) {
+			es.channel = 1;
+			es.status = msg->u.usbcan.error_event.status_ch1;
+			es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch1;
+			es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch1;
+			es.usbcan.other_ch_status =
+				msg->u.usbcan.error_event.status_ch0;
+			usbcan_report_error_if_applicable(dev, &es);
+		}
+		break;
+
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+	}
+}
+
+static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
+				const struct kvaser_msg *msg)
+{
+	switch (dev->family) {
+	case KVASER_LEAF:
+		leaf_extract_error_from_msg(dev, msg);
+		break;
+	case KVASER_USBCAN:
+		usbcan_extract_error_from_msg(dev, msg);
+		break;
+	default:
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
+}
 
-	priv = dev->nets[channel];
+static void kvaser_report_error_event(const struct kvaser_usb *dev,
+				      struct kvaser_error_summary *es)
+{
+	struct can_frame *cf;
+	struct sk_buff *skb;
+	struct net_device_stats *stats;
+	struct kvaser_usb_net_priv *priv;
+	unsigned int new_state;
+
+	if (es->channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", es->channel);
+		return;
+	}
+
+	priv = dev->nets[es->channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
+	if (es->status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
@@ -675,9 +982,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	new_state = priv->can.state;
 
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	if (status & M16C_STATE_BUS_OFF) {
+	if (es->status & M16C_STATE_BUS_OFF) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
 		priv->can.can_stats.bus_off++;
@@ -687,12 +994,12 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		netif_carrier_off(priv->netdev);
 
 		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
 		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es->txerr || es->rxerr)
+				cf->data[1] = (es->txerr > es->rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
@@ -703,13 +1010,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	}
-
-	if (status == M16C_STATE_BUS_ERROR) {
+	} else if (es->status & M16C_STATE_BUS_ERROR) {
 		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es->txerr > es->rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
 
@@ -723,7 +1028,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 	}
 
-	if (!status) {
+	if (!es->status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
 
@@ -739,34 +1044,52 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
-		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
-
-		if (error_factor & M16C_EF_ACKE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
-					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
-			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
-			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
-			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
-			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
-			cf->data[2] |= CAN_ERR_PROT_TX;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			priv->can.can_stats.bus_error++;
+			stats->rx_errors++;
+
+			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+
+			if (es->leaf.error_factor & M16C_EF_ACKE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
+			if (es->leaf.error_factor & M16C_EF_CRCE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
+						CAN_ERR_PROT_LOC_CRC_DEL);
+			if (es->leaf.error_factor & M16C_EF_FORME)
+				cf->data[2] |= CAN_ERR_PROT_FORM;
+			if (es->leaf.error_factor & M16C_EF_STFE)
+				cf->data[2] |= CAN_ERR_PROT_STUFF;
+			if (es->leaf.error_factor & M16C_EF_BITE0)
+				cf->data[2] |= CAN_ERR_PROT_BIT0;
+			if (es->leaf.error_factor & M16C_EF_BITE1)
+				cf->data[2] |= CAN_ERR_PROT_BIT1;
+			if (es->leaf.error_factor & M16C_EF_TRE)
+				cf->data[2] |= CAN_ERR_PROT_TX;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
+			stats->tx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
+			stats->rx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			priv->can.can_stats.bus_error++;
+			cf->can_id |= CAN_ERR_BUSERROR;
+		}
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto err;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
+	cf->data[6] = es->txerr;
+	cf->data[7] = es->rxerr;
 
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
 
 	priv->can.state = new_state;
 
@@ -774,6 +1097,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+
+	return;
+
+err:
+	dev_kfree_skb(skb);
 }
 
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
@@ -783,16 +1111,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	struct sk_buff *skb;
 	struct net_device_stats *stats = &priv->netdev->stats;
 
-	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
+	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
 					 MSG_FLAG_NERR)) {
 		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
-			   msg->u.rx_can.flag);
+			   msg->u.rx_can_header.flag);
 
 		stats->rx_errors++;
 		return;
 	}
 
-	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -819,7 +1147,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
-	u8 channel = msg->u.rx_can.channel;
+	u8 channel = msg->u.rx_can_header.channel;
+	const u8 *rx_msg;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -830,19 +1159,32 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
-	    (msg->id == CMD_LOG_MESSAGE)) {
+	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+	    (dev->family == KVASER_LEAF && msg->id == LEAF_CMD_LOG_MESSAGE)) {
 		kvaser_usb_rx_error(dev, msg);
 		return;
-	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
-					 MSG_FLAG_NERR |
-					 MSG_FLAG_OVERRUN)) {
+	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
+						MSG_FLAG_NERR |
+						MSG_FLAG_OVERRUN)) {
 		kvaser_usb_rx_can_err(priv, msg);
 		return;
-	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
+	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
 		netdev_warn(priv->netdev,
 			    "Unhandled frame (flags: 0x%02x)",
-			    msg->u.rx_can.flag);
+			    msg->u.rx_can_header.flag);
+		return;
+	}
+
+	switch (dev->family) {
+	case KVASER_LEAF:
+		rx_msg = msg->u.leaf.rx_can.msg;
+		break;
+	case KVASER_USBCAN:
+		rx_msg = msg->u.usbcan.rx_can.msg;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
 
@@ -852,38 +1194,37 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (msg->id == CMD_LOG_MESSAGE) {
-		cf->can_id = le32_to_cpu(msg->u.log_message.id);
+	if (dev->family == KVASER_LEAF && msg->id == LEAF_CMD_LOG_MESSAGE) {
+		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
 		if (cf->can_id & KVASER_EXTENDED_FRAME)
 			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
 		else
 			cf->can_id &= CAN_SFF_MASK;
 
-		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
+		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
 
-		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.log_message.data,
+			memcpy(cf->data, &msg->u.leaf.log_message.data,
 			       cf->can_dlc);
 	} else {
-		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
-			     (msg->u.rx_can.msg[1] & 0x3f);
+		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
 
 		if (msg->id == CMD_RX_EXT_MESSAGE) {
 			cf->can_id <<= 18;
-			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
-				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
-				      (msg->u.rx_can.msg[4] & 0x3f);
+			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
+				      ((rx_msg[3] & 0xff) << 6) |
+				      (rx_msg[4] & 0x3f);
 			cf->can_id |= CAN_EFF_FLAG;
 		}
 
-		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
+		cf->can_dlc = get_can_dlc(rx_msg[5]);
 
-		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.rx_can.msg[6],
+			memcpy(cf->data, &rx_msg[6],
 			       cf->can_dlc);
 	}
 
@@ -947,7 +1288,12 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 
 	case CMD_RX_STD_MESSAGE:
 	case CMD_RX_EXT_MESSAGE:
-	case CMD_LOG_MESSAGE:
+		kvaser_usb_rx_can_msg(dev, msg);
+		break;
+
+	case LEAF_CMD_LOG_MESSAGE:
+		if (dev->family != KVASER_LEAF)
+			goto warn;
 		kvaser_usb_rx_can_msg(dev, msg);
 		break;
 
@@ -960,8 +1306,14 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 		kvaser_usb_tx_acknowledge(dev, msg);
 		break;
 
+	/* Ignored messages */
+	case USBCAN_CMD_CLOCK_OVERFLOW_EVENT:
+		if (dev->family != KVASER_USBCAN)
+			goto warn;
+		break;
+
 	default:
-		dev_warn(dev->udev->dev.parent,
+warn:		dev_warn(dev->udev->dev.parent,
 			 "Unhandled message (%d)\n", msg->id);
 		break;
 	}
@@ -1181,7 +1533,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 				  dev->rxbuf[i],
 				  dev->rxbuf_dma[i]);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++) {
+	for (i = 0; i < dev->nchannels; i++) {
 		struct kvaser_usb_net_priv *priv = dev->nets[i];
 
 		if (priv)
@@ -1289,6 +1641,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	uint8_t *msg_tx_can_flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1310,9 +1663,23 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	msg = buf;
 	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
-	msg->u.tx_can.flags = 0;
 	msg->u.tx_can.channel = priv->channel;
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
+		break;
+	case KVASER_USBCAN:
+		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto releasebuf;
+	}
+
+	*msg_tx_can_flags = 0;
+
 	if (cf->can_id & CAN_EFF_FLAG) {
 		msg->id = CMD_TX_EXT_MESSAGE;
 		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
@@ -1330,7 +1697,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
 
 	if (cf->can_id & CAN_RTR_FLAG)
-		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
+		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
@@ -1601,6 +1968,18 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 	if (!dev)
 		return -ENOMEM;
 
+	if (LEAF_PRODUCT_ID(id->idProduct)) {
+		dev->family = KVASER_LEAF;
+		dev->max_channels = LEAF_MAX_NET_DEVICES;
+	} else if (USBCAN_PRODUCT_ID(id->idProduct)) {
+		dev->family = KVASER_USBCAN;
+		dev->max_channels = USBCAN_MAX_NET_DEVICES;
+	} else {
+		dev_err(&intf->dev, "Product ID (%d) does not belong to any "
+				    "known Kvaser USB family", id->idProduct);
+		return -ENODEV;
+	}
+
 	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
 	if (err) {
 		dev_err(&intf->dev, "Cannot get usb endpoint(s)");

^ permalink raw reply related	[relevance 39%]

* [PATCH v2 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels
  2014-12-24 23:59 86%   ` [PATCH v2 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
@ 2014-12-25  0:02 98%     ` Ahmed S. Darwish
  2014-12-25  0:04 39%       ` [PATCH v2 4/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2014-12-25  0:02 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Greg KH, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

"Someone reported me that recent leaf firmwares go in trouble when
you send a command for a channel that does not exist. Instead ...
you can move the reset command to kvaser_usb_init_one() function."

Suggested-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 598e251..2791501 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1505,6 +1505,10 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb_net_priv *priv;
 	int i, err;
 
+	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
+	if (err)
+		return err;
+
 	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
@@ -1609,9 +1613,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++)
-		kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, i);
-
 	err = kvaser_usb_get_software_info(dev);
 	if (err) {
 		dev_err(&intf->dev,

^ permalink raw reply related	[relevance 98%]

* Re: [PATCH v2 1/4] can: kvaser_usb: Don't free packets when tight on URBs
  @ 2014-12-25  9:38 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2014-12-25  9:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, David S. Miller, Paul Gortmaker, Linux-CAN,
	netdev, Linux-stable, LKML

On Wed, Dec 24, 2014 at 06:50:11PM -0800, Greg KH wrote:
> On Thu, Dec 25, 2014 at 01:56:44AM +0200, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
  > > Flooding the Kvaser CAN to USB dongle with multiple reads and
> > writes in high frequency caused seemingly-random panics in the
> > kernel.
> > 
> > On further inspection, it seems the driver erroneously freed the
> > to-be-transmitted packet upon getting tight on URBs and returning
> > NETDEV_TX_BUSY, leading to invalid memory writes and double frees
> > at a later point in time.
> > 
> > Note:
> > 
> > Finding no more URBs/transmit-contexts and returning NETDEV_TX_BUSY
> > is a driver bug in and out of itself: it means that our start/stop
> > queue flow control is broken.
> > 
> > This patch only fixes the (buggy) error handling code; the root
> > cause shall be fixed in a later commit.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > ---
> >  drivers/net/can/usb/kvaser_usb.c | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> >  (Marc, Greg, I believe this should also be added to -stable?)
> 
> 
> <formletter>
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
> for how to do this properly.
> 
> </formletter>

<msg-response>

Note taken. Sorry about that ;-)

</msg-response>

^ permalink raw reply	[relevance 99%]

* Re: [PATCH] can: kvaser_usb: Add support for the Usbcan-II family
  @ 2014-12-30 15:33 99%         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2014-12-30 15:33 UTC (permalink / raw)
  To: Olivier Sobrie
  Cc: Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

On Sun, Dec 28, 2014 at 10:51:34PM +0100, Olivier Sobrie wrote:
[...]
> > > >  
> > > > +	if (LEAF_PRODUCT_ID(id->idProduct)) {
> > > > +		dev->family = KVASER_LEAF;
> > > > +		dev->max_channels = LEAF_MAX_NET_DEVICES;
> > > > +	} else if (USBCAN_PRODUCT_ID(id->idProduct)) {
> > > > +		dev->family = KVASER_USBCAN;
> > > > +		dev->max_channels = USBCAN_MAX_NET_DEVICES;
> > > > +	} else {
> > > > +		dev_err(&intf->dev, "Product ID (%d) does not belong to any "
> > > > +				    "known Kvaser USB family", id->idProduct);
> > > > +		return -ENODEV;
> > > > +	}
> > > > +
> > > 
> > > Is it really required to keep max_channels in the kvaser_usb structure?
> > > If I looked correctly, you use this variable as a replacement for
> > > MAX_NET_DEVICES in the code and MAX_NET_DEVICES is only used in probe
> > > and disconnect functions. I think it can even be replaced by nchannels
> > > in the disconnect path. So I also think that it don't need to be in the
> > > kvaser_usb structure.
> > > 
> > 
> > hmmm.. given the current state of error arbitration explained
> > above, where I cannot accept a dev->nchannels > 2, I guess we
> > have two options:
> > 
> > a) Remove max_channels, and hardcode the channels count
> > correctness logic as follows:
> > 
> >         dev->nchannels = msg.u.cardinfo.nchannels;
> >         if ((dev->family == USBCAN && dev->nchannels > USBCAN_MAX_NET_DEVICES)
> >             || (dev->family == LEAF && dev->nchannels > LEAF_MAX_NET_DEVICES))
> >                 return -EINVAL
> > 
> > b) Leave max_channels in 'struct kvaser_usb' as is.
> > 
> > I personally prefer the solution at 'b)' but I can do it as
> > in 'a)' if you prefer :-)
> 
> Keeping max_channels in the kvaser_usb structure is useless because it
> is only used in one function that is called in the probe function.
> 
> I would prefer to have:
> 	if (dev->nchannels > MAX_NET_DEVICES)
> 		return -EINVAL
> 
> 	if ((dev->family == USBCAN) &&
> 	    (dev->nchannels > MAX_USBCAN_NET_DEVICES))
> 		return -EINVAL
> 
> You can remove LEAF_MAX_NET_DEVICES which is not used, keep
> MAX_NET_DEVICES equals to 3 and remove the MAX() macro.
> The test specific to the USBCAN family can eventually be moved in the
> kvaser_usb_probe() function.
> 

Quite nice, will do it that way in v3.

Regards,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH] can: kvaser_usb: Don't free packets when tight on URBs
  @ 2015-01-03 14:34 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-03 14:34 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, David S. Miller, Paul Gortmaker, Linux-CAN,
	netdev, LKML

On Thu, Jan 01, 2015 at 01:59:13PM -0800, Stephen Hemminger wrote:
> On Tue, 23 Dec 2014 17:46:54 +0200
> "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> 
> >  	int ret = NETDEV_TX_OK;
> > +	bool kfree_skb_on_error = true;
> >  
> >  	if (can_dropped_invalid_skb(netdev, skb))
> >  		return NETDEV_TX_OK;
> > @@ -1336,6 +1337,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  
> >  	if (!context) {
> >  		netdev_warn(netdev, "cannot find free context\n");
> > +		kfree_skb_on_error = false;
> >  		ret =  NETDEV_TX_BUSY;
> 
> You already have a flag value (ret == NETDEV_TX_BUSY), why
> not use that instead of introducing another variable?

Yes, that variable got implicitly removed in v2 patch 1/4.

Thanks,

--
Darwish

^ permalink raw reply	[relevance 99%]

* [PATCH v3 1/4] can: kvaser_usb: Don't free packets when tight on URBs
  2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
                   ` (3 preceding siblings ...)
  @ 2015-01-05 17:49 93% ` Ahmed S. Darwish
  2015-01-05 17:52 87%   ` [PATCH v3 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
  2015-01-11 20:05 99% ` [PATCH v4 00/04] can: Introduce support for Kvaser USBCAN-II devices Ahmed S. Darwish
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-05 17:49 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in high frequency caused seemingly-random panics in the
kernel.

On further inspection, it seems the driver erroneously freed the
to-be-transmitted packet upon getting tight on URBs and returning
NETDEV_TX_BUSY, leading to invalid memory writes and double frees
at a later point in time.

Note:

Finding no more URBs/transmit-contexts and returning NETDEV_TX_BUSY
is a driver bug in and out of itself: it means that our start/stop
queue flow control is broken.

This patch only fixes the (buggy) error handling code; the root
cause shall be fixed in a later commit.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Acked-by: Olivier Sobrie <olivier@sobrie.be>
---
 drivers/net/can/usb/kvaser_usb.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

** V3 Changelog:
- checkpatch.pl suggestions ('net/' commenting style)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 541fb7a..2e7d513 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1294,12 +1294,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (!urb) {
 		netdev_err(netdev, "No memory left for URBs\n");
 		stats->tx_dropped++;
-		goto nourbmem;
+		dev_kfree_skb(skb);
+		return NETDEV_TX_OK;
 	}
 
 	buf = kmalloc(sizeof(struct kvaser_msg), GFP_ATOMIC);
 	if (!buf) {
 		stats->tx_dropped++;
+		dev_kfree_skb(skb);
 		goto nobufmem;
 	}
 
@@ -1334,6 +1336,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		}
 	}
 
+	/* This should never happen; it implies a flow control bug */
 	if (!context) {
 		netdev_warn(netdev, "cannot find free context\n");
 		ret =  NETDEV_TX_BUSY;
@@ -1364,9 +1367,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (unlikely(err)) {
 		can_free_echo_skb(netdev, context->echo_index);
 
-		skb = NULL; /* set to NULL to avoid double free in
-			     * dev_kfree_skb(skb) */
-
 		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
@@ -1388,8 +1388,6 @@ releasebuf:
 	kfree(buf);
 nobufmem:
 	usb_free_urb(urb);
-nourbmem:
-	dev_kfree_skb(skb);
 	return ret;
 }
 


^ permalink raw reply related	[relevance 93%]

* [PATCH v3 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close
  2015-01-05 17:49 93% ` [PATCH v3 1/4] " Ahmed S. Darwish
@ 2015-01-05 17:52 87%   ` Ahmed S. Darwish
  2015-01-05 17:57 98%     ` [PATCH v3 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-05 17:52 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in very high frequency (*), closing the CAN channel while
all the transmissions are on (#), opening the device again (@),
then sending a small number of packets would make the driver
enter an almost infinite loop of:

[....]
[15959.853988] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853990] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853991] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853993] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853994] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853995] kvaser_usb 4-3:1.0 can0: cannot find free context
[....]

_dragging the whole system down_ in the process due to the
excessive logging output.

Initially, this has caused random panics in the kernel due to a
buggy error recovery path.  That got fixed in an earlier commit.(%)
This patch aims at solving the root cause. -->

16 tx URBs and contexts are allocated per CAN channel per USB
device. Such URBs are protected by:

a) A simple atomic counter, up to a value of MAX_TX_URBS (16)
b) A flag in each URB context, stating if it's free
c) The fact that ndo_start_xmit calls are themselves protected
   by the networking layers higher above

After grabbing one of the tx URBs, if the driver noticed that all
of them are now taken, it stops the netif transmission queue.
Such queue is worken up again only if an acknowedgment was received
from the firmware on one of our earlier-sent frames.

Meanwhile, upon channel close (#), the driver sends a CMD_STOP_CHIP
to the firmware, effectively closing all further communication.  In
the high traffic case, the atomic counter remains at MAX_TX_URBS,
and all the URB contexts remain marked as active.  While opening
the channel again (@), it cannot send any further frames since no
more free tx URB contexts are available.

Reset all tx URB contexts upon CAN channel close.

(*) 50 parallel instances of `cangen0 -g 0 -ix`
(#) `ifconfig can0 down`
(@) `ifconfig can0 up`
(%) "can: kvaser_usb: Don't free packets when tight on URBs"

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 2e7d513..9accc82 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1246,6 +1246,9 @@ static int kvaser_usb_close(struct net_device *netdev)
 	if (err)
 		netdev_warn(netdev, "Cannot stop device, error %d\n", err);
 
+	/* reset tx contexts */
+	kvaser_usb_unlink_tx_urbs(priv);
+
 	priv->can.state = CAN_STATE_STOPPED;
 	close_candev(priv->netdev);
 


^ permalink raw reply related	[relevance 87%]

* [PATCH v3 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels
  2015-01-05 17:52 87%   ` [PATCH v3 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
@ 2015-01-05 17:57 98%     ` Ahmed S. Darwish
  2015-01-05 18:31 39%       ` [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-05 17:57 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Recent Leaf firmware versions (>= 3.1.557) do not allow to send
commands for non-existing channels.  If a command is sent for a
non-existing channel, the firmware crashes.

Reported-by: Christopher Storah <Christopher.Storah@invetech.com.au>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
---
 drivers/net/can/usb/kvaser_usb.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

** V3 Changelog:
- Update commit log message per Olivier remarks on v2

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 9accc82..cc7bfc0 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1503,6 +1503,10 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb_net_priv *priv;
 	int i, err;
 
+	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
+	if (err)
+		return err;
+
 	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
@@ -1607,9 +1611,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++)
-		kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, i);
-
 	err = kvaser_usb_get_software_info(dev);
 	if (err) {
 		dev_err(&intf->dev,


^ permalink raw reply related	[relevance 98%]

* [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family
  2015-01-05 17:57 98%     ` [PATCH v3 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
@ 2015-01-05 18:31 39%       ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-05 18:31 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
divided into two major families: 'Leaf', and 'UsbcanII'.  From an
Operating System perspective, the firmware of both families behave
in a not too drastically different fashion.

This patch adds support for the UsbcanII family of devices to the
current Kvaser Leaf-only driver.

CAN frames sending, receiving, and error handling paths has been
tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
should also work nicely with other products in the same category.

List of new devices supported by this driver update:

         - Kvaser USBcan II HS/HS
         - Kvaser USBcan II HS/LS
         - Kvaser USBcan Rugged ("USBcan Rev B")
         - Kvaser Memorator HS/HS
         - Kvaser Memorator HS/LS
         - Scania VCI2 (if you have the Kvaser logo on top)

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/Kconfig      |   8 +-
 drivers/net/can/usb/kvaser_usb.c | 618 +++++++++++++++++++++++++++++++--------
 2 files changed, 503 insertions(+), 123 deletions(-)

** V2 Changelog:
- Update Kconfig entries
- Use actual number of CAN channels (instead of max) where appropriate
- Rebase over a new set of UsbcanII-independent driver fixes

** V3 Changelog:
- Fix padding for the usbcan_msg_tx_acknowledge command
- Remove kvaser_usb->max_channels and the MAX_NET_DEVICES macro
- Rename commands to CMD_LEAF_xxx and CMD_USBCAN_xxx
- Apply checkpatch.pl suggestions ('net/' comments, multi-line strings, etc.)

diff --git a/drivers/net/can/usb/Kconfig b/drivers/net/can/usb/Kconfig
index a77db919..f6f5500 100644
--- a/drivers/net/can/usb/Kconfig
+++ b/drivers/net/can/usb/Kconfig
@@ -25,7 +25,7 @@ config CAN_KVASER_USB
 	tristate "Kvaser CAN/USB interface"
 	---help---
 	  This driver adds support for Kvaser CAN/USB devices like Kvaser
-	  Leaf Light.
+	  Leaf Light and Kvaser USBcan II.
 
 	  The driver provides support for the following devices:
 	    - Kvaser Leaf Light
@@ -46,6 +46,12 @@ config CAN_KVASER_USB
 	    - Kvaser USBcan R
 	    - Kvaser Leaf Light v2
 	    - Kvaser Mini PCI Express HS
+	    - Kvaser USBcan II HS/HS
+	    - Kvaser USBcan II HS/LS
+	    - Kvaser USBcan Rugged ("USBcan Rev B")
+	    - Kvaser Memorator HS/HS
+	    - Kvaser Memorator HS/LS
+	    - Scania VCI2 (if you have the Kvaser logo on top)
 
 	  If unsure, say N.
 
diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index cc7bfc0..43b3928 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -6,10 +6,12 @@
  * Parts of this driver are based on the following:
  *  - Kvaser linux leaf driver (version 4.78)
  *  - CAN driver for esd CAN-USB/2
+ *  - Kvaser linux usbcanII driver (version 5.3)
  *
  * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
  * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
  * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
+ * Copyright (C) 2015 Valeo Corporation
  */
 
 #include <linux/completion.h>
@@ -21,6 +23,15 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
+/* Kvaser USB CAN dongles are divided into two major families:
+ * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
+ * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
+ */
+enum kvaser_usb_family {
+	KVASER_LEAF,
+	KVASER_USBCAN,
+};
+
 #define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
@@ -30,8 +41,9 @@
 #define RX_BUFFER_SIZE			3072
 #define CAN_USB_CLOCK			8000000
 #define MAX_NET_DEVICES			3
+#define MAX_USBCAN_NET_DEVICES		2
 
-/* Kvaser USB devices */
+/* Leaf USB devices */
 #define KVASER_VENDOR_ID		0x0bfd
 #define USB_LEAF_DEVEL_PRODUCT_ID	10
 #define USB_LEAF_LITE_PRODUCT_ID	11
@@ -55,6 +67,16 @@
 #define USB_CAN_R_PRODUCT_ID		39
 #define USB_LEAF_LITE_V2_PRODUCT_ID	288
 #define USB_MINI_PCIE_HS_PRODUCT_ID	289
+#define LEAF_PRODUCT_ID(id)		(id >= USB_LEAF_DEVEL_PRODUCT_ID && \
+					 id <= USB_MINI_PCIE_HS_PRODUCT_ID)
+
+/* USBCANII devices */
+#define USB_USBCAN_REVB_PRODUCT_ID	2
+#define USB_VCI2_PRODUCT_ID		3
+#define USB_USBCAN2_PRODUCT_ID		4
+#define USB_MEMORATOR_PRODUCT_ID	5
+#define USBCAN_PRODUCT_ID(id)		(id >= USB_USBCAN_REVB_PRODUCT_ID && \
+					 id <= USB_MEMORATOR_PRODUCT_ID)
 
 /* USB devices features */
 #define KVASER_HAS_SILENT_MODE		BIT(0)
@@ -73,7 +95,7 @@
 #define MSG_FLAG_TX_ACK			BIT(6)
 #define MSG_FLAG_TX_REQUEST		BIT(7)
 
-/* Can states */
+/* Can states (M16C CxSTRH register) */
 #define M16C_STATE_BUS_RESET		BIT(0)
 #define M16C_STATE_BUS_ERROR		BIT(4)
 #define M16C_STATE_BUS_PASSIVE		BIT(5)
@@ -98,7 +120,13 @@
 #define CMD_START_CHIP_REPLY		27
 #define CMD_STOP_CHIP			28
 #define CMD_STOP_CHIP_REPLY		29
-#define CMD_GET_CARD_INFO2		32
+#define CMD_READ_CLOCK			30
+#define CMD_READ_CLOCK_REPLY		31
+
+#define CMD_LEAF_GET_CARD_INFO2		32
+#define CMD_USBCAN_RESET_CLOCK		32
+#define CMD_USBCAN_CLOCK_OVERFLOW_EVENT	33
+
 #define CMD_GET_CARD_INFO		34
 #define CMD_GET_CARD_INFO_REPLY		35
 #define CMD_GET_SOFTWARE_INFO		38
@@ -108,8 +136,9 @@
 #define CMD_RESET_ERROR_COUNTER		49
 #define CMD_TX_ACKNOWLEDGE		50
 #define CMD_CAN_ERROR_EVENT		51
-#define CMD_USB_THROTTLE		77
-#define CMD_LOG_MESSAGE			106
+
+#define CMD_LEAF_USB_THROTTLE		77
+#define CMD_LEAF_LOG_MESSAGE		106
 
 /* error factors */
 #define M16C_EF_ACKE			BIT(0)
@@ -121,6 +150,14 @@
 #define M16C_EF_RCVE			BIT(6)
 #define M16C_EF_TRE			BIT(7)
 
+/* Only Leaf-based devices can report M16C error factors,
+ * thus define our own error status flags for USBCANII
+ */
+#define USBCAN_ERROR_STATE_NONE		0
+#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
+#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
+#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
+
 /* bittiming parameters */
 #define KVASER_USB_TSEG1_MIN		1
 #define KVASER_USB_TSEG1_MAX		16
@@ -137,7 +174,7 @@
 #define KVASER_CTRL_MODE_SELFRECEPTION	3
 #define KVASER_CTRL_MODE_OFF		4
 
-/* log message */
+/* Extended CAN identifier flag */
 #define KVASER_EXTENDED_FRAME		BIT(31)
 
 struct kvaser_msg_simple {
@@ -148,30 +185,55 @@ struct kvaser_msg_simple {
 struct kvaser_msg_cardinfo {
 	u8 tid;
 	u8 nchannels;
-	__le32 serial_number;
-	__le32 padding;
+	union {
+		struct {
+			__le32 serial_number;
+			__le32 padding;
+		} __packed leaf0;
+		struct {
+			__le32 serial_number_low;
+			__le32 serial_number_high;
+		} __packed usbcan0;
+	} __packed;
 	__le32 clock_resolution;
 	__le32 mfgdate;
 	u8 ean[8];
 	u8 hw_revision;
-	u8 usb_hs_mode;
-	__le16 padding2;
+	union {
+		struct {
+			u8 usb_hs_mode;
+		} __packed leaf1;
+		struct {
+			u8 padding;
+		} __packed usbcan1;
+	} __packed;
+	__le16 padding;
 } __packed;
 
 struct kvaser_msg_cardinfo2 {
 	u8 tid;
-	u8 channel;
+	u8 reserved;
 	u8 pcb_id[24];
 	__le32 oem_unlock_code;
 } __packed;
 
-struct kvaser_msg_softinfo {
+struct leaf_msg_softinfo {
 	u8 tid;
-	u8 channel;
+	u8 padding0;
 	__le32 sw_options;
 	__le32 fw_version;
 	__le16 max_outstanding_tx;
-	__le16 padding[9];
+	__le16 padding1[9];
+} __packed;
+
+struct usbcan_msg_softinfo {
+	u8 tid;
+	u8 fw_name[5];
+	__le16 max_outstanding_tx;
+	u8 padding[6];
+	__le32 fw_version;
+	__le16 checksum;
+	__le16 sw_options;
 } __packed;
 
 struct kvaser_msg_busparams {
@@ -188,36 +250,86 @@ struct kvaser_msg_tx_can {
 	u8 channel;
 	u8 tid;
 	u8 msg[14];
-	u8 padding;
-	u8 flags;
+	union {
+		struct {
+			u8 padding;
+			u8 flags;
+		} __packed leaf;
+		struct {
+			u8 flags;
+			u8 padding;
+		} __packed usbcan;
+	} __packed;
 } __packed;
 
-struct kvaser_msg_rx_can {
+struct kvaser_msg_rx_can_header {
 	u8 channel;
 	u8 flag;
+} __packed;
+
+struct leaf_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
 	__le16 time[3];
 	u8 msg[14];
 } __packed;
 
-struct kvaser_msg_chip_state_event {
+struct usbcan_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
+	u8 msg[14];
+	__le16 time;
+} __packed;
+
+struct leaf_msg_chip_state_event {
 	u8 tid;
 	u8 channel;
+
 	__le16 time[3];
 	u8 tx_errors_count;
 	u8 rx_errors_count;
+
+	u8 status;
+	u8 padding[3];
+} __packed;
+
+struct usbcan_msg_chip_state_event {
+	u8 tid;
+	u8 channel;
+
+	u8 tx_errors_count;
+	u8 rx_errors_count;
+	__le16 time;
+
 	u8 status;
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_tx_acknowledge {
+struct kvaser_msg_tx_acknowledge_header {
+	u8 channel;
+	u8 tid;
+};
+
+struct leaf_msg_tx_acknowledge {
 	u8 channel;
 	u8 tid;
+
 	__le16 time[3];
 	u8 flags;
 	u8 time_offset;
 } __packed;
 
-struct kvaser_msg_error_event {
+struct usbcan_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
+	__le16 time;
+	__le16 padding;
+} __packed;
+
+struct leaf_msg_error_event {
 	u8 tid;
 	u8 flags;
 	__le16 time[3];
@@ -229,6 +341,18 @@ struct kvaser_msg_error_event {
 	u8 error_factor;
 } __packed;
 
+struct usbcan_msg_error_event {
+	u8 tid;
+	u8 padding;
+	u8 tx_errors_count_ch0;
+	u8 rx_errors_count_ch0;
+	u8 tx_errors_count_ch1;
+	u8 rx_errors_count_ch1;
+	u8 status_ch0;
+	u8 status_ch1;
+	__le16 time;
+} __packed;
+
 struct kvaser_msg_ctrl_mode {
 	u8 tid;
 	u8 channel;
@@ -243,7 +367,7 @@ struct kvaser_msg_flush_queue {
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_log_message {
+struct leaf_msg_log_message {
 	u8 channel;
 	u8 flags;
 	__le16 time[3];
@@ -260,19 +384,50 @@ struct kvaser_msg {
 		struct kvaser_msg_simple simple;
 		struct kvaser_msg_cardinfo cardinfo;
 		struct kvaser_msg_cardinfo2 cardinfo2;
-		struct kvaser_msg_softinfo softinfo;
 		struct kvaser_msg_busparams busparams;
+
+		struct kvaser_msg_rx_can_header rx_can_header;
+		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
+
+		union {
+			struct leaf_msg_softinfo softinfo;
+			struct leaf_msg_rx_can rx_can;
+			struct leaf_msg_chip_state_event chip_state_event;
+			struct leaf_msg_tx_acknowledge tx_acknowledge;
+			struct leaf_msg_error_event error_event;
+			struct leaf_msg_log_message log_message;
+		} __packed leaf;
+
+		union {
+			struct usbcan_msg_softinfo softinfo;
+			struct usbcan_msg_rx_can rx_can;
+			struct usbcan_msg_chip_state_event chip_state_event;
+			struct usbcan_msg_tx_acknowledge tx_acknowledge;
+			struct usbcan_msg_error_event error_event;
+		} __packed usbcan;
+
 		struct kvaser_msg_tx_can tx_can;
-		struct kvaser_msg_rx_can rx_can;
-		struct kvaser_msg_chip_state_event chip_state_event;
-		struct kvaser_msg_tx_acknowledge tx_acknowledge;
-		struct kvaser_msg_error_event error_event;
 		struct kvaser_msg_ctrl_mode ctrl_mode;
 		struct kvaser_msg_flush_queue flush_queue;
-		struct kvaser_msg_log_message log_message;
 	} u;
 } __packed;
 
+/* Leaf/USBCAN-agnostic summary of an error event.
+ * No M16C error factors for USBCAN-based devices.
+ */
+struct kvaser_error_summary {
+	u8 channel, status, txerr, rxerr;
+	union {
+		struct {
+			u8 error_factor;
+		} leaf;
+		struct {
+			u8 other_ch_status;
+			u8 error_state;
+		} usbcan;
+	};
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -288,6 +443,7 @@ struct kvaser_usb {
 
 	u32 fw_version;
 	unsigned int nchannels;
+	enum kvaser_usb_family family;
 
 	bool rxinitdone;
 	void *rxbuf[MAX_RX_URBS];
@@ -311,6 +467,7 @@ struct kvaser_usb_net_priv {
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
+	/* Leaf family IDs */
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
@@ -360,6 +517,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
 		.driver_info = KVASER_HAS_TXRX_ERRORS },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+
+	/* USBCANII family IDs */
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+
 	{ }
 };
 MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
@@ -463,7 +631,18 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	if (err)
 		return err;
 
-	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
+	switch (dev->family) {
+	case KVASER_LEAF:
+		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		break;
+	case KVASER_USBCAN:
+		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		return -EINVAL;
+	}
 
 	return 0;
 }
@@ -484,6 +663,9 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
 	dev->nchannels = msg.u.cardinfo.nchannels;
 	if (dev->nchannels > MAX_NET_DEVICES)
 		return -EINVAL;
+	if (dev->family == KVASER_USBCAN &&
+	    dev->nchannels > MAX_USBCAN_NET_DEVICES)
+		return -EINVAL;
 
 	return 0;
 }
@@ -496,8 +678,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	u8 channel = msg->u.tx_acknowledge.channel;
-	u8 tid = msg->u.tx_acknowledge.tid;
+	u8 channel, tid;
+
+	channel = msg->u.tx_acknowledge_header.channel;
+	tid = msg->u.tx_acknowledge_header.tid;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -615,37 +799,80 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
 }
 
-static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
-				const struct kvaser_msg *msg)
+static void kvaser_report_error_event(const struct kvaser_usb *dev,
+				      struct kvaser_error_summary *es);
+
+/* Report error to userspace iff the controller's errors counter has
+ * increased, or we're the only channel seeing the bus error state.
+ *
+ * As reported by USBCAN sheets, "the CAN controller has difficulties
+ * to tell whether an error frame arrived on channel 1 or on channel 2."
+ * Thus, error counters are compared with their earlier values to
+ * determine which channel was responsible for the error event.
+ */
+static void usbcan_report_error_if_applicable(const struct kvaser_usb *dev,
+					      struct kvaser_error_summary *es)
 {
-	struct can_frame *cf;
-	struct sk_buff *skb;
-	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
+	int old_tx_err_count, old_rx_err_count, channel, report_error;
+
+	channel = es->channel;
+	if (channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", channel);
+		return;
+	}
+
+	priv = dev->nets[channel];
+	old_tx_err_count = priv->bec.txerr;
+	old_rx_err_count = priv->bec.rxerr;
+
+	report_error = 0;
+	if (es->txerr > old_tx_err_count) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
+		report_error = 1;
+	}
+	if (es->rxerr > old_rx_err_count) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
+		report_error = 1;
+	}
+	if ((es->status & M16C_STATE_BUS_ERROR) &&
+	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
+		report_error = 1;
+	}
+
+	if (report_error)
+		kvaser_report_error_event(dev, es);
+}
+
+/* Extract error summary from a Leaf-based device error message */
+static void leaf_extract_error_from_msg(const struct kvaser_usb *dev,
+					const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { 0, };
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
+		es.channel = msg->u.leaf.error_event.channel;
+		es.status =  msg->u.leaf.error_event.status;
+		es.txerr = msg->u.leaf.error_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
+		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
 		break;
-	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
+	case CMD_LEAF_LOG_MESSAGE:
+		es.channel = msg->u.leaf.log_message.channel;
+		es.status = msg->u.leaf.log_message.data[0];
+		es.txerr = msg->u.leaf.log_message.data[2];
+		es.rxerr = msg->u.leaf.log_message.data[3];
+		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
 		break;
 	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
+		es.channel = msg->u.leaf.chip_state_event.channel;
+		es.status =  msg->u.leaf.chip_state_event.status;
+		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
+		es.leaf.error_factor = 0;
 		break;
 	default:
 		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
@@ -653,16 +880,92 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (channel >= dev->nchannels) {
+	kvaser_report_error_event(dev, &es);
+}
+
+/* Extract error summary from a USBCANII-based device error message */
+static void usbcan_extract_error_from_msg(const struct kvaser_usb *dev,
+					  const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { 0, };
+
+	switch (msg->id) {
+	/* Sometimes errors are sent as unsolicited chip state events */
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.usbcan.chip_state_event.channel;
+		es.status =  msg->u.usbcan.chip_state_event.status;
+		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
+		usbcan_report_error_if_applicable(dev, &es);
+		break;
+
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = 0;
+		es.status = msg->u.usbcan.error_event.status_ch0;
+		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
+		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
+		es.usbcan.other_ch_status =
+			msg->u.usbcan.error_event.status_ch1;
+		usbcan_report_error_if_applicable(dev, &es);
+
+		/* For error events, the USBCAN firmware does not support
+		 * more than 2 channels: ch0, and ch1.
+		 */
+		if (dev->nchannels > 1) {
+			es.channel = 1;
+			es.status = msg->u.usbcan.error_event.status_ch1;
+			es.txerr =
+				msg->u.usbcan.error_event.tx_errors_count_ch1;
+			es.rxerr =
+				msg->u.usbcan.error_event.rx_errors_count_ch1;
+			es.usbcan.other_ch_status =
+				msg->u.usbcan.error_event.status_ch0;
+			usbcan_report_error_if_applicable(dev, &es);
+		}
+		break;
+
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+	}
+}
+
+static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
+				const struct kvaser_msg *msg)
+{
+	switch (dev->family) {
+	case KVASER_LEAF:
+		leaf_extract_error_from_msg(dev, msg);
+		break;
+	case KVASER_USBCAN:
+		usbcan_extract_error_from_msg(dev, msg);
+		break;
+	default:
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
+}
 
-	priv = dev->nets[channel];
+static void kvaser_report_error_event(const struct kvaser_usb *dev,
+				      struct kvaser_error_summary *es)
+{
+	struct can_frame *cf;
+	struct sk_buff *skb;
+	struct net_device_stats *stats;
+	struct kvaser_usb_net_priv *priv;
+	unsigned int new_state;
+
+	if (es->channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", es->channel);
+		return;
+	}
+
+	priv = dev->nets[es->channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
+	if (es->status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
@@ -675,9 +978,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	new_state = priv->can.state;
 
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	if (status & M16C_STATE_BUS_OFF) {
+	if (es->status & M16C_STATE_BUS_OFF) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
 		priv->can.can_stats.bus_off++;
@@ -687,12 +990,12 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		netif_carrier_off(priv->netdev);
 
 		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
 		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es->txerr || es->rxerr)
+				cf->data[1] = (es->txerr > es->rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
@@ -703,13 +1006,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	}
-
-	if (status == M16C_STATE_BUS_ERROR) {
+	} else if (es->status & M16C_STATE_BUS_ERROR) {
 		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es->txerr > es->rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
 
@@ -723,7 +1024,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 	}
 
-	if (!status) {
+	if (!es->status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
 
@@ -739,34 +1040,52 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
-		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
-
-		if (error_factor & M16C_EF_ACKE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
-					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
-			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
-			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
-			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
-			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
-			cf->data[2] |= CAN_ERR_PROT_TX;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			priv->can.can_stats.bus_error++;
+			stats->rx_errors++;
+
+			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+
+			if (es->leaf.error_factor & M16C_EF_ACKE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
+			if (es->leaf.error_factor & M16C_EF_CRCE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
+						CAN_ERR_PROT_LOC_CRC_DEL);
+			if (es->leaf.error_factor & M16C_EF_FORME)
+				cf->data[2] |= CAN_ERR_PROT_FORM;
+			if (es->leaf.error_factor & M16C_EF_STFE)
+				cf->data[2] |= CAN_ERR_PROT_STUFF;
+			if (es->leaf.error_factor & M16C_EF_BITE0)
+				cf->data[2] |= CAN_ERR_PROT_BIT0;
+			if (es->leaf.error_factor & M16C_EF_BITE1)
+				cf->data[2] |= CAN_ERR_PROT_BIT1;
+			if (es->leaf.error_factor & M16C_EF_TRE)
+				cf->data[2] |= CAN_ERR_PROT_TX;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
+			stats->tx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
+			stats->rx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			priv->can.can_stats.bus_error++;
+			cf->can_id |= CAN_ERR_BUSERROR;
+		}
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto err;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
+	cf->data[6] = es->txerr;
+	cf->data[7] = es->rxerr;
 
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
 
 	priv->can.state = new_state;
 
@@ -774,6 +1093,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+
+	return;
+
+err:
+	dev_kfree_skb(skb);
 }
 
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
@@ -783,16 +1107,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	struct sk_buff *skb;
 	struct net_device_stats *stats = &priv->netdev->stats;
 
-	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
+	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
 					 MSG_FLAG_NERR)) {
 		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
-			   msg->u.rx_can.flag);
+			   msg->u.rx_can_header.flag);
 
 		stats->rx_errors++;
 		return;
 	}
 
-	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -819,7 +1143,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
-	u8 channel = msg->u.rx_can.channel;
+	u8 channel = msg->u.rx_can_header.channel;
+	const u8 *rx_msg;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -830,19 +1155,32 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
-	    (msg->id == CMD_LOG_MESSAGE)) {
+	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+	    (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE)) {
 		kvaser_usb_rx_error(dev, msg);
 		return;
-	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
-					 MSG_FLAG_NERR |
-					 MSG_FLAG_OVERRUN)) {
+	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
+						MSG_FLAG_NERR |
+						MSG_FLAG_OVERRUN)) {
 		kvaser_usb_rx_can_err(priv, msg);
 		return;
-	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
+	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
 		netdev_warn(priv->netdev,
 			    "Unhandled frame (flags: 0x%02x)",
-			    msg->u.rx_can.flag);
+			    msg->u.rx_can_header.flag);
+		return;
+	}
+
+	switch (dev->family) {
+	case KVASER_LEAF:
+		rx_msg = msg->u.leaf.rx_can.msg;
+		break;
+	case KVASER_USBCAN:
+		rx_msg = msg->u.usbcan.rx_can.msg;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
 
@@ -852,38 +1190,37 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (msg->id == CMD_LOG_MESSAGE) {
-		cf->can_id = le32_to_cpu(msg->u.log_message.id);
+	if (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE) {
+		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
 		if (cf->can_id & KVASER_EXTENDED_FRAME)
 			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
 		else
 			cf->can_id &= CAN_SFF_MASK;
 
-		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
+		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
 
-		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.log_message.data,
+			memcpy(cf->data, &msg->u.leaf.log_message.data,
 			       cf->can_dlc);
 	} else {
-		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
-			     (msg->u.rx_can.msg[1] & 0x3f);
+		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
 
 		if (msg->id == CMD_RX_EXT_MESSAGE) {
 			cf->can_id <<= 18;
-			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
-				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
-				      (msg->u.rx_can.msg[4] & 0x3f);
+			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
+				      ((rx_msg[3] & 0xff) << 6) |
+				      (rx_msg[4] & 0x3f);
 			cf->can_id |= CAN_EFF_FLAG;
 		}
 
-		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
+		cf->can_dlc = get_can_dlc(rx_msg[5]);
 
-		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.rx_can.msg[6],
+			memcpy(cf->data, &rx_msg[6],
 			       cf->can_dlc);
 	}
 
@@ -947,7 +1284,12 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 
 	case CMD_RX_STD_MESSAGE:
 	case CMD_RX_EXT_MESSAGE:
-	case CMD_LOG_MESSAGE:
+		kvaser_usb_rx_can_msg(dev, msg);
+		break;
+
+	case CMD_LEAF_LOG_MESSAGE:
+		if (dev->family != KVASER_LEAF)
+			goto warn;
 		kvaser_usb_rx_can_msg(dev, msg);
 		break;
 
@@ -960,8 +1302,14 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 		kvaser_usb_tx_acknowledge(dev, msg);
 		break;
 
+	/* Ignored messages */
+	case CMD_USBCAN_CLOCK_OVERFLOW_EVENT:
+		if (dev->family != KVASER_USBCAN)
+			goto warn;
+		break;
+
 	default:
-		dev_warn(dev->udev->dev.parent,
+warn:		dev_warn(dev->udev->dev.parent,
 			 "Unhandled message (%d)\n", msg->id);
 		break;
 	}
@@ -1181,7 +1529,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 				  dev->rxbuf[i],
 				  dev->rxbuf_dma[i]);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++) {
+	for (i = 0; i < dev->nchannels; i++) {
 		struct kvaser_usb_net_priv *priv = dev->nets[i];
 
 		if (priv)
@@ -1289,6 +1637,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	uint8_t *msg_tx_can_flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1310,9 +1659,23 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	msg = buf;
 	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
-	msg->u.tx_can.flags = 0;
 	msg->u.tx_can.channel = priv->channel;
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
+		break;
+	case KVASER_USBCAN:
+		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto releasebuf;
+	}
+
+	*msg_tx_can_flags = 0;
+
 	if (cf->can_id & CAN_EFF_FLAG) {
 		msg->id = CMD_TX_EXT_MESSAGE;
 		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
@@ -1330,7 +1693,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
 
 	if (cf->can_id & CAN_RTR_FLAG)
-		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
+		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
@@ -1599,6 +1962,17 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 	if (!dev)
 		return -ENOMEM;
 
+	if (LEAF_PRODUCT_ID(id->idProduct)) {
+		dev->family = KVASER_LEAF;
+	} else if (USBCAN_PRODUCT_ID(id->idProduct)) {
+		dev->family = KVASER_USBCAN;
+	} else {
+		dev_err(&intf->dev,
+			"Product ID (%d) does not belong to any known Kvaser USB family",
+			id->idProduct);
+		return -ENODEV;
+	}
+
 	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
 	if (err) {
 		dev_err(&intf->dev, "Cannot get usb endpoint(s)");



^ permalink raw reply related	[relevance 39%]

* Re: [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family
  @ 2015-01-08 15:19 79%           ` Ahmed S. Darwish
    2015-01-09  3:06 99%           ` Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-08 15:19 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi Marc,

On Thu, Jan 08, 2015 at 12:53:37PM +0100, Marc Kleine-Budde wrote:
> On 01/05/2015 07:31 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
[...]
> > +/* Kvaser USB CAN dongles are divided into two major families:
> > + * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
> > + * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
> > + */
> > +enum kvaser_usb_family {
> > +	KVASER_LEAF,
> > +	KVASER_USBCAN,
> > +};
> > +
> >  #define MAX_TX_URBS			16
> >  #define MAX_RX_URBS			4
> >  #define START_TIMEOUT			1000 /* msecs */
> > @@ -30,8 +41,9 @@
> >  #define RX_BUFFER_SIZE			3072
> >  #define CAN_USB_CLOCK			8000000
> >  #define MAX_NET_DEVICES			3
> > +#define MAX_USBCAN_NET_DEVICES		2
> >  
> > -/* Kvaser USB devices */
> > +/* Leaf USB devices */
> >  #define KVASER_VENDOR_ID		0x0bfd
> >  #define USB_LEAF_DEVEL_PRODUCT_ID	10
> >  #define USB_LEAF_LITE_PRODUCT_ID	11
> > @@ -55,6 +67,16 @@
> >  #define USB_CAN_R_PRODUCT_ID		39
> >  #define USB_LEAF_LITE_V2_PRODUCT_ID	288
> >  #define USB_MINI_PCIE_HS_PRODUCT_ID	289
> > +#define LEAF_PRODUCT_ID(id)		(id >= USB_LEAF_DEVEL_PRODUCT_ID && \
> > +					 id <= USB_MINI_PCIE_HS_PRODUCT_ID)
> 
> Can you please convert both *_PRODUCT_ID() macros into static inline
> bool functions.
> 

Will do.

[...]
> >  MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
> > @@ -463,7 +631,18 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
> >  	if (err)
> >  		return err;
> >  
> > -	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
> > +		break;
> > +	case KVASER_USBCAN:
> > +		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> > +		return -EINVAL;
> 
> The default case should not happen. I think you can remove it.
> 

It's true, it _should_ never happen. But I only add such checks if
the follow-up code critically depends on a certain `dev->family`
behavior. So it's kind of a defensive check against any possible
bug in driver or memory.

What do you think?

> > +	}
> >  
> >  	return 0;
> >  }
> > @@ -484,6 +663,9 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
> >  	dev->nchannels = msg.u.cardinfo.nchannels;
> >  	if (dev->nchannels > MAX_NET_DEVICES)
> >  		return -EINVAL;
> > +	if (dev->family == KVASER_USBCAN &&
> > +	    dev->nchannels > MAX_USBCAN_NET_DEVICES)
> > +		return -EINVAL;
> 
> Nitpick, as the new "if" also does a test on nchannels, why no extend
> the existing "if" with an "||"?
> 

Will do.

> >  
> >  	return 0;
> >  }
> > @@ -496,8 +678,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
> >  	struct kvaser_usb_net_priv *priv;
> >  	struct sk_buff *skb;
> >  	struct can_frame *cf;
> > -	u8 channel = msg->u.tx_acknowledge.channel;
> > -	u8 tid = msg->u.tx_acknowledge.tid;
> > +	u8 channel, tid;
> > +
> > +	channel = msg->u.tx_acknowledge_header.channel;
> > +	tid = msg->u.tx_acknowledge_header.tid;
> >  
> >  	if (channel >= dev->nchannels) {
> >  		dev_err(dev->udev->dev.parent,
> > @@ -615,37 +799,80 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
> >  		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
> >  }
> >  
> > -static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> > -				const struct kvaser_msg *msg)
> > +static void kvaser_report_error_event(const struct kvaser_usb *dev,
> > +				      struct kvaser_error_summary *es);
> 
> Please rearange your code that forward declarations are not needed (if
> possible - I haven't checked, though).
> 

I originally did it that way, but it completely messed up with the
patch if I do such rearrangement. A huge block of code gets removed
at top of the patch, and it got added again at the end, making the
actual important lines changed _within_ such big block non-apparent.

Maybe I should do the re-arrangement in a follow-up patch?

> > +
> > +/* Report error to userspace iff the controller's errors counter has
> > + * increased, or we're the only channel seeing the bus error state.
> > + *
> > + * As reported by USBCAN sheets, "the CAN controller has difficulties
> > + * to tell whether an error frame arrived on channel 1 or on channel 2."
> > + * Thus, error counters are compared with their earlier values to
> > + * determine which channel was responsible for the error event.
> 
> Your code doesn't match this comment. You compare the error counters
> against the old values to tell if it's a rx or tx error, the channel
> information from the struct kvaser_error_summary is used directly.
> 

Hmmm, good catch, upon a second look, the code looks a bit deceiving.
The comments, meanwhile, are taken verbatim from the Kvaser sheets:

  http://www.kvaser.com/canlib-webhelp/page_hardware_specific_can_controllers.html
  Archived at http://www.webcitation.org/6VQd87jIA

So, what happens is that the firmware does not tell us whether the
received error event is for ch0 or ch1. But it gives us the (possibly
new) error counters for both of them:

struct usbcan_msg_error_event {
        [..]
        u8 tx_errors_count_ch0;
        u8 rx_errors_count_ch0;
        u8 tx_errors_count_ch1;
        u8 rx_errors_count_ch1;
	[..]
} __packed;

We loop over each channel, and report an error to userspace if extra
errors were spotted for such channel.

kvaser_error_summary is not a firmware command or response. Since
the wire format for an error event differs between Leaf and Usbcan,
it's just our way to summarize an error whether it's from any of
them, and this is where the conflict stems from:

- If it's a Leaf-device, `error_summary->channel' is the excat
channel reported by the firmware
- If it's a Usbcan device, `error_summary->channel' is just a mark
to check the error counters for such a channel and report error
to userspace iff they've increased.

Thus the raison d'etre for `error_summary' struct is to share
the error handling code between Leaf and UsbcanII devices since
it's quite similar.

So to clear up this conflict, I suggest the following error_summary
layout:

struct kvaser_error_summary {
       union {
                struct {
                       u8 channel;
                } leaf;
                struct {
                       u8 possible_channel;
                } usbcan;
       };
       /* Rest of layout is left as-is */
}

This way, it's clear that in case of Usbcan, channel is not final
and we need further work to do. What do you think?

(BTW, any better name than "kvaser_error_summary"? It conflicts a
little bit with the namespace used for the packed wire format
structures "kvaser_*")

> > + */
> > +static void usbcan_report_error_if_applicable(const struct kvaser_usb *dev,
> > +					      struct kvaser_error_summary *es)
> 
> Nitpick: can you please add a "kvaser_" prefix to the all usbcan_* and
> leaf_* functions.
> 

Will do.

> >  {
> > -	struct can_frame *cf;
> > -	struct sk_buff *skb;
> > -	struct net_device_stats *stats;
> >  	struct kvaser_usb_net_priv *priv;
> > -	unsigned int new_state;
> > -	u8 channel, status, txerr, rxerr, error_factor;
> > +	int old_tx_err_count, old_rx_err_count, channel, report_error;
> 
> bool report_error;
> 

Will do.

> > +
> > +	channel = es->channel;
> > +	if (channel >= dev->nchannels) {
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid channel number (%d)\n", channel);
> > +		return;
> > +	}
> > +
> > +	priv = dev->nets[channel];
> > +	old_tx_err_count = priv->bec.txerr;
> > +	old_rx_err_count = priv->bec.rxerr;
> 
> Why do you make a copy of priv->bec, AFAICS you can use
> priv->bec.{r,t}xerr directly?
> 

Initially, just for clarity, as I was not used yet to socketCAN
structures ;-) will remove it in the next submission.

> > +
> > +	report_error = 0;
> > +	if (es->txerr > old_tx_err_count) {
> > +		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
> > +		report_error = 1;
> > +	}
> > +	if (es->rxerr > old_rx_err_count) {
> > +		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
> > +		report_error = 1;
> > +	}
> > +	if ((es->status & M16C_STATE_BUS_ERROR) &&
> > +	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
> > +		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
> > +		report_error = 1;
> > +	}
> > +
> > +	if (report_error)
> > +		kvaser_report_error_event(dev, es);
> > +}
> > +
> > +/* Extract error summary from a Leaf-based device error message */
> > +static void leaf_extract_error_from_msg(const struct kvaser_usb *dev,
> > +					const struct kvaser_msg *msg)
> > +{
> > +	struct kvaser_error_summary es = { 0, };
> 
> IIRC "es = { };" should be sufficient.
> 

Correct, will do.

> >  
> >  	switch (msg->id) {
> >  	case CMD_CAN_ERROR_EVENT:
> > -		channel = msg->u.error_event.channel;
> > -		status =  msg->u.error_event.status;
> > -		txerr = msg->u.error_event.tx_errors_count;
> > -		rxerr = msg->u.error_event.rx_errors_count;
> > -		error_factor = msg->u.error_event.error_factor;
> > +		es.channel = msg->u.leaf.error_event.channel;
> > +		es.status =  msg->u.leaf.error_event.status;
> > +		es.txerr = msg->u.leaf.error_event.tx_errors_count;
> > +		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
> > +		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
> >  		break;
> > -	case CMD_LOG_MESSAGE:
> > -		channel = msg->u.log_message.channel;
> > -		status = msg->u.log_message.data[0];
> > -		txerr = msg->u.log_message.data[2];
> > -		rxerr = msg->u.log_message.data[3];
> > -		error_factor = msg->u.log_message.data[1];
> > +	case CMD_LEAF_LOG_MESSAGE:
> > +		es.channel = msg->u.leaf.log_message.channel;
> > +		es.status = msg->u.leaf.log_message.data[0];
> > +		es.txerr = msg->u.leaf.log_message.data[2];
> > +		es.rxerr = msg->u.leaf.log_message.data[3];
> > +		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
> >  		break;
> >  	case CMD_CHIP_STATE_EVENT:
> > -		channel = msg->u.chip_state_event.channel;
> > -		status =  msg->u.chip_state_event.status;
> > -		txerr = msg->u.chip_state_event.tx_errors_count;
> > -		rxerr = msg->u.chip_state_event.rx_errors_count;
> > -		error_factor = 0;
> > +		es.channel = msg->u.leaf.chip_state_event.channel;
> > +		es.status =  msg->u.leaf.chip_state_event.status;
> > +		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
> > +		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
> > +		es.leaf.error_factor = 0;
> >  		break;
> >  	default:
> >  		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
> > @@ -653,16 +880,92 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  		return;
> >  	}
> >  
> > -	if (channel >= dev->nchannels) {
> > +	kvaser_report_error_event(dev, &es);
> > +}
> > +
> > +/* Extract error summary from a USBCANII-based device error message */
> > +static void usbcan_extract_error_from_msg(const struct kvaser_usb *dev,
> > +					  const struct kvaser_msg *msg)
> > +{
> > +	struct kvaser_error_summary es = { 0, };
> 
> same here.
> 

Ditto.

> > +
> > +	switch (msg->id) {
> > +	/* Sometimes errors are sent as unsolicited chip state events */
> > +	case CMD_CHIP_STATE_EVENT:
> > +		es.channel = msg->u.usbcan.chip_state_event.channel;
> > +		es.status =  msg->u.usbcan.chip_state_event.status;
> > +		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
> > +		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
> > +		usbcan_report_error_if_applicable(dev, &es);
> > +		break;
> > +
> > +	case CMD_CAN_ERROR_EVENT:
> > +		es.channel = 0;
> > +		es.status = msg->u.usbcan.error_event.status_ch0;
> > +		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
> > +		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
> > +		es.usbcan.other_ch_status =
> > +			msg->u.usbcan.error_event.status_ch1;
> > +		usbcan_report_error_if_applicable(dev, &es);
> > +
> > +		/* For error events, the USBCAN firmware does not support
> > +		 * more than 2 channels: ch0, and ch1.
> > +		 */
> > +		if (dev->nchannels > 1) {
> > +			es.channel = 1;
> 
> Why is channel == 1 if the device has more than 1 channel?
> 

This is related to the "kvaser_error_summary" discussion above
where "channel" is only a suggestion for checking the error
counters.

If the Usbcan device has only one channel, then there's no need
to check if the "tx_errors_count_ch1", "rx_errors_count_ch1" has
increased or not. Their values are undefined.

> > +			es.status = msg->u.usbcan.error_event.status_ch1;
> > +			es.txerr =
> > +				msg->u.usbcan.error_event.tx_errors_count_ch1;
> > +			es.rxerr =
> > +				msg->u.usbcan.error_event.rx_errors_count_ch1;
> > +			es.usbcan.other_ch_status =
> > +				msg->u.usbcan.error_event.status_ch0;
> > +			usbcan_report_error_if_applicable(dev, &es);
> > +		}
> > +		break;
> > +
> > +	default:
> > +		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
> > +			msg->id);
> > +	}
> > +}
> > +
> > +static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> > +				const struct kvaser_msg *msg)
> > +{
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		leaf_extract_error_from_msg(dev, msg);
> > +		break;
> > +	case KVASER_USBCAN:
> > +		usbcan_extract_error_from_msg(dev, msg);
> > +		break;
> > +	default:
> should not happen.

Yes. As in above, will wait your input in this regarding the checks
being defensive.

[...]
> > +	case KVASER_USBCAN:
> > +		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
> > +			stats->tx_errors++;
> > +		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
> > +			stats->rx_errors++;
> > +		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
> > +			priv->can.can_stats.bus_error++;
> > +			cf->can_id |= CAN_ERR_BUSERROR;
> > +		}
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> > +		goto err;
> 
> should not happen.
> 

Ditto.

> >  	}
> >  
> > -	cf->data[6] = txerr;
> > -	cf->data[7] = rxerr;
> > +	cf->data[6] = es->txerr;
> > +	cf->data[7] = es->rxerr;
> >  
> > -	priv->bec.txerr = txerr;
> > -	priv->bec.rxerr = rxerr;
> > +	priv->bec.txerr = es->txerr;
> > +	priv->bec.rxerr = es->rxerr;
> >  
> >  	priv->can.state = new_state;
> >  
> > @@ -774,6 +1093,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >  
> >  	stats->rx_packets++;
> >  	stats->rx_bytes += cf->can_dlc;
> > +
> > +	return;
> > +
> > +err:
> > +	dev_kfree_skb(skb);
> >  }
> >  
> >  static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
> > @@ -783,16 +1107,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
> >  	struct sk_buff *skb;
> >  	struct net_device_stats *stats = &priv->netdev->stats;
> >  
> > -	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
> > +	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
> >  					 MSG_FLAG_NERR)) {
> >  		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
> > -			   msg->u.rx_can.flag);
> > +			   msg->u.rx_can_header.flag);
> >  
> >  		stats->rx_errors++;
> >  		return;
> >  	}
> >  
> > -	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
> > +	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
> >  		skb = alloc_can_err_skb(priv->netdev, &cf);
> > 		if (!skb) {
> > 			stats->rx_dropped++;
> > 			return;
> > 		}
> 
> Can you prepare a (seperate) patch that does the stats, even in case of OOM here. Same for kvaser_report_error_event()

Sure.

In kvaser_report_error_event() though, isn't it a little bit tricky?
Specially in fragments as in below:

        switch (dev->family) {
        case KVASER_LEAF:
                if (es->leaf.error_factor) {
                        priv->can.can_stats.bus_error++;
                        stats->rx_errors++;

                        cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
                        [...]
                }
                break;
        case KVASER_USBCAN:
                if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
                        stats->tx_errors++;
                if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
                        stats->rx_errors++;
                if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
                        priv->can.can_stats.bus_error++;
                        cf->can_id |= CAN_ERR_BUSERROR;
                }
                break;
        }

IMHO, there will be some duplication of the above fragment. Once
to handle "stats->*", and once to handle packet-related "cf->*" stuff.
Also in the Usbcan case, it will clutter the error_state checks in
different places.

> 
> > 
> > 		cf->can_id |= CAN_ERR_CRTL;
> > 		cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
> > 
> > 		stats->rx_over_errors++;
> > 		stats->rx_errors++;
> > 
> > 		netif_rx(skb);
> > 
> > 		stats->rx_packets++;
> > 		stats->rx_bytes += cf->can_dlc;
> 
> Another patch would be not to touch cf after netif_rx(), please move the stats handling before calling netif_rx(). Same applies to the kvaser_usb_rx_can_msg() function.
> 

Indeed, these can be totally bogus values after netif_rx().
I'll introduce this as a new patch in a new series.

Thanks for your review!

Regards,
Darwish

^ permalink raw reply	[relevance 79%]

* Re: [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family
    2015-01-08 15:19 79%           ` Ahmed S. Darwish
@ 2015-01-09  3:06 99%           ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-09  3:06 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

On Thu, Jan 08, 2015 at 12:53:37PM +0100, Marc Kleine-Budde wrote:
> On 01/05/2015 07:31 PM, Ahmed S. Darwish wrote:
> > 

[...]

> > 
> > 		cf->can_id |= CAN_ERR_CRTL;
> > 		cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
> > 
> > 		stats->rx_over_errors++;
> > 		stats->rx_errors++;
> > 
> > 		netif_rx(skb);
> > 
> > 		stats->rx_packets++;
> > 		stats->rx_bytes += cf->can_dlc;
> 
> Another patch would be not to touch cf after netif_rx(), please move the stats handling before calling netif_rx(). Same applies to the kvaser_usb_rx_can_msg() function.
> 

BTW, is it guaranteed from the SocketCAN stack that netif_rx()
will never return NET_RX_DROPPED? Because if no guarantee
exists, I guess below fragment cannot be completely correct?

        stats->rx_packets++;
        stats->rx_bytes += cf->can_dlc;
        netif_rx(skb);

On the other hand, I don't see evan a single CAN driver checking
netif_rx() return value, so maybe such a check is an overkill...

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: Fix for synchronization issue in IPV6 implementation in smack module(v3.18)
  @ 2015-01-09 13:42 95% ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-09 13:42 UTC (permalink / raw)
  To: Vishal Goel; +Cc: linux-kernel, linux-security-module, casey

Hi Vishal,

On Thu, Jan 08, 2015 at 10:11:52PM +0530, Vishal Goel wrote:
>  [PATCH] This patch fixes the synchronization issue in IPv6
>  implementation. Previously there was no synchronization mechanism used while
>  accessing(adding/reading/deletion) smk_ipv6_port_list. It could be possible
>  that when one thread is reading the list, at the same time another thread is
>  adding/deleting in the list.So it is possible that reader thread will read
>  the inaccurate or incomplete list. So to make sure that reader thread will
>  read the accurate list, rcu mechanism has been used while accessing the
>  list.RCU allows readers to access a data structure even when it is in the
>  process of being updated
> 
> Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
>             Himanshu Shukla <himanshu.sh@samsung.com>

The legality of your patches are blurry. You're sending from
a personal email, while having Signed-off-by signatures by your
employer.

You **really** need to add a "From: XXX@samsung.com" header on
the very first line of your emails if this is a sponsored work.
Kindly check Documentation/SubmittingPatches for further details.

Beside the above:

- Your patches are not applicable to the tree since they're
  white-spaces mangled. You're using Gmail's web interface, which
  is well known at converting tabs to white-spaces. Check
  Documentation/email-clients.txt for further details.

- Please fix you Subject line. Make it something in the form of:
  [PATCH 1/3] smack: Fix xxx

- No need for "[PATCH]" in the commit log body, only in the
  subject line.

- Please make the commit message more comprehensible. Check
  the kernel git log history for good examples. A grammar
  check will also be nice; there are a number of free good
  tools on the web.

- Add "Signed-off-by" headers for each developer. In the patch
  above, you'll need _two_ "Signed-off-by" lines.

- You're sending multiple related patches, but posting each one
  in its own thread. This will make it very very hard for review,
  especially in a very busy list like LKML. Please send related
  patches in an "email thread", with clear sequence numbers.

  (e.g., your follow-up patch titled as "In Ref to previous 3
  patches:Fix for synchronization..." is completely bogus.)

Happy kernel coding :-)

Regards,
Darwish

> ---
>  security/smack/smack_lsm.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index d515ec2..b3427ee 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -52,6 +52,7 @@
>  #define SMK_RECEIVING  1
>  #define SMK_SENDING    2
> 
> +DEFINE_MUTEX(smack_ipv6_lock);
>  LIST_HEAD(smk_ipv6_port_list);
> 
>  #ifdef CONFIG_SECURITY_SMACK_BRINGUP
> @@ -2232,17 +2233,20 @@ static void smk_ipv6_port_label(struct socket
> *sock, struct sockaddr *address)
>              * on the bound socket. Take the changes to the port
>              * as well.
>              */
> -           list_for_each_entry(spp, &smk_ipv6_port_list, list) {
> +           rcu_read_lock();
> +           list_for_each_entry_rcu(spp, &smk_ipv6_port_list, list) {
>                   if (sk != spp->smk_sock)
>                         continue;
>                   spp->smk_in = ssp->smk_in;
>                   spp->smk_out = ssp->smk_out;
> +                 rcu_read_unlock();
>                   return;
>             }
>             /*
>              * A NULL address is only used for updating existing
>              * bound entries. If there isn't one, it's OK.
>              */
> +           rcu_read_unlock();
>             return;
>       }
> 
> @@ -2258,16 +2262,18 @@ static void smk_ipv6_port_label(struct socket
> *sock, struct sockaddr *address)
>        * Look for an existing port list entry.
>        * This is an indication that a port is getting reused.
>        */
> -     list_for_each_entry(spp, &smk_ipv6_port_list, list) {
> +     rcu_read_lock();
> +     list_for_each_entry_rcu(spp, &smk_ipv6_port_list, list) {
>             if (spp->smk_port != port)
>                   continue;
>             spp->smk_port = port;
>             spp->smk_sock = sk;
>             spp->smk_in = ssp->smk_in;
>             spp->smk_out = ssp->smk_out;
> +           rcu_read_unlock();
>             return;
>       }
> -
> +     rcu_read_unlock();
>       /*
>        * A new port entry is required.
>        */
> @@ -2280,7 +2286,9 @@ static void smk_ipv6_port_label(struct socket
> *sock, struct sockaddr *address)
>       spp->smk_in = ssp->smk_in;
>       spp->smk_out = ssp->smk_out;
> 
> -     list_add(&spp->list, &smk_ipv6_port_list);
> +     mutex_lock(&smack_ipv6_lock);
> +     list_add_rcu(&spp->list, &smk_ipv6_port_list);
> +     mutex_unlock(&smack_ipv6_lock);
>       return;
>  }
> 
> @@ -2335,8 +2343,8 @@ static int smk_ipv6_port_check(struct sock *sk,
> struct sockaddr_in6 *address,
>             skp = &smack_known_web;
>             goto auditout;
>       }
> -
> -     list_for_each_entry(spp, &smk_ipv6_port_list, list) {
> +     rcu_read_lock();
> +     list_for_each_entry_rcu(spp, &smk_ipv6_port_list, list) {
>             if (spp->smk_port != port)
>                   continue;
>             object = spp->smk_in;
> @@ -2344,6 +2352,7 @@ static int smk_ipv6_port_check(struct sock *sk,
> struct sockaddr_in6 *address,
>                   ssp->smk_packet = spp->smk_out;
>             break;
>       }
> +     rcu_read_unlock();
> 
>  auditout:
> 
> --
> 1.8.3.2
> --

^ permalink raw reply	[relevance 95%]

* [PATCH v4 00/04] can: Introduce support for Kvaser USBCAN-II devices
  2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
                   ` (4 preceding siblings ...)
  2015-01-05 17:49 93% ` [PATCH v3 1/4] " Ahmed S. Darwish
@ 2015-01-11 20:05 99% ` Ahmed S. Darwish
  2015-01-11 20:11 97%   ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() devices Ahmed S. Darwish
  2015-01-20 21:44 70% ` [PATCH v5 1/5] can: kvaser_usb: Update net interface state before exiting on OOM Ahmed S. Darwish
  2015-01-26  5:17 74% ` [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support Ahmed S. Darwish
  7 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-11 20:05 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi,

Now since earlier v3 submission patches #1-3 got merged, this
is a new patch series expanding on patch v3 #4: support for
the USBCAN-II family.

A new series is introduced due to the extra additions suggested
by code review, which required being added in their own
self-contained patches.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() devices
  2015-01-11 20:05 99% ` [PATCH v4 00/04] can: Introduce support for Kvaser USBCAN-II devices Ahmed S. Darwish
@ 2015-01-11 20:11 97%   ` Ahmed S. Darwish
  2015-01-11 20:15 99%     ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Ahmed S. Darwish
  2015-01-11 20:49 97%     ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() Ahmed S. Darwish
  0 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-11 20:11 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

We should not touch the packet after a netif_rx: it might
get freed behind our back.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index cc7bfc0..c32cd61 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -520,10 +520,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (skb) {
 			cf->can_id |= CAN_ERR_RESTARTED;
-			netif_rx(skb);
 
 			stats->rx_packets++;
 			stats->rx_bytes += cf->can_dlc;
+			netif_rx(skb);
 		} else {
 			netdev_err(priv->netdev,
 				   "No memory left for err_skb\n");
@@ -770,10 +770,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	priv->can.state = new_state;
 
-	netif_rx(skb);
-
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+	netif_rx(skb);
 }
 
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
@@ -805,10 +804,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		stats->rx_over_errors++;
 		stats->rx_errors++;
 
-		netif_rx(skb);
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
+		netif_rx(skb);
 	}
 }
 
@@ -887,10 +885,9 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 			       cf->can_dlc);
 	}
 
-	netif_rx(skb);
-
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+	netif_rx(skb);
 }
 
 static void kvaser_usb_start_chip_reply(const struct kvaser_usb *dev,
-- 
1.9.1


^ permalink raw reply related	[relevance 97%]

* [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM
  2015-01-11 20:11 97%   ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() devices Ahmed S. Darwish
@ 2015-01-11 20:15 99%     ` Ahmed S. Darwish
  2015-01-11 20:36 39%       ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
    2015-01-11 20:49 97%     ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() Ahmed S. Darwish
  1 sibling, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-11 20:15 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Let the error counters be more accurate in case of Out of
Memory conditions.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index c32cd61..0eb870b 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -792,6 +792,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	}
 
 	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+		stats->rx_over_errors++;
+		stats->rx_errors++;
+
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -801,9 +804,6 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		cf->can_id |= CAN_ERR_CRTL;
 		cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
 
-		stats->rx_over_errors++;
-		stats->rx_errors++;
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
 		netif_rx(skb);
-- 
1.9.1


^ permalink raw reply related	[relevance 99%]

* [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family
  2015-01-11 20:15 99%     ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Ahmed S. Darwish
@ 2015-01-11 20:36 39%       ` Ahmed S. Darwish
  2015-01-11 20:45 92%         ` [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
                           ` (3 more replies)
    1 sibling, 4 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-11 20:36 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
divided into two major families: 'Leaf', and 'UsbcanII'.  From an
Operating System perspective, the firmware of both families behave
in a not too drastically different fashion.

This patch adds support for the UsbcanII family of devices to the
current Kvaser Leaf-only driver.

CAN frames sending, receiving, and error handling paths has been
tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
should also work nicely with other products in the same category.

List of new devices supported by this driver update:

         - Kvaser USBcan II HS/HS
         - Kvaser USBcan II HS/LS
         - Kvaser USBcan Rugged ("USBcan Rev B")
         - Kvaser Memorator HS/HS
         - Kvaser Memorator HS/LS
         - Scania VCI2 (if you have the Kvaser logo on top)

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/Kconfig      |   8 +-
 drivers/net/can/usb/kvaser_usb.c | 612 ++++++++++++++++++++++++++++++---------
 2 files changed, 487 insertions(+), 133 deletions(-)

** V4 Changelog:
- Use type-safe C methods instead of cpp macros
- Further clarify the code and comments on error events channel arbitration
- Remove defensive checks against non-existing families
- Re-order methods to remove forward declarations
- Smaller stuff spotted by earlier review (function prefexes, etc.)

** V3 Changelog:
- Fix padding for the usbcan_msg_tx_acknowledge command
- Remove kvaser_usb->max_channels and the MAX_NET_DEVICES macro
- Rename commands to CMD_LEAF_xxx and CMD_USBCAN_xxx
- Apply checkpatch.pl suggestions ('net/' comments, multi-line strings, etc.)

** V2 Changelog:
- Update Kconfig entries
- Use actual number of CAN channels (instead of max) where appropriate
- Rebase over a new set of UsbcanII-independent driver fixes

diff --git a/drivers/net/can/usb/Kconfig b/drivers/net/can/usb/Kconfig
index a77db919..f6f5500 100644
--- a/drivers/net/can/usb/Kconfig
+++ b/drivers/net/can/usb/Kconfig
@@ -25,7 +25,7 @@ config CAN_KVASER_USB
 	tristate "Kvaser CAN/USB interface"
 	---help---
 	  This driver adds support for Kvaser CAN/USB devices like Kvaser
-	  Leaf Light.
+	  Leaf Light and Kvaser USBcan II.
 
 	  The driver provides support for the following devices:
 	    - Kvaser Leaf Light
@@ -46,6 +46,12 @@ config CAN_KVASER_USB
 	    - Kvaser USBcan R
 	    - Kvaser Leaf Light v2
 	    - Kvaser Mini PCI Express HS
+	    - Kvaser USBcan II HS/HS
+	    - Kvaser USBcan II HS/LS
+	    - Kvaser USBcan Rugged ("USBcan Rev B")
+	    - Kvaser Memorator HS/HS
+	    - Kvaser Memorator HS/LS
+	    - Scania VCI2 (if you have the Kvaser logo on top)
 
 	  If unsure, say N.
 
diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 0eb870b..da47d17 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -6,10 +6,12 @@
  * Parts of this driver are based on the following:
  *  - Kvaser linux leaf driver (version 4.78)
  *  - CAN driver for esd CAN-USB/2
+ *  - Kvaser linux usbcanII driver (version 5.3)
  *
  * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
  * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
  * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
+ * Copyright (C) 2015 Valeo Corporation
  */
 
 #include <linux/completion.h>
@@ -21,6 +23,15 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
+/* Kvaser USB CAN dongles are divided into two major families:
+ * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
+ * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
+ */
+enum kvaser_usb_family {
+	KVASER_LEAF,
+	KVASER_USBCAN,
+};
+
 #define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
@@ -30,8 +41,9 @@
 #define RX_BUFFER_SIZE			3072
 #define CAN_USB_CLOCK			8000000
 #define MAX_NET_DEVICES			3
+#define MAX_USBCAN_NET_DEVICES		2
 
-/* Kvaser USB devices */
+/* Leaf USB devices */
 #define KVASER_VENDOR_ID		0x0bfd
 #define USB_LEAF_DEVEL_PRODUCT_ID	10
 #define USB_LEAF_LITE_PRODUCT_ID	11
@@ -56,6 +68,24 @@
 #define USB_LEAF_LITE_V2_PRODUCT_ID	288
 #define USB_MINI_PCIE_HS_PRODUCT_ID	289
 
+static inline bool kvaser_is_leaf(const struct usb_device_id *id)
+{
+	return id->idProduct >= USB_LEAF_DEVEL_PRODUCT_ID &&
+	       id->idProduct <= USB_MINI_PCIE_HS_PRODUCT_ID;
+}
+
+/* USBCANII devices */
+#define USB_USBCAN_REVB_PRODUCT_ID	2
+#define USB_VCI2_PRODUCT_ID		3
+#define USB_USBCAN2_PRODUCT_ID		4
+#define USB_MEMORATOR_PRODUCT_ID	5
+
+static inline bool kvaser_is_usbcan(const struct usb_device_id *id)
+{
+	return id->idProduct >= USB_USBCAN_REVB_PRODUCT_ID &&
+	       id->idProduct <= USB_MEMORATOR_PRODUCT_ID;
+}
+
 /* USB devices features */
 #define KVASER_HAS_SILENT_MODE		BIT(0)
 #define KVASER_HAS_TXRX_ERRORS		BIT(1)
@@ -73,7 +103,7 @@
 #define MSG_FLAG_TX_ACK			BIT(6)
 #define MSG_FLAG_TX_REQUEST		BIT(7)
 
-/* Can states */
+/* Can states (M16C CxSTRH register) */
 #define M16C_STATE_BUS_RESET		BIT(0)
 #define M16C_STATE_BUS_ERROR		BIT(4)
 #define M16C_STATE_BUS_PASSIVE		BIT(5)
@@ -98,7 +128,13 @@
 #define CMD_START_CHIP_REPLY		27
 #define CMD_STOP_CHIP			28
 #define CMD_STOP_CHIP_REPLY		29
-#define CMD_GET_CARD_INFO2		32
+#define CMD_READ_CLOCK			30
+#define CMD_READ_CLOCK_REPLY		31
+
+#define CMD_LEAF_GET_CARD_INFO2		32
+#define CMD_USBCAN_RESET_CLOCK		32
+#define CMD_USBCAN_CLOCK_OVERFLOW_EVENT	33
+
 #define CMD_GET_CARD_INFO		34
 #define CMD_GET_CARD_INFO_REPLY		35
 #define CMD_GET_SOFTWARE_INFO		38
@@ -108,8 +144,9 @@
 #define CMD_RESET_ERROR_COUNTER		49
 #define CMD_TX_ACKNOWLEDGE		50
 #define CMD_CAN_ERROR_EVENT		51
-#define CMD_USB_THROTTLE		77
-#define CMD_LOG_MESSAGE			106
+
+#define CMD_LEAF_USB_THROTTLE		77
+#define CMD_LEAF_LOG_MESSAGE		106
 
 /* error factors */
 #define M16C_EF_ACKE			BIT(0)
@@ -121,6 +158,14 @@
 #define M16C_EF_RCVE			BIT(6)
 #define M16C_EF_TRE			BIT(7)
 
+/* Only Leaf-based devices can report M16C error factors,
+ * thus define our own error status flags for USBCANII
+ */
+#define USBCAN_ERROR_STATE_NONE		0
+#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
+#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
+#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
+
 /* bittiming parameters */
 #define KVASER_USB_TSEG1_MIN		1
 #define KVASER_USB_TSEG1_MAX		16
@@ -137,7 +182,7 @@
 #define KVASER_CTRL_MODE_SELFRECEPTION	3
 #define KVASER_CTRL_MODE_OFF		4
 
-/* log message */
+/* Extended CAN identifier flag */
 #define KVASER_EXTENDED_FRAME		BIT(31)
 
 struct kvaser_msg_simple {
@@ -148,30 +193,55 @@ struct kvaser_msg_simple {
 struct kvaser_msg_cardinfo {
 	u8 tid;
 	u8 nchannels;
-	__le32 serial_number;
-	__le32 padding;
+	union {
+		struct {
+			__le32 serial_number;
+			__le32 padding;
+		} __packed leaf0;
+		struct {
+			__le32 serial_number_low;
+			__le32 serial_number_high;
+		} __packed usbcan0;
+	} __packed;
 	__le32 clock_resolution;
 	__le32 mfgdate;
 	u8 ean[8];
 	u8 hw_revision;
-	u8 usb_hs_mode;
-	__le16 padding2;
+	union {
+		struct {
+			u8 usb_hs_mode;
+		} __packed leaf1;
+		struct {
+			u8 padding;
+		} __packed usbcan1;
+	} __packed;
+	__le16 padding;
 } __packed;
 
 struct kvaser_msg_cardinfo2 {
 	u8 tid;
-	u8 channel;
+	u8 reserved;
 	u8 pcb_id[24];
 	__le32 oem_unlock_code;
 } __packed;
 
-struct kvaser_msg_softinfo {
+struct leaf_msg_softinfo {
 	u8 tid;
-	u8 channel;
+	u8 padding0;
 	__le32 sw_options;
 	__le32 fw_version;
 	__le16 max_outstanding_tx;
-	__le16 padding[9];
+	__le16 padding1[9];
+} __packed;
+
+struct usbcan_msg_softinfo {
+	u8 tid;
+	u8 fw_name[5];
+	__le16 max_outstanding_tx;
+	u8 padding[6];
+	__le32 fw_version;
+	__le16 checksum;
+	__le16 sw_options;
 } __packed;
 
 struct kvaser_msg_busparams {
@@ -188,36 +258,86 @@ struct kvaser_msg_tx_can {
 	u8 channel;
 	u8 tid;
 	u8 msg[14];
-	u8 padding;
-	u8 flags;
+	union {
+		struct {
+			u8 padding;
+			u8 flags;
+		} __packed leaf;
+		struct {
+			u8 flags;
+			u8 padding;
+		} __packed usbcan;
+	} __packed;
+} __packed;
+
+struct kvaser_msg_rx_can_header {
+	u8 channel;
+	u8 flag;
 } __packed;
 
-struct kvaser_msg_rx_can {
+struct leaf_msg_rx_can {
 	u8 channel;
 	u8 flag;
+
 	__le16 time[3];
 	u8 msg[14];
 } __packed;
 
-struct kvaser_msg_chip_state_event {
+struct usbcan_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
+	u8 msg[14];
+	__le16 time;
+} __packed;
+
+struct leaf_msg_chip_state_event {
 	u8 tid;
 	u8 channel;
+
 	__le16 time[3];
 	u8 tx_errors_count;
 	u8 rx_errors_count;
+
 	u8 status;
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_tx_acknowledge {
+struct usbcan_msg_chip_state_event {
+	u8 tid;
+	u8 channel;
+
+	u8 tx_errors_count;
+	u8 rx_errors_count;
+	__le16 time;
+
+	u8 status;
+	u8 padding[3];
+} __packed;
+
+struct kvaser_msg_tx_acknowledge_header {
+	u8 channel;
+	u8 tid;
+};
+
+struct leaf_msg_tx_acknowledge {
 	u8 channel;
 	u8 tid;
+
 	__le16 time[3];
 	u8 flags;
 	u8 time_offset;
 } __packed;
 
-struct kvaser_msg_error_event {
+struct usbcan_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
+	__le16 time;
+	__le16 padding;
+} __packed;
+
+struct leaf_msg_error_event {
 	u8 tid;
 	u8 flags;
 	__le16 time[3];
@@ -229,6 +349,18 @@ struct kvaser_msg_error_event {
 	u8 error_factor;
 } __packed;
 
+struct usbcan_msg_error_event {
+	u8 tid;
+	u8 padding;
+	u8 tx_errors_count_ch0;
+	u8 rx_errors_count_ch0;
+	u8 tx_errors_count_ch1;
+	u8 rx_errors_count_ch1;
+	u8 status_ch0;
+	u8 status_ch1;
+	__le16 time;
+} __packed;
+
 struct kvaser_msg_ctrl_mode {
 	u8 tid;
 	u8 channel;
@@ -243,7 +375,7 @@ struct kvaser_msg_flush_queue {
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_log_message {
+struct leaf_msg_log_message {
 	u8 channel;
 	u8 flags;
 	__le16 time[3];
@@ -260,19 +392,57 @@ struct kvaser_msg {
 		struct kvaser_msg_simple simple;
 		struct kvaser_msg_cardinfo cardinfo;
 		struct kvaser_msg_cardinfo2 cardinfo2;
-		struct kvaser_msg_softinfo softinfo;
 		struct kvaser_msg_busparams busparams;
+
+		struct kvaser_msg_rx_can_header rx_can_header;
+		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
+
+		union {
+			struct leaf_msg_softinfo softinfo;
+			struct leaf_msg_rx_can rx_can;
+			struct leaf_msg_chip_state_event chip_state_event;
+			struct leaf_msg_tx_acknowledge tx_acknowledge;
+			struct leaf_msg_error_event error_event;
+			struct leaf_msg_log_message log_message;
+		} __packed leaf;
+
+		union {
+			struct usbcan_msg_softinfo softinfo;
+			struct usbcan_msg_rx_can rx_can;
+			struct usbcan_msg_chip_state_event chip_state_event;
+			struct usbcan_msg_tx_acknowledge tx_acknowledge;
+			struct usbcan_msg_error_event error_event;
+		} __packed usbcan;
+
 		struct kvaser_msg_tx_can tx_can;
-		struct kvaser_msg_rx_can rx_can;
-		struct kvaser_msg_chip_state_event chip_state_event;
-		struct kvaser_msg_tx_acknowledge tx_acknowledge;
-		struct kvaser_msg_error_event error_event;
 		struct kvaser_msg_ctrl_mode ctrl_mode;
 		struct kvaser_msg_flush_queue flush_queue;
-		struct kvaser_msg_log_message log_message;
 	} u;
 } __packed;
 
+/* Summary of a kvaser error event, for a unified Leaf/Usbcan error
+ * handling. Some discrepancies between the two families exist:
+ *
+ * - USBCAN firmware does not report M16C "error factors"
+ * - USBCAN controllers has difficulties reporting if the raised error
+ *   event is for ch0 or ch1. They leave such arbitration to the OS
+ *   driver by letting it compare error counters with previous values
+ *   and decide the error event's channel. Thus for USBCAN, the channel
+ *   field is only advisory.
+ */
+struct kvaser_error_summary {
+	u8 channel, status, txerr, rxerr;
+	union {
+		struct {
+			u8 error_factor;
+		} leaf;
+		struct {
+			u8 other_ch_status;
+			u8 error_state;
+		} usbcan;
+	};
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -288,6 +458,7 @@ struct kvaser_usb {
 
 	u32 fw_version;
 	unsigned int nchannels;
+	enum kvaser_usb_family family;
 
 	bool rxinitdone;
 	void *rxbuf[MAX_RX_URBS];
@@ -311,6 +482,7 @@ struct kvaser_usb_net_priv {
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
+	/* Leaf family IDs */
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
@@ -360,6 +532,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
 		.driver_info = KVASER_HAS_TXRX_ERRORS },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+
+	/* USBCANII family IDs */
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+
 	{ }
 };
 MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
@@ -463,7 +646,14 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	if (err)
 		return err;
 
-	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
+	switch (dev->family) {
+	case KVASER_LEAF:
+		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		break;
+	case KVASER_USBCAN:
+		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		break;
+	}
 
 	return 0;
 }
@@ -482,7 +672,9 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
 		return err;
 
 	dev->nchannels = msg.u.cardinfo.nchannels;
-	if (dev->nchannels > MAX_NET_DEVICES)
+	if ((dev->nchannels > MAX_NET_DEVICES) ||
+	    (dev->family == KVASER_USBCAN &&
+	     dev->nchannels > MAX_USBCAN_NET_DEVICES))
 		return -EINVAL;
 
 	return 0;
@@ -496,8 +688,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	u8 channel = msg->u.tx_acknowledge.channel;
-	u8 tid = msg->u.tx_acknowledge.tid;
+	u8 channel, tid;
+
+	channel = msg->u.tx_acknowledge_header.channel;
+	tid = msg->u.tx_acknowledge_header.tid;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -616,53 +810,24 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 }
 
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
-				const struct kvaser_msg *msg)
+				struct kvaser_error_summary *es)
 {
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
 	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
-
-	switch (msg->id) {
-	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
-		break;
-	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
-		break;
-	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
-		break;
-	default:
-		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
-			msg->id);
-		return;
-	}
 
-	if (channel >= dev->nchannels) {
+	if (es->channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid channel number (%d)\n", es->channel);
 		return;
 	}
 
-	priv = dev->nets[channel];
+	priv = dev->nets[es->channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
+	if (es->status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
@@ -675,9 +840,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	new_state = priv->can.state;
 
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	if (status & M16C_STATE_BUS_OFF) {
+	if (es->status & M16C_STATE_BUS_OFF) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
 		priv->can.can_stats.bus_off++;
@@ -687,12 +852,12 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		netif_carrier_off(priv->netdev);
 
 		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
 		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es->txerr || es->rxerr)
+				cf->data[1] = (es->txerr > es->rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
@@ -703,13 +868,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	}
-
-	if (status == M16C_STATE_BUS_ERROR) {
+	} else if (es->status & M16C_STATE_BUS_ERROR) {
 		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es->txerr > es->rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
 
@@ -723,7 +886,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 	}
 
-	if (!status) {
+	if (!es->status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
 
@@ -739,34 +902,48 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
-		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
-
-		if (error_factor & M16C_EF_ACKE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
-					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
-			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
-			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
-			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
-			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
-			cf->data[2] |= CAN_ERR_PROT_TX;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			priv->can.can_stats.bus_error++;
+			stats->rx_errors++;
+
+			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+
+			if (es->leaf.error_factor & M16C_EF_ACKE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
+			if (es->leaf.error_factor & M16C_EF_CRCE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
+						CAN_ERR_PROT_LOC_CRC_DEL);
+			if (es->leaf.error_factor & M16C_EF_FORME)
+				cf->data[2] |= CAN_ERR_PROT_FORM;
+			if (es->leaf.error_factor & M16C_EF_STFE)
+				cf->data[2] |= CAN_ERR_PROT_STUFF;
+			if (es->leaf.error_factor & M16C_EF_BITE0)
+				cf->data[2] |= CAN_ERR_PROT_BIT0;
+			if (es->leaf.error_factor & M16C_EF_BITE1)
+				cf->data[2] |= CAN_ERR_PROT_BIT1;
+			if (es->leaf.error_factor & M16C_EF_TRE)
+				cf->data[2] |= CAN_ERR_PROT_TX;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
+			stats->tx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
+			stats->rx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			priv->can.can_stats.bus_error++;
+			cf->can_id |= CAN_ERR_BUSERROR;
+		}
+		break;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
+	cf->data[6] = es->txerr;
+	cf->data[7] = es->rxerr;
 
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
 
 	priv->can.state = new_state;
 
@@ -775,6 +952,124 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	netif_rx(skb);
 }
 
+/* For USBCAN, report error to userspace iff the channels's errors counter
+ * has increased, or we're the only channel seeing a bus error state.
+ */
+static void kvaser_usbcan_conditionally_rx_error(const struct kvaser_usb *dev,
+						 struct kvaser_error_summary *es)
+{
+	struct kvaser_usb_net_priv *priv;
+	int channel;
+	bool report_error;
+
+	channel = es->channel;
+	if (channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", channel);
+		return;
+	}
+
+	priv = dev->nets[channel];
+	report_error = false;
+
+	if (es->txerr > priv->bec.txerr) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
+		report_error = true;
+	}
+	if (es->rxerr > priv->bec.rxerr) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
+		report_error = true;
+	}
+	if ((es->status & M16C_STATE_BUS_ERROR) &&
+	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
+		report_error = true;
+	}
+
+	if (report_error)
+		kvaser_usb_rx_error(dev, es);
+}
+
+static void kvaser_usbcan_rx_error(const struct kvaser_usb *dev,
+				   const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { };
+
+	switch (msg->id) {
+	/* Sometimes errors are sent as unsolicited chip state events */
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.usbcan.chip_state_event.channel;
+		es.status =  msg->u.usbcan.chip_state_event.status;
+		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
+		kvaser_usbcan_conditionally_rx_error(dev, &es);
+		break;
+
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = 0;
+		es.status = msg->u.usbcan.error_event.status_ch0;
+		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
+		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
+		es.usbcan.other_ch_status =
+			msg->u.usbcan.error_event.status_ch1;
+		kvaser_usbcan_conditionally_rx_error(dev, &es);
+
+		/* The USBCAN firmware does not support more than 2 channels.
+		 * Now that ch0 was checked, check if ch1 has any errors.
+		 */
+		if (dev->nchannels == MAX_USBCAN_NET_DEVICES) {
+			es.channel = 1;
+			es.status = msg->u.usbcan.error_event.status_ch1;
+			es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch1;
+			es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch1;
+			es.usbcan.other_ch_status =
+				msg->u.usbcan.error_event.status_ch0;
+			kvaser_usbcan_conditionally_rx_error(dev, &es);
+		}
+		break;
+
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+	}
+}
+
+static void kvaser_leaf_rx_error(const struct kvaser_usb *dev,
+				 const struct kvaser_msg *msg)
+{
+	struct kvaser_error_summary es = { };
+
+	switch (msg->id) {
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = msg->u.leaf.error_event.channel;
+		es.status =  msg->u.leaf.error_event.status;
+		es.txerr = msg->u.leaf.error_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
+		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
+		break;
+	case CMD_LEAF_LOG_MESSAGE:
+		es.channel = msg->u.leaf.log_message.channel;
+		es.status = msg->u.leaf.log_message.data[0];
+		es.txerr = msg->u.leaf.log_message.data[2];
+		es.rxerr = msg->u.leaf.log_message.data[3];
+		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
+		break;
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.leaf.chip_state_event.channel;
+		es.status =  msg->u.leaf.chip_state_event.status;
+		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
+		es.leaf.error_factor = 0;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+		return;
+	}
+
+	kvaser_usb_rx_error(dev, &es);
+}
+
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 				  const struct kvaser_msg *msg)
 {
@@ -782,16 +1077,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	struct sk_buff *skb;
 	struct net_device_stats *stats = &priv->netdev->stats;
 
-	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
+	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
 					 MSG_FLAG_NERR)) {
 		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
-			   msg->u.rx_can.flag);
+			   msg->u.rx_can_header.flag);
 
 		stats->rx_errors++;
 		return;
 	}
 
-	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
 		stats->rx_over_errors++;
 		stats->rx_errors++;
 
@@ -817,7 +1112,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
-	u8 channel = msg->u.rx_can.channel;
+	u8 channel = msg->u.rx_can_header.channel;
+	const u8 *rx_msg;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -828,19 +1124,32 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
-	    (msg->id == CMD_LOG_MESSAGE)) {
-		kvaser_usb_rx_error(dev, msg);
+	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+	    (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE)) {
+		kvaser_leaf_rx_error(dev, msg);
 		return;
-	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
-					 MSG_FLAG_NERR |
-					 MSG_FLAG_OVERRUN)) {
+	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
+						MSG_FLAG_NERR |
+						MSG_FLAG_OVERRUN)) {
 		kvaser_usb_rx_can_err(priv, msg);
 		return;
-	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
+	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
 		netdev_warn(priv->netdev,
 			    "Unhandled frame (flags: 0x%02x)",
-			    msg->u.rx_can.flag);
+			    msg->u.rx_can_header.flag);
+		return;
+	}
+
+	switch (dev->family) {
+	case KVASER_LEAF:
+		rx_msg = msg->u.leaf.rx_can.msg;
+		break;
+	case KVASER_USBCAN:
+		rx_msg = msg->u.usbcan.rx_can.msg;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
 		return;
 	}
 
@@ -850,38 +1159,37 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (msg->id == CMD_LOG_MESSAGE) {
-		cf->can_id = le32_to_cpu(msg->u.log_message.id);
+	if (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE) {
+		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
 		if (cf->can_id & KVASER_EXTENDED_FRAME)
 			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
 		else
 			cf->can_id &= CAN_SFF_MASK;
 
-		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
+		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
 
-		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.log_message.data,
+			memcpy(cf->data, &msg->u.leaf.log_message.data,
 			       cf->can_dlc);
 	} else {
-		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
-			     (msg->u.rx_can.msg[1] & 0x3f);
+		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
 
 		if (msg->id == CMD_RX_EXT_MESSAGE) {
 			cf->can_id <<= 18;
-			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
-				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
-				      (msg->u.rx_can.msg[4] & 0x3f);
+			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
+				      ((rx_msg[3] & 0xff) << 6) |
+				      (rx_msg[4] & 0x3f);
 			cf->can_id |= CAN_EFF_FLAG;
 		}
 
-		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
+		cf->can_dlc = get_can_dlc(rx_msg[5]);
 
-		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.rx_can.msg[6],
+			memcpy(cf->data, &rx_msg[6],
 			       cf->can_dlc);
 	}
 
@@ -944,21 +1252,35 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 
 	case CMD_RX_STD_MESSAGE:
 	case CMD_RX_EXT_MESSAGE:
-	case CMD_LOG_MESSAGE:
+		kvaser_usb_rx_can_msg(dev, msg);
+		break;
+
+	case CMD_LEAF_LOG_MESSAGE:
+		if (dev->family != KVASER_LEAF)
+			goto warn;
 		kvaser_usb_rx_can_msg(dev, msg);
 		break;
 
 	case CMD_CHIP_STATE_EVENT:
 	case CMD_CAN_ERROR_EVENT:
-		kvaser_usb_rx_error(dev, msg);
+		if (dev->family == KVASER_LEAF)
+			kvaser_leaf_rx_error(dev, msg);
+		else
+			kvaser_usbcan_rx_error(dev, msg);
 		break;
 
 	case CMD_TX_ACKNOWLEDGE:
 		kvaser_usb_tx_acknowledge(dev, msg);
 		break;
 
+	/* Ignored messages */
+	case CMD_USBCAN_CLOCK_OVERFLOW_EVENT:
+		if (dev->family != KVASER_USBCAN)
+			goto warn;
+		break;
+
 	default:
-		dev_warn(dev->udev->dev.parent,
+warn:		dev_warn(dev->udev->dev.parent,
 			 "Unhandled message (%d)\n", msg->id);
 		break;
 	}
@@ -1178,7 +1500,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 				  dev->rxbuf[i],
 				  dev->rxbuf_dma[i]);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++) {
+	for (i = 0; i < dev->nchannels; i++) {
 		struct kvaser_usb_net_priv *priv = dev->nets[i];
 
 		if (priv)
@@ -1286,6 +1608,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	uint8_t *msg_tx_can_flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1307,9 +1630,23 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	msg = buf;
 	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
-	msg->u.tx_can.flags = 0;
 	msg->u.tx_can.channel = priv->channel;
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
+		break;
+	case KVASER_USBCAN:
+		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent,
+			"Invalid device family (%d)\n", dev->family);
+		goto releasebuf;
+	}
+
+	*msg_tx_can_flags = 0;
+
 	if (cf->can_id & CAN_EFF_FLAG) {
 		msg->id = CMD_TX_EXT_MESSAGE;
 		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
@@ -1327,7 +1664,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
 
 	if (cf->can_id & CAN_RTR_FLAG)
-		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
+		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
@@ -1596,6 +1933,17 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 	if (!dev)
 		return -ENOMEM;
 
+	if (kvaser_is_leaf(id)) {
+		dev->family = KVASER_LEAF;
+	} else if (kvaser_is_usbcan(id)) {
+		dev->family = KVASER_USBCAN;
+	} else {
+		dev_err(&intf->dev,
+			"Product ID (%d) does not belong to any known Kvaser USB family",
+			id->idProduct);
+		return -ENODEV;
+	}
+
 	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
 	if (err) {
 		dev_err(&intf->dev, "Cannot get usb endpoint(s)");
-- 
1.9.1


^ permalink raw reply related	[relevance 39%]

* [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT
  2015-01-11 20:36 39%       ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
@ 2015-01-11 20:45 92%         ` Ahmed S. Darwish
    2015-01-12 11:20 99%         ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-11 20:45 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: Greg Kroah-Hartman, Linux-USB, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

(This is a draft patch, I'm not sure if this fixes the USB
bug or only its psymptom. Feedback from the linux-usb folks
is really appreciated.)

When plugging the Kvaser USB/CAN dongle the first time, everything
works as expected and all of the transfers from and to the USB
device succeeds.

Meanwhile, after unplugging the device and plugging it again, the
first bulk transfer _always_ returns an -ETIMEDOUT. The following
behaviour was observied:

- Setting higher timeout values for the first bulk transfer never
  solved the issue.

- Unloading, then loading, our kvaser_usb module in question
  __always__ solved the issue.

- Checking first bulk transfer status, and retry the transfer
  again in case of an -ETIMEDOUT also __always__ solved the issue.
  This is what the patch below does.

- In the testing done so far, this issue appears only on laptops
  but never on PCs (possibly power related?)

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index da47d17..5925637 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1927,7 +1927,7 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 {
 	struct kvaser_usb *dev;
 	int err = -ENOMEM;
-	int i;
+	int i, retry = 3;
 
 	dev = devm_kzalloc(&intf->dev, sizeof(*dev), GFP_KERNEL);
 	if (!dev)
@@ -1956,7 +1956,16 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	err = kvaser_usb_get_software_info(dev);
+	/* On some x86 laptops, plugging a USBCAN device again after
+	 * an unplug makes the firmware always ignore the very first
+	 * command. For such a case, provide some room for retries
+	 * instead of completly exiting the driver.
+	 */
+	while (retry--) {
+		err = kvaser_usb_get_software_info(dev);
+		if (err != -ETIMEDOUT)
+			break;
+	}
 	if (err) {
 		dev_err(&intf->dev,
 			"Cannot get software infos, error %d\n", err);
-- 
1.9.1


^ permalink raw reply related	[relevance 92%]

* [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx()
  2015-01-11 20:11 97%   ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() devices Ahmed S. Darwish
  2015-01-11 20:15 99%     ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Ahmed S. Darwish
@ 2015-01-11 20:49 97%     ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-11 20:49 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

We should not touch the packet after a netif_rx: it might
get freed behind our back.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

 (Resend, fix the garbled subject line. Sorry)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index cc7bfc0..c32cd61 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -520,10 +520,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (skb) {
 			cf->can_id |= CAN_ERR_RESTARTED;
-			netif_rx(skb);
 
 			stats->rx_packets++;
 			stats->rx_bytes += cf->can_dlc;
+			netif_rx(skb);
 		} else {
 			netdev_err(priv->netdev,
 				   "No memory left for err_skb\n");
@@ -770,10 +770,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	priv->can.state = new_state;
 
-	netif_rx(skb);
-
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+	netif_rx(skb);
 }
 
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
@@ -805,10 +804,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		stats->rx_over_errors++;
 		stats->rx_errors++;
 
-		netif_rx(skb);
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
+		netif_rx(skb);
 	}
 }
 
@@ -887,10 +885,9 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 			       cf->can_dlc);
 	}
 
-	netif_rx(skb);
-
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+	netif_rx(skb);
 }
 
 static void kvaser_usb_start_chip_reply(const struct kvaser_usb *dev,
-- 
1.9.1


^ permalink raw reply related	[relevance 97%]

* Re: [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT
  @ 2015-01-12 10:14 99%             ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 10:14 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Greg Kroah-Hartman, Linux-USB, Linux-CAN, netdev, LKML

On Sun, Jan 11, 2015 at 09:51:10PM +0100, Marc Kleine-Budde wrote:
> On 01/11/2015 09:45 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > (This is a draft patch, I'm not sure if this fixes the USB
> > bug or only its psymptom. Feedback from the linux-usb folks
> > is really appreciated.)
> > 
> > When plugging the Kvaser USB/CAN dongle the first time, everything
> > works as expected and all of the transfers from and to the USB
> > device succeeds.
> > 
> > Meanwhile, after unplugging the device and plugging it again, the
> > first bulk transfer _always_ returns an -ETIMEDOUT. The following
> > behaviour was observied:
> > 
> > - Setting higher timeout values for the first bulk transfer never
> >   solved the issue.
> > 
> > - Unloading, then loading, our kvaser_usb module in question
> >   __always__ solved the issue.
> > 
> > - Checking first bulk transfer status, and retry the transfer
> >   again in case of an -ETIMEDOUT also __always__ solved the issue.
> >   This is what the patch below does.
> > 
> > - In the testing done so far, this issue appears only on laptops
> >   but never on PCs (possibly power related?)
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> Does this patch apply apply between 3 and 4? If not, please re-arrange
> the series. As this is a bug fix, patches 1, 2 and 4 will go via
> net/master, 3 will go via net-next/master.
> 

Since no one complained earlier, I guess this issue only affects
USBCAN devices. That's why I've based it above patch #3: adding
USBCAN hardware support.

Nonetheless, it won't do any harm for the current Leaf-only
driver. So _if_ this is the correct fix, I will update the commit
log, refactor the check into a 'do { } while()' loop, and then
base it above the Leaf-only net/master fixes on patch #1, and #2.

Any feedback on the USB side of things?

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family
  2015-01-11 20:36 39%       ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
  2015-01-11 20:45 92%         ` [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
@ 2015-01-12 11:20 99%         ` Ahmed S. Darwish
      3 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 11:20 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi Marc,

On Sun, Jan 11, 2015 at 03:36:12PM -0500, Ahmed S. Darwish wrote:
> From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
> divided into two major families: 'Leaf', and 'UsbcanII'.  From an
> Operating System perspective, the firmware of both families behave
> in a not too drastically different fashion.
> 
> This patch adds support for the UsbcanII family of devices to the
> current Kvaser Leaf-only driver.
> 
> CAN frames sending, receiving, and error handling paths has been
> tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
> should also work nicely with other products in the same category.
> 

Please delay applying this to as I've just discovered that
removal of two of the device family checks introduced two
un-necessary GCC warnings.

Will send and updated version soon.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family
  @ 2015-01-12 12:07 99%           ` Ahmed S. Darwish
  2015-01-12 12:36 99%             ` Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 12:07 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi Marc,

On Mon, Jan 12, 2015 at 12:43:56PM +0100, Marc Kleine-Budde wrote:
> On 01/11/2015 09:36 PM, Ahmed S. Darwish wrote:
[...]
> > diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> > index 0eb870b..da47d17 100644
> > --- a/drivers/net/can/usb/kvaser_usb.c
> > +++ b/drivers/net/can/usb/kvaser_usb.c
> > @@ -6,10 +6,12 @@
> >   * Parts of this driver are based on the following:
> >   *  - Kvaser linux leaf driver (version 4.78)
> >   *  - CAN driver for esd CAN-USB/2
> > + *  - Kvaser linux usbcanII driver (version 5.3)
> >   *
> >   * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
> >   * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
> >   * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
> > + * Copyright (C) 2015 Valeo Corporation
> >   */
> >  
> >  #include <linux/completion.h>
> > @@ -21,6 +23,15 @@
> >  #include <linux/can/dev.h>
> >  #include <linux/can/error.h>
> >  
> > +/* Kvaser USB CAN dongles are divided into two major families:
> > + * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
> > + * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
> > + */
> > +enum kvaser_usb_family {
> > +	KVASER_LEAF,
> > +	KVASER_USBCAN,
> > +};
> 
> Nitpick: please move after the #defines
> 

Will do.

[...]
> > @@ -616,53 +810,24 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
> >  }
> >  
> >  static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> > -				const struct kvaser_msg *msg)
> > +				struct kvaser_error_summary *es)
> 
> Can you make "struct kvaser_error_summary *es" const?
> 

Sure.

[...]
> > +/* For USBCAN, report error to userspace iff the channels's errors counter
> > + * has increased, or we're the only channel seeing a bus error state.
> > + */
> > +static void kvaser_usbcan_conditionally_rx_error(const struct kvaser_usb *dev,
> > +						 struct kvaser_error_summary *es)
> 
> const struct kvaser_error_summary *es?
>

Ditto.

[...]
> > +
> > +		/* The USBCAN firmware does not support more than 2 channels.
> 
> Does USBCAN support always 2 channels or are there models with 1
> channels, too. I'd rephrase ..."does support up to 2 channels."
> 

Yup, that will be more accurate.

[...]
> > +
> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		rx_msg = msg->u.leaf.rx_can.msg;
> > +		break;
> > +	case KVASER_USBCAN:
> > +		rx_msg = msg->u.usbcan.rx_can.msg;
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> >  		return;
> 
> should not happen.
> 

Yes, but otherwise I get GCC warnings of 'rx_msg' possibly
being unused. I can add __maybe_unused to rx_msg of course,
but such annotation may hide possible errors in the future.

> > +	switch (dev->family) {
> > +	case KVASER_LEAF:
> > +		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
> > +		break;
> > +	case KVASER_USBCAN:
> > +		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
> > +		break;
> > +	default:
> > +		dev_err(dev->udev->dev.parent,
> > +			"Invalid device family (%d)\n", dev->family);
> > +		goto releasebuf;
> should not happen, please remove

Like the 'rx_msg' case above.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family
  @ 2015-01-12 12:26 99%               ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 12:26 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

On Mon, Jan 12, 2015 at 12:51:49PM +0100, Marc Kleine-Budde wrote:
> On 01/08/2015 04:19 PM, Ahmed S. Darwish wrote:
> 
> [...]
> 
> >>>  MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
> >>> @@ -463,7 +631,18 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
> >>>  	if (err)
> >>>  		return err;
> >>>  
> >>> -	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
> >>> +	switch (dev->family) {
> >>> +	case KVASER_LEAF:
> >>> +		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
> >>> +		break;
> >>> +	case KVASER_USBCAN:
> >>> +		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
> >>> +		break;
> >>> +	default:
> >>> +		dev_err(dev->udev->dev.parent,
> >>> +			"Invalid device family (%d)\n", dev->family);
> >>> +		return -EINVAL;
> >>
> >> The default case should not happen. I think you can remove it.
> 
> > It's true, it _should_ never happen. But I only add such checks if
> > the follow-up code critically depends on a certain `dev->family`
> > behavior. So it's kind of a defensive check against any possible
> > bug in driver or memory.
> > 
> > What do you think?
> 
> The kernel is full of callback functions, if you have a bit flip there
> you're in trouble anyways. A bug in the driver (or other parts of the
> kernel) might overwrite the memory of dev->family, but if this happens,
> more things will break.
> 

I see. Thanks for the explanation.

Most of them are now removed in latest submission, except two to
avoid GCC warnings of variables that "may be used uninitialized".

Regards,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family
  2015-01-12 12:07 99%           ` Ahmed S. Darwish
@ 2015-01-12 12:36 99%             ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 12:36 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

On Mon, Jan 12, 2015 at 07:07:41AM -0500, Ahmed S. Darwish wrote:
> Hi Marc,
> 
> On Mon, Jan 12, 2015 at 12:43:56PM +0100, Marc Kleine-Budde wrote:
> > On 01/11/2015 09:36 PM, Ahmed S. Darwish wrote:
> > > +
> > > +	switch (dev->family) {
> > > +	case KVASER_LEAF:
> > > +		rx_msg = msg->u.leaf.rx_can.msg;
> > > +		break;
> > > +	case KVASER_USBCAN:
> > > +		rx_msg = msg->u.usbcan.rx_can.msg;
> > > +		break;
> > > +	default:
> > > +		dev_err(dev->udev->dev.parent,
> > > +			"Invalid device family (%d)\n", dev->family);
> > >  		return;
> > 
> > should not happen.
> > 
> 
> Yes, but otherwise I get GCC warnings of 'rx_msg' possibly
> being unused. I can add __maybe_unused to rx_msg of course,
> but such annotation may hide possible errors in the future.
> 

Ah, what I meant is using uninitialized_var() to suppress the
GCC warning. But, really, using that macro has a bad history
of hiding errors in the future.

Kindly check http://lwn.net/Articles/529954/ for context.

Another solution might be initializing rx_msg to NULL.

> > > +	switch (dev->family) {
> > > +	case KVASER_LEAF:
> > > +		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
> > > +		break;
> > > +	case KVASER_USBCAN:
> > > +		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
> > > +		break;
> > > +	default:
> > > +		dev_err(dev->udev->dev.parent,
> > > +			"Invalid device family (%d)\n", dev->family);
> > > +		goto releasebuf;
> > should not happen, please remove
> 
> Like the 'rx_msg' case above.
> 
> Thanks,
> Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT
  @ 2015-01-12 13:50 99%                 ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 13:50 UTC (permalink / raw)
  To: Olivier Sobrie
  Cc: Marc Kleine-Budde, Oliver Hartkopp, Wolfgang Grandegger,
	Greg Kroah-Hartman, Linux-USB, Linux-CAN, netdev, LKML

On Mon, Jan 12, 2015 at 02:33:30PM +0100, Olivier Sobrie wrote:
> Hello,
> 
> On Mon, Jan 12, 2015 at 05:14:07AM -0500, Ahmed S. Darwish wrote:
> > On Sun, Jan 11, 2015 at 09:51:10PM +0100, Marc Kleine-Budde wrote:
> > > On 01/11/2015 09:45 PM, Ahmed S. Darwish wrote:
> > > > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > > > 
> > > > (This is a draft patch, I'm not sure if this fixes the USB
> > > > bug or only its psymptom. Feedback from the linux-usb folks
> > > > is really appreciated.)
> > > > 
> > > > When plugging the Kvaser USB/CAN dongle the first time, everything
> > > > works as expected and all of the transfers from and to the USB
> > > > device succeeds.
> > > > 
> > > > Meanwhile, after unplugging the device and plugging it again, the
> > > > first bulk transfer _always_ returns an -ETIMEDOUT. The following
> > > > behaviour was observied:
> > > > 
> > > > - Setting higher timeout values for the first bulk transfer never
> > > >   solved the issue.
> > > > 
> > > > - Unloading, then loading, our kvaser_usb module in question
> > > >   __always__ solved the issue.
> > > > 
> > > > - Checking first bulk transfer status, and retry the transfer
> > > >   again in case of an -ETIMEDOUT also __always__ solved the issue.
> > > >   This is what the patch below does.
> > > > 
> > > > - In the testing done so far, this issue appears only on laptops
> > > >   but never on PCs (possibly power related?)
> > > > 
> > > > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > > 
> > > Does this patch apply apply between 3 and 4? If not, please re-arrange
> > > the series. As this is a bug fix, patches 1, 2 and 4 will go via
> > > net/master, 3 will go via net-next/master.
> > > 
> > 
> > Since no one complained earlier, I guess this issue only affects
> > USBCAN devices. That's why I've based it above patch #3: adding
> > USBCAN hardware support.
> > 
> > Nonetheless, it won't do any harm for the current Leaf-only
> > driver. So _if_ this is the correct fix, I will update the commit
> > log, refactor the check into a 'do { } while()' loop, and then
> > base it above the Leaf-only net/master fixes on patch #1, and #2.
> > 
> > Any feedback on the USB side of things?
> 
> Can you take a wireshark capture showing the problem?
> It can maybe help people to figure out what happens.
> 

Yeah, I'm planning on doing something similar.

> What kind of usbcan device do you use?

"Kvaser USBcan II HS/LS"

> Which firmware revision is loaded on the device?
> 

The device reports firmware version 2.9.410.

Interesting. The changelog of their latest firmware states
that it "fixed USB configuration issue during USB attach."
That might be the problem.

I have two devices here. I'll update the firmware only for
one of them and see what happens.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM
  @ 2015-01-12 20:36 67%         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-12 20:36 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

On Mon, Jan 12, 2015 at 12:09:32PM +0100, Marc Kleine-Budde wrote:
> On 01/11/2015 09:15 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > Let the error counters be more accurate in case of Out of
> > Memory conditions.
> 
> Please have a look at kvaser_usb_rx_error(), the whole state handling is
> omitted in case of OOM.
> 

I see. Regarding kvaser_usb_rx_error(), would something like
below patch be acceptable? 

Kindly note that separating recording interface state from
error frame packet building leads to duplication of a good
number of if-conditions. Meanwhile, it truly saves _all_
of the possible state before any ENOMEM -- the correct thing
to do.

Another solution was to allocate the can frame on the stack,
and thus avoiding any code duplication. But this only leads
to calls of "kvaser_usb_simple_msg_async", which can fail
with -ENOMEM by itself, returning to the very same problem
again. 

If the patch is acceptable, I'll rebase my USBCAN-II driver
above it and re-submit the series (minus the merged patch).

Thanks,

-->

[ Patch is build-tested, but not _fully_ run-time tested.

  It's based on linux-can/testing commit d642b49f6d84b94bd
  "can: kvaser_usb: Don't dereference skb after a netif_rx" ]

Subject: [PATCH] can: kvaser_usb: Update net interface state
 before exiting on OOM

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Let the network interface can bus state and error counters be
more accurate in case of Out of Memory conditions.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 182 +++++++++++++++++++++++----------------
 1 file changed, 106 insertions(+), 76 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index c32cd61..2d0ef76 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -273,6 +273,10 @@ struct kvaser_msg {
 	} u;
 } __packed;
 
+struct kvaser_usb_error_summary {
+	u8 channel, status, txerr, rxerr, error_factor;
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -615,6 +619,57 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
 }
 
+static void kvaser_usb_rx_error_update_state(struct kvaser_usb_net_priv *priv,
+					     const struct kvaser_usb_error_summary *es)
+{
+	struct net_device_stats *stats;
+	unsigned int new_state;
+
+	stats = &priv->netdev->stats;
+	new_state = priv->can.state;
+
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
+
+	if (es->status & M16C_STATE_BUS_OFF) {
+		priv->can.can_stats.bus_off++;
+		new_state = CAN_STATE_BUS_OFF;
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
+		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
+			priv->can.can_stats.error_passive++;
+		}
+		new_state = CAN_STATE_ERROR_PASSIVE;
+	}
+
+	if (es->status == M16C_STATE_BUS_ERROR) {
+		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
+			priv->can.can_stats.error_warning++;
+			new_state = CAN_STATE_ERROR_WARNING;
+		} else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
+			new_state = CAN_STATE_ERROR_ACTIVE;
+		}
+	}
+
+	if (!es->status) {
+		new_state = CAN_STATE_ERROR_ACTIVE;
+	}
+
+	if (priv->can.restart_ms &&
+	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
+	    (new_state < CAN_STATE_BUS_OFF)) {
+		priv->can.can_stats.restarts++;
+	}
+
+	if (es->error_factor) {
+		priv->can.can_stats.bus_error++;
+		stats->rx_errors++;
+	}
+
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
+	priv->can.state = new_state;
+}
+
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 				const struct kvaser_msg *msg)
 {
@@ -622,30 +677,30 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
+	struct kvaser_usb_error_summary es = { };
+	unsigned int new_state, old_state;
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
+		es.channel = msg->u.error_event.channel;
+		es.status =  msg->u.error_event.status;
+		es.txerr = msg->u.error_event.tx_errors_count;
+		es.rxerr = msg->u.error_event.rx_errors_count;
+		es.error_factor = msg->u.error_event.error_factor;
 		break;
 	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
+		es.channel = msg->u.log_message.channel;
+		es.status = msg->u.log_message.data[0];
+		es.txerr = msg->u.log_message.data[2];
+		es.rxerr = msg->u.log_message.data[3];
+		es.error_factor = msg->u.log_message.data[1];
 		break;
 	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
+		es.channel = msg->u.chip_state_event.channel;
+		es.status =  msg->u.chip_state_event.status;
+		es.txerr = msg->u.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.chip_state_event.rx_errors_count;
+		es.error_factor = 0;
 		break;
 	default:
 		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
@@ -653,122 +708,97 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (channel >= dev->nchannels) {
+	if (es.channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid channel number (%d)\n", es.channel);
 		return;
 	}
 
-	priv = dev->nets[channel];
+	priv = dev->nets[es.channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
+	if (es.status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
 
+	old_state = priv->can.state;
+	kvaser_usb_rx_error_update_state(priv, &es);
+	new_state = priv->can.state;
+
 	skb = alloc_can_err_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->rx_dropped++;
 		return;
 	}
 
-	new_state = priv->can.state;
-
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
-
-	if (status & M16C_STATE_BUS_OFF) {
-		cf->can_id |= CAN_ERR_BUSOFF;
-
-		priv->can.can_stats.bus_off++;
+	if (es.status & M16C_STATE_BUS_OFF) {
 		if (!priv->can.restart_ms)
 			kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
-
 		netif_carrier_off(priv->netdev);
 
-		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
-		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
+		cf->can_id |= CAN_ERR_BUSOFF;
+	} else if (es.status & M16C_STATE_BUS_PASSIVE) {
+		if (old_state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es.txerr || es.rxerr)
+				cf->data[1] = (es.txerr > es.rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
 				cf->data[1] = CAN_ERR_CRTL_TX_PASSIVE |
 					      CAN_ERR_CRTL_RX_PASSIVE;
-
-			priv->can.can_stats.error_passive++;
 		}
-
-		new_state = CAN_STATE_ERROR_PASSIVE;
 	}
 
-	if (status == M16C_STATE_BUS_ERROR) {
-		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+	if (es.status == M16C_STATE_BUS_ERROR) {
+		if ((old_state < CAN_STATE_ERROR_WARNING) &&
+		    ((es.txerr >= 96) || (es.rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es.txerr > es.rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
-
-			priv->can.can_stats.error_warning++;
-			new_state = CAN_STATE_ERROR_WARNING;
-		} else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
+		} else if (old_state > CAN_STATE_ERROR_ACTIVE) {
 			cf->can_id |= CAN_ERR_PROT;
 			cf->data[2] = CAN_ERR_PROT_ACTIVE;
-
-			new_state = CAN_STATE_ERROR_ACTIVE;
 		}
 	}
 
-	if (!status) {
+	if (!es.status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
-
-		new_state = CAN_STATE_ERROR_ACTIVE;
 	}
 
 	if (priv->can.restart_ms &&
-	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
+	    (old_state >= CAN_STATE_BUS_OFF) &&
 	    (new_state < CAN_STATE_BUS_OFF)) {
 		cf->can_id |= CAN_ERR_RESTARTED;
 		netif_carrier_on(priv->netdev);
-
-		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
+	if (es.error_factor) {
 		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
 
-		if (error_factor & M16C_EF_ACKE)
+		if (es.error_factor & M16C_EF_ACKE)
 			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
+		if (es.error_factor & M16C_EF_CRCE)
 			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
 					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
+		if (es.error_factor & M16C_EF_FORME)
 			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
+		if (es.error_factor & M16C_EF_STFE)
 			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
+		if (es.error_factor & M16C_EF_BITE0)
 			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
+		if (es.error_factor & M16C_EF_BITE1)
 			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
+		if (es.error_factor & M16C_EF_TRE)
 			cf->data[2] |= CAN_ERR_PROT_TX;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
-
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
-
-	priv->can.state = new_state;
+	cf->data[6] = es.txerr;
+	cf->data[7] = es.rxerr;
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
@@ -792,6 +822,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	}
 
 	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+		stats->rx_over_errors++;
+		stats->rx_errors++;
+
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -801,9 +834,6 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		cf->can_id |= CAN_ERR_CRTL;
 		cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
 
-		stats->rx_over_errors++;
-		stats->rx_errors++;
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
 		netif_rx(skb);
-- 
1.9.1


^ permalink raw reply related	[relevance 67%]

* Re: [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family
  @ 2015-01-18 20:12 99%           ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-18 20:12 UTC (permalink / raw)
  To: Olivier Sobrie
  Cc: Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde,
	David S. Miller, Paul Gortmaker, Linux-CAN, netdev, LKML

Hi!

On Mon, Jan 12, 2015 at 02:53:02PM +0100, Olivier Sobrie wrote:
> Hello,
> 
> On Sun, Jan 11, 2015 at 03:36:12PM -0500, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 

...

> > @@ -98,7 +128,13 @@
> >  #define CMD_START_CHIP_REPLY		27
> >  #define CMD_STOP_CHIP			28
> >  #define CMD_STOP_CHIP_REPLY		29
> > -#define CMD_GET_CARD_INFO2		32
> > +#define CMD_READ_CLOCK			30
> > +#define CMD_READ_CLOCK_REPLY		31
> 
> These two defines are not used.
> 

They were added for completeness: the only gap in our continuous
sequence of command IDs from 12 to 39 ;-) No big deal, to be
removed in the next submission.

...

> > +
> > +struct kvaser_msg_tx_acknowledge_header {
> > +	u8 channel;
> > +	u8 tid;
> > +};
> 
> Is this struct really needed? Can't you simply use
> leaf_msg_tx_acknowledge or usbcan_msg_tx_acknowledge
> structures to read the header.
> Same for kvaser_msg_rx_can_header.
> 

They're added to ensure type-safety throughout the code. Basically
they're the common part of a command that has different wire format
between the Leaf and the USBCan, but share a common header.  Such
notation was only added when it was strictly necessary.

For example, there are three functions where 'rx_can_header' is
referenced in the driver, and one function where 'tx_acknowledge_header'
is referenced. Without such header structure, I'll have to sprinkle
3 to 4 extra blocks of:

	switch (dev->family) {
	       case KVASER_LEAF: 
	       case KVASER_USBCAN:
	}

which would be _really_ ugly. The *_header notation ensures that, in
the body of each function, we're accessing the fields in a very safe
manner.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* [PATCH v5 1/5] can: kvaser_usb: Update net interface state before exiting on OOM
  2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
                   ` (5 preceding siblings ...)
  2015-01-11 20:05 99% ` [PATCH v4 00/04] can: Introduce support for Kvaser USBCAN-II devices Ahmed S. Darwish
@ 2015-01-20 21:44 70% ` Ahmed S. Darwish
  2015-01-20 21:45 80%   ` [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
  2015-01-26  5:17 74% ` [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support Ahmed S. Darwish
  7 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-20 21:44 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: Andri Yngvason, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Update all of the can interface's state and error counters before
trying any skb allocation that can actually fail with -ENOMEM.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 182 +++++++++++++++++++++++----------------
 1 file changed, 106 insertions(+), 76 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index c32cd61..971c5f9 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -273,6 +273,10 @@ struct kvaser_msg {
 	} u;
 } __packed;
 
+struct kvaser_usb_error_summary {
+	u8 channel, status, txerr, rxerr, error_factor;
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -615,6 +619,55 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
 }
 
+static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
+						 const struct kvaser_usb_error_summary *es)
+{
+	struct net_device_stats *stats;
+	enum can_state new_state;
+
+	stats = &priv->netdev->stats;
+	new_state = priv->can.state;
+
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
+
+	if (es->status & M16C_STATE_BUS_OFF) {
+		priv->can.can_stats.bus_off++;
+		new_state = CAN_STATE_BUS_OFF;
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
+		if (priv->can.state != CAN_STATE_ERROR_PASSIVE)
+			priv->can.can_stats.error_passive++;
+		new_state = CAN_STATE_ERROR_PASSIVE;
+	}
+
+	if (es->status == M16C_STATE_BUS_ERROR) {
+		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
+			priv->can.can_stats.error_warning++;
+			new_state = CAN_STATE_ERROR_WARNING;
+		} else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
+			new_state = CAN_STATE_ERROR_ACTIVE;
+		}
+	}
+
+	if (!es->status)
+		new_state = CAN_STATE_ERROR_ACTIVE;
+
+	if (priv->can.restart_ms &&
+	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
+	    (new_state < CAN_STATE_BUS_OFF)) {
+		priv->can.can_stats.restarts++;
+	}
+
+	if (es->error_factor) {
+		priv->can.can_stats.bus_error++;
+		stats->rx_errors++;
+	}
+
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
+	priv->can.state = new_state;
+}
+
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 				const struct kvaser_msg *msg)
 {
@@ -622,30 +675,30 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
+	struct kvaser_usb_error_summary es = { };
+	enum can_state old_state;
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
+		es.channel = msg->u.error_event.channel;
+		es.status =  msg->u.error_event.status;
+		es.txerr = msg->u.error_event.tx_errors_count;
+		es.rxerr = msg->u.error_event.rx_errors_count;
+		es.error_factor = msg->u.error_event.error_factor;
 		break;
 	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
+		es.channel = msg->u.log_message.channel;
+		es.status = msg->u.log_message.data[0];
+		es.txerr = msg->u.log_message.data[2];
+		es.rxerr = msg->u.log_message.data[3];
+		es.error_factor = msg->u.log_message.data[1];
 		break;
 	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
+		es.channel = msg->u.chip_state_event.channel;
+		es.status =  msg->u.chip_state_event.status;
+		es.txerr = msg->u.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.chip_state_event.rx_errors_count;
+		es.error_factor = 0;
 		break;
 	default:
 		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
@@ -653,122 +706,99 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (channel >= dev->nchannels) {
+	if (es.channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid channel number (%d)\n", es.channel);
 		return;
 	}
 
-	priv = dev->nets[channel];
+	priv = dev->nets[es.channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
+	if (es.status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
 
+	/* Update all of the can interface's state and error counters before
+	 * trying any skb allocation that can actually fail with -ENOMEM.
+	 */
+	old_state = priv->can.state;
+	kvaser_usb_rx_error_update_can_state(priv, &es);
+
 	skb = alloc_can_err_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->rx_dropped++;
 		return;
 	}
 
-	new_state = priv->can.state;
-
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
-
-	if (status & M16C_STATE_BUS_OFF) {
+	if (es.status & M16C_STATE_BUS_OFF) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
-		priv->can.can_stats.bus_off++;
 		if (!priv->can.restart_ms)
 			kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
-
 		netif_carrier_off(priv->netdev);
-
-		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
-		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
+	} else if (es.status & M16C_STATE_BUS_PASSIVE) {
+		if (old_state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es.txerr || es.rxerr)
+				cf->data[1] = (es.txerr > es.rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
 				cf->data[1] = CAN_ERR_CRTL_TX_PASSIVE |
 					      CAN_ERR_CRTL_RX_PASSIVE;
-
-			priv->can.can_stats.error_passive++;
 		}
-
-		new_state = CAN_STATE_ERROR_PASSIVE;
 	}
 
-	if (status == M16C_STATE_BUS_ERROR) {
-		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+	if (es.status == M16C_STATE_BUS_ERROR) {
+		if ((old_state < CAN_STATE_ERROR_WARNING) &&
+		    ((es.txerr >= 96) || (es.rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es.txerr > es.rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
-
-			priv->can.can_stats.error_warning++;
-			new_state = CAN_STATE_ERROR_WARNING;
-		} else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
+		} else if (old_state > CAN_STATE_ERROR_ACTIVE) {
 			cf->can_id |= CAN_ERR_PROT;
 			cf->data[2] = CAN_ERR_PROT_ACTIVE;
-
-			new_state = CAN_STATE_ERROR_ACTIVE;
 		}
 	}
 
-	if (!status) {
+	if (!es.status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
-
-		new_state = CAN_STATE_ERROR_ACTIVE;
 	}
 
 	if (priv->can.restart_ms &&
-	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
-	    (new_state < CAN_STATE_BUS_OFF)) {
+	    (old_state >= CAN_STATE_BUS_OFF) &&
+	    (priv->can.state < CAN_STATE_BUS_OFF)) {
 		cf->can_id |= CAN_ERR_RESTARTED;
 		netif_carrier_on(priv->netdev);
-
-		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
+	if (es.error_factor) {
 		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
 
-		if (error_factor & M16C_EF_ACKE)
+		if (es.error_factor & M16C_EF_ACKE)
 			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
+		if (es.error_factor & M16C_EF_CRCE)
 			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
 					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
+		if (es.error_factor & M16C_EF_FORME)
 			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
+		if (es.error_factor & M16C_EF_STFE)
 			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
+		if (es.error_factor & M16C_EF_BITE0)
 			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
+		if (es.error_factor & M16C_EF_BITE1)
 			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
+		if (es.error_factor & M16C_EF_TRE)
 			cf->data[2] |= CAN_ERR_PROT_TX;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
-
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
-
-	priv->can.state = new_state;
+	cf->data[6] = es.txerr;
+	cf->data[7] = es.rxerr;
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
@@ -792,6 +822,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	}
 
 	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+		stats->rx_over_errors++;
+		stats->rx_errors++;
+
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -801,9 +834,6 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		cf->can_id |= CAN_ERR_CRTL;
 		cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
 
-		stats->rx_over_errors++;
-		stats->rx_errors++;
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
 		netif_rx(skb);
-- 
1.9.1


^ permalink raw reply related	[relevance 70%]

* [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
  2015-01-20 21:44 70% ` [PATCH v5 1/5] can: kvaser_usb: Update net interface state before exiting on OOM Ahmed S. Darwish
@ 2015-01-20 21:45 80%   ` Ahmed S. Darwish
  2015-01-20 21:47 94%     ` [PATCH v5 3/5] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
                       ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-20 21:45 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: Andri Yngvason, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Replace most of the can interface's state and error counters
handling with the new can-dev can_change_state() mechanism.

Suggested-by: Andri Yngvason <andri.yngvason@marel.com>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 114 +++++++++++++++++++--------------------
 1 file changed, 55 insertions(+), 59 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 971c5f9..0386d3f 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -620,40 +620,43 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 }
 
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
-						 const struct kvaser_usb_error_summary *es)
+						 const struct kvaser_usb_error_summary *es,
+						 struct can_frame *cf)
 {
 	struct net_device_stats *stats;
-	enum can_state new_state;
-
-	stats = &priv->netdev->stats;
-	new_state = priv->can.state;
+	enum can_state cur_state, new_state, tx_state, rx_state;
 
 	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	if (es->status & M16C_STATE_BUS_OFF) {
-		priv->can.can_stats.bus_off++;
+	stats = &priv->netdev->stats;
+	new_state = cur_state = priv->can.state;
+
+	if (es->status & M16C_STATE_BUS_OFF)
 		new_state = CAN_STATE_BUS_OFF;
-	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
-		if (priv->can.state != CAN_STATE_ERROR_PASSIVE)
-			priv->can.can_stats.error_passive++;
+	else if (es->status & M16C_STATE_BUS_PASSIVE)
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	}
 
 	if (es->status == M16C_STATE_BUS_ERROR) {
-		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
-			priv->can.can_stats.error_warning++;
+		if ((cur_state < CAN_STATE_ERROR_WARNING) &&
+		    ((es->txerr >= 96) || (es->rxerr >= 96)))
 			new_state = CAN_STATE_ERROR_WARNING;
-		} else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
+		else if (cur_state > CAN_STATE_ERROR_ACTIVE)
 			new_state = CAN_STATE_ERROR_ACTIVE;
-		}
 	}
 
 	if (!es->status)
 		new_state = CAN_STATE_ERROR_ACTIVE;
 
+	if (new_state != cur_state) {
+		tx_state = (es->txerr >= es->rxerr) ? new_state : 0;
+		rx_state = (es->txerr <= es->rxerr) ? new_state : 0;
+
+		can_change_state(priv->netdev, cf, tx_state, rx_state);
+		new_state = priv->can.state;
+	}
+
 	if (priv->can.restart_ms &&
-	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
+	    (cur_state >= CAN_STATE_BUS_OFF) &&
 	    (new_state < CAN_STATE_BUS_OFF)) {
 		priv->can.can_stats.restarts++;
 	}
@@ -665,18 +668,17 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 
 	priv->bec.txerr = es->txerr;
 	priv->bec.rxerr = es->rxerr;
-	priv->can.state = new_state;
 }
 
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 				const struct kvaser_msg *msg)
 {
-	struct can_frame *cf;
+	struct can_frame *cf, tmp_cf = { .can_id = CAN_ERR_FLAG, .can_dlc = CAN_ERR_DLC };
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
 	struct kvaser_usb_error_summary es = { };
-	enum can_state old_state;
+	enum can_state old_state, new_state;
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
@@ -721,60 +723,54 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	}
 
 	/* Update all of the can interface's state and error counters before
-	 * trying any skb allocation that can actually fail with -ENOMEM.
+	 * trying any memory allocation that can actually fail with -ENOMEM.
+	 *
+	 * We send a temporary stack-allocated error can frame to
+	 * can_change_state() for the very same reason.
+	 *
+	 * TODO: Split can_change_state() responsibility between updating the
+	 * can interface's state and counters, and the setting up of can error
+	 * frame ID and data to userspace. Remove stack allocation afterwards.
 	 */
 	old_state = priv->can.state;
-	kvaser_usb_rx_error_update_can_state(priv, &es);
+	kvaser_usb_rx_error_update_can_state(priv, &es, &tmp_cf);
+	new_state = priv->can.state;
 
 	skb = alloc_can_err_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->rx_dropped++;
 		return;
 	}
+	memcpy(cf, &tmp_cf, sizeof(*cf));
 
-	if (es.status & M16C_STATE_BUS_OFF) {
-		cf->can_id |= CAN_ERR_BUSOFF;
-
-		if (!priv->can.restart_ms)
-			kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
-		netif_carrier_off(priv->netdev);
-	} else if (es.status & M16C_STATE_BUS_PASSIVE) {
-		if (old_state != CAN_STATE_ERROR_PASSIVE) {
-			cf->can_id |= CAN_ERR_CRTL;
-
-			if (es.txerr || es.rxerr)
-				cf->data[1] = (es.txerr > es.rxerr)
-						? CAN_ERR_CRTL_TX_PASSIVE
-						: CAN_ERR_CRTL_RX_PASSIVE;
-			else
-				cf->data[1] = CAN_ERR_CRTL_TX_PASSIVE |
-					      CAN_ERR_CRTL_RX_PASSIVE;
+	if (new_state != old_state) {
+		if (es.status & M16C_STATE_BUS_OFF) {
+			if (!priv->can.restart_ms)
+				kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
+			netif_carrier_off(priv->netdev);
+		}
+
+		if (es.status == M16C_STATE_BUS_ERROR) {
+			if ((old_state >= CAN_STATE_ERROR_WARNING) ||
+			    (es.txerr < 96 && es.rxerr < 96)) {
+				if (old_state > CAN_STATE_ERROR_ACTIVE) {
+					cf->can_id |= CAN_ERR_PROT;
+					cf->data[2] = CAN_ERR_PROT_ACTIVE;
+				}
+			}
 		}
-	}
 
-	if (es.status == M16C_STATE_BUS_ERROR) {
-		if ((old_state < CAN_STATE_ERROR_WARNING) &&
-		    ((es.txerr >= 96) || (es.rxerr >= 96))) {
-			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (es.txerr > es.rxerr)
-					? CAN_ERR_CRTL_TX_WARNING
-					: CAN_ERR_CRTL_RX_WARNING;
-		} else if (old_state > CAN_STATE_ERROR_ACTIVE) {
+		if (!es.status) {
 			cf->can_id |= CAN_ERR_PROT;
 			cf->data[2] = CAN_ERR_PROT_ACTIVE;
 		}
-	}
 
-	if (!es.status) {
-		cf->can_id |= CAN_ERR_PROT;
-		cf->data[2] = CAN_ERR_PROT_ACTIVE;
-	}
-
-	if (priv->can.restart_ms &&
-	    (old_state >= CAN_STATE_BUS_OFF) &&
-	    (priv->can.state < CAN_STATE_BUS_OFF)) {
-		cf->can_id |= CAN_ERR_RESTARTED;
-		netif_carrier_on(priv->netdev);
+		if (priv->can.restart_ms &&
+		    (old_state >= CAN_STATE_BUS_OFF) &&
+		    (new_state < CAN_STATE_BUS_OFF)) {
+			cf->can_id |= CAN_ERR_RESTARTED;
+			netif_carrier_on(priv->netdev);
+		}
 	}
 
 	if (es.error_factor) {
-- 
1.9.1


^ permalink raw reply related	[relevance 80%]

* [PATCH v5 3/5] can: kvaser_usb: Fix state handling upon BUS_ERROR events
  2015-01-20 21:45 80%   ` [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
@ 2015-01-20 21:47 94%     ` Ahmed S. Darwish
  2015-01-20 21:48 97%       ` [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
      2 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-20 21:47 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: Andri Yngvason, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

While being in an ERROR_WARNING state and receiving further
bus error events with error counts in the range of 97-127 inclusive,
the state handling code erroneously reverts back to ERROR_ACTIVE.

As per the CAN standard recommendations, only revert to ERROR_ACTIVE
when the error counters are less than 96.

Moreover, in certain Kvaser models, the BUS_ERROR flag is always set
along with undefined bits in the M16C status register. Thus use
bitwise ops instead of full equality for checking the register
against bus errors.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 0386d3f..640b0eb 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -635,10 +635,12 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 		new_state = CAN_STATE_BUS_OFF;
 	else if (es->status & M16C_STATE_BUS_PASSIVE)
 		new_state = CAN_STATE_ERROR_PASSIVE;
-
-	if (es->status == M16C_STATE_BUS_ERROR) {
-		if ((cur_state < CAN_STATE_ERROR_WARNING) &&
-		    ((es->txerr >= 96) || (es->rxerr >= 96)))
+	else if (es->status & M16C_STATE_BUS_ERROR) {
+		if ((es->txerr >= 256) || (es->rxerr >= 256))
+			new_state = CAN_STATE_BUS_OFF;
+		else if ((es->txerr >= 128) || (es->rxerr >= 128))
+			new_state = CAN_STATE_ERROR_PASSIVE;
+		else if ((es->txerr >= 96) || (es->rxerr >= 96))
 			new_state = CAN_STATE_ERROR_WARNING;
 		else if (cur_state > CAN_STATE_ERROR_ACTIVE)
 			new_state = CAN_STATE_ERROR_ACTIVE;
@@ -748,15 +750,11 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 			if (!priv->can.restart_ms)
 				kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
 			netif_carrier_off(priv->netdev);
-		}
-
-		if (es.status == M16C_STATE_BUS_ERROR) {
-			if ((old_state >= CAN_STATE_ERROR_WARNING) ||
-			    (es.txerr < 96 && es.rxerr < 96)) {
-				if (old_state > CAN_STATE_ERROR_ACTIVE) {
-					cf->can_id |= CAN_ERR_PROT;
-					cf->data[2] = CAN_ERR_PROT_ACTIVE;
-				}
+		} else if (es.status & M16C_STATE_BUS_ERROR) {
+			if ((es.txerr < 96 && es.rxerr < 96) &&
+			    (old_state > CAN_STATE_ERROR_ACTIVE)) {
+				cf->can_id |= CAN_ERR_PROT;
+				cf->data[2] = CAN_ERR_PROT_ACTIVE;
 			}
 		}
 
-- 
1.9.1


^ permalink raw reply related	[relevance 94%]

* [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
  2015-01-20 21:47 94%     ` [PATCH v5 3/5] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
@ 2015-01-20 21:48 97%       ` Ahmed S. Darwish
  2015-01-20 21:50 39%         ` [PATCH v5 5/5] can: kvaser_usb: Add support for the USBcan-II family Ahmed S. Darwish
    0 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-20 21:48 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: Andri Yngvason, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

On some x86 laptops, plugging a Kvaser device again after an
unplug makes the firmware always ignore the very first command.
For such a case, provide some room for retries instead of
completly exiting the driver init code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 640b0eb..068e76c 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1614,7 +1614,7 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 {
 	struct kvaser_usb *dev;
 	int err = -ENOMEM;
-	int i;
+	int i, retry = 3;
 
 	dev = devm_kzalloc(&intf->dev, sizeof(*dev), GFP_KERNEL);
 	if (!dev)
@@ -1632,7 +1632,15 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	err = kvaser_usb_get_software_info(dev);
+	/* On some x86 laptops, plugging a Kvaser device again after
+	 * an unplug makes the firmware always ignore the very first
+	 * command. For such a case, provide some room for retries
+	 * instead of completly exiting the driver.
+	 */
+	do {
+		err = kvaser_usb_get_software_info(dev);
+	} while (--retry && err == -ETIMEDOUT);
+
 	if (err) {
 		dev_err(&intf->dev,
 			"Cannot get software infos, error %d\n", err);
-- 
1.9.1


^ permalink raw reply related	[relevance 97%]

* [PATCH v5 5/5] can: kvaser_usb: Add support for the USBcan-II family
  2015-01-20 21:48 97%       ` [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
@ 2015-01-20 21:50 39%         ` Ahmed S. Darwish
    1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-20 21:50 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger, Marc Kleine-Budde
  Cc: Andri Yngvason, Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
divided into two major families: 'Leaf', and 'USBcanII'.  From an
Operating System perspective, the firmware of both families behave
in a not too drastically different fashion.

This patch adds support for the USBcanII family of devices to the
current Kvaser Leaf-only driver.

CAN frames sending, receiving, and error handling paths has been
tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
should also work nicely with other products in the same category.

List of new devices supported by this driver update:

         - Kvaser USBcan II HS/HS
         - Kvaser USBcan II HS/LS
         - Kvaser USBcan Rugged ("USBcan Rev B")
         - Kvaser Memorator HS/HS
         - Kvaser Memorator HS/LS
         - Scania VCI2 (if you have the Kvaser logo on top)

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/Kconfig      |   8 +-
 drivers/net/can/usb/kvaser_usb.c | 598 ++++++++++++++++++++++++++++++---------
 2 files changed, 478 insertions(+), 128 deletions(-)

** V5 Changelog:
- Rebase on the new CAN error state changes added for the Leaf driver
- Add minor changes (remove unused commands, constify poniters, etc.)

** V4 Changelog:
- Use type-safe C methods instead of cpp macros
- Remove defensive checks against non-existing families
- Re-order methods to remove forward declarations
- Smaller stuff spotted by earlier review (function prefexes, etc.)

** V3 Changelog:
- Fix padding for the usbcan_msg_tx_acknowledge command
- Remove kvaser_usb->max_channels and the MAX_NET_DEVICES macro
- Rename commands to CMD_LEAF_xxx and CMD_USBCAN_xxx
- Apply checkpatch.pl suggestions ('net/' comments, multi-line strings, etc.)

** V2 Changelog:
- Update Kconfig entries
- Use actual number of CAN channels (instead of max) where appropriate
- Rebase over a new set of UsbcanII-independent driver fixes

diff --git a/drivers/net/can/usb/Kconfig b/drivers/net/can/usb/Kconfig
index a77db919..f6f5500 100644
--- a/drivers/net/can/usb/Kconfig
+++ b/drivers/net/can/usb/Kconfig
@@ -25,7 +25,7 @@ config CAN_KVASER_USB
 	tristate "Kvaser CAN/USB interface"
 	---help---
 	  This driver adds support for Kvaser CAN/USB devices like Kvaser
-	  Leaf Light.
+	  Leaf Light and Kvaser USBcan II.
 
 	  The driver provides support for the following devices:
 	    - Kvaser Leaf Light
@@ -46,6 +46,12 @@ config CAN_KVASER_USB
 	    - Kvaser USBcan R
 	    - Kvaser Leaf Light v2
 	    - Kvaser Mini PCI Express HS
+	    - Kvaser USBcan II HS/HS
+	    - Kvaser USBcan II HS/LS
+	    - Kvaser USBcan Rugged ("USBcan Rev B")
+	    - Kvaser Memorator HS/HS
+	    - Kvaser Memorator HS/LS
+	    - Scania VCI2 (if you have the Kvaser logo on top)
 
 	  If unsure, say N.
 
diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 068e76c..3e1eb5d 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -6,10 +6,12 @@
  * Parts of this driver are based on the following:
  *  - Kvaser linux leaf driver (version 4.78)
  *  - CAN driver for esd CAN-USB/2
+ *  - Kvaser linux usbcanII driver (version 5.3)
  *
  * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
  * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
  * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
+ * Copyright (C) 2015 Valeo A.S.
  */
 
 #include <linux/completion.h>
@@ -30,8 +32,9 @@
 #define RX_BUFFER_SIZE			3072
 #define CAN_USB_CLOCK			8000000
 #define MAX_NET_DEVICES			3
+#define MAX_USBCAN_NET_DEVICES		2
 
-/* Kvaser USB devices */
+/* Kvaser Leaf USB devices */
 #define KVASER_VENDOR_ID		0x0bfd
 #define USB_LEAF_DEVEL_PRODUCT_ID	10
 #define USB_LEAF_LITE_PRODUCT_ID	11
@@ -56,6 +59,24 @@
 #define USB_LEAF_LITE_V2_PRODUCT_ID	288
 #define USB_MINI_PCIE_HS_PRODUCT_ID	289
 
+static inline bool kvaser_is_leaf(const struct usb_device_id *id)
+{
+	return id->idProduct >= USB_LEAF_DEVEL_PRODUCT_ID &&
+	       id->idProduct <= USB_MINI_PCIE_HS_PRODUCT_ID;
+}
+
+/* Kvaser USBCan-II devices */
+#define USB_USBCAN_REVB_PRODUCT_ID	2
+#define USB_VCI2_PRODUCT_ID		3
+#define USB_USBCAN2_PRODUCT_ID		4
+#define USB_MEMORATOR_PRODUCT_ID	5
+
+static inline bool kvaser_is_usbcan(const struct usb_device_id *id)
+{
+	return id->idProduct >= USB_USBCAN_REVB_PRODUCT_ID &&
+	       id->idProduct <= USB_MEMORATOR_PRODUCT_ID;
+}
+
 /* USB devices features */
 #define KVASER_HAS_SILENT_MODE		BIT(0)
 #define KVASER_HAS_TXRX_ERRORS		BIT(1)
@@ -73,7 +94,7 @@
 #define MSG_FLAG_TX_ACK			BIT(6)
 #define MSG_FLAG_TX_REQUEST		BIT(7)
 
-/* Can states */
+/* Can states (M16C CxSTRH register) */
 #define M16C_STATE_BUS_RESET		BIT(0)
 #define M16C_STATE_BUS_ERROR		BIT(4)
 #define M16C_STATE_BUS_PASSIVE		BIT(5)
@@ -98,7 +119,11 @@
 #define CMD_START_CHIP_REPLY		27
 #define CMD_STOP_CHIP			28
 #define CMD_STOP_CHIP_REPLY		29
-#define CMD_GET_CARD_INFO2		32
+
+#define CMD_LEAF_GET_CARD_INFO2		32
+#define CMD_USBCAN_RESET_CLOCK		32
+#define CMD_USBCAN_CLOCK_OVERFLOW_EVENT	33
+
 #define CMD_GET_CARD_INFO		34
 #define CMD_GET_CARD_INFO_REPLY		35
 #define CMD_GET_SOFTWARE_INFO		38
@@ -108,8 +133,9 @@
 #define CMD_RESET_ERROR_COUNTER		49
 #define CMD_TX_ACKNOWLEDGE		50
 #define CMD_CAN_ERROR_EVENT		51
-#define CMD_USB_THROTTLE		77
-#define CMD_LOG_MESSAGE			106
+
+#define CMD_LEAF_USB_THROTTLE		77
+#define CMD_LEAF_LOG_MESSAGE		106
 
 /* error factors */
 #define M16C_EF_ACKE			BIT(0)
@@ -121,6 +147,14 @@
 #define M16C_EF_RCVE			BIT(6)
 #define M16C_EF_TRE			BIT(7)
 
+/* Only Leaf-based devices can report M16C error factors,
+ * thus define our own error status flags for USBCANII
+ */
+#define USBCAN_ERROR_STATE_NONE		0
+#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
+#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
+#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
+
 /* bittiming parameters */
 #define KVASER_USB_TSEG1_MIN		1
 #define KVASER_USB_TSEG1_MAX		16
@@ -137,9 +171,18 @@
 #define KVASER_CTRL_MODE_SELFRECEPTION	3
 #define KVASER_CTRL_MODE_OFF		4
 
-/* log message */
+/* Extended CAN identifier flag */
 #define KVASER_EXTENDED_FRAME		BIT(31)
 
+/* Kvaser USB CAN dongles are divided into two major families:
+ * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
+ * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
+ */
+enum kvaser_usb_family {
+	KVASER_LEAF,
+	KVASER_USBCAN,
+};
+
 struct kvaser_msg_simple {
 	u8 tid;
 	u8 channel;
@@ -148,30 +191,55 @@ struct kvaser_msg_simple {
 struct kvaser_msg_cardinfo {
 	u8 tid;
 	u8 nchannels;
-	__le32 serial_number;
-	__le32 padding;
+	union {
+		struct {
+			__le32 serial_number;
+			__le32 padding;
+		} __packed leaf0;
+		struct {
+			__le32 serial_number_low;
+			__le32 serial_number_high;
+		} __packed usbcan0;
+	} __packed;
 	__le32 clock_resolution;
 	__le32 mfgdate;
 	u8 ean[8];
 	u8 hw_revision;
-	u8 usb_hs_mode;
-	__le16 padding2;
+	union {
+		struct {
+			u8 usb_hs_mode;
+		} __packed leaf1;
+		struct {
+			u8 padding;
+		} __packed usbcan1;
+	} __packed;
+	__le16 padding;
 } __packed;
 
 struct kvaser_msg_cardinfo2 {
 	u8 tid;
-	u8 channel;
+	u8 reserved;
 	u8 pcb_id[24];
 	__le32 oem_unlock_code;
 } __packed;
 
-struct kvaser_msg_softinfo {
+struct leaf_msg_softinfo {
 	u8 tid;
-	u8 channel;
+	u8 padding0;
 	__le32 sw_options;
 	__le32 fw_version;
 	__le16 max_outstanding_tx;
-	__le16 padding[9];
+	__le16 padding1[9];
+} __packed;
+
+struct usbcan_msg_softinfo {
+	u8 tid;
+	u8 fw_name[5];
+	__le16 max_outstanding_tx;
+	u8 padding[6];
+	__le32 fw_version;
+	__le16 checksum;
+	__le16 sw_options;
 } __packed;
 
 struct kvaser_msg_busparams {
@@ -188,36 +256,86 @@ struct kvaser_msg_tx_can {
 	u8 channel;
 	u8 tid;
 	u8 msg[14];
-	u8 padding;
-	u8 flags;
+	union {
+		struct {
+			u8 padding;
+			u8 flags;
+		} __packed leaf;
+		struct {
+			u8 flags;
+			u8 padding;
+		} __packed usbcan;
+	} __packed;
+} __packed;
+
+struct kvaser_msg_rx_can_header {
+	u8 channel;
+	u8 flag;
 } __packed;
 
-struct kvaser_msg_rx_can {
+struct leaf_msg_rx_can {
 	u8 channel;
 	u8 flag;
+
 	__le16 time[3];
 	u8 msg[14];
 } __packed;
 
-struct kvaser_msg_chip_state_event {
+struct usbcan_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
+	u8 msg[14];
+	__le16 time;
+} __packed;
+
+struct leaf_msg_chip_state_event {
 	u8 tid;
 	u8 channel;
+
 	__le16 time[3];
 	u8 tx_errors_count;
 	u8 rx_errors_count;
+
+	u8 status;
+	u8 padding[3];
+} __packed;
+
+struct usbcan_msg_chip_state_event {
+	u8 tid;
+	u8 channel;
+
+	u8 tx_errors_count;
+	u8 rx_errors_count;
+	__le16 time;
+
 	u8 status;
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_tx_acknowledge {
+struct kvaser_msg_tx_acknowledge_header {
 	u8 channel;
 	u8 tid;
+} __packed;
+
+struct leaf_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
 	__le16 time[3];
 	u8 flags;
 	u8 time_offset;
 } __packed;
 
-struct kvaser_msg_error_event {
+struct usbcan_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
+	__le16 time;
+	__le16 padding;
+} __packed;
+
+struct leaf_msg_error_event {
 	u8 tid;
 	u8 flags;
 	__le16 time[3];
@@ -229,6 +347,18 @@ struct kvaser_msg_error_event {
 	u8 error_factor;
 } __packed;
 
+struct usbcan_msg_error_event {
+	u8 tid;
+	u8 padding;
+	u8 tx_errors_count_ch0;
+	u8 rx_errors_count_ch0;
+	u8 tx_errors_count_ch1;
+	u8 rx_errors_count_ch1;
+	u8 status_ch0;
+	u8 status_ch1;
+	__le16 time;
+} __packed;
+
 struct kvaser_msg_ctrl_mode {
 	u8 tid;
 	u8 channel;
@@ -243,7 +373,7 @@ struct kvaser_msg_flush_queue {
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_log_message {
+struct leaf_msg_log_message {
 	u8 channel;
 	u8 flags;
 	__le16 time[3];
@@ -260,21 +390,55 @@ struct kvaser_msg {
 		struct kvaser_msg_simple simple;
 		struct kvaser_msg_cardinfo cardinfo;
 		struct kvaser_msg_cardinfo2 cardinfo2;
-		struct kvaser_msg_softinfo softinfo;
 		struct kvaser_msg_busparams busparams;
+
+		struct kvaser_msg_rx_can_header rx_can_header;
+		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
+
+		union {
+			struct leaf_msg_softinfo softinfo;
+			struct leaf_msg_rx_can rx_can;
+			struct leaf_msg_chip_state_event chip_state_event;
+			struct leaf_msg_tx_acknowledge tx_acknowledge;
+			struct leaf_msg_error_event error_event;
+			struct leaf_msg_log_message log_message;
+		} __packed leaf;
+
+		union {
+			struct usbcan_msg_softinfo softinfo;
+			struct usbcan_msg_rx_can rx_can;
+			struct usbcan_msg_chip_state_event chip_state_event;
+			struct usbcan_msg_tx_acknowledge tx_acknowledge;
+			struct usbcan_msg_error_event error_event;
+		} __packed usbcan;
+
 		struct kvaser_msg_tx_can tx_can;
-		struct kvaser_msg_rx_can rx_can;
-		struct kvaser_msg_chip_state_event chip_state_event;
-		struct kvaser_msg_tx_acknowledge tx_acknowledge;
-		struct kvaser_msg_error_event error_event;
 		struct kvaser_msg_ctrl_mode ctrl_mode;
 		struct kvaser_msg_flush_queue flush_queue;
-		struct kvaser_msg_log_message log_message;
 	} u;
 } __packed;
 
+/* Summary of a kvaser error event, for a unified Leaf/Usbcan error
+ * handling. Some discrepancies between the two families exist:
+ *
+ * - USBCAN firmware does not report M16C "error factors"
+ * - USBCAN controllers has difficulties reporting if the raised error
+ *   event is for ch0 or ch1. They leave such arbitration to the OS
+ *   driver by letting it compare error counters with previous values
+ *   and decide the error event's channel. Thus for USBCAN, the channel
+ *   field is only advisory.
+ */
 struct kvaser_usb_error_summary {
-	u8 channel, status, txerr, rxerr, error_factor;
+	u8 channel, status, txerr, rxerr;
+	union {
+		struct {
+			u8 error_factor;
+		} leaf;
+		struct {
+			u8 other_ch_status;
+			u8 error_state;
+		} usbcan;
+	};
 };
 
 struct kvaser_usb_tx_urb_context {
@@ -292,6 +456,7 @@ struct kvaser_usb {
 
 	u32 fw_version;
 	unsigned int nchannels;
+	enum kvaser_usb_family family;
 
 	bool rxinitdone;
 	void *rxbuf[MAX_RX_URBS];
@@ -315,6 +480,7 @@ struct kvaser_usb_net_priv {
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
+	/* Leaf family IDs */
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
@@ -364,6 +530,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
 		.driver_info = KVASER_HAS_TXRX_ERRORS },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+
+	/* USBCANII family IDs */
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+
 	{ }
 };
 MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
@@ -467,7 +644,14 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	if (err)
 		return err;
 
-	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
+	switch (dev->family) {
+	case KVASER_LEAF:
+		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		break;
+	case KVASER_USBCAN:
+		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		break;
+	}
 
 	return 0;
 }
@@ -486,7 +670,9 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
 		return err;
 
 	dev->nchannels = msg.u.cardinfo.nchannels;
-	if (dev->nchannels > MAX_NET_DEVICES)
+	if ((dev->nchannels > MAX_NET_DEVICES) ||
+	    (dev->family == KVASER_USBCAN &&
+	     dev->nchannels > MAX_USBCAN_NET_DEVICES))
 		return -EINVAL;
 
 	return 0;
@@ -500,8 +686,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	u8 channel = msg->u.tx_acknowledge.channel;
-	u8 tid = msg->u.tx_acknowledge.tid;
+	u8 channel, tid;
+
+	channel = msg->u.tx_acknowledge_header.channel;
+	tid = msg->u.tx_acknowledge_header.tid;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -623,12 +811,12 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 						 const struct kvaser_usb_error_summary *es,
 						 struct can_frame *cf)
 {
-	struct net_device_stats *stats;
+	struct kvaser_usb *dev = priv->dev;
+	struct net_device_stats *stats = &priv->netdev->stats;
 	enum can_state cur_state, new_state, tx_state, rx_state;
 
 	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	stats = &priv->netdev->stats;
 	new_state = cur_state = priv->can.state;
 
 	if (es->status & M16C_STATE_BUS_OFF)
@@ -663,9 +851,22 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 		priv->can.can_stats.restarts++;
 	}
 
-	if (es->error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			priv->can.can_stats.bus_error++;
+			stats->rx_errors++;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
+			stats->tx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
+			stats->rx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			priv->can.can_stats.bus_error++;
+		}
+		break;
 	}
 
 	priv->bec.txerr = es->txerr;
@@ -673,53 +874,24 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 }
 
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
-				const struct kvaser_msg *msg)
+				const struct kvaser_usb_error_summary *es)
 {
 	struct can_frame *cf, tmp_cf = { .can_id = CAN_ERR_FLAG, .can_dlc = CAN_ERR_DLC };
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	struct kvaser_usb_error_summary es = { };
 	enum can_state old_state, new_state;
 
-	switch (msg->id) {
-	case CMD_CAN_ERROR_EVENT:
-		es.channel = msg->u.error_event.channel;
-		es.status =  msg->u.error_event.status;
-		es.txerr = msg->u.error_event.tx_errors_count;
-		es.rxerr = msg->u.error_event.rx_errors_count;
-		es.error_factor = msg->u.error_event.error_factor;
-		break;
-	case CMD_LOG_MESSAGE:
-		es.channel = msg->u.log_message.channel;
-		es.status = msg->u.log_message.data[0];
-		es.txerr = msg->u.log_message.data[2];
-		es.rxerr = msg->u.log_message.data[3];
-		es.error_factor = msg->u.log_message.data[1];
-		break;
-	case CMD_CHIP_STATE_EVENT:
-		es.channel = msg->u.chip_state_event.channel;
-		es.status =  msg->u.chip_state_event.status;
-		es.txerr = msg->u.chip_state_event.tx_errors_count;
-		es.rxerr = msg->u.chip_state_event.rx_errors_count;
-		es.error_factor = 0;
-		break;
-	default:
-		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
-			msg->id);
-		return;
-	}
-
-	if (es.channel >= dev->nchannels) {
+	if (es->channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", es.channel);
+			"Invalid channel number (%d)\n", es->channel);
 		return;
 	}
 
-	priv = dev->nets[es.channel];
+	priv = dev->nets[es->channel];
 	stats = &priv->netdev->stats;
 
-	if (es.status & M16C_STATE_BUS_RESET) {
+	if (es->status & M16C_STATE_BUS_RESET) {
 		kvaser_usb_unlink_tx_urbs(priv);
 		return;
 	}
@@ -735,7 +907,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	 * frame ID and data to userspace. Remove stack allocation afterwards.
 	 */
 	old_state = priv->can.state;
-	kvaser_usb_rx_error_update_can_state(priv, &es, &tmp_cf);
+	kvaser_usb_rx_error_update_can_state(priv, es, &tmp_cf);
 	new_state = priv->can.state;
 
 	skb = alloc_can_err_skb(priv->netdev, &cf);
@@ -746,19 +918,19 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	memcpy(cf, &tmp_cf, sizeof(*cf));
 
 	if (new_state != old_state) {
-		if (es.status & M16C_STATE_BUS_OFF) {
+		if (es->status & M16C_STATE_BUS_OFF) {
 			if (!priv->can.restart_ms)
 				kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
 			netif_carrier_off(priv->netdev);
-		} else if (es.status & M16C_STATE_BUS_ERROR) {
-			if ((es.txerr < 96 && es.rxerr < 96) &&
+		} else if (es->status & M16C_STATE_BUS_ERROR) {
+			if ((es->txerr < 96 && es->rxerr < 96) &&
 			    (old_state > CAN_STATE_ERROR_ACTIVE)) {
 				cf->can_id |= CAN_ERR_PROT;
 				cf->data[2] = CAN_ERR_PROT_ACTIVE;
 			}
 		}
 
-		if (!es.status) {
+		if (!es->status) {
 			cf->can_id |= CAN_ERR_PROT;
 			cf->data[2] = CAN_ERR_PROT_ACTIVE;
 		}
@@ -771,34 +943,161 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 	}
 
-	if (es.error_factor) {
-		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
-
-		if (es.error_factor & M16C_EF_ACKE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (es.error_factor & M16C_EF_CRCE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
-					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (es.error_factor & M16C_EF_FORME)
-			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (es.error_factor & M16C_EF_STFE)
-			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (es.error_factor & M16C_EF_BITE0)
-			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (es.error_factor & M16C_EF_BITE1)
-			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (es.error_factor & M16C_EF_TRE)
-			cf->data[2] |= CAN_ERR_PROT_TX;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+
+			if (es->leaf.error_factor & M16C_EF_ACKE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
+			if (es->leaf.error_factor & M16C_EF_CRCE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
+						CAN_ERR_PROT_LOC_CRC_DEL);
+			if (es->leaf.error_factor & M16C_EF_FORME)
+				cf->data[2] |= CAN_ERR_PROT_FORM;
+			if (es->leaf.error_factor & M16C_EF_STFE)
+				cf->data[2] |= CAN_ERR_PROT_STUFF;
+			if (es->leaf.error_factor & M16C_EF_BITE0)
+				cf->data[2] |= CAN_ERR_PROT_BIT0;
+			if (es->leaf.error_factor & M16C_EF_BITE1)
+				cf->data[2] |= CAN_ERR_PROT_BIT1;
+			if (es->leaf.error_factor & M16C_EF_TRE)
+				cf->data[2] |= CAN_ERR_PROT_TX;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			cf->can_id |= CAN_ERR_BUSERROR;
+		}
+		break;
 	}
 
-	cf->data[6] = es.txerr;
-	cf->data[7] = es.rxerr;
+	cf->data[6] = es->txerr;
+	cf->data[7] = es->rxerr;
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
 	netif_rx(skb);
 }
 
+/* For USBCAN, report error to userspace iff the channels's errors counter
+ * has increased, or we're the only channel seeing a bus error state.
+ */
+static void kvaser_usbcan_conditionally_rx_error(const struct kvaser_usb *dev,
+						 struct kvaser_usb_error_summary *es)
+{
+	struct kvaser_usb_net_priv *priv;
+	int channel;
+	bool report_error;
+
+	channel = es->channel;
+	if (channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", channel);
+		return;
+	}
+
+	priv = dev->nets[channel];
+	report_error = false;
+
+	if (es->txerr > priv->bec.txerr) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
+		report_error = true;
+	}
+	if (es->rxerr > priv->bec.rxerr) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
+		report_error = true;
+	}
+	if ((es->status & M16C_STATE_BUS_ERROR) &&
+	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
+		report_error = true;
+	}
+
+	if (report_error)
+		kvaser_usb_rx_error(dev, es);
+}
+
+static void kvaser_usbcan_rx_error(const struct kvaser_usb *dev,
+				   const struct kvaser_msg *msg)
+{
+	struct kvaser_usb_error_summary es = { };
+
+	switch (msg->id) {
+	/* Sometimes errors are sent as unsolicited chip state events */
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.usbcan.chip_state_event.channel;
+		es.status =  msg->u.usbcan.chip_state_event.status;
+		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
+		kvaser_usbcan_conditionally_rx_error(dev, &es);
+		break;
+
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = 0;
+		es.status = msg->u.usbcan.error_event.status_ch0;
+		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
+		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
+		es.usbcan.other_ch_status =
+			msg->u.usbcan.error_event.status_ch1;
+		kvaser_usbcan_conditionally_rx_error(dev, &es);
+
+		/* The USBCAN firmware supports up to 2 channels.
+		 * Now that ch0 was checked, check if ch1 has any errors.
+		 */
+		if (dev->nchannels == MAX_USBCAN_NET_DEVICES) {
+			es.channel = 1;
+			es.status = msg->u.usbcan.error_event.status_ch1;
+			es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch1;
+			es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch1;
+			es.usbcan.other_ch_status =
+				msg->u.usbcan.error_event.status_ch0;
+			kvaser_usbcan_conditionally_rx_error(dev, &es);
+		}
+		break;
+
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+	}
+}
+
+static void kvaser_leaf_rx_error(const struct kvaser_usb *dev,
+				 const struct kvaser_msg *msg)
+{
+	struct kvaser_usb_error_summary es = { };
+
+	switch (msg->id) {
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = msg->u.leaf.error_event.channel;
+		es.status =  msg->u.leaf.error_event.status;
+		es.txerr = msg->u.leaf.error_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
+		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
+		break;
+	case CMD_LEAF_LOG_MESSAGE:
+		es.channel = msg->u.leaf.log_message.channel;
+		es.status = msg->u.leaf.log_message.data[0];
+		es.txerr = msg->u.leaf.log_message.data[2];
+		es.rxerr = msg->u.leaf.log_message.data[3];
+		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
+		break;
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.leaf.chip_state_event.channel;
+		es.status =  msg->u.leaf.chip_state_event.status;
+		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
+		es.leaf.error_factor = 0;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+		return;
+	}
+
+	kvaser_usb_rx_error(dev, &es);
+}
+
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 				  const struct kvaser_msg *msg)
 {
@@ -806,16 +1105,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	struct sk_buff *skb;
 	struct net_device_stats *stats = &priv->netdev->stats;
 
-	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
+	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
 					 MSG_FLAG_NERR)) {
 		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
-			   msg->u.rx_can.flag);
+			   msg->u.rx_can_header.flag);
 
 		stats->rx_errors++;
 		return;
 	}
 
-	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
 		stats->rx_over_errors++;
 		stats->rx_errors++;
 
@@ -841,7 +1140,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
-	u8 channel = msg->u.rx_can.channel;
+	u8 channel = msg->u.rx_can_header.channel;
+	const u8 *rx_msg = NULL;	/* GCC */
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -852,60 +1152,68 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
-	    (msg->id == CMD_LOG_MESSAGE)) {
-		kvaser_usb_rx_error(dev, msg);
+	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+	    (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE)) {
+		kvaser_leaf_rx_error(dev, msg);
 		return;
-	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
-					 MSG_FLAG_NERR |
-					 MSG_FLAG_OVERRUN)) {
+	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
+						MSG_FLAG_NERR |
+						MSG_FLAG_OVERRUN)) {
 		kvaser_usb_rx_can_err(priv, msg);
 		return;
-	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
+	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
 		netdev_warn(priv->netdev,
 			    "Unhandled frame (flags: 0x%02x)",
-			    msg->u.rx_can.flag);
+			    msg->u.rx_can_header.flag);
 		return;
 	}
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		rx_msg = msg->u.leaf.rx_can.msg;
+		break;
+	case KVASER_USBCAN:
+		rx_msg = msg->u.usbcan.rx_can.msg;
+		break;
+	}
+
 	skb = alloc_can_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->tx_dropped++;
 		return;
 	}
 
-	if (msg->id == CMD_LOG_MESSAGE) {
-		cf->can_id = le32_to_cpu(msg->u.log_message.id);
+	if (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE) {
+		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
 		if (cf->can_id & KVASER_EXTENDED_FRAME)
 			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
 		else
 			cf->can_id &= CAN_SFF_MASK;
 
-		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
+		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
 
-		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.log_message.data,
+			memcpy(cf->data, &msg->u.leaf.log_message.data,
 			       cf->can_dlc);
 	} else {
-		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
-			     (msg->u.rx_can.msg[1] & 0x3f);
+		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
 
 		if (msg->id == CMD_RX_EXT_MESSAGE) {
 			cf->can_id <<= 18;
-			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
-				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
-				      (msg->u.rx_can.msg[4] & 0x3f);
+			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
+				      ((rx_msg[3] & 0xff) << 6) |
+				      (rx_msg[4] & 0x3f);
 			cf->can_id |= CAN_EFF_FLAG;
 		}
 
-		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
+		cf->can_dlc = get_can_dlc(rx_msg[5]);
 
-		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.rx_can.msg[6],
+			memcpy(cf->data, &rx_msg[6],
 			       cf->can_dlc);
 	}
 
@@ -968,21 +1276,35 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 
 	case CMD_RX_STD_MESSAGE:
 	case CMD_RX_EXT_MESSAGE:
-	case CMD_LOG_MESSAGE:
+		kvaser_usb_rx_can_msg(dev, msg);
+		break;
+
+	case CMD_LEAF_LOG_MESSAGE:
+		if (dev->family != KVASER_LEAF)
+			goto warn;
 		kvaser_usb_rx_can_msg(dev, msg);
 		break;
 
 	case CMD_CHIP_STATE_EVENT:
 	case CMD_CAN_ERROR_EVENT:
-		kvaser_usb_rx_error(dev, msg);
+		if (dev->family == KVASER_LEAF)
+			kvaser_leaf_rx_error(dev, msg);
+		else
+			kvaser_usbcan_rx_error(dev, msg);
 		break;
 
 	case CMD_TX_ACKNOWLEDGE:
 		kvaser_usb_tx_acknowledge(dev, msg);
 		break;
 
+	/* Ignored messages */
+	case CMD_USBCAN_CLOCK_OVERFLOW_EVENT:
+		if (dev->family != KVASER_USBCAN)
+			goto warn;
+		break;
+
 	default:
-		dev_warn(dev->udev->dev.parent,
+warn:		dev_warn(dev->udev->dev.parent,
 			 "Unhandled message (%d)\n", msg->id);
 		break;
 	}
@@ -1202,7 +1524,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 				  dev->rxbuf[i],
 				  dev->rxbuf_dma[i]);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++) {
+	for (i = 0; i < dev->nchannels; i++) {
 		struct kvaser_usb_net_priv *priv = dev->nets[i];
 
 		if (priv)
@@ -1310,6 +1632,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	u8 *msg_tx_can_flags = NULL;		/* GCC */
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1331,9 +1654,19 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	msg = buf;
 	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
-	msg->u.tx_can.flags = 0;
 	msg->u.tx_can.channel = priv->channel;
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
+		break;
+	case KVASER_USBCAN:
+		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
+		break;
+	}
+
+	*msg_tx_can_flags = 0;
+
 	if (cf->can_id & CAN_EFF_FLAG) {
 		msg->id = CMD_TX_EXT_MESSAGE;
 		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
@@ -1351,7 +1684,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
 
 	if (cf->can_id & CAN_RTR_FLAG)
-		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
+		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
@@ -1620,6 +1953,17 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 	if (!dev)
 		return -ENOMEM;
 
+	if (kvaser_is_leaf(id)) {
+		dev->family = KVASER_LEAF;
+	} else if (kvaser_is_usbcan(id)) {
+		dev->family = KVASER_USBCAN;
+	} else {
+		dev_err(&intf->dev,
+			"Product ID (%d) does not belong to any known Kvaser USB family",
+			id->idProduct);
+		return -ENODEV;
+	}
+
 	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
 	if (err) {
 		dev_err(&intf->dev, "Cannot get usb endpoint(s)");
-- 
1.9.1


^ permalink raw reply related	[relevance 39%]

* Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
  @ 2015-01-21 14:43 61%         ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-21 14:43 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Andri Yngvason, Olivier Sobrie, Oliver Hartkopp,
	Marc Kleine-Budde, Linux-CAN, netdev, LKML

Hi!

On Wed, Jan 21, 2015 at 12:53:58PM +0100, Wolfgang Grandegger wrote:
> On Wed, 21 Jan 2015 10:33:19 +0000, Andri Yngvason
> <andri.yngvason@marel.com> wrote:
> > Quoting Ahmed S. Darwish (2015-01-20 21:45:37)
> >> From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> >> 
> >> Replace most of the can interface's state and error counters
> >> handling with the new can-dev can_change_state() mechanism.
> >> 
> >> Suggested-by: Andri Yngvason <andri.yngvason@marel.com>
> >> Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> >> ---
> >>  drivers/net/can/usb/kvaser_usb.c | 114
> >>  +++++++++++++++++++--------------------
> >>  1 file changed, 55 insertions(+), 59 deletions(-)
> >> 
> >> diff --git a/drivers/net/can/usb/kvaser_usb.c
> >> b/drivers/net/can/usb/kvaser_usb.c
> >> index 971c5f9..0386d3f 100644
> >> --- a/drivers/net/can/usb/kvaser_usb.c
> >> +++ b/drivers/net/can/usb/kvaser_usb.c
> 
> ...
> > 
> > Looks good.
> 
> Would be nice to see some "candump" traces as well.

Sure. The USBCan-II device trace below is generated after applying
all patches in the series, especially patch #3, which fixes some
some invalid CAN state transitions logic in the original driver.

##########################################################################

candump on a PC, Kvaser USB device on the receiving end:

 ...
 (000.011392)  can0  71D   [8]  5B 06 00 00 00 00 00 00
 (000.009270)  can0  712   [3]  5C 06 00
 (000.010691)  can0  0F3   [7]  5D 06 00 00 00 00 00
 (000.010443)  can0  63E   [8]  5E 06 00 00 00 00 00 00
 (000.010112)  can0  502   [8]  5F 06 00 00 00 00 00 00
 (000.009944)  can0  39A   [8]  60 06 00 00 00 00 00 00
 (000.010186)  can0  721   [8]  61 06 00 00 00 00 00 00
 (000.009628)  can0  5B7   [6]  62 06 00 00 00 00
 (000.009784)  can0  1D7   [4]  63 06 00 00
 (000.010806)  can0  4FE   [8]  64 06 00 00 00 00 00 00
 (000.008897)  can0  75E   [1]  65
 (000.010257)  can0  1EA   [2]  66 06

<-- Unplug the cable -->
 (000.010640)  can0  20000080   [8]  00 00 00 00 00 00 00 01   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{1}}

<-- Replug the cable, after 12 seconds -->
 (044.345134)  can0  20000080   [8]  00 00 00 00 00 00 00 02   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{2}}

 (000.002730)  can0  75E   [8]  67 06 00 00 00 00 00 00
 (000.002097)  can0  696   [6]  68 06 00 00 00 00
 (000.002328)  can0  45A   [8]  69 06 00 00 00 00 00 00
 (000.002484)  can0  496   [8]  6A 06 00 00 00 00 00 00
 (000.002458)  can0  604   [8]  6B 06 00 00 00 00 00 00
 (000.002252)  can0  27B   [7]  6C 06 00 00 00 00 00
 (000.002420)  can0  48F   [8]  6D 06 00 00 00 00 00 00
 (000.001306)  can0  1B3   [1]  6E
 (000.002518)  can0  145   [8]  6F 06 00 00 00 00 00 00
 (000.002262)  can0  6EA   [7]  70 06 00 00 00 00 00
 (000.001053)  can0  2DC   [1]  71
 (000.001731)  can0  1DD   [4]  72 06 00 00
 (000.002332)  can0  455   [8]  73 06 00 00 00 00 00 00
 ...

If the cable was _swiftly_ plugged and unplugged, no errors appear.
So it seems the errors above are just due to noise.

##########################################################################

Afterwards, candump on a PC, Kvaser USB device on the sending end:

 ...
 (000.008784)  can0  60A   [1]  C1
 (000.011341)  can0  2A8   [8]  C2 0A 00 00 00 00 00 00
 (000.009873)  can0  03D   [7]  C3 0A 00 00 00 00 00
 (000.010394)  can0  55C   [8]  C4 0A 00 00 00 00 00 00
 (000.009979)  can0  45A   [8]  C5 0A 00 00 00 00 00 00
 (000.010125)  can0  6E8   [8]  C6 0A 00 00 00 00 00 00
 (000.010149)  can0  4EE   [8]  C7 0A 00 00 00 00 00 00
 (000.010102)  can0  5D2   [8]  C8 0A 00 00 00 00 00 00
 (000.010000)  can0  61F   [8]  C9 0A 00 00 00 00 00 00
 (000.010271)  can0  5F8   [8]  CA 0A 00 00 00 00 00 00

<-- Unplug the cable -->

 (000.009106)  can0  20000080   [8]  00 00 00 00 00 00 08 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{8}{0}}
 (000.001872)  can0  20000080   [8]  00 00 00 00 00 00 10 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{16}{0}}
 (000.001748)  can0  20000080   [8]  00 00 00 00 00 00 18 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{24}{0}}
 (000.001751)  can0  20000080   [8]  00 00 00 00 00 00 20 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{32}{0}}
 (000.001874)  can0  20000080   [8]  00 00 00 00 00 00 28 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{40}{0}}
 (000.001625)  can0  20000080   [8]  00 00 00 00 00 00 30 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{48}{0}}
 (000.001875)  can0  20000080   [8]  00 00 00 00 00 00 38 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{56}{0}}
 (000.001751)  can0  20000080   [8]  00 00 00 00 00 00 40 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{64}{0}}
 (000.001761)  can0  20000080   [8]  00 00 00 00 00 00 48 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{72}{0}}
 (000.001743)  can0  20000080   [8]  00 00 00 00 00 00 50 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{80}{0}}
 (000.001910)  can0  20000080   [8]  00 00 00 00 00 00 58 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{88}{0}}
 (000.001753)  can0  20000084   [8]  00 08 00 00 00 00 60 00   ERRORFRAME
	controller-problem{tx-error-warning}
	bus-error
	error-counter-tx-rx{{96}{0}}
 (000.001720)  can0  20000080   [8]  00 00 00 00 00 00 68 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{104}{0}}
 (000.001876)  can0  20000080   [8]  00 00 00 00 00 00 70 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{112}{0}}
 (000.001749)  can0  20000080   [8]  00 00 00 00 00 00 78 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{120}{0}}
 (000.001771)  can0  20000084   [8]  00 20 00 00 00 00 80 00   ERRORFRAME
	controller-problem{tx-error-passive}
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.001868)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.001982)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{128}{0}}

(( Then a continous flood, exactly similar to the above packet, appears.
   Unfortunately this flooding is a firmware problem. ))

<-- Replug the cable, after a good amount of time -->

 (000.520485)  can0  33D   [4]  CB 0A 00 00
 (000.002693)  can0  42E   [8]  CC 0A 00 00 00 00 00 00
 (000.001795)  can0  319   [4]  CD 0A 00 00
 (000.002705)  can0  3B1   [8]  CE 0A 00 00 00 00 00 00
 (000.001295)  can0  4CC   [2]  CF 0A
 (000.002205)  can0  42B   [6]  D0 0A 00 00 00 00
 (000.001620)  can0  5A2   [3]  D1 0A 00
 (000.002636)  can0  691   [8]  D2 0A 00 00 00 00 00 00
 (000.002615)  can0  36A   [8]  D3 0A 00 00 00 00 00 00
 (000.001729)  can0  068   [4]  D4 0A 00 00
 (000.001195)  can0  4C8   [1]  D5
 ...

##########################################################################

Bus-off Testing:

candump on a PC, Kvaser device on the sending end. An i.mx6 ARM
board with flexcan is on the receiving end:

 (000.010319)  can0  5CC   [8]  90 02 00 00 00 00 00 00
 (000.008747)  can0  679   [1]  91
 (000.011442)  can0  011   [8]  92 02 00 00 00 00 00 00
 (000.008991)  can0  631   [2]  93 02
 (000.011097)  can0  532   [7]  94 02 00 00 00 00 00
 (000.009781)  can0  0A9   [5]  95 02 00 00 00
 (000.010792)  can0  1DD   [8]  96 02 00 00 00 00 00 00
 (000.010026)  can0  44E   [8]  97 02 00 00 00 00 00 00
 (000.010181)  can0  76A   [8]  98 02 00 00 00 00 00 00
 (000.008867)  can0  1A5   [1]  99
 (000.011322)  can0  2B4   [8]  9A 02 00 00 00 00 00 00

<-- Unplug the can low and high wires from the board -->

 (000.009688)  can0  20000080   [8]  00 00 00 00 00 00 08 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{8}{0}}
 (000.002246)  can0  20000080   [8]  00 00 00 00 00 00 10 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{16}{0}}
 (000.002124)  can0  20000080   [8]  00 00 00 00 00 00 18 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{24}{0}}
 (000.002115)  can0  20000080   [8]  00 00 00 00 00 00 20 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{32}{0}}
 (000.002132)  can0  20000080   [8]  00 00 00 00 00 00 28 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{40}{0}}
 (000.002266)  can0  20000080   [8]  00 00 00 00 00 00 30 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{48}{0}}
 (000.002187)  can0  20000080   [8]  00 00 00 00 00 00 38 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{56}{0}}
 (000.002046)  can0  20000080   [8]  00 00 00 00 00 00 40 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{64}{0}}
 (000.002076)  can0  20000080   [8]  00 00 00 00 00 00 48 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{72}{0}}
 (000.002406)  can0  20000080   [8]  00 00 00 00 00 00 50 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{80}{0}}
 (000.001969)  can0  20000080   [8]  00 00 00 00 00 00 58 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{88}{0}}
 (000.002388)  can0  20000084   [8]  00 08 00 00 00 00 60 00   ERRORFRAME
	controller-problem{tx-error-warning}
	bus-error
	error-counter-tx-rx{{96}{0}}
 (000.002021)  can0  20000080   [8]  00 00 00 00 00 00 68 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{104}{0}}
 (000.002110)  can0  20000080   [8]  00 00 00 00 00 00 70 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{112}{0}}
 (000.002155)  can0  20000080   [8]  00 00 00 00 00 00 78 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{120}{0}}
 (000.002140)  can0  20000084   [8]  00 20 00 00 00 00 80 00   ERRORFRAME
	controller-problem{tx-error-passive}
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.002324)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.002416)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.002237)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{128}{0}}

(( Then a continous flood, exactly similar to the above packet, appears ))

<-- Short-circuit the can high and low wires -->

 (000.002364)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.002108)  can0  20000080   [8]  00 00 00 00 00 00 88 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{136}{0}}
 (000.000494)  can0  20000080   [8]  00 00 00 00 00 00 90 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{144}{0}}
 (000.000523)  can0  20000080   [8]  00 00 00 00 00 00 98 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{152}{0}}
 (000.000661)  can0  20000080   [8]  00 00 00 00 00 00 A0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{160}{0}}
 (000.000464)  can0  20000080   [8]  00 00 00 00 00 00 A8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{168}{0}}
 (000.000534)  can0  20000080   [8]  00 00 00 00 00 00 B0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{176}{0}}
 (000.000499)  can0  20000080   [8]  00 00 00 00 00 00 B8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{184}{0}}
 (000.000626)  can0  20000080   [8]  00 00 00 00 00 00 C0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{192}{0}}
 (000.000373)  can0  20000080   [8]  00 00 00 00 00 00 C8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{200}{0}}
 (000.000627)  can0  20000080   [8]  00 00 00 00 00 00 D0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{208}{0}}
 (000.000507)  can0  20000080   [8]  00 00 00 00 00 00 D8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{216}{0}}
 (000.000501)  can0  20000080   [8]  00 00 00 00 00 00 E0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{224}{0}}
 (000.000459)  can0  20000080   [8]  00 00 00 00 00 00 E8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{232}{0}}
 (000.000606)  can0  20000080   [8]  00 00 00 00 00 00 F0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{240}{0}}
 (000.000454)  can0  20000080   [8]  00 00 00 00 00 00 F8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{248}{0}}
 (000.000664)  can0  200000C0   [8]  00 00 00 00 00 00 00 00   ERRORFRAME
	bus-off
	bus-error

 (( No further bus activity ))


##########################################################################

Regards,
Darwish

^ permalink raw reply	[relevance 61%]

* Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
  @ 2015-01-21 15:36 96%             ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-21 15:36 UTC (permalink / raw)
  To: Andri Yngvason
  Cc: Wolfgang Grandegger, Olivier Sobrie, Oliver Hartkopp,
	Marc Kleine-Budde, Linux-CAN, netdev, LKML

On Wed, Jan 21, 2015 at 03:00:15PM +0000, Andri Yngvason wrote:
> Quoting Ahmed S. Darwish (2015-01-21 14:43:23)
> > Hi!

...

> > <-- Unplug the cable -->
> > 
> >  (000.009106)  can0  20000080   [8]  00 00 00 00 00 00 08 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{8}{0}}
> >  (000.001872)  can0  20000080   [8]  00 00 00 00 00 00 10 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{16}{0}}
> [...]
> >         error-counter-tx-rx{{80}{0}}
> >  (000.001910)  can0  20000080   [8]  00 00 00 00 00 00 58 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{88}{0}}
> >  (000.001753)  can0  20000084   [8]  00 08 00 00 00 00 60 00   ERRORFRAME
> >         controller-problem{tx-error-warning}
> Good.
> >         bus-error
> >         error-counter-tx-rx{{96}{0}}

Nice.

> >  (000.001720)  can0  20000080   [8]  00 00 00 00 00 00 68 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{104}{0}}
> >  (000.001876)  can0  20000080   [8]  00 00 00 00 00 00 70 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{112}{0}}
> >  (000.001749)  can0  20000080   [8]  00 00 00 00 00 00 78 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{120}{0}}
> >  (000.001771)  can0  20000084   [8]  00 20 00 00 00 00 80 00   ERRORFRAME
> >         controller-problem{tx-error-passive}
> Also good.
> >         bus-error
> >         error-counter-tx-rx{{128}{0}}

Also nice :-)

> >  (000.001868)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{128}{0}}
> >  (000.001982)  can0  20000080   [8]  00 00 00 00 00 00 80 00   ERRORFRAME
> >         bus-error
> >         error-counter-tx-rx{{128}{0}}
> > 
> > (( Then a continous flood, exactly similar to the above packet, appears.
> >    Unfortunately this flooding is a firmware problem. ))
> > 
> > <-- Replug the cable, after a good amount of time -->
> >
> Where are the reverse state transitions?
> > 

Hmmm...

[ ... ]
> 
> Reverse state transitions are missing from the logs. See comments above.
> 

When the device is on the _receiving_ end, and I unplug the CAN cable after
introducing some noise to the level of reaching WARNING or PASSIVE, I
receive a BUS_ERROR event with the rxerr count reset back to 0 or 1. In
that case, the driver correctly transitions back the state to ERROR_ACTIVE
and candump produces something similar to:

    (000.000362)  can0  2000008C   [8]  00 40 40 00 00 00 00 01   ERRORFRAME
    controller-problem{}
    protocol-violation{{back-to-error-active}{}}
    bus-error
    error-counter-tx-rx{{0}{1}}

which is, AFAIK, the correct behaviour from the driver side.

Meanwhile, when the device is on the _sending_ end and I re-plug the CAN
cable again. Sometimes I receive events with txerr reset to 0 or 1, and
the driver correctly reverts back to ERROR_ACTIVE in that case. But on
another times like the quoted case above, I don't receive any events
resetting txerr back -- only data packets on the bus.

So, What can the driver do given the above?

Thanks,
Darwish

P.S. just in case, I'll also re-check now if the driver unintentionally
drops any important events resetting the txerr count back after a CAN
cable replug -- preventing the code from returning to ERROR_ACTIVE in
the process.


^ permalink raw reply	[relevance 96%]

* Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
  @ 2015-01-23  6:07 90%                 ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-23  6:07 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Andri Yngvason, Olivier Sobrie, Oliver Hartkopp,
	Marc Kleine-Budde, Linux-CAN, netdev, LKML

On Wed, Jan 21, 2015 at 05:13:45PM +0100, Wolfgang Grandegger wrote:
> On Wed, 21 Jan 2015 10:36:47 -0500, "Ahmed S. Darwish"
> <darwish.07@gmail.com> wrote:
> > On Wed, Jan 21, 2015 at 03:00:15PM +0000, Andri Yngvason wrote:
> >> Quoting Ahmed S. Darwish (2015-01-21 14:43:23)
> >> > Hi!
> > 
> > ...
> > 
> >> > <-- Unplug the cable -->
> >> > 
> >> >  (000.009106)  can0  20000080   [8]  00 00 00 00 00 00 08 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{8}{0}}
> >> >  (000.001872)  can0  20000080   [8]  00 00 00 00 00 00 10 00  
> 
> For a bus-errors I would also expcect some more information in the
> data[2..3] fields. But these are always zero.
> 

M16C error factors made it possible to report things like
CAN_ERR_PROT_FORM/STUFF/BIT0/BIT1/TX in data[2], and
CAN_ERR_PROT_LOC_ACK/CRC_DEL in data[3].

Unfortunately such error factors are only reported in Leaf, but
not in USBCan-II due to the wire format change in the error event:

	struct leaf_msg_error_event {
		u8 tid;
		u8 flags;
		__le16 time[3];
		u8 channel;
		u8 padding;
		u8 tx_errors_count;
		u8 rx_errors_count;
		u8 status;
		u8 error_factor;
	} __packed;

	struct usbcan_msg_error_event {
		u8 tid;
		u8 padding;
		u8 tx_errors_count_ch0;
		u8 rx_errors_count_ch0;
		u8 tx_errors_count_ch1;
		u8 rx_errors_count_ch1;
		u8 status_ch0;
		u8 status_ch1;
		__le16 time;
	} __packed;

I speculate that the wire format was changed due to controller
bugs in the USBCan-II, which was slightly mentioned in their
data sheets here:

	http://www.kvaser.com/canlib-webhelp/page_hardware_specific_can_controllers.html

So it seems there's really no way for filling such bus error
info given the very limited amount of data exported :-(

The issue of incomplete data does not even stop here, kindly
check below notes regarding reverse state transitions:

> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{16}{0}}
> >> [...]
> >> >         error-counter-tx-rx{{80}{0}}
> >> >  (000.001910)  can0  20000080   [8]  00 00 00 00 00 00 58 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{88}{0}}
> >> >  (000.001753)  can0  20000084   [8]  00 08 00 00 00 00 60 00  
> >> >  ERRORFRAME
> >> >         controller-problem{tx-error-warning}
> >> Good.
> >> >         bus-error
> >> >         error-counter-tx-rx{{96}{0}}
> > 
> > Nice.
> > 
> >> >  (000.001720)  can0  20000080   [8]  00 00 00 00 00 00 68 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{104}{0}}
> >> >  (000.001876)  can0  20000080   [8]  00 00 00 00 00 00 70 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{112}{0}}
> >> >  (000.001749)  can0  20000080   [8]  00 00 00 00 00 00 78 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{120}{0}}
> >> >  (000.001771)  can0  20000084   [8]  00 20 00 00 00 00 80 00  
> >> >  ERRORFRAME
> >> >         controller-problem{tx-error-passive}
> >> Also good.
> >> >         bus-error
> >> >         error-counter-tx-rx{{128}{0}}
> > 
> > Also nice :-)
> > 
> >> >  (000.001868)  can0  20000080   [8]  00 00 00 00 00 00 80 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{128}{0}}
> >> >  (000.001982)  can0  20000080   [8]  00 00 00 00 00 00 80 00  
> >> >  ERRORFRAME
> >> >         bus-error
> >> >         error-counter-tx-rx{{128}{0}}
> >> > 
> >> > (( Then a continous flood, exactly similar to the above packet,
> >> > appears.
> >> >    Unfortunately this flooding is a firmware problem. ))
> >> > 
> >> > <-- Replug the cable, after a good amount of time -->
> >> >
> >> Where are the reverse state transitions?
> >> > 
> > 
> > Hmmm...
> > 
> > [ ... ]
> >> 
> >> Reverse state transitions are missing from the logs. See comments
> above.
> >> 
> > 
> > When the device is on the _receiving_ end, and I unplug the CAN cable
> after
> > introducing some noise to the level of reaching WARNING or PASSIVE, I
> > receive a BUS_ERROR event with the rxerr count reset back to 0 or 1. In
> > that case, the driver correctly transitions back the state to
> ERROR_ACTIVE
> > and candump produces something similar to:
> > 
> >     (000.000362)  can0  2000008C   [8]  00 40 40 00 00 00 00 01  
> >     ERRORFRAME
> >     controller-problem{}
> >     protocol-violation{{back-to-error-active}{}}
> >     bus-error
> >     error-counter-tx-rx{{0}{1}}
> > 
> > which is, AFAIK, the correct behaviour from the driver side.
> > 
> > Meanwhile, when the device is on the _sending_ end and I re-plug the CAN
> > cable again. Sometimes I receive events with txerr reset to 0 or 1, and
> > the driver correctly reverts back to ERROR_ACTIVE in that case. But on
> > another times like the quoted case above, I don't receive any events
> > resetting txerr back -- only data packets on the bus.
> 
> Well, the firmware seems to report *only* bus-errors via
> CMD_CAN_ERROR_EVENT messages, also carrying the new state, but no
> CMD_CHIP_STATE_EVENT just for the state changes.
> 

I've dumped _every_ message I receive from the firmware while
disconnecting the CAN bus, waiting a while, and connecting it again.
I really received _nothing_ from the firmware when the CAN bus was
reconnected and the data packets were flowing again. Not even a
single CHIP_STATE_EVENT, even after waiting for a long time.

So it's basically:
...
ERR EVENT, txerr=128, rxerr=0
ERR EVENT, txerr=128, rxerr=0
ERR EVENT, txerr=128, rxerr=0
...

then complete silence, except the data frames. I've even tried with
different versions of the firmware, but the same behaviour persisted.

> > So, What can the driver do given the above?
> 
> Little if the notification does not come.
> 

We can poll the state by sending CMD_GET_CHIP_STATE to the firmware,
and it will hopefully reply with a CHIP_STATE_EVENT response
containing the new txerr and rxerr values that we can use for
reverse state transitions.

But do we _really_ want to go through the path? I feel that it will
open some cans of worms w.r.t. concurrent access to both the netdev
and USB stacks from a single driver.

A possible solution can be setting up a kernel thread that queries
for a CHIP_STATE_EVENT every second?

Your inputs on this is appreciated.

> Wolfgang.
> 

Regards,
Darwish

^ permalink raw reply	[relevance 90%]

* Re: [PATCH 01/13] kdbus: add documentation
  @ 2015-01-23  6:28 99%   ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-23  6:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api,
	linux-kernel, daniel, dh.herrmann, tixxdz,
	Michael Kerrisk (man-pages),
	Linus Torvalds, Daniel Mack

On Fri, Jan 16, 2015 at 11:16:05AM -0800, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel@zonque.org>
> 
> kdbus is a system for low-latency, low-overhead, easy to use
> interprocess communication (IPC).
> 
> The interface to all functions in this driver is implemented via ioctls
> on files exposed through a filesystem called 'kdbusfs'. The default
> mount point of kdbusfs is /sys/fs/kdbus.

Pardon my ignorance, but we've always been told that adding
new ioctl()s to the kernel is a very big no-no.  But given
the seniority of the folks stewarding this kdbus effort,
there must be a good rationale ;-)

So, can the rationale behind introducing new ioctl()s be
further explained? It would be even better if it's included
in the documentation patch itself.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
  @ 2015-01-25  2:43 86%                     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-25  2:43 UTC (permalink / raw)
  To: Andri Yngvason
  Cc: Wolfgang Grandegger, Olivier Sobrie, Oliver Hartkopp,
	Marc Kleine-Budde, Linux-CAN, netdev, LKML

On Fri, Jan 23, 2015 at 10:32:13AM +0000, Andri Yngvason wrote:
> Quoting Ahmed S. Darwish (2015-01-23 06:07:34)
> > On Wed, Jan 21, 2015 at 05:13:45PM +0100, Wolfgang Grandegger wrote:
> > > On Wed, 21 Jan 2015 10:36:47 -0500, "Ahmed S. Darwish"
> > > <darwish.07@gmail.com> wrote:
> > > > On Wed, Jan 21, 2015 at 03:00:15PM +0000, Andri Yngvason wrote:
> > > >> Quoting Ahmed S. Darwish (2015-01-21 14:43:23)
> > > >> > Hi!
> > > > 
> > > > ...
> > > > 
> > > >> > <-- Unplug the cable -->
> > > >> > 
> > > >> >  (000.009106)  can0  20000080   [8]  00 00 00 00 00 00 08 00  
> > > >> >  ERRORFRAME
> > > >> >         bus-error
> > > >> >         error-counter-tx-rx{{8}{0}}
> > > >> >  (000.001872)  can0  20000080   [8]  00 00 00 00 00 00 10 00  
> > > 
> > > For a bus-errors I would also expcect some more information in the
> > > data[2..3] fields. But these are always zero.
> > > 
> > 
> > M16C error factors made it possible to report things like
> > CAN_ERR_PROT_FORM/STUFF/BIT0/BIT1/TX in data[2], and
> > CAN_ERR_PROT_LOC_ACK/CRC_DEL in data[3].
> > 
> > Unfortunately such error factors are only reported in Leaf, but
> > not in USBCan-II due to the wire format change in the error event:
> > 
> >         struct leaf_msg_error_event {
> >                 u8 tid;
> >                 u8 flags;
> >                 __le16 time[3];
> >                 u8 channel;
> >                 u8 padding;
> >                 u8 tx_errors_count;
> >                 u8 rx_errors_count;
> >                 u8 status;
> >                 u8 error_factor;
> >         } __packed;
> > 
> >         struct usbcan_msg_error_event {
> >                 u8 tid;
> >                 u8 padding;
> >                 u8 tx_errors_count_ch0;
> >                 u8 rx_errors_count_ch0;
> >                 u8 tx_errors_count_ch1;
> >                 u8 rx_errors_count_ch1;
> >                 u8 status_ch0;
> >                 u8 status_ch1;
> >                 __le16 time;
> >         } __packed;
> > 
> > I speculate that the wire format was changed due to controller
> > bugs in the USBCan-II, which was slightly mentioned in their
> > data sheets here:
> > 
> >         http://www.kvaser.com/canlib-webhelp/page_hardware_specific_can_controllers.html
> > 
> > So it seems there's really no way for filling such bus error
> > info given the very limited amount of data exported :-(
> >
> We experienced similar problems with FlexCAN.

Hmm, I'll have a look there then...

Although my initial instincts imply that the FlexCAN driver has
access to the raw CAN registers, something I'm unable to do here.
But maybe there's some black magic I'm missing :-)

[...]

> > 
> > I've dumped _every_ message I receive from the firmware while
> > disconnecting the CAN bus, waiting a while, and connecting it again.
> > I really received _nothing_ from the firmware when the CAN bus was
> > reconnected and the data packets were flowing again. Not even a
> > single CHIP_STATE_EVENT, even after waiting for a long time.
> > 
> > So it's basically:
> > ...
> > ERR EVENT, txerr=128, rxerr=0
> > ERR EVENT, txerr=128, rxerr=0
> > ERR EVENT, txerr=128, rxerr=0
> > ...
> > 
> > then complete silence, except the data frames. I've even tried with
> > different versions of the firmware, but the same behaviour persisted.
> > 
> > > > So, What can the driver do given the above?
> > > 
> > > Little if the notification does not come.
> > > 
> > 
> > We can poll the state by sending CMD_GET_CHIP_STATE to the firmware,
> > and it will hopefully reply with a CHIP_STATE_EVENT response
> > containing the new txerr and rxerr values that we can use for
> > reverse state transitions.
> >
> > But do we _really_ want to go through the path? I feel that it will
> > open some cans of worms w.r.t. concurrent access to both the netdev
> > and USB stacks from a single driver.
> >
> Honestly, I don't know.
> >
> > A possible solution can be setting up a kernel thread that queries
> > for a CHIP_STATE_EVENT every second?
> > 
> Have you considered polling in kvaser_usb_tx_acknowledge? You could do something
> like:
> if(unlikely(dev->can.state != CAN_STATE_ERROR_ACTIVE))
> {
>     request_state();
> }
> 

OK, I have four important updates on this issue:

a) My initial testing was done on high-speed channel, at a bitrate
   of 50K. After setting the bus to a more reasonable bitrate 500K
   or 1M, I was _consistently_ able to receive CHIP_STATE_EVENTs
   when plugging the CAN connector again after an unplug.

b) The error counters on this device do not get reset on plugging
   after an unplug. I've setup a kernel thread [2] that queries
   the chip state event every second, and the error counters stays
   the same all the time. [1]

c) There's a single case when the erro counters do indeed get
   reversed, and it happens only when introducing some noise in
   the bus after the re-plug. In that case, the new error events
   get raised with new error counters starting from 0/1 again.

d) I've discovered a bug that forbids the CAN state from
   returning to ERROR_ACTIVE in case of the error counters
   numbers getting decreased. But independent from that bug, the
   verbose debugging messages clearly imply that we only get the
   error counters decreased in the case mentioned at `c)' above.

So from [1] and [2], it's now clear that the device do not reset
these counters back in the re-plug case. I'll give a check to
flexcan as advised, but unfortunately I don't really think there's
much I can do about this.

[1]

[  877.207082] CAN_ERROR_: channel=0, txerr=88, rxerr=0
[  877.207090] CAN_ERROR_: channel=0, txerr=136, rxerr=0
[  877.207094] CAN_ERROR_: channel=0, txerr=144, rxerr=0
[  877.207098] CAN_ERROR_: channel=0, txerr=152, rxerr=0
[  877.207100] CAN_ERROR_: channel=0, txerr=160, rxerr=0
[  877.207102] CAN_ERROR_: channel=0, txerr=168, rxerr=0
[  877.208075] CAN_ERROR_: channel=0, txerr=200, rxerr=0 

(( The above error event, staying the same at txerr=200 keeps
   flooding the bus until the CAN cable is re-plugged ))

[  878.225116] CHIP_STATE: channel=0, txerr=200, rxerr=0
[  878.225143] CHIP_STATE: channel=1, txerr=0, rxerr=0
[  879.265167] CHIP_STATE: channel=0, txerr=200, rxerr=0
[  879.267152] CHIP_STATE: channel=1, txerr=0, rxerr=0
[  879.265167] CHIP_STATE: channel=0, txerr=200, rxerr=0
[  879.267152] CHIP_STATE: channel=1, txerr=0, rxerr=0

(( The same counters get repeated every second ))

[2] State was polled using:

static int kvaser_usb_poll_chip_state(void *vpriv) {
      struct kvaser_usb_net_priv *priv = vpriv;

      while (!kthread_should_stop()) {
              kvaser_usb_simple_msg_async(priv, CMD_GET_CHIP_STATE);
              ssleep(1);
      }

      return 0;
}

> I don't think that anything beyond that would be worth pursuing.
> 

I agree, but given the new input, it seems that our problem
extends to the error counters themselves not getting decreased
on re-plug. So, even polling will not solve the issue: we'll
get the same txerr/rxerr values again and again :-(

> Best regards,
> Andri

Regards,
Darwish


^ permalink raw reply	[relevance 86%]

* Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
  @ 2015-01-25  2:49 99%           ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-25  2:49 UTC (permalink / raw)
  To: Andri Yngvason
  Cc: Marc Kleine-Budde, Olivier Sobrie, Oliver Hartkopp,
	Wolfgang Grandegger, Linux-CAN, netdev, LKML

On Thu, Jan 22, 2015 at 10:14:47AM +0000, Andri Yngvason wrote:
> Quoting Marc Kleine-Budde (2015-01-21 22:59:23)
> > On 01/21/2015 05:20 PM, Andri Yngvason wrote:
> > > Marc, could you merge the "move bus_off++" patch before you merge this so that I
> > > won't have to incorporate this patch-set into it?
> > 
> > ...included in the lastest pull-request to David. Use
> > tags/linux-can-next-for-3.20-20150121 of the can-next repo as you new base.
> > 
> 
> Thanks!
>

I guess I'll re-base my next submission over this tag too.
Nothing in the new 5 patches is substantial enough to be
included in the current kernel release.

Thanks!
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
    @ 2015-01-25  3:21 99%       ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-25  3:21 UTC (permalink / raw)
  To: Andri Yngvason
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Linux-CAN, netdev, LKML

Hi!

On Wed, Jan 21, 2015 at 04:20:25PM +0000, Andri Yngvason wrote:
> Quoting Ahmed S. Darwish (2015-01-20 21:45:37)
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > Replace most of the can interface's state and error counters
> > handling with the new can-dev can_change_state() mechanism.
> > 
> > Suggested-by: Andri Yngvason <andri.yngvason@marel.com>
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > ---
> >  drivers/net/can/usb/kvaser_usb.c | 114 +++++++++++++++++++--------------------
> >  1 file changed, 55 insertions(+), 59 deletions(-)
> > 
> > diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> > index 971c5f9..0386d3f 100644
> > --- a/drivers/net/can/usb/kvaser_usb.c
> > +++ b/drivers/net/can/usb/kvaser_usb.c
> > @@ -620,40 +620,43 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
> >  }
> >  
> >  static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
> > -                                                const struct kvaser_usb_error_summary *es)
> > +                                                const struct kvaser_usb_error_summary *es,
> > +                                                struct can_frame *cf)
> >  {
> >         struct net_device_stats *stats;
> > -       enum can_state new_state;
> > -
> > -       stats = &priv->netdev->stats;
> > -       new_state = priv->can.state;
> > +       enum can_state cur_state, new_state, tx_state, rx_state;
> >  
> >         netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
> >  
> > -       if (es->status & M16C_STATE_BUS_OFF) {
> > -               priv->can.can_stats.bus_off++;
> > +       stats = &priv->netdev->stats;
> > +       new_state = cur_state = priv->can.state;
> > +
> > +       if (es->status & M16C_STATE_BUS_OFF)
> >                 new_state = CAN_STATE_BUS_OFF;
> > -       } else if (es->status & M16C_STATE_BUS_PASSIVE) {
> > -               if (priv->can.state != CAN_STATE_ERROR_PASSIVE)
> > -                       priv->can.can_stats.error_passive++;
> > +       else if (es->status & M16C_STATE_BUS_PASSIVE)
> >                 new_state = CAN_STATE_ERROR_PASSIVE;
> > -       }
> >  
> >         if (es->status == M16C_STATE_BUS_ERROR) {
> > -               if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
> > -                   ((es->txerr >= 96) || (es->rxerr >= 96))) {
> > -                       priv->can.can_stats.error_warning++;
> > +               if ((cur_state < CAN_STATE_ERROR_WARNING) &&
> > +                   ((es->txerr >= 96) || (es->rxerr >= 96)))
> >                         new_state = CAN_STATE_ERROR_WARNING;
> > -               } else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
> > +               else if (cur_state > CAN_STATE_ERROR_ACTIVE)
> >                         new_state = CAN_STATE_ERROR_ACTIVE;
> > -               }
> >         }
> >  
> >         if (!es->status)
> >                 new_state = CAN_STATE_ERROR_ACTIVE;
> >  
> > +       if (new_state != cur_state) {
> > +               tx_state = (es->txerr >= es->rxerr) ? new_state : 0;
> > +               rx_state = (es->txerr <= es->rxerr) ? new_state : 0;
> > +
> > +               can_change_state(priv->netdev, cf, tx_state, rx_state);
>
> This (below) is redundant. It doesn't harm but at this point can_change_state
> has set priv->can.state to new_state.
>
> > +               new_state = priv->can.state;
> > +       }
> > +

Correct; I will remove it.

> >         if (priv->can.restart_ms &&
> > -           (priv->can.state >= CAN_STATE_BUS_OFF) &&
> > +           (cur_state >= CAN_STATE_BUS_OFF) &&
> >             (new_state < CAN_STATE_BUS_OFF)) {
> >                 priv->can.can_stats.restarts++;
> >         }
> > @@ -665,18 +668,17 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
> >  
> >         priv->bec.txerr = es->txerr;
> >         priv->bec.rxerr = es->rxerr;
> > -       priv->can.state = new_state;
> >  }
> >  
> >  static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >                                 const struct kvaser_msg *msg)
> >  {
> > -       struct can_frame *cf;
> > +       struct can_frame *cf, tmp_cf = { .can_id = CAN_ERR_FLAG, .can_dlc = CAN_ERR_DLC };
> >         struct sk_buff *skb;
> >         struct net_device_stats *stats;
> >         struct kvaser_usb_net_priv *priv;
> >         struct kvaser_usb_error_summary es = { };
> > -       enum can_state old_state;
> > +       enum can_state old_state, new_state;
> >  
> >         switch (msg->id) {
> >         case CMD_CAN_ERROR_EVENT:
> > @@ -721,60 +723,54 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
> >         }
> >  
> >         /* Update all of the can interface's state and error counters before
> > -        * trying any skb allocation that can actually fail with -ENOMEM.
> > +        * trying any memory allocation that can actually fail with -ENOMEM.
> > +        *
> > +        * We send a temporary stack-allocated error can frame to
> > +        * can_change_state() for the very same reason.
> > +        *
> > +        * TODO: Split can_change_state() responsibility between updating the
> > +        * can interface's state and counters, and the setting up of can error
> > +        * frame ID and data to userspace. Remove stack allocation afterwards.
> >          */
> >         old_state = priv->can.state;
> > -       kvaser_usb_rx_error_update_can_state(priv, &es);
> > +       kvaser_usb_rx_error_update_can_state(priv, &es, &tmp_cf);
> > +       new_state = priv->can.state;
> >  
> >         skb = alloc_can_err_skb(priv->netdev, &cf);
> >         if (!skb) {
> >                 stats->rx_dropped++;
> >                 return;
> >         }
> > +       memcpy(cf, &tmp_cf, sizeof(*cf));
> >  
> > -       if (es.status & M16C_STATE_BUS_OFF) {
> > -               cf->can_id |= CAN_ERR_BUSOFF;
> > -
> > -               if (!priv->can.restart_ms)
> > -                       kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
> > -               netif_carrier_off(priv->netdev);
> > -       } else if (es.status & M16C_STATE_BUS_PASSIVE) {
> > -               if (old_state != CAN_STATE_ERROR_PASSIVE) {
> > -                       cf->can_id |= CAN_ERR_CRTL;
> > -
> > -                       if (es.txerr || es.rxerr)
> > -                               cf->data[1] = (es.txerr > es.rxerr)
> > -                                               ? CAN_ERR_CRTL_TX_PASSIVE
> > -                                               : CAN_ERR_CRTL_RX_PASSIVE;
> > -                       else
> > -                               cf->data[1] = CAN_ERR_CRTL_TX_PASSIVE |
> > -                                             CAN_ERR_CRTL_RX_PASSIVE;
> > +       if (new_state != old_state) {
> > +               if (es.status & M16C_STATE_BUS_OFF) {
> > +                       if (!priv->can.restart_ms)
> > +                               kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
> > +                       netif_carrier_off(priv->netdev);
> > +               }
> > +
>
> This block [below] is wrong. The usage of PROT_ACTIVE is based on a misunderstanding.
> It's used in some drivers to signify back-to-error-active but its original
> meaning is something completely different, AFAIK.
> This is handled in can_change_state() using a new CTRL message; namely:
> CAN_ERR_CTRL_ACTIVE. The newest version of can-utils is up to date with this.
>
> > +               if (es.status == M16C_STATE_BUS_ERROR) {
> > +                       if ((old_state >= CAN_STATE_ERROR_WARNING) ||
> > +                           (es.txerr < 96 && es.rxerr < 96)) {
> > +                               if (old_state > CAN_STATE_ERROR_ACTIVE) {
> > +                                       cf->can_id |= CAN_ERR_PROT;
> > +                                       cf->data[2] = CAN_ERR_PROT_ACTIVE;
> > +                               }
> > +                       }
> >                 }

So I would just make the new_state equals CAN_STATE_ERROR_ACTIVE,
and then can_change_state() will do the right thing? Excellent!!
This means the entire block above can be removed ;-)

[ On another note, the block's if conditions above are faulty and
  fixed in patch #3. This patch (#2) used can_change_state()
  without changing any of that faulty logic to ease any future
  bisection. ]

> > -       }
> >  
> > -       if (es.status == M16C_STATE_BUS_ERROR) {
> > -               if ((old_state < CAN_STATE_ERROR_WARNING) &&
> > -                   ((es.txerr >= 96) || (es.rxerr >= 96))) {
> > -                       cf->can_id |= CAN_ERR_CRTL;
> > -                       cf->data[1] = (es.txerr > es.rxerr)
> > -                                       ? CAN_ERR_CRTL_TX_WARNING
> > -                                       : CAN_ERR_CRTL_RX_WARNING;
> > -               } else if (old_state > CAN_STATE_ERROR_ACTIVE) {
>
> This is also redundant, and wrong:
>
> > +               if (!es.status) {
> >                         cf->can_id |= CAN_ERR_PROT;
> >                         cf->data[2] = CAN_ERR_PROT_ACTIVE;
> >                 }
> > -       }

As in the above; extra code to be removed, yay! ;-)

> >  
> > -       if (!es.status) {
> > -               cf->can_id |= CAN_ERR_PROT;
> > -               cf->data[2] = CAN_ERR_PROT_ACTIVE;
> > -       }
> > -
> > -       if (priv->can.restart_ms &&
> > -           (old_state >= CAN_STATE_BUS_OFF) &&
> > -           (priv->can.state < CAN_STATE_BUS_OFF)) {
> > -               cf->can_id |= CAN_ERR_RESTARTED;
> > -               netif_carrier_on(priv->netdev);
> > +               if (priv->can.restart_ms &&
> > +                   (old_state >= CAN_STATE_BUS_OFF) &&
> > +                   (new_state < CAN_STATE_BUS_OFF)) {
> > +                       cf->can_id |= CAN_ERR_RESTARTED;
> > +                       netif_carrier_on(priv->netdev);
> > +               }
> >         }
> >  
> >         if (es.error_factor) {
> > -- 
> > 1.9.1
> 
> Looking over the patch again, I've noticed that there
> are a few things that are not quite right.
> 

The state-handlig code has become much simpler since your
reveiw and Kleine-Budde's one. Thanks a lot for all the effort.

Regards,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH 01/13] kdbus: add documentation
  @ 2015-01-25  3:30 99%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-25  3:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api,
	linux-kernel, daniel, dh.herrmann, tixxdz,
	Michael Kerrisk (man-pages),
	Linus Torvalds, Daniel Mack

On Fri, Jan 23, 2015 at 09:19:46PM +0800, Greg Kroah-Hartman wrote:
> On Fri, Jan 23, 2015 at 08:28:20AM +0200, Ahmed S. Darwish wrote:
> > On Fri, Jan 16, 2015 at 11:16:05AM -0800, Greg Kroah-Hartman wrote:
> > > From: Daniel Mack <daniel@zonque.org>
> > > 
> > > kdbus is a system for low-latency, low-overhead, easy to use
> > > interprocess communication (IPC).
> > > 
> > > The interface to all functions in this driver is implemented via ioctls
> > > on files exposed through a filesystem called 'kdbusfs'. The default
> > > mount point of kdbusfs is /sys/fs/kdbus.
> > 
> > Pardon my ignorance, but we've always been told that adding
> > new ioctl()s to the kernel is a very big no-no.  But given
> > the seniority of the folks stewarding this kdbus effort,
> > there must be a good rationale ;-)
> > 
> > So, can the rationale behind introducing new ioctl()s be
> > further explained? It would be even better if it's included
> > in the documentation patch itself.
> 
> The main reason to use an ioctl is that you want to atomically set
> and/or get something "complex" through the user/kernel boundary.  For
> simple device attributes, sysfs works great, for configuring devices,
> configfs works great, but for data streams / structures / etc. an ioctl
> is the correct thing to use.
> 
> Examples of new ioctls being added to the kernel are all over the
> place, look at all of the special-purpose ioctls the filesystems keep
> creating (they aren't adding new syscalls), look at the monstrosity that
> is the DRM layer, look at other complex things like openvswitch, or
> "simpler" device-specific interfaces like the MEI one, or even more
> complex ones like the MMC interface.  These are all valid uses of ioctls
> as they are device/filesystem specific ways to interact with the kernel.
> 
> The thing is, almost no one pays attention to these new ioctls as they
> are domain-specific interfaces, with open userspace programs talking to
> them, and they work well.  ioctl is a powerful and useful interface, and
> if we were to suddenly require no new ioctls, and require everything to
> be a syscall, we would do nothing except make apis more complex (hint,
> you now have to do extra validation on your file descriptor passed to
> you to determine if it really is what you can properly operate your
> ioctl on), and cause no real benefit at all.
> 
> Yes, people abuse ioctls at times, but really, they are needed.
> 
> And remember, I was one of the people who long ago thought we should not
> be adding more ioctls, but I have seen the folly of my ways, and chalk
> it up to youthful ignorance :)
> 

Exactly, and that's why I wondered about the sudden change
of heart ;-)

Thanks for taking the time to write this.

Regards,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
  @ 2015-01-25 11:59 99%           ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-25 11:59 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason, Linux-CAN, netdev, LKML

On Wed, Jan 21, 2015 at 03:24:46PM +0300, Sergei Shtylyov wrote:
> Hello.
> 
> On 1/21/2015 12:48 AM, Ahmed S. Darwish wrote:
> 
> >From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> >On some x86 laptops, plugging a Kvaser device again after an
> >unplug makes the firmware always ignore the very first command.
> >For such a case, provide some room for retries instead of
> >completly exiting the driver init code.
> 
>    Completely.
> 
> >Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> >---
> >  drivers/net/can/usb/kvaser_usb.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> >diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> >index 640b0eb..068e76c 100644
> >--- a/drivers/net/can/usb/kvaser_usb.c
> >+++ b/drivers/net/can/usb/kvaser_usb.c
> [...]
> >@@ -1632,7 +1632,15 @@ static int kvaser_usb_probe(struct usb_interface *intf,
> >
> >  	usb_set_intfdata(intf, dev);
> >
> >-	err = kvaser_usb_get_software_info(dev);
> >+	/* On some x86 laptops, plugging a Kvaser device again after
> >+	 * an unplug makes the firmware always ignore the very first
> >+	 * command. For such a case, provide some room for retries
> >+	 * instead of completly exiting the driver.
> 
>    Completely.
> 

Thanks, both fixed in the next submission :-D

Regards,
Darwish


^ permalink raw reply	[relevance 99%]

* [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support
  2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
                   ` (6 preceding siblings ...)
  2015-01-20 21:44 70% ` [PATCH v5 1/5] can: kvaser_usb: Update net interface state before exiting on OOM Ahmed S. Darwish
@ 2015-01-26  5:17 74% ` Ahmed S. Darwish
  2015-01-26  5:20 88%   ` [PATCH v6 1/7] can: kvaser_usb: Do not sleep in atomic context Ahmed S. Darwish
  7 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:17 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

Hi!

This is an updated patch series for the Kvaser CAN/USB devices:

1- Extra patches are now added to the series. Most importantly
patch #1 which fixes a critical `sleep in atomic context' bug
in the current upstream driver. Patch #2 fixes a corruption in
the kernel logs, also affecting current upstream driver. Patch
#4 was originally USBCan-II only, but since it's a generic fix,
it's now retrofitted to the Leaf-only upstream driver first.

2- The series has been re-organized so that patches #1 -> #4
inclusive can go to linux-can/origin, while the rest can move
to -next.

3- There are some important updates regarding the USBCan-II
error counters, and how really erratic their heaviour is. All
the new details are covered at the bottom of this URL:

	http://article.gmane.org/gmane.linux.can/7481

4- Attached below is the new candump traces. Now
`back-to-error-active' states appear _if_ the hardware was
kind enough and decreased the error counters appropriately.
The earlier code did not recognize the error counters going
down, thus the `back-to-error-active' transitions did not
appear.

###########################################################

* Bus-off scenario (with transitions active -> passive -> back-to-active):

 (000.000000)  can0  20000080   [8]  00 00 00 00 00 00 30 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{48}{0}}
 (000.000011)  can0  20000084   [8]  00 20 00 00 00 00 88 00   ERRORFRAME
	controller-problem{tx-error-passive}
	bus-error
	error-counter-tx-rx{{136}{0}}
 (000.000001)  can0  20000080   [8]  00 00 00 00 00 00 88 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{136}{0}}
 (000.000987)  can0  20000080   [8]  00 00 00 00 00 00 98 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{152}{0}}
 (000.002011)  can0  20000084   [8]  00 40 00 00 00 00 30 00   ERRORFRAME
	controller-problem{back-to-error-active}
	bus-error
	error-counter-tx-rx{{48}{0}}
 (000.000004)  can0  20000084   [8]  00 20 00 00 00 00 78 00   ERRORFRAME
	controller-problem{tx-error-passive}
	bus-error
	error-counter-tx-rx{{120}{0}}
 (000.000002)  can0  20000080   [8]  00 00 00 00 00 00 88 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{136}{0}}
 (000.000005)  can0  20000080   [8]  00 00 00 00 00 00 90 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{144}{0}}
 (000.000002)  can0  20000080   [8]  00 00 00 00 00 00 98 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{152}{0}}
 (000.000001)  can0  20000080   [8]  00 00 00 00 00 00 A0 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{160}{0}}
 (000.000966)  can0  20000080   [8]  00 00 00 00 00 00 A8 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{168}{0}}
 (000.002998)  can0  20000084   [8]  00 40 00 00 00 00 30 00   ERRORFRAME
	controller-problem{back-to-error-active}
	bus-error
	error-counter-tx-rx{{48}{0}}
 (000.000004)  can0  20000084   [8]  00 20 00 00 00 00 80 00   ERRORFRAME
	controller-problem{tx-error-passive}
	bus-error
	error-counter-tx-rx{{128}{0}}
 (000.000002)  can0  20000080   [8]  00 00 00 00 00 00 88 00   ERRORFRAME
	bus-error
	error-counter-tx-rx{{136}{0}}
 (001.031035)  can0  200000C0   [8]  00 00 00 00 00 00 00 29   ERRORFRAME
	bus-off
	bus-error
	error-counter-tx-rx{{0}{41}}

###########################################################

Regular sending, unpluggng CAN connector, then plugging again:

(with transitions active -> warning -> passive)

[ As stated earlier, the counters don't get decreased upon CAN
replug, even if they were constantly polled. ]

 (000.011001)  can0  2A1   [1]  E5
 (000.010001)  can0  50E   [8]  E6 05 00 00 00 00 00 00
 (000.009999)  can0  009   [1]  E7
 (000.011000)  can0  6E2   [8]  E8 05 00 00 00 00 00 00
 (000.009999)  can0  314   [2]  E9 05
 (000.010001)  can0  708   [6]  EA 05 00 00 00 00
 (000.010991)  can0  20000080   [8]  00 00 00 00 00 00 00 09   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{9}}
 (000.000002)  can0  20000080   [8]  00 00 00 00 00 00 00 0A   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{10}}
 (000.000007)  can0  20000080   [8]  00 00 00 00 00 00 00 2E   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{46}}
 (000.000003)  can0  20000080   [8]  00 00 00 00 00 00 00 37   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{55}}
 (000.000002)  can0  20000080   [8]  00 00 00 00 00 00 00 40   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{64}}
 (000.000993)  can0  20000080   [8]  00 00 00 00 00 00 00 49   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{73}}
 (000.000002)  can0  20000080   [8]  00 00 00 00 00 00 00 52   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{82}}
 (000.000001)  can0  20000080   [8]  00 00 00 00 00 00 00 5B   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{91}}
 (000.000028)  can0  20000084   [8]  00 04 00 00 00 00 00 6C   ERRORFRAME
	controller-problem{rx-error-warning}
	bus-error
	error-counter-tx-rx{{0}{108}}
 (000.000955)  can0  20000080   [8]  00 00 00 00 00 00 00 7F   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{127}}
 (000.000008)  can0  20000084   [8]  00 10 00 00 00 00 00 87   ERRORFRAME
	controller-problem{rx-error-passive}
	bus-error
	error-counter-tx-rx{{0}{135}}
 (000.000001)  can0  20000080   [8]  00 00 00 00 00 00 00 87   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{135}}
 (000.000004)  can0  20000080   [8]  00 00 00 00 00 00 00 87   ERRORFRAME
	bus-error
	error-counter-tx-rx{{0}{135}}

((( Then a continous flood, exactly similar to the above packet, appears )))

 (000.500004)  can0  2ED   [4]  EB 05 00 00
 (000.000006)  can0  0DD   [5]  EC 05 00 00 00
 (000.000002)  can0  1D3   [1]  ED
 (000.000988)  can0  20D   [8]  EE 05 00 00 00 00 00 00
 (000.000006)  can0  04B   [8]  EF 05 00 00 00 00 00 00
 (000.000002)  can0  320   [8]  F0 05 00 00 00 00 00 00
 (000.000002)  can0  023   [8]  F1 05 00 00 00 00 00 00
 (000.000989)  can0  21D   [8]  F2 05 00 00 00 00 00 00
 (000.000005)  can0  17D   [8]  F3 05 00 00 00 00 00 00
 (000.000002)  can0  6DC   [8]  F4 05 00 00 00 00 00 00
 (000.000993)  can0  62D   [8]  F5 05 00 00 00 00 00 00
 (000.000006)  can0  18B   [6]  F6 05 00 00 00 00
 (000.000001)  can0  7EB   [8]  F7 05 00 00 00 00 00 00
 (000.000001)  can0  014   [8]  F8 05 00 00 00 00 00 00
 (000.000994)  can0  52F   [8]  F9 05 00 00 00 00 00 00

--
Regards,
Darwish

^ permalink raw reply	[relevance 74%]

* [PATCH v6 1/7] can: kvaser_usb: Do not sleep in atomic context
  2015-01-26  5:17 74% ` [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support Ahmed S. Darwish
@ 2015-01-26  5:20 88%   ` Ahmed S. Darwish
  2015-01-26  5:22 99%     ` [PATCH v6 2/7] can: kvaser_usb: Send correct context to URB completion Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:20 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Upon receiving a hardware event with the BUS_RESET flag set,
the driver kills all of its anchored URBs and resets all of
its transmit URB contexts.

Unfortunately it does so under the context of URB completion
handler `kvaser_usb_read_bulk_callback()', which is often
called in an atomic context.

While the device is flooded with many received error packets,
usb_kill_urb() typically sleeps/reschedules till the transfer
request of each killed URB in question completes, leading to
the sleep in atomic bug. [3]

In v2 submission of the original driver patch [1], it was
stated that the URBs kill and tx contexts reset was needed
since we don't receive any tx acknowledgments later and thus
such resources will be locked down forever. Fortunately this
is no longer needed since an earlier bugfix in this patch
series is now applied: all tx URB contexts are reset upon CAN
channel close. [2]

Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF
event, which is the recommended handling method advised by
the device manufacturer.

[1] http://article.gmane.org/gmane.linux.network/239442
    http://www.webcitation.org/6Vr2yagAQ

[2] can: kvaser_usb: Reset all URB tx contexts upon channel close
    889b77f7fd2bcc922493d73a4c51d8a851505815

[3] Stacktrace:

 <IRQ>  [<ffffffff8158de87>] dump_stack+0x45/0x57
 [<ffffffff8158b60c>] __schedule_bug+0x41/0x4f
 [<ffffffff815904b1>] __schedule+0x5f1/0x700
 [<ffffffff8159360a>] ? _raw_spin_unlock_irqrestore+0xa/0x10
 [<ffffffff81590684>] schedule+0x24/0x70
 [<ffffffff8147d0a5>] usb_kill_urb+0x65/0xa0
 [<ffffffff81077970>] ? prepare_to_wait_event+0x110/0x110
 [<ffffffff8147d7d8>] usb_kill_anchored_urbs+0x48/0x80
 [<ffffffffa01f4028>] kvaser_usb_unlink_tx_urbs+0x18/0x50 [kvaser_usb]
 [<ffffffffa01f45d0>] kvaser_usb_rx_error+0xc0/0x400 [kvaser_usb]
 [<ffffffff8108b14a>] ? vprintk_default+0x1a/0x20
 [<ffffffffa01f5241>] kvaser_usb_read_bulk_callback+0x4c1/0x5f0 [kvaser_usb]
 [<ffffffff8147a73e>] __usb_hcd_giveback_urb+0x5e/0xc0
 [<ffffffff8147a8a1>] usb_hcd_giveback_urb+0x41/0x110
 [<ffffffffa0008748>] finish_urb+0x98/0x180 [ohci_hcd]
 [<ffffffff810cd1a7>] ? acct_account_cputime+0x17/0x20
 [<ffffffff81069f65>] ? local_clock+0x15/0x30
 [<ffffffffa000a36b>] ohci_work+0x1fb/0x5a0 [ohci_hcd]
 [<ffffffff814fbb31>] ? process_backlog+0xb1/0x130
 [<ffffffffa000cd5b>] ohci_irq+0xeb/0x270 [ohci_hcd]
 [<ffffffff81479fc1>] usb_hcd_irq+0x21/0x30
 [<ffffffff8108bfd3>] handle_irq_event_percpu+0x43/0x120
 [<ffffffff8108c0ed>] handle_irq_event+0x3d/0x60
 [<ffffffff8108ec84>] handle_fasteoi_irq+0x74/0x110
 [<ffffffff81004dfd>] handle_irq+0x1d/0x30
 [<ffffffff81004727>] do_IRQ+0x57/0x100
 [<ffffffff8159482a>] common_interrupt+0x6a/0x6a

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c |    7 +------
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index c32cd61..978a25e 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -662,11 +662,6 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if (status & M16C_STATE_BUS_RESET) {
-		kvaser_usb_unlink_tx_urbs(priv);
-		return;
-	}
-
 	skb = alloc_can_err_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->rx_dropped++;
@@ -677,7 +672,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
 
-	if (status & M16C_STATE_BUS_OFF) {
+	if (status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
 		priv->can.can_stats.bus_off++;
-- 
1.7.7.6



^ permalink raw reply related	[relevance 88%]

* [PATCH v6 2/7] can: kvaser_usb: Send correct context to URB completion
  2015-01-26  5:20 88%   ` [PATCH v6 1/7] can: kvaser_usb: Do not sleep in atomic context Ahmed S. Darwish
@ 2015-01-26  5:22 99%     ` Ahmed S. Darwish
  2015-01-26  5:24 98%       ` [PATCH v6 3/7] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:22 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Send expected argument to the URB completion hander: a CAN
netdevice instead of the network interface private context
`kvaser_usb_net_priv'.

This was discovered by having some garbage in the kernel
log in place of the netdevice names: can0 and can1.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 978a25e..f0c6207 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -587,7 +587,7 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 			  usb_sndbulkpipe(dev->udev,
 					  dev->bulk_out->bEndpointAddress),
 			  buf, msg->len,
-			  kvaser_usb_simple_msg_callback, priv);
+			  kvaser_usb_simple_msg_callback, netdev);
 	usb_anchor_urb(urb, &priv->tx_submitted);
 
 	err = usb_submit_urb(urb, GFP_ATOMIC);
-- 
1.7.7.6


^ permalink raw reply related	[relevance 99%]

* [PATCH v6 3/7] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
  2015-01-26  5:22 99%     ` [PATCH v6 2/7] can: kvaser_usb: Send correct context to URB completion Ahmed S. Darwish
@ 2015-01-26  5:24 98%       ` Ahmed S. Darwish
  2015-01-26  5:25 97%         ` [PATCH v6 4/7] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:24 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

On some x86 laptops, plugging a Kvaser device again after an
unplug makes the firmware always ignore the very first command.
For such a case, provide some room for retries instead of
completely exiting the driver init code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index f0c6207..55407b9 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1585,7 +1585,7 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 {
 	struct kvaser_usb *dev;
 	int err = -ENOMEM;
-	int i;
+	int i, retry = 3;
 
 	dev = devm_kzalloc(&intf->dev, sizeof(*dev), GFP_KERNEL);
 	if (!dev)
@@ -1603,7 +1603,15 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	err = kvaser_usb_get_software_info(dev);
+	/* On some x86 laptops, plugging a Kvaser device again after
+	 * an unplug makes the firmware always ignore the very first
+	 * command. For such a case, provide some room for retries
+	 * instead of completely exiting the driver.
+	 */
+	do {
+		err = kvaser_usb_get_software_info(dev);
+	} while (--retry && err == -ETIMEDOUT);
+
 	if (err) {
 		dev_err(&intf->dev,
 			"Cannot get software infos, error %d\n", err);
-- 
1.7.7.6


^ permalink raw reply related	[relevance 98%]

* [PATCH v6 4/7] can: kvaser_usb: Fix state handling upon BUS_ERROR events
  2015-01-26  5:24 98%       ` [PATCH v6 3/7] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
@ 2015-01-26  5:25 97%         ` Ahmed S. Darwish
  2015-01-26  5:27 70%           ` [PATCH v6 5/7] can: kvaser_usb: Update interface state before exiting on OOM Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:25 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

While being in an ERROR_WARNING state, and receiving further
bus error events with error counters still in the ERROR_WARNING
range of 97-127 inclusive, the state handling code erroneously
reverts back to ERROR_ACTIVE.

Per the CAN standard, only revert to ERROR_ACTIVE when the
error counters are less than 96.

Moreover, in certain Kvaser models, the BUS_ERROR flag is
always set along with undefined bits in the M16C status
register. Thus use bitwise operators instead of full equality
for checking that register against bus errors.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 55407b9..7af379c 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -698,9 +698,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	}
-
-	if (status == M16C_STATE_BUS_ERROR) {
+	} else if (status & M16C_STATE_BUS_ERROR) {
 		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
 		    ((txerr >= 96) || (rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
@@ -710,7 +708,8 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 			priv->can.can_stats.error_warning++;
 			new_state = CAN_STATE_ERROR_WARNING;
-		} else if (priv->can.state > CAN_STATE_ERROR_ACTIVE) {
+		} else if ((priv->can.state > CAN_STATE_ERROR_ACTIVE) &&
+			   ((txerr < 96) && (rxerr < 96))) {
 			cf->can_id |= CAN_ERR_PROT;
 			cf->data[2] = CAN_ERR_PROT_ACTIVE;
 
-- 
1.7.7.6


^ permalink raw reply related	[relevance 97%]

* [PATCH v6 5/7] can: kvaser_usb: Update interface state before exiting on OOM
  2015-01-26  5:25 97%         ` [PATCH v6 4/7] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
@ 2015-01-26  5:27 70%           ` Ahmed S. Darwish
  2015-01-26  5:29 80%             ` [PATCH v6 6/7] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:27 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Update all of the can interface's state and error counters before
trying any skb allocation that can actually fail with -ENOMEM.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c |  181 ++++++++++++++++++++++----------------
 1 files changed, 105 insertions(+), 76 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 7af379c..f57ce55 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -273,6 +273,10 @@ struct kvaser_msg {
 	} u;
 } __packed;
 
+struct kvaser_usb_error_summary {
+	u8 channel, status, txerr, rxerr, error_factor;
+};
+
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -615,6 +619,54 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
 }
 
+static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
+						 const struct kvaser_usb_error_summary *es)
+{
+	struct net_device_stats *stats;
+	enum can_state new_state;
+
+	stats = &priv->netdev->stats;
+	new_state = priv->can.state;
+
+	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
+
+	if (es->status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
+		priv->can.can_stats.bus_off++;
+		new_state = CAN_STATE_BUS_OFF;
+	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
+		if (priv->can.state != CAN_STATE_ERROR_PASSIVE)
+			priv->can.can_stats.error_passive++;
+		new_state = CAN_STATE_ERROR_PASSIVE;
+	} else if (es->status & M16C_STATE_BUS_ERROR) {
+		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
+		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
+			priv->can.can_stats.error_warning++;
+			new_state = CAN_STATE_ERROR_WARNING;
+		} else if ((priv->can.state > CAN_STATE_ERROR_ACTIVE) &&
+			   ((es->txerr < 96) && (es->rxerr < 96))) {
+			new_state = CAN_STATE_ERROR_ACTIVE;
+		}
+	}
+
+	if (!es->status)
+		new_state = CAN_STATE_ERROR_ACTIVE;
+
+	if (priv->can.restart_ms &&
+	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
+	    (new_state < CAN_STATE_BUS_OFF)) {
+		priv->can.can_stats.restarts++;
+	}
+
+	if (es->error_factor) {
+		priv->can.can_stats.bus_error++;
+		stats->rx_errors++;
+	}
+
+	priv->bec.txerr = es->txerr;
+	priv->bec.rxerr = es->rxerr;
+	priv->can.state = new_state;
+}
+
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 				const struct kvaser_msg *msg)
 {
@@ -622,30 +674,30 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	unsigned int new_state;
-	u8 channel, status, txerr, rxerr, error_factor;
+	struct kvaser_usb_error_summary es = { };
+	enum can_state old_state;
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
-		channel = msg->u.error_event.channel;
-		status =  msg->u.error_event.status;
-		txerr = msg->u.error_event.tx_errors_count;
-		rxerr = msg->u.error_event.rx_errors_count;
-		error_factor = msg->u.error_event.error_factor;
+		es.channel = msg->u.error_event.channel;
+		es.status =  msg->u.error_event.status;
+		es.txerr = msg->u.error_event.tx_errors_count;
+		es.rxerr = msg->u.error_event.rx_errors_count;
+		es.error_factor = msg->u.error_event.error_factor;
 		break;
 	case CMD_LOG_MESSAGE:
-		channel = msg->u.log_message.channel;
-		status = msg->u.log_message.data[0];
-		txerr = msg->u.log_message.data[2];
-		rxerr = msg->u.log_message.data[3];
-		error_factor = msg->u.log_message.data[1];
+		es.channel = msg->u.log_message.channel;
+		es.status = msg->u.log_message.data[0];
+		es.txerr = msg->u.log_message.data[2];
+		es.rxerr = msg->u.log_message.data[3];
+		es.error_factor = msg->u.log_message.data[1];
 		break;
 	case CMD_CHIP_STATE_EVENT:
-		channel = msg->u.chip_state_event.channel;
-		status =  msg->u.chip_state_event.status;
-		txerr = msg->u.chip_state_event.tx_errors_count;
-		rxerr = msg->u.chip_state_event.rx_errors_count;
-		error_factor = 0;
+		es.channel = msg->u.chip_state_event.channel;
+		es.status =  msg->u.chip_state_event.status;
+		es.txerr = msg->u.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.chip_state_event.rx_errors_count;
+		es.error_factor = 0;
 		break;
 	default:
 		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
@@ -653,116 +705,93 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		return;
 	}
 
-	if (channel >= dev->nchannels) {
+	if (es.channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", channel);
+			"Invalid channel number (%d)\n", es.channel);
 		return;
 	}
 
-	priv = dev->nets[channel];
+	priv = dev->nets[es.channel];
 	stats = &priv->netdev->stats;
 
+	/* Update all of the can interface's state and error counters before
+	 * trying any skb allocation that can actually fail with -ENOMEM.
+	 */
+	old_state = priv->can.state;
+	kvaser_usb_rx_error_update_can_state(priv, &es);
+
 	skb = alloc_can_err_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->rx_dropped++;
 		return;
 	}
 
-	new_state = priv->can.state;
-
-	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", status);
-
-	if (status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
+	if (es.status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
 		cf->can_id |= CAN_ERR_BUSOFF;
 
-		priv->can.can_stats.bus_off++;
 		if (!priv->can.restart_ms)
 			kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
-
 		netif_carrier_off(priv->netdev);
-
-		new_state = CAN_STATE_BUS_OFF;
-	} else if (status & M16C_STATE_BUS_PASSIVE) {
-		if (priv->can.state != CAN_STATE_ERROR_PASSIVE) {
+	} else if (es.status & M16C_STATE_BUS_PASSIVE) {
+		if (old_state != CAN_STATE_ERROR_PASSIVE) {
 			cf->can_id |= CAN_ERR_CRTL;
 
-			if (txerr || rxerr)
-				cf->data[1] = (txerr > rxerr)
+			if (es.txerr || es.rxerr)
+				cf->data[1] = (es.txerr > es.rxerr)
 						? CAN_ERR_CRTL_TX_PASSIVE
 						: CAN_ERR_CRTL_RX_PASSIVE;
 			else
 				cf->data[1] = CAN_ERR_CRTL_TX_PASSIVE |
 					      CAN_ERR_CRTL_RX_PASSIVE;
-
-			priv->can.can_stats.error_passive++;
 		}
-
-		new_state = CAN_STATE_ERROR_PASSIVE;
-	} else if (status & M16C_STATE_BUS_ERROR) {
-		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((txerr >= 96) || (rxerr >= 96))) {
+	} else if (es.status & M16C_STATE_BUS_ERROR) {
+		if ((old_state < CAN_STATE_ERROR_WARNING) &&
+		    ((es.txerr >= 96) || (es.rxerr >= 96))) {
 			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (txerr > rxerr)
+			cf->data[1] = (es.txerr > es.rxerr)
 					? CAN_ERR_CRTL_TX_WARNING
 					: CAN_ERR_CRTL_RX_WARNING;
-
-			priv->can.can_stats.error_warning++;
-			new_state = CAN_STATE_ERROR_WARNING;
-		} else if ((priv->can.state > CAN_STATE_ERROR_ACTIVE) &&
-			   ((txerr < 96) && (rxerr < 96))) {
+		} else if ((old_state > CAN_STATE_ERROR_ACTIVE) &&
+			   ((es.txerr < 96) && (es.rxerr < 96))) {
 			cf->can_id |= CAN_ERR_PROT;
 			cf->data[2] = CAN_ERR_PROT_ACTIVE;
-
-			new_state = CAN_STATE_ERROR_ACTIVE;
 		}
 	}
 
-	if (!status) {
+	if (!es.status) {
 		cf->can_id |= CAN_ERR_PROT;
 		cf->data[2] = CAN_ERR_PROT_ACTIVE;
-
-		new_state = CAN_STATE_ERROR_ACTIVE;
 	}
 
 	if (priv->can.restart_ms &&
-	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
-	    (new_state < CAN_STATE_BUS_OFF)) {
+	    (old_state >= CAN_STATE_BUS_OFF) &&
+	    (priv->can.state < CAN_STATE_BUS_OFF)) {
 		cf->can_id |= CAN_ERR_RESTARTED;
 		netif_carrier_on(priv->netdev);
-
-		priv->can.can_stats.restarts++;
 	}
 
-	if (error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
-
+	if (es.error_factor) {
 		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
 
-		if (error_factor & M16C_EF_ACKE)
+		if (es.error_factor & M16C_EF_ACKE)
 			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (error_factor & M16C_EF_CRCE)
+		if (es.error_factor & M16C_EF_CRCE)
 			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
 					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (error_factor & M16C_EF_FORME)
+		if (es.error_factor & M16C_EF_FORME)
 			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (error_factor & M16C_EF_STFE)
+		if (es.error_factor & M16C_EF_STFE)
 			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (error_factor & M16C_EF_BITE0)
+		if (es.error_factor & M16C_EF_BITE0)
 			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (error_factor & M16C_EF_BITE1)
+		if (es.error_factor & M16C_EF_BITE1)
 			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (error_factor & M16C_EF_TRE)
+		if (es.error_factor & M16C_EF_TRE)
 			cf->data[2] |= CAN_ERR_PROT_TX;
 	}
 
-	cf->data[6] = txerr;
-	cf->data[7] = rxerr;
-
-	priv->bec.txerr = txerr;
-	priv->bec.rxerr = rxerr;
-
-	priv->can.state = new_state;
+	cf->data[6] = es.txerr;
+	cf->data[7] = es.rxerr;
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
@@ -786,6 +815,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	}
 
 	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+		stats->rx_over_errors++;
+		stats->rx_errors++;
+
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (!skb) {
 			stats->rx_dropped++;
@@ -795,9 +827,6 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		cf->can_id |= CAN_ERR_CRTL;
 		cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
 
-		stats->rx_over_errors++;
-		stats->rx_errors++;
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
 		netif_rx(skb);
-- 
1.7.7.6


^ permalink raw reply related	[relevance 70%]

* [PATCH v6 6/7] can: kvaser_usb: Consolidate and unify state change handling
  2015-01-26  5:27 70%           ` [PATCH v6 5/7] can: kvaser_usb: Update interface state before exiting on OOM Ahmed S. Darwish
@ 2015-01-26  5:29 80%             ` Ahmed S. Darwish
  2015-01-26  5:33 39%               ` [PATCH v6 7/7] can: kvaser_usb: Add support for the USBcan-II family Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:29 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Replace most of the can interface's state and error counters
handling with the new can-dev can_change_state() mechanism.

Suggested-by: Andri Yngvason <andri.yngvason@marel.com>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c |  113 ++++++++++++++++---------------------
 1 files changed, 49 insertions(+), 64 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index f57ce55..ddc2954 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -620,39 +620,44 @@ static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 }
 
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
-						 const struct kvaser_usb_error_summary *es)
+						 const struct kvaser_usb_error_summary *es,
+						 struct can_frame *cf)
 {
 	struct net_device_stats *stats;
-	enum can_state new_state;
-
-	stats = &priv->netdev->stats;
-	new_state = priv->can.state;
+	enum can_state cur_state, new_state, tx_state, rx_state;
 
 	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	if (es->status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
-		priv->can.can_stats.bus_off++;
+	stats = &priv->netdev->stats;
+	new_state = cur_state = priv->can.state;
+
+	if (es->status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET))
 		new_state = CAN_STATE_BUS_OFF;
-	} else if (es->status & M16C_STATE_BUS_PASSIVE) {
-		if (priv->can.state != CAN_STATE_ERROR_PASSIVE)
-			priv->can.can_stats.error_passive++;
+	else if (es->status & M16C_STATE_BUS_PASSIVE)
 		new_state = CAN_STATE_ERROR_PASSIVE;
-	} else if (es->status & M16C_STATE_BUS_ERROR) {
-		if ((priv->can.state < CAN_STATE_ERROR_WARNING) &&
-		    ((es->txerr >= 96) || (es->rxerr >= 96))) {
-			priv->can.can_stats.error_warning++;
+	else if (es->status & M16C_STATE_BUS_ERROR) {
+		if ((es->txerr >= 256) || (es->rxerr >= 256))
+			new_state = CAN_STATE_BUS_OFF;
+		else if ((es->txerr >= 128) || (es->rxerr >= 128))
+			new_state = CAN_STATE_ERROR_PASSIVE;
+		else if ((es->txerr >= 96) || (es->rxerr >= 96))
 			new_state = CAN_STATE_ERROR_WARNING;
-		} else if ((priv->can.state > CAN_STATE_ERROR_ACTIVE) &&
-			   ((es->txerr < 96) && (es->rxerr < 96))) {
+		else if (cur_state > CAN_STATE_ERROR_ACTIVE)
 			new_state = CAN_STATE_ERROR_ACTIVE;
-		}
 	}
 
 	if (!es->status)
 		new_state = CAN_STATE_ERROR_ACTIVE;
 
+	if (new_state != cur_state) {
+		tx_state = (es->txerr >= es->rxerr) ? new_state : 0;
+		rx_state = (es->txerr <= es->rxerr) ? new_state : 0;
+
+		can_change_state(priv->netdev, cf, tx_state, rx_state);
+	}
+
 	if (priv->can.restart_ms &&
-	    (priv->can.state >= CAN_STATE_BUS_OFF) &&
+	    (cur_state >= CAN_STATE_BUS_OFF) &&
 	    (new_state < CAN_STATE_BUS_OFF)) {
 		priv->can.can_stats.restarts++;
 	}
@@ -664,18 +669,17 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 
 	priv->bec.txerr = es->txerr;
 	priv->bec.rxerr = es->rxerr;
-	priv->can.state = new_state;
 }
 
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 				const struct kvaser_msg *msg)
 {
-	struct can_frame *cf;
+	struct can_frame *cf, tmp_cf = { .can_id = CAN_ERR_FLAG, .can_dlc = CAN_ERR_DLC };
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
 	struct kvaser_usb_error_summary es = { };
-	enum can_state old_state;
+	enum can_state old_state, new_state;
 
 	switch (msg->id) {
 	case CMD_CAN_ERROR_EVENT:
@@ -715,59 +719,40 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	stats = &priv->netdev->stats;
 
 	/* Update all of the can interface's state and error counters before
-	 * trying any skb allocation that can actually fail with -ENOMEM.
+	 * trying any memory allocation that can actually fail with -ENOMEM.
+	 *
+	 * We send a temporary stack-allocated error can frame to
+	 * can_change_state() for the very same reason.
+	 *
+	 * TODO: Split can_change_state() responsibility between updating the
+	 * can interface's state and counters, and the setting up of can error
+	 * frame ID and data to userspace. Remove stack allocation afterwards.
 	 */
 	old_state = priv->can.state;
-	kvaser_usb_rx_error_update_can_state(priv, &es);
+	kvaser_usb_rx_error_update_can_state(priv, &es, &tmp_cf);
+	new_state = priv->can.state;
 
 	skb = alloc_can_err_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->rx_dropped++;
 		return;
 	}
-
-	if (es.status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
-		cf->can_id |= CAN_ERR_BUSOFF;
-
-		if (!priv->can.restart_ms)
-			kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
-		netif_carrier_off(priv->netdev);
-	} else if (es.status & M16C_STATE_BUS_PASSIVE) {
-		if (old_state != CAN_STATE_ERROR_PASSIVE) {
-			cf->can_id |= CAN_ERR_CRTL;
-
-			if (es.txerr || es.rxerr)
-				cf->data[1] = (es.txerr > es.rxerr)
-						? CAN_ERR_CRTL_TX_PASSIVE
-						: CAN_ERR_CRTL_RX_PASSIVE;
-			else
-				cf->data[1] = CAN_ERR_CRTL_TX_PASSIVE |
-					      CAN_ERR_CRTL_RX_PASSIVE;
-		}
-	} else if (es.status & M16C_STATE_BUS_ERROR) {
-		if ((old_state < CAN_STATE_ERROR_WARNING) &&
-		    ((es.txerr >= 96) || (es.rxerr >= 96))) {
-			cf->can_id |= CAN_ERR_CRTL;
-			cf->data[1] = (es.txerr > es.rxerr)
-					? CAN_ERR_CRTL_TX_WARNING
-					: CAN_ERR_CRTL_RX_WARNING;
-		} else if ((old_state > CAN_STATE_ERROR_ACTIVE) &&
-			   ((es.txerr < 96) && (es.rxerr < 96))) {
-			cf->can_id |= CAN_ERR_PROT;
-			cf->data[2] = CAN_ERR_PROT_ACTIVE;
+	memcpy(cf, &tmp_cf, sizeof(*cf));
+
+	if (new_state != old_state) {
+		if (es.status &
+		    (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
+			if (!priv->can.restart_ms)
+				kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
+			netif_carrier_off(priv->netdev);
 		}
-	}
-
-	if (!es.status) {
-		cf->can_id |= CAN_ERR_PROT;
-		cf->data[2] = CAN_ERR_PROT_ACTIVE;
-	}
 
-	if (priv->can.restart_ms &&
-	    (old_state >= CAN_STATE_BUS_OFF) &&
-	    (priv->can.state < CAN_STATE_BUS_OFF)) {
-		cf->can_id |= CAN_ERR_RESTARTED;
-		netif_carrier_on(priv->netdev);
+		if (priv->can.restart_ms &&
+		    (old_state >= CAN_STATE_BUS_OFF) &&
+		    (new_state < CAN_STATE_BUS_OFF)) {
+			cf->can_id |= CAN_ERR_RESTARTED;
+			netif_carrier_on(priv->netdev);
+		}
 	}
 
 	if (es.error_factor) {
-- 
1.7.7.6


^ permalink raw reply related	[relevance 80%]

* [PATCH v6 7/7] can: kvaser_usb: Add support for the USBcan-II family
  2015-01-26  5:29 80%             ` [PATCH v6 6/7] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
@ 2015-01-26  5:33 39%               ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-01-26  5:33 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

CAN to USB interfaces sold by the Swedish manufacturer Kvaser are
divided into two major families: 'Leaf', and 'USBcanII'.  From an
Operating System perspective, the firmware of both families behave
in a not too drastically different fashion.

This patch adds support for the USBcanII family of devices to the
current Kvaser Leaf-only driver.

CAN frames sending, receiving, and error handling paths has been
tested using the dual-channel "Kvaser USBcan II HS/LS" dongle. It
should also work nicely with other products in the same category.

List of new devices supported by this driver update:

         - Kvaser USBcan II HS/HS
         - Kvaser USBcan II HS/LS
         - Kvaser USBcan Rugged ("USBcan Rev B")
         - Kvaser Memorator HS/HS
         - Kvaser Memorator HS/LS
         - Scania VCI2 (if you have the Kvaser logo on top)

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/Kconfig      |    8 +-
 drivers/net/can/usb/kvaser_usb.c |  590 ++++++++++++++++++++++++++++++--------
 2 files changed, 474 insertions(+), 124 deletions(-)

** V6 Changelog:
- Revert to the error-active state if the error counters were
  decreased by hardware
- Rebase over a new set of upstream Leaf-driver bugfixes

** V5 Changelog:
- Rebase on the new CAN error state changes added for the Leaf driver
- Add minor changes (remove unused commands, constify poniters, etc.)

** V4 Changelog:
- Use type-safe C methods instead of cpp macros
- Remove defensive checks against non-existing families
- Re-order methods to remove forward declarations
- Smaller stuff spotted by earlier review (function prefexes, etc.)

** V3 Changelog:
- Fix padding for the usbcan_msg_tx_acknowledge command
- Remove kvaser_usb->max_channels and the MAX_NET_DEVICES macro
- Rename commands to CMD_LEAF_xxx and CMD_USBCAN_xxx
- Apply checkpatch.pl suggestions ('net/' comments, multi-line strings, etc.)

** V2 Changelog:
- Update Kconfig entries
- Use actual number of CAN channels (instead of max) where appropriate
- Rebase over a new set of UsbcanII-independent driver fixes

diff --git a/drivers/net/can/usb/Kconfig b/drivers/net/can/usb/Kconfig
index a77db919..f6f5500 100644
--- a/drivers/net/can/usb/Kconfig
+++ b/drivers/net/can/usb/Kconfig
@@ -25,7 +25,7 @@ config CAN_KVASER_USB
 	tristate "Kvaser CAN/USB interface"
 	---help---
 	  This driver adds support for Kvaser CAN/USB devices like Kvaser
-	  Leaf Light.
+	  Leaf Light and Kvaser USBcan II.
 
 	  The driver provides support for the following devices:
 	    - Kvaser Leaf Light
@@ -46,6 +46,12 @@ config CAN_KVASER_USB
 	    - Kvaser USBcan R
 	    - Kvaser Leaf Light v2
 	    - Kvaser Mini PCI Express HS
+	    - Kvaser USBcan II HS/HS
+	    - Kvaser USBcan II HS/LS
+	    - Kvaser USBcan Rugged ("USBcan Rev B")
+	    - Kvaser Memorator HS/HS
+	    - Kvaser Memorator HS/LS
+	    - Scania VCI2 (if you have the Kvaser logo on top)
 
 	  If unsure, say N.
 
diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index ddc2954..17d28d7 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -6,10 +6,12 @@
  * Parts of this driver are based on the following:
  *  - Kvaser linux leaf driver (version 4.78)
  *  - CAN driver for esd CAN-USB/2
+ *  - Kvaser linux usbcanII driver (version 5.3)
  *
  * Copyright (C) 2002-2006 KVASER AB, Sweden. All rights reserved.
  * Copyright (C) 2010 Matthias Fuchs <matthias.fuchs@esd.eu>, esd gmbh
  * Copyright (C) 2012 Olivier Sobrie <olivier@sobrie.be>
+ * Copyright (C) 2015 Valeo A.S.
  */
 
 #include <linux/completion.h>
@@ -30,8 +32,9 @@
 #define RX_BUFFER_SIZE			3072
 #define CAN_USB_CLOCK			8000000
 #define MAX_NET_DEVICES			3
+#define MAX_USBCAN_NET_DEVICES		2
 
-/* Kvaser USB devices */
+/* Kvaser Leaf USB devices */
 #define KVASER_VENDOR_ID		0x0bfd
 #define USB_LEAF_DEVEL_PRODUCT_ID	10
 #define USB_LEAF_LITE_PRODUCT_ID	11
@@ -56,6 +59,24 @@
 #define USB_LEAF_LITE_V2_PRODUCT_ID	288
 #define USB_MINI_PCIE_HS_PRODUCT_ID	289
 
+static inline bool kvaser_is_leaf(const struct usb_device_id *id)
+{
+	return id->idProduct >= USB_LEAF_DEVEL_PRODUCT_ID &&
+	       id->idProduct <= USB_MINI_PCIE_HS_PRODUCT_ID;
+}
+
+/* Kvaser USBCan-II devices */
+#define USB_USBCAN_REVB_PRODUCT_ID	2
+#define USB_VCI2_PRODUCT_ID		3
+#define USB_USBCAN2_PRODUCT_ID		4
+#define USB_MEMORATOR_PRODUCT_ID	5
+
+static inline bool kvaser_is_usbcan(const struct usb_device_id *id)
+{
+	return id->idProduct >= USB_USBCAN_REVB_PRODUCT_ID &&
+	       id->idProduct <= USB_MEMORATOR_PRODUCT_ID;
+}
+
 /* USB devices features */
 #define KVASER_HAS_SILENT_MODE		BIT(0)
 #define KVASER_HAS_TXRX_ERRORS		BIT(1)
@@ -73,7 +94,7 @@
 #define MSG_FLAG_TX_ACK			BIT(6)
 #define MSG_FLAG_TX_REQUEST		BIT(7)
 
-/* Can states */
+/* Can states (M16C CxSTRH register) */
 #define M16C_STATE_BUS_RESET		BIT(0)
 #define M16C_STATE_BUS_ERROR		BIT(4)
 #define M16C_STATE_BUS_PASSIVE		BIT(5)
@@ -98,7 +119,11 @@
 #define CMD_START_CHIP_REPLY		27
 #define CMD_STOP_CHIP			28
 #define CMD_STOP_CHIP_REPLY		29
-#define CMD_GET_CARD_INFO2		32
+
+#define CMD_LEAF_GET_CARD_INFO2		32
+#define CMD_USBCAN_RESET_CLOCK		32
+#define CMD_USBCAN_CLOCK_OVERFLOW_EVENT	33
+
 #define CMD_GET_CARD_INFO		34
 #define CMD_GET_CARD_INFO_REPLY		35
 #define CMD_GET_SOFTWARE_INFO		38
@@ -108,8 +133,9 @@
 #define CMD_RESET_ERROR_COUNTER		49
 #define CMD_TX_ACKNOWLEDGE		50
 #define CMD_CAN_ERROR_EVENT		51
-#define CMD_USB_THROTTLE		77
-#define CMD_LOG_MESSAGE			106
+
+#define CMD_LEAF_USB_THROTTLE		77
+#define CMD_LEAF_LOG_MESSAGE		106
 
 /* error factors */
 #define M16C_EF_ACKE			BIT(0)
@@ -121,6 +147,14 @@
 #define M16C_EF_RCVE			BIT(6)
 #define M16C_EF_TRE			BIT(7)
 
+/* Only Leaf-based devices can report M16C error factors,
+ * thus define our own error status flags for USBCANII
+ */
+#define USBCAN_ERROR_STATE_NONE		0
+#define USBCAN_ERROR_STATE_TX_ERROR	BIT(0)
+#define USBCAN_ERROR_STATE_RX_ERROR	BIT(1)
+#define USBCAN_ERROR_STATE_BUSERROR	BIT(2)
+
 /* bittiming parameters */
 #define KVASER_USB_TSEG1_MIN		1
 #define KVASER_USB_TSEG1_MAX		16
@@ -137,9 +171,18 @@
 #define KVASER_CTRL_MODE_SELFRECEPTION	3
 #define KVASER_CTRL_MODE_OFF		4
 
-/* log message */
+/* Extended CAN identifier flag */
 #define KVASER_EXTENDED_FRAME		BIT(31)
 
+/* Kvaser USB CAN dongles are divided into two major families:
+ * - Leaf: Based on Renesas M32C, running firmware labeled as 'filo'
+ * - UsbcanII: Based on Renesas M16C, running firmware labeled as 'helios'
+ */
+enum kvaser_usb_family {
+	KVASER_LEAF,
+	KVASER_USBCAN,
+};
+
 struct kvaser_msg_simple {
 	u8 tid;
 	u8 channel;
@@ -148,30 +191,55 @@ struct kvaser_msg_simple {
 struct kvaser_msg_cardinfo {
 	u8 tid;
 	u8 nchannels;
-	__le32 serial_number;
-	__le32 padding;
+	union {
+		struct {
+			__le32 serial_number;
+			__le32 padding;
+		} __packed leaf0;
+		struct {
+			__le32 serial_number_low;
+			__le32 serial_number_high;
+		} __packed usbcan0;
+	} __packed;
 	__le32 clock_resolution;
 	__le32 mfgdate;
 	u8 ean[8];
 	u8 hw_revision;
-	u8 usb_hs_mode;
-	__le16 padding2;
+	union {
+		struct {
+			u8 usb_hs_mode;
+		} __packed leaf1;
+		struct {
+			u8 padding;
+		} __packed usbcan1;
+	} __packed;
+	__le16 padding;
 } __packed;
 
 struct kvaser_msg_cardinfo2 {
 	u8 tid;
-	u8 channel;
+	u8 reserved;
 	u8 pcb_id[24];
 	__le32 oem_unlock_code;
 } __packed;
 
-struct kvaser_msg_softinfo {
+struct leaf_msg_softinfo {
 	u8 tid;
-	u8 channel;
+	u8 padding0;
 	__le32 sw_options;
 	__le32 fw_version;
 	__le16 max_outstanding_tx;
-	__le16 padding[9];
+	__le16 padding1[9];
+} __packed;
+
+struct usbcan_msg_softinfo {
+	u8 tid;
+	u8 fw_name[5];
+	__le16 max_outstanding_tx;
+	u8 padding[6];
+	__le32 fw_version;
+	__le16 checksum;
+	__le16 sw_options;
 } __packed;
 
 struct kvaser_msg_busparams {
@@ -188,36 +256,86 @@ struct kvaser_msg_tx_can {
 	u8 channel;
 	u8 tid;
 	u8 msg[14];
-	u8 padding;
-	u8 flags;
+	union {
+		struct {
+			u8 padding;
+			u8 flags;
+		} __packed leaf;
+		struct {
+			u8 flags;
+			u8 padding;
+		} __packed usbcan;
+	} __packed;
+} __packed;
+
+struct kvaser_msg_rx_can_header {
+	u8 channel;
+	u8 flag;
 } __packed;
 
-struct kvaser_msg_rx_can {
+struct leaf_msg_rx_can {
 	u8 channel;
 	u8 flag;
+
 	__le16 time[3];
 	u8 msg[14];
 } __packed;
 
-struct kvaser_msg_chip_state_event {
+struct usbcan_msg_rx_can {
+	u8 channel;
+	u8 flag;
+
+	u8 msg[14];
+	__le16 time;
+} __packed;
+
+struct leaf_msg_chip_state_event {
 	u8 tid;
 	u8 channel;
+
 	__le16 time[3];
 	u8 tx_errors_count;
 	u8 rx_errors_count;
+
+	u8 status;
+	u8 padding[3];
+} __packed;
+
+struct usbcan_msg_chip_state_event {
+	u8 tid;
+	u8 channel;
+
+	u8 tx_errors_count;
+	u8 rx_errors_count;
+	__le16 time;
+
 	u8 status;
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_tx_acknowledge {
+struct kvaser_msg_tx_acknowledge_header {
 	u8 channel;
 	u8 tid;
+} __packed;
+
+struct leaf_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
 	__le16 time[3];
 	u8 flags;
 	u8 time_offset;
 } __packed;
 
-struct kvaser_msg_error_event {
+struct usbcan_msg_tx_acknowledge {
+	u8 channel;
+	u8 tid;
+
+	__le16 time;
+	__le16 padding;
+} __packed;
+
+struct leaf_msg_error_event {
 	u8 tid;
 	u8 flags;
 	__le16 time[3];
@@ -229,6 +347,18 @@ struct kvaser_msg_error_event {
 	u8 error_factor;
 } __packed;
 
+struct usbcan_msg_error_event {
+	u8 tid;
+	u8 padding;
+	u8 tx_errors_count_ch0;
+	u8 rx_errors_count_ch0;
+	u8 tx_errors_count_ch1;
+	u8 rx_errors_count_ch1;
+	u8 status_ch0;
+	u8 status_ch1;
+	__le16 time;
+} __packed;
+
 struct kvaser_msg_ctrl_mode {
 	u8 tid;
 	u8 channel;
@@ -243,7 +373,7 @@ struct kvaser_msg_flush_queue {
 	u8 padding[3];
 } __packed;
 
-struct kvaser_msg_log_message {
+struct leaf_msg_log_message {
 	u8 channel;
 	u8 flags;
 	__le16 time[3];
@@ -260,21 +390,55 @@ struct kvaser_msg {
 		struct kvaser_msg_simple simple;
 		struct kvaser_msg_cardinfo cardinfo;
 		struct kvaser_msg_cardinfo2 cardinfo2;
-		struct kvaser_msg_softinfo softinfo;
 		struct kvaser_msg_busparams busparams;
+
+		struct kvaser_msg_rx_can_header rx_can_header;
+		struct kvaser_msg_tx_acknowledge_header tx_acknowledge_header;
+
+		union {
+			struct leaf_msg_softinfo softinfo;
+			struct leaf_msg_rx_can rx_can;
+			struct leaf_msg_chip_state_event chip_state_event;
+			struct leaf_msg_tx_acknowledge tx_acknowledge;
+			struct leaf_msg_error_event error_event;
+			struct leaf_msg_log_message log_message;
+		} __packed leaf;
+
+		union {
+			struct usbcan_msg_softinfo softinfo;
+			struct usbcan_msg_rx_can rx_can;
+			struct usbcan_msg_chip_state_event chip_state_event;
+			struct usbcan_msg_tx_acknowledge tx_acknowledge;
+			struct usbcan_msg_error_event error_event;
+		} __packed usbcan;
+
 		struct kvaser_msg_tx_can tx_can;
-		struct kvaser_msg_rx_can rx_can;
-		struct kvaser_msg_chip_state_event chip_state_event;
-		struct kvaser_msg_tx_acknowledge tx_acknowledge;
-		struct kvaser_msg_error_event error_event;
 		struct kvaser_msg_ctrl_mode ctrl_mode;
 		struct kvaser_msg_flush_queue flush_queue;
-		struct kvaser_msg_log_message log_message;
 	} u;
 } __packed;
 
+/* Summary of a kvaser error event, for a unified Leaf/Usbcan error
+ * handling. Some discrepancies between the two families exist:
+ *
+ * - USBCAN firmware does not report M16C "error factors"
+ * - USBCAN controllers has difficulties reporting if the raised error
+ *   event is for ch0 or ch1. They leave such arbitration to the OS
+ *   driver by letting it compare error counters with previous values
+ *   and decide the error event's channel. Thus for USBCAN, the channel
+ *   field is only advisory.
+ */
 struct kvaser_usb_error_summary {
-	u8 channel, status, txerr, rxerr, error_factor;
+	u8 channel, status, txerr, rxerr;
+	union {
+		struct {
+			u8 error_factor;
+		} leaf;
+		struct {
+			u8 other_ch_status;
+			u8 error_state;
+		} usbcan;
+	};
 };
 
 struct kvaser_usb_tx_urb_context {
@@ -292,6 +456,7 @@ struct kvaser_usb {
 
 	u32 fw_version;
 	unsigned int nchannels;
+	enum kvaser_usb_family family;
 
 	bool rxinitdone;
 	void *rxbuf[MAX_RX_URBS];
@@ -315,6 +480,7 @@ struct kvaser_usb_net_priv {
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
+	/* Leaf family IDs */
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
@@ -364,6 +530,17 @@ static const struct usb_device_id kvaser_usb_table[] = {
 		.driver_info = KVASER_HAS_TXRX_ERRORS },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
 	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+
+	/* USBCANII family IDs */
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+	{ USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+		.driver_info = KVASER_HAS_TXRX_ERRORS },
+
 	{ }
 };
 MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
@@ -467,7 +644,14 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	if (err)
 		return err;
 
-	dev->fw_version = le32_to_cpu(msg.u.softinfo.fw_version);
+	switch (dev->family) {
+	case KVASER_LEAF:
+		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		break;
+	case KVASER_USBCAN:
+		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		break;
+	}
 
 	return 0;
 }
@@ -486,7 +670,9 @@ static int kvaser_usb_get_card_info(struct kvaser_usb *dev)
 		return err;
 
 	dev->nchannels = msg.u.cardinfo.nchannels;
-	if (dev->nchannels > MAX_NET_DEVICES)
+	if ((dev->nchannels > MAX_NET_DEVICES) ||
+	    (dev->family == KVASER_USBCAN &&
+	     dev->nchannels > MAX_USBCAN_NET_DEVICES))
 		return -EINVAL;
 
 	return 0;
@@ -500,8 +686,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	u8 channel = msg->u.tx_acknowledge.channel;
-	u8 tid = msg->u.tx_acknowledge.tid;
+	u8 channel, tid;
+
+	channel = msg->u.tx_acknowledge_header.channel;
+	tid = msg->u.tx_acknowledge_header.tid;
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -623,12 +811,12 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 						 const struct kvaser_usb_error_summary *es,
 						 struct can_frame *cf)
 {
-	struct net_device_stats *stats;
+	struct kvaser_usb *dev = priv->dev;
+	struct net_device_stats *stats = &priv->netdev->stats;
 	enum can_state cur_state, new_state, tx_state, rx_state;
 
 	netdev_dbg(priv->netdev, "Error status: 0x%02x\n", es->status);
 
-	stats = &priv->netdev->stats;
 	new_state = cur_state = priv->can.state;
 
 	if (es->status & (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET))
@@ -662,9 +850,22 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 		priv->can.can_stats.restarts++;
 	}
 
-	if (es->error_factor) {
-		priv->can.can_stats.bus_error++;
-		stats->rx_errors++;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			priv->can.can_stats.bus_error++;
+			stats->rx_errors++;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_TX_ERROR)
+			stats->tx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_RX_ERROR)
+			stats->rx_errors++;
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			priv->can.can_stats.bus_error++;
+		}
+		break;
 	}
 
 	priv->bec.txerr = es->txerr;
@@ -672,50 +873,21 @@ static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *pri
 }
 
 static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
-				const struct kvaser_msg *msg)
+				const struct kvaser_usb_error_summary *es)
 {
 	struct can_frame *cf, tmp_cf = { .can_id = CAN_ERR_FLAG, .can_dlc = CAN_ERR_DLC };
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
 	struct kvaser_usb_net_priv *priv;
-	struct kvaser_usb_error_summary es = { };
 	enum can_state old_state, new_state;
 
-	switch (msg->id) {
-	case CMD_CAN_ERROR_EVENT:
-		es.channel = msg->u.error_event.channel;
-		es.status =  msg->u.error_event.status;
-		es.txerr = msg->u.error_event.tx_errors_count;
-		es.rxerr = msg->u.error_event.rx_errors_count;
-		es.error_factor = msg->u.error_event.error_factor;
-		break;
-	case CMD_LOG_MESSAGE:
-		es.channel = msg->u.log_message.channel;
-		es.status = msg->u.log_message.data[0];
-		es.txerr = msg->u.log_message.data[2];
-		es.rxerr = msg->u.log_message.data[3];
-		es.error_factor = msg->u.log_message.data[1];
-		break;
-	case CMD_CHIP_STATE_EVENT:
-		es.channel = msg->u.chip_state_event.channel;
-		es.status =  msg->u.chip_state_event.status;
-		es.txerr = msg->u.chip_state_event.tx_errors_count;
-		es.rxerr = msg->u.chip_state_event.rx_errors_count;
-		es.error_factor = 0;
-		break;
-	default:
-		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
-			msg->id);
-		return;
-	}
-
-	if (es.channel >= dev->nchannels) {
+	if (es->channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
-			"Invalid channel number (%d)\n", es.channel);
+			"Invalid channel number (%d)\n", es->channel);
 		return;
 	}
 
-	priv = dev->nets[es.channel];
+	priv = dev->nets[es->channel];
 	stats = &priv->netdev->stats;
 
 	/* Update all of the can interface's state and error counters before
@@ -729,7 +901,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	 * frame ID and data to userspace. Remove stack allocation afterwards.
 	 */
 	old_state = priv->can.state;
-	kvaser_usb_rx_error_update_can_state(priv, &es, &tmp_cf);
+	kvaser_usb_rx_error_update_can_state(priv, es, &tmp_cf);
 	new_state = priv->can.state;
 
 	skb = alloc_can_err_skb(priv->netdev, &cf);
@@ -740,7 +912,7 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 	memcpy(cf, &tmp_cf, sizeof(*cf));
 
 	if (new_state != old_state) {
-		if (es.status &
+		if (es->status &
 		    (M16C_STATE_BUS_OFF | M16C_STATE_BUS_RESET)) {
 			if (!priv->can.restart_ms)
 				kvaser_usb_simple_msg_async(priv, CMD_STOP_CHIP);
@@ -755,34 +927,161 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 		}
 	}
 
-	if (es.error_factor) {
-		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
-
-		if (es.error_factor & M16C_EF_ACKE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
-		if (es.error_factor & M16C_EF_CRCE)
-			cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
-					CAN_ERR_PROT_LOC_CRC_DEL);
-		if (es.error_factor & M16C_EF_FORME)
-			cf->data[2] |= CAN_ERR_PROT_FORM;
-		if (es.error_factor & M16C_EF_STFE)
-			cf->data[2] |= CAN_ERR_PROT_STUFF;
-		if (es.error_factor & M16C_EF_BITE0)
-			cf->data[2] |= CAN_ERR_PROT_BIT0;
-		if (es.error_factor & M16C_EF_BITE1)
-			cf->data[2] |= CAN_ERR_PROT_BIT1;
-		if (es.error_factor & M16C_EF_TRE)
-			cf->data[2] |= CAN_ERR_PROT_TX;
+	switch (dev->family) {
+	case KVASER_LEAF:
+		if (es->leaf.error_factor) {
+			cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+
+			if (es->leaf.error_factor & M16C_EF_ACKE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_ACK);
+			if (es->leaf.error_factor & M16C_EF_CRCE)
+				cf->data[3] |= (CAN_ERR_PROT_LOC_CRC_SEQ |
+						CAN_ERR_PROT_LOC_CRC_DEL);
+			if (es->leaf.error_factor & M16C_EF_FORME)
+				cf->data[2] |= CAN_ERR_PROT_FORM;
+			if (es->leaf.error_factor & M16C_EF_STFE)
+				cf->data[2] |= CAN_ERR_PROT_STUFF;
+			if (es->leaf.error_factor & M16C_EF_BITE0)
+				cf->data[2] |= CAN_ERR_PROT_BIT0;
+			if (es->leaf.error_factor & M16C_EF_BITE1)
+				cf->data[2] |= CAN_ERR_PROT_BIT1;
+			if (es->leaf.error_factor & M16C_EF_TRE)
+				cf->data[2] |= CAN_ERR_PROT_TX;
+		}
+		break;
+	case KVASER_USBCAN:
+		if (es->usbcan.error_state & USBCAN_ERROR_STATE_BUSERROR) {
+			cf->can_id |= CAN_ERR_BUSERROR;
+		}
+		break;
 	}
 
-	cf->data[6] = es.txerr;
-	cf->data[7] = es.rxerr;
+	cf->data[6] = es->txerr;
+	cf->data[7] = es->rxerr;
 
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
 	netif_rx(skb);
 }
 
+/* For USBCAN, report error to userspace iff the channels's errors counter
+ * has changed, or we're the only channel seeing a bus error state.
+ */
+static void kvaser_usbcan_conditionally_rx_error(const struct kvaser_usb *dev,
+						 struct kvaser_usb_error_summary *es)
+{
+	struct kvaser_usb_net_priv *priv;
+	int channel;
+	bool report_error;
+
+	channel = es->channel;
+	if (channel >= dev->nchannels) {
+		dev_err(dev->udev->dev.parent,
+			"Invalid channel number (%d)\n", channel);
+		return;
+	}
+
+	priv = dev->nets[channel];
+	report_error = false;
+
+	if (es->txerr != priv->bec.txerr) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_TX_ERROR;
+		report_error = true;
+	}
+	if (es->rxerr != priv->bec.rxerr) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_RX_ERROR;
+		report_error = true;
+	}
+	if ((es->status & M16C_STATE_BUS_ERROR) &&
+	    !(es->usbcan.other_ch_status & M16C_STATE_BUS_ERROR)) {
+		es->usbcan.error_state |= USBCAN_ERROR_STATE_BUSERROR;
+		report_error = true;
+	}
+
+	if (report_error)
+		kvaser_usb_rx_error(dev, es);
+}
+
+static void kvaser_usbcan_rx_error(const struct kvaser_usb *dev,
+				   const struct kvaser_msg *msg)
+{
+	struct kvaser_usb_error_summary es = { };
+
+	switch (msg->id) {
+	/* Sometimes errors are sent as unsolicited chip state events */
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.usbcan.chip_state_event.channel;
+		es.status =  msg->u.usbcan.chip_state_event.status;
+		es.txerr = msg->u.usbcan.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.usbcan.chip_state_event.rx_errors_count;
+		kvaser_usbcan_conditionally_rx_error(dev, &es);
+		break;
+
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = 0;
+		es.status = msg->u.usbcan.error_event.status_ch0;
+		es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch0;
+		es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch0;
+		es.usbcan.other_ch_status =
+			msg->u.usbcan.error_event.status_ch1;
+		kvaser_usbcan_conditionally_rx_error(dev, &es);
+
+		/* The USBCAN firmware supports up to 2 channels.
+		 * Now that ch0 was checked, check if ch1 has any errors.
+		 */
+		if (dev->nchannels == MAX_USBCAN_NET_DEVICES) {
+			es.channel = 1;
+			es.status = msg->u.usbcan.error_event.status_ch1;
+			es.txerr = msg->u.usbcan.error_event.tx_errors_count_ch1;
+			es.rxerr = msg->u.usbcan.error_event.rx_errors_count_ch1;
+			es.usbcan.other_ch_status =
+				msg->u.usbcan.error_event.status_ch0;
+			kvaser_usbcan_conditionally_rx_error(dev, &es);
+		}
+		break;
+
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+	}
+}
+
+static void kvaser_leaf_rx_error(const struct kvaser_usb *dev,
+				 const struct kvaser_msg *msg)
+{
+	struct kvaser_usb_error_summary es = { };
+
+	switch (msg->id) {
+	case CMD_CAN_ERROR_EVENT:
+		es.channel = msg->u.leaf.error_event.channel;
+		es.status =  msg->u.leaf.error_event.status;
+		es.txerr = msg->u.leaf.error_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.error_event.rx_errors_count;
+		es.leaf.error_factor = msg->u.leaf.error_event.error_factor;
+		break;
+	case CMD_LEAF_LOG_MESSAGE:
+		es.channel = msg->u.leaf.log_message.channel;
+		es.status = msg->u.leaf.log_message.data[0];
+		es.txerr = msg->u.leaf.log_message.data[2];
+		es.rxerr = msg->u.leaf.log_message.data[3];
+		es.leaf.error_factor = msg->u.leaf.log_message.data[1];
+		break;
+	case CMD_CHIP_STATE_EVENT:
+		es.channel = msg->u.leaf.chip_state_event.channel;
+		es.status =  msg->u.leaf.chip_state_event.status;
+		es.txerr = msg->u.leaf.chip_state_event.tx_errors_count;
+		es.rxerr = msg->u.leaf.chip_state_event.rx_errors_count;
+		es.leaf.error_factor = 0;
+		break;
+	default:
+		dev_err(dev->udev->dev.parent, "Invalid msg id (%d)\n",
+			msg->id);
+		return;
+	}
+
+	kvaser_usb_rx_error(dev, &es);
+}
+
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 				  const struct kvaser_msg *msg)
 {
@@ -790,16 +1089,16 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 	struct sk_buff *skb;
 	struct net_device_stats *stats = &priv->netdev->stats;
 
-	if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
+	if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
 					 MSG_FLAG_NERR)) {
 		netdev_err(priv->netdev, "Unknow error (flags: 0x%02x)\n",
-			   msg->u.rx_can.flag);
+			   msg->u.rx_can_header.flag);
 
 		stats->rx_errors++;
 		return;
 	}
 
-	if (msg->u.rx_can.flag & MSG_FLAG_OVERRUN) {
+	if (msg->u.rx_can_header.flag & MSG_FLAG_OVERRUN) {
 		stats->rx_over_errors++;
 		stats->rx_errors++;
 
@@ -825,7 +1124,8 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	struct can_frame *cf;
 	struct sk_buff *skb;
 	struct net_device_stats *stats;
-	u8 channel = msg->u.rx_can.channel;
+	u8 channel = msg->u.rx_can_header.channel;
+	const u8 *rx_msg = NULL;	/* GCC */
 
 	if (channel >= dev->nchannels) {
 		dev_err(dev->udev->dev.parent,
@@ -836,60 +1136,68 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 	priv = dev->nets[channel];
 	stats = &priv->netdev->stats;
 
-	if ((msg->u.rx_can.flag & MSG_FLAG_ERROR_FRAME) &&
-	    (msg->id == CMD_LOG_MESSAGE)) {
-		kvaser_usb_rx_error(dev, msg);
+	if ((msg->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+	    (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE)) {
+		kvaser_leaf_rx_error(dev, msg);
 		return;
-	} else if (msg->u.rx_can.flag & (MSG_FLAG_ERROR_FRAME |
-					 MSG_FLAG_NERR |
-					 MSG_FLAG_OVERRUN)) {
+	} else if (msg->u.rx_can_header.flag & (MSG_FLAG_ERROR_FRAME |
+						MSG_FLAG_NERR |
+						MSG_FLAG_OVERRUN)) {
 		kvaser_usb_rx_can_err(priv, msg);
 		return;
-	} else if (msg->u.rx_can.flag & ~MSG_FLAG_REMOTE_FRAME) {
+	} else if (msg->u.rx_can_header.flag & ~MSG_FLAG_REMOTE_FRAME) {
 		netdev_warn(priv->netdev,
 			    "Unhandled frame (flags: 0x%02x)",
-			    msg->u.rx_can.flag);
+			    msg->u.rx_can_header.flag);
 		return;
 	}
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		rx_msg = msg->u.leaf.rx_can.msg;
+		break;
+	case KVASER_USBCAN:
+		rx_msg = msg->u.usbcan.rx_can.msg;
+		break;
+	}
+
 	skb = alloc_can_skb(priv->netdev, &cf);
 	if (!skb) {
 		stats->tx_dropped++;
 		return;
 	}
 
-	if (msg->id == CMD_LOG_MESSAGE) {
-		cf->can_id = le32_to_cpu(msg->u.log_message.id);
+	if (dev->family == KVASER_LEAF && msg->id == CMD_LEAF_LOG_MESSAGE) {
+		cf->can_id = le32_to_cpu(msg->u.leaf.log_message.id);
 		if (cf->can_id & KVASER_EXTENDED_FRAME)
 			cf->can_id &= CAN_EFF_MASK | CAN_EFF_FLAG;
 		else
 			cf->can_id &= CAN_SFF_MASK;
 
-		cf->can_dlc = get_can_dlc(msg->u.log_message.dlc);
+		cf->can_dlc = get_can_dlc(msg->u.leaf.log_message.dlc);
 
-		if (msg->u.log_message.flags & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.leaf.log_message.flags & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.log_message.data,
+			memcpy(cf->data, &msg->u.leaf.log_message.data,
 			       cf->can_dlc);
 	} else {
-		cf->can_id = ((msg->u.rx_can.msg[0] & 0x1f) << 6) |
-			     (msg->u.rx_can.msg[1] & 0x3f);
+		cf->can_id = ((rx_msg[0] & 0x1f) << 6) | (rx_msg[1] & 0x3f);
 
 		if (msg->id == CMD_RX_EXT_MESSAGE) {
 			cf->can_id <<= 18;
-			cf->can_id |= ((msg->u.rx_can.msg[2] & 0x0f) << 14) |
-				      ((msg->u.rx_can.msg[3] & 0xff) << 6) |
-				      (msg->u.rx_can.msg[4] & 0x3f);
+			cf->can_id |= ((rx_msg[2] & 0x0f) << 14) |
+				      ((rx_msg[3] & 0xff) << 6) |
+				      (rx_msg[4] & 0x3f);
 			cf->can_id |= CAN_EFF_FLAG;
 		}
 
-		cf->can_dlc = get_can_dlc(msg->u.rx_can.msg[5]);
+		cf->can_dlc = get_can_dlc(rx_msg[5]);
 
-		if (msg->u.rx_can.flag & MSG_FLAG_REMOTE_FRAME)
+		if (msg->u.rx_can_header.flag & MSG_FLAG_REMOTE_FRAME)
 			cf->can_id |= CAN_RTR_FLAG;
 		else
-			memcpy(cf->data, &msg->u.rx_can.msg[6],
+			memcpy(cf->data, &rx_msg[6],
 			       cf->can_dlc);
 	}
 
@@ -952,21 +1260,35 @@ static void kvaser_usb_handle_message(const struct kvaser_usb *dev,
 
 	case CMD_RX_STD_MESSAGE:
 	case CMD_RX_EXT_MESSAGE:
-	case CMD_LOG_MESSAGE:
+		kvaser_usb_rx_can_msg(dev, msg);
+		break;
+
+	case CMD_LEAF_LOG_MESSAGE:
+		if (dev->family != KVASER_LEAF)
+			goto warn;
 		kvaser_usb_rx_can_msg(dev, msg);
 		break;
 
 	case CMD_CHIP_STATE_EVENT:
 	case CMD_CAN_ERROR_EVENT:
-		kvaser_usb_rx_error(dev, msg);
+		if (dev->family == KVASER_LEAF)
+			kvaser_leaf_rx_error(dev, msg);
+		else
+			kvaser_usbcan_rx_error(dev, msg);
 		break;
 
 	case CMD_TX_ACKNOWLEDGE:
 		kvaser_usb_tx_acknowledge(dev, msg);
 		break;
 
+	/* Ignored messages */
+	case CMD_USBCAN_CLOCK_OVERFLOW_EVENT:
+		if (dev->family != KVASER_USBCAN)
+			goto warn;
+		break;
+
 	default:
-		dev_warn(dev->udev->dev.parent,
+warn:		dev_warn(dev->udev->dev.parent,
 			 "Unhandled message (%d)\n", msg->id);
 		break;
 	}
@@ -1186,7 +1508,7 @@ static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 				  dev->rxbuf[i],
 				  dev->rxbuf_dma[i]);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++) {
+	for (i = 0; i < dev->nchannels; i++) {
 		struct kvaser_usb_net_priv *priv = dev->nets[i];
 
 		if (priv)
@@ -1294,6 +1616,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err;
 	int ret = NETDEV_TX_OK;
+	u8 *msg_tx_can_flags = NULL;		/* GCC */
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1315,9 +1638,19 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	msg = buf;
 	msg->len = MSG_HEADER_LEN + sizeof(struct kvaser_msg_tx_can);
-	msg->u.tx_can.flags = 0;
 	msg->u.tx_can.channel = priv->channel;
 
+	switch (dev->family) {
+	case KVASER_LEAF:
+		msg_tx_can_flags = &msg->u.tx_can.leaf.flags;
+		break;
+	case KVASER_USBCAN:
+		msg_tx_can_flags = &msg->u.tx_can.usbcan.flags;
+		break;
+	}
+
+	*msg_tx_can_flags = 0;
+
 	if (cf->can_id & CAN_EFF_FLAG) {
 		msg->id = CMD_TX_EXT_MESSAGE;
 		msg->u.tx_can.msg[0] = (cf->can_id >> 24) & 0x1f;
@@ -1335,7 +1668,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	memcpy(&msg->u.tx_can.msg[6], cf->data, cf->can_dlc);
 
 	if (cf->can_id & CAN_RTR_FLAG)
-		msg->u.tx_can.flags |= MSG_FLAG_REMOTE_FRAME;
+		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
@@ -1604,6 +1937,17 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 	if (!dev)
 		return -ENOMEM;
 
+	if (kvaser_is_leaf(id)) {
+		dev->family = KVASER_LEAF;
+	} else if (kvaser_is_usbcan(id)) {
+		dev->family = KVASER_USBCAN;
+	} else {
+		dev_err(&intf->dev,
+			"Product ID (%d) does not belong to any known Kvaser USB family",
+			id->idProduct);
+		return -ENODEV;
+	}
+
 	err = kvaser_usb_get_endpoints(intf, &dev->bulk_in, &dev->bulk_out);
 	if (err) {
 		dev_err(&intf->dev, "Cannot get usb endpoint(s)");
-- 
1.7.7.6


^ permalink raw reply related	[relevance 39%]

* [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures
@ 2015-02-26 15:20 95% Ahmed S. Darwish
  2015-02-26 15:22 88% ` [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer Ahmed S. Darwish
                   ` (5 more replies)
  0 siblings, 6 replies; 200+ results
From: Ahmed S. Darwish @ 2015-02-26 15:20 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Upon a URB submission failure, the driver calls usb_free_urb()
but then manually frees the URB buffer by itself.  Meanwhile
usb_free_urb() has alredy freed out that transfer buffer since
we're the only code path holding a reference to this URB.

Remove two of such invalid manual free().

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 2928f70..d986fe8 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -787,7 +787,6 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 		netdev_err(netdev, "Error transmitting URB\n");
 		usb_unanchor_urb(urb);
 		usb_free_urb(urb);
-		kfree(buf);
 		return err;
 	}
 
@@ -1615,8 +1614,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct urb *urb;
 	void *buf;
 	struct kvaser_msg *msg;
-	int i, err;
-	int ret = NETDEV_TX_OK;
+	int i, err, ret = NETDEV_TX_OK;
 	u8 *msg_tx_can_flags = NULL;		/* GCC */
 
 	if (can_dropped_invalid_skb(netdev, skb))
@@ -1634,7 +1632,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (!buf) {
 		stats->tx_dropped++;
 		dev_kfree_skb(skb);
-		goto nobufmem;
+		goto freeurb;
 	}
 
 	msg = buf;
@@ -1681,8 +1679,10 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	/* This should never happen; it implies a flow control bug */
 	if (!context) {
 		netdev_warn(netdev, "cannot find free context\n");
+
+		kfree(buf);
 		ret =  NETDEV_TX_BUSY;
-		goto releasebuf;
+		goto freeurb;
 	}
 
 	context->priv = priv;
@@ -1719,16 +1719,12 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		else
 			netdev_warn(netdev, "Failed tx_urb %d\n", err);
 
-		goto releasebuf;
+		goto freeurb;
 	}
 
-	usb_free_urb(urb);
-
-	return NETDEV_TX_OK;
+	ret = NETDEV_TX_OK;
 
-releasebuf:
-	kfree(buf);
-nobufmem:
+freeurb:
 	usb_free_urb(urb);
 	return ret;
 }
-- 
1.9.1


^ permalink raw reply related	[relevance 95%]

* [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer
  2015-02-26 15:20 95% [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
@ 2015-02-26 15:22 88% ` Ahmed S. Darwish
  2015-02-26 15:24 78%   ` [PATCH 3/5] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-02-26 15:22 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, Linux-USB, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

The Kvaser firmware can only read and write messages that are
not crossing the USB endpoint's wMaxPacketSize boundary. While
receiving commands from the CAN device, if the next command in
the same URB buffer crossed that max packet size boundary, the
firmware puts a zero-length placeholder command in its place
then moves the real command to the next boundary mark.

The driver did not recognize such behavior, leading to missing
a good number of rx events during a heavy rx load session.

Moreover, a tx URB context only gets freed upon receiving its
respective tx ACK event. Over time, the free tx URB contexts
pool gets depleted due to the missing ACK events. Consequently,
the netif transmission queue gets __permanently__ stopped; no
frames could be sent again except after restarting the CAN
newtwork interface.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index d986fe8..a316fa4 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -14,6 +14,7 @@
  * Copyright (C) 2015 Valeo S.A.
  */
 
+#include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/module.h>
 #include <linux/netdevice.h>
@@ -584,8 +585,15 @@ static int kvaser_usb_wait_msg(const struct kvaser_usb *dev, u8 id,
 		while (pos <= actual_len - MSG_HEADER_LEN) {
 			tmp = buf + pos;
 
-			if (!tmp->len)
-				break;
+			/* Handle messages crossing the USB endpoint max packet
+			 * size boundary. Check kvaser_usb_read_bulk_callback()
+			 * for further details.
+			 */
+			if (tmp->len == 0) {
+				pos = round_up(pos,
+					       dev->bulk_in->wMaxPacketSize);
+				continue;
+			}
 
 			if (pos + tmp->len > actual_len) {
 				dev_err(dev->udev->dev.parent,
@@ -1316,8 +1324,19 @@ static void kvaser_usb_read_bulk_callback(struct urb *urb)
 	while (pos <= urb->actual_length - MSG_HEADER_LEN) {
 		msg = urb->transfer_buffer + pos;
 
-		if (!msg->len)
-			break;
+		/* The Kvaser firmware can only read and write messages that
+		 * does not cross the USB's endpoint wMaxPacketSize boundary.
+		 * If a follow-up command crosses such boundary, firmware puts
+		 * a placeholder zero-length command in its place then aligns
+		 * the real command to the next max packet size.
+		 *
+		 * Handle such cases or we're going to miss a significant
+		 * number of events in case of a heavy rx load on the bus.
+		 */
+		if (msg->len == 0) {
+			pos = round_up(pos, dev->bulk_in->wMaxPacketSize);
+			continue;
+		}
 
 		if (pos + msg->len > urb->actual_length) {
 			dev_err(dev->udev->dev.parent, "Format error\n");
@@ -1325,7 +1344,6 @@ static void kvaser_usb_read_bulk_callback(struct urb *urb)
 		}
 
 		kvaser_usb_handle_message(dev, msg);
-
 		pos += msg->len;
 	}
 
-- 
1.9.1


^ permalink raw reply related	[relevance 88%]

* [PATCH 3/5] can: kvaser_usb: Utilize all possible tx URBs
  2015-02-26 15:22 88% ` [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer Ahmed S. Darwish
@ 2015-02-26 15:24 78%   ` Ahmed S. Darwish
  2015-02-26 15:25 99%     ` [PATCH 4/5] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-02-26 15:24 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, Linux-USB, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

The driver currently limits the number of outstanding, not yet
ACKed, transfers to 16 URBs. Meanwhile, the Kvaser firmware
provides its actual max supported number of outstanding
transmissions in its reply to the CMD_GET_SOFTWARE_INFO message.

One example is the UsbCan-II HS/LS device which reports support
of up to 48 tx URBs instead of just 16, increasing the driver
throughput by two-fold and reducing the possibility of -ENOBUFs.

Dynamically set the max tx URBs value according to firmware
replies.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 62 ++++++++++++++++++++++++++--------------
 1 file changed, 40 insertions(+), 22 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index a316fa4..8f835a1 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -24,7 +24,6 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
-#define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
 #define STOP_TIMEOUT			1000 /* msecs */
@@ -455,8 +454,13 @@ struct kvaser_usb {
 	struct usb_endpoint_descriptor *bulk_in, *bulk_out;
 	struct usb_anchor rx_submitted;
 
+	/* @max_tx_urbs: Firmware-reported maximum number of possible
+	 * outstanding transmissions on this specific Kvaser hardware. The
+	 * value is also used as a sentinel for marking free URB contexts.
+	 */
 	u32 fw_version;
 	unsigned int nchannels;
+	unsigned int max_tx_urbs;
 	enum kvaser_usb_family family;
 
 	bool rxinitdone;
@@ -469,7 +473,7 @@ struct kvaser_usb_net_priv {
 
 	atomic_t active_tx_urbs;
 	struct usb_anchor tx_submitted;
-	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
+	struct kvaser_usb_tx_urb_context *tx_contexts;
 
 	struct completion start_comp, stop_comp;
 
@@ -655,9 +659,13 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	switch (dev->family) {
 	case KVASER_LEAF:
 		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.leaf.softinfo.max_outstanding_tx);
 		break;
 	case KVASER_USBCAN:
 		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.usbcan.softinfo.max_outstanding_tx);
 		break;
 	}
 
@@ -712,7 +720,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats = &priv->netdev->stats;
 
-	context = &priv->tx_contexts[tid % MAX_TX_URBS];
+	context = &priv->tx_contexts[tid % dev->max_tx_urbs];
 
 	/* Sometimes the state change doesn't come after a bus-off event */
 	if (priv->can.restart_ms &&
@@ -739,7 +747,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	stats->tx_bytes += context->dlc;
 	can_get_echo_skb(priv->netdev, context->echo_index);
 
-	context->echo_index = MAX_TX_URBS;
+	context->echo_index = dev->max_tx_urbs;
 	atomic_dec(&priv->active_tx_urbs);
 
 	netif_wake_queue(priv->netdev);
@@ -805,13 +813,14 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 
 static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
 {
+	struct kvaser_usb *dev = priv->dev;
 	int i;
 
 	usb_kill_anchored_urbs(&priv->tx_submitted);
 	atomic_set(&priv->active_tx_urbs, 0);
 
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+	for (i = 0; i < dev->max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = dev->max_tx_urbs;
 }
 
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
@@ -1687,8 +1696,8 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (cf->can_id & CAN_RTR_FLAG)
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
-		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
+	for (i = 0; i < dev->max_tx_urbs; i++) {
+		if (priv->tx_contexts[i].echo_index ==  dev->max_tx_urbs) {
 			context = &priv->tx_contexts[i];
 			break;
 		}
@@ -1720,7 +1729,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 
 	atomic_inc(&priv->active_tx_urbs);
 
-	if (atomic_read(&priv->active_tx_urbs) >= MAX_TX_URBS)
+	if (atomic_read(&priv->active_tx_urbs) >= dev->max_tx_urbs)
 		netif_stop_queue(netdev);
 
 	err = usb_submit_urb(urb, GFP_ATOMIC);
@@ -1860,7 +1869,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	if (err)
 		return err;
 
-	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
+	netdev = alloc_candev(sizeof(*priv), dev->max_tx_urbs);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
 		return -ENOMEM;
@@ -1868,19 +1877,26 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 
 	priv = netdev_priv(netdev);
 
+	priv->tx_contexts = kzalloc(dev->max_tx_urbs *
+				    sizeof(*priv->tx_contexts), GFP_KERNEL);
+	if (!priv->tx_contexts) {
+		free_candev(netdev);
+		return -ENOMEM;
+	}
+
 	init_completion(&priv->start_comp);
 	init_completion(&priv->stop_comp);
 
-	init_usb_anchor(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-
 	priv->dev = dev;
 	priv->netdev = netdev;
 	priv->channel = channel;
 
+	init_usb_anchor(&priv->tx_submitted);
+	atomic_set(&priv->active_tx_urbs, 0);
+
+	for (i = 0; i < dev->max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = dev->max_tx_urbs;
+
 	priv->can.state = CAN_STATE_STOPPED;
 	priv->can.clock.freq = CAN_USB_CLOCK;
 	priv->can.bittiming_const = &kvaser_usb_bittiming_const;
@@ -1909,7 +1925,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 		return err;
 	}
 
-	netdev_dbg(netdev, "device registered\n");
+	netdev_info(netdev, "device registered\n");
 
 	return 0;
 }
@@ -1990,6 +2006,13 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
+	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
+		((dev->fw_version >> 24) & 0xff),
+		((dev->fw_version >> 16) & 0xff),
+		(dev->fw_version & 0xffff));
+
+	dev_dbg(&intf->dev, "Max oustanding tx = %d URBs\n", dev->max_tx_urbs);
+
 	err = kvaser_usb_get_card_info(dev);
 	if (err) {
 		dev_err(&intf->dev,
@@ -1997,11 +2020,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
-	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
-		((dev->fw_version >> 24) & 0xff),
-		((dev->fw_version >> 16) & 0xff),
-		(dev->fw_version & 0xffff));
-
 	for (i = 0; i < dev->nchannels; i++) {
 		err = kvaser_usb_init_one(intf, id, i);
 		if (err) {
-- 
1.9.1


^ permalink raw reply related	[relevance 78%]

* [PATCH 4/5] can: kvaser_usb: Use can-dev unregistration mechanism
  2015-02-26 15:24 78%   ` [PATCH 3/5] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
@ 2015-02-26 15:25 99%     ` Ahmed S. Darwish
  2015-02-26 15:29 79%       ` [PATCH 5/5] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-02-26 15:25 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Use can-dev's unregister_candev() instead of directly calling
networking unregister_netdev(). While both are functionally
equivalent, unregister_candev() might do extra stuff in the
future than just calling networking layer unregistration code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 8f835a1..13bae86 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1844,7 +1844,7 @@ static void kvaser_usb_remove_interfaces(struct kvaser_usb *dev)
 		if (!dev->nets[i])
 			continue;
 
-		unregister_netdev(dev->nets[i]->netdev);
+		unregister_candev(dev->nets[i]->netdev);
 	}
 
 	kvaser_usb_unlink_all_urbs(dev);
-- 
1.9.1


^ permalink raw reply related	[relevance 99%]

* [PATCH 5/5] can: kvaser_usb: Fix tx queue start/stop race conditions
  2015-02-26 15:25 99%     ` [PATCH 4/5] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
@ 2015-02-26 15:29 79%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-02-26 15:29 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, netdev, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

A number of tx queue wake-up events went missing due to the
outlined scenario below. Start state is a pool of 16 tx URBs,
active tx_urbs count = 15, with the netdev tx queue open.

start_xmit()                             tx_acknowledge()
............                             ................
atomic_inc(&tx_urbs);
if (atomic_read(&tx_urbs) >= 16) {
                        URB completion IRQ!
                        -->
                                         atomic_dec(&tx_urbs);
                                         netif_wake_queue();
                                         return;
                        <--
                        end of IRQ!
    netif_stop_queue();
}

At the end, the correct state expected is a 15 tx_urbs count
value with the tx queue state _open_. Due to the race, we get
the same tx_urbs value but with the tx queue state _stopped_.
The wake-up event is completely lost.

Thus avoid hand-rolled concurrency mechanisms and use a proper
lock for contexts protection.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 82 +++++++++++++++++++++++++---------------
 1 file changed, 51 insertions(+), 31 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 13bae86..807ab0c 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -14,6 +14,7 @@
  * Copyright (C) 2015 Valeo S.A.
  */
 
+#include <linux/spinlock.h>
 #include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/module.h>
@@ -471,10 +472,12 @@ struct kvaser_usb {
 struct kvaser_usb_net_priv {
 	struct can_priv can;
 
-	atomic_t active_tx_urbs;
-	struct usb_anchor tx_submitted;
+	spinlock_t tx_contexts_lock;
+	int active_tx_contexts;
 	struct kvaser_usb_tx_urb_context *tx_contexts;
 
+	struct usb_anchor tx_submitted;
+
 	struct completion start_comp, stop_comp;
 
 	struct kvaser_usb *dev;
@@ -702,6 +705,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
+	unsigned long flags;
 	u8 channel, tid;
 
 	channel = msg->u.tx_acknowledge_header.channel;
@@ -745,12 +749,15 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats->tx_packets++;
 	stats->tx_bytes += context->dlc;
-	can_get_echo_skb(priv->netdev, context->echo_index);
 
-	context->echo_index = dev->max_tx_urbs;
-	atomic_dec(&priv->active_tx_urbs);
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
+	can_get_echo_skb(priv->netdev, context->echo_index);
+	context->echo_index = dev->max_tx_urbs;
+	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
+
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 }
 
 static void kvaser_usb_simple_msg_callback(struct urb *urb)
@@ -811,18 +818,6 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 	return 0;
 }
 
-static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
-{
-	struct kvaser_usb *dev = priv->dev;
-	int i;
-
-	usb_kill_anchored_urbs(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < dev->max_tx_urbs; i++)
-		priv->tx_contexts[i].echo_index = dev->max_tx_urbs;
-}
-
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
 						 const struct kvaser_usb_error_summary *es,
 						 struct can_frame *cf)
@@ -1524,6 +1519,26 @@ error:
 	return err;
 }
 
+static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
+{
+	int i, max_tx_urbs;
+
+	max_tx_urbs = priv->dev->max_tx_urbs;
+
+	priv->active_tx_contexts = 0;
+	for (i = 0; i < max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = max_tx_urbs;
+}
+
+/* This method might sleep. Do not call it in the atomic context
+ * of URB completions.
+ */
+static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
+{
+	usb_kill_anchored_urbs(&priv->tx_submitted);
+	kvaser_usb_reset_tx_urb_contexts(priv);
+}
+
 static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 {
 	int i;
@@ -1643,6 +1658,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err, ret = NETDEV_TX_OK;
 	u8 *msg_tx_can_flags = NULL;		/* GCC */
+	unsigned long flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1696,12 +1712,21 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (cf->can_id & CAN_RTR_FLAG)
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 	for (i = 0; i < dev->max_tx_urbs; i++) {
 		if (priv->tx_contexts[i].echo_index ==  dev->max_tx_urbs) {
 			context = &priv->tx_contexts[i];
+
+			context->echo_index = i;
+			can_put_echo_skb(skb, netdev, context->echo_index);
+			++priv->active_tx_contexts;
+			if (priv->active_tx_contexts >= dev->max_tx_urbs)
+				netif_stop_queue(netdev);
+
 			break;
 		}
 	}
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
 	/* This should never happen; it implies a flow control bug */
 	if (!context) {
@@ -1713,7 +1738,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	}
 
 	context->priv = priv;
-	context->echo_index = i;
 	context->dlc = cf->can_dlc;
 
 	msg->u.tx_can.tid = context->echo_index;
@@ -1725,18 +1749,17 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 			  kvaser_usb_write_bulk_callback, context);
 	usb_anchor_urb(urb, &priv->tx_submitted);
 
-	can_put_echo_skb(skb, netdev, context->echo_index);
-
-	atomic_inc(&priv->active_tx_urbs);
-
-	if (atomic_read(&priv->active_tx_urbs) >= dev->max_tx_urbs)
-		netif_stop_queue(netdev);
-
 	err = usb_submit_urb(urb, GFP_ATOMIC);
 	if (unlikely(err)) {
+		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
+
 		can_free_echo_skb(netdev, context->echo_index);
+		context->echo_index = dev->max_tx_urbs;
+		--priv->active_tx_contexts;
+		netif_wake_queue(netdev);
+
+		spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
-		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
 		stats->tx_dropped++;
@@ -1863,7 +1886,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb *dev = usb_get_intfdata(intf);
 	struct net_device *netdev;
 	struct kvaser_usb_net_priv *priv;
-	int i, err;
+	int err;
 
 	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
 	if (err)
@@ -1892,10 +1915,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	priv->channel = channel;
 
 	init_usb_anchor(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < dev->max_tx_urbs; i++)
-		priv->tx_contexts[i].echo_index = dev->max_tx_urbs;
+	kvaser_usb_reset_tx_urb_contexts(priv);
 
 	priv->can.state = CAN_STATE_STOPPED;
 	priv->can.clock.freq = CAN_USB_CLOCK;
-- 
1.9.1


^ permalink raw reply related	[relevance 79%]

* Re: [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures
  @ 2015-03-09 12:32 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-09 12:32 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML

Hi Marc,

(Sorry for the late reply as I was out of town!)

On Wed, Mar 04, 2015 at 10:15:45AM +0100, Marc Kleine-Budde wrote:
> On 02/26/2015 04:20 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > Upon a URB submission failure, the driver calls usb_free_urb()
> > but then manually frees the URB buffer by itself.  Meanwhile
> > usb_free_urb() has alredy freed out that transfer buffer since
> > we're the only code path holding a reference to this URB.
> > 
> > Remove two of such invalid manual free().
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> Applied 1+2 and added stable on Cc. Can you please shuffle the remaining
> patches, so that patch 5 comes first, then 4 and 3 as the last patch. As
> 5 is a bugfix it should go into stable, while 3 isn't.
>
> You can base your series on the can/testing branch.
> 

Did not care much about the bugfixes order this time as the patches
themselves will not apply cleanly (or at all) to -stable due to the
addition of UsbCAN-II code, which all -stable kernels do not have.
Thus I guess I'll need to submit a different patch series for -stable
with patches 1, 2, and 5 -- rebased.

Nonetheless, you're correct that having the bugfixes (1,2,5), then the
optimization (4), then the janitorial fix (3) is the logical order for
history & bisection sake. So.. I'll re-order the patches, individually
test with the new order, and re-submit over can/testing.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  2015-02-26 15:20 95% [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
  2015-02-26 15:22 88% ` [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer Ahmed S. Darwish
  @ 2015-03-11 15:23 78% ` Ahmed S. Darwish
  2015-03-11 15:28 78%   ` [PATCH v2 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
    2015-03-11 17:37 77% ` [PATCH v3 " Ahmed S. Darwish
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 15:23 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

A number of tx queue wake-up events went missing due to the
outlined scenario below. Start state is a pool of 16 tx URBs,
active tx_urbs count = 15, with the netdev tx queue open.

start_xmit()                             tx_acknowledge()
............                             ................
atomic_inc(&tx_urbs);
if (atomic_read(&tx_urbs) >= 16) {
                        URB completion IRQ!
                        -->
                                         atomic_dec(&tx_urbs);
                                         netif_wake_queue();
                                         return;
                        <--
                        end of IRQ!
    netif_stop_queue();
}

At the end, the correct state expected is a 15 tx_urbs count
value with the tx queue state _open_. Due to the race, we get
the same tx_urbs value but with the tx queue state _stopped_.
The wake-up event is completely lost.

Thus avoid hand-rolled concurrency mechanisms and use a proper
lock for contexts protection.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 82 ++++++++++++++++++++++++----------------
 1 file changed, 50 insertions(+), 32 deletions(-)

This new series is just like v1, with the exception of moving
bugfix #5 to the top as suggested earlier.  The patches are
based over linux-can-fixes-for-4.0-20150309. Thanks!

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index a316fa4..0aea8e2 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -14,6 +14,7 @@
  * Copyright (C) 2015 Valeo S.A.
  */
 
+#include <linux/spinlock.h>
 #include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/module.h>
@@ -467,10 +468,11 @@ struct kvaser_usb {
 struct kvaser_usb_net_priv {
 	struct can_priv can;
 
-	atomic_t active_tx_urbs;
-	struct usb_anchor tx_submitted;
+	spinlock_t tx_contexts_lock;
+	int active_tx_contexts;
 	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
 
+	struct usb_anchor tx_submitted;
 	struct completion start_comp, stop_comp;
 
 	struct kvaser_usb *dev;
@@ -694,6 +696,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
+	unsigned long flags;
 	u8 channel, tid;
 
 	channel = msg->u.tx_acknowledge_header.channel;
@@ -737,12 +740,15 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats->tx_packets++;
 	stats->tx_bytes += context->dlc;
-	can_get_echo_skb(priv->netdev, context->echo_index);
 
-	context->echo_index = MAX_TX_URBS;
-	atomic_dec(&priv->active_tx_urbs);
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
+	can_get_echo_skb(priv->netdev, context->echo_index);
+	context->echo_index = MAX_TX_URBS;
+	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
+
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 }
 
 static void kvaser_usb_simple_msg_callback(struct urb *urb)
@@ -803,17 +809,6 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 	return 0;
 }
 
-static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
-{
-	int i;
-
-	usb_kill_anchored_urbs(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-}
-
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
 						 const struct kvaser_usb_error_summary *es,
 						 struct can_frame *cf)
@@ -1515,6 +1510,24 @@ error:
 	return err;
 }
 
+static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
+{
+	int i;
+
+	priv->active_tx_contexts = 0;
+	for (i = 0; i < MAX_TX_URBS; i++)
+		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+}
+
+/* This method might sleep. Do not call it in the atomic context
+ * of URB completions.
+ */
+static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
+{
+	usb_kill_anchored_urbs(&priv->tx_submitted);
+	kvaser_usb_reset_tx_urb_contexts(priv);
+}
+
 static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 {
 	int i;
@@ -1634,6 +1647,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err, ret = NETDEV_TX_OK;
 	u8 *msg_tx_can_flags = NULL;		/* GCC */
+	unsigned long flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1687,12 +1701,21 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (cf->can_id & CAN_RTR_FLAG)
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
 			context = &priv->tx_contexts[i];
+
+			context->echo_index = i;
+			can_put_echo_skb(skb, netdev, context->echo_index);
+			++priv->active_tx_contexts;
+			if (priv->active_tx_contexts >= MAX_TX_URBS)
+				netif_stop_queue(netdev);
+
 			break;
 		}
 	}
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
 	/* This should never happen; it implies a flow control bug */
 	if (!context) {
@@ -1704,7 +1727,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	}
 
 	context->priv = priv;
-	context->echo_index = i;
 	context->dlc = cf->can_dlc;
 
 	msg->u.tx_can.tid = context->echo_index;
@@ -1716,18 +1738,17 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 			  kvaser_usb_write_bulk_callback, context);
 	usb_anchor_urb(urb, &priv->tx_submitted);
 
-	can_put_echo_skb(skb, netdev, context->echo_index);
-
-	atomic_inc(&priv->active_tx_urbs);
-
-	if (atomic_read(&priv->active_tx_urbs) >= MAX_TX_URBS)
-		netif_stop_queue(netdev);
-
 	err = usb_submit_urb(urb, GFP_ATOMIC);
 	if (unlikely(err)) {
+		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
+
 		can_free_echo_skb(netdev, context->echo_index);
+		context->echo_index = MAX_TX_URBS;
+		--priv->active_tx_contexts;
+		netif_wake_queue(netdev);
+
+		spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
-		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
 		stats->tx_dropped++;
@@ -1854,7 +1875,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb *dev = usb_get_intfdata(intf);
 	struct net_device *netdev;
 	struct kvaser_usb_net_priv *priv;
-	int i, err;
+	int err;
 
 	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
 	if (err)
@@ -1868,19 +1889,16 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 
 	priv = netdev_priv(netdev);
 
+	init_usb_anchor(&priv->tx_submitted);
 	init_completion(&priv->start_comp);
 	init_completion(&priv->stop_comp);
 
-	init_usb_anchor(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-
 	priv->dev = dev;
 	priv->netdev = netdev;
 	priv->channel = channel;
 
+	kvaser_usb_reset_tx_urb_contexts(priv);
+
 	priv->can.state = CAN_STATE_STOPPED;
 	priv->can.clock.freq = CAN_USB_CLOCK;
 	priv->can.bittiming_const = &kvaser_usb_bittiming_const;
-- 
1.9.1


^ permalink raw reply related	[relevance 78%]

* [PATCH v2 2/3] can: kvaser_usb: Utilize all possible tx URBs
  2015-03-11 15:23 78% ` [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
@ 2015-03-11 15:28 78%   ` Ahmed S. Darwish
  2015-03-11 15:30 99%     ` [PATCH v2 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
    1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 15:28 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

The driver currently limits the number of outstanding, not yet
ACKed, transfers to 16 URBs. Meanwhile, the Kvaser firmware
provides its actual max supported number of outstanding
transmissions in its reply to the CMD_GET_SOFTWARE_INFO message.

One example is the UsbCan-II HS/LS device which reports support
of up to 48 tx URBs instead of just 16, increasing the driver
throughput by two-fold and reducing the possibility of -ENOBUFs.

Dynamically set the max tx URBs value according to firmware
replies.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 55 +++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 18 deletions(-)

This is a bugfix if the kvaser hardware in question has less
than 16 tx URBs, and a speed optimization if it has more ;-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 0aea8e2..0742d53 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -25,7 +25,6 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
-#define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
 #define STOP_TIMEOUT			1000 /* msecs */
@@ -456,8 +455,13 @@ struct kvaser_usb {
 	struct usb_endpoint_descriptor *bulk_in, *bulk_out;
 	struct usb_anchor rx_submitted;
 
+	/* @max_tx_urbs: Firmware-reported maximum number of possible
+	 * outstanding transmissions on this specific Kvaser hardware. The
+	 * value is also used as a sentinel for marking free URB contexts.
+	 */
 	u32 fw_version;
 	unsigned int nchannels;
+	unsigned int max_tx_urbs;
 	enum kvaser_usb_family family;
 
 	bool rxinitdone;
@@ -470,7 +474,7 @@ struct kvaser_usb_net_priv {
 
 	spinlock_t tx_contexts_lock;
 	int active_tx_contexts;
-	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
+	struct kvaser_usb_tx_urb_context *tx_contexts;
 
 	struct usb_anchor tx_submitted;
 	struct completion start_comp, stop_comp;
@@ -657,9 +661,13 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	switch (dev->family) {
 	case KVASER_LEAF:
 		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.leaf.softinfo.max_outstanding_tx);
 		break;
 	case KVASER_USBCAN:
 		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.usbcan.softinfo.max_outstanding_tx);
 		break;
 	}
 
@@ -715,7 +723,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats = &priv->netdev->stats;
 
-	context = &priv->tx_contexts[tid % MAX_TX_URBS];
+	context = &priv->tx_contexts[tid % dev->max_tx_urbs];
 
 	/* Sometimes the state change doesn't come after a bus-off event */
 	if (priv->can.restart_ms &&
@@ -744,7 +752,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 	can_get_echo_skb(priv->netdev, context->echo_index);
-	context->echo_index = MAX_TX_URBS;
+	context->echo_index = dev->max_tx_urbs;
 	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
 
@@ -1512,11 +1520,13 @@ error:
 
 static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
 {
-	int i;
+	int i, max_tx_urbs;
+
+	max_tx_urbs = priv->dev->max_tx_urbs;
 
 	priv->active_tx_contexts = 0;
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+	for (i = 0; i < max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = max_tx_urbs;
 }
 
 /* This method might sleep. Do not call it in the atomic context
@@ -1702,14 +1712,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
-		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
+	for (i = 0; i < dev->max_tx_urbs; i++) {
+		if (priv->tx_contexts[i].echo_index == dev->max_tx_urbs) {
 			context = &priv->tx_contexts[i];
 
 			context->echo_index = i;
 			can_put_echo_skb(skb, netdev, context->echo_index);
 			++priv->active_tx_contexts;
-			if (priv->active_tx_contexts >= MAX_TX_URBS)
+			if (priv->active_tx_contexts >= dev->max_tx_urbs)
 				netif_stop_queue(netdev);
 
 			break;
@@ -1743,7 +1753,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 		can_free_echo_skb(netdev, context->echo_index);
-		context->echo_index = MAX_TX_URBS;
+		context->echo_index = dev->max_tx_urbs;
 		--priv->active_tx_contexts;
 		netif_wake_queue(netdev);
 
@@ -1881,7 +1891,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	if (err)
 		return err;
 
-	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
+	netdev = alloc_candev(sizeof(*priv), dev->max_tx_urbs);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
 		return -ENOMEM;
@@ -1889,6 +1899,13 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 
 	priv = netdev_priv(netdev);
 
+	priv->tx_contexts = kzalloc(dev->max_tx_urbs *
+				    sizeof(*priv->tx_contexts), GFP_KERNEL);
+	if (!priv->tx_contexts) {
+		free_candev(netdev);
+		return -ENOMEM;
+	}
+
 	init_usb_anchor(&priv->tx_submitted);
 	init_completion(&priv->start_comp);
 	init_completion(&priv->stop_comp);
@@ -1927,7 +1944,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 		return err;
 	}
 
-	netdev_dbg(netdev, "device registered\n");
+	netdev_info(netdev, "device registered\n");
 
 	return 0;
 }
@@ -2008,6 +2025,13 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
+	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
+		((dev->fw_version >> 24) & 0xff),
+		((dev->fw_version >> 16) & 0xff),
+		(dev->fw_version & 0xffff));
+
+	dev_dbg(&intf->dev, "Max oustanding tx = %d URBs\n", dev->max_tx_urbs);
+
 	err = kvaser_usb_get_card_info(dev);
 	if (err) {
 		dev_err(&intf->dev,
@@ -2015,11 +2039,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
-	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
-		((dev->fw_version >> 24) & 0xff),
-		((dev->fw_version >> 16) & 0xff),
-		(dev->fw_version & 0xffff));
-
 	for (i = 0; i < dev->nchannels; i++) {
 		err = kvaser_usb_init_one(intf, id, i);
 		if (err) {
-- 
1.9.1


^ permalink raw reply related	[relevance 78%]

* [PATCH v2 3/3] can: kvaser_usb: Use can-dev unregistration mechanism
  2015-03-11 15:28 78%   ` [PATCH v2 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
@ 2015-03-11 15:30 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 15:30 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Use can-dev's unregister_candev() instead of directly calling
networking unregister_netdev(). While both are functionally
equivalent, unregister_candev() might do extra stuff in the
future than just calling networking layer unregistration code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 0742d53..f9c14e8 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1866,7 +1866,7 @@ static void kvaser_usb_remove_interfaces(struct kvaser_usb *dev)
 		if (!dev->nets[i])
 			continue;
 
-		unregister_netdev(dev->nets[i]->netdev);
+		unregister_candev(dev->nets[i]->netdev);
 	}
 
 	kvaser_usb_unlink_all_urbs(dev);
-- 
1.9.1


^ permalink raw reply related	[relevance 99%]

* Re: [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  @ 2015-03-11 15:57 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 15:57 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML

On Wed, Mar 11, 2015 at 04:36:52PM +0100, Marc Kleine-Budde wrote:
> On 03/11/2015 04:23 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > A number of tx queue wake-up events went missing due to the
> > outlined scenario below. Start state is a pool of 16 tx URBs,
> > active tx_urbs count = 15, with the netdev tx queue open.
> > 
> > start_xmit()                             tx_acknowledge()
> > ............                             ................
> > atomic_inc(&tx_urbs);
> > if (atomic_read(&tx_urbs) >= 16) {
> >                         URB completion IRQ!
> >                         -->
> >                                          atomic_dec(&tx_urbs);
> >                                          netif_wake_queue();
> >                                          return;
> >                         <--
> >                         end of IRQ!
> >     netif_stop_queue();
> > }
> > 
> > At the end, the correct state expected is a 15 tx_urbs count
> > value with the tx queue state _open_. Due to the race, we get
> > the same tx_urbs value but with the tx queue state _stopped_.
> > The wake-up event is completely lost.
> > 
> > Thus avoid hand-rolled concurrency mechanisms and use a proper
> > lock for contexts protection.
> 
> I'm missing a spin_lock_init(), right? Please compile and test your code
> with everything switch on in Kernel hacking -> Lock Debugging.
> 

Ouch... that passed through it seems since __ARCH_SPIN_LOCK_UNLOCKED
is always zero on x86. Recompiling the kernel and re-iterating another
patch series.

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* [PATCH v3 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  2015-02-26 15:20 95% [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
                   ` (2 preceding siblings ...)
  2015-03-11 15:23 78% ` [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
@ 2015-03-11 17:37 77% ` Ahmed S. Darwish
  2015-03-11 17:39 79%   ` [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
    2015-03-14 13:02 74% ` [PATCH v4 " Ahmed S. Darwish
  2015-03-15 15:03 77% ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Ahmed S. Darwish
  5 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 17:37 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

A number of tx queue wake-up events went missing due to the
outlined scenario below. Start state is a pool of 16 tx URBs,
active tx_urbs count = 15, with the netdev tx queue open.

start_xmit()                             tx_acknowledge()
............                             ................
atomic_inc(&tx_urbs);
if (atomic_read(&tx_urbs) >= 16) {
                        URB completion IRQ!
                        -->
                                         atomic_dec(&tx_urbs);
                                         netif_wake_queue();
                                         return;
                        <--
                        end of IRQ!
    netif_stop_queue();
}

At the end, the correct state expected is a 15 tx_urbs count
value with the tx queue state _open_. Due to the race, we get
the same tx_urbs value but with the tx queue state _stopped_.
The wake-up event is completely lost.

Thus avoid hand-rolled concurrency mechanisms and use a proper
lock for contexts protection.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 83 ++++++++++++++++++++++++----------------
 1 file changed, 51 insertions(+), 32 deletions(-)

changelog-v3: Added missing spin_lock_init(). With all kernel
lock debugging options set, I've been running my test suite
for an hour now without apparent problems in dmesg so far.

changelog-v2: Put bugfix patch at the start of the series.

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index a316fa4..e97a08c 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -14,6 +14,7 @@
  * Copyright (C) 2015 Valeo S.A.
  */
 
+#include <linux/spinlock.h>
 #include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/module.h>
@@ -467,10 +468,11 @@ struct kvaser_usb {
 struct kvaser_usb_net_priv {
 	struct can_priv can;
 
-	atomic_t active_tx_urbs;
-	struct usb_anchor tx_submitted;
+	spinlock_t tx_contexts_lock;
+	int active_tx_contexts;
 	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
 
+	struct usb_anchor tx_submitted;
 	struct completion start_comp, stop_comp;
 
 	struct kvaser_usb *dev;
@@ -694,6 +696,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
+	unsigned long flags;
 	u8 channel, tid;
 
 	channel = msg->u.tx_acknowledge_header.channel;
@@ -737,12 +740,15 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats->tx_packets++;
 	stats->tx_bytes += context->dlc;
-	can_get_echo_skb(priv->netdev, context->echo_index);
 
-	context->echo_index = MAX_TX_URBS;
-	atomic_dec(&priv->active_tx_urbs);
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
+	can_get_echo_skb(priv->netdev, context->echo_index);
+	context->echo_index = MAX_TX_URBS;
+	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
+
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 }
 
 static void kvaser_usb_simple_msg_callback(struct urb *urb)
@@ -803,17 +809,6 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 	return 0;
 }
 
-static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
-{
-	int i;
-
-	usb_kill_anchored_urbs(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-}
-
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
 						 const struct kvaser_usb_error_summary *es,
 						 struct can_frame *cf)
@@ -1515,6 +1510,24 @@ error:
 	return err;
 }
 
+static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
+{
+	int i;
+
+	priv->active_tx_contexts = 0;
+	for (i = 0; i < MAX_TX_URBS; i++)
+		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+}
+
+/* This method might sleep. Do not call it in the atomic context
+ * of URB completions.
+ */
+static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
+{
+	usb_kill_anchored_urbs(&priv->tx_submitted);
+	kvaser_usb_reset_tx_urb_contexts(priv);
+}
+
 static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 {
 	int i;
@@ -1634,6 +1647,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err, ret = NETDEV_TX_OK;
 	u8 *msg_tx_can_flags = NULL;		/* GCC */
+	unsigned long flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1687,12 +1701,21 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (cf->can_id & CAN_RTR_FLAG)
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
 			context = &priv->tx_contexts[i];
+
+			context->echo_index = i;
+			can_put_echo_skb(skb, netdev, context->echo_index);
+			++priv->active_tx_contexts;
+			if (priv->active_tx_contexts >= MAX_TX_URBS)
+				netif_stop_queue(netdev);
+
 			break;
 		}
 	}
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
 	/* This should never happen; it implies a flow control bug */
 	if (!context) {
@@ -1704,7 +1727,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	}
 
 	context->priv = priv;
-	context->echo_index = i;
 	context->dlc = cf->can_dlc;
 
 	msg->u.tx_can.tid = context->echo_index;
@@ -1716,18 +1738,17 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 			  kvaser_usb_write_bulk_callback, context);
 	usb_anchor_urb(urb, &priv->tx_submitted);
 
-	can_put_echo_skb(skb, netdev, context->echo_index);
-
-	atomic_inc(&priv->active_tx_urbs);
-
-	if (atomic_read(&priv->active_tx_urbs) >= MAX_TX_URBS)
-		netif_stop_queue(netdev);
-
 	err = usb_submit_urb(urb, GFP_ATOMIC);
 	if (unlikely(err)) {
+		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
+
 		can_free_echo_skb(netdev, context->echo_index);
+		context->echo_index = MAX_TX_URBS;
+		--priv->active_tx_contexts;
+		netif_wake_queue(netdev);
+
+		spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
-		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
 		stats->tx_dropped++;
@@ -1854,7 +1875,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb *dev = usb_get_intfdata(intf);
 	struct net_device *netdev;
 	struct kvaser_usb_net_priv *priv;
-	int i, err;
+	int err;
 
 	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
 	if (err)
@@ -1868,19 +1889,17 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 
 	priv = netdev_priv(netdev);
 
+	init_usb_anchor(&priv->tx_submitted);
 	init_completion(&priv->start_comp);
 	init_completion(&priv->stop_comp);
 
-	init_usb_anchor(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-
 	priv->dev = dev;
 	priv->netdev = netdev;
 	priv->channel = channel;
 
+	spin_lock_init(&priv->tx_contexts_lock);
+	kvaser_usb_reset_tx_urb_contexts(priv);
+
 	priv->can.state = CAN_STATE_STOPPED;
 	priv->can.clock.freq = CAN_USB_CLOCK;
 	priv->can.bittiming_const = &kvaser_usb_bittiming_const;
-- 
1.9.1


^ permalink raw reply related	[relevance 77%]

* [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs
  2015-03-11 17:37 77% ` [PATCH v3 " Ahmed S. Darwish
@ 2015-03-11 17:39 79%   ` Ahmed S. Darwish
  2015-03-11 17:39 99%     ` [PATCH v3 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
      1 sibling, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 17:39 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

The driver currently limits the number of outstanding, not yet
ACKed, transfers to 16 URBs. Meanwhile, the Kvaser firmware
provides its actual max supported number of outstanding
transmissions in its reply to the CMD_GET_SOFTWARE_INFO message.

One example is the UsbCan-II HS/LS device which reports support
of up to 48 tx URBs instead of just 16, increasing the driver
throughput by two-fold and reducing the possibility of -ENOBUFs.

Dynamically set the max tx URBs value according to firmware
replies.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 55 +++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 18 deletions(-)

changelog-v3: No changes

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index e97a08c..30b4d47 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -25,7 +25,6 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
-#define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
 #define STOP_TIMEOUT			1000 /* msecs */
@@ -456,8 +455,13 @@ struct kvaser_usb {
 	struct usb_endpoint_descriptor *bulk_in, *bulk_out;
 	struct usb_anchor rx_submitted;
 
+	/* @max_tx_urbs: Firmware-reported maximum number of possible
+	 * outstanding transmissions on this specific Kvaser hardware. The
+	 * value is also used as a sentinel for marking free URB contexts.
+	 */
 	u32 fw_version;
 	unsigned int nchannels;
+	unsigned int max_tx_urbs;
 	enum kvaser_usb_family family;
 
 	bool rxinitdone;
@@ -470,7 +474,7 @@ struct kvaser_usb_net_priv {
 
 	spinlock_t tx_contexts_lock;
 	int active_tx_contexts;
-	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
+	struct kvaser_usb_tx_urb_context *tx_contexts;
 
 	struct usb_anchor tx_submitted;
 	struct completion start_comp, stop_comp;
@@ -657,9 +661,13 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	switch (dev->family) {
 	case KVASER_LEAF:
 		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.leaf.softinfo.max_outstanding_tx);
 		break;
 	case KVASER_USBCAN:
 		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.usbcan.softinfo.max_outstanding_tx);
 		break;
 	}
 
@@ -715,7 +723,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats = &priv->netdev->stats;
 
-	context = &priv->tx_contexts[tid % MAX_TX_URBS];
+	context = &priv->tx_contexts[tid % dev->max_tx_urbs];
 
 	/* Sometimes the state change doesn't come after a bus-off event */
 	if (priv->can.restart_ms &&
@@ -744,7 +752,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 	can_get_echo_skb(priv->netdev, context->echo_index);
-	context->echo_index = MAX_TX_URBS;
+	context->echo_index = dev->max_tx_urbs;
 	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
 
@@ -1512,11 +1520,13 @@ error:
 
 static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
 {
-	int i;
+	int i, max_tx_urbs;
+
+	max_tx_urbs = priv->dev->max_tx_urbs;
 
 	priv->active_tx_contexts = 0;
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+	for (i = 0; i < max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = max_tx_urbs;
 }
 
 /* This method might sleep. Do not call it in the atomic context
@@ -1702,14 +1712,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
-		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
+	for (i = 0; i < dev->max_tx_urbs; i++) {
+		if (priv->tx_contexts[i].echo_index == dev->max_tx_urbs) {
 			context = &priv->tx_contexts[i];
 
 			context->echo_index = i;
 			can_put_echo_skb(skb, netdev, context->echo_index);
 			++priv->active_tx_contexts;
-			if (priv->active_tx_contexts >= MAX_TX_URBS)
+			if (priv->active_tx_contexts >= dev->max_tx_urbs)
 				netif_stop_queue(netdev);
 
 			break;
@@ -1743,7 +1753,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 		can_free_echo_skb(netdev, context->echo_index);
-		context->echo_index = MAX_TX_URBS;
+		context->echo_index = dev->max_tx_urbs;
 		--priv->active_tx_contexts;
 		netif_wake_queue(netdev);
 
@@ -1881,7 +1891,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	if (err)
 		return err;
 
-	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
+	netdev = alloc_candev(sizeof(*priv), dev->max_tx_urbs);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
 		return -ENOMEM;
@@ -1889,6 +1899,13 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 
 	priv = netdev_priv(netdev);
 
+	priv->tx_contexts = kzalloc(dev->max_tx_urbs *
+				    sizeof(*priv->tx_contexts), GFP_KERNEL);
+	if (!priv->tx_contexts) {
+		free_candev(netdev);
+		return -ENOMEM;
+	}
+
 	init_usb_anchor(&priv->tx_submitted);
 	init_completion(&priv->start_comp);
 	init_completion(&priv->stop_comp);
@@ -1928,7 +1945,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 		return err;
 	}
 
-	netdev_dbg(netdev, "device registered\n");
+	netdev_info(netdev, "device registered\n");
 
 	return 0;
 }
@@ -2009,6 +2026,13 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
+	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
+		((dev->fw_version >> 24) & 0xff),
+		((dev->fw_version >> 16) & 0xff),
+		(dev->fw_version & 0xffff));
+
+	dev_dbg(&intf->dev, "Max oustanding tx = %d URBs\n", dev->max_tx_urbs);
+
 	err = kvaser_usb_get_card_info(dev);
 	if (err) {
 		dev_err(&intf->dev,
@@ -2016,11 +2040,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
-	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
-		((dev->fw_version >> 24) & 0xff),
-		((dev->fw_version >> 16) & 0xff),
-		(dev->fw_version & 0xffff));
-
 	for (i = 0; i < dev->nchannels; i++) {
 		err = kvaser_usb_init_one(intf, id, i);
 		if (err) {
-- 
1.9.1


^ permalink raw reply related	[relevance 79%]

* [PATCH v3 3/3] can: kvaser_usb: Use can-dev unregistration mechanism
  2015-03-11 17:39 79%   ` [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
@ 2015-03-11 17:39 99%     ` Ahmed S. Darwish
    1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-11 17:39 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Use can-dev's unregister_candev() instead of directly calling
networking unregister_netdev(). While both are functionally
equivalent, unregister_candev() might do extra stuff in the
future than just calling networking layer unregistration code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 30b4d47..fafcb89 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1866,7 +1866,7 @@ static void kvaser_usb_remove_interfaces(struct kvaser_usb *dev)
 		if (!dev->nets[i])
 			continue;
 
-		unregister_netdev(dev->nets[i]->netdev);
+		unregister_candev(dev->nets[i]->netdev);
 	}
 
 	kvaser_usb_unlink_all_urbs(dev);
-- 
1.9.1


^ permalink raw reply related	[relevance 99%]

* Re: [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs
  @ 2015-03-12 10:52 99%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-12 10:52 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML

On Wed, Mar 11, 2015 at 10:53:28PM +0100, Marc Kleine-Budde wrote:
> On 03/11/2015 06:39 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > The driver currently limits the number of outstanding, not yet
> > ACKed, transfers to 16 URBs. Meanwhile, the Kvaser firmware
> > provides its actual max supported number of outstanding
> > transmissions in its reply to the CMD_GET_SOFTWARE_INFO message.
> > 
> > One example is the UsbCan-II HS/LS device which reports support
> > of up to 48 tx URBs instead of just 16, increasing the driver
> > throughput by two-fold and reducing the possibility of -ENOBUFs.
> > 
> > Dynamically set the max tx URBs value according to firmware
> > replies.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > ---
> >  drivers/net/can/usb/kvaser_usb.c | 55 +++++++++++++++++++++++++++-------------
> >  1 file changed, 37 insertions(+), 18 deletions(-)
> > 
> > changelog-v3: No changes
> > 
> > diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> > index e97a08c..30b4d47 100644
> > --- a/drivers/net/can/usb/kvaser_usb.c
> > +++ b/drivers/net/can/usb/kvaser_usb.c
> > @@ -25,7 +25,6 @@
> >  #include <linux/can/dev.h>
> >  #include <linux/can/error.h>
> >  
> > -#define MAX_TX_URBS			16
> >  #define MAX_RX_URBS			4
> >  #define START_TIMEOUT			1000 /* msecs */
> >  #define STOP_TIMEOUT			1000 /* msecs */
> > @@ -456,8 +455,13 @@ struct kvaser_usb {
> >  	struct usb_endpoint_descriptor *bulk_in, *bulk_out;
> >  	struct usb_anchor rx_submitted;
> >  
> > +	/* @max_tx_urbs: Firmware-reported maximum number of possible
> > +	 * outstanding transmissions on this specific Kvaser hardware. The
> > +	 * value is also used as a sentinel for marking free URB contexts.
> > +	 */
> >  	u32 fw_version;
> >  	unsigned int nchannels;
> > +	unsigned int max_tx_urbs;
> >  	enum kvaser_usb_family family;
> >  
> >  	bool rxinitdone;
> > @@ -470,7 +474,7 @@ struct kvaser_usb_net_priv {
> >  
> >  	spinlock_t tx_contexts_lock;
> >  	int active_tx_contexts;
> > -	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
> > +	struct kvaser_usb_tx_urb_context *tx_contexts;
> >  
> >  	struct usb_anchor tx_submitted;
> >  	struct completion start_comp, stop_comp;
> > @@ -657,9 +661,13 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
> >  	switch (dev->family) {
> >  	case KVASER_LEAF:
> >  		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
> > +		dev->max_tx_urbs =
> > +			le16_to_cpu(msg.u.leaf.softinfo.max_outstanding_tx);
> >  		break;
> >  	case KVASER_USBCAN:
> >  		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
> > +		dev->max_tx_urbs =
> > +			le16_to_cpu(msg.u.usbcan.softinfo.max_outstanding_tx);
> >  		break;
> >  	}
> >  
> > @@ -715,7 +723,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
> >  
> >  	stats = &priv->netdev->stats;
> >  
> > -	context = &priv->tx_contexts[tid % MAX_TX_URBS];
> > +	context = &priv->tx_contexts[tid % dev->max_tx_urbs];
> >  
> >  	/* Sometimes the state change doesn't come after a bus-off event */
> >  	if (priv->can.restart_ms &&
> > @@ -744,7 +752,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
> >  	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
> >  
> >  	can_get_echo_skb(priv->netdev, context->echo_index);
> > -	context->echo_index = MAX_TX_URBS;
> > +	context->echo_index = dev->max_tx_urbs;
> >  	--priv->active_tx_contexts;
> >  	netif_wake_queue(priv->netdev);
> >  
> > @@ -1512,11 +1520,13 @@ error:
> >  
> >  static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
> >  {
> > -	int i;
> > +	int i, max_tx_urbs;
> > +
> > +	max_tx_urbs = priv->dev->max_tx_urbs;
> >  
> >  	priv->active_tx_contexts = 0;
> > -	for (i = 0; i < MAX_TX_URBS; i++)
> > -		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
> > +	for (i = 0; i < max_tx_urbs; i++)
> > +		priv->tx_contexts[i].echo_index = max_tx_urbs;
> >  }
> >  
> >  /* This method might sleep. Do not call it in the atomic context
> > @@ -1702,14 +1712,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
> >  
> >  	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
> > -	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
> > -		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
> > +	for (i = 0; i < dev->max_tx_urbs; i++) {
> > +		if (priv->tx_contexts[i].echo_index == dev->max_tx_urbs) {
> >  			context = &priv->tx_contexts[i];
> >  
> >  			context->echo_index = i;
> >  			can_put_echo_skb(skb, netdev, context->echo_index);
> >  			++priv->active_tx_contexts;
> > -			if (priv->active_tx_contexts >= MAX_TX_URBS)
> > +			if (priv->active_tx_contexts >= dev->max_tx_urbs)
> >  				netif_stop_queue(netdev);
> >  
> >  			break;
> > @@ -1743,7 +1753,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
> >  		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
> >  
> >  		can_free_echo_skb(netdev, context->echo_index);
> > -		context->echo_index = MAX_TX_URBS;
> > +		context->echo_index = dev->max_tx_urbs;
> >  		--priv->active_tx_contexts;
> >  		netif_wake_queue(netdev);
> >  
> > @@ -1881,7 +1891,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
> >  	if (err)
> >  		return err;
> >  
> > -	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
> > +	netdev = alloc_candev(sizeof(*priv), dev->max_tx_urbs);
> >  	if (!netdev) {
> >  		dev_err(&intf->dev, "Cannot alloc candev\n");
> >  		return -ENOMEM;
> > @@ -1889,6 +1899,13 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
> >  
> >  	priv = netdev_priv(netdev);
> >  
> > +	priv->tx_contexts = kzalloc(dev->max_tx_urbs *
> > +				    sizeof(*priv->tx_contexts), GFP_KERNEL);
> > +	if (!priv->tx_contexts) {
> > +		free_candev(netdev);
> > +		return -ENOMEM;
> > +	}
> 
> I'm missing a free for the priv->tx_contexts. I see two options:
> 

Correct. Should not have missed that.

> 1) use devm_kzalloc(), or
> 2) move struct kvaser_usb_tx_urb_context tx_contexts[]; to the end of
>    struct kvaser_usb_net_priv, see [1] for an example.
> 
>    Without further testing, I think the correct alloc for that case
>    would be:
>        alloc_candev(sizeof(*priv + dev->max_tx_urbs *
>                sizeof(struct kvaser_usb_tx_urb_context))
> 

The first option looks better I guess. I'll have to check though
if the resource handling done by devm_kmalloc() will work even
if the probe() method fails with -ENODEV and the like...

> Marc
> 
> [1] http://stackoverflow.com/questions/2060974/dynamic-array-in-struct-c
> 

Thanks for the link. Didn't know that such a "hack" has gained
official status by C99 :-)

Regards,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v3 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  @ 2015-03-12 19:30 89%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-12 19:30 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML, Linux-netdev

Hi Marc,

On Wed, Mar 11, 2015 at 10:43:07PM +0100, Marc Kleine-Budde wrote:
> On 03/11/2015 06:37 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > A number of tx queue wake-up events went missing due to the
> > outlined scenario below. Start state is a pool of 16 tx URBs,
> > active tx_urbs count = 15, with the netdev tx queue open.
> > 
> > start_xmit()                             tx_acknowledge()
> > ............                             ................
> > atomic_inc(&tx_urbs);
> > if (atomic_read(&tx_urbs) >= 16) {
> >                         URB completion IRQ!
> >                         -->
> >                                          atomic_dec(&tx_urbs);
> >                                          netif_wake_queue();
> >                                          return;
> >                         <--
> >                         end of IRQ!
> >     netif_stop_queue();
> > }
> > 
> > At the end, the correct state expected is a 15 tx_urbs count
> > value with the tx queue state _open_. Due to the race, we get
> > the same tx_urbs value but with the tx queue state _stopped_.
> > The wake-up event is completely lost.
> > 
> > Thus avoid hand-rolled concurrency mechanisms and use a proper
> > lock for contexts protection.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > ---
> >  drivers/net/can/usb/kvaser_usb.c | 83 ++++++++++++++++++++++++----------------
> >  1 file changed, 51 insertions(+), 32 deletions(-)
> > 
> > changelog-v3: Added missing spin_lock_init(). With all kernel
> > lock debugging options set, I've been running my test suite
> > for an hour now without apparent problems in dmesg so far.
> > 
> > changelog-v2: Put bugfix patch at the start of the series.
> > 
> > diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
> > index a316fa4..e97a08c 100644
> > --- a/drivers/net/can/usb/kvaser_usb.c
> > +++ b/drivers/net/can/usb/kvaser_usb.c
> > @@ -14,6 +14,7 @@
> >   * Copyright (C) 2015 Valeo S.A.
> >   */
> >  
> > +#include <linux/spinlock.h>
> >  #include <linux/kernel.h>
> >  #include <linux/completion.h>
> >  #include <linux/module.h>
> > @@ -467,10 +468,11 @@ struct kvaser_usb {
> >  struct kvaser_usb_net_priv {
> >  	struct can_priv can;
> >  
> > -	atomic_t active_tx_urbs;
> > -	struct usb_anchor tx_submitted;
> > +	spinlock_t tx_contexts_lock;
> > +	int active_tx_contexts;
> >  	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
> >  
> > +	struct usb_anchor tx_submitted;
> >  	struct completion start_comp, stop_comp;
> >  
> >  	struct kvaser_usb *dev;
> > @@ -694,6 +696,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
> >  	struct kvaser_usb_net_priv *priv;
> >  	struct sk_buff *skb;
> >  	struct can_frame *cf;
> > +	unsigned long flags;
> >  	u8 channel, tid;
> >  
> >  	channel = msg->u.tx_acknowledge_header.channel;
> > @@ -737,12 +740,15 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
> >  
> >  	stats->tx_packets++;
> >  	stats->tx_bytes += context->dlc;
> > -	can_get_echo_skb(priv->netdev, context->echo_index);
> >  
> > -	context->echo_index = MAX_TX_URBS;
> > -	atomic_dec(&priv->active_tx_urbs);
> > +	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
> >  
> > +	can_get_echo_skb(priv->netdev, context->echo_index);
> > +	context->echo_index = MAX_TX_URBS;
> > +	--priv->active_tx_contexts;
> >  	netif_wake_queue(priv->netdev);
> > +
> > +	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
> 
> I think it should be possible to move tun unlock before waking up the queue?
> 

Hmmm... I've kept thinking about this today. Here's my
understanding of the whole situation:

1) start_xmit() runs in NET_TX_SOFTIRQ softirq context
2) tx_acknowledge() occurs in URB completion softirq context
3) In an SMP machine, softirqs can run parallel to each other
4) start_xmit is protected from itself by the _xmit_lock spin
5) start_xmit() and tx_acknowledge() can, and do, run in parallel
   to each other

>From the above, we can imagine the following:

################################################################
#  Transfer queue length = 16
#  Transfer queue index = 15
#  
#  CPU #1                                    CPU #2
#  start_xmit()                              tx_acknowledge()
#  ------------                              ----------------
#                                            ...
#                                            spin_lock(ctx_lock)
#                                            index--
#                                            spin_unlock(ctx__lock)
#                          <---
#  ...
#  spin_lock(ctx_lock)
#  index++
#  spin_unlock(ctx_lock)
#  return;
#  
#  /* Another invocation of start_xmit */
#  
#  ...
#  spin_lock(ctx_lock)
#  index++
#  /* We've filled the tx queue */
#  if (index == 16)
#      netif_stop_queue()
#  spin_unlock(ctx_lock)
#  return;
#  
#  /* Transfer queue stopped */
#
#                          --->
#                                            /* Queue woken up
#  					     while it's full */
#                                            netif_wake_queue()
#                        
################################################################

I admit that unlike the race in the patch commit message, which
actually appeared in practice in a normal but heavy use case,
the one in the box above is highly theoretical. 

Nonetheless, I was actually able to fully and reliably produce
it by inserting a busy loop as follows:

static void kvaser_usb_tx_acknowledge(...)
{
   ...
   spin_lock_irqsave(&priv->tx_contexts_lock, flags);

   can_get_echo_skb(priv->netdev, context->echo_index);
   context->echo_index = dev->max_tx_urbs;
   --priv->active_tx_contexts;

   spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);

   /* Manual delay; use cpu_relax() not to be optimized out */
   for (n = 0; n < 1000; n++)
       cpu_relax();

   netif_wake_queue(priv->netdev);
   ...
}

Then running a very heavy transmission load:

   $ for i in {1..10}; do (cangen -i -I 111 -g1 -Di -L4 can0 &); done

And as I've expected, dmesg began seeing entries of "cannot find
free context" due to waking up the tx queue while being full:

[  +0.000002] kvaser_usb 1-1.2.1.1:1.0 can0: cannot find free context
[  +0.000003] kvaser_usb 1-1.2.1.1:1.0 can0: cannot find free context
[  +0.000002] kvaser_usb 1-1.2.1.1:1.0 can0: cannot find free context
[  +0.000002] kvaser_usb 1-1.2.1.1:1.0 can0: cannot find free context

Puting the netif_wake_queue() back inside the critical section,
even while keeping the delay loop in place, avoided such a race
condition.

Under normal conditions, such a heavy delay between the critical
section exit and netif_wake_queue() will not usually occur. This
is even more true since a softirq cannot preempt another softirq
running on the same CPU.

Nonetheless, since the race can be manually triggered as seen
above, I'd be safe and wake the queue inside the critical section
rather than sorry...

What do you think?

Regards,
Darwish

^ permalink raw reply	[relevance 89%]

* [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  2015-02-26 15:20 95% [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
                   ` (3 preceding siblings ...)
  2015-03-11 17:37 77% ` [PATCH v3 " Ahmed S. Darwish
@ 2015-03-14 13:02 74% ` Ahmed S. Darwish
  2015-03-14 13:09 78%   ` [PATCH v4 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
    2015-03-15 15:03 77% ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Ahmed S. Darwish
  5 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 13:02 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML, netdev

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

A number of tx queue wake-up events went missing due to the
outlined scenario below. Start state is a pool of 16 tx URBs,
active tx_urbs count = 15, with the netdev tx queue open.

CPU #1 [softirq]                         CPU #2 [softirq]
start_xmit()                             tx_acknowledge()
................                         ................

atomic_inc(&tx_urbs);
if (atomic_read(&tx_urbs) >= 16) {
                        -->
                                         atomic_dec(&tx_urbs);
                                         netif_wake_queue();
                                         return;
                        <--
    netif_stop_queue();
}

At the end, the correct state expected is a 15 tx_urbs count
value with the tx queue state _open_. Due to the race, we get
the same tx_urbs value but with the tx queue state _stopped_.
The wake-up event is completely lost.

Thus avoid hand-rolled concurrency mechanisms and use a proper
lock for contexts and tx queue protection.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 83 ++++++++++++++++++++++++----------------
 1 file changed, 51 insertions(+), 32 deletions(-)

Changelog v4:
-------------

Improve the commit log not to give the impression of a
softirq preempting another softirq, which can never happen. The
race condition occurs by having the softirqs running in parallel.

For why are we waking up the queue inside the newly created
critical section, kindly check the explanation here:

	http://article.gmane.org/gmane.linux.kernel/1907377
	Archived at: http://www.webcitation.org/6X1SNi708

Meanwhile, I've been running the driver for 30 hours now under
very heavy and ordered "cangen -Di" traffic from both ends.
Analyzing the tens of gigabytes candump traces (generated, in
parallel, using in-kernel CAN ID filters to avoid SO_RXQ_OVFL
overflows) shows that all the frames were sent and received in
the expected sequence.

Changelog v3:
-------------

Add missing spin_lock_init(). Run driver tests with locking
and memory management debugging options on.

Changelog v2:
-------------

Put this bugfix patch at the top of the series

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index a316fa4..e97a08c 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -14,6 +14,7 @@
  * Copyright (C) 2015 Valeo S.A.
  */
 
+#include <linux/spinlock.h>
 #include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/module.h>
@@ -467,10 +468,11 @@ struct kvaser_usb {
 struct kvaser_usb_net_priv {
 	struct can_priv can;
 
-	atomic_t active_tx_urbs;
-	struct usb_anchor tx_submitted;
+	spinlock_t tx_contexts_lock;
+	int active_tx_contexts;
 	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
 
+	struct usb_anchor tx_submitted;
 	struct completion start_comp, stop_comp;
 
 	struct kvaser_usb *dev;
@@ -694,6 +696,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	struct kvaser_usb_net_priv *priv;
 	struct sk_buff *skb;
 	struct can_frame *cf;
+	unsigned long flags;
 	u8 channel, tid;
 
 	channel = msg->u.tx_acknowledge_header.channel;
@@ -737,12 +740,15 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats->tx_packets++;
 	stats->tx_bytes += context->dlc;
-	can_get_echo_skb(priv->netdev, context->echo_index);
 
-	context->echo_index = MAX_TX_URBS;
-	atomic_dec(&priv->active_tx_urbs);
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
+	can_get_echo_skb(priv->netdev, context->echo_index);
+	context->echo_index = MAX_TX_URBS;
+	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
+
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 }
 
 static void kvaser_usb_simple_msg_callback(struct urb *urb)
@@ -803,17 +809,6 @@ static int kvaser_usb_simple_msg_async(struct kvaser_usb_net_priv *priv,
 	return 0;
 }
 
-static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
-{
-	int i;
-
-	usb_kill_anchored_urbs(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-}
-
 static void kvaser_usb_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
 						 const struct kvaser_usb_error_summary *es,
 						 struct can_frame *cf)
@@ -1515,6 +1510,24 @@ error:
 	return err;
 }
 
+static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
+{
+	int i;
+
+	priv->active_tx_contexts = 0;
+	for (i = 0; i < MAX_TX_URBS; i++)
+		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+}
+
+/* This method might sleep. Do not call it in the atomic context
+ * of URB completions.
+ */
+static void kvaser_usb_unlink_tx_urbs(struct kvaser_usb_net_priv *priv)
+{
+	usb_kill_anchored_urbs(&priv->tx_submitted);
+	kvaser_usb_reset_tx_urb_contexts(priv);
+}
+
 static void kvaser_usb_unlink_all_urbs(struct kvaser_usb *dev)
 {
 	int i;
@@ -1634,6 +1647,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	struct kvaser_msg *msg;
 	int i, err, ret = NETDEV_TX_OK;
 	u8 *msg_tx_can_flags = NULL;		/* GCC */
+	unsigned long flags;
 
 	if (can_dropped_invalid_skb(netdev, skb))
 		return NETDEV_TX_OK;
@@ -1687,12 +1701,21 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (cf->can_id & CAN_RTR_FLAG)
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
+	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
 		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
 			context = &priv->tx_contexts[i];
+
+			context->echo_index = i;
+			can_put_echo_skb(skb, netdev, context->echo_index);
+			++priv->active_tx_contexts;
+			if (priv->active_tx_contexts >= MAX_TX_URBS)
+				netif_stop_queue(netdev);
+
 			break;
 		}
 	}
+	spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
 	/* This should never happen; it implies a flow control bug */
 	if (!context) {
@@ -1704,7 +1727,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	}
 
 	context->priv = priv;
-	context->echo_index = i;
 	context->dlc = cf->can_dlc;
 
 	msg->u.tx_can.tid = context->echo_index;
@@ -1716,18 +1738,17 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 			  kvaser_usb_write_bulk_callback, context);
 	usb_anchor_urb(urb, &priv->tx_submitted);
 
-	can_put_echo_skb(skb, netdev, context->echo_index);
-
-	atomic_inc(&priv->active_tx_urbs);
-
-	if (atomic_read(&priv->active_tx_urbs) >= MAX_TX_URBS)
-		netif_stop_queue(netdev);
-
 	err = usb_submit_urb(urb, GFP_ATOMIC);
 	if (unlikely(err)) {
+		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
+
 		can_free_echo_skb(netdev, context->echo_index);
+		context->echo_index = MAX_TX_URBS;
+		--priv->active_tx_contexts;
+		netif_wake_queue(netdev);
+
+		spin_unlock_irqrestore(&priv->tx_contexts_lock, flags);
 
-		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
 		stats->tx_dropped++;
@@ -1854,7 +1875,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb *dev = usb_get_intfdata(intf);
 	struct net_device *netdev;
 	struct kvaser_usb_net_priv *priv;
-	int i, err;
+	int err;
 
 	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
 	if (err)
@@ -1868,19 +1889,17 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 
 	priv = netdev_priv(netdev);
 
+	init_usb_anchor(&priv->tx_submitted);
 	init_completion(&priv->start_comp);
 	init_completion(&priv->stop_comp);
 
-	init_usb_anchor(&priv->tx_submitted);
-	atomic_set(&priv->active_tx_urbs, 0);
-
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
-
 	priv->dev = dev;
 	priv->netdev = netdev;
 	priv->channel = channel;
 
+	spin_lock_init(&priv->tx_contexts_lock);
+	kvaser_usb_reset_tx_urb_contexts(priv);
+
 	priv->can.state = CAN_STATE_STOPPED;
 	priv->can.clock.freq = CAN_USB_CLOCK;
 	priv->can.bittiming_const = &kvaser_usb_bittiming_const;
-- 
1.9.1


^ permalink raw reply related	[relevance 74%]

* [PATCH v4 2/3] can: kvaser_usb: Utilize all possible tx URBs
  2015-03-14 13:02 74% ` [PATCH v4 " Ahmed S. Darwish
@ 2015-03-14 13:09 78%   ` Ahmed S. Darwish
  2015-03-14 13:11 99%     ` [PATCH v4 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
    1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 13:09 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML, netdev

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

The driver currently limits the number of outstanding, not yet
ACKed, transfers to 16 URBs. Meanwhile, the Kvaser firmware
provides its actual max supported number of outstanding
transmissions in its reply to the CMD_GET_SOFTWARE_INFO message.

One example is the UsbCan-II HS/LS device which reports support
of up to 48 tx URBs instead of just 16, increasing the driver
throughput by two-fold and reducing the possibility of -ENOBUFs.

Dynamically set the max tx URBs value according to firmware
replies.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 64 ++++++++++++++++++++++++----------------
 1 file changed, 39 insertions(+), 25 deletions(-)

Changelog v4: Allocate the now-dynamically-sized tx_contexts[]
array as a C99 "flexible array member", insuring automatic proper
de-allocation on driver exit.

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index e97a08c..60eadf5 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -25,7 +25,6 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
-#define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
 #define STOP_TIMEOUT			1000 /* msecs */
@@ -443,6 +442,7 @@ struct kvaser_usb_error_summary {
 	};
 };
 
+/* Context for an outstanding, not yet ACKed, transmission */
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -456,8 +456,13 @@ struct kvaser_usb {
 	struct usb_endpoint_descriptor *bulk_in, *bulk_out;
 	struct usb_anchor rx_submitted;
 
+	/* @max_tx_urbs: Firmware-reported maximum number of oustanding,
+	 * not yet ACKed, transmissions on this device. This value is
+	 * also used as a sentinel for marking free tx contexts.
+	 */
 	u32 fw_version;
 	unsigned int nchannels;
+	unsigned int max_tx_urbs;
 	enum kvaser_usb_family family;
 
 	bool rxinitdone;
@@ -467,19 +472,18 @@ struct kvaser_usb {
 
 struct kvaser_usb_net_priv {
 	struct can_priv can;
-
-	spinlock_t tx_contexts_lock;
-	int active_tx_contexts;
-	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
-
-	struct usb_anchor tx_submitted;
-	struct completion start_comp, stop_comp;
+	struct can_berr_counter bec;
 
 	struct kvaser_usb *dev;
 	struct net_device *netdev;
 	int channel;
 
-	struct can_berr_counter bec;
+	struct completion start_comp, stop_comp;
+	struct usb_anchor tx_submitted;
+
+	spinlock_t tx_contexts_lock;
+	int active_tx_contexts;
+	struct kvaser_usb_tx_urb_context tx_contexts[];
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
@@ -657,9 +661,13 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	switch (dev->family) {
 	case KVASER_LEAF:
 		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.leaf.softinfo.max_outstanding_tx);
 		break;
 	case KVASER_USBCAN:
 		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.usbcan.softinfo.max_outstanding_tx);
 		break;
 	}
 
@@ -715,7 +723,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats = &priv->netdev->stats;
 
-	context = &priv->tx_contexts[tid % MAX_TX_URBS];
+	context = &priv->tx_contexts[tid % dev->max_tx_urbs];
 
 	/* Sometimes the state change doesn't come after a bus-off event */
 	if (priv->can.restart_ms &&
@@ -744,7 +752,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 	can_get_echo_skb(priv->netdev, context->echo_index);
-	context->echo_index = MAX_TX_URBS;
+	context->echo_index = dev->max_tx_urbs;
 	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
 
@@ -1512,11 +1520,13 @@ error:
 
 static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
 {
-	int i;
+	int i, max_tx_urbs;
+
+	max_tx_urbs = priv->dev->max_tx_urbs;
 
 	priv->active_tx_contexts = 0;
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+	for (i = 0; i < max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = max_tx_urbs;
 }
 
 /* This method might sleep. Do not call it in the atomic context
@@ -1702,14 +1712,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
-		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
+	for (i = 0; i < dev->max_tx_urbs; i++) {
+		if (priv->tx_contexts[i].echo_index == dev->max_tx_urbs) {
 			context = &priv->tx_contexts[i];
 
 			context->echo_index = i;
 			can_put_echo_skb(skb, netdev, context->echo_index);
 			++priv->active_tx_contexts;
-			if (priv->active_tx_contexts >= MAX_TX_URBS)
+			if (priv->active_tx_contexts >= dev->max_tx_urbs)
 				netif_stop_queue(netdev);
 
 			break;
@@ -1743,7 +1753,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 		can_free_echo_skb(netdev, context->echo_index);
-		context->echo_index = MAX_TX_URBS;
+		context->echo_index = dev->max_tx_urbs;
 		--priv->active_tx_contexts;
 		netif_wake_queue(netdev);
 
@@ -1881,7 +1891,9 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	if (err)
 		return err;
 
-	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
+	netdev = alloc_candev(sizeof(*priv) +
+			      dev->max_tx_urbs * sizeof(*priv->tx_contexts),
+			      dev->max_tx_urbs);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
 		return -ENOMEM;
@@ -1928,7 +1940,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 		return err;
 	}
 
-	netdev_dbg(netdev, "device registered\n");
+	netdev_info(netdev, "device registered\n");
 
 	return 0;
 }
@@ -2009,6 +2021,13 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
+	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
+		((dev->fw_version >> 24) & 0xff),
+		((dev->fw_version >> 16) & 0xff),
+		(dev->fw_version & 0xffff));
+
+	dev_dbg(&intf->dev, "Max oustanding tx = %d URBs\n", dev->max_tx_urbs);
+
 	err = kvaser_usb_get_card_info(dev);
 	if (err) {
 		dev_err(&intf->dev,
@@ -2016,11 +2035,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
-	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
-		((dev->fw_version >> 24) & 0xff),
-		((dev->fw_version >> 16) & 0xff),
-		(dev->fw_version & 0xffff));
-
 	for (i = 0; i < dev->nchannels; i++) {
 		err = kvaser_usb_init_one(intf, id, i);
 		if (err) {
-- 
1.9.1


^ permalink raw reply related	[relevance 78%]

* [PATCH v4 3/3] can: kvaser_usb: Use can-dev unregistration mechanism
  2015-03-14 13:09 78%   ` [PATCH v4 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
@ 2015-03-14 13:11 99%     ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 13:11 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML, netdev

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Use can-dev's unregister_candev() instead of directly calling
networking unregister_netdev(). While both are functionally
equivalent, unregister_candev() might do extra stuff in the
future than just calling networking layer unregistration code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 60eadf5..d44fb1e 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1866,7 +1866,7 @@ static void kvaser_usb_remove_interfaces(struct kvaser_usb *dev)
 		if (!dev->nets[i])
 			continue;
 
-		unregister_netdev(dev->nets[i]->netdev);
+		unregister_candev(dev->nets[i]->netdev);
 	}
 
 	kvaser_usb_unlink_all_urbs(dev);
-- 
1.9.1


^ permalink raw reply related	[relevance 99%]

* Re: [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  @ 2015-03-14 14:38 95%     ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 14:38 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML, netdev

Hi Marc,

On Sat, Mar 14, 2015 at 02:41:18PM +0100, Marc Kleine-Budde wrote:
> On 03/14/2015 02:02 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > A number of tx queue wake-up events went missing due to the
> > outlined scenario below. Start state is a pool of 16 tx URBs,
> > active tx_urbs count = 15, with the netdev tx queue open.
> > 
> > CPU #1 [softirq]                         CPU #2 [softirq]
> > start_xmit()                             tx_acknowledge()
> > ................                         ................
> > 
> > atomic_inc(&tx_urbs);
> > if (atomic_read(&tx_urbs) >= 16) {
> >                         -->
> >                                          atomic_dec(&tx_urbs);
> >                                          netif_wake_queue();
> >                                          return;
> >                         <--
> >     netif_stop_queue();
> > }
> > 
> > At the end, the correct state expected is a 15 tx_urbs count
> > value with the tx queue state _open_. Due to the race, we get
> > the same tx_urbs value but with the tx queue state _stopped_.
> > The wake-up event is completely lost.
> > 
> > Thus avoid hand-rolled concurrency mechanisms and use a proper
> > lock for contexts and tx queue protection.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> Applied to can. This will go into David's net tree and finally into
> net-next. Then I'll apply patches 2+3. Nag me, if I forget about them ;)
> 

Thanks! :-)

So if I've understood correctly, this patch will go to -rc5 and
the rest will go into net-next?

If so, IMHO patch #2 should also go to -rc5. I know it's usually
frowned up on to add further patches at this late -rc stage, but
here's my logic:

The original driver code just _arbitrarily_ assumed a MAX_TX_URB
value of 16 parallel transmissions. This value was chosen, it seems,
because the driver was heavily based on esd_usb2.c and the esd code
just did so :-(

Meanwhile, in the Kvaser hardware at hand, if I've increased the
driver's max parallel transmissions little above the recommended
limit reported by firmware, the firmware breaks up badly, reports a
massive list of internal errors, and the candump traces becomes
sort of an internal mess hardly related to the actual frames sent
and received.

In my case, I was lucky that the brand we own here (*) had a higher
max outstanding transmissions value than 16. But if there is hardware
out there with a max oustanding tx support < 16 (#), such hardware
will break badly under a heavy transmission load :-(

(*) http://www.kvaser.com/products/kvaser-usb-hs-ii-hsls/

(#) There are a huge list of Kvaser products having the same controller
    but with different performance metrics, so this is quite a
    possiblity.

Thanks,
Darwish

^ permalink raw reply	[relevance 95%]

* Re: [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
  @ 2015-03-14 15:19 99%         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 15:19 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML, netdev

On Sat, Mar 14, 2015 at 03:58:39PM +0100, Marc Kleine-Budde wrote:
> On 03/14/2015 03:38 PM, Ahmed S. Darwish wrote:
> >> Applied to can. This will go into David's net tree and finally into
> >> net-next. Then I'll apply patches 2+3. Nag me, if I forget about them ;)
> >>
> > 
> > Thanks! :-)
> > 
> > So if I've understood correctly, this patch will go to -rc5 and
> > the rest will go into net-next?
> > 
> > If so, IMHO patch #2 should also go to -rc5. I know it's usually
> > frowned up on to add further patches at this late -rc stage, but
> > here's my logic:
> > 
> > The original driver code just _arbitrarily_ assumed a MAX_TX_URB
> > value of 16 parallel transmissions. This value was chosen, it seems,
> > because the driver was heavily based on esd_usb2.c and the esd code
> > just did so :-(
> > 
> > Meanwhile, in the Kvaser hardware at hand, if I've increased the
> > driver's max parallel transmissions little above the recommended
> > limit reported by firmware, the firmware breaks up badly, reports a
> > massive list of internal errors, and the candump traces becomes
> > sort of an internal mess hardly related to the actual frames sent
> > and received.
> 
> In this particular hardware, what's the limit as reported by the firmware?
> 

48 max oustanding tx, which is quite big in itself it seems.

Other drivers in the tree range between 10 (Peak) and 20 tx (8dev).

> > In my case, I was lucky that the brand we own here (*) had a higher
> > max outstanding transmissions value than 16. But if there is hardware
> 
> Okay - higher.
> 
> > out there with a max oustanding tx support < 16 (#), such hardware
> > will break badly under a heavy transmission load :-(
> 
> I see.
> 
> If you add this motivation to the patch description and let the subject
> reflect that this is a "fix" or safety measure rather than a possible
> performance improvement, no-one will say anything against this patch :D
> 

True; I admit the "fix" part should've been clearer :-)

Will send a better worded commit message then.

Thanks a lot,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 3/3] can: kvaser_usb: Use can-dev unregistration mechanism
  @ 2015-03-14 15:41 99%         ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 15:41 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML, netdev

On Sat, Mar 14, 2015 at 04:26:56PM +0100, Marc Kleine-Budde wrote:
> On 03/14/2015 02:11 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > Use can-dev's unregister_candev() instead of directly calling
> > networking unregister_netdev(). While both are functionally
> > equivalent, unregister_candev() might do extra stuff in the
> > future than just calling networking layer unregistration code.
> 
> Since 2 goes into can, I've applied this into can-next.
> 

Merci.

Was this a cherry-pick? Because I was going to send a new
series with patch #2 better worded, and with a new patch for
the endiannes issue.

Regards,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v4 3/3] can: kvaser_usb: Use can-dev unregistration mechanism
  @ 2015-03-14 16:06 99%             ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-14 16:06 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML, netdev

On Sat, Mar 14, 2015 at 04:55:11PM +0100, Marc Kleine-Budde wrote:
> On 03/14/2015 04:41 PM, Ahmed S. Darwish wrote:
> > On Sat, Mar 14, 2015 at 04:26:56PM +0100, Marc Kleine-Budde wrote:
> >> On 03/14/2015 02:11 PM, Ahmed S. Darwish wrote:
> >>> From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> >>>
> >>> Use can-dev's unregister_candev() instead of directly calling
> >>> networking unregister_netdev(). While both are functionally
> >>> equivalent, unregister_candev() might do extra stuff in the
> >>> future than just calling networking layer unregistration code.
> >>
> >> Since 2 goes into can, I've applied this into can-next.
> 
> > Was this a cherry-pick? Because I was going to send a new
> > series with patch #2 better worded, and with a new patch for
> > the endiannes issue.
> 
> Yes, no need to resend patch #3, as it's already applied to can-next.
> 
> regards,
> Marc
> 
> 1 = can: kvaser_usb: Fix tx queue start/stop race conditions
> 2 = can: kvaser_usb: Utilize all possible tx URBs
> 3 = can: kvaser_usb: Use can-dev unregistration mechanism
> 4 = the endianess issue
> 
> 1 = is in linux-can and included in linux-can-fixes-for-4.0-20150314
> 2 = will go into linux-can with a better commit message
>     which is currently prepare by you
>     will be in the next pull request for net
> 3 = is in linux-can-next and will be included in the next pull request
>     for net-next
> 4 = is currently prepared by you
>     will be in the next pull request for net
> 
> This means, you'll send me patches 2 and 4 in a new v5 series. (This
> patches will of course have new numbers, 1 and 2.)
> 

A piece of cake :D

^ permalink raw reply	[relevance 99%]

* [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value
  2015-02-26 15:20 95% [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
                   ` (4 preceding siblings ...)
  2015-03-14 13:02 74% ` [PATCH v4 " Ahmed S. Darwish
@ 2015-03-15 15:03 77% ` Ahmed S. Darwish
  2015-03-15 15:10 98%   ` [PATCH v5 2/2] can: kvaser_usb: Fix sparse warning __le16 degrades to integer Ahmed S. Darwish
    5 siblings, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-15 15:03 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

Current driver code arbitrarily assumes a max outstanding tx
value of 16 parallel transmissions. Meanwhile, the device
firmware provides its actual maximum inside its reply to the
CMD_GET_SOFTWARE_INFO message.

Under heavy tx traffic, if the interleaved transmissions count
increases above the limit reported by firmware, the firmware
breaks up badly, reports a massive list of internal errors, and
the candump traces hardly matches the actual frames sent and
received.

On the other hand, in certain models, the firmware can support
up to 48 tx URBs instead of just 16, increasing the driver
throughput by two-fold and reducing the possibility of -ENOBUFs.

Thus dynamically set the driver's max tx URBs value according
to firmware replies.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 64 ++++++++++++++++++++++++----------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index e97a08c..60eadf5 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -25,7 +25,6 @@
 #include <linux/can/dev.h>
 #include <linux/can/error.h>
 
-#define MAX_TX_URBS			16
 #define MAX_RX_URBS			4
 #define START_TIMEOUT			1000 /* msecs */
 #define STOP_TIMEOUT			1000 /* msecs */
@@ -443,6 +442,7 @@ struct kvaser_usb_error_summary {
 	};
 };
 
+/* Context for an outstanding, not yet ACKed, transmission */
 struct kvaser_usb_tx_urb_context {
 	struct kvaser_usb_net_priv *priv;
 	u32 echo_index;
@@ -456,8 +456,13 @@ struct kvaser_usb {
 	struct usb_endpoint_descriptor *bulk_in, *bulk_out;
 	struct usb_anchor rx_submitted;
 
+	/* @max_tx_urbs: Firmware-reported maximum number of oustanding,
+	 * not yet ACKed, transmissions on this device. This value is
+	 * also used as a sentinel for marking free tx contexts.
+	 */
 	u32 fw_version;
 	unsigned int nchannels;
+	unsigned int max_tx_urbs;
 	enum kvaser_usb_family family;
 
 	bool rxinitdone;
@@ -467,19 +472,18 @@ struct kvaser_usb {
 
 struct kvaser_usb_net_priv {
 	struct can_priv can;
-
-	spinlock_t tx_contexts_lock;
-	int active_tx_contexts;
-	struct kvaser_usb_tx_urb_context tx_contexts[MAX_TX_URBS];
-
-	struct usb_anchor tx_submitted;
-	struct completion start_comp, stop_comp;
+	struct can_berr_counter bec;
 
 	struct kvaser_usb *dev;
 	struct net_device *netdev;
 	int channel;
 
-	struct can_berr_counter bec;
+	struct completion start_comp, stop_comp;
+	struct usb_anchor tx_submitted;
+
+	spinlock_t tx_contexts_lock;
+	int active_tx_contexts;
+	struct kvaser_usb_tx_urb_context tx_contexts[];
 };
 
 static const struct usb_device_id kvaser_usb_table[] = {
@@ -657,9 +661,13 @@ static int kvaser_usb_get_software_info(struct kvaser_usb *dev)
 	switch (dev->family) {
 	case KVASER_LEAF:
 		dev->fw_version = le32_to_cpu(msg.u.leaf.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.leaf.softinfo.max_outstanding_tx);
 		break;
 	case KVASER_USBCAN:
 		dev->fw_version = le32_to_cpu(msg.u.usbcan.softinfo.fw_version);
+		dev->max_tx_urbs =
+			le16_to_cpu(msg.u.usbcan.softinfo.max_outstanding_tx);
 		break;
 	}
 
@@ -715,7 +723,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 
 	stats = &priv->netdev->stats;
 
-	context = &priv->tx_contexts[tid % MAX_TX_URBS];
+	context = &priv->tx_contexts[tid % dev->max_tx_urbs];
 
 	/* Sometimes the state change doesn't come after a bus-off event */
 	if (priv->can.restart_ms &&
@@ -744,7 +752,7 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 	can_get_echo_skb(priv->netdev, context->echo_index);
-	context->echo_index = MAX_TX_URBS;
+	context->echo_index = dev->max_tx_urbs;
 	--priv->active_tx_contexts;
 	netif_wake_queue(priv->netdev);
 
@@ -1512,11 +1520,13 @@ error:
 
 static void kvaser_usb_reset_tx_urb_contexts(struct kvaser_usb_net_priv *priv)
 {
-	int i;
+	int i, max_tx_urbs;
+
+	max_tx_urbs = priv->dev->max_tx_urbs;
 
 	priv->active_tx_contexts = 0;
-	for (i = 0; i < MAX_TX_URBS; i++)
-		priv->tx_contexts[i].echo_index = MAX_TX_URBS;
+	for (i = 0; i < max_tx_urbs; i++)
+		priv->tx_contexts[i].echo_index = max_tx_urbs;
 }
 
 /* This method might sleep. Do not call it in the atomic context
@@ -1702,14 +1712,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		*msg_tx_can_flags |= MSG_FLAG_REMOTE_FRAME;
 
 	spin_lock_irqsave(&priv->tx_contexts_lock, flags);
-	for (i = 0; i < ARRAY_SIZE(priv->tx_contexts); i++) {
-		if (priv->tx_contexts[i].echo_index == MAX_TX_URBS) {
+	for (i = 0; i < dev->max_tx_urbs; i++) {
+		if (priv->tx_contexts[i].echo_index == dev->max_tx_urbs) {
 			context = &priv->tx_contexts[i];
 
 			context->echo_index = i;
 			can_put_echo_skb(skb, netdev, context->echo_index);
 			++priv->active_tx_contexts;
-			if (priv->active_tx_contexts >= MAX_TX_URBS)
+			if (priv->active_tx_contexts >= dev->max_tx_urbs)
 				netif_stop_queue(netdev);
 
 			break;
@@ -1743,7 +1753,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		spin_lock_irqsave(&priv->tx_contexts_lock, flags);
 
 		can_free_echo_skb(netdev, context->echo_index);
-		context->echo_index = MAX_TX_URBS;
+		context->echo_index = dev->max_tx_urbs;
 		--priv->active_tx_contexts;
 		netif_wake_queue(netdev);
 
@@ -1881,7 +1891,9 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	if (err)
 		return err;
 
-	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
+	netdev = alloc_candev(sizeof(*priv) +
+			      dev->max_tx_urbs * sizeof(*priv->tx_contexts),
+			      dev->max_tx_urbs);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
 		return -ENOMEM;
@@ -1928,7 +1940,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 		return err;
 	}
 
-	netdev_dbg(netdev, "device registered\n");
+	netdev_info(netdev, "device registered\n");
 
 	return 0;
 }
@@ -2009,6 +2021,13 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
+	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
+		((dev->fw_version >> 24) & 0xff),
+		((dev->fw_version >> 16) & 0xff),
+		(dev->fw_version & 0xffff));
+
+	dev_dbg(&intf->dev, "Max oustanding tx = %d URBs\n", dev->max_tx_urbs);
+
 	err = kvaser_usb_get_card_info(dev);
 	if (err) {
 		dev_err(&intf->dev,
@@ -2016,11 +2035,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 		return err;
 	}
 
-	dev_dbg(&intf->dev, "Firmware version: %d.%d.%d\n",
-		((dev->fw_version >> 24) & 0xff),
-		((dev->fw_version >> 16) & 0xff),
-		(dev->fw_version & 0xffff));
-
 	for (i = 0; i < dev->nchannels; i++) {
 		err = kvaser_usb_init_one(intf, id, i);
 		if (err) {
-- 
1.9.1


^ permalink raw reply related	[relevance 77%]

* [PATCH v5 2/2] can: kvaser_usb: Fix sparse warning __le16 degrades to integer
  2015-03-15 15:03 77% ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Ahmed S. Darwish
@ 2015-03-15 15:10 98%   ` Ahmed S. Darwish
    1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-15 15:10 UTC (permalink / raw)
  To: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Marc Kleine-Budde, Andri Yngvason
  Cc: Linux-CAN, LKML

From: Ahmed S. Darwish <ahmed.darwish@valeo.com>

USB endpoint's wMaxPacketSize field is an le16 entity. Use
appropriate le16_to_cpu macros to maintain endian independence.

Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
---
 drivers/net/can/usb/kvaser_usb.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

I wasn't successful in making sparse trigger the warning, even
while using the latest version from the git repos and doing:

make C=2 M=drivers/net/can/usb

So, Marc, I hope this patch fixes the issue reported; it should.

thanks,

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 60eadf5..d924016 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -596,8 +596,8 @@ static int kvaser_usb_wait_msg(const struct kvaser_usb *dev, u8 id,
 			 * for further details.
 			 */
 			if (tmp->len == 0) {
-				pos = round_up(pos,
-					       dev->bulk_in->wMaxPacketSize);
+				pos = round_up(pos, le16_to_cpu(dev->bulk_in->
+								wMaxPacketSize));
 				continue;
 			}
 
@@ -1337,7 +1337,8 @@ static void kvaser_usb_read_bulk_callback(struct urb *urb)
 		 * number of events in case of a heavy rx load on the bus.
 		 */
 		if (msg->len == 0) {
-			pos = round_up(pos, dev->bulk_in->wMaxPacketSize);
+			pos = round_up(pos, le16_to_cpu(dev->bulk_in->
+							wMaxPacketSize));
 			continue;
 		}
 
-- 
1.9.1


^ permalink raw reply related	[relevance 98%]

* Re: [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value
  @ 2015-03-16 12:16 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2015-03-16 12:16 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Olivier Sobrie, Oliver Hartkopp, Wolfgang Grandegger,
	Andri Yngvason, Linux-CAN, LKML

On Sun, Mar 15, 2015 at 07:08:23PM +0100, Marc Kleine-Budde wrote:
> On 03/15/2015 04:03 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > Current driver code arbitrarily assumes a max outstanding tx
> > value of 16 parallel transmissions. Meanwhile, the device
> > firmware provides its actual maximum inside its reply to the
> > CMD_GET_SOFTWARE_INFO message.
> > 
> > Under heavy tx traffic, if the interleaved transmissions count
> > increases above the limit reported by firmware, the firmware
> > breaks up badly, reports a massive list of internal errors, and
> > the candump traces hardly matches the actual frames sent and
> > received.
> > 
> > On the other hand, in certain models, the firmware can support
> > up to 48 tx URBs instead of just 16, increasing the driver
> > throughput by two-fold and reducing the possibility of -ENOBUFs.
> > 
> > Thus dynamically set the driver's max tx URBs value according
> > to firmware replies.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> > @@ -1928,7 +1940,7 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
> >  		return err;
> >  	}
> >  
> > -	netdev_dbg(netdev, "device registered\n");
> > +	netdev_info(netdev, "device registered\n");
> 
> This makes the driver more noisy, I'd like to drop that hunk, okay? No
> need to resend.
> 

Sure, go ahead.

I have my reasons for that hunk above, but we can always discuss
this in another separate patch ;-)

Thanks,
Darwish

^ permalink raw reply	[relevance 99%]

* Re: [GIT PULL] Staging/IIO driver patches for 4.19-rc1
  @ 2018-08-28 10:38 97% ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2018-08-28 10:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Andrew Morton, Stephen Rothwell, Rob Springer,
	John Joseph, Simon Que, Todd Poynor, linux-kernel, devel

[ re-send; forgotten lkml CC added; sorry ]

Hi,

On Sat, 18 Aug 2018 17:57:24 +0200, Greg KH wrote:
[...]
> addition of some new IIO drivers.  Also added was a "gasket" driver from
> Google that needs loads of work and the erofs filesystem.
> 

Why are we adding __a whole new in-kernel framework__ for
developing basic user-space drivers?

We already have a frameowrk for that, and it's UIO. [1] The UIO
code is a very stable and simple subsystem; it's also heavily used
in the embedded industry..

I've looked at the gasket documentation [2], and the first user
of this new in-kernel API [3], and this is almost replicating UIO
it's not funny. [4] True, the gasket APIs adds some extra new
conveniences (PCI BAR re-mapping, MSI, ..), but there's no
technical reason this cannot be added to the UIO code instead.

More-over, the exposed user-space API is just some ioctls. So if
google hase some shipped user-space code that is using this,
hopefully the driver can still be re-implemented through UIO
without changing these bits..

Thanks,

[1] https://www.kernel.org/doc/html/v4.18/driver-api/uio-howto.html
[2] drivers/staging/gasket/gasket_core.h :: struct gasket_driver_desc
[3] drivers/staging/gasket/apex_driver.c
[4] include/linux/uio_driver.h

--
Darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 97%]

* Re: [PATCH 11/13] proc: readdir /proc/*/task
  @ 2018-08-28 12:36 99%   ` Ahmed S. Darwish
  2018-08-28 13:04 99%     ` Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2018-08-28 12:36 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: akpm, linux-kernel

On Tue, Aug 28, 2018 at 02:15:01AM +0300, Alexey Dobriyan wrote:
> ---
>  fs/proc/base.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>

Missing description and S-o-b. Further comments below..

> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 33f444721965..668e465c86b3 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -3549,11 +3549,11 @@ static int proc_task_readdir(struct file *file, struct dir_context *ctx)
>  	for (task = first_tid(proc_pid(inode), tid, ctx->pos - 2, ns);
>  	     task;
>  	     task = next_tid(task), ctx->pos++) {
> -		char name[10 + 1];
> -		unsigned int len;
> +		char name[10], *p = name + sizeof(name);
> +

Multiple issues:

- len should be 11, as was in the original code
  (0xffffffff = 4294967295, 10 letters)

- while we're at it, let's use a constant for the '11' instead of
  mysterious magic numbers

- 'p' is clearly overflowing the stack here

>  		tid = task_pid_nr_ns(task, ns);
> -		len = snprintf(name, sizeof(name), "%u", tid);
> -		if (!proc_fill_cache(file, ctx, name, len,
> +		p = _print_integer_u32(p, tid);
> +		if (!proc_fill_cache(file, ctx, p, name + sizeof(name) - p,

You're replacing snprintf() code __that did proper len checking__
with code that does not. That's not good.

I can't see how the fourth proc_fill_cache() parameter, ``name +
sizeof(name)'' safely ever replace the original 'len' parameter.
It's a pointer value .. (!)

Overall this looks like a broken patch submitted by mistake.

Thanks,

-- 
Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: [PATCH 11/13] proc: readdir /proc/*/task
  2018-08-28 12:36 99%   ` Ahmed S. Darwish
@ 2018-08-28 13:04 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2018-08-28 13:04 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: akpm, linux-kernel

On Tue, Aug 28, 2018 at 12:36:22PM +0000, Ahmed S. Darwish wrote:
> On Tue, Aug 28, 2018 at 02:15:01AM +0300, Alexey Dobriyan wrote:
> > ---
> >  fs/proc/base.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> 
> Missing description and S-o-b. Further comments below..
> 
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 33f444721965..668e465c86b3 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -3549,11 +3549,11 @@ static int proc_task_readdir(struct file *file, struct dir_context *ctx)
> >  	for (task = first_tid(proc_pid(inode), tid, ctx->pos - 2, ns);
> >  	     task;
> >  	     task = next_tid(task), ctx->pos++) {
> > -		char name[10 + 1];
> > -		unsigned int len;
> > +		char name[10], *p = name + sizeof(name);
> > +
> 
> Multiple issues:
> 
> - len should be 11, as was in the original code
>   (0xffffffff = 4294967295, 10 letters)
> 
> - while we're at it, let's use a constant for the '11' instead of
>   mysterious magic numbers
> 
> - 'p' is clearly overflowing the stack here
>

See below:

> >  		tid = task_pid_nr_ns(task, ns);
> > -		len = snprintf(name, sizeof(name), "%u", tid);
> > -		if (!proc_fill_cache(file, ctx, name, len,
> > +		p = _print_integer_u32(p, tid);
> > +		if (!proc_fill_cache(file, ctx, p, name + sizeof(name) - p,
> 
> You're replacing snprintf() code __that did proper len checking__
> with code that does not. That's not good.
> 
> I can't see how the fourth proc_fill_cache() parameter, ``name +
> sizeof(name)'' safely ever replace the original 'len' parameter.
> It's a pointer value .. (!)
>

Ok, there's a "- p" in the end, so the length looks to be Ok.

Nonetheless, the whole patch series is introducing funny code
like:

+/*
+ * Print an integer in decimal.
+ * "p" initially points PAST THE END OF THE BUFFER!
+ *
+ * DO NOT USE THESE FUNCTIONS!
+ *
+ * Do not copy these functions.
+ * Do not document these functions.
+ * Do not move these functions to lib/ or elsewhere.
+ * Do not export these functions to modules.
+ * Do not tell anyone about these functions.
+ */
+noinline
+char *_print_integer_u32(char *p, u32 x)
+{
+       do {
+               *--p = '0' + (x % 10);
+               x /= 10;
+       } while (x != 0);
+       return p;
+}

And thus the code using these functions is throwing invalid
past-the-stack pointers and strings with no NULL terminators
like there's no tomorrow...

IMHO It's an accident waiting to happen to sprinkle pointers
like that everywhere. Are we really in a super hot path to
justify all that?

/me confused

-- 
Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: [GIT PULL] Staging/IIO driver patches for 4.19-rc1
  @ 2018-08-28 14:30 91%     ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2018-08-28 14:30 UTC (permalink / raw)
  To: Greg KH, Simon Que
  Cc: Stephen Rothwell, John Joseph, LKML, Rob Springer, devel,
	Andrew Morton, Todd Poynor, Linus Torvalds

Hi!

On Tue, Aug 28, 2018 at 02:36:07PM +0200, Greg KH wrote:
> On Tue, Aug 28, 2018 at 10:38:17AM +0000, Ahmed S. Darwish wrote:
> > [ re-send; forgotten lkml CC added; sorry ]
> >
> > Hi,
> >
> > On Sat, 18 Aug 2018 17:57:24 +0200, Greg KH wrote:
> > [...]
> > > addition of some new IIO drivers.  Also added was a "gasket" driver from
> > > Google that needs loads of work and the erofs filesystem.
> > >
> >
> > Why are we adding __a whole new in-kernel framework__ for
> > developing basic user-space drivers?
> >
> > We already have a frameowrk for that, and it's UIO. [1] The UIO
> > code is a very stable and simple subsystem; it's also heavily used
> > in the embedded industry..
>
> As Todd said, the code will end up being a simple UIO driver, if even
> that big, in th end.  It is just going to take him a while to constantly
> refactor things until he achieves that goal...
>
> > I've looked at the gasket documentation [2], and the first user
> > of this new in-kernel API [3], and this is almost replicating UIO
> > it's not funny. [4] True, the gasket APIs adds some extra new
> > conveniences (PCI BAR re-mapping, MSI, ..), but there's no
> > technical reason this cannot be added to the UIO code instead.
>
> {shh} That's my plan :)
>

Cool, thanks a lot.

Can we then merge something like below patch?

[ I've searched the gasket included TODO file before posting,
  but did not find any mention of UIO. Below patch will make
  sure this will not be forgotten over time.. ]

==>

Subject: [PATCH] gasket: TODO: re-implement using UIO

The gasket in-kernel APIs, recently introduced under staging,
re-implements what is already long-time provided by the UIO
framework and subsystem.

Before moving it out of staging, make sure we add the new bits
to the UIO subsystem instead, then transform its signle client,
the Apex driver, to become a UIO driver (uio_driver.h)

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---
 drivers/staging/gasket/TODO | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/staging/gasket/TODO b/drivers/staging/gasket/TODO
index 6ff8e01b04cc..5b1865f8af2d 100644
--- a/drivers/staging/gasket/TODO
+++ b/drivers/staging/gasket/TODO
@@ -1,9 +1,22 @@
 This is a list of things that need to be done to get this driver out of the
 staging directory.
+
+- Implement the gasket framework's functionality through UIO instead of
+  introducing a new user-space drivers framework that is quite similar.
+
+  UIO provides the necessary bits to implement user-space drivers. Meanwhile
+  the gasket APIs adds some extra conveniences like PCI BAR mapping, and
+  MSI interrupts. Add these features to the UIO subsystem, then re-implement
+  the Apex driver as a basic UIO driver instead (include/linux/uio_driver.h)
+
 - Document sysfs files with Documentation/ABI/ entries.
+
 - Use misc interface instead of major number for driver version description.
+
 - Add descriptions of module_param's
+
 - apex_get_status() should actually check status.
+
 - "drivers" should never be dealing with "raw" sysfs calls or mess around with
   kobjects at all. The driver core should handle all of this for you
   automaically. There should not be a need for raw attribute macros.

Ciao,

--
Darwish
http://darwish.chasingpointers.com

^ permalink raw reply related	[relevance 91%]

* Re: [PATCH V5 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio
  @ 2018-09-07 17:04 97% ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2018-09-07 17:04 UTC (permalink / raw)
  To: Zhang Yi
  Cc: kvm, linux-kernel, linux-nvdimm, pbonzini, dan.j.williams,
	dave.jiang, yu.c.zhang, pagupta, david, jack, hch, linux-mm,
	rkrcmar, jglisse, yi.z.zhang

Hi!

On Sat, Sep 08, 2018 at 02:03:02AM +0800, Zhang Yi wrote:
[...]
>
> V1:
> https://lkml.org/lkml/2018/7/4/91
>
> V2:
> https://lkml.org/lkml/2018/7/10/135
>
> V3:
> https://lkml.org/lkml/2018/8/9/17
>
> V4:
> https://lkml.org/lkml/2018/8/22/17
>

Can we please avoid referencing "lkml.org"?

It's just an unreliable broken website. [1][2] Much more important
though is that its URLs _hide_ the Message-Id field; running the
threat of losing the e-mail reference forever at some point in the
future.

From Documentation/process/submitting-patches.rst:

    If the patch follows from a mailing list discussion, give a
    URL to the mailing list archive; use the https://lkml.kernel.org/
    redirector with a ``Message-Id``, to ensure that the links
    cannot become stale.

So the V1 link above should've been either:

    https://lore.kernel.org/lkml/cover.1530716899.git.yi.z.zhang@linux.intel.com

or:

    https://lkml.kernel.org/r/cover.1530716899.git.yi.z.zhang@linux.intel.com

and so on..

Thanks,

[1] https://www.theregister.co.uk/2018/01/14/linux_kernel_mailing_list_archives_will_return_soon
[2] The threading interface is also broken and in a lot of cases
    does not show all messages in a thread

--
Darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 97%]

* [PATCH] staging: gasket: TODO: re-implement using UIO
  @ 2018-09-10 15:28 94%         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2018-09-10 15:28 UTC (permalink / raw)
  To: Greg KH; +Cc: Simon Que, Todd Poynor, John Joseph, Rob Springer, LKML, devel

The gasket in-kernel framework, recently introduced under staging,
re-implements what is already long-time provided by the UIO
subsystem, with extra PCI BAR remapping and MSI conveniences.

Before moving it out of staging, make sure we add the new bits to
the UIO framework instead, then transform its signle client, the
Apex driver, to a proper UIO driver (uio_driver.h).

Link: https://lkml.kernel.org/r/20180828103817.GB1397@do-kernel

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---
 drivers/staging/gasket/TODO | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/staging/gasket/TODO b/drivers/staging/gasket/TODO
index 6ff8e01b04cc..5b1865f8af2d 100644
--- a/drivers/staging/gasket/TODO
+++ b/drivers/staging/gasket/TODO
@@ -1,9 +1,22 @@
 This is a list of things that need to be done to get this driver out of the
 staging directory.
+
+- Implement the gasket framework's functionality through UIO instead of
+  introducing a new user-space drivers framework that is quite similar.
+
+  UIO provides the necessary bits to implement user-space drivers. Meanwhile
+  the gasket APIs adds some extra conveniences like PCI BAR mapping, and
+  MSI interrupts. Add these features to the UIO subsystem, then re-implement
+  the Apex driver as a basic UIO driver instead (include/linux/uio_driver.h)
+
 - Document sysfs files with Documentation/ABI/ entries.
+
 - Use misc interface instead of major number for driver version description.
+
 - Add descriptions of module_param's
+
 - apex_get_status() should actually check status.
+
 - "drivers" should never be dealing with "raw" sysfs calls or mess around with
   kobjects at all. The driver core should handle all of this for you
   automaically. There should not be a need for raw attribute macros.


--
Darwi
http://darwish.chasingpointers.com

^ permalink raw reply related	[relevance 94%]

* Re: [PATCH 01/10] procfs: add smack subdir to attrs
  @ 2018-09-11 23:45 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2018-09-11 23:45 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: LSM, James Morris, LKM, SE Linux, John Johansen, Kees Cook,
	Tetsuo Handa, Paul Moore, Stephen Smalley, Schaufler, Casey

On Tue, Sep 11, 2018 at 09:41:32AM -0700, Casey Schaufler wrote:
> Back in 2007 I made what turned out to be a rather serious
> mistake in the implementation of the Smack security module.
> The SELinux module used an interface in /proc to manipulate
> the security context on processes. Rather than use a similar
> interface, I used the same interface. The AppArmor team did
> likewise. Now /proc/.../attr/current will tell you the
> security "context" of the process, but it will be different
> depending on the security module you're using.
>
> This patch provides a subdirectory in /proc/.../attr for
> Smack. Smack user space can use the "current" file in
> this subdirectory and never have to worry about getting
> SELinux attributes by mistake. Programs that use the
> old interface will continue to work (or fail, as the case
> may be) as before.
>

Did downstream distributions already merge the stacking patches on
their own?

Got a little-bit confused after reading the log above; I already see
this in in Ubuntu 18.04.1 LTS, v4.15.0-33-generic:

    $ tree /proc/self/attr/
    /proc/self/attr/
    ├── apparmor
    │   ├── current
    │   ├── exec
    │   └── prev
    ├── current
    ├── display_lsm
    ├── exec
    ├── fscreate
    ├── keycreate
    ├── prev
    ├── selinux
    │   ├── current
    │   ├── exec
    │   ├── fscreate
    │   ├── keycreate
    │   ├── prev
    │   └── sockcreate
    ├── smack
    │   └── current
    └── sockcreate

Thanks,

--
Darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v3 3/3] drivers: soc: xilinx: Add ZynqMP PM driver
  @ 2018-09-12  1:10 95% ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2018-09-12  1:10 UTC (permalink / raw)
  To: Jolly Shah
  Cc: matthias.bgg, andy.gross, shawnguo, geert+renesas,
	bjorn.andersson, sean.wang, m.szyprowski, michal.simek, robh+dt,
	mark.rutland, rajanv, devicetree, linux-arm-kernel, linux-kernel,
	Rajan Vaja, Jolly Shah

Hi!

[ Thanks a lot for upstreaming this.. ]

On Tue, Sep 11, 2018 at 02:34:57PM -0700, Jolly Shah wrote:
> From: Rajan Vaja <rajan.vaja@xilinx.com>
>
> Add ZynqMP PM driver. PM driver provides power management
> support for ZynqMP.
>
> Signed-off-by: Rajan Vaja <rajan.vaja@xilinx.com>
> Signed-off-by: Jolly Shah <jollys@xilinx.com>
> ---

[...]

> +static irqreturn_t zynqmp_pm_isr(int irq, void *data)
> +{
> +	u32 payload[CB_PAYLOAD_SIZE];
> +
> +	zynqmp_pm_get_callback_data(payload);
> +
> +	/* First element is callback API ID, others are callback arguments */
> +	if (payload[0] == PM_INIT_SUSPEND_CB) {
> +		if (work_pending(&zynqmp_pm_init_suspend_work->callback_work))
> +			goto done;
> +
> +		/* Copy callback arguments into work's structure */
> +		memcpy(zynqmp_pm_init_suspend_work->args, &payload[1],
> +		       sizeof(zynqmp_pm_init_suspend_work->args));
> +
> +		queue_work(system_unbound_wq,
> +			   &zynqmp_pm_init_suspend_work->callback_work);

We already have devm_request_threaded_irq() which can does this
automatically for us.

Use that method to register the ISR instead, then if there's more
work to do, just do the memcpy and return IRQ_WAKE_THREAD.

> +	}
> +
> +done:
> +	return IRQ_HANDLED;
> +}
> +
> +/**
> + * zynqmp_pm_init_suspend_work_fn() - Initialize suspend
> + * @work:	Pointer to work_struct
> + *
> + * Bottom-half of PM callback IRQ handler.
> + */
> +static void zynqmp_pm_init_suspend_work_fn(struct work_struct *work)
> +{
> +	struct zynqmp_pm_work_struct *pm_work =
> +		container_of(work, struct zynqmp_pm_work_struct, callback_work);
> +
> +	if (pm_work->args[0] == ZYNQMP_PM_SUSPEND_REASON_SYSTEM_SHUTDOWN) {

we_really_seem_to_love_long_40_col_names_for_some_reason

> +		orderly_poweroff(true);
> +	} else if (pm_work->args[0] ==
> +		   ZYNQMP_PM_SUSPEND_REASON_POWER_UNIT_REQUEST) {

Ditto

[...]

> +/**
> + * zynqmp_pm_sysfs_init() - Initialize PM driver sysfs interface
> + * @dev:	Pointer to device structure
> + *
> + * Return: 0 on success, negative error code otherwise
> + */
> +static int zynqmp_pm_sysfs_init(struct device *dev)
> +{
> +	return sysfs_create_file(&dev->kobj, &dev_attr_suspend_mode.attr);
> +}
> +

Sysfs file is created in platform driver's probe(), but is not
removed anywhere in the code.

What happens if this is built as a module? Am I missing something
obvious?

Moreover, what's the wisdom of creating a one-liner function with
a huge six-line comment that:

    a) _purely_ wraps sysfs_create_file(); no extra logic
    b) is called only once
    c) and not passed as a function pointer anywhere

IMO Such one-liner translators obfuscate the code and review
process with no apparent gain..

> +/**
> + * zynqmp_pm_probe() - Probe existence of the PMU Firmware
> + *		       and initialize debugfs interface
> + *
> + * @pdev:	Pointer to the platform_device structure
> + *
> + * Return: Returns 0 on success, negative error code otherwise
> + */

Again, huge 8-line comment that provide no value.

If anyone wants to know what a platform driver probe() does, he
or she better check the primary references at:

    - Documentation/driver-model/platform.txt
    - include/linux/platform_device.h

and not the comment above..

> +static int zynqmp_pm_probe(struct platform_device *pdev)
> +{
> +	int ret, irq;
> +	u32 pm_api_version;
> +
> +	const struct zynqmp_eemi_ops *eemi_ops = zynqmp_pm_get_eemi_ops();
> +
> +	if (!eemi_ops || !eemi_ops->get_api_version || !eemi_ops->init_finalize)
> +		return -ENXIO;
> +
> +	eemi_ops->init_finalize();
> +	eemi_ops->get_api_version(&pm_api_version);
> +
> +	/* Check PM API version number */
> +	if (pm_api_version < ZYNQMP_PM_VERSION)
> +		return -ENODEV;
> +
> +	irq = platform_get_irq(pdev, 0);
> +	if (irq <= 0)
> +		return -ENXIO;
> +
> +	ret = devm_request_irq(&pdev->dev, irq, zynqmp_pm_isr, IRQF_SHARED,
> +			       dev_name(&pdev->dev), pdev);
> +	if (ret) {
> +		dev_err(&pdev->dev, "request_irq '%d' failed with %d\n",
> +			irq, ret);
> +		return ret;
> +	}
> +
> +	zynqmp_pm_init_suspend_work =
> +		devm_kzalloc(&pdev->dev, sizeof(struct zynqmp_pm_work_struct),
> +			     GFP_KERNEL);
> +	if (!zynqmp_pm_init_suspend_work)
> +		return -ENOMEM;
> +
> +	INIT_WORK(&zynqmp_pm_init_suspend_work->callback_work,
> +		  zynqmp_pm_init_suspend_work_fn);
> +

Let's use devm_request_threaded_irq(). Then we can completely
remove the work_struct, INIT_WORK(), and queuue_work() bits.

> +	ret = zynqmp_pm_sysfs_init(&pdev->dev);
> +	if (ret) {
> +		dev_err(&pdev->dev, "unable to initialize sysfs interface\n");
> +		return ret;
> +	}
> +
> +	return ret;

Just return 0 please. BTW ret was declared without initialization.

> +}
> +
> +static const struct of_device_id pm_of_match[] = {
> +	{ .compatible = "xlnx,zynqmp-power", },
> +	{ /* end of table */ },
> +};
> +MODULE_DEVICE_TABLE(of, pm_of_match);
> +
> +static struct platform_driver zynqmp_pm_platform_driver = {
> +	.probe = zynqmp_pm_probe,
> +	.driver = {
> +		.name = "zynqmp_power",
> +		.of_match_table = pm_of_match,
> +	},
> +};

.remove with a basic sysfs_remove_file() is needed.

Thanks,

--
Darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 95%]

* [PATCH] x86/defconfig: Remove archaic partition tables support
@ 2019-03-06  0:44 99% Ahmed S. Darwish
  2019-04-19 10:36 84% ` [tip:x86/build] " tip-bot for Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-03-06  0:44 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, x86, linux-kernel

On x86 systems, only MSDOS and GPT partition tables are typically
encountered. Remove all the rest.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

Notes:
    CONFIG_EFI_PARTITION is also removed since it defaults to `y'.

 arch/x86/configs/i386_defconfig   | 12 ------------
 arch/x86/configs/x86_64_defconfig | 12 ------------
 2 files changed, 24 deletions(-)

diff --git a/arch/x86/configs/i386_defconfig b/arch/x86/configs/i386_defconfig
index 4bb95d7ad947..838c49a30738 100644
--- a/arch/x86/configs/i386_defconfig
+++ b/arch/x86/configs/i386_defconfig
@@ -25,18 +25,6 @@ CONFIG_JUMP_LABEL=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
-CONFIG_PARTITION_ADVANCED=y
-CONFIG_OSF_PARTITION=y
-CONFIG_AMIGA_PARTITION=y
-CONFIG_MAC_PARTITION=y
-CONFIG_BSD_DISKLABEL=y
-CONFIG_MINIX_SUBPARTITION=y
-CONFIG_SOLARIS_X86_PARTITION=y
-CONFIG_UNIXWARE_DISKLABEL=y
-CONFIG_SGI_PARTITION=y
-CONFIG_SUN_PARTITION=y
-CONFIG_KARMA_PARTITION=y
-CONFIG_EFI_PARTITION=y
 CONFIG_SMP=y
 CONFIG_X86_GENERIC=y
 CONFIG_HPET_TIMER=y
diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig
index 0fed049422a8..05e057efca28 100644
--- a/arch/x86/configs/x86_64_defconfig
+++ b/arch/x86/configs/x86_64_defconfig
@@ -24,18 +24,6 @@ CONFIG_JUMP_LABEL=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
-CONFIG_PARTITION_ADVANCED=y
-CONFIG_OSF_PARTITION=y
-CONFIG_AMIGA_PARTITION=y
-CONFIG_MAC_PARTITION=y
-CONFIG_BSD_DISKLABEL=y
-CONFIG_MINIX_SUBPARTITION=y
-CONFIG_SOLARIS_X86_PARTITION=y
-CONFIG_UNIXWARE_DISKLABEL=y
-CONFIG_SGI_PARTITION=y
-CONFIG_SUN_PARTITION=y
-CONFIG_KARMA_PARTITION=y
-CONFIG_EFI_PARTITION=y
 CONFIG_SMP=y
 CONFIG_CALGARY_IOMMU=y
 CONFIG_NR_CPUS=64
-- 
2.21.0




^ permalink raw reply related	[relevance 99%]

* DRM-based Oops viewer
@ 2019-03-10  1:31 81% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-03-10  1:31 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Alex Deucher, Christian König, David Zhou,
	Ard Biesheuvel, Matt Fleming
  Cc: Linus Torvalds, Greg Kroah-Hartman, John Ogness, dri-devel, linux-kernel

Hello DRM/UEFI maintainers,

Several years ago, I wrote a set of patches to dump the kernel
log to disk upon panic -- through BIOS INT 0x13 services. [1]

The overwhelming response was that it's unsafe to do this in a
generic manner. Linus proposed a video-based viewer instead: [2]

    If you want to do the BIOS services thing, do it for video: copy the
    oops to low RAM, return to real mode, re-run the graphics card POST
    routines to initialize text-mode, and use the BIOS to print out the
    oops.  That is WAY less scary than writing to disk.

Of course it's 2019 now though, and it's quite known that
Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]

Researching whether this can be done from UEFI, it was also clear
that UEFI "Runtime Services" do not provide any re-initialization
routines. [4]

The maximum possible that UEFI can provide is a GOP-provided
framebuffer that's ready to use by the OS -- even after the UEFI
boot phase is marked as done through ExitBootServices(). [5]

Of course, once native drivers like i915 or radeon take over,
such a framebuffer is toast... [6]

Thus a possible remaining option, is to display the oops through
"minimal" DRM drivers provided for each HW variant... Since
these special drivers will run only and fully under a panic()
context though, several constraints exist:

  - The code should be fully synchronous (irqs are disabled)
  - It should not allocate any dynamic memory
  - It should make minimal assumptions about HW state
  - It should not chain into any other kernel subsystem
  - It has ample freedom to use delay-based loops and the
    like, the kernel is already dead.

How feasible is it to have such a special "DRM viewoops"
framework + its minimal drivers in the kernel?

The target is to start from i915, since that's what in my
laptop now, and work from there..

Some final notes:

  - The NT kernel has a similar concept, but for storage instead.
    They're used to dump core under kernel panic() situations,
    and are called "Minoport storage drivers". [7]

  - Since Windows 7+, a very fancy Blue Screen of Death is
    displayed, with Unicode and whatnot, implying GPU drivers
    involvement. [8]

  - Mac OS X also does something similar [9]

  - On Linux laptops, the current situation is _really_ bad.

    In any graphical session, type "echo c > /proc/sysrq-trigger";
    the screen will just completely freeze...

    Desired first goal: just print the panic() log

Thanks a lot,

[1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
[2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com

[3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf

[4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
[5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
    "The Graphics Output Protocol supports this capability by
     providing the EFI OS loader access to a hardware frame buffer
     and enough information to allow the OS to draw directly to
     the graphics output device."

[6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
    linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()

[7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive

[8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
[9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg

--darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 81%]

* [PATCH] printk: kmsg_dump: Mark registered flag as private
@ 2019-03-10 20:03 99% Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-03-10 20:03 UTC (permalink / raw)
  To: Petr Mladek, Sergey Senozhatsky; +Cc: Steven Rostedt, John Ogness, linux-kernel

The 'registered' flag is internally used by kmsg_dump_register()
and kmsg_dump_unregister() to track multiple registrations of the
same dumper.

It's protected by printk's internal dump_list_lock, and must thus
be accessed only from there. Mark it as private.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---
 include/linux/kmsg_dump.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h
index 2e7a1e032c71..7c08cb58259a 100644
--- a/include/linux/kmsg_dump.h
+++ b/include/linux/kmsg_dump.h
@@ -36,7 +36,7 @@ enum kmsg_dump_reason {
  * @dump:	Call into dumping code which will retrieve the data with
  * 		through the record iterator
  * @max_reason:	filter for highest reason number that should be dumped
- * @registered:	Flag that specifies if this is already registered
+ * @registered:	Flag that specifies if this is already registered (private)
  */
 struct kmsg_dumper {
 	struct list_head list;
--
2.21.0

^ permalink raw reply related	[relevance 99%]

* Re: DRM-based Oops viewer
    @ 2019-03-11 22:12 86%   ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-03-11 22:12 UTC (permalink / raw)
  To: Jani Nikula
  Cc: David Airlie, Daniel Vetter, Joonas Lahtinen, Rodrigo Vivi,
	Alex Deucher, Christian König, David Zhou, Ard Biesheuvel,
	Matt Fleming, Linus Torvalds, Greg Kroah-Hartman, John Ogness,
	dri-devel, linux-kernel

Hello Jani,

On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> > Hello DRM/UEFI maintainers,
> >
> > Several years ago, I wrote a set of patches to dump the kernel
> > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> >
> > The overwhelming response was that it's unsafe to do this in a
> > generic manner. Linus proposed a video-based viewer instead: [2]
> >
> >     If you want to do the BIOS services thing, do it for video: copy the
> >     oops to low RAM, return to real mode, re-run the graphics card POST
> >     routines to initialize text-mode, and use the BIOS to print out the
> >     oops.  That is WAY less scary than writing to disk.
> >
> > Of course it's 2019 now though, and it's quite known that
> > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> >
> > Researching whether this can be done from UEFI, it was also clear
> > that UEFI "Runtime Services" do not provide any re-initialization
> > routines. [4]
> >
> > The maximum possible that UEFI can provide is a GOP-provided
> > framebuffer that's ready to use by the OS -- even after the UEFI
> > boot phase is marked as done through ExitBootServices(). [5]
> >
> > Of course, once native drivers like i915 or radeon take over,
> > such a framebuffer is toast... [6]
> >
> > Thus a possible remaining option, is to display the oops through
> > "minimal" DRM drivers provided for each HW variant... Since
> > these special drivers will run only and fully under a panic()
> > context though, several constraints exist:
> >
> >   - The code should be fully synchronous (irqs are disabled)
> >   - It should not allocate any dynamic memory
> >   - It should make minimal assumptions about HW state
> >   - It should not chain into any other kernel subsystem
> >   - It has ample freedom to use delay-based loops and the
> >     like, the kernel is already dead.
> >
> > How feasible is it to have such a special "DRM viewoops"
> > framework + its minimal drivers in the kernel?
>
> Please first better define what you want to achieve.
>

Oh I thought this was clear..

What I want to achieve is:

  - for normal day-to-day x86 laptops users
  - properly inform the user when a kernel panic happens during
    a running graphical session (e.g. wayland/gnome/kde/...).

The current situation is dismal: the screen _just freezes_, and
users are left wondering what the heck has really happened to
their system (?)

Some out-of-the-box notification mechanism, for everyday distros
like Fedora and Ubuntu that can be enabled by default, would
improve the situation considerably..

Yes, there are many _developer_ features that can mitigate the
issue somewhat, but they're not really useful for everyday "normal"
usage:

  - netconsole is definitely not an option. It implies a lab
    setting where two machines are always connected through a
    network.

  - ramoops is _completely irrelevant_ for normal users. It's
    mostly for embedded systems and the like; requires intimate
    knowledge of the hardware by the user translated into DT
    bindings or special platform_data struct..

  - kexec/kdump partially solves the save-to-disk problem, but
    doesn't solve the user notification part..

    Maybe a new "kexec/kview" solution can be useful, but
    distributions don't enable kexec/k* solutions _by default_.

    Maybe they fear the extra burden of maintaining two kernels
    at the same time? or the requirement of reserving memory
    for the crashkernel beforehand?

    Linux should not just "freeze the screen" upon panic, even
    if a crashkernel is not present .. _some_ sane default
    built-in user notification mechanism should be there.

  - efivars are neat, they partially solve the save-to-disk
    problem, but does not solve the user notification problem.

    Moreover, they always carry the risk of bricking laptops..

> Do you want to store the dmesg or oops (like your original series
> suggests) or do you want to display the oops?

The original save-to-disk series was only shown for context.
This is a pure display solution; no disk is invovled at all.

> Do you want the facility to be functioning at all times, or only
> when specifically requested in advance by the user?

At all times, as a basic "sane default" for laptop-oriented
distributions to enable (think ubuntu, fedora, mint, etc.)

> If you want to display the oops, do you want it to
> also work when the display is disabled at the time of the oops?

If the screen is disabled, then this is definitely out of scope.

This only deals with classical laptop usage scenarios, where we
want to notify the user that something went wrong at the kernel
level.

> the display is at attached to a port on a dock?
>

This is a much bigger scope that's not important at this stage.

If I'm attaching my laptop to a projector and the kernel panics,
notifying the user only in the main laptop screen should be
enough.

> There's at least kdump, ramoops, and netconsole that can be used to
> achieve some of what you want. How do they fall short for you?
>

Hopfully my answers above provided more insight of why these
solutions fall short..

> BR,
> Jani.
>

Thanks!
--darwi

>
> >
> > The target is to start from i915, since that's what in my
> > laptop now, and work from there..
> >
> > Some final notes:
> >
> >   - The NT kernel has a similar concept, but for storage instead.
> >     They're used to dump core under kernel panic() situations,
> >     and are called "Minoport storage drivers". [7]
> >
> >   - Since Windows 7+, a very fancy Blue Screen of Death is
> >     displayed, with Unicode and whatnot, implying GPU drivers
> >     involvement. [8]
> >
> >   - Mac OS X also does something similar [9]
> >
> >   - On Linux laptops, the current situation is _really_ bad.
> >
> >     In any graphical session, type "echo c > /proc/sysrq-trigger";
> >     the screen will just completely freeze...
> >
> >     Desired first goal: just print the panic() log
> >
> > Thanks a lot,
> >
> > [1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
> > [2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com
> >
> > [3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf
> >
> > [4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
> > [5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
> >     "The Graphics Output Protocol supports this capability by
> >      providing the EFI OS loader access to a hardware frame buffer
> >      and enough information to allow the OS to draw directly to
> >      the graphics output device."
> >
> > [6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
> >     linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()
> >
> > [7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive
> >
> > [8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
> > [9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg
> >
> > --darwi
> > http://darwish.chasingpointers.com
>
> --
> Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[relevance 86%]

* Re: DRM-based Oops viewer
  @ 2019-03-11 23:39 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-03-11 23:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jani Nikula, David Airlie, Joonas Lahtinen, Rodrigo Vivi,
	Alex Deucher, Christian König, David Zhou, Ard Biesheuvel,
	Matt Fleming, Linus Torvalds, Greg Kroah-Hartman, John Ogness,
	dri-devel, linux-kernel

On Mon, Mar 11, 2019 at 02:49:41PM +0100, Daniel Vetter wrote:
> On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> > On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> > > Hello DRM/UEFI maintainers,
> > >
> > > Several years ago, I wrote a set of patches to dump the kernel
> > > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> > >
> > > The overwhelming response was that it's unsafe to do this in a
> > > generic manner. Linus proposed a video-based viewer instead: [2]
> > >
> > >     If you want to do the BIOS services thing, do it for video: copy the
> > >     oops to low RAM, return to real mode, re-run the graphics card POST
> > >     routines to initialize text-mode, and use the BIOS to print out the
> > >     oops.  That is WAY less scary than writing to disk.
> > >
> > > Of course it's 2019 now though, and it's quite known that
> > > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> > >
> > > Researching whether this can be done from UEFI, it was also clear
> > > that UEFI "Runtime Services" do not provide any re-initialization
> > > routines. [4]
> > >
> > > The maximum possible that UEFI can provide is a GOP-provided
> > > framebuffer that's ready to use by the OS -- even after the UEFI
> > > boot phase is marked as done through ExitBootServices(). [5]
> > >
> > > Of course, once native drivers like i915 or radeon take over,
> > > such a framebuffer is toast... [6]
> > >
> > > Thus a possible remaining option, is to display the oops through
> > > "minimal" DRM drivers provided for each HW variant... Since
> > > these special drivers will run only and fully under a panic()
> > > context though, several constraints exist:
> > >
> > >   - The code should be fully synchronous (irqs are disabled)
> > >   - It should not allocate any dynamic memory
> > >   - It should make minimal assumptions about HW state
> > >   - It should not chain into any other kernel subsystem
> > >   - It has ample freedom to use delay-based loops and the
> > >     like, the kernel is already dead.
> > >
> > > How feasible is it to have such a special "DRM viewoops"
> > > framework + its minimal drivers in the kernel?
> >
> > Please first better define what you want to achieve.
> >
> > Do you want to store the dmesg or oops (like your original series
> > suggests) or do you want to display the oops? Do you want the facility
> > to be functioning at all times, or only when specifically requested in
> > advance by the user? If you want to display the oops, do you want it to
> > also work when the display is disabled at the time of the oops? What if
> > the display is at attached to a port on a dock?
> >
> > There's at least kdump, ramoops, and netconsole that can be used to
> > achieve some of what you want. How do they fall short for you?
>
> Assuming the use-case is to get an oops to display on a kms driver, we do
> have a fairly comprehensive plan of what that's should look like:
>
> https://dri.freedesktop.org/docs/drm/gpu/todo.html#make-panic-handling-work
>
> This takes into account all the failed previous attempts at trying to get
> an oops to display. It's conceptually a match with your viewoops framework
> I think.

Thanks a lot Daniel for the reference! Yup, this is a conceptual
match indeed!

It's great to know that at the maintainer level there is some
agreement on, awareness of, and plans for, the general topic...

I'll jump to Noralf Trønnes's just-posted patches then and see how
to move from there :)

all the best,

--darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: [PATCH] printk: kmsg_dump: Mark registered flag as private
  @ 2019-03-12 20:05 99%   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-03-12 20:05 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Petr Mladek, Steven Rostedt, John Ogness, linux-kernel

Hi,

On Mon, Mar 11, 2019 at 09:49:05PM +0900, Sergey Senozhatsky wrote:
> On (03/10/19 21:03), Ahmed S. Darwish wrote:
> > The 'registered' flag is internally used by kmsg_dump_register()
> > and kmsg_dump_unregister() to track multiple registrations of the
> > same dumper.
> >
> > It's protected by printk's internal dump_list_lock, and must thus
> > be accessed only from there. Mark it as private.
> >
> > Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
> > ---
> >  include/linux/kmsg_dump.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h
> > index 2e7a1e032c71..7c08cb58259a 100644
> > --- a/include/linux/kmsg_dump.h
> > +++ b/include/linux/kmsg_dump.h
> > @@ -36,7 +36,7 @@ enum kmsg_dump_reason {
> >   * @dump:	Call into dumping code which will retrieve the data with
> >   * 		through the record iterator
> >   * @max_reason:	filter for highest reason number that should be dumped
> > - * @registered:	Flag that specifies if this is already registered
> > + * @registered:	Flag that specifies if this is already registered (private)
> >   */
> >  struct kmsg_dumper {
> >  	struct list_head list;
>
>
> Do we really do this thing?
>
>
> $ git grep "(private)" include/linux/
> include/linux/kmsg_dump.h: * @list:     Entry in the dumper list (private)
> include/linux/uwb.h: * specific (private) DevAddr (UWB_RSV_TARGET_DEVADDR).
>

Hmmm, while writing a kmsg_dumper for [1], I noticed that struct
kmsg_dumper is:

    /**
     * struct kmsg_dumper - kernel crash message dumper structure
     * @list:        Entry in the dumper list (private)  <== *
     * ...
     * @registered:  Flag that specifies if this is already registered
     */
    struct kmsg_dumper {
    	struct list_head list;
    	...
    	bool registered;
    	/* private state of the kmsg iterator */         <== *
    	...
    };

_All_ private members are annotated (<== *), so this gave the
impression that 'bool registered' was public..

Then I discovered from printk.c code that it's actually private,
and protected by the printk's internal dump_list_lock...

So this trivial patch was submitted for consistency.

thanks,

[1] https://lore.kernel.org/lkml/20190310013142.GA3376@darwi-home-pc

--
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* [tip:x86/build] x86/defconfig: Remove archaic partition tables support
  2019-03-06  0:44 99% [PATCH] x86/defconfig: Remove archaic partition tables support Ahmed S. Darwish
@ 2019-04-19 10:36 84% ` tip-bot for Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: tip-bot for Ahmed S. Darwish @ 2019-04-19 10:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, darwish.07, linux-kernel, hpa, peterz, tglx, torvalds, bp

Commit-ID:  93ddedaa5c9cb828d39a19b5d6e7e1939393085a
Gitweb:     https://git.kernel.org/tip/93ddedaa5c9cb828d39a19b5d6e7e1939393085a
Author:     Ahmed S. Darwish <darwish.07@gmail.com>
AuthorDate: Wed, 6 Mar 2019 01:44:25 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 19 Apr 2019 12:29:48 +0200

x86/defconfig: Remove archaic partition tables support

On x86 systems, only MSDOS and GPT partition tables are typically
encountered. Remove all the rest.

Note, CONFIG_EFI_PARTITION is also removed since it defaults to `y'.

Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20190306004425.GA30537@darwi-home-pc
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/configs/i386_defconfig   | 12 ------------
 arch/x86/configs/x86_64_defconfig | 12 ------------
 2 files changed, 24 deletions(-)

diff --git a/arch/x86/configs/i386_defconfig b/arch/x86/configs/i386_defconfig
index 9f908112bbb9..2b2481acc661 100644
--- a/arch/x86/configs/i386_defconfig
+++ b/arch/x86/configs/i386_defconfig
@@ -25,18 +25,6 @@ CONFIG_JUMP_LABEL=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
-CONFIG_PARTITION_ADVANCED=y
-CONFIG_OSF_PARTITION=y
-CONFIG_AMIGA_PARTITION=y
-CONFIG_MAC_PARTITION=y
-CONFIG_BSD_DISKLABEL=y
-CONFIG_MINIX_SUBPARTITION=y
-CONFIG_SOLARIS_X86_PARTITION=y
-CONFIG_UNIXWARE_DISKLABEL=y
-CONFIG_SGI_PARTITION=y
-CONFIG_SUN_PARTITION=y
-CONFIG_KARMA_PARTITION=y
-CONFIG_EFI_PARTITION=y
 CONFIG_SMP=y
 CONFIG_X86_GENERIC=y
 CONFIG_HPET_TIMER=y
diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig
index 1d3badfda09e..e8829abf063a 100644
--- a/arch/x86/configs/x86_64_defconfig
+++ b/arch/x86/configs/x86_64_defconfig
@@ -24,18 +24,6 @@ CONFIG_JUMP_LABEL=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 CONFIG_MODULE_FORCE_UNLOAD=y
-CONFIG_PARTITION_ADVANCED=y
-CONFIG_OSF_PARTITION=y
-CONFIG_AMIGA_PARTITION=y
-CONFIG_MAC_PARTITION=y
-CONFIG_BSD_DISKLABEL=y
-CONFIG_MINIX_SUBPARTITION=y
-CONFIG_SOLARIS_X86_PARTITION=y
-CONFIG_UNIXWARE_DISKLABEL=y
-CONFIG_SGI_PARTITION=y
-CONFIG_SUN_PARTITION=y
-CONFIG_KARMA_PARTITION=y
-CONFIG_EFI_PARTITION=y
 CONFIG_SMP=y
 CONFIG_CALGARY_IOMMU=y
 CONFIG_NR_CPUS=64

^ permalink raw reply related	[relevance 84%]

* Re: Linux 5.3-rc8
  @ 2019-09-10  4:21 96% ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-10  4:21 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger, Linus Torvalds
  Cc: Jan Kara, zhangjs, linux-ext4, linux-kernel

Hi,

On Sun, Sep 08, 2019 at 01:59:27PM -0700, Linus Torvalds wrote:
> So we probably didn't strictly need an rc8 this release, but with LPC
> and the KS conference travel this upcoming week it just makes
> everything easier.
>

The commit b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), [1]
which was merged in v5.3-rc1, *always* leads to a blocked boot on my
system due to low entropy.

The hardware is not a VM: it's a Thinkpad E480 (i5-8250U CPU), with
a standard Arch user-space.

It was discovered through bisecting the problem v5.2 => v5.3-rc1,
since v5.2 never had any similar issues. The issue still persists in
v5.3-rc8: reverting that commit always fixes the problem.

It seems that batching the directory lookup I/O requests (which are
possibly a lot during boot) is minimizing sources of disk-activity-
induced entropy? [2] [3]

Can this even be considered a user-space breakage? I'm honestly not
sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8)
early-on fixes the problem. I'm not sure about the status of older
CPUs though.

Thanks,

[1]
  commit b03755ad6f33b7b8cd7312a3596a2dbf496de6e7
  Author: zhangjs <zachary@baishancloud.com>
  Date:   Wed Jun 19 23:41:29 2019 -0400

      ext4: make __ext4_get_inode_loc plug

      Add a blk_plug to prevent the inode table readahead from being
      submitted as small I/O requests.

      Signed-off-by: zhangjs <zachary@baishancloud.com>
      Signed-off-by: Theodore Ts'o <tytso@mit.edu>
      Reviewed-by: Jan Kara <jack@suse.cz>

[2] https://lkml.kernel.org/r/20190619122457.GF27954@quack2.suse.cz

[3] block/blk-core.c :: blk_start_plug()

--
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 96%]

* Re: Linux 5.3-rc8
  @ 2019-09-10 17:33 96%     ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-10 17:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, zhangjs, linux-ext4, lkml

On Tue, Sep 10, 2019 at 12:33:12PM +0100, Linus Torvalds wrote:
> On Tue, Sep 10, 2019 at 5:21 AM Ahmed S. Darwish <darwish.07@gmail.com> wrote:
> >
> > The commit b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), [1]
> > which was merged in v5.3-rc1, *always* leads to a blocked boot on my
> > system due to low entropy.
>
> Exactly what is it that blocks on entropy? Nobody should do that
> during boot, because on some systems entropy is really really low
> (think flash memory with polling IO etc).
>

Ok, I've tracked it down further. It's unfortunately GDM
intentionally blocking on a getrandom(buf, 16, 0).

Booting the system with an straced GDM service
("ExecStart=strace -f /usr/bin/gdm") reveals:

  ...
  [  3.779375] strace[262]: [pid   323] execve("/usr/lib/gnome-session-binary",
                                                 ... /* 28 vars */) = 0
  ...
  [  4.019227] strace[262]: [pid   323] getrandom( <unfinished ...>
  [ 79.601433] kernel: random: crng init done
  [ 79.601443] kernel: random: 3 urandom warning(s) missed due to ratelimiting
  [ 79.601262] strace[262]: [pid   323] <... getrandom resumed>..., 16, 0) = 16
  [ 79.601262] strace[262]: [pid   323] getrandom(..., 16, 0) = 16
  [ 79.603041] strace[262]: [pid   323] getrandom(..., 16, 0) = 16
  [ 79.603041] strace[262]: [pid   323] getrandom(..., 16, 0) = 16
  [ 79.603041] strace[262]: [pid   323] getrandom(..., 16, 0) = 16

As can be seen in the timestamps, the GDM boot was only continued
by typing randomly on the keyboard..

> That said, I would have expected that any PC gets plenty of entropy.
> Are you sure it's entropy that is blocking, and not perhaps some odd
> "forgot to unplug" situation?
>

Yes, doing any of below steps makes the problem reliably disappear:

  - boot param "random.trust_cpu=on"
  - rngd(8) enabled at boot (entropy source: x86 RDRAND + jitter)
  - pressing random 3 or 4 keyboard keys while GDM boot is stuck

> > Can this even be considered a user-space breakage? I'm honestly not
> > sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8)
> > early-on fixes the problem. I'm not sure about the status of older
> > CPUs though.
>
> It's definitely breakage, although rather odd. I would have expected
> us to have other sources of entropy than just the disk. Did we stop
> doing low bits of TSC from timer interrupts etc?
>

Exactly.

While gnome-session is obviously at fault here by requiring
*blocking* randomness at the boot path, it's still not requesting
much, just (5 * 16) bytes to be exact.

I guess an x86 laptop should be able to provide that, even without
RDRAND / random.trust_cpu=on (TSC jitter, etc.) ?

thanks,
--darwi

> Ted, either way - ext4 IO patterns or random number entropy - this is
> your code. Comments?
>
>                  Linus

^ permalink raw reply	[relevance 96%]

* Re: Linux 5.3-rc8
    @ 2019-09-11 21:41 99%             ` Ahmed S. Darwish
  2019-09-11 22:37 99%               ` Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-11 21:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Y. Ts'o, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, zhangjs, linux-ext4, lkml

On Wed, Sep 11, 2019 at 05:45:38PM +0100, Linus Torvalds wrote:
> On Wed, Sep 11, 2019 at 5:07 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
> > >
> > > Ted, comments? I'd hate to revert the ext4 thing just because it
> > > happens to expose a bad thing in user space.
> >
> > Unfortuantely, I very much doubt this is going to work.  That's
> > because the add_disk_randomness() path is only used for legacy
> > /dev/random [...]
> >
> > Also, because by default, the vast majority of disks have
> > /sys/block/XXX/queue/add_random set to zero by default.
> 
> Gaah. I was looking at the input randomness, since I thought that was
> where the added randomness that Ahmed got things to work with came
> from.
> 
> And that then made me just look at the legacy disk randomness (for the
> obvious disk IO reasons) and I didn't look further.
>

Yup, I confirm that the quick patch kept the situation as-is. I was
going to debug why, but now we know the answer..

> > So the the way we get entropy these days for initializing the CRNG is
> > via the add_interrupt_randomness() path, where do something really
> > fast, and we assume that we get enough uncertainity from 8 interrupts
> > to give us one bit of entropy (64 interrupts to give us a byte of
> > entropy), and that we need 512 bits of entropy to consider the CRNG
> > fully initialized.  (Yeah, there's a lot of conservatism in those
> > estimates, and so what we could do is decide to say, cut down the
> > number of bits needed to initialize the CRNG to be 256 bits, since
> > that's the size of the CHACHA20 cipher.)
> 
> So that's 4k interrupts if I counted right, and yeah, maybe Ahmed was
> just close enough before, and the merging of the inode table IO then
> took him below that limit.
>
> > Ultimately, though, we need to find *some* way to fix userspace's
> > assumptions that they can always get high quality entropy in early
> > boot, or we need to get over people's distrust of Intel and RDRAND.
>
> Well, even on a PC, sometimes rdrand just isn't there. AMD has screwed
> it up a few times, and older Intel chips just don't have it.
> 
> So I'd be inclined to either lower the limit regardless -

ACK :)

> and perhaps make the "user space asked for randomness much too
> early" be a big *warning* instead of being a basically fatal hung
> machine?

Hmmm, regarding "randomness request much too early", how much is time
really a factor here?

I tested leaving the machine even for 15+ minutes, and it still didn't
continue booting: the boot is practically blocked forever...

Or is the thoery that hopefully once the machine is un-stuck, more
sources of entropy will be available? If that's the case, then
possibly (rate-limited):

  "urandom: process XX asked for YY bytes. CRNG not yet initialized"

> Linus

thanks,

--
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: Linux 5.3-rc8
  2019-09-11 21:41 99%             ` Ahmed S. Darwish
@ 2019-09-11 22:37 99%               ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-11 22:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Y. Ts'o, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, zhangjs, linux-ext4, lkml

On Wed, Sep 11, 2019 at 11:41:44PM +0200, Ahmed S. Darwish wrote:
> On Wed, Sep 11, 2019 at 05:45:38PM +0100, Linus Torvalds wrote:
[...]
> >
> > Well, even on a PC, sometimes rdrand just isn't there. AMD has screwed
> > it up a few times, and older Intel chips just don't have it.
> > 
> > So I'd be inclined to either lower the limit regardless -
> 
> ACK :)
> 
> > and perhaps make the "user space asked for randomness much too
> > early" be a big *warning* instead of being a basically fatal hung
> > machine?
> 
> Hmmm, regarding "randomness request much too early", how much is time
> really a factor here?
> 
> I tested leaving the machine even for 15+ minutes, and it still didn't
> continue booting: the boot is practically blocked forever...
> 
> Or is the thoery that hopefully once the machine is un-stuck, more
> sources of entropy will be available? If that's the case, then
> possibly (rate-limited):
> 
>   "urandom: process XX asked for YY bytes. CRNG not yet initialized"
>
     ^
     getrandom: ....

(since urandom always succeeds, even if CRNG is not inited, and
 it already prints a very similar warning in that case anyway..)

thanks,
--darwi

^ permalink raw reply	[relevance 99%]

* Re: Linux 5.3-rc8
  @ 2019-09-12  3:44 84%                 ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-12  3:44 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Linus Torvalds, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, Alexander E. Patrakov, zhangjs, linux-ext4,
	lkml

Hi Ted,

On Wed, Sep 11, 2019 at 01:36:24PM -0400, Theodore Y. Ts'o wrote:
> On Wed, Sep 11, 2019 at 06:00:19PM +0100, Linus Torvalds wrote:
> >     [    0.231255] random: get_random_bytes called from
> > start_kernel+0x323/0x4f5 with crng_init=0
> >
> > and that's this code:
> >
> >         add_latent_entropy();
> >         add_device_randomness(command_line, strlen(command_line));
> >         boot_init_stack_canary();
> >
> > in particular, it's the boot_init_stack_canary() thing that asks for a
> > random number for the canary.
> >
> > I don't actually see the 'crng init done' until much much later:
> >
> >     [   21.741125] random: crng init done
>
> Yes, that's super early in the boot sequence.  IIRC the stack canary
> gets reinitialized later (or maybe it was only for the other CPU's in
> SMP mode; I don't recall the details of the top of my head).
>
> I think this one always fails, and perhaps we should have a way of
> suppressing it --- but that's correct the in-kernel interface doesn't
> block.
>
> The /dev/urandom device doesn't block either, despite security
> eggheads continually asking me to change it to block ala getrandom(2),
> but I have always pushed because because I *know* changing
> /dev/urandom to block would be asking for userspace regressions.
>
> The compromise we came up with was that since getrandom(2) is a new
> interface, we could make this have the behavior that the security
> heads wanted, which is to make blocking unconditional, since the
> theory was that *this* interface would be sane, and that userspace
> applications which used it too early was buggy, and we could make it
> *their* problem.
>

Hmmmm, IMHO it's almost impossible to define "too early" here... Does
it mean applications in the critical boot path? Does gnome-session =>
libICE => libbsd => getentropy() => getrandom() => generated MIT magic
cookie count as being too early? It's very hazy...

getrandom(2) basically has no guaranteed upper bound for the waiting
time. And in the report I submitted in the parent thread, the upper
bound is really "infinitely locked"...

Here is a trace_printk() log of all the getrandom() calls done from
system boot:

    systemd-random--179   2.510228: getrandom(512 bytes, flags = 1)
    systemd-random--179   2.510239: getrandom(512 bytes, flags = 0)
            polkitd-294   3.903699: getrandom(8 bytes, flags = 1)
            polkitd-294   3.904191: getrandom(8 bytes, flags = 1)

                          ... + 45 similar instances

    gnome-session-b-327   4.400620: getrandom(16 bytes, flags = 0)

                          ... boot blocks here, until
                              pressing some keys

    gnome-session-b-327   49.32140: getrandom(16 bytes, flags = 0)

                          ... + 3 similar instances

        gnome-shell-335   49.553594: getrandom(8 bytes, flags = 1)
        gnome-shell-335   49.553600: getrandom(8 bytes, flags = 1)

                          ... + 10 similar instances

           Xwayland-345   50.129401: getrandom(8 bytes, flags = 1)
           Xwayland-345   50.129491: getrandom(8 bytes, flags = 1)

                          ... + 9 similar instances

        gnome-shell-335   50.487543: getrandom(8 bytes, flags = 1)
        gnome-shell-335   50.487550: getrandom(8 bytes, flags = 1)

                          ... + 79 similar instances

      gsd-xsettings-390   51.431638: getrandom(8 bytes, flags = 1)
      gsd-clipboard-389   51.432693: getrandom(8 bytes, flags = 1)
      gsd-xsettings-390   51.433899: getrandom(8 bytes, flags = 1)
      gsd-smartcard-388   51.433924: getrandom(110 bytes, flags = 0)
      gsd-smartcard-388   51.433936: getrandom(256 bytes, flags = 0)

                          ... + 3 similar instances

And it goes on, including processes like gsd-power-, gsd-xsettings-,
gsd-clipboard-, gsd-print-notif, gsd-clipboard-, gsd-color,
gst-keyboard-, etc.

What's the boundary of "too early" here? It's kinda undefinable..

> People have suggested adding a new getrandom flag, GRND_I_KNOW_THIS_IS_INSECURE,
> or some such, which wouldn't block and would return "best efforts"
> randomness.  I haven't been super enthusiastic about such a flag
> because I *know* it would be insecure.   However, the next time a massive
> security bug shows up on the front pages of the Wall Street Journal,
> or on some web site such as https://factorable.net, it won't be the kernel's fault
> since the flag will be GRND_INSECURE_BROKEN_APPLICATION, or some such.
> It doesn't really solve the problem, though.
>

At least for generating the MIT cookie, it would make some sort of
sense... Really caring about truly random-numbers while using Xorg
is almost like perfecting a hard-metal door for the paper house ;)

(Jokes aside, I understand that this cannot be the solution)

> > But this does show that
> >
> >  (a) we have the same issue in the kernel, and we don't block there
>
> Ultimately, I think the only right answer is to make it the
> bootloader's responsibility to get us some decent entropy at boot
> time.

Just 8 days ago, systemd v243 was released, with systemd-random-seed(8)
now supporting *crediting* the entropy while loading the random seed:

    https://systemd.io/RANDOM_SEEDS

systemd-random-seed do something similar to what OpenBSD does, by
preserving the seed across reboots at /var/lib/systemd/random-seed.

This is not enabled by default though. Will distributions enable it by
default in the future? I have no idea \_(.)_/

> There are patches to allow ARM systems to pass in entropy via
> the device tree.  And in theory (assuming you trust the UEFI BIOS ---
> stop laughing in the back!) we can use that get entropy which will
> solve the problem for UEFI boot systems.

Hmmmm ...

> I've been talking to Ron
> Minnich about trying to get this support into the NERF bootloader, at
> which point new servers from the Open Compute Project will have a
> solution as well.  (We can probably also get solutions for Chrome OS
> devices, since those have TPM-like which are trusted to have a
> comptently engineered hardware RNG --- I'm not sure I would trust all
> TPM devices in commodity hardware, but again, at least we can shift
> blame off of the kernel.  :-P)
>
> Still, these are all point solutions, and don't really solve the
> problem on older systems, or non-x86 systems.
>

For non-x86 _embedded_ systems at least, usually the BSP provider
enables the necessary hwrng driver in question and credit its entropy;
e.g. 62f95ae805fa (hwrng: omap - Set default quality).

> >  (b) initializing the crng really can be a timing problem
> >
> > The interrupt thing is only going to get worse as disks turn into
> > ssd's and some of them end up using polling rather than interrupts..
> > So we're likely to see _fewer_ interrupts in the future, not more.
>
> Yeah, agreed.  Maybe we should have an "insecure_randomness" boot
> option which blindly forces the CRNG to be initialized at boot, so
> that at least people can get to a command line, if insecurely?  I
> don't have any good ideas about how to solve this problem in general.
> :-( :-( :-(
>
> 						- Ted

Yeah, this is a hard engineering problem. You've earlier summarized it
perfectly here:

    https://lore.kernel.org/r/20180514003034.GI14763@thunk.org

I guess, to summarize earlier e-mails, a nice path would be:

    1. Cutting down the number of bits needed to initialize the CRNG
       to 256 bits (CHACHA20 cipher)

    2. Complaining loudly when getrandom() is used while the CRNG is
       not yet initialized.

    3. Hopefully #2 will force distributions to act: either trusting
       RDRANDOM when it's sane, configuring systmed-random-seed(8) to
       credit the entropy by default, etc.

Thanks!

--
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 84%]

* Re: Linux 5.3-rc8
    @ 2019-09-14  9:25 83%                     ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-14  9:25 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Linus Torvalds, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, Alexander E. Patrakov, zhangjs, linux-ext4,
	lkml

On Thu, Sep 12, 2019 at 04:25:30AM -0400, Theodore Y. Ts'o wrote:
> On Thu, Sep 12, 2019 at 05:44:21AM +0200, Ahmed S. Darwish wrote:
[...]
> 
> >     1. Cutting down the number of bits needed to initialize the CRNG
> >        to 256 bits (CHACHA20 cipher)
> 
> Does the attach patch (see below) help?
>
[...]
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 5d5ea4ce1442..b9b3a5a82abf 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -500,7 +500,7 @@ static int crng_init = 0;
>  #define crng_ready() (likely(crng_init > 1))
>  static int crng_init_cnt = 0;
>  static unsigned long crng_global_init_time = 0;
> -#define CRNG_INIT_CNT_THRESH (2*CHACHA_KEY_SIZE)
> +#define CRNG_INIT_CNT_THRESH	CHACHA_KEY_SIZE
>  static void _extract_crng(struct crng_state *crng, __u8 out[CHACHA_BLOCK_SIZE]);
>  static void _crng_backtrack_protect(struct crng_state *crng,
>  				    __u8 tmp[CHACHA_BLOCK_SIZE], int used);

Unfortunately, it only made the early fast init faster, but didn't fix
the normal crng init blockage :-(

Here's a trace log, got by applying the patch at [1]. The boot was
continued only after typing some random keys after ~30s:

#
# entries-in-buffer/entries-written: 22/22   #P:8
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
          <idle>-0     [001] dNh.     0.687088: crng_fast_load: crng threshold = 32
          <idle>-0     [001] dNh.     0.687089: crng_fast_load: crng_init_cnt = 0
          <idle>-0     [001] dNh.     0.687090: crng_fast_load: crng_init_cnt, now set to 16
          <idle>-0     [001] dNh.     0.705208: crng_fast_load: crng threshold = 32
          <idle>-0     [001] dNh.     0.705209: crng_fast_load: crng_init_cnt = 16
          <idle>-0     [001] dNh.     0.705209: crng_fast_load: crng_init_cnt, now set to 32
          <idle>-0     [001] dNh.     0.708048: crng_fast_load: random: fast init done
             lvm-165   [001] d...     2.417971: urandom_read: random: crng_init_cnt, now set to 0
 systemd-random--179   [003] ....     2.495669: wait_for_random_bytes.part.0: wait for randomness
     dbus-daemon-274   [006] dN..     3.294331: urandom_read: random: crng_init_cnt, now set to 0
     dbus-daemon-274   [006] dN..     3.316618: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] dN..     3.873918: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] dN..     3.874303: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] dN..     3.874375: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] d...     3.886204: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] d...     3.886217: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] d...     3.888519: urandom_read: random: crng_init_cnt, now set to 0
         polkitd-286   [007] d...     3.888529: urandom_read: random: crng_init_cnt, now set to 0
 gnome-session-b-321   [006] ....     4.292034: wait_for_random_bytes.part.0: wait for randomness
          <idle>-0     [002] dNh.    36.784001: crng_reseed: random: crng init done
 gnome-session-b-321   [006] ....    36.784019: wait_for_random_bytes.part.0: wait done
 systemd-random--179   [003] ....    36.784051: wait_for_random_bytes.part.0: wait done

[1] patch:

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 5d5ea4ce1442..4a50ee2c230d 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -500,7 +500,7 @@ static int crng_init = 0;
 #define crng_ready() (likely(crng_init > 1))
 static int crng_init_cnt = 0;
 static unsigned long crng_global_init_time = 0;
-#define CRNG_INIT_CNT_THRESH (2*CHACHA_KEY_SIZE)
+#define CRNG_INIT_CNT_THRESH (CHACHA_KEY_SIZE)
 static void _extract_crng(struct crng_state *crng, __u8 out[CHACHA_BLOCK_SIZE]);
 static void _crng_backtrack_protect(struct crng_state *crng,
 				    __u8 tmp[CHACHA_BLOCK_SIZE], int used);
@@ -931,6 +931,9 @@ static int crng_fast_load(const char *cp, size_t len)
 	unsigned long flags;
 	char *p;
 
+	trace_printk("crng threshold = %d\n", CRNG_INIT_CNT_THRESH);
+	trace_printk("crng_init_cnt = %d\n", crng_init_cnt);
+
 	if (!spin_trylock_irqsave(&primary_crng.lock, flags))
 		return 0;
 	if (crng_init != 0) {
@@ -943,11 +946,15 @@ static int crng_fast_load(const char *cp, size_t len)
 		cp++; crng_init_cnt++; len--;
 	}
 	spin_unlock_irqrestore(&primary_crng.lock, flags);
+
+	trace_printk("crng_init_cnt, now set to %d\n", crng_init_cnt);
+
 	if (crng_init_cnt >= CRNG_INIT_CNT_THRESH) {
 		invalidate_batched_entropy();
 		crng_init = 1;
 		wake_up_interruptible(&crng_init_wait);
 		pr_notice("random: fast init done\n");
+		trace_printk("random: fast init done\n");
 	}
 	return 1;
 }
@@ -1033,6 +1040,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
 		process_random_ready_list();
 		wake_up_interruptible(&crng_init_wait);
 		pr_notice("random: crng init done\n");
+		trace_printk("random: crng init done\n");
 		if (unseeded_warning.missed) {
 			pr_notice("random: %d get_random_xx warning(s) missed "
 				  "due to ratelimiting\n",
@@ -1743,9 +1751,16 @@ EXPORT_SYMBOL(get_random_bytes);
  */
 int wait_for_random_bytes(void)
 {
+	int ret;
+
 	if (likely(crng_ready()))
 		return 0;
-	return wait_event_interruptible(crng_init_wait, crng_ready());
+
+	trace_printk("wait for randomness\n");
+	ret = wait_event_interruptible(crng_init_wait, crng_ready());
+	trace_printk("wait done\n");
+
+	return ret;
 }
 EXPORT_SYMBOL(wait_for_random_bytes);
 
@@ -1974,6 +1989,8 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 			       current->comm, nbytes);
 		spin_lock_irqsave(&primary_crng.lock, flags);
 		crng_init_cnt = 0;
+		trace_printk("random: crng_init_cnt, now set to %d\n",
+			     crng_init_cnt);
 		spin_unlock_irqrestore(&primary_crng.lock, flags);
 	}
 	nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));

thanks,

-- 
darwi
http://darwish.chasingpointers.com

^ permalink raw reply related	[relevance 83%]

* [PATCH RFC] random: getrandom(2): don't block on non-initialized entropy pool
  @ 2019-09-14 12:25 80%                       ` Ahmed S. Darwish
    2019-09-14 15:02 94%                       ` Linux 5.3-rc8 Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-14 12:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Y. Ts'o, Michael Kerrisk, Andreas Dilger, Jan Kara,
	Ray Strode, William Jon McCann, Alexander E. Patrakov, zhangjs,
	linux-ext4, lkml

getrandom() has been created as a new and more secure interface for
pseudorandom data requests.  Unlike /dev/urandom, it unconditionally
blocks until the entropy pool has been properly initialized.

While getrandom() has no guaranteed upper bound for its waiting time,
user-space has been abusing it by issuing the syscall, from shared
libraries no less, during the main system boot sequence.

Thus, on certain setups where there is no hwrng (embedded), or the
hwrng is not trusted by some users (intel RDRAND), or sometimes it's
just broken (amd RDRAND), the system boot can be *reliably* blocked.

The issue is further exaggerated by recent file-system optimizations,
e.g. b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), which
merges directory lookup code inode table IO, and thus minimizes the
number of disk interrupts and entropy during boot. After that commit,
a blocked boot can be reliably reproduced on a Thinkpad E480 laptop
with standard ArchLinux user-space.

Thus, don't trust user-space on calling getrandom() from the right
context. Just never block, and return -EINVAL if entropy is not yet
available.

Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
Link: https://lkml.kernel.org/r/20190911173624.GI2740@mit.edu
Link: https://lkml.kernel.org/r/20180514003034.GI14763@thunk.org

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

Notes:
    This feels very risky at the very end of -rc8, so only sending
    this as an RFC. The system of course reliably boots with this,
    and the log, as expected, powerfully warns all callers:

    $ dmesg | grep random
    [0.236472] random: get_random_bytes called from start_kernel+0x30f/0x4d7 with crng_init=0
    [0.680263] random: fast init done
    [2.500346] random: lvm: uninitialized urandom read (4 bytes read)
    [2.595125] random: systemd-random-: invalid getrandom request (512 bytes): crng not ready
    [2.595126] random: systemd-random-: uninitialized urandom read (512 bytes read)
    [3.427699] random: dbus-daemon: uninitialized urandom read (12 bytes read)
    [3.979425] urandom_read: 1 callbacks suppressed
    [3.979426] random: polkitd: uninitialized urandom read (8 bytes read)
    [3.979726] random: polkitd: uninitialized urandom read (8 bytes read)
    [3.979752] random: polkitd: uninitialized urandom read (8 bytes read)
    [4.473398] random: gnome-session-b: invalid getrandom request (16 bytes): crng not ready
    [4.473404] random: gnome-session-b: invalid getrandom request (16 bytes): crng not ready
    [4.473409] random: gnome-session-b: invalid getrandom request (16 bytes): crng not ready
    [5.265636] random: crng init done
    [5.265649] random: 3 urandom warning(s) missed due to ratelimiting
    [5.265652] random: 1 getrandom warning(s) missed due to ratelimiting

 drivers/char/random.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4a50ee2c230d..309dc5ddf370 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -511,6 +511,8 @@ static struct ratelimit_state unseeded_warning =
 	RATELIMIT_STATE_INIT("warn_unseeded_randomness", HZ, 3);
 static struct ratelimit_state urandom_warning =
 	RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3);
+static struct ratelimit_state getrandom_warning =
+	RATELIMIT_STATE_INIT("warn_getrandom_notavail", HZ, 3);

 static int ratelimit_disable __read_mostly;

@@ -1053,6 +1055,12 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
 				  urandom_warning.missed);
 			urandom_warning.missed = 0;
 		}
+		if (getrandom_warning.missed) {
+			pr_notice("random: %d getrandom warning(s) missed "
+				  "due to ratelimiting\n",
+				  getrandom_warning.missed);
+			getrandom_warning.missed = 0;
+		}
 	}
 }

@@ -1915,6 +1923,7 @@ int __init rand_initialize(void)
 	crng_global_init_time = jiffies;
 	if (ratelimit_disable) {
 		urandom_warning.interval = 0;
+		getrandom_warning.interval = 0;
 		unseeded_warning.interval = 0;
 	}
 	return 0;
@@ -2138,8 +2147,6 @@ const struct file_operations urandom_fops = {
 SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 		unsigned int, flags)
 {
-	int ret;
-
 	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
 		return -EINVAL;

@@ -2152,9 +2159,13 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (!crng_ready()) {
 		if (flags & GRND_NONBLOCK)
 			return -EAGAIN;
-		ret = wait_for_random_bytes();
-		if (unlikely(ret))
-			return ret;
+
+		if (__ratelimit(&getrandom_warning))
+			pr_notice("random: %s: invalid getrandom request "
+				  "(%zd bytes): crng not ready",
+				  current->comm, count);
+
+		return -EINVAL;
 	}
 	return urandom_read(NULL, buf, count, NULL);
 }
--
2.23.0

^ permalink raw reply related	[relevance 80%]

* Re: Linux 5.3-rc8
    2019-09-14 12:25 80%                       ` [PATCH RFC] random: getrandom(2): don't block on non-initialized entropy pool Ahmed S. Darwish
@ 2019-09-14 15:02 94%                       ` Ahmed S. Darwish
    1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-14 15:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Y. Ts'o, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, Alexander E. Patrakov, zhangjs, linux-ext4,
	Lennart Poettering, lkml

On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote:
> On Thu, Sep 12, 2019 at 9:25 AM Theodore Y. Ts'o <tytso@mit.edu> wrote:
> >
> > Hmm, one thought might be GRND_FAILSAFE, which will wait up to two
> > minutes before returning "best efforts" randomness and issuing a huge
> > massive warning if it is triggered?
> 
> Yeah, based on (by now) _years_ of experience with people mis-using
> "get me random numbers", I think the sense of a new flag needs to be
> "yeah, I'm willing to wait for it".
>
> Because most people just don't want to wait for it, and most people
> don't think about it, and we need to make the default be for that
> "don't think about it" crowd, with the people who ask for randomness
> sources for a secure key having to very clearly and very explicitly
> say "Yes, I understand that this can take minutes and can only be done
> long after boot".
> 
> Even then people will screw that up because they copy code, or some
> less than gifted rodent writes a library and decides "my library is so
> important that I need that waiting sooper-sekrit-secure random
> number", and then people use that broken library by mistake without
> realizing that it's not going to be reliable at boot time.
> 
> An alternative might be to make getrandom() just return an error
> instead of waiting. Sure, fill the buffer with "as random as we can"
> stuff, but then return -EINVAL because you called us too early.
>

ACK, that's probably _the_ most sensible approach. Only caveat is
the slight change in user-space API semantics though...

For example, this breaks the just released systemd-random-seed(8)
as it _explicitly_ requests blocking behvior from getrandom() here:

    => src/random-seed/random-seed.c:
    /*
     * Let's make this whole job asynchronous, i.e. let's make
     * ourselves a barrier for proper initialization of the
     * random pool.
     */
     k = getrandom(buf, buf_size, GRND_NONBLOCK);
     if (k < 0 && errno == EAGAIN && synchronous) {
         log_notice("Kernel entropy pool is not initialized yet, "
                    "waiting until it is.");
                    
         k = getrandom(buf, buf_size, 0); /* retry synchronously */
     }
     if (k < 0) {
         log_debug_errno(errno, "Failed to read random data with "
                         "getrandom(), falling back to "
                         "/dev/urandom: %m");
     } else if ((size_t) k < buf_size) {
         log_debug("Short read from getrandom(), falling back to "
	           "/dev/urandom: %m");
     } else {
         getrandom_worked = true;
     }

Nonetheless, a slightly broken systemd-random-seed, that was just
released only 11 days ago (v243), is honestly much better than a
*non-booting system*...

I've sent an RFC patch at [1].

To handle the systemd case, I'll add the discussed "yeah, I'm
willing to wait for it" flag (GRND_BLOCK) in v2.

If this whole approach is going to be merged, and the slight ABI
breakage is to be tolerated (hmmmmm?), I wonder how will systemd
random-seed handle the semantics change though without doing
ugly kernel version checks..

thanks,

[1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc

--
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 94%]

* Re: Linux 5.3-rc8
  @ 2019-09-14 21:11 94%                           ` Ahmed S. Darwish
    1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-14 21:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Y. Ts'o, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, Alexander E. Patrakov, zhangjs, linux-ext4,
	Lennart Poettering, lkml

Hi,

On Sat, Sep 14, 2019 at 09:30:19AM -0700, Linus Torvalds wrote:
> On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish <darwish.07@gmail.com> wrote:
> >
> > On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote:
> > >
> > > An alternative might be to make getrandom() just return an error
> > > instead of waiting. Sure, fill the buffer with "as random as we can"
> > > stuff, but then return -EINVAL because you called us too early.
> >
> > ACK, that's probably _the_ most sensible approach. Only caveat is
> > the slight change in user-space API semantics though...
> >
> > For example, this breaks the just released systemd-random-seed(8)
> > as it _explicitly_ requests blocking behvior from getrandom() here:
> >
> 
> Actually, I would argue that the "don't ever block, instead fill
> buffer and return error instead" fixes this broken case.
> 
> >     => src/random-seed/random-seed.c:
> >     /*
> >      * Let's make this whole job asynchronous, i.e. let's make
> >      * ourselves a barrier for proper initialization of the
> >      * random pool.
> >      */
> >      k = getrandom(buf, buf_size, GRND_NONBLOCK);
> >      if (k < 0 && errno == EAGAIN && synchronous) {
> >          log_notice("Kernel entropy pool is not initialized yet, "
> >                     "waiting until it is.");
> >
> >          k = getrandom(buf, buf_size, 0); /* retry synchronously */
> >      }
> 
> Yeah, the above is yet another example of completely broken garbage.
> 
> You can't just wait and block at boot. That is simply 100%
> unacceptable, and always has been, exactly because that may
> potentially mean waiting forever since you didn't do anything that
> actually is likely to add any entropy.
>

ACK, the systemd commit which introduced that code also does:

   => 26ded5570994 (random-seed: rework systemd-random-seed.service..)
    [...]
    --- a/units/systemd-random-seed.service.in
    +++ b/units/systemd-random-seed.service.in
    @@ -22,4 +22,9 @@ Type=oneshot
    RemainAfterExit=yes
    ExecStart=@rootlibexecdir@/systemd-random-seed load
    ExecStop=@rootlibexecdir@/systemd-random-seed save
   -TimeoutSec=30s
   +
   +# This service waits until the kernel's entropy pool is
   +# initialized, and may be used as ordering barrier for service
   +# that require an initialized entropy pool. Since initialization
   +# can take a while on entropy-starved systems, let's increase the
   +# time-out substantially here.
   +TimeoutSec=10min

This 10min wait thing is really broken... it's basically "forever".

> >      if (k < 0) {
> >          log_debug_errno(errno, "Failed to read random data with "
> >                          "getrandom(), falling back to "
> >                          "/dev/urandom: %m");
> 
> At least it gets a log message.
> 
> So I think the right thing to do is to just make getrandom() return
> -EINVAL, and refuse to block.
> 
> As mentioned, this has already historically been a huge issue on
> embedded devices, and with disks turnign not just to NVMe but to
> actual polling nvdimm/xpoint/flash, the amount of true "entropy"
> randomness we can give at boot is very questionable.
>

ACK.

Moreover, and as a result of all that, distributions are now officially
*duct-taping* the problem:

    https://www.debian.org/releases/buster/amd64/release-notes/ch-information.en.html#entropy-starvation

    5.1.4. Daemons fail to start or system appears to hang during boot
  
    Due to systemd needing entropy during boot and the kernel treating
    such calls as blocking when available entropy is low, the system
    may hang for minutes to hours until the randomness subsystem is
    sufficiently initialized (random: crng init done).

"the system may hang for minuts to hours"...

> We can (and will) continue to do a best-effort thing (including very
> much using rdread and friends), but the whole "wait for entropy"
> simply *must* stop.
> 
> > I've sent an RFC patch at [1].
> >
> > [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc
> 
> Looks reasonable to me. Except I'd just make it simpler and make it a
> big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice().
> Make it clear that it is a *bug* if user space thinks it should wait
> at boot time.
> 
> Also, we might even want to just fill the buffer and return 0 at that
> point, to make sure that even more broken user space doesn't then try
> to sleep manually and turn it into a "I'll wait myself" loop.
>

ACK, I'll send an RFC v2, returning buflen, and so on..

/me will enjoy the popcorn from all the to-be-reported WARN_ON()s
on distribution mailing lists ;-)

>                  Linus

thanks,

-- 
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 94%]

* Re: Linux 5.3-rc8
  @ 2019-09-15  7:27 99%                             ` Ahmed S. Darwish
    1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-15  7:27 UTC (permalink / raw)
  To: Lennart Poettering
  Cc: Linus Torvalds, Theodore Y. Ts'o, Andreas Dilger, Jan Kara,
	Ray Strode, William Jon McCann, Alexander E. Patrakov, zhangjs,
	linux-ext4, lkml

On Sun, Sep 15, 2019 at 08:51:42AM +0200, Lennart Poettering wrote:
> On Sa, 14.09.19 09:30, Linus Torvalds (torvalds@linux-foundation.org) wrote:
[...]
> 
> And please don't break /dev/urandom again. The above code is the ony
> way I see how we can make /dev/urandom-derived swap encryption safe,
> and the only way I can see how we can sanely write a valid random seed
> to disk after boot.
>

Any hope in making systemd-random-seed(8) credit that "random seed
from previous boot" file, through RNDADDENTROPY, *by default*?

Because of course this makes the problem reliably go away on my system
too (as discussed in the original bug report, but you were not CCed).

I know that by v243, just released 12 days ago, this can be optionally
done through SYSTEMD_RANDOM_SEED_CREDIT=1. I wonder though if it can
ever be done by default, just like what the BSDs does... This would
solve a big part of the current problem.

> Lennart

thanks,

-- 
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* [PATCH RFC v3] random: getrandom(2): optionally block when CRNG is uninitialized
  @ 2019-09-15  8:17 58%                             ` Ahmed S. Darwish
      1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-15  8:17 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Linus Torvalds, Alexander E. Patrakov, Michael Kerrisk,
	Lennart Poettering, Willy Tarreau, Andreas Dilger, Jan Kara,
	Ray Strode, William Jon McCann, zhangjs, linux-ext4, lkml

Since Linux v3.17, getrandom() has been created as a new and more
secure interface for pseudorandom data requests. It attempted to solve
three problems as compared to /dev/urandom:

  1. the need to access filesystem paths, which can fail, e.g. under a
     chroot

  2. the need to open a file descriptor, which can fail under file
     descriptor exhaustion attacks

  3. the possibility to get not-so-random data from /dev/urandom, due to
     an incompletely initialized kernel entropy pool

To solve the third problem, getrandom(2) was made to block until a
proper amount of entropy has been accumulated. This basically made the
system call have no guaranteed upper-bound for its waiting time.

As was said in c6e9d6f38894 (random: introduce getrandom(2) system
call): "Any userspace program which uses this new functionality must
take care to assure that if it is used during the boot process, that it
will not cause the init scripts or other portions of the system startup
to hang indefinitely."

Meanwhile, user-facing Linux documentation, e.g. the urandom(4) and
getrandom(2) manpages, didn't add such explicit warnings. It didn't
also help that glibc, since v2.25, implemented an "OpenBSD-like"
getentropy(3) in terms of getrandom(2).  OpenBSD getentropy(2) never
blocked though, while linux-glibc version did, possibly indefinitely.
Since that glibc change, even more applications at the boot-path began
to implicitly reques randomness through getrandom(2); e.g., for an
Xorg/Xwayland MIT cookie.

OpenBSD genentropy(2) never blocked because, as stated in its rnd(4)
manpages, it saves entropy to disk on shutdown and restores it on boot.
Moreover, the NetBSD bootloader, as shown in its boot(8), even have
special commands to load a random seed file and pass it to the kernel.
Meanwhile on a Linux systemd userland, systemd-random-seed(8) preserved
a random seed across reboots at /var/lib/systemd/random-seed, but it
never had the actual code, until very recently at v243, to ask the
kernel to credit such entropy through an RNDADDENTROPY ioctl.

From a mix of the above factors, it began to be common for Embedded
Linux systems to "get stuck at boot" unless a daemon like haveged is
installed, or the BSP provider enabling the necessary hwrng driver in
question and crediting its entropy; e.g. 62f95ae805fa (hwrng: omap - Set
default quality). Over time, the issue began to even creep into
consumer-level x86 laptops: mainstream distributions, like debian
buster, began to recommend installing haveged as a workaround.

Thus, on certain setups where there is no hwrng (embedded systems or VMs
on a host lacking virtio-rng), or the hwrng is not trusted by some users
(intel RDRAND), or sometimes it's just broken (amd RDRAND), the system
boot can be *reliably* blocked.

It can therefore be argued that there is no way to use getrandom() on
Linux correctly, especially from shared libraries: GRND_NONBLOCK has
to be used, and a fallback to some other interface like /dev/urandom
is required, thus making the net result no better than just using
/dev/urandom unconditionally.

The issue is further exaggerated by recent file-system optimizations,
e.g. b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), which merges
directory lookup code inode table IO, and thus minimizes the number of
disk interrupts and entropy during boot. After that commit, a blocked
boot can be reliably reproduced on a Thinkpad E480 laptop with
standard ArchLinux user-space.

Thus, don't trust user-space on calling getrandom(2) from the right
context. Never block, by default, and just return data from the
urandom source if entropy is not yet available. This is an explicit
decision not to let user-space work around this through busy loops on
error-codes.

Note: this lowers the quality of random data returned by getrandom(2)
to the level of randomness returned by /dev/urandom, with all the
original security implications coming out of that, as discussed in
problem "3." at the top of this commit log. If this is not desirable,
offer users a fallback to old behavior, by CONFIG_RANDOM_BLOCK=y, or
random.getrandom_block=true bootparam.

[tytso@mit.edu: make the change to a non-blocking getrandom(2) optional]
Link: https://lkml.kernel.org/r/20190914222432.GC19710@mit.edu
Link: https://lkml.kernel.org/r/20190911173624.GI2740@mit.edu
Link: https://factorable.net ("Widespread Weak Keys in Network Devices")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
Rreported-by: Ahmed S. Darwish <darwish.07@gmail.com>
Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---

Notes:
    changelog-v2:
      - tytso: make blocking optional
    
    changelog-v3:
      - more detailed commit log + historical context (thanks patrakov)
      - remove WARN_ON_ONCE. It's pretty excessive, and the first caller
        is systemd-random-seed(8), which we know it will not change.
        Just print errors in the kernel log.
    
    $dmesg | grep random:
    
      [0.235843] random: get_random_bytes called from start_kernel+0x30f/0x4d7 with crng_init=0
      [0.685682] random: fast init done
      [2.405263] random: lvm: CRNG uninitialized (4 bytes read)
      [2.480686] random: systemd-random-: getrandom (512 bytes): CRNG not yet initialized
      [2.480687] random: systemd-random-: CRNG uninitialized (512 bytes read)
      [3.265201] random: dbus-daemon: CRNG uninitialized (12 bytes read)
      [3.835066] urandom_read: 1 callbacks suppressed
      [3.835068] random: polkitd: CRNG uninitialized (8 bytes read)
      [3.835509] random: polkitd: CRNG uninitialized (8 bytes read)
      [3.835577] random: polkitd: CRNG uninitialized (8 bytes read)
      [4.190653] random: gnome-session-b: getrandom (16 bytes): CRNG not yet initialized
      [4.190658] random: gnome-session-b: getrandom (16 bytes): CRNG not yet initialized
      [4.190662] random: gnome-session-b: getrandom (16 bytes): CRNG not yet initialized
      [4.952299] random: crng init done
      [4.952311] random: 3 urandom warning(s) missed due to ratelimiting
      [4.952314] random: 1 getrandom warning(s) missed due to ratelimiting

 drivers/char/Kconfig  | 33 +++++++++++++++++++++++++++++++--
 drivers/char/random.c | 33 ++++++++++++++++++++++++++++-----
 2 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 3e866885a405..337baeca5ebc 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -557,8 +557,6 @@ config ADI
 	  and SSM (Silicon Secured Memory).  Intended consumers of this
 	  driver include crash and makedumpfile.
 
-endmenu
-
 config RANDOM_TRUST_CPU
 	bool "Trust the CPU manufacturer to initialize Linux's CRNG"
 	depends on X86 || S390 || PPC
@@ -573,3 +571,34 @@ config RANDOM_TRUST_CPU
 	has not installed a hidden back door to compromise the CPU's
 	random number generation facilities. This can also be configured
 	at boot with "random.trust_cpu=on/off".
+
+config RANDOM_BLOCK
+	bool "Block if getrandom is called before CRNG is initialized"
+	help
+	  Say Y here if you want userspace programs which call
+	  getrandom(2) before the Cryptographic Random Number
+	  Generator (CRNG) is initialized to block until
+	  secure random numbers are available.
+
+	  Say N if you believe usability is more important than
+	  security, so if getrandom(2) is called before the CRNG is
+	  initialized, it should not block, but instead return "best
+	  effort" randomness which might not be very secure or random
+	  at all; but at least the system boot will not be delayed by
+	  minutes or hours.
+
+	  This can also be controlled at boot with
+	  "random.getrandom_block=on/off".
+
+	  Ideally, systems would be configured with hardware random
+	  number generators, and/or configured to trust CPU-provided
+	  RNG's.  In addition, userspace should generate cryptographic
+	  keys only as late as possible, when they are needed, instead
+	  of during early boot.  (For non-cryptographic use cases,
+	  such as dictionary seeds or MIT Magic Cookies, other
+	  mechanisms such as /dev/urandom or random(3) may be more
+	  appropropriate.)  This config option controls what the
+	  kernel should do as a fallback when the non-ideal case
+	  presents itself.
+
+endmenu
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4a50ee2c230d..689fdb486785 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -511,6 +511,8 @@ static struct ratelimit_state unseeded_warning =
 	RATELIMIT_STATE_INIT("warn_unseeded_randomness", HZ, 3);
 static struct ratelimit_state urandom_warning =
 	RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3);
+static struct ratelimit_state getrandom_warning =
+	RATELIMIT_STATE_INIT("warn_getrandom_randomness", HZ, 3);
 
 static int ratelimit_disable __read_mostly;
 
@@ -854,12 +856,19 @@ static void invalidate_batched_entropy(void);
 static void numa_crng_init(void);
 
 static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static bool getrandom_block __ro_after_init = IS_ENABLED(CONFIG_RANDOM_BLOCK);
 static int __init parse_trust_cpu(char *arg)
 {
 	return kstrtobool(arg, &trust_cpu);
 }
 early_param("random.trust_cpu", parse_trust_cpu);
 
+static int __init parse_block(char *arg)
+{
+	return kstrtobool(arg, &getrandom_block);
+}
+early_param("random.getrandom_block", parse_block);
+
 static void crng_initialize(struct crng_state *crng)
 {
 	int		i;
@@ -1053,6 +1062,12 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
 				  urandom_warning.missed);
 			urandom_warning.missed = 0;
 		}
+		if (getrandom_warning.missed) {
+			pr_notice("random: %d getrandom warning(s) missed "
+				  "due to ratelimiting\n",
+				  getrandom_warning.missed);
+			getrandom_warning.missed = 0;
+		}
 	}
 }
 
@@ -1915,6 +1930,7 @@ int __init rand_initialize(void)
 	crng_global_init_time = jiffies;
 	if (ratelimit_disable) {
 		urandom_warning.interval = 0;
+		getrandom_warning.interval = 0;
 		unseeded_warning.interval = 0;
 	}
 	return 0;
@@ -1984,8 +2000,8 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 	if (!crng_ready() && maxwarn > 0) {
 		maxwarn--;
 		if (__ratelimit(&urandom_warning))
-			printk(KERN_NOTICE "random: %s: uninitialized "
-			       "urandom read (%zd bytes read)\n",
+			pr_err("random: %s: CRNG uninitialized "
+			       "(%zd bytes read)\n",
 			       current->comm, nbytes);
 		spin_lock_irqsave(&primary_crng.lock, flags);
 		crng_init_cnt = 0;
@@ -2152,9 +2168,16 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (!crng_ready()) {
 		if (flags & GRND_NONBLOCK)
 			return -EAGAIN;
-		ret = wait_for_random_bytes();
-		if (unlikely(ret))
-			return ret;
+
+		if (__ratelimit(&getrandom_warning))
+			pr_err("random: %s: getrandom (%zd bytes): CRNG not "
+			       "yet initialized", current->comm, count);
+
+		if (getrandom_block) {
+			ret = wait_for_random_bytes();
+			if (unlikely(ret))
+				return ret;
+		}
 	}
 	return urandom_read(NULL, buf, count, NULL);
 }
-- 
darwi
http://darwish.chasingpointers.com

^ permalink raw reply related	[relevance 58%]

* Re: [PATCH RFC v3] random: getrandom(2): optionally block when CRNG is uninitialized
  @ 2019-09-15 10:02 98%                                   ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-15 10:02 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Lennart Poettering, Theodore Y. Ts'o, Linus Torvalds,
	Alexander E. Patrakov, Michael Kerrisk, Andreas Dilger, Jan Kara,
	Ray Strode, William Jon McCann, zhangjs, linux-ext4, lkml

On Sun, Sep 15, 2019 at 11:30:57AM +0200, Willy Tarreau wrote:
> On Sun, Sep 15, 2019 at 10:59:07AM +0200, Lennart Poettering wrote:
> > We live in a world where people run HTTPS, SSH, and all that stuff in
> > the initrd already. It's where SSH host keys are generated, and plenty
> > session keys.
> 
> It is exactly the type of crap that create this situation : making
> people developing such scripts believe that any random source was OK
> to generate these, and as such forcing urandom to produce crypto-solid
> randoms!

Willy, let's tone it down please... the thread is already getting a
bit toxic.

> No, distro developers must know that it's not acceptable to
> generate lifetime crypto keys from the early boot when no entropy is
> available. At least with this change they will get an error returned
> from getrandom() and will be able to ask the user to feed entropy, or
> be able to say "it was impossible to generate the SSH key right now,
> the daemon will only be started once it's possible", or "the SSH key
> we produced will not be saved because it's not safe and is only usable
> for this recovery session".
> 
> > If Linux lets all that stuff run with awful entropy then
> > you pretend things where secure while they actually aren't. It's much
> > better to fail loudly in that case, I am sure.
> 
> This is precisely what this change permits : fail instead of block
> by default, and let applications decide based on the use case.
>

Unfortunately, not exactly.

Linus didn't want getrandom to return an error code / "to fail" in
that case, but to silently return CRNG-uninitialized /dev/urandom
data, to avoid user-space even working around the error code through
busy-loops.

I understand the rationale behind that, of course, and this is what
I've done so far in the V3 RFC.

Nonetheless, this _will_, for example, make systemd-random-seed(8)
save week seeds under /var/lib/systemd/random-seed, since the kernel
didn't inform it about such weakness at all..

The situation is so bad now, that it's more of "some user-space are
more equal than others".. Let's just at least admit this while
discussing the RFC patch in question.

thanks,

> > Quite frankly, I don't think this is something to fix in the
> > kernel.
> 
> As long as it offers a single API to return randoms, and that it is
> not possible not to block for low-quality randoms, it needs to be
> at least addressed there. Then userspace can adapt. For now userspace
> does not have this option just due to the kernel's way of exposing
> randoms.
> 
> Willy

^ permalink raw reply	[relevance 98%]

* Re: [PATCH RFC v3] random: getrandom(2): optionally block when CRNG is uninitialized
  @ 2019-09-15 10:55 98%                                       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-15 10:55 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Lennart Poettering, Theodore Y. Ts'o, Linus Torvalds,
	Alexander E. Patrakov, Michael Kerrisk, Andreas Dilger, Jan Kara,
	Ray Strode, William Jon McCann, zhangjs, linux-ext4, lkml

On Sun, Sep 15, 2019 at 12:40:27PM +0200, Willy Tarreau wrote:
> On Sun, Sep 15, 2019 at 12:02:01PM +0200, Ahmed S. Darwish wrote:
> > On Sun, Sep 15, 2019 at 11:30:57AM +0200, Willy Tarreau wrote:
> > > On Sun, Sep 15, 2019 at 10:59:07AM +0200, Lennart Poettering wrote:
[...]
> > > > If Linux lets all that stuff run with awful entropy then
> > > > you pretend things where secure while they actually aren't. It's much
> > > > better to fail loudly in that case, I am sure.
> > > 
> > > This is precisely what this change permits : fail instead of block
> > > by default, and let applications decide based on the use case.
> > >
> > 
> > Unfortunately, not exactly.
> > 
> > Linus didn't want getrandom to return an error code / "to fail" in
> > that case, but to silently return CRNG-uninitialized /dev/urandom
> > data, to avoid user-space even working around the error code through
> > busy-loops.
> 
> But with this EINVAL you have the information that it only filled
> the buffer with whatever it could, right ? At least that was the
> last point I manage to catch in the discussion. Otherwise if it's
> totally silent, I fear that it will reintroduce the problem in a
> different form (i.e. libc will say "our randoms are not reliable
> anymore, let us work around this and produce blocking, solid randoms
> again to help all our users").
>

V1 of the patch I posted did indeed return -EINVAL. Linus then
suggested that this might make still some user-space act smart and
just busy-loop around that, basically blocking the boot again:

    https://lkml.kernel.org/r/CAHk-=wiB0e_uGpidYHf+dV4eeT+XmG-+rQBx=JJ110R48QFFWw@mail.gmail.com
    https://lkml.kernel.org/r/CAHk-=whSbo=dBiqozLoa6TFmMgbeB8d9krXXvXBKtpRWkG0rMQ@mail.gmail.com

So it was then requested to actually return what /dev/urandom would
return, so that user-space has no way whatsoever in knowing if
getrandom has failed. Then, it's the job of system integratos / BSP
builders to fix the inspect the big fat WARN on the kernel and fix
that.

This is the core of Lennart's critqueue of V3 above.

> > I understand the rationale behind that, of course, and this is what
> > I've done so far in the V3 RFC.
> > 
> > Nonetheless, this _will_, for example, make systemd-random-seed(8)
> > save week seeds under /var/lib/systemd/random-seed, since the kernel
> > didn't inform it about such weakness at all..
> 
> Then I am confused because I understood that the goal was to return
> EINVAL or anything equivalent in which case the userspace knows what
> it has to deal with :-/
>

Yeah, the discussion moved a bit beyond that.

thanks,
--darwi

^ permalink raw reply	[relevance 98%]

* Re: Linux 5.3-rc8
  @ 2019-09-16  1:40 99%                               ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-16  1:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennart Poettering, Theodore Y. Ts'o, Andreas Dilger,
	Jan Kara, Ray Strode, William Jon McCann, Alexander E. Patrakov,
	zhangjs, linux-ext4, lkml

On Sun, Sep 15, 2019 at 09:29:55AM -0700, Linus Torvalds wrote:
> On Sat, Sep 14, 2019 at 11:51 PM Lennart Poettering
> <mzxreary@0pointer.de> wrote:
> >
> > Oh man. Just spend 5min to understand the situation, before claiming
> > this was garbage or that was garbage. The code above does not block
> > boot.
> 
> Yes it does. You clearly didn't read the thread.
> 
> > It blocks startup of services that explicit order themselves
> > after the code above. There's only a few services that should do that,
> > and the main system boots up just fine without waiting for this.
> 
> That's a nice theory, but it doesn't actually match reality.
> 
> There are clearly broken setups that use this for things that it
> really shouldn't be used for. Asking for true randomness at boot
> before there is any indication that randomness exists, and then just
> blocking with no further action that could actually _generate_ said
> randomness.
> 
> If your description was true that the system would come up and be
> usable while the blocked thread is waiting for that to happen, things
> would be fine.
>

A small note here, especially after I've just read the commit log of
72dbcf721566 ('Revert ext4: "make __ext4_get_inode_loc plug"'), which
unfairly blames systemd there.

Yes, the systemd-random-seed(8) process blocks, but this is an
isolated process, and it's only there as a synchronization point and
to load/restore random seeds from disk across reboots.

The wisdom of having a sysnchronization service ("before/after urandom
CRNG is inited") can be debated. That service though, and systemd in
general, did _not_ block the overall system boot.

What blocked the system boot was GDM/gnome-session implicitly calling
getrandom() for the Xorg MIT cookie. This was shown in the strace log
below:

   https://lkml.kernel.org/r/20190910173243.GA3992@darwi-home-pc

thanks,

-- 
darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: [PATCH RFC v2] random: optionally block in getrandom(2) when the CRNG is uninitialized
  @ 2019-09-16  2:45 92%                                   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-16  2:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Willy Tarreau, Theodore Y. Ts'o, Alexander E. Patrakov,
	Michael Kerrisk, Andreas Dilger, Jan Kara, Ray Strode,
	William Jon McCann, zhangjs, linux-ext4, lkml,
	Lennart Poettering

On Sun, Sep 15, 2019 at 11:59:41AM -0700, Linus Torvalds wrote:
> On Sun, Sep 15, 2019 at 11:32 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > I think that the exponential decay will either not be used or
> > be totally used, so in practice you'll always end up with 0 or
> > 30s depending on the entropy situation
> 
> According to the systemd random-seed source snippet that Ahmed posted,
> it actually just tries once (well, first once non-blocking, then once
> blocking) and then falls back to reading urandom if it fails.
> 
> So assuming there's just one of those "read much too early" cases, I
> think it actually matters.
>

Just a quick note, the snippest I posted:

    https://lkml.kernel.org/r/20190914150206.GA2270@darwi-home-pc

is not PID 1.

It's just a lowly process called "systemd-random-seed". Its main
reason of existence is to load/restore a random seed file from and to
disk across reboots (just like what sysv scripts did).

The reason I posted it was to show that if we change getrandom() to
silently return weak crypto instead of blocking or an error code,
systemd-random-seed will break: it will save the resulting data to
disk, then even _credit_ it (if asked to) in the next boot cycle
through RNDADDENTROPY.

> But while I tried to test this, on my F30 install, systemd seems to
> always just use urandom().
> 
> I can trigger the urandom read warning easily enough (turn of CPU
> rdrand trusting and increase the entropy requirement by a factor of
> ten, and turn of the ioctl to add entropy from user space), just not
> the getrandom() blocking case at all.
>

Yeah, because the problem was/is not with systemd :)

It is GDM/gnome-session which was blocking the graphical boot process.

Regarding reproducing the issue, through a quick trace_prink, all of
below processes are calling getrandom() on my Arch system at boot:

    https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc

The fatal call was gnome-session's one, because gnome didn't continue
_its own_ boot due to this blockage.

> So presumably that's because I have a systemd that doesn't use
> getrandom() at all, or perhaps uses the 'rdrand' instruction directly.
> Or maybe because Arch has some other oddity that just triggers the
> problem.
>

It seems Arch is good at triggering this. For example, here is a
another Arch user on a Thinkpad (different model than mine), also with
GDM getting blocked on entropy:

    https://bbs.archlinux.org/viewtopic.php?id=248035
    
    "As you can see, the system is literally waiting a half minute for
    something - up until crng init is done"

(The NetworkManager logs are just noise. I also had them, but completely
 disabling NetworkManager didn't do anything .. just made the logs
 cleaner)

thanks,

--
Ahmed Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 92%]

* Re: Linux 5.3-rc8
    @ 2019-09-16 19:53 99%                                               ` Ahmed S. Darwish
  1 sibling, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-16 19:53 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Linus Torvalds, Willy Tarreau, Vito Caputo, Lennart Poettering,
	Andreas Dilger, Jan Kara, Ray Strode, William Jon McCann,
	Alexander E. Patrakov, zhangjs, linux-ext4, lkml

On Mon, Sep 16, 2019 at 01:21:17PM -0400, Theodore Y. Ts'o wrote:
> On Mon, Sep 16, 2019 at 09:17:10AM -0700, Linus Torvalds wrote:
> > So the semantics that getrandom() should have had are:
> > 
> >  getrandom(0) - just give me reasonable random numbers for any of a
> > million non-strict-long-term-security use (ie the old urandom)
> > 
> >     - the nonblocking flag makes no sense here and would be a no-op
> 
> That change is what I consider highly problematic.  There are a *huge*
> number of applications which use cryptography which assumes that
> getrandom(0) means, "I'm guaranteed to get something safe
> cryptographic use".  Changing his now would expose a very large number
> of applications to be insecure.  Part of the problem here is that
> there are many different actors.  There is the application or
> cryptographic library developer, who may want to be sure they have
> cryptographically secure random numbers.  They are the ones who will
> select getrandom(0).
> 
> Then you have the distribution or consumer-grade electronics
> developers who may choose to run them too early in some init script or
> systemd unit files.  And some of these people may do something stupid,
> like run things too early, or omit the a hardware random number
> generator in their design, even though it's for a security critical
> purpose (say, a digital wallet for bitcoin).

Ted, you're really the expert here. My apologies though, every time I
see the words "too early" I get a cramp... Please check my earlier
reply:

    https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc

Specifically the trace_printk log of all the getrandom(2) calls
during an standard Archlinux boot...

where is the "too early" boundary there? It's undefinable.

You either have entropy, or you don't. And if you don't, it will stay
like this forever, because if you had, you wouldn't have blocked in
the first place...

Thanks,

--
Ahmed Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* Re: Linux 5.3-rc8
  @ 2019-09-16 23:29 98%                                                         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-16 23:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Garrett, Theodore Y. Ts'o, Willy Tarreau,
	Vito Caputo, Lennart Poettering, Andreas Dilger, Jan Kara,
	Ray Strode, William Jon McCann, Alexander E. Patrakov, zhangjs,
	linux-ext4, lkml

On Mon, Sep 16, 2019 at 04:18:00PM -0700, Linus Torvalds wrote:
> On Mon, Sep 16, 2019 at 4:11 PM Matthew Garrett <mjg59@srcf.ucam.org> wrote:
> >
> > In one case we have "Systems don't boot, but you can downgrade your
> > kernel" and in the other case we have "Your cryptographic keys are weak
> > and you have no way of knowing unless you read dmesg", and I think
> > causing boot problems is the better outcome here.
> 
> Or: In one case you have a real and present problem. In the other
> case, people are talking hypotheticals.
>

Linus, in all honesty, the other case is _not_ a hypothetical . For
example, here is a fresh comment on LWN from gnupg developers:

    https://lwn.net/Articles/799352

It's about this libgnupg code:

    => https://dev.gnupg.org/source/libgcrypt.git

    => random/rdlinux.c:
    
    /* If we have a modern operating system, we first try to use the new
     * getentropy function.  That call guarantees that the kernel's
     * RNG has been properly seeded before returning any data.  This
     * is different from /dev/urandom which may, due to its
     * non-blocking semantics, return data even if the kernel has
     * not been properly seeded.  And it differs from /dev/random by never
     * blocking once the kernel is seeded.  */
    #if defined(HAVE_GETENTROPY) || defined(__NR_getrandom)
    do {
        ...
        ret = getentropy (buffer, nbytes);
        ...
    } while (ret == -1 && errno == EINTR);

thanks,

-- 
Ahmed Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 98%]

* Re: Linux 5.3-rc8
  @ 2019-09-17 12:30 84%                                                         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-17 12:30 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Martin Steigerwald, Willy Tarreau, Matthew Garrett,
	Linus Torvalds, Vito Caputo, Lennart Poettering, Andreas Dilger,
	Jan Kara, Ray Strode, William Jon McCann, Alexander E. Patrakov,
	zhangjs, linux-ext4, lkml

On Tue, Sep 17, 2019 at 08:11:56AM -0400, Theodore Y. Ts'o wrote:
> On Tue, Sep 17, 2019 at 09:33:40AM +0200, Martin Steigerwald wrote:
> > Willy Tarreau - 17.09.19, 07:24:38 CEST:
> > > On Mon, Sep 16, 2019 at 06:46:07PM -0700, Matthew Garrett wrote:
> > > > >Well, the patch actually made getrandom() return en error too, but
> > > > >you seem more interested in the hypotheticals than in arguing
> > > > >actualities.>
> > > > If you want to be safe, terminate the process.
> > >
> > > This is an interesting approach. At least it will cause bug reports in
> > > application using getrandom() in an unreliable way and they will
> > > check for other options. Because one of the issues with systems that
> > > do not finish to boot is that usually the user doesn't know what
> > > process is hanging.
> >
>
> I would be happy with a change which changes getrandom(0) to send a
> kill -9 to the process if it is called too early, with a new flag,
> getrandom(GRND_BLOCK) which blocks until entropy is available.  That
> leaves it up to the application developer to decide what behavior they
> want.
>

Yup, I'm convinced that's the sanest option too. I'll send a final RFC
patch tonight implementing the following:

config GETRANDOM_CRNG_ENTROPY_MAX_WAIT_MS
	int
	default 3000
	help
	  Default max wait in milliseconds, for the getrandom(2) system
	  call when asking for entropy from the urandom source, until
	  the Cryptographic Random Number Generator (CRNG) gets
	  initialized.  Any process exceeding this duration for entropy
	  wait will get killed by kernel. The maximum wait can be
	  overriden through the "random.getrandom_max_wait_ms" kernel
	  boot parameter. Rationale follows.

	  When the getrandom(2) system call was created, it came with
	  the clear warning: "Any userspace program which uses this new
	  functionality must take care to assure that if it is used
	  during the boot process, that it will not cause the init
	  scripts or other portions of the system startup to hang
	  indefinitely.

	  Unfortunately, due to multiple factors, including not having
	  this warning written in a scary enough language in the
	  manpages, and due to glibc since v2.25 implementing a BSD-like
	  getentropy(3) in terms of getrandom(2), modern user-space is
	  calling getrandom(2) in the boot path everywhere.

	  Embedded Linux systems were first hit by this, and reports of
	  embedded system "getting stuck at boot" began to be
	  common. Over time, the issue began to even creep into consumer
	  level x86 laptops: mainstream distributions, like Debian
	  Buster, began to recommend installing haveged as a workaround,
	  just to let the system boot.

	  Filesystem optimizations in EXT4 and XFS exagerated the
	  problem, due to aggressive batching of IO requests, and thus
	  minimizing sources of entropy at boot. This led to large
	  delays until the kernel's Cryptographic Random Number
	  Generator (CRNG) got initialized, and thus having reports of
	  getrandom(2) inidifinitely stuck at boot.

	  Solve this problem by setting a conservative upper bound for
	  getrandom(2) wait. Kill the process, instead of returning an
	  error code, because otherwise crypto-sensitive applications
	  may revert to less secure mechanisms (e.g. /dev/urandom). We
	  __deeply encourage__ system integrators and distribution
	  builders not to considerably increase this value: during
	  system boot, you either have entropy, or you don't. And if you
	  didn't have entropy, it will stay like this forever, because
	  if you had, you wouldn't have blocked in the first place. It's
	  an atomic "either/or" situation, with no middle ground. Please
	  think twice.

	  Ideally, systems would be configured with hardware random
	  number generators, and/or configured to trust the CPU-provided
	  RNG's (CONFIG_RANDOM_TRUST_CPU) or boot-loader provided ones
	  (CONFIG_RANDOM_TRUST_BOOTLOADER).  In addition, userspace
	  should generate cryptographic keys only as late as possible,
	  when they are needed, instead of during early boot.  (For
	  non-cryptographic use cases, such as dictionary seeds or MIT
	  Magic Cookies, other mechanisms such as /dev/urandom or
	  random(3) may be more appropropriate.)

Sounds good?

thanks,

--
Ahmed Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 84%]

* Re: Linux 5.3-rc8
  @ 2019-09-17 20:52 99%                                                       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-17 20:52 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: Linus Torvalds, Lennart Poettering, Theodore Y. Ts'o,
	Willy Tarreau, Matthew Garrett, Vito Caputo, Andreas Dilger,
	Jan Kara, Ray Strode, William Jon McCann, Alexander E. Patrakov,
	zhangjs, linux-ext4, lkml

On Tue, Sep 17, 2019 at 10:28:47PM +0200, Martin Steigerwald wrote:
[...]
> 
> I don't have any kernel logs old enough to see whether whether crng init
> times have been different with Systemd due to asking for randomness for
> UUID/hashmaps.
>

Please stop claiming this. It has been pointed out to you, __multiple
times__, that this makes no difference. For example:

    https://lkml.kernel.org/r/20190916024904.GA22035@mit.edu
    
    No. getrandom(2) uses the new CRNG, which is either initialized,
    or it's not ... So to the extent that systemd has made systems
    boot faster, you could call that systemd's "fault".

You've claimed this like 3 times before in this thread already, and
multiple people replied with the same response. If you don't get the
paragraph above, then please don't continue replying further on this
thread.

thanks,

-- 
Ahmed Darwish
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 99%]

* [PATCH RFC v4 0/1] random: WARN on large getrandom() waits and introduce getrandom2()
    @ 2019-09-18 21:15 97%                               ` Ahmed S. Darwish
  2019-09-18 21:17 53%                                 ` [PATCH RFC v4 1/1] " Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-18 21:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, lkml, linux-ext4,
	linux-man

Hi,

This is an RFC, and it obviously needs much more testing beside the
"it boots" smoke test I've just did.

Interestingly though, on my current system, the triggered WARN()
**reliably** makes the system get un-stuck... I know this is a very
crude heuristic, but I would personally prefer it to the other
proposals that were mentioned in this jumbo thread.

If I get an OK from Linus on this, I'll send a polished v5: further
real testing, kernel-parameters.txt docs, a new getrandom_wait(7)
manpage as referenced in the WARN() message, and extensions to the
getrandom(2) manpage for new getrandom2().

The new getrandom2() system call is basically a summary of Linus',
Lennart's, and Willy's proposals. Please see the patch #1 commit log,
and the "Link:" section inside it, for a rationale.

@Lennart, since you obviously represent user-space here, any further
notes on the new system call?

thanks,

Ahmed S. Darwish (1):
  random: WARN on large getrandom() waits and introduce getrandom2()

 drivers/char/Kconfig        | 60 ++++++++++++++++++++++++--
 drivers/char/random.c       | 85 ++++++++++++++++++++++++++++++++-----
 include/uapi/linux/random.h | 20 +++++++--
 3 files changed, 148 insertions(+), 17 deletions(-)

--
http://darwish.chasingpointers.com

^ permalink raw reply	[relevance 97%]

* [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  2019-09-18 21:15 97%                               ` [PATCH RFC v4 0/1] random: WARN on large getrandom() waits and introduce getrandom2() Ahmed S. Darwish
@ 2019-09-18 21:17 53%                                 ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-18 21:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, lkml, linux-ext4,
	linux-man

Since Linux v3.17, getrandom(2) has been created as a new and more
secure interface for pseudorandom data requests.  It attempted to
solve three problems, as compared to /dev/urandom:

  1. the need to access filesystem paths, which can fail, e.g. under a
     chroot

  2. the need to open a file descriptor, which can fail under file
     descriptor exhaustion attacks

  3. the possibility of getting not-so-random data from /dev/urandom,
     due to an incompletely initialized kernel entropy pool

To solve the third point, getrandom(2) was made to block until a
proper amount of entropy has been accumulated to initialize the
CHACHA20 cipher.  This basically made the system call have no
guaranteed upper-bound for its initial waiting time.

Thus when it was introduced at c6e9d6f38894 ("random: introduce
getrandom(2) system call"), it came with a clear warning: "Any
userspace program which uses this new functionality must take care to
assure that if it is used during the boot process, that it will not
cause the init scripts or other portions of the system startup to hang
indefinitely."

Unfortunately, due to multiple factors, including not having this
warning written in a scary-enough language in the manpages, and due to
glibc since v2.25 implementing a BSD-like getentropy(3) in terms of
getrandom(2), modern user-space is calling getrandom(2) in the boot
path everywhere.

Embedded Linux systems were first hit by this, and reports of embedded
systems "getting stuck at boot" began to be common.  Over time, the
issue began to even creep into consumer-level x86 laptops: mainstream
distributions, like Debian Buster, began to recommend installing
haveged as a duct-tape workaround... just to let the system boot. (!)

Moreover, filesystem optimizations in EXT4 and XFS, e.g. b03755ad6f33
("ext4: make __ext4_get_inode_loc plug"), which merged directory
lookup code inode table IO, and very fast systemd boots, further
exaggerated the problem by limiting interrupt-based entropy sources.
This led to large delays until the kernel's cryptographic random
number generator (CRNG) got initialized.

Mitigate the problem, as a first step, in two ways:

  1. Issue a big WARN_ON when any process gets stuck on getrandom(2)
     for more than CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC seconds.

  2. Introduce the new getrandom2(2) system call, with clear semantics
     that can guide user-space in doing the right thing.

On the author's Thinkpad E480 x86 laptop and an ArchLinux user-space,
the ext4 commit earlier mentioned reliably blocked the system on GDM
gnome-session boot. Complain loudly through a WARN_ON if processes
get stuck on getrandom(2). Beside its obvious informational purposes,
the WARN_ON also reliably gets the system unstuck.

Set CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC to a heuristic 30-second
default value. We __deeply encourage__ system integrators and
distribution builders not to increase it much: during system boot, you
either have entropy, or you don't. And if you didn't have entropy, it
will stay like this forever, because if you had, you wouldn't have
blocked in the first place. It's an atomic "either/or" situation, with
no middle ground. Please think twice.

For the new getrandom2(2) system call, it tries to avoid the problems
introduced by its earlier siblings. As Linus mentioned several times
in the bug report thread, Linux should have never provided the
"/dev/random" and "getrandom(GRND_RANDOM)" APIs. These interfaces are
broken by design due to their almost-permanent blockage, leading to
the current misuse of /dev/urandom and getrandom(flags=0) calls. Thus
for getrandom2, introduce the flags:

  1. GRND2_SECURE_UNBOUNDED_INITIAL_WAIT
  2. GRND2_INSECURE

where both extract randomness __exclusively__ from the urandom source.
Due to the clear nature of its new GRND2_* flags, the getrandom2()
system call will never issue any warnings on the kernel log.

OpenBSD, to its credit, got that correctly from the start by making
both of /dev/random and /dev/urandom equivalent.

Rreported-by: Ahmed S. Darwish <darwish.07@gmail.com>
Link: https://lkml.kernel.org/r/20190910042107.GA1517@darwi-home-pc
Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
Link: https://lkml.kernel.org/r/20190914222432.GC19710@mit.edu
Link: https://lkml.kernel.org/r/20180514003034.GI14763@thunk.org
Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
Link: https://lkml.kernel.org/r/20190917052438.GA26923@1wt.eu
Link: https://lkml.kernel.org/r/20190917160844.GC31567@gardel-login
Link: https://lkml.kernel.org/r/CAHk-=wjABG3+daJFr4w3a+OWuraVcZpi=SMUg=pnZ+7+O0E2FA@mail.gmail.com
Link: https://lkml.kernel.org/r/CAHk-=wjQeiYu8Q_wcMgM-nAcW7KsBfG1+90DaTD5WF2cCeGCgA@mail.gmail.com
Link: https://factorable.net ("Widespread Weak Keys in Network Devices")
Link: https://man.openbsd.org/man4/random.4
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---
 drivers/char/Kconfig        | 60 ++++++++++++++++++++++++--
 drivers/char/random.c       | 85 ++++++++++++++++++++++++++++++++-----
 include/uapi/linux/random.h | 20 +++++++--
 3 files changed, 148 insertions(+), 17 deletions(-)

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index df0fc997dc3e..772765c36fc3 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -535,8 +535,6 @@ config ADI
 	  and SSM (Silicon Secured Memory).  Intended consumers of this
 	  driver include crash and makedumpfile.
 
-endmenu
-
 config RANDOM_TRUST_CPU
 	bool "Trust the CPU manufacturer to initialize Linux's CRNG"
 	depends on X86 || S390 || PPC
@@ -559,4 +557,60 @@ config RANDOM_TRUST_BOOTLOADER
 	device randomness. Say Y here to assume the entropy provided by the
 	booloader is trustworthy so it will be added to the kernel's entropy
 	pool. Otherwise, say N here so it will be regarded as device input that
-	only mixes the entropy pool.
\ No newline at end of file
+	only mixes the entropy pool.
+
+config GETRANDOM_WAIT_THRESHOLD_SEC
+	int
+	default 30
+	help
+	  The getrandom(2) system call, when asking for entropy from the
+	  urandom source, blocks until the kernel's Cryptographic Random
+	  Number Generator (CRNG) gets initialized. This configuration
+	  option sets the maximum wait time, in seconds, for a process
+	  to get blocked on such a system call before the kernel issues
+	  a loud warning. Rationale follows:
+
+	  When the getrandom(2) system call was created, it came with
+	  the clear warning: "Any userspace program which uses this new
+	  functionality must take care to assure that if it is used
+	  during the boot process, that it will not cause the init
+	  scripts or other portions of the system startup to hang
+	  indefinitely.
+
+	  Unfortunately, due to multiple factors, including not having
+	  this warning written in a scary-enough language in the
+	  manpages, and due to glibc since v2.25 implementing a BSD-like
+	  getentropy(3) in terms of getrandom(2), modern user-space is
+	  calling getrandom(2) in the boot path everywhere.
+
+	  Embedded Linux systems were first hit by this, and reports of
+	  embedded system "getting stuck at boot" began to be
+	  common. Over time, the issue began to even creep into consumer
+	  level x86 laptops: mainstream distributions, like Debian
+	  Buster, began to recommend installing haveged as a workaround,
+	  just to let the system boot.
+
+	  Filesystem optimizations in EXT4 and XFS exagerated the
+	  problem, due to aggressive batching of IO requests, and thus
+	  minimizing sources of entropy at boot. This led to large
+	  delays until the kernel's CRNG got initialized.
+
+	  System integrators and distribution builderss are not
+	  encouraged to considerably increase this value: during system
+	  boot, you either have entropy, or you don't. And if you didn't
+	  have entropy, it will stay like this forever, because if you
+	  had, you wouldn't have blocked in the first place. It's an
+	  atomic "either/or" situation, with no middle ground. Please
+	  think twice.
+
+	  Ideally, systems would be configured with hardware random
+	  number generators, and/or configured to trust the CPU-provided
+	  RNG's (CONFIG_RANDOM_TRUST_CPU) or boot-loader provided ones
+	  (CONFIG_RANDOM_TRUST_BOOTLOADER).  In addition, userspace
+	  should generate cryptographic keys only as late as possible,
+	  when they are needed, instead of during early boot.  For
+	  non-cryptographic use cases, such as dictionary seeds or MIT
+	  Magic Cookies, the getrandom2(GRND2_INSECURE) system call,
+	  or even random(3), may be more appropropriate.
+
+endmenu
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 566922df4b7b..74057e496303 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -322,6 +322,7 @@
 #include <linux/interrupt.h>
 #include <linux/mm.h>
 #include <linux/nodemask.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
 #include <linux/percpu.h>
@@ -854,12 +855,21 @@ static void invalidate_batched_entropy(void);
 static void numa_crng_init(void);
 
 static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static int getrandom_wait_threshold __ro_after_init =
+				CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC;
+
 static int __init parse_trust_cpu(char *arg)
 {
 	return kstrtobool(arg, &trust_cpu);
 }
 early_param("random.trust_cpu", parse_trust_cpu);
 
+static int __init parse_getrandom_wait_threshold(char *arg)
+{
+	return kstrtoint(arg, 0, &getrandom_wait_threshold);
+}
+early_param("random.getrandom_wait_threshold", parse_getrandom_wait_threshold);
+
 static void crng_initialize(struct crng_state *crng)
 {
 	int		i;
@@ -1960,7 +1970,7 @@ random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 }
 
 static ssize_t
-urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+_urandom_read(char __user *buf, size_t nbytes, bool warn_on_noninited_crng)
 {
 	unsigned long flags;
 	static int maxwarn = 10;
@@ -1968,7 +1978,7 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 
 	if (!crng_ready() && maxwarn > 0) {
 		maxwarn--;
-		if (__ratelimit(&urandom_warning))
+		if (warn_on_noninited_crng && __ratelimit(&urandom_warning))
 			printk(KERN_NOTICE "random: %s: uninitialized "
 			       "urandom read (%zd bytes read)\n",
 			       current->comm, nbytes);
@@ -1982,6 +1992,12 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 	return ret;
 }
 
+static ssize_t
+urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+{
+	return _urandom_read(buf, nbytes, true);
+}
+
 static __poll_t
 random_poll(struct file *file, poll_table * wait)
 {
@@ -2118,11 +2134,41 @@ const struct file_operations urandom_fops = {
 	.llseek = noop_llseek,
 };
 
-SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
-		unsigned int, flags)
+static int getrandom_wait(char __user *buf, size_t count,
+			  bool warn_on_large_wait)
 {
+	unsigned long timeout = MAX_SCHEDULE_TIMEOUT;
 	int ret;
 
+	if (warn_on_large_wait && (getrandom_wait_threshold > 0))
+		timeout = HZ * getrandom_wait_threshold;
+
+	do {
+		ret = wait_event_interruptible_timeout(crng_init_wait,
+						       crng_ready(),
+						       timeout);
+		if (ret < 0)
+			return ret;
+
+		if (ret == 0) {
+			WARN(1, "random: %s[%d]: getrandom(%zu bytes) "
+			     "is blocked for more than %d seconds. Check "
+			     "getrandom_wait(7)\n", current->comm,
+			     task_pid_nr(current), count,
+			     getrandom_wait_threshold);
+
+			/* warn once per caller */
+			timeout = MAX_SCHEDULE_TIMEOUT;
+		}
+
+	} while (ret == 0);
+
+	return _urandom_read(buf, count, true);
+}
+
+SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
+		unsigned int, flags)
+{
 	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
 		return -EINVAL;
 
@@ -2132,14 +2178,31 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (flags & GRND_RANDOM)
 		return _random_read(flags & GRND_NONBLOCK, buf, count);
 
-	if (!crng_ready()) {
-		if (flags & GRND_NONBLOCK)
+	if ((flags & GRND_NONBLOCK) && !crng_ready())
 			return -EAGAIN;
-		ret = wait_for_random_bytes();
-		if (unlikely(ret))
-			return ret;
-	}
-	return urandom_read(NULL, buf, count, NULL);
+
+	return getrandom_wait(buf, count, true);
+}
+
+SYSCALL_DEFINE3(getrandom2, char __user *, buf, size_t, count,
+		unsigned int, flags)
+{
+	if (flags & ~(GRND2_SECURE_UNBOUNDED_INITIAL_WAIT|GRND2_INSECURE))
+		return -EINVAL;
+
+	if (flags & (GRND2_SECURE_UNBOUNDED_INITIAL_WAIT|GRND2_INSECURE))
+		return -EINVAL;
+
+	if (count > INT_MAX)
+		count = INT_MAX;
+
+	if (flags & GRND2_SECURE_UNBOUNDED_INITIAL_WAIT)
+		return getrandom_wait(buf, count, false);
+
+	if (flags & GRND2_INSECURE)
+		return _urandom_read(buf, count, false);
+
+	unreachable();
 }
 
 /********************************************************************
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index 26ee91300e3e..3f09a8f6aff3 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -8,6 +8,7 @@
 #ifndef _UAPI_LINUX_RANDOM_H
 #define _UAPI_LINUX_RANDOM_H
 
+#include <linux/bits.h>
 #include <linux/types.h>
 #include <linux/ioctl.h>
 #include <linux/irqnr.h>
@@ -23,7 +24,7 @@
 /* Get the contents of the entropy pool.  (Superuser only.) */
 #define RNDGETPOOL	_IOR( 'R', 0x02, int [2] )
 
-/* 
+/*
  * Write bytes into the entropy pool and add to the entropy count.
  * (Superuser only.)
  */
@@ -50,7 +51,20 @@ struct rand_pool_info {
  * GRND_NONBLOCK	Don't block and return EAGAIN instead
  * GRND_RANDOM		Use the /dev/random pool instead of /dev/urandom
  */
-#define GRND_NONBLOCK	0x0001
-#define GRND_RANDOM	0x0002
+#define GRND_NONBLOCK				BIT(0)
+#define GRND_RANDOM				BIT(1)
+
+/*
+ * Flags for getrandom2(2)
+ *
+ * GRND2_SECURE		Use urandom pool, block until CRNG is inited
+ * GRND2_INSECURE	Use urandom pool, never block even if CRNG isn't inited
+ *
+ * NOTE: don't mix flag values with GRND, to protect against the
+ * security implications of users passing the invalid flag family
+ * to system calls (GRND_* vs. GRND2_*).
+ */
+#define GRND2_SECURE_UNBOUNDED_INITIAL_WAIT	BIT(7)
+#define GRND2_INSECURE				BIT(8)
 
 #endif /* _UAPI_LINUX_RANDOM_H */
-- 
Ahmed Darwish
http://darwish.chasingpointers.com

^ permalink raw reply related	[relevance 53%]

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  @ 2019-09-20 13:46 89%                                     ` Ahmed S. Darwish
      2019-09-26 20:42 89%                                     ` [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags Ahmed S. Darwish
  1 sibling, 2 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-20 13:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Lennart Poettering, Theodore Y. Ts'o, Eric W. Biederman,
	Alexander E. Patrakov, Michael Kerrisk, Willy Tarreau,
	Matthew Garrett, lkml, linux-ext4, linux-api, linux-man

Hi,

On Wed, Sep 18, 2019 at 04:57:58PM -0700, Linus Torvalds wrote:
> On Wed, Sep 18, 2019 at 2:17 PM Ahmed S. Darwish <darwish.07@gmail.com> wrote:
> >
> > Since Linux v3.17, getrandom(2) has been created as a new and more
> > secure interface for pseudorandom data requests.  It attempted to
> > solve three problems, as compared to /dev/urandom:
  > 
> I don't think your patch is really _wrong_, but I think it's silly to
> introduce a new system call, when we have 30 bits left in the flags of
> the old one, and the old system call checked them.
> 
> So it's much simpler and more straightforward to  just introduce a
> single new bit #2 that says "I actually know what I'm doing, and I'm
> explicitly asking for secure/insecure random data".
> 
> And then say that the existing bit #1 just means "I want to wait for entropy".
> 
> So then you end up with this:
> 
>     /*
>      * Flags for getrandom(2)
>      *
>      * GRND_NONBLOCK    Don't block and return EAGAIN instead
>      * GRND_WAIT_ENTROPY        Explicitly wait for entropy
>      * GRND_EXPLICIT    Make it clear you know what you are doing
>      */
>     #define GRND_NONBLOCK               0x0001
>     #define GRND_WAIT_ENTROPY   0x0002
>     #define GRND_EXPLICIT               0x0004
> 
>     #define GRND_SECURE (GRND_EXPLICIT | GRND_WAIT_ENTROPY)
>     #define GRND_INSECURE       (GRND_EXPLICIT | GRND_NONBLOCK)
> 
>     /* Nobody wants /dev/random behavior, nobody should use it */
>     #define GRND_RANDOM 0x0002
> 
> which is actually fairly easy to understand. So now we have three
> bits, and the values are:
> 
>  000  - ambiguous "secure or just lazy/ignorant"
>  001 - -EAGAIN or secure
>  010 - blocking /dev/random DO NOT USE
>  011 - nonblocking /dev/random DO NOT USE
>  100 - nonsense, returns -EINVAL
>  101 - /dev/urandom without warnings
>  110 - blocking secure
>  111 - -EAGAIN or secure
>

Hmmm, the point of the new syscall was **exactly** to avoid the 2^3
combinations above, and to provide developers only two, sane and easy,
options:

  - GRND2_INSECURE
  - GRND2_SECURE_UNBOUNDED_INITIAL_WAIT

You *must* pick one of these, and that's it. (!)

Then the proposed getrandom_wait(7) manpage, also mentioned in the V4
patch WARN message, would provide a big rationale, and encourage
everyone to use the new getrandom2(2) syscall instead.

But yeah, maybe we should add the extra flags to the old getrandom()
instead, and let glibc implement a getrandom_safe(3) wrapper only
with the sane options available.

Problem is, glibc is still *really* slow in adopting linux syscall
wrappers, so I'm not optimistic about that...

I still see the new system call as the sanest path, even provided
the cost of a new syscall number..

@Linus, @Ted:  Final thoughts?

> and people would be encouraged to use one of these three:
> 
>  - GRND_INSECURE
>  - GRND_SECURE
>  - GRND_SECURE | GRND_NONBLOCK
> 
> all of which actually make sense, and none of which have any
> ambiguity. And while "GRND_INSECURE | GRND_NONBLOCK" works, it's
> exactly the same as just plain GRND_INSECURE - the point is that it
> doesn't block for entropy anyway, so non-blocking makes no different.
>

[...]

> 
> There is *one* other small semantic change: The old code did
> urandom_read() which added warnings, but each warning also _reset_ the
> crng_init_cnt. Until it decided not to warn any more, at which point
> it also stops that resetting of crng_init_cnt.
> 
> And that reset of crng_init_cnt, btw, is some cray cray.
> 
> It's basically a "we used up entropy" thing, which is very
> questionable to begin with as the whole discussion has shown, but
> since it stops doing it after 10 cases, it's not even good security
> assuming the "use up entropy" case makes sense in the first place.
> 
> So I didn't copy that insanity either. And I'm wondering if removing
> it from /dev/urandom might also end up helping Ahmed's case of getting
> entropy earlier, when we don't reset the counter.
>

Yeah, noticed that, but I've learned not to change crypto or
speculative-execution code even if the changes "just look the same" at
first glance ;-)

(out of curiosity, I'll do a quick test with this CRNG entropy reset
part removed. Maybe it was indeed part of the problem..)

> But other than those two details, none of the existing semantics
> changed, we just added the three actually _sane_ cases without any
> ambiguity.
> 
> In particular, this still leaves the semantics of that nasty
> "getrandom(0)" as the same "blocking urandom" that it currently is.
> But now it's a separate case, and we can make that perhaps do the
> timeout, or at least the warning.
>

Yeah, I would propose to keep the V4-submitted "timeout then WARN"
logic. This alone will give user-space / distributions time to adapt.

For example, it was interesting that even the 0day bot had limited
entropy on boot (virtio-rng / TRUST_CPU not enabled):

    https://lkml.kernel.org/r/20190920005120.GP15734@shao2-debian

If user-space didn't get its act together, then the other extreme
measures can be implemented later (the getrandom() length test, using
jitter as a credited kernel entropy source, etc., etc.)

> And the new cases are defined to *not* warn. In particular,
> GRND_INSECURE very much does *not* warn about early urandom access
> when crng isn't ready. Because the whole point of that new mode is
> that the user knows it isn't secure.
>
> So that should make getrandom(GRND_INSECURE) palatable to the systemd
> kind of use that wanted to avoid the pointless kernel warning.
>

Yup, that's what was in the submitted V4 patch too. The caller
explicitly asked for "insecure", so they know what they're doing.

getrandom2(2) never prints any kernel message.

> And we could mark this for stable and try to get it backported so that
> it will have better coverage, and encourage people to use the new sane
> _explicit_ waiting (or not) for entropy.
>

ACK. I'll wait for an answer to the "Final thoughts?" question above,
send a V5 with CC:stable, then disappear from this thread ;-)

Thanks a lot everyone!

--
Ahmed Darwish

^ permalink raw reply	[relevance 89%]

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  @ 2019-09-20 17:56 99%                                         ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-20 17:56 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Linus Torvalds, Lennart Poettering, Theodore Y. Ts'o,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk,
	Matthew Garrett, lkml, linux-ext4, linux-api, linux-man

On Fri, Sep 20, 2019 at 07:26:09PM +0200, Willy Tarreau wrote:
> Hi Ahmed,
> 
> On Fri, Sep 20, 2019 at 03:46:09PM +0200, Ahmed S. Darwish wrote:
> > Problem is, glibc is still *really* slow in adopting linux syscall
> > wrappers, so I'm not optimistic about that...
> >
> > I still see the new system call as the sanest path, even provided
> > the cost of a new syscall number..
> 
> New syscalls are always a pain to deal with in userland, because when
> they are introduced, everyone wants them long before they're available
> in glibc. So userland has to define NR_xxx for each supported arch and
> to perform the call itself.
> 
> With flags adoption is instantaneous. Just #ifndef/#define, check if
> the flag is supported and that's done. The only valid reason for a new
> syscall is when the API changes (e.g. one extra arg, a la accept4()),
> which doesn't seem to be the case here. Otherwise please by all means
> avoid this in general.
> 

I see. Thanks a lot for the explanation above :)

--
Ahmed Darwish

^ permalink raw reply	[relevance 99%]

* [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
    2019-09-20 13:46 89%                                     ` Ahmed S. Darwish
@ 2019-09-26 20:42 89%                                     ` Ahmed S. Darwish
  2019-09-26 20:44 50%                                       ` [PATCH v5 1/1] " Ahmed S. Darwish
  1 sibling, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-26 20:42 UTC (permalink / raw)
  To: Linus Torvalds, Theodore Y. Ts'o
  Cc: Florian Weimer, Willy Tarreau, Matthew Garrett, Andy Lutomirski,
	Lennart Poettering, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, lkml, linux-ext4, linux-api, linux-man

Summary / Changelog-v5:

  - Add the new flags GRND_INSECURE and GRND_SECURE_UNBOUNDED_INITIAL_WAIT
    to getrandom(2), instead of introducing a new getrandom2(2) system
    call, which nobody liked.

  - Fix a bug discovered through testing where "int ret =
    wait_event_interruptible_timeout(waitq, true, MAX_SCHEDULE_TIMEOUT)"
    returns failure (-1) due to implicit LONG_MAX => int truncation

  - WARN if a process is stuck on getrandom(,,flags=0) for more than 30
    seconds ... defconfig and bootparam configurable

  - Add documentation for "random.getrandom_wait_threshold" kernel param

  - Extra comments @ include/uapi/linux/random.h and random.c::getrandom.
    Explicit recommendations to *exclusively* use the new flags.

  - GRND_INSECURE never issue any warning, even if CRNG is not inited.
    Similarly for GRND_SECURE_UNBOUNDED_INITIAL_WAIT, no matter how
    big the unbounded wait is.

In a reply to the V4 patch, Linus posted a related patch [*] with the
following additions:

  - Drop the original random.c behavior of having each /dev/urandom
    "CRNG not inited" warning also _reset_ the crng_init_cnt entropy.

    This is not included in this patch, as IMHO this can be done as a
    separate patch on top.

 - Limit GRND_RANDOM max count/buflen to 32MB instead of 2GB.  This
   is very sane obviously, and can be done in a separate patch on
   top.

   This V5 patch just tries to be as conservative as possible.

 - GRND_WAIT_ENTROPY and GRND_EXCPLICIT: AFAIK these were primarily
   added so that getrandom(,,flags=0) can be changed to return
   weaker non-blocking crypto from non-inited CRG in a possible
   future.

   I hope we don't have to resort to that extreme measure.. Hopefully
   the WARN() on this patch will be enough in nudging distributions to
   enable more hwrng sources (RDRAND, etc.) .. and also for the
   user-space developres badly pointed at (hi GDM and Qt) to fix their
   code.

[*] https://lkml.kernel.org/r/CAHk-=wiCqDiU7SE3FLn2W26MS_voUAuqj5XFa1V_tiGTrrW-zQ@mail.gmail.com

Ahmed S. Darwish (1):
  random: getrandom(2): warn on large CRNG waits, introduce new flags

 .../admin-guide/kernel-parameters.txt         |   7 ++
 drivers/char/Kconfig                          |  60 ++++++++++-
 drivers/char/random.c                         | 102 +++++++++++++++---
 include/uapi/linux/random.h                   |  27 ++++-
 4 files changed, 177 insertions(+), 19 deletions(-)

--
2.23.0

^ permalink raw reply	[relevance 89%]

* [PATCH v5 1/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
  2019-09-26 20:42 89%                                     ` [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags Ahmed S. Darwish
@ 2019-09-26 20:44 50%                                       ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-09-26 20:44 UTC (permalink / raw)
  To: Linus Torvalds, Theodore Y. Ts'o
  Cc: Florian Weimer, Willy Tarreau, Matthew Garrett, Andy Lutomirski,
	Lennart Poettering, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, lkml, linux-ext4, linux-api, linux-man

Since Linux v3.17, getrandom(2) has been created as a new and more
secure interface for pseudorandom data requests.  It attempted to
solve three problems, as compared to /dev/urandom:

  1. the need to access filesystem paths, which can fail, e.g. under a
     chroot

  2. the need to open a file descriptor, which can fail under file
     descriptor exhaustion attacks

  3. the possibility of getting not-so-random data from /dev/urandom,
     due to an incompletely initialized kernel entropy pool

To solve the third point, getrandom(2) was made to block until a
proper amount of entropy has been accumulated to initialize the CRNG
ChaCha20 cipher.  This made the system call have no guaranteed
upper-bound for its initial waiting time.

Thus when it was introduced at c6e9d6f38894 ("random: introduce
getrandom(2) system call"), it came with a clear warning: "Any
userspace program which uses this new functionality must take care to
assure that if it is used during the boot process, that it will not
cause the init scripts or other portions of the system startup to hang
indefinitely."

Unfortunately, due to multiple factors, including not having this
warning written in a scary-enough language in the manpages, and due to
glibc since v2.25 implementing a BSD-like getentropy(3) in terms of
getrandom(2), modern user-space is calling getrandom(2) in the boot
path everywhere (e.g. Qt, GDM, etc.)

Embedded Linux systems were first hit by this, and reports of embedded
systems "getting stuck at boot" began to be common.  Over time, the
issue began to even creep into consumer-level x86 laptops: mainstream
distributions, like Debian Buster, began to recommend installing
haveged as a duct-tape workaround... just to let the system boot.

Moreover, filesystem optimizations in EXT4 and XFS, e.g. b03755ad6f33
("ext4: make __ext4_get_inode_loc plug"), which merged directory
lookup code inode table IO, and very fast systemd boots, further
exaggerated the problem by limiting interrupt-based entropy sources.
This led to large delays until the kernel's cryptographic random
number generator (CRNG) got initialized.

On a Thinkpad E480 x86 laptop and an ArchLinux user-space, the ext4
commit earlier mentioned reliably blocked the system on GDM boot.
Mitigate the problem, as a first step, in two ways:

  1. Issue a big WARN_ON when any process gets stuck on getrandom(2)
     for more than CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC seconds.

  2. Introduce new getrandom(2) flags, with clear semantics that can
     hopefully guide user-space in doing the right thing.

Set CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC to a heuristic 30-second
default value. System integrators and distribution builders are deeply
encouraged not to increase it much: during system boot, you either
have entropy, or you don't. And if you didn't have entropy, it will
stay like this forever, because if you had, you wouldn't have blocked
in the first place. It's an atomic "either/or" situation, with no
middle ground. Please think twice.

For the new getrandom(2) flags, be much more explicit.  As Linus
mentioned several times in the bug report thread, Linux should've
never provided /dev/random and the getrandom(GRND_RANDOM) APIs. These
interfaces are broken by design due to their almost-permanent
blockage, leading to the current misuse of /dev/urandom and
getrandom(flags=0) calls. Thus introduce the flags:

  1. GRND_INSECURE
  2. GRND_SECURE_UNBOUNDED_INITIAL_WAIT

where both extract randomness _exclusively_ from the urandom source.

Due to the explicit semantics of these new flags, GRND_INSECURE will
never issue a kernel warning message even if the CRNG is not yet
inited.  Similarly, GRND_SECURE_UNBOUNDED_INITIAL_WAIT will never
cause any any kernel WARN, no matter how large the unbounded wait is.

Rreported-by: Ahmed S. Darwish <darwish.07@gmail.com>
Link: https://lkml.kernel.org/r/20190910042107.GA1517@darwi-home-pc
Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
Link: https://lkml.kernel.org/r/20190914222432.GC19710@mit.edu
Link: https://lkml.kernel.org/r/20180514003034.GI14763@thunk.org
Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
Link: https://lkml.kernel.org/r/20190917052438.GA26923@1wt.eu
Link: https://lkml.kernel.org/r/20190917160844.GC31567@gardel-login
Link: https://lkml.kernel.org/r/CAHk-=wjABG3+daJFr4w3a+OWuraVcZpi=SMUg=pnZ+7+O0E2FA@mail.gmail.com
Link: https://lkml.kernel.org/r/CAHk-=wjQeiYu8Q_wcMgM-nAcW7KsBfG1+90DaTD5WF2cCeGCgA@mail.gmail.com
Link: https://factorable.net ("Widespread Weak Keys in Network Devices")
Link: https://man.openbsd.org/man4/random.4
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
---
 .../admin-guide/kernel-parameters.txt         |   7 ++
 drivers/char/Kconfig                          |  60 ++++++++++-
 drivers/char/random.c                         | 102 +++++++++++++++---
 include/uapi/linux/random.h                   |  27 ++++-
 4 files changed, 177 insertions(+), 19 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6ef205fd7c97..d82eafc6a62a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3728,6 +3728,13 @@
 			fully seed the kernel's CRNG. Default is controlled
 			by CONFIG_RANDOM_TRUST_CPU.

+	random.getrandom_wait_threshold=
+			Maximum amount, in seconds, for a process to block
+			in a getrandom(,,flags=0) systemcall without a loud
+			warning in the kernel logs. Default is controlled by
+			CONFIG_RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC. Check
+			the config option help text for more information.
+
 	ras=option[,option,...]	[KNL] RAS-specific options

 		cec_disable	[X86]
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index df0fc997dc3e..adc9bc63d27c 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -535,8 +535,6 @@ config ADI
 	  and SSM (Silicon Secured Memory).  Intended consumers of this
 	  driver include crash and makedumpfile.

-endmenu
-
 config RANDOM_TRUST_CPU
 	bool "Trust the CPU manufacturer to initialize Linux's CRNG"
 	depends on X86 || S390 || PPC
@@ -559,4 +557,60 @@ config RANDOM_TRUST_BOOTLOADER
 	device randomness. Say Y here to assume the entropy provided by the
 	booloader is trustworthy so it will be added to the kernel's entropy
 	pool. Otherwise, say N here so it will be regarded as device input that
-	only mixes the entropy pool.
\ No newline at end of file
+	only mixes the entropy pool.
+
+config RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC
+	int
+	default 30
+	help
+	  The getrandom(2) system call, when asking for entropy from the
+	  urandom source, blocks until the kernel's Cryptographic Random
+	  Number Generator (CRNG) gets initialized. This configuration
+	  option sets the maximum wait time, in seconds, for a process
+	  to get blocked on such a system call before the kernel issues
+	  a loud warning. Rationale follows:
+
+	  When the getrandom(2) system call was created, it came with
+	  the clear warning: "Any userspace program which uses this new
+	  functionality must take care to assure that if it is used
+	  during the boot process, that it will not cause the init
+	  scripts or other portions of the system startup to hang
+	  indefinitely.
+
+	  Unfortunately, due to multiple factors, including not having
+	  this warning written in a scary-enough language in the
+	  manpages, and due to glibc since v2.25 implementing a BSD-like
+	  getentropy(3) in terms of getrandom(2), modern user-space is
+	  calling getrandom(2) in the boot path everywhere.
+
+	  Embedded Linux systems were first hit by this, and reports of
+	  embedded system "getting stuck at boot" began to be
+	  common. Over time, the issue began to even creep into consumer
+	  level x86 laptops: mainstream distributions, like Debian
+	  Buster, began to recommend installing haveged as a workaround,
+	  just to let the system boot.
+
+	  Filesystem optimizations in EXT4 and XFS exaggerated the
+	  problem, due to aggressive batching of IO requests, and thus
+	  minimizing sources of entropy at boot. This led to large
+	  delays until the kernel's CRNG got initialized.
+
+	  System integrators and distribution builders are not
+	  encouraged to considerably increase this value: during system
+	  boot, you either have entropy, or you don't. And if you didn't
+	  have entropy, it will stay like this forever, because if you
+	  had, you wouldn't have blocked in the first place. It's an
+	  atomic "either/or" situation, with no middle ground. Please
+	  think twice.
+
+	  Ideally, systems would be configured with hardware random
+	  number generators, and/or configured to trust the CPU-provided
+	  RNG's (CONFIG_RANDOM_TRUST_CPU) or boot-loader provided ones
+	  (CONFIG_RANDOM_TRUST_BOOTLOADER).  In addition, userspace
+	  should generate cryptographic keys only as late as possible,
+	  when they are needed, instead of during early boot.  For
+	  non-cryptographic use cases, such as dictionary seeds or MIT
+	  Magic Cookies, the getrandom2(GRND2_INSECURE) system call,
+	  or even random(3), may be more appropriate.
+
+endmenu
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 566922df4b7b..37c00cff1c08 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -322,6 +322,7 @@
 #include <linux/interrupt.h>
 #include <linux/mm.h>
 #include <linux/nodemask.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
 #include <linux/percpu.h>
@@ -854,12 +855,21 @@ static void invalidate_batched_entropy(void);
 static void numa_crng_init(void);

 static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static int getrandom_wait_threshold __ro_after_init =
+				CONFIG_RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC;
+
 static int __init parse_trust_cpu(char *arg)
 {
 	return kstrtobool(arg, &trust_cpu);
 }
 early_param("random.trust_cpu", parse_trust_cpu);

+static int __init parse_getrandom_wait_threshold(char *arg)
+{
+	return kstrtoint(arg, 0, &getrandom_wait_threshold);
+}
+early_param("random.getrandom_wait_threshold", parse_getrandom_wait_threshold);
+
 static void crng_initialize(struct crng_state *crng)
 {
 	int		i;
@@ -1960,7 +1970,7 @@ random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 }

 static ssize_t
-urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+_urandom_read(char __user *buf, size_t nbytes, bool warn_on_noninited_crng)
 {
 	unsigned long flags;
 	static int maxwarn = 10;
@@ -1968,7 +1978,7 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)

 	if (!crng_ready() && maxwarn > 0) {
 		maxwarn--;
-		if (__ratelimit(&urandom_warning))
+		if (warn_on_noninited_crng && __ratelimit(&urandom_warning))
 			printk(KERN_NOTICE "random: %s: uninitialized "
 			       "urandom read (%zd bytes read)\n",
 			       current->comm, nbytes);
@@ -1982,6 +1992,13 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
 	return ret;
 }

+static ssize_t
+urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+{
+	/* warn on non-inited CRNG */
+	return _urandom_read(buf, nbytes, true);
+}
+
 static __poll_t
 random_poll(struct file *file, poll_table * wait)
 {
@@ -2118,13 +2135,55 @@ const struct file_operations urandom_fops = {
 	.llseek = noop_llseek,
 };

+static int geturandom_wait(char __user *buf, size_t count,
+			   bool warn_on_large_wait)
+{
+	long ret, timeout = MAX_SCHEDULE_TIMEOUT;
+
+	if (warn_on_large_wait && (getrandom_wait_threshold > 0))
+		timeout = HZ * getrandom_wait_threshold;
+
+	do {
+		ret = wait_event_interruptible_timeout(crng_init_wait,
+						       crng_ready(),
+						       timeout);
+		if (ret < 0)
+			return ret;
+
+		if (ret == 0) {
+			WARN(1, "random: %s[%d]: getrandom(%zu bytes) "
+			     "is blocked for more than %d seconds. Check "
+			     "getrandom_wait(7)\n", current->comm,
+			     task_pid_nr(current), count,
+			     getrandom_wait_threshold);
+
+			/* warn once per caller */
+			timeout = MAX_SCHEDULE_TIMEOUT;
+		}
+
+	} while (ret == 0);
+
+	return _urandom_read(buf, count, true);
+}
+
 SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 		unsigned int, flags)
 {
-	int ret;
+	unsigned int i, invalid_combs[] = {
+		GRND_INSECURE|GRND_SECURE_UNBOUNDED_INITIAL_WAIT,
+		GRND_INSECURE|GRND_RANDOM,
+	};

-	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
+	if (flags & ~(GRND_NONBLOCK | \
+		      GRND_RANDOM   | \
+		      GRND_INSECURE | \
+		      GRND_SECURE_UNBOUNDED_INITIAL_WAIT)) {
 		return -EINVAL;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(invalid_combs); i++)
+		if ((flags & invalid_combs[i]) == invalid_combs[i])
+			return -EINVAL;

 	if (count > INT_MAX)
 		count = INT_MAX;
@@ -2132,14 +2191,33 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (flags & GRND_RANDOM)
 		return _random_read(flags & GRND_NONBLOCK, buf, count);

-	if (!crng_ready()) {
-		if (flags & GRND_NONBLOCK)
+	/*
+	 * urandom: explicit request *not* to wait for CRNG init, and
+	 * thus no "uninitialized urandom read" warnings.
+	 */
+	if (flags & GRND_INSECURE)
+		return _urandom_read(buf, count, false);
+
+	/* urandom: nonblocking access */
+	if ((flags & GRND_NONBLOCK) && !crng_ready())
 			return -EAGAIN;
-		ret = wait_for_random_bytes();
-		if (unlikely(ret))
-			return ret;
-	}
-	return urandom_read(NULL, buf, count, NULL);
+
+	/*
+	 * urandom: explicit request *to* wait for CRNG init, and thus
+	 * no "getrandom is blocked for more than X seconds" warnings
+	 * on large waits.
+	 */
+	if (flags & GRND_SECURE_UNBOUNDED_INITIAL_WAIT)
+		return geturandom_wait(buf, count, false);
+
+	/*
+	 * urandom: *implicit* request to wait for CRNG init (flags=0)
+	 *
+	 * User-space has been badly abusing this by calling getrandom
+	 * with flags=0 in the boot path, and thus blocking system
+	 * boots forever in absence of entropy. Warn on large waits.
+	 */
+	return geturandom_wait(buf, count, true);
 }

 /********************************************************************
@@ -2458,4 +2536,4 @@ void add_bootloader_randomness(const void *buf, unsigned int size)
 	else
 		add_device_randomness(buf, size);
 }
-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
\ No newline at end of file
+EXPORT_SYMBOL_GPL(add_bootloader_randomness);
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index 26ee91300e3e..5a3df92270a7 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -8,6 +8,7 @@
 #ifndef _UAPI_LINUX_RANDOM_H
 #define _UAPI_LINUX_RANDOM_H

+#include <linux/bits.h>
 #include <linux/types.h>
 #include <linux/ioctl.h>
 #include <linux/irqnr.h>
@@ -23,7 +24,7 @@
 /* Get the contents of the entropy pool.  (Superuser only.) */
 #define RNDGETPOOL	_IOR( 'R', 0x02, int [2] )

-/*
+/*
  * Write bytes into the entropy pool and add to the entropy count.
  * (Superuser only.)
  */
@@ -47,10 +48,28 @@ struct rand_pool_info {
 /*
  * Flags for getrandom(2)
  *
+ * 0			discouraged - don't use (see below)
  * GRND_NONBLOCK	Don't block and return EAGAIN instead
- * GRND_RANDOM		Use the /dev/random pool instead of /dev/urandom
+ * GRND_RANDOM		discouraged - don't use (uses /dev/random pool)
+ * GRND_INSECURE	Use urandom pool, never block even if CRNG isn't inited
+ * GRND_SECURE_UNBOUNDED_INITIAL_WAIT
+ *			Use urandom pool, block until CRNG is inited
+ *
+ * User-space has been badly abusing getrandom(flags=0) by calling
+ * it in the boot path, and thus blocking system boots forever in
+ * the absence of entropy (a blocked system cannot generate more
+ * entropy, by definition).
+ *
+ * Thus if a process blocks on a getrandom(flags=0), waithing for
+ * more than CONFIG_RANDOM_GETRANDOM_WAIT_THRESHOLD_SEC seconds,
+ * the kernel will issue a loud warning.
+ *
+ * In general, don't use flags=0. Always use either GRND_INSECURE
+ * or GRND_SECURE_UNBOUNDED_INITIAL_WAIT instead.
  */
-#define GRND_NONBLOCK	0x0001
-#define GRND_RANDOM	0x0002
+#define GRND_NONBLOCK				BIT(0)
+#define GRND_RANDOM				BIT(1)
+#define GRND_INSECURE				BIT(2)
+#define GRND_SECURE_UNBOUNDED_INITIAL_WAIT	BIT(3)

 #endif /* _UAPI_LINUX_RANDOM_H */
--
2.23.0

^ permalink raw reply related	[relevance 50%]

* Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()
  @ 2019-09-26 21:11 99%                                                   ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-26 21:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Florian Weimer, Linus Torvalds, Lennart Poettering,
	Theodore Y. Ts'o, Eric W. Biederman, Alexander E. Patrakov,
	Michael Kerrisk, Willy Tarreau, Matthew Garrett, lkml,
	Ext4 Developers List, Linux API, linux-man

On Mon, Sep 23, 2019 at 11:33:21AM -0700, Andy Lutomirski wrote:
> On Fri, Sep 20, 2019 at 11:07 PM Florian Weimer <fweimer@redhat.com> wrote:
> >
> > * Linus Torvalds:
> >
> > > Violently agreed. And that's kind of what the GRND_EXPLICIT is really
> > > aiming for.
> > >
> > > However, it's worth noting that nobody should ever use GRND_EXPLICIT
> > > directly. That's just the name for the bit. The actual users would use
> > > GRND_INSECURE or GRND_SECURE.
> >
> > Should we switch glibc's getentropy to GRND_EXPLICIT?  Or something
> > else?
> >
> > I don't think we want to print a kernel warning for this function.
> >
> 
> Contemplating this question, I think the answer is that we should just
> not introduce GRND_EXPLICIT or anything like it.  glibc is going to
> have to do *something*, and getentropy() is unlikely to just go away.
> The explicitly documented semantics are that it blocks if the RNG
> isn't seeded.
> 
> Similarly, FreeBSD has getrandom():
> 
> https://www.freebsd.org/cgi/man.cgi?query=getrandom&sektion=2&manpath=freebsd-release-ports
> 
> and if we make getrandom(..., 0) warn, then we have a situation where
> the *correct* (if regrettable) way to use the function on FreeBSD
> causes a warning on Linux.
> 
> Let's just add GRND_INSECURE, make the blocking mode work better, and,
> if we're feeling a bit more adventurous, add GRND_SECURE_BLOCKING as a
> better replacement for 0, ...

This is what's now done in the just-submitted V5, except the "make the
blocking mode work better" part:

    https://lkml.kernel.org/r/20190926204217.GA1366@pc

It's a very conservative patch so far IMHO (minus the loud warning).

Thanks,
--
Ahmed Darwish

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v5 1/1] random: getrandom(2): warn on large CRNG waits, introduce new flags
  @ 2019-09-28  9:30 98%                                           ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-09-28  9:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linus Torvalds, Theodore Y. Ts'o, Florian Weimer,
	Willy Tarreau, Matthew Garrett, Lennart Poettering,
	Eric W. Biederman, Alexander E. Patrakov, Michael Kerrisk, lkml,
	linux-ext4, linux-api, linux-man

On Thu, Sep 26, 2019 at 02:39:44PM -0700, Andy Lutomirski wrote:
> On 9/26/19 1:44 PM, Ahmed S. Darwish wrote:
> > Since Linux v3.17, getrandom(2) has been created as a new and more
> > secure interface for pseudorandom data requests.  It attempted to
> > solve three problems, as compared to /dev/urandom:
> > 
> >    1. the need to access filesystem paths, which can fail, e.g. under a
> >       chroot
> > 
> >    2. the need to open a file descriptor, which can fail under file
> >       descriptor exhaustion attacks
> > 
> >    3. the possibility of getting not-so-random data from /dev/urandom,
> >       due to an incompletely initialized kernel entropy pool
> > 
> > To solve the third point, getrandom(2) was made to block until a
> > proper amount of entropy has been accumulated to initialize the CRNG
> > ChaCha20 cipher.  This made the system call have no guaranteed
> > upper-bound for its initial waiting time.
> > 
> > Thus when it was introduced at c6e9d6f38894 ("random: introduce
> > getrandom(2) system call"), it came with a clear warning: "Any
> > userspace program which uses this new functionality must take care to
> > assure that if it is used during the boot process, that it will not
> > cause the init scripts or other portions of the system startup to hang
> > indefinitely."
> > 
> > Unfortunately, due to multiple factors, including not having this
> > warning written in a scary-enough language in the manpages, and due to
> > glibc since v2.25 implementing a BSD-like getentropy(3) in terms of
> > getrandom(2), modern user-space is calling getrandom(2) in the boot
> > path everywhere (e.g. Qt, GDM, etc.)
> > 
> > Embedded Linux systems were first hit by this, and reports of embedded
> > systems "getting stuck at boot" began to be common.  Over time, the
> > issue began to even creep into consumer-level x86 laptops: mainstream
> > distributions, like Debian Buster, began to recommend installing
> > haveged as a duct-tape workaround... just to let the system boot.
> > 
> > Moreover, filesystem optimizations in EXT4 and XFS, e.g. b03755ad6f33
> > ("ext4: make __ext4_get_inode_loc plug"), which merged directory
> > lookup code inode table IO, and very fast systemd boots, further
> > exaggerated the problem by limiting interrupt-based entropy sources.
> > This led to large delays until the kernel's cryptographic random
> > number generator (CRNG) got initialized.
> > 
> > On a Thinkpad E480 x86 laptop and an ArchLinux user-space, the ext4
> > commit earlier mentioned reliably blocked the system on GDM boot.
> > Mitigate the problem, as a first step, in two ways:
> > 
> >    1. Issue a big WARN_ON when any process gets stuck on getrandom(2)
> >       for more than CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC seconds.
> > 
> >    2. Introduce new getrandom(2) flags, with clear semantics that can
> >       hopefully guide user-space in doing the right thing.
> > 
> > Set CONFIG_GETRANDOM_WAIT_THRESHOLD_SEC to a heuristic 30-second
> > default value. System integrators and distribution builders are deeply
> > encouraged not to increase it much: during system boot, you either
> > have entropy, or you don't. And if you didn't have entropy, it will
> > stay like this forever, because if you had, you wouldn't have blocked
> > in the first place. It's an atomic "either/or" situation, with no
> > middle ground. Please think twice.
> 
> So what do we expect glibc's getentropy() to do?  If it just adds the new
> flag to shut up the warning, we haven't really accomplished much.

Yes, if glibc adds GRND_SECURE_UNBOUNDED_INITIAL_WAIT to gentropy(3),
then this exercise would indeed be invalidated. Hopefully,
coordination with glibc will be done so it won't happen... @Florian?

Afterwards, a sane approach would be for gentropy(3) to be deprecated,
and to add getentropy_secure_unbounded_initial_wait(3) and
getentropy_insecure(3).

Note that this V5 patch does not claim to fully solve the problem, but
it will:

  1. Pinpoint to the processes causing system boots to block
  
  2. Tell people what correct alternative to use when facing problem
     #1 above, through the proposed getrandom_wait(7) manpage. That
     manpage page will fully describe the problem, and advise
     user-space to either use the new getrandom flags, or the new
     glibc gentropy_*() variants.

thanks,

--
Ahmed Darwish

^ permalink raw reply	[relevance 98%]

* Re: x86/random: Speculation to the rescue
  @ 2019-10-01 16:15 96%   ` Ahmed S. Darwish
    0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2019-10-01 16:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, a.darwish, LKML, Theodore Ts'o,
	Nicholas Mc Guire, the arch/x86 maintainers, Andy Lutomirski,
	Kees Cook

Hi,

Sorry for the late reply as I'm also on vacation this week :-)

On Sat, Sep 28, 2019 at 04:53:52PM -0700, Linus Torvalds wrote:
> On Sat, Sep 28, 2019 at 3:24 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > Nicholas presented the idea to (ab)use speculative execution for random
> > number generation years ago at the Real-Time Linux Workshop:
>
> What you describe is just a particularly simple version of the jitter
> entropy. Not very reliable.
>
> But hey, here's a made-up patch. It basically does jitter entropy, but
> it uses a more complex load than the fibonacci LFSR folding: it calls
> "schedule()" in a loop, and it sets up a timer to fire.
>
> And then it mixes in the TSC in that loop.
>
> And to be fairly conservative, it then credits one bit of entropy for
> every timer tick. Not because the timer itself would be all that
> unpredictable, but because the interaction between the timer and the
> loop is going to be pretty damn unpredictable.
>
> Ok, I'm handwaving. But I do claim it really is fairly conservative to
> think that a cycle counter would give one bit of entropy when you time
> over a timer actually happening. The way that loop is written, we do
> guarantee that we'll mix in the TSC value both before and after the
> timer actually happened. We never look at the difference of TSC
> values, because the mixing makes that uninteresting, but the code does
> start out with verifying that "yes, the TSC really is changing rapidly
> enough to be meaningful".
>
> So if we want to do jitter entropy, I'd much rather do something like
> this that actually has a known fairly complex load with timers and
> scheduling.
>
> And even if absolutely no actual other process is running, the timer
> itself is still going to cause perturbations. And the "schedule()"
> call is more complicated than the LFSR is anyway.
>
> It does wait for one second the old way before it starts doing this.
>
> Whatever. I'm entirely convinced this won't make everybody happy
> anyway, but it's _one_ approach to handle the issue.
>
> Ahmed - would you be willing to test this on your problem case (with
> the ext4 optimization re-enabled, of course)?
>

So I pulled the patch and the revert of the ext4 revert as they're all
now merged in master. It of course made the problem go away...

To test the quality of the new jitter code, I added a small patch on
top to disable all other sources of randomness except the new jitter
entropy code, [1] and made quick tests on the quality of getrandom(0).

Using the "ent" tool, [2] also used to test randomness in the Stephen
Müller LRNG paper, on a 500000-byte file, produced the following
results:

    $ ent rand-file

    Entropy = 7.999625 bits per byte.

    Optimum compression would reduce the size of this 500000 byte file
    by 0 percent.

    Chi square distribution for 500000 samples is 259.43, and randomly
    would exceed this value 41.11 percent of the times.

    Arithmetic mean value of data bytes is 127.4085 (127.5 = random).

    Monte Carlo value for Pi is 3.148476594 (error 0.22 percent).

    Serial correlation coefficient is 0.001740 (totally uncorrelated = 0.0).

As can be seen above, everything looks random, and almost all of the
statistical randomness tests matched the same kernel without the
"jitter + schedule()" patch added (after getting it un-stuck).

Thanks!

[1] Nullified add_{device,timer,input,interrupt,disk,.*}_randomness()
[2] http://www.fourmilab.ch/random/

--
Ahmed Darwish

^ permalink raw reply	[relevance 96%]

* Re: x86/random: Speculation to the rescue
  @ 2019-10-01 17:18 96%       ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2019-10-01 17:18 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Thomas Gleixner, a.darwish, LKML,
	Theodore Ts'o, Nicholas Mc Guire, the arch/x86 maintainers,
	Andy Lutomirski

On Tue, Oct 01, 2019 at 09:37:39AM -0700, Kees Cook wrote:
> On Tue, Oct 01, 2019 at 06:15:02PM +0200, Ahmed S. Darwish wrote:
> > On Sat, Sep 28, 2019 at 04:53:52PM -0700, Linus Torvalds wrote:
> > > Ahmed - would you be willing to test this on your problem case (with
> > > the ext4 optimization re-enabled, of course)?
> > >
> > 
> > So I pulled the patch and the revert of the ext4 revert as they're all
> > now merged in master. It of course made the problem go away...
> > 
> > To test the quality of the new jitter code, I added a small patch on
> > top to disable all other sources of randomness except the new jitter
> > entropy code, [1] and made quick tests on the quality of getrandom(0).
> > 
> > Using the "ent" tool, [2] also used to test randomness in the Stephen
> > Müller LRNG paper, on a 500000-byte file, produced the following
> > results:
> > 
> >     $ ent rand-file
> > 
> >     Entropy = 7.999625 bits per byte.
> > 
> >     Optimum compression would reduce the size of this 500000 byte file
> >     by 0 percent.
> > 
> >     Chi square distribution for 500000 samples is 259.43, and randomly
> >     would exceed this value 41.11 percent of the times.
> > 
> >     Arithmetic mean value of data bytes is 127.4085 (127.5 = random).
> > 
> >     Monte Carlo value for Pi is 3.148476594 (error 0.22 percent).
> > 
> >     Serial correlation coefficient is 0.001740 (totally uncorrelated = 0.0).
> > 
> > As can be seen above, everything looks random, and almost all of the
> > statistical randomness tests matched the same kernel without the
> > "jitter + schedule()" patch added (after getting it un-stuck).
> 
> Can you post the patch for [1]?
>

Yup, it was the quick&dirty patch below:

(discussion continues after the patch)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index c2f7de9dc543..26d0d2bb3337 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1177,6 +1177,8 @@ struct timer_rand_state {
  */
 void add_device_randomness(const void *buf, unsigned int size)
 {
+	return;
+
 	unsigned long time = random_get_entropy() ^ jiffies;
 	unsigned long flags;
 
@@ -1205,6 +1207,8 @@ static struct timer_rand_state input_timer_state = INIT_TIMER_RAND_STATE;
  */
 static void add_timer_randomness(struct timer_rand_state *state, unsigned num)
 {
+	return;
+
 	struct entropy_store	*r;
 	struct {
 		long jiffies;
@@ -1255,6 +1259,8 @@ static void add_timer_randomness(struct timer_rand_state *state, unsigned num)
 void add_input_randomness(unsigned int type, unsigned int code,
 				 unsigned int value)
 {
+	return;
+
 	static unsigned char last_value;
 
 	/* ignore autorepeat and the like */
@@ -1308,6 +1314,8 @@ static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
 
 void add_interrupt_randomness(int irq, int irq_flags)
 {
+	return;
+
 	struct entropy_store	*r;
 	struct fast_pool	*fast_pool = this_cpu_ptr(&irq_randomness);
 	struct pt_regs		*regs = get_irq_regs();
@@ -1375,6 +1383,8 @@ EXPORT_SYMBOL_GPL(add_interrupt_randomness);
 #ifdef CONFIG_BLOCK
 void add_disk_randomness(struct gendisk *disk)
 {
+	return;
+
 	if (!disk || !disk->random)
 		return;
 	/* first major is 1, so we get >= 0x200 here */
@@ -2489,6 +2499,8 @@ randomize_page(unsigned long start, unsigned long range)
 void add_hwgenerator_randomness(const char *buffer, size_t count,
 				size_t entropy)
 {
+	return;
+
 	struct entropy_store *poolp = &input_pool;
 
 	if (unlikely(crng_init == 0)) {
@@ -2515,9 +2527,11 @@ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
  */
 void add_bootloader_randomness(const void *buf, unsigned int size)
 {
+	return;
+
 	if (IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER))
 		add_hwgenerator_randomness(buf, size, size * 8);
 	else

> Another test we should do is the
> multi-boot test. Testing the stream (with ent, or with my dieharder run)
> is mainly testing the RNG algo. I'd like to see if the first 8 bytes
> out of the kernel RNG change between multiple boots of the same system.
> e.g. read the first 8 bytes, for each of 100000 boots, and feed THAT
> byte "stream" into ent...
>

Oh, indeed, that's an excellent point... I'll prototype this and come
back.

thanks,

--
Ahmed Darwish

^ permalink raw reply related	[relevance 96%]

* [PATCH] sched/clock: expire timer in hardirq context
@ 2020-03-09 18:15 98% Ahmed S. Darwish
  2020-03-19  8:47 83% ` [tip: timers/core] time/sched_clock: Expire " tip-bot2 for Ahmed S. Darwish
  0 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2020-03-09 18:15 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Sebastian Andrzej Siewior, Ahmed S . Darwish, LKML

To minimize latency, PREEMPT_RT kernels expires hrtimes in preemptible
softirq context by default. This can be overriden by marking the timer's
expiry with HRTIMER_MODE_HARD.

sched_clock_timer is missing this annotation: if its callback is
preempted and the duration of the preemption exceeds the wrap around
time of the underlying clocksource, sched clock will get out of sync.

Mark the sched_clock_timer for expiry in hard interrupt context.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 kernel/time/sched_clock.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index e4332e3e2d56..fa3f800d7d76 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -208,7 +208,8 @@ sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
 
 	if (sched_clock_timer.function != NULL) {
 		/* update timeout for clock wrap */
-		hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
+		hrtimer_start(&sched_clock_timer, cd.wrap_kt,
+			      HRTIMER_MODE_REL_HARD);
 	}
 
 	r = rate;
@@ -254,9 +255,9 @@ void __init generic_sched_clock_init(void)
 	 * Start the timer to keep sched_clock() properly updated and
 	 * sets the initial epoch.
 	 */
-	hrtimer_init(&sched_clock_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer_init(&sched_clock_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
 	sched_clock_timer.function = sched_clock_poll;
-	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
+	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL_HARD);
 }
 
 /*
@@ -293,7 +294,7 @@ void sched_clock_resume(void)
 	struct clock_read_data *rd = &cd.read_data[0];
 
 	rd->epoch_cyc = cd.actual_read_sched_clock();
-	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
+	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL_HARD);
 	rd->read_sched_clock = cd.actual_read_sched_clock;
 }
 
-- 
2.20.1


^ permalink raw reply related	[relevance 98%]

* [tip: timers/core] time/sched_clock: Expire timer in hardirq context
  2020-03-09 18:15 98% [PATCH] sched/clock: expire timer in hardirq context Ahmed S. Darwish
@ 2020-03-19  8:47 83% ` tip-bot2 for Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: tip-bot2 for Ahmed S. Darwish @ 2020-03-19  8:47 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Ahmed S. Darwish, Thomas Gleixner, x86, LKML

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     2c8bd58812ee3dbf0d78b566822f7eacd34bdd7b
Gitweb:        https://git.kernel.org/tip/2c8bd58812ee3dbf0d78b566822f7eacd34bdd7b
Author:        Ahmed S. Darwish <a.darwish@linutronix.de>
AuthorDate:    Mon, 09 Mar 2020 18:15:29 
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 19 Mar 2020 09:45:08 +01:00

time/sched_clock: Expire timer in hardirq context

To minimize latency, PREEMPT_RT kernels expires hrtimers in preemptible
softirq context by default. This can be overriden by marking the timer's
expiry with HRTIMER_MODE_HARD.

sched_clock_timer is missing this annotation: if its callback is preempted
and the duration of the preemption exceeds the wrap around time of the
underlying clocksource, sched clock will get out of sync.

Mark the sched_clock_timer for expiry in hard interrupt context.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200309181529.26558-1-a.darwish@linutronix.de

---
 kernel/time/sched_clock.c |  9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index e4332e3..fa3f800 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -208,7 +208,8 @@ sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
 
 	if (sched_clock_timer.function != NULL) {
 		/* update timeout for clock wrap */
-		hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
+		hrtimer_start(&sched_clock_timer, cd.wrap_kt,
+			      HRTIMER_MODE_REL_HARD);
 	}
 
 	r = rate;
@@ -254,9 +255,9 @@ void __init generic_sched_clock_init(void)
 	 * Start the timer to keep sched_clock() properly updated and
 	 * sets the initial epoch.
 	 */
-	hrtimer_init(&sched_clock_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer_init(&sched_clock_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
 	sched_clock_timer.function = sched_clock_poll;
-	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
+	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL_HARD);
 }
 
 /*
@@ -293,7 +294,7 @@ void sched_clock_resume(void)
 	struct clock_read_data *rd = &cd.read_data[0];
 
 	rd->epoch_cyc = cd.actual_read_sched_clock();
-	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
+	hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL_HARD);
 	rd->read_sched_clock = cd.actual_read_sched_clock;
 }
 

^ permalink raw reply related	[relevance 83%]

* [PATCH v1 20/25] raid5: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (18 preceding siblings ...)
  2020-05-19 21:45 89% ` [PATCH v1 19/25] vfs: Use sequence counter with associated spinlock Ahmed S. Darwish
@ 2020-05-19 21:45 95% ` Ahmed S. Darwish
  2020-05-19 21:45 95% ` [PATCH v1 21/25] iocost: " Ahmed S. Darwish
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Song Liu, linux-raid

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 drivers/md/raid5.c | 2 +-
 drivers/md/raid5.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ba00e9877f02..69f31c675b58 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6929,7 +6929,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
 	} else
 		goto abort;
 	spin_lock_init(&conf->device_lock);
-	seqcount_init(&conf->gen_lock);
+	seqcount_spinlock_init(&conf->gen_lock, &conf->device_lock);
 	mutex_init(&conf->cache_size_mutex);
 	init_waitqueue_head(&conf->wait_for_quiescent);
 	init_waitqueue_head(&conf->wait_for_stripe);
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index f90e0704bed9..a2c9e9e9f5ac 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -589,7 +589,7 @@ struct r5conf {
 	int			prev_chunk_sectors;
 	int			prev_algo;
 	short			generation; /* increments with every reshape */
-	seqcount_t		gen_lock;	/* lock against generation changes */
+	seqcount_spinlock_t	gen_lock;	/* lock against generation changes */
 	unsigned long		reshape_checkpoint; /* Time we last updated
 						     * metadata */
 	long long		min_offset_diff; /* minimum difference between
-- 
2.20.1


^ permalink raw reply related	[relevance 95%]

* [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
@ 2020-05-19 21:45 82% ` Ahmed S. Darwish
    2020-05-19 21:45 81% ` [PATCH v1 02/25] mm/swap: Don't abuse the seqcount latching API Ahmed S. Darwish
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, David S. Miller,
	Jakub Kicinski, netdev

Sequence counters write paths are critical sections that must never be
preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed.

Commit 5dbe7c178d3f ("net: fix kernel deadlock with interface rename and
netdev name retrieval.") handled a deadlock, observed with
CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was
infinitely spinning: it got scheduled after the seqcount write side
blocked inside its own critical section.

To fix that deadlock, among other issues, the commit added a
cond_resched() inside the read side section. While this will get the
non-preemptible kernel eventually unstuck, the seqcount reader is fully
exhausting its slice just spinning -- until TIF_NEED_RESCHED is set.

The fix is also still broken: if the seqcount reader belongs to a
real-time scheduling policy, it can spin forever and the kernel will
livelock.

Disabling preemption over the seqcount write side critical section will
not work: inside it are a number of GFP_KERNEL allocations and mutex
locking through the drivers/base/ :: device_rename() call chain.

From all the above, replace the seqcount with a rwsem.

Fixes: 5dbe7c178d3f (net: fix kernel deadlock with interface rename and netdev name retrieval.)
Fixes: 30e6c9fa93cf (net: devnet_rename_seq should be a seqcount)
Fixes: c91f6df2db49 (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name)
Cc: <stable@vger.kernel.org>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 net/core/dev.c | 30 ++++++++++++------------------
 1 file changed, 12 insertions(+), 18 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 522288177bbd..e18a4c23df0e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -79,6 +79,7 @@
 #include <linux/sched.h>
 #include <linux/sched/mm.h>
 #include <linux/mutex.h>
+#include <linux/rwsem.h>
 #include <linux/string.h>
 #include <linux/mm.h>
 #include <linux/socket.h>
@@ -194,7 +195,7 @@ static DEFINE_SPINLOCK(napi_hash_lock);
 static unsigned int napi_gen_id = NR_CPUS;
 static DEFINE_READ_MOSTLY_HASHTABLE(napi_hash, 8);
 
-static seqcount_t devnet_rename_seq;
+static DECLARE_RWSEM(devnet_rename_sem);
 
 static inline void dev_base_seq_inc(struct net *net)
 {
@@ -930,18 +931,13 @@ EXPORT_SYMBOL(dev_get_by_napi_id);
  *	@net: network namespace
  *	@name: a pointer to the buffer where the name will be stored.
  *	@ifindex: the ifindex of the interface to get the name from.
- *
- *	The use of raw_seqcount_begin() and cond_resched() before
- *	retrying is required as we want to give the writers a chance
- *	to complete when CONFIG_PREEMPTION is not set.
  */
 int netdev_get_name(struct net *net, char *name, int ifindex)
 {
 	struct net_device *dev;
-	unsigned int seq;
 
-retry:
-	seq = raw_seqcount_begin(&devnet_rename_seq);
+	down_read(&devnet_rename_sem);
+
 	rcu_read_lock();
 	dev = dev_get_by_index_rcu(net, ifindex);
 	if (!dev) {
@@ -951,10 +947,8 @@ int netdev_get_name(struct net *net, char *name, int ifindex)
 
 	strcpy(name, dev->name);
 	rcu_read_unlock();
-	if (read_seqcount_retry(&devnet_rename_seq, seq)) {
-		cond_resched();
-		goto retry;
-	}
+
+	up_read(&devnet_rename_sem);
 
 	return 0;
 }
@@ -1228,10 +1222,10 @@ int dev_change_name(struct net_device *dev, const char *newname)
 	    likely(!(dev->priv_flags & IFF_LIVE_RENAME_OK)))
 		return -EBUSY;
 
-	write_seqcount_begin(&devnet_rename_seq);
+	down_write(&devnet_rename_sem);
 
 	if (strncmp(newname, dev->name, IFNAMSIZ) == 0) {
-		write_seqcount_end(&devnet_rename_seq);
+		up_write(&devnet_rename_sem);
 		return 0;
 	}
 
@@ -1239,7 +1233,7 @@ int dev_change_name(struct net_device *dev, const char *newname)
 
 	err = dev_get_valid_name(net, dev, newname);
 	if (err < 0) {
-		write_seqcount_end(&devnet_rename_seq);
+		up_write(&devnet_rename_sem);
 		return err;
 	}
 
@@ -1254,11 +1248,11 @@ int dev_change_name(struct net_device *dev, const char *newname)
 	if (ret) {
 		memcpy(dev->name, oldname, IFNAMSIZ);
 		dev->name_assign_type = old_assign_type;
-		write_seqcount_end(&devnet_rename_seq);
+		up_write(&devnet_rename_sem);
 		return ret;
 	}
 
-	write_seqcount_end(&devnet_rename_seq);
+	up_write(&devnet_rename_sem);
 
 	netdev_adjacent_rename_links(dev, oldname);
 
@@ -1279,7 +1273,7 @@ int dev_change_name(struct net_device *dev, const char *newname)
 		/* err >= 0 after dev_alloc_name() or stores the first errno */
 		if (err >= 0) {
 			err = ret;
-			write_seqcount_begin(&devnet_rename_seq);
+			down_write(&devnet_rename_sem);
 			memcpy(dev->name, oldname, IFNAMSIZ);
 			memcpy(oldname, newname, IFNAMSIZ);
 			dev->name_assign_type = old_assign_type;
-- 
2.20.1


^ permalink raw reply related	[relevance 82%]

* [PATCH v1 02/25] mm/swap: Don't abuse the seqcount latching API
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
  2020-05-19 21:45 82% ` [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount Ahmed S. Darwish
@ 2020-05-19 21:45 81% ` Ahmed S. Darwish
  2020-05-19 21:45 92% ` [PATCH v1 03/25] net: phy: fixed_phy: Remove unused seqcount Ahmed S. Darwish
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Andrew Morton,
	Konstantin Khlebnikov, linux-mm

Commit eef1a429f234 ("mm/swap.c: piggyback lru_add_drain_all() calls")
implemented an optimization mechanism to exit the to-be-started LRU
drain operation (name it A) if another drain operation *started and
finished* while (A) was blocked on the LRU draining mutex.

This was done through a seqcount latch, which is an abuse of its
semantics:

  1. Seqcount latching should be used for the purpose of switching
     between two storage places with sequence protection to allow
     interruptible, preemptible writer sections. The optimization
     mechanism has absolutely nothing to do with that.

  2. The used raw_write_seqcount_latch() has two smp write memory
     barriers to always insure one consistent storage place out of the
     two storage places available. This extra smp_wmb() is redundant for
     the optimization use case.

Beside the API abuse, the semantics of a latch sequence counter was
force fitted into the optimization. What was actually meant is to track
generations of LRU draining operations, where "current lru draining
generation = x" implies that all generations 0 < n <= x are already
*scheduled* for draining.

Remove the conceptually-inappropriate seqcount latch usage and manually
implement the optimization using a counter and SMP memory barriers.

Link: https://lkml.kernel.org/r/CALYGNiPSr-cxV9MX9czaVh6Wz_gzSv3H_8KPvgjBTGbJywUJpA@mail.gmail.com
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 mm/swap.c | 57 +++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 47 insertions(+), 10 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index bf9a79fed62d..d6910eeed43d 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -713,10 +713,20 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
  */
 void lru_add_drain_all(void)
 {
-	static seqcount_t seqcount = SEQCNT_ZERO(seqcount);
-	static DEFINE_MUTEX(lock);
+	/*
+	 * lru_drain_gen - Current generation of pages that could be in vectors
+	 *
+	 * (A) Definition: lru_drain_gen = x implies that all generations
+	 *     0 < n <= x are already scheduled for draining.
+	 *
+	 * This is an optimization for the highly-contended use case where a
+	 * user space workload keeps constantly generating a flow of pages
+	 * for each CPU.
+	 */
+	static unsigned int lru_drain_gen;
 	static struct cpumask has_work;
-	int cpu, seq;
+	static DEFINE_MUTEX(lock);
+	int cpu, this_gen;
 
 	/*
 	 * Make sure nobody triggers this path before mm_percpu_wq is fully
@@ -725,21 +735,48 @@ void lru_add_drain_all(void)
 	if (WARN_ON(!mm_percpu_wq))
 		return;
 
-	seq = raw_read_seqcount_latch(&seqcount);
+	/*
+	 * (B) Cache the LRU draining generation number
+	 *
+	 * smp_rmb() ensures that the counter is loaded before the mutex is
+	 * taken. It pairs with the smp_wmb() inside the mutex critical section
+	 * at (D).
+	 */
+	this_gen = READ_ONCE(lru_drain_gen);
+	smp_rmb();
 
 	mutex_lock(&lock);
 
 	/*
-	 * Piggyback on drain started and finished while we waited for lock:
-	 * all pages pended at the time of our enter were drained from vectors.
+	 * (C) Exit the draining operation if a newer generation, from another
+	 * lru_add_drain_all(), was already scheduled for draining. Check (A).
 	 */
-	if (__read_seqcount_retry(&seqcount, seq))
+	if (unlikely(this_gen != lru_drain_gen))
 		goto done;
 
-	raw_write_seqcount_latch(&seqcount);
+	/*
+	 * (D) Increment generation number
+	 *
+	 * Pairs with READ_ONCE() and smp_rmb() at (B), outside of the critical
+	 * section.
+	 *
+	 * This pairing must be done here, before the for_each_online_cpu loop
+	 * below which drains the page vectors.
+	 *
+	 * Let x, y, and z represent some system CPU numbers, where x < y < z.
+	 * Assume CPU #z is is in the middle of the for_each_online_cpu loop
+	 * below and has already reached CPU #y's per-cpu data. CPU #x comes
+	 * along, adds some pages to its per-cpu vectors, then calls
+	 * lru_add_drain_all().
+	 *
+	 * If the paired smp_wmb() below is done at any later step, e.g. after
+	 * the loop, CPU #x will just exit at (C) and miss flushing out all of
+	 * its added pages.
+	 */
+	WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1);
+	smp_wmb();
 
 	cpumask_clear(&has_work);
-
 	for_each_online_cpu(cpu) {
 		struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
 
@@ -766,7 +803,7 @@ void lru_add_drain_all(void)
 {
 	lru_add_drain();
 }
-#endif
+#endif /* CONFIG_SMP */
 
 /**
  * release_pages - batched put_page()
-- 
2.20.1


^ permalink raw reply related	[relevance 81%]

* [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (2 preceding siblings ...)
  2020-05-19 21:45 92% ` [PATCH v1 03/25] net: phy: fixed_phy: Remove unused seqcount Ahmed S. Darwish
@ 2020-05-19 21:45 97% ` Ahmed S. Darwish
    2020-05-19 21:45 82% ` [PATCH v1 05/25] u64_stats: Document writer non-preemptibility requirement Ahmed S. Darwish
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Jens Axboe, Phillip Susi,
	Vivek Goyal, linux-block

For optimized block readers not holding a mutex, the "number of sectors"
64-bit value is protected from tearing on 32-bit architectures by a
sequence counter.

Disable preemption before entering that sequence counter's write side
critical section. Otherwise, the read side can preempt the write side
section and spin for the entire scheduler tick. If the reader belongs to
a real-time scheduling class, it can spin forever and the kernel will
livelock.

Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 block/blk.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk.h b/block/blk.h
index 0a94ec68af32..151f86932547 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
 static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+	preempt_disable();
 	write_seqcount_begin(&part->nr_sects_seq);
 	part->nr_sects = size;
 	write_seqcount_end(&part->nr_sects_seq);
+	preempt_enable();
 #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
 	preempt_disable();
 	part->nr_sects = size;
-- 
2.20.1


^ permalink raw reply related	[relevance 97%]

* [PATCH v1 07/25] lockdep: Add preemption disabled assertion API
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (5 preceding siblings ...)
  2020-05-19 21:45 91% ` [PATCH v1 06/25] dma-buf: Remove custom seqcount lockdep class key Ahmed S. Darwish
@ 2020-05-19 21:45 89% ` Ahmed S. Darwish
  2020-05-19 21:45 91% ` [PATCH v1 08/25] seqlock: lockdep assert non-preemptibility on seqcount_t write Ahmed S. Darwish
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish

Asserting that preemption is disabled is a critical sanity check.
Developers are usually reluctant to add such a check in a fastpath, as
reading the preemption count can be costly.

Extend the lockdep API with a preemption disabled assertion. If lockdep
is disabled, or if the underlying architecture does not support kernel
preemption, this assert has no runtime overhead.

Since the lockdep assertion references sched.h task_struct current,
define it at lockdep.c instead of lockdep.h. This unbinds a potential
circular header dependency chain for call-sites, defined inlined, at
other header files already included and needed by sched.h.

Mark the exported assertion symbol with NOKPROBE_SYMBOL. Lockdep
functions can be involved in breakpoint handling and probing on those
functions can cause a breakpoint recursion.

References: f54bb2ec02c8 ("locking/lockdep: Add IRQs disabled/enabled assertion APIs: ...")
References: 2f43c6022d84 ("kprobes: Prohibit probing on lockdep functions")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/lockdep.h  |  9 +++++++++
 kernel/locking/lockdep.c | 15 +++++++++++++++
 lib/Kconfig.debug        |  1 +
 3 files changed, 25 insertions(+)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 206774ac6946..54c929ea5b98 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -702,6 +702,14 @@ do {									\
 			  "Not in hardirq as expected\n");		\
 	} while (0)
 
+/*
+ * Don't define this assertion here to avoid a call-site's header file
+ * dependency on sched.h task_struct current. This is needed by call
+ * sites that are inline defined at header files already included by
+ * sched.h.
+ */
+void lockdep_assert_preemption_disabled(void);
+
 #else
 # define might_lock(lock) do { } while (0)
 # define might_lock_read(lock) do { } while (0)
@@ -709,6 +717,7 @@ do {									\
 # define lockdep_assert_irqs_enabled() do { } while (0)
 # define lockdep_assert_irqs_disabled() do { } while (0)
 # define lockdep_assert_in_irq() do { } while (0)
+# define lockdep_assert_preemption_disabled() do { } while (0)
 #endif
 
 #ifdef CONFIG_PROVE_RAW_LOCK_NESTING
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index ac10db66cc63..4dae65bc65c2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5857,3 +5857,18 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	dump_stack();
 }
 EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
+
+#ifdef CONFIG_PROVE_LOCKING
+
+void lockdep_assert_preemption_disabled(void)
+{
+	WARN_ONCE(IS_ENABLED(CONFIG_PREEMPT_COUNT)	&&
+		  debug_locks				&&
+		  !current->lockdep_recursion		&&
+		  (preempt_count() == 0 && current->hardirqs_enabled),
+		  "preemption not disabled as expected\n");
+}
+EXPORT_SYMBOL_GPL(lockdep_assert_preemption_disabled);
+NOKPROBE_SYMBOL(lockdep_assert_preemption_disabled);
+
+#endif
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 21d9c5f6e7ec..34d9d8896003 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1062,6 +1062,7 @@ config PROVE_LOCKING
 	select DEBUG_RWSEMS
 	select DEBUG_WW_MUTEX_SLOWPATH
 	select DEBUG_LOCK_ALLOC
+	select PREEMPT_COUNT if !ARCH_NO_PREEMPT
 	select TRACE_IRQFLAGS
 	default n
 	help
-- 
2.20.1


^ permalink raw reply related	[relevance 89%]

* [PATCH v1 05/25] u64_stats: Document writer non-preemptibility requirement
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (3 preceding siblings ...)
  2020-05-19 21:45 97% ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
@ 2020-05-19 21:45 82% ` Ahmed S. Darwish
  2020-05-19 21:45 91% ` [PATCH v1 06/25] dma-buf: Remove custom seqcount lockdep class key Ahmed S. Darwish
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, David S. Miller,
	Jakub Kicinski, netdev

The u64_stats mechanism uses sequence counters to protect against 64-bit
values tearing on 32-bit architectures. Updating such statistics is a
sequence counter write side critical section.

Preemption must be disabled before entering this seqcount write critical
section.  Failing to do so, the seqcount read side can preempt the write
side section and spin for the entire scheduler tick.  If that reader
belongs to a real-time scheduling class, it can spin forever and the
kernel will livelock.

Document this statistics update side non-preemptibility requirement.

Reword the u64_stats header file top comment to always mention "Reader"
or "Writer" at the start of each bullet point, making it easier to
follow which side each point is actually for.

Fix the statement "whole thing is a NOOP on 64bit arches or UP kernels".
For 32-bit UP kernels, preemption is always disabled for the statistics
read side section.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/u64_stats_sync.h | 38 ++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
index 9de5c10293f5..30358ce3d8fe 100644
--- a/include/linux/u64_stats_sync.h
+++ b/include/linux/u64_stats_sync.h
@@ -7,29 +7,31 @@
  * we provide a synchronization point, that is a noop on 64bit or UP kernels.
  *
  * Key points :
- * 1) Use a seqcount on SMP 32bits, with low overhead.
- * 2) Whole thing is a noop on 64bit arches or UP kernels.
- * 3) Write side must ensure mutual exclusion or one seqcount update could
+ *
+ * 1) Use a seqcount on 32-bit SMP, only disable preemption for 32-bit UP.
+ *
+ * 2) The whole thing is a no-op on 64-bit architectures.
+ *
+ * 3) Write side must ensure mutual exclusion, or one seqcount update could
  *    be lost, thus blocking readers forever.
- *    If this synchronization point is not a mutex, but a spinlock or
- *    spinlock_bh() or disable_bh() :
- * 3.1) Write side should not sleep.
- * 3.2) Write side should not allow preemption.
- * 3.3) If applicable, interrupts should be disabled.
  *
- * 4) If reader fetches several counters, there is no guarantee the whole values
- *    are consistent (remember point 1) : this is a noop on 64bit arches anyway)
+ * 4) Write side must disable preemption, or a seqcount reader can preempt the
+ *    writer and also spin forever.
  *
- * 5) readers are allowed to sleep or be preempted/interrupted : They perform
- *    pure reads. But if they have to fetch many values, it's better to not allow
- *    preemptions/interruptions to avoid many retries.
+ * 5) Write side must use the _irqsave() variant if other writers, or a reader,
+ *    can be invoked from an IRQ context.
  *
- * 6) If counter might be written by an interrupt, readers should block interrupts.
- *    (On UP, there is no seqcount_t protection, a reader allowing interrupts could
- *     read partial values)
+ * 6) If reader fetches several counters, there is no guarantee the whole values
+ *    are consistent w.r.t. each other (remember point #2: seqcounts are not
+ *    used for 64bit architectures).
  *
- * 7) For irq and softirq uses, readers can use u64_stats_fetch_begin_irq() and
- *    u64_stats_fetch_retry_irq() helpers
+ * 7) Readers are allowed to sleep or be preempted/interrupted: they perform
+ *    pure reads.
+ *
+ * 8) Readers must use both u64_stats_fetch_{begin,retry}_irq() if the stats
+ *    might be updated from a hardirq or softirq context (remember point #1:
+ *    seqcounts are not used for UP kernels). 32-bit UP stat readers could read
+ *    corrupted 64-bit values otherwise.
  *
  * Usage :
  *
-- 
2.20.1


^ permalink raw reply related	[relevance 82%]

* [PATCH v1 03/25] net: phy: fixed_phy: Remove unused seqcount
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
  2020-05-19 21:45 82% ` [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount Ahmed S. Darwish
  2020-05-19 21:45 81% ` [PATCH v1 02/25] mm/swap: Don't abuse the seqcount latching API Ahmed S. Darwish
@ 2020-05-19 21:45 92% ` Ahmed S. Darwish
  2020-05-19 21:45 97% ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Andrew Lunn,
	Florian Fainelli, Heiner Kallweit, Russell King, David S. Miller,
	netdev

Commit bf7afb29d545 ("phy: improve safety of fixed-phy MII register
reading") protected the fixed PHY status with a sequence counter.

Two years later, commit d2b977939b18 ("net: phy: fixed-phy: remove
fixed_phy_update_state()") removed the sequence counter's write side
critical section -- neutralizing its read side retry loop.

Remove the unused seqcount.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/phy/fixed_phy.c | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index 4a3d34f40cb9..f55365c9d1f7 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -34,7 +34,6 @@ struct fixed_mdio_bus {
 struct fixed_phy {
 	int addr;
 	struct phy_device *phydev;
-	seqcount_t seqcount;
 	struct fixed_phy_status status;
 	bool no_carrier;
 	int (*link_update)(struct net_device *, struct fixed_phy_status *);
@@ -80,19 +79,17 @@ static int fixed_mdio_read(struct mii_bus *bus, int phy_addr, int reg_num)
 	list_for_each_entry(fp, &fmb->phys, node) {
 		if (fp->addr == phy_addr) {
 			struct fixed_phy_status state;
-			int s;
 
-			do {
-				s = read_seqcount_begin(&fp->seqcount);
-				fp->status.link = !fp->no_carrier;
-				/* Issue callback if user registered it. */
-				if (fp->link_update)
-					fp->link_update(fp->phydev->attached_dev,
-							&fp->status);
-				/* Check the GPIO for change in status */
-				fixed_phy_update(fp);
-				state = fp->status;
-			} while (read_seqcount_retry(&fp->seqcount, s));
+			fp->status.link = !fp->no_carrier;
+
+			/* Issue callback if user registered it. */
+			if (fp->link_update)
+				fp->link_update(fp->phydev->attached_dev,
+						&fp->status);
+
+			/* Check the GPIO for change in status */
+			fixed_phy_update(fp);
+			state = fp->status;
 
 			return swphy_read_reg(reg_num, &state);
 		}
@@ -150,8 +147,6 @@ static int fixed_phy_add_gpiod(unsigned int irq, int phy_addr,
 	if (!fp)
 		return -ENOMEM;
 
-	seqcount_init(&fp->seqcount);
-
 	if (irq != PHY_POLL)
 		fmb->mii_bus->irq[phy_addr] = irq;
 
-- 
2.20.1


^ permalink raw reply related	[relevance 92%]

* [PATCH v1 09/25] Documentation: locking: Describe seqlock design and usage
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (7 preceding siblings ...)
  2020-05-19 21:45 91% ` [PATCH v1 08/25] seqlock: lockdep assert non-preemptibility on seqcount_t write Ahmed S. Darwish
@ 2020-05-19 21:45 58% ` Ahmed S. Darwish
  2020-05-19 21:45 82% ` [PATCH v1 10/25] seqlock: Add RST directives to kernel-doc code samples and notes Ahmed S. Darwish
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Jonathan Corbet,
	linux-doc

Proper documentation for the design and usage of sequence counters and
sequential locks does not exist. Complete the seqlock.h documentation as
follows:

  - Divide all documentation on a seqcount_t vs. seqlock_t basis. The
    description for both mechanisms was intermingled, which is incorrect
    since the usage constrains for each type are vastly different.

  - Add an introductory paragraph describing the internal design of, and
    rationale for, sequence counters.

  - Document seqcount_t writer non-preemptibility requirement, which was
    not previously documented anywhere, and provide a clear rationale.

  - Provide template code for seqcount_t and seqlock_t initialization
    and reader/writer critical sections.

  - Recommend using seqlock_t by default. It implicitly handles the
    serialization and non-preemptibility requirements of writers.

At seqlock.h:

  - Remove references to brlocks as they've long been removed from the
    kernel.

  - Remove references to gcc-3.x since the kernel's minimum supported
    gcc version is 4.6.

  - Remove the severely lacking top comment and reference the newly
    introduced Documentation/locking/seqlock.rst file instead.

References: 0f6ed63b1707 ("no need to keep brlock macros anymore...")
References: cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 Documentation/locking/index.rst   |   1 +
 Documentation/locking/seqlock.rst | 181 ++++++++++++++++++++++++++++++
 include/linux/seqlock.h           |  73 +++++-------
 3 files changed, 213 insertions(+), 42 deletions(-)
 create mode 100644 Documentation/locking/seqlock.rst

diff --git a/Documentation/locking/index.rst b/Documentation/locking/index.rst
index 5d6800a723dc..aad15fc81ccd 100644
--- a/Documentation/locking/index.rst
+++ b/Documentation/locking/index.rst
@@ -14,6 +14,7 @@ locking
     mutex-design
     rt-mutex-design
     rt-mutex
+    seqlock
     spinlocks
     ww-mutex-design
 
diff --git a/Documentation/locking/seqlock.rst b/Documentation/locking/seqlock.rst
new file mode 100644
index 000000000000..2242ae00e7bf
--- /dev/null
+++ b/Documentation/locking/seqlock.rst
@@ -0,0 +1,181 @@
+======================================
+Sequence counters and sequential locks
+======================================
+
+Introduction
+============
+
+Sequence counters are a reader-writer consistency mechanism with
+lockless readers (read-only retry loops), and no writer starvation. They
+are used for data that's rarely written to (e.g. system time), where the
+reader wants a consistent set of information and is willing to retry if
+that information changes.
+
+A data set is consistent when the sequence count at the beginning of the
+read side critical section is even and the same sequence count value is
+read again at the end of the critical section. The data in the set must
+be copied out inside the read side critical section. If the sequence
+count has changed between the start and the end of the critical section,
+the reader must retry.
+
+Writers increment the sequence count at the start and the end of their
+critical section. After starting the critical section the sequence count
+is odd and indicates to the readers that an update is in progress. At
+the end of the write side critical section the sequence count becomes
+even again which lets readers make progress.
+
+A sequence counter write side critical section must never be preempted
+or interrupted by read side sections. Otherwise the reader will spin for
+the entire scheduler tick due to the odd sequence count value and the
+interrupted writer. If that reader belongs to a real-time scheduling
+class, it can spin forever and the kernel will livelock.
+
+.. _seqcount_t:
+
+Sequence counters (:c:type:`seqcount_t`)
+========================================
+
+This is the the raw counting mechanism, which does not protect against
+multiple writers.  Write side critical sections must thus be serialized
+by an external lock.
+
+If the write serialization primitive is not implicitly disabling
+preemption, preemption must be explicitly disabled before entering the
+write side section. If the sequence counter read section can be invoked
+from hardirq or softirq contexts, interrupts or bottom halves must be
+respectively disabled before entering the write side section.
+
+If it's desired to automatically handle the sequence counter
+requirements of writer serialization and non-preemptibility, use a
+:ref:`sequential lock <seqlock_t>` instead.
+
+Initialization:
+
+.. code-block:: c
+
+	/* dynamic */
+	seqcount_t foo_seqcount;
+	seqcount_init(&foo_seqcount);
+
+	/* static */
+	static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount);
+
+	/* C99 struct init */
+	struct {
+		.seq   = SEQCNT_ZERO(foo.seq),
+	} foo;
+
+Write path:
+
+.. code-block:: c
+
+	/* Serialized context with disabled preemption */
+
+	write_seqcount_begin(&foo_seqcount);
+
+	/* ... [[write-side critical section]] ... */
+
+	write_seqcount_end(&foo_seqcount);
+
+Read path:
+
+.. code-block:: c
+
+	do {
+		seq = read_seqcount_begin(&foo_seqcount);
+
+		/* ... [[read-side critical section]] ... */
+
+	} while (read_seqcount_retry(&foo_seqcount, seq));
+
+.. _seqlock_t:
+
+Sequential locks (:c:type:`seqlock_t`)
+======================================
+
+This contains the :ref:`sequence counting mechanism <seqcount_t>`
+earlier discussed, plus an embedded spinlock for writer serialization
+and non-preemptibility.
+
+If the read side section can be invoked from hardirq or softirq context,
+use the write side function variants which respectively disable
+interrupts or bottom halves.
+
+Initialization:
+
+.. code-block:: c
+
+	/* dynamic */
+	seqlock_t foo_seqlock;
+	seqlock_init(&foo_seqlock);
+
+	/* static */
+	static DEFINE_SEQLOCK(foo_seqlock);
+
+	/* C99 struct init */
+	struct {
+		.seql   = __SEQLOCK_UNLOCKED(foo.seql)
+	} foo;
+
+Write path:
+
+.. code-block:: c
+
+	write_seqlock(&foo_seqlock);
+
+	/* ... [[write-side critical section]] ... */
+
+	write_sequnlock(&foo_seqlock);
+
+Read path, three categories:
+
+1. Normal Sequence readers which never block a writer but they must
+   retry if a writer is in progress by detecting change in the sequence
+   number.  Writers do not wait for a sequence reader.
+
+   .. code-block:: c
+
+	do {
+		seq = read_seqbegin(&foo_seqlock);
+
+		/* ... [[read-side critical section]] ... */
+
+	} while (read_seqretry(&foo_seqlock, seq));
+
+2. Locking readers which will wait if a writer or another locking reader
+   is in progress. A locking reader in progress will also block a writer
+   from entering its critical section. This read lock is
+   exclusive. Unlike rwlock_t, only one locking reader can acquire it.
+
+   .. code-block:: c
+
+	read_seqlock_excl(&foo_seqlock);
+
+	/* ... [[read-side critical section]] ... */
+
+	read_sequnlock_excl(&foo_seqlock);
+
+3. Conditional lockless reader (as in 1), or locking reader (as in 2),
+   according to a passed marker. This is used to avoid lockless readers
+   starvation (too much retry loops) in case of a sharp spike in write
+   activity. First, a lockless read is tried (even marker passed). If
+   that trial fails (odd sequence counter is returned, which is used as
+   the next iteration marker), the lockless read is transformed to a
+   full locking read and no retry loop is necessary.
+
+   .. code-block:: c
+
+	/* marker; even initialization */
+	int seq = 0;
+	do {
+		read_seqbegin_or_lock(&foo_seqlock, &seq);
+
+		/* ... [[read-side critical section]] ... */
+
+	} while (need_seqretry(&foo_seqlock, seq));
+	done_seqretry(&foo_seqlock, seq);
+
+API documentation
+=================
+
+.. kernel-doc:: include/linux/seqlock.h
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index d35be7709403..2a4af746b1da 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -1,36 +1,15 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #ifndef __LINUX_SEQLOCK_H
 #define __LINUX_SEQLOCK_H
+
 /*
- * Reader/writer consistent mechanism without starving writers. This type of
- * lock for data where the reader wants a consistent set of information
- * and is willing to retry if the information changes. There are two types
- * of readers:
- * 1. Sequence readers which never block a writer but they may have to retry
- *    if a writer is in progress by detecting change in sequence number.
- *    Writers do not wait for a sequence reader.
- * 2. Locking readers which will wait if a writer or another locking reader
- *    is in progress. A locking reader in progress will also block a writer
- *    from going forward. Unlike the regular rwlock, the read lock here is
- *    exclusive so that only one locking reader can get it.
+ * seqcount_t / seqlock_t - a reader-writer consistency mechanism with
+ * lockless readers (read-only retry loops), and no writer starvation.
  *
- * This is not as cache friendly as brlock. Also, this may not work well
- * for data that contains pointers, because any writer could
- * invalidate a pointer that a reader was following.
+ * See Documentation/locking/seqlock.rst for full description.
  *
- * Expected non-blocking reader usage:
- * 	do {
- *	    seq = read_seqbegin(&foo);
- * 	...
- *      } while (read_seqretry(&foo, seq));
- *
- *
- * On non-SMP the spin locks disappear but the writer still needs
- * to increment the sequence variables because an interrupt routine could
- * change the state of the data.
- *
- * Based on x86_64 vsyscall gettimeofday 
- * by Keith Owens and Andrea Arcangeli
+ * Copyrights:
+ * - Based on x86_64 vsyscall gettimeofday: Keith Owens, Andrea Arcangeli
  */
 
 #include <linux/spinlock.h>
@@ -40,11 +19,23 @@
 #include <asm/processor.h>
 
 /*
- * Version using sequence counter only.
- * This can be used when code has its own mutex protecting the
- * updating starting before the write_seqcountbeqin() and ending
- * after the write_seqcount_end().
+ * Sequence counters (seqcount_t)
+ *
+ * The raw counting mechanism without any writer protection. Write side
+ * critical sections must be serialized and readers on the same CPU
+ * (e.g. through preemption or interrupts) must be excluded.
+ *
+ * If the write serialization mechanism is one of the common kernel
+ * locking primitives, use a sequence counter with associated lock
+ * (seqcount_LOCKTYPE_t) instead.
+ *
+ * If it's desired to automatically handle the sequence counter writer
+ * serialization and non-preemptibility requirements, use a sequential
+ * lock (seqlock_t) instead.
+ *
+ * See Documentation/locking/seqlock.rst
  */
+
 typedef struct seqcount {
 	unsigned sequence;
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -221,8 +212,6 @@ static inline int read_seqcount_retry(const seqcount_t *s, unsigned start)
 	return __read_seqcount_retry(s, start);
 }
 
-
-
 static inline void raw_write_seqcount_begin(seqcount_t *s)
 {
 	s->sequence++;
@@ -367,11 +356,6 @@ static inline void raw_write_seqcount_latch(seqcount_t *s)
        smp_wmb();      /* increment "sequence" before following stores */
 }
 
-/*
- * Sequence counter only version assumes that callers are using their
- * own locking and preemption is disabled.
- */
-
 static inline void __write_seqcount_begin_nested(seqcount_t *s, int subclass)
 {
 	raw_write_seqcount_begin(s);
@@ -419,15 +403,20 @@ static inline void write_seqcount_invalidate(seqcount_t *s)
 	s->sequence+=2;
 }
 
+/*
+ * Sequential locks (seqlock_t)
+ *
+ * Sequence counters with an embedded spinlock for writer serialization
+ * and non-preemptibility.
+ *
+ * See Documentation/locking/seqlock.rst
+ */
+
 typedef struct {
 	struct seqcount seqcount;
 	spinlock_t lock;
 } seqlock_t;
 
-/*
- * These macros triggered gcc-3.x compile-time problems.  We think these are
- * OK now.  Be cautious.
- */
 #define __SEQLOCK_UNLOCKED(lockname)			\
 	{						\
 		.seqcount = SEQCNT_ZERO(lockname),	\
-- 
2.20.1


^ permalink raw reply related	[relevance 58%]

* [PATCH v1 10/25] seqlock: Add RST directives to kernel-doc code samples and notes
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (8 preceding siblings ...)
  2020-05-19 21:45 58% ` [PATCH v1 09/25] Documentation: locking: Describe seqlock design and usage Ahmed S. Darwish
@ 2020-05-19 21:45 82% ` Ahmed S. Darwish
    2020-05-19 21:45 42% ` [PATCH v1 11/25] seqlock: Add missing kernel-doc annotations Ahmed S. Darwish
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Jonathan Corbet,
	linux-doc

Mark all C code samples inside seqlock.h kernel-doc text with the RST
'code-block: c' directive. Sphinx won't properly format the example code
and will produce noisy text indentation warnings otherwise.

Mark all kernel-doc "NOTE" sections with the RST 'attention' directive.
Otherwise Sphinx produces "duplicate section name 'NOTE'" warnings.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/seqlock.h | 82 +++++++++++++++++++++++------------------
 1 file changed, 47 insertions(+), 35 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 2a4af746b1da..dfec0c9c19c4 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -232,6 +232,8 @@ static inline void raw_write_seqcount_end(seqcount_t *s)
  * usual consistency guarantee. It is one wmb cheaper, because we can
  * collapse the two back-to-back wmb()s.
  *
+ * .. code-block:: c
+ *
  *      seqcount_t seq;
  *      bool X = true, Y = false;
  *
@@ -292,62 +294,72 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
  *
  * The basic form is a data structure like:
  *
- * struct latch_struct {
- *	seqcount_t		seq;
- *	struct data_struct	data[2];
- * };
+ * .. code-block:: c
+ *
+ *	struct latch_struct {
+ *		seqcount_t		seq;
+ *		struct data_struct	data[2];
+ *	};
  *
  * Where a modification, which is assumed to be externally serialized, does the
  * following:
  *
- * void latch_modify(struct latch_struct *latch, ...)
- * {
- *	smp_wmb();	<- Ensure that the last data[1] update is visible
- *	latch->seq++;
- *	smp_wmb();	<- Ensure that the seqcount update is visible
+ * .. code-block:: c
  *
- *	modify(latch->data[0], ...);
+ *	void latch_modify(struct latch_struct *latch, ...)
+ *	{
+ *		smp_wmb();	// Ensure that the last data[1] update is visible
+ *		latch->seq++;
+ *		smp_wmb();	// Ensure that the seqcount update is visible
  *
- *	smp_wmb();	<- Ensure that the data[0] update is visible
- *	latch->seq++;
- *	smp_wmb();	<- Ensure that the seqcount update is visible
+ *		modify(latch->data[0], ...);
  *
- *	modify(latch->data[1], ...);
- * }
+ *		smp_wmb();	// Ensure that the data[0] update is visible
+ *		latch->seq++;
+ *		smp_wmb();	// Ensure that the seqcount update is visible
+ *
+ *		modify(latch->data[1], ...);
+ *	}
  *
  * The query will have a form like:
  *
- * struct entry *latch_query(struct latch_struct *latch, ...)
- * {
- *	struct entry *entry;
- *	unsigned seq, idx;
+ * .. code-block:: c
  *
- *	do {
- *		seq = raw_read_seqcount_latch(&latch->seq);
+ *	struct entry *latch_query(struct latch_struct *latch, ...)
+ *	{
+ *		struct entry *entry;
+ *		unsigned seq, idx;
  *
- *		idx = seq & 0x01;
- *		entry = data_query(latch->data[idx], ...);
+ *		do {
+ *			seq = raw_read_seqcount_latch(&latch->seq);
  *
- *		smp_rmb();
- *	} while (seq != latch->seq);
+ *			idx = seq & 0x01;
+ *			entry = data_query(latch->data[idx], ...);
  *
- *	return entry;
- * }
+ *			smp_rmb();
+ *		} while (seq != latch->seq);
+ *
+ *		return entry;
+ *	}
  *
  * So during the modification, queries are first redirected to data[1]. Then we
  * modify data[0]. When that is complete, we redirect queries back to data[0]
  * and we can modify data[1].
  *
- * NOTE: The non-requirement for atomic modifications does _NOT_ include
- *       the publishing of new entries in the case where data is a dynamic
- *       data structure.
+ * .. attention::
  *
- *       An iteration might start in data[0] and get suspended long enough
- *       to miss an entire modification sequence, once it resumes it might
- *       observe the new entry.
+ *	The non-requirement for atomic modifications does _NOT_ include
+ *	the publishing of new entries in the case where data is a dynamic
+ *	data structure.
  *
- * NOTE: When data is a dynamic data structure; one should use regular RCU
- *       patterns to manage the lifetimes of the objects within.
+ *	An iteration might start in data[0] and get suspended long enough
+ *	to miss an entire modification sequence, once it resumes it might
+ *	observe the new entry.
+ *
+ * .. attention::
+ *
+ *	When data is a dynamic data structure; one should use regular RCU
+ *	patterns to manage the lifetimes of the objects within.
  */
 static inline void raw_write_seqcount_latch(seqcount_t *s)
 {
-- 
2.20.1


^ permalink raw reply related	[relevance 82%]

* [PATCH v1 12/25] seqlock: Extend seqcount API with associated locks
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (10 preceding siblings ...)
  2020-05-19 21:45 42% ` [PATCH v1 11/25] seqlock: Add missing kernel-doc annotations Ahmed S. Darwish
@ 2020-05-19 21:45 37% ` Ahmed S. Darwish
  2020-05-19 21:45 81% ` [PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex Ahmed S. Darwish
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Jonathan Corbet,
	linux-doc

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the write side critical section.

There is no built-in debugging mechanism to verify that the lock used
for writer serialization is held and preemption is disabled. Some usage
sites like dma-buf have explicit lockdep checks for the writer-side
lock, but this covers only a small portion of the sequence counter usage
in the kernel.

Add new sequence counter types which allows to associate a lock to the
sequence counter at initialization time. The seqcount API functions are
extended to provide appropriate lockdep assertions depending on the
seqcount/lock type.

For sequence counters with associated locks that do not implicitly
disable preemption, preemption protection is enforced in the sequence
counter write side functions. This removes the need to explicitly add
preempt_disable/enable() around the write side critical sections: the
write_begin/end() functions for these new sequence counter types
automatically do this.

Introduce the following seqcount types with associated locks:

     seqcount_spinlock_t
     seqcount_raw_spinlock_t
     seqcount_rwlock_t
     seqcount_mutex_t
     seqcount_ww_mutex_t

Extend the seqcount read and write functions to branch out to the
specific seqcount_LOCKTYPE_t implementation at compile-time. This avoids
kernel API explosion per each new seqcount_LOCKTYPE_t added. Add such
compile-time type detection logic into a new, internal, seqlock header.

Document the proper seqcount_LOCKTYPE_t usage, and rationale, at
Documentation/locking/seqlock.rst.

If lockdep is disabled, this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 Documentation/locking/seqlock.rst      |  64 ++++-
 MAINTAINERS                            |   2 +-
 include/linux/seqlock.h                | 355 +++++++++++++++++++++----
 include/linux/seqlock_types_internal.h | 187 +++++++++++++
 4 files changed, 549 insertions(+), 59 deletions(-)
 create mode 100644 include/linux/seqlock_types_internal.h

diff --git a/Documentation/locking/seqlock.rst b/Documentation/locking/seqlock.rst
index 2242ae00e7bf..e6f8e4be7db8 100644
--- a/Documentation/locking/seqlock.rst
+++ b/Documentation/locking/seqlock.rst
@@ -45,9 +45,11 @@ write side section. If the sequence counter read section can be invoked
 from hardirq or softirq contexts, interrupts or bottom halves must be
 respectively disabled before entering the write side section.
 
-If it's desired to automatically handle the sequence counter
-requirements of writer serialization and non-preemptibility, use a
-:ref:`sequential lock <seqlock_t>` instead.
+If the write serialization mechanism is one of the common kernel locking
+primitives, use :ref:`sequence counters with associated locks
+<seqcount_locktype_t>` instead. If it's desired to automatically handle
+the sequence counter writer serialization and non-preemptibility
+requirements, use a :ref:`sequential lock <seqlock_t>`.
 
 Initialization:
 
@@ -67,6 +69,7 @@ Initialization:
 
 Write path:
 
+.. _seqcount_write_ops:
 .. code-block:: c
 
 	/* Serialized context with disabled preemption */
@@ -79,6 +82,7 @@ Write path:
 
 Read path:
 
+.. _seqcount_read_ops:
 .. code-block:: c
 
 	do {
@@ -88,6 +92,60 @@ Read path:
 
 	} while (read_seqcount_retry(&foo_seqcount, seq));
 
+.. _seqcount_locktype_t:
+
+Sequence counters with associated locks (:c:type:`seqcount_LOCKTYPE_t`)
+-----------------------------------------------------------------------
+
+As :ref:`earlier discussed <seqcount_t>`, seqcount write side critical
+sections must be serialized and non-preemptible. This variant of
+sequence counters associate the lock used for writer serialization at
+the seqcount initialization time. This enables lockdep to validate that
+the write side critical section is properly serialized.
+
+This lock association is a NOOP if lockdep is disabled and has neither
+storage nor runtime overhead. If lockdep is enabled, the lock pointer is
+stored in struct seqcount and lockdep's "lock is held" assertions are
+injected at the beginning of the write side critical section to validate
+that it is properly protected.
+
+For lock types which do not implicitly disable preemption, preemption
+protection is enforced in the write side function.
+
+The following seqcounts with associated locks are defined:
+
+  - :c:type:`seqcount_spinlock_t`
+  - :c:type:`seqcount_raw_spinlock_t`
+  - :c:type:`seqcount_rwlock_t`
+  - :c:type:`seqcount_mutex_t`
+  - :c:type:`seqcount_ww_mutex_t`
+
+The plain seqcount read and write APIs branch out to the specific
+seqcount_LOCKTYPE_t implementation at compile-time. This avoids kernel
+API explosion per each new seqcount LOCKTYPE.
+
+Initialization (replace "LOCKTYPE" with one of the supported locks):
+
+.. code-block:: c
+
+	/* dynamic */
+	seqcount_LOCKTYPE_t foo_seqcount;
+	seqcount_LOCKTYPE_init(&foo_seqcount, &lock);
+
+	/* static */
+	static seqcount_LOCKTYPE_t foo_seqcount =
+		SEQCNT_LOCKTYPE_ZERO(foo_seqcount, &lock);
+
+	/* C99 struct init */
+	struct {
+		.seq   = SEQCNT_LOCKTYPE_ZERO(foo.seq, &lock),
+	} foo;
+
+Write path: same as in :ref:`plain seqcount_t <seqcount_write_ops>`,
+while running from a context with the associated LOCKTYPE lock acquired.
+
+Read path: same as in :ref:`plain seqcount_t <seqcount_read_ops>`.
+
 .. _seqlock_t:
 
 Sequential locks (:c:type:`seqlock_t`)
diff --git a/MAINTAINERS b/MAINTAINERS
index 091ec22c1a23..f3ae546009ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9925,7 +9925,7 @@ F:	include/linux/lockdep.h
 F:	include/linux/mutex*.h
 F:	include/linux/rwlock*.h
 F:	include/linux/rwsem*.h
-F:	include/linux/seqlock.h
+F:	include/linux/seqlock*.h
 F:	include/linux/spinlock*.h
 F:	kernel/locking/
 F:	lib/locking*.[ch]
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index dd55555ff607..eca464ecf012 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -90,11 +90,10 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s)
  */
 #define SEQCNT_ZERO(name) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(name) }
 
-
 /**
  * __read_seqcount_begin() - begin a seq-read critical section (without barrier)
- * @s: Pointer to &typedef seqcount_t
- * Returns: count to be passed to read_seqcount_retry()
+ * @s: Pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
+ * Returns: count to be passed to read_seqcount_retry
  *
  * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
  * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
@@ -104,7 +103,9 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s)
  * Use carefully, only in critical code, and comment how the barrier is
  * provided.
  */
-static inline unsigned __read_seqcount_begin(const seqcount_t *s)
+#define __read_seqcount_begin(s)	do___read_seqcount_begin(s)
+
+static inline unsigned __read_seqcount_t_begin(const seqcount_t *s)
 {
 	unsigned ret;
 
@@ -119,14 +120,16 @@ static inline unsigned __read_seqcount_begin(const seqcount_t *s)
 
 /**
  * raw_read_seqcount() - Read the raw seqcount
- * @s: Pointer to &typedef seqcount_t
- * Returns: count to be passed to read_seqcount_retry()
+ * @s: Pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
+ * Returns: count to be passed to read_seqcount_retry
  *
  * raw_read_seqcount opens a read critical section of the given
  * seqcount without any lockdep checking and without checking or
  * masking the LSB. Calling code is responsible for handling that.
  */
-static inline unsigned raw_read_seqcount(const seqcount_t *s)
+#define raw_read_seqcount(s)	do_raw_read_seqcount(s)
+
+static inline unsigned raw_read_seqcount_t(const seqcount_t *s)
 {
 	unsigned ret = READ_ONCE(s->sequence);
 	smp_rmb();
@@ -135,38 +138,42 @@ static inline unsigned raw_read_seqcount(const seqcount_t *s)
 
 /**
  * raw_read_seqcount_begin() - start seq-read critical section w/o lockdep
- * @s: Pointer to &typedef seqcount_t
- * Returns: count to be passed to read_seqcount_retry()
+ * @s: Pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
+ * Returns: count to be passed to read_seqcount_retry
  *
  * raw_read_seqcount_begin opens a read critical section of the given
  * seqcount, but without any lockdep checking. Validity of the critical
  * section is tested by calling read_seqcount_retry().
  */
-static inline unsigned raw_read_seqcount_begin(const seqcount_t *s)
+#define raw_read_seqcount_begin(s)	do_raw_read_seqcount_begin(s)
+
+static inline unsigned raw_read_seqcount_t_begin(const seqcount_t *s)
 {
-	unsigned ret = __read_seqcount_begin(s);
+	unsigned ret = __read_seqcount_t_begin(s);
 	smp_rmb();
 	return ret;
 }
 
 /**
  * read_seqcount_begin() - begin a seq-read critical section
- * @s: Pointer to &typedef seqcount_t
- * Returns: count to be passed to read_seqcount_retry()
+ * @s: pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
+ * Returns: count to be passed to read_seqcount_retry
  *
  * read_seqcount_begin opens a read critical section of the given
  * seqcount_t.  Validity of the critical section is tested by calling
  * read_seqcount_retry().
  */
-static inline unsigned read_seqcount_begin(const seqcount_t *s)
+#define read_seqcount_begin(s)	do_read_seqcount_begin(s)
+
+static inline unsigned read_seqcount_t_begin(const seqcount_t *s)
 {
 	seqcount_lockdep_reader_access(s);
-	return raw_read_seqcount_begin(s);
+	return raw_read_seqcount_t_begin(s);
 }
 
 /**
  * raw_seqcount_begin() - begin a seq-read critical section
- * @s: Pointer to &typedef seqcount_t
+ * @s: pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  * Returns: count to be passed to read_seqcount_retry
  *
  * raw_seqcount_begin opens a read critical section of the given seqcount.
@@ -178,7 +185,9 @@ static inline unsigned read_seqcount_begin(const seqcount_t *s)
  * read_seqcount_retry() instead of stabilizing at the beginning of the
  * critical section.
  */
-static inline unsigned raw_seqcount_begin(const seqcount_t *s)
+#define raw_seqcount_begin(s)	do_raw_seqcount_begin(s)
+
+static inline unsigned raw_seqcount_t_begin(const seqcount_t *s)
 {
 	unsigned ret = READ_ONCE(s->sequence);
 	smp_rmb();
@@ -187,7 +196,7 @@ static inline unsigned raw_seqcount_begin(const seqcount_t *s)
 
 /**
  * __read_seqcount_retry() - end a seq-read critical section (without barrier)
- * @s: Pointer to &typedef seqcount_t
+ * @s: pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  * @start: count, from read_seqcount_begin
  * Returns: 1 if retry is required, else 0
  *
@@ -199,14 +208,16 @@ static inline unsigned raw_seqcount_begin(const seqcount_t *s)
  * Use carefully, only in critical code, and comment how the barrier is
  * provided.
  */
-static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start)
+#define __read_seqcount_retry(s, start)	do___read_seqcount_retry(s, start)
+
+static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
 {
 	return unlikely(s->sequence != start);
 }
 
 /**
  * read_seqcount_retry() - end a seq-read critical section
- * @s: Pointer to &typedef seqcount_t
+ * @s: pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  * @start: count, from read_seqcount_begin
  * Returns: 1 if retry is required, else 0
  *
@@ -214,19 +225,25 @@ static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start)
  * If the critical section was invalid, it must be ignored (and typically
  * retried).
  */
-static inline int read_seqcount_retry(const seqcount_t *s, unsigned start)
+#define read_seqcount_retry(s, start)	do_read_seqcount_retry(s, start)
+
+static inline int read_seqcount_t_retry(const seqcount_t *s, unsigned start)
 {
 	smp_rmb();
-	return __read_seqcount_retry(s, start);
+	return __read_seqcount_t_retry(s, start);
 }
 
-static inline void raw_write_seqcount_begin(seqcount_t *s)
+#define raw_write_seqcount_begin(s)	do_raw_write_seqcount_begin(s)
+
+static inline void raw_write_seqcount_t_begin(seqcount_t *s)
 {
 	s->sequence++;
 	smp_wmb();
 }
 
-static inline void raw_write_seqcount_end(seqcount_t *s)
+#define raw_write_seqcount_end(s)	do_raw_write_seqcount_end(s)
+
+static inline void raw_write_seqcount_t_end(seqcount_t *s)
 {
 	smp_wmb();
 	s->sequence++;
@@ -234,7 +251,7 @@ static inline void raw_write_seqcount_end(seqcount_t *s)
 
 /**
  * raw_write_seqcount_barrier() - do a seq write barrier
- * @s: Pointer to &typedef seqcount_t
+ * @s: Pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  *
  * This can be used to provide an ordering guarantee instead of the
  * usual consistency guarantee. It is one wmb cheaper, because we can
@@ -268,7 +285,9 @@ static inline void raw_write_seqcount_end(seqcount_t *s)
  *              X = false;
  *      }
  */
-static inline void raw_write_seqcount_barrier(seqcount_t *s)
+#define raw_write_seqcount_barrier(s)	do_raw_write_seqcount_barrier(s)
+
+static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
 {
 	s->sequence++;
 	smp_wmb();
@@ -277,7 +296,7 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)
 
 /**
  * raw_read_seqcount_latch() - pick even or odd seqcount latch data copy
- * @s: Pointer to &typedef seqcount_t
+ * @s: pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  *
  * Use seqcount latching to switch between two storage places with
  * sequence protection to allow interruptible, preemptible, writer
@@ -290,7 +309,9 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)
  * which data copy to read. Full counter must then be passed to
  * read_seqcount_retry().
  */
-static inline int raw_read_seqcount_latch(seqcount_t *s)
+#define raw_read_seqcount_latch(s)	do_raw_read_seqcount_latch(s)
+
+static inline int raw_read_seqcount_t_latch(seqcount_t *s)
 {
 	/* Pairs with the first smp_wmb() in raw_write_seqcount_latch() */
 	int seq = READ_ONCE(s->sequence); /* ^^^ */
@@ -299,7 +320,7 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
 
 /**
  * raw_write_seqcount_latch() - redirect readers to even/odd copy
- * @s: Pointer to &typedef seqcount_t
+ * @s: pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  *
  * The latch technique is a multiversion concurrency control method that allows
  * queries during non-atomic modifications. If you can guarantee queries never
@@ -384,34 +405,39 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
  *	When data is a dynamic data structure; one should use regular RCU
  *	patterns to manage the lifetimes of the objects within.
  */
-static inline void raw_write_seqcount_latch(seqcount_t *s)
+#define raw_write_seqcount_latch(s)	do_raw_write_seqcount_latch(s)
+
+static inline void raw_write_seqcount_t_latch(seqcount_t *s)
 {
        smp_wmb();      /* prior stores before incrementing "sequence" */
        s->sequence++;
        smp_wmb();      /* increment "sequence" before following stores */
 }
 
-static inline void __write_seqcount_begin_nested(seqcount_t *s, int subclass)
+#define write_seqcount_begin_nested(s, subclass)		\
+	do_write_seqcount_begin_nested(s, subclass)
+
+static inline void __write_seqcount_t_begin_nested(seqcount_t *s, int subclass)
 {
-	raw_write_seqcount_begin(s);
+	raw_write_seqcount_t_begin(s);
 	seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
 }
 
-static inline void write_seqcount_begin_nested(seqcount_t *s, int subclass)
+static inline void write_seqcount_t_begin_nested(seqcount_t *s, int subclass)
 {
 	lockdep_assert_preemption_disabled();
-	__write_seqcount_begin_nested(s, subclass);
+	__write_seqcount_t_begin_nested(s, subclass);
 }
 
 /*
- * write_seqcount_begin() without lockdep non-preemptibility checks.
+ * write_seqcount_t_begin() without lockdep non-preemptibility check.
  *
  * Use for internal seqlock.h code where it's known that preemption
- * is already disabled. For example, seqlock_t write functions.
+ * is already disabled. For example, seqlock_t write side functions.
  */
-static inline void __write_seqcount_begin(seqcount_t *s)
+static inline void __write_seqcount_t_begin(seqcount_t *s)
 {
-	__write_seqcount_begin_nested(s, 0);
+	__write_seqcount_t_begin_nested(s, 0);
 }
 
 /**
@@ -422,9 +448,11 @@ static inline void __write_seqcount_begin(seqcount_t *s)
  * seqcount. Seqcount write-side critical sections must be externally
  * serialized and non-preemptible.
  */
-static inline void write_seqcount_begin(seqcount_t *s)
+#define write_seqcount_begin(s)		do_write_seqcount_begin(s)
+
+static inline void write_seqcount_t_begin(seqcount_t *s)
 {
-	write_seqcount_begin_nested(s, 0);
+	write_seqcount_t_begin_nested(s, 0);
 }
 
 /**
@@ -434,25 +462,242 @@ static inline void write_seqcount_begin(seqcount_t *s)
  * write_seqcount_end closes a write-side critical section of the given
  * seqcount.
  */
-static inline void write_seqcount_end(seqcount_t *s)
+#define write_seqcount_end(s)		do_write_seqcount_end(s)
+
+static inline void write_seqcount_t_end(seqcount_t *s)
 {
 	seqcount_release(&s->dep_map, _RET_IP_);
-	raw_write_seqcount_end(s);
+	raw_write_seqcount_t_end(s);
 }
 
 /**
  * write_seqcount_invalidate() - invalidate in-progress read-side seq operations
- * @s: Pointer to &typedef seqcount_t
+ * @s: Pointer to &typedef seqcount_t or any of the seqcount_locktype_t variants
  *
  * After write_seqcount_invalidate, no read-side seq operations will complete
  * successfully and see data older than this.
  */
-static inline void write_seqcount_invalidate(seqcount_t *s)
+#define write_seqcount_invalidate(s)	do_write_seqcount_invalidate(s)
+
+static inline void write_seqcount_t_invalidate(seqcount_t *s)
 {
 	smp_wmb();
 	s->sequence+=2;
 }
 
+/*
+ * Sequence counters with associated locks (seqcount_LOCKTYPE_t)
+ *
+ * A sequence counter which associates the lock used for writer
+ * serialization at initialization time. This enables lockdep to validate
+ * that the write side critical section is properly serialized.
+ *
+ * For assicated locks which do not implicitly disable preemption,
+ * preemption protection is enforced in the write side function.
+ *
+ * See Documentation/locking/seqlock.rst
+ */
+
+/**
+ * typedef seqcount_spinlock_t - sequence count with spinlock associated
+ * @seqcount:		The real sequence counter
+ * @lock:		Pointer to the associated spinlock
+ *
+ * A plain sequence counter with external writer synchronization by a
+ * spinlock. The spinlock is associated to the sequence count in the
+ * static initializer or init function. This enables lockdep to validate
+ * that the write side critical section is properly serialized.
+ */
+typedef struct seqcount_spinlock {
+	seqcount_t      seqcount;
+#ifdef CONFIG_LOCKDEP
+	spinlock_t	*lock;
+#endif
+} seqcount_spinlock_t;
+
+#ifdef CONFIG_LOCKDEP
+
+#define SEQCOUNT_LOCKTYPE_ZERO(seq_name, assoc_lock) {		\
+	.seqcount	= SEQCNT_ZERO(seq_name.seqcount),	\
+	.lock		= (assoc_lock),				\
+}
+
+/* Define as macro due to static lockdep key @ seqcount_init() */
+#define seqcount_locktype_init(s, assoc_lock)			\
+do {								\
+	seqcount_init(&(s)->seqcount);				\
+	(s)->lock = (assoc_lock);				\
+} while (0)
+
+#else /* !CONFIG_LOCKDEP */
+
+#define SEQCOUNT_LOCKTYPE_ZERO(seq_name, assoc_lock) {		\
+	.seqcount	= SEQCNT_ZERO(seq_name.seqcount),	\
+}
+
+#define seqcount_locktype_init(s, assoc_lock)			\
+do {								\
+	seqcount_init(&(s)->seqcount);				\
+} while (0)
+
+#endif
+
+/**
+ * SEQCNT_SPINLOCK_ZERO - static initializer for seqcount_spinlock_t
+ * @name:	Name of the &typedef seqcount_spinlock_t instance
+ * @lock:	Pointer to the associated spinlock
+ */
+#define SEQCNT_SPINLOCK_ZERO(name, lock)	\
+	SEQCOUNT_LOCKTYPE_ZERO(name, lock)
+
+/**
+ * seqcount_spinlock_init - runtime initializer for seqcount_spinlock_t
+ * @s:		Pointer to the &typedef seqcount_spinlock_t instance
+ * @lock:	Pointer to the associated spinlock
+ */
+#define seqcount_spinlock_init(s, lock)		\
+	seqcount_locktype_init(s, lock)
+
+/**
+ * typedef seqcount_raw_spinlock_t - sequence count with raw spinlock associated
+ * @seqcount:		The real sequence counter
+ * @lock:		Pointer to the associated raw spinlock
+ *
+ * A plain sequence counter with external writer synchronization by a
+ * raw spinlock. The raw spinlock is associated to the sequence count in
+ * the static initializer or init function. This enables lockdep to
+ * validate that the write side critical section is properly serialized.
+ */
+typedef struct seqcount_raw_spinlock {
+	seqcount_t      seqcount;
+#ifdef CONFIG_LOCKDEP
+	raw_spinlock_t	*lock;
+#endif
+} seqcount_raw_spinlock_t;
+
+/**
+ * SEQCNT_RAW_SPINLOCK_ZERO - static initializer for seqcount_raw_spinlock_t
+ * @name:	Name of the &typedef seqcount_raw_spinlock_t instance
+ * @lock:	Pointer to the associated raw_spinlock
+ */
+#define SEQCNT_RAW_SPINLOCK_ZERO(name, lock)	\
+	SEQCOUNT_LOCKTYPE_ZERO(name, lock)
+
+/**
+ * seqcount_raw_spinlock_init - runtime initializer for seqcount_raw_spinlock_t
+ * @s:		Pointer to the &typedef seqcount_raw_spinlock_t instance
+ * @lock:	Pointer to the associated raw_spinlock
+ */
+#define seqcount_raw_spinlock_init(s, lock)	\
+	seqcount_locktype_init(s, lock)
+
+/**
+ * typedef seqcount_rwlock_t - sequence count with rwlock associated
+ * @seqcount:		The real sequence counter
+ * @lock:		Pointer to the associated rwlock
+ *
+ * A plain sequence counter with external writer synchronization by a
+ * rwlock. The rwlock is associated to the sequence count in the static
+ * initializer or init function. This enables lockdep to validate that
+ * the write side critical section is properly serialized.
+ */
+typedef struct seqcount_rwlock {
+	seqcount_t      seqcount;
+#ifdef CONFIG_LOCKDEP
+	rwlock_t	*lock;
+#endif
+} seqcount_rwlock_t;
+
+/**
+ * SEQCNT_RWLOCK_ZERO - static initializer for seqcount_rwlock_t
+ * @name:	Name of the &typedef seqcount_rwlock_t instance
+ * @lock:	Pointer to the associated rwlock
+ */
+#define SEQCNT_RWLOCK_ZERO(name, lock)		\
+	SEQCOUNT_LOCKTYPE_ZERO(name, lock)
+
+/**
+ * seqcount_rwlock_init - runtime initializer for seqcount_rwlock_t
+ * @s:		Pointer to the &typedef seqcount_rwlock_t instance
+ * @lock:	Pointer to the associated rwlock
+ */
+#define seqcount_rwlock_init(s, lock)		\
+	seqcount_locktype_init(s, lock)
+
+/**
+ * typedef seqcount_mutex_t - sequence count with mutex associated
+ * @seqcount:		The real sequence counter
+ * @lock:		Pointer to the associated mutex
+ *
+ * A plain sequence counter with external writer synchronization by a
+ * mutex. The mutex is associated to the sequence counter in the static
+ * initializer or init function. This enables lockdep to validate that
+ * the write side critical section is properly serialized.
+ *
+ * The write side API functions write_seqcount_begin()/end() automatically
+ * disable and enable preemption when used with seqcount_mutex_t.
+ */
+typedef struct seqcount_mutex {
+	seqcount_t      seqcount;
+#ifdef CONFIG_LOCKDEP
+	struct mutex	*lock;
+#endif
+} seqcount_mutex_t;
+
+/**
+ * SEQCNT_MUTEX_ZERO - static initializer for seqcount_mutex_t
+ * @name:	Name of the &typedef seqcount_mutex_t instance
+ * @lock:	Pointer to the associated mutex
+ */
+#define SEQCNT_MUTEX_ZERO(name, lock)		\
+	SEQCOUNT_LOCKTYPE_ZERO(name, lock)
+
+/**
+ * seqcount_mutex_init - runtime initializer for seqcount_mutex_t
+ * @s:		Pointer to the &typedef seqcount_mutex_t instance
+ * @lock:	Pointer to the associated mutex
+ */
+#define seqcount_mutex_init(s, lock)		\
+	seqcount_locktype_init(s, lock)
+
+/**
+ * typedef seqcount_ww_mutex_t - sequence count with ww_mutex associated
+ * @seqcount:		The real sequence counter
+ * @lock:		Pointer to the associated ww_mutex
+ *
+ * A plain sequence counter with external writer synchronization by a
+ * ww_mutex. The ww_mutex is associated to the sequence counter in the static
+ * initializer or init function. This enables lockdep to validate that
+ * the write side critical section is properly serialized.
+ *
+ * The write side API functions write_seqcount_begin()/end() automatically
+ * disable and enable preemption when used with seqcount_ww_mutex_t.
+ */
+typedef struct seqcount_ww_mutex {
+	seqcount_t      seqcount;
+#ifdef CONFIG_LOCKDEP
+	struct ww_mutex	*lock;
+#endif
+} seqcount_ww_mutex_t;
+
+/**
+ * SEQCNT_WW_MUTEX_ZERO - static initializer for seqcount_ww_mutex_t
+ * @name:	Name of the &typedef seqcount_ww_mutex_t instance
+ * @lock:	Pointer to the associated ww_mutex
+ */
+#define SEQCNT_WW_MUTEX_ZERO(name, lock)	\
+	SEQCOUNT_LOCKTYPE_ZERO(name, lock)
+
+/**
+ * seqcount_ww_mutex_init - runtime initializer for seqcount_ww_mutex_t
+ * @s:		Pointer to the &typedef seqcount_ww_mutex_t instance
+ * @lock:	Pointer to the associated ww_mutex
+ */
+#define seqcount_ww_mutex_init(s, lock)		\
+	seqcount_locktype_init(s, lock)
+
+#include <linux/seqlock_types_internal.h>
+
 /*
  * Sequential locks (seqlock_t)
  *
@@ -475,7 +720,7 @@ typedef struct {
 
 /**
  * seqlock_init() - dynamic initializer for seqlock_t
- * @sl: Pointer to the seqlock_t instance
+ * @sl: Pointer to the &typedef seqlock_t instance
  */
 #define seqlock_init(sl)				\
 	do {						\
@@ -502,7 +747,7 @@ typedef struct {
  */
 static inline unsigned read_seqbegin(const seqlock_t *sl)
 {
-	return read_seqcount_begin(&sl->seqcount);
+	return read_seqcount_t_begin(&sl->seqcount);
 }
 
 /**
@@ -518,7 +763,7 @@ static inline unsigned read_seqbegin(const seqlock_t *sl)
  */
 static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
 {
-	return read_seqcount_retry(&sl->seqcount, start);
+	return read_seqcount_t_retry(&sl->seqcount, start);
 }
 
 /**
@@ -539,7 +784,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
 static inline void write_seqlock(seqlock_t *sl)
 {
 	spin_lock(&sl->lock);
-	__write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_t_begin(&sl->seqcount);
 }
 
 /**
@@ -551,7 +796,7 @@ static inline void write_seqlock(seqlock_t *sl)
  */
 static inline void write_sequnlock(seqlock_t *sl)
 {
-	write_seqcount_end(&sl->seqcount);
+	write_seqcount_t_end(&sl->seqcount);
 	spin_unlock(&sl->lock);
 }
 
@@ -569,7 +814,7 @@ static inline void write_sequnlock(seqlock_t *sl)
 static inline void write_seqlock_bh(seqlock_t *sl)
 {
 	spin_lock_bh(&sl->lock);
-	__write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_t_begin(&sl->seqcount);
 }
 
 /**
@@ -582,7 +827,7 @@ static inline void write_seqlock_bh(seqlock_t *sl)
  */
 static inline void write_sequnlock_bh(seqlock_t *sl)
 {
-	write_seqcount_end(&sl->seqcount);
+	write_seqcount_t_end(&sl->seqcount);
 	spin_unlock_bh(&sl->lock);
 }
 
@@ -597,7 +842,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl)
 static inline void write_seqlock_irq(seqlock_t *sl)
 {
 	spin_lock_irq(&sl->lock);
-	__write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_t_begin(&sl->seqcount);
 }
 
 /**
@@ -612,7 +857,7 @@ static inline void write_seqlock_irq(seqlock_t *sl)
  */
 static inline void write_sequnlock_irq(seqlock_t *sl)
 {
-	write_seqcount_end(&sl->seqcount);
+	write_seqcount_t_end(&sl->seqcount);
 	spin_unlock_irq(&sl->lock);
 }
 
@@ -621,7 +866,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
 	unsigned long flags;
 
 	spin_lock_irqsave(&sl->lock, flags);
-	__write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_t_begin(&sl->seqcount);
 	return flags;
 }
 
@@ -658,7 +903,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
 static inline void
 write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
 {
-	write_seqcount_end(&sl->seqcount);
+	write_seqcount_t_end(&sl->seqcount);
 	spin_unlock_irqrestore(&sl->lock, flags);
 }
 
diff --git a/include/linux/seqlock_types_internal.h b/include/linux/seqlock_types_internal.h
new file mode 100644
index 000000000000..de635f4c7297
--- /dev/null
+++ b/include/linux/seqlock_types_internal.h
@@ -0,0 +1,187 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_SEQLOCK_TYPES_INTERNAL_H
+#define __LINUX_SEQLOCK_TYPES_INTERNAL_H
+
+/*
+ * Sequence counters with associated locks
+ *
+ * Copyright (C) 2020 Linutronix GmbH
+ */
+
+#ifndef __LINUX_SEQLOCK_H
+#error This is an INTERNAL header; it must only be included by seqlock.h
+#endif
+
+#include <linux/mutex.h>
+#include <linux/rwlock.h>
+#include <linux/spinlock.h>
+#include <linux/ww_mutex.h>
+
+/*
+ * @s: pointer to seqcount_t or any of the seqcount_locktype_t variants
+ */
+#define __to_seqcount_t(s)						\
+({									\
+	seqcount_t *seq;						\
+									\
+	if (__same_type(*(s), seqcount_t))				\
+		seq = (seqcount_t *)(s);				\
+	else if (__same_type(*(s), seqcount_spinlock_t))		\
+		seq = &((seqcount_spinlock_t *)(s))->seqcount;		\
+	else if (__same_type(*(s), seqcount_raw_spinlock_t))		\
+		seq = &((seqcount_raw_spinlock_t *)(s))->seqcount;	\
+	else if (__same_type(*(s), seqcount_rwlock_t))			\
+		seq = &((seqcount_rwlock_t *)(s))->seqcount;		\
+	else if (__same_type(*(s), seqcount_mutex_t))			\
+		seq = &((seqcount_mutex_t *)(s))->seqcount;		\
+	else if (__same_type(*(s), seqcount_ww_mutex_t))		\
+		seq = &((seqcount_ww_mutex_t *)(s))->seqcount;		\
+	else								\
+		BUILD_BUG_ON_MSG(1, "Unknown seqcount type");		\
+									\
+	seq;								\
+})
+
+/*
+ *	seqcount_LOCKTYPE_t -- write APIs
+ *
+ * For associated lock types which do not implicitly disable preemption,
+ * enforce preemption protection in the write side functions.
+ *
+ * Never use lockdep for the raw write variants.
+ */
+
+#define __associated_lock_is_preemptible(s)				\
+({									\
+	bool ret;							\
+									\
+	if (__same_type(*(s), seqcount_t) ||				\
+	    __same_type(*(s), seqcount_spinlock_t) ||			\
+	    __same_type(*(s), seqcount_raw_spinlock_t) ||		\
+	    __same_type(*(s), seqcount_rwlock_t)) {			\
+		ret = false;						\
+	} else if (__same_type(*(s), seqcount_mutex_t) ||		\
+		   __same_type(*(s), seqcount_ww_mutex_t)) {		\
+		ret = true;						\
+	} else								\
+		BUILD_BUG_ON_MSG(1, "Unknown seqcount type");		\
+									\
+	ret;								\
+})
+
+#ifdef CONFIG_LOCKDEP
+
+#define __assert_associated_lock_held(s)				\
+do {									\
+	if (__same_type(*(s), seqcount_t))				\
+		break;							\
+									\
+	if (__same_type(*(s), seqcount_spinlock_t))			\
+		lockdep_assert_held(((seqcount_spinlock_t *)(s))->lock);\
+	else if (__same_type(*(s), seqcount_raw_spinlock_t))		\
+		lockdep_assert_held(((seqcount_raw_spinlock_t *)(s))->lock);	\
+	else if (__same_type(*(s), seqcount_rwlock_t))			\
+		lockdep_assert_held_write(((seqcount_rwlock_t *)(s))->lock);	\
+	else if (__same_type(*(s), seqcount_mutex_t))			\
+		lockdep_assert_held(((seqcount_mutex_t *)(s))->lock);	\
+	else if (__same_type(*(s), seqcount_ww_mutex_t))		\
+		lockdep_assert_held(&((seqcount_ww_mutex_t *)(s))->lock->base);	\
+	else								\
+		BUILD_BUG_ON_MSG(1, "Unknown seqcount type");		\
+} while (0)
+
+#else
+
+#define __assert_associated_lock_held(s)				\
+do {									\
+	(void) __to_seqcount_t(s);					\
+} while (0)
+
+#endif /* CONFIG_LOCKDEP */
+
+#define do_raw_write_seqcount_begin(s)					\
+do {									\
+	if (__associated_lock_is_preemptible(s))			\
+		preempt_disable();					\
+									\
+	raw_write_seqcount_t_begin(__to_seqcount_t(s));			\
+} while (0)
+
+#define do_raw_write_seqcount_end(s)					\
+do {									\
+	raw_write_seqcount_t_end(__to_seqcount_t(s));			\
+									\
+	if (__associated_lock_is_preemptible(s))			\
+		preempt_enable();					\
+} while (0)
+
+#define do_write_seqcount_begin_nested(s, subclass)			\
+do {									\
+	__assert_associated_lock_held(s);				\
+									\
+	if (__associated_lock_is_preemptible(s))			\
+		preempt_disable();					\
+									\
+	write_seqcount_t_begin_nested(__to_seqcount_t(s), subclass);	\
+} while (0)
+
+#define do_write_seqcount_begin(s)					\
+do {									\
+	__assert_associated_lock_held(s);				\
+									\
+	if (__associated_lock_is_preemptible(s))			\
+		preempt_disable();					\
+									\
+	write_seqcount_t_begin(__to_seqcount_t(s));			\
+} while (0)
+
+#define do_write_seqcount_end(s)					\
+do {									\
+	write_seqcount_t_end(__to_seqcount_t(s));			\
+									\
+	if (__associated_lock_is_preemptible(s))			\
+		preempt_enable();					\
+} while (0)
+
+#define do_write_seqcount_invalidate(s)					\
+	write_seqcount_t_invalidate(__to_seqcount_t(s))
+
+#define do_raw_write_seqcount_barrier(s)				\
+	raw_write_seqcount_t_barrier(__to_seqcount_t(s))
+
+/*
+ * Latch sequence counters write side critical sections don't need to
+ * run with preemption disabled. Check @raw_write_seqcount_latch().
+ */
+#define do_raw_write_seqcount_latch(s)					\
+	raw_write_seqcount_t_latch(__to_seqcount_t(s))
+
+/*
+ *	seqcount_LOCKTYPE_t -- read APIs
+ */
+
+#define do___read_seqcount_begin(s)					\
+	__read_seqcount_t_begin(__to_seqcount_t(s))
+
+#define do_raw_read_seqcount(s)						\
+	raw_read_seqcount_t(__to_seqcount_t(s))
+
+#define do_raw_seqcount_begin(s)					\
+	raw_seqcount_t_begin(__to_seqcount_t(s))
+
+#define do_raw_read_seqcount_begin(s)					\
+	raw_read_seqcount_t_begin(__to_seqcount_t(s))
+
+#define do_read_seqcount_begin(s)					\
+	read_seqcount_t_begin(__to_seqcount_t(s))
+
+#define do_raw_read_seqcount_latch(s)					\
+	raw_read_seqcount_t_latch(__to_seqcount_t(s))
+
+#define do___read_seqcount_retry(s, start)				\
+	__read_seqcount_t_retry(__to_seqcount_t(s), start)
+
+#define do_read_seqcount_retry(s, start)				\
+	read_seqcount_t_retry(__to_seqcount_t(s), start)
+
+#endif /* __LINUX_SEQLOCK_TYPES_INTERNAL_H */
-- 
2.20.1


^ permalink raw reply related	[relevance 37%]

* [PATCH v1 14/25] sched: tasks: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (12 preceding siblings ...)
  2020-05-19 21:45 81% ` [PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex Ahmed S. Darwish
@ 2020-05-19 21:45 93% ` Ahmed S. Darwish
  2020-05-19 21:45 92% ` [PATCH v1 15/25] netfilter: conntrack: " Ahmed S. Darwish
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Ben Segall, Mel Gorman

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/sched.h | 2 +-
 init/init_task.c      | 3 ++-
 kernel/fork.c         | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4418f5cb8324..a9ce6fbeb735 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1046,7 +1046,7 @@ struct task_struct {
 	/* Protected by ->alloc_lock: */
 	nodemask_t			mems_allowed;
 	/* Seqence number to catch updates: */
-	seqcount_t			mems_allowed_seq;
+	seqcount_spinlock_t		mems_allowed_seq;
 	int				cpuset_mem_spread_rotor;
 	int				cpuset_slab_spread_rotor;
 #endif
diff --git a/init/init_task.c b/init/init_task.c
index bd403ed3e418..94bf4aea8293 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -142,7 +142,8 @@ struct task_struct init_task
 	.rcu_tasks_idle_cpu = -1,
 #endif
 #ifdef CONFIG_CPUSETS
-	.mems_allowed_seq = SEQCNT_ZERO(init_task.mems_allowed_seq),
+	.mems_allowed_seq = SEQCNT_SPINLOCK_ZERO(init_task.mems_allowed_seq,
+						 &init_task.alloc_lock),
 #endif
 #ifdef CONFIG_RT_MUTEXES
 	.pi_waiters	= RB_ROOT_CACHED,
diff --git a/kernel/fork.c b/kernel/fork.c
index 8c700f881d92..a0fde1f17e0a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2019,7 +2019,7 @@ static __latent_entropy struct task_struct *copy_process(
 #ifdef CONFIG_CPUSETS
 	p->cpuset_mem_spread_rotor = NUMA_NO_NODE;
 	p->cpuset_slab_spread_rotor = NUMA_NO_NODE;
-	seqcount_init(&p->mems_allowed_seq);
+	seqcount_spinlock_init(&p->mems_allowed_seq, &p->alloc_lock);
 #endif
 #ifdef CONFIG_TRACE_IRQFLAGS
 	p->irq_events = 0;
-- 
2.20.1


^ permalink raw reply related	[relevance 93%]

* [PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (11 preceding siblings ...)
  2020-05-19 21:45 37% ` [PATCH v1 12/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
@ 2020-05-19 21:45 81% ` Ahmed S. Darwish
    2020-05-19 21:45 93% ` [PATCH v1 14/25] sched: tasks: Use sequence counter with associated spinlock Ahmed S. Darwish
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Sumit Semwal,
	Felix Kuehling, Alex Deucher, Christian König,
	David (ChunMing) Zhou, David Airlie, Daniel Vetter, linux-media,
	dri-devel, amd-gfx

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.

The dma-buf reservation subsystem uses plain sequence counters to manage
updates to reservations. Writer serialization is accomplished through a
wound/wait mutex.

Acquiring a wound/wait mutex does not disable preemption, so this needs
to be done manually before and after the write side critical section.

Use the newly-added seqcount_ww_mutex_t instead:

  - It associates the ww_mutex with the sequence count, which enables
    lockdep to validate that the write side critical section is properly
    serialized.

  - It removes the need to explicitly add preempt_disable/enable()
    around the write side critical section because the write_begin/end()
    functions for this new data type automatically do this.

If lockdep is disabled this ww_mutex lock association is compiled out
and has neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 drivers/dma-buf/dma-resv.c                       | 8 +-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 --
 include/linux/dma-resv.h                         | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 590ce7ad60a0..3aba2b2bfc48 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -128,7 +128,7 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
 	ww_mutex_init(&obj->lock, &reservation_ww_class);
-	seqcount_init(&obj->seq);
+	seqcount_ww_mutex_init(&obj->seq, &obj->lock);
 
 	RCU_INIT_POINTER(obj->fence, NULL);
 	RCU_INIT_POINTER(obj->fence_excl, NULL);
@@ -259,7 +259,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 	fobj = dma_resv_get_list(obj);
 	count = fobj->shared_count;
 
-	preempt_disable();
 	write_seqcount_begin(&obj->seq);
 
 	for (i = 0; i < count; ++i) {
@@ -281,7 +280,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 	smp_store_mb(fobj->shared_count, count);
 
 	write_seqcount_end(&obj->seq);
-	preempt_enable();
 	dma_fence_put(old);
 }
 EXPORT_SYMBOL(dma_resv_add_shared_fence);
@@ -308,14 +306,12 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 	if (fence)
 		dma_fence_get(fence);
 
-	preempt_disable();
 	write_seqcount_begin(&obj->seq);
 	/* write_seqcount_begin provides the necessary memory barrier */
 	RCU_INIT_POINTER(obj->fence_excl, fence);
 	if (old)
 		old->shared_count = 0;
 	write_seqcount_end(&obj->seq);
-	preempt_enable();
 
 	/* inplace update, no shared fences */
 	while (i--)
@@ -393,13 +389,11 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 	src_list = dma_resv_get_list(dst);
 	old = dma_resv_get_excl(dst);
 
-	preempt_disable();
 	write_seqcount_begin(&dst->seq);
 	/* write_seqcount_begin provides the necessary memory barrier */
 	RCU_INIT_POINTER(dst->fence_excl, new);
 	RCU_INIT_POINTER(dst->fence, dst_list);
 	write_seqcount_end(&dst->seq);
-	preempt_enable();
 
 	dma_resv_list_free(src_list);
 	dma_fence_put(old);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..87fd32aae8f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -258,11 +258,9 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 	new->shared_count = k;
 
 	/* Install the new fence list, seqcount provides the barriers */
-	preempt_disable();
 	write_seqcount_begin(&resv->seq);
 	RCU_INIT_POINTER(resv->fence, new);
 	write_seqcount_end(&resv->seq);
-	preempt_enable();
 
 	/* Drop the references to the removed fences or move them to ef_list */
 	for (i = j, k = 0; i < old->shared_count; ++i) {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index a6538ae7d93f..d44a77e8a7e3 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -69,7 +69,7 @@ struct dma_resv_list {
  */
 struct dma_resv {
 	struct ww_mutex lock;
-	seqcount_t seq;
+	seqcount_ww_mutex_t seq;
 
 	struct dma_fence __rcu *fence_excl;
 	struct dma_resv_list __rcu *fence;
-- 
2.20.1


^ permalink raw reply related	[relevance 81%]

* [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
@ 2020-05-19 21:45 72% Ahmed S. Darwish
  2020-05-19 21:45 82% ` [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount Ahmed S. Darwish
                   ` (24 more replies)
  0 siblings, 25 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, David S. Miller,
	Andrew Morton, Jens Axboe, Jonathan Corbet, Alexander Viro,
	David Airlie, Daniel Vetter, netdev, linux-mm, linux-block,
	dri-devel, linux-fsdevel, linux-doc

Hi,

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the write side critical section.

There is no built-in debugging mechanism to verify that the lock used
for writer serialization is held and preemption is disabled. Some usage
sites like dma-buf have explicit lockdep checks for the writer-side
lock, but this covers only a small portion of the sequence counter usage
in the kernel.

Add new sequence counter types which allows to associate a lock to the
sequence counter at initialization time. The seqcount API functions are
extended to provide appropriate lockdep assertions depending on the
seqcount/lock type.

For sequence counters with associated locks that do not implicitly
disable preemption, preemption protection is enforced in the sequence
counter write side functions. This removes the need to explicitly add
preempt_disable/enable() around the write side critical sections: the
write_begin/end() functions for these new sequence counter types
automatically do this.

Extend the lockdep API with a macro asserting that preemption is
disabled.  Use it to verify that preemption is disabled for all sequence
counters write side critical sections.

If lockdep is disabled, these lock associations and non-preemptibility
checks are compiled out and have neither storage size nor runtime
overhead. If lockdep is enabled, a pointer to the lock is stored in the
seqcount and the write side API functions enable lockdep assertions.

The following seqcount types with associated locks are introduced:

     seqcount_spinlock_t
     seqcount_raw_spinlock_t
     seqcount_rwlock_t
     seqcount_mutex_t
     seqcount_ww_mutex_t

This lock association is not only useful for debugging purposes, it also
provides a mechanism for PREEMPT_RT to prevent writer starvation. On RT
kernels spinlocks and rwlocks are substituted with sleeping locks and
the code sections protected by these locks become preemptible, which has
the same problem as write side critical section with preemption enabled
on a non-RT kernel. RT utilizes this association by storing the provided
lock pointer and in case that a reader sees an active writer (seqcount
is odd), it does not spin, but blocks on the associated lock similar to
read_seqbegin_or_lock().

By using the lockdep debugging mechanisms added in this patch series, a
number of erroneous seqcount call-sites were discovered across the
kernel. The fixes are included at the beginning of the series.

Thanks,

8<--------------

Ahmed S. Darwish (25):
  net: core: device_rename: Use rwsem instead of a seqcount
  mm/swap: Don't abuse the seqcount latching API
  net: phy: fixed_phy: Remove unused seqcount
  block: nr_sects_write(): Disable preemption on seqcount write
  u64_stats: Document writer non-preemptibility requirement
  dma-buf: Remove custom seqcount lockdep class key
  lockdep: Add preemption disabled assertion API
  seqlock: lockdep assert non-preemptibility on seqcount_t write
  Documentation: locking: Describe seqlock design and usage
  seqlock: Add RST directives to kernel-doc code samples and notes
  seqlock: Add missing kernel-doc annotations
  seqlock: Extend seqcount API with associated locks
  dma-buf: Use sequence counter with associated wound/wait mutex
  sched: tasks: Use sequence counter with associated spinlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  xfrm: policy: Use sequence counters with associated lock
  timekeeping: Use sequence counter with associated raw spinlock
  vfs: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  hrtimer: Use sequence counter with associated raw spinlock

 Documentation/locking/index.rst               |   1 +
 Documentation/locking/seqlock.rst             | 239 +++++
 MAINTAINERS                                   |   2 +-
 block/blk-iocost.c                            |   5 +-
 block/blk.h                                   |   2 +
 drivers/dma-buf/dma-resv.c                    |  15 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 -
 drivers/md/raid5.c                            |   2 +-
 drivers/md/raid5.h                            |   2 +-
 drivers/net/phy/fixed_phy.c                   |  25 +-
 fs/dcache.c                                   |   2 +-
 fs/fs_struct.c                                |   4 +-
 fs/nfs/nfs4_fs.h                              |   2 +-
 fs/nfs/nfs4state.c                            |   2 +-
 fs/userfaultfd.c                              |   4 +-
 include/linux/dcache.h                        |   2 +-
 include/linux/dma-resv.h                      |   4 +-
 include/linux/fs_struct.h                     |   2 +-
 include/linux/hrtimer.h                       |   2 +-
 include/linux/kvm_irqfd.h                     |   2 +-
 include/linux/lockdep.h                       |   9 +
 include/linux/sched.h                         |   2 +-
 include/linux/seqlock.h                       | 882 +++++++++++++++---
 include/linux/seqlock_types_internal.h        | 187 ++++
 include/linux/u64_stats_sync.h                |  38 +-
 include/net/netfilter/nf_conntrack.h          |   2 +-
 init/init_task.c                              |   3 +-
 kernel/fork.c                                 |   2 +-
 kernel/locking/lockdep.c                      |  15 +
 kernel/time/hrtimer.c                         |  13 +-
 kernel/time/timekeeping.c                     |  19 +-
 lib/Kconfig.debug                             |   1 +
 mm/swap.c                                     |  57 +-
 net/core/dev.c                                |  30 +-
 net/netfilter/nf_conntrack_core.c             |   5 +-
 net/netfilter/nft_set_rbtree.c                |   4 +-
 net/xfrm/xfrm_policy.c                        |  10 +-
 virt/kvm/eventfd.c                            |   2 +-
 38 files changed, 1325 insertions(+), 277 deletions(-)
 create mode 100644 Documentation/locking/seqlock.rst
 create mode 100644 include/linux/seqlock_types_internal.h

base-commit: 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
--
2.20.1

^ permalink raw reply	[relevance 72%]

* [PATCH v1 11/25] seqlock: Add missing kernel-doc annotations
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (9 preceding siblings ...)
  2020-05-19 21:45 82% ` [PATCH v1 10/25] seqlock: Add RST directives to kernel-doc code samples and notes Ahmed S. Darwish
@ 2020-05-19 21:45 42% ` Ahmed S. Darwish
  2020-05-19 21:45 37% ` [PATCH v1 12/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Jonathan Corbet,
	linux-doc

A small number of the the exported seqlock.h functions are kernel-doc
annotated.

Since seqlock.h is now included by the kernel's RST documentation, add
kernel-doc annotations for all of the remaining functions.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/seqlock.h | 414 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 361 insertions(+), 53 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index dfec0c9c19c4..dd55555ff607 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -57,6 +57,10 @@ static inline void __seqcount_init(seqcount_t *s, const char *name,
 # define SEQCOUNT_DEP_MAP_INIT(lockname) \
 		.dep_map = { .name = #lockname } \
 
+/**
+ * seqcount_init() - runtime initializer for seqcount_t
+ * @s: Pointer to the &typedef seqcount_t instance
+ */
 # define seqcount_init(s)				\
 	do {						\
 		static struct lock_class_key __key;	\
@@ -80,13 +84,17 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s)
 # define seqcount_lockdep_reader_access(x)
 #endif
 
-#define SEQCNT_ZERO(lockname) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(lockname)}
+/**
+ * SEQCNT_ZERO() - static initializer for seqcount_t
+ * @name: Name of the &typedef seqcount_t instance
+ */
+#define SEQCNT_ZERO(name) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(name) }
 
 
 /**
- * __read_seqcount_begin - begin a seq-read critical section (without barrier)
- * @s: pointer to seqcount_t
- * Returns: count to be passed to read_seqcount_retry
+ * __read_seqcount_begin() - begin a seq-read critical section (without barrier)
+ * @s: Pointer to &typedef seqcount_t
+ * Returns: count to be passed to read_seqcount_retry()
  *
  * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
  * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
@@ -110,9 +118,9 @@ static inline unsigned __read_seqcount_begin(const seqcount_t *s)
 }
 
 /**
- * raw_read_seqcount - Read the raw seqcount
- * @s: pointer to seqcount_t
- * Returns: count to be passed to read_seqcount_retry
+ * raw_read_seqcount() - Read the raw seqcount
+ * @s: Pointer to &typedef seqcount_t
+ * Returns: count to be passed to read_seqcount_retry()
  *
  * raw_read_seqcount opens a read critical section of the given
  * seqcount without any lockdep checking and without checking or
@@ -126,13 +134,13 @@ static inline unsigned raw_read_seqcount(const seqcount_t *s)
 }
 
 /**
- * raw_read_seqcount_begin - start seq-read critical section w/o lockdep
- * @s: pointer to seqcount_t
- * Returns: count to be passed to read_seqcount_retry
+ * raw_read_seqcount_begin() - start seq-read critical section w/o lockdep
+ * @s: Pointer to &typedef seqcount_t
+ * Returns: count to be passed to read_seqcount_retry()
  *
  * raw_read_seqcount_begin opens a read critical section of the given
  * seqcount, but without any lockdep checking. Validity of the critical
- * section is tested by checking read_seqcount_retry function.
+ * section is tested by calling read_seqcount_retry().
  */
 static inline unsigned raw_read_seqcount_begin(const seqcount_t *s)
 {
@@ -142,13 +150,13 @@ static inline unsigned raw_read_seqcount_begin(const seqcount_t *s)
 }
 
 /**
- * read_seqcount_begin - begin a seq-read critical section
- * @s: pointer to seqcount_t
- * Returns: count to be passed to read_seqcount_retry
+ * read_seqcount_begin() - begin a seq-read critical section
+ * @s: Pointer to &typedef seqcount_t
+ * Returns: count to be passed to read_seqcount_retry()
  *
- * read_seqcount_begin opens a read critical section of the given seqcount.
- * Validity of the critical section is tested by checking read_seqcount_retry
- * function.
+ * read_seqcount_begin opens a read critical section of the given
+ * seqcount_t.  Validity of the critical section is tested by calling
+ * read_seqcount_retry().
  */
 static inline unsigned read_seqcount_begin(const seqcount_t *s)
 {
@@ -157,8 +165,8 @@ static inline unsigned read_seqcount_begin(const seqcount_t *s)
 }
 
 /**
- * raw_seqcount_begin - begin a seq-read critical section
- * @s: pointer to seqcount_t
+ * raw_seqcount_begin() - begin a seq-read critical section
+ * @s: Pointer to &typedef seqcount_t
  * Returns: count to be passed to read_seqcount_retry
  *
  * raw_seqcount_begin opens a read critical section of the given seqcount.
@@ -178,8 +186,8 @@ static inline unsigned raw_seqcount_begin(const seqcount_t *s)
 }
 
 /**
- * __read_seqcount_retry - end a seq-read critical section (without barrier)
- * @s: pointer to seqcount_t
+ * __read_seqcount_retry() - end a seq-read critical section (without barrier)
+ * @s: Pointer to &typedef seqcount_t
  * @start: count, from read_seqcount_begin
  * Returns: 1 if retry is required, else 0
  *
@@ -197,8 +205,8 @@ static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start)
 }
 
 /**
- * read_seqcount_retry - end a seq-read critical section
- * @s: pointer to seqcount_t
+ * read_seqcount_retry() - end a seq-read critical section
+ * @s: Pointer to &typedef seqcount_t
  * @start: count, from read_seqcount_begin
  * Returns: 1 if retry is required, else 0
  *
@@ -225,8 +233,8 @@ static inline void raw_write_seqcount_end(seqcount_t *s)
 }
 
 /**
- * raw_write_seqcount_barrier - do a seq write barrier
- * @s: pointer to seqcount_t
+ * raw_write_seqcount_barrier() - do a seq write barrier
+ * @s: Pointer to &typedef seqcount_t
  *
  * This can be used to provide an ordering guarantee instead of the
  * usual consistency guarantee. It is one wmb cheaper, because we can
@@ -267,6 +275,21 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)
 	s->sequence++;
 }
 
+/**
+ * raw_read_seqcount_latch() - pick even or odd seqcount latch data copy
+ * @s: Pointer to &typedef seqcount_t
+ *
+ * Use seqcount latching to switch between two storage places with
+ * sequence protection to allow interruptible, preemptible, writer
+ * sections.
+ *
+ * Check raw_write_seqcount_latch() for more details and a full reader
+ * and writer usage example.
+ *
+ * Return: sequence counter. Use the lowest bit as index for picking
+ * which data copy to read. Full counter must then be passed to
+ * read_seqcount_retry().
+ */
 static inline int raw_read_seqcount_latch(seqcount_t *s)
 {
 	/* Pairs with the first smp_wmb() in raw_write_seqcount_latch() */
@@ -275,8 +298,8 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
 }
 
 /**
- * raw_write_seqcount_latch - redirect readers to even/odd copy
- * @s: pointer to seqcount_t
+ * raw_write_seqcount_latch() - redirect readers to even/odd copy
+ * @s: Pointer to &typedef seqcount_t
  *
  * The latch technique is a multiversion concurrency control method that allows
  * queries during non-atomic modifications. If you can guarantee queries never
@@ -336,8 +359,8 @@ static inline int raw_read_seqcount_latch(seqcount_t *s)
  *			idx = seq & 0x01;
  *			entry = data_query(latch->data[idx], ...);
  *
- *			smp_rmb();
- *		} while (seq != latch->seq);
+ *			// read_seqcount_retry() includes necessary smp_rmb()
+ *		} while (read_seqcount_retry(&latch->seq, seq);
  *
  *		return entry;
  *	}
@@ -391,11 +414,26 @@ static inline void __write_seqcount_begin(seqcount_t *s)
 	__write_seqcount_begin_nested(s, 0);
 }
 
+/**
+ * write_seqcount_begin() - start a seqcount write-side critical section
+ * @s: Pointer to &typedef seqcount_t
+ *
+ * write_seqcount_begin opens a write-side critical section of the given
+ * seqcount. Seqcount write-side critical sections must be externally
+ * serialized and non-preemptible.
+ */
 static inline void write_seqcount_begin(seqcount_t *s)
 {
 	write_seqcount_begin_nested(s, 0);
 }
 
+/**
+ * write_seqcount_end() - end a seqcount write-side critical section
+ * @s: Pointer to &typedef seqcount_t
+ *
+ * write_seqcount_end closes a write-side critical section of the given
+ * seqcount.
+ */
 static inline void write_seqcount_end(seqcount_t *s)
 {
 	seqcount_release(&s->dep_map, _RET_IP_);
@@ -403,8 +441,8 @@ static inline void write_seqcount_end(seqcount_t *s)
 }
 
 /**
- * write_seqcount_invalidate - invalidate in-progress read-side seq operations
- * @s: pointer to seqcount_t
+ * write_seqcount_invalidate() - invalidate in-progress read-side seq operations
+ * @s: Pointer to &typedef seqcount_t
  *
  * After write_seqcount_invalidate, no read-side seq operations will complete
  * successfully and see data older than this.
@@ -435,32 +473,68 @@ typedef struct {
 		.lock =	__SPIN_LOCK_UNLOCKED(lockname)	\
 	}
 
-#define seqlock_init(x)					\
+/**
+ * seqlock_init() - dynamic initializer for seqlock_t
+ * @sl: Pointer to the seqlock_t instance
+ */
+#define seqlock_init(sl)				\
 	do {						\
-		seqcount_init(&(x)->seqcount);		\
-		spin_lock_init(&(x)->lock);		\
+		seqcount_init(&(sl)->seqcount);		\
+		spin_lock_init(&(sl)->lock);		\
 	} while (0)
 
-#define DEFINE_SEQLOCK(x) \
-		seqlock_t x = __SEQLOCK_UNLOCKED(x)
+/**
+ * DEFINE_SEQLOCK() - Define a statically-allocated seqlock_t
+ * @sl: Name of the &typedef seqlock_t instance
+ */
+#define DEFINE_SEQLOCK(sl) \
+		seqlock_t sl = __SEQLOCK_UNLOCKED(sl)
 
-/*
- * Read side functions for starting and finalizing a read side section.
+/**
+ * read_seqbegin() - start a seqlock_t read-side critical section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_seqbegin opens a read side critical section of the given
+ * seqlock_t. Validity of the critical section is tested by checking
+ * read_seqretry().
+ *
+ * Return: count to be passed to read_seqretry()
  */
 static inline unsigned read_seqbegin(const seqlock_t *sl)
 {
 	return read_seqcount_begin(&sl->seqcount);
 }
 
+/**
+ * read_seqretry() - end a seqlock_t read side critical section
+ * @sl: Pointer to &typedef seqlock_t
+ * @start: count, from read_seqbegin()
+ *
+ * read_seqretry closes a read side critical section of the given
+ * seqlock_t. If the read side critical section was invalid, it must be
+ * ignored and retried.
+ *
+ * Return: 1 if a retry is required, 0 otherwise
+ */
 static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
 {
 	return read_seqcount_retry(&sl->seqcount, start);
 }
 
-/*
- * Lock out other writers and update the count.
- * Acts like a normal spin_lock/unlock.
- * Don't need preempt_disable() because that is in the spin_lock already.
+/**
+ * write_seqlock() - start a seqlock_t write side critical section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * write_seqlock opens a write side critical section of the given
+ * seqlock_t.  It also acquires the spinlock embedded inside the
+ * sequential lock. All seqlock_t write side critical sections are thus
+ * automatically serialized and non-preemptible.
+ *
+ * If the seqlock_t read side section can be invoked from a hardirq or
+ * softirq context, the ``_irqsave`` and ``_bh`` variants of this
+ * function must be respectively used instead.
+ *
+ * The opened write side section must be closed with write_sequnlock().
  */
 static inline void write_seqlock(seqlock_t *sl)
 {
@@ -468,30 +542,74 @@ static inline void write_seqlock(seqlock_t *sl)
 	__write_seqcount_begin(&sl->seqcount);
 }
 
+/**
+ * write_sequnlock() - end a seqlock_t write side critical section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * write_sequnlock closes the (serialized and non-preemptible) write
+ * side critical section of the given seqlock_t.
+ */
 static inline void write_sequnlock(seqlock_t *sl)
 {
 	write_seqcount_end(&sl->seqcount);
 	spin_unlock(&sl->lock);
 }
 
+/**
+ * write_seqlock_bh() - start a softirqs-disabled seqlock_t write  section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * write_seqlock_bh is a write_seqlock() variant that disables softirqs
+ * before opening the serialized seqlock_t write side critical section.
+ * Use it only if the read side section, or other writers, can be
+ * invoked from a softirq context.
+ *
+ * The opened write section must be closed with write_sequnlock_bh().
+ */
 static inline void write_seqlock_bh(seqlock_t *sl)
 {
 	spin_lock_bh(&sl->lock);
 	__write_seqcount_begin(&sl->seqcount);
 }
 
+/**
+ * write_sequnlock_bh() - end a softirqs-disabled seqlock_t write section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * write_sequnlock_bh closes the serialized, softirqs-disabled,
+ * seqlock_t write side critical section. It enables softirqs if they
+ * were already enabled before calling the paired write_seqlock_bh().
+ */
 static inline void write_sequnlock_bh(seqlock_t *sl)
 {
 	write_seqcount_end(&sl->seqcount);
 	spin_unlock_bh(&sl->lock);
 }
 
+/**
+ * write_seqlock_irq() - start a non-interruptible seqlock_t write side section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * write_seqlock_irq is a write_seqlock() variant where hardirqs are
+ * disabled before opening the serialized and non-preemptible seqlock_t
+ * write side critical section.
+ */
 static inline void write_seqlock_irq(seqlock_t *sl)
 {
 	spin_lock_irq(&sl->lock);
 	__write_seqcount_begin(&sl->seqcount);
 }
 
+/**
+ * write_sequnlock_irq() - end a non-interruptible seqlock_t write side section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * write_sequnlock_irq closes the serialized and non-interruptible write
+ * side critical section of the given seqlock_t. It enables local
+ * interrupts afterwards.
+ *
+ * The write critical section must've been opened with write_seqlock_irq().
+ */
 static inline void write_sequnlock_irq(seqlock_t *sl)
 {
 	write_seqcount_end(&sl->seqcount);
@@ -507,9 +625,36 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
 	return flags;
 }
 
+/**
+ * write_seqlock_irqsave() - start a non-interruptible seqlock_t write section
+ * @lock:  Pointer to &typedef seqlock_t
+ * @flags: Stack-allocated storage for saving caller's local interrupt
+ *         state, to be passed to write_sequnlock_irqrestore().
+ *
+ * write_seqlock_irqsave is a write_seqlock() variant where the caller's
+ * local interrupts state is saved, then local interrupts are disabled,
+ * before opening the serialized and non-preemptible seqlock_t write
+ * side critical section.
+ *
+ * Use this only if the read side section can be invoked from a hardirq
+ * context.
+ *
+ * The opened write section must be closed with write_sequnlock_irqrestore().
+ */
 #define write_seqlock_irqsave(lock, flags)				\
 	do { flags = __write_seqlock_irqsave(lock); } while (0)
 
+/**
+ * write_sequnlock_irqrestore() - end non-interruptible seqlock_t write section
+ * @sl:    Pointer to &typedef seqlock_t
+ * @flags: Caller's saved interrupt state, from write_seqlock_irqsave()
+ *
+ * write_sequnlock_irq closes the serialized and non-interruptible write
+ * side critical section of the given seqlock_t. It then restores the
+ * caller's local interrupts saved state.
+ *
+ * The write section must've been opened with write_seqlock_irqsave().
+ */
 static inline void
 write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
 {
@@ -517,30 +662,61 @@ write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
 	spin_unlock_irqrestore(&sl->lock, flags);
 }
 
-/*
- * A locking reader exclusively locks out other writers and locking readers,
- * but doesn't update the sequence number. Acts like a normal spin_lock/unlock.
- * Don't need preempt_disable() because that is in the spin_lock already.
+/**
+ * read_seqlock_excl() - begin a seqlock_t locking reader critical section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_seqlock_excl opens a locking reader critical section for the
+ * given seqlock_t. A locking reader exclusively locks out other writers
+ * and other locking readers, but doesn't update the sequence number.
+ *
+ * Locking readers act like a normal spin_lock()/spin_unlock().
+ *
+ * The opened read side section must be closed with read_sequnlock_excl().
  */
 static inline void read_seqlock_excl(seqlock_t *sl)
 {
 	spin_lock(&sl->lock);
 }
 
+/**
+ * read_sequnlock_excl() - end a seqlock_t locking reader critical section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_sequnlock_excl closes a locking reader critical section.  The
+ * read section must've been opened with read_seqlock_excl().
+ */
 static inline void read_sequnlock_excl(seqlock_t *sl)
 {
 	spin_unlock(&sl->lock);
 }
 
 /**
- * read_seqbegin_or_lock - begin a sequence number check or locking block
- * @lock: sequence lock
- * @seq : sequence number to be checked
+ * read_seqbegin_or_lock() - begin a seqlock_t lockless or locking reader
+ * @lock: Pointer to &typedef seqlock_t
+ * @seq : Marker and return parameter. If the passed value is even, the
+ * reader will become a *lockless* seqlock_t sequence counter reader as
+ * in read_seqbegin(). If the passed value is odd, the reader will
+ * become a fully locking reader, as in read_seqlock_excl().  In the
+ * first call to read_seqbegin_or_lock(), the caller **must** initialize
+ * and pass an even value in @seq so a lockless read is optimistically
+ * tried first.
  *
- * First try it once optimistically without taking the lock. If that fails,
- * take the lock. The sequence number is also used as a marker for deciding
- * whether to be a reader (even) or writer (odd).
- * N.B. seq must be initialized to an even number to begin with.
+ * read_seqbegin_or_lock optimistically tries a lockless seqlock_t
+ * sequence counter read first. If an odd counter is found, the lockless
+ * read trial has failed, and the reader transforms to a full seqlock_t
+ * locking reader as in read_seqlock_excl().  This is typically used to
+ * avoid lockless seqlock_t readers starvation (too much retry loops) in
+ * the case of a sharp spike in write activity.
+ *
+ * The opened read section must be closed with done_seqretry().  Check
+ * Documentation/locking/seqlock.rst for template example code.
+ *
+ * Return: The read critical section status is returned through @seq,
+ * which is overloaded as a return parameter. This value must be passed
+ * to need_seqretry() to check the validity of the tried seqlock_t read
+ * section. If the read section must be retried, the returned value must
+ * also be passed to the next iteration of read_seqbegin_or_lock().
  */
 static inline void read_seqbegin_or_lock(seqlock_t *lock, int *seq)
 {
@@ -550,32 +726,98 @@ static inline void read_seqbegin_or_lock(seqlock_t *lock, int *seq)
 		read_seqlock_excl(lock);
 }
 
+/**
+ * need_seqretry() - validate seqlock_t "locking or lockless" reader section
+ * @lock: Pointer to &typedef seqlock_t
+ * @seq: count, from read_seqbegin_or_lock()
+ *
+ * need_seqretry checks if the seqlock_t read-side critical section
+ * started with read_seqbegin_or_lock() is valid. If it was not, the
+ * caller must retry the read-side section.
+ *
+ * Return: 1 if a retry is required, 0 otherwise
+ */
 static inline int need_seqretry(seqlock_t *lock, int seq)
 {
 	return !(seq & 1) && read_seqretry(lock, seq);
 }
 
+/**
+ * done_seqretry() - end seqlock_t "locking or lockless" reader section
+ * @lock: Pointer to &typedef seqlock_t
+ * @seq: count, from read_seqbegin_or_lock()
+ *
+ * done_seqretry finishes the seqlock_t read side critical section
+ * started by read_seqbegin_or_lock(). Before finishing the critical
+ * section, the validity of the read side section must've been already
+ * verified with need_seqretry().
+ */
 static inline void done_seqretry(seqlock_t *lock, int seq)
 {
 	if (seq & 1)
 		read_sequnlock_excl(lock);
 }
 
+/**
+ * read_seqlock_excl_bh() - start a locking reader seqlock_t section
+ *			    with softirqs disabled
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_seqlock_excl_bh is a variant of read_seqlock_excl() that saves
+ * softirqs state, then disables softirqs, before starting the locking
+ * reader read side section. Only use this variant if the seqlock_t
+ * write side section, *or other read sections*, can be invoked from a
+ * softirq context
+ *
+ * The opened section must be closed with read_sequnlock_excl_bh().
+ */
 static inline void read_seqlock_excl_bh(seqlock_t *sl)
 {
 	spin_lock_bh(&sl->lock);
 }
 
+/**
+ * read_sequnlock_excl_bh() - stop a seqlock_t softirq-disabled locking
+ *			      reader section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_sequnlock_excl_bh ends the softirq-disabled seqlock_t locking
+ * reader read side section. It restores the softirqs state saved by
+ * read_seqlock_excl_bh() afterwards.
+ */
 static inline void read_sequnlock_excl_bh(seqlock_t *sl)
 {
 	spin_unlock_bh(&sl->lock);
 }
 
+/**
+ * read_seqlock_excl_irq() - start a non-interruptible seqlock_t locking
+ *			     reader section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_seqlock_excl_irq is a variant of read_seqlock_excl() that
+ * disables interrupts before starting the locking reader read side
+ * section. Only use this variant if the seqlock_t write side section,
+ * *or other read sections*, can be invoked from a hardirq context
+ *
+ * The opened read section must be closed with read_sequnlock_excl_irq().
+ */
 static inline void read_seqlock_excl_irq(seqlock_t *sl)
 {
 	spin_lock_irq(&sl->lock);
 }
 
+/**
+ * read_sequnlock_excl_irq() - end an interrupts-disabled seqlock_t
+ *                             locking reader section
+ * @sl: Pointer to &typedef seqlock_t
+ *
+ * read_sequnlock_excl_irq ends the interrupts-disabled seqlock_t
+ * locking reader read side critical section. It enables local
+ * interrupts afterwards.
+ *
+ * The read section must've been started with read_seqlock_excl_irq().
+ */
 static inline void read_sequnlock_excl_irq(seqlock_t *sl)
 {
 	spin_unlock_irq(&sl->lock);
@@ -589,15 +831,68 @@ static inline unsigned long __read_seqlock_excl_irqsave(seqlock_t *sl)
 	return flags;
 }
 
+/**
+ * read_seqlock_excl_irqsave() - start a non-interruptible seqlock_t
+ *				 locking reader section
+ * @lock: Pointer to &typedef seqlock_t
+ * @flags: Stack-allocated storage for saving caller's local interrupt
+ *         state, to be passed to read_sequnlock_excl_irqrestore().
+ *
+ * read_seqlock_excl_irqsave is a read_seqlock_excl() variant which
+ * saves the caller's local interrupts state, then disables local
+ * interrupts, before opening the seqlock_t locking reader critical
+ * section.
+ *
+ * Use this only if the seqlock_t write side critical section, or other
+ * read side sections, can be invoked from a hardirq context.
+ *
+ * The opened locking reader critical section must be closed with
+ * read_sequnlock_excl_irqrestore().
+ */
 #define read_seqlock_excl_irqsave(lock, flags)				\
 	do { flags = __read_seqlock_excl_irqsave(lock); } while (0)
 
+/**
+ * read_sequnlock_excl_irqrestore() - end non-interruptible seqlock_t
+ *				      locking reader section
+ * @sl: Pointer to &typedef seqlock_t
+ * @flags: Caller's saved interrupt state, from
+ *	   read_seqlock_excl_irqsave()
+ *
+ * read_sequnlock_excl_irqrestore closes the non-interruptible seqlock_t
+ * locking reader section. It then restores the caller's local
+ * interrupts saved state.
+ *
+ * The read section must've been opened with read_seqlock_excl_irqsave().
+ */
 static inline void
 read_sequnlock_excl_irqrestore(seqlock_t *sl, unsigned long flags)
 {
 	spin_unlock_irqrestore(&sl->lock, flags);
 }
 
+/**
+ * read_seqbegin_or_lock_irqsave() - begin a seqlock_t lockless reader, or
+ *                                   a non-interruptible locking reader
+ * @lock: Pointer to &typedef seqlock_t
+ * @seq: Marker and return parameter. Check read_seqbegin_or_lock().
+ *
+ * read_seqbegin_or_lock_irqsave is a variant of read_seqbegin_or_lock()
+ * which saves the local interrupts state, then disables local
+ * interrupts, before opening a seqlock_t *locking reader* critical
+ * section.
+ *
+ * The opened section must be closed with done_seqretry_irqrestore().
+ *
+ * Return:
+ *
+ *   1. The saved local interrupts state in case of a locking reader, to
+ *      be passed to done_seqretry_irqrestore().
+ *
+ *   2. The read critical section status, returned through @seq which is
+ *      overloaded as a return parameter. Check read_seqbegin_or_lock()
+ *      for more info.
+ */
 static inline unsigned long
 read_seqbegin_or_lock_irqsave(seqlock_t *lock, int *seq)
 {
@@ -611,6 +906,19 @@ read_seqbegin_or_lock_irqsave(seqlock_t *lock, int *seq)
 	return flags;
 }
 
+/**
+ * done_seqretry_irqrestore() - end a seqlock_t lockless reader, or a
+ *				non-interruptible locking reader section
+ * @lock:  Pointer to &typedef seqlock_t
+ * @seq:   Count, from read_seqbegin_or_lock_irqsave()
+ * @flags: Caller's saved local interrupt state in case of a locking
+ *	   reader, also from read_seqbegin_or_lock_irqsave()
+ *
+ * done_seqretry_irqrestore is a variant of done_seqretry() which
+ * restores the callers saved local interrupts state in case of a
+ * locking reader. Check done_seqretry() for more information. The read
+ * section must've been opened with read_seqbegin_or_lock_irqsave().
+ */
 static inline void
 done_seqretry_irqrestore(seqlock_t *lock, int seq, unsigned long flags)
 {
-- 
2.20.1


^ permalink raw reply related	[relevance 42%]

* [PATCH v1 15/25] netfilter: conntrack: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (13 preceding siblings ...)
  2020-05-19 21:45 93% ` [PATCH v1 14/25] sched: tasks: Use sequence counter with associated spinlock Ahmed S. Darwish
@ 2020-05-19 21:45 92% ` Ahmed S. Darwish
  2020-05-19 21:45 95% ` [PATCH v1 16/25] netfilter: nft_set_rbtree: Use sequence counter with associated rwlock Ahmed S. Darwish
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Pablo Neira Ayuso,
	Jozsef Kadlecsik, Florian Westphal, David S. Miller,
	Jakub Kicinski, netfilter-devel, coreteam, netdev

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/net/netfilter/nf_conntrack.h | 2 +-
 net/netfilter/nf_conntrack_core.c    | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 9f551f3b69c6..333fd54aec30 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -286,7 +286,7 @@ int nf_conntrack_hash_resize(unsigned int hashsize);
 
 extern struct hlist_nulls_head *nf_conntrack_hash;
 extern unsigned int nf_conntrack_htable_size;
-extern seqcount_t nf_conntrack_generation;
+extern seqcount_spinlock_t nf_conntrack_generation;
 extern unsigned int nf_conntrack_max;
 
 /* must be called with rcu read lock held */
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index c4582eb71766..48a839377da2 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -180,7 +180,7 @@ EXPORT_SYMBOL_GPL(nf_conntrack_htable_size);
 
 unsigned int nf_conntrack_max __read_mostly;
 EXPORT_SYMBOL_GPL(nf_conntrack_max);
-seqcount_t nf_conntrack_generation __read_mostly;
+seqcount_spinlock_t nf_conntrack_generation __read_mostly;
 static unsigned int nf_conntrack_hash_rnd __read_mostly;
 
 static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple,
@@ -2512,7 +2512,8 @@ int nf_conntrack_init_start(void)
 	/* struct nf_ct_ext uses u8 to store offsets/size */
 	BUILD_BUG_ON(total_extension_size() > 255u);
 
-	seqcount_init(&nf_conntrack_generation);
+	seqcount_spinlock_init(&nf_conntrack_generation,
+			       &nf_conntrack_locks_all_lock);
 
 	for (i = 0; i < CONNTRACK_LOCKS; i++)
 		spin_lock_init(&nf_conntrack_locks[i]);
-- 
2.20.1


^ permalink raw reply related	[relevance 92%]

* [PATCH v1 17/25] xfrm: policy: Use sequence counters with associated lock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (15 preceding siblings ...)
  2020-05-19 21:45 95% ` [PATCH v1 16/25] netfilter: nft_set_rbtree: Use sequence counter with associated rwlock Ahmed S. Darwish
@ 2020-05-19 21:45 91% ` Ahmed S. Darwish
  2020-05-19 21:45 84% ` [PATCH v1 18/25] timekeeping: Use sequence counter with associated raw spinlock Ahmed S. Darwish
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Steffen Klassert,
	Herbert Xu, David S. Miller, Jakub Kicinski, netdev

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.

A plain seqcount_t does not contain the information of which lock must
be held when entering a write side critical section.

Use the new seqcount_spinlock_t and seqcount_mutex_t data types instead,
which allow to associate a lock with the sequence counter. This enables
lockdep to verify that the lock used for writer serialization is held
when the write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 net/xfrm/xfrm_policy.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 297b2fdb3c29..aae78a7aecd7 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -122,7 +122,7 @@ struct xfrm_pol_inexact_bin {
 	/* list containing '*:*' policies */
 	struct hlist_head hhead;
 
-	seqcount_t count;
+	seqcount_spinlock_t count;
 	/* tree sorted by daddr/prefix */
 	struct rb_root root_d;
 
@@ -155,7 +155,7 @@ static struct xfrm_policy_afinfo const __rcu *xfrm_policy_afinfo[AF_INET6 + 1]
 						__read_mostly;
 
 static struct kmem_cache *xfrm_dst_cache __ro_after_init;
-static __read_mostly seqcount_t xfrm_policy_hash_generation;
+static __read_mostly seqcount_mutex_t xfrm_policy_hash_generation;
 
 static struct rhashtable xfrm_policy_inexact_table;
 static const struct rhashtable_params xfrm_pol_inexact_params;
@@ -719,7 +719,7 @@ xfrm_policy_inexact_alloc_bin(const struct xfrm_policy *pol, u8 dir)
 	INIT_HLIST_HEAD(&bin->hhead);
 	bin->root_d = RB_ROOT;
 	bin->root_s = RB_ROOT;
-	seqcount_init(&bin->count);
+	seqcount_spinlock_init(&bin->count, &net->xfrm.xfrm_policy_lock);
 
 	prev = rhashtable_lookup_get_insert_key(&xfrm_policy_inexact_table,
 						&bin->k, &bin->head,
@@ -1911,7 +1911,7 @@ static int xfrm_policy_match(const struct xfrm_policy *pol,
 
 static struct xfrm_pol_inexact_node *
 xfrm_policy_lookup_inexact_addr(const struct rb_root *r,
-				seqcount_t *count,
+				seqcount_spinlock_t *count,
 				const xfrm_address_t *addr, u16 family)
 {
 	const struct rb_node *parent;
@@ -4158,7 +4158,7 @@ void __init xfrm_init(void)
 {
 	register_pernet_subsys(&xfrm_net_ops);
 	xfrm_dev_init();
-	seqcount_init(&xfrm_policy_hash_generation);
+	seqcount_mutex_init(&xfrm_policy_hash_generation, &hash_resize_mutex);
 	xfrm_input_init();
 
 #ifdef CONFIG_INET_ESPINTCP
-- 
2.20.1


^ permalink raw reply related	[relevance 91%]

* [PATCH v1 16/25] netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (14 preceding siblings ...)
  2020-05-19 21:45 92% ` [PATCH v1 15/25] netfilter: conntrack: " Ahmed S. Darwish
@ 2020-05-19 21:45 95% ` Ahmed S. Darwish
  2020-05-19 21:45 91% ` [PATCH v1 17/25] xfrm: policy: Use sequence counters with associated lock Ahmed S. Darwish
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Pablo Neira Ayuso,
	Jozsef Kadlecsik, Florian Westphal, David S. Miller,
	Jakub Kicinski, netfilter-devel, coreteam, netdev

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_rwlock_t data type, which allows to associate a
rwlock with the sequence counter. This enables lockdep to verify that
the rwlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 net/netfilter/nft_set_rbtree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 3ffef454d469..f50d986d43c5 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -18,7 +18,7 @@
 struct nft_rbtree {
 	struct rb_root		root;
 	rwlock_t		lock;
-	seqcount_t		count;
+	seqcount_rwlock_t	count;
 	struct delayed_work	gc_work;
 };
 
@@ -505,7 +505,7 @@ static int nft_rbtree_init(const struct nft_set *set,
 	struct nft_rbtree *priv = nft_set_priv(set);
 
 	rwlock_init(&priv->lock);
-	seqcount_init(&priv->count);
+	seqcount_rwlock_init(&priv->count, &priv->lock);
 	priv->root = RB_ROOT;
 
 	INIT_DEFERRABLE_WORK(&priv->gc_work, nft_rbtree_gc);
-- 
2.20.1


^ permalink raw reply related	[relevance 95%]

* [PATCH v1 18/25] timekeeping: Use sequence counter with associated raw spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (16 preceding siblings ...)
  2020-05-19 21:45 91% ` [PATCH v1 17/25] xfrm: policy: Use sequence counters with associated lock Ahmed S. Darwish
@ 2020-05-19 21:45 84% ` Ahmed S. Darwish
  2020-05-19 21:45 89% ` [PATCH v1 19/25] vfs: Use sequence counter with associated spinlock Ahmed S. Darwish
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, John Stultz,
	Stephen Boyd

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_raw_spinlock_t data type, which allows to associate
a raw spinlock with the sequence counter. This enables lockdep to verify
that the raw spinlock used for writer serialization is held when the
write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 kernel/time/timekeeping.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9ebaab13339d..24e91a1e2acd 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -39,18 +39,19 @@ enum timekeeping_adv_mode {
 	TK_ADV_FREQ
 };
 
+static DEFINE_RAW_SPINLOCK(timekeeper_lock);
+
 /*
  * The most important data for readout fits into a single 64 byte
  * cache line.
  */
 static struct {
-	seqcount_t		seq;
+	seqcount_raw_spinlock_t	seq;
 	struct timekeeper	timekeeper;
 } tk_core ____cacheline_aligned = {
-	.seq = SEQCNT_ZERO(tk_core.seq),
+	.seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_core.seq, &timekeeper_lock),
 };
 
-static DEFINE_RAW_SPINLOCK(timekeeper_lock);
 static struct timekeeper shadow_timekeeper;
 
 /**
@@ -63,7 +64,7 @@ static struct timekeeper shadow_timekeeper;
  * See @update_fast_timekeeper() below.
  */
 struct tk_fast {
-	seqcount_t		seq;
+	seqcount_raw_spinlock_t	seq;
 	struct tk_read_base	base[2];
 };
 
@@ -80,11 +81,13 @@ static struct clocksource dummy_clock = {
 };
 
 static struct tk_fast tk_fast_mono ____cacheline_aligned = {
+	.seq     = SEQCNT_RAW_SPINLOCK_ZERO(tk_fast_mono.seq, &timekeeper_lock),
 	.base[0] = { .clock = &dummy_clock, },
 	.base[1] = { .clock = &dummy_clock, },
 };
 
 static struct tk_fast tk_fast_raw  ____cacheline_aligned = {
+	.seq     = SEQCNT_RAW_SPINLOCK_ZERO(tk_fast_raw.seq, &timekeeper_lock),
 	.base[0] = { .clock = &dummy_clock, },
 	.base[1] = { .clock = &dummy_clock, },
 };
@@ -157,7 +160,7 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
  * tk_clock_read - atomic clocksource read() helper
  *
  * This helper is necessary to use in the read paths because, while the
- * seqlock ensures we don't return a bad value while structures are updated,
+ * seqcount ensures we don't return a bad value while structures are updated,
  * it doesn't protect from potential crashes. There is the possibility that
  * the tkr's clocksource may change between the read reference, and the
  * clock reference passed to the read function.  This can cause crashes if
@@ -222,10 +225,10 @@ static inline u64 timekeeping_get_delta(const struct tk_read_base *tkr)
 	unsigned int seq;
 
 	/*
-	 * Since we're called holding a seqlock, the data may shift
+	 * Since we're called holding a seqcount, the data may shift
 	 * under us while we're doing the calculation. This can cause
 	 * false positives, since we'd note a problem but throw the
-	 * results away. So nest another seqlock here to atomically
+	 * results away. So nest another seqcount here to atomically
 	 * grab the points we are checking with.
 	 */
 	do {
@@ -486,7 +489,7 @@ EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
  *
  * To keep it NMI safe since we're accessing from tracing, we're not using a
  * separate timekeeper with updates to monotonic clock and boot offset
- * protected with seqlocks. This has the following minor side effects:
+ * protected with seqcounts. This has the following minor side effects:
  *
  * (1) Its possible that a timestamp be taken after the boot offset is updated
  * but before the timekeeper is updated. If this happens, the new boot offset
-- 
2.20.1


^ permalink raw reply related	[relevance 84%]

* [PATCH v1 23/25] userfaultfd: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (21 preceding siblings ...)
  2020-05-19 21:45 95% ` [PATCH v1 22/25] NFSv4: " Ahmed S. Darwish
@ 2020-05-19 21:45 96% ` Ahmed S. Darwish
  2020-05-19 21:45 95% ` [PATCH v1 24/25] kvm/eventfd: " Ahmed S. Darwish
  2020-05-19 21:45 94% ` [PATCH v1 25/25] hrtimer: Use sequence counter with associated raw spinlock Ahmed S. Darwish
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Alexander Viro,
	linux-fsdevel

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 fs/userfaultfd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index e39fdec8a0b0..dd3aab31c50f 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -61,7 +61,7 @@ struct userfaultfd_ctx {
 	/* waitqueue head for events */
 	wait_queue_head_t event_wqh;
 	/* a refile sequence protected by fault_pending_wqh lock */
-	struct seqcount refile_seq;
+	seqcount_spinlock_t refile_seq;
 	/* pseudo fd refcounting */
 	refcount_t refcount;
 	/* userfaultfd syscall flags */
@@ -1998,7 +1998,7 @@ static void init_once_userfaultfd_ctx(void *mem)
 	init_waitqueue_head(&ctx->fault_wqh);
 	init_waitqueue_head(&ctx->event_wqh);
 	init_waitqueue_head(&ctx->fd_wqh);
-	seqcount_init(&ctx->refile_seq);
+	seqcount_spinlock_init(&ctx->refile_seq, &ctx->fault_pending_wqh.lock);
 }
 
 SYSCALL_DEFINE1(userfaultfd, int, flags)
-- 
2.20.1


^ permalink raw reply related	[relevance 96%]

* [PATCH v1 22/25] NFSv4: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (20 preceding siblings ...)
  2020-05-19 21:45 95% ` [PATCH v1 21/25] iocost: " Ahmed S. Darwish
@ 2020-05-19 21:45 95% ` Ahmed S. Darwish
  2020-05-19 21:45 96% ` [PATCH v1 23/25] userfaultfd: " Ahmed S. Darwish
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Trond Myklebust,
	Anna Schumaker, linux-nfs

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 fs/nfs/nfs4_fs.h   | 2 +-
 fs/nfs/nfs4state.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 2b7f6dcd2eb8..210e590e1f71 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -117,7 +117,7 @@ struct nfs4_state_owner {
 	unsigned long	     so_flags;
 	struct list_head     so_states;
 	struct nfs_seqid_counter so_seqid;
-	seqcount_t	     so_reclaim_seqcount;
+	seqcount_spinlock_t  so_reclaim_seqcount;
 	struct mutex	     so_delegreturn_mutex;
 };
 
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index ac93715c05a4..9b2bad35ad24 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -509,7 +509,7 @@ nfs4_alloc_state_owner(struct nfs_server *server,
 	nfs4_init_seqid_counter(&sp->so_seqid);
 	atomic_set(&sp->so_count, 1);
 	INIT_LIST_HEAD(&sp->so_lru);
-	seqcount_init(&sp->so_reclaim_seqcount);
+	seqcount_spinlock_init(&sp->so_reclaim_seqcount, &sp->so_lock);
 	mutex_init(&sp->so_delegreturn_mutex);
 	return sp;
 }
-- 
2.20.1


^ permalink raw reply related	[relevance 95%]

* [PATCH v1 24/25] kvm/eventfd: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (22 preceding siblings ...)
  2020-05-19 21:45 96% ` [PATCH v1 23/25] userfaultfd: " Ahmed S. Darwish
@ 2020-05-19 21:45 95% ` Ahmed S. Darwish
  2020-05-19 21:45 94% ` [PATCH v1 25/25] hrtimer: Use sequence counter with associated raw spinlock Ahmed S. Darwish
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Paolo Bonzini, kvm

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/kvm_irqfd.h | 2 +-
 virt/kvm/eventfd.c        | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h
index dc1da020305b..dac047abdba7 100644
--- a/include/linux/kvm_irqfd.h
+++ b/include/linux/kvm_irqfd.h
@@ -42,7 +42,7 @@ struct kvm_kernel_irqfd {
 	wait_queue_entry_t wait;
 	/* Update side is protected by irqfds.lock */
 	struct kvm_kernel_irq_routing_entry irq_entry;
-	seqcount_t irq_entry_sc;
+	seqcount_spinlock_t irq_entry_sc;
 	/* Used for level IRQ fast-path */
 	int gsi;
 	struct work_struct inject;
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 67b6fc153e9c..8694a2920ea9 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -303,7 +303,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 	INIT_LIST_HEAD(&irqfd->list);
 	INIT_WORK(&irqfd->inject, irqfd_inject);
 	INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
-	seqcount_init(&irqfd->irq_entry_sc);
+	seqcount_spinlock_init(&irqfd->irq_entry_sc, &kvm->irqfds.lock);
 
 	f = fdget(args->fd);
 	if (!f.file) {
-- 
2.20.1


^ permalink raw reply related	[relevance 95%]

* [PATCH v1 25/25] hrtimer: Use sequence counter with associated raw spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (23 preceding siblings ...)
  2020-05-19 21:45 95% ` [PATCH v1 24/25] kvm/eventfd: " Ahmed S. Darwish
@ 2020-05-19 21:45 94% ` Ahmed S. Darwish
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_raw_spinlock_t data type, which allows to associate
a raw spinlock with the sequence counter. This enables lockdep to verify
that the raw spinlock used for writer serialization is held when the
write side critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/hrtimer.h |  2 +-
 kernel/time/hrtimer.c   | 13 ++++++++++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 15c8ac313678..25993b86ac5c 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -159,7 +159,7 @@ struct hrtimer_clock_base {
 	struct hrtimer_cpu_base	*cpu_base;
 	unsigned int		index;
 	clockid_t		clockid;
-	seqcount_t		seq;
+	seqcount_raw_spinlock_t	seq;
 	struct hrtimer		*running;
 	struct timerqueue_head	active;
 	ktime_t			(*get_time)(void);
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index d89da1c7e005..c4038511d5c9 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -135,7 +135,11 @@ static const int hrtimer_clock_to_base_table[MAX_CLOCKS] = {
  * timer->base->cpu_base
  */
 static struct hrtimer_cpu_base migration_cpu_base = {
-	.clock_base = { { .cpu_base = &migration_cpu_base, }, },
+	.clock_base = { {
+		.cpu_base = &migration_cpu_base,
+		.seq      = SEQCNT_RAW_SPINLOCK_ZERO(migration_cpu_base.seq,
+						     &migration_cpu_base.lock),
+	}, },
 };
 
 #define migration_base	migration_cpu_base.clock_base[0]
@@ -1998,8 +2002,11 @@ int hrtimers_prepare_cpu(unsigned int cpu)
 	int i;
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
-		cpu_base->clock_base[i].cpu_base = cpu_base;
-		timerqueue_init_head(&cpu_base->clock_base[i].active);
+		struct hrtimer_clock_base *clock_b = &cpu_base->clock_base[i];
+
+		clock_b->cpu_base = cpu_base;
+		seqcount_raw_spinlock_init(&clock_b->seq, &cpu_base->lock);
+		timerqueue_init_head(&clock_b->active);
 	}
 
 	cpu_base->cpu = cpu;
-- 
2.20.1


^ permalink raw reply related	[relevance 94%]

* [PATCH v1 21/25] iocost: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (19 preceding siblings ...)
  2020-05-19 21:45 95% ` [PATCH v1 20/25] raid5: " Ahmed S. Darwish
@ 2020-05-19 21:45 95% ` Ahmed S. Darwish
  2020-05-19 21:45 95% ` [PATCH v1 22/25] NFSv4: " Ahmed S. Darwish
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Jens Axboe, linux-block

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 block/blk-iocost.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 7c1fe605d0d6..8029a9e8fa55 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -405,7 +405,7 @@ struct ioc {
 	enum ioc_running		running;
 	atomic64_t			vtime_rate;
 
-	seqcount_t			period_seqcount;
+	seqcount_spinlock_t		period_seqcount;
 	u32				period_at;	/* wallclock starttime */
 	u64				period_at_vtime; /* vtime starttime */
 
@@ -872,7 +872,6 @@ static void ioc_now(struct ioc *ioc, struct ioc_now *now)
 
 static void ioc_start_period(struct ioc *ioc, struct ioc_now *now)
 {
-	lockdep_assert_held(&ioc->lock);
 	WARN_ON_ONCE(ioc->running != IOC_RUNNING);
 
 	write_seqcount_begin(&ioc->period_seqcount);
@@ -1958,7 +1957,7 @@ static int blk_iocost_init(struct request_queue *q)
 
 	ioc->running = IOC_IDLE;
 	atomic64_set(&ioc->vtime_rate, VTIME_PER_USEC);
-	seqcount_init(&ioc->period_seqcount);
+	seqcount_spinlock_init(&ioc->period_seqcount, &ioc->lock);
 	ioc->period_at = ktime_to_us(ktime_get());
 	atomic64_set(&ioc->cur_period, 0);
 	atomic_set(&ioc->hweight_gen, 0);
-- 
2.20.1


^ permalink raw reply related	[relevance 95%]

* [PATCH v1 19/25] vfs: Use sequence counter with associated spinlock
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (17 preceding siblings ...)
  2020-05-19 21:45 84% ` [PATCH v1 18/25] timekeeping: Use sequence counter with associated raw spinlock Ahmed S. Darwish
@ 2020-05-19 21:45 89% ` Ahmed S. Darwish
  2020-05-19 21:45 95% ` [PATCH v1 20/25] raid5: " Ahmed S. Darwish
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Alexander Viro,
	linux-fsdevel

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 fs/dcache.c               | 2 +-
 fs/fs_struct.c            | 4 ++--
 include/linux/dcache.h    | 2 +-
 include/linux/fs_struct.h | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index b280e07e162b..e5f365d8fd67 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1727,7 +1727,7 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
 	dentry->d_lockref.count = 1;
 	dentry->d_flags = 0;
 	spin_lock_init(&dentry->d_lock);
-	seqcount_init(&dentry->d_seq);
+	seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock);
 	dentry->d_inode = NULL;
 	dentry->d_parent = dentry;
 	dentry->d_sb = sb;
diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index ca639ed967b7..04b3f5b9c629 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -117,7 +117,7 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old)
 		fs->users = 1;
 		fs->in_exec = 0;
 		spin_lock_init(&fs->lock);
-		seqcount_init(&fs->seq);
+		seqcount_spinlock_init(&fs->seq, &fs->lock);
 		fs->umask = old->umask;
 
 		spin_lock(&old->lock);
@@ -163,6 +163,6 @@ EXPORT_SYMBOL(current_umask);
 struct fs_struct init_fs = {
 	.users		= 1,
 	.lock		= __SPIN_LOCK_UNLOCKED(init_fs.lock),
-	.seq		= SEQCNT_ZERO(init_fs.seq),
+	.seq		= SEQCNT_SPINLOCK_ZERO(init_fs.seq, &init_fs.lock),
 	.umask		= 0022,
 };
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index c1488cc84fd9..235563da356d 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -89,7 +89,7 @@ extern struct dentry_stat_t dentry_stat;
 struct dentry {
 	/* RCU lookup touched fields */
 	unsigned int d_flags;		/* protected by d_lock */
-	seqcount_t d_seq;		/* per dentry seqlock */
+	seqcount_spinlock_t d_seq;	/* per dentry seqlock */
 	struct hlist_bl_node d_hash;	/* lookup hash list */
 	struct dentry *d_parent;	/* parent directory */
 	struct qstr d_name;
diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index cf1015abfbf2..783b48dedb72 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -9,7 +9,7 @@
 struct fs_struct {
 	int users;
 	spinlock_t lock;
-	seqcount_t seq;
+	seqcount_spinlock_t seq;
 	int umask;
 	int in_exec;
 	struct path root, pwd;
-- 
2.20.1


^ permalink raw reply related	[relevance 89%]

* [PATCH v1 06/25] dma-buf: Remove custom seqcount lockdep class key
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (4 preceding siblings ...)
  2020-05-19 21:45 82% ` [PATCH v1 05/25] u64_stats: Document writer non-preemptibility requirement Ahmed S. Darwish
@ 2020-05-19 21:45 91% ` Ahmed S. Darwish
  2020-05-19 21:45 89% ` [PATCH v1 07/25] lockdep: Add preemption disabled assertion API Ahmed S. Darwish
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish, Sumit Semwal,
	David Airlie, Daniel Vetter, linux-media, dri-devel

Commit 3c3b177a9369 ("reservation: add support for read-only access
using rcu") introduced a sequence counter to manage updates to
reservations. Back then, the reservation object initializer
reservation_object_init() was always inlined.

Having the sequence counter initialization inlined meant that each of
the call sites would have a different lockdep class key, which would've
broken lockdep's deadlock detection. The aforementioned commit thus
introduced, and exported, a custom seqcount lockdep class key and name.

The commit 8735f16803f00 ("dma-buf: cleanup reservation_object_init...")
transformed the reservation object initializer to a normal non-inlined C
function. seqcount_init(), which automatically defines the seqcount
lockdep class key and must be called non-inlined, can now be safely used.

Remove the seqcount custom lockdep class key, name, and export. Use
seqcount_init() inside the dma reservation object initializer.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/dma-buf/dma-resv.c | 9 +--------
 include/linux/dma-resv.h   | 2 --
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 4264e64788c4..590ce7ad60a0 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -50,12 +50,6 @@
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
-struct lock_class_key reservation_seqcount_class;
-EXPORT_SYMBOL(reservation_seqcount_class);
-
-const char reservation_seqcount_string[] = "reservation_seqcount";
-EXPORT_SYMBOL(reservation_seqcount_string);
-
 /**
  * dma_resv_list_alloc - allocate fence list
  * @shared_max: number of fences we need space for
@@ -134,9 +128,8 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
 	ww_mutex_init(&obj->lock, &reservation_ww_class);
+	seqcount_init(&obj->seq);
 
-	__seqcount_init(&obj->seq, reservation_seqcount_string,
-			&reservation_seqcount_class);
 	RCU_INIT_POINTER(obj->fence, NULL);
 	RCU_INIT_POINTER(obj->fence_excl, NULL);
 }
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index ee50d10f052b..a6538ae7d93f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -46,8 +46,6 @@
 #include <linux/rcupdate.h>
 
 extern struct ww_class reservation_ww_class;
-extern struct lock_class_key reservation_seqcount_class;
-extern const char reservation_seqcount_string[];
 
 /**
  * struct dma_resv_list - a list of shared fences
-- 
2.20.1


^ permalink raw reply related	[relevance 91%]

* [PATCH v1 08/25] seqlock: lockdep assert non-preemptibility on seqcount_t write
  2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
                   ` (6 preceding siblings ...)
  2020-05-19 21:45 89% ` [PATCH v1 07/25] lockdep: Add preemption disabled assertion API Ahmed S. Darwish
@ 2020-05-19 21:45 91% ` Ahmed S. Darwish
  2020-05-19 21:45 58% ` [PATCH v1 09/25] Documentation: locking: Describe seqlock design and usage Ahmed S. Darwish
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, Ahmed S. Darwish

Preemption must be disabled before entering a sequence count write side
critical section.  Failing to do so, the seqcount read side can preempt
the write side section and spin for the entire scheduler tick.  If that
reader belongs to a real-time scheduling class, it can spin forever and
the kernel will livelock.

Assert through lockdep that preemption is disabled for seqcount writers.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
---
 include/linux/seqlock.h | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index 0491d963d47e..d35be7709403 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -369,14 +369,32 @@ static inline void raw_write_seqcount_latch(seqcount_t *s)
 
 /*
  * Sequence counter only version assumes that callers are using their
- * own mutexing.
+ * own locking and preemption is disabled.
  */
-static inline void write_seqcount_begin_nested(seqcount_t *s, int subclass)
+
+static inline void __write_seqcount_begin_nested(seqcount_t *s, int subclass)
 {
 	raw_write_seqcount_begin(s);
 	seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
 }
 
+static inline void write_seqcount_begin_nested(seqcount_t *s, int subclass)
+{
+	lockdep_assert_preemption_disabled();
+	__write_seqcount_begin_nested(s, subclass);
+}
+
+/*
+ * write_seqcount_begin() without lockdep non-preemptibility checks.
+ *
+ * Use for internal seqlock.h code where it's known that preemption
+ * is already disabled. For example, seqlock_t write functions.
+ */
+static inline void __write_seqcount_begin(seqcount_t *s)
+{
+	__write_seqcount_begin_nested(s, 0);
+}
+
 static inline void write_seqcount_begin(seqcount_t *s)
 {
 	write_seqcount_begin_nested(s, 0);
@@ -446,7 +464,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
 static inline void write_seqlock(seqlock_t *sl)
 {
 	spin_lock(&sl->lock);
-	write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_begin(&sl->seqcount);
 }
 
 static inline void write_sequnlock(seqlock_t *sl)
@@ -458,7 +476,7 @@ static inline void write_sequnlock(seqlock_t *sl)
 static inline void write_seqlock_bh(seqlock_t *sl)
 {
 	spin_lock_bh(&sl->lock);
-	write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_begin(&sl->seqcount);
 }
 
 static inline void write_sequnlock_bh(seqlock_t *sl)
@@ -470,7 +488,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl)
 static inline void write_seqlock_irq(seqlock_t *sl)
 {
 	spin_lock_irq(&sl->lock);
-	write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_begin(&sl->seqcount);
 }
 
 static inline void write_sequnlock_irq(seqlock_t *sl)
@@ -484,7 +502,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
 	unsigned long flags;
 
 	spin_lock_irqsave(&sl->lock, flags);
-	write_seqcount_begin(&sl->seqcount);
+	__write_seqcount_begin(&sl->seqcount);
 	return flags;
 }
 
-- 
2.20.1


^ permalink raw reply related	[relevance 91%]

* Re: [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount
  @ 2020-05-20  6:42 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-20  6:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Thomas Gleixner, Paul E. McKenney, Sebastian A. Siewior,
	Steven Rostedt, LKML, David S. Miller, Jakub Kicinski, netdev

Hello Eric,

On Tue, May 19, 2020 at 07:01:38PM -0700, Eric Dumazet wrote:
>
> On 5/19/20 2:45 PM, Ahmed S. Darwish wrote:
> > Sequence counters write paths are critical sections that must never be
> > preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed.
> >
> > Commit 5dbe7c178d3f ("net: fix kernel deadlock with interface rename and
> > netdev name retrieval.") handled a deadlock, observed with
> > CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was
> > infinitely spinning: it got scheduled after the seqcount write side
> > blocked inside its own critical section.
> >
> > To fix that deadlock, among other issues, the commit added a
> > cond_resched() inside the read side section. While this will get the
> > non-preemptible kernel eventually unstuck, the seqcount reader is fully
> > exhausting its slice just spinning -- until TIF_NEED_RESCHED is set.
> >
> > The fix is also still broken: if the seqcount reader belongs to a
> > real-time scheduling policy, it can spin forever and the kernel will
> > livelock.
> >
> > Disabling preemption over the seqcount write side critical section will
> > not work: inside it are a number of GFP_KERNEL allocations and mutex
> > locking through the drivers/base/ :: device_rename() call chain.
> >
> > From all the above, replace the seqcount with a rwsem.
> >
> > Fixes: 5dbe7c178d3f (net: fix kernel deadlock with interface rename and netdev name retrieval.)
> > Fixes: 30e6c9fa93cf (net: devnet_rename_seq should be a seqcount)
> > Fixes: c91f6df2db49 (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name)
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> > Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  net/core/dev.c | 30 ++++++++++++------------------
> >  1 file changed, 12 insertions(+), 18 deletions(-)
> >
>
> Seems fine to me, assuming rwsem prevent starvation of the writer.
>

Thanks for the review.

AFAIK, due to 5cfd92e12e13 ("locking/rwsem: Adaptive disabling of reader
optimistic spinning"), using a rwsem shouldn't lead to writer starvation
in the contended case.

--
Ahmed S. Darwish
Linutronix GmbH

^ permalink raw reply	[relevance 99%]

* Re: [PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex
  @ 2020-05-21  0:09 96%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-21  0:09 UTC (permalink / raw)
  To: Christian König
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, David (ChunMing) Zhou,
	amd-gfx, Paul E. McKenney, David Airlie, Sebastian A. Siewior,
	LKML, Steven Rostedt, Christian König, dri-devel,
	Daniel Vetter, Alex Deucher, Felix Kuehling, Thomas Gleixner,
	Sumit Semwal, linux-media

On Wed, May 20, 2020, Christian König wrote:
> Am 19.05.20 um 23:45 schrieb Ahmed S. Darwish:
> > A sequence counter write side critical section must be protected by some
> > form of locking to serialize writers. If the serialization primitive is
> > not disabling preemption implicitly, preemption has to be explicitly
> > disabled before entering the sequence counter write side critical
> > section.
> >
> > The dma-buf reservation subsystem uses plain sequence counters to manage
> > updates to reservations. Writer serialization is accomplished through a
> > wound/wait mutex.
> >
> > Acquiring a wound/wait mutex does not disable preemption, so this needs
> > to be done manually before and after the write side critical section.
> >
> > Use the newly-added seqcount_ww_mutex_t instead:
> >
> >    - It associates the ww_mutex with the sequence count, which enables
> >      lockdep to validate that the write side critical section is properly
> >      serialized.
> >
> >    - It removes the need to explicitly add preempt_disable/enable()
> >      around the write side critical section because the write_begin/end()
> >      functions for this new data type automatically do this.
> >
> > If lockdep is disabled this ww_mutex lock association is compiled out
> > and has neither storage size nor runtime overhead.
>
> Mhm, is the dma_resv object the only user of this new seqcount_ww_mutex
> variant ?
>
> If yes we are trying to get rid of this sequence counter for quite some
> time, so I would rather invest the additional time to finish this.
>

In this patch series, each extra "seqcount with associated lock" data
type costs us, exactly:

  - 1 typedef definition, seqcount_ww_mutex_t
  - 1 static initializer, SEQCNT_WW_MUTEX_ZERO()
  - 1 runtime initializer, seqcount_ww_mutex_init()

Definitions for the typedef and the 2 initializers above are
template-code one liners.

The logic which automatically disables preemption upon entering a
seqcount_ww_mutex_t write side critical section is also already shared
with seqcount_mutex_t and any future, preemptible, associated lock.

So, yes, dma-resv is the only user of seqcount_ww_mutex.

But even in that case, given the one liner template code nature of
seqcount_ww_mutex_t logic, it does not make sense to block the dma_resv
and amdgpu change until at some point in the future the sequence counter
is completely removed.

**If and when** the sequence counter gets removed, please just remove
the seqcount_ww_mutex_t data type with it. It will be extremely simple.

> Regards,
> Christian.
>

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH

^ permalink raw reply	[relevance 96%]

* Re: [PATCH v1 10/25] seqlock: Add RST directives to kernel-doc code samples and notes
  @ 2020-05-25  9:36 93%           ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-25  9:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Ingo Molnar, Will Deacon, Paul E. McKenney,
	Sebastian A. Siewior, Steven Rostedt, LKML, Jonathan Corbet,
	linux-doc

Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2020 at 08:26:44PM +0200, Thomas Gleixner wrote:
> > Peter Zijlstra <peterz@infradead.org> writes:
> > > On Fri, May 22, 2020 at 08:02:54PM +0200, Peter Zijlstra wrote:
> > >> On Tue, May 19, 2020 at 11:45:32PM +0200, Ahmed S. Darwish wrote:
> > >> > Mark all C code samples inside seqlock.h kernel-doc text with the RST
> > >> > 'code-block: c' directive. Sphinx won't properly format the example code
> > >> > and will produce noisy text indentation warnings otherwise.
> > >>
> > >> I so bloody hate RST.. and now it's infecting perfectly sane comments
> > >> and turning them into unreadable junk :-(
> > >
> > > The correct fix is, as always, to remove the kernel-doc marker.
> >
> > Get over it already.
>
> I will not let sensible code comments deteriorate to the benefit of some
> external piece of crap.
>
> As a programmer the primary interface to all this is a text editor, not
> a web broswer or a pdf file or whatever other bullshit.
>
> If comments are unreadable in your text editor, they're useless.

Wait.

Most of the patch in question is just substituting the code snippet's
leading white spaces to tabs. For illustration purposes, if we remove
these white space hunks from the diff, it becomes:

  --- a/include/linux/seqlock.h
  +++ b/include/linux/seqlock.h
  @@ -232,6 +232,8 @@ static inline void raw_write_seqcount_end(seqcount_t *s)
  + * .. code-block:: c
  ...
  + * .. code-block:: c
  ...
  - * NOTE: The non-requirement for atomic modifications does _NOT_ include
  - *       the publishing of new entries in the case where data is a dynamic
  - *       data structure.
  + * .. attention::
  + *
  + *     The non-requirement for atomic modifications does _NOT_ include
  + *     the publishing of new entries in the case where data is a dynamic
  + *     data structure.
  ...

Are you trying to tell me that, good heavens, these directives are
really hurting your eyes so much?

Putting kernel-doc aside... That huge raw_write_seqcount_latch() comment
is actually *way more readable from any text editor* after applying this
patch. Go figure.

>>> The correct fix is, as always, to remove the kernel-doc marker.

Sorry, that's not the correct fix.

In the following patches, kernel-doc for the entire seqlock.h API is
added. Singling out raw_write_seqcount_latch() doesn't make any sense.

If you look at the top of this patch series, a lot of seqlock.h
seqcount_t call sites were badly broken. The 0day kernel test bot sent
me even more erroneous call sites due to the added lockdep checks. This
is an extra argument for the added documentation: the existing one is
horrible.

So, please, don't claim that the current situation is fine. It is not.

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH

^ permalink raw reply	[relevance 93%]

* Re: [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write
  @ 2020-05-25  9:56 99%     ` Ahmed S. Darwish
  0 siblings, 0 replies; 200+ results
From: Ahmed S. Darwish @ 2020-05-25  9:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Thomas Gleixner, Paul E. McKenney,
	Sebastian A. Siewior, Steven Rostedt, LKML, Jens Axboe,
	Phillip Susi, Vivek Goyal, linux-block

Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, May 19, 2020 at 11:45:26PM +0200, Ahmed S. Darwish wrote:
> > For optimized block readers not holding a mutex, the "number of sectors"
> > 64-bit value is protected from tearing on 32-bit architectures by a
> > sequence counter.
> >
> > Disable preemption before entering that sequence counter's write side
> > critical section. Otherwise, the read side can preempt the write side
> > section and spin for the entire scheduler tick. If the reader belongs to
> > a real-time scheduling class, it can spin forever and the kernel will
> > livelock.
> >
> > Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> > Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  block/blk.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/block/blk.h b/block/blk.h
> > index 0a94ec68af32..151f86932547 100644
> > --- a/block/blk.h
> > +++ b/block/blk.h
> > @@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
> >  static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> >  {
> >  #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> > +	preempt_disable();
> >  	write_seqcount_begin(&part->nr_sects_seq);
> >  	part->nr_sects = size;
> >  	write_seqcount_end(&part->nr_sects_seq);
> > +	preempt_enable();
> >  #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
> >  	preempt_disable();
> >  	part->nr_sects = size;
>
> This does look like something that include/linux/u64_stats_sync.h could
> help with.

Correct.

I just felt though that this would be too much for a 'Cc: stable' patch.

In another (in-progress) seqlock.h patch series, all of the seqcount_t
call sites that are used for 64-bit values tearing protection on 32-bit
kernels are transformed to the u64_stats_sync.h API.

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH

^ permalink raw reply	[relevance 99%]

Results 201-400 of ~800   |  | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2008-03-01 23:27     [PATCH -v2 -mm] LSM: Add security= boot parameter Ahmed S. Darwish
2008-03-02  7:49     ` Ahmed S. Darwish
2008-03-02 10:59       ` [PATCH -v3 " Ahmed S. Darwish
2008-03-03  8:29         ` James Morris
2008-03-03 15:35           ` Ahmed S. Darwish
2008-03-03 15:54             ` Stephen Smalley
2008-03-03 21:24               ` [PATCH -v4 " Ahmed S. Darwish
2008-03-03 22:16                 ` James Morris
2008-03-04  3:04                   ` [PATCH -v5 " Ahmed S. Darwish
2008-03-05 22:29                     ` Andrew Morton
2008-03-05 22:56 98%                   ` Ahmed S. Darwish
2008-03-05 23:06 99%                     ` Ahmed S. Darwish
2008-03-04 13:10     [PATCH BUGFIX -rc3] Smack: Don't register smackfs if we're not loaded Ahmed S. Darwish
2008-03-04 13:58     ` [PATCH -rc3] Security: Introduce security= boot parameter Ahmed S. Darwish
2008-03-05 15:29 67%   ` [PATCH -v7 " Ahmed S. Darwish
2008-03-05 16:33         ` Casey Schaufler
2008-03-05 16:55 98%       ` Ahmed S. Darwish
2008-03-05 18:46 65%   ` [PATCH -v7b " Ahmed S. Darwish
2008-03-04 17:45     [PATCH BUGFIX -rc3] Smack: Don't register smackfs if we're not loaded Casey Schaufler
2008-03-04 18:12     ` Linus Torvalds
2008-03-05  0:58       ` James Morris
2008-03-05 12:12 99%     ` Ahmed S. Darwish
2008-03-05 12:44 99% ` Ahmed S. Darwish
2008-03-05 12:51 99%   ` Ahmed S. Darwish
2008-03-06 12:19 67% [PATCH -v8 -rc3] Security: Introduce security= boot parameter Ahmed S. Darwish
2008-03-06 13:31     ` James Morris
2008-03-06 14:32 99%   ` Ahmed S. Darwish
2008-03-06 15:03         ` James Morris
2008-03-06 16:09 68%       ` [PATCH -v8b " Ahmed S. Darwish
2008-03-08 21:38 86% [PATCH BUGFIX -rc4] Smackfs: Do not trust `count' in inodes write()s Ahmed S. Darwish
2008-03-10 12:49 61% [RFC][PATCH] Smack<->Audit integration Ahmed S. Darwish
2008-03-10 16:07     ` Casey Schaufler
2008-03-10 18:26 99%   ` Ahmed S. Darwish
2008-03-10 18:43         ` Casey Schaufler
2008-03-12  2:44 78%       ` [RFC][PATCH -v2] Smack: Integrate with Audit Ahmed S. Darwish
2008-03-12  4:23             ` Casey Schaufler
2008-03-12 12:18 78%           ` [PATCH -v2b] " Ahmed S. Darwish
2008-03-12 15:40     [RFC][PATCH -v2] " Casey Schaufler
2008-03-12 15:48     ` Stephen Smalley
2008-03-12 16:43 99%   ` Ahmed S. Darwish
2008-03-14 23:10 96% [PATCH BUGFIX -rc5] Smack: Do not dereference NULL ipc object Ahmed S. Darwish
2008-03-17 23:14     [PATCH] de-semaphorize smackfs Jonathan Corbet
2008-03-18  7:35     ` Christoph Hellwig
2008-03-18 12:01 99%   ` Ahmed S. Darwish
2008-03-21 15:05 94% [PATCH BUGFIX -rc6] Smackfs: remove redundant lock, fix open(,O_RDWR) Ahmed S. Darwish
2008-04-10 19:31     [patch] SLQB v2 Nick Piggin
2008-04-15  3:44 94% ` Ahmed S. Darwish
2008-04-17 11:05     Security testing tree patch review for 2.6.26 James Morris
2008-04-17 16:10 75% ` [PATCH 2.6.26 #repost] Smack: Integrate with Audit Ahmed S. Darwish
2008-05-30 23:36 99% [PATCH BUGFIX -rc4] Smack: Respect 'unlabeled' netlabel mode Ahmed S. Darwish
2008-05-30 23:10     ` Casey Schaufler
2008-05-31  0:58 95%   ` Ahmed S. Darwish
2008-05-30 23:57 94% ` [PATCH BUGFIX -v2 " Ahmed S. Darwish
2008-05-30 23:25       ` Andrew Morton
2008-05-31  1:12 99%     ` Ahmed S. Darwish
2008-09-25 17:25     SMACK netfilter smacklabel socket match Tilman Baumann
2008-09-26  8:19     ` Tilman Baumann
2008-09-27  5:01       ` Casey Schaufler
2008-09-29 16:21         ` Tilman Baumann
2008-09-30  3:29           ` Casey Schaufler
2008-10-01 11:29             ` Tilman Baumann
2008-10-01 15:21               ` Casey Schaufler
2008-10-01 16:55                 ` Tilman Baumann
2008-10-01 18:22                   ` Casey Schaufler
2008-10-06 12:57                     ` Tilman Baumann
2008-10-06 23:05 99%                   ` Ahmed S. Darwish
2009-12-04  9:21 98% x86: Is 'volatile' necessary for readb/writeb and friends? Ahmed S. Darwish
2010-12-21 23:30     Linux 2.6.37-rc7 Jochen Voss
2010-12-22 18:18 99% ` Ahmed S. Darwish
2010-12-25  9:57 97% [PATCH, Bugfix] RAMOOPS: Don't overflow over non-allocated regions Ahmed S. Darwish
2010-12-25 11:19     ` Marco Stornelli
2010-12-27 21:33 99%   ` Ahmed S. Darwish
2011-01-14 13:39 99% Saving early panic() messages, to disk, using the BIOS Ahmed S. Darwish
2011-01-25 13:47 85% [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Ahmed S. Darwish
2011-01-25 13:51 55% ` [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS Ahmed S. Darwish
2011-01-25 13:53 63% ` [PATCH -next 2/2][RFC] x86: Saveoops: Reserve low memory and register code Ahmed S. Darwish
2011-01-25 17:29       ` H. Peter Anvin
2011-01-26  9:04 99%     ` Ahmed S. Darwish
2011-01-25 14:09     ` [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Ingo Molnar
2011-01-25 15:08       ` Tejun Heo
2011-01-25 17:33         ` H. Peter Anvin
2011-01-26 11:44 99%       ` Ahmed S. Darwish
2011-01-25 15:36 87%   ` Ahmed S. Darwish
2011-01-25 16:02         ` James Bottomley
2011-01-25 17:05 96%       ` Ahmed S. Darwish
2011-01-25 20:25     ` Linus Torvalds
     [not found]       ` <20110126124954.GC24527@laptop>
     [not found]         ` <20110127021338.GA20334@redhat.com>
     [not found]           ` <4D40F81E.1030009@zytor.com>
     [not found]             ` <20110127052639.GA16289@laptop>
     [not found]               ` <m1sjweyeax.fsf@fess.ebiederm.org>
2011-02-02 11:13 99%             ` Ahmed S. Darwish
2011-02-06 15:41 99% [PATCH -next] Documentation: Improve crashkernel= description Ahmed S. Darwish
2011-02-06 21:57     ` Simon Horman
2011-02-07  4:14       ` Randy Dunlap
2011-02-07 11:30 99%     ` [PATCH v2 " Ahmed S. Darwish
2011-02-07 14:25           ` Vivek Goyal
2011-02-07 16:01 96%         ` [PATCH v3 " Ahmed S. Darwish
2011-02-06 17:45 99% [PATCH -next] Documentation: Explain the [KMG] parameters suffix Ahmed S. Darwish
2011-02-07  4:40     ` [PATCH] Documentation: log_buf_len accepts [KMG] Randy Dunlap
2011-02-07 11:40 99%   ` Ahmed S. Darwish
2014-12-23 15:46 96% [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
2014-12-23 15:53 40% ` [PATCH] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2014-12-24 12:36       ` Olivier Sobrie
2014-12-24 15:04 89%     ` Ahmed S. Darwish
2014-12-28 21:51           ` Olivier Sobrie
2014-12-30 15:33 99%         ` Ahmed S. Darwish
2014-12-24 12:31     ` [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Olivier Sobrie
2014-12-24 15:52 99%   ` Ahmed S. Darwish
2014-12-24 23:56 92% ` [PATCH v2 1/4] " Ahmed S. Darwish
2014-12-24 23:59 86%   ` [PATCH v2 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
2014-12-25  0:02 98%     ` [PATCH v2 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
2014-12-25  0:04 39%       ` [PATCH v2 4/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2014-12-25  2:50       ` [PATCH v2 1/4] can: kvaser_usb: Don't free packets when tight on URBs Greg KH
2014-12-25  9:38 99%     ` Ahmed S. Darwish
2015-01-01 21:59     ` [PATCH] " Stephen Hemminger
2015-01-03 14:34 99%   ` Ahmed S. Darwish
2015-01-05 17:49 93% ` [PATCH v3 1/4] " Ahmed S. Darwish
2015-01-05 17:52 87%   ` [PATCH v3 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
2015-01-05 17:57 98%     ` [PATCH v3 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
2015-01-05 18:31 39%       ` [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2015-01-08 11:53             ` Marc Kleine-Budde
2015-01-08 15:19 79%           ` Ahmed S. Darwish
2015-01-12 11:51                 ` Marc Kleine-Budde
2015-01-12 12:26 99%               ` Ahmed S. Darwish
2015-01-09  3:06 99%           ` Ahmed S. Darwish
2015-01-11 20:05 99% ` [PATCH v4 00/04] can: Introduce support for Kvaser USBCAN-II devices Ahmed S. Darwish
2015-01-11 20:11 97%   ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() devices Ahmed S. Darwish
2015-01-11 20:15 99%     ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Ahmed S. Darwish
2015-01-11 20:36 39%       ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2015-01-11 20:45 92%         ` [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
2015-01-11 20:51               ` Marc Kleine-Budde
2015-01-12 10:14 99%             ` Ahmed S. Darwish
2015-01-12 13:33                   ` Olivier Sobrie
2015-01-12 13:50 99%                 ` Ahmed S. Darwish
2015-01-12 11:20 99%         ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2015-01-12 11:43             ` Marc Kleine-Budde
2015-01-12 12:07 99%           ` Ahmed S. Darwish
2015-01-12 12:36 99%             ` Ahmed S. Darwish
2015-01-12 13:53             ` Olivier Sobrie
2015-01-18 20:12 99%           ` Ahmed S. Darwish
2015-01-12 11:09           ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Marc Kleine-Budde
2015-01-12 20:36 67%         ` Ahmed S. Darwish
2015-01-11 20:49 97%     ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() Ahmed S. Darwish
2015-01-20 21:44 70% ` [PATCH v5 1/5] can: kvaser_usb: Update net interface state before exiting on OOM Ahmed S. Darwish
2015-01-20 21:45 80%   ` [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
2015-01-20 21:47 94%     ` [PATCH v5 3/5] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
2015-01-20 21:48 97%       ` [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
2015-01-20 21:50 39%         ` [PATCH v5 5/5] can: kvaser_usb: Add support for the USBcan-II family Ahmed S. Darwish
2015-01-21 12:24             ` [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Sergei Shtylyov
2015-01-25 11:59 99%           ` Ahmed S. Darwish
2015-01-21 10:33         ` [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling Andri Yngvason
2015-01-21 11:53           ` Wolfgang Grandegger
2015-01-21 14:43 61%         ` Ahmed S. Darwish
2015-01-21 15:00               ` Andri Yngvason
2015-01-21 15:36 96%             ` Ahmed S. Darwish
2015-01-21 16:13                   ` Wolfgang Grandegger
2015-01-23  6:07 90%                 ` Ahmed S. Darwish
2015-01-23 10:32                       ` Andri Yngvason
2015-01-25  2:43 86%                     ` Ahmed S. Darwish
2015-01-21 16:20         ` Andri Yngvason
2015-01-21 22:59           ` Marc Kleine-Budde
2015-01-22 10:14             ` Andri Yngvason
2015-01-25  2:49 99%           ` Ahmed S. Darwish
2015-01-25  3:21 99%       ` Ahmed S. Darwish
2015-01-26  5:17 74% ` [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support Ahmed S. Darwish
2015-01-26  5:20 88%   ` [PATCH v6 1/7] can: kvaser_usb: Do not sleep in atomic context Ahmed S. Darwish
2015-01-26  5:22 99%     ` [PATCH v6 2/7] can: kvaser_usb: Send correct context to URB completion Ahmed S. Darwish
2015-01-26  5:24 98%       ` [PATCH v6 3/7] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
2015-01-26  5:25 97%         ` [PATCH v6 4/7] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
2015-01-26  5:27 70%           ` [PATCH v6 5/7] can: kvaser_usb: Update interface state before exiting on OOM Ahmed S. Darwish
2015-01-26  5:29 80%             ` [PATCH v6 6/7] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
2015-01-26  5:33 39%               ` [PATCH v6 7/7] can: kvaser_usb: Add support for the USBcan-II family Ahmed S. Darwish
2015-01-08 16:41     Fix for synchronization issue in IPV6 implementation in smack module(v3.18) Vishal Goel
2015-01-09 13:42 95% ` Ahmed S. Darwish
2015-01-16 19:16     [PATCH v3 00/13] Add kdbus implementation Greg Kroah-Hartman
2015-01-16 19:16     ` [PATCH 01/13] kdbus: add documentation Greg Kroah-Hartman
2015-01-23  6:28 99%   ` Ahmed S. Darwish
2015-01-23 13:19         ` Greg Kroah-Hartman
2015-01-25  3:30 99%       ` Ahmed S. Darwish
2015-02-26 15:20 95% [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
2015-02-26 15:22 88% ` [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer Ahmed S. Darwish
2015-02-26 15:24 78%   ` [PATCH 3/5] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-02-26 15:25 99%     ` [PATCH 4/5] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-02-26 15:29 79%       ` [PATCH 5/5] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
2015-03-04  9:15     ` [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Marc Kleine-Budde
2015-03-09 12:32 99%   ` Ahmed S. Darwish
2015-03-11 15:23 78% ` [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
2015-03-11 15:28 78%   ` [PATCH v2 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-03-11 15:30 99%     ` [PATCH v2 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-03-11 15:36       ` [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Marc Kleine-Budde
2015-03-11 15:57 99%     ` Ahmed S. Darwish
2015-03-11 17:37 77% ` [PATCH v3 " Ahmed S. Darwish
2015-03-11 17:39 79%   ` [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-03-11 17:39 99%     ` [PATCH v3 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-03-11 21:53         ` [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs Marc Kleine-Budde
2015-03-12 10:52 99%       ` Ahmed S. Darwish
2015-03-11 21:43       ` [PATCH v3 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Marc Kleine-Budde
2015-03-12 19:30 89%     ` Ahmed S. Darwish
2015-03-14 13:02 74% ` [PATCH v4 " Ahmed S. Darwish
2015-03-14 13:09 78%   ` [PATCH v4 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-03-14 13:11 99%     ` [PATCH v4 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-03-14 15:26           ` Marc Kleine-Budde
2015-03-14 15:41 99%         ` Ahmed S. Darwish
2015-03-14 15:55               ` Marc Kleine-Budde
2015-03-14 16:06 99%             ` Ahmed S. Darwish
2015-03-14 13:41       ` [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Marc Kleine-Budde
2015-03-14 14:38 95%     ` Ahmed S. Darwish
2015-03-14 14:58           ` Marc Kleine-Budde
2015-03-14 15:19 99%         ` Ahmed S. Darwish
2015-03-15 15:03 77% ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Ahmed S. Darwish
2015-03-15 15:10 98%   ` [PATCH v5 2/2] can: kvaser_usb: Fix sparse warning __le16 degrades to integer Ahmed S. Darwish
2015-03-15 18:08       ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Marc Kleine-Budde
2015-03-16 12:16 99%     ` Ahmed S. Darwish
2018-08-18 15:57     [GIT PULL] Staging/IIO driver patches for 4.19-rc1 Greg KH
2018-08-28 10:38 97% ` Ahmed S. Darwish
2018-08-28 12:36       ` Greg KH
2018-08-28 14:30 91%     ` Ahmed S. Darwish
2018-09-10  8:16           ` Greg KH
2018-09-10 15:28 94%         ` [PATCH] staging: gasket: TODO: re-implement using UIO Ahmed S. Darwish
2018-08-27 23:14     [PATCH 01/13] seq_file: rewrite seq_puts() in terms of seq_write() Alexey Dobriyan
2018-08-27 23:15     ` [PATCH 11/13] proc: readdir /proc/*/task Alexey Dobriyan
2018-08-28 12:36 99%   ` Ahmed S. Darwish
2018-08-28 13:04 99%     ` Ahmed S. Darwish
2018-09-07 18:03     [PATCH V5 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio Zhang Yi
2018-09-07 17:04 97% ` Ahmed S. Darwish
2018-09-11 16:26     [PATCH v2 00/10] LSM: Module stacking in support of S.A.R.A and Landlock Casey Schaufler
2018-09-11 16:41     ` [PATCH 01/10] procfs: add smack subdir to attrs Casey Schaufler
2018-09-11 23:45 99%   ` Ahmed S. Darwish
2018-09-11 21:34     [PATCH v3 3/3] drivers: soc: xilinx: Add ZynqMP PM driver Jolly Shah
2018-09-12  1:10 95% ` Ahmed S. Darwish
2019-03-06  0:44 99% [PATCH] x86/defconfig: Remove archaic partition tables support Ahmed S. Darwish
2019-04-19 10:36 84% ` [tip:x86/build] " tip-bot for Ahmed S. Darwish
2019-03-10  1:31 81% DRM-based Oops viewer Ahmed S. Darwish
2019-03-11  9:04     ` Jani Nikula
2019-03-11 13:49       ` Daniel Vetter
2019-03-11 23:39 99%     ` Ahmed S. Darwish
2019-03-11 22:12 86%   ` Ahmed S. Darwish
2019-03-10 20:03 99% [PATCH] printk: kmsg_dump: Mark registered flag as private Ahmed S. Darwish
2019-03-11 12:49     ` Sergey Senozhatsky
2019-03-12 20:05 99%   ` Ahmed S. Darwish
2019-09-08 20:59     Linux 5.3-rc8 Linus Torvalds
2019-09-10  4:21 96% ` Ahmed S. Darwish
2019-09-10 11:33       ` Linus Torvalds
2019-09-10 17:33 96%     ` Ahmed S. Darwish
2019-09-10 18:21           ` Linus Torvalds
2019-09-11 16:07             ` Theodore Y. Ts'o
2019-09-11 16:45               ` Linus Torvalds
2019-09-11 17:00                 ` Linus Torvalds
2019-09-11 17:36                   ` Theodore Y. Ts'o
2019-09-12  3:44 84%                 ` Ahmed S. Darwish
2019-09-12  8:25                       ` Theodore Y. Ts'o
2019-09-12 11:34                         ` Linus Torvalds
2019-09-14 12:25 80%                       ` [PATCH RFC] random: getrandom(2): don't block on non-initialized entropy pool Ahmed S. Darwish
2019-09-14 14:08                             ` Alexander E. Patrakov
2019-09-15  5:22                               ` [PATCH RFC v2] random: optionally block in getrandom(2) when the CRNG is uninitialized Theodore Y. Ts'o
2019-09-15  8:17 58%                             ` [PATCH RFC v3] random: getrandom(2): optionally block when " Ahmed S. Darwish
2019-09-15  8:59                                   ` Lennart Poettering
2019-09-15  9:30                                     ` Willy Tarreau
2019-09-15 10:02 98%                                   ` Ahmed S. Darwish
2019-09-15 10:40                                         ` Willy Tarreau
2019-09-15 10:55 98%                                       ` Ahmed S. Darwish
2019-09-15 17:32                                 ` [PATCH RFC v2] random: optionally block in getrandom(2) when the " Linus Torvalds
2019-09-15 18:32                                   ` Willy Tarreau
2019-09-15 18:59                                     ` Linus Torvalds
2019-09-16  2:45 92%                                   ` Ahmed S. Darwish
2019-09-18 21:15 97%                               ` [PATCH RFC v4 0/1] random: WARN on large getrandom() waits and introduce getrandom2() Ahmed S. Darwish
2019-09-18 21:17 53%                                 ` [PATCH RFC v4 1/1] " Ahmed S. Darwish
2019-09-18 23:57                                       ` Linus Torvalds
2019-09-20 13:46 89%                                     ` Ahmed S. Darwish
2019-09-20 14:33                                           ` Andy Lutomirski
2019-09-20 16:29                                             ` Linus Torvalds
2019-09-20 17:52                                               ` Andy Lutomirski
2019-09-20 18:09                                                 ` Linus Torvalds
2019-09-21  6:07                                                   ` Florian Weimer
2019-09-23 18:33                                                     ` Andy Lutomirski
2019-09-26 21:11 99%                                                   ` Ahmed S. Darwish
2019-09-20 17:26                                           ` Willy Tarreau
2019-09-20 17:56 99%                                         ` Ahmed S. Darwish
2019-09-26 20:42 89%                                     ` [PATCH v5 0/1] random: getrandom(2): warn on large CRNG waits, introduce new flags Ahmed S. Darwish
2019-09-26 20:44 50%                                       ` [PATCH v5 1/1] " Ahmed S. Darwish
2019-09-26 21:39                                             ` Andy Lutomirski
2019-09-28  9:30 98%                                           ` Ahmed S. Darwish
2019-09-14 15:02 94%                       ` Linux 5.3-rc8 Ahmed S. Darwish
2019-09-14 16:30                             ` Linus Torvalds
2019-09-14 21:11 94%                           ` Ahmed S. Darwish
2019-09-15  6:51                               ` Lennart Poettering
2019-09-15  7:27 99%                             ` Ahmed S. Darwish
2019-09-15 16:29                                 ` Linus Torvalds
2019-09-16  1:40 99%                               ` Ahmed S. Darwish
2019-09-16  1:48                                     ` Vito Caputo
2019-09-16  2:49                                       ` Theodore Y. Ts'o
2019-09-16  4:29                                         ` Willy Tarreau
2019-09-16  5:02                                           ` Linus Torvalds
2019-09-16  6:12                                             ` Willy Tarreau
2019-09-16 16:17                                               ` Linus Torvalds
2019-09-16 17:21                                                 ` Theodore Y. Ts'o
2019-09-16 17:44                                                   ` Linus Torvalds
2019-09-16 23:02                                                     ` Matthew Garrett
2019-09-16 23:05                                                       ` Linus Torvalds
2019-09-16 23:11                                                         ` Matthew Garrett
2019-09-16 23:18                                                           ` Linus Torvalds
2019-09-16 23:29 98%                                                         ` Ahmed S. Darwish
2019-09-17  1:46                                                     ` Matthew Garrett
2019-09-17  5:24                                                       ` Willy Tarreau
2019-09-17  7:33                                                         ` Martin Steigerwald
2019-09-17 12:11                                                           ` Theodore Y. Ts'o
2019-09-17 12:30 84%                                                         ` Ahmed S. Darwish
2019-09-17 17:42                                                     ` Lennart Poettering
2019-09-17 18:01                                                       ` Linus Torvalds
2019-09-17 20:28                                                         ` Martin Steigerwald
2019-09-17 20:52 99%                                                       ` Ahmed S. Darwish
2019-09-16 19:53 99%                                               ` Ahmed S. Darwish
2019-09-14  9:25 83%                     ` Ahmed S. Darwish
2019-09-11 21:41 99%             ` Ahmed S. Darwish
2019-09-11 22:37 99%               ` Ahmed S. Darwish
2019-09-28 22:24     x86/random: Speculation to the rescue Thomas Gleixner
2019-09-28 23:53     ` Linus Torvalds
2019-10-01 16:15 96%   ` Ahmed S. Darwish
2019-10-01 16:37         ` Kees Cook
2019-10-01 17:18 96%       ` Ahmed S. Darwish
2020-03-09 18:15 98% [PATCH] sched/clock: expire timer in hardirq context Ahmed S. Darwish
2020-03-19  8:47 83% ` [tip: timers/core] time/sched_clock: Expire " tip-bot2 for Ahmed S. Darwish
2020-05-19 21:45 72% [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-05-19 21:45 82% ` [PATCH v1 01/25] net: core: device_rename: Use rwsem instead of a seqcount Ahmed S. Darwish
2020-05-20  2:01       ` Eric Dumazet
2020-05-20  6:42 99%     ` Ahmed S. Darwish
2020-05-19 21:45 81% ` [PATCH v1 02/25] mm/swap: Don't abuse the seqcount latching API Ahmed S. Darwish
2020-05-19 21:45 92% ` [PATCH v1 03/25] net: phy: fixed_phy: Remove unused seqcount Ahmed S. Darwish
2020-05-19 21:45 97% ` [PATCH v1 04/25] block: nr_sects_write(): Disable preemption on seqcount write Ahmed S. Darwish
2020-05-22 16:39       ` Peter Zijlstra
2020-05-25  9:56 99%     ` Ahmed S. Darwish
2020-05-19 21:45 82% ` [PATCH v1 05/25] u64_stats: Document writer non-preemptibility requirement Ahmed S. Darwish
2020-05-19 21:45 91% ` [PATCH v1 06/25] dma-buf: Remove custom seqcount lockdep class key Ahmed S. Darwish
2020-05-19 21:45 89% ` [PATCH v1 07/25] lockdep: Add preemption disabled assertion API Ahmed S. Darwish
2020-05-19 21:45 91% ` [PATCH v1 08/25] seqlock: lockdep assert non-preemptibility on seqcount_t write Ahmed S. Darwish
2020-05-19 21:45 58% ` [PATCH v1 09/25] Documentation: locking: Describe seqlock design and usage Ahmed S. Darwish
2020-05-19 21:45 82% ` [PATCH v1 10/25] seqlock: Add RST directives to kernel-doc code samples and notes Ahmed S. Darwish
2020-05-22 18:02       ` Peter Zijlstra
2020-05-22 18:03         ` Peter Zijlstra
2020-05-22 18:26           ` Thomas Gleixner
2020-05-22 18:32             ` Peter Zijlstra
2020-05-25  9:36 93%           ` Ahmed S. Darwish
2020-05-19 21:45 42% ` [PATCH v1 11/25] seqlock: Add missing kernel-doc annotations Ahmed S. Darwish
2020-05-19 21:45 37% ` [PATCH v1 12/25] seqlock: Extend seqcount API with associated locks Ahmed S. Darwish
2020-05-19 21:45 81% ` [PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex Ahmed S. Darwish
2020-05-20 10:48       ` Christian König
2020-05-21  0:09 96%     ` Ahmed S. Darwish
2020-05-19 21:45 93% ` [PATCH v1 14/25] sched: tasks: Use sequence counter with associated spinlock Ahmed S. Darwish
2020-05-19 21:45 92% ` [PATCH v1 15/25] netfilter: conntrack: " Ahmed S. Darwish
2020-05-19 21:45 95% ` [PATCH v1 16/25] netfilter: nft_set_rbtree: Use sequence counter with associated rwlock Ahmed S. Darwish
2020-05-19 21:45 91% ` [PATCH v1 17/25] xfrm: policy: Use sequence counters with associated lock Ahmed S. Darwish
2020-05-19 21:45 84% ` [PATCH v1 18/25] timekeeping: Use sequence counter with associated raw spinlock Ahmed S. Darwish
2020-05-19 21:45 89% ` [PATCH v1 19/25] vfs: Use sequence counter with associated spinlock Ahmed S. Darwish
2020-05-19 21:45 95% ` [PATCH v1 20/25] raid5: " Ahmed S. Darwish
2020-05-19 21:45 95% ` [PATCH v1 21/25] iocost: " Ahmed S. Darwish
2020-05-19 21:45 95% ` [PATCH v1 22/25] NFSv4: " Ahmed S. Darwish
2020-05-19 21:45 96% ` [PATCH v1 23/25] userfaultfd: " Ahmed S. Darwish
2020-05-19 21:45 95% ` [PATCH v1 24/25] kvm/eventfd: " Ahmed S. Darwish
2020-05-19 21:45 94% ` [PATCH v1 25/25] hrtimer: Use sequence counter with associated raw spinlock Ahmed S. Darwish

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).