All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/44 take 2] [UBI] Unsorted Block Images
@ 2007-02-17 16:54 Artem Bityutskiy
  2007-02-17 16:54 ` [PATCH 01/44 take 2] [UBI] Linux build integration Artem Bityutskiy
                   ` (46 more replies)
  0 siblings, 47 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4100 bytes --]

Hello,

This patch-set contains UBI, which stands for Unsorted Block Images. This
is closely related to the memory technology devices Linux subsystem (MTD),
so this new piece of software is from drivers/mtd/ubi.

In short, UBI is kind of LVM layer but for flash (MTD) devices. It makes
it possible to dynamically create, delete and re-size volumes. But the
analogy is not full. UBI also takes care of wear-leveling and bad
eraseblocks handling, so UBI completely hides 2 aspects of flash chips
which make them very difficult to work with:

1. wear of eraseblocks;
2. bad eraseblocks.

There is some documentation available at:
http://www.linux-mtd.infradead.org/doc/ubi.html
http://www.linux-mtd.infradead.org/faq/ubi.html

The sources are available via the GIT tree:
git://git.infradead.org/ubi-2.6.git

The UBI tree has been in -mm for a few releases now and we would like to
see it in the mainline. We have several groups using the code now and it
has proved to be fairly stable thus far. We've also got some feedback
from people outside the community.

The structure of the UBI code is very simple. Whole UBI consists of units.
Each unit has one .c file which implements it and one .h file which defines
the interface of this unit. So I've split the UBI code so that there is
a distinct patch for each unit header and implementations. Plus there are
several patches like JFFS2 support, user-space-visible headers, etc.
To realize what is the unit responsible for, glance at its header - there
is a short description.

Diffstat of the whole UBI:

 MAINTAINERS                   |    8
 drivers/mtd/Kconfig           |    2
 drivers/mtd/Makefile          |    2
 drivers/mtd/ubi/Kconfig       |   71 +
 drivers/mtd/ubi/Kconfig.debug |  218 +++++
 drivers/mtd/ubi/Makefile      |    7
 drivers/mtd/ubi/account.c     |  286 +++++++
 drivers/mtd/ubi/account.h     |  118 ++
 drivers/mtd/ubi/alloc.c       |  369 +++++++++
 drivers/mtd/ubi/alloc.h       |  235 +++++
 drivers/mtd/ubi/background.c  |  352 ++++++++
 drivers/mtd/ubi/background.h  |  177 ++++
 drivers/mtd/ubi/badeb.c       |  234 +++++
 drivers/mtd/ubi/badeb.h       |  108 ++
 drivers/mtd/ubi/build.c       |  191 ++++
 drivers/mtd/ubi/build.h       |   62 +
 drivers/mtd/ubi/cdev.c        | 1033 +++++++++++++++++++++++++
 drivers/mtd/ubi/cdev.h        |   82 ++
 drivers/mtd/ubi/debug.c       |  634 +++++++++++++++
 drivers/mtd/ubi/debug.h       |  284 +++++++
 drivers/mtd/ubi/eba.c         | 1212 ++++++++++++++++++++++++++++++
 drivers/mtd/ubi/eba.h         |  362 +++++++++
 drivers/mtd/ubi/gluebi.c      |  363 +++++++++
 drivers/mtd/ubi/gluebi.h      |   88 ++
 drivers/mtd/ubi/init.c        |  371 +++++++++
 drivers/mtd/ubi/io.c          | 1297 ++++++++++++++++++++++++++++++++
 drivers/mtd/ubi/io.h          |  419 ++++++++++
 drivers/mtd/ubi/misc.c        |  138 +++
 drivers/mtd/ubi/misc.h        |  146 +++
 drivers/mtd/ubi/scan.c        | 1383 ++++++++++++++++++++++++++++++++++
 drivers/mtd/ubi/scan.h        |  279 ++++++
 drivers/mtd/ubi/sysfs.c       |  614 +++++++++++++++
 drivers/mtd/ubi/sysfs.h       |   82 ++
 drivers/mtd/ubi/ubi.h         |  100 ++
 drivers/mtd/ubi/uif.c         |  786 +++++++++++++++++++
 drivers/mtd/ubi/uif.h         |  182 ++++
 drivers/mtd/ubi/upd.c         |  389 +++++++++
 drivers/mtd/ubi/upd.h         |  136 +++
 drivers/mtd/ubi/volmgmt.c     |  374 +++++++++
 drivers/mtd/ubi/volmgmt.h     |  129 +++
 drivers/mtd/ubi/vtbl.c        | 1369 ++++++++++++++++++++++++++++++++++
 drivers/mtd/ubi/vtbl.h        |  302 +++++++
 drivers/mtd/ubi/wl.c          | 1684 ++++++++++++++++++++++++++++++++++++++++++
 drivers/mtd/ubi/wl.h          |  284 +++++++
 fs/jffs2/fs.c                 |   12
 fs/jffs2/os-linux.h           |    6
 fs/jffs2/wbuf.c               |   24
 include/linux/mtd/ubi.h       |  391 +++++++++
 include/mtd/Kbuild            |    2
 include/mtd/mtd-abi.h         |    1
 include/mtd/ubi-header.h      |  337 ++++++++
 include/mtd/ubi-user.h        |  161 ++++
 52 files changed, 17896 insertions(+)

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 01/44 take 2] [UBI] Linux build integration
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-17 16:54 ` [PATCH 02/44 take 2] [UBI] on-flash data structures header Artem Bityutskiy
                   ` (45 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/Kconfig tmp-to/drivers/mtd/Kconfig
--- tmp-from/drivers/mtd/Kconfig	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/drivers/mtd/Kconfig	2007-02-17 18:07:26.000000000 +0200
@@ -292,5 +292,7 @@ source "drivers/mtd/nand/Kconfig"
 
 source "drivers/mtd/onenand/Kconfig"
 
+source "drivers/mtd/ubi/Kconfig"
+
 endmenu
 
diff -auNrp tmp-from/drivers/mtd/Makefile tmp-to/drivers/mtd/Makefile
--- tmp-from/drivers/mtd/Makefile	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/drivers/mtd/Makefile	2007-02-17 18:07:26.000000000 +0200
@@ -28,3 +28,5 @@ nftl-objs		:= nftlcore.o nftlmount.o
 inftl-objs		:= inftlcore.o inftlmount.o
 
 obj-y		+= chips/ maps/ devices/ nand/ onenand/
+
+obj-$(CONFIG_MTD_UBI)		+= ubi/
diff -auNrp tmp-from/drivers/mtd/ubi/Kconfig tmp-to/drivers/mtd/ubi/Kconfig
--- tmp-from/drivers/mtd/ubi/Kconfig	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/Kconfig	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,71 @@
+# drivers/mtd/ubi/Kconfig
+
+menu "UBI - Unsorted block images"
+	depends on MTD
+
+config MTD_UBI
+	tristate "Enable UBI"
+	depends on MTD
+	select CRC32
+	help
+	  UBI is a software layer above MTD layer which admits of LVM-like
+	  logical volumes on top of MTD devices, hides some complexities of
+	  flash chips like wear and bad blocks and provides some other useful
+	  capabilities. Please, consult the MTD web site for more details
+	  (www.linux-mtd.infradead.org).
+
+config MTD_UBI_WL_THRESHOLD
+	int "UBI wear-leveling threshold"
+	default 4096
+	range 2 65536
+	depends on MTD_UBI
+	help
+	  This parameter defines the maximum difference between the highest
+	  erase counter value and the lowest erase counter value of eraseblocks
+	  of UBI devices. When this threshold is exceeded, UBI starts performing
+	  wear leveling by means of moving data from eraseblock with low erase
+	  counter to eraseblocks with high erase counter. Leave the default
+	  value if unsure.
+
+config MTD_UBI_BEB_RESERVE
+	int "Percentage of reserved eraseblocks for bad eraseblocks handling"
+	default 1
+	range 0 25
+	depends on MTD_UBI
+	help
+	  If the MTD device admits of bad eraseblocks (e.g. NAND flash), UBI
+	  reserves some amount of physical eraseblocks to handle new bad
+	  eraseblocks. For example, if a flash physical eraseblock becomes bad,
+	  UBI uses these reserved physical eraseblocks to relocate the bad one.
+	  This option specifies how many physical eraseblocks will be reserved
+	  for bad eraseblock handling (percents of total number of good flash
+	  eraseblocks). If the underlying flash does not admit of bad
+	  eraseblocks (e.g. NOR flash), this value is ignored and nothing is
+	  reserved. Leave the default value if unsure.
+
+config MTD_UBI_GLUEBI
+	bool "Emulate MTD devices"
+	default n
+	depends on MTD_UBI
+	help
+	   This option enables MTD devices emulation on top of UBI volumes: for
+	   each UBI volumes an MTD device is created, and all I/O to this MTD
+	   device is redirected to the UBI volume. This is handy to make
+	   MTD-oriented software (like JFFS2) work on top of UBI. Do not enable
+	   this if no legacy software will be used.
+
+config MTD_UBI_USERSPACE_IO
+	bool "UBI user-space write/erase"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  By default, users cannot directly write and erase individual
+	  eraseblocks of dynamic volumes, and have to use update operation.
+	  This option enables this capability. Disabling this feature is more
+	  robust because it discourage users from using tools like 'dd' to copy
+	  their images to UBI volumes. Do not enable if unsure.
+
+# There are a lot of debugging options, so they are moved to a distinct file
+source "drivers/mtd/ubi/Kconfig.debug"
+
+endmenu
diff -auNrp tmp-from/drivers/mtd/ubi/Kconfig.debug tmp-to/drivers/mtd/ubi/Kconfig.debug
--- tmp-from/drivers/mtd/ubi/Kconfig.debug	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/Kconfig.debug	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,218 @@
+# UBI debugging configuration options, part of drivers/mtd/ubi/Kconfig
+
+comment "UBI debugging options"
+	depends on MTD_UBI
+
+config MTD_UBI_DEBUG
+	bool "UBI debugging"
+	default n
+	depends on MTD_UBI
+	select DEBUG_FS
+	select KALLSYMS_ALL
+	help
+	  This enables UBI debugging support: enables various assertions in the
+	  code, verbose debugging messages and allows to switch many other
+	  debugging features on. UBI exposes its debugging stuff via the Linux
+	  debugfs virtual file-system under the "ubi" directory. Mount debugfs
+	  to access this stuff: mount -t debugfs none <mount_point>
+
+config MTD_UBI_DEBUG_DISABLE_BGT
+	bool "Do not enable the UBI background thread"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option switches the background thread off by default. The thread
+	  may be also be enabled/disabled via UBI sysfs.
+
+config MTD_UBI_DEBUG_EMULATE_BITFLIPS
+	bool "Emulate flash bit-flips"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option emulates bit-flips with probability 1/50, which in turn
+	  causes scrubbing. Useful for debugging and stressing UBI.
+
+config MTD_UBI_DEBUG_EMULATE_WRITE_FAILURES
+	bool "Emulate flash write failures"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option emulates write failures with probability 1/100. Useful for
+	  debugging and testing how UBI handlines errors.
+
+config MTD_UBI_DEBUG_EMULATE_ERASE_FAILURES
+	bool "Emulate flash erase failures"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option emulates erase failures with probability 1/100. Useful for
+	  debugging and testing how UBI handlines errors.
+
+menu "UBI debugging messages"
+	depends on MTD_UBI_DEBUG
+
+config MTD_UBI_DEBUG_MSG_VB_ERR
+	bool "Verbose errors reporting"
+	depends on MTD_UBI_DEBUG
+	default y
+	help
+	  This option enables verbose reporting about errors occurred in UBI.
+
+config MTD_UBI_DEBUG_MSG_UIF
+	bool "User interface unit messages"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables debugging messages from the UBI user interfaces
+	  unit.
+
+config MTD_UBI_DEBUG_MSG_CDEV
+	bool "Messages from the character device sub-unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables debugging messages from the UBI character device
+	  handling sub-unit of the user interfaces unit.
+
+config MTD_UBI_DEBUG_MSG_GLUEBI
+	bool "Messages from the gluebi sub-unit"
+	depends on MTD_UBI_DEBUG
+	depends on MTD_UBI_GLUEBI
+	default n
+	help
+	  This option enables debugging messages from the gluebi (MTD devices
+	  emulation) sub-unit of the user interfaces unit.
+
+config MTD_UBI_DEBUG_MSG_VMT
+	bool "Volume management unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI volume management
+	  unit.
+
+config MTD_UBI_DEBUG_MSG_UPD
+	bool "Update unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI update unit.
+
+config MTD_UBI_DEBUG_MSG_VTBL
+	bool "Volume table unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI volume table
+	  unit.
+
+config MTD_UBI_DEBUG_MSG_ACC
+	bool "Accounting unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI accountig unit.
+
+config MTD_UBI_DEBUG_MSG_EBA
+	bool "Eraseblock association unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI eraseblock
+	  association unit.
+
+config MTD_UBI_DEBUG_MSG_WL
+	bool "Wear-leveling unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI wear-leveling
+	  unit.
+
+config MTD_UBI_DEBUG_MSG_BGT
+	bool "Background thread unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI background thread
+	  unit.
+
+config MTD_UBI_DEBUG_MSG_IO
+	bool "Input/output unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI input/output unit.
+
+config MTD_UBI_DEBUG_MSG_BLD
+	bool "Build unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI build unit.
+
+config MTD_UBI_DEBUG_MSG_SCAN
+	bool "Scanning unit messages"
+	default n
+	depends on MTD_UBI_DEBUG
+	help
+	  This option enables debugging messages from the UBI scanning unit.
+
+endmenu # UBI debugging messages
+
+menu "UBI paranoid checks"
+	depends on MTD_UBI_DEBUG
+
+config MTD_UBI_DEBUG_PARANOID_VMT
+	bool "Paranoid checks in the volume management unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the volume management unit.
+
+config MTD_UBI_DEBUG_PARANOID_VTBL
+	bool "Paranoid checks in the volume table unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the volume table unit.
+
+config MTD_UBI_DEBUG_PARANOID_EBA
+	bool "Paranoid checks in the eraseblock association unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the eraseblock association
+	  unit.
+
+config MTD_UBI_DEBUG_PARANOID_WL
+	bool "Paranoid checks in the wear-leveling unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the wear-leveling unit.
+
+config MTD_UBI_DEBUG_PARANOID_ALLOC
+	bool "Paranoid checks in the memory allocation unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the memory allocation unit.
+
+config MTD_UBI_DEBUG_PARANOID_IO
+	bool "Paranoid checks in the input/output unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the input/output unit.
+	  Warning, this is rather heavy-weight and will slow UBI down.
+
+config MTD_UBI_DEBUG_PARANOID_SCAN
+	bool "Paranoid checks in the scanning unit"
+	depends on MTD_UBI_DEBUG
+	default n
+	help
+	  This option enables extra self-checks in the scanning unit. Warning,
+	  this is rather heavy-weight and will slow UBI down.
+
+endmenu # UBI paranoid checks
diff -auNrp tmp-from/drivers/mtd/ubi/Makefile tmp-to/drivers/mtd/ubi/Makefile
--- tmp-from/drivers/mtd/ubi/Makefile	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/Makefile	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,7 @@
+obj-$(CONFIG_MTD_UBI) += ubi.o
+
+ubi-y += badeb.o upd.o sysfs.o cdev.o uif.o vtbl.o volmgmt.o eba.o io.o wl.o
+ubi-y += scan.o build.o background.o alloc.o init.o account.o misc.o
+
+ubi-$(CONFIG_MTD_UBI_DEBUG) += debug.o
+ubi-$(CONFIG_MTD_UBI_GLUEBI) += gluebi.o
diff -auNrp tmp-from/include/mtd/Kbuild tmp-to/include/mtd/Kbuild
--- tmp-from/include/mtd/Kbuild	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/include/mtd/Kbuild	2007-02-17 18:07:26.000000000 +0200
@@ -3,3 +3,5 @@ header-y += jffs2-user.h
 header-y += mtd-abi.h
 header-y += mtd-user.h
 header-y += nftl-user.h
+header-y += ubi-header.h
+header-y += ubi-user.h

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 02/44 take 2] [UBI] on-flash data structures header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
  2007-02-17 16:54 ` [PATCH 01/44 take 2] [UBI] Linux build integration Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-17 16:54 ` [PATCH 03/44 take 2] [UBI] user-space API header Artem Bityutskiy
                   ` (44 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/include/mtd/ubi-header.h tmp-to/include/mtd/ubi-header.h
--- tmp-from/include/mtd/ubi-header.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/include/mtd/ubi-header.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,337 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Authors: Artem B. Bityutskiy
+ *          Thomas Gleixner
+ *          Frank Haverkamp
+ *          Oliver Lohmann
+ *          Andreas Arnez
+ */
+
+/*
+ * This file defines the layout of UBI headers and all the other UBI on-flash
+ * data structures.
+ */
+
+#ifndef __UBI_HEADER_H__
+#define __UBI_HEADER_H__
+
+#include <asm/byteorder.h>
+
+/* The version of this UBI implementation */
+#define UBI_VERSION 1
+
+/* The highest erase counter value supported by this implementation of UBI */
+#define UBI_MAX_ERASECOUNTER 0x7FFFFFFF
+
+/* The initial CRC32 value used when calculating CRC checksums */
+#define UBI_CRC32_INIT 0xFFFFFFFFU
+
+/*
+ * Magic numbers of the UBI headers.
+ *
+ * @UBI_EC_HDR_MAGIC: erase counter header magic number (ASCII "UBI#")
+ * @UBI_VID_HDR_MAGIC: volume identifier header magic number (ASCII "UBI!")
+ */
+enum {
+	UBI_EC_HDR_MAGIC  = 0x55424923,
+	UBI_VID_HDR_MAGIC = 0x55424921
+};
+
+/*
+ * Volume type constants used in the volume identifier header.
+ *
+ * @UBI_VID_DYNAMIC: dynamic volume
+ * @UBI_VID_STATIC: static volume
+ */
+enum {
+	UBI_VID_DYNAMIC = 1,
+	UBI_VID_STATIC  = 2
+};
+
+/*
+ * Compatibility constants used by internal volumes.
+ *
+ * @UBI_COMPAT_DELETE: delete this internal volume before anything is written
+ * to the flash
+ * @UBI_COMPAT_RO: attach this device in read-only mode
+ * @UBI_COMPAT_PRESERVE: preserve this internal volume - do not touch its
+ * physical eraseblocks, don't allow the wear-leveling unit to move them
+ * @UBI_COMPAT_REJECT: reject this UBI image
+ */
+enum {
+	UBI_COMPAT_DELETE   = 1,
+	UBI_COMPAT_RO       = 2,
+	UBI_COMPAT_PRESERVE = 4,
+	UBI_COMPAT_REJECT   = 5
+};
+
+/*
+ * ubi16_t/ubi32_t/ubi64_t - 16, 32, and 64-bit integers used in UBI on-flash
+ * data structures.
+ */
+typedef struct {
+	uint16_t int16;
+} __attribute__ ((packed)) ubi16_t;
+
+typedef struct {
+	uint32_t int32;
+} __attribute__ ((packed)) ubi32_t;
+
+typedef struct {
+	uint64_t int64;
+} __attribute__ ((packed)) ubi64_t;
+
+/*
+ * In this implementation of UBI uses the big-endian format for on-flash
+ * integers. The below are the corresponding conversion macros.
+ */
+#define cpu_to_ubi16(x) ((ubi16_t){__cpu_to_be16(x)})
+#define ubi16_to_cpu(x) ((uint16_t)__be16_to_cpu((x).int16))
+
+#define cpu_to_ubi32(x) ((ubi32_t){__cpu_to_be32(x)})
+#define ubi32_to_cpu(x) ((uint32_t)__be32_to_cpu((x).int32))
+
+#define cpu_to_ubi64(x) ((ubi64_t){__cpu_to_be64(x)})
+#define ubi64_to_cpu(x) ((uint64_t)__be64_to_cpu((x).int64))
+
+/* Sizes of UBI headers */
+#define UBI_EC_HDR_SIZE  sizeof(struct ubi_ec_hdr)
+#define UBI_VID_HDR_SIZE sizeof(struct ubi_vid_hdr)
+
+/* Sizes of UBI headers without the ending CRC */
+#define UBI_EC_HDR_SIZE_CRC  (UBI_EC_HDR_SIZE  - sizeof(ubi32_t))
+#define UBI_VID_HDR_SIZE_CRC (UBI_VID_HDR_SIZE - sizeof(ubi32_t))
+
+/**
+ * struct ubi_ec_hdr - UBI erase counter header.
+ *
+ * @magic: the erase counter header magic number (%UBI_EC_HDR_MAGIC)
+ * @version: the version of UBI implementation which is supposed to accept this
+ * UBI image (%UBI_VERSION)
+ * @padding1: reserved for future, zeroes
+ * @ec: the erase counter
+ * @vid_hdr_offset: where the VID header starts
+ * @data_offset: where the user data starts
+ * @padding2: reserved for future, zeroes
+ * @hdr_crc: the erase counter header CRC checksum
+ *
+ * The erase counter header takes 64 bytes and has a plenty of unused space for
+ * future usage. The unused fields are zeroed. The @version field is used to
+ * indicate the version of UBI implementation which is supposed to be able to
+ * work with this UBI image. If @version is greater then the current UBI
+ * version, the image is rejected. This may be useful in future if something
+ * is changed radically. This field is duplicated in the volume identifier
+ * header.
+ *
+ * The @vid_hdr_offset and @data_offset fields contain the offset of the the
+ * volume identifier header and user data, relative to the beginning of the
+ * eraseblock. These values have to be the same for all eraseblocks.
+ */
+struct ubi_ec_hdr {
+	ubi32_t magic;
+	uint8_t version;
+	uint8_t padding1[3];
+	ubi64_t ec; /* Warning: the current limit is 31-bit anyway! */
+	ubi32_t vid_hdr_offset;
+	ubi32_t data_offset;
+	uint8_t padding2[36];
+	ubi32_t hdr_crc;
+} __attribute__ ((packed));
+
+/**
+ * struct ubi_vid_hdr - on-flash UBI volume identifier header.
+ *
+ * @magic: volume identifier header magic number (%UBI_VID_HDR_MAGIC)
+ * @version: UBI implementation version which is supposed to accept this UBI
+ * image (%UBI_VERSION)
+ * @vol_type: volume type (%UBI_VID_DYNAMIC or %UBI_VID_STATIC)
+ * @copy_flag: a flag indicating if this physical eraseblock was created by
+ * means of copying an original physical eraseblock to ensure wear-leveling.
+ * @compat: compatibility of this volume (%UBI_COMPAT_DELETE,
+ * %UBI_COMPAT_IGNORE, %UBI_COMPAT_PRESERVE, or %UBI_COMPAT_REJECT)
+ * @vol_id: ID of this volume
+ * @lnum: logical eraseblock number
+ * @leb_ver: eraseblock copy number
+ * @data_size: how many bytes of data this eraseblock contains
+ * @used_ebs: total number of used logical eraseblocks in this volume
+ * @data_pad: how many bytes at the end of this eraseblock are not used
+ * @data_crc: CRC checksum of the data stored in this eraseblock
+ * @padding1: reserved for future, zeroes
+ * @hdr_crc: volume identifier header CRC checksum
+ *
+ * The @leb_ver and the @copy_flag fields are used to distinguish between older
+ * and newer copies of the logical eraseblock, as well as to guarantee
+ * robustness against unclean reboots. As UBI erases logical eraseblocks
+ * asynchronously, in background, it has to distinguish between older and newer
+ * copies of logical eraseblocks. This is done using the @version field. On the
+ * other hand, when UBI moves data of an eraseblock, its version is also
+ * increased and the @copy_flag is set to 1. Additionally, when moving data of
+ * eraseblocks, UBI calculates data CRC and stores it in the @data_crc field,
+ * even for dynamic volumes.
+ *
+ * Thus, if there are 2 physical eraseblocks belonging to the logical
+ * eraseblock (same volume ID and logical eraseblock number), UBI uses the
+ * following algorithm to pick one of them. It first picks the one with larger
+ * version (say, A). If @copy_flag is not set, then A is picked. If @copy_flag
+ * is set, UBI checks the CRC of data of this physical eraseblock (@data_crc).
+ * This is needed to ensure that the copying was finished. If the CRC is all
+ * right, A is picked. If not, the older physical eraseblock is picked.
+ *
+ * Note, the @leb_ver field may overflow. Thus, if you have 2 versions X and Y,
+ * then X > Y if abs(X-Y) < 0x7FFFFFFF, otherwise X < Y.
+ *
+ * There are 2 sorts of volumes in UBI: user volumes and internal volumes.
+ * Internal volumes are not seen from outside and are used for various internal
+ * UBI purposes. In this implementation there is only one internal volume - the
+ * layout volume. Internal volumes are the main mechanism of UBI extensions.
+ * For example, in future one may introduce a journal internal volume. Internal
+ * volumes have their own reserved range of IDs.
+ *
+ * The @compat field is only used for internal volumes and contains the "degree
+ * of their compatibility". It is always zero for user volumes. This field
+ * provides a mechanism to introduce UBI extensions and to be still compatible
+ * with older UBI binaries. For example, if someone introduced a journal in
+ * future, he would probably use %UBI_COMPAT_DELETE compatibility for the
+ * journal volume.  And in this case, older UBI binaries, which know nothing
+ * about the journal volume, would just delete this volume and work perfectly
+ * fine. This is somewhat similar to what Ext2fs does when it is fed by an
+ * Ext3fs image - it just ignores the Ext3fs journal.
+ *
+ * The @data_crc field contains the CRC checksum of the contents of the logical
+ * eraseblock if this is a static volume. In case of dynamic volumes, it does
+ * not contain the CRC checksum as a rule. The only exception is when the
+ * data of the physical eraseblock was moved by the wear-leveling unit, then
+ * the wear-leveling unit calculates the data CRC and stores it in the
+ * @data_crc field. And of course, the @copy_flag is %in this case.
+ *
+ * The @data_size field is used only for static volumes because UBI has to know
+ * how many bytes of data are stored in this eraseblock. For dynamic volumes,
+ * this field usually contains zero. The only exception is when the data of the
+ * physical eraseblock was moved to another physical eraseblock for
+ * wear-leveling reasons. In this case, UBI calculates CRC checksum of the
+ * contents and uses both @data_crc and @data_size fields. In this case, the
+ * @data_size field contains data size.
+ *
+ * The @used_ebs field is used only for static volumes and indicates how many
+ * eraseblocks the data of the volume takes. For dynamic volumes this field is
+ * not used and always contains zero.
+ *
+ * The @data_pad is calculated when volumes are created using the alignment
+ * parameter. So, effectively, the @data_pad field reduces the size of logical
+ * eraseblocks of this volume. This is very handy when one uses block-oriented
+ * software (say, cramfs) on top of the UBI volume.
+ */
+struct ubi_vid_hdr {
+	ubi32_t magic;
+	uint8_t version;
+	uint8_t vol_type;
+	uint8_t copy_flag;
+	uint8_t compat;
+	ubi32_t vol_id;
+	ubi32_t lnum;
+	ubi32_t leb_ver;
+	ubi32_t data_size;
+	ubi32_t used_ebs;
+	ubi32_t data_pad;
+	ubi32_t data_crc;
+	uint8_t padding1[24];
+	ubi32_t hdr_crc;
+} __attribute__ ((packed));
+
+/* Count of internal UBI volumes */
+#define UBI_INT_VOL_COUNT 1
+
+/*
+ * IDs of internal volumes start from this digit. There is a reserved room for
+ * 4096 internal volumes.
+ */
+#define UBI_INTERNAL_VOL_START (0x7FFFFFFF - 4096)
+
+/*
+ * IDs of internal UBI volumes.
+ *
+ * %UBI_LAYOUT_VOL_ID: layout volume ID
+ */
+enum {
+	UBI_LAYOUT_VOL_ID = UBI_INTERNAL_VOL_START,
+};
+
+#define UBI_LAYOUT_VOLUME_EBS    2
+#define UBI_LAYOUT_VOLUME_NAME   "The layout volume"
+#define UBI_LAYOUT_VOLUME_COMPAT UBI_COMPAT_REJECT
+
+/* The maximum number of volumes per one UBI device */
+#define UBI_MAX_VOLUMES 128
+
+/* The maximum volume name length */
+#define UBI_VOL_NAME_MAX 127
+
+/* Size of the volume table record */
+#define UBI_VTBL_RECORD_SIZE sizeof(struct ubi_vol_tbl_record)
+
+/* Size of the volume table record without the ending CRC */
+#define UBI_VTBL_RECORD_SIZE_CRC (UBI_VTBL_RECORD_SIZE - sizeof(ubi32_t))
+
+/**
+ * struct ubi_vol_tbl_record - a record in the volume table.
+ *
+ * @reserved_pebs: how many physical eraseblocks are reserved for this volume
+ * @alignment: volume alignment
+ * @data_pad: how many bytes are unused at the end of the each eraseblock to
+ * satisfy the requested alignment
+ * @vol_type: volume type (%UBI_DYNAMIC_VOLUME or %UBI_STATIC_VOLUME)
+ * @upd_marker: the update marker flag
+ * @name_len: the volume name length
+ * @name: the volume name
+ * @padding2: reserved, zeroes
+ * @crc: a CRC32 checksum of the record
+ *
+ * The layout volume consists of 2 logical eraseblock, each of which contains
+ * the volume table (i.e., the volume table is duplicated). The volume table is
+ * an array of &struct ubi_vol_tbl_record objects indexed by the volume ID.
+ *
+ * If the size of the logical eraseblock is large enough to fit
+ * %UBI_MAX_VOLUMES, the volume table contains %UBI_MAX_VOLUMES records.
+ * Otherwise, it contains as many records as it can fit (i.e., size of logical
+ * eraseblock divided by sizeof(struct ubi_vol_tbl_record)).
+ *
+ * The @upd_marker flag is used to implement volume update. It is set to %1
+ * before update and set to %0 after the update.
+ *
+ * The @alignment field is specified when the volume is created and cannot be
+ * later changed. It may be useful, for example, when a block-oriented file
+ * system works on top of UBI. The @data_pad field is calculated using the
+ * logical eraseblock size and @alignment. The alignment must be multiple to the
+ * minimal flash I/O unit. If @alignment is 1, all the available space of
+ * eraseblocks is used.
+ *
+ * Empty records contain all zeroes and the CRC checksum of those zeroes.
+ */
+struct ubi_vol_tbl_record {
+	ubi32_t reserved_pebs;
+	ubi32_t alignment;
+	ubi32_t data_pad;
+	uint8_t vol_type;
+	uint8_t upd_marker;
+	ubi16_t name_len;
+	uint8_t name[UBI_VOL_NAME_MAX+1];
+	uint8_t padding2[24];
+	ubi32_t crc;
+} __attribute__ ((packed));
+
+#endif /* !__UBI_HEADER_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 03/44 take 2] [UBI] user-space API header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
  2007-02-17 16:54 ` [PATCH 01/44 take 2] [UBI] Linux build integration Artem Bityutskiy
  2007-02-17 16:54 ` [PATCH 02/44 take 2] [UBI] on-flash data structures header Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-17 21:27   ` Arnd Bergmann
  2007-02-17 16:54 ` [PATCH 04/44 take 2] [UBI] kernel-spce " Artem Bityutskiy
                   ` (43 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/include/mtd/ubi-user.h tmp-to/include/mtd/ubi-user.h
--- tmp-from/include/mtd/ubi-user.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/include/mtd/ubi-user.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,161 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#ifndef __UBI_USER_H__
+#define __UBI_USER_H__
+
+/*
+ * UBI volume creation
+ * ~~~~~~~~~~~~~~~~~~~
+ *
+ * UBI volumes are created via the %UBI_IOCMKVOL IOCTL command of UBI character
+ * device. A &struct ubi_mkvol_req object has to be properly filled and a
+ * pointer to it has to be passed to the IOCTL.
+ *
+ * UBI volume deletion
+ * ~~~~~~~~~~~~~~~~~~~
+ *
+ * To delete a volume, the %UBI_IOCRMVOL IOCTL command of the UBI character
+ * device should be used. A pointer to the 32-bit volume ID hast to be passed
+ * to the IOCTL.
+ *
+ * UBI volume re-size
+ * ~~~~~~~~~~~~~~~~~~
+ *
+ * To re-size a volume, the %UBI_IOCRSVOL IOCTL command of the UBI character
+ * device should be used. A &struct ubi_rsvol_req object has to be properly
+ * filled and a pointer to it has to be passed to the IOCTL.
+ *
+ * UBI volume update
+ * ~~~~~~~~~~~~~~~~~
+ *
+ * Volume update should be done via the %UBI_IOCVOLUP IOCTL command of the
+ * corresponding UBI volume character device. A pointer to a 64-bit update
+ * size should be passed to the IOCTL. After then, UBI expects user to write
+ * this number of bytes to the volume character device. The update is finished
+ * when the claimed number of bytes is passed. So, the volume update sequence
+ * is something like:
+ *
+ * fd = open("/dev/my_volume");
+ * ioctl(fd, UBI_IOCVOLUP, &image_size);
+ * write(fd, buf, image_size);
+ * close(fd);
+ */
+
+/*
+ * When a new volume is created, users may either specify the volume number they
+ * want to create or to let UBI automatically assign a volume number using this
+ * constant.
+ */
+#define UBI_VOL_NUM_AUTO (-1)
+
+/* Maximum volume name length */
+#define UBI_MAX_VOLUME_NAME 127
+
+/* IOCTL commands of UBI character devices */
+
+#define UBI_IOC_MAGIC 'o'
+
+/* Create an UBI volume */
+#define UBI_IOCMKVOL _IOW(UBI_IOC_MAGIC, 0, struct ubi_mkvol_req)
+/* Remove an UBI volume */
+#define UBI_IOCRMVOL _IOW(UBI_IOC_MAGIC, 1, int32_t)
+/* Re-size an UBI volume */
+#define UBI_IOCRSVOL _IOW(UBI_IOC_MAGIC, 2, struct ubi_rsvol_req)
+
+/* IOCTL commands of UBI volume character devices */
+
+#define UBI_VOL_IOC_MAGIC 'O'
+
+/* Start UBI volume update */
+#define UBI_IOCVOLUP _IOW(UBI_VOL_IOC_MAGIC, 0, int64_t)
+/* An eraseblock erasure command, used for debugging, disabled by default */
+#define UBI_IOCEBER _IOW(UBI_VOL_IOC_MAGIC, 1, int32_t)
+
+/*
+ * UBI volume type constants.
+ *
+ * @UBI_DYNAMIC_VOLUME: dynamic volume
+ * @UBI_STATIC_VOLUME:  static volume
+ */
+enum {
+	UBI_DYNAMIC_VOLUME = 3,
+	UBI_STATIC_VOLUME = 4
+};
+
+/**
+ * struct ubi_mkvol_req - volume description data structure used in
+ * volume creation requests.
+ *
+ * @vol_id: volume number
+ * @alignment: volume alignment
+ * @bytes: volume size in bytes
+ * @vol_type: volume type (%UBI_DYNAMIC_VOLUME or %UBI_STATIC_VOLUME)
+ * @padding: reserved for future, not used
+ * @name_len: volume name length
+ * @name: volume name
+ *
+ * This structure is used by userspace programs when creating new volumes. The
+ * @used_bytes field is only necessary when creating static volumes.
+ *
+ * The @alignment field specifies the required alignment of the volume logical
+ * eraseblock. This means, that the size of logical eraseblocks will be aligned
+ * to this number, i.e.,
+ * 	(UBI device logical eraseblock size) mod (@alignment) = 0.
+ *
+ * To put it differently, the logical eraseblock of this volume may be slightly
+ * shortened in order to make it properly aligned. The alignment has to be
+ * multiple of the flash minimal input/output unit, or %1 to utilize the entire
+ * available space of logical eraseblocks.
+ *
+ * The @alignment field may be useful, for example, when one wants to maintain
+ * a block device on top of an UBI volume. In this case, it is desirable to fit
+ * an integer number of blocks in logical eraseblocks of this UBI volume. With
+ * alignment it is possible to update this volume using plane UBI volume image
+ * BLOBs, without caring about how to properly write them.
+ */
+struct ubi_mkvol_req {
+	int32_t vol_id;
+	int32_t alignment;
+	int64_t bytes;
+	int8_t vol_type;
+	int8_t padding[9];
+	int16_t name_len;
+	__user const char *name;
+} __attribute__ ((packed));
+
+/**
+ * struct ubi_rsvol_req - a data structure used in volume re-size requests.
+ *
+ * @vol_id: ID of the volume to re-size
+ * @bytes: new size of the volume in bytes
+ *
+ * Re-sizing is possible for both dynamic and static volumes. But while dynamic
+ * volumes may be re-sized arbitrarily, static volumes cannot be made to be
+ * smaller then the number of bytes they bear. To arbitrarily shrink a static
+ * volume, it must be wiped out first (by means of volume update operation with
+ * zero number of bytes).
+ */
+struct ubi_rsvol_req {
+	int64_t bytes;
+	int32_t vol_id;
+} __attribute__ ((packed));
+
+#endif /* __UBI_USER_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 04/44 take 2] [UBI] kernel-spce API header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (2 preceding siblings ...)
  2007-02-17 16:54 ` [PATCH 03/44 take 2] [UBI] user-space API header Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-18  1:32   ` Greg KH
  2007-02-17 16:54 ` [PATCH 05/44 take 2] [UBI] internal common header Artem Bityutskiy
                   ` (42 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/include/linux/mtd/ubi.h tmp-to/include/linux/mtd/ubi.h
--- tmp-from/include/linux/mtd/ubi.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/include/linux/mtd/ubi.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,391 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#ifndef __LINUX_UBI_H__
+#define __LINUX_UBI_H__
+
+#include <linux/types.h>
+#include <asm/ioctl.h>
+#include <mtd/ubi-user.h>
+
+/**
+ * enum ubi_data_type - UBI data type hint constants.
+ *
+ * @UBI_DATA_LONGTERM: long-term data
+ * @UBI_DATA_SHORTTERM: short-term data
+ * @UBI_DATA_UNKNOWN: data persistence is unknown
+ *
+ * These constants are used when data is written to UBI volumes in order to
+ * help the UBI wear-leveling unit to find more appropriate physical
+ * eraseblocks.
+ */
+enum ubi_data_type {
+	UBI_DATA_LONGTERM = 1,
+	UBI_DATA_SHORTTERM,
+	UBI_DATA_UNKNOWN
+};
+
+/**
+ * enum ubi_open_mode - UBI volume open mode constants.
+ *
+ * @UBI_READONLY: read-only mode
+ * @UBI_READWRITE: read-write mode
+ * @UBI_EXCLUSIVE: exclusive mode
+ */
+enum ubi_open_mode {
+	UBI_READONLY = 1,
+	UBI_READWRITE,
+	UBI_EXCLUSIVE
+};
+
+/**
+ * struct ubi_vol_info - UBI volume description data structure.
+ *
+ * @vol_id: volume ID
+ * @ubi_num: UBI device number this volume belongs to
+ * @size: how many physical eraseblocks are reserved for this volume
+ * @used_bytes: how many bytes of data this volume contains
+ * @used_ebs: how many physical eraseblocks of this volume actually contain any
+ * data
+ * @vol_type: volume type (%UBI_DYNAMIC_VOLUME or %UBI_STATIC_VOLUME)
+ * @corrupted: non-zero if the volume is corrupted (static volumes only)
+ * @upd_marker: non-zero if the volume has update marker set
+ * @alignment: volume alignment
+ * @usable_leb_size: how many bytes are available in logical eraseblocks of
+ * this volume
+ * @name_len: volume name length
+ * @name: volume name
+ * @cdev: UBI volume character device major and minor numbers
+ *
+ * The @corrupted flag is only relevant to static volumes and is always zero
+ * for dynamic ones. This is because UBI does not care about dynamic volume
+ * data protection and only cares about protecting static volume data.
+ *
+ * The @upd_marker flag is set if the volume update operation was interrupted.
+ * Before touching the volume data during the update operation, UBI first sets
+ * the update marker flag for this volume. If the volume update operation was
+ * further interrupted, the update marker indicates this. If the update marker
+ * is set, the contents of the volume is certainly damaged and a new volume
+ * update operation has to be started.
+ *
+ * To put it differently, @corrupted and @upd_marker fields have different
+ * semantics:
+ *     o the @corrupted flag means that this static volume is corrupted for some
+ *       reasons, but not because an interrupted volume update
+ *     o the @@upd_marker field means that the volume is damaged because of an
+ *       interrupted update operation.
+ *
+ * I.e., the @corrupted flag is never set if the @upd_marker flag is set.
+ *
+ * The @used_bytes and @used_ebs fields are only really needed for static volumes
+ * and contain the number of bytes stored in this static volume and how many
+ * eraseblock this data occupies. In case of dynamic volumes, the @used_bytes
+ * field is equivalent to @size*@usable_leb_size, and the @used_ebs field is
+ * equivalent to @size.
+ *
+ * In general, logical eraseblock size is a property of the UBI device, not
+ * of the UBI volume. Indeed, the logical eraseblock size depends on the
+ * physical eraseblock size and on how much bytes UBI headers consume. But
+ * because of the volume alignment (@alignment), the usable size of logical
+ * eraseblocks if a volume may be less. The following equation is true:
+ * 	@usable_leb_size = LEB size - (LEB size mod @alignment),
+ * where LEB size is the logical eraseblock size defined by the UBI device.
+ *
+ * The alignment is multiple to the minimal flash input/output unit size or %1
+ * if all the available space is used.
+ *
+ * To put this differently, alignment may be considered is a way to change
+ * volume logical eraseblock sizes.
+ *
+ */
+struct ubi_vol_info {
+	int ubi_num;
+	int vol_id;
+	int size;
+	long long used_bytes;
+	int used_ebs;
+	int vol_type;
+	int corrupted;
+	int upd_marker;
+	int alignment;
+	int usable_leb_size;
+	int name_len;
+	const char *name;
+	dev_t cdev;
+};
+
+/**
+ * struct ubi_dev_info - UBI device description data structure.
+ *
+ * @ubi_num: ubi device number
+ * @leb_size: logical eraseblock size on this UBI device
+ * @min_io_size: minimal I/O unit size
+ * @ro_mode: if this device is in read-only mode
+ * @cdev: UBI character device major and minor numbers
+ *
+ * Note, @leb_size is the logical eraseblock size offered by the UBI device.
+ * Volumes of this UBI device may have smaller logical eraseblock size if their
+ * alignment is not equivalent to %1.
+ */
+struct ubi_dev_info {
+	int ubi_num;
+	int leb_size;
+	int min_io_size;
+	int ro_mode;
+	dev_t cdev;
+};
+
+/**
+ * ubi_get_device_info - get information about an UBI device.
+ *
+ * @ubi_num: UBI device number
+ * @di: the volume information is returned here
+ *
+ * This function returns 0 in case of success and a %-ENODEV if there is no
+ * such UBI device.
+ */
+int ubi_get_device_info(int ubi_num, struct ubi_dev_info *di);
+
+/* UBI descriptor given to users when they open UBI volumes */
+struct ubi_vol_desc;
+
+/**
+ * ubi_get_volume_info - get information about an UBI volume.
+ *
+ * @udesc: volume descriptor
+ * @vi: the volume information is returned here
+ */
+void ubi_get_volume_info(struct ubi_vol_desc *udesc, struct ubi_vol_info *vi);
+
+/*
+ * ubi_open_volume - open an UBI volume.
+ *
+ * @ubi_num: the UBI device number
+ * @vol_id: ID of the volume to open
+ * @mode: volume open mode
+ *
+ * This function opens a UBI volume. The @mode parameter specifies if the
+ * volume is opened in read-only mode, read-write mode, or exclusive mode. The
+ * exclusive mode means that nobody else will be allowed to open this volume.
+ * Note, UBI allows to have many volume readers and one writer at a time. And
+ * note, static volumes may only be opened in read-only mode.
+ *
+ * In case of success, this function returns an UBI volume descriptor. In case
+ * of failure, the following error codes may be returned:
+ *
+ * o %-EBUSY if the volume is busy (it is being updated, or it is already
+ *   opened in exclusive or read-write mode by somebody else);
+ * o %-EINVAL if the input arguments are invalid;
+ * o %-ENODEV if this volume does not exist or the UBI device does not exist;
+ * o other negative error codes in case of other errors.
+ *
+ * Additional node: if the volume is being opened for the first time since the
+ * last boot, it is fully checked by this function. It's a feature of current
+ * implementation.
+ */
+struct ubi_vol_desc *ubi_open_volume(int ubi_num, int vol_id,
+				     enum ubi_open_mode mode);
+
+/*
+ * ubi_open_volume_nm - open an UBI volume by volume name.
+ *
+ * @ubi_num: the UBI device number
+ * @name: volume name
+ * @mode: volume open mode
+ *
+ * This function is similar to the 'ubi_open_volume()' function, but opens UBI
+ * volumes by name.
+ */
+struct ubi_vol_desc *ubi_open_volume_nm(int ubi_num, const char *name,
+					enum ubi_open_mode mode);
+
+/**
+ * ubi_close_volume - close an UBI volume.
+ *
+ * @udesc: UBI volume descriptor
+ */
+void ubi_close_volume(struct ubi_vol_desc *udesc);
+
+/**
+ * ubi_eraseblock_read - read data from a logical eraseblock.
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to read from
+ * @buf: a buffer where to store the read data
+ * @offset: the offset within the logical eraseblock from where to read
+ * @len: how many bytes to read
+ * @check: whether UBI has to check the read data's CRC or not.
+ *
+ * This function reads data from offset @offset of the logical eraseblock @lnum
+ * and stores the read data at @buf. When reading from static volumes, @check
+ * may be used to specify whether the read data has to be checked or not. If
+ * checking is requested, the whole logical eraseblock will be read and its CRC
+ * checksum will be checked, so checking may substantially slow down the read
+ * speed. The @check argument is ignored in case of dynamic volumes.
+ *
+ * In case of success, this function returns zero. In case of error, this
+ * function returns a negative error code. A special %-EBADMSG error code is
+ * returned:
+ *
+ * o for both static and dynamic volumes if the MTD driver has detected a data
+ * integrity problem, unrecoverable ECC checksum mismatch in case of NAND;
+ * o for static volumes if the data CRC mismatches.
+ *
+ * If a corrupted static volume is read (i.e., the @corrupted flag is set in
+ * its description object), but the data were read from flash without errors
+ * because this particular eraseblock is not corrupted, this function returns
+ * %-EUCLEAN, not zero. This just indicates that the read static volume is
+ * corrupted. But the read data is actually OK.
+ *
+ * Note, if a volume is damaged because of an interrupted update (the
+ * @upd_marker flag is set) this function just returns immediately with %-EBADF
+ * error code. In other words, volumes like that cannot be read before re-doing
+ * the update operation.
+ */
+int ubi_eraseblock_read(struct ubi_vol_desc *udesc, int lnum, char *buf,
+			int offset, int len, int check);
+
+/**
+ * ubi_read - read data from an logical eraseblock (simplified).
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to read from
+ * @buf: a buffer where to store the read data
+ * @offset: the offset within the logical eraseblock from where to read
+ * @len: how many bytes to read
+ *
+ * This function is the same as the 'ubi_eraseblock_read()' function, but it
+ * does not provide the checking capability.
+ */
+static inline int ubi_read(struct ubi_vol_desc *udesc, int lnum, char *buf,
+			   int offset, int len)
+{
+	return ubi_eraseblock_read(udesc, lnum, buf, offset, len, 0);
+}
+
+/**
+ * ubi_eraseblock_write - write data to a logical eraseblock.
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to write to
+ * @buf: the data to write
+ * @offset: offset within the logical eraseblock where to write
+ * @len: how many bytes from @buf to write
+ * @dtype: expected data type
+ *
+ * This function writes @len bytes of data from buffer @buf to offset @offset
+ * of logical eraseblock @lnum. The @dtype argument describes the expected
+ * lifetime of the data being written.
+ *
+ * Note, this function takes care about write failures. If a write to the physical
+ * eraseblock (the one this logical eraseblock is mapped to) fails, the logical
+ * eraseblock is re-mapped to another physical eraseblock, the data is
+ * recovered, and the write finishes.
+ *
+ * If all the data were successfully written, zero is returned. If an error
+ * occurred, this function returns a negative error code. Note, in case of an
+ * error, it is possible that something was still written to the flash media,
+ * but may be some garbage.
+ */
+int ubi_eraseblock_write(struct ubi_vol_desc *udesc, int lnum, const void *buf,
+			 int offset, int len, enum ubi_data_type dtype);
+
+/**
+ * ubi_write - write data to a logical eraseblock (simplified).
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to write to
+ * @buf: the data to write
+ * @offset: offset within the logical eraseblock where to write
+ * @len: how many bytes from @buf to write
+ *
+ * This function is the same as the 'ubi_eraseblock_write()' functions, but it
+ * does not have the data type argument.
+ */
+static inline int ubi_write(struct ubi_vol_desc *udesc, int lnum,
+			    const void *buf, int offset, int len)
+{
+	return ubi_eraseblock_write(udesc, lnum, buf, offset, len,
+				    UBI_DATA_UNKNOWN);
+}
+
+/**
+ * ubi_eraseblock_erase - erase a logical eraseblock.
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to erase
+ *
+ * This function un-maps logical eraseblock @lnum and synchronously erases the
+ * correspondent physical eraseblock. Returns zero in case of success and a
+ * negative error code in case of failure.
+ *
+ *
+ * Note, UBI erases eraseblocks asynchronously. This means that this function
+ * will basically un-map this logical eraseblock from its physical eraseblock,
+ * schedule the physical eraseblock for erasure and return.
+ */
+int ubi_eraseblock_erase(struct ubi_vol_desc *udesc, int lnum);
+
+/**
+ * ubi_eraseblock_unmap - unmap a logical eraseblock.
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to unmap
+ *
+ * This function un-maps logical eraseblock @lnum and schedules the
+ * corresponding physical eraseblock for erasure, so that it will eventually be
+ * physically erased, probably in background. So this operation is much faster
+ * then the synchronous erase.
+ *
+ * Unlike the synchronous erase, the unmap operation does not guarantee that
+ * the logical eraseblock will contain all 0xFF bytes when UBI is initialized
+ * again. For example, if several logical eraseblocks are unmapped, then an
+ * unclean reboot happens, the logical eraseblocks will not necessarily be
+ * unmapped again. They may actually be mapped to the same physical eraseblocks
+ * again. So, this call has to be used with care.
+ *
+ * The main and obvious use-case of this call is re-writing the contents of a
+ * logical eraseblock. Then it is much efficient to first unmap it, then write
+ * new data, rather the first erase it, then write new data. Note, once new
+ * data is written to the logical eraseblock, UBI guarantees that the old
+ * contents has forever gone. In other words, if an unclean reboot happens
+ * after the logical eraseblock was unmapped and then written to, it will
+ * contain the last written data.
+ */
+int ubi_eraseblock_unmap(struct ubi_vol_desc *udesc, int lnum);
+
+/**
+ * ubi_eraseblock_is_mapped - check if a logical eraseblock is mapped.
+ *
+ * @udesc: volume descriptor
+ * @lnum: the logical eraseblock number to erase
+ *
+ * This function checks if a logical eraseblock is mapped to a physical
+ * eraseblock. Unmapped logical eraseblocks are equivalent to erased logical
+ * eraseblocks and contain only 0xFF bytes. Mapped logical eraseblocks are
+ * those that were explicitly written to. They may also contain only 0xFF
+ * bytes if these were written.
+ *
+ * This function returns %1 if the LEB is mapped, %0 if not, and a negative
+ * error code in case of failure.
+ */
+int ubi_eraseblock_is_mapped(struct ubi_vol_desc *udesc, int lnum);
+
+#endif /* !__LINUX_UBI_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (3 preceding siblings ...)
  2007-02-17 16:54 ` [PATCH 04/44 take 2] [UBI] kernel-spce " Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-17 21:05   ` Arnd Bergmann
  2007-02-19 10:54   ` Christoph Hellwig
  2007-02-17 16:54 ` [PATCH 06/44 take 2] [UBI] startup code Artem Bityutskiy
                   ` (41 subsequent siblings)
  46 siblings, 2 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/ubi.h tmp-to/drivers/mtd/ubi/ubi.h
--- tmp-from/drivers/mtd/ubi/ubi.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/ubi.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,100 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#ifndef __UBI_UBI_H__
+#define __UBI_UBI_H__
+
+#include <linux/mtd/ubi.h>
+
+/* Version of this UBI implementation */
+#define UBI_VERSION 1
+
+/* Maximum number of supported UBI devices */
+#define UBI_MAX_INSTANCES 32
+
+/* UBI messages printk level */
+#define UBI_MSG_LEVEL  KERN_INFO
+#define UBI_WARN_LEVEL KERN_WARNING
+#define UBI_ERR_LEVEL  KERN_ERR
+
+/* Prefixes of UBI messages */
+#define UBI_MSG_PREF  "UBI:"
+#define UBI_WARN_PREF "UBI warning:"
+#define UBI_ERR_PREF  "UBI error:"
+
+/* Normal UBI messages */
+#define ubi_msg(fmt, ...)                                                   \
+	printk(UBI_MSG_LEVEL UBI_MSG_PREF " " fmt "\n", ##__VA_ARGS__)
+/* UBI warning messages */
+#define ubi_warn(fmt, ...)                                                  \
+	printk(UBI_WARN_LEVEL UBI_WARN_PREF " %s: " fmt "\n", __FUNCTION__, \
+	       ##__VA_ARGS__)
+/* UBI error messages */
+#define ubi_err(fmt, ...)                                                   \
+	printk(UBI_ERR_LEVEL UBI_ERR_PREF " %s " fmt "\n", __FUNCTION__,    \
+	       ##__VA_ARGS__)
+
+struct ubi_io_info;
+struct ubi_bgt_info;
+struct ubi_wl_info;
+struct ubi_beb_info;
+struct ubi_vmt_info;
+struct ubi_ivol_info;
+struct ubi_vtbl_info;
+struct ubi_acc_info;
+struct ubi_upd_info;
+struct ubi_eba_info;
+struct ubi_uif_info;
+
+/**
+ * struct ubi_info - UBI device description structure
+ *
+ * @ubi_num: number of the UBI device
+ * @io: input/output unit information
+ * @bgt: background thread unit information
+ * @wl: wear-leveling unit information
+ * @beb: bad eraseblock handling unit information
+ * @vmt: volume management unit information
+ * @ivol: internal volume management unit information
+ * @vtbl: volume table unit information
+ * @acc: accounting unit information
+ * @upd: update unit information
+ * @eba: EBA unit information
+ * @uif: user interface unit information
+ */
+struct ubi_info {
+	int ubi_num;
+	struct ubi_io_info   *io;
+	struct ubi_bgt_info  *bgt;
+	struct ubi_wl_info   *wl;
+	struct ubi_beb_info  *beb;
+	struct ubi_vmt_info  *vmt;
+	struct ubi_ivol_info *ivol;
+	struct ubi_vtbl_info *vtbl;
+	struct ubi_acc_info  *acc;
+	struct ubi_upd_info  *upd;
+	struct ubi_eba_info  *eba;
+	struct ubi_uif_info  *uif;
+};
+
+extern int ubis_num;
+extern struct ubi_info *ubis[UBI_MAX_INSTANCES];
+
+#endif /* !__UBI_UBI_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 06/44 take 2] [UBI] startup code
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (4 preceding siblings ...)
  2007-02-17 16:54 ` [PATCH 05/44 take 2] [UBI] internal common header Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-19 10:59   ` Christoph Hellwig
  2007-02-17 16:54 ` [PATCH 07/44 take 2] [UBI] misc unit header Artem Bityutskiy
                   ` (40 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/init.c tmp-to/drivers/mtd/ubi/init.c
--- tmp-from/drivers/mtd/ubi/init.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/init.c	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,371 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy,
+ *         Frank Haverkamp
+ */
+
+#include <linux/init.h>
+#include <linux/err.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/sched.h>
+#include <linux/stringify.h>
+#include <linux/types.h>
+#include <linux/stat.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "uif.h"
+#include "io.h"
+#include "build.h"
+#include "debug.h"
+
+/* Maximum MTD device specification parameter length */
+#define UBI_MTD_PARAM_LEN_MAX 64
+
+/**
+ * struct mtd_dev_param - MTD device parameter description data structure.
+ *
+ * @name: MTD device name or number string
+ * @vid_hdr_offs: VID header offset
+ * @data_offs: data offset
+ */
+struct mtd_dev_param
+{
+	char name[UBI_MTD_PARAM_LEN_MAX];
+	int vid_hdr_offs;
+	int data_offs;
+};
+
+/* Numbers of elements set in the @mtd_dev_param array. */
+static int mtd_devs = 0;
+
+/* MTD devices specification parameters */
+static struct mtd_dev_param mtd_dev_param[UBI_MAX_INSTANCES];
+
+/* Number of UBI devices in system */
+int ubis_num;
+
+/* All the UBI devices in system */
+struct ubi_info *ubis[UBI_MAX_INSTANCES];
+
+/* UBI headers must take 64 bytes. The below is a hacky way to ensure this */
+static int __ubi_check_ec_hdr_size[(UBI_EC_HDR_SIZE == 64) - 1]
+    __attribute__ ((__unused__));
+static int __ubi_check_ec_hdr_size[(UBI_VID_HDR_SIZE == 64) - 1]
+    __attribute__ ((__unused__));
+
+static int ubi_attach_mtd_dev(const char *mtd_dev, int vid_hdr_offset,
+			      int data_offset);
+static void ubi_destroy_dev(int ubi_num);
+
+static int __init ubi_init(void)
+{
+	int err, i, k;
+
+	if (mtd_devs > UBI_MAX_INSTANCES) {
+		printk("UBI error: too many MTD devices, max. is %d\n",
+		       UBI_MAX_INSTANCES);
+		return -EINVAL;
+	}
+
+	err = ubi_dbg_init();
+	if (err) {
+		printk("UBI error: failed to initialize debugging unit, "
+		       "error %d", err);
+		return err;
+	}
+
+	err = ubi_alloc_init();
+	if (err) {
+		dbg_err("failed to initialize memory allocation unit, "
+			"error %d", err);
+		goto out_dbg;
+	}
+
+	/* Initialize the user interface unit */
+	err = ubi_uif_global_init();
+	if (err) {
+		dbg_err("failed to initialize user interfaces unit, error %d",
+			err);
+		goto out_alloc;
+	}
+
+	/* Attach MTD devices */
+	for (i = 0; i < mtd_devs; i++) {
+		struct mtd_dev_param *p = &mtd_dev_param[i];
+
+		cond_resched();
+		err = -EINVAL;
+
+		/* First suppose this is MTD device name */
+		err = ubi_attach_mtd_dev(p->name, p->vid_hdr_offs,
+					 p->data_offs);
+		if (err)
+			goto out_detach;
+	}
+
+	return 0;
+
+out_detach:
+	for (k = 0; k < i; k++)
+		ubi_destroy_dev(k);
+	ubi_uif_global_close();
+out_alloc:
+	ubi_alloc_close();
+out_dbg:
+	ubi_dbg_close();
+	return err;
+}
+module_init(ubi_init);
+
+static void __exit ubi_exit(void)
+{
+	int i;
+
+	for (i = 0; i < ubis_num; i++)
+		ubi_destroy_dev(i);
+	ubi_uif_global_close();
+	ubi_alloc_close();
+	ubi_dbg_close();
+}
+module_exit(ubi_exit);
+
+/**
+ * ubi_attach_mtd_dev - attach an MTD device.
+ *
+ * @mtd_dev: MTD device name or number string to attach
+ * @vid_hdr_offset: volume identifier header offset in physical eraseblocks
+ * @data_offset: data offset in physical eraseblock
+ *
+ * This function attaches an MTD device to UBI. It first treats @mtd_dev as the
+ * MTD device name, and tries to open it by this name. If it is unable to open,
+ * it tries to convert @mtd_dev to an integer and open the MTD device by its
+ * number. Returns zero in case of success and a negative error code in case of
+ * failure.
+ */
+static int ubi_attach_mtd_dev(const char *mtd_dev, int vid_hdr_offset,
+			      int data_offset)
+{
+	struct mtd_info *mtd;
+	int i, err, mtd_num, ubi_num;
+
+	if (!mtd_dev)
+		return -EINVAL;
+
+	if (ubis_num == UBI_MAX_INSTANCES) {
+		ubi_err("too many UBI devices, max. is %d", UBI_MAX_INSTANCES);
+		return -EINVAL;
+	}
+
+	mtd = get_mtd_device_nm(mtd_dev);
+	if (IS_ERR(mtd)) {
+		char *endp;
+
+		if (PTR_ERR(mtd) != -ENODEV)
+			return PTR_ERR(mtd);
+
+		mtd_num = simple_strtoul(mtd_dev, &endp, 0);
+		if (*endp != '\0' || mtd_dev == endp) {
+			ubi_err("incorrect MTD device: \"%s\"", mtd_dev);
+			return -ENODEV;
+		}
+
+		mtd = get_mtd_device(NULL, mtd_num);
+		if (IS_ERR(mtd))
+			return PTR_ERR(mtd);
+	}
+
+	mtd_num = mtd->index;
+	put_mtd_device(mtd);
+
+	/* Check is we already have the same MTD device attached */
+	for (i = 0; i < ubis_num; i++)
+		if (ubis[i]->io->mtd_num == mtd_num) {
+			ubi_err("mtd%d is already attached to ubi%d",
+				mtd_num, i);
+			return -EINVAL;
+		}
+
+	ubi_num = ubis_num++;
+
+	ubis[ubi_num] = ubi_kzalloc(sizeof(struct ubi_info));
+	if (!ubis[ubi_num])
+		return -ENOMEM;
+
+	ubis[ubi_num]->ubi_num = ubi_num;
+
+	err = ubi_bld_attach_mtd_dev(ubis[ubi_num], mtd_num, vid_hdr_offset,
+				     data_offset);
+	if (err)
+		goto out_free;
+
+	if (ubi_num == ubis_num)
+		ubis_num += 1;
+
+	return 0;
+
+out_free:
+	ubi_kfree(ubis[i]);
+	return -ENODEV;
+}
+
+/**
+ * ubi_destroy_dev - destroy an UBI device.
+ *
+ * @ubi_num: UBI device number to destroy
+ *
+ * In current UBI implementation UBI devices are static and cannot dynamically
+ * go and come. So this function is only used when UBI is de-initialized.
+ */
+static void ubi_destroy_dev(int ubi_num)
+{
+	ubi_bld_detach_mtd_dev(ubis[ubi_num]);
+	ubi_kfree(ubis[ubi_num]);
+}
+
+static int __init bytes_str_to_int(const char *str);
+
+/**
+ * ubi_mtd_param_parse - parse the "mtd" UBI parameter.
+ *
+ * @val: the parameter value to parse
+ * @kp: not used
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of error.
+ */
+static int __init ubi_mtd_param_parse(const char *val, struct kernel_param *kp)
+{
+	int i, len;
+	struct mtd_dev_param *p;
+	char buf[UBI_MTD_PARAM_LEN_MAX];
+	char *pbuf = &buf[0];
+	char *tokens[3] = {NULL, NULL, NULL};
+
+	if (mtd_devs == UBI_MAX_INSTANCES) {
+		printk("UBI error: too many parameters, max. is %d\n",
+		       UBI_MAX_INSTANCES);
+		return -EINVAL;
+	}
+
+	len = strnlen(val, UBI_MTD_PARAM_LEN_MAX);
+	if (len > UBI_MTD_PARAM_LEN_MAX) {
+		printk("UBI error: parameter \"%s\" is too long, max. is %d\n",
+		       val, UBI_MTD_PARAM_LEN_MAX);
+		return -EINVAL;
+	}
+
+	if (len == 0) {
+		printk("UBI warning: empty \"mtd\" parameter - ignored\n");
+		return 0;
+	}
+
+	strcpy(buf, val);
+
+	/* Get rid of the final newline */
+	if (buf[len - 1] == '\n')
+		buf[len - 1] = 0;
+
+	for (i = 0; i < 3; i++)
+		tokens[i] = strsep(&pbuf, ",");
+
+	if (pbuf) {
+		printk("UBI error: too many arguments at \"%s\"\n", val);
+		return -EINVAL;
+	}
+
+	if (tokens[0] == '\0')
+		return -EINVAL;
+
+	p = &mtd_dev_param[mtd_devs];
+	strcpy(&p->name[0], tokens[0]);
+
+	if (tokens[1])
+		p->vid_hdr_offs = bytes_str_to_int(tokens[1]);
+	if (tokens[2])
+		p->data_offs = bytes_str_to_int(tokens[2]);
+
+	if (p->vid_hdr_offs < 0)
+		return p->vid_hdr_offs;
+	if (p->data_offs < 0)
+		return p->data_offs;
+
+	mtd_devs += 1;
+
+	return 0;
+}
+
+/*
+ * bytes_str_to_int - convert a string representing a number of bytes to an
+ * integer.
+ *
+ * @str: the string to convert
+ *
+ * This function returns positive resulting integer in case of success and a
+ * negative error code in case of failure.
+ */
+static int __init bytes_str_to_int(const char *str)
+{
+	char *endp;
+	unsigned long result;
+
+	result = simple_strtoul(str, &endp, 0);
+	if (str == endp || result < 0) {
+		printk("UBI error: incorrect bytes count: \"%s\"\n", str);
+		return -EINVAL;
+	}
+
+	switch (*endp) {
+	case 'G':
+		result *= 1024;
+	case 'M':
+		result *= 1024;
+	case 'K':
+	case 'k':
+		result *= 1024;
+		if (endp[1] == 'i' && (endp[2] == '\0' ||
+			  endp[2] == 'B'  || endp[2] == 'b'))
+			endp += 2;
+	case '\0':
+		break;
+	default:
+		printk("UBI error: incorrect bytes count: \"%s\"\n", str);
+		return -EINVAL;
+	}
+
+	return result;
+}
+
+module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 000);
+MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: "
+		      "mtd=<name|num>[,<vid_hdr_offs>,<data_offs>]. "
+		      "Multiple \"mtd\" parameters may be specified.\n"
+		      "MTD devices may be specified by their number or name. "
+		      "Optional \"vid_hdr_offs\" and \"data_offs\" parameters "
+		      "specify UBI VID header position and data starting "
+		      "position to be used by UBI.\n"
+		      "Example: mtd=content,1984,2048 mtd=4 - attach MTD device"
+		      "with name content using VID header offset 1984 and data "
+		      "start 2048, and MTD device number 4 using default "
+		      "offsets");
+
+MODULE_VERSION(__stringify(UBI_VERSION));
+MODULE_DESCRIPTION("UBI - Unsorted Block Images");
+MODULE_AUTHOR("Artem B. Bityutskiy");
+MODULE_LICENSE("GPL");

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 07/44 take 2] [UBI] misc unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (5 preceding siblings ...)
  2007-02-17 16:54 ` [PATCH 06/44 take 2] [UBI] startup code Artem Bityutskiy
@ 2007-02-17 16:54 ` Artem Bityutskiy
  2007-02-17 22:59   ` Theodore Tso
  2007-02-17 16:55 ` [PATCH 08/44 take 2] [UBI] misc unit implementation Artem Bityutskiy
                   ` (39 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/misc.h tmp-to/drivers/mtd/ubi/misc.h
--- tmp-from/drivers/mtd/ubi/misc.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/misc.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,146 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * Here we keep miscellaneous functions which are used all over the UBI code and
+ * do not really belong to any particular unit.
+ */
+
+#ifndef __UBI_MISC_H__
+#define __UBI_MISC_H__
+
+#include <linux/rbtree.h>
+
+struct ubi_info;
+
+#define xquotise(s) #s
+#define quotise(s) xquotise(s)
+
+/**
+ * rb_for_each_entry - walk an RB-tree.
+ *
+ * @rb: a pointer to type 'struct rb_node' to to use as a loop counter
+ * @pos: a pointer to RB-tree entry type to use as a loop counter
+ * @root: RB-tree's root
+ * @member: the name of the 'struct rb_node' within the RB-tree entry
+ */
+#define rb_for_each_entry(rb, pos, root, member)                             \
+	for (rb = rb_first(root),                                            \
+	     pos = (rb ? container_of(rb, typeof(*pos), member) : NULL);     \
+	     rb;                                                             \
+	     rb = rb_next(rb), pos = container_of(rb, typeof(*pos), member))
+
+/*
+ * align_up - align an integer to another integer.
+ *
+ * @x: the integer to align
+ * @y: the integer to align to
+ *
+ * This function returns the lowest number which is multiple to @y and not less
+ * then @x.
+ */
+static inline int align_up(int x, int y)
+{
+	return y*(x/y) + (!!(x % y)) * y;
+}
+
+/*
+ * align_down - align an integer to another integer.
+ *
+ * @x: the integer to align
+ * @y: the integer to align to
+ *
+ * This function returns the highest number which is multiple to @y and not
+ * greater then @x.
+ */
+static inline int align_down(int x, int y)
+{
+	return y*(x/y) - (!!(x % y)) * y;
+}
+
+/**
+ * ubi_buf_all_ff - check if buffer contains only 0xFF bytes.
+ *
+ * @buf: buffer to check
+ * @size: buffer size in bytes
+ *
+ * This function returns non-zero in there are only 0xFF bytes in @buf, and
+ * zero if something else was also found.
+ */
+int ubi_buf_all_ff(const void *buf, int size);
+
+/**
+ * ubi_buf_all_zeroes - check if buffer contains only zeroes.
+ *
+ * @buf: buffer to check
+ * @size: buffer size in bytes
+ *
+ * This function returns non-zero in there are only 0 bytes in @buf, and
+ * zero if something else was also found.
+ */
+int ubi_buf_all_zeroes(const void *buf, int size);
+
+/**
+ * ubi_check_pattern - check if buffer contains only a certain byte pattern.
+ *
+ * @buf: buffer to check
+ * @patt: the pattern to check
+ * @size: buffer size in bytes
+ *
+ * This function returns non-zero in there are only @patt bytes in @buf, and
+ * zero if something else was also found.
+ */
+int ubi_check_pattern(const void *buf, uint8_t patt, int size);
+
+/**
+ * strdup_len - duplicate a string with known length.
+ *
+ * @str: original string
+ * @len: the length of the string
+ */
+char *strdup_len(const char *str, int len);
+
+/**
+ * calc_data_len - calculate how much real data is stored in a buffer.
+ *
+ * @ubi: the UBI device description object
+ * @buf: a buffer with the contents of the physical eraseblock
+ * @length: the buffer length
+ *
+ * This function calculates how much real data is stored in @buf. Continuous
+ * 0xFF bytes at the end of the buffer are not considered as data.
+ */
+int ubi_calc_data_len(const struct ubi_info *ubi, const void *buf,
+		      int length);
+
+/**
+ * ubi_check_volume - check the contents of a (static) volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to check
+ *
+ * This function checks if a static volume is corrupted by fully reading it and
+ * checking data CRC. This function returns %0 if the volume is not corrupted,
+ * %1 if it is corrupted and a negative error code in case of failure. Dynamic
+ * volumes are not checked and zero is returned immediately.
+ */
+int ubi_check_volume(const struct ubi_info *ubi, int vol_id);
+
+#endif /* !__UBI_MISC_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 08/44 take 2] [UBI] misc unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (6 preceding siblings ...)
  2007-02-17 16:54 ` [PATCH 07/44 take 2] [UBI] misc unit header Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 16:55 ` [PATCH 09/44 take 2] [UBI] debug unit header Artem Bityutskiy
                   ` (38 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/misc.c tmp-to/drivers/mtd/ubi/misc.c
--- tmp-from/drivers/mtd/ubi/misc.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/misc.c	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,138 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/string.h>
+#include <linux/sched.h>
+#include "ubi.h"
+#include "misc.h"
+#include "vtbl.h"
+#include "debug.h"
+#include "eba.h"
+#include "alloc.h"
+#include "io.h"
+#include "account.h"
+#include "background.h"
+
+int ubi_buf_all_ff(const void *buf, int size)
+{
+	int i;
+
+	for (i = 0; i < size / sizeof(unsigned int); i++)
+		if (((const unsigned int *)buf)[i] != ~0)
+			return 0;
+
+	for (i = i; i < size; i++)
+		if (((const uint8_t *)buf)[i] != 0xFF)
+			return 0;
+
+	return 1;
+}
+
+int ubi_buf_all_zeroes(const void *buf, int size)
+{
+	int i;
+
+	for (i = 0; i < size / sizeof(unsigned int); i++)
+		if (((const unsigned int *)buf)[i] != 0)
+			return 0;
+
+	for (i = i; i < size; i++)
+		if (((const uint8_t *)buf)[i] != 0)
+			return 0;
+
+	return 1;
+}
+
+int ubi_check_pattern(const void *buf, uint8_t patt, int size)
+{
+	int i;
+
+	for (i = 0; i < size; i++)
+		if (((const uint8_t *)buf)[i] != patt)
+			return 0;
+	return 1;
+}
+
+char *strdup_len(const char *str, int len)
+{
+	char *dup;
+
+	ubi_assert(strnlen(str, len + 1) == len);
+
+	dup = ubi_kmalloc(len + 1);
+	if (!dup)
+		return NULL;
+
+	memcpy(dup, str, len);
+	dup[len] = '\0';
+
+	return dup;
+}
+
+int ubi_calc_data_len(const struct ubi_info *ubi, const void *buf,
+		      int length)
+{
+	int i;
+
+	ubi_assert(length % ubi->io->min_io_size == 0);
+
+	for (i = length - 1; i >= 0; i--)
+		if (((const uint8_t *)buf)[i] != 0xFF)
+			break;
+
+	/* The resulting length must be aligned to the minimum flash I/O size */
+	length = align_up(i + 1, ubi->io->min_io_size);
+	return length;
+}
+
+int ubi_check_volume(const struct ubi_info *ubi, int vol_id)
+{
+	void *buf;
+	int err = 0, i;
+	const struct ubi_vtbl_vtr *vtr;
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+
+	if (vtr->vol_type != UBI_STATIC_VOLUME)
+		return 0;
+
+	buf = ubi_kmalloc(vtr->usable_leb_size);
+	if (!buf)
+		return -ENOMEM;
+
+	for (i = 0; i < vtr->used_ebs; i++) {
+		int size;
+
+		if (i == vtr->used_ebs - 1)
+			size = vtr->last_eb_bytes;
+		else
+			size = vtr->usable_leb_size;
+
+		err = ubi_eba_read_leb(ubi, vol_id, i, buf, 0, size, 1);
+		if (err) {
+			if (err == -EBADMSG)
+				err = 1;
+			break;
+		}
+	}
+
+	ubi_kfree(buf);
+	return err;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 09/44 take 2] [UBI] debug unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (7 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 08/44 take 2] [UBI] misc unit implementation Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 21:18   ` Arnd Bergmann
  2007-02-17 16:55 ` [PATCH 10/44 take 2] [UBI] debug unit implementation Artem Bityutskiy
                   ` (37 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/debug.h tmp-to/drivers/mtd/ubi/debug.h
--- tmp-from/drivers/mtd/ubi/debug.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/debug.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,284 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/**
+ * UBI debugging unit.
+ *
+ * UBI provides rich debugging capabilities which are implemented in
+ * this unit. The main feature provided by this unit is the debugging log.
+ * The debugging log is accessed via the "<debugfs>/ubi/log" file from
+ * user-space. The debugging prints may also be directed to the console. UBI
+ * distinguishes between debugging messages from different units and may switch
+ * them on and off separately.
+ */
+
+#ifndef __UBI_DEBUG_H__
+#define __UBI_DEBUG_H__
+
+#include <linux/init.h>
+
+#ifdef CONFIG_MTD_UBI_DEBUG
+
+#define UBI_DEBUG 1
+
+#define ubi_assert(expr) BUG_ON(!(expr))
+
+#define ubi_dbg_dump_stack() dump_stack()
+
+/* Debugging messages from different UBI units */
+
+/* A verbose error message */
+#define dbg_err(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_VB_ERR, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* User interface unit */
+#define dbg_uif(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_UIF, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Character device handling sub-unit */
+#define dbg_cdev(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_CDEV, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Gluebi sub-unit */
+#define dbg_gluebi(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_GLUEBI, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Volume management unit */
+#define dbg_vmt(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_VMT, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Update unit */
+#define dbg_upd(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_UPD, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Volume table unit */
+#define dbg_vtbl(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_VTBL, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Accounting unit */
+#define dbg_acc(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_ACC, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Eraseblock association unit */
+#define dbg_eba(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_EBA, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Wear-leveling unit */
+#define dbg_wl(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_WL, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Background thread unit */
+#define dbg_bgt(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_BGT, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Input/output unit */
+#define dbg_io(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_IO, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Build unit */
+#define dbg_bld(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_BLD, __FUNCTION__, fmt, ##__VA_ARGS__)
+/* Scanning unit */
+#define dbg_scan(fmt, ...) \
+	ubi_dbg_print(UBI_DBG_SCAN, __FUNCTION__, fmt, ##__VA_ARGS__)
+
+/**
+ * UBI message types.
+ *
+ * @UBI_DBG_VB_ERR: a verbose error message
+ * @UBI_DBG_UIF: a debugging message from the user interfaces unit
+ * @UBI_DBG_CDEV: a debugging message from the character device handling
+ * sub-unit.
+ * @UBI_DBG_GLUEBI: a debugging message from the gluebi sub-unit.
+ * @UBI_DBG_VMT: a debugging message from the volume management unit
+ * @UBI_DBG_UPD: a debugging message from the update unit
+ * @UBI_DBG_VTBL: a debugging message from the volume table unit
+ * @UBI_DBG_ACC: a debugging message from the accounting unit
+ * @UBI_DBG_EBA: a debugging message from the eraseblock association unit
+ * @UBI_DBG_WL: a debugging message from the wear-leveling unit
+ * @UBI_DBG_BGT: a debugging message from the background thread unit
+ * @UBI_DBG_IO: a debugging message from the input/output unit
+ * @UBI_DBG_BLD: a debugging message from the build unit
+ * @UBI_DBG_SCAN: a debugging message from the scanning unit
+ */
+enum {
+	UBI_DBG_VB_ERR,
+	UBI_DBG_UIF,
+	UBI_DBG_CDEV,
+	UBI_DBG_GLUEBI,
+	UBI_DBG_VMT,
+	UBI_DBG_UPD,
+	UBI_DBG_VTBL,
+	UBI_DBG_ACC,
+	UBI_DBG_EBA,
+	UBI_DBG_WL,
+	UBI_DBG_BGT,
+	UBI_DBG_IO,
+	UBI_DBG_BLD,
+	UBI_DBG_SCAN
+};
+
+/**
+ * ubi_dbg_print - print a message.
+ *
+ * @type: type of the message
+ * @func: printing function name
+ * @fmt: format string
+ *
+ * This function prints a message to the console, the debugging log, or both.
+ * Normal, warning, and error messages always go to both console and debugging
+ * log. Debugging messages always go to the debugging log, and if the
+ * corresponding option is enabled, they also go to the console.
+ */
+void ubi_dbg_print(int type, const char *func, const char *fmt, ...);
+
+struct ubi_info;
+struct ubi_ec_hdr;
+struct ubi_vid_hdr;
+struct ubi_vtbl_vtr;
+struct ubi_vol_tbl_record;
+struct ubi_scan_volume;
+struct ubi_scan_leb;
+struct ubi_mkvol_req;
+
+/**
+ * ubi_dbg_dump_ec_hdr - dump an erase counter header.
+ *
+ * @ec_hdr: the erase counter header to dump
+ */
+void ubi_dbg_dump_ec_hdr(const struct ubi_ec_hdr *ec_hdr);
+
+/**
+ * ubi_dbg_dump_vid_hdr - dump a volume identifier header.
+ *
+ * @vid_hdr: the volume identifier header to dump
+ */
+void ubi_dbg_dump_vid_hdr(const struct ubi_vid_hdr *vid_hdr);
+
+/**
+ * ubi_dbg_dump_vtr - dump a &struct ubi_vtbl_vtr object.
+ *
+ * @vtr: the object to dump
+ */
+void ubi_dbg_dump_vtr(const struct ubi_vtbl_vtr *vtr);
+
+/**
+ * ubi_dbg_dump_vol_tbl_record - dump a &struct ubi_vol_tbl_record object.
+ *
+ * @r: the object to dump
+ */
+void ubi_dbg_dump_vol_tbl_record(const struct ubi_vol_tbl_record *r);
+
+/**
+ * ubi_dbg_dump_sv - dump a &struct ubi_scan_volume object.
+ *
+ * @sv: the object to dump
+ */
+void ubi_dbg_dump_sv(const struct ubi_scan_volume *sv);
+
+
+/**
+ * ubi_dbg_dump_seb - dump a &struct ubi_scan_leb object.
+ *
+ * @seb: the object to dump
+ * @type: object type: 0 - not corrupted, 1 - corrupted
+ */
+void ubi_dbg_dump_seb(const struct ubi_scan_leb *seb, int type);
+
+/**
+ * ubi_dbg_dump_mkvol_req - dump a &struct ubi_mkvol_req object.
+ *
+ * @req: the object to dump
+ * @name: volume name in kernel memory
+ */
+void ubi_dbg_dump_mkvol_req(const struct ubi_mkvol_req *req, const char *name);
+
+/**
+ * ubi_dbg_hexdump - dump a buffer.
+ *
+ * @buf: the buffer to dump
+ * @size: buffer size which must be multiple of 4 bytes
+ */
+void ubi_dbg_hexdump(const void *buf, int size);
+
+/**
+ * ubi_dbg_is_bitflip - if its time to emulate a bit-flip.
+ *
+ * Returns non-zero if a bit-flip should be emulated, otherwise returns zero.
+ */
+int ubi_dbg_is_bitflip(void);
+
+/**
+ * ubi_dbg_is_write_failure - if its time to emulate a write failure.
+ *
+ * Returns non-zero if a write failure should be emulated, otherwise returns
+ * zero.
+ */
+int ubi_dbg_is_write_failure(void);
+
+/**
+ * ubi_dbg_is_erase_failure - if its time to emulate an erase failure.
+ *
+ * Returns non-zero if an erase failure should be emulated, otherwise returns
+ * zero.
+ */
+int ubi_dbg_is_erase_failure(void);
+
+/**
+ * ubi_dbg_init - initialize the debugging unit.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int __init ubi_dbg_init(void);
+
+/**
+ * ubi_dbg_close - close the debugging unit.
+ */
+void __exit ubi_dbg_close(void);
+
+#else
+
+#define UBI_DEBUG 0
+
+#define ubi_assert(expr)    ({})
+
+#define dbg_err(fmt, ...)    ({})
+#define dbg_uif(fmt, ...)    ({})
+#define dbg_cdev(fmt, ...)   ({})
+#define dbg_gluebi(fmt, ...) ({})
+#define dbg_vmt(fmt, ...)    ({})
+#define dbg_upd(fmt, ...)    ({})
+#define dbg_vtbl(fmt, ...)   ({})
+#define dbg_acc(fmt, ...)    ({})
+#define dbg_eba(fmt, ...)    ({})
+#define dbg_wl(fmt, ...)     ({})
+#define dbg_bgt(fmt, ...)    ({})
+#define dbg_io(fmt, ...)     ({})
+#define dbg_bld(fmt, ...)    ({})
+#define dbg_scan(fmt, ...)   ({})
+
+#define ubi_dbg_print(func, fmt, ...)     ({})
+#define ubi_dbg_dump_stack()              ({})
+#define ubi_dbg_dump_ec_hdr(ec_hdr)       ({})
+#define ubi_dbg_dump_vid_hdr(vid_hdr)     ({})
+#define ubi_dbg_dump_vtr(vtr)             ({})
+#define ubi_dbg_dump_vol_tbl_record(r)    ({})
+#define ubi_dbg_dump_sv(sv)               ({})
+#define ubi_dbg_dump_seb(seb, type)       ({})
+#define ubi_dbg_dump_mkvol_req(req, name) ({})
+#define ubi_dbg_hexdump(buf, size)        ({})
+#define ubi_dbg_is_bitflip()               0
+#define ubi_dbg_is_write_failure()         0
+#define ubi_dbg_is_erase_failure()         0
+
+#define ubi_dbg_init()  0
+#define ubi_dbg_close()
+
+#endif /* !CONFIG_MTD_UBI_DEBUG */
+#endif /* !__UBI_DEBUG_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 10/44 take 2] [UBI] debug unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (8 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 09/44 take 2] [UBI] debug unit header Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 21:00   ` Arnd Bergmann
  2007-02-17 16:55 ` [PATCH 11/44 take 2] [UBI] allocation unit header Artem Bityutskiy
                   ` (36 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/debug.c tmp-to/drivers/mtd/ubi/debug.c
--- tmp-from/drivers/mtd/ubi/debug.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/debug.c	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,634 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <stdarg.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/debugfs.h>
+#include <linux/stat.h>
+#include <linux/spinlock.h>
+#include <linux/sched.h>
+#include <linux/random.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "debug.h"
+#include "vtbl.h"
+#include "scan.h"
+
+/* Level of UBI debugging prints */
+#define UBI_DBG_LEVEL KERN_DEBUG
+
+/* Prefixes of debugging messages */
+#define UBI_DBG_VB_ERR_PREF  "[UBI DBG error]"
+#define UBI_DBG_UIF_PREF     "[UBI DBG uif]"
+#define UBI_DBG_CDEV_PREF    "[UBI DBG cdev]"
+#define UBI_DBG_GLUEBI_PREF  "[UBI DBG gluebi]"
+#define UBI_DBG_VMT_PREF     "[UBI DBG vmt]"
+#define UBI_DBG_UPD_PREF     "[UBI DBG upd]"
+#define UBI_DBG_VTBL_PREF    "[UBI DBG vtbl]"
+#define UBI_DBG_ACC_PREF     "[UBI DBG acc]"
+#define UBI_DBG_EBA_PREF     "[UBI DBG eba]"
+#define UBI_DBG_WL_PREF      "[UBI DBG wl]"
+#define UBI_DBG_BGT_PREF     "[UBI DBG bgt]"
+#define UBI_DBG_ALLOC_PREF   "[UBI DBG alloc]"
+#define UBI_DBG_IO_PREF      "[UBI DBG io]"
+#define UBI_DBG_BLD_PREF     "[UBI DBG bld]"
+#define UBI_DBG_SCAN_PREF    "[UBI DBG scan]"
+
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_VB_ERR
+static int vb_err_prints = 1;
+#else
+static int vb_err_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_UIF
+static int uif_prints = 1;
+#else
+static int uif_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_CDEV
+static int cdev_prints = 1;
+#else
+static int cdev_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_GLUEBI
+static int gluebi_prints = 1;
+#else
+static int gluebi_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_VMT
+static int vmt_prints = 1;
+#else
+static int vmt_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_UPD
+static int upd_prints = 1;
+#else
+static int upd_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_VTBL
+static int vtbl_prints = 1;
+#else
+static int vtbl_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_ACC
+static int acc_prints = 1;
+#else
+static int acc_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_EBA
+static int eba_prints = 1;
+#else
+static int eba_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_WL
+static int wl_prints = 1;
+#else
+static int wl_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_BGT
+static int bgt_prints = 1;
+#else
+static int bgt_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_IO
+static int io_prints = 1;
+#else
+static int io_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_BLD
+static int bld_prints = 1;
+#else
+static int bld_prints;
+#endif
+#ifdef CONFIG_MTD_UBI_DEBUG_MSG_SCAN
+static int scan_prints = 1;
+#else
+static int scan_prints;
+#endif
+
+/* If bit-flips should be emulated */
+#ifdef CONFIG_MTD_UBI_DEBUG_EMULATE_BITFLIPS
+static int emulate_bitflips = 1;
+#else
+static int emulate_bitflips;
+#endif
+
+/* If write failures should be emulated */
+#ifdef CONFIG_MTD_UBI_DEBUG_EMULATE_WRITE_FAILURES
+static int emulate_write_failures = 1;
+#else
+static int emulate_write_failures;
+#endif
+
+/* If erase failures should be emulated */
+#ifdef CONFIG_MTD_UBI_DEBUG_EMULATE_ERASE_FAILURES
+static int emulate_erase_failures = 1;
+#else
+static int emulate_erase_failures;
+#endif
+
+/* Direntries of the UBI debugfs files */
+
+/* <debugfs>/ubi */
+static struct dentry *debugfs_root;
+/* <debugfs>/ubi/log */
+static struct dentry *debugfs_vb_err_prints;
+/* <debugfs>/ubi/uif_prints */
+static struct dentry *debugfs_uif_prints;
+/* <debugfs>/ubi/cdev_prints */
+static struct dentry *debugfs_cdev_prints;
+/* <debugfs>/ubi/gluebi_prints */
+static struct dentry *debugfs_gluebi_prints;
+/* <debugfs>/ubi/vmt_prints */
+static struct dentry *debugfs_vmt_prints;
+/* <debugfs>/ubi/upd_prints */
+static struct dentry *debugfs_upd_prints;
+/* <debugfs>/ubi/vtbl_prints */
+static struct dentry *debugfs_vtbl_prints;
+/* <debugfs>/ubi/acc_prints */
+static struct dentry *debugfs_acc_prints;
+/* <debugfs>/ubi/eba_prints */
+static struct dentry *debugfs_eba_prints;
+/* <debugfs>/ubi/wl_prints */
+static struct dentry *debugfs_wl_prints;
+/* <debugfs>/ubi/bgt_prints */
+static struct dentry *debugfs_bgt_prints;
+/* <debugfs>/ubi/io_prints */
+static struct dentry *debugfs_io_prints;
+/* <debugfs>/ubi/bld_prints */
+static struct dentry *debugfs_bld_prints;
+/* <debugfs>/ubi/scan_prints */
+static struct dentry *debugfs_scan_prints;
+
+/* Serializes prints */
+static spinlock_t dbg_prints_lock = SPIN_LOCK_UNLOCKED;
+
+int __init ubi_dbg_init(void)
+{
+	int err = -ENODEV;
+
+	/* Create debugging files and directories */
+
+	debugfs_root = debugfs_create_dir("ubi", NULL);
+	if (!debugfs_root || IS_ERR(debugfs_root))
+		goto out;
+
+	debugfs_vb_err_prints = debugfs_create_bool("vb_err_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &vb_err_prints);
+	if (!debugfs_vb_err_prints || IS_ERR(debugfs_vb_err_prints))
+		goto out_root;
+
+	debugfs_uif_prints = debugfs_create_bool("uif_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &uif_prints);
+	if (!debugfs_uif_prints || IS_ERR(debugfs_uif_prints))
+		goto out_vb_err;
+
+	debugfs_cdev_prints = debugfs_create_bool("cdev_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &cdev_prints);
+	if (!debugfs_cdev_prints || IS_ERR(debugfs_cdev_prints))
+		goto out_uif;
+
+	debugfs_gluebi_prints = debugfs_create_bool("gluebi_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &gluebi_prints);
+	if (!debugfs_gluebi_prints || IS_ERR(debugfs_gluebi_prints))
+		goto out_cdev;
+
+	debugfs_vmt_prints = debugfs_create_bool("vmt_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &vmt_prints);
+	if (!debugfs_vmt_prints || IS_ERR(debugfs_vmt_prints))
+		goto out_gluebi;
+
+	debugfs_upd_prints = debugfs_create_bool("upd_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &upd_prints);
+	if (!debugfs_upd_prints || IS_ERR(debugfs_upd_prints))
+		goto out_vmt;
+
+	debugfs_vtbl_prints = debugfs_create_bool("vtbl_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &vtbl_prints);
+	if (!debugfs_vtbl_prints || IS_ERR(debugfs_vtbl_prints))
+		goto out_upd;
+
+	debugfs_acc_prints = debugfs_create_bool("acc_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &acc_prints);
+	if (!debugfs_acc_prints || IS_ERR(debugfs_acc_prints))
+		goto out_vtbl;
+
+	debugfs_eba_prints = debugfs_create_bool("eba_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &eba_prints);
+	if (!debugfs_eba_prints || IS_ERR(debugfs_eba_prints))
+		goto out_acc;
+
+	debugfs_wl_prints = debugfs_create_bool("wl_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &wl_prints);
+	if (!debugfs_wl_prints || IS_ERR(debugfs_wl_prints))
+		goto out_eba;
+
+	debugfs_bgt_prints = debugfs_create_bool("bgt_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &bgt_prints);
+	if (!debugfs_bgt_prints || IS_ERR(debugfs_bgt_prints))
+		goto out_wl;
+
+	debugfs_io_prints = debugfs_create_bool("io_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &io_prints);
+	if (!debugfs_io_prints || IS_ERR(debugfs_io_prints))
+		goto out_bgt;
+
+	debugfs_bld_prints = debugfs_create_bool("bld_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &bld_prints);
+	if (!debugfs_bld_prints || IS_ERR(debugfs_bld_prints))
+		goto out_io;
+
+	debugfs_scan_prints = debugfs_create_bool("scan_prints",
+		S_IFREG | S_IRUGO | S_IWUGO, debugfs_root, &scan_prints);
+	if (!debugfs_scan_prints || IS_ERR(debugfs_scan_prints))
+		goto out_bld;
+
+	return 0;
+
+out_bld:
+	debugfs_remove(debugfs_bld_prints);
+out_io:
+	debugfs_remove(debugfs_io_prints);
+out_bgt:
+	debugfs_remove(debugfs_bgt_prints);
+out_wl:
+	debugfs_remove(debugfs_wl_prints);
+out_eba:
+	debugfs_remove(debugfs_eba_prints);
+out_acc:
+	debugfs_remove(debugfs_acc_prints);
+out_vtbl:
+	debugfs_remove(debugfs_vtbl_prints);
+out_upd:
+	debugfs_remove(debugfs_upd_prints);
+out_vmt:
+	debugfs_remove(debugfs_vmt_prints);
+out_gluebi:
+	debugfs_remove(debugfs_gluebi_prints);
+out_cdev:
+	debugfs_remove(debugfs_cdev_prints);
+out_uif:
+	debugfs_remove(debugfs_uif_prints);
+out_vb_err:
+	debugfs_remove(debugfs_vb_err_prints);
+out_root:
+        debugfs_remove(debugfs_root);
+out:
+	return err;
+}
+
+void __exit ubi_dbg_close(void)
+{
+	debugfs_remove(debugfs_scan_prints);
+	debugfs_remove(debugfs_bld_prints);
+	debugfs_remove(debugfs_io_prints);
+	debugfs_remove(debugfs_bgt_prints);
+	debugfs_remove(debugfs_wl_prints);
+	debugfs_remove(debugfs_eba_prints);
+	debugfs_remove(debugfs_acc_prints);
+	debugfs_remove(debugfs_vtbl_prints);
+	debugfs_remove(debugfs_upd_prints);
+	debugfs_remove(debugfs_vmt_prints);
+	debugfs_remove(debugfs_gluebi_prints);
+	debugfs_remove(debugfs_cdev_prints);
+	debugfs_remove(debugfs_uif_prints);
+	debugfs_remove(debugfs_vb_err_prints);
+	debugfs_remove(debugfs_root);
+}
+
+static void ubi_dbg_vprint_nolock(int type, const char *func, const char *fmt,
+				  va_list args);
+
+void ubi_dbg_print(int type, const char *func, const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	spin_lock(&dbg_prints_lock);
+	ubi_dbg_vprint_nolock(type, func, fmt, args);
+	spin_unlock(&dbg_prints_lock);
+	va_end(args);
+}
+
+static void ubi_dbg_vprint_nolock(int type, const char *func, const char *fmt,
+				  va_list args)
+{
+	const char *prefix;
+
+	switch (type) {
+	case UBI_DBG_VB_ERR:
+		if (!vb_err_prints)
+			return;
+		prefix = UBI_DBG_VB_ERR_PREF;
+		break;
+	case UBI_DBG_UIF:
+		if (!uif_prints)
+			return;
+		prefix = UBI_DBG_UIF_PREF;
+		break;
+	case UBI_DBG_CDEV:
+		if (!cdev_prints)
+			return;
+		prefix = UBI_DBG_CDEV_PREF;
+		break;
+	case UBI_DBG_GLUEBI:
+		if (!gluebi_prints)
+			return;
+		prefix = UBI_DBG_GLUEBI_PREF;
+		break;
+	case UBI_DBG_VMT:
+		if (!vmt_prints)
+			return;
+		prefix = UBI_DBG_VMT_PREF;
+		break;
+	case UBI_DBG_UPD:
+		if (!upd_prints)
+			return;
+		prefix = UBI_DBG_UPD_PREF;
+		break;
+	case UBI_DBG_VTBL:
+		if (!vtbl_prints)
+			return;
+		prefix = UBI_DBG_VTBL_PREF;
+		break;
+	case UBI_DBG_ACC:
+		if (!acc_prints)
+			return;
+		prefix = UBI_DBG_ACC_PREF;
+		break;
+	case UBI_DBG_EBA:
+		if (!eba_prints)
+			return;
+		prefix = UBI_DBG_EBA_PREF;
+		break;
+	case UBI_DBG_WL:
+		if (!wl_prints)
+			return;
+		prefix = UBI_DBG_WL_PREF;
+		break;
+	case UBI_DBG_BGT:
+		if (!bgt_prints)
+			return;
+		prefix = UBI_DBG_BGT_PREF;
+		break;
+	case UBI_DBG_IO:
+		if (!io_prints)
+			return;
+		prefix = UBI_DBG_IO_PREF;
+		break;
+	case UBI_DBG_BLD:
+		if (!bld_prints)
+			return;
+		prefix = UBI_DBG_BLD_PREF;
+		break;
+	case UBI_DBG_SCAN:
+		if (!scan_prints)
+			return;
+		prefix = UBI_DBG_SCAN_PREF;
+		break;
+	default:
+		BUG();
+		return;
+	}
+
+	printk(UBI_DBG_LEVEL "%s (pid:%d) ", prefix, current->pid);
+	if (func)
+		printk("%s: ", func);
+	vprintk(fmt, args);
+	printk("\n");
+}
+
+void ubi_dbg_dump_ec_hdr(const struct ubi_ec_hdr *ec_hdr)
+{
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("erase counter header dump:");
+	ubi_msg("magic          %#08x", ubi32_to_cpu(ec_hdr->magic));
+	ubi_msg("version        %d",    (int)ec_hdr->version);
+	ubi_msg("ec             %llu",  (long long)ubi64_to_cpu(ec_hdr->ec));
+	ubi_msg("vid_hdr_offset %d",    ubi32_to_cpu(ec_hdr->vid_hdr_offset));
+	ubi_msg("data_offset    %d",    ubi32_to_cpu(ec_hdr->data_offset));
+	ubi_msg("hdr_crc        %#08x", ubi32_to_cpu(ec_hdr->hdr_crc));
+	ubi_msg("erase counter header hexdump:");
+	spin_unlock(&dbg_prints_lock);
+	ubi_dbg_hexdump(ec_hdr, UBI_EC_HDR_SIZE);
+}
+
+void ubi_dbg_dump_vid_hdr(const struct ubi_vid_hdr *vid_hdr)
+{
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("volume identifier header dump:");
+	ubi_msg("magic     %08x", ubi32_to_cpu(vid_hdr->magic));
+	ubi_msg("version   %d",   (int)vid_hdr->version);
+	ubi_msg("vol_type  %d",   (int)vid_hdr->vol_type);
+	ubi_msg("copy_flag %d",   (int)vid_hdr->copy_flag);
+	ubi_msg("compat    %d",   (int)vid_hdr->compat);
+	ubi_msg("vol_id    %d",   ubi32_to_cpu(vid_hdr->vol_id));
+	ubi_msg("lnum      %d",   ubi32_to_cpu(vid_hdr->lnum));
+	ubi_msg("leb_ver   %u",   ubi32_to_cpu(vid_hdr->leb_ver));
+	ubi_msg("data_size %d",   ubi32_to_cpu(vid_hdr->data_size));
+	ubi_msg("used_ebs  %d",   ubi32_to_cpu(vid_hdr->used_ebs));
+	ubi_msg("data_pad  %d",   ubi32_to_cpu(vid_hdr->data_pad));
+	ubi_msg("hdr_crc   %08x", ubi32_to_cpu(vid_hdr->hdr_crc));
+	ubi_msg("volume identifier header hexdump:");
+	spin_unlock(&dbg_prints_lock);
+	ubi_dbg_hexdump(vid_hdr, UBI_VID_HDR_SIZE_CRC);
+}
+
+void ubi_dbg_dump_vtr(const struct ubi_vtbl_vtr *vtr)
+{
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("volume table record dump:");
+	ubi_msg("reserved_pebs   %d", vtr->reserved_pebs);
+	ubi_msg("alignment       %d", vtr->alignment);
+	ubi_msg("data_pad        %d", vtr->data_pad);
+	ubi_msg("vol_type        %d", vtr->vol_type);
+	ubi_msg("name_len        %d", vtr->name_len);
+
+	if (vtr->name == NULL) {
+		ubi_msg("name          NULL");
+		spin_unlock(&dbg_prints_lock);
+		return;
+	}
+
+	if (vtr->name_len <= UBI_VOL_NAME_MAX &&
+	    strnlen(vtr->name, vtr->name_len + 1) == vtr->name_len) {
+		ubi_msg("name          %s", vtr->name);
+	} else {
+		ubi_msg("the 1st 5 characters of the name: %c%c%c%c%c",
+			vtr->name[0], vtr->name[1], vtr->name[2],
+			vtr->name[3], vtr->name[4]);
+	}
+
+	ubi_msg("usable_leb_size %d",   vtr->usable_leb_size);
+	ubi_msg("used_ebs        %d",   vtr->used_ebs);
+	ubi_msg("used_bytes      %lld", (long long)vtr->used_bytes);
+	ubi_msg("last_eb_bytes   %d",   vtr->last_eb_bytes);
+	ubi_msg("corrupted       %d",   vtr->corrupted);
+	ubi_msg("upd_marker      %d",   vtr->upd_marker);
+	spin_unlock(&dbg_prints_lock);
+}
+
+void ubi_dbg_dump_vol_tbl_record(const struct ubi_vol_tbl_record *r)
+{
+	int name_len = ubi16_to_cpu(r->name_len);
+
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("raw volume table record dump:");
+	ubi_msg("reserved_pebs   %d", ubi32_to_cpu(r->reserved_pebs));
+	ubi_msg("alignment       %d", ubi32_to_cpu(r->alignment));
+	ubi_msg("data_pad        %d", ubi32_to_cpu(r->data_pad));
+	ubi_msg("vol_type        %d", (int)r->vol_type);
+	ubi_msg("upd_marker      %d", (int)r->upd_marker);
+	ubi_msg("name_len        %d", name_len);
+
+	if (r->name[0] == '\0') {
+		ubi_msg("name          NULL");
+		spin_unlock(&dbg_prints_lock);
+		return;
+	}
+
+	if (name_len <= UBI_VOL_NAME_MAX &&
+	    strnlen(&r->name[0], name_len + 1) == name_len) {
+		ubi_msg("name          %s", &r->name[0]);
+	} else {
+		ubi_msg("the 1st 5 characters of the name: %c%c%c%c%c",
+			r->name[0], r->name[1], r->name[2], r->name[3],
+			r->name[4]);
+	}
+	spin_unlock(&dbg_prints_lock);
+}
+
+void ubi_dbg_dump_sv(const struct ubi_scan_volume *sv)
+{
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("volume scanning information dump:");
+	ubi_msg("vol_id         %d", sv->vol_id);
+	ubi_msg("highest_lnum   %d", sv->highest_lnum);
+	ubi_msg("leb_count      %d", sv->leb_count);
+	ubi_msg("compat         %d", sv->compat);
+	ubi_msg("vol_type       %d", sv->vol_type);
+	ubi_msg("used_ebs       %d", sv->used_ebs);
+	ubi_msg("last_data_size %d", sv->last_data_size);
+	ubi_msg("data_pad       %d", sv->data_pad);
+	spin_unlock(&dbg_prints_lock);
+}
+
+void ubi_dbg_dump_seb(const struct ubi_scan_leb *seb, int type)
+{
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("eraseblock scanning information dump:");
+	ubi_msg("ec       %d", seb->ec);
+	ubi_msg("pnum     %d", seb->pnum);
+	switch (type) {
+		case 0:
+			ubi_msg("lnum     %d", seb->lnum);
+			ubi_msg("leb_ver  %u", seb->leb_ver);
+			break;
+		case 1:
+			break;
+	}
+	spin_unlock(&dbg_prints_lock);
+}
+
+void ubi_dbg_dump_mkvol_req(const struct ubi_mkvol_req *req, const char *name)
+{
+	spin_lock(&dbg_prints_lock);
+	ubi_msg("volume creation request dump:");
+	ubi_msg("vol_id    %d",   req->vol_id);
+	ubi_msg("alignment %d",   req->alignment);
+	ubi_msg("bytes     %lld", (long long)req->bytes);
+	ubi_msg("vol_type  %d",   req->vol_type);
+	ubi_msg("name_len  %d",   req->name_len);
+
+	if (name == NULL) {
+		ubi_msg("name      NULL");
+		spin_unlock(&dbg_prints_lock);
+		return;
+	}
+
+	if (req->name_len <= UBI_VOL_NAME_MAX &&
+	    strnlen(name, req->name_len + 1) == req->name_len) {
+		ubi_msg("name      %s", name);
+	} else {
+		ubi_msg("the 1st 5 characters of the name: %c%c%c%c%c",
+			name[0], name[1], name[2], name[3], name[4]);
+	}
+	spin_unlock(&dbg_prints_lock);
+}
+
+
+#define BYTES_PER_LINE 32
+void ubi_dbg_hexdump(const void *ptr, int size)
+{
+	int i, k = 0, rows, columns;
+	const uint8_t *p = ptr;
+
+	size = ALIGN(size, 4);
+	rows = size/BYTES_PER_LINE + size % BYTES_PER_LINE;
+	for (i = 0; i < rows; i++) {
+		int j;
+
+		cond_resched();
+
+		columns = min(size - k, BYTES_PER_LINE) / 4;
+		if (columns == 0)
+			break;
+
+		spin_lock(&dbg_prints_lock);
+		printk(UBI_DBG_LEVEL "%5d:  ", i * BYTES_PER_LINE);
+
+		for (j = 0; j < columns; j++) {
+			int n, N;
+
+			N = size - k > 4 ? 4 : size - k;
+			for (n = 0; n < N; n++)
+				printk("%02x", p[k++]);
+			printk(" ");
+		}
+		printk("\n");
+		spin_unlock(&dbg_prints_lock);
+	}
+}
+
+int ubi_dbg_is_bitflip(void)
+{
+	if (emulate_bitflips)
+		return !(random32() % 50);
+	else
+		return 0;
+}
+
+int ubi_dbg_is_write_failure(void)
+{
+	if (emulate_write_failures)
+		return !(random32() % 100);
+	else
+		return 0;
+}
+
+int ubi_dbg_is_erase_failure(void)
+{
+	if (emulate_erase_failures)
+		return !(random32() % 100);
+	else
+		return 0;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 11/44 take 2] [UBI] allocation unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (9 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 10/44 take 2] [UBI] debug unit implementation Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 16:55 ` [PATCH 12/44 take 2] [UBI] allocation unit implementation Artem Bityutskiy
                   ` (35 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/alloc.h tmp-to/drivers/mtd/ubi/alloc.h
--- tmp-from/drivers/mtd/ubi/alloc.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/alloc.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,235 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI memory allocation unit.
+ *
+ * This unit provides memory allocation/deallocation calls and wrappers.
+ */
+
+#ifndef __UBI_ALLOC_H__
+#define __UBI_ALLOC_H__
+
+#include <linux/init.h>
+
+struct ubi_info;
+struct ubi_ec_hdr;
+struct ubi_vid_hdr;
+struct ubi_bgt_work;
+struct ubi_wl_erase_work;
+struct ubi_wl_entry;
+struct ubi_wl_prot_entry;
+struct ubi_eba_ltree_entry;
+struct ubi_scan_leb;
+struct ubi_scan_volume;
+
+/**
+ * ubi_kmalloc - allocate memory.
+ *
+ * @size: how many bytes to allocate
+ *
+ * This function is just a wrapper over the standard Linux 'kmalloc()'
+ * function with %GFP_KERNEL argument.
+ */
+void *ubi_kmalloc(size_t size);
+
+/**
+ * ubi_kzalloc - allocate and zero memory.
+ *
+ * @size: how many bytes to allocate
+ *
+ * This function is just a wrapper over the standard Linux 'kzalloc()'
+ * function with %GFP_KERNEL argument.
+ */
+void *ubi_kzalloc(size_t size);
+
+/**
+ * ubi_kfree - free memory allocated by 'ubi_kmalloc()' or 'ubi_kzalloc()'.
+ *
+ * @obj: a pointer to the object to free
+ *
+ * This is just a wrapper over the standard Linux 'kfree()' function.
+ */
+void ubi_kfree(const void *obj);
+
+/**
+ * ubi_zalloc_ec_hdr - allocate a &struct ubi_ec_hdr object.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns a pointer to the newly allocated and zero-filled erase
+ * counter header object in case of success and %NULL in case of failure.
+ */
+struct ubi_ec_hdr *ubi_zalloc_ec_hdr(const struct ubi_info *ubi);
+
+/**
+ * ubi_free_ec_hdr - free a &struct ubi_ec_hdr object.
+ *
+ * @ubi: the UBI device description object
+ * @ec_hdr: a pointer to the object to free
+ */
+void ubi_free_ec_hdr(const struct ubi_info *ubi, struct ubi_ec_hdr *ec_hdr);
+
+/**
+ * ubi_zalloc_vid_hdr - allocate a &struct ubi_vid_hdr object.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns a pointer to the newly allocated and zero-filled
+ * volume identifier header object in case of success and %NULL in case of
+ * failure.
+ */
+struct ubi_vid_hdr *ubi_zalloc_vid_hdr(const struct ubi_info *ubi);
+
+/**
+ * ubi_free_vid_hdr - free a &struct ubi_vid_hdr object.
+ *
+ * @ubi: the UBI device description object
+ * @vid_hdr: a pointer to the object to free
+ */
+void ubi_free_vid_hdr(const struct ubi_info *ubi, struct ubi_vid_hdr *vid_hdr);
+
+/**
+ * ubi_alloc_bgt_work - allocate a &struct ubi_bgt_work object.
+ *
+ * This function returns a pointer to the newly allocated &struct ubi_bgt_work
+ * object in case of success and %NULL in case of failure. The allocated object
+ * is not zeroed.
+ */
+struct ubi_bgt_work *ubi_alloc_bgt_work(void);
+
+/**
+ * ubi_free_bgt_work - free a &struct ubi_bgt_work object.
+ *
+ * @wrk: a pointer to the object to free
+ */
+void ubi_free_bgt_work(struct ubi_bgt_work *wrk);
+
+/**
+ * ubi_alloc_wl_erase_work - allocate a &struct ubi_wl_erase_work object.
+ *
+ * This function returns a pointer to the newly allocated &struct ubi_wl_erase_work
+ * object in case of success and %NULL in case of failure. The allocated object
+ * is not zeroed.
+ */
+struct ubi_wl_erase_work *ubi_alloc_wl_erase_work(void);
+
+/**
+ * ubi_free_wl_erase_work - free a &struct ubi_wl_erase_work object.
+ *
+ * @wrk: a pointer to the object to free
+ */
+void ubi_free_wl_erase_work(struct ubi_wl_erase_work *wrk);
+
+/**
+ * ubi_alloc_wl_entry - allocate a &struct ubi_wl_entry object.
+ *
+ * This function returns a pointer to the newly allocated &struct ubi_wl_entry
+ * object in case of success and %NULL in case of failure. The allocated object
+ * is not zeroed.
+ */
+struct ubi_wl_entry *ubi_alloc_wl_entry(void);
+
+/**
+ * ubi_free_wl_entry - free a &struct ubi_wl_entry object.
+ *
+ * @wle: a pointer to the object to free
+ */
+void ubi_free_wl_entry(struct ubi_wl_entry *wle);
+
+/**
+ * ubi_alloc_wl_prot_entry - allocate a &struct ubi_wl_prot_entry object.
+ *
+ * This function returns a pointer to the newly allocated
+ * &struct ubi_wl_prot_entry object in case of success and %NULL in case of
+ * failure. The allocated object is not zeroed.
+ */
+struct ubi_wl_prot_entry *ubi_alloc_wl_prot_entry(void);
+
+/**
+ * ubi_free_wl_prot_entry - free a &struct ubi_wl_prot_entry object.
+ *
+ * @pe: a pointer to the object to free
+ */
+void ubi_free_wl_prot_entry(struct ubi_wl_prot_entry *pe);
+
+/**
+ * ubi_alloc_eba_ltree_entry - allocate a &struct ubi_eba_ltree_entry object.
+ *
+ * This function returns a pointer to the newly allocated
+ * &struct ubi_eba_ltree_entry object in case of success and %NULL in case of
+ * failure. The allocated object is not zeroed, but the @users and @mutex
+ * fields are initialized by slab constructor.
+ */
+struct ubi_eba_ltree_entry *ubi_alloc_eba_ltree_entry(void);
+
+/**
+ * ubi_free_eba_ltree_entry - free a &struct ubi_eba_ltree_entry object.
+ *
+ * @le: a pointer to the object to free
+ */
+void ubi_free_eba_ltree_entry(struct ubi_eba_ltree_entry *le);
+
+/**
+ * ubi_alloc_scan_leb - allocate a &struct ubi_scan_leb object.
+ *
+ * This function returns a pointer to the newly allocated &struct ubi_scan_leb
+ * object in case of success, or %NULL in case of failure. The allocated object
+ * is not zeroed.
+ */
+struct ubi_scan_leb *ubi_alloc_scan_leb(void);
+
+/**
+ * ubi_free_scan_leb - free a &struct ubi_scan_leb object.
+ *
+ * @seb: a pinter to the &struct ubi_scan_leb object to free
+ */
+void ubi_free_scan_leb(struct ubi_scan_leb *seb);
+
+/**
+ * ubi_alloc_scan_volume - allocate a &struct ubi_scan_volume object.
+ *
+ * This function returns a pointer to the newly allocated &struct
+ * ubi_scan_volume object in cases of success, or %NULL in case of failure. The
+ * allocated object is not zeroed.
+ */
+struct ubi_scan_volume *ubi_alloc_scan_volume(void);
+
+/**
+ * ubi_free_scan_volume - free a &struct ubi_scan_volume object.
+ *
+ * @sv: a pinter to the &struct ubi_scan_volume object to free
+ */
+void ubi_free_scan_volume(struct ubi_scan_volume *sv);
+
+/**
+ * ubi_alloc_init - initialize the UBI memory allocation unit.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int __init ubi_alloc_init(void);
+
+/**
+ * ubi_alloc_close - close the UBI memory allocation unit.
+ */
+void ubi_alloc_close(void);
+
+#endif /* !__UBI_ALLOC_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 12/44 take 2] [UBI] allocation unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (10 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 11/44 take 2] [UBI] allocation unit header Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 20:55   ` Arnd Bergmann
  2007-02-19 11:13   ` Pekka Enberg
  2007-02-17 16:55 ` [PATCH 13/44 take 2] [UBI] I/O unit header Artem Bityutskiy
                   ` (34 subsequent siblings)
  46 siblings, 2 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/alloc.c tmp-to/drivers/mtd/ubi/alloc.c
--- tmp-from/drivers/mtd/ubi/alloc.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/alloc.c	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,369 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/rwsem.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "io.h"
+#include "background.h"
+#include "wl.h"
+#include "debug.h"
+#include "eba.h"
+#include "scan.h"
+
+#define BGT_WORK_SLAB_NAME        "ubi_bgt_work_slab"
+#define WL_ERASE_WORK_SLAB_NAME   "ubi_wl_erase_work_slab"
+#define WL_ENTRY_SLAB_NAME        "ubi_wl_entry_slab"
+#define WL_PROT_ENTRY_SLAB_NAME   "ubi_wl_prow_entry_slab"
+#define EBA_LTREE_ENTRY_SLAB_NAME "ubi_eba_ltree_entry_slab"
+#define SCAN_EB_SLAB_NAME         "ubi_scan_leb"
+#define SCAN_VOLUME_SLAB_NAME     "ubi_scan_volume"
+
+static struct kmem_cache *bgt_work_slab;
+static struct kmem_cache *wl_erase_work_slab;
+static struct kmem_cache *wl_entries_slab;
+static struct kmem_cache *wl_prot_entry_slab;
+static struct kmem_cache *eba_ltree_entry_slab;
+static struct kmem_cache *scan_eb_slab;
+static struct kmem_cache *scan_volume_slab;
+
+void *ubi_kzalloc(size_t size)
+{
+	void *ret;
+
+	ret = kzalloc(size, GFP_KERNEL);
+	if (unlikely(!ret)) {
+		ubi_err("cannot allocate %zd bytes", size);
+		dump_stack();
+		return NULL;
+	}
+
+	return ret;
+}
+
+void *ubi_kmalloc(size_t size)
+{
+	void *ret;
+
+	ret = kmalloc(size, GFP_KERNEL);
+	if (unlikely(!ret)) {
+		ubi_err("cannot allocate %zd bytes", size);
+		dump_stack();
+		return NULL;
+	}
+
+	return ret;
+}
+
+void ubi_kfree(const void *obj)
+{
+	if (unlikely(!obj))
+		return;
+	kfree(obj);
+}
+
+struct ubi_ec_hdr *ubi_zalloc_ec_hdr(const struct ubi_info *ubi)
+{
+	struct ubi_ec_hdr *ec_hdr;
+	const struct ubi_io_info *io = ubi->io;
+
+	ec_hdr = kzalloc(io->ec_hdr_alsize, GFP_KERNEL);
+	if (unlikely(!ec_hdr)) {
+		ubi_err("cannot allocate %d bytes", io->ec_hdr_alsize);
+		dump_stack();
+		return NULL;
+	}
+
+	return ec_hdr;
+}
+
+void ubi_free_ec_hdr(const struct ubi_info *ubi, struct ubi_ec_hdr *ec_hdr)
+{
+	if (unlikely(!ec_hdr))
+		return;
+	kfree(ec_hdr);
+}
+
+struct ubi_vid_hdr *ubi_zalloc_vid_hdr(const struct ubi_info *ubi)
+{
+	char *vid_hdr;
+	const struct ubi_io_info *io = ubi->io;
+
+	vid_hdr = kzalloc(io->vid_hdr_alsize, GFP_KERNEL);
+	if (unlikely(!vid_hdr)) {
+		ubi_err("cannot allocate %d bytes", io->vid_hdr_alsize);
+		dump_stack();
+		return NULL;
+	}
+
+	/*
+	 * If VID headers are stored at non-aligned addresses, we have to shift
+	 * the pointer.
+	 */
+	if (likely(vid_hdr))
+		vid_hdr = vid_hdr + io->vid_hdr_shift;
+
+	return (struct ubi_vid_hdr *)vid_hdr;
+}
+
+void ubi_free_vid_hdr(const struct ubi_info *ubi, struct ubi_vid_hdr *vid_hdr)
+{
+	if (unlikely(!vid_hdr))
+		return;
+	vid_hdr = (struct ubi_vid_hdr *)((char *)vid_hdr - ubi->io->vid_hdr_shift);
+	kfree(vid_hdr);
+}
+
+struct ubi_bgt_work *ubi_alloc_bgt_work(void)
+{
+	struct ubi_bgt_work *wrk;
+
+	wrk = kmem_cache_alloc(bgt_work_slab, GFP_KERNEL);
+	if (unlikely(!wrk)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+	return wrk;
+}
+
+void ubi_free_bgt_work(struct ubi_bgt_work *wrk)
+{
+	if (unlikely(!wrk))
+		return;
+	kmem_cache_free(bgt_work_slab, wrk);
+}
+
+struct ubi_wl_erase_work *ubi_alloc_wl_erase_work(void)
+{
+	struct ubi_wl_erase_work *wrk;
+
+	wrk = kmem_cache_alloc(wl_erase_work_slab, GFP_KERNEL);
+	if (unlikely(!wrk)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+	return wrk;
+}
+
+void ubi_free_wl_erase_work(struct ubi_wl_erase_work *wrk)
+{
+	if (unlikely(!wrk))
+		return;
+	kmem_cache_free(wl_erase_work_slab, wrk);
+}
+
+struct ubi_wl_entry *ubi_alloc_wl_entry(void)
+{
+	struct ubi_wl_entry *wle;
+
+	wle = kmem_cache_alloc(wl_entries_slab, GFP_KERNEL);
+	if (unlikely(!wle)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+	return wle;
+}
+
+void ubi_free_wl_entry(struct ubi_wl_entry *wle)
+{
+	if (unlikely(!wle))
+		return;
+	kmem_cache_free(wl_entries_slab, wle);
+}
+
+struct ubi_wl_prot_entry *ubi_alloc_wl_prot_entry(void)
+{
+	struct ubi_wl_prot_entry *pe;
+
+	pe = kmem_cache_alloc(wl_prot_entry_slab, GFP_KERNEL);
+	if (unlikely(!pe)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+	return pe;
+}
+
+void ubi_free_wl_prot_entry(struct ubi_wl_prot_entry *pe)
+{
+	if (unlikely(!pe))
+		return;
+	kmem_cache_free(wl_prot_entry_slab, pe);
+}
+
+struct ubi_eba_ltree_entry *ubi_alloc_eba_ltree_entry(void)
+{
+	struct ubi_eba_ltree_entry *le;
+
+	le = kmem_cache_alloc(eba_ltree_entry_slab, GFP_KERNEL);
+	if (unlikely(!le)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+	return le;
+}
+
+void ubi_free_eba_ltree_entry(struct ubi_eba_ltree_entry *le)
+{
+	if (unlikely(!le))
+		return;
+	kmem_cache_free(eba_ltree_entry_slab, le);
+}
+
+struct ubi_scan_leb *ubi_alloc_scan_leb(void)
+{
+	struct ubi_scan_leb *seb;
+
+	seb = kmem_cache_alloc(scan_eb_slab, GFP_KERNEL);
+	if (unlikely(!seb)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+
+	return seb;
+}
+
+void ubi_free_scan_leb(struct ubi_scan_leb *seb)
+{
+	if (unlikely(!seb))
+		return;
+	kmem_cache_free(scan_eb_slab, seb);
+}
+
+struct ubi_scan_volume *ubi_alloc_scan_volume(void)
+{
+	struct ubi_scan_volume *sv;
+
+	sv = kmem_cache_alloc(scan_volume_slab, GFP_KERNEL);
+	if (unlikely(!sv)) {
+		ubi_err("failed to allocate memory");
+		dump_stack();
+		return NULL;
+	}
+
+	return sv;
+}
+
+void ubi_free_scan_volume(struct ubi_scan_volume *sv)
+{
+	if (unlikely(!sv))
+		return;
+	kmem_cache_free(scan_volume_slab, sv);
+}
+
+static void ltree_entry_ctor(void *obj, struct kmem_cache *cache,
+			     unsigned long flags);
+
+int __init ubi_alloc_init(void)
+{
+	const char *name;
+	size_t size;
+
+	name = BGT_WORK_SLAB_NAME;
+	size = sizeof(struct ubi_bgt_work);
+	bgt_work_slab = kmem_cache_create(name, size, 0, 0, NULL, NULL);
+	if (!bgt_work_slab)
+		goto out;
+
+	name = WL_ERASE_WORK_SLAB_NAME;
+	size = sizeof(struct ubi_wl_erase_work);
+	wl_erase_work_slab = kmem_cache_create(name, size, 0, 0, NULL, NULL);
+	if (!wl_erase_work_slab)
+		goto out;
+
+	name = WL_ENTRY_SLAB_NAME;
+	size = sizeof(struct ubi_wl_entry);
+	wl_entries_slab = kmem_cache_create(name, size, 0, 0, NULL, NULL);
+	if (!wl_entries_slab)
+		goto out;
+
+	name = WL_PROT_ENTRY_SLAB_NAME;
+	size = sizeof(struct ubi_wl_prot_entry);
+	wl_prot_entry_slab = kmem_cache_create(name, size, 0, 0, NULL, NULL);
+	if (!wl_prot_entry_slab)
+		goto out;
+
+	name = EBA_LTREE_ENTRY_SLAB_NAME;
+	size = sizeof(struct ubi_eba_ltree_entry);
+	eba_ltree_entry_slab = kmem_cache_create(name, size, 0, 0,
+						 &ltree_entry_ctor, NULL);
+	if (!eba_ltree_entry_slab)
+		goto out;
+
+	name = SCAN_EB_SLAB_NAME;
+	size = sizeof(struct ubi_scan_leb);
+	scan_eb_slab = kmem_cache_create(name, size, 0, 0, NULL, NULL);
+	if (!scan_eb_slab)
+		goto out;
+
+	name = SCAN_VOLUME_SLAB_NAME;
+	size = sizeof(struct ubi_scan_volume);
+	scan_volume_slab = kmem_cache_create(name, size, 0, 0, NULL, NULL);
+	if (!scan_volume_slab)
+		goto out;
+
+	return 0;
+
+out:
+	ubi_err("cannot create \"%s\" slab", name);
+	ubi_alloc_close();
+	return -ENOMEM;
+}
+
+void ubi_alloc_close(void)
+{
+	if (scan_volume_slab)
+		kmem_cache_destroy(scan_volume_slab);
+	if (scan_eb_slab)
+		kmem_cache_destroy(scan_eb_slab);
+	if (eba_ltree_entry_slab)
+		kmem_cache_destroy(eba_ltree_entry_slab);
+	if (wl_prot_entry_slab)
+		kmem_cache_destroy(wl_prot_entry_slab);
+	if (wl_entries_slab)
+		kmem_cache_destroy(wl_entries_slab);
+	if (wl_erase_work_slab)
+		kmem_cache_destroy(wl_erase_work_slab);
+	if (bgt_work_slab)
+		kmem_cache_destroy(bgt_work_slab);
+}
+
+/* Lock tree entries slab cache constructor */
+static void ltree_entry_ctor(void *obj, struct kmem_cache *cache,
+			     unsigned long flags)
+{
+	struct ubi_eba_ltree_entry *le = obj;
+
+	if ((flags & (SLAB_CTOR_VERIFY | SLAB_CTOR_CONSTRUCTOR)) !=
+	    SLAB_CTOR_CONSTRUCTOR)
+		return;
+
+	le->users = 0;
+	init_rwsem(&le->mutex);
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 13/44 take 2] [UBI] I/O unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (11 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 12/44 take 2] [UBI] allocation unit implementation Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 16:55 ` [PATCH 14/44 take 2] [UBI] I/O unit implementation Artem Bityutskiy
                   ` (33 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/io.h tmp-to/drivers/mtd/ubi/io.h
--- tmp-from/drivers/mtd/ubi/io.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/io.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,419 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ * Copyright (C) Nokia Corporation, 2006,2007
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI input/output unit.
+ *
+ * This unit provides a uniform way to work with all kinds of the underlying
+ * MTD devices. It also implements handy functions for reading and writing UBI
+ * headers.
+ *
+ * We are trying to have a paranoid mindset and not to trust to what we read
+ * from the flash media in order to be more secure and robust. For example, we
+ * do not want to have vulnerability when attackers feed us with an
+ * inconsistent image end ends up with a buffer overflow. This is especially
+ * important if UBI is ever used on removable media. So, UBI validates every
+ * single node it reads from the flash media.
+ */
+
+#ifndef __UBI_IO_H__
+#define __UBI_IO_H__
+
+struct ubi_info;
+struct ubi_ec_hdr;
+struct ubi_vid_hdr;
+struct ubi_io_info;
+struct mtd_info;
+
+/*
+ * Error codes returned by this unit.
+ *
+ * @UBI_IO_PEB_EMPTY: the physical eraseblock is empty, i.e. it contains only
+ * %0xFF bytes
+ * @UBI_IO_PEB_FREE: the physical eraseblock is free, i.e. it contains only a
+ * valid erase counter header, and the rest are %0xFF bytes
+ * @UBI_IO_BAD_EC_HDR: the erase counter header is corrupted (bad magic or CRC)
+ * @UBI_IO_BAD_VID_HDR: the volume identifier header is corrupted (bad magic or
+ * CRC)
+ * @UBI_IO_BITFLIPS: bit-flips were detected and corrected
+ */
+enum {
+	UBI_IO_PEB_EMPTY = 1,
+	UBI_IO_PEB_FREE,
+	UBI_IO_BAD_EC_HDR,
+	UBI_IO_BAD_VID_HDR,
+	UBI_IO_BITFLIPS
+};
+
+/**
+ * ubi_io_read - read data from a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @buf: buffer where to store the read data
+ * @pnum: physical eraseblock number to read from
+ * @offset: offset within the physical eraseblock from where to read
+ * @len: how many bytes to read
+ *
+ * This function reads data from offset @offset of physical eraseblock @pnum
+ * and stores the read data in the @buf buffer. The following return codes are
+ * possible:
+ *
+ * o %0 if all the requested data were successfully read;
+ * o %UBI_IO_BITFLIPS if all the requested data were successfully read, but
+ *   correctable bit-flips were detected; this is harmless but may indicate
+ *   that this eraseblock may become bad soon;
+ * o %-EBADMSG if the MTD subsystem reported about data data integrity
+ *   problems, for example it can me an ECC error in case of NAND; this most
+ *   probably means that the data is corrupted;
+ * o %-EIO if some I/O error occurred;
+ * o Other negative error codes in case of other errors.
+ */
+int ubi_io_read(const struct ubi_info *ubi, void *buf, int pnum, int offset,
+		int len);
+
+/**
+ * ubi_io_read_data - read logical eraseblock data from a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @buf: buffer where to store the read data
+ * @pnum: physical eraseblock number to read from
+ * @offset: offset within the logical eraseblock from where to read
+ * @len: how many bytes to read
+ *
+ * This function is equivalent to 'ubi_io_read()', but @offset is relative to
+ * the beginning of the logical eraseblock, not to the beginning of the
+ * physical eraseblock.
+ */
+#define ubi_io_read_data(ubi, buf, pnum, offset, len)  ({                    \
+	int __err;                                                           \
+	ubi_assert((offset) >= 0);                                           \
+	__err = ubi_io_read(ubi, buf, pnum, (offset) + (ubi)->io->leb_start, \
+			    len);                                            \
+	__err = __err;                                                       \
+})
+
+/**
+ * ubi_io_write - write data to a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @buf: buffer with the data to write
+ * @pnum: physical eraseblock number to write to
+ * @offset: offset within the physical eraseblock where to write
+ * @len: how many bytes to write
+ *
+ * This function writes @len bytes of data from buffer @buf to offset @offset
+ * of physical eraseblock @pnum. If all the data were successfully written,
+ * zero is returned. If an error occurred, this function returns a negative
+ * error code. If %-EIO is returned, the physical eraseblock most probably went
+ * bad.
+ *
+ * Note, in case of an error, it is possible that something was still written
+ * to the flash media, but may be some garbage.
+ */
+int ubi_io_write(const struct ubi_info *ubi, const void *buf, int pnum,
+		 int offset, int len);
+
+/**
+ * ubi_io_write_data - write logical eraseblock data to a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @buf: buffer with the data to write
+ * @pnum: physical eraseblock number to write to
+ * @offset: offset within the logical eraseblock where to write
+ * @len: how many bytes to write
+ *
+ * This function is equivalent to 'ubi_io_write()', but @offset is relative to
+ * the beginning of the logical eraseblock, not to the beginning of the
+ * physical eraseblock.
+ */
+#define ubi_io_write_data(ubi, buf, pnum, offset, len) ({                     \
+	int __err;                                                            \
+	ubi_assert((offset) >= 0);                                            \
+	__err = ubi_io_write(ubi, buf, pnum, (offset) + (ubi)->io->leb_start, \
+			     len);                                            \
+	__err = __err;                                                        \
+})
+
+/**
+ * ubi_io_sync_erase - synchronously erase a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock number to erase
+ * @torture: if this physical eraseblock has to be tortured
+ *
+ * This function synchronously erases physical eraseblock @pnum. If @torture
+ * flag is not zero, the physical eraseblock is checked by means of writing
+ * different patterns and reading them back. If the torturing is enabled, the
+ * physical eraseblock is erased more then once.
+ *
+ * This function returns the number of erasures made in case of success, %-EIO
+ * if erasure failed or the torturing test failed, and other negative error
+ * codes in case of other errors. Note, %-EIO most probably means that the
+ * physical eraseblock is bad.
+ */
+int ubi_io_sync_erase(const struct ubi_info *ubi, int pnum, int torture);
+
+/**
+ * ubi_io_is_bad - check if a physical eraseblock is bad.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to check
+ *
+ * This function returns a positive number if the physical eraseblock is bad,
+ * zero if not, and a negative error code if an error occurred.
+ */
+int ubi_io_is_bad(const struct ubi_info *ubi, int pnum);
+
+/**
+ * ubi_io_mark_bad - mark a physical eraseblock as bad.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to mark
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_io_mark_bad(const struct ubi_info *ubi, int pnum);
+
+/**
+ * ubi_io_read_ec_hdr - read and check an erase counter header.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock to read from
+ * @ec_hdr: a &struct ubi_ec_hdr object where to store the read erase counter
+ * header
+ * @verbose: be verbose if the header is corrupted or was not found
+ *
+ * This function reads erase counter header from physical eraseblock @pnum and
+ * stores it in @ec_hdr. This function also checks CRC checksum of the read
+ * erase counter header. The following codes may be returned:
+ *
+ * o %0 if the CRC checksum is correct and the header was successfully read;
+ * o %UBI_IO_BITFLIPS if the CRC is correct, but bit-flips were detected
+ *   and corrected by the flash driver; this is harmless but may indicate that
+ *   this eraseblock may become bad soon;
+ * o %UBI_IO_BAD_EC_HDR if the erase counter header is corrupted (a CRC error);
+ * o %UBI_IO_PEB_EMPTY if the physical eraseblock is empty;
+ * o %-EIO if an input/output error occurred and this function failed to read
+ *   the erase counter header; this may indicate serious problems;
+ * o other negative values in case of other errors.
+ */
+int ubi_io_read_ec_hdr(const struct ubi_info *ubi, int pnum,
+		       struct ubi_ec_hdr *ec_hdr, int verbose);
+
+/**
+ * ubi_io_write_ec_hdr - write an erase counter header.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock to write to
+ * @ec_hdr: the erase counter header to write
+ *
+ * This function writes the erase counter header described by @ec_hdr to
+ * physical eraseblock @pnum. This function also fills most fields of @ec_hdr
+ * before writing, so the caller do not have to fill them. Callers must only fill
+ * the @ec_hdr->ec field.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure. If %-EIO is returned, the physical eraseblock most probably
+ * went bad.
+ */
+int ubi_io_write_ec_hdr(const struct ubi_info *ubi, int pnum,
+			struct ubi_ec_hdr *ec_hdr);
+
+/**
+ * ubi_io_read_vid_hdr - read and check a volume identifier header.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock number to read from
+ * @vid_hdr: &struct ubi_vid_hdr object where to store the read volume
+ * identifier header
+ * @verbose: be verbose if the header is corrupted or wasn't found
+ *
+ * This function reads the volume identifier header from physical eraseblock
+ * @pnum and stores it in @vid_hdr. It also checks CRC checksum of the read
+ * volume identifier header. The following codes may be returned:
+ *
+ * o %0 if the CRC checksum is correct and the header was successfully read;
+ * o %UBI_IO_BITFLIPS if the CRC is correct, but bit-flips were detected
+ *   and corrected by the flash driver; this is harmless but may indicate that
+ *   this eraseblock may become bad soon;
+ * o %UBI_IO_BAD_VID_HRD if the volume identifier header is corrupted (a CRC
+ *   error detected);
+ * o %UBI_IO_PEB_FREE if the physical eraseblock is free (i.e., there is no VID
+ *   header there);
+ * o %-EIO if an input/output error occurred and this function failed to read
+ *   the erase counter header; this indicates some serious problems;
+ * o other negative values in case of other errors.
+ */
+int ubi_io_read_vid_hdr(const struct ubi_info *ubi, int pnum,
+			struct ubi_vid_hdr *vid_hdr, int verbose);
+
+/**
+ * ubi_io_write_vid_hdr - write a volume identifier header.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to write to
+ * @vid_hdr: the contents of the volume identifier header
+ *
+ * This function writes the volume identifier header described by @vid_hdr to
+ * physical eraseblock @pnum. Callers do not have to fill all the fields of the
+ * passed volume identifier header object as this function fills many of them
+ * automatically. Callers have to fill only @vid_hdr->vol_type,
+ * @vid_hdr->leb_ver, @vid_hdr->vol_id, @vid_hdr->lnum, @vid_hdr->compat,
+ * @vid_hdr->data_size, @vid_hdr->used_ebs, and @vid_hdr->data_pad fields.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure. If %-EIO is returned, the physical eraseblock probably went
+ * bad.
+ */
+int ubi_io_write_vid_hdr(const struct ubi_info *ubi, int pnum,
+			 struct ubi_vid_hdr *vid_hdr);
+
+/**
+ * ubi_io_init - initialize the UBI I/O unit for a given UBI device.
+ *
+ * @ubi: the UBI device description object
+ * @mtd_num: the underlying MTD device number
+ * @vid_hdr_offset: volume identifier header offset
+ * @data_offset: logical eraseblock data offset
+ *
+ * If the @vid_hdr_offset and @data_offset parameters are zero, the default
+ * offsets are assumed:
+ *   o the EC header is always at offset zero - this cannot be changed
+ *   o the VID header starts after the EC header at the closest address aligned
+ *   to @io->@hdrs_min_io_size
+ *   o data starts after the VID header at the closest address aligned to
+ *   @io->@min_io_size
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_io_init(struct ubi_info *ubi, int mtd_num, int vid_hdr_offset,
+		int data_offset);
+
+/**
+ * ubi_io_close - close the UBI I/O unit for a given UBI device.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_io_close(const struct ubi_info *ubi);
+
+/**
+ * struct ubi_io_info - UBI I/O unit description data structure.
+ *
+ * @flash_size: the underlying MTD device size (in bytes)
+ * @peb_count: count of physical eraseblocks on the MTD device
+ * @peb_size: physical eraseblock size
+ * @bad_peb_count: count of bad physical eraseblocks
+ * @good_peb_count: count of good physical eraseblocks
+ * @min_io_size: minimal input/output unit size of the underlying MTD device
+ * @hdrs_min_io_size: minimal I/O unit size used for VID and EC headers
+ * @ro_mode: if the UBI device is in read-only mode
+ * @leb_size: logical eraseblock size
+ * @leb_start: starting offset of logical eraseblocks within physical
+ * eraseblocks
+ * @ec_hdr_alsize: size of the EC header aligned to @hdrs_min_io_size
+ * @vid_hdr_alsize: size of the VID header aligned to @hdrs_min_io_size
+ * @vid_hdr_offset: starting offset of the volume identifier header (might be
+ * unaligned)
+ * @vid_hdr_aloffset: starting offset of the VID header aligned to
+ * @hdrs_min_io_size
+ * @vid_hdr_shift: contains @vid_hdr_offset - @vid_hdr_aloffset
+ * @bad_allowed: whether the underlying MTD device admits of bad physical
+ * eraseblocks or not
+ * @mtd_num: the underlying MTD device number
+ * @mtd_name: the underlying MTD device name
+ * @mtd: the underlying MTD device descriptor
+ *
+ * The erase counter header is always stored at offset zero. By default, the
+ * VID header is stored after the EC header at the closest aligned offset
+ * (note, aligned means aligned to the minimum I/O unit size in this context).
+ * Data starts next to the VID header at the closest aligned offset. But for
+ * different reasons (e.g., optimization), UBI may be asked to put the VID
+ * header at another offset, and even at an unaligned offset. Of course, if the
+ * offset of the VID header is unaligned, UBI adds proper padding before it.
+ *
+ * About minimal I/O units. In general, UBI assumes flash device model where
+ * there is only one minimal I/O unit size. E.g., in case of NOR flash it is 1,
+ * in case of NAND flash it is a NAND page, etc. This is reported by MTD in the
+ * @writesize field. But as an exception, UBI admits of using another
+ * (smaller) minimal I/O unit size for EC and VID headers to make it possible
+ * to do different optimizations.
+ *
+ * For example, this is extremely useful in case of NAND flashes which admit of
+ * several write operations to one NAND page. In this case UBI can fit EC and
+ * VID headers at one NAND page. So, in case of such NAND flashes UBI uses
+ * "sub-page" size as the minimal I/O unit for the headers (the
+ * @hdrs_min_io_size field). But it still reports NAND page size (min_io_size)
+ * as a minimal I/O unit for the UBI users.
+ *
+ * Example: some Samsung NANDs with 2KiB pages allow 4x 512-byte writes, so
+ * although the minimal I/O unit is 2K, UBI uses 512 bytes for UBI and VID
+ * headers.
+ *
+ * Q: why not just to treat sub-page as a minimal I/O unit of this flash
+ * device, e.g., make @min_io_size = 512 in the example above?
+ *
+ * A: because when writing a sub-page, MTD still writes a full 2K page but the
+ * bytes which are no relevant to the sub-page are 0xFF. So, basically, writing
+ * 4x512 sub-pages is 4x times slower then writing one 2KiB NAND page. Thus, we
+ * prefer to use sub-pages only for EV and VID headers.
+ *
+ * As it was noted above, the VID header may start at an non-aligned offset.
+ * For example, in case of a 2KiB-PER-page NAND flash with a 512 bytes
+ * sub-page, the VID header may reside at offset 1984 which is the last 64
+ * bytes of the last sub-page (EC header is always at offset zero). This causes
+ * some difficulties when reading and writing VID headers.
+ *
+ * Suppose we have a 64-byte buffer and we read VID header at it. We change the
+ * data and wand to write this VID header to some eraseblock. As we can only
+ * write in 512-byte chunks, we have to allocate one more buffer and copy our
+ * VID header to offset 448 of this buffer.
+ *
+ * The I/O unit, together with the memory allocation unit do the following
+ * trick in order to avoid this extra copy. The memory allocation allocates
+ * a @vid_hdr_alsize bytes buffer for the VID header and returns a pointer to
+ * offset @vid_hdr_shift of this buffer. When write VID headers, we shift the
+ * VID header pointed back and write the whole sub-page.
+ */
+struct ubi_io_info {
+	long long flash_size; /* public  */
+	int peb_count;        /* public  */
+	int peb_size;         /* public  */
+	int bad_peb_count;    /* public  */
+	int good_peb_count;   /* public  */
+	int min_io_size;      /* public  */
+	int hdrs_min_io_size; /* public  */
+	int ro_mode;          /* public  */
+	int leb_size;         /* public  */
+	int leb_start;        /* public  */
+	int ec_hdr_alsize;    /* public  */
+	int vid_hdr_alsize;   /* public  */
+	int vid_hdr_offset;   /* public  */
+	int vid_hdr_aloffset; /* public  */
+	int vid_hdr_shift;    /* public  */
+	int bad_allowed;      /* public  */
+	int mtd_num;          /* public  */
+	const char *mtd_name; /* public  */
+	struct mtd_info *mtd; /* private */
+};
+
+#endif /* !__UBI_IO_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 14/44 take 2] [UBI] I/O unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (12 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 13/44 take 2] [UBI] I/O unit header Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 16:55 ` [PATCH 15/44 take 2] [UBI] scanning unit header Artem Bityutskiy
                   ` (32 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/io.c tmp-to/drivers/mtd/ubi/io.c
--- tmp-from/drivers/mtd/ubi/io.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/io.c	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,1297 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ * Copyright (C) Nokia Corporation, 2006,2007
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/crc32.h>
+#include <linux/wait.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/mtd/mtd.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "io.h"
+#include "misc.h"
+#include "debug.h"
+
+/*
+ * In case of an input/output error, UBI tries to repeat the operation several
+ * times before returning error. The below constant defines how many times
+ * UBI re-tries.
+ */
+#define IO_RETRIES 3
+
+/*
+ * "Paranoid" checks of the UBI I/O unit. Note, they substantially slow down
+ * the system.
+ */
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_IO
+
+static int paranoid_check_not_bad(const struct ubi_info *ubi, int pnum);
+static int paranoid_check_peb_ec_hdr(const struct ubi_info *ubi, int pnum);
+static int paranoid_check_ec_hdr(const struct ubi_info *ubi, int pnum,
+				 const struct ubi_ec_hdr *ec_hdr);
+static int paranoid_check_peb_vid_hdr(const struct ubi_info *ubi, int pnum);
+static int paranoid_check_vid_hdr(const struct ubi_info *ubi, int pnum,
+				  const struct ubi_vid_hdr *vid_hdr);
+static int paranoid_check_all_ff(const struct ubi_info *ubi, int pnum,
+				 int offset, int len);
+#else
+#define paranoid_check_not_bad(ubi, pnum) 0
+#define paranoid_check_peb_ec_hdr(ubi, pnum)  0
+#define paranoid_check_ec_hdr(ubi, pnum, ec_hdr)  0
+#define paranoid_check_peb_vid_hdr(ubi, pnum) 0
+#define paranoid_check_vid_hdr(ubi, pnum, vid_hdr) 0
+#define paranoid_check_all_ff(ubi, pnum, offset, len) 0
+#endif /* !CONFIG_MTD_UBI_DEBUG_PARANOID_IO */
+
+/**
+ * mtd_read - read data from flash.
+ *
+ * @io: I/O unit description object
+ * @buf: a buffer where to store the read data
+ * @pnum: physical eraseblock number to read from
+ * @offset: offset to read from
+ * @len: how many bytes to read
+ * @read: how many bytes were actually read is returned here
+ *
+ * This is a simple wrapper over mtd->read().
+ */
+static inline int mtd_read(const struct ubi_io_info *io, void *buf,
+			   int pnum, int offset, int len, size_t *read)
+{
+	loff_t addr = (loff_t)pnum * io->peb_size + offset;
+
+	return io->mtd->read(io->mtd, addr, len, read, buf);
+}
+
+/**
+ * mtd_write - write data to flash.
+ *
+ * @io: I/O unit description object
+ * @buf: the data to write
+ * @pnum: physical eraseblock number to write to
+ * @offset: offset to write to
+ * @len: how many bytes to write
+ * @written: how many bytes were actually written
+ *
+ * This is a simple wrapper over mtd->write().
+ */
+static inline int mtd_write(const struct ubi_io_info *io, const void *buf,
+			    int pnum, int offset, int len, size_t *written)
+{
+	loff_t addr = (loff_t)pnum * io->peb_size + offset;
+
+	return io->mtd->write(io->mtd, addr, len, written, buf);
+}
+
+int ubi_io_read(const struct ubi_info *ubi, void *buf, int pnum, int offset,
+		int len)
+{
+	int err, tries = 0;
+	size_t read;
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_io("read %d bytes from PEB %d:%d", len, pnum, offset);
+
+	ubi_assert(pnum >= 0 && pnum < io->peb_count);
+	ubi_assert(offset >= 0 && offset + len <= io->peb_size);
+	ubi_assert(len > 0);
+
+	err = paranoid_check_not_bad(ubi, pnum);
+	if (unlikely(err))
+		return err > 0 ? -EINVAL : err;
+
+retry:
+	err = mtd_read(io, buf, pnum, offset, len, &read);
+	if (unlikely(err)) {
+		if (err == -EUCLEAN) {
+			/*
+			 * -EUCLEAN is reported if there was a bit-flip which
+			 * was corrected, so this is harmless.
+			 */
+			dbg_io("bit-flip occurred");
+			return UBI_IO_BITFLIPS;
+		}
+
+		if (read != len && tries++ < IO_RETRIES) {
+			yield();
+			dbg_io("error %d while reading %d bytes from PEB %d:%d, "
+			       "read only %zd bytes, retry",
+			       err, len, pnum, offset, read);
+			goto retry;
+		}
+
+		ubi_err("error %d while reading %d bytes from PEB %d:%d, "
+			"read %zd bytes", err, len, pnum, offset, read);
+		ubi_dbg_dump_stack();
+	} else {
+		ubi_assert(len == read);
+
+		/*
+		 * The below is just for debugging and is compiled out if
+		 * disabled.
+		 */
+		if (ubi_dbg_is_bitflip()) {
+			dbg_io("emulate bit-flip");
+			err = UBI_IO_BITFLIPS;
+		}
+	}
+
+	return err;
+}
+
+int ubi_io_write(const struct ubi_info *ubi, const void *buf, int pnum,
+		 int offset, int len)
+{
+	int err;
+	size_t written;
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_io("write %d bytes to PEB %d:%d", len, pnum, offset);
+
+	ubi_assert(pnum >= 0 && pnum < io->peb_count);
+	ubi_assert(offset >= 0 && offset + len <= io->peb_size);
+	ubi_assert(offset % io->hdrs_min_io_size == 0);
+	ubi_assert(len > 0 && len % io->hdrs_min_io_size == 0);
+
+	if (unlikely(io->ro_mode)) {
+		ubi_err("read-only mode");
+		return -EROFS;
+	}
+
+	/* The below has to be compiled out if paranoid checks are disabled */
+
+	err = paranoid_check_not_bad(ubi, pnum);
+	if (unlikely(err))
+		return err > 0 ? -EINVAL : err;
+
+	/* The area we are writing to has to contain all 0xFF bytes */
+	err = paranoid_check_all_ff(ubi, pnum, offset, len);
+	if (unlikely(err))
+		return err > 0 ? -EINVAL : err;
+
+	if (offset >= io->leb_start) {
+		/*
+		 * We write to the data area of the physical eraseblock. Make
+		 * sure it has valid EC and VID headers.
+		 */
+		err = paranoid_check_peb_ec_hdr(ubi, pnum);
+		if (unlikely(err))
+			return err > 0 ? -EINVAL : err;
+		err = paranoid_check_peb_vid_hdr(ubi, pnum);
+		if (unlikely(err))
+			return err > 0 ? -EINVAL : err;
+	}
+
+	/* The below is just for debugging and is compiled out if disabled */
+	if (ubi_dbg_is_write_failure()) {
+		ubi_err("cannot write %d bytes to PEB %d:%d "
+			"(emulated)", len, pnum, offset);
+		ubi_dbg_dump_stack();
+		return -EIO;
+	}
+
+	err = mtd_write(io, buf, pnum, offset, len, &written);
+	if (unlikely(err)) {
+		ubi_err("error %d while writing %d bytes to PEB %d:%d, written"
+			" %zd bytes", err, len, pnum, offset, written);
+		ubi_dbg_dump_stack();
+	} else
+		ubi_assert(written == len);
+
+	return err;
+}
+
+static void erase_callback(struct erase_info *ei)
+{
+	wake_up_interruptible((wait_queue_head_t *)ei->priv);
+}
+
+static int sync_erase(const struct ubi_info *ubi, int pnum);
+static int ubi_io_torture_peb(const struct ubi_info *ubi, int pnum);
+
+int ubi_io_sync_erase(const struct ubi_info *ubi, int pnum, int torture)
+{
+	int err, ret = 0;
+
+	ubi_assert(pnum >= 0 && pnum < ubi->io->peb_count);
+
+	if (unlikely(ubi->io->ro_mode)) {
+		ubi_err("read-only mode");
+		return -EROFS;
+	}
+
+	if (torture) {
+		ret = ubi_io_torture_peb(ubi, pnum);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	err = sync_erase(ubi, pnum);
+	if (unlikely(err))
+		return err;
+
+	return ret + 1;
+}
+
+int ubi_io_is_bad(const struct ubi_info *ubi, int pnum)
+{
+	const struct ubi_io_info *io = ubi->io;
+	struct mtd_info *mtd = io->mtd;
+
+	ubi_assert(pnum >= 0 && pnum < io->peb_count);
+
+	if (io->bad_allowed) {
+		int ret;
+
+		ret = mtd->block_isbad(mtd, (loff_t)pnum * io->peb_size);
+		if (unlikely(ret < 0))
+			ubi_err("error %d while checking if PEB %d is bad",
+				ret, pnum);
+		else if (ret)
+			dbg_io("PEB %d is bad", pnum);
+		return ret;
+	}
+
+	return 0;
+}
+
+int ubi_io_mark_bad(const struct ubi_info *ubi, int pnum)
+{
+	int err;
+	const struct ubi_io_info *io = ubi->io;
+	struct mtd_info *mtd = io->mtd;
+
+	ubi_assert(pnum >= 0 && pnum < io->peb_count);
+
+	if (unlikely(io->ro_mode)) {
+		ubi_err("read-only mode");
+		return -EROFS;
+	}
+
+	if (!io->bad_allowed)
+		return 0;
+
+	err = mtd->block_markbad(mtd, (loff_t)pnum * io->peb_size);
+	if (unlikely(err))
+		ubi_err("cannot mark PEB %d bad, error %d", pnum, err);
+	return err;
+}
+
+static int validate_ec_hdr(const struct ubi_info *ubi,
+			   const struct ubi_ec_hdr *ec_hdr);
+
+int ubi_io_read_ec_hdr(const struct ubi_info *ubi, int pnum,
+		       struct ubi_ec_hdr *ec_hdr, int verbose)
+{
+	int err, read_err = 0;
+	uint32_t crc, magic, hdr_crc;
+
+	dbg_io("read EC header from PEB %d", pnum);
+
+	ubi_assert(pnum >= 0 && pnum < ubi->io->peb_count);
+
+	err = ubi_io_read(ubi, ec_hdr, pnum, 0, UBI_EC_HDR_SIZE);
+	if (unlikely(err)) {
+		if (err != UBI_IO_BITFLIPS && err != -EBADMSG)
+			return err;
+
+		/*
+		 * We read all the data, but either a correctable bit-flip
+		 * occurred, or MTD reported about some data integrity error,
+		 * like an ECC error in case of NAND. The former is harmless,
+		 * the later may mean that the read data is corrupted. But we
+		 * have a CRC check-sum and we will identify this. If the EC
+		 * header is still OK, we just report this as there was a
+		 * bit-flip.
+		 */
+		read_err = err;
+	}
+
+	magic = ubi32_to_cpu(ec_hdr->magic);
+	if (unlikely(magic != UBI_EC_HDR_MAGIC)) {
+
+		/*
+		 * The magic field is wrong. Let's check if we have read all
+		 * 0xFF. If yes, this physical eraseblock is assumed to be
+		 * empty.
+		 *
+		 * But if there was a read error, we do not test it for all
+		 * 0xFFs. Even if it does contain all 0xFFs, this error
+		 * indicates that something is still wrong with this physical
+		 * eraseblock anyway.
+		 */
+		if (likely(read_err != -EBADMSG) &&
+		    ubi_buf_all_ff(ec_hdr, UBI_EC_HDR_SIZE)) {
+			/* The physical eraseblock is supposedly empty. But */
+
+			/*
+			 * The below is just a paranoid check, it has to be
+			 * compiled out if paranoid checks are disabled.
+			 */
+			err = paranoid_check_all_ff(ubi, pnum, 0,
+						    ubi->io->peb_size);
+			if (unlikely(err))
+				return err > 0 ? UBI_IO_BAD_EC_HDR : err;
+
+			if (verbose)
+				ubi_warn("no EC header found at PEB %d, "
+					 "only 0xFF bytes", pnum);
+
+			/*
+			 * We read all 0xFFs, so assume the eraseblock is
+			 * empty.
+			 */
+			return UBI_IO_PEB_EMPTY;
+		}
+
+		/*
+		 * This is not a valid erase counter header, and these are not
+		 * 0xFF bytes. Report that the header is corrupted.
+		 */
+		if (verbose) {
+			ubi_warn("bad magic number at PEB %d: %08x instead of "
+				 "%08x", pnum, magic, UBI_EC_HDR_MAGIC);
+			ubi_dbg_dump_ec_hdr(ec_hdr);
+		}
+		return UBI_IO_BAD_EC_HDR;
+	}
+
+	crc = crc32(UBI_CRC32_INIT, ec_hdr, UBI_EC_HDR_SIZE_CRC);
+	hdr_crc = ubi32_to_cpu(ec_hdr->hdr_crc);
+
+	if (unlikely(hdr_crc != crc)) {
+		if (verbose) {
+			ubi_warn("bad EC header CRC at PEB %d, calculated %#08x,"
+				 " read %#08x", pnum, crc, hdr_crc);
+			ubi_dbg_dump_ec_hdr(ec_hdr);
+		}
+		return UBI_IO_BAD_EC_HDR;
+	}
+
+	/* Validate what was read from the media */
+	err = validate_ec_hdr(ubi, ec_hdr);
+	if (unlikely(err > 0)) {
+		ubi_err("validation failed for PEB %d", pnum);
+		return -EINVAL;
+	}
+
+	return read_err ? UBI_IO_BITFLIPS : 0;
+}
+
+int ubi_io_write_ec_hdr(const struct ubi_info *ubi, int pnum,
+			struct ubi_ec_hdr *ec_hdr)
+{
+	int err;
+	uint32_t crc;
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_io("write EC header to PEB %d", pnum);
+
+	ubi_assert(pnum >= 0 &&  pnum < io->peb_count);
+
+	ec_hdr->magic = cpu_to_ubi32(UBI_EC_HDR_MAGIC);
+	ec_hdr->version = UBI_VERSION;
+	ec_hdr->vid_hdr_offset = cpu_to_ubi32(io->vid_hdr_offset);
+	ec_hdr->data_offset = cpu_to_ubi32(io->leb_start);
+	crc = crc32(UBI_CRC32_INIT, ec_hdr, UBI_EC_HDR_SIZE_CRC);
+	ec_hdr->hdr_crc = cpu_to_ubi32(crc);
+
+	err = paranoid_check_ec_hdr(ubi, pnum, ec_hdr);
+	if (unlikely(err))
+		return -EINVAL;
+
+	err = ubi_io_write(ubi, ec_hdr, pnum, 0, io->ec_hdr_alsize);
+	return err;
+}
+
+static int validate_vid_hdr(const struct ubi_info *ubi,
+			    const struct ubi_vid_hdr *vid_hdr);
+
+int ubi_io_read_vid_hdr(const struct ubi_info *ubi, int pnum,
+			struct ubi_vid_hdr *vid_hdr, int verbose)
+{
+	int err, read_err = 0;
+	uint32_t crc, magic, hdr_crc;
+	const struct ubi_io_info *io = ubi->io;
+	void *p;
+
+	dbg_io("read VID header from PEB %d", pnum);
+
+	ubi_assert(pnum >= 0 &&  pnum < io->peb_count);
+
+	p = (char *)vid_hdr - io->vid_hdr_shift;
+	err = ubi_io_read(ubi, p, pnum, io->vid_hdr_aloffset,
+			  io->vid_hdr_alsize);
+	if (unlikely(err)) {
+		if (err != UBI_IO_BITFLIPS && err != -EBADMSG)
+			return err;
+
+		/*
+		 * We read all the data, but either a correctable bit-flip
+		 * occurred, or MTD reported about some data integrity error,
+		 * like an ECC error in case of NAND. The former is harmless,
+		 * the later may mean the read data is corrupted. But we have a
+		 * CRC check-sum and we will identify this. If the VID header is
+		 * still OK, we just report this as there was a bit-flip.
+		 */
+		read_err = err;
+	}
+
+	magic = ubi32_to_cpu(vid_hdr->magic);
+	if (unlikely(magic != UBI_VID_HDR_MAGIC)) {
+		/*
+		 * If we have read all 0xFF bytes, the VID header probably does
+		 * not exist and the physical eraseblock is assumed to be free.
+		 *
+		 * But if there was a read error, we do not test the data for
+		 * 0xFFs. Even if it does contain all 0xFFs, this error
+		 * indicates that something is still wrong with this physical
+		 * eraseblock.
+		 */
+		if (likely(read_err != -EBADMSG) &&
+		    ubi_buf_all_ff(vid_hdr, UBI_VID_HDR_SIZE)) {
+			/* The physical eraseblock is supposedly free */
+
+			/*
+			 * The below is just a paranoid check, it has to be
+			 * compiled out if paranoid checks are disabled.
+			 */
+			err = paranoid_check_all_ff(ubi, pnum, io->leb_start,
+						    io->leb_size);
+			if (unlikely(err))
+				return err > 0 ? UBI_IO_BAD_VID_HDR : err;
+
+			/*
+			 * We read all 0xFFs, so assume the eraseblock is
+			 * free.
+			 */
+			if (verbose)
+				ubi_warn("no VID header found at PEB %d, "
+					 "only 0xFF bytes", pnum);
+			return UBI_IO_PEB_FREE;
+		}
+
+		/*
+		 * This is not a valid VID header, and these are not 0xFF
+		 * bytes. Report that the header is corrupted.
+		 */
+		if (verbose) {
+			ubi_warn("bad magic number at PEB %d: %08x instead of "
+				 "%08x", pnum, magic, UBI_VID_HDR_MAGIC);
+			ubi_dbg_dump_vid_hdr(vid_hdr);
+		}
+		return UBI_IO_BAD_VID_HDR;
+	}
+
+	crc = crc32(UBI_CRC32_INIT, vid_hdr, UBI_VID_HDR_SIZE_CRC);
+	hdr_crc = ubi32_to_cpu(vid_hdr->hdr_crc);
+
+	if (unlikely(hdr_crc != crc)) {
+		if (verbose) {
+			ubi_warn("bad CRC at PEB %d, calculated %#08x, "
+				 "read %#08x", pnum, crc, hdr_crc);
+			ubi_dbg_dump_vid_hdr(vid_hdr);
+		}
+		return UBI_IO_BAD_VID_HDR;
+	}
+
+	/* Validate the VID header that we have just read */
+	err = validate_vid_hdr(ubi, vid_hdr);
+	if (unlikely(err)) {
+		ubi_err("validation failed for PEB %d", pnum);
+		return -EINVAL;
+	}
+
+	return read_err ? UBI_IO_BITFLIPS : 0;
+}
+
+int ubi_io_write_vid_hdr(const struct ubi_info *ubi, int pnum,
+			 struct ubi_vid_hdr *vid_hdr)
+{
+	int err;
+	uint32_t crc;
+	const struct ubi_io_info *io = ubi->io;
+	void *p;
+
+	dbg_io("write VID header to PEB %d", pnum);
+
+	ubi_assert(pnum >= 0 &&  pnum < io->peb_count);
+
+	err = paranoid_check_peb_ec_hdr(ubi, pnum);
+	if (unlikely(err))
+		return err > 0 ? -EINVAL: err;
+
+	vid_hdr->magic = cpu_to_ubi32(UBI_VID_HDR_MAGIC);
+	vid_hdr->version = UBI_VERSION;
+	crc = crc32(UBI_CRC32_INIT, vid_hdr, UBI_VID_HDR_SIZE_CRC);
+	vid_hdr->hdr_crc = cpu_to_ubi32(crc);
+
+	err = paranoid_check_vid_hdr(ubi, pnum, vid_hdr);
+	if (unlikely(err))
+		return -EINVAL;
+
+	p = (char *)vid_hdr - io->vid_hdr_shift;
+	err = ubi_io_write(ubi, p, pnum, io->vid_hdr_aloffset,
+			   io->vid_hdr_alsize);
+	return err;
+}
+
+int ubi_io_init(struct ubi_info *ubi, int mtd_num, int vid_hdr_offset,
+		int data_offset)
+{
+	int err;
+	struct mtd_info *mtd;
+	struct ubi_io_info *io;
+
+	dbg_io("initialize the UBI I/O unit for MTD device %d, VID hdr offset "
+	       "%d data offset %d", mtd_num, vid_hdr_offset, data_offset);
+
+	io = ubi_kzalloc(sizeof(struct ubi_io_info));
+	if (!io)
+		return -ENOMEM;
+	ubi->io = io;
+
+	mtd = io->mtd = get_mtd_device(NULL, mtd_num);
+	if (IS_ERR(mtd)) {
+		ubi_err("cannot open MTD device %d", mtd_num);
+		err = PTR_ERR(mtd);
+		goto out_io;
+	}
+	io->mtd_num = mtd_num;
+
+	err = -EINVAL;
+	if (mtd->numeraseregions != 0) {
+		/*
+		 * Some flashes have several erase regions. Different regions
+		 * may have different eraseblock size and other
+		 * characteristics. It looks like mostly multi-region flashes
+		 * have one "main" region and one or more small regions to
+		 * store boot loader code or boot parameters or whatever. I
+		 * guess we should just pick the largest region. But this is
+		 * not implemented.
+		 */
+		ubi_err("multiple regions, not implemented");
+		goto out_mtd;
+	}
+
+	/*
+	 * Note, in this implementation we support MTD devices with 0x7FFFFFFF
+	 * physical eraseblocks maximum.
+	 */
+
+	io->mtd_name   = mtd->name;
+	io->peb_size   = mtd->erasesize;
+	io->peb_count  = mtd->size / mtd->erasesize;
+	io->flash_size = mtd->size;
+
+	if (mtd->block_isbad && mtd->block_markbad)
+		io->bad_allowed = 1;
+
+	io->min_io_size = mtd->writesize;
+	io->hdrs_min_io_size = mtd->writesize >> mtd->subpage_sft;
+
+	ubi_assert(io->hdrs_min_io_size > 0);
+	ubi_assert(io->hdrs_min_io_size <= io->min_io_size);
+	ubi_assert(io->min_io_size % io->hdrs_min_io_size == 0);
+
+	/* Calculate default aligned sizes of EC and VID headers */
+	io->ec_hdr_alsize = align_up(UBI_EC_HDR_SIZE, io->hdrs_min_io_size);
+	io->vid_hdr_alsize = align_up(UBI_VID_HDR_SIZE, io->hdrs_min_io_size);
+
+	dbg_io("min_io_size      %d", io->min_io_size);
+	dbg_io("hdrs_min_io_size %d", io->hdrs_min_io_size);
+	dbg_io("ec_hdr_alsize    %d", io->ec_hdr_alsize);
+	dbg_io("vid_hdr_alsize   %d", io->vid_hdr_alsize);
+
+	if (vid_hdr_offset == 0)
+		/* Default offset */
+		io->vid_hdr_offset = io->vid_hdr_aloffset = io->ec_hdr_alsize;
+	else {
+		io->vid_hdr_offset = vid_hdr_offset;
+		io->vid_hdr_aloffset = align_down(vid_hdr_offset,
+						  io->hdrs_min_io_size);
+		io->vid_hdr_shift = vid_hdr_offset - io->vid_hdr_aloffset;
+	}
+
+	/* Similar for the data offset */
+	if (data_offset == 0) {
+		io->leb_start = io->vid_hdr_offset + io->vid_hdr_alsize;
+		io->leb_start = align_up(io->leb_start, io->min_io_size);
+	} else
+		io->leb_start = data_offset;
+
+	dbg_io("vid_hdr_offset   %d", io->vid_hdr_offset);
+	dbg_io("vid_hdr_aloffset %d", io->vid_hdr_aloffset);
+	dbg_io("vid_hdr_shift    %d", io->vid_hdr_shift);
+	dbg_io("leb_start        %d", io->leb_start);
+
+	/* The shift must be aligned to 32-bit boundary */
+	if (io->vid_hdr_shift % 4) {
+		ubi_err("unaligned VID header shift %d",
+			io->vid_hdr_shift);
+		goto out_mtd;
+	}
+
+	/* Check sanity */
+	if (io->vid_hdr_offset < UBI_EC_HDR_SIZE ||
+	    io->leb_start < io->vid_hdr_offset + UBI_VID_HDR_SIZE ||
+	    io->leb_start > io->peb_size - UBI_VID_HDR_SIZE ||
+	    io->leb_start % io->min_io_size) {
+		ubi_err("bad VID header (%d) or data offsets (%d)",
+			io->vid_hdr_offset, io->leb_start);
+		goto out_mtd;
+	}
+
+	/*
+	 * It may happen that EC and VID headers are situated in one minimal
+	 * I/O unit. In this case we can only accept this UBI image in
+	 * read-only mode.
+	 */
+	if (io->vid_hdr_offset + UBI_VID_HDR_SIZE <= io->hdrs_min_io_size) {
+		ubi_warn("EC and VID headers are in the same minimal I/O unit, "
+			 "switch to read-only mode");
+		io->ro_mode = 1;
+	}
+
+	io->leb_size = io->peb_size - io->leb_start;
+
+	if (!(mtd->flags & MTD_WRITEABLE)) {
+		ubi_msg("MTD device %d is write-protected, attach in "
+			"read-only mode", mtd_num);
+		io->ro_mode = 1;
+	}
+
+	dbg_io("leb_size         %d", io->leb_size);
+	dbg_io("ro_mode          %d", io->ro_mode);
+
+	/*
+	 * FIXME: ideally, we have to initialize io->bad_peb_count here. But
+	 * unfortunately, MTD does not provide this information. We should loop
+	 * over all physical eraseblocks and invoke mtd->block_is_bad() which
+	 * is not optimal. So, we skip io->bad_peb_count uninitialized and let
+	 * the scanning unit to initialize it. This is not nice.
+	 */
+
+	dbg_io("the UBI I/O unit is initialized");
+	return 0;
+
+out_mtd:
+	put_mtd_device(mtd);
+out_io:
+	ubi_kfree(io);
+	return err;
+}
+
+void ubi_io_close(const struct ubi_info *ubi)
+{
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_io("close the UBI I/O unit for mtd device %d", io->mtd_num);
+	put_mtd_device(io->mtd);
+	ubi_kfree(io);
+}
+
+/**
+ * sync_erase - synchronously erase a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to erase
+ *
+ * This function synchronously erases physical eraseblock @pnum and returns
+ * zero in case of success and a negative error code in case of failure. If
+ * %-EIO is returned, the physical eraseblock went bad.
+ */
+static int sync_erase(const struct ubi_info *ubi, int pnum)
+{
+	int err, tries = 0;
+	struct erase_info ei;
+	wait_queue_head_t wq;
+	const struct ubi_io_info *io = ubi->io;
+
+	/*
+	 * Note, even though MTD erase interface is asynchronous, all the
+	 * current implementations are synchronous.
+	 */
+
+	dbg_io("erase PEB %d", pnum);
+
+	ubi_assert(pnum >= 0 && pnum < io->peb_count);
+
+	err = paranoid_check_not_bad(ubi, pnum);
+	if (unlikely(err != 0))
+		return err > 0 ? -EINVAL : err;
+
+retry:
+	init_waitqueue_head(&wq);
+	memset(&ei, 0, sizeof(struct erase_info));
+
+	ei.mtd      = io->mtd;
+	ei.addr     = pnum * io->peb_size;
+	ei.len      = io->peb_size;
+	ei.retries  = 2;
+	ei.callback = erase_callback;
+	ei.priv     = (unsigned long)&wq;
+
+	err = io->mtd->erase(io->mtd, &ei);
+	if (unlikely(err)) {
+		if (tries++ < IO_RETRIES) {
+			yield();
+			dbg_io("error %d while erasing PEB %d, retry",
+			       err, pnum);
+			goto retry;
+		}
+		ubi_err("cannot erase PEB %d, error %d", pnum, err);
+		ubi_dbg_dump_stack();
+		return err;
+	}
+
+	err = wait_event_interruptible(wq, ei.state == MTD_ERASE_DONE ||
+					   ei.state == MTD_ERASE_FAILED);
+	if (unlikely(err)) {
+		ubi_err("interrupted PEB %d erasure", pnum);
+		return -EINTR;
+	}
+
+	if (unlikely(ei.state == MTD_ERASE_FAILED)) {
+		if (tries++ < IO_RETRIES) {
+			yield();
+			dbg_io("error while erasing PEB %d, retry", pnum);
+			goto retry;
+		}
+		ubi_err("cannot erase PEB %d", pnum);
+		ubi_dbg_dump_stack();
+		return -EIO;
+	}
+
+	err = paranoid_check_all_ff(ubi, pnum, 0, io->peb_size);
+	if (unlikely(err))
+		return err > 0 ? -EINVAL : err;
+
+	/* The below is just for debugging and is compiled out if disabled */
+	if (ubi_dbg_is_erase_failure() && !err) {
+		ubi_err("cannot erase PEB %d (emulated)", pnum);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/* Patterns to write to a physical eraseblock when torturing it */
+static uint8_t patterns[] = {0xa5, 0x5a, 0x0};
+
+/**
+ * ubi_io_torture_peb - test a supposedly bad physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to test
+ *
+ * This function returns %-EIO if the physical eraseblock did not pass the
+ * test, a positive number of erase operations done if the test was
+ * successfully passed, and other negative error codes in case of other errors.
+ */
+static int ubi_io_torture_peb(const struct ubi_info *ubi, int pnum)
+{
+	void *buf;
+	int err, i, patt_count;
+	const struct ubi_io_info *io = ubi->io;
+
+	buf = ubi_kmalloc(io->peb_size);
+	if (unlikely(!buf))
+		return -ENOMEM;
+
+	patt_count = ARRAY_SIZE(patterns);
+	ubi_assert(patt_count > 0);
+
+	for (i = 0; i < patt_count; i++) {
+		err = sync_erase(ubi, pnum);
+		if (unlikely(err))
+			goto out;
+
+		/* Make sure the PEB contains only 0xFF bytes */
+		err = ubi_io_read(ubi, buf, pnum, 0, io->peb_size);
+		if (unlikely(err))
+			goto out;
+
+		err = ubi_buf_all_ff(buf, io->peb_size);
+		if (unlikely(err == 0)) {
+			ubi_err("erased PEB %d, but a non-0xFF byte found",
+				pnum);
+			err = -EIO;
+			goto out;
+		}
+
+		/* Write a pattern and check it */
+		memset(buf, patterns[i], io->peb_size);
+		err = ubi_io_write(ubi, buf, pnum, 0, io->peb_size);
+		if (unlikely(err))
+			goto out;
+
+		memset(buf, ~patterns[i], io->peb_size);
+		err = ubi_io_read(ubi, buf, pnum, 0, io->peb_size);
+		if (unlikely(err))
+			goto out;
+
+		err = ubi_check_pattern(buf, patterns[i], io->peb_size);
+		if (unlikely(err == 0)) {
+			ubi_err("pattern %x checking failed for PEB %d",
+				patterns[i], pnum);
+			err = -EIO;
+			goto out;
+		}
+	}
+
+	err = patt_count;
+
+out:
+	if (unlikely(err == UBI_IO_BITFLIPS || err == -EBADMSG))
+		/*
+		 * If a bit-flip or data integrity error was detected, the test
+		 * has not been passed.
+		 */
+		err = -EIO;
+	ubi_kfree(buf);
+	return err;
+}
+
+/**
+ * validate_ec_hdr - validate an erase counter header.
+ *
+ * @ubi: the UBI device description object
+ * @ec_hdr: the erase counter header to check
+ *
+ * This function returns zero if the erase counter header is OK, and %1 if
+ * not.
+ */
+static int validate_ec_hdr(const struct ubi_info *ubi,
+			   const struct ubi_ec_hdr *ec_hdr)
+{
+	long long ec;
+	int vid_hdr_offset, leb_start;
+	const struct ubi_io_info *io = ubi->io;
+
+	ec = ubi64_to_cpu(ec_hdr->ec);
+	vid_hdr_offset = ubi32_to_cpu(ec_hdr->vid_hdr_offset);
+	leb_start = ubi32_to_cpu(ec_hdr->data_offset);
+
+	if (unlikely(ec_hdr->version != UBI_VERSION)) {
+		ubi_err("node with incompatible UBI version found: "
+			"this UBI version is %d, image version is %d",
+			UBI_VERSION, (int)ec_hdr->version);
+		goto bad;
+	}
+
+	if (unlikely(vid_hdr_offset != io->vid_hdr_offset)) {
+		ubi_err("bad VID header offset %d, expected %d",
+			vid_hdr_offset, io->vid_hdr_offset);
+		goto bad;
+	}
+
+	if (unlikely(leb_start != io->leb_start)) {
+		ubi_err("bad data offset %d, expected %d",
+			leb_start, io->leb_start);
+		goto bad;
+	}
+
+	if (unlikely(ec < 0 || ec > UBI_MAX_ERASECOUNTER)) {
+		ubi_err("bad erase counter %lld", ec);
+		goto bad;
+	}
+
+	return 0;
+
+bad:
+	ubi_err("bad EC header");
+	ubi_dbg_dump_ec_hdr(ec_hdr);
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+/**
+ * validate_vid_hdr - validate a volume identifier header.
+ *
+ * @ubi: the UBI device description object
+ * @vid_hdr: the volume identifier header to check
+ *
+ * This function checks that data stored in the volume identifier header
+ * @vid_hdr is sane. This function returns zero if the VID header is OK and %1
+ * if not.
+ */
+static int validate_vid_hdr(const struct ubi_info *ubi,
+			    const struct ubi_vid_hdr *vid_hdr)
+{
+	const struct ubi_io_info *io = ubi->io;
+	int vol_type = vid_hdr->vol_type;
+	int copy_flag = vid_hdr->copy_flag;
+	int vol_id = ubi32_to_cpu(vid_hdr->vol_id);
+	int lnum = ubi32_to_cpu(vid_hdr->lnum);
+	int compat = vid_hdr->compat;
+	int data_size = ubi32_to_cpu(vid_hdr->data_size);
+	int used_ebs = ubi32_to_cpu(vid_hdr->used_ebs);
+	int data_pad = ubi32_to_cpu(vid_hdr->data_pad);
+	int data_crc = ubi32_to_cpu(vid_hdr->data_crc);
+	int usable_leb_size = io->leb_size - data_pad;
+
+	if (unlikely(copy_flag != 0 && copy_flag != 1)) {
+		dbg_err("bad copy_flag");
+		goto bad;
+	}
+
+	if (unlikely(vol_id < 0 || lnum < 0 || data_size < 0 || used_ebs < 0 ||
+		     data_pad < 0)) {
+		dbg_err("negative values");
+		goto bad;
+	}
+
+	if (unlikely(vol_id >= UBI_MAX_VOLUMES &&
+		     vol_id < UBI_INTERNAL_VOL_START)) {
+		dbg_err("bad vol_id");
+		goto bad;
+	}
+
+	if (unlikely(vol_id < UBI_INTERNAL_VOL_START && compat != 0)) {
+		dbg_err("bad compat");
+		goto bad;
+	}
+
+	if (unlikely(vol_id >= UBI_INTERNAL_VOL_START &&
+		     compat != UBI_COMPAT_DELETE &&
+		     compat != UBI_COMPAT_RO &&
+		     compat != UBI_COMPAT_PRESERVE &&
+		     compat != UBI_COMPAT_REJECT)) {
+		dbg_err("bad compat");
+		goto bad;
+	}
+
+	if (unlikely(vol_type != UBI_VID_DYNAMIC &&
+		     vol_type != UBI_VID_STATIC)) {
+		dbg_err("bad vol_type");
+		goto bad;
+	}
+
+	if (unlikely(data_pad >= io->leb_size / 2)) {
+		dbg_err("bad data_pad");
+		goto bad;
+	}
+
+	if (vol_type == UBI_VID_STATIC) {
+		/*
+		 * Although from high-level point of view static volumes may
+		 * contain zero bytes of data, but no VID headers can contain
+		 * zero at these fields, because they empty volumes do not have
+		 * mapped logical eraseblocks.
+		 */
+		if (unlikely(used_ebs == 0)) {
+			dbg_err("zero used_ebs");
+			goto bad;
+		}
+		if (unlikely(data_size == 0)) {
+			dbg_err("zero data_size");
+			goto bad;
+		}
+		if (lnum < used_ebs - 1) {
+			if (unlikely(data_size != usable_leb_size)) {
+				dbg_err("bad data_size");
+				goto bad;
+			}
+		} else if (lnum == used_ebs - 1) {
+			if (unlikely(data_size == 0)) {
+				dbg_err("bad data_size at last LEB");
+				goto bad;
+			}
+		} else {
+			dbg_err("too high lnum");
+			goto bad;
+		}
+	} else {
+		if (copy_flag == 0) {
+			if (unlikely(data_crc != 0)) {
+				dbg_err("non-zero data CRC");
+				goto bad;
+			}
+			if (unlikely(data_size != 0)) {
+				dbg_err("non-zero data_size");
+				goto bad;
+			}
+		} else {
+			if (unlikely(data_size == 0)) {
+				dbg_err("zero data_size of copy");
+				goto bad;
+			}
+		}
+		if (unlikely(used_ebs != 0)) {
+			dbg_err("bad used_ebs");
+			goto bad;
+		}
+	}
+
+	return 0;
+
+bad:
+	ubi_err("bad VID header");
+	ubi_dbg_dump_vid_hdr(vid_hdr);
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_IO
+
+/**
+ * paranoid_check_not_bad - ensure that a physical eraseblock is not bad.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock number to check
+ *
+ * This function returns zero if the physical eraseblock is good, a positive
+ * number if it is bad and a negative error code if an error occurred.
+ */
+static int paranoid_check_not_bad(const struct ubi_info *ubi, int pnum)
+{
+	int err;
+
+	err = ubi_io_is_bad(ubi, pnum);
+	if (likely(!err))
+		return err;
+
+	ubi_err("paranoid check failed for PEB %d", pnum);
+	ubi_dbg_dump_stack();
+	return err;
+}
+
+/**
+ * paranoid_check_ec_hdr - check if an erase counter header is all right.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock number the erase counter header belongs to
+ * @ec_hdr: the erase counter header to check
+ *
+ * This function returns zero if the erase counter header contains valid
+ * values, and %1 if not.
+ */
+static int paranoid_check_ec_hdr(const struct ubi_info *ubi, int pnum,
+				 const struct ubi_ec_hdr *ec_hdr)
+{
+	int err;
+	uint32_t magic;
+
+	magic = ubi32_to_cpu(ec_hdr->magic);
+	if (unlikely(magic != UBI_EC_HDR_MAGIC)) {
+		ubi_err("bad magic %#08x, must be %#08x",
+			magic, UBI_EC_HDR_MAGIC);
+		goto fail;
+	}
+
+	err = validate_ec_hdr(ubi, ec_hdr);
+	if (unlikely(err)) {
+		ubi_err("paranoid check failed for PEB %d", pnum);
+		goto fail;
+	}
+
+	return 0;
+
+fail:
+	ubi_dbg_dump_ec_hdr(ec_hdr);
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+/**
+ * paranoid_check_peb_ec_hdr - check that the erase counter header of a
+ * physical eraseblock is in-place and is all right.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to check
+ *
+ * This function returns zero if the erase counter header is all right, %1 if
+ * not, and a negative error code if an error occurred.
+ */
+static int paranoid_check_peb_ec_hdr(const struct ubi_info *ubi, int pnum)
+{
+	int err;
+	uint32_t crc, hdr_crc;
+	struct ubi_ec_hdr *ec_hdr;
+
+	ec_hdr = ubi_zalloc_ec_hdr(ubi);
+	if (unlikely(!ec_hdr))
+		return -ENOMEM;
+
+	err = ubi_io_read(ubi, ec_hdr, pnum, 0, UBI_EC_HDR_SIZE);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS && err != -EBADMSG)
+		goto exit;
+
+	crc = crc32(UBI_CRC32_INIT, ec_hdr, UBI_EC_HDR_SIZE_CRC);
+	hdr_crc = ubi32_to_cpu(ec_hdr->hdr_crc);
+	if (unlikely(hdr_crc != crc)) {
+		ubi_err("bad CRC, calculated %#08x, read %#08x", crc, hdr_crc);
+		ubi_err("paranoid check failed for PEB %d", pnum);
+		ubi_dbg_dump_ec_hdr(ec_hdr);
+		ubi_dbg_dump_stack();
+		err = 1;
+		goto exit;
+	}
+
+	err = paranoid_check_ec_hdr(ubi, pnum, ec_hdr);
+
+exit:
+	ubi_free_ec_hdr(ubi, ec_hdr);
+	return err;
+}
+
+/**
+ * paranoid_check_vid_hdr - check that a volume identifier header is all right.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock number the volume identifier header belongs to
+ * @vid_hdr: the volume identifier header to check
+ *
+ * This function returns zero if the volume identifier header is all right, and
+ * %1 if not.
+ */
+static int paranoid_check_vid_hdr(const struct ubi_info *ubi, int pnum,
+				  const struct ubi_vid_hdr *vid_hdr)
+{
+	int err;
+	uint32_t magic;
+
+	magic = ubi32_to_cpu(vid_hdr->magic);
+	if (unlikely(magic != UBI_VID_HDR_MAGIC)) {
+		ubi_err("bad VID header magic %#08x at PEB %d, must be %#08x",
+			magic, pnum, UBI_VID_HDR_MAGIC);
+		goto fail;
+	}
+
+	err = validate_vid_hdr(ubi, vid_hdr);
+	if (unlikely(err)) {
+		ubi_err("paranoid check failed for PEB %d", pnum);
+		goto fail;
+	}
+
+	return err;
+
+fail:
+	ubi_err("paranoid check failed for PEB %d", pnum);
+	ubi_dbg_dump_vid_hdr(vid_hdr);
+	ubi_dbg_dump_stack();
+	return 1;
+
+}
+
+/**
+ * paranoid_check_peb_vid_hdr - check that the volume identifier header of a
+ * physical eraseblock is in-place and is all right.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to check
+ *
+ * This function returns zero if the volume identifier header is all right,
+ * %1 if not, and a negative error code if an error occurred.
+ */
+static int paranoid_check_peb_vid_hdr(const struct ubi_info *ubi, int pnum)
+{
+	int err;
+	uint32_t crc, hdr_crc;
+	struct ubi_vid_hdr *vid_hdr;
+	const struct ubi_io_info *io = ubi->io;
+	void *p;
+
+	vid_hdr = ubi_zalloc_vid_hdr(ubi);
+	if (unlikely(!vid_hdr))
+		return -ENOMEM;
+
+	p = (char *)vid_hdr - io->vid_hdr_shift;
+	err = ubi_io_read(ubi, p, pnum, io->vid_hdr_aloffset,
+			  io->vid_hdr_alsize);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS && err != -EBADMSG)
+		goto exit;
+
+	crc = crc32(UBI_CRC32_INIT, vid_hdr, UBI_EC_HDR_SIZE_CRC);
+	hdr_crc = ubi32_to_cpu(vid_hdr->hdr_crc);
+	if (unlikely(hdr_crc != crc)) {
+		ubi_err("bad VID header CRC at PEB %d, calculated %#08x, "
+			"read %#08x", pnum, crc, hdr_crc);
+		ubi_err("paranoid check failed for PEB %d", pnum);
+		ubi_dbg_dump_vid_hdr(vid_hdr);
+		ubi_dbg_dump_stack();
+		err = 1;
+		goto exit;
+	}
+
+	err = paranoid_check_vid_hdr(ubi, pnum, vid_hdr);
+
+exit:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	return err;
+}
+
+/**
+ * paranoid_check_all_ff - check that a region of flash is empty.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to check
+ * @offset: the starting offset within the physical eraseblock to check
+ * @len: the length of the region to check
+ *
+ * This function returns zero if only 0xFF bytes are present at offset
+ * @offset of the physical eraseblock @pnum, %1 if not, and a negative error
+ * code if an error occurred.
+ */
+static int paranoid_check_all_ff(const struct ubi_info *ubi, int pnum,
+				 int offset, int len)
+{
+	size_t read;
+	int err;
+	void *buf;
+	const struct ubi_io_info *io = ubi->io;
+
+	buf = ubi_kzalloc(len);
+	if (unlikely(!buf))
+		return -ENOMEM;
+
+	err = mtd_read(io, buf, pnum, offset, len, &read);
+	if (unlikely(err && err != -EUCLEAN)) {
+		ubi_err("error %d while reading %d bytes from PEB %d:%d, "
+			"read %zd bytes", err, len, pnum, offset, read);
+		goto error;
+	}
+
+	err = ubi_buf_all_ff(buf, len);
+	if (unlikely(err == 0)) {
+		ubi_err("flash region at PEB %d:%d, length %d does not "
+			"contain all 0xFF bytes", pnum, offset, len);
+		goto fail;
+	}
+
+	ubi_kfree(buf);
+	return 0;
+
+fail:
+	ubi_err("paranoid check failed for PEB %d", pnum);
+	dbg_err("hex dump of the %d-%d region", offset, offset + len);
+	ubi_dbg_hexdump(buf, len);
+	err = 1;
+error:
+	ubi_dbg_dump_stack();
+	ubi_kfree(buf);
+	return err;
+}
+
+#endif /* CONFIG_MTD_UBI_DEBUG_PARANOID_IO */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 15/44 take 2] [UBI] scanning unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (13 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 14/44 take 2] [UBI] I/O unit implementation Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 23:07   ` Theodore Tso
  2007-02-17 16:55 ` [PATCH 16/44 take 2] [UBI] scanning unit implementation Artem Bityutskiy
                   ` (31 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/scan.h tmp-to/drivers/mtd/ubi/scan.h
--- tmp-from/drivers/mtd/ubi/scan.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/scan.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,279 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI scanning unit.
+ *
+ * This unit is responsible for scanning the flash media, checking UBI
+ * headers and providing complete information about the UBI flash image.
+ */
+
+#ifndef __UBI_SCAN_H__
+#define __UBI_SCAN_H__
+
+#include <linux/list.h>
+#include <linux/rbtree.h>
+
+/* The erase counter of this eraseblock is unknown */
+#define NAND_SCAN_UNKNOWN_EC (-1)
+
+struct ubi_info;
+struct ubi_scan_info;
+struct ubi_scan_volume;
+struct ubi_scan_leb;
+struct ubi_vid_hdr;
+
+/**
+ * ubi_scan_add_peb - add information about a physical eraseblock to the
+ * scanning information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @pnum: the physical eraseblock number
+ * @ec: erase counter
+ * @vid_hdr: the volume identifier header
+ * @bitflips: if a bit-flips were detected while reading this physical
+ * eraseblock
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_scan_add_peb(const struct ubi_info *ubi, struct ubi_scan_info *si,
+		     int pnum, int ec, const struct ubi_vid_hdr *vid_hdr,
+		     int bitflips);
+
+/**
+ * ubi_scan_add_corr_peb - add a physical eraseblock to the list of corrupted
+ * physical eraseblocks.
+ *
+ * @si: a pointer to the scanning information
+ * @pnum: the physical eraseblock number
+ * @ec: erase counter of this physical eraseblock
+ *
+ * If @ec is not known, %NAND_SCAN_UNKNOWN_EC has to be passed and mean erase
+ * counter will be used. This function returns zero in case of success and a
+ * negative error code in case of failure.
+ */
+int ubi_scan_add_corr_peb(struct ubi_scan_info *si, int pnum, int ec);
+
+/**
+ * ubi_scan_find_sv - find information about a particular volume in the
+ * scanning information.
+ *
+ * @si: a pointer to the scanning information
+ * @vol_id: the requested volume ID
+ *
+ * This function returns a pointer to the volume description or %NULL if there
+ * are no data about this volume in the scanning information.
+ */
+struct ubi_scan_volume *ubi_scan_find_sv(const struct ubi_scan_info *si,
+					 int vol_id);
+
+/**
+ * ubi_scan_find_seb - find information about a particular logical
+ * eraseblock in the volume scanning information.
+ *
+ * @sv: a pointer to the volume scanning information
+ * @lnum: the requested logical eraseblock
+ *
+ * This function returns a pointer to the scanning logical eraseblock or %NULL
+ * if there are no data about it in the scanning volume information.
+ */
+struct ubi_scan_leb *ubi_scan_find_seb(const struct ubi_scan_volume *sv,
+				       int lnum);
+
+/**
+ * ubi_scan_erase_peb - erase a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @pnum: physical eraseblock number to erase;
+ * @ec: erase counter value to write (%NAND_SCAN_UNKNOWN_EC if it is unknown)
+ *
+ * This function erases physical eraseblock 'pnum', and writes the erase
+ * counter header to it. This function should only be used on UBI device
+ * initialization stages, when the EBA unit had not been yet initialized. This
+ * function returns zero in case of success and a negative error code in case
+ * of failure.
+ */
+int ubi_scan_erase_peb(const struct ubi_info *ubi,
+		       const struct ubi_scan_info *si, int pnum, int ec);
+
+/**
+ * ubi_scan_get_free_peb - get a free physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns a free physical eraseblock. It is supposed to be
+ * called on the UBI initialization stages when the wear-leveling unit is not
+ * initialized yet. This function picks a physical eraseblocks from one of the
+ * lists, writes the EC header if it is needed, and removes it from the list.
+ *
+ * This function returns scanning physical eraseblock information in case of
+ * success and an error code in case of failure.
+ */
+struct ubi_scan_leb *ubi_scan_get_free_peb(const struct ubi_info *ubi,
+					   struct ubi_scan_info *si);
+
+/**
+ * ubi_scan_rm_volume - delete scanning information about a volume.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @sv: the volume scanning information to delete
+ */
+void ubi_scan_rm_volume(const struct ubi_info *ubi, struct ubi_scan_info *si,
+			struct ubi_scan_volume *sv);
+
+/**
+ * ubi_scan - scan an MTD device.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function does full scanning of an MTD device and returns complete
+ * information about it. In case of failure, an error code is returned.
+ */
+struct ubi_scan_info *ubi_scan(struct ubi_info *ubi);
+
+/**
+ * ubi_scan_destroy_si - destroy scanning information.
+ *
+ * @si: a pointer to the scanning information
+ */
+void ubi_scan_destroy_si(struct ubi_scan_info *si);
+
+/**
+ * struct ubi_scan_volume - scanning information about a volume.
+ *
+ * @vol_id: volume ID
+ * @highest_lnum: the highest logical eraseblock number found in this volume
+ * @leb_count: the number of found logical eraseblocks belonging to this volume
+ * @vol_type: volume type
+ * @data_pad: how many bytes at the end of logical eraseblocks of this volume
+ * are not used (due to the volume alignment)
+ * @used_ebs: the number of used logical eraseblocks in this volume (only for
+ * static volumes)
+ * @last_data_size: amount of data in the last found logical eraseblock of this
+ * volume (always equivalent to the usable logical eraseblock size fro dynamic
+ * volumes)
+ * @data_size: how many bytes of data logical eraseblock contain (only for static
+ * volumes, invalid for the last logical eraseblock)
+ * @compat: compatibility flags of the volume
+ * @rb: link in the volume RB-tree
+ * @root: the root of RB-tree containing all the found eraseblock belonging to
+ * this volume (&struct ubi_scan_leb objects)
+ */
+struct ubi_scan_volume {
+	int vol_id;
+	int highest_lnum;
+	int leb_count;
+	int vol_type;
+	int used_ebs;
+	int last_data_size;
+	int data_pad;
+	int compat;
+	struct rb_node rb;
+	struct rb_root root;
+};
+
+/**
+ * struct ubi_scan_leb - scanning information about a physical eraseblock.
+ *
+ * @ec: erase counter (%NAND_SCAN_UNKNOWN_EC if it is unknown)
+ * @pnum: physical eraseblock number
+ * @lnum: logical eraseblock number
+ * @scrub: if this physical eraseblock needs scrubbing
+ * @leb_ver: version of this logical eraseblock
+ * @u.rb: link in the per-volume RB-tree of &struct ubi_scan_leb objects
+ * @u.list: link in one of the eraseblock lists
+ *
+ * One object of this type is allocated for each physical eraseblock during
+ * scanning.
+ */
+struct ubi_scan_leb {
+	int ec;
+	int pnum;
+	int lnum;
+	int scrub;
+	uint32_t leb_ver;
+	union {
+		struct rb_node rb;
+		struct list_head list;
+	} u;
+};
+
+/**
+ * struct ubi_scan_info - UBI scanning information.
+ *
+ * @volumes: root of the volume RB-tree
+ * @corr: list of corrupted eraseblocks
+ * @free: list of free eraseblocks
+ * @erase: list of eraseblocks which have to be erased
+ * @alien: count of physical eraseblocks which should not be used by UBI (e.g.,
+ * those belonging to "preserve"-compatible internal volumes)
+ * @vols_found: total count of volumes found during scanning
+ * @highest_vol_id: highest volume ID found during scanning
+ * @bad_peb_count: count of bad physical eraseblocks found during scanning
+ * @alien_peb_count: count of physical eraseblocks in the @@alien list
+ * @is_empty: a flag indicating whether the flash device is empty or not
+ * @min_ec: the lowest found erase counter value
+ * @max_ec: the highest found erase counter value
+ * @mean_ec: mean erase counter value
+ * @ec_sum: a temporary variable used when calculating @mean_ec
+ * @ec_count: a temporary variable used when calculating @mean_ec
+ *
+ * This data structure contains the result of scanning and may be used by other
+ * UBI units to build final UBI data structures, further error-recovery and so
+ * on.
+ *
+ * Information about found volumes is represented by &struct ubi_scan_volume
+ * objects which are kept in volume RB-tree with root at the @volumes field. The
+ * RB-tree is indexed by the volume ID.
+ *
+ * Found logical eraseblocks are represented by &struct ubi_scan_leb objects.
+ * These objects are kept in per-volume RB-trees with the root at the
+ * corresponding &struct ubi_scan_volume object. To put it differently, we keep
+ * an RB-tree of per-volume objects and each of these objects is the root of
+ * RB-tree of per-eraseblock objects.
+ *
+ * Corrupted physical eraseblocks are put to the @corr list, free physical
+ * eraseblocks are put to the @free list and the physical eraseblock to be
+ * erased are put to the @erase list.
+ */
+struct ubi_scan_info {
+	struct rb_root volumes;  /* public  */
+	struct list_head corr;   /* public  */
+	struct list_head free;   /* public  */
+	struct list_head erase;  /* public  */
+	struct list_head alien;  /* public  */
+	int vols_found;          /* public  */
+	int highest_vol_id;      /* public  */
+	int bad_peb_count;       /* public  */
+	int alien_peb_count;     /* public  */
+	int is_empty;            /* public  */
+	int min_ec;              /* public  */
+	int max_ec;              /* public  */
+	int mean_ec;             /* public  */
+	int ec_sum;              /* private */
+	int ec_count;            /* private */
+};
+
+#endif /* !__UBI_SCAN_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 16/44 take 2] [UBI] scanning unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (14 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 15/44 take 2] [UBI] scanning unit header Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-19 11:05   ` Christoph Hellwig
  2007-02-17 16:55 ` [PATCH 17/44 take 2] [UBI] build unit header Artem Bityutskiy
                   ` (30 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/scan.c tmp-to/drivers/mtd/ubi/scan.c
--- tmp-from/drivers/mtd/ubi/scan.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/scan.c	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,1383 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/err.h>
+#include <linux/crc32.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "scan.h"
+#include "io.h"
+#include "misc.h"
+#include "vtbl.h"
+#include "debug.h"
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_SCAN
+static int paranoid_check_si(const struct ubi_info *ubi,
+			     struct ubi_scan_info *si);
+#else
+#define paranoid_check_si(ubi, si) 0
+#endif
+
+static int process_eb(const struct ubi_info *ubi,
+		      struct ubi_scan_info *si, int pnum);
+static void commit_to_mean_value(struct ubi_scan_info *si);
+
+/* Temporary variables used during scanning */
+static struct ubi_ec_hdr *ech;
+static struct ubi_vid_hdr *vidh;
+
+struct ubi_scan_info *ubi_scan(struct ubi_info *ubi)
+{
+	int err, pnum;
+	struct rb_node *rb1, *rb2;
+	struct ubi_scan_volume *sv;
+	struct ubi_scan_leb *seb;
+	struct ubi_scan_info *si;
+	struct ubi_io_info *io = ubi->io;
+
+	si = ubi_kzalloc(sizeof(struct ubi_scan_info));
+	if (!si)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&si->corr);
+	INIT_LIST_HEAD(&si->free);
+	INIT_LIST_HEAD(&si->erase);
+	INIT_LIST_HEAD(&si->alien);
+	si->volumes = RB_ROOT;
+	si->is_empty = 1;
+
+	err = -ENOMEM;
+	ech = ubi_zalloc_ec_hdr(ubi);
+	if (!ech)
+		goto out_si;
+
+	vidh = ubi_zalloc_vid_hdr(ubi);
+	if (!vidh)
+		goto out_ech;
+
+	for (pnum = 0; pnum < io->peb_count; pnum++) {
+		cond_resched();
+
+		err = process_eb(ubi, si, pnum);
+		if (unlikely(err < 0))
+			goto out_vidh;
+	}
+
+	dbg_scan("scanning is finished");
+
+	/* Finish mean erase counter calculations */
+	if (si->ec_count)
+		commit_to_mean_value(si);
+
+	/*
+	 * FIXME: this is actually duty of the I/O unit to initialize this, but
+	 * MTD does not provide enough information.
+	 */
+	io->bad_peb_count = si->bad_peb_count;
+	io->good_peb_count = io->peb_count - io->bad_peb_count;
+
+	if (si->is_empty)
+		ubi_msg("empty MTD device detected");
+
+	/*
+	 * In case of unknown erase counter we use the mean erase counter
+	 * value.
+	 */
+	rb_for_each_entry(rb1, sv, &si->volumes, rb) {
+		cond_resched();
+		rb_for_each_entry(rb2, seb, &sv->root, u.rb)
+			if (seb->ec == NAND_SCAN_UNKNOWN_EC)
+				seb->ec = si->mean_ec;
+	}
+
+	cond_resched();
+	list_for_each_entry(seb, &si->free, u.list)
+		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
+			seb->ec = si->mean_ec;
+
+	cond_resched();
+	list_for_each_entry(seb, &si->corr, u.list)
+		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
+			seb->ec = si->mean_ec;
+
+	cond_resched();
+	list_for_each_entry(seb, &si->erase, u.list)
+		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
+			seb->ec = si->mean_ec;
+
+	err = paranoid_check_si(ubi, si);
+	if (err) {
+		if (err > 0)
+			err = -EINVAL;
+		goto out_vidh;
+	}
+
+	ubi_free_vid_hdr(ubi, vidh);
+	ubi_free_ec_hdr(ubi, ech);
+	return si;
+
+out_vidh:
+	ubi_free_vid_hdr(ubi, vidh);
+out_ech:
+	ubi_free_ec_hdr(ubi, ech);
+out_si:
+	ubi_scan_destroy_si(si);
+	return ERR_PTR(err);
+}
+
+static int vid_hdr_sanity_check(const struct ubi_info *ubi,
+				const struct ubi_vid_hdr *vid_hdr,
+				const struct ubi_scan_volume *sv, int pnum);
+
+static int add_to_erase(struct ubi_scan_info *si, int pnum, int ec);
+
+static struct ubi_scan_volume *add_volume(const struct ubi_info *ubi,
+					  struct ubi_scan_info *si, int vol_id,
+					  int pnum,
+					  const struct ubi_vid_hdr *vid_hdr);
+
+static int compare_lebs(const struct ubi_info *ubi,
+			const struct ubi_scan_leb *seb, int pnum,
+			const struct ubi_vid_hdr *vid_hdr);
+
+int ubi_scan_add_peb(const struct ubi_info *ubi, struct ubi_scan_info *si,
+		     int pnum, int ec, const struct ubi_vid_hdr *vid_hdr,
+		     int bitflips)
+{
+	int err, vol_id, lnum;
+	uint32_t leb_ver;
+	struct ubi_scan_volume *sv;
+	struct ubi_scan_leb *seb;
+	struct rb_node **p, *parent = NULL;
+
+	vol_id = ubi32_to_cpu(vid_hdr->vol_id);
+	lnum = ubi32_to_cpu(vid_hdr->lnum);
+	leb_ver = ubi32_to_cpu(vid_hdr->leb_ver);
+
+	dbg_scan("PEB %d, LEB %d:%d, EC %d, LEB ver %u, bitflips %d",
+		 pnum, vol_id, lnum, ec, leb_ver, bitflips);
+
+	sv = add_volume(ubi, si, vol_id, pnum, vid_hdr);
+	if (unlikely(IS_ERR(sv)) < 0)
+		return PTR_ERR(sv);
+
+	/*
+	 * Walk the RB-tree of logical eraseblocks of volume @vol_id to look
+	 * if this is the first instance of this logical eraseblock or not.
+	 */
+	p = &sv->root.rb_node;
+	while (*p) {
+		int cmp_res;
+
+		parent = *p;
+		seb = rb_entry(parent, struct ubi_scan_leb, u.rb);
+
+		if (lnum != seb->lnum) {
+			if (lnum < seb->lnum)
+				p = &(*p)->rb_left;
+			else
+				p = &(*p)->rb_right;
+			continue;
+		}
+
+		/*
+		 * There is already a physical eraseblock describing the same
+		 * logical eraseblock present.
+		 */
+
+		dbg_scan("this LEB already exists: PEB %d, LEB ver %u, EC %d",
+			 seb->pnum, seb->leb_ver, seb->ec);
+
+		/*
+		 * Make sure that the logical eraseblocks have different
+		 * versions. Otherwise the image is bad.
+		 */
+		if (unlikely(seb->leb_ver == leb_ver)) {
+			ubi_err("two LEBs with same version %u", leb_ver);
+			ubi_dbg_dump_seb(seb, 0);
+			ubi_dbg_dump_vid_hdr(vid_hdr);
+			return -EINVAL;
+		}
+
+		/*
+		 * Now we have to drop the older one and preserve the newer
+		 * one.
+		 */
+		cmp_res = compare_lebs(ubi, seb, pnum, vid_hdr);
+		if (unlikely(cmp_res < 0))
+			return cmp_res;
+
+		if (cmp_res & 1) {
+			/*
+			 * This logical eraseblock is newer then the one
+			 * found earlier.
+			 */
+			err = vid_hdr_sanity_check(ubi, vid_hdr, sv,
+						   pnum);
+			if (unlikely(err))
+				return err;
+
+			if (cmp_res & 4)
+				err = ubi_scan_add_corr_peb(si, seb->pnum, seb->ec);
+			else
+				err = add_to_erase(si, seb->pnum, seb->ec);
+			if (unlikely(err))
+				return err;
+
+			seb->ec = ec;
+			seb->pnum = pnum;
+			seb->leb_ver = leb_ver;
+			seb->scrub = ((cmp_res & 2) || bitflips);
+
+			if (sv->highest_lnum == lnum)
+				sv->last_data_size =
+					ubi32_to_cpu(vid_hdr->data_size);
+
+			return 0;
+		} else {
+			/*
+			 * This logical eraseblock is older then the one found
+			 * previously.
+			 */
+			if (cmp_res & 4)
+				return ubi_scan_add_corr_peb(si, pnum, ec);
+			else
+				return add_to_erase(si, pnum, ec);
+		}
+	}
+
+	/*
+	 * We've met this logical eraseblock for the first time, add it to the
+	 * scanning information.
+	 */
+
+	err = vid_hdr_sanity_check(ubi, vid_hdr, sv, pnum);
+	if (unlikely(err))
+		return err;
+
+	seb = ubi_alloc_scan_leb();
+	if (unlikely(!seb))
+		return -ENOMEM;
+
+	seb->ec = ec;
+	seb->pnum = pnum;
+	seb->lnum = lnum;
+	seb->leb_ver = leb_ver;
+	seb->scrub = bitflips;
+
+	if (sv->highest_lnum <= lnum) {
+		sv->highest_lnum = lnum;
+		sv->last_data_size = ubi32_to_cpu(vid_hdr->data_size);
+	}
+
+	sv->leb_count += 1;
+	rb_link_node(&seb->u.rb, parent, p);
+	rb_insert_color(&seb->u.rb, &sv->root);
+	return 0;
+}
+
+int ubi_scan_add_corr_peb(struct ubi_scan_info *si, int pnum, int ec)
+{
+	struct ubi_scan_leb *seb;
+
+	dbg_scan("PEB %d (EC %d) is corrupted", pnum, ec);
+
+	seb = ubi_alloc_scan_leb();
+	if (unlikely(!seb))
+		return -ENOMEM;
+
+	seb->pnum = pnum;
+	seb->ec = ec;
+	list_add_tail(&seb->u.list, &si->corr);
+	return 0;
+}
+
+struct ubi_scan_volume *ubi_scan_find_sv(const struct ubi_scan_info *si,
+					 int vol_id)
+{
+	struct ubi_scan_volume *sv;
+	struct rb_node *p = si->volumes.rb_node;
+
+	while (p) {
+		sv = rb_entry(p, struct ubi_scan_volume, rb);
+
+		if (vol_id == sv->vol_id)
+			return sv;
+
+		if (vol_id > sv->vol_id)
+			p = p->rb_left;
+		else
+			p = p->rb_right;
+	}
+
+	return NULL;
+}
+
+struct ubi_scan_leb *ubi_scan_find_seb(const struct ubi_scan_volume *sv,
+				       int lnum)
+{
+	struct ubi_scan_leb *seb;
+	struct rb_node *p = sv->root.rb_node;
+
+	while (p) {
+		seb = rb_entry(p, struct ubi_scan_leb, u.rb);
+
+		if (lnum == seb->lnum)
+			return seb;
+
+		if (lnum > seb->lnum)
+			p = p->rb_left;
+		else
+			p = p->rb_right;
+	}
+
+	return NULL;
+}
+
+void ubi_scan_rm_volume(const struct ubi_info *ubi, struct ubi_scan_info *si,
+			struct ubi_scan_volume *sv)
+{
+	struct rb_node *rb;
+	struct ubi_scan_leb *seb;
+
+	dbg_scan("remove scanning information about volume %d", sv->vol_id);
+
+	while ((rb = rb_first(&sv->root))) {
+		cond_resched();
+
+		seb = rb_entry(rb, struct ubi_scan_leb, u.rb);
+
+		/* The physical eraseblock will be erased later */
+		rb_erase(&seb->u.rb, &sv->root);
+		list_add_tail(&seb->u.list, &si->erase);
+	}
+
+	rb_erase(&sv->rb, &si->volumes);
+	ubi_free_scan_volume(sv);
+	si->vols_found -= 1;
+}
+
+int ubi_scan_erase_peb(const struct ubi_info *ubi,
+		       const struct ubi_scan_info *si, int pnum, int ec)
+{
+	int err;
+	struct ubi_ec_hdr *ec_hdr;
+	uint64_t ec1 = ec;
+
+	ec_hdr = ubi_zalloc_ec_hdr(ubi);
+	if (!ec_hdr)
+		return -ENOMEM;
+
+	if (unlikely(ec1 >= UBI_MAX_ERASECOUNTER)) {
+		/*
+		 * Erase counter overflow. Upgrade UBI and use 64-bit
+		 * erase counters internally.
+		 */
+		ubi_err("erase counter overflow at PEB %d, EC %lld",
+			pnum, (long long)ec);
+		return -EINVAL;
+	}
+
+	ec_hdr->ec = cpu_to_ubi64(ec1);
+
+	err = ubi_io_sync_erase(ubi, pnum, 0);
+	if (unlikely(err < 0))
+		goto out_free;
+
+	err = ubi_io_write_ec_hdr(ubi, pnum, ec_hdr);
+
+out_free:
+	ubi_free_ec_hdr(ubi, ec_hdr);
+	return err;
+}
+
+struct ubi_scan_leb *ubi_scan_get_free_peb(const struct ubi_info *ubi,
+					   struct ubi_scan_info *si)
+{
+	int err = 0, i;
+	struct ubi_scan_leb *seb;
+
+	if (!list_empty(&si->free)) {
+		seb = list_entry(si->free.next, struct ubi_scan_leb,
+				 u.list);
+		list_del(&seb->u.list);
+		return seb;
+	}
+
+	if (unlikely(list_empty(&si->erase) && list_empty(&si->corr))) {
+		ubi_err("no vacant eraseblocks found");
+		return ERR_PTR(-ENOSPC);
+	}
+
+	for (i = 0; i < 2; i++) {
+		struct list_head *head;
+		struct ubi_scan_leb *tmp_seb;
+
+		if (i == 0)
+			head = &si->erase;
+		else
+			head = &si->corr;
+
+		/*
+		 * We try to erase the first physical eraseblock from the @head
+		 * list and pick it if we succeed, or try to erase the
+		 * next one if not. And so forth. We don't want to take care
+		 * about bad eraseblocks here - they'll be handled later.
+		 */
+		list_for_each_entry_safe(seb, tmp_seb, head, u.list) {
+			cond_resched();
+
+			if (seb->ec == NAND_SCAN_UNKNOWN_EC)
+				seb->ec = si->mean_ec;
+
+			err = ubi_scan_erase_peb(ubi, si, seb->pnum,
+						 seb->ec + 1);
+			if (unlikely(err))
+				continue;
+
+			seb->ec += 1;
+			list_del(&seb->u.list);
+			dbg_scan("return PEB %d, EC %d", seb->pnum, seb->ec);
+			return seb;
+		}
+	}
+
+	return ERR_PTR(err ? err : -ENOSPC);
+}
+
+static void destroy_sv(struct ubi_scan_volume *sv);
+
+void ubi_scan_destroy_si(struct ubi_scan_info *si)
+{
+	struct ubi_scan_leb *seb, *seb_tmp;
+	struct ubi_scan_volume *sv;
+	struct rb_node *rb;
+
+	list_for_each_entry_safe(seb, seb_tmp, &si->alien, u.list) {
+		list_del(&seb->u.list);
+		ubi_free_scan_leb(seb);
+	}
+	list_for_each_entry_safe(seb, seb_tmp, &si->erase, u.list) {
+		list_del(&seb->u.list);
+		ubi_free_scan_leb(seb);
+	}
+	list_for_each_entry_safe(seb, seb_tmp, &si->corr, u.list) {
+		list_del(&seb->u.list);
+		ubi_free_scan_leb(seb);
+	}
+	list_for_each_entry_safe(seb, seb_tmp, &si->free, u.list) {
+		list_del(&seb->u.list);
+		ubi_free_scan_leb(seb);
+	}
+
+	/* Destroy the volume RB-tree */
+	rb = si->volumes.rb_node;
+	while (rb) {
+		if (rb->rb_left)
+			rb = rb->rb_left;
+		else if (rb->rb_right)
+			rb = rb->rb_right;
+		else {
+			sv = rb_entry(rb, struct ubi_scan_volume, rb);
+
+			rb = rb_parent(rb);
+			if (rb) {
+				if (rb->rb_left == &sv->rb)
+					rb->rb_left = NULL;
+				else
+					rb->rb_right = NULL;
+			}
+
+			destroy_sv(sv);
+		}
+	}
+
+	ubi_kfree(si);
+}
+
+static int add_to_free(struct ubi_scan_info *si, int pnum, int ec);
+static int add_to_alien(struct ubi_scan_info *si, int pnum);
+
+/**
+ * process_eb - read UBI headers, check them and add corresponding data
+ * to the scanning information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @pnum: the physical eraseblock number
+ *
+ * This function returns a zero if the physical eraseblock was succesfully
+ * handled and a negative error code in case of failure.
+ */
+static int process_eb(const struct ubi_info *ubi,
+			     struct ubi_scan_info *si, int pnum)
+{
+	long long ec;
+	int err, bitflips = 0, vol_id, ec_corr = 0;
+
+	dbg_scan("scan PEB %d", pnum);
+
+	/* Skip bad physical eraseblocks */
+	err = ubi_io_is_bad(ubi, pnum);
+	if (unlikely(err < 0))
+		return err;
+	else if (err) {
+		si->bad_peb_count += 1;
+		return 0;
+	}
+
+	err = ubi_io_read_ec_hdr(ubi, pnum, ech, 0);
+	if (unlikely(err < 0))
+		return err;
+	else if (unlikely(err == UBI_IO_BITFLIPS))
+		bitflips = 1;
+	else if (err == UBI_IO_PEB_EMPTY)
+		return add_to_erase(si, pnum, NAND_SCAN_UNKNOWN_EC);
+	else if (err == UBI_IO_BAD_EC_HDR) {
+		/*
+		 * We have to also look at the VID header, possibly it is not
+		 * corrupted. Set %bitflips flag in order to make this PEB be
+		 * moved and EC be re-created.
+		 */
+		ec_corr = 1;
+		ec = NAND_SCAN_UNKNOWN_EC;
+		bitflips = 1;
+	}
+
+	si->is_empty = 0;
+
+	if (!ec_corr) {
+		/* Make sure UBI version is OK */
+		if (unlikely(ech->version != UBI_VERSION)) {
+			ubi_err("this UBI version is %d, image version is %d",
+				UBI_VERSION, (int)ech->version);
+			return -EINVAL;
+		}
+
+		ec = ubi64_to_cpu(ech->ec);
+		if (unlikely(ec > UBI_MAX_ERASECOUNTER)) {
+			/*
+			 * Erase counter overflow. The EC headers have 64 bits
+			 * reserved, but we anyway make use of only 31 bit
+			 * values, as this seems to be enough for any existing
+			 * flash. Upgrade UBI and use 64-bit erase counters
+			 * internally.
+			 */
+			ubi_err("erase counter overflow, max is %d",
+				UBI_MAX_ERASECOUNTER);
+			ubi_dbg_dump_ec_hdr(ech);
+			return -EINVAL;
+		}
+	}
+
+	/* OK, we've done with the EC header, let's look at the VID header */
+
+	err = ubi_io_read_vid_hdr(ubi, pnum, vidh, 0);
+	if (unlikely(err < 0))
+		return err;
+	else if (unlikely(err == UBI_IO_BITFLIPS))
+		bitflips = 1;
+	else if (unlikely(err == UBI_IO_BAD_VID_HDR ||
+		 (err == UBI_IO_PEB_FREE && ec_corr))) {
+		/* VID header is corrupted */
+		err = ubi_scan_add_corr_peb(si, pnum, ec);
+		if (err)
+			return err;
+		goto adjust_mean_ec;
+	} else if (err == UBI_IO_PEB_FREE) {
+		/* No VID header - the physical eraseblock is free */
+		err = add_to_free(si, pnum, ec);
+		if (unlikely(err))
+			return err;
+		goto adjust_mean_ec;
+	}
+
+	vol_id = ubi32_to_cpu(vidh->vol_id);
+	if (unlikely(!ubi_ivol_is_known(vol_id))) {
+		int lnum = ubi32_to_cpu(vidh->lnum);
+
+		/* Unsupported internal volume */
+		switch (vidh->compat) {
+		case UBI_COMPAT_DELETE:
+			ubi_msg("\"delete\" compatible internal volume %d:%d"
+				" found, remove it", vol_id, lnum);
+			err = ubi_scan_add_corr_peb(si, pnum, ec);
+			if (unlikely(err))
+				return err;
+			break;
+
+		case UBI_COMPAT_RO:
+			ubi_msg("read-only compatible internal volume %d:%d"
+				" found, switch to read-only mode",
+				vol_id, lnum);
+			ubi->io->ro_mode = 1;
+			break;
+
+		case UBI_COMPAT_PRESERVE:
+			ubi_msg("\"preserve\" compatible internal volume %d:%d"
+				" found", vol_id, lnum);
+			err = add_to_alien(si, pnum);
+			if (unlikely(err))
+				return err;
+			si->alien_peb_count += 1;
+			return 0;
+
+		case UBI_COMPAT_REJECT:
+			ubi_err("incompatible internal volume %d:%d found",
+				vol_id, lnum);
+			return -EINVAL;
+		}
+	}
+
+	/* Both UBI headers seem to be fine */
+	err = ubi_scan_add_peb(ubi, si, pnum, ec, vidh, bitflips);
+	if (unlikely(err))
+		return err;
+
+adjust_mean_ec:
+	if (!ec_corr) {
+		if (si->ec_sum + ec < ec) {
+			commit_to_mean_value(si);
+			si->ec_sum = 0;
+			si->ec_count = 0;
+		} else {
+			si->ec_sum += ec;
+			si->ec_count += 1;
+		}
+
+		if (ec > si->max_ec)
+			si->max_ec = ec;
+		if (ec < si->min_ec)
+			si->min_ec = ec;
+	}
+
+	return 0;
+}
+
+/**
+ * add_volume - add a volume tho the scanning information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @vol_id: ID of the volume to add
+ * @pnum: physical eraseblock number
+ * @vid_hdr: volume identifier header
+ * @vol_info: a pointer to the corresponding volume scanning information is
+ * returned here.
+ *
+ * If the volume corresponding to the @vid_hdr logical eraseblock is already
+ * present in the scanning information, this function does nothing. Otherwise
+ * it adds corresponding volume to the scanning information. Returns the a
+ * pointer to the scanning volume object in case of success and a negative
+ * error code in case of failure.
+ */
+static struct ubi_scan_volume  *add_volume(const struct ubi_info *ubi,
+					   struct ubi_scan_info *si, int vol_id,
+					   int pnum,
+					   const struct ubi_vid_hdr *vid_hdr)
+{
+	struct ubi_scan_volume *sv;
+	struct rb_node **p = &si->volumes.rb_node, *parent = NULL;
+
+	ubi_assert(vol_id == ubi32_to_cpu(vid_hdr->vol_id));
+
+	/*
+	 * Walk the volume RB-tree to look if a volume @vol_id is already
+	 * present there.
+	 */
+	while (*p) {
+		parent = *p;
+		sv = rb_entry(parent, struct ubi_scan_volume, rb);
+
+		if (vol_id == sv->vol_id)
+			return sv;
+
+		if (vol_id > sv->vol_id)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	/* The volume is absent - add it */
+
+	sv = ubi_alloc_scan_volume();
+	if (unlikely(!sv))
+		return ERR_PTR(-ENOMEM);
+
+	sv->highest_lnum = sv->leb_count = 0;
+	sv->vol_id = vol_id;
+	sv->root = RB_ROOT;
+	sv->used_ebs = ubi32_to_cpu(vid_hdr->used_ebs);
+	sv->data_pad = ubi32_to_cpu(vid_hdr->data_pad);
+	sv->compat = vid_hdr->compat;
+	sv->vol_type = vid_hdr->vol_type == UBI_VID_DYNAMIC ? UBI_DYNAMIC_VOLUME
+							    : UBI_STATIC_VOLUME;
+	if (vol_id > si->highest_vol_id)
+		si->highest_vol_id = vol_id;
+
+	rb_link_node(&sv->rb, parent, p);
+	rb_insert_color(&sv->rb, &si->volumes);
+	si->vols_found += 1;
+	dbg_scan("added volume %d", vol_id);
+	return sv;
+}
+
+/**
+ * compare_lebs - find out which logical eraseblock is newer.
+ *
+ * @ubi: the UBI device description object
+ * @seb: the first logical eraseblock to compare
+ * @pnum: the physical eraseblock number of the second logical eraseblock to
+ * compare
+ * @vid_hdr: the volume identifier header of the second logical eraseblock
+ *
+ * This function compares versions of 2 copies of a LEB and informs which one
+ * is newer. In case of success this function returns a positive value, in case
+ * of error, a negative error code is returned. The success return codes use
+ * the following bits:
+ *     o bit 0 is cleared: the first PEB (described by @seb) is newer then the
+ *       second PEB (described by @pnum and @vid_hdr);
+ *     o bit 0 is set: the second PEB is newer;
+ *     o bit 1 is cleared: no bit-flips were detected in the newer PEB;
+ *     o bit 1 is set: bit-flips were detected in the newer PEB;
+ *     o bit 2 is cleared: the older PEB is not corrupted;
+ *     o bit 2 is set: the older PEB is not corrupted.
+ */
+static int compare_lebs(const struct ubi_info *ubi,
+			const struct ubi_scan_leb *seb, int pnum,
+			const struct ubi_vid_hdr *vid_hdr)
+{
+	void *buf;
+	int len, err, second_is_newer, bitflips = 0, corrupted = 0;
+	uint32_t data_crc, crc;
+	long long abs, v1 = seb->leb_ver, v2 = ubi32_to_cpu(vid_hdr->leb_ver);
+	struct ubi_vid_hdr *vidh = NULL;
+
+	/*
+	 * UBI constantly increases the logical eraseblock version number and
+	 * it can overflow. Thus, we have to bear in mind that versions that
+	 * are close to %0xFFFFFFFF are less then versions that are close to
+	 * %0.
+	 *
+	 * The UBI WL unit guarantees that the number of pending tasks is not
+	 * greater then %0x7FFFFFFF. So, if the difference between any two
+	 * versions is greater or equivalent to %0x7FFFFFFF, there was an
+	 * overflow and the logical eraseblock with lower version is actually
+	 * newer then the one with higher version.
+	 */
+
+	abs = v1 - v2;
+	if (abs < 0)
+		abs = -abs;
+
+	if (likely(abs < 0x7FFFFFFF))
+		/* Non-overflow situation */
+		second_is_newer = (v2 > v1);
+	else
+		second_is_newer = (v2 < v1);
+
+	/*
+	 * Now we know which copy is newer. If the copy flag of the PEB with
+	 * newer version is not set, then we just return, otherwise we have to
+	 * check data CRC. For the second PEB we already have the VID header,
+	 * for the first one - we'll need to re-read it from flash.
+	 *
+	 * FIXME: this may be optimized so that we wouldn't read twice.
+	 */
+
+	if (second_is_newer) {
+		if (!vid_hdr->copy_flag) {
+			/* It is not a copy, so it is newer */
+			dbg_scan("second PEB %d is newer, copy_flag is unset",
+				 pnum);
+			return 1;
+		}
+	} else {
+		pnum = seb->pnum;
+
+		vidh = ubi_zalloc_vid_hdr(ubi);
+		if (!vidh)
+			return -ENOMEM;
+
+		err = ubi_io_read_vid_hdr(ubi, pnum, vidh, 0);
+		if (unlikely(err)) {
+			if (err == UBI_IO_BITFLIPS)
+				bitflips = 1;
+			else {
+				dbg_err("VID of PEB %d header is bad, but it "
+					"was OK earlier", pnum);
+				if (err > 0)
+					err = -EIO;
+
+				goto out_free_vidh;
+			}
+		}
+
+		if (!vidh->copy_flag) {
+			/* It is not a copy, so it is newer */
+			dbg_scan("first PEB %d is newer, copy_flag is unset",
+				 pnum);
+			err = bitflips << 1;
+			goto out_free_vidh;
+		}
+
+		vid_hdr = vidh;
+	}
+
+	/* Read the data of the copy and check the CRC */
+
+	len = ubi32_to_cpu(vid_hdr->data_size);
+	buf = ubi_kmalloc(len);
+	if (unlikely(!buf)) {
+		err = -ENOMEM;
+		goto out_free_vidh;
+	}
+
+	err = ubi_io_read_data(ubi, buf, pnum, 0, len);
+	if (unlikely(err && err != UBI_IO_BITFLIPS))
+		goto out_free_buf;
+
+	data_crc = ubi32_to_cpu(vid_hdr->data_crc);
+	crc = crc32(UBI_CRC32_INIT, buf, len);
+	if (unlikely(crc != data_crc)) {
+		dbg_scan("PEB %d CRC error: calculated %#08x, must be %#08x",
+			 pnum, crc, data_crc);
+		corrupted = 1;
+		bitflips = 0;
+		second_is_newer = !second_is_newer;
+	} else {
+		dbg_scan("PEB %d CRC is OK", pnum);
+		bitflips = !!err;
+	}
+
+	ubi_kfree(buf);
+	ubi_free_vid_hdr(ubi, vidh);
+
+	if (second_is_newer)
+		dbg_scan("second PEB %d is newer, copy_flag is set", pnum);
+	else
+		dbg_scan("first PEB %d is newer, copy_flag is set", pnum);
+
+	return second_is_newer | (bitflips << 1) | (corrupted << 2);
+
+out_free_buf:
+	ubi_kfree(buf);
+out_free_vidh:
+	ubi_free_vid_hdr(ubi, vidh);
+	return err;
+}
+
+/**
+ * add_to_erase - add a physical eraseblock to the list of physical eraseblocks
+ * which have to be erased.
+ *
+ * @si: a pointer to the scanning information
+ * @pnum: the physical eraseblock number
+ * @ec: erase counter of this physical eraseblock
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int add_to_erase(struct ubi_scan_info *si, int pnum, int ec)
+{
+	struct ubi_scan_leb *seb;
+
+	dbg_scan("PEB %d, EC %d", pnum, ec);
+
+	seb = ubi_alloc_scan_leb();
+	if (unlikely(!seb))
+		return -ENOMEM;
+
+	seb->pnum = pnum;
+	seb->ec = ec;
+	list_add_tail(&seb->u.list, &si->erase);
+	return 0;
+}
+
+/**
+ * add_to_alien - add a physical eraseblock to the @si->alien list.
+ *
+ * @si: a pointer to the scanning information
+ * @pnum: the physical eraseblock number
+ *
+ * This function returns zero in case of success and a negative error
+ * code in case of failure.
+ */
+static int add_to_alien(struct ubi_scan_info *si, int pnum)
+{
+	struct ubi_scan_leb *seb;
+
+	dbg_scan("PEB %d is alien", pnum);
+
+	seb = ubi_alloc_scan_leb();
+	if (unlikely(!seb))
+		return -ENOMEM;
+
+	seb->pnum = pnum;
+	list_add_tail(&seb->u.list, &si->alien);
+	return 0;
+}
+
+/**
+ * add_to_free - add a physical eraseblock to the list of free physical
+ * eraseblocks.
+ *
+ * @si: a pointer to the scanning information
+ * @pnum: the physical eraseblock number
+ * @ec: erase counter of this physical eraseblock
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int add_to_free(struct ubi_scan_info *si, int pnum, int ec)
+{
+	struct ubi_scan_leb *seb;
+
+	dbg_scan("PEB %d, EC %d", pnum, ec);
+	ubi_assert(ec >= 0);
+
+	seb = ubi_alloc_scan_leb();
+	if (unlikely(!seb))
+		return -ENOMEM;
+
+	seb->pnum = pnum;
+	seb->ec = ec;
+	list_add_tail(&seb->u.list, &si->free);
+	return 0;
+}
+
+/**
+ * destroy_sv - free the scanning volume information
+ *
+ * @sv: scanning volume information
+ *
+ * This function destroys the volume RB-tree (@sv->root) and the scanning
+ * volume information.
+ */
+static void destroy_sv(struct ubi_scan_volume *sv)
+{
+	struct ubi_scan_leb *seb;
+	struct rb_node *this = sv->root.rb_node;
+
+	while (this) {
+		if (this->rb_left)
+			this = this->rb_left;
+		else if (this->rb_right)
+			this = this->rb_right;
+		else {
+			seb = rb_entry(this, struct ubi_scan_leb, u.rb);
+			this = rb_parent(this);
+			if (this) {
+				if (this->rb_left == &seb->u.rb)
+					this->rb_left = NULL;
+				else
+					this->rb_right = NULL;
+			}
+
+			ubi_free_scan_leb(seb);
+		}
+	}
+	ubi_free_scan_volume(sv);
+}
+
+/**
+ * commit_to_mean_value - commit intermediate results to the final mean erase
+ * counter value.
+ *
+ * @si: the scanning information
+ *
+ * This function is a helper function which calculates partial mean value and
+ * adds it to the resulting mean value. As we can work only in integer
+ * arithmetic and we want to calculate the mean value of erase counter
+ * accurately, we first sum erase counter values in @si->ec_sum variable and
+ * count these components in @si->ec_count. If this temporary @si->ec_sum is
+ * going to overflow, we calculate the partial mean value
+ * (@si->ec_sum/@si->ec_count) and add it to @si->mean_ec.
+ */
+static void commit_to_mean_value(struct ubi_scan_info *si)
+{
+	int rem;
+
+	rem = si->ec_sum % si->ec_count;
+	si->ec_sum /= si->ec_count;
+	if (rem >= si->ec_count / 2)
+		si->mean_ec += 1;
+	si->mean_ec += si->ec_sum;
+}
+
+/**
+ * vid_hdr_sanity_check - check that a volume identifier header is sane.
+ *
+ * @ubi: the UBI device description object
+ * @vid_hdr: the volume identifier header to check
+ * @sv: information about the volume this logical eraseblock belongs to
+ * @pnum: the physical eraseblock number the VID header came from
+ *
+ * This function checks that data stored in the volume identifier header
+ * @vid_hdr is consistent. This function returns non-zero if an inconsistency
+ * was found and zero if not.
+ *
+ * Note, UBI does sanity check of everything it reads from the flash media.
+ * Most of the checks are done in the I/O unit. Here we check that the
+ * information in this VID header is consistent to information in other VID
+ * headers of the same volume.
+ */
+static int vid_hdr_sanity_check(const struct ubi_info *ubi,
+				const struct ubi_vid_hdr *vid_hdr,
+				const struct ubi_scan_volume *sv, int pnum)
+{
+	int vol_type = vid_hdr->vol_type;
+	int vol_id = ubi32_to_cpu(vid_hdr->vol_id);
+	int used_ebs = ubi32_to_cpu(vid_hdr->used_ebs);
+	int data_pad = ubi32_to_cpu(vid_hdr->data_pad);
+
+	if (sv->leb_count != 0) {
+		/*
+		 * This is not the first logical eraseblock belonging to this
+		 * volume. Ensure that the data in its VID header is consistent
+		 * to the data in previous logical eraseblocks' headers.
+		 */
+		int sv_vol_type;
+
+		if (unlikely(vol_id != sv->vol_id)) {
+			dbg_err("inconsistent vol_id");
+			goto bad;
+		}
+
+		if (sv->vol_type == UBI_STATIC_VOLUME)
+			sv_vol_type = UBI_VID_STATIC;
+		else
+			sv_vol_type = UBI_VID_DYNAMIC;
+
+		if (unlikely(vol_type != sv_vol_type)) {
+			dbg_err("inconsistent vol_type");
+			goto bad;
+		}
+
+		if (unlikely(used_ebs != sv->used_ebs)) {
+			dbg_err("inconsistent used_ebs");
+			goto bad;
+		}
+
+		if (unlikely(data_pad != sv->data_pad)) {
+			dbg_err("inconsistent data_pad");
+			goto bad;
+		}
+	}
+
+	return 0;
+
+bad:
+	ubi_err("bad VID header at PEB %d", pnum);
+	ubi_dbg_dump_vid_hdr(vid_hdr);
+	ubi_dbg_dump_sv(sv);
+	return -EINVAL;
+}
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_SCAN
+
+/**
+ * paranoid_check_si - check if the scanning information is sane and correct.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero if the scanning information is all right, %1 if
+ * not and a negative error code if an error occurred.
+ */
+static int paranoid_check_si(const struct ubi_info *ubi,
+			     struct ubi_scan_info *si)
+{
+	int pnum, err, vols_found = 0;
+	struct rb_node *rb1, *rb2;
+	struct ubi_scan_volume *sv;
+	struct ubi_scan_leb *seb, *last_seb;
+	const struct ubi_io_info *io = ubi->io;
+	uint8_t *buf;
+
+	/*
+	 * At first, check that scanning information is sane.
+	 */
+	rb_for_each_entry(rb1, sv, &si->volumes, rb) {
+		int leb_count = 0;
+
+		cond_resched();
+
+		vols_found += 1;
+
+		if (unlikely(si->is_empty)) {
+			ubi_err("bad is_empty flag");
+			goto bad_sv;
+		}
+
+		if (unlikely(sv->vol_id < 0 || sv->highest_lnum < 0 ||
+			     sv->leb_count < 0 || sv->vol_type < 0 ||
+			     sv->used_ebs < 0 || sv->data_pad < 0 ||
+			     sv->last_data_size < 0)) {
+			ubi_err("negative values");
+			goto bad_sv;
+		}
+
+		if (unlikely(sv->vol_id >= UBI_MAX_VOLUMES &&
+			     sv->vol_id < UBI_INTERNAL_VOL_START)) {
+			ubi_err("bad vol_id");
+			goto bad_sv;
+		}
+
+		if (unlikely(sv->vol_id > si->highest_vol_id)) {
+			ubi_err("highest_vol_id is %d, but vol_id %d is there",
+				si->highest_vol_id, sv->vol_id);
+			goto out;
+		}
+
+		if (unlikely(sv->vol_type != UBI_DYNAMIC_VOLUME &&
+			     sv->vol_type != UBI_STATIC_VOLUME)) {
+			ubi_err("bad vol_type");
+			goto bad_sv;
+		}
+
+		if (unlikely(sv->data_pad > io->leb_size / 2)) {
+			ubi_err("bad data_pad");
+			goto bad_sv;
+		}
+
+		last_seb = NULL;
+		rb_for_each_entry(rb2, seb, &sv->root, u.rb) {
+			cond_resched();
+
+			last_seb = seb;
+			leb_count += 1;
+
+			if (unlikely(seb->pnum < 0 || seb->ec < 0)) {
+				ubi_err("negative values");
+				goto bad_seb;
+			}
+
+			if (unlikely(seb->ec < si->min_ec)) {
+				ubi_err("bad si->min_ec (%d), %d found",
+					si->min_ec, seb->ec);
+				goto bad_seb;
+			}
+
+			if (unlikely(seb->ec > si->max_ec)) {
+				ubi_err("bad si->max_ec (%d), %d found",
+					si->max_ec, seb->ec);
+				goto bad_seb;
+			}
+
+			if (unlikely(seb->pnum >= io->peb_count)) {
+				ubi_err("too high PEB number %d, total PEBs %d",
+					seb->pnum, io->peb_count);
+				goto bad_seb;
+			}
+
+			if (sv->vol_type == UBI_STATIC_VOLUME) {
+				if (unlikely(seb->lnum >= sv->used_ebs)) {
+					ubi_err("bad lnum or used_ebs");
+					goto bad_seb;
+				}
+			} else {
+				if (unlikely(sv->used_ebs != 0)) {
+					ubi_err("non-zero used_ebs");
+					goto bad_seb;
+				}
+			}
+
+			if (unlikely(seb->lnum > sv->highest_lnum)) {
+				ubi_err("incorrect highest_lnum or lnum");
+				goto bad_seb;
+			}
+		}
+
+		if (unlikely(sv->leb_count != leb_count)) {
+			ubi_err("bad leb_count, %d objects in the tree",
+				leb_count);
+			goto bad_sv;
+		}
+
+		if (!last_seb)
+			continue;
+
+		seb = last_seb;
+
+		if (unlikely(seb->lnum != sv->highest_lnum)) {
+			ubi_err("bad highest_lnum");
+			goto bad_seb;
+		}
+	}
+
+	if (vols_found != si->vols_found) {
+		ubi_err("bad si->vols_found %d, should be %d",
+			si->vols_found, vols_found);
+		goto out;
+	}
+
+	/* Check that scanning information is correct */
+	rb_for_each_entry(rb1, sv, &si->volumes, rb) {
+		last_seb = NULL;
+		rb_for_each_entry(rb2, seb, &sv->root, u.rb) {
+			int vol_type;
+
+			cond_resched();
+
+			last_seb = seb;
+
+			err = ubi_io_read_vid_hdr(ubi, seb->pnum, vidh, 1);
+			if (unlikely(err) && err != UBI_IO_BITFLIPS) {
+				ubi_err("VID header is not OK (%d)", err);
+				if (err > 0)
+					err = -EIO;
+				return err;
+			}
+
+			vol_type = vidh->vol_type == UBI_VID_DYNAMIC ?
+				   UBI_DYNAMIC_VOLUME : UBI_STATIC_VOLUME;
+			if (unlikely(sv->vol_type != vol_type)) {
+				ubi_err("bad vol_type");
+				goto bad_vid_hdr;
+			}
+
+			if (unlikely(seb->leb_ver !=
+				     ubi32_to_cpu(vidh->leb_ver))) {
+				ubi_err("bad leb_ver %u", seb->leb_ver);
+				goto bad_vid_hdr;
+			}
+
+			if (unlikely(sv->vol_id !=
+				     ubi32_to_cpu(vidh->vol_id))) {
+				ubi_err("bad vol_id %d", sv->vol_id);
+				goto bad_vid_hdr;
+			}
+
+			if (unlikely(sv->compat != vidh->compat)) {
+				ubi_err("bad compat %d", vidh->compat);
+				goto bad_vid_hdr;
+			}
+
+			if (unlikely(seb->lnum !=
+				     ubi32_to_cpu(vidh->lnum))) {
+				ubi_err("bad lnum %d", seb->lnum);
+				goto bad_vid_hdr;
+			}
+
+			if (unlikely(sv->used_ebs !=
+				     ubi32_to_cpu(vidh->used_ebs))) {
+				ubi_err("bad used_ebs %d", sv->used_ebs);
+				goto bad_vid_hdr;
+			}
+
+			if (unlikely(sv->data_pad !=
+				     ubi32_to_cpu(vidh->data_pad))) {
+				ubi_err("bad data_pad %d", sv->data_pad);
+				goto bad_vid_hdr;
+			}
+		}
+
+		if (!last_seb)
+			continue;
+
+		if (unlikely(sv->highest_lnum != ubi32_to_cpu(vidh->lnum))) {
+			ubi_err("bad highest_lnum %d", sv->highest_lnum);
+			goto bad_vid_hdr;
+		}
+
+		if (unlikely(sv->last_data_size !=
+			     ubi32_to_cpu(vidh->data_size))) {
+			ubi_err("bad last_data_size %d", sv->last_data_size);
+			goto bad_vid_hdr;
+		}
+	}
+
+	/*
+	 * Make sure that all the physical eraseblocks are in one of the lists
+	 * or trees.
+	 */
+	buf = ubi_kmalloc(io->peb_count);
+	if (!buf)
+		return -ENOMEM;
+
+	memset(buf, 1, io->peb_count);
+	for (pnum = 0; pnum < io->peb_count; pnum++) {
+		err = ubi_io_is_bad(ubi, pnum);
+		if (unlikely(err < 0))
+			return err;
+		else if (err)
+			buf[pnum] = 0;
+	}
+
+	rb_for_each_entry(rb1, sv, &si->volumes, rb)
+		rb_for_each_entry(rb2, seb, &sv->root, u.rb)
+			buf[seb->pnum] = 0;
+
+	cond_resched();
+	list_for_each_entry(seb, &si->free, u.list)
+		buf[seb->pnum] = 0;
+
+	cond_resched();
+	list_for_each_entry(seb, &si->corr, u.list)
+		buf[seb->pnum] = 0;
+
+	cond_resched();
+	list_for_each_entry(seb, &si->erase, u.list)
+		buf[seb->pnum] = 0;
+
+	cond_resched();
+	list_for_each_entry(seb, &si->alien, u.list)
+		buf[seb->pnum] = 0;
+
+	err = 0;
+	for (pnum = 0; pnum < io->peb_count; pnum++)
+		if (unlikely(buf[pnum])) {
+			ubi_err("PEB %d is not referred", pnum);
+			err = 1;
+		}
+
+	ubi_kfree(buf);
+	if (err)
+		goto out;
+	return 0;
+
+bad_seb:
+	ubi_err("bad scanning information about LEB %d", seb->lnum);
+	ubi_dbg_dump_seb(seb, 0);
+	ubi_dbg_dump_sv(sv);
+	goto out;
+
+bad_sv:
+	ubi_err("bad scanning information about volume %d", sv->vol_id);
+	ubi_dbg_dump_sv(sv);
+	goto out;
+
+bad_vid_hdr:
+	ubi_err("bad scanning information about volume %d", sv->vol_id);
+	ubi_dbg_dump_sv(sv);
+	ubi_dbg_dump_vid_hdr(vidh);
+
+out:
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+#endif /* CONFIG_MTD_UBI_DEBUG_PARANOID_SCAN */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 17/44 take 2] [UBI] build unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (15 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 16/44 take 2] [UBI] scanning unit implementation Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 16:55 ` [PATCH 18/44 take 2] [UBI] build unit implementation Artem Bityutskiy
                   ` (29 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/build.h tmp-to/drivers/mtd/ubi/build.h
--- tmp-from/drivers/mtd/ubi/build.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/build.h	2007-02-17 18:07:26.000000000 +0200
@@ -0,0 +1,62 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI build unit.
+ *
+ * This unit is responsible for attaching MTD devices to UBI devices. At the
+ * moment there is only one attachment method exists - full scan. But in future
+ * one may add a superblock-based attachment method and improve UBI
+ * scalability. But anyways, the scanning method will always be useful because
+ * in case of superblock corruptions UBI can always scan the device and
+ * re-build all the core data structures.
+ */
+
+#ifndef __UBI_BUILD_H__
+#define __UBI_BUILD_H__
+
+struct ubi_info;
+
+/**
+ * ubi_bld_attach_mtd_dev - attach an MTD device.
+ *
+ * @ubi: the UBI device description object
+ * @mtd_num: MTD device number
+ * @vid_hdr_offset: volume identifier headers offset
+ * @data_offset: data offset
+ *
+ * This function attaches an MTD device number @mtd_num. If @vid_hdr_offset and
+ * @data_offset are zero, the default layout of UBI headers is assumed. See the
+ * I/O unit.
+ *
+ * This function returns a positive number of the new UBI device in case of
+ * success and a negative error code in case of failure.
+ */
+int ubi_bld_attach_mtd_dev(struct ubi_info *ubi, int mtd_num,
+			   int vid_hdr_offset, int data_offset);
+
+/**
+ * ubi_bld_detach_mtd_dev - detach an MTD device.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_bld_detach_mtd_dev(struct ubi_info *ubi);
+
+#endif /* __UBI_BUILD_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 18/44 take 2] [UBI] build unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (16 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 17/44 take 2] [UBI] build unit header Artem Bityutskiy
@ 2007-02-17 16:55 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 19/44 take 2] [UBI] volume table unit header Artem Bityutskiy
                   ` (28 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/build.c tmp-to/drivers/mtd/ubi/build.c
--- tmp-from/drivers/mtd/ubi/build.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/build.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,191 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "io.h"
+#include "wl.h"
+#include "volmgmt.h"
+#include "account.h"
+#include "background.h"
+#include "vtbl.h"
+#include "eba.h"
+#include "build.h"
+#include "uif.h"
+#include "scan.h"
+#include "badeb.h"
+#include "misc.h"
+#include "debug.h"
+
+#ifdef CONFIG_MTD_UBI_DEBUG_DISABLE_BGT
+#define ENABLE_BGT 0
+#else
+#define ENABLE_BGT 1
+#endif
+
+static int attach_by_scanning(struct ubi_info *ubi);
+
+int ubi_bld_attach_mtd_dev(struct ubi_info *ubi, int mtd_num,
+			   int vid_hdr_offset, int data_offset)
+{
+	int err;
+	const struct ubi_io_info *io;
+	const struct ubi_acc_info *acc;
+
+	dbg_bld("attaching mtd%d to ubi%d", mtd_num, ubi->ubi_num);
+
+	err = ubi_io_init(ubi, mtd_num, vid_hdr_offset, data_offset);
+	if (err) {
+		dbg_err("failed to initialize I/O unit, error %d", err);
+		return err;
+	}
+
+	err = ubi_bgt_init(ubi);
+	if (err) {
+		dbg_err("failed to initialize background thread unit, error %d",
+			err);
+		goto out_io;
+	}
+
+	err = attach_by_scanning(ubi);
+	if (err) {
+		dbg_err("failed to attach MTD device, error %d", err);
+		goto out_bgt;
+	}
+
+	err = ubi_beb_init(ubi);
+	if (err) {
+		dbg_err("failed to initialize bad eraseblock handling unit, "
+			"error %d", err);
+		goto out_detach;
+	}
+
+	err = ubi_uif_init(ubi);
+	if (err) {
+		dbg_err("failed to initialize user interfaces unit for UBI "
+			"device %d, error %d", ubi->ubi_num, err);
+		goto out_beb;
+	}
+
+	io = ubi->io;
+	acc = ubi->acc;
+
+	ubi_msg("attached mtd%d to ubi%d", mtd_num, ubi->ubi_num);
+	ubi_msg("MTD device name:            \"%s\"", io->mtd_name);
+	ubi_msg("MTD device size:            %llu MiB", io->flash_size >> 20);
+	ubi_msg("physical eraseblock size:   %d bytes (%d KiB)",
+		io->peb_size, io->peb_size >> 10);
+	ubi_msg("logical eraseblock size:    %d bytes", io->leb_size);
+	ubi_msg("number of good PEBs:        %d", io->good_peb_count);
+	ubi_msg("number of bad PEBs:         %d", io->bad_peb_count);
+	ubi_msg("smallest flash I/O unit:    %d", io->min_io_size);
+	ubi_msg("VID header offset:          %d (aligned %d)",
+		io->vid_hdr_offset, io->vid_hdr_aloffset);
+	ubi_msg("data offset:                %d", io->leb_start);
+	ubi_msg("max. allowed volumes:       %d", acc->max_volumes);
+	ubi_msg("wear-levelling threshold:   %d", CONFIG_MTD_UBI_WL_THRESHOLD);
+	ubi_msg("number of internal volumes: %d", acc->ivol_count);
+	ubi_msg("number of user volumes:     %d", acc->uvol_count);
+	ubi_msg("available PEBs:             %d", acc->avail_pebs);
+	ubi_msg("total number of reserved PEBs: %d", acc->rsvd_pebs);
+	ubi_msg("number of PEBs reserved for bad PEB handling: %d",
+		ubi->beb->reserved_pebs);
+
+	if (!ENABLE_BGT && !io->ro_mode)
+		ubi_bgt_enable(ubi);
+
+	return 0;
+
+out_beb:
+	ubi_beb_close(ubi);
+out_detach:
+	ubi_eba_close(ubi);
+	ubi_wl_close(ubi);
+	ubi_vmt_close(ubi);
+out_bgt:
+	ubi_bgt_kill_thread(ubi);
+	ubi_bgt_close(ubi);
+out_io:
+	ubi_io_close(ubi);
+	return err;
+}
+
+void ubi_bld_detach_mtd_dev(struct ubi_info *ubi)
+{
+	int mtd_num = ubi->io->mtd_num, ubi_num = ubi->ubi_num;
+
+	dbg_bld("detaching mtd%d from ubi%d", mtd_num, ubi_num);
+
+	ubi_bgt_kill_thread(ubi);
+	ubi_uif_close(ubi);
+	ubi_beb_close(ubi);
+	ubi_eba_close(ubi);
+	ubi_wl_close(ubi);
+	ubi_vmt_close(ubi);
+	ubi_bgt_close(ubi);
+	ubi_io_close(ubi);
+	ubi_msg("detached mtd%d from ubi%d", mtd_num, ubi_num);
+}
+
+/**
+ * attach_by_scanning - attach a MTD device using scanning method.
+ *
+ * @ubi: UBI device descriptor
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int attach_by_scanning(struct ubi_info *ubi)
+{
+	int err;
+	struct ubi_scan_info *si;
+
+	dbg_bld("attach mtd device by scanning");
+
+	si = ubi_scan(ubi);
+	if (IS_ERR(si))
+		return PTR_ERR(si);
+
+	err = ubi_vmt_init_scan(ubi, si);
+	if (err)
+		goto out_si;
+
+	err = ubi_wl_init_scan(ubi, si);
+	if (err)
+		goto out_vmt;
+
+	err = ubi_eba_init_scan(ubi, si);
+	if (err)
+		goto out_wl;
+
+	ubi_msg("mean erase counter:         %d", si->mean_ec);
+	ubi_scan_destroy_si(si);
+	return 0;
+
+out_wl:
+	ubi_wl_close(ubi);
+out_vmt:
+	ubi_vmt_close(ubi);
+out_si:
+	ubi_scan_destroy_si(si);
+	return err;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 19/44 take 2] [UBI] volume table unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (17 preceding siblings ...)
  2007-02-17 16:55 ` [PATCH 18/44 take 2] [UBI] build unit implementation Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 20/44 take 2] [UBI] volume table unit implementation Artem Bityutskiy
                   ` (27 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/vtbl.h tmp-to/drivers/mtd/ubi/vtbl.h
--- tmp-from/drivers/mtd/ubi/vtbl.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/vtbl.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,302 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ * Copyright (C) Nokia Corporation, 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * The volume table unit.
+ *
+ * This unit is responsible for maintaining the volume table. The volume table
+ * is an on-flash table containing volume meta-data like volume name, number of
+ * reserved physical eraseblocks for this volume, volume type, etc. The volume
+ * table is stored in the so-called "layout volume".
+ *
+ * The layout volume is an internal volume where the volume table is stored.
+ * Actually, there are 2 equivalent copies of the volume table in the layout
+ * volume. The layout volume it organized as follows. It consists of two
+ * logical eraseblocks - LEB 0 and LEB 1. Each logical eraseblock stores a copy
+ * the volume table, i.e. LEB 0 and LEB 1 duplicate each other. This redundancy
+ * guarantees robustness and tolerance to unclean reboots. The volume table is
+ * a mere array of so-called "volume table records". Each record contains full
+ * information about the volume and is protected by a CRC checksum.
+ *
+ * The volume table is changed as follows. It is first changed in RAM. Then LEB
+ * 0 is erased, and the updated volume table is written back to LEB 0. The same
+ * is done with LEB 1. This scheme guarantees recoverability from unclean
+ * reboots.
+ *
+ * At this UBI implementation the on-flash volume table does not contain about
+ * how many data a static volume actually stores. This information may be found
+ * out while scanning (from the EB headers) so we do not store it in the
+ * on-flash volume table. So, as long as we have an unscalable UBI
+ * implementation which uses scanning, we may live without that. In case of a
+ * scalable implementation, this would be required.
+ *
+ * But it would be beneficial to store this information in the volume table.
+ * For example, suppose we have a static volume X, and all its physical
+ * eraseblocks have gone bad for some reasons. Suppose we are attaching the
+ * corresponding MTD device, the scanning unit finds no logical eraseblocks
+ * corresponding to the volume X. According to the volume table volume X does
+ * exist. So we don't know whether it is just empty or all its physical
+ * eraseblocks went bad. So we cannot alarm the user about this corruption.
+ *
+ * Note, although we don't store this information in the on-flash volume table,
+ * we keep it in the in-RAM copy of this table just because it is quite
+ * convenient.
+ *
+ * The volume table also stores so-called "update marker" which is used to
+ * implement the update operation. Before updating the volume, the update
+ * marker is set, after the update operation is finished, the update marker is
+ * cleared. So if the update operation was interrupted (e.g. by an unclean
+ * reboot) - the update marker is still there and we know that the volume's
+ * contents is damaged.
+ *
+ * Note, in this implementation we do not support concurrent updates and only
+ * one volume at a time may be updated.
+ */
+
+#ifndef __UBI_VTBL_H__
+#define __UBI_VTBL_H__
+
+#include <linux/mutex.h>
+#include <mtd/ubi-header.h>
+
+struct ubi_info;
+struct ubi_scan_info;
+struct ubi_vtbl_vtr;
+
+/**
+ * ubi_vtbl_mkvol - create volume table record for a new volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the new volume
+ * @vtr: volume table record of the new volume
+ *
+ * This function adds a volume described in @vtr to the volume table. This
+ * function uses only @vtr->reserved_pebs, @vtr->alignment, @vtr->data_pad,
+ * @vtr->vol_type, and @vtr->name_len fields of the @vtr object. The
+ * @vtr->usable_leb_size is calculated automatically. The data-related fields
+ * are set to zero for static volumes ant to volume size for dynamic ones.
+ *
+ * This function returns zero in case of success and a negative error code in
+ */
+int ubi_vtbl_mkvol(const struct ubi_info *ubi, int vol_id,
+		   const struct ubi_vtbl_vtr *vtr);
+
+/**
+ * ubi_vtbl_rmvol - clear the volume table record of a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to remove
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_vtbl_rmvol(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_vtbl_rsvol - change volume size in the volume table record.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: re-sized volume's ID
+ * @reserved_pebs: new size, i.e. new number of reserved eraseblocks.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_vtbl_rsvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs);
+
+/**
+ * ubi_vtbl_set_upd_marker - set the update marker flag.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume
+ *
+ * This function sets the update marker flag for volumr @vol_id. Returns zero
+ * in case of success and a negative error code in case of failure.
+ */
+int ubi_vtbl_set_upd_marker(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_vtbl_clear_upd_marker - clear the update marker flag.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume
+ * @bytes: new data size in bytes
+ *
+ * This function clears the update marker for volume @vol_id, sets new volume
+ * data size and cleans the "corrupted" flag (static volume s only). This
+ * function returns zero in case of success and a negative error code in case
+ * of failure.
+ */
+int ubi_vtbl_clear_upd_marker(const struct ubi_info *ubi, int vol_id,
+			      long long bytes);
+
+/**
+ * ubi_vtbl_set_corrupted - mark a volume as 'corrupted'.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to mark
+ *
+ * This function marks volume @vol_id as corrupted. If the volume is not static
+ * it does nothing. Returns zero in case of success and a negative error code
+ * in case of failure.
+ */
+int ubi_vtbl_set_corrupted(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_vtbl_get_vtr - retrieve a volume table record.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the requested volume ID
+ *
+ * This function returns a pointer to the volume record or an error code.
+ * If the volume ID is incorrect, %-EINVAL is returned, if the volume does
+ * not exist, %-ENODEV is returned.
+ *
+ * This function does not access the flash media as retrieves the information
+ * from the in-RAM volume table copy. So it does not sleep.
+ */
+const struct ubi_vtbl_vtr *ubi_vtbl_get_vtr(const struct ubi_info *ubi,
+					    int vol_id);
+
+/**
+ * ubi_vtbl_get_compat - get compatibility flags of a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: volume ID
+ *
+ * This function returns compatibility flags of volumes. User volumes have no
+ * compatibility flags, so %0 is returned. The @vol_id must be correct.
+ */
+int ubi_vtbl_get_compat(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_is_ivol - check if a volume is an internal volume.
+ *
+ * @vol_id: ID of the volume to test
+ *
+ * If the volume is internal volume, %1 is returned, otherwise %0 is returned.
+ */
+static inline int ubi_is_ivol(int vol_id)
+{
+	return vol_id >= UBI_INTERNAL_VOL_START &&
+	       vol_id < UBI_INTERNAL_VOL_START + UBI_INT_VOL_COUNT;
+}
+
+/*
+ * ubi_ivol_is_known - check if this is a known internal volume.
+ *
+ * @vol_id: ID of the volume to check.
+ *
+ * This function returns non-zero if this is a known and supported internal
+ * volume and non-zero if not.
+ */
+static inline int ubi_ivol_is_known(int vol_id)
+{
+	return vol_id == UBI_LAYOUT_VOL_ID;
+}
+
+/**
+ * ubi_vtbl_init_scan - initialize the volume table unit using scanning
+ * information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_vtbl_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si);
+
+/**
+ * ubi_vtbl_close - close the volume table unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_vtbl_close(const struct ubi_info *ubi);
+
+/**
+ * struct ubi_vtbl_vtr - in-memory representation of volume table records.
+ *
+ * @reserved_pebs: how many physical eraseblocks are reserved for this volume
+ * @alignment: volume alignment
+ * @data_pad: how many bytes are not used at the end of eraseblocks to
+ * satisfy the requested alignment
+ * @vol_type: volume type (%UBI_DYNAMIC_VOLUME or %UBI_STATIC_VOLUME)
+ * @name_len: volume name length
+ * @name: volume name
+ * @usable_leb_size: logical eraseblock size without padding
+ * @used_ebs: how many logical eraseblocks in this volume are contain data
+ * @last_eb_bytes: how many bytes are stored in the last logical eraseblock
+ * @used_bytes: how many bytes of data this volume contains
+ * @corrupted: non-zero if the data is corrupted (static volumes only)
+ * @upd_marker: non-zero if the update marker is set for this volume
+ *
+ * Note, the @usable_leb_size field is not stored on flash, as it is easily
+ * calculated with help of the @data_pad field. But it is just very handy, so
+ * we keep it in the in-RAM volume table record representation.
+ *
+ * Similar to the @used_ebs, @last_eb_bytes, @used_bytes and @corrupted. We do
+ * not store them in the on-flash volume table but keep handy in RAM.
+ *
+ * The @corrupted field indicates that the volume's contents is corrupted. And
+ * since UBI protects only the contents of static volumes, this field is only
+ * relevant to static volumes. In case of dynamic volumes it is user's
+ * responsibility to assure data integrity.
+ *
+ * The @upd_marker flag indicates that this volume is either being updated at
+ * the moment or is damaged because of an unclean reboot. Note, the @corrupted
+ * flag is always cleared if the @upd_marker flag is set.
+ */
+struct ubi_vtbl_vtr {
+	int reserved_pebs;
+	int alignment;
+	int data_pad;
+	int vol_type;
+	int name_len;
+	int usable_leb_size;
+	const char *name;
+	int used_ebs;
+	int last_eb_bytes;
+	long long used_bytes;
+	int corrupted;
+	int upd_marker;
+};
+
+/**
+ * struct ubi_vtbl_info - volume table unit description data structure.
+ *
+ * @vt_slots: how many volume table records are stored in the volume table
+ * @vt_size: size of the volume table in bytes
+ * @vt: the in-RAM copy of the volume table
+ * @mutex: serializes volume table changes
+ * @empty_rec: volume table record corresponding to an empty volume
+ * @ivol_vtrs: volume table records corresponding to internal volumes
+ */
+struct ubi_vtbl_info {
+	int vt_slots;                                     /* public  */
+	int vt_size;                                      /* private */
+	struct ubi_vtbl_vtr *vt;                          /* private */
+	struct mutex mutex;                               /* private */
+	const struct ubi_vol_tbl_record empty_rec;        /* private */
+	struct ubi_vtbl_vtr ivol_vtrs[UBI_INT_VOL_COUNT]; /* private */
+};
+
+#endif /* __UBI_VTBL_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 20/44 take 2] [UBI] volume table unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (18 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 19/44 take 2] [UBI] volume table unit header Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 21/44 take 2] [UBI] background thread unit header Artem Bityutskiy
                   ` (26 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/vtbl.c tmp-to/drivers/mtd/ubi/vtbl.c
--- tmp-from/drivers/mtd/ubi/vtbl.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/vtbl.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,1369 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ * Copyright (C) Nokia Corporation, 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/string.h>
+#include <linux/crc32.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <mtd/ubi-header.h>
+#include <asm/div64.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "wl.h"
+#include "io.h"
+#include "vtbl.h"
+#include "eba.h"
+#include "scan.h"
+#include "misc.h"
+#include "debug.h"
+
+static int change_volume(const struct ubi_info *ubi,
+			 int vol_id, const struct ubi_vtbl_vtr *vtr);
+static void fill_data_size_fields(const struct ubi_info *ubi,
+				  struct ubi_vtbl_vtr *vtr);
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_VTBL
+static int paranoid_check_vtr(const struct ubi_info *ubi,
+			      const struct ubi_vtbl_vtr *vtr);
+#else
+#define paranoid_check_vtr(ubi, vtr) 0
+#endif
+
+int ubi_vtbl_mkvol(const struct ubi_info *ubi, int vol_id,
+		   const struct ubi_vtbl_vtr *vtr)
+{
+	int err;
+	struct ubi_vtbl_vtr new_vtr;
+
+	dbg_vtbl("create volume: vol_id %d, reserved_pebs %d, "
+		 "alignment %d, data_pad %d, vol_type %d, "
+		 "name_len %d, name %s", vol_id, vtr->reserved_pebs,
+		 vtr->alignment, vtr->data_pad, vtr->vol_type,
+		 vtr->name_len, vtr->name);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0 && vol_id < ubi->vtbl->vt_slots);
+	ubi_assert(vtr->reserved_pebs > 0);
+	ubi_assert(ubi->vtbl->vt[vol_id].reserved_pebs == 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+
+	memset(&new_vtr, 0, sizeof(struct ubi_vtbl_vtr));
+	new_vtr.reserved_pebs = vtr->reserved_pebs;
+	new_vtr.alignment = vtr->alignment;
+	new_vtr.data_pad = vtr->data_pad;
+	new_vtr.vol_type = vtr->vol_type;
+	new_vtr.name_len = vtr->name_len;
+	new_vtr.name = vtr->name;
+
+	fill_data_size_fields(ubi, &new_vtr);
+
+	err = change_volume(ubi, vol_id, &new_vtr);
+	return err;
+}
+
+int ubi_vtbl_rmvol(const struct ubi_info *ubi, int vol_id)
+{
+	int err;
+	struct ubi_vtbl_vtr empty_vtr;
+
+	dbg_vtbl("remove volume %d", vol_id);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0 && vol_id < ubi->vtbl->vt_slots);
+	ubi_assert(ubi->vtbl->vt[vol_id].reserved_pebs != 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+
+	empty_vtr.reserved_pebs = 0;
+	err = change_volume(ubi, vol_id, &empty_vtr);
+	return err;
+}
+
+int ubi_vtbl_rsvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs)
+{
+	int err;
+	struct ubi_vtbl_vtr vtr;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	dbg_vtbl("re-size volume %d to %d LEBs, old size %d LEBs", vol_id,
+		 reserved_pebs, vtbl->vt[vol_id].reserved_pebs);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0 && vol_id < vtbl->vt_slots);
+	ubi_assert(reserved_pebs > 0);
+	ubi_assert(vtbl->vt[vol_id].reserved_pebs != 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+
+	memcpy(&vtr, &vtbl->vt[vol_id], sizeof(struct ubi_vtbl_vtr));
+
+	vtr.name = strdup_len(vtbl->vt[vol_id].name,
+			      vtbl->vt[vol_id].name_len);
+	if (!vtr.name)
+		return -ENOMEM;
+
+	vtr.reserved_pebs = reserved_pebs;
+	err = change_volume(ubi, vol_id, &vtr);
+	ubi_kfree(vtr.name);
+	return err;
+}
+
+int ubi_vtbl_set_upd_marker(const struct ubi_info *ubi, int vol_id)
+{
+	int err;
+	struct ubi_vtbl_vtr vtr;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	dbg_vtbl("set update marker for volume %d", vol_id);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0 && vol_id < vtbl->vt_slots);
+	ubi_assert(vtbl->vt[vol_id].reserved_pebs != 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+
+	if (vtbl->vt[vol_id].upd_marker) {
+		dbg_vtbl("update marker is already set, do nothing");
+		return 0;
+	}
+
+	memcpy(&vtr, &vtbl->vt[vol_id], sizeof(struct ubi_vtbl_vtr));
+
+	vtr.name = strdup_len(vtbl->vt[vol_id].name,
+			      vtbl->vt[vol_id].name_len);
+	if (!vtr.name)
+		return -ENOMEM;
+	vtr.upd_marker = 1;
+
+	err = change_volume(ubi, vol_id, &vtr);
+	ubi_kfree(vtr.name);
+	return err;
+}
+
+int ubi_vtbl_clear_upd_marker(const struct ubi_info *ubi, int vol_id,
+			      long long bytes)
+{
+	int err;
+	struct ubi_vtbl_vtr vtr;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	dbg_vtbl("clear update marker for volume %d", vol_id);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0 && vol_id < vtbl->vt_slots);
+	ubi_assert(vtbl->vt[vol_id].reserved_pebs != 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+	ubi_assert(bytes >= 0 && bytes <= vtbl->vt[vol_id].usable_leb_size *
+				          vtbl->vt[vol_id].reserved_pebs);
+
+	if (!vtbl->vt[vol_id].upd_marker) {
+		dbg_vtbl("update marker is already cleared, do nothing");
+		return 0;
+	}
+
+	memcpy(&vtr, &vtbl->vt[vol_id], sizeof(struct ubi_vtbl_vtr));
+
+	vtr.name = strdup_len(vtbl->vt[vol_id].name,
+			      vtbl->vt[vol_id].name_len);
+	if (!vtr.name)
+		return -ENOMEM;
+	vtr.upd_marker = 0;
+
+	if (vtbl->vt[vol_id].vol_type == UBI_STATIC_VOLUME) {
+		dbg_vtbl("set data length of static volume %d to %lld",
+			 vol_id, bytes);
+		vtr.used_bytes = bytes;
+		vtr.corrupted = 0;
+		fill_data_size_fields(ubi, &vtr);
+	} else
+		ubi_assert(vtr.corrupted == 0);
+
+	err = paranoid_check_vtr(ubi, &vtbl->vt[vol_id]);
+	if (err)
+		return err;
+
+	err = change_volume(ubi, vol_id, &vtr);
+	ubi_kfree(vtr.name);
+
+	return err;
+}
+
+int ubi_vtbl_set_corrupted(const struct ubi_info *ubi, int vol_id)
+{
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	struct ubi_vtbl_vtr *vtr = &vtbl->vt[vol_id];
+
+	ubi_assert(vol_id >= 0 && vol_id < vtbl->vt_slots);
+	ubi_assert(ubi->vtbl->vt[vol_id].reserved_pebs != 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+	ubi_assert(ubi->vtbl->vt[vol_id].upd_marker == 0);
+
+	if (vtr->vol_type == UBI_STATIC_VOLUME) {
+		dbg_vtbl("mark static volume %d as corrupted", vol_id);
+		vtr->corrupted = 1;
+	}
+
+	return 0;
+}
+
+static const struct ubi_vtbl_vtr *get_ivol_vtr(const struct ubi_info *ubi,
+					       int vol_id);
+
+const struct ubi_vtbl_vtr *ubi_vtbl_get_vtr(const struct ubi_info *ubi,
+					    int vol_id)
+{
+	int err;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+
+	if (ubi_is_ivol(vol_id))
+		return get_ivol_vtr(ubi, vol_id);
+
+	ubi_assert(vol_id >= 0 && vol_id < vtbl->vt_slots);
+
+	if (vtbl->vt[vol_id].reserved_pebs == 0)
+		return ERR_PTR(-ENODEV);
+
+	err = paranoid_check_vtr(ubi, &vtbl->vt[vol_id]);
+	return &vtbl->vt[vol_id];
+}
+
+int ubi_vtbl_get_compat(const struct ubi_info *ubi, int vol_id)
+{
+	if (!ubi_is_ivol(vol_id))
+		return 0;
+
+	switch (vol_id) {
+		case UBI_LAYOUT_VOL_ID:
+			return UBI_LAYOUT_VOLUME_COMPAT;
+		default:
+			BUG();
+	}
+
+	return -ENODEV;
+}
+
+static int init_ram_vt(const struct ubi_info *ubi,
+		       const struct ubi_scan_info *si,
+		       const struct ubi_vol_tbl_record *vol_tbl);
+
+static struct ubi_vol_tbl_record *create_empty_lvol(const struct ubi_info *ubi,
+						    struct ubi_scan_info *si);
+
+static struct ubi_vol_tbl_record *process_lvol(const struct ubi_info *ubi,
+					       struct ubi_scan_info *si,
+					       struct ubi_scan_volume *sv);
+
+static int check_scanning_info(const struct ubi_info *ubi,
+			       struct ubi_scan_info *si);
+
+static void free_volume_info(const struct ubi_info *ubi);
+
+static void init_ivols(struct ubi_info *ubi);
+
+int ubi_vtbl_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si)
+{
+	int err;
+	uint32_t crc;
+	struct ubi_vol_tbl_record *vol_tbl;
+	struct ubi_vtbl_info *vtbl;
+	struct ubi_scan_volume *sv;
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_vtbl("initialize the volume table unit");
+
+	vtbl = ubi_kzalloc(sizeof(struct ubi_vtbl_info));
+	if (!vtbl)
+		return -ENOMEM;
+	ubi->vtbl = vtbl;
+
+	mutex_init(&vtbl->mutex);
+
+	/* Initialize the empty volume table record pattern */
+	vol_tbl = (struct ubi_vol_tbl_record *)&vtbl->empty_rec;
+	crc = crc32(UBI_CRC32_INIT, vol_tbl, UBI_VTBL_RECORD_SIZE_CRC);
+	vol_tbl->crc = cpu_to_ubi32(crc);
+
+	/*
+	 * The number of supported volumes is limited by the eraseblock size
+	 * and by the UBI_MAX_VOLUMES constant.
+	 */
+	vtbl->vt_slots = io->leb_size / UBI_VTBL_RECORD_SIZE;
+	if (vtbl->vt_slots > UBI_MAX_VOLUMES)
+		vtbl->vt_slots = UBI_MAX_VOLUMES;
+
+	/*
+	 * We are going to calculate size of the volume table. It must be less
+	 * then the logical eraseblock size or equivalent to it. Here we also
+	 * ensure that @vtbl->vt_size has correct alignment (i.e., it is
+	 * multiple of the minimal flash I/O unit size).
+	 */
+	vtbl->vt_size = vtbl->vt_slots * UBI_VTBL_RECORD_SIZE;
+	vtbl->vt_size = align_up(vtbl->vt_size, io->min_io_size);
+
+	sv = ubi_scan_find_sv(si, UBI_LAYOUT_VOL_ID);
+	if (!sv) {
+		/*
+		 * No logical eraseblocks belonging to the layout volume were
+		 * found. This could mean that the flash is just empty. In
+		 * this case we "UBI-nize" this flash by means of creating a
+		 * layout volume with an empty volume table.
+		 *
+		 * But if flash is not empty this must be a serious corruption
+		 * or we were just fed by a bad/random/etc data. We could try
+		 * to do some recovery, but it is not implemented. And it seems
+		 * its better to do this using some user-space tools.
+		 */
+		if (si->is_empty) {
+			vol_tbl = create_empty_lvol(ubi, si);
+			if (IS_ERR(vol_tbl)) {
+				err = PTR_ERR(vol_tbl);
+				goto out;
+			}
+		} else {
+			ubi_err("the layout volume was not found");
+			err = -EINVAL;
+			goto out;
+		}
+	} else {
+		/*
+		 * The layout volume was found during scanning, lets look at
+		 * it, check it, etc.
+		 */
+
+		if (sv->leb_count > UBI_LAYOUT_VOLUME_EBS) {
+			/* This must not happen with right UBI images */
+			dbg_err("too many logical LEBs (%d) belonging to the "
+				"layout volume found", sv->leb_count);
+			err = -EINVAL;
+			goto out;
+		}
+
+		vol_tbl = process_lvol(ubi, si, sv);
+		if (IS_ERR(vol_tbl)) {
+			err = PTR_ERR(vol_tbl);
+			goto out;
+		}
+	}
+
+	/*
+	 * The layout volume is OK, initialize the corresponding in-RAM data
+	 * structures.
+	 */
+	err = init_ram_vt(ubi, si, vol_tbl);
+	if (err)
+		goto out;
+
+	ubi_kfree(vol_tbl);
+
+	init_ivols(ubi);
+
+	/*
+	 * Get sure that the scanning information is consistent to the
+	 * information stored in the volume table.
+	 */
+	err = check_scanning_info(ubi, si);
+	if (err)
+		goto out_vi;
+
+	dbg_vtbl("the volume table unit is initialized");
+	return 0;
+
+out_vi:
+	free_volume_info(ubi);
+out:
+	ubi_kfree(vtbl);
+	return err;
+}
+
+void ubi_vtbl_close(const struct ubi_info *ubi)
+{
+	dbg_vtbl("close the volume table unit");
+	free_volume_info(ubi);
+	ubi_kfree(ubi->vtbl);
+}
+
+/**
+ * change_volume - change geometry of an user volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to change
+ * @vtr: new volume table record
+ *
+ * This function accepts a new volume table record in @vtr and changes the
+ * volume table correspondingly (both in RAM and on flash). If the
+ * @vtr->reserved_pebs field contains zero, the volume is be deleted.
+ *
+ * This function changes both on-flash and in-RAM volume tables. Returns zero
+ * in case of success and a negative error code in case of failure.
+ */
+static int change_volume(const struct ubi_info *ubi,
+			 int vol_id, const struct ubi_vtbl_vtr *vtr)
+{
+	int i, err;
+	struct ubi_vol_tbl_record *vol_tbl;
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	vol_tbl = ubi_kzalloc(vtbl->vt_size);
+	if (!vol_tbl)
+		return -ENOMEM;
+
+	mutex_lock(&vtbl->mutex);
+
+	/* Generate the on-flash volume table contents */
+	for (i = 0; i < vtbl->vt_slots; i++) {
+		uint32_t crc;
+		const struct ubi_vtbl_vtr *tmp_vtr;
+
+		cond_resched();
+		tmp_vtr = &vtbl->vt[i];
+
+		err = paranoid_check_vtr(ubi, tmp_vtr);
+		if (unlikely(err))
+			goto out_unlock;
+
+		if (unlikely(i == vol_id))
+			tmp_vtr = vtr;
+
+		if (tmp_vtr->reserved_pebs == 0) {
+			/* Volume is empty */
+			memcpy(&vol_tbl[i], &vtbl->empty_rec, UBI_VTBL_RECORD_SIZE);
+			continue;
+		}
+
+		vol_tbl[i].reserved_pebs = cpu_to_ubi32(tmp_vtr->reserved_pebs);
+		vol_tbl[i].alignment = cpu_to_ubi32(tmp_vtr->alignment);
+		vol_tbl[i].data_pad = cpu_to_ubi32(tmp_vtr->data_pad);
+		vol_tbl[i].upd_marker = tmp_vtr->upd_marker;
+		if (tmp_vtr->vol_type == UBI_DYNAMIC_VOLUME)
+			vol_tbl[i].vol_type = UBI_VID_DYNAMIC;
+		else
+			vol_tbl[i].vol_type = UBI_VID_STATIC;
+		vol_tbl[i].name_len = cpu_to_ubi16((uint16_t)tmp_vtr->name_len);
+
+		memcpy(&vol_tbl[i].name, tmp_vtr->name, tmp_vtr->name_len);
+		vol_tbl[i].name[tmp_vtr->name_len] = '\0';
+
+		crc = crc32(UBI_CRC32_INIT, &vol_tbl[i],
+			    UBI_VTBL_RECORD_SIZE_CRC);
+		vol_tbl[i].crc = cpu_to_ubi32(crc);
+	}
+
+	/* Update both volume table copies */
+	for (i = 0; i < UBI_LAYOUT_VOLUME_EBS; i++) {
+		cond_resched();
+
+		err = ubi_eba_erase_leb(ubi, UBI_LAYOUT_VOL_ID, i);
+		if (unlikely(err))
+			goto out_unlock;
+
+		err = ubi_wl_flush(ubi);
+		if (unlikely(err))
+			goto out_unlock;
+
+		err = ubi_eba_write_leb(ubi, UBI_LAYOUT_VOL_ID, i, vol_tbl, 0,
+					vtbl->vt_size, UBI_DATA_LONGTERM);
+		if (unlikely(err))
+			goto out_unlock;
+	}
+
+	/* Change the in-RAM volume table correspondingly */
+	ubi_kfree(vtbl->vt[vol_id].name);
+	if (vtr->reserved_pebs != 0) {
+		memcpy(&vtbl->vt[vol_id], vtr, sizeof(struct ubi_vtbl_vtr));
+		vtbl->vt[vol_id].name = strdup_len(vtr->name, vtr->name_len);
+		if (!vtbl->vt[vol_id].name) {
+			err = -ENOMEM;
+			goto out_unlock;
+		}
+		if (vtr->vol_type == UBI_DYNAMIC_VOLUME)
+			vtbl->vt[vol_id].used_ebs = vtr->reserved_pebs;
+		if (unlikely(paranoid_check_vtr(ubi, &vtbl->vt[vol_id]))) {
+			err = -EINVAL;
+			goto out_unlock;
+		}
+	} else
+		memset(&vtbl->vt[vol_id], 0, sizeof(struct ubi_vtbl_vtr));
+
+	mutex_unlock(&vtbl->mutex);
+	ubi_kfree(vol_tbl);
+	return 0;
+
+out_unlock:
+	mutex_unlock(&vtbl->mutex);
+
+	/*
+	 * The volume table is probably in an inconsistent state now, so switch
+	 * to read-only mode.
+	 */
+	ubi_eba_ro_mode(ubi);
+	ubi_kfree(vol_tbl);
+	return err;
+}
+
+/**
+ * fill_data_size_fields - fills data size-related fields in a volume table
+ * record.
+ *
+ * @ubi: the UBI device description object
+ * @vtr: a pointer to the volume table record to fill
+ *
+ * This function initializes the @vtr->usable_leb_size, @vtr->used_ebs and
+ * @vtr->last_eb_bytes fields of the volume table record using the
+ * @vtr->vol_type, @vtr->data_pad, and @vtr->bytes fields.
+ */
+static void fill_data_size_fields(const struct ubi_info *ubi,
+				  struct ubi_vtbl_vtr *vtr)
+{
+	const struct ubi_io_info *io = ubi->io;
+
+	vtr->usable_leb_size = io->leb_size - vtr->data_pad;
+
+	if (vtr->vol_type == UBI_DYNAMIC_VOLUME) {
+		vtr->used_ebs = vtr->reserved_pebs;
+		vtr->last_eb_bytes = vtr->usable_leb_size;
+		vtr->used_bytes = vtr->used_ebs * vtr->usable_leb_size;
+	} else {
+		uint64_t tmp = vtr->used_bytes;
+
+		vtr->last_eb_bytes = do_div(tmp, vtr->usable_leb_size);
+		vtr->used_ebs = tmp;
+		if (vtr->last_eb_bytes)
+			vtr->used_ebs += 1;
+		else
+			vtr->last_eb_bytes = vtr->usable_leb_size;
+	}
+}
+
+static int create_vtbl(const struct ubi_info *ubi, struct ubi_scan_info *si,
+		       int copy, void *vol_tbl);
+
+/**
+ * create_empty_lvol - create an empty layout volume.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * If during scanning it was found out that the flash device is empty, this
+ * function is called to create an empty layout volume.
+ *
+ * This function returns the volume table contents in case of success and an
+ * error code in case of failure.
+ */
+static struct ubi_vol_tbl_record *create_empty_lvol(const struct ubi_info *ubi,
+						    struct ubi_scan_info *si)
+{
+	int i, err;
+	struct ubi_vol_tbl_record *vol_tbl;
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	vol_tbl = ubi_kmalloc(vtbl->vt_size);
+	if (!vol_tbl)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < vtbl->vt_slots; i++)
+		memcpy(&vol_tbl[i], &vtbl->empty_rec, UBI_VTBL_RECORD_SIZE);
+
+	for (i = 0; i < UBI_LAYOUT_VOLUME_EBS; i++) {
+		cond_resched();
+
+		err = create_vtbl(ubi, si, i, vol_tbl);
+		if (unlikely(err))
+			goto out_free;
+	}
+
+	return vol_tbl;
+
+out_free:
+	ubi_kfree(vol_tbl);
+	return ERR_PTR(err);
+}
+
+static int vol_tbl_check(const struct ubi_info *ubi,
+			 const struct ubi_vol_tbl_record *vol_tbl);
+
+/**
+ * process_lvol - process the layout volume.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @sv: scanning information about the layout volume
+ *
+ * This function is responsible for reading the layout volume, ensuring it is
+ * not corrupted, and recovering from corruptions if needed.
+ *
+ * This function returns the volume table in case of success and a negative
+ * error code in case of failure.
+ */
+static struct ubi_vol_tbl_record *process_lvol(const struct ubi_info *ubi,
+					       struct ubi_scan_info *si,
+					       struct ubi_scan_volume *sv)
+{
+	int err;
+	struct rb_node *rb;
+	struct ubi_scan_leb *seb;
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	struct ubi_vol_tbl_record *leb[UBI_LAYOUT_VOLUME_EBS] = { NULL, NULL };
+	int leb_corrupted[UBI_LAYOUT_VOLUME_EBS] = {1, 1};
+
+	/*
+	 * UBI goes through the following steps when it updates the layout
+	 * volume:
+	 * a. erase LEB 0;
+	 * b. write new data to LEB 0;
+	 * c. erase LEB 1;
+	 * d. write new data to LEB 1.
+	 * Before being updated, LEBs 0 and 1 contain the same data.
+	 *
+	 * Owing to unclean reboots, we may lose the contents of LEB 0 but there
+	 * is always LEB 1 present. Thus, it is normal situation when LEB 0 is
+	 * corrupted while LEB 1 is OK. Also, due to unclean reboots, the LEB 1
+	 * may be corrupted, but there has to be LEB 0. And finally, unclean
+	 * reboots may result in a situation when neither LEB 0 nor LEB 1 are
+	 * corrupted, but are different. In this case, LEB 0 contains more
+	 * recent information.
+	 *
+	 * So the plan is to first check LEB 0. Then
+	 * a. if LEB 0 is OK, it contains the most resent data; then we
+	 * compare its contents with LEB 1, and if they are different, we copy
+	 * LEB 0 to LEB 1.
+	 * b. if LEB 0 is corrupted, but LEB 1 is OK, we copy LEB 1 to LEB 0.
+	 */
+
+	dbg_vtbl("check the layout volume");
+
+	/* Read both LEB 0 and LEB 1 into RAM */
+	rb_for_each_entry(rb, seb, &sv->root, u.rb) {
+		cond_resched();
+
+		leb[seb->lnum] = ubi_kzalloc(vtbl->vt_size);
+		if (!leb[seb->lnum]) {
+			err = -ENOMEM;
+			goto out_free;
+		}
+
+		err = ubi_io_read_data(ubi, leb[seb->lnum], seb->pnum, 0,
+				      vtbl->vt_size);
+		if (err == UBI_IO_BITFLIPS || err == -EBADMSG)
+			/* Scrub the PEB later */
+			seb->scrub = 1;
+		else if (err)
+			goto out_free;
+	}
+
+	if (leb[0])
+		leb_corrupted[0] = vol_tbl_check(ubi, leb[0]);
+
+	if (leb_corrupted[0] == 0) {
+		/* LEB 0 is OK */
+
+		if (leb[1])
+			leb_corrupted[1] = memcmp(leb[0], leb[1],
+						  vtbl->vt_size);
+		if (leb_corrupted[1]) {
+			ubi_warn("the volume table copy #2 is corrupted");
+			err = create_vtbl(ubi, si, 1, leb[0]);
+			if (err)
+				goto out_free;
+		}
+
+		/* Both LEB 1 and LEB 2 are OK and consistent */
+		ubi_kfree(leb[1]);
+		return leb[0];
+	} else {
+		/* LEB 0 is corrupted or does not exist */
+		if (leb[1])
+			leb_corrupted[1] = vol_tbl_check(ubi, leb[1]);
+		if (leb_corrupted[1]) {
+			/*
+			 * Both LEB 0 and LEB 1 are corrupted. We don't try to
+			 * restore them and let user-space tools do this.
+			 */
+			ubi_err("the layout volume is corrupted");
+			err = -EINVAL;
+			goto out_free;
+		}
+
+		ubi_warn("the volume table copy #1 is corrupted");
+		err = create_vtbl(ubi, si, 0, leb[1]);
+		if (err)
+			goto out_free;
+
+		ubi_kfree(leb[0]);
+		return leb[1];
+	}
+
+out_free:
+	ubi_kfree(leb[0]);
+	ubi_kfree(leb[1]);
+	return ERR_PTR(err);
+}
+
+/**
+ * create_vtbl - create a copy of the volume table.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @copy: the number of the volume table copy
+ * @vol_tbl: the contents of the volume table
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int create_vtbl(const struct ubi_info *ubi, struct ubi_scan_info *si,
+		       int copy, void *vol_tbl)
+{
+	int err, tries = 0, pnum, ec;
+	unsigned int leb_ver;
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	static struct ubi_vid_hdr *vid_hdr;
+	struct ubi_scan_volume *sv;
+	struct ubi_scan_leb *new_seb, *old_seb = NULL;
+
+	ubi_msg("create volume table (copy #%d)", copy + 1);
+
+	vid_hdr = ubi_zalloc_vid_hdr(ubi);
+	if (!vid_hdr)
+		return -ENOMEM;
+
+	/*
+	 * First we look if there is a logical eraseblock which would have to
+	 * contain this volume table copy was found during scanning. We have
+	 * to wipe it.
+	 */
+	sv = ubi_scan_find_sv(si, UBI_LAYOUT_VOL_ID);
+	if (sv)
+		old_seb = ubi_scan_find_seb(sv, copy);
+
+retry:
+	new_seb = ubi_scan_get_free_peb(ubi, si);
+	if (IS_ERR(new_seb)) {
+		err = PTR_ERR(new_seb);
+		goto out_free;
+	}
+	pnum = new_seb->pnum;
+	ec = new_seb->ec;
+	ubi_free_scan_leb(new_seb);
+
+	vid_hdr->vol_type = UBI_VID_DYNAMIC;
+	vid_hdr->vol_id = cpu_to_ubi32(UBI_LAYOUT_VOL_ID);
+	vid_hdr->compat = UBI_LAYOUT_VOLUME_COMPAT;
+	vid_hdr->data_size = vid_hdr->used_ebs =
+			     vid_hdr->data_pad = cpu_to_ubi32(0);
+	vid_hdr->lnum = cpu_to_ubi32(copy);
+	leb_ver = old_seb ? old_seb->leb_ver + 1: 0;
+	vid_hdr->leb_ver = cpu_to_ubi32(leb_ver);
+
+	/* The EC header is already there, write the VID header */
+	err = ubi_io_write_vid_hdr(ubi, pnum, vid_hdr);
+	if (err)
+		goto write_error;
+
+	/* Write the layout volume contents */
+	err = ubi_io_write_data(ubi, vol_tbl, pnum, 0, vtbl->vt_size);
+	if (err)
+		goto write_error;
+
+	/*
+	 * And add it to the scanning information. Don't delete the old
+	 * @old_seb as it will be deleted and freed in 'ubi_scan_add_peb()'.
+	 */
+	err = ubi_scan_add_peb(ubi, si, pnum, ec, vid_hdr, 0);
+	if (err)
+		goto out_free;
+
+out_free:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	return err;
+
+write_error:
+	/* May be this physical eraseblock went bad, try to pick another one */
+	if (++tries <= 5) {
+		err = ubi_scan_add_corr_peb(si, pnum, ec);
+		if (err)
+			goto out_free;
+		goto retry;
+	}
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	return err;
+}
+
+/**
+ * init_ivols - initialize internal volumes information.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function initializes information about internal UBI volumes. This
+ * information is not stored on flash but instead, is kept only in RAM.
+ */
+static void init_ivols(struct ubi_info *ubi)
+{
+	struct ubi_vtbl_vtr *vtr;
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	const struct ubi_io_info *io = ubi->io;
+
+	/* The layout volume */
+	vtr = &vtbl->ivol_vtrs[0];
+	vtr->reserved_pebs = UBI_LAYOUT_VOLUME_EBS;
+	vtr->alignment = 1;
+	vtr->vol_type = UBI_DYNAMIC_VOLUME;
+	vtr->name_len = sizeof(UBI_LAYOUT_VOLUME_NAME) - 1;
+	vtr->name = UBI_LAYOUT_VOLUME_NAME;
+	vtr->usable_leb_size = io->leb_size;
+	vtr->used_ebs = vtr->reserved_pebs;
+	vtr->last_eb_bytes = vtr->reserved_pebs;
+	vtr->used_bytes = vtr->used_ebs * (io->leb_size - vtr->data_pad);
+}
+
+/**
+ * get_ivol_vtr - get volume table record of an internal volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the requested volume ID
+ *
+ * This function returns a pointer to the volume tabe record. The @vol_id must
+ * be correct.
+ */
+static const struct ubi_vtbl_vtr *get_ivol_vtr(const struct ubi_info *ubi,
+					       int vol_id)
+{
+	ubi_assert(ubi_is_ivol(vol_id));
+	return &ubi->vtbl->ivol_vtrs[vol_id - UBI_INTERNAL_VOL_START];
+}
+
+/**
+ * init_ram_vt - initialize the in-RAM copy of the volume table.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ * @vol_tbl: the volume table
+ *
+ * This function builds the in-RAM volume table representation. Note, the
+ * in-RAM volume table contains more data than the on-flash volume table. This
+ * function returns zero in case of success and a negative error code in case
+ * of failure.
+ */
+static int init_ram_vt(const struct ubi_info *ubi,
+		       const struct ubi_scan_info *si,
+		       const struct ubi_vol_tbl_record *vol_tbl)
+{
+	int i;
+	struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	vtbl->vt = ubi_kzalloc(vtbl->vt_slots * sizeof(struct ubi_vtbl_vtr));
+	if (!vtbl->vt)
+		return -ENOMEM;
+
+	for (i = 0; i < vtbl->vt_slots; i++) {
+		struct ubi_vtbl_vtr *vtr = &vtbl->vt[i];
+		struct ubi_scan_volume *sv;
+		int name_len;
+		char *name;
+
+		cond_resched();
+
+		vtr->reserved_pebs = ubi32_to_cpu(vol_tbl[i].reserved_pebs);
+
+		/* Skip empty records */
+		if (vtr->reserved_pebs == 0)
+			continue;
+
+		vtr->alignment = ubi32_to_cpu(vol_tbl[i].alignment);
+		vtr->data_pad = ubi32_to_cpu(vol_tbl[i].data_pad);
+		vtr->vol_type = vol_tbl[i].vol_type == UBI_VID_DYNAMIC ?
+		    UBI_DYNAMIC_VOLUME : UBI_STATIC_VOLUME;
+		name_len = ubi16_to_cpu(vol_tbl[i].name_len);
+		vtr->name_len = name_len;
+		vtr->usable_leb_size = ubi->io->leb_size - vtr->data_pad;
+
+		vtr->name = ubi_kmalloc(name_len + 1);
+		if (unlikely(!vtr->name)) {
+			free_volume_info(ubi);
+			return -ENOMEM;
+		}
+
+		name = (char *)vtr->name;
+		memcpy(name, vol_tbl[i].name, name_len + 1);
+		name[name_len] = '\0';
+
+		/* Initialize the RAM-only fields */
+
+		/*
+		 * In case of dynamic volume UBI knows nothing about how many
+		 * data is stored there. So assume the whole volume is used.
+		 */
+		if (vtr->vol_type == UBI_DYNAMIC_VOLUME) {
+			vtr->used_ebs = vtr->reserved_pebs;
+			vtr->last_eb_bytes = vtr->usable_leb_size;
+			vtr->used_bytes = vtr->used_ebs * vtr->usable_leb_size;
+			continue;
+		}
+
+		/* Static volumes only */
+		sv = ubi_scan_find_sv(si, i);
+		if (!sv)
+			/*
+			 * No eraseblocks belonging to this volume found. We
+			 * don't actually know whether this static volume is
+			 * completely corrupted or just contains no data. And
+			 * we cannot know this as long as the data table is not
+			 * maintained on flash. So we just assume the volume is
+			 * empty.
+			 */
+			continue;
+
+		if (unlikely(sv->leb_count != sv->used_ebs)) {
+			/*
+			 * We found a static volume which misses several
+			 * eraseblocks. Treat it as corrupted.
+			 */
+			ubi_warn("static volume %d misses %d LEBs",
+				 sv->vol_id, sv->used_ebs - sv->leb_count);
+			vtr->corrupted = 1;
+			continue;
+		}
+
+		vtr->used_ebs = sv->used_ebs;
+		vtr->used_bytes = (vtr->used_ebs - 1) * vtr->usable_leb_size;
+		vtr->used_bytes += sv->last_data_size;
+		vtr->last_eb_bytes = sv->last_data_size;
+	}
+
+	return 0;
+}
+
+/**
+ * free_volume_info - free the in-RAM copy of the volume table.
+ *
+ * @ubi: the UBI device description object
+ */
+static void free_volume_info(const struct ubi_info *ubi)
+{
+	int i;
+
+	for (i = 0; i < ubi->vtbl->vt_slots; i++)
+		ubi_kfree(ubi->vtbl->vt[i].name);
+
+	ubi_kfree(ubi->vtbl->vt);
+}
+
+/**
+ * vol_tbl_check - check if the volume table is not corrupted and contains sane
+ * data.
+ *
+ * @ubi: the UBI device description object
+ * @vol_tbl: the volume table
+ *
+ * This function returns zero if the volume table is all right and %-EINVAL if
+ * not.
+ */
+static int vol_tbl_check(const struct ubi_info *ubi,
+			 const struct ubi_vol_tbl_record *vol_tbl)
+{
+	int i, reserved_pebs, alignment, data_pad, vol_type, name_len;
+	int upd_marker;
+	const char *name;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	const struct ubi_io_info *io = ubi->io;
+
+	for (i = 0; i < vtbl->vt_slots; i++) {
+		int n;
+		uint32_t crc;
+
+		cond_resched();
+
+		reserved_pebs = ubi32_to_cpu(vol_tbl[i].reserved_pebs);
+		alignment = ubi32_to_cpu(vol_tbl[i].alignment);
+		data_pad = ubi32_to_cpu(vol_tbl[i].data_pad);
+		upd_marker = vol_tbl[i].upd_marker;
+		vol_type = vol_tbl[i].vol_type;
+		name_len = ubi16_to_cpu(vol_tbl[i].name_len);
+		name = &vol_tbl[i].name[0];
+
+		crc = crc32(UBI_CRC32_INIT, &vol_tbl[i],
+			    UBI_VTBL_RECORD_SIZE_CRC);
+
+		if (unlikely(ubi32_to_cpu(vol_tbl[i].crc) != crc)) {
+			ubi_err("wrong CRC at record %u: %#08x, not %#08x",
+				 i, crc, ubi32_to_cpu(vol_tbl[i].crc));
+			return -EINVAL;
+		}
+
+		if (reserved_pebs == 0) {
+			int is_zero;
+
+			is_zero = ubi_buf_all_zeroes(&vol_tbl[i],
+						     UBI_VTBL_RECORD_SIZE_CRC);
+			if (unlikely(is_zero == 0)) {
+				dbg_err("zero reserved_pebs");
+				goto bad;
+			}
+
+			continue;
+		}
+
+		if (unlikely(reserved_pebs < 0 || alignment < 0 ||
+			     data_pad < 0 || name_len < 0)) {
+			dbg_err("negative values");
+			goto bad;
+		}
+
+		if (unlikely(alignment > io->leb_size)) {
+			dbg_err("too large alignment");
+			goto bad;
+		}
+
+		if (unlikely(alignment == 0)) {
+			dbg_err("zero alignment");
+			goto bad;
+		}
+
+		n = alignment % io->min_io_size;
+		if (alignment != 1 && unlikely(n)) {
+			dbg_err("alignment is not multiple of min I/O unit"
+				 "size");
+			goto bad;
+		}
+
+		n = io->leb_size % alignment;
+		if (unlikely(data_pad != n)) {
+			dbg_err("bad data_pad, has to be %d", n);
+			goto bad;
+		}
+
+		if (likely(vol_type != UBI_VID_DYNAMIC &&
+			   vol_type != UBI_VID_STATIC)) {
+			dbg_err("bad vol_type");
+			goto bad;
+		}
+
+		if (unlikely(upd_marker != 0 && upd_marker != 1)) {
+			dbg_err("bad upd_marker");
+			goto bad;
+		}
+
+		if (unlikely(reserved_pebs > io->good_peb_count)) {
+			dbg_err("too large reserved_pebs");
+			goto bad;
+		}
+
+		if (unlikely(name_len > UBI_VOL_NAME_MAX)) {
+			dbg_err("too long volume name, max is %d",
+				UBI_VOL_NAME_MAX);
+			goto bad;
+		}
+
+		if (unlikely(name[0] == '\0')) {
+			dbg_err("NULL volume name");
+			goto bad;
+		}
+
+		n = strnlen(name, name_len + 1);
+		if (unlikely(name_len != n)) {
+			dbg_err("bad name_len");
+			goto bad;
+		}
+	}
+
+	return 0;
+
+bad:
+	ubi_err("volume table check failed");
+	dbg_err("volume record %d dump:", i);
+	ubi_dbg_dump_vol_tbl_record(&vol_tbl[i]);
+	return -EINVAL;
+}
+
+static int check_sv(const struct ubi_info *ubi,
+		    const struct ubi_scan_volume *sv,
+		    const struct ubi_vtbl_vtr *vtr);
+
+/**
+ * check_scanning_info - check that scanning information is consistent to the
+ * information from the volume table.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * Even though we protect on-flash data by CRC checksums, we still don't trust
+ * the media. Who knows what users are trying to feed us.
+ *
+ * This function returns zero if the scanning information is sane and %-EINVAL
+ * if it is not.
+ */
+static int check_scanning_info(const struct ubi_info *ubi,
+			       struct ubi_scan_info *si)
+{
+	int err, i;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_scan_volume *sv;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+
+	for (i = 0; i < vtbl->vt_slots; i++) {
+		cond_resched();
+
+		vtr = &vtbl->vt[i];
+		sv = ubi_scan_find_sv(si, i);
+
+		if (vtr->reserved_pebs == 0) {
+			if (likely(!sv))
+				continue;
+
+			/*
+			 * The scanning unit has found a volume which does not
+			 * exist according to the information in the volume
+			 * table. This must have happened due to an unclean
+			 * reboot while the volume was being removed. Discard
+			 * these eraseblocks.
+			 */
+			dbg_vtbl("volume %d removal was interrupted, finish it",
+				 sv->vol_id);
+			ubi_scan_rm_volume(ubi, si, sv);
+			continue;
+		}
+
+		if (!sv)
+			continue;
+
+		err = check_sv(ubi, sv, vtr);
+		if (unlikely(err))
+			goto out;
+	}
+
+	/* Check that scanning information about internal UBI volumes is sane */
+	for (i = 0; i < UBI_INT_VOL_COUNT; i++) {
+		cond_resched();
+
+		vtr = get_ivol_vtr(ubi, i + UBI_INTERNAL_VOL_START);
+		ubi_assert(!IS_ERR(vtr));
+
+		sv = ubi_scan_find_sv(si, i + UBI_INTERNAL_VOL_START);
+
+		/*
+		 * If an internal volume was not found, the corresponding
+		 * UBI unit will handle this.
+		 */
+		if (!sv)
+			continue;
+
+		err = check_sv(ubi, sv, vtr);
+		if (unlikely(err))
+			goto out;
+	}
+
+	return 0;
+
+out:
+	return -EINVAL;
+}
+
+/**
+ * check_sv - check sanity of scanning information about a volume.
+ *
+ * @ubi: the UBI device description object
+ * @sv: volume scanning information
+ * @vtr: corresponding volume table record (supposed to be correct)
+ *
+ * This function returns zero if the volume scanning information is sane, and
+ * %-EINVAL if not.
+ */
+static int check_sv(const struct ubi_info *ubi,
+		    const struct ubi_scan_volume *sv,
+		    const struct ubi_vtbl_vtr *vtr)
+{
+	if (unlikely(sv->highest_lnum >= vtr->reserved_pebs)) {
+		dbg_err("bad highest_lnum");
+		goto bad;
+	}
+
+	if (unlikely(sv->leb_count > vtr->reserved_pebs)) {
+		dbg_err("bad leb_count");
+		goto bad;
+	}
+
+	if (unlikely(sv->vol_type != vtr->vol_type)) {
+		dbg_err("bad vol_type");
+		goto bad;
+	}
+
+	if (unlikely(sv->used_ebs > vtr->reserved_pebs)) {
+		dbg_err("bad used_ebs");
+		goto bad;
+	}
+
+	if (unlikely(sv->data_pad != vtr->data_pad)) {
+		dbg_err("bad data_pad");
+		goto bad;
+	}
+
+	return 0;
+
+bad:
+	ubi_err("scanning information is not consistent to volume table");
+	ubi_dbg_dump_sv(sv);
+	ubi_dbg_dump_vtr(vtr);
+	return -EINVAL;
+}
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_VTBL
+
+/**
+ * paranoid_check_vtr - check a &struct ubi_vtbl_vtr object.
+ *
+ * @ubi: the UBI device description object
+ * @vtr: the object pointer to check
+ *
+ * This function returns zero if the volume table record is sane, and %1 if
+ * not.
+ */
+static int paranoid_check_vtr(const struct ubi_info *ubi,
+			      const struct ubi_vtbl_vtr *vtr)
+{
+	long long n;
+	const struct ubi_io_info *io = ubi->io;
+
+	if (vtr->reserved_pebs == 0)
+		return 0;
+
+	if (unlikely(vtr->reserved_pebs < 0 || vtr->alignment < 0 ||
+		     vtr->data_pad < 0 || vtr->name_len < 0)) {
+		ubi_err("negative values");
+		goto fail;
+	}
+
+	if (unlikely(vtr->alignment > io->leb_size)) {
+		ubi_err("too large alignment %d", vtr->alignment);
+		goto fail;
+	}
+
+	if (unlikely(vtr->alignment == 0)) {
+		ubi_err("zero alignment");
+		goto fail;
+	}
+
+	n = vtr->alignment % io->min_io_size;
+	if (vtr->alignment != 1 && unlikely(n)) {
+		ubi_err("alignment %d is not multiple of min I/O unit size",
+			vtr->alignment);
+		goto fail;
+	}
+
+	n = io->leb_size % vtr->alignment;
+	if (unlikely(vtr->data_pad != n)) {
+		ubi_err("bad data_pad %d, has to be %lld", vtr->data_pad, n);
+		goto fail;
+	}
+
+	if (unlikely(vtr->vol_type != UBI_DYNAMIC_VOLUME &&
+		     vtr->vol_type != UBI_STATIC_VOLUME)) {
+		ubi_err("bad vol_type %d", vtr->vol_type);
+		goto fail;
+	}
+
+	if (unlikely(vtr->upd_marker != 0 && vtr->upd_marker != 1)) {
+		ubi_err("zero upd_marker");
+		goto fail;
+	}
+
+	if (unlikely(vtr->reserved_pebs > io->good_peb_count)) {
+		ubi_err("too large reserved_pebs %d", vtr->reserved_pebs);
+		goto fail;
+	}
+
+	if (unlikely(vtr->usable_leb_size != io->leb_size - vtr->data_pad)) {
+		ubi_err("bad usable_leb_size %d, has to be %d",
+			vtr->usable_leb_size, io->leb_size - vtr->data_pad);
+		goto fail;
+	}
+
+	if (unlikely(vtr->name_len > UBI_VOL_NAME_MAX)) {
+		ubi_err("too long volume name %d, max is %d",
+			vtr->name_len, UBI_VOL_NAME_MAX);
+		goto fail;
+	}
+
+	if (unlikely(!vtr->name)) {
+		ubi_err("NULL volume name");
+		goto fail;
+	}
+
+	n = strnlen(vtr->name, vtr->name_len + 1);
+	if (unlikely(n != vtr->name_len)) {
+		ubi_err("bad name_len %d", vtr->name_len);
+		goto fail;
+	}
+
+	/* Check RAM-only fields */
+	n = vtr->used_ebs * vtr->usable_leb_size;
+	if (vtr->vol_type == UBI_DYNAMIC_VOLUME) {
+		if (unlikely(vtr->corrupted != 0)) {
+			ubi_err("bad corrupted");
+			goto fail;
+		}
+
+		if (unlikely(vtr->used_ebs != vtr->reserved_pebs)) {
+			ubi_err("bad used_ebs");
+			goto fail;
+		}
+
+		if (unlikely(vtr->last_eb_bytes != vtr->usable_leb_size)) {
+			ubi_err("bad last_eb_bytes");
+			goto fail;
+		}
+
+		if (unlikely(vtr->used_bytes != n)) {
+			ubi_err("bad used_bytes");
+			goto fail;
+		}
+	} else {
+		if (unlikely(vtr->corrupted != 0 && vtr->corrupted != 1)) {
+			ubi_err("bad corrupted");
+			goto fail;
+		}
+
+		if (unlikely(vtr->used_ebs < 0 ||
+			     vtr->used_ebs > vtr->reserved_pebs)) {
+			ubi_err("bad used_ebs");
+			goto fail;
+		}
+
+		if (unlikely(vtr->last_eb_bytes < 0 ||
+			     vtr->last_eb_bytes > vtr->usable_leb_size)) {
+			ubi_err("bad last_eb_bytes");
+			goto fail;
+		}
+
+		if (unlikely(vtr->used_bytes < 0 || vtr->used_bytes > n ||
+			     vtr->used_bytes < n - vtr->usable_leb_size)) {
+			ubi_err("bad used_bytes");
+			goto fail;
+		}
+	}
+
+	return 0;
+
+fail:
+	ubi_err("paranoid check failed");
+	ubi_dbg_dump_vtr(vtr);
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+#endif /* CONFIG_MTD_UBI_DEBUG_PARANOID_VTBL */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 21/44 take 2] [UBI] background thread unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (19 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 20/44 take 2] [UBI] volume table unit implementation Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 22/44 take 2] [UBI] background thread unit implementation Artem Bityutskiy
                   ` (25 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/background.h tmp-to/drivers/mtd/ubi/background.h
--- tmp-from/drivers/mtd/ubi/background.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/background.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,177 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Authors: Thomas Gleixner, Artem B. Bityutskiy
+ */
+
+/*
+ * UBI background thread unit.
+ *
+ * This unit maintains a per-UBI device thread which is supposed to do
+ * different background works. It is mostly used by the WL unit to perform
+ * eraseblock erasure and movement, but may also be used for other works.
+ */
+
+#ifndef __UBI_BACKGROUND_H__
+#define __UBI_BACKGROUND_H__
+
+#include <linux/sched.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/completion.h>
+
+struct ubi_info;
+struct ubi_bgt_work;
+
+/**
+ * ubi_bgt_schedule - schedule a work.
+ *
+ * @ubi: the UBI device description object
+ * @wrk: the work to schedule
+ *
+ * This function enqueues a work defined by @wrk to the tail of the pending
+ * works list. Returns zero in case of success and %-ENODEV if the background
+ * thread was killed.
+ */
+int ubi_bgt_schedule(const struct ubi_info *ubi, struct ubi_bgt_work *wrk);
+
+/**
+ * ubi_bgt_reschedule - re-schedule a work.
+ *
+ * @ubi: the UBI device description object
+ * @wrk: the work to re-schedule.
+ *
+ * This function enqueues a work defined by @wrk to the tail of the pending
+ * works list. Returns zero in case of success and %-ENODEV if the background
+ * thread was killed.
+ */
+int ubi_bgt_reschedule(const struct ubi_info *ubi, struct ubi_bgt_work *wrk);
+
+/**
+ * ubi_bgt_do_work - do one pending work.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_bgt_do_work(const struct ubi_info *ubi);
+
+/**
+ * ubi_bgt_enable - enable the background thread.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function enables the background thread for UBI device defined by @ubi.
+ * Returns zero in case of success and %-ENODEV if the background thread was
+ * killed.
+ */
+int ubi_bgt_enable(const struct ubi_info *ubi);
+
+/**
+ * ubi_bgt_disable - disable the background thread.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_bgt_disable(const struct ubi_info *ubi);
+
+/**
+ * ubi_bgt_kill_thread - kill the background thread.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function kills the background thread for UBI device defined by @ubi.
+ * This function also makes sure all the pending tasks are done.
+ */
+void ubi_bgt_kill_thread(const struct ubi_info *ubi);
+
+/**
+ * ubi_bgt_init - initialize the background thread unit for an UBI device.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_bgt_init(struct ubi_info *ubi);
+
+/**
+ * ubi_bgt_close - close the background thread unit for an UBI device.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_bgt_close(struct ubi_info *ubi);
+
+/**
+ * ubi_bgt_worker_t - background worker function prototype.
+ *
+ * @ubi: the UBI device description object
+ * @wrk: the work object pointer
+ * @cancel: non-zero if the work has to be canceled
+ *
+ * The @cancel argument is not zero when the background thread is being killed
+ * and just wants the worker to free  everything associated with this work
+ * (@wrk).
+ */
+typedef int ubi_bgt_worker_t(const struct ubi_info *ubi,
+			     struct ubi_bgt_work *wrk, int cancel);
+
+/**
+ * struct ubi_bgt_work - a background work.
+ *
+ * @list: a link in the list of pending works
+ * @func: the worker function
+ * @priv: private data of the worker function
+ *
+ * To schedule a background work users have to construct a
+ * &struct ubi_bgt_work object, initialize the @func and @priv fields and call
+ * 'ubi_bgt_schedule()'.
+ */
+struct ubi_bgt_work {
+	struct list_head list;
+	ubi_bgt_worker_t *func;
+	void *priv;
+};
+
+/**
+ * struct ubi_bgt_work - UBI background thread unit description data structure.
+ *
+ * @pending_works: the list of pending works
+ * @active_work: the work which is currently running
+ * @pending_works_count: count of pending works
+ * @lock: protects the @pending_works, @active_work, @enabled, and @task fields
+ * @enabled: if the background thread is enabled
+ * @task: a pointer to the &struct task_struct of the background thread
+ * @bgt_name: the background thread name
+ * @thread_start: used to synchronize with starting of the background thread
+ * @thread_stop: used to synchronize with killing of the background thread
+ * @wrk_mutex: serializes execution if background works
+ */
+struct ubi_bgt_info {
+	struct list_head pending_works;   /* private */
+	struct ubi_bgt_work *active_work; /* private */
+	int pending_works_count;          /* public  */
+	spinlock_t lock;                  /* private */
+	int enabled;                      /* public  */
+	struct task_struct *task;         /* private */
+	char *bgt_name;                   /* public  */
+	struct completion thread_start;   /* private */
+	struct completion thread_stop;    /* private */
+	struct mutex wrk_mutex;           /* private */
+};
+
+#endif /* !__UBI_BACKGROUND_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 22/44 take 2] [UBI] background thread unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (20 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 21/44 take 2] [UBI] background thread unit header Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-19 11:09   ` Christoph Hellwig
  2007-02-17 16:56 ` [PATCH 23/44 take 2] [UBI] wear-leveling unit header Artem Bityutskiy
                   ` (24 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/background.c tmp-to/drivers/mtd/ubi/background.c
--- tmp-from/drivers/mtd/ubi/background.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/background.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,352 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Authors: Thomas Gleixner, Artem B. Bityutskiy
+ */
+
+#include <linux/string.h>
+#include <linux/sched.h>
+#include <linux/completion.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/freezer.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "debug.h"
+#include "background.h"
+#include "eba.h"
+
+/* Background thread name pattern */
+#define BGT_NAME_PATTERN "ubi_bgt%dd"
+
+/* Highest number of pending works */
+#define BGT_MAX_PENDING_WORKS 0x7FFFFFFF
+
+/*
+ * Maximum number of consecutive background thread failures which is enough to
+ * disable the thread.
+ */
+#define BGT_MAX_FAILURES 32
+
+int ubi_bgt_schedule(const struct ubi_info *ubi, struct ubi_bgt_work *wrk)
+{
+	int err = 0;
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+retry:
+	spin_lock(&bgt->lock);
+	dbg_bgt("%s: schedule work %p (func %p, priv %p)",
+		bgt->bgt_name, wrk, wrk->func, wrk->priv);
+
+	if (unlikely(!bgt->task)) {
+		ubi_err("task \"%s\" was killed", bgt->bgt_name);
+		spin_unlock(&bgt->lock);
+		return -ENODEV;
+	}
+
+	if (unlikely(bgt->pending_works_count == BGT_MAX_PENDING_WORKS)) {
+		/* Too many pending works */
+		spin_unlock(&bgt->lock);
+		dbg_bgt("pending queue is too long, do a work now");
+		err = ubi_bgt_do_work(ubi);
+		if (unlikely(err))
+			goto out;
+
+		cond_resched();
+		goto retry;
+	}
+
+	list_add_tail(&wrk->list, &bgt->pending_works);
+	bgt->pending_works_count += 1;
+
+	if (!bgt->active_work && likely(bgt->enabled))
+		wake_up_process(bgt->task);
+
+out:
+	spin_unlock(&bgt->lock);
+	return err;
+}
+
+int ubi_bgt_reschedule(const struct ubi_info *ubi, struct ubi_bgt_work *wrk)
+{
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	spin_lock(&bgt->lock);
+	dbg_bgt("%s: re-schedule work %p (func %p, priv %p)",
+		bgt->bgt_name, wrk, wrk->func, wrk->priv);
+
+	if (unlikely(!bgt->task)) {
+		ubi_err("task \"%s\" was killed", bgt->bgt_name);
+		spin_unlock(&bgt->lock);
+		return -ENODEV;
+	}
+
+	list_add_tail(&wrk->list, &bgt->pending_works);
+	bgt->pending_works_count += 1;
+
+	if (!bgt->active_work && likely(bgt->enabled))
+		wake_up_process(bgt->task);
+	spin_unlock(&bgt->lock);
+	return 0;
+}
+
+int ubi_bgt_enable(const struct ubi_info *ubi)
+{
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	spin_lock(&bgt->lock);
+	dbg_bgt("enable \"%s\"", bgt->bgt_name);
+
+	if (!bgt->task) {
+		ubi_err("task \"%s\" was killed", bgt->bgt_name);
+		spin_unlock(&bgt->lock);
+		return -ENODEV;
+	}
+
+	bgt->enabled = 1;
+	wake_up_process(bgt->task);
+	spin_unlock(&bgt->lock);
+	return 0;
+}
+
+void ubi_bgt_disable(const struct ubi_info *ubi)
+{
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	spin_lock(&bgt->lock);
+	dbg_bgt("disable \"%s\"", bgt->bgt_name);
+	bgt->enabled = 0;
+	spin_unlock(&bgt->lock);
+}
+
+void ubi_bgt_kill_thread(const struct ubi_info *ubi)
+{
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	dbg_bgt("disable \"%s\"", bgt->bgt_name);
+	if (bgt->task) {
+		send_sig(SIGKILL, bgt->task, 1);
+		wait_for_completion(&bgt->thread_stop);
+	}
+}
+
+int ubi_bgt_do_work(const struct ubi_info *ubi)
+{
+	int err;
+	struct ubi_bgt_work *wrk;
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	mutex_lock(&bgt->wrk_mutex);
+
+	spin_lock(&bgt->lock);
+
+	if (unlikely(bgt->pending_works_count == 0)) {
+		err = 0;
+		goto out;
+	}
+
+	bgt->active_work = wrk = list_entry(bgt->pending_works.next,
+					    struct ubi_bgt_work, list);
+	list_del(&wrk->list);
+	bgt->pending_works_count -= 1;
+	ubi_assert(bgt->pending_works_count >= 0);
+	spin_unlock(&bgt->lock);
+
+	/*
+	 * Call the worker function. Do not touch the work structure
+	 * after this call as it will have been freed or reused by that
+	 * time by the worker function.
+	 */
+	dbg_bgt("%s: do work %p (func %p, priv %p)",
+		bgt->bgt_name, wrk, wrk->func, wrk->priv);
+
+	err = wrk->func(ubi, wrk, 0);
+	if (unlikely(err))
+		ubi_err("a work failed with error code %d", err);
+
+	spin_lock(&bgt->lock);
+	bgt->active_work = NULL;
+out:
+	spin_unlock(&bgt->lock);
+	mutex_unlock(&bgt->wrk_mutex);
+	return err;
+}
+
+/**
+ * ubi_thread - UBI background thread.
+ *
+ * @u: the UBI device description object pointer
+ */
+static int ubi_thread(void *u)
+{
+	int failures = 0;
+	const struct ubi_info *ubi = u;
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	daemonize(bgt->bgt_name);
+	allow_signal(SIGKILL);
+	allow_signal(SIGSTOP);
+
+	ubi_msg("background thread \"%s\" started, PID %d",
+		bgt->bgt_name, current->pid);
+
+	bgt->task = current;
+	complete(&bgt->thread_start);
+	set_current_state(TASK_INTERRUPTIBLE);
+	schedule();
+
+	for (;;) {
+		cond_resched();
+
+		if (unlikely(!bgt->enabled) ||
+			     list_empty(&bgt->pending_works)) {
+			set_current_state(TASK_INTERRUPTIBLE);
+			schedule();
+		}
+
+		if (try_to_freeze())
+			continue;
+
+		while (signal_pending(current)) {
+			siginfo_t info;
+			unsigned long nr;
+
+			nr = dequeue_signal_lock(current, &current->blocked,
+						 &info);
+			if (nr == SIGKILL)
+				goto out;
+			if (nr == SIGSTOP) {
+				bgt->enabled = !bgt->enabled;
+				ubi_msg("%s the background thread",
+					bgt->enabled ? "enable" : "disable");
+			}
+		}
+
+		spin_lock(&bgt->lock);
+		while (bgt->pending_works_count > 0 && likely(bgt->enabled)) {
+			int err;
+
+			ubi_assert(!list_empty(&bgt->pending_works));
+			spin_unlock(&bgt->lock);
+
+			cond_resched();
+
+			err = ubi_bgt_do_work(ubi);
+			if (unlikely(err)) {
+				ubi_err("%s: work failed with error code %d",
+					bgt->bgt_name, err);
+				if (failures++ > BGT_MAX_FAILURES) {
+					/*
+					 * Too many failures, disable the
+					 * thread and switch to read-only mode.
+					 */
+					ubi_msg("%d consecutive failures, "
+						"disable the background thread",
+						BGT_MAX_FAILURES);
+					ubi_bgt_disable(ubi);
+					ubi_eba_ro_mode(ubi);
+					break;
+				} else
+					failures = 0;
+			}
+
+			spin_lock(&bgt->lock);
+		}
+		spin_unlock(&bgt->lock);
+	}
+
+out:
+	dbg_bgt("killing background thread \"%s\"", bgt->bgt_name);
+
+	/* Cancel all pending works before exiting */
+	spin_lock(&bgt->lock);
+	bgt->task = NULL;
+
+	bgt->enabled = 0;
+	while (!list_empty(&bgt->pending_works)) {
+		struct ubi_bgt_work *wrk;
+
+		wrk = list_entry(bgt->pending_works.next, struct ubi_bgt_work,
+				 list);
+		list_del(&wrk->list);
+		bgt->pending_works_count -= 1;
+		spin_unlock(&bgt->lock);
+		wrk->func(ubi, wrk, 1);
+		spin_lock(&bgt->lock);
+	}
+	spin_unlock(&bgt->lock);
+
+	complete_and_exit(&bgt->thread_stop, 0);
+}
+
+int ubi_bgt_init(struct ubi_info *ubi)
+{
+	int err;
+	pid_t pid;
+	struct ubi_bgt_info *bgt;
+
+	dbg_bgt("initialize the UBI background thread unit");
+
+	bgt = ubi_kzalloc(sizeof(struct ubi_bgt_info));
+	if (!bgt)
+		return -ENOMEM;
+	ubi->bgt = bgt;
+
+	init_completion(&bgt->thread_start);
+	init_completion(&bgt->thread_stop);
+	INIT_LIST_HEAD(&bgt->pending_works);
+	spin_lock_init(&bgt->lock);
+	mutex_init(&bgt->wrk_mutex);
+
+	bgt->bgt_name = ubi_kmalloc(sizeof(BGT_NAME_PATTERN) + 20);
+	if (!bgt->bgt_name) {
+		err = -ENOMEM;
+		goto out_bgt;
+	}
+	sprintf(bgt->bgt_name, BGT_NAME_PATTERN, ubi->ubi_num);
+
+	pid = kernel_thread(ubi_thread, ubi, CLONE_FS | CLONE_FILES);
+	if (pid < 0) {
+		err = pid;
+		ubi_err("cannot spawn \"%s\", error %d", bgt->bgt_name, err);
+		goto out_name;
+	}
+
+	wait_for_completion(&bgt->thread_start);
+	dbg_bgt("the UBI background thread unit is initialized");
+	return 0;
+
+out_name:
+	ubi_kfree(bgt->bgt_name);
+out_bgt:
+	ubi_kfree(bgt);
+	return err;
+}
+
+void ubi_bgt_close(struct ubi_info *ubi)
+{
+	struct ubi_bgt_info *bgt = ubi->bgt;
+
+	dbg_bgt("close the UBI background thread unit");
+
+	ubi_assert(!bgt->enabled);
+	ubi_assert(bgt->pending_works_count == 0);
+	ubi_assert(list_empty(&bgt->pending_works));
+
+	ubi_kfree(bgt->bgt_name);
+	ubi_kfree(bgt);
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 23/44 take 2] [UBI] wear-leveling unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (21 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 22/44 take 2] [UBI] background thread unit implementation Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 24/44 take 2] [UBI] wear-leveling unit implementation Artem Bityutskiy
                   ` (23 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/wl.h tmp-to/drivers/mtd/ubi/wl.h
--- tmp-from/drivers/mtd/ubi/wl.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/wl.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,284 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Authors: Artem B. Bityutskiy, Thomas Gleixner
+ */
+
+/*
+ * UBI wear-leveling unit.
+ *
+ * This unit is responsible for wear-leveling. This unit works in terms of
+ * physical eraseblocks and erase counters and knows nothing about logical
+ * eraseblocks, volumes, etc (with one exception). From this unit's perspective
+ * all physical eraseblocks are of two types - used and free. Used physical
+ * eraseblocks are those that were "get" by the 'ubi_wl_get_peb()' function,
+ * and free physical eraseblocks are those that were put by the
+ * 'ubi_wl_put_peb()' function. Actually, the above two functions are the main
+ * in this unit.
+ *
+ * Physical eraseblocks returned by 'ubi_wl_get_peb()' have only the erase
+ * counter header. The rest of the physical eraseblock contains only 0xFF bytes.
+ *
+ * When physical eraseblocks are returned to the WL unit by means of the
+ * 'ubi_wl_put_peb()' function, they are scheduled for erasure. The erasure is
+ * not done synchronously. Instead, it is done in background in context of the
+ * per-UBI device background thread (see the background thread unit). Actually,
+ * the WL unit strongly depends on the background thread and cannot operate
+ * without it.
+ *
+ * The wear-leveling is ensured by means of moving the contents of used
+ * physical eraseblocks with low erase counter to free physical eraseblocks
+ * with high erase counter. The movement is also done in background.
+ *
+ * When eraseblocks are moved, the WL unit cooperates with the EBA unit to
+ * provide proper eraseblock locking. This means, the WL unit uses the EBA unit
+ * to lock the logical eraseblock corresponding to the physical eraseblock
+ * which is being moved. This is the only place where the WL unit "knows" about
+ * logical eraseblocks and volume identifier headers.
+ *
+ * The 'ubi_wl_get_peb()' function accepts data type hints which help to pick
+ * an "optimal" physical eraseblock. Indeed, for example, when it knows that
+ * the physical eraseblock will be "put" soon, it may pick a free physical
+ * eraseblock with low erase counter, and so forth.
+ *
+ * If the WL unit fails to erase a physical eraseblock, it marks the physical
+ * eraseblock as bad (using the bad eraseblock handling unit).
+ *
+ * This unit is also responsible for scrubbing. If a bit-flip is detected in a
+ * physical eraseblock, it has to be moved. Technically this is the same as
+ * moving it for wear-leveling reasons.
+ *
+ * As it was said, for the UBI unit all physical eraseblocks are either "free"
+ * or "used". Free eraseblock are kept in the @wl->free RB-tree, while used
+ * eraseblocks are kept in a set of different RB-trees: @wl->used,
+ * @wl->prot.pnum, @wl->prot.aec, and @wl->scrub.
+ *
+ * Note, in this implementation, we keep a small in-RAM object for each physical
+ * eraseblock. This is surely not a scalable solution. But it appears to be good
+ * enough for moderately large flashes and it is simple. In future, one may
+ * re-work this unit and make it more scalable.
+ */
+
+#ifndef __UBI_WL_H__
+#define __UBI_WL_H__
+
+#include <linux/rbtree.h>
+#include <linux/wait.h>
+#include <linux/sched.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+#include <linux/mtd/ubi.h>
+#include "background.h"
+
+struct ubi_info;
+struct ubi_scan_info;
+
+/**
+ * ubi_wl_get_peb - get a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @dtype: type of data which will be stored in this physical eraseblock
+ *
+ * This function returns a physical eraseblock in case of success and a
+ * negative error code in case of failure. Might sleep.
+ */
+int ubi_wl_get_peb(const struct ubi_info *ubi, enum ubi_data_type dtype);
+
+/**
+ * ubi_wl_put_peb - return a physical eraseblock to the wear-leveling
+ * unit.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: physical eraseblock to return
+ * @torture: if this physical eraseblock has to be tortured
+ *
+ * If an error occurred during I/O to @pnum, and the caller suspects @pnum to be
+ * bad, it will be tested for badness if @torture flag is not zero. This function
+ * returns zero in case of success and a negative error code in case of
+ * failure. Might sleep.
+ */
+int ubi_wl_put_peb(const struct ubi_info *ubi, int pnum, int torture);
+
+/**
+ * ubi_wl_flush - flush all pending works.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_wl_flush(const struct ubi_info *ubi);
+
+/**
+ * ubi_wl_scrub_peb - schedule a physical eraseblock for scrubbing.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock to schedule
+ *
+ * If a bit-flip in a physical eraseblock is detected, this physical eraseblock
+ * needs scrubbing. This function schedules a physical eraseblock for
+ * scrubbing which is done in background. This function returns zero in case of
+ * success and a negative error code in case of failure.
+ */
+int ubi_wl_scrub_peb(const struct ubi_info *ubi, int pnum);
+
+/**
+ * ubi_wl_init_scan - initialize the wear-leveling unit using scanning
+ * information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success, and a negative error code in
+ * case of failure.
+ */
+int ubi_wl_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si);
+
+/**
+ * ubi_wl_close - close the wear-leveling unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_wl_close(struct ubi_info *ubi);
+
+/**
+ * struct ubi_wl_entry - a wear-leveling entry.
+ *
+ * @rb: link in the corresponding RB-tree
+ * @ec: erase counter
+ * @pnum: physical eraseblock number
+ *
+ * Each physical eraseblock has a corresponding &struct wl_entry object which
+ * may be kept in different RB-trees.
+ */
+struct ubi_wl_entry {
+	struct rb_node rb;
+	int ec;
+	int pnum;
+};
+
+/**
+ * struct ubi_wl_prot_entry - a protection entry.
+ *
+ * @rb_pnum: link in the @wl->prot.pnum RB-tree
+ * @rb_aec: link in the @wl->prot.aec RB-tree
+ * @abs_ec: the absolute erase counter value when the protection ends
+ * @e: the wear-levelling entry of the physical eraseblock under protection
+ *
+ * When the WL unit returns a physical eraseblock, the physical eraseblock is
+ * protected from being moved for some "time". For this reason, the physical
+ * eraseblock is not directly moved from the @wl->free tree to the @wl->used
+ * tree. There is one more tree in between where this physical eraseblock is
+ * temporarily stored (@wl->prot).
+ *
+ * All this protection stuff is needed because:
+ *  o we don't want to move physical eraseblocks just after we have given them
+ *    to the user; instead, we first want to let users fill them up with data;
+ *
+ *  o there is a chance that the user will put the physical eraseblock very
+ *    soon, so it makes sense not to move it for some time, but wait; this is
+ *    especially important in case of "short term" physical eraseblocks.
+ *
+ * Physical eraseblocks stay protected only for limited time. But the "time" is
+ * measured in erase cycles in this case. This is implemented with help of the
+ * absolute erase counter (@wl->abs_ec). When it reaches certain value, the
+ * physical eraseblocks are moved from the protection trees (@wl->prot.*) to
+ * the @wl->used tree.
+ *
+ * Protected physical eraseblocks are searched by physical eraseblock number
+ * (when they are put) and by the absolute erase counter (to check if it is
+ * time to move them to the @wl->used tree). So there are actually 2 RB-trees
+ * storing the protected physical eraseblocks: @wl->prot.pnum and
+ * @wl->prot.aec. They are referred to as the "protection" trees. The
+ * first one is indexed by the physical eraseblock number. The second one is
+ * indexed by the absolute erase counter. Both trees store
+ * &struct ubi_wl_prot_entry objects.
+ */
+struct ubi_wl_prot_entry {
+	struct rb_node rb_pnum;
+	struct rb_node rb_aec;
+	unsigned long long abs_ec;
+	struct ubi_wl_entry *e;
+};
+
+/**
+ * struct ubi_wl_erase_work - physical eraseblock erasure work description data
+ * structure.
+ *
+ * @wrk: the background thread work descriptor
+ * @e: the physical eraseblock to erase
+ * @torture: if the physical eraseblock has to be tortured
+ *
+ * This data structure is used for erasure background works. The @torture flag
+ * indicates whether the physical eraseblock should be tested. Testing physical
+ * eraseblocks may be needed if an error occurred and they are likely to become
+ * bad.
+ */
+struct ubi_wl_erase_work {
+	struct ubi_bgt_work wrk;
+	struct ubi_wl_entry *e;
+	int torture;
+};
+
+/**
+ * struct ubi_wl_info - the UBI WL unit description data structure.
+ *
+ * @used: RB-tree of used physical eraseblocks
+ * @free: RB-tree of free physical eraseblocks
+ * @scrub: RB-tree of physical eraseblocks which need scrubbing
+ * @prot.pnum: the protection tree indexed by physical eraseblock numbers
+ * @prot: embraces protection trees
+ * @prot.aec: the protection tree indexed the absolute erase counter
+ * @lock: protects the @used, @free, @prot, @lookuptbl, @abs_ec, @move,
+ * @wl_scheduled, and @erase_pending fields
+ * @wl_scheduled: non-zero if the wear leveling was scheduled
+ * @lookuptbl: a table to quickly find a &struct ubi_wl_entry object for any
+ * physical eraseblock
+ * @erase_pending: how many physical eraseblock are waiting for erasure
+ * @abs_ec: the absolute erase counter
+ * @move: if a physical eraseblock is being moved, it is referred to here
+ * @max_ec: current highest erase counter value
+ *
+ * Each physical eraseblock has 2 main states: free and used. The former state
+ * corresponds to the @free RB-tree. The latter state is split up on several
+ * sub-states:
+ * o the WL movement is allowed (@used RB-tree);
+ * o the WL movement is temporarily prohibited (@prot.pnum and @prot.aec
+ * RB-trees);
+ * o scrubbing is needed (@scrub RB-tree),
+ *
+ * Depending on the sub-state, wear-levelling entries of the used physical
+ * eraseblocks may be kept in one of those trees.
+ */
+struct ubi_wl_info {
+	struct rb_root used;             /* private */
+	struct rb_root free;             /* private */
+	struct rb_root scrub;            /* private */
+	struct {
+		struct rb_root pnum;     /* private */
+		struct rb_root aec;      /* private */
+	} prot;
+	spinlock_t lock;                 /* private */
+	int wl_scheduled;                /* private */
+	struct ubi_wl_entry **lookuptbl; /* private */
+	int erase_pending;               /* private */
+	unsigned long long abs_ec;       /* public  */
+	struct ubi_wl_entry *move;       /* private */
+	int max_ec;                      /* public  */
+};
+
+#endif /* __UBI_WL_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 24/44 take 2] [UBI] wear-leveling unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (22 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 23/44 take 2] [UBI] wear-leveling unit header Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 25/44 take 2] [UBI] EBA unit header Artem Bityutskiy
                   ` (22 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/wl.c tmp-to/drivers/mtd/ubi/wl.c
--- tmp-from/drivers/mtd/ubi/wl.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/wl.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,1684 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Authors: Artem B. Bityutskiy, Thomas Gleixner
+ */
+
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/wait.h>
+#include <linux/crc32.h>
+#include <linux/sched.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "wl.h"
+#include "badeb.h"
+#include "io.h"
+#include "account.h"
+#include "eba.h"
+#include "background.h"
+#include "scan.h"
+#include "misc.h"
+#include "debug.h"
+
+/* Number of physical eraseblocks reserved for wear-leveling purposes */
+#define WL_RESERVED_PEBS 1
+
+/*
+ * How many erase cycles are short term, unknown, and long term physical
+ * eraseblocks protected.
+ */
+#define ST_PROTECTION 16
+#define U_PROTECTION  10
+#define LT_PROTECTION 4
+
+/*
+ * Maximum difference between two erase counters. If this threshold is
+ * exceeded, the WL unit starts moving data from used physical eraseblocks with
+ * low erase counter to free physical eraseblocks with high erase counter.
+ */
+#define UBI_WL_THRESHOLD CONFIG_MTD_UBI_WL_THRESHOLD
+
+/*
+ * When a physical eraseblock is moved, the WL unit has to pick the target
+ * physical eraseblock to move to. The simplest way would be just to pick the
+ * one with the highest erase counter. But in certain workload this could lead
+ * to an unbounded wearing of one or few physical eraseblock. Indeed, imagine a
+ * situation when the picked physical eraseblock is constantly erased after the
+ * data is written to it. So, we have a constant which limits the highest
+ * erase counter of the free physical eraseblock to pick. Namely, the WL unit
+ * does not pick eraseblocks with erase counter greater then the lowest erase
+ * counter plus %WL_FREE_MAX_DIFF.
+ */
+#define WL_FREE_MAX_DIFF (2*UBI_WL_THRESHOLD)
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_WL
+static int paranoid_check_ec(const struct ubi_info *ubi, int pnum, int ec);
+static int paranoid_check_in_wl_tree(struct ubi_wl_entry *e, struct rb_root *root);
+#else
+#define paranoid_check_ec(ubi, pnum, ec) 0
+#define paranoid_check_in_wl_tree(e, root)
+#endif
+
+/**
+ * tree_empty - a helper function to check if an RB-tree is empty.
+ *
+ * @root: the root of the tree
+ *
+ * This function returns non-zero if the tree is empty and zero if not.
+ */
+static inline int tree_empty(struct rb_root *root)
+{
+	return root->rb_node == NULL;
+}
+
+static void wl_tree_add(struct ubi_wl_entry *e, struct rb_root *root);
+
+/* Functions to add and delete wear-leveling entries from different trees */
+
+static inline void free_tree_add(struct ubi_wl_info *wl,
+				 struct ubi_wl_entry *e)
+{
+	wl_tree_add(e, &wl->free);
+}
+static inline void used_tree_add(struct ubi_wl_info *wl,
+				 struct ubi_wl_entry *e)
+{
+	wl_tree_add(e, &wl->used);
+}
+static inline void scrub_tree_add(struct ubi_wl_info *wl,
+				  struct ubi_wl_entry *e)
+{
+	wl_tree_add(e, &wl->scrub);
+}
+
+static inline void free_tree_del(struct ubi_wl_info *wl,
+				 struct ubi_wl_entry *e)
+{
+	paranoid_check_in_wl_tree(e, &wl->free);
+	rb_erase(&e->rb, &wl->free);
+}
+static inline void used_tree_del(struct ubi_wl_info *wl,
+				 struct ubi_wl_entry *e)
+{
+	paranoid_check_in_wl_tree(e, &wl->used);
+	rb_erase(&e->rb, &wl->used);
+}
+static inline void scrub_tree_del(struct ubi_wl_info *wl,
+				  struct ubi_wl_entry *e)
+{
+	paranoid_check_in_wl_tree(e, &wl->scrub);
+	rb_erase(&e->rb, &wl->scrub);
+}
+
+static int produce_free(const struct ubi_info *ubi);
+static struct ubi_wl_entry *pick_long_term(struct ubi_wl_info *wl);
+static struct ubi_wl_entry *pick_unknown(struct ubi_wl_info *wl);
+static struct ubi_wl_entry *pick_short_term(struct ubi_wl_info *wl);
+static void prot_tree_add(struct ubi_wl_info *wl, struct ubi_wl_entry *e,
+			  struct ubi_wl_prot_entry *pe, int abs_ec);
+
+int ubi_wl_get_peb(const struct ubi_info *ubi, enum ubi_data_type dtype)
+{
+	int err, protect;
+	struct ubi_wl_entry *e;
+	struct ubi_wl_info *wl = ubi->wl;
+	struct ubi_wl_prot_entry *pe;
+
+	might_sleep();
+
+	/* Input arguments sanity check */
+	ubi_assert(dtype == UBI_DATA_LONGTERM || dtype == UBI_DATA_SHORTTERM ||
+		   dtype == UBI_DATA_UNKNOWN);
+
+	pe = ubi_alloc_wl_prot_entry();
+	if (unlikely(!pe))
+		return -ENOMEM;
+
+retry:
+	spin_lock(&wl->lock);
+	if (unlikely(tree_empty(&wl->free))) {
+		if (unlikely(wl->erase_pending == 0)) {
+			ubi_err("no free eraseblocks");
+			spin_unlock(&wl->lock);
+			ubi_free_wl_prot_entry(pe);
+			return -ENOSPC;
+		}
+		spin_unlock(&wl->lock);
+
+		err = produce_free(ubi);
+		if (unlikely(err < 0)) {
+			ubi_free_wl_prot_entry(pe);
+			return err;
+		}
+		goto retry;
+	}
+
+	switch (dtype) {
+		case UBI_DATA_LONGTERM:
+			e = pick_long_term(wl);
+			protect = LT_PROTECTION;
+			break;
+		case UBI_DATA_UNKNOWN:
+			e = pick_unknown(wl);
+			protect = U_PROTECTION;
+			break;
+		case UBI_DATA_SHORTTERM:
+			e = pick_short_term(wl);
+			protect = ST_PROTECTION;
+			break;
+		default:
+			protect = 0;
+			e = NULL;
+			BUG();
+	}
+
+	/*
+	 * Move the physical eraseblock to the protection trees where it will
+	 * be protected from being moved for some time.
+	 */
+	free_tree_del(wl, e);
+	prot_tree_add(wl, e, pe, protect);
+
+	dbg_wl("PEB %d EC %d, protection %d", e->pnum, e->ec, protect);
+	spin_unlock(&wl->lock);
+
+	return e->pnum;
+}
+
+static int in_wl_tree(struct ubi_wl_entry *e, struct rb_root *root);
+static int schedule_erase(const struct ubi_info *ubi, struct ubi_wl_entry *e,
+			  int torture);
+static void check_protection_over(struct ubi_wl_info *wl);
+static void prot_tree_del(struct ubi_wl_info *wl, int pnum);
+
+int ubi_wl_put_peb(const struct ubi_info *ubi, int pnum, int torture)
+{
+	int err;
+	struct ubi_wl_entry *e;
+	struct ubi_wl_info *wl = ubi->wl;
+
+	dbg_wl("PEB %d", pnum);
+	might_sleep();
+
+	/* Input arguments sanity check */
+	ubi_assert(pnum >= 0);
+	ubi_assert(pnum < ubi->io->peb_count);
+
+	spin_lock(&wl->lock);
+	ubi_assert(wl->erase_pending >= 0);
+	wl->erase_pending += 1;
+
+	e = wl->lookuptbl[pnum];
+	if (unlikely(e == wl->move)) {
+		/*
+		 * User is putting a physical eraseblock which was selected to
+		 * be moved. We cancel the movement by setting @wl->move to
+		 * %NULL. The wear-leveling worker has to notice this and
+		 * cancel.
+		 *
+		 * Note, the physical eraseblock was removed from the @wl->used
+		 * tree by the wear-leveling worker and is not in any tree now.
+		 */
+		dbg_wl("cancel PEB %d movement", pnum);
+		wl->move = NULL;
+	} else {
+		if (in_wl_tree(e, &wl->used))
+			used_tree_del(wl, e);
+		else if (unlikely(in_wl_tree(e, &wl->scrub)))
+			scrub_tree_del(wl, e);
+		else
+			prot_tree_del(wl, e->pnum);
+	}
+	spin_unlock(&wl->lock);
+
+	err = schedule_erase(ubi, e, torture);
+	if (unlikely(err)) {
+		spin_lock(&wl->lock);
+		wl->erase_pending -= 1;
+		used_tree_add(wl, e);
+		spin_unlock(&wl->lock);
+	}
+
+	return err;
+}
+
+static int ensure_wear_leveling(const struct ubi_info *ubi);
+
+int ubi_wl_scrub_peb(const struct ubi_info *ubi, int pnum)
+{
+	struct ubi_wl_entry *e;
+	struct ubi_wl_info *wl = ubi->wl;
+
+	dbg_wl("schedule PEB %d for scrubbing", pnum);
+
+	spin_lock(&wl->lock);
+	e = wl->lookuptbl[pnum];
+	if (e == wl->move || in_wl_tree(e, &wl->scrub)) {
+		spin_unlock(&wl->lock);
+		return 0;
+	}
+
+	if (in_wl_tree(e, &wl->used))
+		used_tree_del(wl, e);
+	else
+		prot_tree_del(wl, pnum);
+
+	scrub_tree_add(wl, e);
+	spin_unlock(&wl->lock);
+
+	/*
+	 * Technically scrubbing is the same as wear-levelling, so it is done
+	 * by the WL worker. Schedule it.
+	 */
+	return ensure_wear_leveling(ubi);
+}
+
+static int erase_worker(const struct ubi_info *ubi, struct ubi_bgt_work *wrk,
+			int cancel);
+
+int ubi_wl_flush(const struct ubi_info *ubi)
+{
+	int err, pending_count;
+
+	pending_count = ubi->bgt->pending_works_count;
+
+	dbg_wl("flush (%d pending works)", pending_count);
+
+	/*
+	 * Erase while the pending works queue is not empty, but not more then
+	 * the number of currently pending works.
+	 */
+	while (pending_count-- > 0) {
+		err = ubi_bgt_do_work(ubi);
+		if (unlikely(err))
+			return err;
+	}
+
+	return 0;
+}
+
+static void tree_destroy(struct rb_root *root);
+
+int ubi_wl_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si)
+{
+	int err;
+	struct rb_node *rb1, *rb2;
+	struct ubi_scan_volume *sv;
+	struct ubi_scan_leb *seb, *tmp;
+	struct ubi_wl_entry *e;
+	struct ubi_wl_info *wl;
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_wl("initialize the UBI wear-leveling unit");
+
+	wl = ubi_kzalloc(sizeof(struct ubi_wl_info));
+	if (!wl)
+		return -ENOMEM;
+	ubi->wl = wl;
+
+	wl->used = wl->free = wl->scrub = RB_ROOT;
+	wl->prot.pnum = wl->prot.aec = RB_ROOT;
+	spin_lock_init(&wl->lock);
+	wl->max_ec = si->max_ec;
+
+	err = -ENOMEM;
+	wl->lookuptbl = ubi_kzalloc(io->peb_count * sizeof(void *));
+	if (!wl->lookuptbl)
+		goto out_free;
+
+	/*
+	 * The way how we distinguish between older LEB and newer LEB is based
+	 * on the following principles:
+	 * 1 if we have LEB with versions A and B, A < B, then B is newer then
+	 *   A when abs(B - A) < %0x7FFFFFFF
+	 * 2 as the WL unit guarantees that the length of the pending works
+	 *   queue is shorter then %0x7FFFFFFF works, and the works are put at
+	 *   the tail of the queue and got from its head, the above algorithm
+	 *   works correctly.
+	 *
+	 * Now we've got a list of eraseblocks to erase, and they are now
+	 * out-of-order, which does not satisfy the 2nd item, so we've got to
+	 * erase them now instead of deferring this.
+	 */
+	list_for_each_entry_safe(seb, tmp, &si->erase, u.list) {
+		cond_resched();
+
+		dbg_wl("erase PEB %d", seb->pnum);
+		err = ubi_scan_erase_peb(ubi, si, seb->pnum, seb->ec + 1);
+		if (unlikely(err)) {
+			if (err != -EIO && err != -EROFS)
+				goto out_free;
+			list_del(&seb->u.list);
+			list_add_tail(&seb->u.list, &si->corr);
+		} else {
+			seb->ec += 1;
+			list_del(&seb->u.list);
+			list_add_tail(&seb->u.list, &si->free);
+		}
+	}
+
+	list_for_each_entry(seb, &si->free, u.list) {
+		cond_resched();
+
+		e = ubi_alloc_wl_entry();
+		if (unlikely(!e))
+			goto out_free;
+
+		e->pnum = seb->pnum;
+		e->ec = seb->ec;
+		ubi_assert(e->ec >= 0);
+		free_tree_add(wl, e);
+		wl->lookuptbl[e->pnum] = e;
+	}
+
+	list_for_each_entry(seb, &si->corr, u.list) {
+		cond_resched();
+
+		e = ubi_alloc_wl_entry();
+		if (unlikely(!e)) {
+			err = -ENOMEM;
+			goto out_free;
+		}
+
+		e->pnum = seb->pnum;
+		e->ec = seb->ec;
+		wl->lookuptbl[e->pnum] = e;
+		wl->erase_pending += 1;
+		err = schedule_erase(ubi, e, 0);
+		if (unlikely(err)) {
+			ubi_free_wl_entry(e);
+			goto out_free;
+		}
+	}
+
+	rb_for_each_entry(rb1, sv, &si->volumes, rb) {
+		rb_for_each_entry(rb2, seb, &sv->root, u.rb) {
+			cond_resched();
+
+			e = ubi_alloc_wl_entry();
+			if (unlikely(!e))
+				goto out_free;
+
+			e->pnum = seb->pnum;
+			e->ec = seb->ec;
+			wl->lookuptbl[e->pnum] = e;
+			if (!seb->scrub) {
+				dbg_wl("add PEB %d EC %d to the used tree",
+				       e->pnum, e->ec);
+				used_tree_add(wl, e);
+			} else {
+				dbg_wl("add PEB %d EC %d to the scrub tree",
+				       e->pnum, e->ec);
+				scrub_tree_add(wl, e);
+			}
+		}
+	}
+
+	err = ubi_acc_reserve(ubi, WL_RESERVED_PEBS);
+	if (err)
+		goto out_free;
+
+	/* Schedule wear-leveling if needed */
+	err = ensure_wear_leveling(ubi);
+	if (err)
+		goto out_free;
+
+	return 0;
+
+out_free:
+	tree_destroy(&wl->used);
+	tree_destroy(&wl->free);
+	tree_destroy(&wl->scrub);
+	ubi_kfree(wl->lookuptbl);
+	ubi_kfree(wl);
+	return err;
+}
+
+static void protection_trees_destroy(struct ubi_wl_info *wl);
+
+void ubi_wl_close(struct ubi_info *ubi)
+{
+	struct ubi_wl_info *wl = ubi->wl;
+
+	dbg_wl("close the UBI wear-leveling unit");
+
+	protection_trees_destroy(wl);
+	tree_destroy(&wl->used);
+	tree_destroy(&wl->free);
+	tree_destroy(&wl->scrub);
+	ubi_kfree(wl->lookuptbl);
+	ubi_kfree(wl);
+}
+
+/**
+ * find_wl_entry - find a wl entry closest to certain erase counter.
+ *
+ * @root: the RB-tree where to look for
+ * @max: highest erase possible counter
+ *
+ * This function looks for a wear leveling entry erase counter closest to @max
+ * and less then @max.
+ */
+static struct ubi_wl_entry *find_wl_entry(struct rb_root *root, int max)
+{
+	struct rb_node *p;
+	struct ubi_wl_entry *e;
+
+	e = rb_entry(rb_first(root), struct ubi_wl_entry, rb);
+	max += e->ec;
+
+	p = root->rb_node;
+	while (p) {
+		struct ubi_wl_entry *e1;
+
+		e1 = rb_entry(p, struct ubi_wl_entry, rb);
+		if (e1->ec >= max)
+			p = p->rb_left;
+		else {
+			p = p->rb_right;
+			e = e1;
+		}
+	}
+
+	return e;
+}
+
+/**
+ * pick_long_term - select a "long-term" physical eraseblock.
+ *
+ * @wl: the wear-leveling unit description data structure
+ *
+ * This function returns the requested physical eraseblock. The wl->lock must
+ * be locked. The @wl->free list must not be empty.
+ */
+static struct ubi_wl_entry *pick_long_term(struct ubi_wl_info *wl)
+{
+	struct ubi_wl_entry *e;
+
+	/*
+	 * For long term data we pick a physical eraseblock with high erase
+	 * counter. But the highest erase counter we can pick is bounded by
+	 * the the lowest erase counter plus %WL_FREE_MAX_DIFF.
+	 */
+	e = find_wl_entry(&wl->free, WL_FREE_MAX_DIFF);
+	return e;
+}
+
+/**
+ * pick_unknown - select an "unknown" physical eraseblock.
+ *
+ * @wl: the wear-leveling unit description data structure
+ *
+ * This function returns the requested physical eraseblock. The wl->lock must
+ * be locked. The @wl->free list must not be empty.
+ */
+static struct ubi_wl_entry *pick_unknown(struct ubi_wl_info *wl)
+{
+	int medium_ec;
+	struct rb_node *p;
+	struct ubi_wl_entry *first, *last, *e;
+
+	/*
+	 * For unknown data we are trying to pick a physical eraseblock with
+	 * medium erase counter. But we by no means can pick a physical
+	 * eraseblock with erase counter greater or equivalent then the the
+	 * lowest erase counter plus %WL_FREE_MAX_DIFF.
+	 */
+
+	first = rb_entry(rb_first(&wl->free), struct ubi_wl_entry, rb);
+	last = rb_entry(rb_last(&wl->free), struct ubi_wl_entry, rb);
+
+	if (last->ec - first->ec < WL_FREE_MAX_DIFF)
+		return rb_entry(wl->free.rb_node, struct ubi_wl_entry, rb);
+
+	medium_ec = (first->ec + WL_FREE_MAX_DIFF)/2;
+	e = first;
+
+	p = wl->free.rb_node;
+	while (p) {
+		struct ubi_wl_entry *e1;
+
+		e1 = rb_entry(p, struct ubi_wl_entry, rb);
+		if (e1->ec >= medium_ec)
+			p = p->rb_left;
+		else {
+			p = p->rb_right;
+			e = e1;
+		}
+	}
+
+	return e;
+}
+
+/**
+ * pick_short_term - select a "short term" physical eraseblock.
+ *
+ * @wl: the wear-leveling unit description data structure
+ *
+ * This function returns the requested physical eraseblock. The wl->lock must
+ * be locked. The @wl->free list must not be empty.
+ */
+static struct ubi_wl_entry *pick_short_term(struct ubi_wl_info *wl)
+{
+	struct ubi_wl_entry *e;
+
+	/*
+	 * For short term data we pick a physical eraseblock with the lowest
+	 * erase counter as we expect it will be erased soon.
+	 */
+	e = rb_entry(rb_first(&wl->free), struct ubi_wl_entry, rb);
+	return e;
+}
+
+/**
+ * prot_tree_add - add a the physical eraseblock to the protection trees.
+ *
+ * @wl: the wear-leveling unit description data structure
+ * @e: the physical eraseblock to add
+ * @pe: a protection entry object to use
+ * @abs_ec: the absolute erase counter value when this physical eraseblock has
+ * to be removed from the protection trees.
+ *
+ * @wl->lock has to be locked.
+ */
+static void prot_tree_add(struct ubi_wl_info *wl, struct ubi_wl_entry *e,
+			  struct ubi_wl_prot_entry *pe, int abs_ec)
+{
+	struct rb_node **p, *parent = NULL;
+	struct ubi_wl_prot_entry *pe1;
+
+	pe->e = e;
+	pe->abs_ec = wl->abs_ec + abs_ec;
+
+	p = &wl->prot.pnum.rb_node;
+	while (*p) {
+		parent = *p;
+		pe1 = rb_entry(parent, struct ubi_wl_prot_entry, rb_pnum);
+
+		if (e->pnum < pe1->e->pnum)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	rb_link_node(&pe->rb_pnum, parent, p);
+	rb_insert_color(&pe->rb_pnum, &wl->prot.pnum);
+
+	p = &wl->prot.aec.rb_node;
+	parent = NULL;
+	while (*p) {
+		parent = *p;
+		pe1 = rb_entry(parent, struct ubi_wl_prot_entry, rb_aec);
+
+		if (pe->abs_ec < pe1->abs_ec)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	rb_link_node(&pe->rb_aec, parent, p);
+	rb_insert_color(&pe->rb_aec, &wl->prot.aec);
+}
+
+/**
+ * check_protection_over - check if it is time to stop protecting some
+ * physical eraseblocks.
+ *
+ * @wl: the wear-leveling unit description data structure
+ *
+ * This function is called after each erase operation, when the absolute erase
+ * counter is incremented, to check if some physical eraseblock  have not to be
+ * protected any longer. These physical eraseblocks are moved from the
+ * protection trees to the used tree.
+ */
+static void check_protection_over(struct ubi_wl_info *wl)
+{
+	struct ubi_wl_prot_entry *pe;
+
+	/*
+	 * There may be several protected physical eraseblock to remove,
+	 * process them all.
+	 */
+	while (1) {
+		spin_lock(&wl->lock);
+		if (tree_empty(&wl->prot.aec)) {
+			spin_unlock(&wl->lock);
+			break;
+		}
+
+		pe = rb_entry(rb_first(&wl->prot.aec),
+			      struct ubi_wl_prot_entry, rb_aec);
+
+		if (pe->abs_ec > wl->abs_ec) {
+			spin_unlock(&wl->lock);
+			break;
+		}
+
+		dbg_wl("PEB %d protection over, abs_ec %llu, PEB abs_ec %llu",
+		       pe->e->pnum, wl->abs_ec, pe->abs_ec);
+		rb_erase(&pe->rb_aec, &wl->prot.aec);
+		rb_erase(&pe->rb_pnum, &wl->prot.pnum);
+		used_tree_add(wl, pe->e);
+		spin_unlock(&wl->lock);
+
+		ubi_free_wl_prot_entry(pe);
+		cond_resched();
+	}
+}
+
+/**
+ * prot_tree_del - remove a physical eraseblock from the protection trees
+ *
+ * @wl: the wear-leveling unit description data structure
+ * @pnum: the physical eraseblock number to remove
+ */
+static void prot_tree_del(struct ubi_wl_info *wl, int pnum)
+{
+	struct rb_node *p;
+	struct ubi_wl_prot_entry *pe = NULL;
+
+	p = wl->prot.pnum.rb_node;
+	while (p) {
+
+		pe = rb_entry(p, struct ubi_wl_prot_entry, rb_pnum);
+
+		if (pnum == pe->e->pnum)
+			break;
+
+		if (pnum < pe->e->pnum)
+			p = p->rb_left;
+		else
+			p = p->rb_right;
+	}
+
+	ubi_assert(pe->e->pnum == pnum);
+	rb_erase(&pe->rb_aec, &wl->prot.aec);
+	rb_erase(&pe->rb_pnum, &wl->prot.pnum);
+	ubi_free_wl_prot_entry(pe);
+}
+
+static int wear_leveling_worker(const struct ubi_info *ubi,
+				struct ubi_bgt_work *wrk, int cancel);
+
+/**
+ * ensure_wear_leveling - schedule wear-leveling if it is needed.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function checks if it is time to start wear-leveling and schedules it
+ * if yes. This function returns zero in case of success and a negative error
+ * code in case of failure.
+ */
+static int ensure_wear_leveling(const struct ubi_info *ubi)
+{
+	int err = 0;
+	struct ubi_wl_entry *e1;
+	struct ubi_wl_entry *e2;
+	struct ubi_bgt_work *wrk;
+	struct ubi_wl_info *wl = ubi->wl;
+
+	spin_lock(&wl->lock);
+	if (wl->wl_scheduled)
+		/* Wear-leveling is already in the work queue */
+		goto out_unlock;
+
+	/*
+	 * If the wl->scrub tree is not empty, scrubbing is needed, and the the
+	 * WL worker has to be scheduled anyway.
+	 */
+	if (tree_empty(&wl->scrub)) {
+		if (tree_empty(&wl->used) || tree_empty(&wl->free))
+			/* No physical eraseblocks - no deal */
+			goto out_unlock;
+
+		/*
+		 * We schedule wear-leveling only if the difference between the
+		 * lowest erase counter of used physical eraseblocks and a high
+		 * erase counter of free physical eraseblocks is greater then
+		 * %UBI_WL_THRESHOLD.
+		 */
+		e1 = rb_entry(rb_first(&wl->used), struct ubi_wl_entry, rb);
+		e2 = find_wl_entry(&wl->free, WL_FREE_MAX_DIFF);
+
+		if (!(e2->ec - e1->ec >= UBI_WL_THRESHOLD))
+			goto out_unlock;
+		dbg_wl("schedule wear-leveling");
+	} else
+		dbg_wl("schedule scrubbing");
+
+	wl->wl_scheduled = 1;
+	spin_unlock(&wl->lock);
+
+	wrk = ubi_alloc_bgt_work();
+	if (unlikely(!wrk)) {
+		err = -ENOMEM;
+		goto out_cancel;
+	}
+
+	wrk->func = &wear_leveling_worker;
+	err = ubi_bgt_schedule(ubi, wrk);
+	if (unlikely(err)) {
+		/*
+		 * The background was thread is killed, don't clear the
+		 * @wl->wl_scheduled flag to prevent this error from happening
+		 * again and again. And switch to read-only mode.
+		 */
+		ubi_free_bgt_work(wrk);
+		ubi_eba_ro_mode(ubi);
+	}
+	return err;
+
+out_unlock:
+	spin_unlock(&wl->lock);
+	return err;
+
+out_cancel:
+	spin_lock(&wl->lock);
+	wl->wl_scheduled = 0;
+	spin_unlock(&wl->lock);
+	return err;
+}
+
+/**
+ * schedule_erase - schedule an erase work.
+ *
+ * @ubi: the UBI device description object
+ * @e: the WL entry of the physical eraseblock to erase
+ * @torture: if the physical eraseblock has to be tortured
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ *
+ * Note: @wl->erase_pending must be incremented before this function is called.
+ */
+static int schedule_erase(const struct ubi_info *ubi, struct ubi_wl_entry *e,
+			  int torture)
+{
+	int err;
+	struct ubi_wl_erase_work *wl_wrk;
+
+	dbg_wl("schedule erasure of PEB %d, EC %d, torture %d",
+	       e->pnum, e->ec, torture);
+
+	wl_wrk = ubi_alloc_wl_erase_work();
+	if (unlikely(!wl_wrk))
+		return -ENOMEM;
+
+	wl_wrk->wrk.func = &erase_worker;
+	wl_wrk->wrk.priv = wl_wrk;
+	wl_wrk->e = e;
+	wl_wrk->torture = torture;
+
+	err = ubi_bgt_schedule(ubi, &wl_wrk->wrk);
+	if (unlikely(err)) {
+		/*
+		 * The background thread was killed, but we really need it. We
+		 * can only work in read-only mode without it.
+		 */
+		ubi_free_wl_erase_work(wl_wrk);
+		ubi_eba_ro_mode(ubi);
+	}
+	return err;
+}
+
+static int sync_erase(const struct ubi_info *ubi, struct ubi_wl_entry *e,
+		      int torture);
+
+/**
+ * erase_worker - physical eraseblock erase worker function.
+ *
+ * @ubi: the UBI device description object
+ * @wrk: the work object
+ * @cancel: non-zero if the worker has to free memory and exit
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure. This function also takes care about marking the physical
+ * eraseblock bad if it cannot be erased.
+ */
+static int erase_worker(const struct ubi_info *ubi, struct ubi_bgt_work *wrk,
+			int cancel)
+{
+	int err;
+	struct ubi_wl_info *wl = ubi->wl;
+	struct ubi_wl_erase_work *wl_wrk = wrk->priv;
+	struct ubi_wl_entry *e = wl_wrk->e;
+	int pnum = e->pnum;
+
+	if (unlikely(cancel)) {
+		dbg_wl("cancel erasure of PEB %d EC %d", pnum, e->ec);
+		ubi_free_wl_erase_work(wl_wrk);
+		ubi_free_wl_entry(e);
+		return 0;
+	}
+
+	dbg_wl("erase PEB %d EC %d", pnum, e->ec);
+
+	err = sync_erase(ubi, e, wl_wrk->torture);
+	if (likely(!err)) {
+		/* Fine, we've erased it successfully */
+		ubi_free_wl_erase_work(wl_wrk);
+
+		spin_lock(&wl->lock);
+		wl->erase_pending -= 1;
+		ubi_assert(wl->erase_pending >= 0);
+		wl->abs_ec += 1;
+		free_tree_add(wl, e);
+		spin_unlock(&wl->lock);
+
+		/*
+		 * One more erase operation has happened, take care about protected
+		 * physical eraseblocks.
+		 */
+		check_protection_over(wl);
+
+		/* And take care about wear-leveling */
+		err = ensure_wear_leveling(ubi);
+		return err;
+	}
+
+	/*
+	 * Some error occurred during erasure. If this is something like
+	 * %-EINTR, we just re-schedule this physical eraseblock for
+	 * erasure.
+	 */
+
+	if (err == -EINTR || err == -EAGAIN || err == -ENOMEM ||
+	    err == -EBUSY) {
+		ubi_bgt_reschedule(ubi, wrk); /* Must not return error */
+		return err;
+	}
+
+	ubi_free_wl_erase_work(wl_wrk);
+	ubi_free_wl_entry(e);
+
+	spin_lock(&wl->lock);
+	wl->erase_pending -= 1;
+	spin_unlock(&wl->lock);
+
+	/*
+	 * If this is not %-EIO, we have no idea what to do. Scheduling
+	 * this physical eraseblock for erasure again will cause repeated
+	 * errors.
+	 */
+	if (err != -EIO) {
+		ubi_eba_ro_mode(ubi);
+		return err;
+	}
+
+	/* It is %-EIO, the PEB went bad */
+
+	if (!ubi->io->bad_allowed) {
+		ubi_err("flash device may be severly bad");
+		ubi_eba_ro_mode(ubi);
+		err = -EIO;
+	} else {
+		err = ubi_beb_mark_bad(ubi, pnum);
+		if (err)
+			ubi_eba_ro_mode(ubi);
+	}
+	return err;
+}
+
+/**
+ * wear_leveling_worker - wear-leveling worker function.
+ *
+ * @ubi: the UBI device description object
+ * @wrk: the work object
+ * @cancel: non-zero if the worker has to free memory and exit
+ *
+ * This function copies a less worn out physical eraseblock to a more worn out
+ * one. Returns zero in case of success and a negative error code in case of
+ * failure.
+ */
+static int wear_leveling_worker(const struct ubi_info *ubi,
+				struct ubi_bgt_work *wrk, int cancel)
+{
+	int err, vol_id, lnum, scrub = 0, data_size, aldata_size;
+	struct ubi_wl_entry *e1, *e2;
+	struct ubi_vid_hdr *vid_hdr;
+	void *buf;
+	uint32_t crc;
+	struct ubi_wl_info *wl = ubi->wl;
+	struct ubi_wl_prot_entry *pe;
+	const struct ubi_io_info *io = ubi->io;
+
+	ubi_free_bgt_work(wrk);
+
+	if (unlikely(cancel))
+		return 0;
+
+	spin_lock(&wl->lock);
+	if (tree_empty(&wl->free) ||
+	    (tree_empty(&wl->used) && tree_empty(&wl->scrub))) {
+		/*
+		 * No free physical eraseblocks? Well, we cancel wear-leveling
+		 * then. It will be triggered again when a free physical
+		 * eraseblock appears.
+		 *
+		 * No used physical eraseblocks? They must be temporarily
+		 * protected from being moved. They will be moved to the
+		 * @wl->used tree later and the wear-leveling will be
+		 * triggered again.
+		 */
+		dbg_wl("cancel WL, a list is empty: free %d, used %d",
+		       tree_empty(&wl->free), tree_empty(&wl->used));
+		goto out;
+	}
+
+	if (tree_empty(&wl->scrub)) {
+		/*
+		 * Now pick the least worn-out used physical eraseblock and a
+		 * highly worn-out free physical eraseblock. If the erase
+		 * counters differ much enough, start wear-leveling.
+		 */
+		e1 = rb_entry(rb_first(&wl->used), struct ubi_wl_entry, rb);
+		e2 = find_wl_entry(&wl->free, WL_FREE_MAX_DIFF);
+
+		if (!(e2->ec - e1->ec >= UBI_WL_THRESHOLD)) {
+			dbg_wl("no WL needed: min used EC %d, max free EC %d",
+			       e1->ec, e2->ec);
+			goto out;
+		}
+		used_tree_del(wl, e1);
+		dbg_wl("move PEB %d EC %d to PEB %d EC %d",
+		       e1->pnum, e1->ec, e2->pnum, e2->ec);
+	} else {
+		scrub = 1;
+		e1 = rb_entry(rb_first(&wl->scrub), struct ubi_wl_entry, rb);
+		e2 = find_wl_entry(&wl->free, WL_FREE_MAX_DIFF);
+		scrub_tree_del(wl, e1);
+		dbg_wl("scrub PEB %d to PEB %d", e1->pnum, e2->pnum);
+	}
+
+	free_tree_del(wl, e2);
+	wl->move = e1;
+	spin_unlock(&wl->lock);
+
+	vid_hdr = ubi_zalloc_vid_hdr(ubi);
+	if (unlikely(!vid_hdr)) {
+		err = -ENOMEM;
+		goto out_err_cancel;
+	}
+
+	/*
+	 * Now we are going to copy physical eraseblock @e1->pnum to @e2->pnum.
+	 * We've selected a victim (@e1) and @wl->move refers it. But, user may
+	 * call 'ubi_wl_put_peb()' for @e1, and the movement has to be
+	 * canceled.
+	 *
+	 * We so far do not know which logical eraseblock our physical
+	 * eraseblock (@e1) belongs to. This means we cannot lock it. We have
+	 * to read the volume identifier header first. But if @e1 is put, it is
+	 * scheduled for erasure, and we may have a race with the erasure.
+	 * So, we may easily get an I/O error when we read the VID header.
+	 * This does not necessarily mean something nasty.
+	 */
+
+	err = ubi_io_read_vid_hdr(ubi, e1->pnum, vid_hdr, 0);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS) {
+		/* OK, error. If the movement was canceled, don't panic */
+		spin_lock(&wl->lock);
+		if (!wl->move) {
+			spin_unlock(&wl->lock);
+			/* This physical eraseblock was put meanwhile */
+			goto out_cancel_wl;
+		}
+		spin_unlock(&wl->lock);
+		/* Well, this means there is a problem */
+		dbg_wl("VID hdr read error (%d)", err);
+		goto vid_hdr_read_error;
+	}
+
+	if (vid_hdr->vol_type == UBI_VID_STATIC) {
+		data_size = ubi32_to_cpu(vid_hdr->data_size);
+		aldata_size = align_up(data_size, io->min_io_size);
+	} else
+		data_size = aldata_size =
+			    io->leb_size - ubi32_to_cpu(vid_hdr->data_pad);
+
+	ubi_assert(aldata_size % io->min_io_size == 0);
+
+	buf = ubi_kmalloc(aldata_size);
+	if (unlikely(!buf)) {
+		err = -ENOMEM;
+		goto out_err_cancel_vid_hdr;
+	}
+
+	vol_id = ubi32_to_cpu(vid_hdr->vol_id);
+	lnum = ubi32_to_cpu(vid_hdr->lnum);
+
+	/*
+	 * We do not want anybody to write to this physical eraseblock while
+	 * we are copying it, so we lock it.
+	 */
+	err = ubi_eba_leb_write_lock(ubi, vol_id, lnum);
+	if (unlikely(err))
+		goto out_err_cancel_buf;
+
+	spin_lock(&wl->lock);
+	if (!wl->move)
+		/* This physical eraseblock was put meanwhile, cancel */
+		goto out_cancel_wl_unlock;
+	spin_unlock(&wl->lock);
+
+	/*
+	 * From now on nobody can access this physical eraseblock as we locked
+	 * the corresponding logical eraseblock.
+	 */
+	dbg_wl("read %d bytes of data", aldata_size);
+	err = ubi_io_read_data(ubi, buf, e1->pnum, 0, aldata_size);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS) {
+		ubi_warn("error %d while reading data from PEB %d",
+			 err, e1->pnum);
+		goto data_read_error;
+	}
+
+	/*
+	 * Now we have got to calculate how much data we have to to copy. In
+	 * case of a static volume it is fairly easy - the VID header contains
+	 * the data size. In case of a dynamic volume it is more difficult - we
+	 * have to read the contents, cut 0xFF bytes from the end and copy only
+	 * the first part. We must do this to avoid writing 0xFF bytes as it
+	 * may have some side-effects. And not only this. It is important not
+	 * to include those 0xFFs to CRC because later the user may fill them
+	 * by his data!
+	 */
+	if (vid_hdr->vol_type == UBI_VID_DYNAMIC)
+		aldata_size = data_size =
+				ubi_calc_data_len(ubi, buf, data_size);
+
+	cond_resched();
+	crc = crc32(UBI_CRC32_INIT, buf, data_size);
+
+	/*
+	 * It may turn out that the whole @e1->pnum physical eraseblock
+	 * contains only 0xFF bytes. Then we have to only write the VID header
+	 * and do not write any data. This also means we should not set
+	 * @vid_hdr->copy_flag, @vid_hdr->data_size, and @vid_hdr->data_crc.
+	 */
+	if (likely(data_size > 0)) {
+		vid_hdr->copy_flag = 1;
+		vid_hdr->data_size = cpu_to_ubi32(data_size);
+		vid_hdr->data_crc = cpu_to_ubi32(crc);
+	}
+	vid_hdr->leb_ver = cpu_to_ubi32(ubi32_to_cpu(vid_hdr->leb_ver) + 1);
+
+	cond_resched();
+	err = ubi_io_write_vid_hdr(ubi, e2->pnum, vid_hdr);
+	if (unlikely(err))
+		goto write_error;
+
+	/* Read the VID header back and check if it was written correctly */
+	cond_resched();
+	err = ubi_io_read_vid_hdr(ubi, e2->pnum, vid_hdr, 1);
+	if (unlikely(err)) {
+		if (err != UBI_IO_BITFLIPS) {
+			ubi_warn("cannot read VID header back from PEB %d", e2->pnum);
+			goto write_error;
+		}
+		goto bitflip;
+	}
+
+	if (likely(data_size > 0)) {
+		void *buf1;
+
+		err = ubi_io_write_data(ubi, buf, e2->pnum, 0, aldata_size);
+		if (unlikely(err))
+			goto write_error;
+
+		/*
+		 * We've written the data and are going to read it back to make
+		 * sure it was written correctly.
+		 */
+		buf1 = ubi_kmalloc(aldata_size);
+		if (unlikely(!buf1)) {
+			err = -ENOMEM;
+			goto write_error;
+		}
+
+		cond_resched();
+		err = ubi_io_read_data(ubi, buf1, e2->pnum, 0, aldata_size);
+		if (unlikely(err)) {
+			ubi_kfree(buf1);
+			if (err != UBI_IO_BITFLIPS) {
+				ubi_warn("cannot read data back from PEB %d",
+					 e2->pnum);
+				goto write_error;
+			}
+			goto bitflip;
+		}
+
+		cond_resched();
+		if (unlikely(memcmp(buf, buf1, aldata_size))) {
+			ubi_warn("read data back from PEB %d - it is different",
+				 e2->pnum);
+			err = -EINVAL;
+			ubi_kfree(buf1);
+			goto write_error;
+		}
+		ubi_kfree(buf1);
+	}
+
+	/*
+	 * Re-map the logical eraseblock to the new physical eraseblock
+	 * (@e2->pnum).
+	 */
+	ubi_eba_leb_remap(ubi, vol_id, lnum, e2->pnum);
+
+	/*
+	 * The physical eraseblock was successfully copied and re-mapped. Add
+	 * the new copy to the @wl->used tree and schedule the old one for
+	 * erasure.
+	 */
+	spin_lock(&wl->lock);
+	wl->erase_pending += 1;
+	used_tree_add(wl, e2);
+	wl->wl_scheduled = 0;
+	ubi_assert(wl->move);
+	wl->move = NULL;
+	spin_unlock(&wl->lock);
+
+	/* Unlock the logical eraseblock */
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+
+	ubi_kfree(buf);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+
+	/*
+	 * Note, we do not check if more wear-leveling is needed here. We
+	 * schedule the physical eraseblock for erasure so we know that the
+	 * erase worker will take care about that.
+	 */
+	err = schedule_erase(ubi, e1, 0);
+	if (unlikely(err)) {
+		/* This may only happen if there is no memory */
+		ubi_free_wl_entry(e1);
+		ubi_eba_ro_mode(ubi);
+	}
+	dbg_wl("done");
+	return err;
+
+out:
+	wl->wl_scheduled = 0;
+	spin_unlock(&wl->lock);
+	return 0;
+
+	/*
+	 * The physical eraseblock we have selected was put. It was scheduled
+	 * for erasure and removed from the @wl->used tree, so we only need to
+	 * return @e2 back to the @wl->free tree.
+	 */
+out_cancel_wl_unlock:
+	spin_unlock(&wl->lock);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	ubi_kfree(buf);
+out_cancel_wl:
+	dbg_wl("PEB %d was put, don't move it, cancel", e1->pnum);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	spin_lock(&wl->lock);
+	ubi_assert(wl->move == NULL);
+	free_tree_add(wl, e2);
+	wl->wl_scheduled = 0;
+	spin_unlock(&wl->lock);
+	return 0;
+
+	/*
+	 * Some non-I/O error occurred. Neither @e1 nor @e2 were changed, just
+	 * get them back to the lists they were taken from.
+	 */
+out_err_cancel_buf:
+	ubi_kfree(buf);
+out_err_cancel_vid_hdr:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+out_err_cancel:
+	spin_lock(&wl->lock);
+	wl->wl_scheduled = 0;
+	if (wl->move) {
+		if (scrub)
+			scrub_tree_add(wl, e1);
+		else
+			used_tree_add(wl, e1);
+		wl->move = NULL;
+	}
+	free_tree_add(wl, e2);
+	spin_unlock(&wl->lock);
+	return err;
+
+	/*
+	 * Failed to read from the @e1->pnum physical eraseblock. Something
+	 * nasty happened. We don't want to move this physical eraseblock.
+	 * We also don't want this physical eraseblock to be repeatedly
+	 * selected for wear-leveling, so protect it.
+	 *
+	 * FIXME: It would be better to notify upper layers about this and let
+	 * them handle this. But this is not implemented.
+	 */
+data_read_error:
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	ubi_kfree(buf);
+vid_hdr_read_error:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	spin_lock(&wl->lock);
+	free_tree_add(wl, e2);
+	spin_unlock(&wl->lock);
+
+	pe = ubi_alloc_wl_prot_entry();
+	if (!pe) {
+		spin_lock(&wl->lock);
+		wl->wl_scheduled = 0;
+		if (wl->move)
+			used_tree_add(wl, e1);
+		wl->move = NULL;
+		spin_unlock(&wl->lock);
+		return -ENOMEM;
+	}
+	spin_lock(&wl->lock);
+	wl->wl_scheduled = 0;
+	if (wl->move) {
+		prot_tree_add(wl, e1, pe, ST_PROTECTION);
+		wl->move = NULL;
+		spin_unlock(&wl->lock);
+	} else {
+		spin_unlock(&wl->lock);
+		ubi_free_wl_prot_entry(pe);
+	}
+	return 0;
+
+	/*
+	 * An error occurred during writing. Something was written to @e2-pnum,
+	 * so we cannot treat it as free any longer. Put @e1 back to the
+	 * @wl->used tree and schedule @e2->pnum for erasure.
+	 *
+	 * Normally, this happens if @e2->pnum went bad - then it will be
+	 * handled in the erase function. But if the flash does not admit of
+	 * bad physical eraseblock, we switch to read-only mode.
+	 */
+write_error:
+	ubi_kfree(buf);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+
+	spin_lock(&wl->lock);
+	wl->wl_scheduled = 0;
+	ubi_assert(wl->move);
+	used_tree_add(wl, e1);
+	wl->move = NULL;
+	spin_unlock(&wl->lock);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+
+	if (ubi->io->bad_allowed) {
+		int err1;
+
+		spin_lock(&wl->lock);
+		wl->erase_pending += 1;
+		spin_unlock(&wl->lock);
+		err1 = schedule_erase(ubi, e2, 1);
+		if (err1) {
+			/* No memory - bad, switch to read-only mode */
+			ubi_free_wl_entry(e2);
+			spin_lock(&wl->lock);
+			wl->erase_pending -= 1;
+			spin_unlock(&wl->lock);
+			ubi_eba_ro_mode(ubi);
+			err = err1;
+		}
+	} else {
+		ubi_err("flash device may be severly bad");
+		ubi_free_wl_entry(e2);
+		ubi_eba_ro_mode(ubi);
+	}
+	return err;
+
+	/*
+	 * We successfully wrote the data to @e2->pnum, but when we red it back
+	 * we detected a bit-flip. So we cancel the operation.
+	 */
+bitflip:
+	dbg_wl("bit-flip at the copied data, cancel");
+	ubi_kfree(buf);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+
+	spin_lock(&wl->lock);
+	wl->wl_scheduled = 0;
+	ubi_assert(wl->move);
+	if (scrub)
+		scrub_tree_add(wl, e1);
+	else
+		used_tree_add(wl, e1);
+	wl->move = NULL;
+	spin_unlock(&wl->lock);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+
+	spin_lock(&wl->lock);
+	wl->erase_pending += 1;
+	spin_unlock(&wl->lock);
+	err = schedule_erase(ubi, e2, 0);
+	if (err) {
+		/* No memory - bad, switch to read-only mode */
+		ubi_free_wl_entry(e2);
+		spin_lock(&wl->lock);
+		wl->erase_pending -= 1;
+		spin_unlock(&wl->lock);
+		ubi_eba_ro_mode(ubi);
+	}
+
+	return err;
+
+}
+
+/**
+ * sync_erase - synchronously erase a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @e: the the physical eraseblock to erase
+ * @torture: if the physical eraseblock has to be tortured
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int sync_erase(const struct ubi_info *ubi, struct ubi_wl_entry *e,
+		      int torture)
+{
+	int err;
+	struct ubi_ec_hdr *ec_hdr;
+	struct ubi_wl_info *wl = ubi->wl;
+	uint64_t ec = e->ec;
+
+	dbg_wl("erase PEB %d, old EC %llu", e->pnum, (unsigned long long)ec);
+
+	err = paranoid_check_ec(ubi, e->pnum, e->ec);
+	if (unlikely(err > 0))
+		return -EINVAL;
+
+	ec_hdr = ubi_zalloc_ec_hdr(ubi);
+	if (unlikely(!ec_hdr))
+		return -ENOMEM;
+
+	err = ubi_io_sync_erase(ubi, e->pnum, torture);
+	if (unlikely(err < 0))
+		goto out_free;
+
+	ec += err;
+	if (unlikely(ec > UBI_MAX_ERASECOUNTER)) {
+		/*
+		 * Erase counter overflow. Upgrade UBI and use 64-bit
+		 * erase counters internally.
+		 */
+		ubi_err("erase counter overflow at PEB %d, EC %llu",
+			e->pnum, (unsigned long long)ec);
+		err = -EINVAL;
+		goto out_free;
+	}
+
+	dbg_wl("erased PEB %d, new EC %llu", e->pnum, (unsigned long long)ec);
+
+	ec_hdr->ec = cpu_to_ubi64(ec);
+
+	err = ubi_io_write_ec_hdr(ubi, e->pnum, ec_hdr);
+	if (unlikely(err))
+		goto out_free;
+
+	e->ec = ec;
+	if (e->ec > wl->max_ec)
+		wl->max_ec = e->ec;
+
+out_free:
+	ubi_free_ec_hdr(ubi, ec_hdr);
+	return err;
+}
+
+/**
+ * produce_free - produce a free physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function tries to make a free PEB by means of syncronoes execution of
+ * pending works. This may be neede if, for example the background thread is
+ * disabled. Returns zero in case of success and a negative error code in case
+ * of failure.
+ */
+static int produce_free(const struct ubi_info *ubi)
+{
+	int err;
+	struct ubi_wl_info *wl = ubi->wl;
+
+	spin_lock(&wl->lock);
+	while (tree_empty(&wl->free)) {
+		spin_unlock(&wl->lock);
+
+		dbg_wl("do one work synchronously");
+		err = ubi_bgt_do_work(ubi);
+		if (unlikely(err))
+			return err;
+
+		spin_lock(&wl->lock);
+	}
+	spin_unlock(&wl->lock);
+
+	return 0;
+}
+
+/**
+ * wl_tree_add - add a wear-leveling entry to a WL RB-tree.
+ *
+ * @e: the wear-leveling entry to add
+ * @root: the root of the tree
+ *
+ * Note, we use (erase counter, physical eraseblock number) pairs as keys in
+ * the @wl->used and @wl->free RB-trees.
+ */
+static void wl_tree_add(struct ubi_wl_entry *e, struct rb_root *root)
+{
+	struct rb_node **p, *parent = NULL;
+
+	p = &root->rb_node;
+	while (*p) {
+		struct ubi_wl_entry *e1;
+
+		parent = *p;
+		e1 = rb_entry(parent, struct ubi_wl_entry, rb);
+
+		if (e->ec < e1->ec)
+			p = &(*p)->rb_left;
+		else if (e->ec > e1->ec)
+			p = &(*p)->rb_right;
+		else {
+			ubi_assert(e->pnum != e1->pnum);
+			if (e->pnum < e1->pnum)
+				p = &(*p)->rb_left;
+			else
+				p = &(*p)->rb_right;
+		}
+	}
+
+	rb_link_node(&e->rb, parent, p);
+	rb_insert_color(&e->rb, root);
+}
+
+/**
+ * in_wl_tree - check if a wear-leveling entry is present in a WL RB-tree.
+ *
+ * @e: the wear-leveling entry to check
+ * @root: the root of the tree
+ *
+ * This function returns non-zero if @e is in the @root RB-tree and zero if it
+ * is not.
+ */
+static int in_wl_tree(struct ubi_wl_entry *e, struct rb_root *root)
+{
+	struct rb_node *p;
+
+	p = root->rb_node;
+	while (p) {
+		struct ubi_wl_entry *e1;
+
+		e1 = rb_entry(p, struct ubi_wl_entry, rb);
+
+		if (e->pnum == e1->pnum) {
+			ubi_assert(e == e1);
+			return 1;
+		}
+
+		if (e->ec < e1->ec)
+			p = p->rb_left;
+		else if (e->ec > e1->ec)
+			p = p->rb_right;
+		else {
+			ubi_assert(e->pnum != e1->pnum);
+			if (e->pnum < e1->pnum)
+				p = p->rb_left;
+			else
+				p = p->rb_right;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * tree_destroy - destroy an RB-tree.
+ *
+ * @root: the root of the tree to destroy
+ */
+static void tree_destroy(struct rb_root *root)
+{
+	struct rb_node *rb;
+	struct ubi_wl_entry *e;
+
+	rb = root->rb_node;
+	while (rb) {
+		if (rb->rb_left)
+			rb = rb->rb_left;
+		else if (rb->rb_right)
+			rb = rb->rb_right;
+		else {
+			e = rb_entry(rb, struct ubi_wl_entry, rb);
+
+			rb = rb_parent(rb);
+			if (rb) {
+				if (rb->rb_left == &e->rb)
+					rb->rb_left = NULL;
+				else
+					rb->rb_right = NULL;
+			}
+
+			ubi_free_wl_entry(e);
+		}
+	}
+}
+
+/**
+ * protection_trees_destroy - destroy the protection RB-trees.
+ *
+ * @wl: the wear-leveling unit description data structure
+ */
+static void protection_trees_destroy(struct ubi_wl_info *wl)
+{
+	struct rb_node *rb;
+	struct ubi_wl_prot_entry *pe;
+
+	rb = wl->prot.aec.rb_node;
+	while (rb) {
+		if (rb->rb_left)
+			rb = rb->rb_left;
+		else if (rb->rb_right)
+			rb = rb->rb_right;
+		else {
+			pe = rb_entry(rb, struct ubi_wl_prot_entry, rb_aec);
+
+			rb = rb_parent(rb);
+			if (rb) {
+				if (rb->rb_left == &pe->rb_aec)
+					rb->rb_left = NULL;
+				else
+					rb->rb_right = NULL;
+			}
+
+			ubi_free_wl_entry(pe->e);
+			ubi_free_wl_prot_entry(pe);
+		}
+	}
+}
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_WL
+
+/**
+ * paranoid_check_ec - make sure that the erase counter of a physical eraseblock
+ * is correct.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to check
+ * @ec: the erase counter to check
+ *
+ * This function returns zero if the erase counter of physical eraseblock @pnum
+ * is equivalent to @ec, %1 if not, and a negative error code if an error
+ * occurred.
+ */
+static int paranoid_check_ec(const struct ubi_info *ubi, int pnum, int ec)
+{
+	int err;
+	long long read_ec;
+	struct ubi_ec_hdr *ec_hdr;
+
+	ec_hdr = ubi_zalloc_ec_hdr(ubi);
+	if (unlikely(!ec_hdr))
+		return -ENOMEM;
+
+	err = ubi_io_read_ec_hdr(ubi, pnum, ec_hdr, 0);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS) {
+		/* The header does not have to exist */
+		err = 0;
+		goto out_free;
+	}
+
+	read_ec = ubi64_to_cpu(ec_hdr->ec);
+	if (unlikely(ec != read_ec)) {
+		ubi_err("paranoid check failed for PEB %d", pnum);
+		ubi_err("read EC is %lld, should be %d", read_ec, ec);
+		ubi_dbg_dump_stack();
+		err = 1;
+	} else
+		err = 0;
+
+out_free:
+	ubi_free_ec_hdr(ubi, ec_hdr);
+	return err;
+}
+
+/**
+ * paranoid_check_in_wl_tree - make sure that a wear-leveling entry is present
+ * in a WL RB-tree.
+ *
+ * @e: the wear-leveling entry to check
+ * @root: the root of the tree
+ *
+ * This function returns zero if @e is in the @root RB-tree and %1 if it
+ * is not.
+ */
+static int paranoid_check_in_wl_tree(struct ubi_wl_entry *e, struct rb_root *root)
+{
+	if (likely(in_wl_tree(e, root)))
+		return 0;
+
+	ubi_err("paranoid check failed for PEB %d, EC %d, RB-tree %p ",
+		e->pnum, e->ec, root);
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+#endif /* CONFIG_MTD_UBI_DEBUG_PARANOID_WL */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 25/44 take 2] [UBI] EBA unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (23 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 24/44 take 2] [UBI] wear-leveling unit implementation Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 26/44 take 2] [UBI] EBA unit implementation Artem Bityutskiy
                   ` (21 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/eba.h tmp-to/drivers/mtd/ubi/eba.h
--- tmp-from/drivers/mtd/ubi/eba.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/eba.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,362 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * The UBI Eraseblock Association (EBA) unit.
+ *
+ * The main goal of this unit is to maintain the Eraseblock Association Table
+ * (EBA table). The EBA table is a data structure which maps (volume ID,
+ * logical eraseblock number) pairs to physical eraseblock numbers.
+ *
+ * Note, it is supposed that all the UBI input/output goes via the EBA unit.
+ * The only reservation should be made for the initialization time when
+ * different units may directly do input/output from physical eraseblocks.
+ *
+ * Although in this implementation the EBA table is fully kept and managed in
+ * RAM, which assumes poor UBI scalability, it might be (partially) maintained
+ * on flash in future implementations.
+ */
+
+#ifndef __UBI_EBA_H__
+#define __UBI_EBA_H__
+
+#include <linux/spinlock.h>
+#include <linux/rwsem.h>
+#include <linux/rbtree.h>
+#include <linux/mtd/ubi.h>
+
+struct ubi_info;
+struct ubi_scan_info;
+
+/**
+ * ubi_eba_mkvol - create EBA mapping for a new volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the new volume
+ * @leb_count: how many eraseblocks are reserved for this volume
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_eba_mkvol(const struct ubi_info *ubi, int vol_id, int leb_count);
+
+/**
+ * ubi_eba_rmvol - remove EBA mapping for a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to be removed
+ *
+ * This function removes a volume from the EBA table. This function un-maps all
+ * the logical eraseblocks and the corresponding physical eraseblocks will be
+ * scheduled for erasure. This function returns zero in case of success and a
+ * negative error code in case of failure.
+ */
+int ubi_eba_rmvol(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_eba_rsvol - re-size EBA mapping for a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to be re-sized
+ * @reserved_pebs: new count of physical eraseblocks in this volume
+ *
+ * This function changes the EBA table accordingly to the volume re-size
+ * operation. If the volume is actually shrunken, the dropped logical
+ * eraseblocks are got unmapped an thus, the corresponding physical eraseblocks
+ * are scheduled for erasure. This function returns zero in case of success and
+ * a negative error code in case of failure.
+ */
+int ubi_eba_rsvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs);
+
+/**
+ * ubi_eba_erase_leb - erase a logical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: volume ID
+ * @lnum: the logical eraseblock number to erase
+ *
+ * This function un-maps the logical eraseblock and schedules the physical
+ * eraseblock for erasure.
+ *
+ * Note, the physical eraseblock is just scheduled for erasure, not actually
+ * erased by this function. So there is no guarantee that after an unclean
+ * reboot (or even a regular UBI shut down) this logical eraseblock stays
+ * un-mapped. It may become mapped to the same physical eraseblock again if it
+ * had not been erased by the time the unclean reboot happened.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_eba_erase_leb(const struct ubi_info *ubi, int vol_id, int lnum);
+
+/**
+ * ubi_eba_read_leb - read data from a logical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID from where to read
+ * @lnum: the logical eraseblock number to read from
+ * @buf: the buffer to store the read data
+ * @offset: the offset within the logical eraseblock from where to read
+ * @len: how many bytes to read
+ * @check: data CRC check flag
+ *
+ * If the logical eraseblock @lnum is unmapped, @buf is filled by 0xFF bytes.
+ * The @check flag only makes sense for static volumes and forces eraseblock
+ * data CRC checking.
+ *
+ * In case of success this function returns zero. If the @check flag is set,
+ * @vol_id is a static volume, and the data CRC mismatches - %-EBADMSG is
+ * returned. %-EBADMSG may also be returned for any volume type if an ECC error
+ * was detected by the MTD device driver.
+ *
+ * Other negative error cored may be returned in case of other errors.
+ */
+int ubi_eba_read_leb(const struct ubi_info *ubi, int vol_id, int lnum,
+		     void *buf, int offset, int len, int check);
+
+/**
+ * ubi_eba_write_leb - write data to a logical eraseblock of a dynamic volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID where to write
+ * @lnum: the logical eraseblock number to write
+ * @buf: the data to write
+ * @offset: the offset within the logical eraseblock where to write
+ * @len: how many bytes to write
+ * @dtype: data type
+ *
+ * This function writes data to a logical eraseblock of a dynamic volume.
+ * Returns zero in case of success and a negative error code in case of
+ * failure.
+ *
+ * Note, in case of an error, it is possible that something was still written
+ * to the flash media, but may be some garbage.
+ */
+int ubi_eba_write_leb(const struct ubi_info *ubi, int vol_id, int lnum,
+		      const void *buf, int offset, int len,
+		      enum ubi_data_type dtype);
+
+/**
+ * ubi_eba_write_leb_st - write data to a logical eraseblock of a static volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID where to write
+ * @lnum: the logical eraseblock number to write
+ * @buf: the data to write
+ * @len: how many bytes to write
+ * @dtype: data type
+ * @used_ebs: how many logical eraseblocks will this volume contain (used only
+ * for static volumes)
+ *
+ * This function writes data to a logical eraseblock of a static volume. The
+ * @used_ebs argument should contain total number of logical eraseblock which
+ * will contain any data in this static volume.
+ *
+ * When writing to the last logical eraseblock of a static volume, the @len
+ * argument doesn't have to be aligned to the minimal I/O unit size. Instead,
+ * it has to be equivalent to the real data size, although the @buf buffer has
+ * to contain the alignment. In all other cases, @len has to be aligned.
+ *
+ * Note, it is prohibited to write more then once to logical eraseblocks of
+ * static volumes.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_eba_write_leb_st(const struct ubi_info *ubi, int vol_id, int lnum,
+			 const void *buf, int len, enum ubi_data_type dtype,
+			 int used_ebs);
+
+/**
+ * ubi_eba_leb_is_mapped - check if a logical eraseblock is mapped.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number to check
+ *
+ * This function checks if a logical eraseblock is mapped to a physical
+ * eraseblock. Returns %1 if it is mapped, %0 if not, and a negative error
+ * code in case of failure.
+ */
+int ubi_eba_leb_is_mapped(const struct ubi_info *ubi, int vol_id, int lnum);
+
+/**
+ * ubi_eba_leb_read_lock - lock a logical eraseblock for reading.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number to lock
+ *
+ * This function locks a logical eraseblock for reading which means that all
+ * writers will be locked waiting while the logical eraseblock is stopped being
+ * used.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_eba_leb_read_lock(const struct ubi_info *ubi, int vol_id, int lnum);
+
+/**
+ * ubi_eba_leb_write_lock - lock a logical eraseblock for writing.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number to lock
+ *
+ * This function locks a logical eraseblock for writing which means that all
+ * further readers and writers will be locked waiting while the logical
+ * eraseblock is stopped being used.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_eba_leb_write_lock(const struct ubi_info *ubi, int vol_id, int lnum);
+
+/**
+ * ubi_eba_leb_read_unlock - unlock a logical eraseblock locked for reading.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number to unlock
+ */
+void ubi_eba_leb_read_unlock(const struct ubi_info *ubi, int vol_id, int lnum);
+
+/**
+ * ubi_eba_leb_write_unlock - unlock a logical eraseblock locked for writing.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number to unlock
+ */
+void ubi_eba_leb_write_unlock(const struct ubi_info *ubi, int vol_id, int lnum);
+
+/**
+ * ubi_eba_leb_remap - re-map a logical eraseblock to another physical
+ * eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number
+ * @pnum: new physical eraseblock to map to
+ *
+ * The logical eraseblock must be locked before re-mapping.
+ */
+void ubi_eba_leb_remap(const struct ubi_info *ubi, int vol_id, int lnum,
+		       int pnum);
+
+/**
+ * ubi_eba_ro_mode - switch to read-only mode.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_eba_ro_mode(const struct ubi_info *ubi);
+
+/**
+ * ubi_eba_init_scan - initialize the EBA unit using scanning information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_eba_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si);
+
+/**
+ * ubi_eba_close - close the EBA unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_eba_close(const struct ubi_info *ubi);
+
+/**
+ * struct ubi_eba_tbl_rec - a record in the eraseblock association table.
+ *
+ * @pnum: physical eraseblock number
+ * @leb_ver: logical eraseblock version
+ *
+ * This structure represents a record in the eraseblock association table.
+ */
+struct ubi_eba_tbl_rec {
+	int pnum;
+	uint32_t leb_ver;
+};
+
+/**
+ * struct ubi_eba_tbl_volume - a volume in the the eraseblock association
+ * table.
+ *
+ * @recs: an array of per-logical eraseblock records (for each logical
+ * eraseblock of this volume)
+ * @leb_count: how many logical eraseblock this volume has
+ */
+struct ubi_eba_tbl_volume {
+	struct ubi_eba_tbl_rec *recs;
+	int leb_count;
+};
+
+/**
+ * struct ubi_eba_ltree_entry - an entry in the lock tree.
+ *
+ * @rb: link in the RB-tree
+ * @vol_id: volume ID of the locked logical eraseblock
+ * @lnum: the locked logical eraseblock number
+ * @users: how many tasks are using this logical eraseblock or wait for it
+ * @mutex: a read/write mutex to implement read/write access serialization to
+ * the (@vol_id, @lnum) logical eraseblock
+ *
+ * This data structured is used to lock a logical eraseblock - a corresponding
+ * &struct ubi_eba_ltree_entry is created and inserted to the lock tree
+ * (@eba->ltree).
+ */
+struct ubi_eba_ltree_entry {
+	struct rb_node rb;
+	int vol_id;
+	int lnum;
+	int users;
+	struct rw_semaphore mutex;
+};
+
+/**
+ * struct ubi_eba_info - UBI EBA unit description data structure.
+ *
+ * @eba_tbl: the eraseblock association table
+ * @eba_tbl_lock: protects the EBA table
+ * @ltree: the lock tree
+ * @ltree_lock: protects the lock tree
+ * @num_volumes: number of volumes mapped by the EBA table
+ *
+ * The EBA unit implements per-logical eraseblock locking. Before accessing a
+ * logical eraseblock it is locked for reading or writing. The per-logical
+ * eraseblock locking is implemented by means of the lock tree.
+ *
+ * The lock tree is an RB-tree which refers all the currently locked logical
+ * eraseblocks. The lock tree elements are &struct ubi_eba_ltree_entry objects.
+ * They are indexed by (@vol_id,@lnum) pairs.
+ */
+struct ubi_eba_info {
+	struct ubi_eba_tbl_volume *eba_tbl; /* private */
+	spinlock_t eba_tbl_lock;            /* private */
+	struct rb_root ltree;               /* private */
+	spinlock_t ltree_lock;              /* private */
+	int num_volumes;                    /* private */
+};
+
+#endif /* !__UBI_EBA_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 26/44 take 2] [UBI] EBA unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (24 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 25/44 take 2] [UBI] EBA unit header Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 27/44 take 2] [UBI] bad block handling unit header Artem Bityutskiy
                   ` (20 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/eba.c tmp-to/drivers/mtd/ubi/eba.c
--- tmp-from/drivers/mtd/ubi/eba.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/eba.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,1212 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/rbtree.h>
+#include <linux/sched.h>
+#include <linux/crc32.h>
+#include <linux/spinlock.h>
+#include <linux/rwsem.h>
+#include <linux/err.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "eba.h"
+#include "badeb.h"
+#include "io.h"
+#include "wl.h"
+#include "volmgmt.h"
+#include "vtbl.h"
+#include "account.h"
+#include "background.h"
+#include "scan.h"
+#include "misc.h"
+#include "debug.h"
+
+/*
+ * The highest bit in logical-to-physical eraseblock mappings is used to
+ * indicate that the logical eraseblock is not mapped.
+ */
+#define NOT_MAPPED 0x80000000
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_EBA
+static int paranoid_check_leb(const struct ubi_info *ubi, int pnum, int vol_id,
+			      int lnum, int leb_ver,
+			      const struct ubi_vid_hdr *vid_hdr);
+static int paranoid_check_leb_locked(const struct ubi_info *ubi, int vol_id,
+				     int lnum);
+#else
+#define paranoid_check_leb(ubi, vol_id, pnum, lnum, leb_ver, vid_hdr) 0
+#define paranoid_check_leb_locked(ubi, vol_id, lnum)
+#endif
+
+/**
+ * vol_id2idx - turn a volume ID to the EBA table index.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ */
+static inline int vol_id2idx(const struct ubi_info *ubi, int vol_id)
+{
+	const struct ubi_acc_info *acc = ubi->acc;
+
+	if (vol_id >= UBI_INTERNAL_VOL_START)
+		return vol_id - UBI_INTERNAL_VOL_START + acc->max_volumes;
+	else
+		return vol_id;
+}
+
+/**
+ * idx2vol_id - turn an EBA table index to the volume ID.
+ *
+ * @ubi: the UBI device description object
+ * @idx: the EBA table index
+ */
+static inline int idx2vol_id(const struct ubi_info *ubi, int idx)
+{
+	const struct ubi_acc_info *acc = ubi->acc;
+
+	if (idx >= acc->max_volumes)
+		return idx - acc->max_volumes + UBI_INTERNAL_VOL_START;
+	else
+		return idx;
+}
+
+/**
+ * leb_get_ver - get logical eraseblock version.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number
+ *
+ * The logical eraseblock has to be locked.
+ */
+static inline int leb_get_ver(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	int idx, leb_ver;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(eba->eba_tbl[idx].recs);
+	leb_ver = eba->eba_tbl[idx].recs[lnum].leb_ver;
+	spin_unlock(&eba->eba_tbl_lock);
+	return leb_ver;
+}
+
+/**
+ * leb_map - map a logical eraseblock to a physical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number
+ * @pnum: the physical eraseblock
+ *
+ * The logical eraseblock has to be locked.
+ */
+static inline void leb_map(const struct ubi_info *ubi, int vol_id, int lnum,
+			   int pnum)
+{
+	int idx;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(eba->eba_tbl[idx].recs);
+	ubi_assert(eba->eba_tbl[idx].recs[lnum].pnum < 0);
+	eba->eba_tbl[idx].recs[lnum].pnum = pnum;
+	spin_unlock(&eba->eba_tbl_lock);
+}
+
+/**
+ * leb_unmap - unmap a logical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number to unmap
+ *
+ * This function unmaps a logical eraseblock and increases its version. The
+ * logical eraseblock has to be locked.
+ */
+static inline void leb_unmap(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	int idx;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(eba->eba_tbl[idx].recs);
+	ubi_assert(eba->eba_tbl[idx].recs[lnum].pnum >= 0);
+
+	eba->eba_tbl[idx].recs[lnum].pnum |= NOT_MAPPED;
+	eba->eba_tbl[idx].recs[lnum].leb_ver += 1;
+	spin_unlock(&eba->eba_tbl_lock);
+}
+
+/**
+ * leb2peb - get physical eraseblock number the logical eraseblock is mapped
+ * to.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID
+ * @lnum: the logical eraseblock number
+ *
+ * If the logical eraseblock is mapped, this function returns a positive
+ * physical eraseblock number. If it is not mapped, this function returns
+ * a negative number.
+ */
+static inline int leb2peb(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	int idx, pnum;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(eba->eba_tbl[idx].recs);
+	pnum = eba->eba_tbl[idx].recs[lnum].pnum;
+	spin_unlock(&eba->eba_tbl_lock);
+
+	return pnum;
+}
+
+int ubi_eba_mkvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs)
+{
+	int i, idx, sz;
+	struct ubi_eba_tbl_rec *new_ebs;
+	struct ubi_eba_info *eba = ubi->eba;
+	struct ubi_eba_tbl_volume *eba_tbl = eba->eba_tbl;
+
+	dbg_eba("create volume %d, size %d", vol_id, reserved_pebs);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(reserved_pebs > 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+	ubi_assert(vol_id < ubi->acc->max_volumes);
+
+	if (ubi->io->ro_mode) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	sz = reserved_pebs * sizeof(struct ubi_eba_tbl_rec);
+	new_ebs = ubi_kmalloc(sz);
+	if (!new_ebs)
+		return -ENOMEM;
+
+	for (i = 0; i < reserved_pebs; i++) {
+		new_ebs[i].pnum = NOT_MAPPED;
+		new_ebs[i].leb_ver = 0xFFFFFFF0;
+	}
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(!eba_tbl[idx].recs);
+	eba_tbl[idx].recs = new_ebs;
+	eba_tbl[idx].leb_count = reserved_pebs;
+	spin_unlock(&eba->eba_tbl_lock);
+
+	return 0;
+}
+
+int ubi_eba_rmvol(const struct ubi_info *ubi, int vol_id)
+{
+	int err = 0, i, idx, to_put;
+	struct ubi_eba_tbl_rec *rm_ebs;
+	struct ubi_eba_info *eba = ubi->eba;
+	struct ubi_eba_tbl_volume *eba_tbl = eba->eba_tbl;
+
+	dbg_eba("remove volume %d", vol_id);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+	ubi_assert(vol_id < ubi->acc->max_volumes);
+
+	if (ubi->io->ro_mode) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(eba_tbl[idx].recs);
+	rm_ebs = eba_tbl[idx].recs;
+	to_put = eba_tbl[idx].leb_count;
+	eba_tbl[idx].recs = NULL;
+	eba_tbl[idx].leb_count = 0;
+	spin_unlock(&eba->eba_tbl_lock);
+
+	for (i = 0; i < to_put; i++)
+		if (rm_ebs[i].pnum >= 0) {
+			err = ubi_wl_put_peb(ubi, rm_ebs[i].pnum, 0);
+			if (err)
+				break;
+		}
+
+	ubi_kfree(rm_ebs);
+	return err;
+}
+
+int ubi_eba_rsvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs)
+{
+	int err = 0, i, idx, min, to_put, sz;
+	struct ubi_eba_tbl_rec *new_ebs, *old_ebs;
+	struct ubi_eba_info *eba = ubi->eba;
+	struct ubi_eba_tbl_volume *eba_tbl = eba->eba_tbl;
+
+	dbg_eba("re-size volume %d to %d PEBs", vol_id, reserved_pebs);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(!ubi_is_ivol(vol_id));
+	ubi_assert(vol_id < ubi->acc->max_volumes);
+	ubi_assert(reserved_pebs > 0);
+
+	if (ubi->io->ro_mode) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	sz = reserved_pebs * sizeof(struct ubi_eba_tbl_rec);
+	new_ebs = ubi_kmalloc(sz);
+	if (!new_ebs)
+		return -ENOMEM;
+
+	for (i = 0; i < reserved_pebs; i++) {
+		new_ebs[i].pnum = NOT_MAPPED;
+		new_ebs[i].leb_ver = 0;
+	}
+
+	idx = vol_id2idx(ubi, vol_id);
+
+	spin_lock(&eba->eba_tbl_lock);
+	ubi_assert(eba_tbl[idx].recs);
+
+	if (reserved_pebs < eba_tbl[idx].leb_count) {
+		min = reserved_pebs;
+		to_put = eba_tbl[idx].leb_count - reserved_pebs;
+	} else {
+		min = eba_tbl[idx].leb_count;
+		to_put = 0;
+	}
+
+	for (i = 0; i < min; i++) {
+		new_ebs[i].pnum = eba_tbl[idx].recs[i].pnum;
+		new_ebs[i].leb_ver = eba_tbl[idx].recs[i].leb_ver;
+	}
+	old_ebs = eba_tbl[idx].recs;
+	eba_tbl[idx].recs = new_ebs;
+	eba_tbl[idx].leb_count = reserved_pebs;
+	spin_unlock(&eba->eba_tbl_lock);
+
+	for (i = 0; i < to_put; i++)
+		if (old_ebs[i].pnum >= 0) {
+			err = ubi_wl_put_peb(ubi, old_ebs[i].pnum, 0);
+			if (err)
+				break;
+		}
+
+	ubi_kfree(old_ebs);
+	return err;
+}
+
+int ubi_eba_erase_leb(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	int err, pnum;
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(vol_id < ubi->acc->max_volumes || ubi_is_ivol(vol_id));
+	ubi_assert(lnum >= 0);
+	ubi_assert(ubi->eba->eba_tbl[vol_id2idx(ubi, vol_id)].recs);
+	ubi_assert(lnum < ubi->eba->eba_tbl[vol_id2idx(ubi, vol_id)].leb_count);
+
+	cond_resched();
+
+	if (unlikely(ubi->io->ro_mode)) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	err = ubi_eba_leb_write_lock(ubi, vol_id, lnum);
+	if (unlikely(err))
+		return err;
+
+	pnum = leb2peb(ubi, vol_id, lnum);
+	if (pnum < 0) {
+		/* This logical eraseblock is already unmapped */
+		dbg_eba("erase LEB %d:%d (unmapped)", vol_id, lnum);
+		goto out_unlock;
+	}
+	dbg_eba("erase LEB %d:%d, PEB %d", vol_id, lnum, pnum);
+
+	leb_unmap(ubi, vol_id, lnum);
+
+	err = ubi_wl_put_peb(ubi, pnum, 0);
+
+out_unlock:
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	return err;
+}
+
+int ubi_eba_read_leb(const struct ubi_info *ubi, int vol_id, int lnum,
+		     void *buf, int offset, int len, int check)
+{
+	int err, pnum, scrub = 0;
+	const struct ubi_vtbl_vtr *vtr;
+	uint32_t data_crc;
+	struct ubi_vid_hdr *vid_hdr;
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(vol_id < ubi->acc->max_volumes || ubi_is_ivol(vol_id));
+	ubi_assert(lnum >= 0);
+	ubi_assert(offset >= 0);
+	ubi_assert(len > 0);
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+	ubi_assert(offset + len <= ubi->io->leb_size - vtr->data_pad);
+	ubi_assert(lnum < ubi->eba->eba_tbl[vol_id2idx(ubi, vol_id)].leb_count);
+
+	cond_resched();
+
+	err = ubi_eba_leb_read_lock(ubi, vol_id, lnum);
+	if (unlikely(err))
+		return err;
+
+	pnum = leb2peb(ubi, vol_id, lnum);
+
+	if (pnum < 0) {
+		/*
+		 * The logical eraseblock is not mapped, fill the whole buffer
+		 * by 0xFF bytes. The exception is static volumes for which it
+		 * is an error to read unmapped logical eraseblocks.
+		 */
+		dbg_eba("read %d bytes from offset %d of LEB %d:%d (unmapped)",
+			len, offset, vol_id, lnum);
+		ubi_eba_leb_read_unlock(ubi, vol_id, lnum);
+		ubi_assert(vtr->vol_type != UBI_STATIC_VOLUME);
+		memset(buf, 0xFF, len);
+		return 0;
+	}
+	dbg_eba("read %d bytes from offset %d of LEB %d:%d, PEB %d",
+		len, offset, vol_id, lnum, pnum);
+
+	if (vtr->vol_type == UBI_DYNAMIC_VOLUME)
+		/* In case of dynamic volumes no checking is needed */
+		check = 0;
+
+	if (check) {
+		vid_hdr = ubi_zalloc_vid_hdr(ubi);
+		if (unlikely(!vid_hdr)) {
+			err = -ENOMEM;
+			goto out_unlock;
+		}
+
+		err = ubi_io_read_vid_hdr(ubi, pnum, vid_hdr, 1);
+		if (unlikely(err) && err != UBI_IO_BITFLIPS) {
+			if (err > 0) {
+				/*
+				 * The header is either absent or corrupted.
+				 * The former case means there is a bug -
+				 * switch to read-only mode just in case.
+				 * The latter case means a real corruption - we
+				 * may try to recover data. FIXME: but this is
+				 * not implemented.
+				 */
+				if (err == UBI_IO_BAD_VID_HDR) {
+					ubi_warn("bad VID header at PEB %d, LEB"
+						 "%d:%d", pnum, vol_id, lnum);
+					err = -EBADMSG;
+				} else
+					ubi_eba_ro_mode(ubi);
+			}
+			goto out_free;
+		} else if (unlikely(err == UBI_IO_BITFLIPS))
+			scrub = 1;
+
+		err = paranoid_check_leb(ubi, pnum, vol_id, lnum,
+					 leb_get_ver(ubi, vol_id, lnum),
+					 vid_hdr);
+		if (unlikely(err)) {
+			if (err > 0)
+				err = -EINVAL;
+			goto out_free;
+		}
+
+		ubi_assert(lnum < ubi32_to_cpu(vid_hdr->used_ebs));
+		ubi_assert(len == ubi32_to_cpu(vid_hdr->data_size));
+
+		data_crc = ubi32_to_cpu(vid_hdr->data_crc);
+		ubi_free_vid_hdr(ubi, vid_hdr);
+	}
+
+	err = ubi_io_read_data(ubi, buf, pnum, offset, len);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS)
+		goto out_unlock;
+	else if (unlikely(err == UBI_IO_BITFLIPS)) {
+		scrub = 1;
+		err = 0;
+	}
+
+	if (check) {
+		uint32_t crc;
+
+		crc = crc32(UBI_CRC32_INIT, buf, len);
+		if (unlikely(crc != data_crc)) {
+			ubi_warn("CRC error: calculated %#08x, must be %#08x",
+				 crc, data_crc);
+			err = -EBADMSG;
+			goto out_unlock;
+		}
+
+		if (err)
+			dbg_eba("error %d while reading, but data CRC is OK, "
+				"ignore the error", err);
+		err = 0;
+		dbg_eba("data is OK, CRC matches");
+	}
+
+	if (unlikely(err))
+		goto out_unlock;
+
+	if (unlikely(scrub))
+		err = ubi_wl_scrub_peb(ubi, pnum);
+
+	ubi_eba_leb_read_unlock(ubi, vol_id, lnum);
+	return err;
+
+out_free:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+out_unlock:
+	ubi_eba_leb_read_unlock(ubi, vol_id, lnum);
+	return err;
+}
+
+int ubi_eba_write_leb(const struct ubi_info *ubi, int vol_id, int lnum,
+		      const void *buf, int offset, int len,
+		      enum ubi_data_type dtype)
+{
+	int err, pnum, tries = 0;
+	uint32_t leb_ver;
+	struct ubi_vid_hdr *vid_hdr;
+	const struct ubi_vtbl_vtr *vtr;
+	const struct ubi_io_info *io = ubi->io;
+
+retry:
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(vol_id < ubi->acc->max_volumes || ubi_is_ivol(vol_id));
+	ubi_assert(lnum >= 0);
+	ubi_assert(offset >= 0);
+	ubi_assert(len >= 0);
+	ubi_assert(dtype == UBI_DATA_LONGTERM || dtype == UBI_DATA_SHORTTERM ||
+		   dtype == UBI_DATA_UNKNOWN);
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+	ubi_assert(offset + len <= io->leb_size - vtr->data_pad);
+	ubi_assert(lnum < ubi->eba->eba_tbl[vol_id2idx(ubi, vol_id)].leb_count);
+	ubi_assert(len % io->min_io_size == 0);
+	ubi_assert(offset % io->min_io_size == 0);
+	ubi_assert(vtr->vol_type == UBI_DYNAMIC_VOLUME);
+
+	cond_resched();
+
+	if (unlikely(ubi->io->ro_mode)) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	err = ubi_eba_leb_write_lock(ubi, vol_id, lnum);
+	if (unlikely(err))
+		return err;
+
+	pnum = leb2peb(ubi, vol_id, lnum);
+	leb_ver = leb_get_ver(ubi, vol_id, lnum);
+	if (pnum >= 0) {
+		dbg_eba("write %d bytes at offset %d of LEB %d:%d, PEB %d",
+			len, offset, vol_id, lnum, pnum);
+
+		if (len != 0) {
+			err = ubi_io_write_data(ubi, buf, pnum, offset, len);
+			if (unlikely(err))
+				goto data_write_error;
+		}
+		ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+		return err;
+	}
+
+	/*
+	 * The logical eraseblock is not mapped. We have to get a free physical
+	 * eraseblock and write the volume identifier header there first.
+	 */
+	vid_hdr = ubi_zalloc_vid_hdr(ubi);
+	if (unlikely(!vid_hdr)) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+
+	vid_hdr->vol_type = UBI_VID_DYNAMIC;
+	vid_hdr->leb_ver = cpu_to_ubi32(leb_ver);
+	vid_hdr->vol_id = cpu_to_ubi32(vol_id);
+	vid_hdr->lnum = cpu_to_ubi32(lnum);
+	vid_hdr->compat = ubi_vtbl_get_compat(ubi, vol_id);
+	vid_hdr->data_pad = cpu_to_ubi32(vtr->data_pad);
+
+	pnum = ubi_wl_get_peb(ubi, dtype);
+	if (unlikely(pnum < 0)) {
+		err = pnum;
+		goto out_vid_hdr;
+	}
+	dbg_eba("write VID hdr and %d bytes at offset %d of LEB %d:%d, PEB %d",
+		len, offset, vol_id, lnum, pnum);
+
+	err = ubi_io_write_vid_hdr(ubi, pnum, vid_hdr);
+	if (unlikely(err))
+		goto hdr_write_error;
+
+	leb_map(ubi, vol_id, lnum, pnum);
+
+	if (len != 0) {
+		err = ubi_io_write_data(ubi, buf, pnum, offset, len);
+		if (unlikely(err))
+			goto data_write_error_free;
+	}
+
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	return 0;
+
+out_vid_hdr:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+out_unlock:
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	return err;
+
+	/* Failed to write the volume identifier header */
+hdr_write_error:
+	ubi_warn("failed to write VID header to PEB %d", pnum);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	if (err != -EIO || !io->bad_allowed)
+		goto no_bad_eraseblocks;
+
+	/*
+	 * Fortunately, we did not write any data there yet, so just put this
+	 * physical eraseblock and request a new one. We assume that if this
+	 * physical eraseblock went bad, the erase code will handle that.
+	 */
+	ubi_msg("try to recover form the error");
+	err = ubi_wl_put_peb(ubi, pnum, 1);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	if (err || ++tries > 5)
+		return err;
+	goto retry;
+
+	/* Failed to write data */
+data_write_error_free:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+data_write_error:
+	ubi_warn("failed to write data to PEB %d", pnum);
+	if (err != -EIO || !io->bad_allowed)
+		goto no_bad_eraseblocks;
+
+	err = ubi_beb_recover_peb(ubi, pnum, vol_id, lnum, buf, offset, len);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	return err;
+
+	/*
+	 * This flash device does not admit of bad eraseblocks or something
+	 * nasty and unexpected happened. Switch to read-only mode just in
+	 * case.
+	 */
+no_bad_eraseblocks:
+	ubi_eba_ro_mode(ubi);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	return err;
+}
+
+int ubi_eba_write_leb_st(const struct ubi_info *ubi, int vol_id, int lnum,
+			 const void *buf, int len, enum ubi_data_type dtype,
+			 int used_ebs)
+{
+	int err, pnum, tries = 0, data_size = len;
+	uint32_t leb_ver, crc;
+	struct ubi_vid_hdr *vid_hdr;
+	const struct ubi_vtbl_vtr *vtr;
+	const struct ubi_io_info *io = ubi->io;
+
+retry:
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(vol_id < ubi->acc->max_volumes || ubi_is_ivol(vol_id));
+	ubi_assert(lnum >= 0);
+	ubi_assert(len > 0);
+	ubi_assert(dtype == UBI_DATA_LONGTERM || dtype == UBI_DATA_SHORTTERM ||
+		   dtype == UBI_DATA_UNKNOWN);
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+	ubi_assert(lnum < ubi->eba->eba_tbl[vol_id2idx(ubi, vol_id)].leb_count);
+	ubi_assert(lnum < used_ebs);
+	ubi_assert(used_ebs >= 0);
+	ubi_assert(vtr->vol_type == UBI_STATIC_VOLUME);
+
+	cond_resched();
+
+	if (lnum == used_ebs - 1) {
+		/*
+		 * If this is the last logical eraseblock of a static
+		 * volume, @len may be unaligned.
+		 */
+		ubi_assert(len <= io->leb_size - vtr->data_pad);
+		len = align_up(data_size, io->min_io_size);
+	} else {
+		ubi_assert(len == io->leb_size - vtr->data_pad);
+		ubi_assert(len % io->min_io_size == 0);
+	}
+
+	if (unlikely(ubi->io->ro_mode)) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	err = ubi_eba_leb_write_lock(ubi, vol_id, lnum);
+	if (unlikely(err))
+		return err;
+
+	ubi_assert(leb2peb(ubi, vol_id, lnum) < 0);
+
+	/*
+	 * Get a free physical eraseblock and write the volume identifier
+	 * header.
+	 */
+	vid_hdr = ubi_zalloc_vid_hdr(ubi);
+	if (unlikely(!vid_hdr)) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+
+	leb_ver = leb_get_ver(ubi, vol_id, lnum);
+	vid_hdr->leb_ver = cpu_to_ubi32(leb_ver);
+	vid_hdr->vol_id = cpu_to_ubi32(vol_id);
+	vid_hdr->lnum = cpu_to_ubi32(lnum);
+	vid_hdr->compat = ubi_vtbl_get_compat(ubi, vol_id);
+	vid_hdr->data_pad = cpu_to_ubi32(vtr->data_pad);
+
+	crc = crc32(UBI_CRC32_INIT, buf, data_size);
+	vid_hdr->vol_type = UBI_VID_STATIC;
+	vid_hdr->data_size = cpu_to_ubi32(data_size);
+	vid_hdr->used_ebs = cpu_to_ubi32(used_ebs);
+	vid_hdr->data_crc = cpu_to_ubi32(crc);
+
+	pnum = ubi_wl_get_peb(ubi, dtype);
+	if (unlikely(pnum < 0)) {
+		err = pnum;
+		goto out_vid_hdr;
+	}
+	dbg_eba("write VID hdr and %d bytes at of LEB %d:%d, PEB %d",
+		len, vol_id, lnum, pnum);
+
+	err = ubi_io_write_vid_hdr(ubi, pnum, vid_hdr);
+	if (unlikely(err)) {
+		ubi_warn("failed to write VID header to PEB %d", pnum);
+		goto write_error;
+	}
+
+	leb_map(ubi, vol_id, lnum, pnum);
+
+	err = ubi_io_write_data(ubi, buf, pnum, 0, len);
+	if (unlikely(err)) {
+		ubi_warn("failed to write data to PEB %d", pnum);
+		goto write_error;
+	}
+
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	return 0;
+
+out_vid_hdr:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+out_unlock:
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	return err;
+
+	/* Write failure */
+write_error:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	if (err != -EIO || !io->bad_allowed)
+		goto no_bad_eraseblocks;
+
+	/*
+	 * We assume that if this physical eraseblock went bad - the erase code
+	 * will handle that.
+	 */
+	ubi_msg("try to recover form the error");
+	err = ubi_wl_put_peb(ubi, pnum, 1);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	if (err || ++tries > 5)
+		return err;
+	goto retry;
+
+	/*
+	 * This flash device does not admit of bad eraseblocks or something
+	 * nasty and unexpected happened. Switch to read-only mode just in
+	 * case.
+	 */
+no_bad_eraseblocks:
+	ubi_eba_ro_mode(ubi);
+	ubi_eba_leb_write_unlock(ubi, vol_id, lnum);
+	return err;
+}
+
+int ubi_eba_leb_is_mapped(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	dbg_eba("check LEB %d:%d PEBs", vol_id, lnum);
+
+	/* Input arguments sanity check */
+	ubi_assert(vol_id >= 0);
+	ubi_assert(vol_id < ubi->acc->max_volumes);
+	ubi_assert(lnum >= 0);
+	ubi_assert(lnum < ubi->eba->eba_tbl[vol_id2idx(ubi, vol_id)].leb_count);
+
+	return leb2peb(ubi, vol_id, lnum) >= 0;
+}
+
+void ubi_eba_ro_mode(const struct ubi_info *ubi)
+{
+	ubi_bgt_disable(ubi);
+	ubi->io->ro_mode = 1;
+	ubi_warn("switched to read-only mode");
+}
+
+/**
+ * ltree_lookup - look up the lock tree.
+ *
+ * @eba: the EBA unit description data structure
+ * @vol_id: volume ID of the logical eraseblock to look up
+ * @lnum: the logical eraseblock number to look up
+ *
+ * This function returns a pointer to the corresponding &struct ubi_eba_info
+ * object if the logical eraseblock is locked and %NULL if it is not locked.
+ *
+ * The @eba->ltree_lock has to be locked.
+ *
+ * This is a helper function for the logical eraseblock locking/unlocking
+ * functions.
+ */
+static inline struct ubi_eba_ltree_entry *
+ltree_lookup(struct ubi_eba_info *eba, int vol_id, int lnum)
+{
+	struct rb_node *p;
+
+	p = eba->ltree.rb_node;
+	while (p) {
+		struct ubi_eba_ltree_entry *le;
+
+		le = rb_entry(p, struct ubi_eba_ltree_entry, rb);
+
+		if (vol_id < le->vol_id)
+			p = p->rb_left;
+		else if (vol_id > le->vol_id)
+			p = p->rb_right;
+		else {
+			if (lnum < le->lnum)
+				p = p->rb_left;
+			else if (lnum > le->lnum)
+				p = p->rb_right;
+			else
+				return le;
+		}
+	}
+
+	return NULL;
+}
+
+static struct ubi_eba_ltree_entry *ltree_add_entry(const struct ubi_info *ubi,
+						   int vol_id, int lnum);
+
+int ubi_eba_leb_read_lock(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	struct ubi_eba_ltree_entry *le;
+
+	le = ltree_add_entry(ubi, vol_id, lnum);
+	if (unlikely(IS_ERR(le)))
+		return PTR_ERR(le);
+	down_read(&le->mutex);
+	return 0;
+}
+
+int ubi_eba_leb_write_lock(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	struct ubi_eba_ltree_entry *le;
+
+	le = ltree_add_entry(ubi, vol_id, lnum);
+	if (unlikely(IS_ERR(le)))
+		return PTR_ERR(le);
+	down_write(&le->mutex);
+	return 0;
+}
+
+void ubi_eba_leb_read_unlock(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	int free = 0;
+	struct ubi_eba_ltree_entry *le;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	spin_lock(&eba->ltree_lock);
+	le = ltree_lookup(ubi->eba, vol_id, lnum);
+	le->users -= 1;
+	ubi_assert(le->users >= 0);
+	if (le->users == 0) {
+		rb_erase(&le->rb, &eba->ltree);
+		free = 1;
+	}
+	spin_unlock(&eba->ltree_lock);
+
+	up_read(&le->mutex);
+	if (free)
+		ubi_free_eba_ltree_entry(le);
+}
+
+void ubi_eba_leb_write_unlock(const struct ubi_info *ubi, int vol_id, int lnum)
+{
+	int free;
+	struct ubi_eba_ltree_entry *le;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	spin_lock(&eba->ltree_lock);
+	le = ltree_lookup(ubi->eba, vol_id, lnum);
+	le->users -= 1;
+	ubi_assert(le->users >= 0);
+	if (le->users == 0) {
+		rb_erase(&le->rb, &eba->ltree);
+		free = 1;
+	} else
+		free = 0;
+	spin_unlock(&eba->ltree_lock);
+
+	up_write(&le->mutex);
+	if (free)
+		ubi_free_eba_ltree_entry(le);
+}
+
+void ubi_eba_leb_remap(const struct ubi_info *ubi, int vol_id, int lnum,
+		       int pnum)
+{
+	/* The logical eraseblock is supposed to be locked */
+	paranoid_check_leb_locked(ubi, vol_id, lnum);
+	leb_unmap(ubi, vol_id, lnum);
+	leb_map(ubi, vol_id, lnum, pnum);
+}
+
+static int build_eba_tbl(const struct ubi_info *ubi,
+			 const struct ubi_scan_info *si);
+
+int ubi_eba_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si)
+{
+	int err, sz;
+	struct ubi_eba_info *eba;
+	struct ubi_acc_info *acc = ubi->acc;
+
+	dbg_eba("initialize the EBA unit");
+
+	eba = ubi_kzalloc(sizeof(struct ubi_eba_info));
+	if (!eba)
+		return -ENOMEM;
+	ubi->eba = eba;
+
+	spin_lock_init(&eba->eba_tbl_lock);
+	spin_lock_init(&eba->ltree_lock);
+	eba->ltree = RB_ROOT;
+
+	eba->num_volumes = acc->max_volumes + acc->ivol_count;
+	sz = eba->num_volumes * sizeof(struct ubi_eba_tbl_volume);
+	eba->eba_tbl = ubi_kzalloc(sz);
+	if (!eba->eba_tbl) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = build_eba_tbl(ubi, si);
+	if (err)
+		goto out;
+
+	dbg_eba("the EBA unit is initialized");
+	return 0;
+
+out:
+	ubi_kfree(eba->eba_tbl);
+	ubi_kfree(eba);
+	return err;
+}
+
+void ubi_eba_close(const struct ubi_info *ubi)
+{
+	unsigned int i;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	dbg_eba("close EBA management unit");
+
+
+	for (i = 0; i < eba->num_volumes; i++)
+		ubi_kfree(eba->eba_tbl[i].recs);
+	ubi_kfree(eba->eba_tbl);
+	ubi_kfree(eba);
+}
+
+/**
+ * build_eba_tbl - build the eraseblock association table.
+ *
+ * @ubi: the UBI device description object
+ * @si: scanning info
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int build_eba_tbl(const struct ubi_info *ubi,
+			 const struct ubi_scan_info *si)
+{
+	int i, err, idx;
+	struct ubi_eba_info *eba = ubi->eba;
+	struct ubi_eba_tbl_volume *eba_tbl = eba->eba_tbl;
+
+	for (idx = 0; idx < eba->num_volumes; idx++) {
+		struct rb_node *rb;
+		struct ubi_scan_leb *seb;
+		struct ubi_scan_volume *sv;
+		const struct ubi_vtbl_vtr *vtr;
+		int sz;
+
+		cond_resched();
+
+		vtr = ubi_vtbl_get_vtr(ubi, idx2vol_id(ubi, idx));
+		if (IS_ERR(vtr))
+			continue;
+
+		dbg_eba("found volume %d (idx %d)", idx2vol_id(ubi, idx), idx);
+
+		eba_tbl[idx].leb_count = vtr->reserved_pebs;
+
+		sz = vtr->reserved_pebs * sizeof(struct ubi_eba_tbl_rec);
+		eba_tbl[idx].recs = ubi_kmalloc(sz);
+		if (unlikely(!eba_tbl[idx].recs)) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		for (i = 0; i < vtr->reserved_pebs; i++) {
+			eba->eba_tbl[idx].recs[i].pnum = NOT_MAPPED;
+			eba->eba_tbl[idx].recs[i].leb_ver = 0;
+		}
+
+		sv = ubi_scan_find_sv(si, idx2vol_id(ubi, idx));
+		if (!sv)
+			continue;
+
+		rb_for_each_entry(rb, seb, &sv->root, u.rb) {
+			eba->eba_tbl[idx].recs[seb->lnum].pnum = seb->pnum;
+			eba->eba_tbl[idx].recs[seb->lnum].leb_ver = seb->leb_ver;
+		}
+	}
+
+	return 0;
+
+out:
+	for (i = 0; i < eba->num_volumes; i++)
+		ubi_kfree(eba->eba_tbl[i].recs);
+
+	return err;
+}
+
+/**
+ * ltree_add_entry - add new entry to the lock tree.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: volume ID of the logical eraseblock
+ * @lnum: the logical eraseblock number
+ *
+ * This function add new lock tree entry for logical eraseblock
+ * (@vol_id,@lnum). If the corresponding entry is already there, its usage
+ * counter is increased. This function returns a pointer to the lock tree
+ * entry.
+ */
+static struct ubi_eba_ltree_entry *ltree_add_entry(const struct ubi_info *ubi,
+						   int vol_id, int lnum)
+{
+	struct ubi_eba_info *eba = ubi->eba;
+	struct ubi_eba_ltree_entry *le, *le1, *le_free;
+
+	le = ubi_alloc_eba_ltree_entry();
+	if (unlikely(!le))
+		return ERR_PTR(-ENOMEM);
+
+	le->vol_id = vol_id;
+	le->lnum = lnum;
+
+	spin_lock(&eba->ltree_lock);
+	le1 = ltree_lookup(eba, vol_id, lnum);
+
+	if (le1) {
+		/*
+		 * This logical eraseblock is already locked. The newly
+		 * allocated lock entry is not needed.
+		 */
+		le_free = le;
+		le = le1;
+	} else {
+		struct rb_node **p, *parent = NULL;
+
+		/*
+		 * No lock entry, add the newly allocated one to the
+		 * @eba->ltree RB-tree.
+		 */
+		le_free = NULL;
+
+		p = &eba->ltree.rb_node;
+		while (*p) {
+			parent = *p;
+			le1 = rb_entry(parent, struct ubi_eba_ltree_entry, rb);
+
+			if (vol_id < le1->vol_id)
+				p = &(*p)->rb_left;
+			else if (vol_id > le1->vol_id)
+				p = &(*p)->rb_right;
+			else {
+				ubi_assert(lnum != le1->lnum);
+				if (lnum < le1->lnum)
+					p = &(*p)->rb_left;
+				else
+					p = &(*p)->rb_right;
+			}
+		}
+
+		rb_link_node(&le->rb, parent, p);
+		rb_insert_color(&le->rb, &eba->ltree);
+	}
+	le->users += 1;
+	spin_unlock(&eba->ltree_lock);
+
+	if (le_free)
+		ubi_free_eba_ltree_entry(le_free);
+
+	return le;
+}
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_EBA
+
+/**
+ * paranoid_check_leb - check that a logical eraseblock has correct erase
+ * counter and volume identifier headers.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number
+ * @vol_id: the volume ID to check
+ * @lnum: the logical eraseblock number to check
+ * @leb_ver: the logical eraseblock version to check
+ * @vid_hdr: volume identifier header to check
+ *
+ * This function returns zero if the headers are all right, %1 if not, and a
+ * negative error code in case of error.
+ */
+static int paranoid_check_leb(const struct ubi_info *ubi, int pnum, int vol_id,
+			      int lnum, int leb_ver,
+			      const struct ubi_vid_hdr *vid_hdr)
+{
+	int err, hdr_vol_id, hdr_lnum, hdr_leb_ver;
+	struct ubi_ec_hdr *ec_hdr;
+
+	/* Check the EC header */
+	ec_hdr = ubi_zalloc_ec_hdr(ubi);
+	if (unlikely(!ec_hdr))
+		return -ENOMEM;
+
+	err = ubi_io_read_ec_hdr(ubi, pnum, ec_hdr, 1);
+	ubi_free_ec_hdr(ubi, ec_hdr);
+	if (unlikely(err) && err != UBI_IO_BITFLIPS) {
+		if (err < 0)
+			return err;
+		goto fail;
+	}
+
+	hdr_vol_id = ubi32_to_cpu(vid_hdr->vol_id);
+	hdr_lnum = ubi32_to_cpu(vid_hdr->lnum);
+	hdr_leb_ver = ubi32_to_cpu(vid_hdr->leb_ver);
+
+	if (unlikely(vol_id != hdr_vol_id)) {
+		ubi_err("bad vol_id %d, should be %d", hdr_vol_id, vol_id);
+		goto fail;
+	}
+
+	if (unlikely(lnum != hdr_lnum)) {
+		ubi_err("bad lnum %d, should be %d", hdr_lnum, lnum);
+		goto fail;
+	}
+
+	if (unlikely(leb_ver != hdr_leb_ver)) {
+		ubi_err("bad leb_ver %d, should be %d", hdr_leb_ver, leb_ver);
+		goto fail;
+	}
+
+	return 0;
+
+fail:
+	ubi_err("paranoid check failed");
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+/**
+ * paranoid_check_leb_locked - ensure that a logical eraseblock is locked.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID to check
+ * @lnum: the logical eraseblock number to check
+ *
+ * This function returns zero if the logical eraseblock is locked and %1 if
+ * not.
+ */
+static int paranoid_check_leb_locked(const struct ubi_info *ubi, int vol_id,
+				     int lnum)
+{
+	struct ubi_eba_ltree_entry *le;
+	struct ubi_eba_info *eba = ubi->eba;
+
+	spin_lock(&eba->ltree_lock);
+	le = ltree_lookup(ubi->eba, vol_id, lnum);
+	spin_unlock(&eba->ltree_lock);
+	if (likely(le))
+		return 0;
+
+	ubi_err("paranoid check failed");
+	ubi_dbg_dump_stack();
+	return 1;
+}
+
+#endif /* CONFIG_MTD_UBI_DEBUG_PARANOID_EBA */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 27/44 take 2] [UBI] bad block handling unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (25 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 26/44 take 2] [UBI] EBA unit implementation Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 28/44 take 2] [UBI] bad block handling unit implementation Artem Bityutskiy
                   ` (19 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/badeb.h tmp-to/drivers/mtd/ubi/badeb.h
--- tmp-from/drivers/mtd/ubi/badeb.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/badeb.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,108 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * Bad eraseblock handling unit.
+ *
+ * This unit is responsible for marking physical eraseblocks as bad and for
+ * recovering data from supposedly bad physical eraseblocks.
+ */
+
+#ifndef __UBI_BADEB_H__
+#define __UBI_BADEB_H__
+
+#include <linux/spinlock.h>
+
+/**
+ * ubi_beb_mark_bad - mark a physical eraseblock as bad.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock number to mark
+ *
+ * This function returns error in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_beb_mark_bad(const struct ubi_info *ubi, int pnum);
+
+/**
+ * ubi_beb_maintain_reserved - maintain a certain level of reserved physical
+ * eraseblock.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function tries to maintain a fixed number of reserved physical
+ * eraseblocks for bad eraseblock handling. This number is configurable.
+ */
+void ubi_beb_maintain_reserved(const struct ubi_info *ubi);
+
+/**
+ * ubi_beb_recover_peb - recover a physical eraseblock after a write failure.
+ *
+ * @ubi: the UBI device description object
+ * @pnum: the physical eraseblock to recover
+ * @vol_id: volume ID this LEB belongs to
+ * @lnum: logical eraseblock number
+ * @buf: the data which was not be written because of a write failure
+ * @offset: offset of the failed write
+ * @len: how many bytes are should have been written
+ *
+ * This function has to be called in case of a write failure to move all the
+ * good data foam the potentially bad physical eraseblock to a good physical
+ * eraseblock. This function also writes the data which was not written due to
+ * the failure. This function returns the new physical eraseblock number in
+ * case of success, and a negative error code in case of failure.
+ */
+int ubi_beb_recover_peb(const struct ubi_info *ubi, int pnum, int vol_id,
+			int lnum, const void *buf, int offset, int len);
+
+/**
+ * ubi_beb_init - initialize the bad eraseblock handling unit.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success, and a negative error code in
+ * case of failure.
+ */
+int ubi_beb_init(struct ubi_info *ubi);
+
+/**
+ * ubi_beb_close - close the bad eraseblock handling unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_beb_close(struct ubi_info *ubi);
+
+/**
+ * struct ubi_beb_info - UBI bad PEB handling unit description data structure.
+ *
+ * @reserved_pebs: how many physical eraseblocks are reserved for bad PEB
+ * handling
+ * @reserved_max: how many PEBs have to be reserved for bad PEB handling, i.e.,
+ * the normal level of reserved PEBs
+ * @lock: protects @reserved_pebs
+ */
+struct ubi_beb_info {
+	int reserved_pebs; /* public  */
+	int reserved_max;  /* public  */
+	spinlock_t lock;   /* private */
+};
+
+#endif /* !__UBI_BADEB_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 28/44 take 2] [UBI] bad block handling unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (26 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 27/44 take 2] [UBI] bad block handling unit header Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 29/44 take 2] [UBI] update unit header Artem Bityutskiy
                   ` (18 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/badeb.c tmp-to/drivers/mtd/ubi/badeb.c
--- tmp-from/drivers/mtd/ubi/badeb.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/badeb.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,234 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/spinlock.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "badeb.h"
+#include "debug.h"
+#include "io.h"
+#include "eba.h"
+#include "account.h"
+#include "wl.h"
+#include "misc.h"
+
+/* The lowest number PEBs reserved for bad PEB handling */
+#define MIN_RESEVED_PEBS 1
+
+static void calculate_reserved_max(const struct ubi_info *ubi);
+
+int ubi_beb_mark_bad(const struct ubi_info *ubi, int pnum)
+{
+	int err;
+	struct ubi_beb_info *beb = ubi->beb;
+
+	ubi_assert(ubi->io->bad_allowed);
+
+	ubi_msg("PEB %d wend bad, mark it as bad", pnum);
+
+	err = ubi_io_mark_bad(ubi, pnum);
+	if (err)
+		return err;
+
+	spin_lock(&beb->lock);
+	if (beb->reserved_pebs <= 0)
+		ubi_warn("no reserved physical eraseblocks!");
+	ubi->io->bad_peb_count += 1;
+	ubi->io->good_peb_count -= 1;
+	calculate_reserved_max(ubi);
+	beb->reserved_pebs -= 1;
+	if (beb->reserved_pebs < beb->reserved_max && !ubi_acc_reserve(ubi, 1))
+		beb->reserved_pebs += 1;
+	spin_unlock(&beb->lock);
+
+	return err;
+}
+
+void ubi_beb_maintain_reserved(const struct ubi_info *ubi)
+{
+	int err, i, needed;
+	struct ubi_beb_info *beb = ubi->beb;
+
+	if (!ubi->io->bad_allowed)
+		return;
+
+	spin_lock(&beb->lock);
+	needed = beb->reserved_max - beb->reserved_pebs;
+	for (i = needed; i > 0; i--) {
+		err = ubi_acc_reserve(ubi, i);
+		if (!err) {
+			ubi_msg("reserved %d PEBs for bad PEB handling", i);
+			beb->reserved_pebs += i;
+			break;
+		}
+	}
+	spin_unlock(&beb->lock);
+}
+
+int ubi_beb_recover_peb(const struct ubi_info *ubi, int pnum, int vol_id,
+			int lnum, const void *buf, int offset, int len)
+{
+	int err, new_pnum, data_size, tries = 0;
+	struct ubi_vid_hdr *vid_hdr;
+	unsigned char *new_buf;
+
+	ubi_assert(ubi->io->bad_allowed);
+
+retry:
+	new_pnum = ubi_wl_get_peb(ubi, UBI_DATA_UNKNOWN);
+	if (new_pnum < 0)
+		return new_pnum;
+
+	ubi_msg("recover PEB %d, move its data to PEB %d", pnum, new_pnum);
+
+	/* At first recover the VID header */
+
+	vid_hdr = ubi_zalloc_vid_hdr(ubi);
+	if (!vid_hdr) {
+		err = -ENOMEM;
+		goto out_put;
+	}
+
+	err = ubi_io_read_vid_hdr(ubi, pnum, vid_hdr, 1);
+	if (err && err != UBI_IO_BITFLIPS) {
+		if (err > 0)
+			err = -EIO;
+		goto out_vid_hdr;
+	}
+
+	vid_hdr->leb_ver = cpu_to_ubi32(ubi32_to_cpu(vid_hdr->leb_ver) + 1);
+	err = ubi_io_write_vid_hdr(ubi, new_pnum, vid_hdr);
+	if (err)
+		goto vid_write_error;
+
+	/* Now recover the data */
+
+	data_size = offset + len;
+	new_buf = ubi_kmalloc(data_size);
+	if (unlikely(!new_buf)) {
+		err = -ENOMEM;
+		goto out_vid_hdr;
+	}
+	memset(new_buf + offset, 0xFF, len);
+
+	/* Read everything before the area where the write failure happened */
+	if (offset > 0) {
+		err = ubi_io_read_data(ubi, new_buf, pnum, 0, offset);
+		if (err && err != UBI_IO_BITFLIPS)
+			goto out_new_buf;
+	}
+
+	/*
+	 * Now we assume that before the failed write the (offset, offset+len)
+	 * area contained all 0xFF bytes. This is true for NAND. This is not
+	 * always true for NOR, but NOR don't admit of bad PEBs.
+	 */
+	memcpy(new_buf + offset, buf, len);
+
+	err = ubi_io_write_data(ubi, new_buf, new_pnum, 0, data_size);
+	if (err)
+		goto vid_data_write_error;
+
+	ubi_kfree(new_buf);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	ubi_eba_leb_remap(ubi, vol_id, lnum, new_pnum);
+	ubi_wl_put_peb(ubi, pnum, 1);
+	ubi_msg("data was successfully recovered");
+	return 0;
+
+out_new_buf:
+	ubi_kfree(new_buf);
+out_vid_hdr:
+	ubi_free_vid_hdr(ubi, vid_hdr);
+out_put:
+	ubi_wl_put_peb(ubi, new_pnum, 1);
+	return err;
+
+vid_data_write_error:
+	ubi_kfree(new_buf);
+vid_write_error:
+	/*
+	 * Bad luck? This physical eraseblock is bad too? Crud. Let's try to
+	 * get another one.
+	 */
+	ubi_warn("failed to write to PEB %d", new_pnum);
+	ubi_free_vid_hdr(ubi, vid_hdr);
+	ubi_wl_put_peb(ubi, new_pnum, 1);
+	if (++tries > 5)
+		/* We've tried too many times */
+		return err;
+	ubi_msg("try again");
+	goto retry;
+}
+
+int ubi_beb_init(struct ubi_info *ubi)
+{
+	int i;
+	struct ubi_beb_info *beb;
+	const struct ubi_io_info *io = ubi->io;
+
+	beb = ubi_kzalloc(sizeof(struct ubi_beb_info));
+	if (!beb)
+		return -ENOMEM;
+	ubi->beb = beb;
+
+	spin_lock_init(&beb->lock);
+
+	if (!io->bad_allowed)
+		return 0;
+
+	calculate_reserved_max(ubi);
+
+	for (i = beb->reserved_max; i > 0; i--)
+		if (!ubi_acc_reserve(ubi, i)) {
+			beb->reserved_pebs = i;
+			break;
+		}
+
+	if (beb->reserved_pebs < beb->reserved_max)
+		/* No enough free physical eraseblocks */
+		ubi_warn("cannot reserve enough PEBs");
+
+	return 0;
+}
+
+void ubi_beb_close(struct ubi_info *ubi)
+{
+	ubi_kfree(ubi->beb);
+}
+
+/**
+ * calculate_reserved_max - calculate how many PEBs must be reserved for bad
+ * eraseblock handling.
+ *
+ * @ubi: the UBI device description object
+ */
+static void calculate_reserved_max(const struct ubi_info *ubi)
+{
+	struct ubi_beb_info *beb = ubi->beb;
+
+	/* Reserve some amount of PEBs for bad PEB handling */
+	beb->reserved_max = ubi->io->good_peb_count/100;
+	beb->reserved_max *= CONFIG_MTD_UBI_BEB_RESERVE;
+	if (beb->reserved_max < MIN_RESEVED_PEBS)
+		beb->reserved_max = MIN_RESEVED_PEBS;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 29/44 take 2] [UBI] update unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (27 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 28/44 take 2] [UBI] bad block handling unit implementation Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:56 ` [PATCH 30/44 take 2] [UBI] update unit implementation Artem Bityutskiy
                   ` (17 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/upd.h tmp-to/drivers/mtd/ubi/upd.h
--- tmp-from/drivers/mtd/ubi/upd.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/upd.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,136 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ * Copyright (C) Nokia Corporation, 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * The update unit.
+ *
+ * This unit implements the volume update operation. In the current
+ * implementation we use an update marker for this. The update marker is
+ * per-volume and is stored in the volume table. The update marker is set
+ * before the update starts, and removed after the update has been finished.
+ * So, if the update was interrupted by an unclean re-boot or due to some other
+ * reasons, the update marker is found and we know that the volume is damaged.
+ *
+ * Note, in general it is possible to implement the update operation as a
+ * transaction with a possibility to roll-back. But this is far more complex.
+ *
+ * This implementation does not support concurrent updates but it is not
+ * difficult to implement this.
+ */
+
+#ifndef __UBI_UPD_H__
+#define __UBI_UPD_H__
+
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+struct ubi_info;
+struct ubi_scan_info;
+
+/**
+ * ubi_upd_start - start update operation.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: volume ID to start update for
+ * @bytes: how many bytes will be written to the volume
+ *
+ * This function starts a volume update operation. If @bytes is zero, the
+ * volume will just be fully erased. This function returns zero in case of
+ * success and a negative error code in case of error.
+ *
+ * Note, this function does not check if the volume is being used by some other
+ * user. The (upper) calling layer has to do this.
+ */
+int ubi_upd_start(const struct ubi_info *ubi, int vol_id, long long bytes);
+
+/**
+ * ubi_upd_write_data - write more data.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume under update
+ * @buf: the data to write
+ * @count: how much bytes to write
+ *
+ * This function writes more data to the volume which is being updated. It may
+ * be called arbitrary number of times until all of the update bytes arrive.
+ * This function returns %0 in case of success, %1 if the update was
+ * successfully finished, and a negative error code in case of failure.
+ */
+int ubi_upd_write_data(const struct ubi_info *ubi, int vol_id,
+		       const void __user *buf, int count);
+
+/**
+ * ubi_upd_abort - abort an update operation.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to abort update operation for
+ *
+ * This function aborts an update operation. Returns zero in case of success
+ * and a negative error code in case of failure.
+ *
+ * If an update operation is aborted, the update marker stays on the flash
+ * media. A new successful update operation is required to make this volume
+ * usable.
+ */
+int ubi_upd_abort(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_upd_init_scan - initialize the update volume unit using scanning
+ * information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success, and a negative error code in
+ * case of failure.
+ */
+int ubi_upd_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si);
+
+/**
+ * ubi_upd_close - close the update volume unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_upd_close(const struct ubi_info *ubi);
+
+/**
+ * struct ubi_upd_info - UBI update unit description data structure.
+ *
+ * @updating: if any volume is being updated at the moment
+ * @vol_id: which volume utilizes the update marker at the moment
+ * @upd_ebs: how many eraseblocks are going to be updated
+ * @upd_received: how many bytes were already received by the update unit
+ * @upd_bytes: how many more bytes are expected to be received
+ * @upd_buf: a buffer which is used to collect update data during the update
+ * operation
+ * @mutex: serializes access to the volume update capability
+ */
+struct ubi_upd_info {
+	int updating;                        /* public  */
+	int vol_id;                          /* private */
+	int upd_ebs;                         /* private */
+	long long upd_received;              /* private */
+	long long upd_bytes;                 /* private */
+	void *upd_buf;                       /* private */
+	struct mutex mutex;                  /* private */
+};
+
+#endif /* !__UBI_UPD_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 30/44 take 2] [UBI] update unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (28 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 29/44 take 2] [UBI] update unit header Artem Bityutskiy
@ 2007-02-17 16:56 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 31/44 take 2] [UBI] accounting unit header Artem Bityutskiy
                   ` (16 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/upd.c tmp-to/drivers/mtd/ubi/upd.c
--- tmp-from/drivers/mtd/ubi/upd.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/upd.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,389 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ * Copyright (C) Nokia Corporation, 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ *
+ * Jan 2007: Alexander Schmidt, hacked per-volume update.
+ */
+
+#include <linux/mutex.h>
+#include <linux/err.h>
+#include <linux/types.h>
+#include <linux/sched.h>
+#include <asm/uaccess.h>
+#include <asm/div64.h>
+#include "ubi.h"
+#include "upd.h"
+#include "wl.h"
+#include "vtbl.h"
+#include "io.h"
+#include "eba.h"
+#include "misc.h"
+#include "account.h"
+#include "alloc.h"
+#include "scan.h"
+#include "debug.h"
+
+static int ubi_wipe_out_volume(const struct ubi_info *ubi, int vol_id);
+
+int ubi_upd_start(const struct ubi_info *ubi, int vol_id, long long bytes)
+{
+	int err, rem;
+	uint64_t tmp;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_upd_info *upd = ubi->upd;
+
+	dbg_upd("start update of volume %d, %llu bytes", vol_id, bytes);
+
+	ubi_assert(vol_id >= 0 && vol_id < ubi->acc->max_volumes);
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+	ubi_assert(bytes >= 0 &&
+		   bytes <= vtr->usable_leb_size * vtr->reserved_pebs);
+
+	mutex_lock(&upd->mutex);
+
+	if (upd->updating == 1) {
+		dbg_err("volume %d is being updated", upd->vol_id);
+		mutex_unlock(&upd->mutex);
+		return -EBUSY;
+	}
+
+	upd->updating = 1;
+	upd->vol_id = vol_id;
+
+	/* Set the update marker first */
+	err = ubi_vtbl_set_upd_marker(ubi, vol_id);
+	if (err)
+		goto out_unlock;
+
+	/* Before updating, we wipe out the volume */
+	err = ubi_wipe_out_volume(ubi, vol_id);
+	if (err)
+		goto out_unlock;
+
+	if (bytes == 0) {
+		/* Zero bytes means the volume just has to be erased */
+		err = ubi_vtbl_clear_upd_marker(ubi, vol_id, 0);
+		goto out_unlock;
+	}
+
+	tmp = bytes;
+	rem = do_div(tmp, vtr->usable_leb_size);
+	upd->upd_ebs = tmp + !!rem;
+	upd->upd_bytes = bytes;
+	upd->upd_received = 0;
+
+	mutex_unlock(&upd->mutex);
+	return 0;
+
+out_unlock:
+	upd->updating = 0;
+	mutex_unlock(&upd->mutex);
+	return err;
+}
+
+static int write_leb(const struct ubi_info *ubi, int vol_id, int lnum,
+		     void *buf, int len, int used_ebs);
+
+int ubi_upd_write_data(const struct ubi_info *ubi, int vol_id,
+		       const void __user *buf, int count)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_upd_info *upd = ubi->upd;
+	int lnum, offs, err = 0, len;
+	uint64_t tmp;
+
+	dbg_upd("write %d bytes requested", count);
+
+	ubi_assert(vol_id >= 0 && vol_id < ubi->acc->max_volumes);
+	ubi_assert(count >= 0);
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (unlikely(count == 0))
+		return 0;
+
+	mutex_lock(&upd->mutex);
+
+	if (unlikely(!upd->updating || upd->vol_id != vol_id)) {
+		dbg_err("volume %d update was not started", vol_id);
+		err = -EINVAL;
+		goto out_unlock;
+	}
+
+	ubi_assert(upd->upd_received >= 0 &&
+		   upd->upd_received < upd->upd_bytes);
+	tmp = upd->upd_received;
+	offs = do_div(tmp, vtr->usable_leb_size);
+	lnum = tmp;
+
+	if (upd->upd_received + count > upd->upd_bytes)
+		count = upd->upd_bytes - upd->upd_received;
+
+	/*
+	 * When updating volumes, we accumulate whole eraseblock and write
+	 * it at once.
+	 */
+
+	if (offs != 0) {
+
+		/*
+		 * This is a write to the middle of the logical eraseblock. We
+		 * copy the data to our update buffer and wait for more data or
+		 * flush it (if the whole eraseblock is written or the update
+		 * is finished).
+		 */
+
+		len = vtr->usable_leb_size - offs;
+		if (len > count)
+			len = count;
+
+		dbg_upd("copy more %d bytes of data", len);
+
+		err = copy_from_user(upd->upd_buf + offs, buf, len);
+		if (err) {
+			dbg_err("memory access error");
+			err = -EFAULT;
+			goto out_unlock;
+		}
+
+		if (offs + len == vtr->usable_leb_size ||
+		    upd->upd_received + len == upd->upd_bytes) {
+			int flush_len = offs + len;
+
+			/*
+			 * OK, we gathered either the whole eraseblock or this
+			 * is the last chunk, it's time to flush our buffer.
+			 */
+
+			ubi_assert(flush_len <= vtr->usable_leb_size);
+
+			err = write_leb(ubi, vol_id, lnum, upd->upd_buf, flush_len,
+					upd->upd_ebs);
+			if (err)
+				goto out_unlock;
+		}
+
+		upd->upd_received += len;
+		count -= len;
+		buf += len;
+		lnum += 1;
+	}
+
+	/*
+	 * If we've got more to write, let's continue. At this point we know we
+	 * are starting from the beginning of an eraseblock.
+	 */
+
+	while (count) {
+		cond_resched();
+
+		if (count > vtr->usable_leb_size)
+			len = vtr->usable_leb_size;
+		else
+			len = count;
+
+		dbg_upd("copy %d bytes of user data", len);
+		err = copy_from_user(upd->upd_buf, buf, len);
+		if (err) {
+			dbg_err("memory access error");
+			err = -EFAULT;
+			goto out_unlock;
+		}
+
+		if (len == vtr->usable_leb_size ||
+		    upd->upd_received + len == upd->upd_bytes) {
+			err = write_leb(ubi, vol_id, lnum, upd->upd_buf, len,
+					upd->upd_ebs);
+			if (unlikely(err))
+				break;
+		}
+
+		upd->upd_received += len;
+		count -= len;
+		lnum += 1;
+		buf += len;
+	}
+
+	ubi_assert(upd->upd_received <= upd->upd_bytes);
+	if (upd->upd_received == upd->upd_bytes) {
+		/* The update is finished, clear the update marker */
+		upd->updating = 0;
+		err = ubi_vtbl_clear_upd_marker(ubi, vol_id, upd->upd_bytes);
+		if (err == 0)
+			err = 1;
+	}
+
+out_unlock:
+	mutex_unlock(&upd->mutex);
+	return err;
+}
+
+int ubi_upd_abort(const struct ubi_info *ubi, int vol_id)
+{
+	int err = 0;
+	struct ubi_upd_info *upd = ubi->upd;
+
+	mutex_lock(&upd->mutex);
+	if (upd->updating && upd->vol_id == vol_id) {
+		dbg_upd("aborting volume %d update - it is damaged since now",
+			vol_id);
+		upd->updating = 0;
+	} else {
+		dbg_upd("volume %d is not under update", vol_id);
+		err = -EINVAL;
+	}
+	mutex_unlock(&upd->mutex);
+
+	return err;
+}
+
+int ubi_upd_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si)
+{
+	int err;
+	struct ubi_upd_info *upd;
+
+	dbg_upd("initialize the update unit");
+
+	upd = ubi_kzalloc(sizeof(struct ubi_upd_info));
+	if (!upd)
+		return -ENOMEM;
+	ubi->upd = upd;
+
+	upd->upd_buf = ubi_kmalloc(ubi->io->leb_size);
+	if (!upd->upd_buf) {
+		err = -ENOMEM;
+		goto out_free_upd;
+	}
+
+	mutex_init(&upd->mutex);
+
+	dbg_upd("the update unit is initialized");
+
+	return 0;
+
+out_free_upd:
+	ubi_kfree(upd);
+	return err;
+}
+
+void ubi_upd_close(const struct ubi_info *ubi)
+{
+	dbg_upd("close the update unit");
+	ubi_kfree(ubi->upd->upd_buf);
+	ubi_kfree(ubi->upd);
+}
+
+/**
+ * ubi_wipe_out_volume - wipe out an UBI volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to free
+ *
+ * This function erases all the volume's eraseblocks. Returns zero in case of
+ * success, and a negative error code in case of failure.
+ */
+static int ubi_wipe_out_volume(const struct ubi_info *ubi, int vol_id)
+{
+	int i, err;
+	const struct ubi_vtbl_vtr *vtr;
+
+	ubi_assert(vol_id >= 0 && vol_id < ubi->acc->max_volumes);
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	for (i = 0; i < vtr->reserved_pebs; i++) {
+		cond_resched();
+
+		err = ubi_eba_erase_leb(ubi, vol_id, i);
+		if (unlikely(err))
+			return err;
+	}
+
+	err = ubi_wl_flush(ubi);
+	return err;
+}
+
+/**
+ * write_leb - write a portion of update data to a logical eraseblock.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: the volume ID to write to
+ * @lnum: the logical eraseblock number to write
+ * @buf: the data to write
+ * @len: how many bytes to write
+ * @used_ebs: how many logical eraseblocks will this volume contain
+ *
+ * This function writes a portion of update data to the corresponding logical
+ * eraseblock. If a dynamic volume is being updated, this function checks if
+ * the data contains 0xFF bytes at the end. If yes, the 0xFF bytes are not
+ * written. If the whole buffer contains only 0xFF bytes, the LEB is left
+ * unmapped.
+ *
+ * 0xFF bytes LEBs are skipped in case of dynamic volumes because writing of
+ * 0xFF bytes may have side effects and this PEB won't be writable anymore. In
+ * case of static volume 0xFF bytes are not got rid of because static volumes
+ * are treated specially in UBI: per-LEB CRC is calculated and checked and
+ * presence of all LEBs is taken care of. So we cannot cut amount of data
+ * written to LEBs of static volume or to skip some LEBs in case of static
+ * volumes.
+ */
+static int write_leb(const struct ubi_info *ubi, int vol_id, int lnum,
+		     void *buf, int len, int used_ebs)
+{
+	int err;
+	const struct ubi_vtbl_vtr *vtr;
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (vtr->vol_type == UBI_DYNAMIC_VOLUME) {
+		int l;
+
+		ubi_assert(len == vtr->usable_leb_size);
+
+		/* This is dynamic volume - skip the ending 0xFFs */
+		l = ubi_calc_data_len(ubi, buf, len);
+		if (l == 0) {
+			dbg_upd("LEB %d:%d contains only 0xFF bytes - skip",
+				vol_id, lnum);
+			return 0;
+		}
+		if (len != l)
+			dbg_upd("skip last %d bytes of data for LEB %d:%d",
+				len - l, vol_id, lnum);
+
+		err = ubi_eba_write_leb(ubi, vol_id, lnum, buf, 0, l,
+					UBI_DATA_UNKNOWN);
+	} else {
+		/*
+		 * When writing to static volumes, and this is the last logical
+		 * eraseblock, the length (@len) does not have to be aligned to
+		 * the minimal flash I/O unit. The 'ubi_eba_write_leb_st()'
+		 * function needs the exact (unaligned) length to store in the
+		 * VID header. And it will take care of proper alignment by
+		 * padding the buffer. Here we just make sure the padding will
+		 * contain zeros, not random trash.
+		 */
+		memset(buf + len, 0, vtr->usable_leb_size - len);
+		err = ubi_eba_write_leb_st(ubi, vol_id, lnum, buf, len,
+					   UBI_DATA_UNKNOWN, used_ebs);
+	}
+
+	return err;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 31/44 take 2] [UBI] accounting unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (29 preceding siblings ...)
  2007-02-17 16:56 ` [PATCH 30/44 take 2] [UBI] update unit implementation Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 32/44 take 2] [UBI] accounting unit implementation Artem Bityutskiy
                   ` (15 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/account.h tmp-to/drivers/mtd/ubi/account.h
--- tmp-from/drivers/mtd/ubi/account.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/account.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,118 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI accounting unit.
+ *
+ * This unit is responsible for maintaining the correct physical eraseblock
+ * accounting to prevent overcommitment.
+ */
+
+#ifndef __UBI_ACCOUNT_H__
+#define __UBI_ACCOUNT_H__
+
+#include <linux/spinlock.h>
+
+struct ubi_info;
+struct ubi_scan_info;
+
+/**
+ * ubi_acc_mkvol - account creation of a volume.
+ *
+ * @ubi: the UBI device description object
+ * @reserved_pebs: how many eraseblocks are reserved for the volume
+ *
+ * This function reserves @reserved_pebs physical eraseblocks for the newly
+ * created volume. Returns zero in case of success and a %-ENOSPC if there are
+ * no enough physical eraseblocks.
+ */
+int ubi_acc_mkvol(const struct ubi_info *ubi, int reserved_pebs);
+
+/**
+ * ubi_acc_rmvol - account removal of a volume.
+ *
+ * @ubi: the UBI device description object
+ * @reserved_pebs: how many eraseblocks were reserved for the volume
+ *
+ * This function reclaims the physical eraseblocks occupied by a volume. Note,
+ * UBI is trying to maintain a constant level of physical eraseblock reserved
+ * for bad PEB handling. So, if there is a lack of reserved physical
+ * eraseblock, this function will reserve them at once.
+ */
+void ubi_acc_rmvol(const struct ubi_info *ubi, int reserved_pebs);
+
+/**
+ * ubi_acc_reserve - reserve a number of physical eraseblocks.
+ *
+ * @ubi: the UBI device description object
+ * @pebs: how many physical eraseblocks to reserve
+ *
+ * This function returns zero in case of success and %-ENOSPC if there are no
+ * enough physical eraseblocks.
+ */
+int ubi_acc_reserve(const struct ubi_info *ubi, int pebs);
+
+/**
+ * ubi_acc_free - free a number of reserved physical eraseblocks.
+ *
+ * @ubi: the UBI device description object
+ * @pebs: how many physical eraseblocks to free
+ */
+void ubi_acc_free(const struct ubi_info *ubi, int pebs);
+
+/**
+ * ubi_acc_init_scan - initialize the accounting unit using scanning
+ * information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_acc_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si);
+
+/**
+ * ubi_acc_close - close the accounting unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_acc_close(const struct ubi_info *ubi);
+
+/**
+ * struct ubi_acc_info - the UBI accounting unit's description data structure.
+ *
+ * @ivol_count: count of internal volumes
+ * @uvol_count: count of user volumes
+ * @rsvd_pebs: count of reserved physical eraseblocks
+ * @avail_pebs: count of available physical eraseblocks
+ * @max_volumes: maximum number of volumes that users may create
+ * @lock: protects the accounting data
+ */
+struct ubi_acc_info {
+	int ivol_count;  /* public  */
+	int uvol_count;  /* public  */
+	int rsvd_pebs;   /* public  */
+	int avail_pebs;  /* public  */
+	int max_volumes; /* public  */
+	spinlock_t lock; /* private */
+};
+
+#endif /* !__UBI_ACCOUNT_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 32/44 take 2] [UBI] accounting unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (30 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 31/44 take 2] [UBI] accounting unit header Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 33/44 take 2] [UBI] volume management unit header Artem Bityutskiy
                   ` (14 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/account.c tmp-to/drivers/mtd/ubi/account.c
--- tmp-from/drivers/mtd/ubi/account.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/account.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,286 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/sched.h>
+#include <linux/string.h>
+#include <linux/spinlock.h>
+#include <linux/err.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "vtbl.h"
+#include "account.h"
+#include "scan.h"
+#include "alloc.h"
+#include "badeb.h"
+#include "io.h"
+#include "debug.h"
+
+int ubi_acc_mkvol(const struct ubi_info *ubi, int reserved_pebs)
+{
+	struct ubi_acc_info *acc = ubi->acc;
+
+	dbg_acc("reserve %d PEBs for a new volume (uvol_count %d"
+		"rsvd_pebs %d, avail_pebs %d)", reserved_pebs,
+		acc->uvol_count, acc->rsvd_pebs, acc->avail_pebs);
+	ubi_assert(reserved_pebs > 0);
+
+	spin_lock(&acc->lock);
+	if (acc->uvol_count + 1 > acc->max_volumes) {
+		dbg_err("no room for the volume");
+		goto out;
+	}
+	if (reserved_pebs > acc->avail_pebs) {
+		dbg_err("no enough PEBs");
+		goto out;
+	}
+	acc->uvol_count += 1;
+	acc->avail_pebs -= reserved_pebs;
+	acc->rsvd_pebs += reserved_pebs;
+	ubi_assert(acc->avail_pebs >= 0);
+	spin_unlock(&acc->lock);
+	return 0;
+
+out:
+	spin_unlock(&acc->lock);
+	return -ENOSPC;
+}
+
+void ubi_acc_rmvol(const struct ubi_info *ubi, int reserved_pebs)
+{
+	struct ubi_acc_info *acc = ubi->acc;
+
+	dbg_acc("remove volume and get back %d PEBs (uvol_count %d, "
+		"rsvd_pebs %d, avail_pebs %d)", reserved_pebs,
+		acc->uvol_count, acc->rsvd_pebs, acc->avail_pebs);
+	ubi_assert(reserved_pebs > 0 && reserved_pebs <= acc->rsvd_pebs);
+
+	spin_lock(&acc->lock);
+	acc->uvol_count -= 1;
+	acc->avail_pebs += reserved_pebs;
+	acc->rsvd_pebs -= reserved_pebs;
+	ubi_assert(acc->uvol_count >= 0);
+	ubi_assert(acc->rsvd_pebs >= 0);
+	spin_unlock(&acc->lock);
+
+	/* Take care about PEBs reserved for bad PEB handling */
+	ubi_beb_maintain_reserved(ubi);
+}
+
+int ubi_acc_reserve(const struct ubi_info *ubi, int pebs)
+{
+	struct ubi_acc_info *acc = ubi->acc;
+
+	dbg_acc("reserve %d PEBs (rsvd_pebs %d, avail_pebs %d)",
+		pebs, acc->rsvd_pebs, acc->avail_pebs);
+	ubi_assert(pebs > 0);
+
+	spin_lock(&acc->lock);
+	if (unlikely(pebs > acc->avail_pebs)) {
+		dbg_err("no enough PEBs");
+		spin_unlock(&acc->lock);
+		return -ENOSPC;
+	}
+	acc->avail_pebs -= pebs;
+	acc->rsvd_pebs += pebs;
+	spin_unlock(&acc->lock);
+	return 0;
+}
+
+void ubi_acc_free(const struct ubi_info *ubi, int pebs)
+{
+	struct ubi_acc_info *acc = ubi->acc;
+
+	dbg_acc("free %d PEBs (rsvd_pebs %d, avail_pebs %d)",
+		pebs, acc->rsvd_pebs, acc->avail_pebs);
+	spin_lock(&acc->lock);
+	ubi_assert(pebs > 0 && pebs <= acc->rsvd_pebs);
+	acc->rsvd_pebs -= pebs;
+	acc->avail_pebs += pebs;
+	spin_unlock(&acc->lock);
+}
+
+static int acc_info_check(const struct ubi_info *ubi,
+			  const struct ubi_scan_info *si);
+
+int ubi_acc_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si)
+{
+	int err, i;
+	struct ubi_acc_info *acc;
+	const struct ubi_vtbl_vtr *vtr;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	const struct ubi_io_info *io = ubi->io;
+
+	dbg_acc("initialize the accounting unit");
+
+	acc = ubi_kzalloc(sizeof(struct ubi_acc_info));
+	if (!acc)
+		return -ENOMEM;
+	ubi->acc = acc;
+
+	spin_lock_init(&acc->lock);
+	acc->ivol_count = UBI_INT_VOL_COUNT;
+
+	for (i = 0; i < acc->ivol_count; i++) {
+		cond_resched();
+		vtr = ubi_vtbl_get_vtr(ubi, UBI_INTERNAL_VOL_START + i);
+		ubi_assert(!IS_ERR(vtr));
+		acc->rsvd_pebs += vtr->reserved_pebs;
+	}
+
+	/*
+	 * The maximum number of volumes may be less then the volume table
+	 * fits if there are too few available eraseblocks on the flash.
+	 */
+	acc->max_volumes = vtbl->vt_slots;
+	i = io->good_peb_count - acc->rsvd_pebs;
+	if (i <= 0) {
+		ubi_err("too small flash, at least %d good physical eraseblock"
+			" needed", acc->rsvd_pebs + 1);
+		err = -EINVAL;
+		goto out_acc;
+	}
+
+	if (acc->max_volumes > i)
+		acc->max_volumes = i;
+
+	for (i = 0; i < acc->max_volumes; i++) {
+		cond_resched();
+		vtr = ubi_vtbl_get_vtr(ubi, i);
+		if (IS_ERR(vtr))
+			continue;
+		acc->uvol_count += 1;
+		acc->rsvd_pebs += vtr->reserved_pebs;
+	}
+
+	acc->rsvd_pebs += si->alien_peb_count;
+	acc->avail_pebs = io->good_peb_count - acc->rsvd_pebs;
+
+	/* Check accounting information sanity and consistency */
+	err = acc_info_check(ubi, si);
+	if (err)
+		goto out_acc;
+
+	dbg_acc("uvol_count %d, ivol_count %d, avail_pebs %d rsvd_pebs %d "
+		"max_volumes %d", acc->uvol_count, acc->ivol_count,
+		acc->avail_pebs, acc->rsvd_pebs, acc->max_volumes);
+	return 0;
+
+out_acc:
+	ubi_kfree(acc);
+	return err;
+}
+
+void ubi_acc_close(const struct ubi_info *ubi)
+{
+	dbg_acc("close the accounting unit");
+	ubi_kfree(ubi->acc);
+}
+
+/**
+ * acc_info_check - check sanity and consistency of accounting information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information which must be consistent to accounting
+ * information
+ *
+ * As we try not to trust the data we read from the flash media, we have to
+ * check that the accounting information is sane and consistent, as it is
+ * formed using on-flash information. This function returns zero if all is fine
+ * and a negative error code if some inconsistency was found.
+ */
+static int acc_info_check(const struct ubi_info *ubi,
+			  const struct ubi_scan_info *si)
+{
+	int i;
+	const struct ubi_vtbl_vtr *vtr;
+	const struct ubi_acc_info *acc = ubi->acc;
+	const struct ubi_vtbl_info *vtbl = ubi->vtbl;
+	const struct ubi_io_info *io = ubi->io;
+
+	if (acc->avail_pebs < 0 || acc->rsvd_pebs < 0 || acc->uvol_count < 0 ||
+	    acc->ivol_count < 0) {
+		dbg_err("negative values");
+		goto bad;
+	}
+
+	if (acc->avail_pebs > io->good_peb_count) {
+		dbg_err("bad avail_pebs");
+		goto bad;
+	}
+
+	if (acc->rsvd_pebs > io->good_peb_count) {
+		dbg_err("bad rsvd_pebs");
+		goto bad;
+	}
+
+	if (acc->avail_pebs + acc->rsvd_pebs != io->good_peb_count) {
+		dbg_err("accounting error");
+		goto bad;
+	}
+
+	if (acc->max_volumes > vtbl->vt_slots) {
+		dbg_err("bad max_volumes");
+		goto bad;
+	}
+
+	if (acc->ivol_count + acc->uvol_count > acc->max_volumes) {
+		dbg_err("vol. count (%d + %d) > max_volumes",
+			acc->ivol_count, acc->uvol_count);
+		goto bad;
+	}
+
+	/*
+	 * Ensure that there are no volumes which exceed acc->max_volumes
+	 * exist.
+	 */
+	for (i = acc->max_volumes; i < vtbl->vt_slots; i++) {
+		cond_resched();
+		vtr = ubi_vtbl_get_vtr(ubi, i);
+		if (unlikely(!IS_ERR(vtr))) {
+			dbg_err("volume %d exists", i);
+			goto bad;
+		}
+	}
+
+	if (si->vols_found > acc->ivol_count + acc->uvol_count) {
+		dbg_err("scanning found volumes %d > %d + %d",
+			si->vols_found, acc->ivol_count, acc->uvol_count);
+		goto bad;
+	}
+
+	if (si->highest_vol_id >= acc->max_volumes &&
+	    si->highest_vol_id < UBI_INTERNAL_VOL_START) {
+		dbg_err("too large volume ID %d found by scanning",
+			si->highest_vol_id);
+		goto bad;
+	}
+
+	return 0;
+
+bad:
+	ubi_err("accounting check failed");
+	dbg_err("uvol_count %d, ivol_count %d, avail_pebs %d, rsvd_pebs %d "
+		"io->good_peb_count %d, max_volumes %d, vtbl->vt_slots %d",
+		acc->uvol_count, acc->ivol_count, acc->avail_pebs,
+		acc->rsvd_pebs, io->good_peb_count, acc->max_volumes,
+		vtbl->vt_slots);
+	return -EINVAL;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 33/44 take 2] [UBI] volume management unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (31 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 32/44 take 2] [UBI] accounting unit implementation Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 34/44 take 2] [UBI] volume management unit implementation Artem Bityutskiy
                   ` (13 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/volmgmt.h tmp-to/drivers/mtd/ubi/volmgmt.h
--- tmp-from/drivers/mtd/ubi/volmgmt.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/volmgmt.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,129 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI volume management unit.
+ *
+ * The unit is responsible for creation, deletion, updating and resizing
+ * of volumes.
+ */
+
+#ifndef __UBI_VOLMGMT_H__
+#define __UBI_VOLMGMT_H__
+
+#include <linux/mutex.h>
+
+struct ubi_info;
+struct ubi_scan_info;
+struct ubi_vol_info;
+struct ubi_vtbl_vtr;
+
+/**
+ * ubi_vmt_get_data_info - get volume data description.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the requested volume
+ *
+ * This function returns a pointer to the corresponding data description
+ * object.
+ */
+const struct ubi_vmt_data_info *
+ubi_vmt_get_data_info(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_vmt_mkvol - create a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID to assign to the new volume
+ * @vtr: volume table record corresponding to the new volume
+ *
+ * If @vol_id id %UBI_VOL_NUM_AUTO then new volume is automatically given an
+ * unused volume identifier. The @vtr->usable_leb_size field is ignored.
+ *
+ * This function returns the ID of the newly created volume in case of success,
+ * and a negative error code in case of failure.
+ */
+int ubi_vmt_mkvol(const struct ubi_info *ubi, int vol_id,
+		  const struct ubi_vtbl_vtr *vtr);
+
+/**
+ * ubi_vmt_rmvol - remove a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to remove
+ *
+ * This function returns zero in case of success, and a negative error code in
+ * case of failure.
+ */
+int ubi_vmt_rmvol(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_vmt_rsvol - re-size a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to re-size
+ * @reserved_pebs: new volume size
+ *
+ * This function returns zero in case of success, and a negative error code in
+ * case of failure.
+ */
+int ubi_vmt_rsvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs);
+
+/**
+ * ubi_vmt_truncate_volume - make sure the volume contains only 0xFF bytes.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume to free
+ *
+ * This function erases all the volume's eraseblocks. Returns zero in case of
+ * success, and a negative error code in case of failure.
+ */
+int ubi_vmt_truncate_volume(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_vmt_init_scan - initialize the volume management unit using scanning
+ * information.
+ *
+ * @ubi: the UBI device description object
+ * @si: a pointer to the scanning information
+ *
+ * This function returns zero in case of success, and a negative error code in
+ * case of failure.
+ */
+int ubi_vmt_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si);
+
+/**
+ * ubi_vmt_close - close the volume management unit.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_vmt_close(const struct ubi_info *ubi);
+
+/**
+ * struct ubi_vmt_info - volume management unit description data
+ * structure.
+ *
+ * @mutex: a mutex to serialize volume changes
+ */
+struct ubi_vmt_info {
+	struct mutex mutex; /* private */
+};
+
+#endif /* !__UBI_VOLMGMT_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 34/44 take 2] [UBI] volume management unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (32 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 33/44 take 2] [UBI] volume management unit header Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 35/44 take 2] [UBI] user-interfaces unit header Artem Bityutskiy
                   ` (12 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/volmgmt.c tmp-to/drivers/mtd/ubi/volmgmt.c
--- tmp-from/drivers/mtd/ubi/volmgmt.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/volmgmt.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,374 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/string.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/mutex.h>
+#include "ubi.h"
+#include "alloc.h"
+#include "io.h"
+#include "wl.h"
+#include "upd.h"
+#include "volmgmt.h"
+#include "vtbl.h"
+#include "misc.h"
+#include "eba.h"
+#include "account.h"
+#include "scan.h"
+#include "debug.h"
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_VMT
+static int paranoid_check_vtr(const struct ubi_info *ubi,
+			 const struct ubi_vtbl_vtr *vtr);
+#else
+#define paranoid_check_vtr(ubi, vtr) 0
+#endif
+
+static int find_vacant_vol_id(const struct ubi_info *ubi);
+
+int ubi_vmt_mkvol(const struct ubi_info *ubi, int vol_id,
+		  const struct ubi_vtbl_vtr *vtr)
+{
+	int i, err = 0;
+	struct ubi_vmt_info *vmt = ubi->vmt;
+	const struct ubi_vtbl_vtr *vtr_ck;
+
+	dbg_vmt("create volume ID %d, reserved_pebs %d, type %d, name %s",
+		vol_id, vtr->reserved_pebs, vtr->vol_type, vtr->name);
+
+	mutex_lock(&vmt->mutex);
+
+	if (vol_id == UBI_VOL_NUM_AUTO) {
+		vol_id = find_vacant_vol_id(ubi);
+		if (vol_id < 0) {
+			err = vol_id;
+			goto out_unlock;
+		}
+	} else
+		ubi_assert(vol_id >= 0 && vol_id < ubi->acc->max_volumes);
+
+	err = paranoid_check_vtr(ubi, vtr);
+	if (err) {
+		err = -EINVAL;
+		goto out_unlock;
+	}
+
+	/* Get sure that this volume does not exist */
+	err = -EEXIST;
+	vtr_ck = ubi_vtbl_get_vtr(ubi, vol_id);
+	if (!IS_ERR(vtr_ck)) {
+		dbg_err("volume %d already exists", vol_id);
+		goto out_unlock;
+	}
+
+	/* Ensure that this volume has a unique name */
+	for (i = 0; i < ubi->acc->max_volumes; i++) {
+		cond_resched();
+
+		vtr_ck = ubi_vtbl_get_vtr(ubi, i);
+		if (IS_ERR(vtr_ck))
+			continue;
+
+		if (unlikely(vtr->name_len == vtr_ck->name_len &&
+			     !strcmp(vtr->name, vtr_ck->name))) {
+			dbg_err("not unique name \"%s\", volume %d has it",
+				vtr->name, i);
+			goto out_unlock;
+		}
+	}
+
+	err = ubi_acc_mkvol(ubi, vtr->reserved_pebs);
+	if (err)
+		goto out_unlock;
+
+	/*
+	 * Finish all the pending erases because there may be some LEBs
+	 * belonging to the same volume ID. We don't want to be messed-up.
+	 */
+	err = ubi_wl_flush(ubi);
+	if (err)
+		goto out_acc;
+
+	err = ubi_eba_mkvol(ubi, vol_id, vtr->reserved_pebs);
+	if (err)
+		goto out_acc;
+
+	err = ubi_vtbl_mkvol(ubi, vol_id, vtr);
+	if (err)
+		goto out_eba;
+
+	mutex_unlock(&vmt->mutex);
+	return vol_id;
+
+out_eba:
+	ubi_eba_rmvol(ubi, vol_id);
+out_acc:
+	ubi_acc_rmvol(ubi, vtr->reserved_pebs);
+out_unlock:
+	mutex_unlock(&vmt->mutex);
+	return err;
+}
+
+int ubi_vmt_rmvol(const struct ubi_info *ubi, int vol_id)
+{
+	int err, reserved_pebs;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vmt_info *vmt = ubi->vmt;
+
+	dbg_vmt("remove volume %d", vol_id);
+	ubi_assert(vol_id >= 0 && vol_id < ubi->acc->max_volumes);
+
+	mutex_lock(&vmt->mutex);
+
+	/* Ensure that this volume exists */
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	if (IS_ERR(vtr)) {
+		err = PTR_ERR(vtr);
+		goto out_unlock;
+	}
+
+	reserved_pebs = vtr->reserved_pebs;
+
+	err = ubi_vtbl_rmvol(ubi, vol_id);
+	if (err)
+		goto out_unlock;
+
+	err = ubi_eba_rmvol(ubi, vol_id);
+	if (err)
+		goto out_unlock;
+
+	ubi_acc_rmvol(ubi, reserved_pebs);
+
+out_unlock:
+	mutex_unlock(&vmt->mutex);
+	return err;
+}
+
+int ubi_vmt_rsvol(const struct ubi_info *ubi, int vol_id, int reserved_pebs)
+{
+	int err, pebs, old_reserved_pebs;
+	struct ubi_vmt_info *vmt = ubi->vmt;
+	const struct ubi_vtbl_vtr *vtr;
+
+	dbg_vmt("re-size volume %d to %d PEBs", vol_id, reserved_pebs);
+	ubi_assert(vol_id >= 0 && vol_id < ubi->acc->max_volumes);
+	ubi_assert(reserved_pebs > 0);
+
+	mutex_lock(&vmt->mutex);
+
+	/* Ensure that this volume exists */
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	if (IS_ERR(vtr)) {
+		err = PTR_ERR(vtr);
+		goto out_unlock;
+	}
+
+	if (vtr->vol_type == UBI_STATIC_VOLUME &&
+	    reserved_pebs < vtr->used_ebs) {
+		dbg_err("too small size %d, static volume %d has %d used LEBs",
+			reserved_pebs, vol_id, vtr->used_ebs);
+		err = -EINVAL;
+		goto out_unlock;
+	}
+
+	/* If the size is the same, we have nathing to do */
+	if (reserved_pebs == vtr->reserved_pebs) {
+		err = 0;
+		goto out_unlock;
+	}
+
+	old_reserved_pebs = vtr->reserved_pebs;
+
+	err = ubi_vtbl_rsvol(ubi, vol_id, reserved_pebs);
+	if (err)
+		goto out_unlock;
+
+	pebs = reserved_pebs - old_reserved_pebs;
+	if (pebs > 0) {
+		err = ubi_acc_reserve(ubi, pebs);
+		if (err)
+			goto out_unlock;
+	} else
+		ubi_acc_free(ubi, -pebs);
+
+	err = ubi_eba_rsvol(ubi, vol_id, reserved_pebs);
+	if (err)
+		goto out_unlock;
+
+out_unlock:
+	mutex_unlock(&vmt->mutex);
+	return err;
+}
+
+int ubi_vmt_init_scan(struct ubi_info *ubi, struct ubi_scan_info *si)
+{
+	int err;
+	struct ubi_vmt_info *vmt;
+
+	dbg_vmt("initialize the volume management unit");
+
+	vmt = ubi_kzalloc(sizeof(struct ubi_vmt_info));
+	if (!vmt)
+		return -ENOMEM;
+	ubi->vmt = vmt;
+
+	mutex_init(&vmt->mutex);
+
+	err = ubi_vtbl_init_scan(ubi, si);
+	if (err)
+		goto out_vmt;
+
+	err = ubi_acc_init_scan(ubi, si);
+	if (err)
+		goto out_vtbl;
+
+	err = ubi_upd_init_scan(ubi, si);
+	if (err)
+		goto out_acc;
+
+	dbg_vmt("the volume management unit is initialized");
+	return 0;
+
+out_acc:
+	ubi_acc_close(ubi);
+out_vtbl:
+	ubi_vtbl_close(ubi);
+out_vmt:
+	ubi_kfree(vmt);
+	return err;
+}
+
+void ubi_vmt_close(const struct ubi_info *ubi)
+{
+	dbg_vmt("close the volume management unit");
+	ubi_upd_close(ubi);
+	ubi_acc_close(ubi);
+	ubi_vtbl_close(ubi);
+	ubi_kfree(ubi->vmt);
+}
+
+/**
+ * find_vacant_vol_id - find an unused volume ID.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns a positive volume ID or %-ENOSPC if there are no free
+ * volume IDs.
+ */
+static int find_vacant_vol_id(const struct ubi_info *ubi)
+{
+	int i;
+
+	for (i = 0; i < ubi->acc->max_volumes; i++) {
+		const struct ubi_vtbl_vtr *vtr;
+
+		cond_resched();
+
+		vtr = ubi_vtbl_get_vtr(ubi, i);
+		if (IS_ERR(vtr)) {
+			dbg_vmt("found volume ID %d", i);
+			return i;
+		}
+	}
+
+	dbg_vmt("vacant volume ID not found");
+	return -ENOSPC;
+}
+
+#ifdef CONFIG_MTD_UBI_DEBUG_PARANOID_VMT
+
+/**
+ * paranoid_check_vtr - check sanity of a &struct ubi_vtbl_vtr object.
+ *
+ * @ubi: the UBI device description object
+ * @vtr: an object to check
+ *
+ * This function returns zero if the volume table record is sane, and %1 if
+ * not.
+ */
+static int paranoid_check_vtr(const struct ubi_info *ubi,
+			      const struct ubi_vtbl_vtr *vtr)
+{
+	int n;
+	const struct ubi_io_info *io = ubi->io;
+
+	if (vtr->reserved_pebs == 0)
+		return 0;
+
+	if (unlikely(vtr->reserved_pebs < 0 || vtr->alignment < 0 ||
+		     vtr->data_pad < 0 || vtr->name_len < 0)) {
+		ubi_err("negative values");
+		goto bad;
+	}
+
+	if (unlikely(vtr->alignment > io->leb_size)) {
+		ubi_err("too large alignment");
+		goto bad;
+	}
+
+	if (unlikely(vtr->alignment == 0)) {
+		ubi_err("zero alignment");
+		goto bad;
+	}
+
+	n = vtr->alignment % io->min_io_size;
+	if (vtr->alignment != 1 && unlikely(n)) {
+		ubi_err("alignment is not multiple of min I/O unit size");
+		goto bad;
+	}
+
+	n = io->leb_size % vtr->alignment;
+	if (unlikely(vtr->data_pad != n)) {
+		ubi_err("bad data_pad, has to be %d", n);
+		goto bad;
+	}
+
+	if (unlikely(vtr->vol_type != UBI_DYNAMIC_VOLUME &&
+		     vtr->vol_type != UBI_STATIC_VOLUME)) {
+		ubi_err("bad vol_type");
+		goto bad;
+	}
+
+	if (unlikely(vtr->name_len > UBI_VOL_NAME_MAX)) {
+		ubi_err("too long volume name, max is %d", UBI_VOL_NAME_MAX);
+		goto bad;
+	}
+
+	if (unlikely(!vtr->name)) {
+		ubi_err("NULL volume name");
+		goto bad;
+	}
+
+	n = strnlen(vtr->name, vtr->name_len + 1);
+	if (unlikely(n != vtr->name_len)) {
+		ubi_err("bad name_len");
+		goto bad;
+	}
+
+	return 0;
+
+bad:
+	ubi_err("volume table record paranoid check failed");
+	ubi_dbg_dump_vtr(vtr);
+	return 1;
+}
+
+#endif /* CONFIG_MTD_UBI_DEBUG_PARANOID_VMT */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 35/44 take 2] [UBI] user-interfaces unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (33 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 34/44 take 2] [UBI] volume management unit implementation Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 36/44 take 2] [UBI] user-interfaces unit implementation Artem Bityutskiy
                   ` (11 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/uif.h tmp-to/drivers/mtd/ubi/uif.h
--- tmp-from/drivers/mtd/ubi/uif.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/uif.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,182 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * UBI user interface unit.
+ *
+ * This unit implements all the UBI user interfaces: kernel interfaces,
+ * character device interfaces, and sysfs interfaces.
+ *
+ * There are two kinds of character devices: UBI character devices and volume
+ * character devices. UBI character devices allow users to manipulate by whole
+ * volumes: create, remove, and resize them. Volume character devices provide
+ * volume read and update capabilities.
+ *
+ * Major and minor numbers are assigned dynamically to both UBI and volume
+ * character devices.
+ */
+
+#ifndef __UBI_UIF_H__
+#define __UBI_UIF_H__
+
+#include <linux/init.h>
+#include <linux/spinlock.h>
+#include <linux/fs.h>
+#include <linux/cdev.h>
+#include <linux/list.h>
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include "gluebi.h"
+
+#define UBI_NAME_STR "ubi"
+
+struct ubi_info;
+struct ubi_vol_desc;
+
+/**
+ * ubi_uif_mkvol - create a volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the new volume
+ *
+ * This functions creates all the user interface-related data structures of a
+ * new volume. Returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_uif_mkvol(const struct ubi_info *ubi, int vol_id);
+
+/**
+ * ubi_uif_close_and_rmvol - close a and remove a volume.
+ *
+ * @desc: volume descriptor
+ *
+ * This functions closes a volume and removes all the user interface-related
+ * data structures of this volume. Returns zero in case of success and a
+ * negative error code in case of failure.
+ */
+int ubi_uif_close_and_rmvol(struct ubi_vol_desc *desc);
+
+/**
+ * ubi_uif_init - initialize the UBI user interface unit for an UBI device.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_uif_init(struct ubi_info *ubi);
+
+/**
+ * ubi_uif_close - close the UBI user interface unit for an UBI device.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_uif_close(const struct ubi_info *ubi);
+
+/**
+ * ubi_uif_global_init - initialize the UBI user interface unit.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int __init ubi_uif_global_init(void);
+
+/**
+ * ubi_uif_global_close - close the UBI user interface unit.
+ */
+void ubi_uif_global_close(void);
+
+/**
+ * struct ubi_uif_volume - a per-volume user interface data structure.
+ *
+ * @dev: a class device object to make use of the the Linux device model
+ * @cdev: a Linux character device object to create a character device of this
+ * volume
+ * @ubi: a reference to the UBI description object this volume belongs to
+ * @vol_id: volume ID
+ * @list: the link in the list of UIF volume information
+ * @readers: number of users who are using this volume in read-only mode
+ * @writers: number of users who are using this volume in read-write mode
+ * @exclusive: whether somebody is using this volume in exclusive mode
+ * @removed: if the volume was removed from the UBI device
+ * @checked: if this static volume was checked
+ * @vol_lock: protects the @readers, @writers, @exclusive, and @removed fields
+ * @updating: whether the volume is being updated
+ * @gluebi_vol: gluebi-specific information
+ */
+struct ubi_uif_volume {
+	struct class_device dev;
+	struct cdev cdev;
+	const struct ubi_info *ubi;
+	int vol_id;
+	struct list_head list;
+	int readers;
+	int writers;
+	int exclusive;
+	int removed;
+	int checked;
+	spinlock_t vol_lock;
+	int updating;
+	struct ubi_gluebi_volume gluebi_vol;
+};
+
+/**
+ * struct ubi_vol_desc - UBI opened volume descriptor
+ *
+ * @vol: reference to the corresponding volume description object
+ * @mode: volume open mode
+ */
+struct ubi_vol_desc {
+	struct ubi_uif_volume *vol;
+	enum ubi_open_mode mode;
+};
+
+/**
+ * struct ubi_uif_info - UBI user interfaces unit description structure.
+ *
+ * @cdev: a Linux character device object to create a character device of this
+ * UBI device
+ * @dev: the class device structure to use the the Linux device model
+ * @ubi: a reference to the UBI description structure this volume belongs to
+ * @major: major number of the UBI character device
+ * @ubi_name: name of this UBI device
+ * @volumes: a list of 'struct ubi_uif_volume' object for all existing volumes
+ * @volumes_list_lock: protects the the @volumes list
+ * @vol_check: serializes volume checking
+ *
+ * The @volumes_list_lock mutex protects the list of volumes of this UBI device
+ * from being changed. So it has to be locked when the @volumes list is being
+ * accessed.
+ *
+ * The @vol_check lock is used only when checking static volumes consistency to
+ * prevent from simultanious volume checks.
+ */
+struct ubi_uif_info {
+	struct cdev cdev;
+	struct class_device dev;
+	struct ubi_info *ubi;
+	int major;
+	char *ubi_name;
+	struct list_head volumes;
+	struct mutex volumes_list_lock;
+	struct mutex vol_check;
+};
+
+#endif /* !__UBI_UIF_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 36/44 take 2] [UBI] user-interfaces unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (34 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 35/44 take 2] [UBI] user-interfaces unit header Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 37/44 take 2] [UBI] sysfs handling unit header Artem Bityutskiy
                   ` (10 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/uif.c tmp-to/drivers/mtd/ubi/uif.c
--- tmp-from/drivers/mtd/ubi/uif.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/uif.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,786 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+#include <asm/div64.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "misc.h"
+#include "sysfs.h"
+#include "upd.h"
+#include "wl.h"
+#include "uif.h"
+#include "cdev.h"
+#include "alloc.h"
+#include "debug.h"
+#include "vtbl.h"
+#include "volmgmt.h"
+#include "io.h"
+#include "eba.h"
+#include "account.h"
+
+int ubi_get_device_info(int ubi_num, struct ubi_dev_info *di)
+{
+	const struct ubi_info *ubi;
+	int err = -ENODEV;
+
+	if (!try_module_get(THIS_MODULE))
+		return err;
+
+	if (ubi_num < 0 || ubi_num >= UBI_MAX_INSTANCES) {
+		dbg_err("bad UBI device number %d", ubi_num);
+		goto out_put;
+	}
+
+	if (!ubis[ubi_num]) {
+		dbg_err("UBI device %d does not exist", ubi_num);
+		goto out_put;
+	}
+	ubi = ubis[ubi_num];
+
+	di->ubi_num = ubi->ubi_num;
+	di->leb_size = ubi->io->leb_size;
+	di->min_io_size = ubi->io->min_io_size;
+	di->ro_mode = ubi->io->ro_mode;
+	di->cdev = MKDEV(ubi->uif->major, 0);
+	err = 0;
+
+out_put:
+	module_put(THIS_MODULE);
+	return err;
+}
+EXPORT_SYMBOL_GPL(ubi_get_device_info);
+
+static void fill_ubi_vol_info(const struct ubi_info *ubi, int vol_id,
+			      struct ubi_vol_info *vi);
+
+void ubi_get_volume_info(struct ubi_vol_desc *udesc, struct ubi_vol_info *vi)
+{
+	struct ubi_vol_desc *desc = udesc;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	int vol_id = desc->vol->vol_id;
+
+	fill_ubi_vol_info(ubi, vol_id, vi);
+}
+EXPORT_SYMBOL_GPL(ubi_get_volume_info);
+
+struct ubi_vol_desc *ubi_open_volume(int ubi_num, int vol_id,
+				     enum ubi_open_mode mode)
+{
+	int err, found = 0;
+	struct ubi_vol_desc *desc;
+	const struct ubi_info *ubi;
+	struct ubi_uif_info *uif;
+	struct ubi_uif_volume *vol;
+
+	dbg_uif("open device %d volume %d, mode %d", ubi_num, vol_id, mode);
+
+	err = -ENODEV;
+	if (!try_module_get(THIS_MODULE))
+		return ERR_PTR(err);
+
+	if (ubi_num < 0 || ubi_num >= UBI_MAX_INSTANCES) {
+		dbg_err("bad UBI device number %d", ubi_num);
+		goto out_put;
+	}
+
+	if (!ubis[ubi_num]) {
+		dbg_err("UBI device %d does not exist", ubi_num);
+		goto out_put;
+	}
+
+	ubi = ubis[ubi_num];
+	ubi_assert(ubi->ubi_num == ubi_num);
+	uif = ubi->uif;
+
+	desc = ubi_kzalloc(sizeof(struct ubi_vol_desc));
+	if (!desc) {
+		err = -ENOMEM;
+		goto out_put;
+	}
+
+	err = -EINVAL;
+	if (vol_id < 0 || vol_id >= ubi->acc->max_volumes) {
+		dbg_err("bad vol_id %d", vol_id);
+		goto out_free;
+	}
+
+	if (mode != UBI_READONLY && mode != UBI_READWRITE &&
+	    mode != UBI_EXCLUSIVE) {
+		dbg_err("bad mode %d", mode);
+		goto out_free;
+	}
+
+	mutex_lock(&uif->volumes_list_lock);
+	list_for_each_entry(vol, &uif->volumes, list)
+		if (vol->vol_id == vol_id) {
+			found = 1;
+			break;
+		}
+	if (!found) {
+		dbg_err("volume %d does not exist", vol_id);
+		err = -ENODEV;
+		goto out_unlock;
+	}
+
+	err = -EBUSY;
+	spin_lock(&vol->vol_lock);
+	if (vol->updating) {
+		/* If the volume is being updated, no one can open it */
+		dbg_err("device busy - updating");
+		goto out_unlock_vol;
+	}
+
+	switch (mode) {
+	case UBI_READONLY:
+		if (vol->exclusive) {
+			dbg_err("device is busy - exclusive");
+			goto out_unlock_vol;
+		}
+		vol->readers += 1;
+		break;
+
+	case UBI_READWRITE:
+		if (vol->exclusive || vol->writers > 0) {
+			if (vol->exclusive)
+				dbg_err("device is busy - exclusive");
+			else
+				dbg_err("device is busy - writers");
+			goto out_unlock_vol;
+		}
+		vol->writers += 1;
+		break;
+
+	case UBI_EXCLUSIVE:
+		if (vol->exclusive || vol->writers || vol->readers) {
+			if (vol->exclusive)
+				dbg_err("device is busy - exclusive");
+			else if (vol->writers)
+				dbg_err("device is busy - writers");
+			else
+				dbg_err("device is busy - readers");
+			goto out_unlock_vol;
+		}
+		vol->exclusive = 1;
+		break;
+	}
+	spin_unlock(&vol->vol_lock);
+	mutex_unlock(&uif->volumes_list_lock);
+
+	desc->vol = vol;
+	desc->mode = mode;
+
+	mutex_lock(&uif->vol_check);
+	if (!vol->checked) {
+		/*
+		 * This is the first time this volume is being opened, we have
+		 * to check it. If the volume is corrupted, we still return
+		 * success.
+		 */
+		err = ubi_check_volume(ubi, vol_id);
+		if (err < 0)
+			goto unlock_close;
+
+		if (err == 1) {
+			ubi_warn("volume %d on UBI device %d is corrupted",
+				 vol_id, ubi->ubi_num);
+			err = ubi_vtbl_set_corrupted(ubi, vol_id);
+			if (err)
+				goto unlock_close;
+		}
+		vol->checked = 1;
+	}
+	mutex_unlock(&uif->vol_check);
+
+	return desc;
+
+out_unlock_vol:
+	spin_unlock(&vol->vol_lock);
+out_unlock:
+	mutex_unlock(&uif->volumes_list_lock);
+out_free:
+	ubi_kfree(desc);
+out_put:
+	module_put(THIS_MODULE);
+	return ERR_PTR(err);
+
+unlock_close:
+	mutex_unlock(&uif->vol_check);
+	ubi_close_volume(desc);
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(ubi_open_volume);
+
+struct ubi_vol_desc *ubi_open_volume_nm(int ubi_num, const char *name,
+					enum ubi_open_mode mode)
+{
+	int err, vol_id = -1, len;
+	struct ubi_vol_desc *ret;
+	const struct ubi_info *ubi;
+	struct ubi_uif_info *uif;
+	struct ubi_uif_volume *vol;
+
+	dbg_uif("open volume by name %s, mode %d", name, mode);
+
+	if (!name) {
+		dbg_err("bad name");
+		return ERR_PTR(-EINVAL);
+	}
+	len = strnlen(name, UBI_VOL_NAME_MAX + 1);
+	if (len > UBI_VOL_NAME_MAX) {
+		dbg_err("bad name");
+		return ERR_PTR(-EINVAL);
+	}
+
+	err = -ENODEV;
+	if (!try_module_get(THIS_MODULE))
+		return ERR_PTR(err);
+
+	if (ubi_num < 0 || ubi_num >= UBI_MAX_INSTANCES) {
+		dbg_err("bad UBI device number %d", ubi_num);
+		goto out_put;
+	}
+
+	if (!ubis[ubi_num]) {
+		dbg_err("UBI device %d does not exist", ubi_num);
+		goto out_put;
+	}
+
+	ubi = ubis[ubi_num];
+	ubi_assert(ubi->ubi_num == ubi_num);
+	uif = ubi->uif;
+
+	/* Walk all volumes of this UBI device */
+	mutex_lock(&uif->volumes_list_lock);
+	list_for_each_entry(vol, &uif->volumes, list) {
+		const struct ubi_vtbl_vtr *vtr;
+
+		spin_lock(&vol->vol_lock);
+		vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+		if (len == vtr->name_len && !strcmp(name, vtr->name)) {
+			vol_id = vol->vol_id;
+			dbg_err("found volume volume %d", vol_id);
+			spin_unlock(&vol->vol_lock);
+			break;
+		}
+		spin_unlock(&vol->vol_lock);
+	}
+	mutex_unlock(&uif->volumes_list_lock);
+
+	if (vol_id < 0) {
+		dbg_err("volume %s does not exist", name);
+		goto out_put;
+	}
+
+	ret = ubi_open_volume(ubi_num, vol_id, mode);
+	if (!IS_ERR(ret))
+		return ret;
+
+	err = PTR_ERR(ret);
+
+out_put:
+	module_put(THIS_MODULE);
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(ubi_open_volume_nm);
+
+void ubi_close_volume(struct ubi_vol_desc *udesc)
+{
+	struct ubi_vol_desc *desc = udesc;
+	struct ubi_uif_volume *vol = desc->vol;
+
+	dbg_uif("close volume %d, mode %d", vol->vol_id, desc->mode);
+
+	spin_lock(&vol->vol_lock);
+	switch (desc->mode) {
+	case UBI_READONLY:
+		ubi_assert(vol->readers > 0);
+		vol->readers -= 1;
+		break;
+
+	case UBI_READWRITE:
+		ubi_assert(vol->writers > 0);
+		vol->writers -= 1;
+		break;
+
+	case UBI_EXCLUSIVE:
+		ubi_assert(vol->exclusive > 0);
+		vol->exclusive = 0;
+	}
+	spin_unlock(&vol->vol_lock);
+
+	ubi_kfree(desc);
+	module_put(THIS_MODULE);
+}
+EXPORT_SYMBOL_GPL(ubi_close_volume);
+
+int ubi_eraseblock_read(struct ubi_vol_desc *udesc, int lnum, char *buf,
+			int offset, int len, int check)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = udesc;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	int err, vol_id = desc->vol->vol_id;
+
+	dbg_uif("read %d bytes from offset %d of LEB %d:%d",
+		len, offset, vol_id, lnum);
+
+	if (unlikely(vol_id < 0 || vol_id >= ubi->acc->max_volumes)) {
+		dbg_err("bad vol_id %d", vol_id);
+		return -EINVAL;
+	}
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (unlikely(lnum < 0 || lnum >= vtr->used_ebs)) {
+		dbg_err("bad lnum %d", lnum);
+		return -EINVAL;
+	}
+
+	if (unlikely(offset < 0 || len < 0 ||
+		     offset + len > vtr->usable_leb_size)) {
+		dbg_err("bad offset %d or len %d", offset, len);
+		return -EINVAL;
+	}
+
+	if (unlikely(vtr->vol_type == UBI_STATIC_VOLUME &&
+		     lnum == vtr->used_ebs - 1 &&
+		     offset + len > vtr->last_eb_bytes)) {
+		dbg_err("bad offset %d or len %d for last LEB", offset, len);
+		return -EINVAL;
+	}
+
+	if (unlikely(len == 0))
+		return 0;
+
+	if (unlikely(vtr->upd_marker)) {
+		dbg_err("reading from update-interrupted volume");
+		return -EBADF;
+	}
+
+	err = ubi_eba_read_leb(ubi, vol_id, lnum, buf, offset, len, check);
+	if (unlikely(err))
+		return err;
+	if (unlikely(vtr->corrupted)) {
+		ubi_assert(vtr->vol_type == UBI_STATIC_VOLUME);
+		dbg_err("corrupted volume is read");
+		err = -EUCLEAN;
+	}
+	return err;
+}
+EXPORT_SYMBOL_GPL(ubi_eraseblock_read);
+
+int ubi_eraseblock_write(struct ubi_vol_desc *udesc, int lnum, const void *buf,
+			 int offset, int len, enum ubi_data_type dtype)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = udesc;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	const struct ubi_io_info *io = ubi->io;
+	int vol_id = desc->vol->vol_id;
+
+	dbg_uif("write %zd bytes at offset %d of LEB %d:%d",
+		len, offset, vol_id, lnum);
+
+	if (unlikely(vol_id < 0 || vol_id >= ubi->acc->max_volumes)) {
+		dbg_err("bad vol_id %d", vol_id);
+		return -EINVAL;
+	}
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (unlikely(desc->mode == UBI_READONLY)) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	if (unlikely(vtr->vol_type == UBI_STATIC_VOLUME)) {
+		dbg_err("static volume");
+		return -EROFS;
+	}
+
+	if (unlikely(lnum < 0 || lnum >= vtr->reserved_pebs)) {
+		dbg_err("bad lnum %d", lnum);
+		return -EINVAL;
+	}
+
+	if (unlikely(offset < 0 || len < 0 ||
+		     offset + len > vtr->usable_leb_size)) {
+		dbg_err("bad offset %d or len %zd", offset, len);
+		return -EINVAL;
+	}
+
+	if (unlikely(offset % io->min_io_size || len % io->min_io_size)) {
+		dbg_err("unaligned offset %d or len %zd", offset, len);
+		return -EINVAL;
+	}
+
+	if (unlikely(dtype != UBI_DATA_LONGTERM &&
+		     dtype != UBI_DATA_SHORTTERM &&
+		     dtype != UBI_DATA_UNKNOWN)) {
+		dbg_err("bad dtype %d", dtype);
+		return -EINVAL;
+	}
+
+	if (unlikely(len == 0))
+		return 0;
+
+	if (unlikely(vtr->upd_marker)) {
+		dbg_err("writing update-interrupted volume");
+		return -EBADF;
+	}
+
+	return ubi_eba_write_leb(ubi, vol_id, lnum, buf, offset, len, dtype);
+}
+EXPORT_SYMBOL_GPL(ubi_eraseblock_write);
+
+int ubi_eraseblock_erase(struct ubi_vol_desc *udesc, int lnum)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = udesc;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	int vol_id = desc->vol->vol_id;
+
+	dbg_uif("erase LEB %d:%d", vol_id, lnum);
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (unlikely(desc->mode == UBI_READONLY)) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	if (unlikely(vtr->vol_type == UBI_STATIC_VOLUME)) {
+		dbg_err("static volume");
+		return -EROFS;
+	}
+
+	if (unlikely(lnum < 0 || lnum >= vtr->reserved_pebs)) {
+		dbg_err("bad lnum %d", lnum);
+		return -EINVAL;
+	}
+
+	if (unlikely(vtr->upd_marker)) {
+		dbg_err("erasing update-interrupted volume");
+		return -EBADF;
+	}
+
+	return ubi_eba_erase_leb(ubi, vol_id, lnum);
+}
+EXPORT_SYMBOL_GPL(ubi_eraseblock_erase);
+
+int ubi_eraseblock_unmap(struct ubi_vol_desc *udesc, int lnum)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = udesc;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	int err, vol_id = desc->vol->vol_id;
+
+	dbg_uif("erase LEB %d:%d", vol_id, lnum);
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (unlikely(desc->mode == UBI_READONLY)) {
+		dbg_err("read-only mode");
+		return -EROFS;
+	}
+
+	if (unlikely(vtr->vol_type == UBI_STATIC_VOLUME)) {
+		dbg_err("static volume");
+		return -EROFS;
+	}
+
+	if (unlikely(lnum < 0 || lnum >= vtr->reserved_pebs)) {
+		dbg_err("bad lnum %d", lnum);
+		return -EINVAL;
+	}
+
+	if (unlikely(vtr->upd_marker)) {
+		dbg_err("unmapping eraseblock of an update-interrupted "
+			"volume");
+		return -EBADF;
+	}
+
+	err = ubi_eba_erase_leb(ubi, vol_id, lnum);
+	if (unlikely(err))
+		return err;
+
+	return ubi_wl_flush(ubi);
+}
+EXPORT_SYMBOL_GPL(ubi_eraseblock_unmap);
+
+int ubi_eraseblock_is_mapped(struct ubi_vol_desc *udesc, int lnum)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = udesc;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	int vol_id = desc->vol->vol_id;
+
+	dbg_uif("check LEB %d:%d", vol_id, lnum);
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	if (unlikely(lnum < 0 || lnum >= vtr->reserved_pebs)) {
+		dbg_err("bad lnum %d", lnum);
+		return -EINVAL;
+	}
+
+	if (unlikely(vtr->upd_marker)) {
+		dbg_err("update-interrupted volume");
+		return -EBADF;
+	}
+
+	return ubi_eba_leb_is_mapped(ubi, vol_id, lnum);
+}
+EXPORT_SYMBOL_GPL(ubi_eraseblock_is_mapped);
+
+int ubi_uif_mkvol(const struct ubi_info *ubi, int vol_id)
+{
+	int err;
+	struct ubi_uif_volume *vol;
+	struct ubi_uif_info *uif = ubi->uif;
+	const struct ubi_vtbl_vtr *vtr;
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	dbg_uif("create volume %d, size %d, type %d",
+		vol_id, vtr->reserved_pebs, vtr->vol_type);
+
+	vol = ubi_kzalloc(sizeof(struct ubi_uif_volume));
+	if (!vol)
+		return -ENOMEM;
+	vol->ubi = ubi;
+	vol->vol_id = vol_id;
+	vol->exclusive = 1;
+	spin_lock_init(&vol->vol_lock);
+
+	mutex_lock(&uif->volumes_list_lock);
+	list_add(&vol->list, &uif->volumes);
+	mutex_unlock(&uif->volumes_list_lock);
+
+	err = ubi_sysfs_vol_init(ubi, vol);
+	if (err)
+		goto out_sysfs;
+
+	err = ubi_cdev_vol_init(ubi, vol);
+	if (err)
+		goto out_sysfs;
+
+	err = ubi_gluebi_vol_init(ubi, vol);
+	if (err)
+		goto out_cdev;
+
+	spin_lock(&vol->vol_lock);
+	vol->exclusive = 0;
+	spin_unlock(&vol->vol_lock);
+
+	return 0;
+
+out_cdev:
+	ubi_cdev_vol_close(vol);
+out_sysfs:
+	mutex_lock(&uif->volumes_list_lock);
+	spin_lock(&vol->vol_lock);
+	vol->removed = 1;
+	spin_unlock(&vol->vol_lock);
+	list_del(&vol->list);
+	mutex_unlock(&uif->volumes_list_lock);
+	ubi_sysfs_vol_close(vol);
+	return err;
+}
+
+int ubi_uif_close_and_rmvol(struct ubi_vol_desc *desc)
+{
+	int err;
+	struct ubi_uif_volume *vol = desc->vol;
+	struct ubi_uif_info *uif = vol->ubi->uif;
+
+	dbg_uif("remove UBI volume %d", vol->vol_id);
+	ubi_assert(desc->mode == UBI_EXCLUSIVE);
+
+	err = ubi_gluebi_vol_close(vol);
+	if (err)
+		/* Somebody still holds the emulated MTD device */
+		return err;
+
+	spin_lock(&vol->vol_lock);
+	vol->removed = 1;
+	spin_unlock(&vol->vol_lock);
+
+	mutex_lock(&uif->volumes_list_lock);
+	list_del(&vol->list);
+	mutex_unlock(&uif->volumes_list_lock);
+
+	ubi_cdev_vol_close(vol);
+	ubi_sysfs_vol_close(vol);
+	ubi_kfree(desc);
+	module_put(THIS_MODULE);
+	return err;
+}
+
+static void delete_volumes(const struct ubi_info *ubi);
+
+int ubi_uif_init(struct ubi_info *ubi)
+{
+	int i, err;
+	struct ubi_uif_info *uif;
+	const struct ubi_acc_info *acc = ubi->acc;
+
+	dbg_uif("initialize the user interface unit");
+
+	uif = ubi_kzalloc(sizeof(struct ubi_uif_info));
+	if (!uif)
+		return -ENOMEM;
+	ubi->uif = uif;
+	uif->ubi = ubi;
+
+	mutex_init(&uif->volumes_list_lock);
+	mutex_init(&uif->vol_check);
+	INIT_LIST_HEAD(&uif->volumes);
+
+	uif->ubi_name = ubi_kmalloc(sizeof(UBI_NAME_STR) + 5);
+	if (!uif->ubi_name) {
+		err = -ENOMEM;
+		goto out_uif;
+	}
+	sprintf(uif->ubi_name, UBI_NAME_STR "%d", ubi->ubi_num);
+
+	err = ubi_cdev_init(ubi);
+	if (err)
+		goto out_name;
+
+	err = ubi_sysfs_init(ubi);
+	if (err)
+		goto out_cdev;
+
+	for (i = 0; i < acc->max_volumes; i++) {
+		const struct ubi_vtbl_vtr *vtr;
+
+		cond_resched();
+
+		vtr = ubi_vtbl_get_vtr(ubi, i);
+		if (IS_ERR(vtr))
+			continue;
+
+		err = ubi_uif_mkvol(ubi, i);
+		if (unlikely(err))
+			goto out_volumes;
+	}
+
+	dbg_uif("the user interface unit is initialized");
+	return 0;
+
+out_volumes:
+	delete_volumes(ubi);
+	ubi_sysfs_close(ubi);
+
+out_cdev:
+	ubi_cdev_close(ubi);
+
+out_name:
+	ubi_kfree(uif->ubi_name);
+
+out_uif:
+	ubi_kfree(uif);
+	return err;
+}
+
+void ubi_uif_close(const struct ubi_info *ubi)
+{
+	struct ubi_uif_info *uif = ubi->uif;
+
+	dbg_uif("close the user interface unit for %s", uif->ubi_name);
+
+	delete_volumes(ubi);
+	ubi_sysfs_close(ubi);
+	ubi_cdev_close(ubi);
+	ubi_kfree(uif->ubi_name);
+	ubi_kfree(uif);
+}
+
+int __init ubi_uif_global_init(void)
+{
+	return ubi_sysfs_global_init();
+}
+
+void ubi_uif_global_close(void)
+{
+	return ubi_sysfs_global_close();
+}
+
+/**
+ * delete_volumes - delete all the volume information.
+ *
+ * @ubi: the UBI device description object
+ */
+static void delete_volumes(const struct ubi_info *ubi)
+{
+	struct ubi_uif_volume *vol, *vol_tmp;
+	struct ubi_uif_info *uif = ubi->uif;
+
+	list_for_each_entry_safe(vol, vol_tmp, &uif->volumes, list) {
+		list_del(&vol->list);
+		vol->removed = 1;
+		ubi_gluebi_vol_close(vol);
+		ubi_cdev_vol_close(vol);
+		ubi_sysfs_vol_close(vol);
+	}
+}
+
+/**
+ * fill_ubi_vol_info - fill an "user volume information data structure".
+ *
+ * @ubi: the UBI device description object
+ * @vol_id: ID of the volume
+ * @vi: the volume information to fill
+ *
+ * This function must be invoked when the @vol_id volume will for sure not go.
+ */
+static void fill_ubi_vol_info(const struct ubi_info *ubi, int vol_id,
+			      struct ubi_vol_info *vi)
+{
+	const struct ubi_vtbl_vtr *vtr;
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	vi->vol_id = vol_id;
+	vi->ubi_num = ubi->ubi_num;
+	vi->size = vtr->reserved_pebs;
+	vi->used_bytes = vtr->used_bytes;
+	vi->vol_type = vtr->vol_type;
+	vi->corrupted = vtr->corrupted;
+	vi->upd_marker = vtr->upd_marker;
+	vi->alignment = vtr->alignment;
+	vi->usable_leb_size = vtr->usable_leb_size;
+	vi->name_len = vtr->name_len;
+	vi->name = vtr->name;
+	vi->cdev = MKDEV(ubi->uif->major, vi->vol_id + 1);
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 37/44 take 2] [UBI] sysfs handling unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (35 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 36/44 take 2] [UBI] user-interfaces unit implementation Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 38/44 take 2] [UBI] sysfs handling unit implementation Artem Bityutskiy
                   ` (9 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/sysfs.h tmp-to/drivers/mtd/ubi/sysfs.h
--- tmp-from/drivers/mtd/ubi/sysfs.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/sysfs.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,82 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * This is a part of the UBI user interface unit and contains all the
+ * sysfs-related stuff.
+ */
+
+#ifndef __UBI_SYSFS_H__
+#define __UBI_SYSFS_H__
+
+#include <linux/init.h>
+
+struct ubi_info;
+struct ubi_uif_volume;
+
+/**
+ * ubi_sysfs_vol_init - initialize sysfs for an UBI volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol: user interfaces unit volume description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_sysfs_vol_init(const struct ubi_info *ubi, struct ubi_uif_volume *vol);
+
+/**
+ * ubi_sysfs_vol_close - close sysfs for an UBI volume.
+ *
+ * @vol: user interfaces unit volume description object
+ */
+void ubi_sysfs_vol_close(struct ubi_uif_volume *vol);
+
+/**
+ * ubi_sysfs_init - initialize sysfs for an UBI device.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_sysfs_init(const struct ubi_info *ubi);
+
+/**
+ * ubi_sysfs_close - close sysfs for an UBI device.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_sysfs_close(const struct ubi_info *ubi);
+
+/**
+ * ubi_sysfs_global_init - initialize UBI sysfs support.
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int __init ubi_sysfs_global_init(void);
+
+/**
+ * ubi_sysfs_global_close - close UBI sysfs support.
+ */
+void ubi_sysfs_global_close(void);
+
+#endif /* !__UBI_SYSFS_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 38/44 take 2] [UBI] sysfs handling unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (36 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 37/44 take 2] [UBI] sysfs handling unit header Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 39/44 take 2] [UBI] character devices handling sub-unit header Artem Bityutskiy
                   ` (8 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/sysfs.c tmp-to/drivers/mtd/ubi/sysfs.c
--- tmp-from/drivers/mtd/ubi/sysfs.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/sysfs.c	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,614 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/init.h>
+#include <linux/kobject.h>
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/stat.h>
+#include <linux/types.h>
+#include <mtd/ubi-header.h>
+#include "ubi.h"
+#include "uif.h"
+#include "upd.h"
+#include "debug.h"
+#include "sysfs.h"
+#include "io.h"
+#include "wl.h"
+#include "badeb.h"
+#include "account.h"
+#include "vtbl.h"
+#include "alloc.h"
+#include "background.h"
+
+static struct class *ubi_class;
+
+static ssize_t ubi_version_show(struct class *class, char *buf);
+
+/* Class attributes corresponding to files in '/<sysfs>/class/ubi/' */
+static struct class_attribute ubi_version =
+	__ATTR(version, S_IRUGO, ubi_version_show, NULL);
+
+int __init ubi_sysfs_global_init(void)
+{
+	int err;
+
+	ubi_class = class_create(THIS_MODULE, UBI_NAME_STR);
+	if (IS_ERR(ubi_class)) {
+		err = PTR_ERR(ubi_class);
+		goto out;
+	}
+
+	err = class_create_file(ubi_class, &ubi_version);
+	if (err)
+		goto out_class;
+
+	return 0;
+
+out_class:
+	class_destroy(ubi_class);
+out:
+	return err;
+}
+
+void ubi_sysfs_global_close(void)
+{
+	class_remove_file(ubi_class, &ubi_version);
+	class_destroy(ubi_class);
+}
+
+static void dev_release(struct class_device *dev);
+static ssize_t dev_eraseblock_size_show(struct class_device *dev, char *buf);
+static ssize_t dev_avail_eraseblocks_show(struct class_device *dev, char *buf);
+static ssize_t dev_total_eraseblocks_show(struct class_device *dev, char *buf);
+static ssize_t dev_volumes_count_show(struct class_device *dev, char *buf);
+static ssize_t dev_max_ec_show(struct class_device *dev, char *buf);
+static ssize_t dev_update_show(struct class_device *dev, char *buf);
+static ssize_t dev_reserved_for_bad_show(struct class_device *dev, char *buf);
+static ssize_t dev_bad_peb_count_show(struct class_device *dev, char *buf);
+static ssize_t dev_max_vol_count_show(struct class_device *dev, char *buf);
+static ssize_t dev_min_io_size_show(struct class_device *dev, char *buf);
+static ssize_t dev_bgt_enabled_show(struct class_device *dev, char *buf);
+static ssize_t dev_bgt_enabled_store(struct class_device *dev, const char *buf,
+				     size_t count);
+
+/*
+ * Class device attributes corresponding to files in '/<sysfs>/class/ubi/ubiX'.
+ */
+static struct class_device_attribute dev_eraseblock_size =
+	__ATTR(eraseblock_size, S_IRUGO, dev_eraseblock_size_show, NULL);
+static struct class_device_attribute dev_avail_eraseblocks =
+	__ATTR(avail_eraseblocks, S_IRUGO, dev_avail_eraseblocks_show, NULL);
+static struct class_device_attribute dev_total_eraseblocks =
+	__ATTR(total_eraseblocks, S_IRUGO, dev_total_eraseblocks_show, NULL);
+static struct class_device_attribute dev_volumes_count =
+	__ATTR(volumes_count, S_IRUGO, dev_volumes_count_show, NULL);
+static struct class_device_attribute dev_max_ec =
+	__ATTR(max_ec, S_IRUGO, dev_max_ec_show, NULL);
+static struct class_device_attribute dev_update =
+	__ATTR(update, S_IRUGO, dev_update_show, NULL);
+static struct class_device_attribute dev_reserved_for_bad =
+	__ATTR(reserved_for_bad, S_IRUGO, dev_reserved_for_bad_show, NULL);
+static struct class_device_attribute dev_bad_peb_count =
+	__ATTR(bad_peb_count, S_IRUGO, dev_bad_peb_count_show, NULL);
+static struct class_device_attribute dev_max_vol_count =
+	__ATTR(max_vol_count, S_IRUGO, dev_max_vol_count_show, NULL);
+static struct class_device_attribute dev_min_io_size =
+	__ATTR(min_io_size, S_IRUGO, dev_min_io_size_show, NULL);
+static struct class_device_attribute dev_bgt_enabled =
+	__ATTR(bgt_enabled, S_IRUGO | S_IWUSR,
+	       dev_bgt_enabled_show, dev_bgt_enabled_store);
+
+int ubi_sysfs_init(const struct ubi_info *ubi)
+{
+	int err;
+	struct ubi_uif_info *uif = ubi->uif;
+
+	uif->dev.release = dev_release;
+	uif->dev.devt = MKDEV(uif->major, 0);
+	uif->dev.class = ubi_class;
+	sprintf(&uif->dev.class_id[0], UBI_NAME_STR"%d", ubi->ubi_num);
+	err = class_device_register(&uif->dev);
+	if (err)
+		goto out;
+
+	err = class_device_create_file(&uif->dev, &dev_eraseblock_size);
+	if (err)
+		goto out_unregister;
+	err = class_device_create_file(&uif->dev, &dev_avail_eraseblocks);
+	if (err)
+		goto out_eraseblock_size;
+	err = class_device_create_file(&uif->dev, &dev_total_eraseblocks);
+	if (err)
+		goto out_avail_eraseblocks;
+	err = class_device_create_file(&uif->dev, &dev_volumes_count);
+	if (err)
+		goto out_total_eraseblocks;
+	err = class_device_create_file(&uif->dev, &dev_max_ec);
+	if (err)
+		goto out_volumes_count;
+	err = class_device_create_file(&uif->dev, &dev_update);
+	if (err)
+		goto out_volumes_max_ec;
+	err = class_device_create_file(&uif->dev, &dev_reserved_for_bad);
+	if (err)
+		goto out_update;
+	err = class_device_create_file(&uif->dev, &dev_bad_peb_count);
+	if (err)
+		goto out_reserved_for_bad;
+	err = class_device_create_file(&uif->dev, &dev_max_vol_count);
+	if (err)
+		goto out_bad_peb_count;
+	err = class_device_create_file(&uif->dev, &dev_min_io_size);
+	if (err)
+		goto out_max_vol_count;
+	err = class_device_create_file(&uif->dev, &dev_bgt_enabled);
+	if (err)
+		goto out_min_io_size;
+
+	return 0;
+
+out_min_io_size:
+	class_device_remove_file(&uif->dev, &dev_min_io_size);
+out_max_vol_count:
+	class_device_remove_file(&uif->dev, &dev_max_vol_count);
+out_bad_peb_count:
+	class_device_remove_file(&uif->dev, &dev_bad_peb_count);
+out_reserved_for_bad:
+	class_device_remove_file(&uif->dev, &dev_reserved_for_bad);
+out_update:
+	class_device_remove_file(&uif->dev, &dev_update);
+out_volumes_max_ec:
+	class_device_remove_file(&uif->dev, &dev_max_ec);
+out_volumes_count:
+	class_device_remove_file(&uif->dev, &dev_volumes_count);
+out_total_eraseblocks:
+	class_device_remove_file(&uif->dev, &dev_total_eraseblocks);
+out_avail_eraseblocks:
+	class_device_remove_file(&uif->dev, &dev_avail_eraseblocks);
+out_eraseblock_size:
+	class_device_remove_file(&uif->dev, &dev_eraseblock_size);
+out_unregister:
+	class_device_unregister(&uif->dev);
+out:
+	ubi_err("failed to initialize sysfs for UBI device %d", ubi->ubi_num);
+	return err;
+}
+
+void ubi_sysfs_close(const struct ubi_info *ubi)
+{
+	struct ubi_uif_info *uif = ubi->uif;
+
+	class_device_remove_file(&uif->dev, &dev_bgt_enabled);
+	class_device_remove_file(&uif->dev, &dev_min_io_size);
+	class_device_remove_file(&uif->dev, &dev_max_vol_count);
+	class_device_remove_file(&uif->dev, &dev_bad_peb_count);
+	class_device_remove_file(&uif->dev, &dev_reserved_for_bad);
+	class_device_remove_file(&uif->dev, &dev_update);
+	class_device_remove_file(&uif->dev, &dev_max_ec);
+	class_device_remove_file(&uif->dev, &dev_volumes_count);
+	class_device_remove_file(&uif->dev, &dev_total_eraseblocks);
+	class_device_remove_file(&uif->dev, &dev_avail_eraseblocks);
+	class_device_remove_file(&uif->dev, &dev_eraseblock_size);
+	class_device_unregister(&uif->dev);
+}
+
+static void vol_release(struct class_device *dev);
+static ssize_t vol_reserved_ebs_show(struct class_device *dev, char *buf);
+static ssize_t vol_type_show(struct class_device *dev, char *buf);
+static ssize_t vol_name_show(struct class_device *dev, char *buf);
+static ssize_t vol_corrupted_show(struct class_device *dev, char *buf);
+static ssize_t vol_alignment_show(struct class_device *dev, char *buf);
+static ssize_t vol_usable_eb_size_show(struct class_device *dev, char *buf);
+static ssize_t vol_data_bytes_show(struct class_device *dev, char *buf);
+static ssize_t vol_upd_marker_show(struct class_device *dev, char *buf);
+
+/*
+ * Class device attributes corresponding to files in
+ * '/<sysfs>/class/ubi/ubiX/Y'.
+ */
+static struct class_device_attribute vol_reserved_ebs =
+	__ATTR(reserved_ebs, S_IRUGO, vol_reserved_ebs_show, NULL);
+static struct class_device_attribute vol_type =
+	__ATTR(type, S_IRUGO, vol_type_show, NULL);
+static struct class_device_attribute vol_name =
+	__ATTR(name, S_IRUGO, vol_name_show, NULL);
+static struct class_device_attribute vol_corrupted =
+	__ATTR(corrupted, S_IRUGO, vol_corrupted_show, NULL);
+static struct class_device_attribute vol_alignment =
+	__ATTR(alignment, S_IRUGO, vol_alignment_show, NULL);
+static struct class_device_attribute vol_usable_eb_size =
+	__ATTR(usable_eb_size, S_IRUGO, vol_usable_eb_size_show, NULL);
+static struct class_device_attribute vol_data_bytes =
+	__ATTR(data_bytes, S_IRUGO, vol_data_bytes_show, NULL);
+static struct class_device_attribute vol_upd_marker =
+	__ATTR(upd_marker, S_IRUGO, vol_upd_marker_show, NULL);
+
+/*
+ * Note, this function does not free allocated resources in case of failure -
+ * the caller does it. This is because this would cause release() here and the
+ * caller would oops.
+ */
+int ubi_sysfs_vol_init(const struct ubi_info *ubi, struct ubi_uif_volume *vol)
+{
+	int err;
+
+	vol->dev.release = vol_release;
+	vol->dev.parent = &ubi->uif->dev;
+	vol->dev.devt = MKDEV(ubi->uif->major, vol->vol_id + 1);
+	vol->dev.class = ubi_class;
+	sprintf(&vol->dev.class_id[0], "%d", vol->vol_id);
+	err = class_device_register(&vol->dev);
+	if (err)
+		return err;
+
+	err = class_device_create_file(&vol->dev, &vol_reserved_ebs);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_type);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_name);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_corrupted);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_alignment);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_usable_eb_size);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_data_bytes);
+	if (err)
+		return err;
+	err = class_device_create_file(&vol->dev, &vol_upd_marker);
+	if (err)
+		return err;
+	return 0;
+}
+
+void ubi_sysfs_vol_close(struct ubi_uif_volume *vol)
+{
+	class_device_remove_file(&vol->dev, &vol_upd_marker);
+	class_device_remove_file(&vol->dev, &vol_data_bytes);
+	class_device_remove_file(&vol->dev, &vol_usable_eb_size);
+	class_device_remove_file(&vol->dev, &vol_alignment);
+	class_device_remove_file(&vol->dev, &vol_corrupted);
+	class_device_remove_file(&vol->dev, &vol_name);
+	class_device_remove_file(&vol->dev, &vol_type);
+	class_device_remove_file(&vol->dev, &vol_reserved_ebs);
+	class_device_unregister(&vol->dev);
+}
+
+/**
+ * dev2ubi -- find UBI device description object by the pointer to the class
+ * device object.
+ *
+ * @dev: class device object pointer
+ *
+ * This function returns a pointer to the UBI device description object.
+ */
+static inline struct ubi_info *dev2ubi(struct class_device *dev)
+{
+	struct ubi_uif_info *uif;
+
+	uif = container_of(dev, struct ubi_uif_info, dev);
+	return uif->ubi;
+}
+
+/* "Show" and "store" methods for files in '/<sysfs>/class/ubi/' */
+static ssize_t ubi_version_show(struct class *class, char *buf)
+{
+	return sprintf(buf, "%d\n", UBI_VERSION);
+}
+
+/* "Release" method for UBI devices */
+static void dev_release(struct class_device *dev)
+{
+	return;
+}
+
+/* "Show" method for files in '/<sysfs>/class/ubi/ubiX/' */
+static ssize_t dev_eraseblock_size_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->io->leb_size);
+}
+
+static ssize_t dev_avail_eraseblocks_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->acc->avail_pebs);
+}
+
+static ssize_t dev_total_eraseblocks_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->io->good_peb_count);
+}
+
+static ssize_t dev_volumes_count_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->acc->uvol_count);
+}
+
+static ssize_t dev_max_ec_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->wl->max_ec);
+}
+
+static ssize_t dev_update_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+	int vol_id = ubi->upd->vol_id;
+
+	if (vol_id == -1)
+		return 0;
+	return sprintf(buf, "%d\n", vol_id);
+}
+
+static ssize_t dev_reserved_for_bad_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->beb->reserved_pebs);
+}
+
+static ssize_t dev_bad_peb_count_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->io->bad_peb_count);
+}
+
+static ssize_t dev_max_vol_count_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->acc->max_volumes);
+}
+
+static ssize_t dev_min_io_size_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->io->min_io_size);
+}
+
+static ssize_t dev_bgt_enabled_show(struct class_device *dev, char *buf)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	return sprintf(buf, "%d\n", ubi->bgt->enabled);
+}
+
+static ssize_t dev_bgt_enabled_store(struct class_device *dev, const char *buf,
+				     size_t count)
+{
+	const struct ubi_info *ubi = dev2ubi(dev);
+
+	if (count > 2)
+		return -EINVAL;
+
+	if (count == 2 && buf[1] != '\n')
+		return -EINVAL;
+
+	if (buf[0] == '1')
+		ubi_bgt_enable(ubi);
+	else if (buf[0] == '0')
+		ubi_bgt_disable(ubi);
+	else
+		return -EINVAL;
+
+	return count;
+}
+
+/**
+ * dev2ubi -- find volume description object by the pointer to the class device
+ * object.
+ *
+ * @dev: class device object pointer
+ *
+ * This function returns a pointer to the UBI volume description object.
+ */
+static inline struct ubi_uif_volume *dev2vol(struct class_device *dev)
+{
+	return container_of(dev, struct ubi_uif_volume, dev);
+}
+
+/* Release method for volume devices */
+static void vol_release(struct class_device *dev)
+{
+	const struct ubi_uif_volume *vol = dev2vol(dev);
+
+	dbg_uif("release volume %d", vol->vol_id);
+	ubi_kfree(vol);
+}
+
+/*
+ * "Show" methods for files in '/<sysfs>/class/ubi/ubiX/Y/'.
+ *
+ * Consider a situation:
+ * A. process 1 opens a sysfs file related to volume Y, say
+ *    /<sysfs>/class/ubi/ubiX/Y/reserved_ebs;
+ * B. process 2 removes volume Y;
+ * C. process 1 starts reading the /<sysfs>/class/ubi/ubiX/Y/reserved_ebs file;
+ *
+ * What we want to do in a situation like that is to return error when the file
+ * is read. This is done by means of the 'removed' flag and the 'vol_lock' of
+ * the UBI UIF volume information structure.
+ */
+
+static ssize_t vol_reserved_ebs_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	ret = sprintf(buf, "%d\n", vtr->reserved_pebs);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_type_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const char *tp;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	tp = vtr->vol_type == UBI_DYNAMIC_VOLUME ? "dynamic" : "static";
+	ret = sprintf(buf, "%s\n", tp);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_name_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	ret = sprintf(buf, "%s\n", vtr->name);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_corrupted_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	ret = sprintf(buf, "%d\n", vtr->corrupted);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_alignment_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	ret = sprintf(buf, "%d\n", vtr->alignment);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_usable_eb_size_show(struct class_device *dev, char *buf)
+{
+	int ret, usable_eb_size;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+	const struct ubi_io_info *io = vol->ubi->io;
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	usable_eb_size = io->leb_size - vtr->data_pad;
+	ret = sprintf(buf, "%d\n", usable_eb_size);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_data_bytes_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	ret = sprintf(buf, "%lld\n", vtr->used_bytes);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}
+
+static ssize_t vol_upd_marker_show(struct class_device *dev, char *buf)
+{
+	int ret;
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_uif_volume *vol = dev2vol(dev);
+
+	spin_lock(&vol->vol_lock);
+	if (vol->removed) {
+		spin_unlock(&vol->vol_lock);
+		dbg_uif("volume %d was removed", vol->vol_id);
+		return -EIO;
+	}
+	vtr = ubi_vtbl_get_vtr(vol->ubi, vol->vol_id);
+	ret = sprintf(buf, "%d\n", vtr->upd_marker);
+	spin_unlock(&vol->vol_lock);
+	return ret;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 39/44 take 2] [UBI] character devices handling sub-unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (37 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 38/44 take 2] [UBI] sysfs handling unit implementation Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 40/44 take 2] [UBI] character devices handling sub-unit implementation Artem Bityutskiy
                   ` (7 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/cdev.h tmp-to/drivers/mtd/ubi/cdev.h
--- tmp-from/drivers/mtd/ubi/cdev.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/cdev.h	2007-02-17 18:07:27.000000000 +0200
@@ -0,0 +1,82 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * This is a part of the user interface unit and includes all the character
+ * device-related stuff.
+ */
+
+#ifndef __UBI_CDEV_H__
+#define __UBI_CDEV_H__
+
+/* Maximum sequence numbers of UBI and volume character device IOCTLs */
+#define UBI_CDEV_IOC_MAX_SEQ 2
+
+#ifndef CONFIG_MTD_UBI_USERSPACE_IO
+#define VOL_CDEV_IOC_MAX_SEQ 1
+#else
+#define VOL_CDEV_IOC_MAX_SEQ 2
+#endif
+
+struct ubi_info;
+struct inode;
+struct file;
+struct ubi_uif_volume;
+
+/**
+ * ubi_cdev_vol_init - initialize all the character device-related stuff for
+ * an UBI volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol: user interfaces unit volume description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_cdev_vol_init(const struct ubi_info *ubi, struct ubi_uif_volume *vol);
+
+/**
+ * ubi_cdev_vol_close - close all the character device-related stuff for an
+ * UBI volume.
+ *
+ * @vol: user interfaces unit volume description object
+ */
+void ubi_cdev_vol_close(struct ubi_uif_volume *vol);
+
+/**
+ * ubi_cdev_init - initialize all the character device-related stuff for
+ * an UBI device.
+ *
+ * @ubi: the UBI device description object
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+int ubi_cdev_init(struct ubi_info *ubi);
+
+/**
+ * ubi_cdev_close - close all the character device-related stuff for an UBI
+ * device.
+ *
+ * @ubi: the UBI device description object
+ */
+void ubi_cdev_close(const struct ubi_info *ubi);
+
+#endif /* !__UBI_CDEV_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 40/44 take 2] [UBI] character devices handling sub-unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (38 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 39/44 take 2] [UBI] character devices handling sub-unit header Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:57 ` [PATCH 41/44 take 2] [UBI] gluebi unit header Artem Bityutskiy
                   ` (6 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/cdev.c tmp-to/drivers/mtd/ubi/cdev.c
--- tmp-from/drivers/mtd/ubi/cdev.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/cdev.c	2007-02-17 18:07:28.000000000 +0200
@@ -0,0 +1,1033 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/stat.h>
+#include <linux/sched.h>
+#include <linux/ioctl.h>
+#include <linux/cdev.h>
+#include <linux/capability.h>
+#include <linux/types.h>
+#include <linux/mtd/ubi.h>
+#include <mtd/ubi-user.h>
+#include <mtd/ubi-header.h>
+#include <asm/uaccess.h>
+#include <asm/div64.h>
+#include "ubi.h"
+#include "uif.h"
+#include "upd.h"
+#include "eba.h"
+#include "cdev.h"
+#include "account.h"
+#include "debug.h"
+#include "alloc.h"
+#include "vtbl.h"
+#include "io.h"
+#include "volmgmt.h"
+#include "misc.h"
+
+static int vol_cdev_open(struct inode *inode, struct file *file);
+static int vol_cdev_release(struct inode *inode, struct file *file);
+static loff_t vol_cdev_llseek(struct file *file, loff_t offset, int origin);
+static ssize_t vol_cdev_read(struct file *file, __user char *buf,
+			     size_t count, loff_t * offp);
+static ssize_t vol_cdev_write(struct file *file, const char __user *buf,
+			      size_t count, loff_t *offp);
+static int vol_cdev_ioctl(struct inode *inode, struct file *file,
+			  unsigned int cmd, unsigned long arg);
+#ifdef CONFIG_MTD_UBI_USERSPACE_IO
+static ssize_t vol_cdev_direct_write(struct file *file, const char __user *buf,
+				     size_t count, loff_t *offp);
+#else
+#define vol_cdev_direct_write(file, buf, count, offp) -EROFS
+#endif
+
+/* Volume character device operations */
+static struct file_operations vol_cdev_operations = {
+	.owner   = THIS_MODULE,
+	.open    = vol_cdev_open,
+	.release = vol_cdev_release,
+	.llseek  = vol_cdev_llseek,
+	.read    = vol_cdev_read,
+	.write   = vol_cdev_write,
+	.ioctl   = vol_cdev_ioctl
+};
+
+int ubi_cdev_vol_init(const struct ubi_info *ubi, struct ubi_uif_volume *vol)
+{
+	int err;
+	const struct ubi_uif_info *uif = ubi->uif;
+
+	cdev_init(&vol->cdev, &vol_cdev_operations);
+	vol->cdev.owner = THIS_MODULE;
+	err = cdev_add(&vol->cdev, MKDEV(uif->major, vol->vol_id + 1), 1);
+	if (err) {
+		ubi_err("cannot add char dev for volume %d", vol->vol_id);
+		goto out;
+	}
+
+	return 0;
+out:
+	cdev_del(&vol->cdev);
+	return err;
+}
+
+void ubi_cdev_vol_close(struct ubi_uif_volume *vol)
+{
+	cdev_del(&vol->cdev);
+}
+
+static int ubi_cdev_ioctl(struct inode *inode, struct file *file,
+			  unsigned int cmd, unsigned long arg);
+
+/* UBI character device operations */
+static struct file_operations ubi_cdev_operations = {
+	.owner = THIS_MODULE,
+	.ioctl = ubi_cdev_ioctl,
+	.llseek = no_llseek
+};
+
+int ubi_cdev_init(struct ubi_info *ubi)
+{
+	int err;
+	dev_t dev;
+	struct ubi_uif_info *uif = ubi->uif;
+	const struct ubi_acc_info *acc = ubi->acc;
+
+	/*
+	 * Major numbers for the UBI character devices are allocated
+	 * dynamically. Major numbers of volume character devices are
+	 * equivalent to ones of the corresponding UBI character device. Minor
+	 * numbers of UBI character devices are 0, while minor numbers of
+	 * volume character devices start from 1. Thus, we allocate one major
+	 * number and acc->max_volumes + 1 minor numbers.
+	 */
+	err = alloc_chrdev_region(&dev, 0, acc->max_volumes + 1, uif->ubi_name);
+	if (err) {
+		ubi_err("cannot register UBI char devs");
+		return err;
+	}
+
+	cdev_init(&uif->cdev, &ubi_cdev_operations);
+	uif->major = MAJOR(dev);
+	uif->cdev.owner = THIS_MODULE;
+
+	dev = MKDEV(uif->major, 0);
+	err = cdev_add(&uif->cdev, dev, 1);
+	if (err) {
+		ubi_err("cannot add char dev %s", uif->ubi_name);
+		goto out_unreg;
+	}
+
+	dbg_cdev("%s major:minor is %u:0", uif->ubi_name, uif->major);
+	return 0;
+
+out_unreg:
+	unregister_chrdev_region(MKDEV(uif->major, 0), acc->max_volumes + 1);
+	return err;
+
+}
+
+void ubi_cdev_close(const struct ubi_info *ubi)
+{
+	struct ubi_uif_info *uif = ubi->uif;
+	const struct ubi_acc_info *acc = ubi->acc;
+
+	cdev_del(&uif->cdev);
+	unregister_chrdev_region(MKDEV(uif->major, 0), acc->max_volumes + 1);
+}
+
+static struct ubi_info *major2ubi_info(int major);
+
+/**
+ * vol_cdev_open - open method of volume character devices.
+ *
+ * @inode: inode of the volume character device
+ * @file: &struct file object of the volume character device
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int vol_cdev_open(struct inode *inode, struct file *file)
+{
+	struct ubi_vol_desc *desc;
+	const struct ubi_info *ubi = major2ubi_info(imajor(inode));
+	int vol_id = iminor(inode) - 1;
+	int mode;
+
+	if (file->f_mode & FMODE_WRITE)
+		mode = UBI_READWRITE;
+	else
+		mode = UBI_READONLY;
+
+	dbg_cdev("open volume %d, mode %d", vol_id, mode);
+
+	desc = ubi_open_volume(ubi->ubi_num, vol_id, mode);
+	if (IS_ERR(desc))
+		return PTR_ERR(desc);
+
+	file->private_data = desc;
+	return 0;
+}
+
+/**
+ * vol_cdev_release - release method of volume character devices.
+ *
+ * @inode: inode of the volume character device
+ * @file: &struct file object of the volume character device
+ */
+static int vol_cdev_release(struct inode *inode, struct file *file)
+{
+	int err = 0;
+	struct ubi_vol_desc *desc = file->private_data;
+	struct ubi_uif_volume *vol = desc->vol;
+
+	dbg_cdev("release volume %d, mode %d", vol->vol_id, desc->mode);
+
+	if (vol->updating) {
+		ubi_warn("update of volume %d was not finished", vol->vol_id);
+		err = ubi_upd_abort(vol->ubi, vol->vol_id);
+		vol->updating = 0;
+	}
+
+	ubi_close_volume(desc);
+	return err;
+}
+
+/**
+ * vol_cdev_llseek - llseek method of volume character devices.
+ *
+ * @file: &struct file object of the volume character device
+ * @offset: file offset
+ * @origin: defines the starting point of the @offset
+ *
+ * If an update is in progress, seeking is prohibited. This function returns
+ * positive offset in case of success and a negative error code in case of
+ * failure.
+ */
+static loff_t vol_cdev_llseek(struct file *file, loff_t offset, int origin)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = file->private_data;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	struct ubi_uif_volume *vol = desc->vol;
+	loff_t new_offset;
+
+	dbg_cdev("seek volume %d, offset %lld, origin %d",
+		 vol->vol_id, offset, origin);
+
+	if (vol->updating) {
+		dbg_cdev("updating");
+		return -EBUSY;
+	}
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol->vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	switch (origin) {
+	case 0: /* SEEK_SET */
+		new_offset = offset;
+		break;
+	case 1: /* SEEK_CUR */
+		new_offset = file->f_pos + offset;
+		break;
+	case 2: /* SEEK_END */
+		new_offset = vtr->used_bytes + offset;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (new_offset < 0 || new_offset > vtr->used_bytes) {
+		dbg_err("bad seek (%lld)", new_offset);
+		return -EINVAL;
+	}
+
+	dbg_cdev("set volume %d offset at %lld", vol->vol_id, new_offset);
+	file->f_pos = new_offset;
+	return new_offset;
+}
+
+/**
+ * vol_cdev_read - read method of volume character devices
+ *
+ * @file: &struct file object of volume character device
+ * @buf: user-space buffer where to put read data
+ * @count: how many bytes to read
+ * @offp: the read position
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static ssize_t vol_cdev_read(struct file *file, __user char *buf,
+			     size_t count, loff_t * offp)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = file->private_data;
+	struct ubi_uif_volume *vol = desc->vol;
+	const struct ubi_info *ubi = vol->ubi;
+	const struct ubi_io_info *io = ubi->io;
+	int err, lnum, off, len, tbuf_size, vol_id = desc->vol->vol_id;
+	size_t count_save = count;
+	void *tbuf;
+	uint64_t tmp;
+
+	dbg_cdev("request: read %zd bytes from offset %lld of volume %u",
+		 count, *offp, vol_id);
+
+	if (unlikely(count == 0))
+		return 0;
+
+	if (vol->updating) {
+		dbg_err("updating");
+		return -EBUSY;
+	}
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+	ubi_assert(!IS_ERR(vtr));
+	ubi_assert(*offp >= 0 && *offp <= vtr->used_bytes);
+
+	if (vtr->upd_marker) {
+		dbg_err("damaged volume, update marker is set");
+		return -EBADF;
+	}
+
+	if (*offp == vtr->used_bytes)
+		return 0;
+
+	if (unlikely(vtr->corrupted))
+		dbg_err("read from corrupted volume %d", vol_id);
+
+	if (*offp + count > vtr->used_bytes)
+		count_save = count = vtr->used_bytes - *offp;
+
+	/*
+	 * To optimize reading, we read in fractions of the minimum
+	 * input/output units of the flash.
+	 */
+	tbuf_size = (PAGE_SIZE / io->min_io_size) * io->min_io_size;
+	if (tbuf_size == 0)
+		tbuf_size = io->min_io_size;
+	if (tbuf_size > io->leb_size)
+		tbuf_size = io->leb_size;
+
+	tbuf = ubi_kmalloc(tbuf_size);
+	if (!tbuf)
+		return -ENOMEM;
+
+	/*
+	 * We read in portions of the minimal flash input/output unit. If we are
+	 * requested to read form a non-aligned offset, we first read up to the
+	 * nearest boundary, and later only read in units of 'tbuf_size'.
+	 */
+	if (count > tbuf_size) {
+		int rem;
+
+		tmp = *offp;
+		rem = do_div(tmp, io->min_io_size);
+		if (rem == 0)
+			len = tbuf_size;
+		else
+			len = io->min_io_size - rem;
+	} else
+		len = count;
+
+	tmp = *offp;
+	off = do_div(tmp, vtr->usable_leb_size);
+	lnum = tmp;
+
+	do {
+		cond_resched();
+
+		if (off + len >= vtr->usable_leb_size)
+			len = vtr->usable_leb_size - off;
+
+		err = ubi_eba_read_leb(ubi, vol_id, lnum, tbuf, off, len, 0);
+		if (unlikely(err))
+			break;
+
+		off += len;
+		if (off == vtr->usable_leb_size) {
+			lnum += 1;
+			off -= vtr->usable_leb_size;
+		}
+
+		count -= len;
+		*offp += len;
+
+		err = copy_to_user(buf, tbuf, len);
+		if (unlikely(err)) {
+			dbg_err("memory access error");
+			break;
+		}
+
+		buf += len;
+		len = count > tbuf_size ? tbuf_size : count;
+	} while (count);
+
+	ubi_kfree(tbuf);
+	return err ? err : count_save - count;
+}
+
+/**
+ * vol_cdev_write - write method of volume character devices
+ *
+ * @file: &struct file object of volume character device
+ * @buf: user-space buffer with the data to write
+ * @count: how many bytes to write
+ * @offp: the write position
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static ssize_t vol_cdev_write(struct file *file, const char __user *buf,
+			      size_t count, loff_t *offp)
+{
+	int err = 0;
+	struct ubi_vol_desc *desc = file->private_data;
+	struct ubi_uif_volume *vol = desc->vol;
+	const struct ubi_info *ubi = vol->ubi;
+
+	dbg_cdev("requested: write %zd bytes to offset %lld of volume %u",
+		 count, *offp, vol->vol_id);
+
+	if (!vol->updating)
+		return vol_cdev_direct_write(file, buf, count, offp);
+
+	err = ubi_upd_write_data(ubi, vol->vol_id, buf, count);
+	if (err < 0) {
+		dbg_err("cannot write %zd bytes of update data", count);
+		return err;
+	}
+
+	if (err) {
+		/* Update is finished */
+		vol->updating = 0;
+
+		err = ubi_check_volume(ubi, vol->vol_id);
+		if (err < 0)
+			return err;
+
+		if (err == 1) {
+			ubi_warn("volume %d on UBI device %d is corrupted",
+				 vol->vol_id, ubi->ubi_num);
+			err = ubi_vtbl_set_corrupted(ubi, vol->vol_id);
+			if (err)
+				return err;
+		}
+		vol->checked = 1;
+	}
+
+	*offp += count;
+	return count;
+}
+
+/**
+ * vol_cdev_ioctl - ioctl method of volume character devices.
+ *
+ * @inode: inode of the volume character device
+ * @file: &struct file object of the volume character device
+ * @cmd: ioctl command
+ * @arg: ioctl arguments
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int vol_cdev_ioctl(struct inode *inode, struct file *file,
+			  unsigned int cmd, unsigned long arg)
+{
+	int err = 0;
+	struct ubi_vol_desc *desc = file->private_data;
+	const struct ubi_info *ubi = desc->vol->ubi;
+	void __user *argp = (void __user *)arg;
+
+	if (_IOC_NR(cmd) > VOL_CDEV_IOC_MAX_SEQ ||
+	    _IOC_TYPE(cmd) != UBI_VOL_IOC_MAGIC) {
+		dbg_err("bad ioctl command");
+		return -ENOTTY;
+	}
+
+	if (_IOC_DIR(cmd) && _IOC_READ)
+		err = !access_ok(VERIFY_WRITE, argp, _IOC_SIZE(cmd));
+	else if (_IOC_DIR(cmd) && _IOC_WRITE)
+		err = !access_ok(VERIFY_READ, argp, _IOC_SIZE(cmd));
+	if (err) {
+		dbg_err("memory access error");
+		return -EFAULT;
+	}
+
+	switch (cmd) {
+
+	/* Volume update command */
+	case UBI_IOCVOLUP:
+	{
+		int64_t bytes, rsvd_bytes;
+		const struct ubi_vtbl_vtr *vtr;
+
+		if (!capable(CAP_SYS_RESOURCE)) {
+			dbg_err("no rights");
+			err = -EPERM;
+			break;
+		}
+
+		err = copy_from_user(&bytes, argp, sizeof(int64_t));
+		if (err) {
+			dbg_err("memory access error");
+			err = -EFAULT;
+			break;
+		}
+
+		dbg_cdev("update volume %u, %lld bytes",
+			 desc->vol->vol_id, (long long)bytes);
+
+		if (desc->mode == UBI_READONLY) {
+			dbg_err("read-only mode");
+			err = -EROFS;
+			break;
+		}
+
+		vtr = ubi_vtbl_get_vtr(ubi, desc->vol->vol_id);
+		rsvd_bytes = vtr->reserved_pebs *
+				(ubi->io->leb_size - vtr->data_pad);
+		if (bytes < 0 || bytes > rsvd_bytes) {
+			dbg_err("bad data size %lld", (long long)bytes);
+			err = -EINVAL;
+			break;
+		}
+
+		desc->vol->updating = 1;
+		err = ubi_upd_start(ubi, desc->vol->vol_id, bytes);
+		if (bytes == 0)
+			/* It was just volume truncation */
+			desc->vol->updating = 0;
+		file->f_pos = 0;
+		break;
+	}
+
+#ifdef CONFIG_MTD_UBI_USERSPACE_IO
+	/* An eraseblock erasure command */
+	case UBI_IOCEBER:
+	{
+		int32_t lnum;
+		const struct ubi_vtbl_vtr *vtr;
+
+		err = __get_user(lnum, (__user int32_t *)argp);
+		if (err) {
+			dbg_err("memory access error");
+			err = -EFAULT;
+			break;
+		}
+
+		if (desc->mode == UBI_READONLY) {
+			dbg_err("read-only mode");
+			err = -EROFS;
+			break;
+		}
+
+		vtr = ubi_vtbl_get_vtr(ubi, desc->vol->vol_id);
+		ubi_assert(!IS_ERR(vtr));
+		if (lnum < 0 || lnum >= vtr->reserved_pebs) {
+			dbg_err("bad lnum %d", lnum);
+			err = -EINVAL;
+			break;
+		}
+
+		if (vtr->vol_type != UBI_DYNAMIC_VOLUME) {
+			dbg_err("static volume");
+			err = -EROFS;
+			break;
+		}
+
+		dbg_cdev("erase LEB %d:%d", desc->vol->vol_id, lnum);
+		err = ubi_eba_erase_leb(ubi, desc->vol->vol_id, lnum);
+		break;
+	}
+#endif
+
+	default:
+		err = -ENOTTY;
+		break;
+	}
+
+	return err;
+}
+
+static int check_mkvol_req(const struct ubi_info *ubi,
+			   const struct ubi_mkvol_req *req);
+
+static int check_rsvol_req(const struct ubi_info *ubi,
+			   const struct ubi_rsvol_req *req);
+
+/**
+ * ubi_cdev_ioctl - ioctl method of UBI character devices.
+ *
+ * @inode: inode of the UBI character device
+ * @file: &struct file object of the UBI character device
+ * @cmd: ioctl command
+ * @arg: ioctl arguments
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int ubi_cdev_ioctl(struct inode *inode, struct file *file,
+			  unsigned int cmd, unsigned long arg)
+{
+	int err = 0;
+	const struct ubi_info *ubi;
+	void __user *argp = (void __user *)arg;
+
+	if (_IOC_NR(cmd) > UBI_CDEV_IOC_MAX_SEQ ||
+	    _IOC_TYPE(cmd) != UBI_IOC_MAGIC) {
+		dbg_err("bad ioctl command");
+		return -ENOTTY;
+	}
+
+	if (_IOC_DIR(cmd) && _IOC_READ)
+		err = !access_ok(VERIFY_WRITE, argp, _IOC_SIZE(cmd));
+	else if (_IOC_DIR(cmd) && _IOC_WRITE)
+		err = !access_ok(VERIFY_READ, argp, _IOC_SIZE(cmd));
+	if (err)
+		return -EFAULT;
+
+	if (!capable(CAP_SYS_RESOURCE)) {
+		dbg_err("no rights");
+		return -EPERM;
+	}
+
+	ubi = major2ubi_info(imajor(inode));
+	if (IS_ERR(ubi))
+		return PTR_ERR(ubi);
+
+	switch (cmd) {
+
+	/* Create volume command */
+	case UBI_IOCMKVOL:
+	{
+		char *name;
+		struct ubi_vtbl_vtr vtr;
+		struct ubi_mkvol_req req;
+		const struct ubi_io_info *io = ubi->io;
+		int pebs;
+
+		err = __copy_from_user(&req, argp,
+				       sizeof(struct ubi_mkvol_req));
+		if (err) {
+			err = -EFAULT;
+			break;
+		}
+
+		/* Make sure that user passed us sane data */
+		pebs = check_mkvol_req(ubi, &req);
+		if (pebs < 0) {
+			err = pebs;
+			break;
+		}
+
+		name = ubi_kmalloc(req.name_len + 1);
+		if (!name) {
+			err = -ENOMEM;
+			break;
+		}
+
+		err = copy_from_user(name, req.name, req.name_len + 1);
+		if (err) {
+			err = -EFAULT;
+			goto out_free;
+		}
+
+		vtr.reserved_pebs = pebs;
+		vtr.alignment = req.alignment;
+		vtr.vol_type = req.vol_type;
+		vtr.name_len = req.name_len;
+		vtr.name = name;
+		vtr.data_pad = io->leb_size % vtr.alignment;
+
+		dbg_cdev("create volume ID %d, size %d, type %d, name %s",
+		     req.vol_id, vtr.reserved_pebs, vtr.vol_type,
+		     vtr.name);
+
+		err = ubi_vmt_mkvol(ubi, req.vol_id, &vtr);
+		if (err < 0)
+			goto out_free;
+
+		req.vol_id = err;
+		err = ubi_uif_mkvol(ubi, req.vol_id);
+		if (err)
+			goto out_rmvol;
+
+		ubi_kfree(name);
+
+		err = __copy_to_user(argp, &req,
+				       sizeof(struct ubi_mkvol_req));
+		if (err) {
+			err = -EFAULT;
+			break;
+		}
+
+		break;
+
+out_rmvol:
+		ubi_vmt_rmvol(ubi, req.vol_id);
+out_free:
+		ubi_kfree(name);
+		break;
+	}
+
+	/* Remove volume command */
+	case UBI_IOCRMVOL:
+	{
+		int32_t vol_id;
+		struct ubi_vol_desc *desc;
+
+		err = __get_user(vol_id, (__user int32_t *)argp);
+		if (err) {
+			err = -EFAULT;
+			break;
+		}
+
+		dbg_cdev("remove volume %u", vol_id);
+
+		desc = ubi_open_volume(ubi->ubi_num, vol_id, UBI_EXCLUSIVE);
+		if (IS_ERR(desc)) {
+			err = PTR_ERR(desc);
+			break;
+		}
+
+		err = ubi_uif_close_and_rmvol(desc);
+		if (err) {
+			ubi_close_volume(desc);
+			break;
+		}
+
+		err = ubi_vmt_rmvol(ubi, vol_id);
+		break;
+	}
+
+	/* Re-size volume command */
+	case UBI_IOCRSVOL:
+	{
+		int rem, pebs;
+		uint64_t tmp;
+		const struct ubi_vtbl_vtr *vtr;
+		struct ubi_rsvol_req req;
+		struct ubi_vol_desc *desc;
+
+		err = __copy_from_user(&req, argp,
+				       sizeof(struct ubi_rsvol_req));
+		if (err) {
+			err = -EFAULT;
+			break;
+		}
+
+		/* Make sure that user passed us sane data */
+		err = check_rsvol_req(ubi, &req);
+		if (err)
+			break;
+
+		dbg_cdev("re-size volume %d to size %lld bytes",
+			 req.vol_id, (long long)req.bytes);
+
+		desc = ubi_open_volume(ubi->ubi_num, req.vol_id, UBI_EXCLUSIVE);
+		if (IS_ERR(desc)) {
+			err = PTR_ERR(desc);
+			break;
+		}
+
+		vtr = ubi_vtbl_get_vtr(ubi, req.vol_id);
+		ubi_assert(!IS_ERR(vtr));
+
+		tmp = req.bytes;
+		rem = do_div(tmp, vtr->usable_leb_size);
+		pebs = tmp;
+		if (rem)
+			pebs += 1;
+
+		err = ubi_vmt_rsvol(ubi, req.vol_id, pebs);
+		ubi_close_volume(desc);
+		break;
+	}
+
+	default:
+		err = -ENOTTY;
+		break;
+	}
+
+	return err;
+}
+
+/**
+ * major2ubi_info - find the UBI device description by major number of the
+ * corresponding character device.
+ *
+ * @major: major number
+ *
+ * This function returns a pointer to the UBI description object.
+ */
+static struct ubi_info *major2ubi_info(int major)
+{
+	int i;
+
+	for (i = 0; i < ubis_num; i++)
+		if (ubis[i] && ubis[i]->uif->major == major)
+			return ubis[i];
+
+	BUG();
+	return NULL;
+}
+
+/**
+ * check_mkvol_req - check sanity of a volume creation request.
+ *
+ * @ubi: the UBI device description object
+ * @req: the request to check
+ *
+ * This function returns a positive volume size in eraseblocks if the request
+ * is sane, and %-EINVAL if not.
+ */
+static int check_mkvol_req(const struct ubi_info *ubi,
+			   const struct ubi_mkvol_req *req)
+{
+	int n, err, rem, ebs, usable_leb_size;
+	char *name = NULL;
+	const struct ubi_io_info *io = ubi->io;
+	uint64_t bytes;
+
+	name = ubi_kmalloc(req->name_len + 1);
+	if (!name)
+		return -ENOMEM;
+
+	if (req->bytes < 0 || req->alignment < 0 || req->vol_type < 0 ||
+	    req->name_len < 0) {
+		dbg_err("negative values");
+		goto bad;
+	}
+
+	if ((req->vol_id < 0 || req->vol_id >= ubi->acc->max_volumes) &&
+	    req->vol_id != UBI_VOL_NUM_AUTO) {
+		dbg_err("bad vol_id %d", req->vol_id);
+		goto bad;
+	}
+
+	if (req->alignment == 0) {
+		dbg_err("zero alignment");
+		goto bad;
+	}
+
+	if (req->bytes == 0) {
+		dbg_err("zero bytes");
+		goto bad;
+	}
+
+	if (req->vol_type != UBI_DYNAMIC_VOLUME &&
+	    req->vol_type != UBI_STATIC_VOLUME) {
+		dbg_err("bad vol_type");
+		goto bad;
+	}
+
+	if (req->alignment > io->leb_size) {
+		dbg_err("too large alignment");
+		goto bad;
+	}
+
+	n = req->alignment % io->min_io_size;
+	if (req->alignment != 1 && n) {
+		dbg_err("alignment is not multiple of min I/O unit size");
+		goto bad;
+	}
+
+	if (req->name_len > UBI_VOL_NAME_MAX) {
+		dbg_err("too long volume name, max is %d", UBI_VOL_NAME_MAX);
+		goto bad;
+	}
+
+	if (!req->name) {
+		dbg_err("NULL volume name");
+		goto bad;
+	}
+
+	err = copy_from_user(name, req->name, req->name_len + 1);
+	if (err)
+		return err;
+
+	n = strnlen(name, req->name_len + 1);
+	if (n != req->name_len) {
+		dbg_err("bad name_len");
+		goto bad;
+	}
+
+	ubi_kfree(name);
+
+	/* Calculate how many eraseblocks are requested */
+	usable_leb_size = io->leb_size - io->leb_size % req->alignment;
+	bytes = req->bytes;
+	rem = do_div(bytes, usable_leb_size);
+	ebs = bytes;
+	if (rem)
+		ebs += 1;
+
+	return ebs;
+
+bad:
+	dbg_err("bad volume creation request");
+	ubi_dbg_dump_mkvol_req(req, name);
+	ubi_kfree(name);
+	return -EINVAL;
+}
+
+/**
+ * check_rsvol_req - check sanity of a volume re-size request.
+ *
+ * @ubi: the UBI device description object
+ * @req: the re-size request to check
+ *
+ * This function returns zero if the request is sane, and %-EINVAL if not.
+ */
+static int check_rsvol_req(const struct ubi_info *ubi,
+			   const struct ubi_rsvol_req *req)
+{
+	if (req->bytes <= 0) {
+		dbg_err("bad bytes %lld", (long long)req->bytes);
+		return -EINVAL;
+	}
+
+	if (req->vol_id < 0 || req->vol_id >= ubi->acc->max_volumes) {
+		dbg_err("bad vol_id %d", req->vol_id);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+#ifdef CONFIG_MTD_UBI_USERSPACE_IO
+
+/**
+ * vol_cdev_direct_write - write to volume character devices directly (nothing
+ * to do with O_DIRECT!).
+ *
+ * @file: &struct file object of volume character device
+ * @buf: user-space buffer with the data to write
+ * @count: how many bytes to write
+ * @offp: the write position
+ *
+ * This function allows to directly write to dynamic UBI volumes, without
+ * issuing the volume update operation. This function returns zero in case of
+ * success and a negative error code in case of failure.
+ */
+static ssize_t vol_cdev_direct_write(struct file *file, const char __user *buf,
+				     size_t count, loff_t *offp)
+{
+	const struct ubi_vtbl_vtr *vtr;
+	struct ubi_vol_desc *desc = file->private_data;
+	struct ubi_uif_volume *vol = desc->vol;
+	const struct ubi_info *ubi = vol->ubi;
+	const struct ubi_io_info *io = ubi->io;
+	int lnum, off, len, tbuf_size, vol_id = vol->vol_id, err = 0;
+	size_t count_save = count;
+	char *tbuf;
+	uint64_t tmp;
+
+	dbg_cdev("requested: write %zd bytes to offset %lld of volume %u",
+		 count, *offp, desc->vol->vol_id);
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol_id);
+
+	ubi_assert(!IS_ERR(vtr));
+	ubi_assert(!vol->updating);
+	ubi_assert(*offp >= 0 && *offp <= vtr->used_bytes);
+
+	if (vtr->vol_type == UBI_STATIC_VOLUME) {
+		dbg_err("static volume");
+		return -EROFS;
+	}
+
+	tmp = *offp;
+	off = do_div(tmp, vtr->usable_leb_size);
+	lnum = tmp;
+
+	if (off % io->min_io_size) {
+		dbg_err("unaligned position");
+		return -EIO;
+	}
+
+	if (*offp + count > vtr->used_bytes)
+		count_save = count = vtr->used_bytes - *offp;
+
+	/*
+	 * We can only write in fractions of the minimum input/output unit of
+	 * the flash.
+	 */
+	if (count % io->min_io_size) {
+		dbg_err("unaligned write length");
+		return -EINVAL;
+	}
+
+	tbuf_size = (PAGE_SIZE / io->min_io_size) * io->min_io_size;
+	if (tbuf_size == 0)
+		tbuf_size = io->min_io_size;
+	if (tbuf_size > io->leb_size)
+		tbuf_size = io->leb_size;
+
+	tbuf = ubi_kmalloc(tbuf_size);
+	if (!tbuf)
+		return -ENOMEM;
+
+	len = count > tbuf_size ? tbuf_size : count;
+
+	while (count) {
+		cond_resched();
+
+		if (off + len >= vtr->usable_leb_size)
+			len = vtr->usable_leb_size - off;
+
+		dbg_cdev("copy %d bytes of user data", len);
+		err = copy_from_user(tbuf, buf, len);
+		if (err) {
+			dbg_err("memory access error");
+			err = -EFAULT;
+			break;
+		}
+
+		dbg_cdev("write %d bytes to LEB %d:%d, offset %d",
+			 len, vol_id, lnum, off);
+
+		err = ubi_eba_write_leb(ubi, vol_id, lnum, tbuf, off, len,
+					UBI_DATA_UNKNOWN);
+		if (unlikely(err))
+			break;
+
+		off += len;
+		if (off == vtr->usable_leb_size) {
+			lnum += 1;
+			off -= vtr->usable_leb_size;
+		}
+
+		count -= len;
+		*offp += len;
+		buf += len;
+		len = count > tbuf_size ? tbuf_size : count;
+	}
+
+	ubi_kfree(tbuf);
+	return err ? err : count_save - count;
+}
+
+#endif /* CONFIG_MTD_UBI_USERSPACE_IO */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (39 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 40/44 take 2] [UBI] character devices handling sub-unit implementation Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 21:14   ` Arnd Bergmann
  2007-02-17 16:57 ` [PATCH 42/44 take 2] [UBI] gluebi unit implementation Artem Bityutskiy
                   ` (5 subsequent siblings)
  46 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/drivers/mtd/ubi/gluebi.h tmp-to/drivers/mtd/ubi/gluebi.h
--- tmp-from/drivers/mtd/ubi/gluebi.h	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/gluebi.h	2007-02-17 18:07:28.000000000 +0200
@@ -0,0 +1,88 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy
+ */
+
+/*
+ * This unit is responsible for emulating MTD devices on top of UBI devices.
+ * This sounds strange, but it is in fact quite useful to make legacy software
+ * work on top of UBI. New software should use native UBI API instead.
+ *
+ * Gluebi emulated MTD devices of "MTD_UBIVOLUME" type. Their minimal I/O unit
+ * size (mtd->writesize) is equivalent to the underlying flash minimal I/O
+ * unit. The eraseblock size is equivalent to the logical UBI volume eraseblock
+ * size.
+ */
+
+#ifndef __UBI_GLUEBI_H__
+#define __UBI_GLUEBI_H__
+
+#include <linux/mtd/mtd.h>
+
+#if defined(CONFIG_MTD_UBI_GLUEBI)
+
+struct ubi_uif_volume;
+
+/**
+ * struct ubi_gluebi_volume - emulated MTD device description data structure.
+ *
+ * @desc: UBI volume descriptor
+ * @mtd: MTD device description object
+ */
+struct ubi_gluebi_volume
+{
+	struct ubi_vol_desc *desc;
+	int refcount;
+	struct mtd_info mtd;
+};
+
+/**
+ * ubi_gluebi_vol_init - initialize all the gluebi-related stuff for an UBI
+ * volume.
+ *
+ * @ubi: the UBI device description object
+ * @vol: user interfaces unit volume description object
+ *
+ * This function is called when an UBI volume is created in order to create
+ * corresponding MTD device. Returns zero in case of success and a negative
+ * error code in case of failure.
+ */
+int ubi_gluebi_vol_init(const struct ubi_info *ubi, struct ubi_uif_volume *vol);
+
+/**
+ * ubi_gluebi_vol_close - close all the gluebi-related stuff for an UBI volume.
+ *
+ * @vol: user interfaces unit volume description object
+ *
+ * This function is called when an UBI volume is removed in order to remove
+ * corresponding MTD device. Returns zero in case of success and a negative
+ * error code in case of failure.
+ */
+int ubi_gluebi_vol_close(struct ubi_uif_volume *vol);
+
+#else
+
+struct ubi_gluebi_volume
+{
+};
+
+#define ubi_gluebi_vol_init(ubi, vol) ({int __ret; __ret = 0;})
+#define ubi_gluebi_vol_close(vol)     ({int __ret; __ret = 0;})
+
+#endif /* CONFIG_MTD_UBI_GLUEBI */
+#endif /* __UBI_GLUEBI_H__ */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 42/44 take 2] [UBI] gluebi unit implementation
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (40 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 41/44 take 2] [UBI] gluebi unit header Artem Bityutskiy
@ 2007-02-17 16:57 ` Artem Bityutskiy
  2007-02-17 16:58 ` [PATCH 43/44 take 2] [UBI] JFFS2 UBI support Artem Bityutskiy
                   ` (4 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/drivers/mtd/ubi/gluebi.c tmp-to/drivers/mtd/ubi/gluebi.c
--- tmp-from/drivers/mtd/ubi/gluebi.c	1970-01-01 02:00:00.000000000 +0200
+++ tmp-to/drivers/mtd/ubi/gluebi.c	2007-02-17 18:07:28.000000000 +0200
@@ -0,0 +1,363 @@
+/*
+ * Copyright (c) International Business Machines Corp., 2006
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
+ * the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Author: Artem B. Bityutskiy (based on code written by Joern Engel)
+ */
+
+#include <linux/mtd/mtd.h>
+#include <linux/mtd/ubi.h>
+#include <asm/div64.h>
+#include "ubi.h"
+#include "uif.h"
+#include "gluebi.h"
+#include "vtbl.h"
+#include "misc.h"
+#include "debug.h"
+#include "alloc.h"
+#include "io.h"
+#include "eba.h"
+#include "wl.h"
+
+static int gluebi_get_device(struct mtd_info *mtd);
+static void gluebi_put_device(struct mtd_info *mtd);
+static int gluebi_read(struct mtd_info *mtd, loff_t from, size_t len,
+		       size_t *retlen, unsigned char *buf);
+static int gluebi_write(struct mtd_info *mtd, loff_t to, size_t len,
+		       size_t *retlen, const u_char *buf);
+static int gluebi_erase(struct mtd_info *mtd, struct erase_info *instr);
+
+/**
+ * mtd2vol - take the user interface volume description object by an MTD object.
+ *
+ * @mtd: the MTD object
+ *
+ * This function returns the user interface volume description object
+ * corresponding to the @mtd object.
+ */
+static inline struct ubi_uif_volume *mtd2vol(struct mtd_info *mtd)
+{
+	struct ubi_gluebi_volume *gluebi_vol;
+	struct ubi_uif_volume *vol;
+
+	gluebi_vol = container_of(mtd, struct ubi_gluebi_volume, mtd);
+	vol = container_of(gluebi_vol, struct ubi_uif_volume, gluebi_vol);
+	return vol;
+}
+
+int ubi_gluebi_vol_init(const struct ubi_info *ubi, struct ubi_uif_volume *vol)
+{
+	int err;
+	struct mtd_info *mtd = &vol->gluebi_vol.mtd;
+	const struct ubi_vtbl_vtr *vtr;
+	const struct ubi_io_info *io = ubi->io;
+
+	vtr = ubi_vtbl_get_vtr(ubi, vol->vol_id);
+	ubi_assert(!IS_ERR(vtr));
+
+	mtd->name = strdup_len(vtr->name, vtr->name_len);
+	if (!mtd->name)
+		return -ENOMEM;
+
+	mtd->type = MTD_UBIVOLUME;
+	if (!io->ro_mode)
+		mtd->flags = MTD_WRITEABLE;
+	mtd->writesize  = io->min_io_size;
+	mtd->owner      = THIS_MODULE;
+	mtd->size       = vtr->usable_leb_size * vtr->reserved_pebs;
+	mtd->erasesize  = vtr->usable_leb_size;
+	mtd->read       = gluebi_read;
+	mtd->write      = gluebi_write;
+	mtd->erase      = gluebi_erase;
+	mtd->get_device = gluebi_get_device;
+	mtd->put_device = gluebi_put_device;
+
+	if (add_mtd_device(mtd)) {
+		ubi_err("cannot not add MTD device\n");
+
+		/*
+		 * Unfortunately, add_mtd_device() does not return sane error
+		 * code. So, let's name it -ENOMEM;
+		 */
+		err = -ENOMEM;
+		goto out_free;
+	}
+
+	dbg_gluebi("added mtd%d (\"%s\"), size %u, EB size %u",
+		   mtd->index, mtd->name, mtd->size, mtd->erasesize);
+
+	return 0;
+
+out_free:
+	ubi_kfree(mtd->name);
+	return err;
+}
+
+int ubi_gluebi_vol_close(struct ubi_uif_volume *vol)
+{
+	int err;
+	struct mtd_info *mtd = &vol->gluebi_vol.mtd;
+
+	dbg_gluebi("remove mtd%d", mtd->index);
+
+	err = del_mtd_device(mtd);
+	if (err)
+		return err;
+
+	ubi_kfree(mtd->name);
+	return 0;
+}
+
+/**
+ * gluebi_get_device - get MTD device reference.
+ *
+ * @mtd: the MTD device description object
+ *
+ * This function is called every time the MTD device is being got. Returns
+ * zero in case of success and a negative error code in case of failure.
+ */
+static int gluebi_get_device(struct mtd_info *mtd)
+{
+	struct ubi_uif_volume *vol = mtd2vol(mtd);
+	struct ubi_gluebi_volume *gluebi_vol = &vol->gluebi_vol;
+
+	/*
+	 * We do not introduce locks for gluebi reference count because the
+	 * get_device()/put_device() calls are already serialized.
+	 */
+	if (gluebi_vol->refcount > 0) {
+		/*
+		 * The MTD device is already referenced and this is just one
+		 * more reference. MTD allows opening many users to open the
+		 * same volume simultaniously and do not distinguish between
+		 * readers/writers/exclusive openers as UBI does. So we do not
+		 * open the UBI volume again - just increase the reference
+		 * counter and return.
+		 */
+		gluebi_vol->refcount += 1;
+		return 0;
+	}
+
+	/*
+	 * This is the first reference to this UBI volume via the MTD device
+	 * interface. Open the corresponding volume in read-write mode.
+	 */
+	gluebi_vol->desc = ubi_open_volume(vol->ubi->ubi_num, vol->vol_id,
+					   UBI_READWRITE);
+	if (IS_ERR(gluebi_vol->desc))
+		return PTR_ERR(gluebi_vol->desc);
+	gluebi_vol->refcount += 1;
+	return 0;
+}
+
+/**
+ * gluebi_put_device - put MTD device reference.
+ *
+ * @mtd: the MTD device description object
+ *
+ * This function is called every time the MTD device is being put. Returns
+ * zero in case of success and a negative error code in case of failure.
+ */
+static void gluebi_put_device(struct mtd_info *mtd)
+{
+	struct ubi_gluebi_volume *gluebi_vol = &mtd2vol(mtd)->gluebi_vol;
+
+	gluebi_vol->refcount -= 1;
+	ubi_assert(gluebi_vol->refcount >= 0);
+	if (gluebi_vol->refcount == 0)
+		ubi_close_volume(gluebi_vol->desc);
+}
+
+/**
+ * gluebi_read - read operation of emulated MTD devices.
+ *
+ * @mtd: the MTD device description object
+ * @from: absolute offset from where to read
+ * @len: how many bytes to read
+ * @retlen: count of read bytes is returned here
+ * @buf: the buffer to store the data to
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int gluebi_read(struct mtd_info *mtd, loff_t from, size_t len,
+		       size_t *retlen, unsigned char *buf)
+{
+	int err = 0, lnum, offs, total_read;
+	struct ubi_uif_volume *vol;
+	const struct ubi_info *ubi;
+
+	dbg_gluebi("read %zd bytes from offset %lld", len, from);
+
+	if (unlikely(len < 0 || from < 0 || from + len > mtd->size))
+		return -EINVAL;
+
+	vol = mtd2vol(mtd);
+	ubi = vol->ubi;
+
+	offs = do_div(from, mtd->erasesize);
+	lnum = from;
+
+	total_read = len;
+	while (total_read) {
+		size_t to_read = mtd->erasesize - offs;
+
+		if (to_read > total_read)
+			to_read = total_read;
+
+		dbg_gluebi("read %zd bytes from LEB %d:%d, offset %d",
+			   to_read, vol->vol_id, lnum, offs);
+
+		err = ubi_eba_read_leb(ubi, vol->vol_id, lnum, buf, offs,
+				       to_read, 0);
+		if (unlikely(err))
+			break;
+
+		lnum += 1;
+		offs = 0;
+		total_read -= to_read;
+		buf += to_read;
+	}
+
+	*retlen = len - total_read;
+	return err;
+}
+
+/**
+ * gluebi_write - write operation of emulated MTD devices.
+ *
+ * @mtd: the MTD device description object
+ * @to: absolute offset where to write
+ * @len: how many bytes to write
+ * @retlen: count of written bytes is returned here
+ * @buf: the buffer with data to write
+ *
+ * This function returns zero in case of success and a negative error code in
+ * case of failure.
+ */
+static int gluebi_write(struct mtd_info *mtd, loff_t to, size_t len,
+		       size_t *retlen, const u_char *buf)
+{
+	int err = 0, lnum, offs, total_written;
+	struct ubi_uif_volume *vol;
+	const struct ubi_info *ubi;
+
+	dbg_gluebi("write %zd bytes to offset %lld", len, to);
+
+	if (unlikely(len < 0 || to < 0 || len + to > mtd->size))
+		return -EINVAL;
+
+	vol = mtd2vol(mtd);
+	ubi = vol->ubi;
+
+	if (unlikely(ubi->io->ro_mode))
+		return -EROFS;
+
+	offs = do_div(to, mtd->erasesize);
+	lnum = to;
+
+	if (unlikely(len % mtd->writesize || offs % mtd->writesize))
+		return -EINVAL;
+
+	total_written = len;
+	while (total_written) {
+		size_t to_write = mtd->erasesize - offs;
+
+		if (to_write > total_written)
+			to_write = total_written;
+
+		dbg_gluebi("write %zd bytes to LEB %d:%d, offset %d",
+			   to_write, vol->vol_id, lnum, offs);
+
+		err = ubi_eba_write_leb(ubi, vol->vol_id, lnum, buf, offs,
+					to_write, UBI_DATA_UNKNOWN);
+		if (unlikely(err))
+			break;
+
+		lnum += 1;
+		offs = 0;
+		total_written -= to_write;
+		buf += to_write;
+	}
+
+	*retlen = len - total_written;
+	return err;
+}
+
+/**
+ * gluebi_erase - erase operation of emulated MTD devices.
+ *
+ * @mtd: the MTD device description object
+ * @instr: the erase operation description
+ *
+ * This function calls the erase callback when finishes. Returns zero in case
+ * of success and a negative error code in case of failure.
+ */
+static int gluebi_erase(struct mtd_info *mtd, struct erase_info *instr)
+{
+	int err, i, lnum, count;
+	struct ubi_uif_volume *vol;
+	const struct ubi_info *ubi;
+
+	dbg_gluebi("erase %u bytes at offset %u", instr->len, instr->addr);
+
+	if (unlikely(instr->addr < 0 ||
+		     instr->addr > mtd->size - mtd->erasesize))
+		return -EINVAL;
+
+	if (unlikely(instr->len < 0 ||
+		     instr->addr + instr->len > mtd->size))
+		return -EINVAL;
+
+	if (unlikely(instr->addr % mtd->writesize ||
+		     instr->len % mtd->writesize))
+		return -EINVAL;
+
+	lnum = instr->addr / mtd->erasesize;
+	count = instr->len / mtd->erasesize;
+
+	vol = mtd2vol(mtd);
+	ubi = vol->ubi;
+
+	if (unlikely(ubi->io->ro_mode))
+		return -EROFS;
+
+	for (i = 0; i < count; i++) {
+		dbg_gluebi("erase LEB %d", lnum);
+
+		err = ubi_eba_erase_leb(ubi, vol->vol_id, lnum + i);
+		if (unlikely(err))
+			goto out_err;
+	}
+
+	/*
+	 * MTD erase operations are synchronous, so we have to make sure the
+	 * physical eraseblock is wiped out.
+	 */
+	err = ubi_wl_flush(ubi);
+	if (unlikely(err))
+		goto out_err;
+
+        instr->state = MTD_ERASE_DONE;
+        mtd_erase_callback(instr);
+
+	return 0;
+
+out_err:
+	instr->state = MTD_ERASE_FAILED;
+	instr->fail_addr = lnum * mtd->erasesize;
+	return err;
+}

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 43/44 take 2] [UBI] JFFS2 UBI support
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (41 preceding siblings ...)
  2007-02-17 16:57 ` [PATCH 42/44 take 2] [UBI] gluebi unit implementation Artem Bityutskiy
@ 2007-02-17 16:58 ` Artem Bityutskiy
  2007-02-17 16:58 ` [PATCH 44/44 take 2] [UBI] update MAINTAINERS Artem Bityutskiy
                   ` (3 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:58 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

diff -auNrp tmp-from/fs/jffs2/fs.c tmp-to/fs/jffs2/fs.c
--- tmp-from/fs/jffs2/fs.c	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/fs/jffs2/fs.c	2007-02-17 18:07:28.000000000 +0200
@@ -672,6 +672,13 @@ static int jffs2_flash_setup(struct jffs
 			return ret;
 	}
 
+	/* and an UBI volume */
+	if (jffs2_ubivol(c)) {
+		ret = jffs2_ubivol_setup(c);
+		if (ret)
+			return ret;
+	}
+
 	return ret;
 }
 
@@ -690,4 +697,9 @@ void jffs2_flash_cleanup(struct jffs2_sb
 	if (jffs2_nor_wbuf_flash(c)) {
 		jffs2_nor_wbuf_flash_cleanup(c);
 	}
+
+	/* and an UBI volume */
+	if (jffs2_ubivol(c)) {
+		jffs2_ubivol_cleanup(c);
+	}
 }
diff -auNrp tmp-from/fs/jffs2/os-linux.h tmp-to/fs/jffs2/os-linux.h
--- tmp-from/fs/jffs2/os-linux.h	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/fs/jffs2/os-linux.h	2007-02-17 18:07:28.000000000 +0200
@@ -98,6 +98,9 @@ static inline void jffs2_init_inode_info
 #define jffs2_nor_wbuf_flash(c) (0)
 #define jffs2_nor_wbuf_flash_setup(c) (0)
 #define jffs2_nor_wbuf_flash_cleanup(c) do {} while (0)
+#define jffs2_ubivol(c) (0)
+#define jffs2_ubivol_setup(c) (0)
+#define jffs2_ubivol_cleanup(c) do {} while (0)
 
 #else /* NAND and/or ECC'd NOR support present */
 
@@ -133,6 +136,9 @@ void jffs2_nand_flash_cleanup(struct jff
 #define jffs2_dataflash(c) (c->mtd->type == MTD_DATAFLASH)
 int jffs2_dataflash_setup(struct jffs2_sb_info *c);
 void jffs2_dataflash_cleanup(struct jffs2_sb_info *c);
+#define jffs2_ubivol(c) (c->mtd->type == MTD_UBIVOLUME)
+int jffs2_ubivol_setup(struct jffs2_sb_info *c);
+void jffs2_ubivol_cleanup(struct jffs2_sb_info *c);
 
 #define jffs2_nor_wbuf_flash(c) (c->mtd->type == MTD_NORFLASH && ! (c->mtd->flags & MTD_BIT_WRITEABLE))
 int jffs2_nor_wbuf_flash_setup(struct jffs2_sb_info *c);
diff -auNrp tmp-from/fs/jffs2/wbuf.c tmp-to/fs/jffs2/wbuf.c
--- tmp-from/fs/jffs2/wbuf.c	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/fs/jffs2/wbuf.c	2007-02-17 18:07:28.000000000 +0200
@@ -1274,3 +1274,27 @@ int jffs2_nor_wbuf_flash_setup(struct jf
 void jffs2_nor_wbuf_flash_cleanup(struct jffs2_sb_info *c) {
 	kfree(c->wbuf);
 }
+
+int jffs2_ubivol_setup(struct jffs2_sb_info *c) {
+	c->cleanmarker_size = 0;
+
+	if (c->mtd->writesize == 1)
+		/* We do not need write-buffer */
+		return 0;
+
+	init_rwsem(&c->wbuf_sem);
+
+	c->wbuf_pagesize =  c->mtd->writesize;
+	c->wbuf_ofs = 0xFFFFFFFF;
+	c->wbuf = kmalloc(c->wbuf_pagesize, GFP_KERNEL);
+	if (!c->wbuf)
+		return -ENOMEM;
+
+	printk(KERN_INFO "JFFS2 write-buffering enabled buffer (%d) erasesize (%d)\n", c->wbuf_pagesize, c->sector_size);
+
+	return 0;
+}
+
+void jffs2_ubivol_cleanup(struct jffs2_sb_info *c) {
+	kfree(c->wbuf);
+}
diff -auNrp tmp-from/include/mtd/mtd-abi.h tmp-to/include/mtd/mtd-abi.h
--- tmp-from/include/mtd/mtd-abi.h	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/include/mtd/mtd-abi.h	2007-02-17 18:07:28.000000000 +0200
@@ -24,6 +24,7 @@ struct mtd_oob_buf {
 #define MTD_NORFLASH		3
 #define MTD_NANDFLASH		4
 #define MTD_DATAFLASH		6
+#define MTD_UBIVOLUME		7
 
 #define MTD_WRITEABLE		0x400	/* Device is writeable */
 #define MTD_BIT_WRITEABLE	0x800	/* Single bits can be flipped */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 44/44 take 2] [UBI] update MAINTAINERS
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (42 preceding siblings ...)
  2007-02-17 16:58 ` [PATCH 43/44 take 2] [UBI] JFFS2 UBI support Artem Bityutskiy
@ 2007-02-17 16:58 ` Artem Bityutskiy
  2007-02-17 22:49 ` [PATCH 00/44 take 2] [UBI] Unsorted Block Images Theodore Tso
                   ` (2 subsequent siblings)
  46 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-17 16:58 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Christoph Hellwig, Artem Bityutskiy, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

diff -auNrp tmp-from/MAINTAINERS tmp-to/MAINTAINERS
--- tmp-from/MAINTAINERS	2007-02-17 18:07:26.000000000 +0200
+++ tmp-to/MAINTAINERS	2007-02-17 18:07:28.000000000 +0200
@@ -2176,6 +2176,14 @@ L:	linux-mtd@lists.infradead.org
 T:	git git://git.infradead.org/mtd-2.6.git
 S:	Maintained
 
+UNSORTED BLOCK IMAGES (UBI)
+P:	Artem Bityutskiy
+M:	dedekind@infradead.org
+W:	http://www.linux-mtd.infradead.org/
+L:	linux-mtd@lists.infradead.org
+T:	git git://git.infradead.org/ubi-2.6.git
+S:	Maintained
+
 MICROTEK X6 SCANNER
 P:	Oliver Neukum
 M:	oliver@neukum.name

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 12/44 take 2] [UBI] allocation unit implementation
  2007-02-17 16:55 ` [PATCH 12/44 take 2] [UBI] allocation unit implementation Artem Bityutskiy
@ 2007-02-17 20:55   ` Arnd Bergmann
  2007-02-19 11:05     ` Artem Bityutskiy
  2007-02-19 11:13   ` Pekka Enberg
  1 sibling, 1 reply; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-17 20:55 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Saturday 17 February 2007 17:55, Artem Bityutskiy wrote:
> diff -auNrp tmp-from/drivers/mtd/ubi/alloc.c tmp-to/drivers/mtd/ubi/alloc.c

> +#include "ubi.h"
> +#include "alloc.h"
> +#include "io.h"
> +#include "background.h"
> +#include "wl.h"
> +#include "debug.h"
> +#include "eba.h"
> +#include "scan.h"

I don't see much point in having one local header for each of these,
you could simply put all of the declarations into one header in the
ubi directory.

> +
> +#define BGT_WORK_SLAB_NAME        "ubi_bgt_work_slab"
> +#define WL_ERASE_WORK_SLAB_NAME   "ubi_wl_erase_work_slab"
> +#define WL_ENTRY_SLAB_NAME        "ubi_wl_entry_slab"
> +#define WL_PROT_ENTRY_SLAB_NAME   "ubi_wl_prow_entry_slab"
> +#define EBA_LTREE_ENTRY_SLAB_NAME "ubi_eba_ltree_entry_slab"
> +#define SCAN_EB_SLAB_NAME         "ubi_scan_leb"
> +#define SCAN_VOLUME_SLAB_NAME     "ubi_scan_volume"

These macros seem rather pointless, each of them is only used
once, and the macro name directly corresponds to the contents.

> +static struct kmem_cache *bgt_work_slab;
> +static struct kmem_cache *wl_erase_work_slab;
> +static struct kmem_cache *wl_entries_slab;
> +static struct kmem_cache *wl_prot_entry_slab;
> +static struct kmem_cache *eba_ltree_entry_slab;
> +static struct kmem_cache *scan_eb_slab;
> +static struct kmem_cache *scan_volume_slab;

Do you really need all these slab caches? If a cache only contains
a small number of objects, e.g. one per volume, then you're much
better off using a regular kmalloc.

> +void *ubi_kzalloc(size_t size)
> +{
> +	void *ret;
> +
> +	ret = kzalloc(size, GFP_KERNEL);
> +	if (unlikely(!ret)) {
> +		ubi_err("cannot allocate %zd bytes", size);
> +		dump_stack();
> +		return NULL;
> +	}
> +
> +	return ret;
> +}
> +
> +void *ubi_kmalloc(size_t size)
> +{
> +	void *ret;
> +
> +	ret = kmalloc(size, GFP_KERNEL);
> +	if (unlikely(!ret)) {
> +		ubi_err("cannot allocate %zd bytes", size);
> +		dump_stack();
> +		return NULL;
> +	}
> +
> +	return ret;
> +}
> +
> +void ubi_kfree(const void *obj)
> +{
> +	if (unlikely(!obj))
> +		return;
> +	kfree(obj);
> +}

These look somewhat too complex. Don't introduce your own generic
infrastructure if you can help it. IIRC, when kmalloc fails, you
already get the full stack trace from the buddy allocator, so
this is just duplication. Better use the regular kzalloc/kfree
calls directly.

> +struct ubi_ec_hdr *ubi_zalloc_ec_hdr(const struct ubi_info *ubi)
> +{
> +	struct ubi_ec_hdr *ec_hdr;
> +	const struct ubi_io_info *io = ubi->io;
> +
> +	ec_hdr = kzalloc(io->ec_hdr_alsize, GFP_KERNEL);
> +	if (unlikely(!ec_hdr)) {
> +		ubi_err("cannot allocate %d bytes", io->ec_hdr_alsize);
> +		dump_stack();
> +		return NULL;
> +	}
> +
> +	return ec_hdr;
> +}
> +
> +void ubi_free_ec_hdr(const struct ubi_info *ubi, struct ubi_ec_hdr *ec_hdr)
> +{
> +	if (unlikely(!ec_hdr))
> +		return;
> +	kfree(ec_hdr);
> +}

same for this and the others. Unless the allocation is done in many
places in the code from a single slab cache, just call kmem_cache_alloc
or kmalloc directly.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 10/44 take 2] [UBI] debug unit implementation
  2007-02-17 16:55 ` [PATCH 10/44 take 2] [UBI] debug unit implementation Artem Bityutskiy
@ 2007-02-17 21:00   ` Arnd Bergmann
  2007-02-19 12:29     ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-17 21:00 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Saturday 17 February 2007 17:55, Artem Bityutskiy wrote:
> 
> diff -auNrp tmp-from/drivers/mtd/ubi/debug.c tmp-to/drivers/mtd/ubi/debug.c
> --- tmp-from/drivers/mtd/ubi/debug.c    1970-01-01 02:00:00.000000000 +0200
> +++ tmp-to/drivers/mtd/ubi/debug.c      2007-02-17 18:07:26.000000000 +0200

This whole file looks like it can be removed, as nothing in here
is really relevant for regular operation. I'm sure that much of it
was a good help in developing the code and finding the bugs in here,
but why would you want to merge it into the mainline kernel?

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-17 16:54 ` [PATCH 05/44 take 2] [UBI] internal common header Artem Bityutskiy
@ 2007-02-17 21:05   ` Arnd Bergmann
  2007-02-19 11:16     ` Artem Bityutskiy
  2007-02-19 10:54   ` Christoph Hellwig
  1 sibling, 1 reply; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-17 21:05 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Saturday 17 February 2007 17:54, Artem Bityutskiy wrote:

> +/* Maximum number of supported UBI devices */
> +#define UBI_MAX_INSTANCES 32

Does this need to be limited?

> +/* UBI messages printk level */
> +#define UBI_MSG_LEVEL  KERN_INFO
> +#define UBI_WARN_LEVEL KERN_WARNING
> +#define UBI_ERR_LEVEL  KERN_ERR
> +
> +/* Prefixes of UBI messages */
> +#define UBI_MSG_PREF  "UBI:"
> +#define UBI_WARN_PREF "UBI warning:"
> +#define UBI_ERR_PREF  "UBI error:"
> +
> +/* Normal UBI messages */
> +#define ubi_msg(fmt, ...)                                                   \
> +	printk(UBI_MSG_LEVEL UBI_MSG_PREF " " fmt "\n", ##__VA_ARGS__)
> +/* UBI warning messages */
> +#define ubi_warn(fmt, ...)                                                  \
> +	printk(UBI_WARN_LEVEL UBI_WARN_PREF " %s: " fmt "\n", __FUNCTION__, \
> +	       ##__VA_ARGS__)
> +/* UBI error messages */
> +#define ubi_err(fmt, ...)                                                   \
> +	printk(UBI_ERR_LEVEL UBI_ERR_PREF " %s " fmt "\n", __FUNCTION__,    \
> +	       ##__VA_ARGS__)

You shouldn't need these helpers, just use the regular dev_dbg, dev_info
and related macros.

> +/**
> + * struct ubi_info - UBI device description structure
> + *
> + * @ubi_num: number of the UBI device
> + * @io: input/output unit information
> + * @bgt: background thread unit information
> + * @wl: wear-leveling unit information
> + * @beb: bad eraseblock handling unit information
> + * @vmt: volume management unit information
> + * @ivol: internal volume management unit information
> + * @vtbl: volume table unit information
> + * @acc: accounting unit information
> + * @upd: update unit information
> + * @eba: EBA unit information
> + * @uif: user interface unit information
> + */
> +struct ubi_info {
> +	int ubi_num;
> +	struct ubi_io_info   *io;
> +	struct ubi_bgt_info  *bgt;
> +	struct ubi_wl_info   *wl;
> +	struct ubi_beb_info  *beb;
> +	struct ubi_vmt_info  *vmt;
> +	struct ubi_ivol_info *ivol;
> +	struct ubi_vtbl_info *vtbl;
> +	struct ubi_acc_info  *acc;
> +	struct ubi_upd_info  *upd;
> +	struct ubi_eba_info  *eba;
> +	struct ubi_uif_info  *uif;
> +};

I don't know what went wrong here, but this does not at all
look ok. The members in here probably should all be part
of the ubi_info structure itself.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-17 16:57 ` [PATCH 41/44 take 2] [UBI] gluebi unit header Artem Bityutskiy
@ 2007-02-17 21:14   ` Arnd Bergmann
  2007-02-18  2:04     ` Josh Boyer
  2007-02-19 13:30     ` Artem Bityutskiy
  0 siblings, 2 replies; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-17 21:14 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Saturday 17 February 2007 17:57, Artem Bityutskiy wrote:
> + * This unit is responsible for emulating MTD devices on top of UBI devices.
> + * This sounds strange, but it is in fact quite useful to make legacy software
> + * work on top of UBI. New software should use native UBI API instead.
> + *
> + * Gluebi emulated MTD devices of "MTD_UBIVOLUME" type. Their minimal I/O unit
> + * size (mtd->writesize) is equivalent to the underlying flash minimal I/O
> + * unit. The eraseblock size is equivalent to the logical UBI volume eraseblock
> + * size.

This approach doesn't seem to make sense at all. If the MTD device interface
is flawed, the right approach should be to fix that instead. After all,
there are not many users of the MTD interface, so you should be able to
adapt them.

In fact, I would expect that there is much more reason to merge the existing
MTD interface with the block interface in the kernel, but you now introduce
a third interface that is unrelated to the first two, and make another
conversion to convert it back?

Let's assume I want to use the wear levelling capabilities of UBI on top
of an SD card, and use the ext3 file system on top of it. I get a stack of

1. MMC
2. block2mtd
3. UBI
4. gluebi
5. mtdblock
6. VFS

when in an ideal world, it should just be

1. MMC
2. UBI
3. VFS

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 09/44 take 2] [UBI] debug unit header
  2007-02-17 16:55 ` [PATCH 09/44 take 2] [UBI] debug unit header Artem Bityutskiy
@ 2007-02-17 21:18   ` Arnd Bergmann
  2007-02-19 11:00     ` Christoph Hellwig
  2007-02-19 12:33     ` Artem Bityutskiy
  0 siblings, 2 replies; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-17 21:18 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Saturday 17 February 2007 17:55, Artem Bityutskiy wrote:
> +
> +/**
> + * UBI debugging unit.
> + *
> + * UBI provides rich debugging capabilities which are implemented in
> + * this unit.

Stop right here. You should be doing one thing and do it right.
Since the point of your patches is to do volume management for MTD,
it should do just that.

If you feel that Linux needs rich debugging capabilities, then submit
a patch for that independent of UBI.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 03/44 take 2] [UBI] user-space API header
  2007-02-17 16:54 ` [PATCH 03/44 take 2] [UBI] user-space API header Artem Bityutskiy
@ 2007-02-17 21:27   ` Arnd Bergmann
  2007-02-20 13:07     ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-17 21:27 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Saturday 17 February 2007 17:54, Artem Bityutskiy wrote:
> +struct ubi_mkvol_req {
> +       int32_t vol_id;
> +       int32_t alignment;
> +       int64_t bytes;
> +       int8_t vol_type;
> +       int8_t padding[9];
> +       int16_t name_len;
> +       __user const char *name;
> +} __attribute__ ((packed));

This structure is not suitable for an ioctl call, because it has
incompatible layout between 32 and 64 bit processes. The easiest
fix for this would be to change the 'name' field to an array
instead of a pointer.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (43 preceding siblings ...)
  2007-02-17 16:58 ` [PATCH 44/44 take 2] [UBI] update MAINTAINERS Artem Bityutskiy
@ 2007-02-17 22:49 ` Theodore Tso
  2007-02-19 12:48   ` Artem Bityutskiy
  2007-02-19 10:50 ` Christoph Hellwig
  2007-02-20 14:52 ` John Stoffel
  46 siblings, 1 reply; 129+ messages in thread
From: Theodore Tso @ 2007-02-17 22:49 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sat, Feb 17, 2007 at 06:54:24PM +0200, Artem Bityutskiy wrote:
> The structure of the UBI code is very simple. Whole UBI consists of units.
> Each unit has one .c file which implements it and one .h file which defines
> the interface of this unit. So I've split the UBI code so that there is
> a distinct patch for each unit header and implementations. Plus there are
> several patches like JFFS2 support, user-space-visible headers, etc.
> To realize what is the unit responsible for, glance at its header - there
> is a short description.

Artem,

        This patch introduces the Makefile before any of the source
files, which means it will break "git bisect" operations.  Could you
please refactor your patches so that the tree will build after any
point in your patch series?  

	BTW, I'm not personally found of having a separate .h file for
every .c file; in my mind a much better division is a public and
private header file.

						- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 07/44 take 2] [UBI] misc unit header
  2007-02-17 16:54 ` [PATCH 07/44 take 2] [UBI] misc unit header Artem Bityutskiy
@ 2007-02-17 22:59   ` Theodore Tso
  2007-02-19 11:00     ` Christoph Hellwig
  2007-02-19 11:13     ` Artem Bityutskiy
  0 siblings, 2 replies; 129+ messages in thread
From: Theodore Tso @ 2007-02-17 22:59 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, Feb 17, 2007 at 06:54:59PM +0200, Artem Bityutskiy wrote:
> diff -auNrp tmp-from/drivers/mtd/ubi/misc.h tmp-to/drivers/mtd/ubi/misc.h
> --- tmp-from/drivers/mtd/ubi/misc.h	1970-01-01 02:00:00.000000000 +0200
> +++ tmp-to/drivers/mtd/ubi/misc.h	2007-02-17 18:07:26.000000000 +0200
> @@ -0,0 +1,146 @@
> +#define xquotise(s) #s
> +#define quotise(s) xquotise(s)

Nothing in your patch series uses this, and it's identical to
stringify().   Please remove?

> +/**
> + * rb_for_each_entry - walk an RB-tree.
> + *
> + * @rb: a pointer to type 'struct rb_node' to to use as a loop counter
> + * @pos: a pointer to RB-tree entry type to use as a loop counter
> + * @root: RB-tree's root
> + * @member: the name of the 'struct rb_node' within the RB-tree entry
> + */
> +#define rb_for_each_entry(rb, pos, root, member)...

Shouldn't this be added to include/linux/rbtree.h?

> +/**
> + * strdup_len - duplicate a string with known length.
> + *
> + * @str: original string
> + * @len: the length of the string
> + */
> +char *strdup_len(const char *str, int len);

I'm not sure this should be polluting the kernel symbol namespace,
especially since the implementation calls ubi_assert()....

It's not clear the assertion is all that useful, but if you must have
it, why not do the check as an inline (with the assertion normally
turned off), and then call out to kmemdup()?

						- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 15/44 take 2] [UBI] scanning unit header
  2007-02-17 16:55 ` [PATCH 15/44 take 2] [UBI] scanning unit header Artem Bityutskiy
@ 2007-02-17 23:07   ` Theodore Tso
  2007-02-18  2:17     ` Josh Boyer
  0 siblings, 1 reply; 129+ messages in thread
From: Theodore Tso @ 2007-02-17 23:07 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, Feb 17, 2007 at 06:55:40PM +0200, Artem Bityutskiy wrote:
> +/**
> + * ubi_scan_erase_peb - erase a physical eraseblock.
> + *
> + * @ubi: the UBI device description object
> + * @si: a pointer to the scanning information
> + * @pnum: physical eraseblock number to erase;
> + * @ec: erase counter value to write (%NAND_SCAN_UNKNOWN_EC if it is unknown)
> + *
> + * This function erases physical eraseblock 'pnum', and writes the erase
> + * counter header to it. This function should only be used on UBI device
> + * initialization stages, when the EBA unit had not been yet initialized. This
> + * function returns zero in case of success and a negative error code in case
> + * of failure.
> + */

This is a general comment that applies across your entire patchset.
It would be a lot easier to review the patchset if you put the Docbook
description of the function with the .c file instead of the .h file.
This will also make it much more likely that when you or other people
update the code function, that the documentation will get updated as
well.

I'd recommend doing this along with combining all of your *.h files
into a ubi_private.h and ubi.h file.

Regards,

							- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 04/44 take 2] [UBI] kernel-spce API header
  2007-02-17 16:54 ` [PATCH 04/44 take 2] [UBI] kernel-spce " Artem Bityutskiy
@ 2007-02-18  1:32   ` Greg KH
  2007-02-18  2:08     ` Josh Boyer
  2007-02-26 12:12     ` Artem Bityutskiy
  0 siblings, 2 replies; 129+ messages in thread
From: Greg KH @ 2007-02-18  1:32 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sat, Feb 17, 2007 at 06:54:44PM +0200, Artem Bityutskiy wrote:
> diff -auNrp tmp-from/include/linux/mtd/ubi.h tmp-to/include/linux/mtd/ubi.h
> --- tmp-from/include/linux/mtd/ubi.h	1970-01-01 02:00:00.000000000 +0200
> +++ tmp-to/include/linux/mtd/ubi.h	2007-02-17 18:07:26.000000000 +0200
> @@ -0,0 +1,391 @@
> +/*
> + * Copyright (c) International Business Machines Corp., 2006
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.

Are you sure this is the proper license for new kernel code coming from
IBM these days?  You might want to go verify that the "or any later
version" is allowed right now...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-17 21:14   ` Arnd Bergmann
@ 2007-02-18  2:04     ` Josh Boyer
  2007-02-18  2:15       ` Arnd Bergmann
  2007-02-19 12:29       ` Christoph Hellwig
  2007-02-19 13:30     ` Artem Bityutskiy
  1 sibling, 2 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-18  2:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sat, Feb 17, 2007 at 10:14:54PM +0100, Arnd Bergmann wrote:
> On Saturday 17 February 2007 17:57, Artem Bityutskiy wrote:
> > + * This unit is responsible for emulating MTD devices on top of UBI devices.
> > + * This sounds strange, but it is in fact quite useful to make legacy software
> > + * work on top of UBI. New software should use native UBI API instead.
> > + *
> > + * Gluebi emulated MTD devices of "MTD_UBIVOLUME" type. Their minimal I/O unit
> > + * size (mtd->writesize) is equivalent to the underlying flash minimal I/O
> > + * unit. The eraseblock size is equivalent to the logical UBI volume eraseblock
> > + * size.
> 
> This approach doesn't seem to make sense at all. If the MTD device interface
> is flawed, the right approach should be to fix that instead. After all,
> there are not many users of the MTD interface, so you should be able to
> adapt them.

No, the MTD interface isn't flawed.  gluebi is present to make things like
JFFS2 work on top of UBI volumes with very little adaptations.  If you go
changing _every_ MTD user to now use either an MTD device or a native UBI
device, then the code for those users just gets bloated.

> In fact, I would expect that there is much more reason to merge the existing
> MTD interface with the block interface in the kernel, but you now introduce
> a third interface that is unrelated to the first two, and make another
> conversion to convert it back?
> 
> Let's assume I want to use the wear levelling capabilities of UBI on top
> of an SD card, and use the ext3 file system on top of it. I get a stack of
> 
> 1. MMC
> 2. block2mtd
> 3. UBI
> 4. gluebi
> 5. mtdblock
> 6. VFS

Assuming your SD card isn't doing wear-leveling itself within the device,
yes that is what you would get.  Or you could do something slightly more sane
and use:

1. MMC
2. block2mtd
3. JFFS2

> when in an ideal world, it should just be
> 
> 1. MMC
> 2. UBI
> 3. VFS

This could perhaps still be done.  UBI has a general concept of an io_unit
so theoretically it could be adapted to work with the block layer in the
kernel.

josh

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 04/44 take 2] [UBI] kernel-spce API header
  2007-02-18  1:32   ` Greg KH
@ 2007-02-18  2:08     ` Josh Boyer
  2007-02-26 12:12     ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-18  2:08 UTC (permalink / raw)
  To: Greg KH
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sat, Feb 17, 2007 at 05:32:19PM -0800, Greg KH wrote:
> On Sat, Feb 17, 2007 at 06:54:44PM +0200, Artem Bityutskiy wrote:
> > diff -auNrp tmp-from/include/linux/mtd/ubi.h tmp-to/include/linux/mtd/ubi.h
> > --- tmp-from/include/linux/mtd/ubi.h	1970-01-01 02:00:00.000000000 +0200
> > +++ tmp-to/include/linux/mtd/ubi.h	2007-02-17 18:07:26.000000000 +0200
> > @@ -0,0 +1,391 @@
> > +/*
> > + * Copyright (c) International Business Machines Corp., 2006
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> 
> Are you sure this is the proper license for new kernel code coming from
> IBM these days?  You might want to go verify that the "or any later
> version" is allowed right now...

The code was actually released to the community a while ago.  But point noted
and we'll look into it.  Thanks for pointing it out Greg.

josh

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18  2:04     ` Josh Boyer
@ 2007-02-18  2:15       ` Arnd Bergmann
  2007-02-18  3:02         ` Josh Boyer
  2007-02-19 12:29       ` Christoph Hellwig
  1 sibling, 1 reply; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-18  2:15 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sunday 18 February 2007 03:04, Josh Boyer wrote:
> No, the MTD interface isn't flawed.  gluebi is present to make things like
> JFFS2 work on top of UBI volumes with very little adaptations.  If you go
> changing _every_ MTD user to now use either an MTD device or a native UBI
> device, then the code for those users just gets bloated.

Right, that was my point. If the MTD API in the kernel is not flawed, why
do we need the 'native' UBI interface? Just merge gluebi into UBI and
get rid of the extra abstraction.

> Assuming your SD card isn't doing wear-leveling itself within the device,
> yes that is what you would get.  

While probably all modern SD cards have some amount of wear leveling
built in, I wouldn't want to rely on that for anything but the simple
large-file-on-fatfs (jpeg or mp3) case. Using UBI on top of the
native wear-leveling sounds like the right solution.

> Or you could do something slightly more sane 
> and use:
> 
> 1. MMC
> 2. block2mtd
> 3. JFFS2

Not on a 4GB SD medium, with the current jffs2 version. The problem
is that jffs2 doesn't scale that well, so you want a different fs.
Since logfs isn't stable yet, you end up with something like ext3,
which in turn means that you need a UBI-like concept to avoid
wearing out the blocks that store your metadata.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 15/44 take 2] [UBI] scanning unit header
  2007-02-17 23:07   ` Theodore Tso
@ 2007-02-18  2:17     ` Josh Boyer
  0 siblings, 0 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-18  2:17 UTC (permalink / raw)
  To: Theodore Tso, Artem Bityutskiy, Linux Kernel Mailing List,
	Christoph Hellwig, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse

On Sat, Feb 17, 2007 at 06:07:46PM -0500, Theodore Tso wrote:
> 
> This is a general comment that applies across your entire patchset.
> It would be a lot easier to review the patchset if you put the Docbook
> description of the function with the .c file instead of the .h file.
> This will also make it much more likely that when you or other people
> update the code function, that the documentation will get updated as
> well.
> 
> I'd recommend doing this along with combining all of your *.h files
> into a ubi_private.h and ubi.h file.

I agree.  Thanks for the suggestions Ted.

josh

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18  2:15       ` Arnd Bergmann
@ 2007-02-18  3:02         ` Josh Boyer
  2007-02-18 22:37           ` Arnd Bergmann
  2007-02-19 14:07           ` Jörn Engel
  0 siblings, 2 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-18  3:02 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sun, Feb 18, 2007 at 03:15:23AM +0100, Arnd Bergmann wrote:
> On Sunday 18 February 2007 03:04, Josh Boyer wrote:
> > No, the MTD interface isn't flawed.  gluebi is present to make things like
> > JFFS2 work on top of UBI volumes with very little adaptations.  If you go
> > changing _every_ MTD user to now use either an MTD device or a native UBI
> > device, then the code for those users just gets bloated.
> 
> Right, that was my point. If the MTD API in the kernel is not flawed, why
> do we need the 'native' UBI interface? Just merge gluebi into UBI and
> get rid of the extra abstraction.

That suggestion came up several times.  gluebi represents a compromise
between the two groups.  IIRC, the issue was that representing UBI volumes
as MTD devices only makes sense in the dynamic volume case.  Static UBI
volumes require special write/update handling and so there was a need for
a native interface anyway.

> 
> > Assuming your SD card isn't doing wear-leveling itself within the device,
> > yes that is what you would get.  
> 
> While probably all modern SD cards have some amount of wear leveling
> built in, I wouldn't want to rely on that for anything but the simple
> large-file-on-fatfs (jpeg or mp3) case. Using UBI on top of the
> native wear-leveling sounds like the right solution.

Yeah.  Unfortunately, SD/USB/CF cards are all in sort of an awkward spot
when it comes to things like that.  They don't expose the raw flash
underneath, and they don't provide any indication of how robust the
built in wear-leveling is.  Ugh.

> > Or you could do something slightly more sane 
> > and use:
> > 
> > 1. MMC
> > 2. block2mtd
> > 3. JFFS2
> 
> Not on a 4GB SD medium, with the current jffs2 version. The problem
> is that jffs2 doesn't scale that well, so you want a different fs.

Oh, believe me I know. :)

> Since logfs isn't stable yet, you end up with something like ext3,
> which in turn means that you need a UBI-like concept to avoid
> wearing out the blocks that store your metadata.

That just sounds like we need Jörn to get off his butt and finish logfs ;)
Seriously, until something like that is done we'll be stuck with some not
so pleasant solutions for these kind of devices.

josh

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18  3:02         ` Josh Boyer
@ 2007-02-18 22:37           ` Arnd Bergmann
  2007-02-19 13:52             ` Artem Bityutskiy
  2007-02-19 14:01             ` Josh Boyer
  2007-02-19 14:07           ` Jörn Engel
  1 sibling, 2 replies; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-18 22:37 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sunday 18 February 2007 04:02:17 Josh Boyer wrote:
> On Sun, Feb 18, 2007 at 03:15:23AM +0100, Arnd Bergmann wrote:
> > On Sunday 18 February 2007 03:04, Josh Boyer wrote:
> > > No, the MTD interface isn't flawed.  gluebi is present to make things
> > > like JFFS2 work on top of UBI volumes with very little adaptations.  If
> > > you go changing _every_ MTD user to now use either an MTD device or a
> > > native UBI device, then the code for those users just gets bloated.
> >
> > Right, that was my point. If the MTD API in the kernel is not flawed, why
> > do we need the 'native' UBI interface? Just merge gluebi into UBI and
> > get rid of the extra abstraction.
>
> That suggestion came up several times.  gluebi represents a compromise
> between the two groups.  IIRC, the issue was that representing UBI volumes
> as MTD devices only makes sense in the dynamic volume case.  Static UBI
> volumes require special write/update handling and so there was a need for
> a native interface anyway.

Which brings be back to my original point ;-)

I'm sure this has been discussed before, but I'd still like to understand
what is so special with 'static UBI volumes' that they can't be used with
a slightly extended MTD interface.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (44 preceding siblings ...)
  2007-02-17 22:49 ` [PATCH 00/44 take 2] [UBI] Unsorted Block Images Theodore Tso
@ 2007-02-19 10:50 ` Christoph Hellwig
  2007-02-19 17:44   ` Artem Bityutskiy
  2007-02-20 14:52 ` John Stoffel
  46 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 10:50 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sat, Feb 17, 2007 at 06:54:24PM +0200, Artem Bityutskiy wrote:
> The structure of the UBI code is very simple. Whole UBI consists of units.
> Each unit has one .c file which implements it and one .h file which defines
> the interface of this unit. So I've split the UBI code so that there is
> a distinct patch for each unit header and implementations. Plus there are
> several patches like JFFS2 support, user-space-visible headers, etc.
> To realize what is the unit responsible for, glance at its header - there
> is a short description.

I think this is the wrong approach.  For one thing the unit terms is
rather foregin in Linux, and second this one header file "unit" is
definitly too much.  Normally you'd have as little as possible and at
most four:

 (1) internal definitions of the driver
 (2) external kernel interface of the driver if nessecary
 (3) user interface of the driver if nessecary
 (4) ondisk structure if nessecary

Also please make sure you submit patches at this driver level.  You really
want one git commit per driver that is totally self-contained.  If nessecary
you'd split this into multiple patches if a single one would be bigger then
the message size limit.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-17 16:54 ` [PATCH 05/44 take 2] [UBI] internal common header Artem Bityutskiy
  2007-02-17 21:05   ` Arnd Bergmann
@ 2007-02-19 10:54   ` Christoph Hellwig
  2007-02-19 12:38     ` Josh Boyer
  2007-02-20 13:05     ` Artem Bityutskiy
  1 sibling, 2 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 10:54 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, Feb 17, 2007 at 06:54:49PM +0200, Artem Bityutskiy wrote:
> +#ifndef __UBI_UBI_H__
> +#define __UBI_UBI_H__
> +
> +#include <linux/mtd/ubi.h>
> +
> +/* Version of this UBI implementation */
> +#define UBI_VERSION 1

We shouldn't have versions for inkernel interfaces.

> +/* UBI messages printk level */
> +#define UBI_MSG_LEVEL  KERN_INFO
> +#define UBI_WARN_LEVEL KERN_WARNING
> +#define UBI_ERR_LEVEL  KERN_ERR

Please use the KERN_ macros directly.

> +/* Normal UBI messages */
> +#define ubi_msg(fmt, ...)                                                   \
> +	printk(UBI_MSG_LEVEL UBI_MSG_PREF " " fmt "\n", ##__VA_ARGS__)
> +/* UBI warning messages */
> +#define ubi_warn(fmt, ...)                                                  \
> +	printk(UBI_WARN_LEVEL UBI_WARN_PREF " %s: " fmt "\n", __FUNCTION__, \
> +	       ##__VA_ARGS__)
> +/* UBI error messages */
> +#define ubi_err(fmt, ...)                                                   \
> +	printk(UBI_ERR_LEVEL UBI_ERR_PREF " %s " fmt "\n", __FUNCTION__,    \
> +	       ##__VA_ARGS__)

you should probably use dev_printk and the wrappers around it in most
places.

> +extern int ubis_num;
> +extern struct ubi_info *ubis[UBI_MAX_INSTANCES];

Having this kind of global information directly exposed is a very
bad idea.  In general you only want to access it through more
specific information and avoid allocating the global array at all.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-17 16:54 ` [PATCH 06/44 take 2] [UBI] startup code Artem Bityutskiy
@ 2007-02-19 10:59   ` Christoph Hellwig
  2007-02-20 13:00     ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 10:59 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sat, Feb 17, 2007 at 06:54:54PM +0200, Artem Bityutskiy wrote:
> +/* UBI headers must take 64 bytes. The below is a hacky way to ensure this */
> +static int __ubi_check_ec_hdr_size[(UBI_EC_HDR_SIZE == 64) - 1]
> +    __attribute__ ((__unused__));
> +static int __ubi_check_ec_hdr_size[(UBI_VID_HDR_SIZE == 64) - 1]
> +    __attribute__ ((__unused__));

please use BUILD_BUG_ON instead.

> +
> +static int ubi_attach_mtd_dev(const char *mtd_dev, int vid_hdr_offset,
> +			      int data_offset);
> +static void ubi_destroy_dev(int ubi_num);

Can you reorder the code to avoid all these forward declarations please?

> +	/* Attach MTD devices */
> +	for (i = 0; i < mtd_devs; i++) {
> +		struct mtd_dev_param *p = &mtd_dev_param[i];
> +
> +		cond_resched();
> +		err = -EINVAL;

This looks very odd.

> +module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 000);
> +MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: "
> +		      "mtd=<name|num>[,<vid_hdr_offs>,<data_offs>]. "
> +		      "Multiple \"mtd\" parameters may be specified.\n"
> +		      "MTD devices may be specified by their number or name. "
> +		      "Optional \"vid_hdr_offs\" and \"data_offs\" parameters "
> +		      "specify UBI VID header position and data starting "
> +		      "position to be used by UBI.\n"
> +		      "Example: mtd=content,1984,2048 mtd=4 - attach MTD device"
> +		      "with name content using VID header offset 1984 and data "
> +		      "start 2048, and MTD device number 4 using default "
> +		      "offsets");

This is a very odd paramater interface.  We really don't want drivers to use
module_param_call directly.  You probably want various module_param_array calls
instead.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 07/44 take 2] [UBI] misc unit header
  2007-02-17 22:59   ` Theodore Tso
@ 2007-02-19 11:00     ` Christoph Hellwig
  2007-02-20 12:56       ` Artem Bityutskiy
  2007-02-19 11:13     ` Artem Bityutskiy
  1 sibling, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 11:00 UTC (permalink / raw)
  To: Theodore Tso, Artem Bityutskiy, Linux Kernel Mailing List,
	Christoph Hellwig, Frank Haverkamp, Josh Boyer, Thomas Gleixner,
	David Woodhouse

On Sat, Feb 17, 2007 at 05:59:08PM -0500, Theodore Tso wrote:
> On Sat, Feb 17, 2007 at 06:54:59PM +0200, Artem Bityutskiy wrote:
> > diff -auNrp tmp-from/drivers/mtd/ubi/misc.h tmp-to/drivers/mtd/ubi/misc.h
> > --- tmp-from/drivers/mtd/ubi/misc.h	1970-01-01 02:00:00.000000000 +0200
> > +++ tmp-to/drivers/mtd/ubi/misc.h	2007-02-17 18:07:26.000000000 +0200
> > @@ -0,0 +1,146 @@
> > +#define xquotise(s) #s
> > +#define quotise(s) xquotise(s)
> 
> Nothing in your patch series uses this, and it's identical to
> stringify().   Please remove?
> 
> > +/**
> > + * rb_for_each_entry - walk an RB-tree.
> > + *
> > + * @rb: a pointer to type 'struct rb_node' to to use as a loop counter
> > + * @pos: a pointer to RB-tree entry type to use as a loop counter
> > + * @root: RB-tree's root
> > + * @member: the name of the 'struct rb_node' within the RB-tree entry
> > + */
> > +#define rb_for_each_entry(rb, pos, root, member)...
> 
> Shouldn't this be added to include/linux/rbtree.h?
> 
> > +/**
> > + * strdup_len - duplicate a string with known length.
> > + *
> > + * @str: original string
> > + * @len: the length of the string
> > + */
> > +char *strdup_len(const char *str, int len);
> 
> I'm not sure this should be polluting the kernel symbol namespace,
> especially since the implementation calls ubi_assert()....
> 
> It's not clear the assertion is all that useful, but if you must have
> it, why not do the check as an inline (with the assertion normally
> turned off), and then call out to kmemdup()?

Agreement to all of the above.  And _please_ move all kdoc comments to
the implementation files.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 09/44 take 2] [UBI] debug unit header
  2007-02-17 21:18   ` Arnd Bergmann
@ 2007-02-19 11:00     ` Christoph Hellwig
  2007-02-19 12:33     ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 11:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, Feb 17, 2007 at 10:18:21PM +0100, Arnd Bergmann wrote:
> On Saturday 17 February 2007 17:55, Artem Bityutskiy wrote:
> > +
> > +/**
> > + * UBI debugging unit.
> > + *
> > + * UBI provides rich debugging capabilities which are implemented in
> > + * this unit.
> 
> Stop right here. You should be doing one thing and do it right.
> Since the point of your patches is to do volume management for MTD,
> it should do just that.
> 
> If you feel that Linux needs rich debugging capabilities, then submit
> a patch for that independent of UBI.
> 

*nod*

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 12/44 take 2] [UBI] allocation unit implementation
  2007-02-17 20:55   ` Arnd Bergmann
@ 2007-02-19 11:05     ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 11:05 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sat, 2007-02-17 at 21:55 +0100, Arnd Bergmann wrote:
> On Saturday 17 February 2007 17:55, Artem Bityutskiy wrote:
> > +#include "wl.h"
> > +#include "debug.h"
> > +#include "eba.h"
> > +#include "scan.h"
> 
> I don't see much point in having one local header for each of these,
> you could simply put all of the declarations into one header in the
> ubi directory.

I'll put them all to ubi.h to start with. I am still not convinced
having one big header is more readable then per-unit headers.

> > +#define WL_PROT_ENTRY_SLAB_NAME   "ubi_wl_prow_entry_slab"
> > +#define EBA_LTREE_ENTRY_SLAB_NAME "ubi_eba_ltree_entry_slab"
> > +#define SCAN_EB_SLAB_NAME         "ubi_scan_leb"
> > +#define SCAN_VOLUME_SLAB_NAME     "ubi_scan_volume"
> 
> These macros seem rather pointless, each of them is only used
> once, and the macro name directly corresponds to the contents.

Done.

> > +static struct kmem_cache *bgt_work_slab;
> > +static struct kmem_cache *wl_erase_work_slab;
> > +static struct kmem_cache *wl_entries_slab;
> > +static struct kmem_cache *wl_prot_entry_slab;
> > +static struct kmem_cache *eba_ltree_entry_slab;
> > +static struct kmem_cache *scan_eb_slab;
> > +static struct kmem_cache *scan_volume_slab;
> 
> Do you really need all these slab caches? If a cache only contains
> a small number of objects, e.g. one per volume, then you're much
> better off using a regular kmalloc.

Good point, done. Left only wl_entries_slab, wl_prot_entry_slab, and
eba_ltree_entry_slab.


> > +void ubi_kfree(const void *obj)
> > +{
> > +	if (unlikely(!obj))
> > +		return;
> > +	kfree(obj);
> > +}
> 
> These look somewhat too complex. Don't introduce your own generic
> infrastructure if you can help it. IIRC, when kmalloc fails, you
> already get the full stack trace from the buddy allocator, so
> this is just duplication. Better use the regular kzalloc/kfree
> calls directly.

Done.

> > +void ubi_free_ec_hdr(const struct ubi_info *ubi, struct ubi_ec_hdr *ec_hdr)
> > +{
> > +	if (unlikely(!ec_hdr))
> > +		return;
> > +	kfree(ec_hdr);
> > +}
> 
> same for this and the others. Unless the allocation is done in many
> places in the code from a single slab cache, just call kmem_cache_alloc
> or kmalloc directly.

I need a wrapper for VID header allocation, so better to leave this for
symmetry.

Thanks, Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 16/44 take 2] [UBI] scanning unit implementation
  2007-02-17 16:55 ` [PATCH 16/44 take 2] [UBI] scanning unit implementation Artem Bityutskiy
@ 2007-02-19 11:05   ` Christoph Hellwig
  2007-02-19 14:11     ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 11:05 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

> +	for (pnum = 0; pnum < io->peb_count; pnum++) {
> +		cond_resched();
> +
> +		err = process_eb(ubi, si, pnum);
> +		if (unlikely(err < 0))
> +			goto out_vidh;
> +	}

> +	rb_for_each_entry(rb1, sv, &si->volumes, rb) {
> +		cond_resched();
> +		rb_for_each_entry(rb2, seb, &sv->root, u.rb)
> +			if (seb->ec == NAND_SCAN_UNKNOWN_EC)
> +				seb->ec = si->mean_ec;
> +	}
> +
> +	cond_resched();
> +	list_for_each_entry(seb, &si->free, u.list)
> +		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
> +			seb->ec = si->mean_ec;
> +
> +	cond_resched();
> +	list_for_each_entry(seb, &si->corr, u.list)
> +		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
> +			seb->ec = si->mean_ec;
> +
> +	cond_resched();
> +	list_for_each_entry(seb, &si->erase, u.list)
> +		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
> +			seb->ec = si->mean_ec;

You really shouldn't need random cond_resched all over the place.

> +static int vid_hdr_sanity_check(const struct ubi_info *ubi,
> +				const struct ubi_vid_hdr *vid_hdr,
> +				const struct ubi_scan_volume *sv, int pnum);
> +
> +static int add_to_erase(struct ubi_scan_info *si, int pnum, int ec);
> +
> +static struct ubi_scan_volume *add_volume(const struct ubi_info *ubi,
> +					  struct ubi_scan_info *si, int vol_id,
> +					  int pnum,
> +					  const struct ubi_vid_hdr *vid_hdr);
> +
> +static int compare_lebs(const struct ubi_info *ubi,
> +			const struct ubi_scan_leb *seb, int pnum,
> +			const struct ubi_vid_hdr *vid_hdr);

forward declarations in the middle of the file are really annoying.  Please try
to reorder the code to not need them at all if needed, and if needed add them
to the top of the file.

> +void ubi_scan_destroy_si(struct ubi_scan_info *si)
> +{
> +	struct ubi_scan_leb *seb, *seb_tmp;
> +	struct ubi_scan_volume *sv;
> +	struct rb_node *rb;
> +
> +	list_for_each_entry_safe(seb, seb_tmp, &si->alien, u.list) {
> +		list_del(&seb->u.list);
> +		ubi_free_scan_leb(seb);
> +	}

no locking needed here?


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 22/44 take 2] [UBI] background thread unit implementation
  2007-02-17 16:56 ` [PATCH 22/44 take 2] [UBI] background thread unit implementation Artem Bityutskiy
@ 2007-02-19 11:09   ` Christoph Hellwig
  2007-02-19 13:55     ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 11:09 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

This really should use the linux/kthread.h infrastructure, and most of the
code in here would simply go away.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 07/44 take 2] [UBI] misc unit header
  2007-02-17 22:59   ` Theodore Tso
  2007-02-19 11:00     ` Christoph Hellwig
@ 2007-02-19 11:13     ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 11:13 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, 2007-02-17 at 17:59 -0500, Theodore Tso wrote:
> > +#define xquotise(s) #s
> > +#define quotise(s) xquotise(s)
> 
> Nothing in your patch series uses this, and it's identical to
> stringify().   Please remove?

Done, thanks.

> > +/**
> > + * rb_for_each_entry - walk an RB-tree.
> > + *
> > + * @rb: a pointer to type 'struct rb_node' to to use as a loop counter
> > + * @pos: a pointer to RB-tree entry type to use as a loop counter
> > + * @root: RB-tree's root
> > + * @member: the name of the 'struct rb_node' within the RB-tree entry
> > + */
> > +#define rb_for_each_entry(rb, pos, root, member)...
> 
> Shouldn't this be added to include/linux/rbtree.h?

Not sure, probably for generic we want to have several of these
depending on how the tree is traversed. But I could try to submit it to
generic code if you think it makes sens.

> > +char *strdup_len(const char *str, int len);
> 
> I'm not sure this should be polluting the kernel symbol namespace,
> especially since the implementation calls ubi_assert()....
> 
> It's not clear the assertion is all that useful, but if you must have
> it, why not do the check as an inline (with the assertion normally
> turned off), and then call out to kmemdup()?

Got rid of it and use kmemdup() directly, thanks.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 12/44 take 2] [UBI] allocation unit implementation
  2007-02-17 16:55 ` [PATCH 12/44 take 2] [UBI] allocation unit implementation Artem Bityutskiy
  2007-02-17 20:55   ` Arnd Bergmann
@ 2007-02-19 11:13   ` Pekka Enberg
  2007-02-20 11:30     ` Artem Bityutskiy
  1 sibling, 1 reply; 129+ messages in thread
From: Pekka Enberg @ 2007-02-19 11:13 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On 2/17/07, Artem Bityutskiy <dedekind@infradead.org> wrote:
> +void *ubi_kzalloc(size_t size)
> +{
> +       void *ret;
> +
> +       ret = kzalloc(size, GFP_KERNEL);
> +       if (unlikely(!ret)) {
> +               ubi_err("cannot allocate %zd bytes", size);
> +               dump_stack();
> +               return NULL;
> +       }
> +
> +       return ret;
> +}

[snip, snip]

NAK. Please drop all of these utterly pointless kmalloc() and
kmem_cache_alloc() wrappers.

                                             Pekka

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-17 21:05   ` Arnd Bergmann
@ 2007-02-19 11:16     ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 11:16 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, 2007-02-17 at 22:05 +0100, Arnd Bergmann wrote:
> > +/* Maximum number of supported UBI devices */
> > +#define UBI_MAX_INSTANCES 32
> 
> Does this need to be limited?

It is how this is implemented at the moment. Note, this limits number of
UBI devices, not ubi _volumes_.

> > +/* UBI error messages */
> > +#define ubi_err(fmt, ...)                                                   \
> > +	printk(UBI_ERR_LEVEL UBI_ERR_PREF " %s " fmt "\n", __FUNCTION__,    \
> > +	       ##__VA_ARGS__)
> 
> You shouldn't need these helpers, just use the regular dev_dbg, dev_info
> and related macros.

I will look at them and see if they are ok.

> > +struct ubi_info {
> > +	int ubi_num;
> > +	struct ubi_io_info   *io;
> > +	struct ubi_bgt_info  *bgt;
> > +	struct ubi_wl_info   *wl;
> > +	struct ubi_beb_info  *beb;
> > +	struct ubi_vmt_info  *vmt;
> > +	struct ubi_ivol_info *ivol;
> > +	struct ubi_vtbl_info *vtbl;
> > +	struct ubi_acc_info  *acc;
> > +	struct ubi_upd_info  *upd;
> > +	struct ubi_eba_info  *eba;
> > +	struct ubi_uif_info  *uif;
> > +};
> 
> I don't know what went wrong here, but this does not at all
> look ok. The members in here probably should all be part
> of the ubi_info structure itself.

Will be fixed, thanks.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 10/44 take 2] [UBI] debug unit implementation
  2007-02-17 21:00   ` Arnd Bergmann
@ 2007-02-19 12:29     ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 12:29 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sat, 2007-02-17 at 22:00 +0100, Arnd Bergmann wrote:
> This whole file looks like it can be removed, as nothing in here
> is really relevant for regular operation. I'm sure that much of it
> was a good help in developing the code and finding the bugs in here,
> but why would you want to merge it into the mainline kernel?

UBI is a new thing and it will likely be tweaked, improved, etc. When I
have a bugreport, I just ask to enable debugging and provide me the log.
I is a great help.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18  2:04     ` Josh Boyer
  2007-02-18  2:15       ` Arnd Bergmann
@ 2007-02-19 12:29       ` Christoph Hellwig
  1 sibling, 0 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-19 12:29 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Arnd Bergmann, Artem Bityutskiy, Linux Kernel Mailing List,
	Christoph Hellwig, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse

On Sat, Feb 17, 2007 at 08:04:30PM -0600, Josh Boyer wrote:
> No, the MTD interface isn't flawed.  gluebi is present to make things like
> JFFS2 work on top of UBI volumes with very little adaptations.  If you go
> changing _every_ MTD user to now use either an MTD device or a native UBI
> device, then the code for those users just gets bloated.

That's not what Arnd means.  We really want only one mtd interface that
works transparently with both physical and stackable devices.  That means
instead of adding a new ubi interface you want to add support for stackable
devices to mtd.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 09/44 take 2] [UBI] debug unit header
  2007-02-17 21:18   ` Arnd Bergmann
  2007-02-19 11:00     ` Christoph Hellwig
@ 2007-02-19 12:33     ` Artem Bityutskiy
  2007-02-19 14:02       ` Josh Boyer
  1 sibling, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 12:33 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, 2007-02-17 at 22:18 +0100, Arnd Bergmann wrote:
> If you feel that Linux needs rich debugging capabilities, then submit
> a patch for that independent of UBI.

I will fix the comment which annoys.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-19 10:54   ` Christoph Hellwig
@ 2007-02-19 12:38     ` Josh Boyer
  2007-02-20 13:05     ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-19 12:38 UTC (permalink / raw)
  To: Christoph Hellwig, Artem Bityutskiy, Linux Kernel Mailing List,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Mon, Feb 19, 2007 at 10:54:45AM +0000, Christoph Hellwig wrote:
> On Sat, Feb 17, 2007 at 06:54:49PM +0200, Artem Bityutskiy wrote:
> > +#ifndef __UBI_UBI_H__
> > +#define __UBI_UBI_H__
> > +
> > +#include <linux/mtd/ubi.h>
> > +
> > +/* Version of this UBI implementation */
> > +#define UBI_VERSION 1
> 
> We shouldn't have versions for inkernel interfaces.

The versioning is part of the UBI block headers that are written to the medium.
It is not for in-kernel purposes, but required so things like bootloaders
can understand which version of UBI is on flash.

josh

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-17 22:49 ` [PATCH 00/44 take 2] [UBI] Unsorted Block Images Theodore Tso
@ 2007-02-19 12:48   ` Artem Bityutskiy
  2007-02-19 14:33     ` Theodore Tso
  0 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 12:48 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

Theodore,

On Sat, 2007-02-17 at 17:49 -0500, Theodore Tso wrote:
>         This patch introduces the Makefile before any of the source
> files, which means it will break "git bisect" operations.  Could you
> please refactor your patches so that the tree will build after any
> point in your patch series? 

I actually did not mean these patches should be included to a git. We
have UBI git to pull from for these purposes. I basically manually split
the UBI sources to make UBI easier to review. I should have added an
"RFC" tag, apologies.

> 	BTW, I'm not personally found of having a separate .h file for
> every .c file; in my mind a much better division is a public and
> private header file.

This reflects the way of my thinking. I see UBI as a set of units with
defined interfaces. So I even physically split the interface description
into files. I still think it is easier to grasp the architecture this
way.

So, it is not distinct header for _.c_ file, it is a distinct header for
each _unit_, which is a basic architecture element of UBI.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-17 21:14   ` Arnd Bergmann
  2007-02-18  2:04     ` Josh Boyer
@ 2007-02-19 13:30     ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 13:30 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, 2007-02-17 at 22:14 +0100, Arnd Bergmann wrote:
> This approach doesn't seem to make sense at all. If the MTD device interface
> is flawed, the right approach should be to fix that instead. After all,
> there are not many users of the MTD interface, so you should be able to
> adapt them.

MTD interface is not flawed, why? It is a good abstraction for flash
chips. 

UBI provide too many new services to utilize MTD interface. UBI != MTD,
but UBI may behave like MTD, although it is wider.

> In fact, I would expect that there is much more reason to merge the existing
> MTD interface with the block interface in the kernel

I do not think so, but the idea sounds exciting, please, talk to dwmw2
about this. But surely you know how different flash devices and HDD's
are: http://www.linux-mtd.infradead.org/faq/general.html

MTD devices are bare flashes.
Block devices are HDDs, MMCs, SDs, USB sticks, etc.

> but you now introduce
> a third interface that is unrelated to the first two

Why not? UBI is something which works on top of MTD, so it does relate
to MTD. But yes, it has nothing to do with block devices, I do not why
you talk about them. They are just irrelevant in my opinion, lets remove
them from discussion.

> , and make another
> conversion to convert it back?

All the the conversion things were created as debugging tools. I have
not heard anybody used them in production. But may be someone do, but
this is rare though and they must have _really good reasons_ for this.

> Let's assume I want to use the wear levelling capabilities of UBI on top
> of an SD card, and use the ext3 file system on top of it.

I do not see any point in this. SD card is a block device. It was
designed to be a block device. Using it for different purpose does not
look reasonable. Use bare flashes instead.

But technically it is possible to add block device back-end support to
UBI, but I do not know any real use-case for this.

>  I get a stack of
> 
> 1. MMC
Block device.
> 2. block2mtd
A debugging tool to develop flash software on host. Not normally used
for other purposes.
> 3. UBI
Close to MTD but also have a lot of new services.
> 4. gluebi
MTD devices emulated by UBI.
> 5. mtdblock
Stupid FTL driver, to emulated block devices on top of MTD. Too
straightforward, may only be used in RO mode. Any use in RW mode is
dangerous as you loose whole eraseblock in case of an unclean reboot.
> 6. VFS
Generalized FS view of the kernel.

> when in an ideal world, it should just be
> 
> 1. MMC
Just all block devices.
> 2. UBI
Is MTD here?


-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18 22:37           ` Arnd Bergmann
@ 2007-02-19 13:52             ` Artem Bityutskiy
  2007-02-19 14:01             ` Josh Boyer
  1 sibling, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 13:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Josh Boyer, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sun, 2007-02-18 at 23:37 +0100, Arnd Bergmann wrote:
> Which brings be back to my original point ;-)
> 
> I'm sure this has been discussed before, but I'd still like to understand
> what is so special with 'static UBI volumes' that they can't be used with
> a slightly extended MTD interface.

Let me provide a list of new things.

* Two types of character devices: UBI devices and UBI volumes. MTD is
aware only of one type of device - just MTD device.
* Two types of volumes.
* New volume update operation.
* Write hints - you may inform UBI which kind of data you are writing -
long term data, short-term data, or just unknown. Depending on the hint
it will pick physical eraseblock with high erase counter low or medium.
* Asynchronous eraseblock erasure operation
* Atomic eraseblock change operation.
* When you read static volume, you may select whether you want UBI to
check CRC or not (CRCs are per-eraseblock, so often it is not reasonable
to check it on any read operation).
* Resizing of volume, and all the things related to their dynamic
nature.
* some other small new interfaces.

The whole idea of MTD interface it to provide _uniform_ method to access
_all_ flashes. It does not look reasonable for UBI software to use MTD
interface, because it is _designed_ for UBI, not for MTD.

So, having native interface for ubi-only software looks reasonable. But
we also able to integrate UBI into MTD for MTD software, which looks a
good design decision and a good compromise.

P.S: Also, I'd ask you to look at the monster mtd_info data structure
and be scared :-) Imagine we add more there.
-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 22/44 take 2] [UBI] background thread unit implementation
  2007-02-19 11:09   ` Christoph Hellwig
@ 2007-02-19 13:55     ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 13:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 11:09 +0000, Christoph Hellwig wrote:
> This really should use the linux/kthread.h infrastructure, and most of the
> code in here would simply go away.

I will look at the infrastructure, thanks.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18 22:37           ` Arnd Bergmann
  2007-02-19 13:52             ` Artem Bityutskiy
@ 2007-02-19 14:01             ` Josh Boyer
  1 sibling, 0 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-19 14:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Sun, 2007-02-18 at 23:37 +0100, Arnd Bergmann wrote:
> On Sunday 18 February 2007 04:02:17 Josh Boyer wrote:
> > On Sun, Feb 18, 2007 at 03:15:23AM +0100, Arnd Bergmann wrote:
> > > On Sunday 18 February 2007 03:04, Josh Boyer wrote:
> > > > No, the MTD interface isn't flawed.  gluebi is present to make things
> > > > like JFFS2 work on top of UBI volumes with very little adaptations.  If
> > > > you go changing _every_ MTD user to now use either an MTD device or a
> > > > native UBI device, then the code for those users just gets bloated.
> > >
> > > Right, that was my point. If the MTD API in the kernel is not flawed, why
> > > do we need the 'native' UBI interface? Just merge gluebi into UBI and
> > > get rid of the extra abstraction.
> >
> > That suggestion came up several times.  gluebi represents a compromise
> > between the two groups.  IIRC, the issue was that representing UBI volumes
> > as MTD devices only makes sense in the dynamic volume case.  Static UBI
> > volumes require special write/update handling and so there was a need for
> > a native interface anyway.
> 
> Which brings be back to my original point ;-)
> 
> I'm sure this has been discussed before, but I'd still like to understand
> what is so special with 'static UBI volumes' that they can't be used with
> a slightly extended MTD interface.

Since I personally loath the standard "RTFM" response, I'll try and
explain briefly.

Think of static volumes as containers for chunks of whole binary data.
You don't want partial updates to this data, or it's inconsistent.  So a
static volume must be written wholesale.  As UBI writes out the data,
each block has a field in the VID header with the data size and the CRC
for that block.

Take a vmlinux binary for example.  By putting that in a static UBI
volume, a bootloader can read each block in the volume in order, verify
the block CRC and know that the resulting reconstructed image is valid.
If the CRC doesn't match on a block, the data is invalid and the
bootloader can do whatever it determines to be the correct behavior.

One could arguably create a new MTD interface called "update" and
provide only that and the read function for static volumes.  But that is
essentially what has been done within UBI itself.  I won't speak for
Artem, who could explain much more than I as to why it was done the way
it currently is.

josh


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 09/44 take 2] [UBI] debug unit header
  2007-02-19 12:33     ` Artem Bityutskiy
@ 2007-02-19 14:02       ` Josh Boyer
  2007-02-19 14:04         ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Josh Boyer @ 2007-02-19 14:02 UTC (permalink / raw)
  To: dedekind
  Cc: Arnd Bergmann, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Mon, 2007-02-19 at 14:33 +0200, Artem Bityutskiy wrote:
> On Sat, 2007-02-17 at 22:18 +0100, Arnd Bergmann wrote:
> > If you feel that Linux needs rich debugging capabilities, then submit
> > a patch for that independent of UBI.
> 
> I will fix the comment which annoys.

It is not the comment that annoys people.  Arnd was saying that the
entire debug unit should not exist.

josh


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 09/44 take 2] [UBI] debug unit header
  2007-02-19 14:02       ` Josh Boyer
@ 2007-02-19 14:04         ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 14:04 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Arnd Bergmann, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Mon, 2007-02-19 at 08:02 -0600, Josh Boyer wrote:
> It is not the comment that annoys people.  Arnd was saying that the
> entire debug unit should not exist.

There is many per-subsystem debugging stuff in kernel. I do not see
_objective_ any reason to prohibit use have our own debugging stuff in
UBI. It is compiled out when disabled. It helps to quickly fix bugs. It
helps to check if new patches introduce bugs (assertions, paranoid
checks). These are _objective_ reasons to keep it.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 41/44 take 2] [UBI] gluebi unit header
  2007-02-18  3:02         ` Josh Boyer
  2007-02-18 22:37           ` Arnd Bergmann
@ 2007-02-19 14:07           ` Jörn Engel
  1 sibling, 0 replies; 129+ messages in thread
From: Jörn Engel @ 2007-02-19 14:07 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Arnd Bergmann, Artem Bityutskiy, Linux Kernel Mailing List,
	Christoph Hellwig, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse

On Sat, 17 February 2007 21:02:17 -0600, Josh Boyer wrote:
> 
> That just sounds like we need Jörn to get off his butt and finish logfs ;)

He's working on it this very moment.  Help in debugging would be
appreciated, if progress is too slow for people.  That part seems to
consume a lot of time and scale fairly well.

Jörn

-- 
Fancy algorithms are slow when n is small, and n is usually small.
Fancy algorithms have big constants. Until you know that n is
frequently going to be big, don't get fancy.
-- Rob Pike

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 16/44 take 2] [UBI] scanning unit implementation
  2007-02-19 11:05   ` Christoph Hellwig
@ 2007-02-19 14:11     ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 14:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 11:05 +0000, Christoph Hellwig wrote:
> > +	cond_resched();
> > +	list_for_each_entry(seb, &si->erase, u.list)
> > +		if (seb->ec == NAND_SCAN_UNKNOWN_EC)
> > +			seb->ec = si->mean_ec;
> 
> You really shouldn't need random cond_resched all over the place.

Good point. I will review the code and clean up stuff like this.

While we are on the point, could you please formulate your criteria of
when cond_resched() should be used.

> > +static int compare_lebs(const struct ubi_info *ubi,
> > +			const struct ubi_scan_leb *seb, int pnum,
> > +			const struct ubi_vid_hdr *vid_hdr);
> 
> forward declarations in the middle of the file are really annoying.  Please try
> to reorder the code to not need them at all if needed, and if needed add them
> to the top of the file.

I tried to layout the code like: top level functions first, low level
last. This is matter of taste, isn't it? I am reluctant to change this,
because it is big useless work (I'll need to change it everywhere in
UBI). But will do this with great delight if it annoys too many people
(you are the second so far).

> > +	list_for_each_entry_safe(seb, seb_tmp, &si->alien, u.list) {
> > +		list_del(&seb->u.list);
> > +		ubi_free_scan_leb(seb);
> > +	}
> 
> no locking needed here?

No, initialization stage does not need any locking.


Thanks,
Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 12:48   ` Artem Bityutskiy
@ 2007-02-19 14:33     ` Theodore Tso
  2007-02-19 17:07       ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Theodore Tso @ 2007-02-19 14:33 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, Feb 19, 2007 at 02:48:23PM +0200, Artem Bityutskiy wrote:
> I actually did not mean these patches should be included to a git. We
> have UBI git to pull from for these purposes. I basically manually split
> the UBI sources to make UBI easier to review. I should have added an
> "RFC" tag, apologies.

It made it much, much, MUCH harder to review.  Especially given that
the documentation was separated from the implementation.  As I looked
at the implementation, there was no way to look and what it was
supposed to do without flipping back to a previous e-mail message and
losing my place.  

> This reflects the way of my thinking. I see UBI as a set of units with
> defined interfaces. So I even physically split the interface description
> into files. I still think it is easier to grasp the architecture this
> way.

Speaking as someone who was coming into it cold, it actually made it
far more difficult.  Your units were too small, so that meant the
number of interfaces that were created as a result were huge!  (Around
20 _sets_ of interfaces, all of which had to be comprehended for what
should have been a relatively simple set of functionality!)

And when you create that many interfaces, it adds inertia to changing
the interfaces later on, because it's sometimes not clear how many
users of the interface there really are.  My general rule of thumb is
that if an interface only has one user, then it may be a good idea to
combine it with the user of that interface, and then make the
functions involved be a static, so that it becomes clear the only user
of that functoin is within that one file.  You can take this too far,
and to extremes it doesn't work all that well, but the UBI layer has
gone waaaaaay off the deep end in terms of functional decomposition.

Adding lots of units so you can have formal functional interfaces is
not always the right answer in terms of making code more readable or
maintainable.  And in my humble opinion, UBI is a picture-perfect
demonstration of this.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 14:33     ` Theodore Tso
@ 2007-02-19 17:07       ` Artem Bityutskiy
  2007-02-19 23:34         ` Theodore Tso
  2007-02-25  5:51         ` Christoph Hellwig
  0 siblings, 2 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 17:07 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 09:33 -0500, Theodore Tso wrote:
> It made it much, much, MUCH harder to review.  Especially given that
> the documentation was separated from the implementation.  As I looked
> at the implementation, there was no way to look and what it was
> supposed to do without flipping back to a previous e-mail message and
> losing my place.

I will send Build stuff as the last patch next time, thanks, point
taken.

I just used different concept: one looks at declaration and the overall
picture becomes clear because _there is_ documentation. One does not
look at the implementation to grasp picture on surface.

But your point is fair. I assume _programmers_ look in .c first. Users
may always generate a pdf. I will do what you advise.

May be a good compromise would be to have just brief comments at
headers, and full specification at .c. I will think about this, thanks.

> > This reflects the way of my thinking. I see UBI as a set of units with
> > defined interfaces. So I even physically split the interface description
> > into files. I still think it is easier to grasp the architecture this
> > way.
> 
> Speaking as someone who was coming into it cold, it actually made it
> far more difficult.  Your units were too small, so that meant the
> number of interfaces that were created as a result were huge!  (Around
> 20 _sets_ of interfaces, all of which had to be comprehended for what
> should have been a relatively simple set of functionality!)

Why not? Some stuff may probably be merged. _Specific_ advices are
welcome.

> And when you create that many interfaces, it adds inertia to changing
> the interfaces later on, because it's sometimes not clear how many
> users of the interface there really are.  My general rule of thumb is
> that if an interface only has one user, then it may be a good idea to
> combine it with the user of that interface, and then make the
> functions involved be a static, so that it becomes clear the only user
> of that functoin is within that one file.  You can take this too far,
> and to extremes it doesn't work all that well, but the UBI layer has
> gone waaaaaay off the deep end in terms of functional decomposition.

Well... I do not want any flame on this topic. It is about taste,
trade-offs, compromises. It is difficult to provide _objective_ and
killing arguments here. But I will think on this, point taken, thanks.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 10:50 ` Christoph Hellwig
@ 2007-02-19 17:44   ` Artem Bityutskiy
  2007-02-25  5:55     ` Christoph Hellwig
  0 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-19 17:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 10:50 +0000, Christoph Hellwig wrote:
> I think this is the wrong approach.  For one thing the unit terms is
> rather foregin in Linux

I would rather disagree. Subjective. Unit is a generic word, just like
subsystem. Unit-tests for example is a widespread word it refer to
internal units of a big system.

> , and second this one header file "unit" is
> definitly too much.

It is .c + .h.

Why definitely, where is the definition? I appreciate your opinion, but
do not want any endless discussion about "how the life has to go",
though.

>   Normally you'd have as little as possible and at
> most four:
> 
>  (1) internal definitions of the driver
>  (2) external kernel interface of the driver if nessecary
>  (3) user interface of the driver if nessecary
>  (4) ondisk structure if nessecary

Where does this rather restrictive model come from? What does it mean in
practice for UBI? Please, be more specific.

> Also please make sure you submit patches at this driver level.  You really
> want one git commit per driver that is totally self-contained.  If nessecary
> you'd split this into multiple patches if a single one would be bigger then
> the message size limit.

Please, refine what does this exactly mean. I do not see how I should
have sent it, sorry. OK, I've separated external headers, JFFS2 support,
build stuff. What next?

And I sent it just for review, as you requested, I assumed that for the
pull we have git, and I specified the URL.

Thanks,
Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 17:07       ` Artem Bityutskiy
@ 2007-02-19 23:34         ` Theodore Tso
  2007-02-20 11:54           ` Artem Bityutskiy
  2007-02-25  5:51         ` Christoph Hellwig
  1 sibling, 1 reply; 129+ messages in thread
From: Theodore Tso @ 2007-02-19 23:34 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, Feb 19, 2007 at 07:07:46PM +0200, Artem Bityutskiy wrote:
> I just used different concept: one looks at declaration and the overall
> picture becomes clear because _there is_ documentation. One does not
> look at the implementation to grasp picture on surface.
> 
> But your point is fair. I assume _programmers_ look in .c first. Users
> may always generate a pdf. I will do what you advise.

First of all, you are writing kernel code for programmers, not end-users.

Secondly, and this is a big issue, you haven't specified which "units"
are exporting interfaces that will be used outside of UBI, and which
are being used only internally inside UBI.  Interfaces which will be
used outside of UBI have to be carefully controlled.  Interfaces only
used inside UBI are private interfaces which can be easily changed.

Also, if you are going to use this horrible, theoretical, pendantic
Computer Science coding stlye, at the very least you should have
included a module dependency diagram so it was easy to understand what
the heck was going on.  This would make it easier to make specific
suggestions, because after I looked at it for a while, I decided it
was too hard to figure out the overall architecture, and not worth my
time, and so I gave up on the review.  Not that my opinion is the only
one you need to pay attention to, but if everyone is telling you that
need to simplify the number of interfaces, you may want to listen
since your code is going to need adequate review if you want to get it
into mainline.

						- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 12/44 take 2] [UBI] allocation unit implementation
  2007-02-19 11:13   ` Pekka Enberg
@ 2007-02-20 11:30     ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 11:30 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 13:13 +0200, Pekka Enberg wrote:
> NAK. Please drop all of these utterly pointless kmalloc() and
> kmem_cache_alloc() wrappers.

Will be fixed.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 23:34         ` Theodore Tso
@ 2007-02-20 11:54           ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 11:54 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 18:34 -0500, Theodore Tso wrote:
> Not that my opinion is the only
> one you need to pay attention to, but if everyone is telling you that
> need to simplify the number of interfaces, you may want to listen
> since your code is going to need adequate review if you want to get it
> into mainline.

Theodore, thanks for even more vivid explanation. As I wrote, point
taken. I was probably vague, sorry, but let me say this more bluntly: I
will think how to change this. At the very least I _will_ dispose of 3-5
units and the situation will be better.

Thank you,
Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 07/44 take 2] [UBI] misc unit header
  2007-02-19 11:00     ` Christoph Hellwig
@ 2007-02-20 12:56       ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 12:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Theodore Tso, Linux Kernel Mailing List, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Mon, 2007-02-19 at 11:00 +0000, Christoph Hellwig wrote:
> > It's not clear the assertion is all that useful, but if you must have
> > it, why not do the check as an inline (with the assertion normally
> > turned off), and then call out to kmemdup()?
> 
> Agreement to all of the above.  And _please_ move all kdoc comments to
> the implementation files.

Will be done.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-19 10:59   ` Christoph Hellwig
@ 2007-02-20 13:00     ` Artem Bityutskiy
  2007-02-23 11:03       ` Artem Bityutskiy
  2007-02-25  5:58       ` Christoph Hellwig
  0 siblings, 2 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 13:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse, Josh Boyer

On Mon, 2007-02-19 at 10:59 +0000, Christoph Hellwig wrote:
> On Sat, Feb 17, 2007 at 06:54:54PM +0200, Artem Bityutskiy wrote:
> > +/* UBI headers must take 64 bytes. The below is a hacky way to ensure this */
> > +static int __ubi_check_ec_hdr_size[(UBI_EC_HDR_SIZE == 64) - 1]
> > +    __attribute__ ((__unused__));
> > +static int __ubi_check_ec_hdr_size[(UBI_VID_HDR_SIZE == 64) - 1]
> > +    __attribute__ ((__unused__));
> 
> please use BUILD_BUG_ON instead.

Will be done, thanks.

> > +
> > +static int ubi_attach_mtd_dev(const char *mtd_dev, int vid_hdr_offset,
> > +			      int data_offset);
> > +static void ubi_destroy_dev(int ubi_num);
> 
> Can you reorder the code to avoid all these forward declarations please?

Could you please submit a CodingStyle patch that would contain a
requirement to use the "higher-level functions at the bottom,
lower-layer at top"? Because I just use the opposite.

> > +	/* Attach MTD devices */
> > +	for (i = 0; i < mtd_devs; i++) {
> > +		struct mtd_dev_param *p = &mtd_dev_param[i];
> > +
> > +		cond_resched();
> > +		err = -EINVAL;
> 
> This looks very odd.

What exactly?

> > +module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 000);
> > +MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: "
> > +		      "mtd=<name|num>[,<vid_hdr_offs>,<data_offs>]. "
> > +		      "Multiple \"mtd\" parameters may be specified.\n"
> > +		      "MTD devices may be specified by their number or name. "
> > +		      "Optional \"vid_hdr_offs\" and \"data_offs\" parameters "
> > +		      "specify UBI VID header position and data starting "
> > +		      "position to be used by UBI.\n"
> > +		      "Example: mtd=content,1984,2048 mtd=4 - attach MTD device"
> > +		      "with name content using VID header offset 1984 and data "
> > +		      "start 2048, and MTD device number 4 using default "
> > +		      "offsets");
> 
> This is a very odd paramater interface.  We really don't want drivers to use
> module_param_call directly.  You probably want various module_param_array calls
> instead.

Why not? We tried to avoid this but found out that this is the most
decent interface. Specific advises are welcome.

Thank you,
Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-19 10:54   ` Christoph Hellwig
  2007-02-19 12:38     ` Josh Boyer
@ 2007-02-20 13:05     ` Artem Bityutskiy
  2007-02-20 14:55       ` Theodore Tso
  2007-02-25  5:50       ` Christoph Hellwig
  1 sibling, 2 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 13:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

On Mon, 2007-02-19 at 10:54 +0000, Christoph Hellwig wrote:
> On Sat, Feb 17, 2007 at 06:54:49PM +0200, Artem Bityutskiy wrote:
> > +#ifndef __UBI_UBI_H__
> > +#define __UBI_UBI_H__
> > +
> > +#include <linux/mtd/ubi.h>
> > +
> > +/* Version of this UBI implementation */
> > +#define UBI_VERSION 1
> We shouldn't have versions for inkernel interfaces.

What do you mean? It is internal version just for future help: if we
develop incompatible UBI2 the old UBI will reject the new images.

> > +/* UBI messages printk level */
> > +#define UBI_MSG_LEVEL  KERN_INFO
> > +#define UBI_WARN_LEVEL KERN_WARNING
> > +#define UBI_ERR_LEVEL  KERN_ERR
> 
> Please use the KERN_ macros directly.

Will be done, thanks.

> > +/* UBI error messages */
> > +#define ubi_err(fmt, ...)                                                   \
> > +	printk(UBI_ERR_LEVEL UBI_ERR_PREF " %s " fmt "\n", __FUNCTION__,    \
> > +	       ##__VA_ARGS__)
> 
> you should probably use dev_printk and the wrappers around it in most
> places.

I will look at it, thanks.

> > +extern int ubis_num;
> > +extern struct ubi_info *ubis[UBI_MAX_INSTANCES];
> 
> Having this kind of global information directly exposed is a very
> bad idea.  In general you only want to access it through more
> specific information and avoid allocating the global array at all.

I do not see what is the problem with this. Please, refine.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 03/44 take 2] [UBI] user-space API header
  2007-02-17 21:27   ` Arnd Bergmann
@ 2007-02-20 13:07     ` Artem Bityutskiy
  2007-02-20 13:17       ` Arnd Bergmann
  0 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 13:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Sat, 2007-02-17 at 22:27 +0100, Arnd Bergmann wrote:
> On Saturday 17 February 2007 17:54, Artem Bityutskiy wrote:
> > +struct ubi_mkvol_req {
> > +       int32_t vol_id;
> > +       int32_t alignment;
> > +       int64_t bytes;
> > +       int8_t vol_type;
> > +       int8_t padding[9];
> > +       int16_t name_len;
> > +       __user const char *name;
> > +} __attribute__ ((packed));
> 
> This structure is not suitable for an ioctl call, because it has
> incompatible layout between 32 and 64 bit processes. The easiest
> fix for this would be to change the 'name' field to an array
> instead of a pointer.

Will be fixed thanks. Just out of curiosity, could you please provide an
example when this may be a problem.

Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 03/44 take 2] [UBI] user-space API header
  2007-02-20 13:07     ` Artem Bityutskiy
@ 2007-02-20 13:17       ` Arnd Bergmann
  0 siblings, 0 replies; 129+ messages in thread
From: Arnd Bergmann @ 2007-02-20 13:17 UTC (permalink / raw)
  To: dedekind
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Tuesday 20 February 2007 14:07, Artem Bityutskiy wrote:
> 
> > This structure is not suitable for an ioctl call, because it has
> > incompatible layout between 32 and 64 bit processes. The easiest
> > fix for this would be to change the 'name' field to an array
> > instead of a pointer.
> 
> Will be fixed thanks. Just out of curiosity, could you please provide an
> example when this may be a problem.

On a 64 bit process with a 32 bit user app calling this ioctl, the kernel
would read the pointer value from the 8 bytes at the end, which means that
it will read four bytes after the end of the structure and interpret
whatever it finds as a pointer, instead of using only the first four
bytes as the lower half.

	Arnd <><

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
                   ` (45 preceding siblings ...)
  2007-02-19 10:50 ` Christoph Hellwig
@ 2007-02-20 14:52 ` John Stoffel
  2007-02-20 17:41   ` Artem Bityutskiy
                     ` (2 more replies)
  46 siblings, 3 replies; 129+ messages in thread
From: John Stoffel @ 2007-02-20 14:52 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer


Artem> This patch-set contains UBI, which stands for Unsorted Block
Artem> Images. This is closely related to the memory technology
Artem> devices Linux subsystem (MTD), so this new piece of software is
Artem> from drivers/mtd/ubi.

Can you define UBI in each and every file you create?  This is a
completely unique acronym and I'm sure a bunch of people will be going
"wtf" when they read this, I know I was.  

Artem> In short, UBI is kind of LVM layer but for flash (MTD)
Artem> devices. It makes it possible to dynamically create, delete and
Artem> re-size volumes. But the analogy is not full. UBI also takes
Artem> care of wear-leveling and bad eraseblocks handling, so UBI
Artem> completely hides 2 aspects of flash chips which make them very
Artem> difficult to work with:

Neat stuff!  Too bad you don't have a more descriptive name for it,
UBI just doesn't seem to have any connection to LVM for Flash plus
wear leveling.  

/grin/

John

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 13:05     ` Artem Bityutskiy
@ 2007-02-20 14:55       ` Theodore Tso
  2007-02-20 15:15         ` David Woodhouse
                           ` (2 more replies)
  2007-02-25  5:50       ` Christoph Hellwig
  1 sibling, 3 replies; 129+ messages in thread
From: Theodore Tso @ 2007-02-20 14:55 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Tue, Feb 20, 2007 at 03:05:53PM +0200, Artem Bityutskiy wrote:
> On Mon, 2007-02-19 at 10:54 +0000, Christoph Hellwig wrote:
> > On Sat, Feb 17, 2007 at 06:54:49PM +0200, Artem Bityutskiy wrote:
> > > +#ifndef __UBI_UBI_H__
> > > +#define __UBI_UBI_H__
> > > +
> > > +#include <linux/mtd/ubi.h>
> > > +
> > > +/* Version of this UBI implementation */
> > > +#define UBI_VERSION 1
> > We shouldn't have versions for inkernel interfaces.
> 
> What do you mean? It is internal version just for future help: if we
> develop incompatible UBI2 the old UBI will reject the new images.

In that case it's not an *implementation* version number, but rather
an on-disk *format* version number.  There's a difference.  It's also
often not used much, since another way of dealing with the problem is
to mark major each on-disk version with a different magic number.
(Many new filesystems these also will use an 8-byte magic string, such
as "UBI-FS1\n" so that it's a bit users who look at an image using a
hex editor to see what the heck it is, and also since with a longer
magic string there is less likelihood of an accidental collision given
that we don't have a central registry of magic numbers.)

BTW, it's kind of silly to use an enum here:

/*
 * Magic numbers of the UBI headers.
 *
 * @UBI_EC_HDR_MAGIC: erase counter header magic number (ASCII "UBI#")
 * @UBI_VID_HDR_MAGIC: volume identifier header magic number (ASCII "UBI!")
 */
enum {
	UBI_EC_HDR_MAGIC  = 0x55424923,
	UBI_VID_HDR_MAGIC = 0x55424921
};

Why isn't this being done via #define?  It's not like this is any kind
of an enumerated type, especially since it's being installed into a
32bit type, and not even an enum type.

Also while taking another look at your patches, note that that your
mix of C99 types and your UBI magic types is a bad idea:

struct ubi_ec_hdr {
	ubi32_t magic;
	uint8_t version;
	uint8_t padding1[3];
	ubi64_t ec; /* Warning: the current limit is 31-bit anyway! */
	ubi32_t vid_hdr_offset;
	...

It appears that the reason why you are doing this is because you think
you need the (packed) attribute.  Not needed; Linux assumes all over
the place 16, 32, and 64 types are packed.  If Linux is ever compiled
on an architecture where this isn't true, the compiler will probably
need to be fixed so these assumptions are true, since all manner of
things will break.

It would be much better to use __be32 and __be64, so you get better
type checking, and you will catch bugs caused by forgetting to use
be32_to_cpu, et. al.

BTW, it might also be a good idea to run UBI through sparse to catch bugs.

							- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 14:55       ` Theodore Tso
@ 2007-02-20 15:15         ` David Woodhouse
  2007-02-20 15:22           ` Theodore Tso
                             ` (2 more replies)
  2007-02-20 15:21         ` Artem Bityutskiy
  2007-02-20 15:25         ` Artem Bityutskiy
  2 siblings, 3 replies; 129+ messages in thread
From: David Woodhouse @ 2007-02-20 15:15 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Artem Bityutskiy, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Tue, 2007-02-20 at 09:55 -0500, Theodore Tso wrote:
> It appears that the reason why you are doing this is because you think
> you need the (packed) attribute.  Not needed; Linux assumes all over
> the place 16, 32, and 64 types are packed.  If Linux is ever compiled
> on an architecture where this isn't true, the compiler will probably
> need to be fixed so these assumptions are true, since all manner of
> things will break.

No, the packedness is irrelevant -- the reason is just to catch all the
places where you might otherwise forget to use byte-swapping accesses.

> It would be much better to use __be32 and __be64, so you get better
> type checking, and you will catch bugs caused by forgetting to use
> be32_to_cpu, et. al. 

The technique Artem uses is derived from what I do in JFFS2. It predates
the use of sparse to catch such errors, and works in gcc for _everyone_
without having to do anything special (like run sparse).

-- 
dwmw2


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 14:55       ` Theodore Tso
  2007-02-20 15:15         ` David Woodhouse
@ 2007-02-20 15:21         ` Artem Bityutskiy
  2007-02-25  5:46           ` Christoph Hellwig
  2007-02-20 15:25         ` Artem Bityutskiy
  2 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 15:21 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Tue, 2007-02-20 at 09:55 -0500, Theodore Tso wrote:
> > What do you mean? It is internal version just for future help: if we
> > develop incompatible UBI2 the old UBI will reject the new images.
> 
> In that case it's not an *implementation* version number, but rather
> an on-disk *format* version number. 

True, will refine the comment.

>  There's a difference.  It's also
> often not used much, since another way of dealing with the problem is
> to mark major each on-disk version with a different magic number.

The advantage of version is that UBI can distinguish between garbage and
incompatible image. If I meet wrong magic - what is it - rubbish,
corrupted header? Should I run recovery procedure? Having version is
just cleaner.

> Why isn't this being done via #define?  It's not like this is any kind
> of an enumerated type, especially since it's being installed into a
> 32bit type, and not even an enum type.

Will be fixed, thanks.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:15         ` David Woodhouse
@ 2007-02-20 15:22           ` Theodore Tso
  2007-02-20 15:33             ` David Woodhouse
  2007-02-20 15:24           ` Artem Bityutskiy
  2007-02-25  5:43           ` Christoph Hellwig
  2 siblings, 1 reply; 129+ messages in thread
From: Theodore Tso @ 2007-02-20 15:22 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Artem Bityutskiy, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Tue, Feb 20, 2007 at 03:15:55PM +0000, David Woodhouse wrote:
> > It would be much better to use __be32 and __be64, so you get better
> > type checking, and you will catch bugs caused by forgetting to use
> > be32_to_cpu, et. al. 
> 
> The technique Artem uses is derived from what I do in JFFS2. It predates
> the use of sparse to catch such errors, and works in gcc for _everyone_
> without having to do anything special (like run sparse).

But __be32 will catch the same errors these days because the be/le
types use __bitwise now, right?  So use of the __be32/__be64 types should
be preferred since it also will work with sparse, I would think.

Regards,

						- Ted


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:15         ` David Woodhouse
  2007-02-20 15:22           ` Theodore Tso
@ 2007-02-20 15:24           ` Artem Bityutskiy
  2007-02-25  5:45             ` Christoph Hellwig
  2007-02-25  5:43           ` Christoph Hellwig
  2 siblings, 1 reply; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 15:24 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Theodore Tso, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Tue, 2007-02-20 at 15:15 +0000, David Woodhouse wrote:
> On Tue, 2007-02-20 at 09:55 -0500, Theodore Tso wrote:
> > It appears that the reason why you are doing this is because you think
> > you need the (packed) attribute.  Not needed; Linux assumes all over
> > the place 16, 32, and 64 types are packed.  If Linux is ever compiled
> > on an architecture where this isn't true, the compiler will probably
> > need to be fixed so these assumptions are true, since all manner of
> > things will break.
> 
> No, the packedness is irrelevant -- the reason is just to catch all the
> places where you might otherwise forget to use byte-swapping accesses.

Bear in mind we share this header with user-space, so it is safer to use
packed as well as C99 types.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 14:55       ` Theodore Tso
  2007-02-20 15:15         ` David Woodhouse
  2007-02-20 15:21         ` Artem Bityutskiy
@ 2007-02-20 15:25         ` Artem Bityutskiy
  2 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 15:25 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Tue, 2007-02-20 at 09:55 -0500, Theodore Tso wrote:
> BTW, it might also be a good idea to run UBI through sparse to catch bugs.

Ran it before, will do before the next submission iteration.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:22           ` Theodore Tso
@ 2007-02-20 15:33             ` David Woodhouse
  2007-02-20 16:12               ` Theodore Tso
  2007-02-25 10:42               ` Pavel Machek
  0 siblings, 2 replies; 129+ messages in thread
From: David Woodhouse @ 2007-02-20 15:33 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Artem Bityutskiy, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Tue, 2007-02-20 at 10:22 -0500, Theodore Tso wrote:
> But __be32 will catch the same errors these days because the be/le
> types use __bitwise now, right?  So use of the __be32/__be64 types should
> be preferred since it also will work with sparse, I would think.

Does __bitwise work in gcc? I thought it was only for sparse?

-- 
dwmw2


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:33             ` David Woodhouse
@ 2007-02-20 16:12               ` Theodore Tso
  2007-02-20 16:47                 ` David Woodhouse
  2007-02-25 10:42               ` Pavel Machek
  1 sibling, 1 reply; 129+ messages in thread
From: Theodore Tso @ 2007-02-20 16:12 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Artem Bityutskiy, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Tue, Feb 20, 2007 at 03:33:18PM +0000, David Woodhouse wrote:
> On Tue, 2007-02-20 at 10:22 -0500, Theodore Tso wrote:
> > But __be32 will catch the same errors these days because the be/le
> > types use __bitwise now, right?  So use of the __be32/__be64 types should
> > be preferred since it also will work with sparse, I would think.
> 
> Does __bitwise work in gcc? I thought it was only for sparse?

Sorry, you're right and I'm wrong.  I thought it was a gcc attribute
as well, but it looks like it isn't.  

						- Ted


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 16:12               ` Theodore Tso
@ 2007-02-20 16:47                 ` David Woodhouse
  0 siblings, 0 replies; 129+ messages in thread
From: David Woodhouse @ 2007-02-20 16:47 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Artem Bityutskiy, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Tue, 2007-02-20 at 11:12 -0500, Theodore Tso wrote:
>  I thought it was a gcc attribute as well, but it looks like it
> isn't.   

Indeed, which is why I've never really been tempted to switch JFFS2 to
__[bl]e32 rather than the structures it currently uses. Sparse is all
very nice and all, but no substitute for just making the real compiler
barf when you forget to byteswap.

-- 
dwmw2


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-20 14:52 ` John Stoffel
@ 2007-02-20 17:41   ` Artem Bityutskiy
  2007-02-20 17:44   ` Josh Boyer
  2007-02-25  5:48   ` Christoph Hellwig
  2 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-20 17:41 UTC (permalink / raw)
  To: John Stoffel
  Cc: Linux Kernel Mailing List, Christoph Hellwig, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

John,

On Tue, 2007-02-20 at 09:52 -0500, John Stoffel wrote:
> Can you define UBI in each and every file you create?  This is a
> completely unique acronym and I'm sure a bunch of people will be going
> "wtf" when they read this, I know I was.

Do you mean adding something like "This is file is a part of the
Unsorted Block Images (UBI) subsystem" to the header of each file?

> Neat stuff!  Too bad you don't have a more descriptive name for it,
> UBI just doesn't seem to have any connection to LVM for Flash plus
> wear leveling.

Well, this name was born inside IBM, the guys in CC and may comment on
this. Feel free to advice a better name, although renaming this is
something I really do not want to do.

And it indeed has no connection to LVM, I just used the analogy to
quickly provide the idea what is this about.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-20 14:52 ` John Stoffel
  2007-02-20 17:41   ` Artem Bityutskiy
@ 2007-02-20 17:44   ` Josh Boyer
  2007-02-25  5:48   ` Christoph Hellwig
  2 siblings, 0 replies; 129+ messages in thread
From: Josh Boyer @ 2007-02-20 17:44 UTC (permalink / raw)
  To: John Stoffel
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse

On Tue, 2007-02-20 at 09:52 -0500, John Stoffel wrote:
> Artem> This patch-set contains UBI, which stands for Unsorted Block
> Artem> Images. This is closely related to the memory technology
> Artem> devices Linux subsystem (MTD), so this new piece of software is
> Artem> from drivers/mtd/ubi.
> 
> Can you define UBI in each and every file you create?  This is a

That is pretty pointless work.  It's expanded in the Kconfig menu, which
people need to select before using it anyway.

> Artem> In short, UBI is kind of LVM layer but for flash (MTD)
> Artem> devices. It makes it possible to dynamically create, delete and
> Artem> re-size volumes. But the analogy is not full. UBI also takes
> Artem> care of wear-leveling and bad eraseblocks handling, so UBI
> Artem> completely hides 2 aspects of flash chips which make them very
> Artem> difficult to work with:
> 
> Neat stuff!  Too bad you don't have a more descriptive name for it,
> UBI just doesn't seem to have any connection to LVM for Flash plus
> wear leveling.  

Given that it's contained under the MTD tree, it should be fairly
obvious it's flash related.  And the Kconfig help text for UBI
specifically mentions it's purpose and analogy to LVM.

josh


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-20 13:00     ` Artem Bityutskiy
@ 2007-02-23 11:03       ` Artem Bityutskiy
  2007-02-25  5:58       ` Christoph Hellwig
  1 sibling, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-23 11:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse, Josh Boyer

On Tue, 2007-02-20 at 15:00 +0200, Artem Bityutskiy wrote:
> > This looks very odd.
> 
> What exactly?

I see, will be fixed, thanks.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:15         ` David Woodhouse
  2007-02-20 15:22           ` Theodore Tso
  2007-02-20 15:24           ` Artem Bityutskiy
@ 2007-02-25  5:43           ` Christoph Hellwig
  2007-02-25  6:04             ` David Woodhouse
  2 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:43 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Theodore Tso, Artem Bityutskiy, Christoph Hellwig,
	Linux Kernel Mailing List, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner

On Tue, Feb 20, 2007 at 03:15:55PM +0000, David Woodhouse wrote:
> > It would be much better to use __be32 and __be64, so you get better
> > type checking, and you will catch bugs caused by forgetting to use
> > be32_to_cpu, et. al. 
> 
> The technique Artem uses is derived from what I do in JFFS2. It predates
> the use of sparse to catch such errors, and works in gcc for _everyone_
> without having to do anything special (like run sparse).

And makes the code clumsy and pointlessly different from all other code
we have.  While no one will force you to convert your legacy point at
this code, we really shouldn't add more non-standard code.  If you
really want warnings from gcc directly I doubt __attribute__((bitwise))
would be hard to implement for it.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:24           ` Artem Bityutskiy
@ 2007-02-25  5:45             ` Christoph Hellwig
  2007-02-26 10:28               ` Artem Bityutskiy
  0 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:45 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: David Woodhouse, Theodore Tso, Christoph Hellwig,
	Linux Kernel Mailing List, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner

On Tue, Feb 20, 2007 at 05:24:15PM +0200, Artem Bityutskiy wrote:
> On Tue, 2007-02-20 at 15:15 +0000, David Woodhouse wrote:
> > On Tue, 2007-02-20 at 09:55 -0500, Theodore Tso wrote:
> > > It appears that the reason why you are doing this is because you think
> > > you need the (packed) attribute.  Not needed; Linux assumes all over
> > > the place 16, 32, and 64 types are packed.  If Linux is ever compiled
> > > on an architecture where this isn't true, the compiler will probably
> > > need to be fixed so these assumptions are true, since all manner of
> > > things will break.
> > 
> > No, the packedness is irrelevant -- the reason is just to catch all the
> > places where you might otherwise forget to use byte-swapping accesses.
> 
> Bear in mind we share this header with user-space, so it is safer to use
> packed as well as C99 types.

It's definitively not safe for userspace - packed is not an ISO C thing
and there's no guarantee userspace compilers understand it.   Also you
really don't want to use packed in new code, if you really need oddly
aligned types it's much better to use byte arrays.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:21         ` Artem Bityutskiy
@ 2007-02-25  5:46           ` Christoph Hellwig
  0 siblings, 0 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:46 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Theodore Tso, Christoph Hellwig, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner, David Woodhouse

On Tue, Feb 20, 2007 at 05:21:48PM +0200, Artem Bityutskiy wrote:
> > In that case it's not an *implementation* version number, but rather
> > an on-disk *format* version number. 
> 
> True, will refine the comment.
> 
> >  There's a difference.  It's also
> > often not used much, since another way of dealing with the problem is
> > to mark major each on-disk version with a different magic number.
> 
> The advantage of version is that UBI can distinguish between garbage and
> incompatible image. If I meet wrong magic - what is it - rubbish,
> corrupted header? Should I run recovery procedure? Having version is
> just cleaner.

Agreed.  

> > Why isn't this being done via #define?  It's not like this is any kind
> > of an enumerated type, especially since it's being installed into a
> > 32bit type, and not even an enum type.
> 
> Will be fixed, thanks.

We currently try to use enums for most constants, then again this really
doesn't matter all that much.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-20 14:52 ` John Stoffel
  2007-02-20 17:41   ` Artem Bityutskiy
  2007-02-20 17:44   ` Josh Boyer
@ 2007-02-25  5:48   ` Christoph Hellwig
  2 siblings, 0 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:48 UTC (permalink / raw)
  To: John Stoffel
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse, Josh Boyer

On Tue, Feb 20, 2007 at 09:52:46AM -0500, John Stoffel wrote:
> 
> Artem> This patch-set contains UBI, which stands for Unsorted Block
> Artem> Images. This is closely related to the memory technology
> Artem> devices Linux subsystem (MTD), so this new piece of software is
> Artem> from drivers/mtd/ubi.
> 
> Can you define UBI in each and every file you create?  This is a
> completely unique acronym and I'm sure a bunch of people will be going
> "wtf" when they read this, I know I was.  

We're not in kindergarden.  An explanation in Kconfig or the main
header file should be more than enough.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 13:05     ` Artem Bityutskiy
  2007-02-20 14:55       ` Theodore Tso
@ 2007-02-25  5:50       ` Christoph Hellwig
  2007-02-25 11:55         ` Theodore Tso
  2007-02-26 10:09         ` Artem Bityutskiy
  1 sibling, 2 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:50 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Frank Haverkamp,
	Josh Boyer, Thomas Gleixner, David Woodhouse

On Tue, Feb 20, 2007 at 03:05:53PM +0200, Artem Bityutskiy wrote:
> > Having this kind of global information directly exposed is a very
> > bad idea.  In general you only want to access it through more
> > specific information and avoid allocating the global array at all.
> 
> I do not see what is the problem with this. Please, refine.

global variables are very bad for code maintainability and understanadbility.
So we usually try to avoid them if possible and make them static to a
single file and provide proper accessors for allowed actions on them.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 17:07       ` Artem Bityutskiy
  2007-02-19 23:34         ` Theodore Tso
@ 2007-02-25  5:51         ` Christoph Hellwig
  2007-02-26 10:11           ` Artem Bityutskiy
  1 sibling, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:51 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Theodore Tso, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, Feb 19, 2007 at 07:07:46PM +0200, Artem Bityutskiy wrote:
> > And when you create that many interfaces, it adds inertia to changing
> > the interfaces later on, because it's sometimes not clear how many
> > users of the interface there really are.  My general rule of thumb is
> > that if an interface only has one user, then it may be a good idea to
> > combine it with the user of that interface, and then make the
> > functions involved be a static, so that it becomes clear the only user
> > of that functoin is within that one file.  You can take this too far,
> > and to extremes it doesn't work all that well, but the UBI layer has
> > gone waaaaaay off the deep end in terms of functional decomposition.
> 
> Well... I do not want any flame on this topic. It is about taste,
> trade-offs, compromises. It is difficult to provide _objective_ and
> killing arguments here. But I will think on this, point taken, thanks.

Codingstyle is and always will be about tastes, but it usually helps a lot
if you try to understand and follow the style of the project you want
to contribute to :)

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-19 17:44   ` Artem Bityutskiy
@ 2007-02-25  5:55     ` Christoph Hellwig
  0 siblings, 0 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:55 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Mon, Feb 19, 2007 at 07:44:16PM +0200, Artem Bityutskiy wrote:
> On Mon, 2007-02-19 at 10:50 +0000, Christoph Hellwig wrote:
> > I think this is the wrong approach.  For one thing the unit terms is
> > rather foregin in Linux
> 
> I would rather disagree. Subjective. Unit is a generic word, just like
> subsystem. Unit-tests for example is a widespread word it refer to
> internal units of a big system.

What I meant is that we currently do not use the 'unit' term for code
modules in the linux kernel at all.  

> Please, refine what does this exactly mean. I do not see how I should
> have sent it, sorry. OK, I've separated external headers, JFFS2 support,
> build stuff. What next?

You should not separate more but less.  Generelly the review-"unit"
should be a functionally separate kenrel module, not microscopic pieces
of it.  Unfortunately there are mailinglist limitations why we sometimes
have to split things up a little further, but it's usually really hard
to do useful splits at this boundary.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-20 13:00     ` Artem Bityutskiy
  2007-02-23 11:03       ` Artem Bityutskiy
@ 2007-02-25  5:58       ` Christoph Hellwig
  2007-02-25 22:03         ` Rusty Russell
                           ` (2 more replies)
  1 sibling, 3 replies; 129+ messages in thread
From: Christoph Hellwig @ 2007-02-25  5:58 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Christoph Hellwig, Linux Kernel Mailing List, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer, rusty

On Tue, Feb 20, 2007 at 03:00:56PM +0200, Artem Bityutskiy wrote:
> > > +module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 000);
> > > +MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: "
> > > +		      "mtd=<name|num>[,<vid_hdr_offs>,<data_offs>]. "
> > > +		      "Multiple \"mtd\" parameters may be specified.\n"
> > > +		      "MTD devices may be specified by their number or name. "
> > > +		      "Optional \"vid_hdr_offs\" and \"data_offs\" parameters "
> > > +		      "specify UBI VID header position and data starting "
> > > +		      "position to be used by UBI.\n"
> > > +		      "Example: mtd=content,1984,2048 mtd=4 - attach MTD device"
> > > +		      "with name content using VID header offset 1984 and data "
> > > +		      "start 2048, and MTD device number 4 using default "
> > > +		      "offsets");
> > 
> > This is a very odd paramater interface.  We really don't want drivers to use
> > module_param_call directly.  You probably want various module_param_array calls
> > instead.
> 
> Why not? We tried to avoid this but found out that this is the most
> decent interface. Specific advises are welcome.

because this type of compount interface is really painful for the user.
the module.param=foo syntax makes sure paramaters can be used without
endless documentation for each and every single of them, and makes
sure module writers don't introduce bugs in their own parser reimplementations.

Rusty, was it intentional that drivers can use __module_param_call?
Do you think the ubi use here is okay?

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-25  5:43           ` Christoph Hellwig
@ 2007-02-25  6:04             ` David Woodhouse
  0 siblings, 0 replies; 129+ messages in thread
From: David Woodhouse @ 2007-02-25  6:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Theodore Tso, Artem Bityutskiy, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Sun, 2007-02-25 at 05:43 +0000, Christoph Hellwig wrote:
> > The technique Artem uses is derived from what I do in JFFS2. It predates
> > the use of sparse to catch such errors, and works in gcc for _everyone_
> > without having to do anything special (like run sparse).
> 
> And makes the code clumsy and pointlessly different from all other code
> we have. 

I would suggest that it's not any more clumsy; the implementation is
fairly obvious and the _use_ of it is just the same. But I won't bother,
because it's largely pointless to compare something functional with
something non-functional.

>  If you really want warnings from gcc directly I doubt
>  __attribute__((bitwise)) would be hard to implement for it. 

I have warnings from GCC already, thank you very much. I see no
particular reason to hack on the compiler to give me new ways of
achieving what I've already had for years. But if someone else were to
do so I'd have no particular objection to switching over to using it.
Let's see it merged first though, before you suggest that people should
use it instead of what actually _works_ today.

-- 
dwmw2


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-20 15:33             ` David Woodhouse
  2007-02-20 16:12               ` Theodore Tso
@ 2007-02-25 10:42               ` Pavel Machek
  1 sibling, 0 replies; 129+ messages in thread
From: Pavel Machek @ 2007-02-25 10:42 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Theodore Tso, Artem Bityutskiy, Christoph Hellwig,
	Linux Kernel Mailing List, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner

On Tue 2007-02-20 15:33:18, David Woodhouse wrote:
> On Tue, 2007-02-20 at 10:22 -0500, Theodore Tso wrote:
> > But __be32 will catch the same errors these days because the be/le
> > types use __bitwise now, right?  So use of the __be32/__be64 types should
> > be preferred since it also will work with sparse, I would think.
> 
> Does __bitwise work in gcc? I thought it was only for sparse?

Hmm... if you have superior endian-debugging infrastructure, perhaps
you should just move rest of kernel to it? Having two endian-debugging
infrastructures is ugly...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-25  5:50       ` Christoph Hellwig
@ 2007-02-25 11:55         ` Theodore Tso
  2007-02-26 10:09         ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Theodore Tso @ 2007-02-25 11:55 UTC (permalink / raw)
  To: Christoph Hellwig, Artem Bityutskiy, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner, David Woodhouse

On Sun, Feb 25, 2007 at 05:50:11AM +0000, Christoph Hellwig wrote:
> global variables are very bad for code maintainability and understanadbility.
> So we usually try to avoid them if possible and make them static to a
> single file and provide proper accessors for allowed actions on them.

Another reason why mincing up code files into little-tiny itsy-bitsy
"units" to satisfy some vague abstraction fetish learned in CS101 is a
bad idea.  :-)

Seriously, abstractions have to be at the right layer in order for
them to add value, and too many abstractions (such as the time I
looked at a C++ toolkit with dozens of abstractions layers and
hundreds of itsy-bitsy files all adding a tiny amount of functionality
on top of each other) can be actively harmful.  If you group related
files together in a single C file, then it's possible to use static
per-file variables instead of global ones much more often.

In addition, another problem with global variables is that they by
destroy the ability to have multiple instances of the code referencing
running in different processors at the same time (unless they are
properly locked, adding a contention bottleneck).  So before ever
adding a global variable it is always important to ask how is this
protected from a locking point of view, and could it be linked off
some structure instead?

						- Ted

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-25  5:58       ` Christoph Hellwig
@ 2007-02-25 22:03         ` Rusty Russell
  2007-03-05 13:28           ` Frank Haverkamp
  2007-02-26 11:54         ` Artem Bityutskiy
  2007-05-17 14:44         ` Christoph Hellwig
  2 siblings, 1 reply; 129+ messages in thread
From: Rusty Russell @ 2007-02-25 22:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sun, 2007-02-25 at 05:58 +0000, Christoph Hellwig wrote:
> On Tue, Feb 20, 2007 at 03:00:56PM +0200, Artem Bityutskiy wrote:
> > > > +module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 000);
> > > > +MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: "
> > > > +		      "mtd=<name|num>[,<vid_hdr_offs>,<data_offs>]. "
> > > > +		      "Multiple \"mtd\" parameters may be specified.\n"
> > > > +		      "MTD devices may be specified by their number or name. "
> > > > +		      "Optional \"vid_hdr_offs\" and \"data_offs\" parameters "
> > > > +		      "specify UBI VID header position and data starting "
> > > > +		      "position to be used by UBI.\n"
> > > > +		      "Example: mtd=content,1984,2048 mtd=4 - attach MTD device"
> > > > +		      "with name content using VID header offset 1984 and data "
> > > > +		      "start 2048, and MTD device number 4 using default "
> > > > +		      "offsets");
> > > 
> > > This is a very odd paramater interface.  We really don't want drivers to use
> > > module_param_call directly.  You probably want various module_param_array calls
> > > instead.
> > 
> > Why not? We tried to avoid this but found out that this is the most
> > decent interface. Specific advises are welcome.
> 
> because this type of compount interface is really painful for the user.
> the module.param=foo syntax makes sure paramaters can be used without
> endless documentation for each and every single of them, and makes
> sure module writers don't introduce bugs in their own parser reimplementations.
> 
> Rusty, was it intentional that drivers can use __module_param_call?
> Do you think the ubi use here is okay?

The reason drivers can use __module_param_call is that they can
implement their own "types" for module parameters, which will end up
requiring this.

Using it directly is only really for backwards compatibility (which is
important!), but for new parameters, this multi-part self-parsing is
nasty.  Standard (but admittedly suboptimal) way to do this is having
three parameters module_param_array(name, ...),
module_param_array(header_offset, ...),
module_param_array(data_start, ...).

Hope that helps,
Rusty.




^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-25  5:50       ` Christoph Hellwig
  2007-02-25 11:55         ` Theodore Tso
@ 2007-02-26 10:09         ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-26 10:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Josh Boyer,
	Thomas Gleixner, David Woodhouse

On Sun, 2007-02-25 at 05:50 +0000, Christoph Hellwig wrote:
> > I do not see what is the problem with this. Please, refine.
> 
> global variables are very bad for code maintainability and understanadbility.
> So we usually try to avoid them if possible and make them static to a
> single file and provide proper accessors for allowed actions on them.

I will think about this, but at the first glance it looks like it will
be too much to provide distinct functions to assces the current number
of attached UBI devices. I can make it inline, but then exporting the
variable will be needed anyway.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 00/44 take 2] [UBI] Unsorted Block Images
  2007-02-25  5:51         ` Christoph Hellwig
@ 2007-02-26 10:11           ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-26 10:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Theodore Tso, Linux Kernel Mailing List, Frank Haverkamp,
	Thomas Gleixner, David Woodhouse, Josh Boyer

On Sun, 2007-02-25 at 05:51 +0000, Christoph Hellwig wrote:
> > Well... I do not want any flame on this topic. It is about taste,
> > trade-offs, compromises. It is difficult to provide _objective_ and
> > killing arguments here. But I will think on this, point taken, thanks.
> 
> Codingstyle is and always will be about tastes, but it usually helps a lot
> if you try to understand and follow the style of the project you want
> to contribute to :)

Will be done at the next submit iteration.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 05/44 take 2] [UBI] internal common header
  2007-02-25  5:45             ` Christoph Hellwig
@ 2007-02-26 10:28               ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-26 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: David Woodhouse, Theodore Tso, Linux Kernel Mailing List,
	Frank Haverkamp, Josh Boyer, Thomas Gleixner

On Sun, 2007-02-25 at 05:45 +0000, Christoph Hellwig wrote:
> It's definitively not safe for userspace - packed is not an ISO C thing
> and there's no guarantee userspace compilers understand it.   Also you
> really don't want to use packed in new code, if you really need oddly
> aligned types it's much better to use byte arrays.

Let me express my unsorted points:

1. It is certainly OK in kernel, if it is not OK in user-space, 'make
headers_install' should remove the __attribute__ stuff.

2. I failed to find __atribute__ stuff in ISO C99, so it is not
standard. _BUT_ there are many non-C99 assumptions in kernel anyway. So
"non ISO C" is a weak argument anyway.

3. 
[dedekind@mordor ubi-2.6.git]$ make headers_install
 .... [snip]
[dedekind@mordor include]$ grep -r packed * | wc -l
117

This is the _fact_ of life.

4. Nonetheless, I do not mind at all to remove the "packed" stuff if you
can guarantee that all this compilers on all platforms will use it as
packed. Otherwise we should play safe instead.

Artem.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-25  5:58       ` Christoph Hellwig
  2007-02-25 22:03         ` Rusty Russell
@ 2007-02-26 11:54         ` Artem Bityutskiy
  2007-05-17 14:44         ` Christoph Hellwig
  2 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-26 11:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux Kernel Mailing List, Frank Haverkamp, Thomas Gleixner,
	David Woodhouse, Josh Boyer, rusty

On Sun, 2007-02-25 at 05:58 +0000, Christoph Hellwig wrote:
> because this type of compount interface is really painful for the user.
> the module.param=foo syntax makes sure paramaters can be used without
> endless documentation for each and every single of them, and makes
> sure module writers don't introduce bugs in their own parser reimplementations.

I stand that without the use this function the interface is much worse.
I will be grateful if you advice something better.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 04/44 take 2] [UBI] kernel-spce API header
  2007-02-18  1:32   ` Greg KH
  2007-02-18  2:08     ` Josh Boyer
@ 2007-02-26 12:12     ` Artem Bityutskiy
  1 sibling, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-02-26 12:12 UTC (permalink / raw)
  To: Greg KH
  Cc: Artem Bityutskiy, Linux Kernel Mailing List, Christoph Hellwig,
	Frank Haverkamp, Thomas Gleixner, David Woodhouse, Josh Boyer

Greg KH wrote:
> Are you sure this is the proper license for new kernel code coming from
> IBM these days?  You might want to go verify that the "or any later
> version" is allowed right now...

Hi Greg,

you advised me to use the "class" sysfs category which I did. But I've 
read at LWN that there is a long-term plan to get rid of this category 
altogether. So may I ask you the same question I asked some time ago.

UBI works on top of MTD devices. MTD device may perfectly well support 
the device/driver/bus model. But how this model does not really fit UBI 
because we have no bus.

Please, advice me, should I keep using the "class" category which seems 
fine for me or should I use the device/driver/bus stuff? What do I do 
with bus then? I tried this some time ago and things did not work with 
NULL bus.

Thanks,
Artem.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-25 22:03         ` Rusty Russell
@ 2007-03-05 13:28           ` Frank Haverkamp
  0 siblings, 0 replies; 129+ messages in thread
From: Frank Haverkamp @ 2007-03-05 13:28 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Christoph Hellwig, Artem Bityutskiy, Linux Kernel Mailing List,
	Thomas Gleixner, David Woodhouse, Josh Boyer, Frank Haverkamp

Hi Rusty,

On Mon, 2007-02-26 at 09:03 +1100, Rusty Russell wrote:
> On Sun, 2007-02-25 at 05:58 +0000, Christoph Hellwig wrote:
> > On Tue, Feb 20, 2007 at 03:00:56PM +0200, Artem Bityutskiy wrote:
> > > > > +module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 000);
> > > > > +MODULE_PARM_DESC(mtd, "MTD devices to attach. Parameter format: "
> > > > > +		      "mtd=<name|num>[,<vid_hdr_offs>,<data_offs>]. "
> > > > > +		      "Multiple \"mtd\" parameters may be specified.\n"
> > > > > +		      "MTD devices may be specified by their number or name. "
> > > > > +		      "Optional \"vid_hdr_offs\" and \"data_offs\" parameters "
> > > > > +		      "specify UBI VID header position and data starting "
> > > > > +		      "position to be used by UBI.\n"
> > > > > +		      "Example: mtd=content,1984,2048 mtd=4 - attach MTD device"
> > > > > +		      "with name content using VID header offset 1984 and data "
> > > > > +		      "start 2048, and MTD device number 4 using default "
> > > > > +		      "offsets");
> > > > 

> The reason drivers can use __module_param_call is that they can
> implement their own "types" for module parameters, which will end up
> requiring this.
> 
> Using it directly is only really for backwards compatibility (which is
> important!), but for new parameters, this multi-part self-parsing is
> nasty.  Standard (but admittedly suboptimal) way to do this is having
> three parameters module_param_array(name, ...),
> module_param_array(header_offset, ...),
> module_param_array(data_start, ...).

we wanted to be able to ommit some of the parameters and let UBI use
resonable defaults if needed.

Example where we wanted to explicitly overwrite the offset parameters:
  mtd=content,1984,2048

Example where we did not want it but let instead UBI figure out
resonable parameters:
  mtd=4

With the three arrays we did not see a nice way of achieving this.

  ubimtds=content,4,7
  hdroffs=1984,?,1984
  dataoff=s2048,?,2048

The ? look odd, don't they?

We also looked at the slram parameters which are like this:

  slram=name0,addr0,+size0,name1,addr1,+size1

and found that if we wanted to add one more parameter e.g. width it
cause likely an interface change:

  slram=name0,addr0,+size0,width0,name1,addr1,+size1,width1

Finally we found that we liked the phram parameter parsing (which is
essentially the same) and did it similar to that.

Frank



^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-02-25  5:58       ` Christoph Hellwig
  2007-02-25 22:03         ` Rusty Russell
  2007-02-26 11:54         ` Artem Bityutskiy
@ 2007-05-17 14:44         ` Christoph Hellwig
  2007-05-17 15:06           ` Artem Bityutskiy
  2 siblings, 1 reply; 129+ messages in thread
From: Christoph Hellwig @ 2007-05-17 14:44 UTC (permalink / raw)
  To: Artem Bityutskiy, Linux Kernel Mailing List, akpm

On Sun, Feb 25, 2007 at 05:58:28AM +0000, Christoph Hellwig wrote:
> > Why not? We tried to avoid this but found out that this is the most
> > decent interface. Specific advises are welcome.
> 
> because this type of compount interface is really painful for the user.
> the module.param=foo syntax makes sure paramaters can be used without
> endless documentation for each and every single of them, and makes
> sure module writers don't introduce bugs in their own parser reimplementations.
> 
> Rusty, was it intentional that drivers can use __module_param_call?
> Do you think the ubi use here is okay?

So both I and Rusty told you not to do this and you did it anyway.

I'm quite pissed about this ignorance.  Andrew, what do we do about such
a case?  Should we just revert ubi until they fixed their mess up?

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 06/44 take 2] [UBI] startup code
  2007-05-17 14:44         ` Christoph Hellwig
@ 2007-05-17 15:06           ` Artem Bityutskiy
  0 siblings, 0 replies; 129+ messages in thread
From: Artem Bityutskiy @ 2007-05-17 15:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Linux Kernel Mailing List, akpm

Christoph,

On Thu, 2007-05-17 at 15:44 +0100, Christoph Hellwig wrote:
> So both I and Rusty told you not to do this and you did it anyway.
> 
> I'm quite pissed about this ignorance.  Andrew, what do we do about such
> a case?  Should we just revert ubi until they fixed their mess up?

Just for reference: http://lkml.org/lkml/2007/2/19/67

Sorry, I thought Rusty meant that we can use it if we invent our own
type, see the first part of his answer. You did not answer Frank's mail
about _why_ we use it. So I thought you agreed with us, I did not ignore
you, no need to be so assertive.

It is not a problem to fix this, but neither Rusty nor you explained
what is wrong with that call. But we _did_ explain why we use it. You
ignored the explanation and the questions.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)


^ permalink raw reply	[flat|nested] 129+ messages in thread

end of thread, other threads:[~2007-05-17 15:07 UTC | newest]

Thread overview: 129+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-17 16:54 [PATCH 00/44 take 2] [UBI] Unsorted Block Images Artem Bityutskiy
2007-02-17 16:54 ` [PATCH 01/44 take 2] [UBI] Linux build integration Artem Bityutskiy
2007-02-17 16:54 ` [PATCH 02/44 take 2] [UBI] on-flash data structures header Artem Bityutskiy
2007-02-17 16:54 ` [PATCH 03/44 take 2] [UBI] user-space API header Artem Bityutskiy
2007-02-17 21:27   ` Arnd Bergmann
2007-02-20 13:07     ` Artem Bityutskiy
2007-02-20 13:17       ` Arnd Bergmann
2007-02-17 16:54 ` [PATCH 04/44 take 2] [UBI] kernel-spce " Artem Bityutskiy
2007-02-18  1:32   ` Greg KH
2007-02-18  2:08     ` Josh Boyer
2007-02-26 12:12     ` Artem Bityutskiy
2007-02-17 16:54 ` [PATCH 05/44 take 2] [UBI] internal common header Artem Bityutskiy
2007-02-17 21:05   ` Arnd Bergmann
2007-02-19 11:16     ` Artem Bityutskiy
2007-02-19 10:54   ` Christoph Hellwig
2007-02-19 12:38     ` Josh Boyer
2007-02-20 13:05     ` Artem Bityutskiy
2007-02-20 14:55       ` Theodore Tso
2007-02-20 15:15         ` David Woodhouse
2007-02-20 15:22           ` Theodore Tso
2007-02-20 15:33             ` David Woodhouse
2007-02-20 16:12               ` Theodore Tso
2007-02-20 16:47                 ` David Woodhouse
2007-02-25 10:42               ` Pavel Machek
2007-02-20 15:24           ` Artem Bityutskiy
2007-02-25  5:45             ` Christoph Hellwig
2007-02-26 10:28               ` Artem Bityutskiy
2007-02-25  5:43           ` Christoph Hellwig
2007-02-25  6:04             ` David Woodhouse
2007-02-20 15:21         ` Artem Bityutskiy
2007-02-25  5:46           ` Christoph Hellwig
2007-02-20 15:25         ` Artem Bityutskiy
2007-02-25  5:50       ` Christoph Hellwig
2007-02-25 11:55         ` Theodore Tso
2007-02-26 10:09         ` Artem Bityutskiy
2007-02-17 16:54 ` [PATCH 06/44 take 2] [UBI] startup code Artem Bityutskiy
2007-02-19 10:59   ` Christoph Hellwig
2007-02-20 13:00     ` Artem Bityutskiy
2007-02-23 11:03       ` Artem Bityutskiy
2007-02-25  5:58       ` Christoph Hellwig
2007-02-25 22:03         ` Rusty Russell
2007-03-05 13:28           ` Frank Haverkamp
2007-02-26 11:54         ` Artem Bityutskiy
2007-05-17 14:44         ` Christoph Hellwig
2007-05-17 15:06           ` Artem Bityutskiy
2007-02-17 16:54 ` [PATCH 07/44 take 2] [UBI] misc unit header Artem Bityutskiy
2007-02-17 22:59   ` Theodore Tso
2007-02-19 11:00     ` Christoph Hellwig
2007-02-20 12:56       ` Artem Bityutskiy
2007-02-19 11:13     ` Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 08/44 take 2] [UBI] misc unit implementation Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 09/44 take 2] [UBI] debug unit header Artem Bityutskiy
2007-02-17 21:18   ` Arnd Bergmann
2007-02-19 11:00     ` Christoph Hellwig
2007-02-19 12:33     ` Artem Bityutskiy
2007-02-19 14:02       ` Josh Boyer
2007-02-19 14:04         ` Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 10/44 take 2] [UBI] debug unit implementation Artem Bityutskiy
2007-02-17 21:00   ` Arnd Bergmann
2007-02-19 12:29     ` Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 11/44 take 2] [UBI] allocation unit header Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 12/44 take 2] [UBI] allocation unit implementation Artem Bityutskiy
2007-02-17 20:55   ` Arnd Bergmann
2007-02-19 11:05     ` Artem Bityutskiy
2007-02-19 11:13   ` Pekka Enberg
2007-02-20 11:30     ` Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 13/44 take 2] [UBI] I/O unit header Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 14/44 take 2] [UBI] I/O unit implementation Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 15/44 take 2] [UBI] scanning unit header Artem Bityutskiy
2007-02-17 23:07   ` Theodore Tso
2007-02-18  2:17     ` Josh Boyer
2007-02-17 16:55 ` [PATCH 16/44 take 2] [UBI] scanning unit implementation Artem Bityutskiy
2007-02-19 11:05   ` Christoph Hellwig
2007-02-19 14:11     ` Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 17/44 take 2] [UBI] build unit header Artem Bityutskiy
2007-02-17 16:55 ` [PATCH 18/44 take 2] [UBI] build unit implementation Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 19/44 take 2] [UBI] volume table unit header Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 20/44 take 2] [UBI] volume table unit implementation Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 21/44 take 2] [UBI] background thread unit header Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 22/44 take 2] [UBI] background thread unit implementation Artem Bityutskiy
2007-02-19 11:09   ` Christoph Hellwig
2007-02-19 13:55     ` Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 23/44 take 2] [UBI] wear-leveling unit header Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 24/44 take 2] [UBI] wear-leveling unit implementation Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 25/44 take 2] [UBI] EBA unit header Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 26/44 take 2] [UBI] EBA unit implementation Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 27/44 take 2] [UBI] bad block handling unit header Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 28/44 take 2] [UBI] bad block handling unit implementation Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 29/44 take 2] [UBI] update unit header Artem Bityutskiy
2007-02-17 16:56 ` [PATCH 30/44 take 2] [UBI] update unit implementation Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 31/44 take 2] [UBI] accounting unit header Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 32/44 take 2] [UBI] accounting unit implementation Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 33/44 take 2] [UBI] volume management unit header Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 34/44 take 2] [UBI] volume management unit implementation Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 35/44 take 2] [UBI] user-interfaces unit header Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 36/44 take 2] [UBI] user-interfaces unit implementation Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 37/44 take 2] [UBI] sysfs handling unit header Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 38/44 take 2] [UBI] sysfs handling unit implementation Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 39/44 take 2] [UBI] character devices handling sub-unit header Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 40/44 take 2] [UBI] character devices handling sub-unit implementation Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 41/44 take 2] [UBI] gluebi unit header Artem Bityutskiy
2007-02-17 21:14   ` Arnd Bergmann
2007-02-18  2:04     ` Josh Boyer
2007-02-18  2:15       ` Arnd Bergmann
2007-02-18  3:02         ` Josh Boyer
2007-02-18 22:37           ` Arnd Bergmann
2007-02-19 13:52             ` Artem Bityutskiy
2007-02-19 14:01             ` Josh Boyer
2007-02-19 14:07           ` Jörn Engel
2007-02-19 12:29       ` Christoph Hellwig
2007-02-19 13:30     ` Artem Bityutskiy
2007-02-17 16:57 ` [PATCH 42/44 take 2] [UBI] gluebi unit implementation Artem Bityutskiy
2007-02-17 16:58 ` [PATCH 43/44 take 2] [UBI] JFFS2 UBI support Artem Bityutskiy
2007-02-17 16:58 ` [PATCH 44/44 take 2] [UBI] update MAINTAINERS Artem Bityutskiy
2007-02-17 22:49 ` [PATCH 00/44 take 2] [UBI] Unsorted Block Images Theodore Tso
2007-02-19 12:48   ` Artem Bityutskiy
2007-02-19 14:33     ` Theodore Tso
2007-02-19 17:07       ` Artem Bityutskiy
2007-02-19 23:34         ` Theodore Tso
2007-02-20 11:54           ` Artem Bityutskiy
2007-02-25  5:51         ` Christoph Hellwig
2007-02-26 10:11           ` Artem Bityutskiy
2007-02-19 10:50 ` Christoph Hellwig
2007-02-19 17:44   ` Artem Bityutskiy
2007-02-25  5:55     ` Christoph Hellwig
2007-02-20 14:52 ` John Stoffel
2007-02-20 17:41   ` Artem Bityutskiy
2007-02-20 17:44   ` Josh Boyer
2007-02-25  5:48   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.