linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch] zero-bounce highmem I/O
@ 2001-08-15  7:50 Jens Axboe
  2001-08-15  9:11 ` David S. Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2001-08-15  7:50 UTC (permalink / raw)
  To: Linux Kernel; +Cc: David S. Miller

[-- Attachment #1: Type: text/plain, Size: 1690 bytes --]

Hi,

I updated the patches to 2.4.9-pre4 along with a few other changes:

- Fix bh_kmap_irq nested irq bug (Andrea, me)

- Remove __scsi_end_request changes and add hard_cur_sectors locally.
  Done to minimize changes on core code, the new code was not buggy but
  it's probably more of a 2.5 thing. (me)

- Remove 'enabling highmem I/O' messages (me)

- Up queue_nr_requests a bit (me)

- Drop page_to_bus and page_to_phys from my tree, identical version are
  in the main line now. (me)

- Sync pci-dma changes with my bio patch set -- use of sg_list and
  pci_map_sgl makes support of highmem easy. See ide-dma changes. The
  single page mappings of pci_map_page/unmap_page are good too, of
  course.  SCSI uses the horrible pci_map_sg hack... (me)

There are just two patches now:

- block-highmem-all-7 has everything

- pci-dma-high-1 has PCI DMA changes to support mapping of highmem pages
- block-highmem-7 has core and driver updates

The two latter patches equals the former. Currently it will only compile
on x86 due to the scatterlist changes, I'd appreciate info from arch
maintainers on whether the pci-dma-high-1 patch does something which
can't be readily supported on non-x86 platforms. Dave, comments on that?
To me it looks fine, but I could be missing something. And Dave, should
I add the 64-bit stuff I started again? :-)

Testers are more than welcome. Oracle had some very abysmal profiling
numbers for the stock kernel and bouncing, I won't even show those here
in fear of inducing vomit attacks in the readers. The patch is
considered stable.

Can't get in to kernel.org currently, so I'm attaching the two patches.
They aren't that big anyway...

-- 
Jens Axboe


[-- Attachment #2: pci-dma-high-1 --]
[-- Type: text/plain, Size: 5161 bytes --]

diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/asm-i386/pci.h linux/include/asm-i386/pci.h
--- /opt/kernel/linux-2.4.9-pre4/include/asm-i386/pci.h	Wed Aug 15 09:19:05 2001
+++ linux/include/asm-i386/pci.h	Wed Aug 15 09:36:55 2001
@@ -28,6 +28,7 @@
 
 #include <linux/types.h>
 #include <linux/slab.h>
+#include <linux/highmem.h>
 #include <asm/scatterlist.h>
 #include <linux/string.h>
 #include <asm/io.h>
@@ -84,6 +85,27 @@
 	/* Nothing to do */
 }
 
+/*
+ * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
+ * to pci_map_single, but takes a struct page instead of a virtual address
+ */
+extern inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
+				      size_t size, int offset, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	return (page - mem_map) * PAGE_SIZE + offset;
+}
+
+extern inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+				  size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -102,8 +124,26 @@
 static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
 			     int nents, int direction)
 {
+	int i;
+
 	if (direction == PCI_DMA_NONE)
 		BUG();
+
+	/*
+	 * temporary 2.4 hack
+	 */
+	for (i = 0; i < nents; i++ ) {
+		if (sg[i].address && sg[i].page)
+			BUG();
+		else if (!sg[i].address && !sg[i].page)
+			BUG();
+
+		if (sg[i].page)
+			sg[i].dma_address = page_to_bus(sg[i].page) + sg[i].offset;
+		else
+			sg[i].dma_address = virt_to_bus(sg[i].address);
+	}
+
 	return nents;
 }
 
@@ -119,6 +159,32 @@
 	/* Nothing to do */
 }
 
+/*
+ * meant to replace the pci_map_sg api, new drivers should use this
+ * interface
+ */
+extern inline int pci_map_sgl(struct pci_dev *hwdev, struct sg_list *sg,
+			      int nents, int direction)
+{
+	int i;
+
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	for (i = 0; i < nents; i++)
+		sg[i].dma_address = page_to_bus(sg[i].page) + sg[i].offset;
+
+	return nents;
+}
+
+extern inline void pci_unmap_sgl(struct pci_dev *hwdev, struct sg_list *sg,
+				 int nents, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
  *
@@ -173,10 +239,9 @@
 /* These macros should be used after a pci_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries pci_map_sg
- * returns, or alternatively stop on the first sg_dma_len(sg) which
- * is 0.
+ * returns.
  */
-#define sg_dma_address(sg)	(virt_to_bus((sg)->address))
+#define sg_dma_address(sg)	((sg)->dma_address)
 #define sg_dma_len(sg)		((sg)->length)
 
 /* Return the index of the PCI controller for device. */
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/asm-i386/scatterlist.h linux/include/asm-i386/scatterlist.h
--- /opt/kernel/linux-2.4.9-pre4/include/asm-i386/scatterlist.h	Mon Dec 30 12:01:10 1996
+++ linux/include/asm-i386/scatterlist.h	Wed Aug 15 09:30:21 2001
@@ -1,12 +1,56 @@
 #ifndef _I386_SCATTERLIST_H
 #define _I386_SCATTERLIST_H
 
+/*
+ * temporary measure, include a page and offset.
+ */
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
+    struct page * page; /* Location for highmem page, if any */
+    char *  address;    /* Location data is to be transferred to, NULL for
+			 * highmem page */
     char * alt_address; /* Location of actual if address is a 
 			 * dma indirect buffer.  NULL otherwise */
+    dma_addr_t dma_address;
     unsigned int length;
+    unsigned int offset;/* for highmem, page offset */
 };
+
+/*
+ * new style scatter gather list -- move to this completely?
+ */
+struct sg_list {
+	/*
+	 * input
+	 */
+	struct page *page;	/* page to do I/O to */
+	unsigned int length;	/* length of I/O */
+	unsigned int offset;	/* offset into page */
+
+	/*
+	 * original page, if bounced
+	 */
+	struct page *bounce_page;
+
+	/*
+	 * output
+	 */
+	dma_addr_t dma_address;	/* mapped address */
+};
+
+extern inline void set_bh_sg(struct scatterlist *sg, struct buffer_head *bh)
+{
+	if (PageHighMem(bh->b_page)) {
+		sg->page = bh->b_page;
+		sg->offset = bh_offset(bh);
+		sg->address = NULL;
+	} else {
+		sg->page = NULL;
+		sg->offset = 0;
+		sg->address = bh->b_data;
+	}
+
+	sg->length = bh->b_size;
+}
 
 #define ISA_DMA_THRESHOLD (0x00ffffff)
 
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/linux/pci.h linux/include/linux/pci.h
--- /opt/kernel/linux-2.4.9-pre4/include/linux/pci.h	Fri Jul 20 21:52:38 2001
+++ linux/include/linux/pci.h	Wed Aug 15 09:41:02 2001
@@ -314,6 +314,8 @@
 #define PCI_DMA_FROMDEVICE	2
 #define PCI_DMA_NONE		3
 
+#define PCI_MAX_DMA32		(0xffffffff)
+
 #define DEVICE_COUNT_COMPATIBLE	4
 #define DEVICE_COUNT_IRQ	2
 #define DEVICE_COUNT_DMA	2

[-- Attachment #3: block-highmem-all-7 --]
[-- Type: text/plain, Size: 50629 bytes --]

diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/cciss.c linux/drivers/block/cciss.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/cciss.c	Mon Jul  2 22:56:40 2001
+++ linux/drivers/block/cciss.c	Wed Aug 15 09:14:26 2001
@@ -1129,7 +1129,7 @@
 	{
 		temp64.val32.lower = cmd->SG[i].Addr.lower;
 		temp64.val32.upper = cmd->SG[i].Addr.upper;
-		pci_unmap_single(hba[cmd->ctlr]->pdev,
+		pci_unmap_page(hba[cmd->ctlr]->pdev,
 			temp64.val, cmd->SG[i].Len, 
 			(cmd->Request.Type.Direction == XFER_READ) ? 
 				PCI_DMA_FROMDEVICE : PCI_DMA_TODEVICE);
@@ -1225,7 +1225,7 @@
 static int cpq_back_merge_fn(request_queue_t *q, struct request *rq,
                              struct buffer_head *bh, int max_segments)
 {
-        if (rq->bhtail->b_data + rq->bhtail->b_size == bh->b_data)
+	if (bh_bus(rq->bhtail) + rq->bhtail->b_size == bh_bus(bh))
                 return 1;
         return cpq_new_segment(q, rq, max_segments);
 }
@@ -1233,7 +1233,7 @@
 static int cpq_front_merge_fn(request_queue_t *q, struct request *rq,
                              struct buffer_head *bh, int max_segments)
 {
-        if (bh->b_data + bh->b_size == rq->bh->b_data)
+	if (bh_bus(bh) + bh->b_size == bh_bus(rq->bh))
                 return 1;
         return cpq_new_segment(q, rq, max_segments);
 }
@@ -1243,7 +1243,7 @@
 {
         int total_segments = rq->nr_segments + nxt->nr_segments;
 
-        if (rq->bhtail->b_data + rq->bhtail->b_size == nxt->bh->b_data)
+	if (bh_bus(rq->bhtail) + rq->bhtail->b_size == bh_bus(nxt->bh))
                 total_segments--;
 
         if (total_segments > MAXSGENTRIES)
@@ -1264,7 +1264,7 @@
 	ctlr_info_t *h= q->queuedata; 
 	CommandList_struct *c;
 	int log_unit, start_blk, seg, sect;
-	char *lastdataend;
+	unsigned long lastdataend;
 	struct buffer_head *bh;
 	struct list_head *queue_head = &q->queue_head;
 	struct request *creq;
@@ -1272,10 +1272,15 @@
 	struct my_sg tmp_sg[MAXSGENTRIES];
 	int i;
 
-    // Loop till the queue is empty if or it is plugged
+	if (q->plugged) {
+		start_io(h);
+		return;
+	}
+
+    // Loop till the queue is empty
     while (1)
     {
-	if (q->plugged || list_empty(queue_head)) {
+	if (list_empty(queue_head)) {
                 start_io(h);
                 return;
         }
@@ -1323,12 +1328,12 @@
 		(int) creq->nr_sectors);	
 #endif /* CCISS_DEBUG */
 	seg = 0; 
-	lastdataend = NULL;
+	lastdataend = 0;
 	sect = 0;
 	while(bh)
 	{
 		sect += bh->b_size/512;
-		if (bh->b_data == lastdataend)
+		if (bh_bus(bh) == lastdataend)
 		{  // tack it on to the last segment 
 			tmp_sg[seg-1].len +=bh->b_size;
 			lastdataend += bh->b_size;
@@ -1336,9 +1341,10 @@
 		{
 			if (seg == MAXSGENTRIES)
 				BUG();
+			tmp_sg[seg].page = bh->b_page;
 			tmp_sg[seg].len = bh->b_size;
-			tmp_sg[seg].start_addr = bh->b_data;
-			lastdataend = bh->b_data + bh->b_size;
+			tmp_sg[seg].offset = bh_offset(bh);
+			lastdataend = bh_bus(bh) + bh->b_size;
 			seg++;
 		}
 		bh = bh->b_reqnext;
@@ -1347,9 +1353,8 @@
 	for (i=0; i<seg; i++)
 	{
 		c->SG[i].Len = tmp_sg[i].len;
-		temp64.val = (__u64) pci_map_single( h->pdev,
-			tmp_sg[i].start_addr,
-			tmp_sg[i].len,
+		temp64.val = (__u64) pci_map_page( h->pdev,
+			tmp_sg[i].page, tmp_sg[i].len, tmp_sg[i].offset,
 			(c->Request.Type.Direction == XFER_READ) ? 
                                 PCI_DMA_FROMDEVICE : PCI_DMA_TODEVICE);
 		c->SG[i].Addr.lower = temp64.val32.lower;
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/cciss.h linux/drivers/block/cciss.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/cciss.h	Tue May 22 19:23:16 2001
+++ linux/drivers/block/cciss.h	Wed Aug 15 09:41:25 2001
@@ -16,8 +16,9 @@
 #define MAJOR_NR COMPAQ_CISS_MAJOR 
 
 struct my_sg {
-	int len;
-	char *start_addr;
+	struct page *page;
+	unsigned short len;
+	unsigned short offset;
 };
 
 struct ctlr_info;
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/cpqarray.c linux/drivers/block/cpqarray.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/cpqarray.c	Wed Aug 15 09:19:03 2001
+++ linux/drivers/block/cpqarray.c	Wed Aug 15 09:14:26 2001
@@ -367,7 +367,7 @@
 static int cpq_back_merge_fn(request_queue_t *q, struct request *rq,
 			     struct buffer_head *bh, int max_segments)
 {
-	if (rq->bhtail->b_data + rq->bhtail->b_size == bh->b_data)
+	if (bh_bus(rq->bhtail) + rq->bhtail->b_size == bh_bus(bh))
 		return 1;
 	return cpq_new_segment(q, rq, max_segments);
 }
@@ -375,7 +375,7 @@
 static int cpq_front_merge_fn(request_queue_t *q, struct request *rq,
 			     struct buffer_head *bh, int max_segments)
 {
-	if (bh->b_data + bh->b_size == rq->bh->b_data)
+	if (bh_bus(bh) + bh->b_size == bh_bus(rq->bh))
 		return 1;
 	return cpq_new_segment(q, rq, max_segments);
 }
@@ -385,7 +385,7 @@
 {
 	int total_segments = rq->nr_segments + nxt->nr_segments;
 
-	if (rq->bhtail->b_data + rq->bhtail->b_size == nxt->bh->b_data)
+	if (bh_bus(rq->bhtail) + rq->bhtail->b_size == bh_bus(nxt->bh))
 		total_segments--;
 
 	if (total_segments > SG_MAX)
@@ -532,6 +532,7 @@
 		q = BLK_DEFAULT_QUEUE(MAJOR_NR + i);
 		q->queuedata = hba[i];
 		blk_init_queue(q, do_ida_request);
+		blk_queue_bounce_limit(q, BLK_BOUNCE_4G);
 		blk_queue_headactive(q, 0);
 		blksize_size[MAJOR_NR+i] = ida_blocksizes + (i*256);
 		hardsect_size[MAJOR_NR+i] = ida_hardsizes + (i*256);
@@ -923,17 +924,22 @@
 	ctlr_info_t *h = q->queuedata;
 	cmdlist_t *c;
 	int seg, sect;
-	char *lastdataend;
+	unsigned long lastdataend;
 	struct list_head * queue_head = &q->queue_head;
 	struct buffer_head *bh;
 	struct request *creq;
 	struct my_sg tmp_sg[SG_MAX];
 	int i;
 
-// Loop till the queue is empty if or it is plugged 
+	if (q->plugged) {
+		start_io(h);
+		return;
+	}
+
+// Loop till the queue is empty
    while (1)
 {
-	if (q->plugged || list_empty(queue_head)) {
+	if (list_empty(queue_head)) {
 		start_io(h);
 		return;
 	}
@@ -973,19 +979,20 @@
 	
 	printk("sector=%d, nr_sectors=%d\n", creq->sector, creq->nr_sectors);
 );
-	seg = 0; lastdataend = NULL;
+	seg = lastdataend = 0;
 	sect = 0;
 	while(bh) {
 		sect += bh->b_size/512;
-		if (bh->b_data == lastdataend) {
+		if (bh_bus(bh) == lastdataend) {
 			tmp_sg[seg-1].size += bh->b_size;
 			lastdataend += bh->b_size;
 		} else {
 			if (seg == SG_MAX)
 				BUG();
+			tmp_sg[seg].page = bh->b_page;
 			tmp_sg[seg].size = bh->b_size;
-			tmp_sg[seg].start_addr = bh->b_data;
-			lastdataend = bh->b_data + bh->b_size;
+			tmp_sg[seg].offset = bh_offset(bh);
+			lastdataend = bh_bus(bh) + bh->b_size;
 			seg++;
 		}
 		bh = bh->b_reqnext;
@@ -994,9 +1001,9 @@
 	for( i=0; i < seg; i++)
 	{
 		c->req.sg[i].size = tmp_sg[i].size;
-		c->req.sg[i].addr = (__u32) pci_map_single(
-                		h->pci_dev, tmp_sg[i].start_addr, 
-				tmp_sg[i].size,
+		c->req.sg[i].addr = (__u32) pci_map_page(
+                		h->pci_dev, tmp_sg[i].page, tmp_sg[i].size,
+				tmp_sg[i].offset,
                                 (creq->cmd == READ) ? 
 					PCI_DMA_FROMDEVICE : PCI_DMA_TODEVICE);
 	}
@@ -1103,7 +1110,7 @@
 	/* unmap the DMA mapping for all the scatter gather elements */
         for(i=0; i<cmd->req.hdr.sg_cnt; i++)
         {
-                pci_unmap_single(hba[cmd->ctlr]->pci_dev,
+                pci_unmap_page(hba[cmd->ctlr]->pci_dev,
                         cmd->req.sg[i].addr, cmd->req.sg[i].size,
                         (cmd->req.hdr.cmd == IDA_READ) ? PCI_DMA_FROMDEVICE : PCI_DMA_TODEVICE);
         }
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/cpqarray.h linux/drivers/block/cpqarray.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/cpqarray.h	Tue May 22 19:23:16 2001
+++ linux/drivers/block/cpqarray.h	Wed Aug 15 09:41:38 2001
@@ -57,8 +57,9 @@
 #ifdef __KERNEL__
 
 struct my_sg {
-	int size;
-	char *start_addr;
+	struct page *page;
+	unsigned short size;
+	unsigned short offset;
 };
 
 struct ctlr_info;
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/elevator.c linux/drivers/block/elevator.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/elevator.c	Fri Jul 20 05:59:41 2001
+++ linux/drivers/block/elevator.c	Wed Aug 15 09:14:26 2001
@@ -110,7 +110,6 @@
 			break;
 		} else if (__rq->sector - count == bh->b_rsector) {
 			ret = ELEVATOR_FRONT_MERGE;
-			__rq->elevator_sequence -= count;
 			*req = __rq;
 			break;
 		}
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/ll_rw_blk.c	Wed Aug 15 09:19:03 2001
+++ linux/drivers/block/ll_rw_blk.c	Wed Aug 15 09:14:26 2001
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/smp_lock.h>
 #include <linux/completion.h>
+#include <linux/bootmem.h>
 
 #include <asm/system.h>
 #include <asm/io.h>
@@ -124,6 +125,8 @@
  */
 static int queue_nr_requests, batch_requests;
 
+unsigned long blk_max_low_pfn;
+
 static inline int get_max_sectors(kdev_t dev)
 {
 	if (!max_sectors[MAJOR(dev)])
@@ -131,7 +134,7 @@
 	return max_sectors[MAJOR(dev)][MINOR(dev)];
 }
 
-inline request_queue_t *__blk_get_queue(kdev_t dev)
+inline request_queue_t *blk_get_queue(kdev_t dev)
 {
 	struct blk_dev_struct *bdev = blk_dev + MAJOR(dev);
 
@@ -141,22 +144,6 @@
 		return &blk_dev[MAJOR(dev)].request_queue;
 }
 
-/*
- * NOTE: the device-specific queue() functions
- * have to be atomic!
- */
-request_queue_t *blk_get_queue(kdev_t dev)
-{
-	request_queue_t *ret;
-	unsigned long flags;
-
-	spin_lock_irqsave(&io_request_lock,flags);
-	ret = __blk_get_queue(dev);
-	spin_unlock_irqrestore(&io_request_lock,flags);
-
-	return ret;
-}
-
 static int __blk_cleanup_queue(struct list_head *head)
 {
 	struct request *rq;
@@ -261,6 +248,24 @@
 	q->make_request_fn = mfn;
 }
 
+/**
+ * blk_queue_bounce_limit - set bounce buffer limit for queue
+ * @q:  the request queue for the device
+ * @bus_addr:   bus address limit
+ *
+ * Description:
+ *    Different hardware can have different requirements as to what pages
+ *    it can do I/O directly to. A low level driver can call
+ *    blk_queue_bounce_limit to have lower memory pages allocated as bounce
+ *    buffers for doing I/O to pages residing above @page. By default
+ *    the block layer sets this to the highest numbered "low" memory page, ie
+ *    one the driver can still call bio_page() and get a valid address on.
+ **/
+void blk_queue_bounce_limit(request_queue_t *q, unsigned long dma_addr)
+{
+	q->bounce_limit = mem_map + (dma_addr >> PAGE_SHIFT);
+}
+
 static inline int ll_new_segment(request_queue_t *q, struct request *req, int max_segments)
 {
 	if (req->nr_segments < max_segments) {
@@ -273,7 +278,7 @@
 static int ll_back_merge_fn(request_queue_t *q, struct request *req, 
 			    struct buffer_head *bh, int max_segments)
 {
-	if (req->bhtail->b_data + req->bhtail->b_size == bh->b_data)
+	if (bh_bus(req->bhtail) + req->bhtail->b_size == bh_bus(bh))
 		return 1;
 	return ll_new_segment(q, req, max_segments);
 }
@@ -281,7 +286,7 @@
 static int ll_front_merge_fn(request_queue_t *q, struct request *req, 
 			     struct buffer_head *bh, int max_segments)
 {
-	if (bh->b_data + bh->b_size == req->bh->b_data)
+	if (bh_bus(bh) + bh->b_size == bh_bus(req->bh))
 		return 1;
 	return ll_new_segment(q, req, max_segments);
 }
@@ -291,7 +296,7 @@
 {
 	int total_segments = req->nr_segments + next->nr_segments;
 
-	if (req->bhtail->b_data + req->bhtail->b_size == next->bh->b_data)
+	if (bh_bus(req->bhtail) + req->bhtail->b_size == bh_bus(next->bh))
 		total_segments--;
     
 	if (total_segments > max_segments)
@@ -430,6 +435,8 @@
 	 */
 	q->plug_device_fn 	= generic_plug_device;
 	q->head_active    	= 1;
+
+	blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH);
 }
 
 #define blkdev_free_rq(list) list_entry((list)->next, struct request, table);
@@ -696,9 +703,7 @@
 	 * driver. Create a bounce buffer if the buffer data points into
 	 * high memory - keep the original buffer otherwise.
 	 */
-#if CONFIG_HIGHMEM
-	bh = create_bounce(rw, bh);
-#endif
+	bh = blk_queue_bounce(q, rw, bh);
 
 /* look for a free request. */
 	/*
@@ -743,8 +748,13 @@
 			elevator->elevator_merge_cleanup_fn(q, req, count);
 			bh->b_reqnext = req->bh;
 			req->bh = bh;
+			/*
+			 * may not be valid, but queues not having bounce
+			 * enabled for highmem pages must not look at
+			 * ->buffer anyway
+			 */
 			req->buffer = bh->b_data;
-			req->current_nr_sectors = count;
+			req->current_nr_sectors = req->hard_cur_sectors = count;
 			req->sector = req->hard_sector = sector;
 			req->nr_sectors = req->hard_nr_sectors += count;
 			blk_started_io(count);
@@ -794,7 +804,7 @@
 	req->errors = 0;
 	req->hard_sector = req->sector = sector;
 	req->hard_nr_sectors = req->nr_sectors = count;
-	req->current_nr_sectors = count;
+	req->current_nr_sectors = req->hard_cur_sectors = count;
 	req->nr_segments = 1; /* Always 1 for a new request. */
 	req->nr_hw_segments = 1; /* Always 1 for a new request. */
 	req->buffer = bh->b_data;
@@ -1104,6 +1114,7 @@
 			req->nr_sectors = req->hard_nr_sectors;
 
 			req->current_nr_sectors = bh->b_size >> 9;
+			req->hard_cur_sectors = req->current_nr_sectors;
 			if (req->nr_sectors < req->current_nr_sectors) {
 				req->nr_sectors = req->current_nr_sectors;
 				printk("end_request: buffer-list destroyed\n");
@@ -1152,7 +1163,7 @@
 	 */
 	queue_nr_requests = 64;
 	if (total_ram > MB(32))
-		queue_nr_requests = 128;
+		queue_nr_requests = 256;
 
 	/*
 	 * Batch frees according to queue length
@@ -1160,6 +1171,8 @@
 	batch_requests = queue_nr_requests >> 3;
 	printk("block: %d slots per queue, batch=%d\n", queue_nr_requests, batch_requests);
 
+	blk_max_low_pfn = max_low_pfn;
+
 #ifdef CONFIG_AMIGA_Z2RAM
 	z2_init();
 #endif
@@ -1272,10 +1285,11 @@
 EXPORT_SYMBOL(end_that_request_last);
 EXPORT_SYMBOL(blk_init_queue);
 EXPORT_SYMBOL(blk_get_queue);
-EXPORT_SYMBOL(__blk_get_queue);
 EXPORT_SYMBOL(blk_cleanup_queue);
 EXPORT_SYMBOL(blk_queue_headactive);
 EXPORT_SYMBOL(blk_queue_make_request);
 EXPORT_SYMBOL(generic_make_request);
 EXPORT_SYMBOL(blkdev_release_request);
 EXPORT_SYMBOL(generic_unplug_device);
+EXPORT_SYMBOL(blk_queue_bounce_limit);
+EXPORT_SYMBOL(blk_max_low_pfn);
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/block/loop.c linux/drivers/block/loop.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/block/loop.c	Sat Jun 30 01:16:56 2001
+++ linux/drivers/block/loop.c	Wed Aug 15 09:14:26 2001
@@ -453,9 +453,7 @@
 		goto err;
 	}
 
-#if CONFIG_HIGHMEM
-	rbh = create_bounce(rw, rbh);
-#endif
+	rbh = blk_queue_bounce(q, rw, rbh);
 
 	/*
 	 * file backed, queue for loop_thread to handle
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/ide/hpt34x.c linux/drivers/ide/hpt34x.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/ide/hpt34x.c	Sun May 20 02:43:06 2001
+++ linux/drivers/ide/hpt34x.c	Wed Aug 15 09:14:26 2001
@@ -425,6 +425,7 @@
 			hwif->autodma = 0;
 
 		hwif->dmaproc = &hpt34x_dmaproc;
+		hwif->highmem = 1;
 	} else {
 		hwif->drives[0].autotune = 1;
 		hwif->drives[1].autotune = 1;
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/ide/hpt366.c linux/drivers/ide/hpt366.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/ide/hpt366.c	Thu Jun 28 02:10:55 2001
+++ linux/drivers/ide/hpt366.c	Wed Aug 15 09:14:26 2001
@@ -720,6 +720,7 @@
 			hwif->autodma = 1;
 		else
 			hwif->autodma = 0;
+		hwif->highmem = 1;
 	} else {
 		hwif->autodma = 0;
 		hwif->drives[0].autotune = 1;
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/ide/ide-disk.c linux/drivers/ide/ide-disk.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/ide/ide-disk.c	Wed Aug 15 09:19:17 2001
+++ linux/drivers/ide/ide-disk.c	Wed Aug 15 09:14:26 2001
@@ -27,9 +27,10 @@
  * Version 1.09		added increment of rq->sector in ide_multwrite
  *			added UDMA 3/4 reporting
  * Version 1.10		request queue changes, Ultra DMA 100
+ * Version 1.11		Highmem I/O support, Jens Axboe <axboe@suse.de>
  */
 
-#define IDEDISK_VERSION	"1.10"
+#define IDEDISK_VERSION	"1.11"
 
 #undef REALLY_SLOW_IO		/* most systems can safely undef this */
 
@@ -139,7 +140,9 @@
 	byte stat;
 	int i;
 	unsigned int msect, nsect;
+	unsigned long flags;
 	struct request *rq;
+	char *to;
 
 	/* new way for dealing with premature shared PCI interrupts */
 	if (!OK_STAT(stat=GET_STAT(),DATA_READY,BAD_R_STAT)) {
@@ -150,8 +153,8 @@
 		ide_set_handler(drive, &read_intr, WAIT_CMD, NULL);
 		return ide_started;
 	}
+
 	msect = drive->mult_count;
-	
 read_next:
 	rq = HWGROUP(drive)->rq;
 	if (msect) {
@@ -160,14 +163,15 @@
 		msect -= nsect;
 	} else
 		nsect = 1;
-	idedisk_input_data(drive, rq->buffer, nsect * SECTOR_WORDS);
+	to = ide_map_buffer(rq, &flags);
+	idedisk_input_data(drive, to, nsect * SECTOR_WORDS);
 #ifdef DEBUG
 	printk("%s:  read: sectors(%ld-%ld), buffer=0x%08lx, remaining=%ld\n",
 		drive->name, rq->sector, rq->sector+nsect-1,
 		(unsigned long) rq->buffer+(nsect<<9), rq->nr_sectors-nsect);
 #endif
+	ide_unmap_buffer(to, &flags);
 	rq->sector += nsect;
-	rq->buffer += nsect<<9;
 	rq->errors = 0;
 	i = (rq->nr_sectors -= nsect);
 	if (((long)(rq->current_nr_sectors -= nsect)) <= 0)
@@ -201,14 +205,16 @@
 #endif
 		if ((rq->nr_sectors == 1) ^ ((stat & DRQ_STAT) != 0)) {
 			rq->sector++;
-			rq->buffer += 512;
 			rq->errors = 0;
 			i = --rq->nr_sectors;
 			--rq->current_nr_sectors;
 			if (((long)rq->current_nr_sectors) <= 0)
 				ide_end_request(1, hwgroup);
 			if (i > 0) {
-				idedisk_output_data (drive, rq->buffer, SECTOR_WORDS);
+				unsigned long flags;
+				char *to = ide_map_buffer(rq, &flags);
+				idedisk_output_data (drive, to, SECTOR_WORDS);
+				ide_unmap_buffer(to, &flags);
 				ide_set_handler (drive, &write_intr, WAIT_CMD, NULL);
                                 return ide_started;
 			}
@@ -238,14 +244,14 @@
   	do {
   		char *buffer;
   		int nsect = rq->current_nr_sectors;
- 
+		unsigned long flags;
+
 		if (nsect > mcount)
 			nsect = mcount;
 		mcount -= nsect;
-		buffer = rq->buffer;
 
+		buffer = ide_map_buffer(rq, &flags);
 		rq->sector += nsect;
-		rq->buffer += nsect << 9;
 		rq->nr_sectors -= nsect;
 		rq->current_nr_sectors -= nsect;
 
@@ -259,7 +265,7 @@
 			} else {
 				rq->bh = bh;
 				rq->current_nr_sectors = bh->b_size >> 9;
-				rq->buffer             = bh->b_data;
+				rq->hard_cur_sectors = rq->current_nr_sectors;
 			}
 		}
 
@@ -268,6 +274,7 @@
 		 * re-entering us on the last transfer.
 		 */
 		idedisk_output_data(drive, buffer, nsect<<7);
+		ide_unmap_buffer(buffer, &flags);
 	} while (mcount);
 
         return 0;
@@ -452,8 +459,11 @@
 				return ide_stopped;
 			}
 		} else {
+			unsigned long flags;
+			char *buffer = ide_map_buffer(rq, &flags);
 			ide_set_handler (drive, &write_intr, WAIT_CMD, NULL);
-			idedisk_output_data(drive, rq->buffer, SECTOR_WORDS);
+			idedisk_output_data(drive, buffer, SECTOR_WORDS);
+			ide_unmap_buffer(buffer, &flags);
 		}
 		return ide_started;
 	}
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/ide/ide-dma.c linux/drivers/ide/ide-dma.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/ide/ide-dma.c	Wed Aug 15 09:19:17 2001
+++ linux/drivers/ide/ide-dma.c	Wed Aug 15 09:36:47 2001
@@ -230,36 +230,45 @@
 static int ide_build_sglist (ide_hwif_t *hwif, struct request *rq)
 {
 	struct buffer_head *bh;
-	struct scatterlist *sg = hwif->sg_table;
+	struct sg_list *sg = hwif->sg_table;
+	unsigned long lastdataend;
 	int nents = 0;
 
 	if (hwif->sg_dma_active)
 		BUG();
-		
+
 	if (rq->cmd == READ)
 		hwif->sg_dma_direction = PCI_DMA_FROMDEVICE;
 	else
 		hwif->sg_dma_direction = PCI_DMA_TODEVICE;
+
 	bh = rq->bh;
+	lastdataend = 0;
 	do {
-		unsigned char *virt_addr = bh->b_data;
-		unsigned int size = bh->b_size;
-
-		if (nents >= PRD_ENTRIES)
-			return 0;
+		/*
+		 * continue segment from before?
+		 */
+		if (bh_bus(bh) == lastdataend) {
+			sg[nents - 1].length += bh->b_size;
+			lastdataend += bh->b_size;
+		} else {
+			struct sg_list *sge;
+			/*
+			 * start new segment
+			 */
+			if (nents >= PRD_ENTRIES)
+				return 0;
 
-		while ((bh = bh->b_reqnext) != NULL) {
-			if ((virt_addr + size) != (unsigned char *) bh->b_data)
-				break;
-			size += bh->b_size;
+			sge = &sg[nents];
+			sge->page = bh->b_page;
+			sge->length = bh->b_size;
+			sge->offset = bh_offset(bh);
+			lastdataend = bh_bus(bh) + bh->b_size;
+			nents++;
 		}
-		memset(&sg[nents], 0, sizeof(*sg));
-		sg[nents].address = virt_addr;
-		sg[nents].length = size;
-		nents++;
-	} while (bh != NULL);
+	} while ((bh = bh->b_reqnext) != NULL);
 
-	return pci_map_sg(hwif->pci_dev, sg, nents, hwif->sg_dma_direction);
+	return pci_map_sgl(hwif->pci_dev, sg, nents, hwif->sg_dma_direction);
 }
 
 /*
@@ -277,7 +286,7 @@
 #endif
 	unsigned int count = 0;
 	int i;
-	struct scatterlist *sg;
+	struct sg_list *sg;
 
 	HWIF(drive)->sg_nents = i = ide_build_sglist(HWIF(drive), HWGROUP(drive)->rq);
 
@@ -285,7 +294,7 @@
 		return 0;
 
 	sg = HWIF(drive)->sg_table;
-	while (i && sg_dma_len(sg)) {
+	while (i) {
 		u32 cur_addr;
 		u32 cur_len;
 
@@ -299,36 +308,35 @@
 		 */
 
 		while (cur_len) {
-			if (count++ >= PRD_ENTRIES) {
-				printk("%s: DMA table too small\n", drive->name);
-				goto use_pio_instead;
-			} else {
-				u32 xcount, bcount = 0x10000 - (cur_addr & 0xffff);
-
-				if (bcount > cur_len)
-					bcount = cur_len;
-				*table++ = cpu_to_le32(cur_addr);
-				xcount = bcount & 0xffff;
-				if (is_trm290_chipset)
-					xcount = ((xcount >> 2) - 1) << 16;
-				if (xcount == 0x0000) {
-					/* 
-					 * Most chipsets correctly interpret a length of 0x0000 as 64KB,
-					 * but at least one (e.g. CS5530) misinterprets it as zero (!).
-					 * So here we break the 64KB entry into two 32KB entries instead.
-					 */
-					if (count++ >= PRD_ENTRIES) {
-						printk("%s: DMA table too small\n", drive->name);
-						goto use_pio_instead;
-					}
-					*table++ = cpu_to_le32(0x8000);
-					*table++ = cpu_to_le32(cur_addr + 0x8000);
-					xcount = 0x8000;
-				}
-				*table++ = cpu_to_le32(xcount);
-				cur_addr += bcount;
-				cur_len -= bcount;
+			u32 xcount, bcount = 0x10000 - (cur_addr & 0xffff);
+			
+			if (count++ >= PRD_ENTRIES)
+				BUG();
+
+			if (bcount > cur_len)
+				bcount = cur_len;
+			*table++ = cpu_to_le32(cur_addr);
+			xcount = bcount & 0xffff;
+			if (is_trm290_chipset)
+				xcount = ((xcount >> 2) - 1) << 16;
+			if (xcount == 0x0000) {
+				/* 
+				 * Most chipsets correctly interpret a length
+				 * of 0x0000 as 64KB, but at least one
+				 * (e.g. CS5530) misinterprets it as zero (!).
+				 * So here we break the 64KB entry into two
+				 * 32KB entries instead.
+				 */
+				if (count++ >= PRD_ENTRIES)
+					goto use_pio_instead;
+
+				*table++ = cpu_to_le32(0x8000);
+				*table++ = cpu_to_le32(cur_addr + 0x8000);
+				xcount = 0x8000;
 			}
+			*table++ = cpu_to_le32(xcount);
+			cur_addr += bcount;
+			cur_len -= bcount;
 		}
 
 		sg++;
@@ -342,7 +350,7 @@
 	}
 	printk("%s: empty DMA table?\n", drive->name);
 use_pio_instead:
-	pci_unmap_sg(HWIF(drive)->pci_dev,
+	pci_unmap_sgl(HWIF(drive)->pci_dev,
 		     HWIF(drive)->sg_table,
 		     HWIF(drive)->sg_nents,
 		     HWIF(drive)->sg_dma_direction);
@@ -354,10 +362,10 @@
 void ide_destroy_dmatable (ide_drive_t *drive)
 {
 	struct pci_dev *dev = HWIF(drive)->pci_dev;
-	struct scatterlist *sg = HWIF(drive)->sg_table;
+	struct sg_list *sg = HWIF(drive)->sg_table;
 	int nents = HWIF(drive)->sg_nents;
 
-	pci_unmap_sg(dev, sg, nents, HWIF(drive)->sg_dma_direction);
+	pci_unmap_sgl(dev, sg, nents, HWIF(drive)->sg_dma_direction);
 	HWIF(drive)->sg_dma_active = 0;
 }
 
@@ -512,6 +520,20 @@
 }
 #endif /* CONFIG_BLK_DEV_IDEDMA_TIMEOUT */
 
+#ifdef CONFIG_HIGHMEM
+static inline void ide_toggle_bounce(ide_drive_t *drive, int on)
+{
+	unsigned long addr = BLK_BOUNCE_HIGH;
+
+	if (on && drive->media == ide_disk && HWIF(drive)->highmem)
+		addr = BLK_BOUNCE_4G;
+
+	blk_queue_bounce_limit(&drive->queue, addr);
+}
+#else
+#define ide_toggle_bounce(drive, on)
+#endif
+
 /*
  * ide_dmaproc() initiates/aborts DMA read/write operations on a drive.
  *
@@ -534,18 +556,20 @@
 	ide_hwif_t *hwif		= HWIF(drive);
 	unsigned long dma_base		= hwif->dma_base;
 	byte unit			= (drive->select.b.unit & 0x01);
-	unsigned int count, reading	= 0;
+	unsigned int count, reading = 0, set_high = 1;
 	byte dma_stat;
 
 	switch (func) {
 		case ide_dma_off:
 			printk("%s: DMA disabled\n", drive->name);
+			set_high = 0;
 		case ide_dma_off_quietly:
 			outb(inb(dma_base+2) & ~(1<<(5+unit)), dma_base+2);
 		case ide_dma_on:
 			drive->using_dma = (func == ide_dma_on);
 			if (drive->using_dma)
 				outb(inb(dma_base+2)|(1<<(5+unit)), dma_base+2);
+			ide_toggle_bounce(drive, set_high);
 			return 0;
 		case ide_dma_check:
 			return config_drive_for_dma (drive);
@@ -681,7 +705,7 @@
 	if (hwif->dmatable_cpu == NULL)
 		goto dma_alloc_failure;
 
-	hwif->sg_table = kmalloc(sizeof(struct scatterlist) * PRD_ENTRIES,
+	hwif->sg_table = kmalloc(sizeof(struct sg_list) * PRD_ENTRIES,
 				 GFP_KERNEL);
 	if (hwif->sg_table == NULL) {
 		pci_free_consistent(hwif->pci_dev, PRD_ENTRIES * PRD_BYTES,
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/ide/pdc202xx.c linux/drivers/ide/pdc202xx.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/ide/pdc202xx.c	Wed Aug 15 09:19:17 2001
+++ linux/drivers/ide/pdc202xx.c	Wed Aug 15 09:14:26 2001
@@ -891,6 +891,7 @@
 #ifdef CONFIG_BLK_DEV_IDEDMA
 	if (hwif->dma_base) {
 		hwif->dmaproc = &pdc202xx_dmaproc;
+		hwif->highmem = 1;
 		if (!noautodma)
 			hwif->autodma = 1;
 	} else {
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/ide/piix.c linux/drivers/ide/piix.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/ide/piix.c	Wed Aug 15 09:19:17 2001
+++ linux/drivers/ide/piix.c	Wed Aug 15 09:14:26 2001
@@ -521,6 +521,7 @@
 	if (!hwif->dma_base)
 		return;
 
+	hwif->highmem = 1;
 #ifndef CONFIG_BLK_DEV_IDEDMA
 	hwif->autodma = 0;
 #else /* CONFIG_BLK_DEV_IDEDMA */
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/aic7xxx/aic7xxx_linux_host.h linux/drivers/scsi/aic7xxx/aic7xxx_linux_host.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/aic7xxx/aic7xxx_linux_host.h	Sat May  5 00:16:28 2001
+++ linux/drivers/scsi/aic7xxx/aic7xxx_linux_host.h	Wed Aug 15 09:14:26 2001
@@ -81,7 +81,8 @@
 	present: 0,		/* number of 7xxx's present   */\
 	unchecked_isa_dma: 0,	/* no memory DMA restrictions */\
 	use_clustering: ENABLE_CLUSTERING,			\
-	use_new_eh_code: 1					\
+	use_new_eh_code: 1,					\
+	can_dma_32: 1						\
 }
 
 #endif /* _AIC7XXX_LINUX_HOST_H_ */
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/hosts.c linux/drivers/scsi/hosts.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/hosts.c	Thu Jul  5 20:28:17 2001
+++ linux/drivers/scsi/hosts.c	Wed Aug 15 09:14:26 2001
@@ -235,6 +235,7 @@
     retval->cmd_per_lun = tpnt->cmd_per_lun;
     retval->unchecked_isa_dma = tpnt->unchecked_isa_dma;
     retval->use_clustering = tpnt->use_clustering;   
+    retval->can_dma_32 = tpnt->can_dma_32;
 
     retval->select_queue_depths = tpnt->select_queue_depths;
     retval->max_sectors = tpnt->max_sectors;
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/hosts.h linux/drivers/scsi/hosts.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/hosts.h	Fri Jul 20 21:55:46 2001
+++ linux/drivers/scsi/hosts.h	Wed Aug 15 09:41:37 2001
@@ -291,6 +291,8 @@
      */
     unsigned emulated:1;
 
+    unsigned can_dma_32:1;
+
     /*
      * Name of proc directory
      */
@@ -390,6 +392,7 @@
     unsigned in_recovery:1;
     unsigned unchecked_isa_dma:1;
     unsigned use_clustering:1;
+    unsigned can_dma_32:1;
     /*
      * True if this host was loaded as a loadable module
      */
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/qlogicfc.h linux/drivers/scsi/qlogicfc.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/qlogicfc.h	Mon Jun 26 21:02:16 2000
+++ linux/drivers/scsi/qlogicfc.h	Wed Aug 15 09:14:26 2001
@@ -100,7 +100,8 @@
 	cmd_per_lun:		QLOGICFC_CMD_PER_LUN, 			   \
         present:                0,                                         \
         unchecked_isa_dma:      0,                                         \
-        use_clustering:         ENABLE_CLUSTERING 			   \
+        use_clustering:         ENABLE_CLUSTERING,			   \
+	can_dma_32:		1					   \
 }
 
 #endif /* _QLOGICFC_H */
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi.c linux/drivers/scsi/scsi.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi.c	Fri Jul 20 06:07:04 2001
+++ linux/drivers/scsi/scsi.c	Wed Aug 15 09:14:26 2001
@@ -178,10 +178,13 @@
  *              handler in the list - ultimately they call scsi_request_fn
  *              to do the dirty deed.
  */
-void  scsi_initialize_queue(Scsi_Device * SDpnt, struct Scsi_Host * SHpnt) {
-	blk_init_queue(&SDpnt->request_queue, scsi_request_fn);
-        blk_queue_headactive(&SDpnt->request_queue, 0);
-        SDpnt->request_queue.queuedata = (void *) SDpnt;
+void  scsi_initialize_queue(Scsi_Device * SDpnt, struct Scsi_Host * SHpnt)
+{
+	request_queue_t *q = &SDpnt->request_queue;
+
+	blk_init_queue(q, scsi_request_fn);
+	blk_queue_headactive(q, 0);
+	q->queuedata = (void *) SDpnt;
 }
 
 #ifdef MODULE
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi.h linux/drivers/scsi/scsi.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi.h	Fri Jul 20 21:55:46 2001
+++ linux/drivers/scsi/scsi.h	Wed Aug 15 09:41:37 2001
@@ -391,7 +391,7 @@
 #define CONTIGUOUS_BUFFERS(X,Y) \
 	(virt_to_phys((X)->b_data+(X)->b_size-1)+1==virt_to_phys((Y)->b_data))
 #else
-#define CONTIGUOUS_BUFFERS(X,Y) ((X->b_data+X->b_size) == Y->b_data)
+#define CONTIGUOUS_BUFFERS(X,Y) (bh_bus((X)) + (X)->b_size == bh_bus((Y)))
 #endif
 
 
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi_lib.c linux/drivers/scsi/scsi_lib.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi_lib.c	Wed Aug 15 09:19:18 2001
+++ linux/drivers/scsi/scsi_lib.c	Wed Aug 15 09:14:26 2001
@@ -388,6 +388,7 @@
 				req->nr_sectors -= nsect;
 
 				req->current_nr_sectors = bh->b_size >> 9;
+				req->hard_cur_sectors = req->current_nr_sectors;
 				if (req->nr_sectors < req->current_nr_sectors) {
 					req->nr_sectors = req->current_nr_sectors;
 					printk("scsi_end_request: buffer-list destroyed\n");
@@ -410,7 +411,6 @@
 
                 q = &SCpnt->device->request_queue;
 
-		req->buffer = bh->b_data;
 		/*
 		 * Bleah.  Leftovers again.  Stick the leftovers in
 		 * the front of the queue, and goose the queue again.
@@ -489,6 +489,8 @@
  */
 static void scsi_release_buffers(Scsi_Cmnd * SCpnt)
 {
+	struct request *req = &SCpnt->request;
+
 	ASSERT_LOCK(&io_request_lock, 0);
 
 	/*
@@ -507,9 +509,8 @@
 		}
 		scsi_free(SCpnt->request_buffer, SCpnt->sglist_len);
 	} else {
-		if (SCpnt->request_buffer != SCpnt->request.buffer) {
-			scsi_free(SCpnt->request_buffer, SCpnt->request_bufflen);
-		}
+		if (SCpnt->request_buffer != req->buffer)
+			scsi_free(SCpnt->request_buffer,SCpnt->request_bufflen);
 	}
 
 	/*
@@ -545,6 +546,7 @@
 	int result = SCpnt->result;
 	int this_count = SCpnt->bufflen >> 9;
 	request_queue_t *q = &SCpnt->device->request_queue;
+	struct request *req = &SCpnt->request;
 
 	/*
 	 * We must do one of several things here:
@@ -574,7 +576,7 @@
 
 		for (i = 0; i < SCpnt->use_sg; i++) {
 			if (sgpnt[i].alt_address) {
-				if (SCpnt->request.cmd == READ) {
+				if (req->cmd == READ) {
 					memcpy(sgpnt[i].alt_address, 
 					       sgpnt[i].address,
 					       sgpnt[i].length);
@@ -584,10 +586,12 @@
 		}
 		scsi_free(SCpnt->buffer, SCpnt->sglist_len);
 	} else {
-		if (SCpnt->buffer != SCpnt->request.buffer) {
-			if (SCpnt->request.cmd == READ) {
-				memcpy(SCpnt->request.buffer, SCpnt->buffer,
-				       SCpnt->bufflen);
+		if (SCpnt->buffer != req->buffer) {
+			if (req->cmd == READ) {
+				unsigned long flags;
+				char *to = bh_kmap_irq(req->bh, &flags);
+				memcpy(to, SCpnt->buffer, SCpnt->bufflen);
+				bh_kunmap_irq(to, &flags);
 			}
 			scsi_free(SCpnt->buffer, SCpnt->bufflen);
 		}
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi_merge.c linux/drivers/scsi/scsi_merge.c
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/scsi_merge.c	Thu Jul  5 20:28:17 2001
+++ linux/drivers/scsi/scsi_merge.c	Wed Aug 15 09:14:26 2001
@@ -6,6 +6,7 @@
  *                        Based upon conversations with large numbers
  *                        of people at Linux Expo.
  *	Support for dynamic DMA mapping: Jakub Jelinek (jakub@redhat.com).
+ *	Support for highmem I/O: Jens Axboe <axboe@suse.de>
  */
 
 /*
@@ -95,7 +96,7 @@
 		printk("Segment 0x%p, blocks %d, addr 0x%lx\n",
 		       bh,
 		       bh->b_size >> 9,
-		       virt_to_phys(bh->b_data - 1));
+		       bh_bus(bh) - 1);
 	}
 	panic("Ththththaats all folks.  Too dangerous to continue.\n");
 }
@@ -223,8 +224,7 @@
 			 * DMA capable host, make sure that a segment doesn't span
 			 * the DMA threshold boundary.  
 			 */
-			if (dma_host &&
-			    virt_to_phys(bhnext->b_data) - 1 == ISA_DMA_THRESHOLD) {
+			if (dma_host && bh_bus(bhnext) - 1 == ISA_DMA_THRESHOLD) {
 				ret++;
 				reqsize = bhnext->b_size;
 			} else if (CONTIGUOUS_BUFFERS(bh, bhnext)) {
@@ -241,8 +241,7 @@
 				 * kind of screwed and we need to start
 				 * another segment.
 				 */
-				if( dma_host
-				    && virt_to_phys(bh->b_data) - 1 >= ISA_DMA_THRESHOLD
+				if( dma_host && bh_bus(bh) - 1 >= ISA_DMA_THRESHOLD
 				    && reqsize + bhnext->b_size > PAGE_SIZE )
 				{
 					ret++;
@@ -304,7 +303,7 @@
 }
 
 #define MERGEABLE_BUFFERS(X,Y) \
-(((((long)(X)->b_data+(X)->b_size)|((long)(Y)->b_data)) & \
+(((((long)bh_bus((X))+(X)->b_size)|((long)bh_bus((Y)))) & \
   (DMA_CHUNK_SIZE - 1)) == 0)
 
 #ifdef DMA_CHUNK_SIZE
@@ -427,14 +426,11 @@
 		 * DMA capable host, make sure that a segment doesn't span
 		 * the DMA threshold boundary.  
 		 */
-		if (dma_host &&
-		    virt_to_phys(req->bhtail->b_data) - 1 == ISA_DMA_THRESHOLD) {
+		if (dma_host && bh_bus(req->bhtail) - 1 == ISA_DMA_THRESHOLD)
 			goto new_end_segment;
-		}
 		if (CONTIGUOUS_BUFFERS(req->bhtail, bh)) {
 #ifdef DMA_SEGMENT_SIZE_LIMITED
-			if( dma_host
-			    && virt_to_phys(bh->b_data) - 1 >= ISA_DMA_THRESHOLD ) {
+			if (dma_host && bh_bus(bh) - 1 >= ISA_DMA_THRESHOLD) {
 				segment_size = 0;
 				count = __count_segments(req, use_clustering, dma_host, &segment_size);
 				if( segment_size + bh->b_size > PAGE_SIZE ) {
@@ -486,14 +482,12 @@
 		 * DMA capable host, make sure that a segment doesn't span
 		 * the DMA threshold boundary. 
 		 */
-		if (dma_host &&
-		    virt_to_phys(bh->b_data) - 1 == ISA_DMA_THRESHOLD) {
+		if (dma_host && bh_bus(bh) - 1 == ISA_DMA_THRESHOLD) {
 			goto new_start_segment;
 		}
 		if (CONTIGUOUS_BUFFERS(bh, req->bh)) {
 #ifdef DMA_SEGMENT_SIZE_LIMITED
-			if( dma_host
-			    && virt_to_phys(bh->b_data) - 1 >= ISA_DMA_THRESHOLD ) {
+			if (dma_host && bh_bus(bh) - 1 >= ISA_DMA_THRESHOLD) {
 				segment_size = bh->b_size;
 				count = __count_segments(req, use_clustering, dma_host, &segment_size);
 				if( count != req->nr_segments ) {
@@ -652,10 +646,8 @@
 		 * DMA capable host, make sure that a segment doesn't span
 		 * the DMA threshold boundary.  
 		 */
-		if (dma_host &&
-		    virt_to_phys(req->bhtail->b_data) - 1 == ISA_DMA_THRESHOLD) {
+		if (dma_host && bh_bus(req->bhtail) - 1 == ISA_DMA_THRESHOLD)
 			goto dont_combine;
-		}
 #ifdef DMA_SEGMENT_SIZE_LIMITED
 		/*
 		 * We currently can only allocate scatter-gather bounce
@@ -663,7 +655,7 @@
 		 */
 		if (dma_host
 		    && CONTIGUOUS_BUFFERS(req->bhtail, next->bh)
-		    && virt_to_phys(req->bhtail->b_data) - 1 >= ISA_DMA_THRESHOLD )
+		    && bh_bus(req->bhtail) - 1 >= ISA_DMA_THRESHOLD )
 		{
 			int segment_size = 0;
 			int count = 0;
@@ -808,29 +800,6 @@
 	struct scatterlist * sgpnt;
 	int		     this_count;
 
-	/*
-	 * FIXME(eric) - don't inline this - it doesn't depend on the
-	 * integer flags.   Come to think of it, I don't think this is even
-	 * needed any more.  Need to play with it and see if we hit the
-	 * panic.  If not, then don't bother.
-	 */
-	if (!SCpnt->request.bh) {
-		/* 
-		 * Case of page request (i.e. raw device), or unlinked buffer 
-		 * Typically used for swapping, but this isn't how we do
-		 * swapping any more.
-		 */
-		panic("I believe this is dead code.  If we hit this, I was wrong");
-#if 0
-		SCpnt->request_bufflen = SCpnt->request.nr_sectors << 9;
-		SCpnt->request_buffer = SCpnt->request.buffer;
-		SCpnt->use_sg = 0;
-		/*
-		 * FIXME(eric) - need to handle DMA here.
-		 */
-#endif
-		return 1;
-	}
 	req = &SCpnt->request;
 	/*
 	 * First we need to know how many scatter gather segments are needed.
@@ -847,24 +816,16 @@
 	 * buffer.
 	 */
 	if (dma_host && scsi_dma_free_sectors <= 10) {
-		this_count = SCpnt->request.current_nr_sectors;
-		goto single_segment;
-	}
-	/*
-	 * Don't bother with scatter-gather if there is only one segment.
-	 */
-	if (count == 1) {
-		this_count = SCpnt->request.nr_sectors;
+		this_count = req->current_nr_sectors;
 		goto single_segment;
 	}
-	SCpnt->use_sg = count;
 
 	/* 
 	 * Allocate the actual scatter-gather table itself.
 	 * scsi_malloc can only allocate in chunks of 512 bytes 
 	 */
-	SCpnt->sglist_len = (SCpnt->use_sg
-			     * sizeof(struct scatterlist) + 511) & ~511;
+	SCpnt->use_sg = count;
+	SCpnt->sglist_len = (count * sizeof(struct scatterlist) + 511) & ~511;
 
 	sgpnt = (struct scatterlist *) scsi_malloc(SCpnt->sglist_len);
 
@@ -877,7 +838,7 @@
 		 * simply write the first buffer all by itself.
 		 */
 		printk("Warning - running *really* short on DMA buffers\n");
-		this_count = SCpnt->request.current_nr_sectors;
+		this_count = req->current_nr_sectors;
 		goto single_segment;
 	}
 	/* 
@@ -889,11 +850,9 @@
 	SCpnt->request_bufflen = 0;
 	bhprev = NULL;
 
-	for (count = 0, bh = SCpnt->request.bh;
-	     bh; bh = bh->b_reqnext) {
+	for (count = 0, bh = req->bh; bh; bh = bh->b_reqnext) {
 		if (use_clustering && bhprev != NULL) {
-			if (dma_host &&
-			    virt_to_phys(bhprev->b_data) - 1 == ISA_DMA_THRESHOLD) {
+			if (dma_host && bh_bus(bhprev) - 1 == ISA_DMA_THRESHOLD) {
 				/* Nothing - fall through */
 			} else if (CONTIGUOUS_BUFFERS(bhprev, bh)) {
 				/*
@@ -904,7 +863,7 @@
 				 */
 				if( dma_host ) {
 #ifdef DMA_SEGMENT_SIZE_LIMITED
-					if( virt_to_phys(bh->b_data) - 1 < ISA_DMA_THRESHOLD
+					if (bh_bus(bh) - 1 < ISA_DMA_THRESHOLD
 					    || sgpnt[count - 1].length + bh->b_size <= PAGE_SIZE ) {
 						sgpnt[count - 1].length += bh->b_size;
 						bhprev = bh;
@@ -923,12 +882,12 @@
 				}
 			}
 		}
-		count++;
-		sgpnt[count - 1].address = bh->b_data;
-		sgpnt[count - 1].length += bh->b_size;
-		if (!dma_host) {
+
+		set_bh_sg(&sgpnt[count], bh);
+		if (!dma_host)
 			SCpnt->request_bufflen += bh->b_size;
-		}
+
+		count++;
 		bhprev = bh;
 	}
 
@@ -951,6 +910,10 @@
 	for (i = 0; i < count; i++) {
 		sectors = (sgpnt[i].length >> 9);
 		SCpnt->request_bufflen += sgpnt[i].length;
+		/*
+		 * only done for dma_host, in which case .page is not
+		 * set since it's guarenteed to be a low memory page
+		 */
 		if (virt_to_phys(sgpnt[i].address) + sgpnt[i].length - 1 >
 		    ISA_DMA_THRESHOLD) {
 			if( scsi_dma_free_sectors - sectors <= 10  ) {
@@ -986,7 +949,7 @@
 				}
 				break;
 			}
-			if (SCpnt->request.cmd == WRITE) {
+			if (req->cmd == WRITE) {
 				memcpy(sgpnt[i].address, sgpnt[i].alt_address,
 				       sgpnt[i].length);
 			}
@@ -1031,8 +994,7 @@
 	 * single-block requests if we had hundreds of free sectors.
 	 */
 	if( scsi_dma_free_sectors > 30 ) {
-		for (this_count = 0, bh = SCpnt->request.bh;
-		     bh; bh = bh->b_reqnext) {
+		for (this_count = 0, bh = req->bh; bh; bh = bh->b_reqnext) {
 			if( scsi_dma_free_sectors - this_count < 30 
 			    || this_count == sectors )
 			{
@@ -1045,7 +1007,7 @@
 		/*
 		 * Yow!   Take the absolute minimum here.
 		 */
-		this_count = SCpnt->request.current_nr_sectors;
+		this_count = req->current_nr_sectors;
 	}
 
 	/*
@@ -1058,28 +1020,31 @@
 	 * segment.  Possibly the entire request, or possibly a small
 	 * chunk of the entire request.
 	 */
-	bh = SCpnt->request.bh;
-	buff = SCpnt->request.buffer;
+	bh = req->bh;
+	buff = req->buffer = bh->b_data;
 
-	if (dma_host) {
+	if (dma_host || PageHighMem(bh->b_page)) {
 		/*
 		 * Allocate a DMA bounce buffer.  If the allocation fails, fall
 		 * back and allocate a really small one - enough to satisfy
 		 * the first buffer.
 		 */
-		if (virt_to_phys(SCpnt->request.bh->b_data)
-		    + (this_count << 9) - 1 > ISA_DMA_THRESHOLD) {
+		if (bh_bus(bh) + (this_count << 9) - 1 > ISA_DMA_THRESHOLD) {
 			buff = (char *) scsi_malloc(this_count << 9);
 			if (!buff) {
 				printk("Warning - running low on DMA memory\n");
-				this_count = SCpnt->request.current_nr_sectors;
+				this_count = req->current_nr_sectors;
 				buff = (char *) scsi_malloc(this_count << 9);
 				if (!buff) {
 					dma_exhausted(SCpnt, 0);
 				}
 			}
-			if (SCpnt->request.cmd == WRITE)
-				memcpy(buff, (char *) SCpnt->request.buffer, this_count << 9);
+			if (req->cmd == WRITE) {
+				unsigned long flags;
+				char *buf = bh_kmap_irq(bh, &flags);
+				memcpy(buff, buf, this_count << 9);
+				bh_kunmap_irq(buf, &flags);
+			}
 		}
 	}
 	SCpnt->request_bufflen = this_count << 9;
@@ -1127,14 +1092,6 @@
 	q = &SDpnt->request_queue;
 
 	/*
-	 * If the host has already selected a merge manager, then don't
-	 * pick a new one.
-	 */
-#if 0
-	if (q->back_merge_fn && q->front_merge_fn)
-		return;
-#endif
-	/*
 	 * If this host has an unlimited tablesize, then don't bother with a
 	 * merge manager.  The whole point of the operation is to make sure
 	 * that requests don't grow too large, and this host isn't picky.
@@ -1166,4 +1123,14 @@
 		q->merge_requests_fn = scsi_merge_requests_fn_dc;
 		SDpnt->scsi_init_io_fn = scsi_init_io_vdc;
 	}
+
+	/*
+	 * now enable highmem I/O, if appropriate
+	 */
+#ifdef CONFIG_HIGHMEM
+	if (SHpnt->can_dma_32 && (SDpnt->type == TYPE_DISK))
+		blk_queue_bounce_limit(q, BLK_BOUNCE_4G);
+	else
+		blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH);
+#endif
 }
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/drivers/scsi/sym53c8xx.h linux/drivers/scsi/sym53c8xx.h
--- /opt/kernel/linux-2.4.9-pre4/drivers/scsi/sym53c8xx.h	Fri Jul 20 21:56:08 2001
+++ linux/drivers/scsi/sym53c8xx.h	Wed Aug 15 09:24:32 2001
@@ -96,7 +96,8 @@
 			this_id:        7,			\
 			sg_tablesize:   SCSI_NCR_SG_TABLESIZE,	\
 			cmd_per_lun:    SCSI_NCR_CMD_PER_LUN,	\
-			use_clustering: DISABLE_CLUSTERING} 
+			use_clustering: DISABLE_CLUSTERING,	\
+			can_dma_32:	1}
 
 #else
 
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/fs/buffer.c linux/fs/buffer.c
--- /opt/kernel/linux-2.4.9-pre4/fs/buffer.c	Wed Aug 15 09:19:18 2001
+++ linux/fs/buffer.c	Wed Aug 15 09:14:26 2001
@@ -1327,13 +1327,11 @@
 	bh->b_page = page;
 	if (offset >= PAGE_SIZE)
 		BUG();
-	if (PageHighMem(page))
-		/*
-		 * This catches illegal uses and preserves the offset:
-		 */
-		bh->b_data = (char *)(0 + offset);
-	else
-		bh->b_data = page_address(page) + offset;
+	/*
+	 * ->virtual is NULL on highmem pages, so we can catch the
+	 * offset even though using page_address on it
+	 */
+	bh->b_data = page_address(page) + offset;
 }
 
 /*
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/asm-i386/kmap_types.h linux/include/asm-i386/kmap_types.h
--- /opt/kernel/linux-2.4.9-pre4/include/asm-i386/kmap_types.h	Thu Apr 12 21:11:39 2001
+++ linux/include/asm-i386/kmap_types.h	Wed Aug 15 09:14:26 2001
@@ -6,6 +6,7 @@
 	KM_BOUNCE_WRITE,
 	KM_SKB_DATA,
 	KM_SKB_DATA_SOFTIRQ,
+	KM_BH_IRQ,
 	KM_TYPE_NR
 };
 
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/linux/blkdev.h linux/include/linux/blkdev.h
--- /opt/kernel/linux-2.4.9-pre4/include/linux/blkdev.h	Wed Aug 15 09:19:05 2001
+++ linux/include/linux/blkdev.h	Wed Aug 15 09:25:18 2001
@@ -36,7 +36,7 @@
 	unsigned long hard_sector, hard_nr_sectors;
 	unsigned int nr_segments;
 	unsigned int nr_hw_segments;
-	unsigned long current_nr_sectors;
+	unsigned long current_nr_sectors, hard_cur_sectors;
 	void * special;
 	char * buffer;
 	struct completion * waiting;
@@ -110,6 +110,8 @@
 	 */
 	char			head_active;
 
+	struct page		*bounce_limit;
+
 	/*
 	 * Is meant to protect the queue in the future instead of
 	 * io_request_lock
@@ -122,6 +124,27 @@
 	wait_queue_head_t	wait_for_request;
 };
 
+extern unsigned long blk_max_low_pfn;
+
+#define BLK_BOUNCE_HIGH		(blk_max_low_pfn * PAGE_SIZE)
+#define BLK_BOUNCE_4G		PCI_MAX_DMA32
+
+extern void blk_queue_bounce_limit(request_queue_t *, unsigned long);
+
+#ifdef CONFIG_HIGHMEM
+extern struct buffer_head *create_bounce(int, struct buffer_head *);
+extern inline struct buffer_head *blk_queue_bounce(request_queue_t *q, int rw,
+						   struct buffer_head *bh)
+{
+	if (bh->b_page <= q->bounce_limit)
+		return bh;
+
+	return create_bounce(rw, bh);
+}
+#else
+#define blk_queue_bounce(q, rw, bh)	(bh)
+#endif
+
 struct blk_dev_struct {
 	/*
 	 * queue_proc has to be atomic
@@ -149,8 +172,7 @@
 extern void grok_partitions(struct gendisk *dev, int drive, unsigned minors, long size);
 extern void register_disk(struct gendisk *dev, kdev_t first, unsigned minors, struct block_device_operations *ops, long size);
 extern void generic_make_request(int rw, struct buffer_head * bh);
-extern request_queue_t *blk_get_queue(kdev_t dev);
-extern inline request_queue_t *__blk_get_queue(kdev_t dev);
+extern inline request_queue_t *blk_get_queue(kdev_t dev);
 extern void blkdev_release_request(struct request *);
 
 /*
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/linux/fs.h linux/include/linux/fs.h
--- /opt/kernel/linux-2.4.9-pre4/include/linux/fs.h	Wed Aug 15 09:19:19 2001
+++ linux/include/linux/fs.h	Wed Aug 15 09:25:13 2001
@@ -277,6 +277,8 @@
 
 #define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
 
+#define bh_bus(bh)		(page_to_bus((bh)->b_page) + bh_offset((bh)))
+
 extern void set_bh_page(struct buffer_head *bh, struct page *page, unsigned long offset);
 
 #define touch_buffer(bh)	SetPageReferenced(bh->b_page)
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/linux/highmem.h linux/include/linux/highmem.h
--- /opt/kernel/linux-2.4.9-pre4/include/linux/highmem.h	Fri Jul 20 21:52:18 2001
+++ linux/include/linux/highmem.h	Wed Aug 15 09:25:14 2001
@@ -13,8 +13,7 @@
 /* declarations for linux/mm/highmem.c */
 FASTCALL(unsigned int nr_free_highpages(void));
 
-extern struct buffer_head * create_bounce(int rw, struct buffer_head * bh_orig);
-
+extern struct buffer_head *create_bounce(int rw, struct buffer_head * bh_orig);
 
 static inline char *bh_kmap(struct buffer_head *bh)
 {
@@ -26,6 +25,42 @@
 	kunmap(bh->b_page);
 }
 
+/*
+ * remember to add offset! and never ever reenable interrupts between a
+ * bh_kmap_irq and bh_kunmap_irq!!
+ */
+static inline char *bh_kmap_irq(struct buffer_head *bh, unsigned long *flags)
+{
+	unsigned long addr;
+
+	__save_flags(*flags);
+
+	/*
+	 * could be low
+	 */
+	if (!PageHighMem(bh->b_page))
+		return bh->b_data;
+
+	/*
+	 * it's a highmem page
+	 */
+	__cli();
+	addr = (unsigned long) kmap_atomic(bh->b_page, KM_BH_IRQ);
+
+	if (addr & ~PAGE_MASK)
+		BUG();
+
+	return (char *) addr + bh_offset(bh);
+}
+
+static inline void bh_kunmap_irq(char *buffer, unsigned long *flags)
+{
+	unsigned long ptr = (unsigned long) buffer & PAGE_MASK;
+
+	kunmap_atomic((void *) ptr, KM_BH_IRQ);
+	__restore_flags(*flags);
+}
+
 #else /* CONFIG_HIGHMEM */
 
 static inline unsigned int nr_free_highpages(void) { return 0; }
@@ -37,8 +72,10 @@
 #define kmap_atomic(page,idx)		kmap(page)
 #define kunmap_atomic(page,idx)		kunmap(page)
 
-#define bh_kmap(bh)	((bh)->b_data)
-#define bh_kunmap(bh)	do { } while (0)
+#define bh_kmap(bh)			((bh)->b_data)
+#define bh_kunmap(bh)			do { } while (0)
+#define bh_kmap_irq(bh, flags)		((bh)->b_data)
+#define bh_kunmap_irq(bh, flags)	do { } while (0)
 
 #endif /* CONFIG_HIGHMEM */
 
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/include/linux/ide.h linux/include/linux/ide.h
--- /opt/kernel/linux-2.4.9-pre4/include/linux/ide.h	Wed Aug 15 09:19:19 2001
+++ linux/include/linux/ide.h	Wed Aug 15 09:28:25 2001
@@ -485,7 +485,7 @@
 	ide_dmaproc_t	*dmaproc;	/* dma read/write/abort routine */
 	unsigned int	*dmatable_cpu;	/* dma physical region descriptor table (cpu view) */
 	dma_addr_t	dmatable_dma;	/* dma physical region descriptor table (dma view) */
-	struct scatterlist *sg_table;	/* Scatter-gather list used to build the above */
+	struct sg_list	*sg_table;	/* Scatter-gather list used to build the above */
 	int sg_nents;			/* Current number of entries in it */
 	int sg_dma_direction;		/* dma transfer direction */
 	int sg_dma_active;		/* is it in use */
@@ -507,6 +507,7 @@
 	unsigned	reset      : 1;	/* reset after probe */
 	unsigned	autodma    : 1;	/* automatically try to enable DMA at boot */
 	unsigned	udma_four  : 1;	/* 1=ATA-66 capable, 0=default */
+	unsigned	highmem	   : 1; /* can do full 32-bit dma */
 	byte		channel;	/* for dual-port chips: 0=primary, 1=secondary */
 #ifdef CONFIG_BLK_DEV_IDEPCI
 	struct pci_dev	*pci_dev;	/* for pci chipsets */
@@ -812,6 +813,21 @@
 	ide_preempt,	/* insert rq in front of current request */
 	ide_end		/* insert rq at end of list, but don't wait for it */
 } ide_action_t;
+
+/*
+ * temporarily mapping a (possible) highmem bio
+ */
+#define ide_rq_offset(rq) (((rq)->hard_cur_sectors - (rq)->current_nr_sectors) << 9)
+
+extern inline void *ide_map_buffer(struct request *rq, unsigned long *flags)
+{
+	return bh_kmap_irq(rq->bh, flags) + ide_rq_offset(rq);
+}
+
+extern inline void ide_unmap_buffer(char *buffer, unsigned long *flags)
+{
+	bh_kunmap_irq(buffer, flags);
+}
 
 /*
  * This function issues a special IDE device request
diff -ur --exclude-from /home/axboe/exclude /opt/kernel/linux-2.4.9-pre4/kernel/ksyms.c linux/kernel/ksyms.c
--- /opt/kernel/linux-2.4.9-pre4/kernel/ksyms.c	Wed Aug 15 09:19:19 2001
+++ linux/kernel/ksyms.c	Wed Aug 15 09:14:26 2001
@@ -121,6 +121,8 @@
 EXPORT_SYMBOL(kunmap_high);
 EXPORT_SYMBOL(highmem_start_page);
 EXPORT_SYMBOL(create_bounce);
+EXPORT_SYMBOL(kmap_prot);
+EXPORT_SYMBOL(kmap_pte);
 #endif
 
 /* filesystem internal functions */

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15  7:50 [patch] zero-bounce highmem I/O Jens Axboe
@ 2001-08-15  9:11 ` David S. Miller
  2001-08-15  9:17   ` Jens Axboe
                     ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-15  9:11 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel

   From: Jens Axboe <axboe@suse.de>
   Date: Wed, 15 Aug 2001 09:50:18 +0200
   
   Dave, comments on that?

I think the new-style sg_list is slightly overkill, too much
stuff.  You need much less, in fact, especially on x86.

Take include/linux/skbuff.h:skb_frag_struct, rename it to
sg_list and add a dma_addr_t.  You should need nothing else.
The bounce page, for example, is superfluous.

If you bounce, the bounce page can be determined later via the
dma_addr_t right?

   And Dave, should I add the 64-bit stuff I started again? :-)

Let me draft something up, and meanwhile you can think about
the changes I suggest above.  Ok?

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15  9:11 ` David S. Miller
@ 2001-08-15  9:17   ` Jens Axboe
  2001-08-15  9:26   ` Jens Axboe
  2001-08-15 10:22   ` David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-15  9:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 09:50:18 +0200
>    
>    Dave, comments on that?
> 
> I think the new-style sg_list is slightly overkill, too much
> stuff.  You need much less, in fact, especially on x86.
> 
> Take include/linux/skbuff.h:skb_frag_struct, rename it to
> sg_list and add a dma_addr_t.  You should need nothing else.
> The bounce page, for example, is superfluous.

Ok agreed, fine with me.

> If you bounce, the bounce page can be determined later via the
> dma_addr_t right?

That's true.

>    And Dave, should I add the 64-bit stuff I started again? :-)
> 
> Let me draft something up, and meanwhile you can think about
> the changes I suggest above.  Ok?

Ok done from my side :)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15  9:11 ` David S. Miller
  2001-08-15  9:17   ` Jens Axboe
@ 2001-08-15  9:26   ` Jens Axboe
  2001-08-15 10:22   ` David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-15  9:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 770 bytes --]

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 09:50:18 +0200
>    
>    Dave, comments on that?
> 
> I think the new-style sg_list is slightly overkill, too much
> stuff.  You need much less, in fact, especially on x86.
> 
> Take include/linux/skbuff.h:skb_frag_struct, rename it to
> sg_list and add a dma_addr_t.  You should need nothing else.
> The bounce page, for example, is superfluous.

Ok, here's an updated version. Maybe modulo the struct scatterlist
changes, I'd like to see this included in 2.4.x soonish. Or at least the
interface we agree on -- it'll make my life easier at least. And finally
provide driver authors with something not quite as stupid as struct
scatterlist.

-- 
Jens Axboe


[-- Attachment #2: pci-dma-high-2 --]
[-- Type: text/plain, Size: 4697 bytes --]

--- /opt/kernel/linux-2.4.9-pre4/include/asm-i386/pci.h	Wed Aug 15 09:19:05 2001
+++ linux/include/asm-i386/pci.h	Wed Aug 15 11:19:02 2001
@@ -28,6 +28,7 @@
 
 #include <linux/types.h>
 #include <linux/slab.h>
+#include <linux/highmem.h>
 #include <asm/scatterlist.h>
 #include <linux/string.h>
 #include <asm/io.h>
@@ -84,6 +85,27 @@
 	/* Nothing to do */
 }
 
+/*
+ * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
+ * to pci_map_single, but takes a struct page instead of a virtual address
+ */
+extern inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
+				      size_t size, int offset, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	return (page - mem_map) * PAGE_SIZE + offset;
+}
+
+extern inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+				  size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -102,8 +124,26 @@
 static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
 			     int nents, int direction)
 {
+	int i;
+
 	if (direction == PCI_DMA_NONE)
 		BUG();
+
+	/*
+	 * temporary 2.4 hack
+	 */
+	for (i = 0; i < nents; i++ ) {
+		if (sg[i].address && sg[i].page)
+			BUG();
+		else if (!sg[i].address && !sg[i].page)
+			BUG();
+
+		if (sg[i].page)
+			sg[i].dma_address = page_to_bus(sg[i].page) + sg[i].offset;
+		else
+			sg[i].dma_address = virt_to_bus(sg[i].address);
+	}
+
 	return nents;
 }
 
@@ -119,6 +159,32 @@
 	/* Nothing to do */
 }
 
+/*
+ * meant to replace the pci_map_sg api, new drivers should use this
+ * interface
+ */
+extern inline int pci_map_sgl(struct pci_dev *hwdev, struct sg_list *sg,
+			      int nents, int direction)
+{
+	int i;
+
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	for (i = 0; i < nents; i++)
+		sg[i].dma_address = page_to_bus(sg[i].page) + sg[i].offset;
+
+	return nents;
+}
+
+extern inline void pci_unmap_sgl(struct pci_dev *hwdev, struct sg_list *sg,
+				 int nents, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
  *
@@ -173,10 +239,9 @@
 /* These macros should be used after a pci_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries pci_map_sg
- * returns, or alternatively stop on the first sg_dma_len(sg) which
- * is 0.
+ * returns.
  */
-#define sg_dma_address(sg)	(virt_to_bus((sg)->address))
+#define sg_dma_address(sg)	((sg)->dma_address)
 #define sg_dma_len(sg)		((sg)->length)
 
 /* Return the index of the PCI controller for device. */
--- /opt/kernel/linux-2.4.9-pre4/include/asm-i386/scatterlist.h	Mon Dec 30 12:01:10 1996
+++ linux/include/asm-i386/scatterlist.h	Wed Aug 15 11:18:29 2001
@@ -1,12 +1,52 @@
 #ifndef _I386_SCATTERLIST_H
 #define _I386_SCATTERLIST_H
 
+/*
+ * temporary measure, include a page and offset.
+ */
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
+    struct page * page; /* Location for highmem page, if any */
+    char *  address;    /* Location data is to be transferred to, NULL for
+			 * highmem page */
     char * alt_address; /* Location of actual if address is a 
 			 * dma indirect buffer.  NULL otherwise */
+    dma_addr_t dma_address;
     unsigned int length;
+    unsigned int offset;/* for highmem, page offset */
 };
+
+/*
+ * new style scatter gather list -- move to this completely?
+ */
+struct sg_list {
+	/*
+	 * input
+	 */
+	struct page *page;
+	__u16 length;
+	__u16 offset;
+
+	/*
+	 * output -- mapped address. either directly mapped from ->page
+	 * above, or possibly a bounce address
+	 */
+	dma_addr_t dma_address;
+};
+
+extern inline void set_bh_sg(struct scatterlist *sg, struct buffer_head *bh)
+{
+	if (PageHighMem(bh->b_page)) {
+		sg->page = bh->b_page;
+		sg->offset = bh_offset(bh);
+		sg->address = NULL;
+	} else {
+		sg->page = NULL;
+		sg->offset = 0;
+		sg->address = bh->b_data;
+	}
+
+	sg->length = bh->b_size;
+}
 
 #define ISA_DMA_THRESHOLD (0x00ffffff)
 
--- /opt/kernel/linux-2.4.9-pre4/include/linux/pci.h	Fri Jul 20 21:52:38 2001
+++ linux/include/linux/pci.h	Wed Aug 15 11:19:10 2001
@@ -314,6 +314,8 @@
 #define PCI_DMA_FROMDEVICE	2
 #define PCI_DMA_NONE		3
 
+#define PCI_MAX_DMA32		(0xffffffff)
+
 #define DEVICE_COUNT_COMPATIBLE	4
 #define DEVICE_COUNT_IRQ	2
 #define DEVICE_COUNT_DMA	2

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15  9:11 ` David S. Miller
  2001-08-15  9:17   ` Jens Axboe
  2001-08-15  9:26   ` Jens Axboe
@ 2001-08-15 10:22   ` David S. Miller
  2001-08-15 11:13     ` Jens Axboe
                       ` (3 more replies)
  2 siblings, 4 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-15 10:22 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Wed, 15 Aug 2001 11:26:21 +0200
   
   Ok, here's an updated version. Maybe modulo the struct scatterlist
   changes, I'd like to see this included in 2.4.x soonish. Or at least the
   interface we agree on -- it'll make my life easier at least. And finally
   provide driver authors with something not quite as stupid as struct
   scatterlist.

Jens, have a look at the patch I have below.  What do you
think about it?  Specifically the set of interfaces.

Andrea, I am very much interested in your input as well.

I would like to kill two birds with one stone here if
we can.   The x86 versions of the asm/pci.h and
asm/scatterlist.h bits are pretty mindless and left as
an exercise to the reader :-)

--- drivers/pci/pci.c.~1~	Mon Aug 13 22:05:39 2001
+++ drivers/pci/pci.c	Wed Aug 15 02:55:37 2001
@@ -832,7 +832,7 @@
 }
 
 int
-pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask)
+pci_set_dma_mask(struct pci_dev *dev, u64 mask)
 {
     if(! pci_dma_supported(dev, mask))
         return -EIO;
@@ -842,6 +842,12 @@
     return 0;
 }
     
+void
+pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off)
+{
+	dev->dma_flags |= on;
+	dev->dma_flags &= ~off;
+}
 
 /*
  * Translate the low bits of the PCI base
--- include/asm-sparc64/pci.h.~1~	Tue Aug 14 21:31:07 2001
+++ include/asm-sparc64/pci.h	Wed Aug 15 03:18:33 2001
@@ -28,6 +28,13 @@
 /* Dynamic DMA mapping stuff.
  */
 
+/* PCI 64-bit addressing works for all slots on all controller
+ * types on sparc64.  However, it requires that the device
+ * can drive enough of the 64 bits.
+ */
+#define pci_dac_cycles_ok(pci_dev) \
+	(((pci_dev)->dma_mask & PCI64_ADDR_BASE) == PCI64_ADDR_BASE)
+
 #include <asm/scatterlist.h>
 
 struct pci_dev;
@@ -64,6 +71,20 @@
  */
 extern void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr, size_t size, int direction);
 
+/* No highmem on sparc64, plus we have an IOMMU, so mapping pages is easy. */
+#define pci_map_page(dev, page, off, size, dir) \
+	pci_map_single(dev, (page_address(page) + (off)), size, dir)
+#define pci_unmap_page(dev,addr,sz,dir) pci_unmap_single(dev,addr,sz,dir)
+
+/* The 64-bit cases might have to do something interesting if
+ * PCI_DMA_FLAG_HUGE_MAPS is set in hwdev->dma_flags.
+ */
+extern dma64_addr_t pci64_map_page(struct pci_dev *hwdev,
+				   struct page *page, unsigned long offset,
+				   size_t size, int direction);
+extern void pci64_unmap_page(struct pci_dev *hwdev, dma64_addr_t dma_addr,
+			     size_t size, int direction);
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -79,13 +100,19 @@
  * Device ownership issues as mentioned above for pci_map_single are
  * the same here.
  */
-extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nents, int direction);
+extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		      int nents, int direction);
+extern int pci64_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			int nents, int direction);
 
 /* Unmap a set of streaming mode DMA translations.
  * Again, cpu read rules concerning calls here are the same as for
  * pci_unmap_single() above.
  */
-extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nhwents, int direction);
+extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			 int nhwents, int direction);
+extern void pci64_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			   int nhwents, int direction);
 
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
@@ -96,7 +123,10 @@
  * next point you give the PCI dma address back to the card, the
  * device again owns the buffer.
  */
-extern void pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle, size_t size, int direction);
+extern void pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+				size_t size, int direction);
+extern void pci64_dma_sync_single(struct pci_dev *hwdev, dma64_addr_t dma_handle,
+				  size_t size, int direction);
 
 /* Make physical memory consistent for a set of streaming
  * mode DMA translations after a transfer.
@@ -105,13 +135,14 @@
  * same rules and usage.
  */
 extern void pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nelems, int direction);
+extern void pci64_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nelems, int direction);
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask);
+extern int pci_dma_supported(struct pci_dev *hwdev, u64 mask);
 
 /* Return the index of the PCI controller for device PDEV. */
 
--- include/asm-sparc64/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ include/asm-sparc64/scatterlist.h	Wed Aug 15 03:10:44 2001
@@ -5,17 +5,26 @@
 #include <asm/page.h>
 
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
-    unsigned int length;
+	char *address;
+	char *alt_address;
 
-    __u32 dvma_address; /* A place to hang host-specific addresses at. */
-    __u32 dvma_length;
+	struct page *page;
+	unsigned int offset;
+	unsigned int length;
+
+	/* A place to hang host-specific addresses at. */
+	union {
+		dma_addr_t   dma32_address;
+		dma64_addr_t dma64_address;
+	} dma_u;
+#define dvma_address		dma_u.dma32_address
+
+	__u32 dvma_length;
 };
 
-#define sg_dma_address(sg) ((sg)->dvma_address)
-#define sg_dma_len(sg)     ((sg)->dvma_length)
+#define sg_dma_address(sg)	((sg)->dma_u.dma32_address)
+#define sg_dma64_address(sg)	((sg)->dma_u.dma64_address)
+#define sg_dma_len(sg)     	((sg)->dvma_length)
 
 #define ISA_DMA_THRESHOLD	(~0UL)
 
--- include/asm-sparc64/types.h.~1~	Tue Nov 28 08:33:08 2000
+++ include/asm-sparc64/types.h	Wed Aug 15 02:13:44 2001
@@ -45,9 +45,10 @@
 
 #define BITS_PER_LONG 64
 
-/* Dma addresses are 32-bits wide for now.  */
+/* Dma addresses come in 32-bit and 64-bit flavours.  */
 
 typedef u32 dma_addr_t;
+typedef u64 dma64_addr_t;
 
 #endif /* __KERNEL__ */
 
--- include/linux/pci.h.~1~	Tue Aug 14 21:31:11 2001
+++ include/linux/pci.h	Wed Aug 15 03:17:30 2001
@@ -314,6 +314,12 @@
 #define PCI_DMA_FROMDEVICE	2
 #define PCI_DMA_NONE		3
 
+/* These are the boolean attributes stored in pci_dev->dma_flags. */
+#define PCI_DMA_FLAG_HUGE_MAPS	0x00000001 /* Device may hold an enormous number
+					    * of mappings at once?
+					    */
+#define PCI_DMA_FLAG_ARCHMASK	0xf0000000 /* Reserved for arch-specific flags */
+
 #define DEVICE_COUNT_COMPATIBLE	4
 #define DEVICE_COUNT_IRQ	2
 #define DEVICE_COUNT_DMA	2
@@ -353,11 +359,12 @@
 
 	struct pci_driver *driver;	/* which driver has allocated this device */
 	void		*driver_data;	/* data private to the driver */
-	dma_addr_t	dma_mask;	/* Mask of the bits of bus address this
+	u64		dma_mask;	/* Mask of the bits of bus address this
 					   device implements.  Normally this is
 					   0xffffffff.  You only need to change
 					   this if your device has broken DMA
 					   or supports 64-bit transfers.  */
+	unsigned int	dma_flags;	/* See PCI_DMA_FLAG_* above */
 
 	u32             current_state;  /* Current operating state. In ACPI-speak,
 					   this is D0-D3, D0 being fully functional,
@@ -559,7 +566,8 @@
 int pci_enable_device(struct pci_dev *dev);
 void pci_disable_device(struct pci_dev *dev);
 void pci_set_master(struct pci_dev *dev);
-int pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask);
+int pci_set_dma_mask(struct pci_dev *dev, u64 mask);
+void pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off);
 int pci_assign_resource(struct pci_dev *dev, int i);
 
 /* Power management related routines */
@@ -641,7 +649,8 @@
 static inline int pci_enable_device(struct pci_dev *dev) { return -EIO; }
 static inline void pci_disable_device(struct pci_dev *dev) { }
 static inline int pci_module_init(struct pci_driver *drv) { return -ENODEV; }
-static inline int pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask) { return -EIO; }
+static inline int pci_set_dma_mask(struct pci_dev *dev, u64 mask) { return -EIO; }
+static inline void pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off) { }
 static inline int pci_assign_resource(struct pci_dev *dev, int i) { return -EBUSY;}
 static inline int pci_register_driver(struct pci_driver *drv) { return 0;}
 static inline void pci_unregister_driver(struct pci_driver *drv) { }

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 10:22   ` David S. Miller
@ 2001-08-15 11:13     ` Jens Axboe
  2001-08-15 11:47     ` David S. Miller
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-15 11:13 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 11:26:21 +0200
>    
>    Ok, here's an updated version. Maybe modulo the struct scatterlist
>    changes, I'd like to see this included in 2.4.x soonish. Or at least the
>    interface we agree on -- it'll make my life easier at least. And finally
>    provide driver authors with something not quite as stupid as struct
>    scatterlist.
> 
> Jens, have a look at the patch I have below.  What do you
> think about it?  Specifically the set of interfaces.

Looks fine to me, exactly the interface I've used/wanted. But you want
to add the extra page/offset to the existing scatterlist, and not scrap
that one completely?

> we can.   The x86 versions of the asm/pci.h and
> asm/scatterlist.h bits are pretty mindless and left as
> an exercise to the reader :-)

:-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 10:22   ` David S. Miller
  2001-08-15 11:13     ` Jens Axboe
@ 2001-08-15 11:47     ` David S. Miller
  2001-08-15 12:07       ` Jens Axboe
  2001-08-15 12:35       ` David S. Miller
  2001-08-15 19:20     ` Gérard Roudier
  2001-08-16  8:12     ` David S. Miller
  3 siblings, 2 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-15 11:47 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Wed, 15 Aug 2001 13:13:35 +0200

   Looks fine to me, exactly the interface I've used/wanted. But you want
   to add the extra page/offset to the existing scatterlist, and not scrap
   that one completely?

The idea is that address/alt_address disappear sometime in 2.5.
Something like this right?

BTW, on x86 we can ifdef the dma64_address to u32 or u64 based
upon CONFIG_HIGHMEM if we wish.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 11:47     ` David S. Miller
@ 2001-08-15 12:07       ` Jens Axboe
  2001-08-15 12:35       ` David S. Miller
  1 sibling, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-15 12:07 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 13:13:35 +0200
> 
>    Looks fine to me, exactly the interface I've used/wanted. But you want
>    to add the extra page/offset to the existing scatterlist, and not scrap
>    that one completely?
> 
> The idea is that address/alt_address disappear sometime in 2.5.
> Something like this right?

Ok so you just want to turn scatterlist into what I call sg_list in 2.5
time, fine with me too. Depends on whether we want to keep the
pci_map_sg and struct scatterlist interface intact, or just break it and
tell driver authors they must fix their stuff regardless of whether they
want to support highmem. As I write this sentence, it's clear to me
which way is the superior :-)

> BTW, on x86 we can ifdef the dma64_address to u32 or u64 based
> upon CONFIG_HIGHMEM if we wish.

Yep. Want me to add in the x86 parts of your patch?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 11:47     ` David S. Miller
  2001-08-15 12:07       ` Jens Axboe
@ 2001-08-15 12:35       ` David S. Miller
  2001-08-15 13:10         ` Jens Axboe
  2001-08-15 14:02         ` [patch] zero-bounce highmem I/O David S. Miller
  1 sibling, 2 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-15 12:35 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Wed, 15 Aug 2001 14:07:40 +0200

   Ok so you just want to turn scatterlist into what I call sg_list in 2.5
   time, fine with me too. Depends on whether we want to keep the
   pci_map_sg and struct scatterlist interface intact, or just break it and
   tell driver authors they must fix their stuff regardless of whether they
   want to support highmem. As I write this sentence, it's clear to me
   which way is the superior :-)
   
pci_map_sg is pci_map_sg, if the internal representation of
scatterlist is changed such that address/alt_address no longer exist,
it will work on pages only.  Right?  The compatibility mode in
2.4.x is the "if (address != NULL) virt_to_bus(address);" stuff.

Understand that the pci64_{map,unmap}_sg is created for a seperate
purpose, independant of whether scatterlist has the backwards
compatability stuff or not.  (There have been threads here about this,
I can describe it quickly for you in quiet if you want to know).

Two more things to consider:

1) There is nobody who cannot be search&replace converted from
   	sg->address = ptr
   into
	sg->page = virt_to_page(ptr)
	sg->offset = ((unsigned long)ptr & ~PAGE_MASK);

   The only truly problematic area is the alt_address thing.
   It is would be a nice thing to rip this eyesore out of the scsi
   layer anyways.

2) I want to put scatterlist in to replace skb_frag_struct in skbuff.h
   and then have a zerocopy network driver do something like:

   	header_dma = pci_map_single(pdev, skb->data, skb->len, PCI_DMA_TODEVICE);
	data_nents = pci_map_sg(pdev, skb_shinfo(skb)->frag_list,
				skb_shinfo(skb)->nr_frags,
				PCI_DMA_TODEVICE);

See? :-)

   Yep. Want me to add in the x86 parts of your patch?

Please let me finish up my prototype with sparc64 building and
working, then I'll send you what I have ok?

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 12:35       ` David S. Miller
@ 2001-08-15 13:10         ` Jens Axboe
  2001-08-15 14:25           ` David S. Miller
  2001-08-15 14:02         ` [patch] zero-bounce highmem I/O David S. Miller
  1 sibling, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2001-08-15 13:10 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 14:07:40 +0200
> 
>    Ok so you just want to turn scatterlist into what I call sg_list in 2.5
>    time, fine with me too. Depends on whether we want to keep the
>    pci_map_sg and struct scatterlist interface intact, or just break it and
>    tell driver authors they must fix their stuff regardless of whether they
>    want to support highmem. As I write this sentence, it's clear to me
>    which way is the superior :-)
>    
> pci_map_sg is pci_map_sg, if the internal representation of
> scatterlist is changed such that address/alt_address no longer exist,
> it will work on pages only.  Right?  The compatibility mode in

Yes, the pci_* was not my worry. It's the *address case you list below,
but I think you are right that it' not a big issue at that.

> 2.4.x is the "if (address != NULL) virt_to_bus(address);" stuff.

Yes, aka the horrible hack.

> Understand that the pci64_{map,unmap}_sg is created for a seperate
> purpose, independant of whether scatterlist has the backwards
> compatability stuff or not.  (There have been threads here about this,
> I can describe it quickly for you in quiet if you want to know).

I've read these now, been behind on traffic lately.

> Two more things to consider:
> 
> 1) There is nobody who cannot be search&replace converted from
>    	sg->address = ptr
>    into
> 	sg->page = virt_to_page(ptr)
> 	sg->offset = ((unsigned long)ptr & ~PAGE_MASK);
> 
>    The only truly problematic area is the alt_address thing.
>    It is would be a nice thing to rip this eyesore out of the scsi
>    layer anyways.

The SCSI issue was exactly what was on my mind, and is indeed the reason
why I didn't go all the way and did a complete conversion there. The
SCSI layer is _not_ very clean in this regard, didn't exactly enjoy this
part of the work...

> 2) I want to put scatterlist in to replace skb_frag_struct in skbuff.h
>    and then have a zerocopy network driver do something like:
> 
>    	header_dma = pci_map_single(pdev, skb->data, skb->len, PCI_DMA_TODEVICE);
> 	data_nents = pci_map_sg(pdev, skb_shinfo(skb)->frag_list,
> 				skb_shinfo(skb)->nr_frags,
> 				PCI_DMA_TODEVICE);
> 
> See? :-)

Yep, I see where you are going :)

>    Yep. Want me to add in the x86 parts of your patch?
> 
> Please let me finish up my prototype with sparc64 building and
> working, then I'll send you what I have ok?

Fine

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 12:35       ` David S. Miller
  2001-08-15 13:10         ` Jens Axboe
@ 2001-08-15 14:02         ` David S. Miller
  2001-08-16  5:52           ` Jens Axboe
  1 sibling, 1 reply; 29+ messages in thread
From: David S. Miller @ 2001-08-15 14:02 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Wed, 15 Aug 2001 15:10:52 +0200

   On Wed, Aug 15 2001, David S. Miller wrote:
   >    The only truly problematic area is the alt_address thing.
   >    It is would be a nice thing to rip this eyesore out of the scsi
   >    layer anyways.
   
   The SCSI issue was exactly what was on my mind, and is indeed the reason
   why I didn't go all the way and did a complete conversion there. The
   SCSI layer is _not_ very clean in this regard, didn't exactly enjoy this
   part of the work...
   
I just took a quick look at this, and I think I can make this
alt_address thing into a scsi-layer-specific mechanism and
thus be able to safely remove it from struct scatterlist.

Would you like me to whip up such a set of changes?  I'll be
more than happy to work on it.

   >    Yep. Want me to add in the x86 parts of your patch?
   > 
   > Please let me finish up my prototype with sparc64 building and
   > working, then I'll send you what I have ok?
   
   Fine
   
This is forthcoming.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 13:10         ` Jens Axboe
@ 2001-08-15 14:25           ` David S. Miller
  2001-08-16 11:51             ` Jens Axboe
                               ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-15 14:25 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: "David S. Miller" <davem@redhat.com>
   Date: Wed, 15 Aug 2001 07:02:04 -0700 (PDT)
   
      >    Yep. Want me to add in the x86 parts of your patch?
      > 
      > Please let me finish up my prototype with sparc64 building and
      > working, then I'll send you what I have ok?
      
      Fine
      
   This is forthcoming.

As promised.  I actually got bored during the build and tried to
quickly cook up the ix86 bits myself :-)

--- ./arch/alpha/kernel/pci_iommu.c.~1~	Sun Aug 12 23:50:25 2001
+++ ./arch/alpha/kernel/pci_iommu.c	Wed Aug 15 03:04:24 2001
@@ -636,7 +636,7 @@
    supported properly.  */
 
 int
-pci_dma_supported(struct pci_dev *pdev, dma_addr_t mask)
+pci_dma_supported(struct pci_dev *pdev, u64 mask)
 {
 	struct pci_controller *hose;
 	struct pci_iommu_arena *arena;
--- ./arch/sparc64/kernel/pci_iommu.c.~1~	Wed May 23 17:57:03 2001
+++ ./arch/sparc64/kernel/pci_iommu.c	Wed Aug 15 06:40:54 2001
@@ -356,6 +356,20 @@
 	return 0;
 }
 
+dma64_addr_t pci64_map_page(struct pci_dev *pdev,
+			    struct page *page, unsigned long offset,
+			    size_t sz, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS)) {
+		return (dma64_addr_t)
+			pci_map_single(pdev,
+				       page_address(page) + offset,
+				       sz, direction);
+	}
+
+	return PCI64_ADDR_BASE + (__pa(page_address(page)) + offset);
+}
+
 /* Unmap a single streaming mode DMA translation. */
 void pci_unmap_single(struct pci_dev *pdev, dma_addr_t bus_addr, size_t sz, int direction)
 {
@@ -378,7 +392,8 @@
 		((bus_addr - iommu->page_table_map_base) >> PAGE_SHIFT);
 #ifdef DEBUG_PCI_IOMMU
 	if (iopte_val(*base) == IOPTE_INVALID)
-		printk("pci_unmap_single called on non-mapped region %08x,%08x from %016lx\n", bus_addr, sz, __builtin_return_address(0));
+		printk("pci_unmap_single called on non-mapped region %08x,%08x from %016lx\n",
+		       bus_addr, sz, __builtin_return_address(0));
 #endif
 	bus_addr &= PAGE_MASK;
 
@@ -423,18 +438,38 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-static inline void fill_sg(iopte_t *iopte, struct scatterlist *sg, int nused, unsigned long iopte_protection)
+void pci64_unmap_page(struct pci_dev *pdev, dma64_addr_t bus_addr,
+		      size_t sz, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS)) {
+		if ((bus_addr >> 32) != (dma64_addr_t) 0)
+			BUG();
+
+		return pci_unmap_single(pdev, (dma_addr_t) bus_addr,
+					sz, direction);
+	}
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
+#define SG_ENT_PHYS_ADDRESS(SG)	\
+	((SG)->address ? \
+	 __pa((SG)->address) : \
+	 (__pa(page_address((SG)->page)) + (SG)->offset))
+
+static inline void fill_sg(iopte_t *iopte, struct scatterlist *sg,
+			   int nused, unsigned long iopte_protection)
 {
 	struct scatterlist *dma_sg = sg;
 	int i;
 
 	for (i = 0; i < nused; i++) {
 		unsigned long pteval = ~0UL;
-		u32 dma_npages;
+		u64 dma_npages;
 
-		dma_npages = ((dma_sg->dvma_address & (PAGE_SIZE - 1UL)) +
-			      dma_sg->dvma_length +
-			      ((u32)(PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
+		dma_npages = ((dma_sg->dma_address & (PAGE_SIZE - 1UL)) +
+			      dma_sg->dma_length +
+			      ((PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
 		do {
 			unsigned long offset;
 			signed int len;
@@ -447,7 +482,7 @@
 			for (;;) {
 				unsigned long tmp;
 
-				tmp = (unsigned long) __pa(sg->address);
+				tmp = SG_ENT_PHYS_ADDRESS(sg);
 				len = sg->length;
 				if (((tmp ^ pteval) >> PAGE_SHIFT) != 0UL) {
 					pteval = tmp & PAGE_MASK;
@@ -480,9 +515,9 @@
 			 * detect a page crossing event.
 			 */
 			while ((pteval << (64 - PAGE_SHIFT)) != 0UL &&
-			       pteval == __pa(sg->address) &&
+			       (pteval == SG_ENT_PHYS_ADDRESS(sg)) &&
 			       ((pteval ^
-				 (__pa(sg->address) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
+				 (SG_ENT_PHYS_ADDRESS(sg) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
 				pteval += sg->length;
 				sg++;
 			}
@@ -505,14 +540,19 @@
 	struct pci_strbuf *strbuf;
 	unsigned long flags, ctx, npages, iopte_protection;
 	iopte_t *base;
-	u32 dma_base;
+	u64 dma_base;
 	struct scatterlist *sgtmp;
 	int used;
 
 	/* Fast path single entry scatterlists. */
 	if (nelems == 1) {
-		sglist->dvma_address = pci_map_single(pdev, sglist->address, sglist->length, direction);
-		sglist->dvma_length = sglist->length;
+		sglist->dma_address = (dma64_addr_t)
+			pci_map_single(pdev,
+				       (sglist->address ?
+					sglist->address :
+					(page_address(sglist->page) + sglist->offset)),
+				       sglist->length, direction);
+		sglist->dma_length = sglist->length;
 		return 1;
 	}
 
@@ -540,8 +580,8 @@
 	used = nelems;
 
 	sgtmp = sglist;
-	while (used && sgtmp->dvma_length) {
-		sgtmp->dvma_address += dma_base;
+	while (used && sgtmp->dma_length) {
+		sgtmp->dma_address += dma_base;
 		sgtmp++;
 		used--;
 	}
@@ -574,6 +614,23 @@
 	return 0;
 }
 
+int pci64_map_sg(struct pci_dev *pdev, struct scatterlist *sg,
+		 int nelems, int direction)
+{
+	if ((pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS) != 0) {
+		int i;
+
+		for (i = 0; i < nelems; i++) {
+			sg[i].dma_address =
+				PCI64_ADDR_BASE + SG_ENT_PHYS_ADDRESS(&sg[i]);
+			sg[i].dma_length = sg[i].length;
+		}
+		return nelems;
+	}
+
+	return pci_map_sg(pdev, sg, nelems, direction);
+}
+
 /* Unmap a set of streaming mode DMA translations. */
 void pci_unmap_sg(struct pci_dev *pdev, struct scatterlist *sglist, int nelems, int direction)
 {
@@ -582,7 +639,7 @@
 	struct pci_strbuf *strbuf;
 	iopte_t *base;
 	unsigned long flags, ctx, i, npages;
-	u32 bus_addr;
+	u64 bus_addr;
 
 	if (direction == PCI_DMA_NONE)
 		BUG();
@@ -591,20 +648,21 @@
 	iommu = pcp->pbm->iommu;
 	strbuf = &pcp->pbm->stc;
 	
-	bus_addr = sglist->dvma_address & PAGE_MASK;
+	bus_addr = sglist->dma_address & PAGE_MASK;
 
 	for (i = 1; i < nelems; i++)
-		if (sglist[i].dvma_length == 0)
+		if (sglist[i].dma_length == 0)
 			break;
 	i--;
-	npages = (PAGE_ALIGN(sglist[i].dvma_address + sglist[i].dvma_length) - bus_addr) >> PAGE_SHIFT;
+	npages = (PAGE_ALIGN(sglist[i].dma_address + sglist[i].dma_length) - bus_addr) >> PAGE_SHIFT;
 
 	base = iommu->page_table +
 		((bus_addr - iommu->page_table_map_base) >> PAGE_SHIFT);
 
 #ifdef DEBUG_PCI_IOMMU
 	if (iopte_val(*base) == IOPTE_INVALID)
-		printk("pci_unmap_sg called on non-mapped region %08x,%d from %016lx\n", sglist->dvma_address, nelems, __builtin_return_address(0));
+		printk("pci_unmap_sg called on non-mapped region %016lx,%d from %016lx\n",
+		       sglist->dma_address, nelems, __builtin_return_address(0));
 #endif
 
 	spin_lock_irqsave(&iommu->lock, flags);
@@ -616,7 +674,7 @@
 
 	/* Step 1: Kick data out of streaming buffers if necessary. */
 	if (strbuf->strbuf_enabled) {
-		u32 vaddr = bus_addr;
+		u32 vaddr = (u32) bus_addr;
 
 		PCI_STC_FLUSHFLAG_INIT(strbuf);
 		if (strbuf->strbuf_ctxflush &&
@@ -648,6 +706,15 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+void pci64_unmap_sg(struct pci_dev *pdev, struct scatterlist *sglist,
+		    int nelems, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS))
+		return pci_unmap_sg(pdev, sglist, nelems, direction);
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
  */
@@ -709,6 +776,20 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+void pci64_dma_sync_single(struct pci_dev *pdev, dma64_addr_t bus_addr,
+			   size_t sz, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS)) {
+		if ((bus_addr >> 32) != (dma64_addr_t) 0)
+			BUG();
+
+		return pci_dma_sync_single(pdev, (dma_addr_t) bus_addr,
+					   sz, direction);
+	}
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
 /* Make physical memory consistent for a set of streaming
  * mode DMA translations after a transfer.
  */
@@ -735,7 +816,7 @@
 		iopte_t *iopte;
 
 		iopte = iommu->page_table +
-			((sglist[0].dvma_address - iommu->page_table_map_base) >> PAGE_SHIFT);
+			((sglist[0].dma_address - iommu->page_table_map_base) >> PAGE_SHIFT);
 		ctx = (iopte_val(*iopte) & IOPTE_CONTEXT) >> 47UL;
 	}
 
@@ -752,15 +833,15 @@
 		} while (((long)pci_iommu_read(matchreg)) < 0L);
 	} else {
 		unsigned long i, npages;
-		u32 bus_addr;
+		u64 bus_addr;
 
-		bus_addr = sglist[0].dvma_address & PAGE_MASK;
+		bus_addr = sglist[0].dma_address & PAGE_MASK;
 
 		for(i = 1; i < nelems; i++)
-			if (!sglist[i].dvma_length)
+			if (!sglist[i].dma_length)
 				break;
 		i--;
-		npages = (PAGE_ALIGN(sglist[i].dvma_address + sglist[i].dvma_length) - bus_addr) >> PAGE_SHIFT;
+		npages = (PAGE_ALIGN(sglist[i].dma_address + sglist[i].dma_length) - bus_addr) >> PAGE_SHIFT;
 		for (i = 0; i < npages; i++, bus_addr += PAGE_SIZE)
 			pci_iommu_write(strbuf->strbuf_pflush, bus_addr);
 	}
@@ -774,10 +855,19 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-int pci_dma_supported(struct pci_dev *pdev, dma_addr_t device_mask)
+void pci64_dma_sync_sg(struct pci_dev *pdev, struct scatterlist *sglist,
+		       int nelems, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS))
+		return pci_dma_sync_sg(pdev, sglist, nelems, direction);
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
+int pci_dma_supported(struct pci_dev *pdev, u64 device_mask)
 {
 	struct pcidev_cookie *pcp = pdev->sysdata;
-	u32 dma_addr_mask;
+	u64 dma_addr_mask;
 
 	if (pdev == NULL) {
 		dma_addr_mask = 0xffffffff;
--- ./arch/sparc64/kernel/iommu_common.c.~1~	Tue Nov 28 08:33:08 2000
+++ ./arch/sparc64/kernel/iommu_common.c	Wed Aug 15 05:13:34 2001
@@ -12,7 +12,7 @@
  */
 
 #ifdef VERIFY_SG
-int verify_lengths(struct scatterlist *sg, int nents, int npages)
+static int verify_lengths(struct scatterlist *sg, int nents, int npages)
 {
 	int sg_len, dma_len;
 	int i, pgcount;
@@ -22,8 +22,8 @@
 		sg_len += sg[i].length;
 
 	dma_len = 0;
-	for (i = 0; i < nents && sg[i].dvma_length; i++)
-		dma_len += sg[i].dvma_length;
+	for (i = 0; i < nents && sg[i].dma_length; i++)
+		dma_len += sg[i].dma_length;
 
 	if (sg_len != dma_len) {
 		printk("verify_lengths: Error, different, sg[%d] dma[%d]\n",
@@ -32,13 +32,13 @@
 	}
 
 	pgcount = 0;
-	for (i = 0; i < nents && sg[i].dvma_length; i++) {
+	for (i = 0; i < nents && sg[i].dma_length; i++) {
 		unsigned long start, end;
 
-		start = sg[i].dvma_address;
+		start = sg[i].dma_address;
 		start = start & PAGE_MASK;
 
-		end = sg[i].dvma_address + sg[i].dvma_length;
+		end = sg[i].dma_address + sg[i].dma_length;
 		end = (end + (PAGE_SIZE - 1)) & PAGE_MASK;
 
 		pgcount += ((end - start) >> PAGE_SHIFT);
@@ -55,15 +55,16 @@
 	return 0;
 }
 
-int verify_one_map(struct scatterlist *dma_sg, struct scatterlist **__sg, int nents, iopte_t **__iopte)
+static int verify_one_map(struct scatterlist *dma_sg, struct scatterlist **__sg, int nents, iopte_t **__iopte)
 {
 	struct scatterlist *sg = *__sg;
 	iopte_t *iopte = *__iopte;
-	u32 dlen = dma_sg->dvma_length;
-	u32 daddr = dma_sg->dvma_address;
+	u32 dlen = dma_sg->dma_length;
+	u32 daddr;
 	unsigned int sglen;
 	unsigned long sgaddr;
 
+	daddr = dma_sg->dma_address;
 	sglen = sg->length;
 	sgaddr = (unsigned long) sg->address;
 	while (dlen > 0) {
@@ -136,7 +137,7 @@
 	return nents;
 }
 
-int verify_maps(struct scatterlist *sg, int nents, iopte_t *iopte)
+static int verify_maps(struct scatterlist *sg, int nents, iopte_t *iopte)
 {
 	struct scatterlist *dma_sg = sg;
 	struct scatterlist *orig_dma_sg = dma_sg;
@@ -147,7 +148,7 @@
 		if (nents <= 0)
 			break;
 		dma_sg++;
-		if (dma_sg->dvma_length == 0)
+		if (dma_sg->dma_length == 0)
 			break;
 	}
 
@@ -174,14 +175,15 @@
 	    verify_maps(sg, nents, iopte) < 0) {
 		int i;
 
-		printk("verify_sglist: Crap, messed up mappings, dumping, iodma at %08x.\n",
-		       (u32) (sg->dvma_address & PAGE_MASK));
+		printk("verify_sglist: Crap, messed up mappings, dumping, iodma at ");
+		printk("%016lx.\n", sg->dma_address & PAGE_MASK);
+
 		for (i = 0; i < nents; i++) {
 			printk("sg(%d): address(%p) length(%x) "
-			       "dma_address[%08x] dma_length[%08x]\n",
+			       "dma_address[%016lx] dma_length[%08x]\n",
 			       i,
 			       sg[i].address, sg[i].length,
-			       sg[i].dvma_address, sg[i].dvma_length);
+			       sg[i].dma_address, sg[i].dma_length);
 		}
 	}
 
@@ -189,30 +191,23 @@
 }
 #endif
 
-/* Two addresses are "virtually contiguous" if and only if:
- * 1) They are equal, or...
- * 2) They are both on a page boundry
- */
-#define VCONTIG(__X, __Y)	(((__X) == (__Y)) || \
-				 (((__X) | (__Y)) << (64UL - PAGE_SHIFT)) == 0UL)
-
 unsigned long prepare_sg(struct scatterlist *sg, int nents)
 {
 	struct scatterlist *dma_sg = sg;
 	unsigned long prev;
-	u32 dent_addr, dent_len;
+	u64 dent_addr, dent_len;
 
 	prev  = (unsigned long) sg->address;
 	prev += (unsigned long) (dent_len = sg->length);
-	dent_addr = (u32) ((unsigned long)sg->address & (PAGE_SIZE - 1UL));
+	dent_addr = (u64) ((unsigned long)sg->address & (PAGE_SIZE - 1UL));
 	while (--nents) {
 		unsigned long addr;
 
 		sg++;
 		addr = (unsigned long) sg->address;
 		if (! VCONTIG(prev, addr)) {
-			dma_sg->dvma_address = dent_addr;
-			dma_sg->dvma_length = dent_len;
+			dma_sg->dma_address = dent_addr;
+			dma_sg->dma_length = dent_len;
 			dma_sg++;
 
 			dent_addr = ((dent_addr +
@@ -225,8 +220,8 @@
 		dent_len += sg->length;
 		prev = addr + sg->length;
 	}
-	dma_sg->dvma_address = dent_addr;
-	dma_sg->dvma_length = dent_len;
+	dma_sg->dma_address = dent_addr;
+	dma_sg->dma_length = dent_len;
 
 	return ((unsigned long) dent_addr +
 		(unsigned long) dent_len +
--- ./arch/sparc64/kernel/sbus.c.~1~	Wed May 23 17:57:03 2001
+++ ./arch/sparc64/kernel/sbus.c	Wed Aug 15 06:41:43 2001
@@ -376,6 +376,11 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+#define SG_ENT_PHYS_ADDRESS(SG)	\
+	((SG)->address ? \
+	 __pa((SG)->address) : \
+	 (__pa(page_address((SG)->page)) + (SG)->offset))
+
 static inline void fill_sg(iopte_t *iopte, struct scatterlist *sg, int nused, unsigned long iopte_bits)
 {
 	struct scatterlist *dma_sg = sg;
@@ -383,11 +388,11 @@
 
 	for (i = 0; i < nused; i++) {
 		unsigned long pteval = ~0UL;
-		u32 dma_npages;
+		u64 dma_npages;
 
-		dma_npages = ((dma_sg->dvma_address & (PAGE_SIZE - 1UL)) +
-			      dma_sg->dvma_length +
-			      ((u32)(PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
+		dma_npages = ((dma_sg->dma_address & (PAGE_SIZE - 1UL)) +
+			      dma_sg->dma_length +
+			      ((PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
 		do {
 			unsigned long offset;
 			signed int len;
@@ -400,7 +405,7 @@
 			for (;;) {
 				unsigned long tmp;
 
-				tmp = (unsigned long) __pa(sg->address);
+				tmp = (unsigned long) SG_ENT_PHYS_ADDRESS(sg);
 				len = sg->length;
 				if (((tmp ^ pteval) >> PAGE_SHIFT) != 0UL) {
 					pteval = tmp & PAGE_MASK;
@@ -433,9 +438,9 @@
 			 * detect a page crossing event.
 			 */
 			while ((pteval << (64 - PAGE_SHIFT)) != 0UL &&
-			       pteval == __pa(sg->address) &&
+			       (pteval == SG_ENT_PHYS_ADDRESS(sg)) &&
 			       ((pteval ^
-				 (__pa(sg->address) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
+				 (SG_ENT_PHYS_ADDRESS(sg) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
 				pteval += sg->length;
 				sg++;
 			}
@@ -461,8 +466,13 @@
 
 	/* Fast path single entry scatterlists. */
 	if (nents == 1) {
-		sg->dvma_address = sbus_map_single(sdev, sg->address, sg->length, dir);
-		sg->dvma_length = sg->length;
+		sg->dma_address = (dma64_addr_t)
+			sbus_map_single(sdev,
+					(sg->address ?
+					 sg->address :
+					 (page_address(sg->page) + sg->offset)),
+					sg->length, dir);
+		sg->dma_length = sg->length;
 		return 1;
 	}
 
@@ -478,8 +488,8 @@
 	sgtmp = sg;
 	used = nents;
 
-	while (used && sgtmp->dvma_length) {
-		sgtmp->dvma_address += dma_base;
+	while (used && sgtmp->dma_length) {
+		sgtmp->dma_address += dma_base;
 		sgtmp++;
 		used--;
 	}
@@ -507,22 +517,22 @@
 {
 	unsigned long size, flags;
 	struct sbus_iommu *iommu;
-	u32 dvma_base;
+	u64 dvma_base;
 	int i;
 
 	/* Fast path single entry scatterlists. */
 	if (nents == 1) {
-		sbus_unmap_single(sdev, sg->dvma_address, sg->dvma_length, direction);
+		sbus_unmap_single(sdev, sg->dma_address, sg->dma_length, direction);
 		return;
 	}
 
-	dvma_base = sg[0].dvma_address & PAGE_MASK;
+	dvma_base = sg[0].dma_address & PAGE_MASK;
 	for (i = 0; i < nents; i++) {
-		if (sg[i].dvma_length == 0)
+		if (sg[i].dma_length == 0)
 			break;
 	}
 	i--;
-	size = PAGE_ALIGN(sg[i].dvma_address + sg[i].dvma_length) - dvma_base;
+	size = PAGE_ALIGN(sg[i].dma_address + sg[i].dma_length) - dvma_base;
 
 	iommu = sdev->bus->iommu;
 	spin_lock_irqsave(&iommu->lock, flags);
@@ -547,16 +557,16 @@
 {
 	struct sbus_iommu *iommu = sdev->bus->iommu;
 	unsigned long flags, size;
-	u32 base;
+	u64 base;
 	int i;
 
-	base = sg[0].dvma_address & PAGE_MASK;
+	base = sg[0].dma_address & PAGE_MASK;
 	for (i = 0; i < nents; i++) {
-		if (sg[i].dvma_length == 0)
+		if (sg[i].dma_length == 0)
 			break;
 	}
 	i--;
-	size = PAGE_ALIGN(sg[i].dvma_address + sg[i].dvma_length) - base;
+	size = PAGE_ALIGN(sg[i].dma_address + sg[i].dma_length) - base;
 
 	spin_lock_irqsave(&iommu->lock, flags);
 	strbuf_flush(iommu, base, size >> PAGE_SHIFT);
--- ./arch/sparc64/kernel/iommu_common.h.~1~	Mon Aug 13 11:04:26 2001
+++ ./arch/sparc64/kernel/iommu_common.h	Wed Aug 15 07:00:20 2001
@@ -18,10 +18,7 @@
 #undef VERIFY_SG
 
 #ifdef VERIFY_SG
-int verify_lengths(struct scatterlist *sg, int nents, int npages);
-int verify_one_map(struct scatterlist *dma_sg, struct scatterlist **__sg, int nents, iopte_t **__iopte);
-int verify_maps(struct scatterlist *sg, int nents, iopte_t *iopte);
-void verify_sglist(struct scatterlist *sg, int nents, iopte_t *iopte, int npages);
+extern void verify_sglist(struct scatterlist *sg, int nents, iopte_t *iopte, int npages);
 #endif
 
 /* Two addresses are "virtually contiguous" if and only if:
@@ -31,4 +28,4 @@
 #define VCONTIG(__X, __Y)	(((__X) == (__Y)) || \
 				 (((__X) | (__Y)) << (64UL - PAGE_SHIFT)) == 0UL)
 
-unsigned long prepare_sg(struct scatterlist *sg, int nents);
+extern unsigned long prepare_sg(struct scatterlist *sg, int nents);
--- ./arch/sparc64/kernel/sparc64_ksyms.c.~1~	Mon Jun  4 20:39:50 2001
+++ ./arch/sparc64/kernel/sparc64_ksyms.c	Wed Aug 15 06:23:37 2001
@@ -216,10 +216,16 @@
 EXPORT_SYMBOL(pci_free_consistent);
 EXPORT_SYMBOL(pci_map_single);
 EXPORT_SYMBOL(pci_unmap_single);
+EXPORT_SYMBOL(pci64_map_page);
+EXPORT_SYMBOL(pci64_unmap_page);
 EXPORT_SYMBOL(pci_map_sg);
 EXPORT_SYMBOL(pci_unmap_sg);
+EXPORT_SYMBOL(pci64_map_sg);
+EXPORT_SYMBOL(pci64_unmap_sg);
 EXPORT_SYMBOL(pci_dma_sync_single);
+EXPORT_SYMBOL(pci64_dma_sync_single);
 EXPORT_SYMBOL(pci_dma_sync_sg);
+EXPORT_SYMBOL(pci64_dma_sync_sg);
 EXPORT_SYMBOL(pci_dma_supported);
 #endif
 
--- ./arch/parisc/kernel/ccio-dma.c.~1~	Sun Feb 11 23:53:07 2001
+++ ./arch/parisc/kernel/ccio-dma.c	Wed Aug 15 03:07:31 2001
@@ -638,7 +638,7 @@
 }
 
 
-static int ccio_dma_supported( struct pci_dev *dev, dma_addr_t mask)
+static int ccio_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	if (dev == NULL) {
 		printk(MODULE_NAME ": EISA/ISA/et al not supported\n");
--- ./arch/parisc/kernel/pci-dma.c.~1~	Sun Feb 11 23:53:07 2001
+++ ./arch/parisc/kernel/pci-dma.c	Wed Aug 15 03:07:50 2001
@@ -77,7 +77,7 @@
 static inline void dump_resmap(void) {;}
 #endif
 
-static int pa11_dma_supported( struct pci_dev *dev, dma_addr_t mask)
+static int pa11_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	return 1;
 }
--- ./arch/parisc/kernel/ccio-rm-dma.c.~1~	Wed Dec  6 05:30:33 2000
+++ ./arch/parisc/kernel/ccio-rm-dma.c	Wed Aug 15 03:07:42 2001
@@ -93,7 +93,7 @@
 }
 
 
-static int ccio_dma_supported( struct pci_dev *dev, dma_addr_t mask)  
+static int ccio_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	if (dev == NULL) {
 		printk(MODULE_NAME ": EISA/ISA/et al not supported\n");
--- ./arch/parisc/kernel/sba_iommu.c.~1~	Sun Feb 11 23:53:07 2001
+++ ./arch/parisc/kernel/sba_iommu.c	Wed Aug 15 03:07:56 2001
@@ -779,7 +779,7 @@
 }
 
 static int
-sba_dma_supported( struct pci_dev *dev, dma_addr_t mask)
+sba_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	if (dev == NULL) {
 		printk(MODULE_NAME ": EISA/ISA/et al not supported\n");
--- ./drivers/net/acenic.c.~1~	Mon Aug 13 09:55:44 2001
+++ ./drivers/net/acenic.c	Wed Aug 15 02:33:22 2001
@@ -202,6 +202,7 @@
 #define pci_free_consistent(cookie, size, ptr, dma_ptr)	kfree(ptr)
 #define pci_map_single(cookie, address, size, dir)	virt_to_bus(address)
 #define pci_unmap_single(cookie, address, size, dir)
+#define pci_set_dma_mask(dev, mask)		do { } while (0)
 #endif
 
 #if (LINUX_VERSION_CODE < 0x02032b)
@@ -258,11 +259,6 @@
 #define ace_mark_net_bh()			{do{} while(0);}
 #define ace_if_down(dev)			{do{} while(0);}
 #endif
-
-#ifndef pci_set_dma_mask
-#define pci_set_dma_mask(dev, mask)		dev->dma_mask = mask;
-#endif
-
 
 #if (LINUX_VERSION_CODE >= 0x02031b)
 #define NEW_NETINIT
--- ./drivers/pci/pci.c.~1~	Mon Aug 13 22:05:39 2001
+++ ./drivers/pci/pci.c	Wed Aug 15 06:22:34 2001
@@ -832,7 +832,7 @@
 }
 
 int
-pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask)
+pci_set_dma_mask(struct pci_dev *dev, u64 mask)
 {
     if(! pci_dma_supported(dev, mask))
         return -EIO;
@@ -842,6 +842,12 @@
     return 0;
 }
     
+void
+pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off)
+{
+	dev->dma_flags |= on;
+	dev->dma_flags &= ~off;
+}
 
 /*
  * Translate the low bits of the PCI base
@@ -1954,6 +1960,7 @@
 EXPORT_SYMBOL(pci_find_subsys);
 EXPORT_SYMBOL(pci_set_master);
 EXPORT_SYMBOL(pci_set_dma_mask);
+EXPORT_SYMBOL(pci_change_dma_flag);
 EXPORT_SYMBOL(pci_assign_resource);
 EXPORT_SYMBOL(pci_register_driver);
 EXPORT_SYMBOL(pci_unregister_driver);
--- ./drivers/scsi/sym53c8xx.c.~1~	Thu Jul  5 22:11:53 2001
+++ ./drivers/scsi/sym53c8xx.c	Wed Aug 15 02:34:39 2001
@@ -13101,7 +13101,7 @@
 		(int) (PciDeviceFn(pdev) & 7));
 
 #ifdef SCSI_NCR_DYNAMIC_DMA_MAPPING
-	if (pci_set_dma_mask(pdev, (dma_addr_t) (0xffffffffUL))) {
+	if (pci_set_dma_mask(pdev, 0xffffffff)) {
 		printk(KERN_WARNING NAME53C8XX
 		       "32 BIT PCI BUS DMA ADDRESSING NOT SUPPORTED\n");
 		return -1;
--- ./drivers/scsi/sym53c8xx_comm.h.~1~	Tue Aug 14 21:43:17 2001
+++ ./drivers/scsi/sym53c8xx_comm.h	Wed Aug 15 03:05:23 2001
@@ -2186,7 +2186,7 @@
 		(int) (PciDeviceFn(pdev) & 7));
 
 #ifdef SCSI_NCR_DYNAMIC_DMA_MAPPING
-	if (!pci_dma_supported(pdev, (dma_addr_t) (0xffffffffUL))) {
+	if (!pci_dma_supported(pdev, 0xffffffff)) {
 		printk(KERN_WARNING NAME53C8XX
 		       "32 BIT PCI BUS DMA ADDRESSING NOT SUPPORTED\n");
 		return -1;
--- ./drivers/scsi/qlogicfc.c.~1~	Sun Aug 12 23:50:32 2001
+++ ./drivers/scsi/qlogicfc.c	Wed Aug 15 06:53:38 2001
@@ -65,7 +65,7 @@
 
 #if 1
 /* Once pci64_ DMA mapping interface is in, kill this. */
-typedef dma_addr_t dma64_addr_t;
+#define dma64_addr_t dma_addr_t
 #define pci64_alloc_consistent(d,s,p) pci_alloc_consistent((d),(s),(p))
 #define pci64_free_consistent(d,s,c,a) pci_free_consistent((d),(s),(c),(a))
 #define pci64_map_single(d,c,s,dir) pci_map_single((d),(c),(s),(dir))
@@ -80,6 +80,7 @@
 #define pci64_dma_lo32(a) (a)
 #endif	/* BITS_PER_LONG */
 #define pci64_dma_build(hi,lo) (lo)
+#undef sg_dma64_address
 #define sg_dma64_address(s) sg_dma_address(s)
 #define sg_dma64_len(s) sg_dma_len(s)
 #if BITS_PER_LONG > 32
--- ./include/asm-alpha/pci.h.~1~	Wed May 23 17:57:18 2001
+++ ./include/asm-alpha/pci.h	Wed Aug 15 03:05:38 2001
@@ -144,7 +144,7 @@
    only drive the low 24-bits during PCI bus mastering, then
    you would pass 0x00ffffff as the mask to this function.  */
 
-extern int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask);
+extern int pci_dma_supported(struct pci_dev *hwdev, u64 mask);
 
 /* Return the index of the PCI controller for device PDEV. */
 extern int pci_controller_num(struct pci_dev *pdev);
--- ./include/asm-arm/pci.h.~1~	Sun Aug 12 23:50:35 2001
+++ ./include/asm-arm/pci.h	Wed Aug 15 03:05:45 2001
@@ -152,7 +152,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-static inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+static inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-i386/pci.h.~1~	Fri Jul 27 02:21:22 2001
+++ ./include/asm-i386/pci.h	Wed Aug 15 06:47:07 2001
@@ -55,6 +55,9 @@
 extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
 				void *vaddr, dma_addr_t dma_handle);
 
+/* This is always fine. */
+#define pci_dac_cycles_ok(pci_dev)		(1)
+
 /* Map a single buffer of the indicated size for DMA in streaming mode.
  * The 32-bit bus address to use is returned.
  *
@@ -84,6 +87,46 @@
 	/* Nothing to do */
 }
 
+/*
+ * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
+ * to pci_map_single, but takes a struct page instead of a virtual address
+ */
+static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
+				      unsigned long offset, size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	return (page - mem_map) * PAGE_SIZE + offset;
+}
+
+static inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+				  size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
+/* 64-bit variants */
+static inline dma64_addr_t pci64_map_page(struct pci_dev *hwdev, struct page *page,
+					  unsigned long offset, size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	return (((dma64_addr_t) (page - mem_map)) *
+		((dma64_addr_t) PAGE_SIZE)) + (dma64_addr_t) offset;
+}
+
+static inline void pci64_unmap_page(struct pci_dev *hwdev, dma64_addr_t dma_address,
+				    size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -102,8 +145,26 @@
 static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
 			     int nents, int direction)
 {
+	int i;
+
 	if (direction == PCI_DMA_NONE)
 		BUG();
+
+	/*
+	 * temporary 2.4 hack
+	 */
+	for (i = 0; i < nents; i++ ) {
+		if (sg[i].address && sg[i].page)
+			BUG();
+		else if (!sg[i].address && !sg[i].page)
+			BUG();
+
+		if (sg[i].address)
+			sg[i].dma_address = virt_to_bus(sg[i].address);
+		else
+			sg[i].dma_address = page_to_bus(sg[i].page) + sg[i].offset;
+	}
+
 	return nents;
 }
 
@@ -119,6 +180,9 @@
 	/* Nothing to do */
 }
 
+#define pci64_map_sg	pci_map_sg
+#define pci64_unmap_sg	pci_unmap_sg
+
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
  *
@@ -152,12 +216,15 @@
 	/* Nothing to do */
 }
 
+#define pci64_dma_sync_single	pci_dma_sync_single
+#define pci64_dma_sync_sg	pci_dma_sync_sg
+
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-static inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+static inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
         /*
          * we fall back to GFP_DMA when the mask isn't all 1s,
@@ -173,10 +240,10 @@
 /* These macros should be used after a pci_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries pci_map_sg
- * returns, or alternatively stop on the first sg_dma_len(sg) which
- * is 0.
+ * returns.
  */
-#define sg_dma_address(sg)	(virt_to_bus((sg)->address))
+#define sg_dma_address(sg)	((dma_addr_t) ((sg)->dma_address))
+#define sg_dma64_address(sg)	((sg)->dma_address)
 #define sg_dma_len(sg)		((sg)->length)
 
 /* Return the index of the PCI controller for device. */
--- ./include/asm-i386/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-i386/scatterlist.h	Wed Aug 15 06:33:04 2001
@@ -2,9 +2,14 @@
 #define _I386_SCATTERLIST_H
 
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
+    char *  address;    /* Location data is to be transferred to, NULL for
+			 * highmem page */
     char * alt_address; /* Location of actual if address is a 
 			 * dma indirect buffer.  NULL otherwise */
+    struct page * page; /* Location for highmem page, if any */
+    unsigned int offset;/* for highmem, page offset */
+
+    dma64_addr_t dma_address;
     unsigned int length;
 };
 
--- ./include/asm-i386/types.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-i386/types.h	Wed Aug 15 06:32:48 2001
@@ -27,6 +27,8 @@
  */
 #ifdef __KERNEL__
 
+#include <linux/config.h>
+
 typedef signed char s8;
 typedef unsigned char u8;
 
@@ -44,6 +46,11 @@
 /* Dma addresses are 32-bits wide.  */
 
 typedef u32 dma_addr_t;
+#ifdef CONFIG_HIGHMEM
+typedef u64 dma64_addr_t;
+#else
+typedef u32 dma64_addr_t;
+#endif
 
 #endif /* __KERNEL__ */
 
--- ./include/asm-mips/pci.h.~1~	Tue Jul  3 18:14:13 2001
+++ ./include/asm-mips/pci.h	Wed Aug 15 03:06:03 2001
@@ -206,7 +206,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	/*
 	 * we fall back to GFP_DMA when the mask isn't all 1s,
--- ./include/asm-ppc/pci.h.~1~	Wed May 23 17:57:21 2001
+++ ./include/asm-ppc/pci.h	Wed Aug 15 03:06:10 2001
@@ -108,7 +108,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-sparc/pci.h.~1~	Sat May 12 03:47:41 2001
+++ ./include/asm-sparc/pci.h	Wed Aug 15 03:06:17 2001
@@ -108,7 +108,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-sparc64/types.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-sparc64/types.h	Wed Aug 15 02:13:44 2001
@@ -45,9 +45,10 @@
 
 #define BITS_PER_LONG 64
 
-/* Dma addresses are 32-bits wide for now.  */
+/* Dma addresses come in 32-bit and 64-bit flavours.  */
 
 typedef u32 dma_addr_t;
+typedef u64 dma64_addr_t;
 
 #endif /* __KERNEL__ */
 
--- ./include/asm-sparc64/pci.h.~1~	Tue Aug 14 21:31:07 2001
+++ ./include/asm-sparc64/pci.h	Wed Aug 15 06:43:53 2001
@@ -28,6 +28,15 @@
 /* Dynamic DMA mapping stuff.
  */
 
+/* PCI 64-bit addressing works for all slots on all controller
+ * types on sparc64.  However, it requires that the device
+ * can drive enough of the 64 bits.
+ */
+#define PCI64_ADDR_BASE		0x3fff000000000000
+#define PCI64_REQUIRED_MASK	(~(dma64_addr_t)0)
+#define pci_dac_cycles_ok(pci_dev) \
+	(((pci_dev)->dma_mask & PCI64_REQUIRED_MASK) == PCI64_REQUIRED_MASK)
+
 #include <asm/scatterlist.h>
 
 struct pci_dev;
@@ -64,6 +73,20 @@
  */
 extern void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr, size_t size, int direction);
 
+/* No highmem on sparc64, plus we have an IOMMU, so mapping pages is easy. */
+#define pci_map_page(dev, page, off, size, dir) \
+	pci_map_single(dev, (page_address(page) + (off)), size, dir)
+#define pci_unmap_page(dev,addr,sz,dir) pci_unmap_single(dev,addr,sz,dir)
+
+/* The 64-bit cases might have to do something interesting if
+ * PCI_DMA_FLAG_HUGE_MAPS is set in hwdev->dma_flags.
+ */
+extern dma64_addr_t pci64_map_page(struct pci_dev *hwdev,
+				   struct page *page, unsigned long offset,
+				   size_t size, int direction);
+extern void pci64_unmap_page(struct pci_dev *hwdev, dma64_addr_t dma_addr,
+			     size_t size, int direction);
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -79,13 +102,19 @@
  * Device ownership issues as mentioned above for pci_map_single are
  * the same here.
  */
-extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nents, int direction);
+extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		      int nents, int direction);
+extern int pci64_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			int nents, int direction);
 
 /* Unmap a set of streaming mode DMA translations.
  * Again, cpu read rules concerning calls here are the same as for
  * pci_unmap_single() above.
  */
-extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nhwents, int direction);
+extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			 int nhwents, int direction);
+extern void pci64_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			   int nhwents, int direction);
 
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
@@ -96,7 +125,10 @@
  * next point you give the PCI dma address back to the card, the
  * device again owns the buffer.
  */
-extern void pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle, size_t size, int direction);
+extern void pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+				size_t size, int direction);
+extern void pci64_dma_sync_single(struct pci_dev *hwdev, dma64_addr_t dma_handle,
+				  size_t size, int direction);
 
 /* Make physical memory consistent for a set of streaming
  * mode DMA translations after a transfer.
@@ -105,13 +137,14 @@
  * same rules and usage.
  */
 extern void pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nelems, int direction);
+extern void pci64_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nelems, int direction);
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask);
+extern int pci_dma_supported(struct pci_dev *hwdev, u64 mask);
 
 /* Return the index of the PCI controller for device PDEV. */
 
--- ./include/asm-sparc64/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-sparc64/scatterlist.h	Wed Aug 15 06:27:36 2001
@@ -5,17 +5,25 @@
 #include <asm/page.h>
 
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
-    unsigned int length;
+	/* These will disappear in 2.5.x */
+	char *address;
+	char *alt_address;
 
-    __u32 dvma_address; /* A place to hang host-specific addresses at. */
-    __u32 dvma_length;
+	/* These two are only valid if ADDRESS member of this
+	 * struct is NULL.
+	 */
+	struct page *page;
+	unsigned int offset;
+
+	unsigned int length;
+
+	dma64_addr_t dma_address;
+	__u32 dma_length;
 };
 
-#define sg_dma_address(sg) ((sg)->dvma_address)
-#define sg_dma_len(sg)     ((sg)->dvma_length)
+#define sg_dma_address(sg)	((dma_addr_t) ((sg)->dma_address))
+#define sg_dma64_address(sg)	((sg)->dma_address)
+#define sg_dma_len(sg)     	((sg)->dma_length)
 
 #define ISA_DMA_THRESHOLD	(~0UL)
 
--- ./include/linux/pci.h.~1~	Tue Aug 14 21:31:11 2001
+++ ./include/linux/pci.h	Wed Aug 15 06:43:53 2001
@@ -314,6 +314,12 @@
 #define PCI_DMA_FROMDEVICE	2
 #define PCI_DMA_NONE		3
 
+/* These are the boolean attributes stored in pci_dev->dma_flags. */
+#define PCI_DMA_FLAG_HUGE_MAPS	0x00000001 /* Device may hold an enormous number
+					    * of mappings at once?
+					    */
+#define PCI_DMA_FLAG_ARCHMASK	0xf0000000 /* Reserved for arch-specific flags */
+
 #define DEVICE_COUNT_COMPATIBLE	4
 #define DEVICE_COUNT_IRQ	2
 #define DEVICE_COUNT_DMA	2
@@ -353,11 +359,12 @@
 
 	struct pci_driver *driver;	/* which driver has allocated this device */
 	void		*driver_data;	/* data private to the driver */
-	dma_addr_t	dma_mask;	/* Mask of the bits of bus address this
+	u64		dma_mask;	/* Mask of the bits of bus address this
 					   device implements.  Normally this is
 					   0xffffffff.  You only need to change
 					   this if your device has broken DMA
 					   or supports 64-bit transfers.  */
+	unsigned int	dma_flags;	/* See PCI_DMA_FLAG_* above */
 
 	u32             current_state;  /* Current operating state. In ACPI-speak,
 					   this is D0-D3, D0 being fully functional,
@@ -559,7 +566,8 @@
 int pci_enable_device(struct pci_dev *dev);
 void pci_disable_device(struct pci_dev *dev);
 void pci_set_master(struct pci_dev *dev);
-int pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask);
+int pci_set_dma_mask(struct pci_dev *dev, u64 mask);
+void pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off);
 int pci_assign_resource(struct pci_dev *dev, int i);
 
 /* Power management related routines */
@@ -641,7 +649,8 @@
 static inline int pci_enable_device(struct pci_dev *dev) { return -EIO; }
 static inline void pci_disable_device(struct pci_dev *dev) { }
 static inline int pci_module_init(struct pci_driver *drv) { return -ENODEV; }
-static inline int pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask) { return -EIO; }
+static inline int pci_set_dma_mask(struct pci_dev *dev, u64 mask) { return -EIO; }
+static inline void pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off) { }
 static inline int pci_assign_resource(struct pci_dev *dev, int i) { return -EBUSY;}
 static inline int pci_register_driver(struct pci_driver *drv) { return 0;}
 static inline void pci_unregister_driver(struct pci_driver *drv) { }
--- ./include/asm-sh/pci.h.~1~	Fri Jun 29 14:26:55 2001
+++ ./include/asm-sh/pci.h	Wed Aug 15 03:06:33 2001
@@ -167,7 +167,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-ia64/pci.h.~1~	Sat May 12 03:47:41 2001
+++ ./include/asm-ia64/pci.h	Wed Aug 15 03:06:42 2001
@@ -52,7 +52,7 @@
  * you would pass 0x00ffffff as the mask to this function.
  */
 static inline int
-pci_dma_supported (struct pci_dev *hwdev, dma_addr_t mask)
+pci_dma_supported (struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-mips64/pci.h.~1~	Thu Jul  5 16:52:48 2001
+++ ./include/asm-mips64/pci.h	Wed Aug 15 03:06:49 2001
@@ -195,7 +195,7 @@
 #endif
 }
 
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	/*
 	 * we fall back to GFP_DMA when the mask isn't all 1s,
--- ./include/asm-parisc/pci.h.~1~	Sat May 12 03:47:41 2001
+++ ./include/asm-parisc/pci.h	Wed Aug 15 03:07:07 2001
@@ -113,7 +113,7 @@
 ** See Documentation/DMA-mapping.txt
 */
 struct pci_dma_ops {
-	int  (*dma_supported)(struct pci_dev *dev, dma_addr_t mask);
+	int  (*dma_supported)(struct pci_dev *dev, u64 mask);
 	void *(*alloc_consistent)(struct pci_dev *dev, size_t size, dma_addr_t *iova);
 	void (*free_consistent)(struct pci_dev *dev, size_t size, void *vaddr, dma_addr_t iova);
 	dma_addr_t (*map_single)(struct pci_dev *dev, void *addr, size_t size, int direction);
--- ./net/ipv6/mcast.c.~1~	Wed Apr 25 13:46:34 2001
+++ ./net/ipv6/mcast.c	Wed Aug 15 00:36:31 2001
@@ -5,7 +5,7 @@
  *	Authors:
  *	Pedro Roque		<roque@di.fc.ul.pt>	
  *
- *	$Id: mcast.c,v 1.37 2001/04/25 20:46:34 davem Exp $
+ *	$Id: mcast.c,v 1.38 2001/08/15 07:36:31 davem Exp $
  *
  *	Based on linux/ipv4/igmp.c and linux/ipv4/ip_sockglue.c 
  *
@@ -90,7 +90,6 @@
 
 	mc_lst->next = NULL;
 	memcpy(&mc_lst->addr, addr, sizeof(struct in6_addr));
-	mc_lst->ifindex = ifindex;
 
 	if (ifindex == 0) {
 		struct rt6_info *rt;
@@ -107,6 +106,8 @@
 		sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
 		return -ENODEV;
 	}
+
+	mc_lst->ifindex = dev->ifindex;
 
 	/*
 	 *	now add/increase the group membership on the device

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 10:22   ` David S. Miller
  2001-08-15 11:13     ` Jens Axboe
  2001-08-15 11:47     ` David S. Miller
@ 2001-08-15 19:20     ` Gérard Roudier
  2001-08-16  8:12     ` David S. Miller
  3 siblings, 0 replies; 29+ messages in thread
From: Gérard Roudier @ 2001-08-15 19:20 UTC (permalink / raw)
  To: David S. Miller; +Cc: axboe, linux-kernel, andrea



On Wed, 15 Aug 2001, David S. Miller wrote:

>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 11:26:21 +0200
>
>    Ok, here's an updated version. Maybe modulo the struct scatterlist
>    changes, I'd like to see this included in 2.4.x soonish. Or at least the
>    interface we agree on -- it'll make my life easier at least. And finally
>    provide driver authors with something not quite as stupid as struct
>    scatterlist.
>
> Jens, have a look at the patch I have below.  What do you
> think about it?  Specifically the set of interfaces.
>
> Andrea, I am very much interested in your input as well.

Here is some input you donnot seem interested in. Too bad for you! :-)

Some drivers may want to allocate internal DMAable data structures in a 32
bit DMAable address range but also may want to support 64 bit DMA
addressing for user data. This applies to the sym53c8xx driver (which is
not quite ready for 64 bit DMA adressing for now) and to the aic7xxx
driver which seems to be ready. The reason is for simplicity and may-be
feasibility and given that driver internal data structures donnot require
that much memory, this should not make memory pressure at all on the 32
bit DMAable range.

The current solution consists in tampering the dma_mask in the pci_dev
structure prior to allocating DMAable memory. Not really clean...
Some interface that would allow to provide some masks as argument would be
cleaner, in my opinion. Btw, the pci_set_* interface does not seem cleaner
to me than hacking the corresponding field in the pcidev structure directly.


> I would like to kill two birds with one stone here if
> we can.   The x86 versions of the asm/pci.h and
> asm/scatterlist.h bits are pretty mindless and left as
> an exercise to the reader :-)

[... b****ed patch removed since reader should already have it :) ...]

Later,
  Gérard.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 14:02         ` [patch] zero-bounce highmem I/O David S. Miller
@ 2001-08-16  5:52           ` Jens Axboe
  0 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-16  5:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Wed, 15 Aug 2001 15:10:52 +0200
> 
>    On Wed, Aug 15 2001, David S. Miller wrote:
>    >    The only truly problematic area is the alt_address thing.
>    >    It is would be a nice thing to rip this eyesore out of the scsi
>    >    layer anyways.
>    
>    The SCSI issue was exactly what was on my mind, and is indeed the reason
>    why I didn't go all the way and did a complete conversion there. The
>    SCSI layer is _not_ very clean in this regard, didn't exactly enjoy this
>    part of the work...
>    
> I just took a quick look at this, and I think I can make this
> alt_address thing into a scsi-layer-specific mechanism and
> thus be able to safely remove it from struct scatterlist.
> 
> Would you like me to whip up such a set of changes?  I'll be
> more than happy to work on it.

Yes please, that'd be great.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 10:22   ` David S. Miller
                       ` (2 preceding siblings ...)
  2001-08-15 19:20     ` Gérard Roudier
@ 2001-08-16  8:12     ` David S. Miller
  3 siblings, 0 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16  8:12 UTC (permalink / raw)
  To: groudier; +Cc: axboe, linux-kernel, andrea

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: Text/Plain; charset=big5, Size: 831 bytes --]

   From: Gérard Roudier <groudier@free.fr>
   Date: Wed, 15 Aug 2001 21:20:17 +0200 (CEST)

   The current solution consists in tampering the dma_mask in the pci_dev
   structure prior to allocating DMAable memory. Not really clean...
   Some interface that would allow to provide some masks as argument would be
   cleaner, in my opinion. Btw, the pci_set_* interface does not seem cleaner
   to me than hacking the corresponding field in the pcidev structure directly.

pci_alloc_consistent will ONLY give you 32-bit DMA memory.

This will be true both before and after my changes.  Does the
IA64 gross hack behave differently?

Later,
David S. Miller
davem@redhat.com
ý:.žË›±Êâmçë¢kaŠÉb²ßìzwm…ébïîžË›±Êâmébžìÿ‘êçz_âžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣømšSåy«\x1e­æ¶\x17…\x01\x06­†ÛiÿÿðÃ\x0fí»\x1fè®\x0få’i\x7f

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 14:25           ` David S. Miller
@ 2001-08-16 11:51             ` Jens Axboe
  2001-08-16 11:56             ` David S. Miller
  2001-08-16 12:28             ` kill alt_address (Re: [patch] zero-bounce highmem I/O) David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-16 11:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Wed, Aug 15 2001, David S. Miller wrote:
>    From: "David S. Miller" <davem@redhat.com>
>    Date: Wed, 15 Aug 2001 07:02:04 -0700 (PDT)
>    
>       >    Yep. Want me to add in the x86 parts of your patch?
>       > 
>       > Please let me finish up my prototype with sparc64 building and
>       > working, then I'll send you what I have ok?
>       
>       Fine
>       
>    This is forthcoming.
> 
> As promised.  I actually got bored during the build and tried to

Looks good! And works here too, even... ->

> quickly cook up the ix86 bits myself :-)

the boring x86 parts :-)

The only difference between your and my tree now is the PCI_MAX_DMA32
flag. Would you consider this? I already use this flag in the block
stuff, I just updated the two references you had. Maybe
PCI_MAX_DMA32_MASK is a better name.

I'll update the block-highmem patch later today -- and with this in
place, there's nothing stopping zero-bounce from going beyond 4GB on the
right hw. Yippi.

--- linux/include/linux/pci.h~	Thu Aug 16 13:40:32 2001
+++ linux/include/linux/pci.h	Thu Aug 16 13:41:11 2001
@@ -314,6 +314,8 @@
 #define PCI_DMA_FROMDEVICE	2
 #define PCI_DMA_NONE		3
 
+#define PCI_MAX_DMA32		0xffffffff
+
 /* These are the boolean attributes stored in pci_dev->dma_flags. */
 #define PCI_DMA_FLAG_HUGE_MAPS	0x00000001 /* Device may hold an enormous number
 					    * of mappings at once?
--- linux/drivers/scsi/sym53c8xx.c~	Thu Aug 16 13:42:27 2001
+++ linux/drivers/scsi/sym53c8xx.c	Thu Aug 16 13:42:50 2001
@@ -13101,7 +13101,7 @@
 		(int) (PciDeviceFn(pdev) & 7));
 
 #ifdef SCSI_NCR_DYNAMIC_DMA_MAPPING
-	if (pci_set_dma_mask(pdev, 0xffffffff)) {
+	if (pci_set_dma_mask(pdev, PCI_MAX_DMA32)) {
 		printk(KERN_WARNING NAME53C8XX
 		       "32 BIT PCI BUS DMA ADDRESSING NOT SUPPORTED\n");
 		return -1;
--- linux/drivers/scsi/sym53c8xx_comm.h~	Thu Aug 16 13:43:09 2001
+++ linux/drivers/scsi/sym53c8xx_comm.h	Thu Aug 16 13:43:32 2001
@@ -2186,7 +2186,7 @@
 		(int) (PciDeviceFn(pdev) & 7));
 
 #ifdef SCSI_NCR_DYNAMIC_DMA_MAPPING
-	if (!pci_dma_supported(pdev, 0xffffffff)) {
+	if (!pci_dma_supported(pdev, PCI_MAX_DMA32)) {
 		printk(KERN_WARNING NAME53C8XX
 		       "32 BIT PCI BUS DMA ADDRESSING NOT SUPPORTED\n");
 		return -1;

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-15 14:25           ` David S. Miller
  2001-08-16 11:51             ` Jens Axboe
@ 2001-08-16 11:56             ` David S. Miller
  2001-08-16 12:03               ` Jens Axboe
                                 ` (3 more replies)
  2001-08-16 12:28             ` kill alt_address (Re: [patch] zero-bounce highmem I/O) David S. Miller
  2 siblings, 4 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16 11:56 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Thu, 16 Aug 2001 13:51:50 +0200

[ Hopefully this mail won't be encoded in Chinese-BIG5 :-)  sorry
  about that ]

   The only difference between your and my tree now is the PCI_MAX_DMA32
   flag. Would you consider this? I already use this flag in the block
   stuff, I just updated the two references you had. Maybe
   PCI_MAX_DMA32_MASK is a better name.
   
I didn't put it into my patch becuase there is no way you can
use such a value in generic code.

What if my scsi controller's pci DMA mask is 0x7fffffff or something
like this?  You don't know at the generic layer, and you must provide
some way for the block device to indicate stuff like this to you.

That is why PCI_MAX_DMA32, or whatever you would like to name it, does
not make any sense.  It can be a shorthand for drivers themselves, but
that is it and personally I'd rather they just put the bits there
explicitly.

I am just finishing up the "death of alt_address" patch right now.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 11:56             ` David S. Miller
@ 2001-08-16 12:03               ` Jens Axboe
  2001-08-16 12:14               ` Gerd Knorr
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-16 12:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Thu, Aug 16 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Thu, 16 Aug 2001 13:51:50 +0200
> 
> [ Hopefully this mail won't be encoded in Chinese-BIG5 :-)  sorry
>   about that ]

Looks good :)

>    The only difference between your and my tree now is the PCI_MAX_DMA32
>    flag. Would you consider this? I already use this flag in the block
>    stuff, I just updated the two references you had. Maybe
>    PCI_MAX_DMA32_MASK is a better name.
>    
> I didn't put it into my patch becuase there is no way you can
> use such a value in generic code.
> 
> What if my scsi controller's pci DMA mask is 0x7fffffff or something
> like this?  You don't know at the generic layer, and you must provide
> some way for the block device to indicate stuff like this to you.

Then your SCSI controller will not use PCI_MAX_DMA32 but rather that
particular mask? For block drivers, using 0x7fffffff for
blk_queue_bounce_limit is perfectly fine too.

> That is why PCI_MAX_DMA32, or whatever you would like to name it, does
> not make any sense.  It can be a shorthand for drivers themselves, but
> that is it and personally I'd rather they just put the bits there
> explicitly.

Drivers, right. THe block stuff used it in _one_ place -- the
BLK_BOUNCE_4G define, to indicated the need to bounce anything above 4G.
But no problem, I can just define that to 0xffffffff myself.

> I am just finishing up the "death of alt_address" patch right now.

Excellent.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 11:56             ` David S. Miller
  2001-08-16 12:03               ` Jens Axboe
@ 2001-08-16 12:14               ` Gerd Knorr
  2001-08-16 12:27               ` David S. Miller
  2001-08-16 12:34               ` David S. Miller
  3 siblings, 0 replies; 29+ messages in thread
From: Gerd Knorr @ 2001-08-16 12:14 UTC (permalink / raw)
  To: linux-kernel

>  What if my scsi controller's pci DMA mask is 0x7fffffff or something
>  like this?  You don't know at the generic layer, and you must provide
>  some way for the block device to indicate stuff like this to you.

While we are at it:  Is there some portable way to figure whenever I can
do a PCI DMA transfer to some page?  On ia32 I can simply look at the
physical address and if it is behind 4G it doesn't work for 32bit PCI
devices.  But I think that is not true for architectures which have a
iommu ...

  Gerd


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 11:56             ` David S. Miller
  2001-08-16 12:03               ` Jens Axboe
  2001-08-16 12:14               ` Gerd Knorr
@ 2001-08-16 12:27               ` David S. Miller
  2001-08-16 12:48                 ` Jens Axboe
                                   ` (2 more replies)
  2001-08-16 12:34               ` David S. Miller
  3 siblings, 3 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16 12:27 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Thu, 16 Aug 2001 14:03:17 +0200

   > That is why PCI_MAX_DMA32, or whatever you would like to name it, does
   > not make any sense.  It can be a shorthand for drivers themselves, but
   > that is it and personally I'd rather they just put the bits there
   > explicitly.
   
   Drivers, right. THe block stuff used it in _one_ place -- the
   BLK_BOUNCE_4G define, to indicated the need to bounce anything above 4G.
   But no problem, I can just define that to 0xffffffff myself.

How can "the block stuff" (ie. generic code) make legal use of this
value?  Which physical bits it may address, this is a device specific
attribute and has nothing to with with 4GB and highmem and PCI
standard specifications. :-)

In fact, this is not only a device specific attribute, it also has
things to do with elements of the platform.

This is why we have things like pci_dma_supported() and friends.
Let me give an example, for most PCI controllers on Sparc64 if your
device can address the upper 2GB of 32-bit PCI space, one may DMA
to any physical memory location via the IOMMU these controllers have.

There may easily be HIGHMEM platforms which operate this way.  So the
result is that CONFIG_HIGHMEM does _not_ mean ">=4GB memory must be
bounced".

Really, 0xffffffff is a meaningless value.  You have to test against
device indicated capabilities for bouncing decisions.

You do not even know how "addressable bits" translates into "range of
physical memory that may be DMA'd to/from by device".  If an IOMMU is
present on the platform, these two things have no relationship
whatsoever.  These two things happen to have a direct relationship
on x86, but that is as far as it goes.

Enough babbling on my part, I'll have a look at your bounce patch
later today. :-)

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* kill alt_address (Re: [patch] zero-bounce highmem I/O)
  2001-08-15 14:25           ` David S. Miller
  2001-08-16 11:51             ` Jens Axboe
  2001-08-16 11:56             ` David S. Miller
@ 2001-08-16 12:28             ` David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16 12:28 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Thu, 16 Aug 2001 13:51:50 +0200

   On Wed, Aug 15 2001, David S. Miller wrote:
   > As promised.  I actually got bored during the build and tried to
   
   Looks good! And works here too, even... ->

Ok, here is my current patch, it kills off alt_address as well
as provide the new PCI dma interfaces.

I have no way to test the alt_address stuff here locally, and I'd
be happy if someone would give it a go.  All I guarentee is that
it builds here on my machines :-)

--- ./arch/alpha/kernel/pci_iommu.c.~1~	Sun Aug 12 23:50:25 2001
+++ ./arch/alpha/kernel/pci_iommu.c	Wed Aug 15 03:04:24 2001
@@ -636,7 +636,7 @@
    supported properly.  */
 
 int
-pci_dma_supported(struct pci_dev *pdev, dma_addr_t mask)
+pci_dma_supported(struct pci_dev *pdev, u64 mask)
 {
 	struct pci_controller *hose;
 	struct pci_iommu_arena *arena;
--- ./arch/sparc64/kernel/pci_iommu.c.~1~	Wed May 23 17:57:03 2001
+++ ./arch/sparc64/kernel/pci_iommu.c	Wed Aug 15 06:40:54 2001
@@ -356,6 +356,20 @@
 	return 0;
 }
 
+dma64_addr_t pci64_map_page(struct pci_dev *pdev,
+			    struct page *page, unsigned long offset,
+			    size_t sz, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS)) {
+		return (dma64_addr_t)
+			pci_map_single(pdev,
+				       page_address(page) + offset,
+				       sz, direction);
+	}
+
+	return PCI64_ADDR_BASE + (__pa(page_address(page)) + offset);
+}
+
 /* Unmap a single streaming mode DMA translation. */
 void pci_unmap_single(struct pci_dev *pdev, dma_addr_t bus_addr, size_t sz, int direction)
 {
@@ -378,7 +392,8 @@
 		((bus_addr - iommu->page_table_map_base) >> PAGE_SHIFT);
 #ifdef DEBUG_PCI_IOMMU
 	if (iopte_val(*base) == IOPTE_INVALID)
-		printk("pci_unmap_single called on non-mapped region %08x,%08x from %016lx\n", bus_addr, sz, __builtin_return_address(0));
+		printk("pci_unmap_single called on non-mapped region %08x,%08x from %016lx\n",
+		       bus_addr, sz, __builtin_return_address(0));
 #endif
 	bus_addr &= PAGE_MASK;
 
@@ -423,18 +438,38 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-static inline void fill_sg(iopte_t *iopte, struct scatterlist *sg, int nused, unsigned long iopte_protection)
+void pci64_unmap_page(struct pci_dev *pdev, dma64_addr_t bus_addr,
+		      size_t sz, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS)) {
+		if ((bus_addr >> 32) != (dma64_addr_t) 0)
+			BUG();
+
+		return pci_unmap_single(pdev, (dma_addr_t) bus_addr,
+					sz, direction);
+	}
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
+#define SG_ENT_PHYS_ADDRESS(SG)	\
+	((SG)->address ? \
+	 __pa((SG)->address) : \
+	 (__pa(page_address((SG)->page)) + (SG)->offset))
+
+static inline void fill_sg(iopte_t *iopte, struct scatterlist *sg,
+			   int nused, unsigned long iopte_protection)
 {
 	struct scatterlist *dma_sg = sg;
 	int i;
 
 	for (i = 0; i < nused; i++) {
 		unsigned long pteval = ~0UL;
-		u32 dma_npages;
+		u64 dma_npages;
 
-		dma_npages = ((dma_sg->dvma_address & (PAGE_SIZE - 1UL)) +
-			      dma_sg->dvma_length +
-			      ((u32)(PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
+		dma_npages = ((dma_sg->dma_address & (PAGE_SIZE - 1UL)) +
+			      dma_sg->dma_length +
+			      ((PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
 		do {
 			unsigned long offset;
 			signed int len;
@@ -447,7 +482,7 @@
 			for (;;) {
 				unsigned long tmp;
 
-				tmp = (unsigned long) __pa(sg->address);
+				tmp = SG_ENT_PHYS_ADDRESS(sg);
 				len = sg->length;
 				if (((tmp ^ pteval) >> PAGE_SHIFT) != 0UL) {
 					pteval = tmp & PAGE_MASK;
@@ -480,9 +515,9 @@
 			 * detect a page crossing event.
 			 */
 			while ((pteval << (64 - PAGE_SHIFT)) != 0UL &&
-			       pteval == __pa(sg->address) &&
+			       (pteval == SG_ENT_PHYS_ADDRESS(sg)) &&
 			       ((pteval ^
-				 (__pa(sg->address) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
+				 (SG_ENT_PHYS_ADDRESS(sg) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
 				pteval += sg->length;
 				sg++;
 			}
@@ -505,14 +540,19 @@
 	struct pci_strbuf *strbuf;
 	unsigned long flags, ctx, npages, iopte_protection;
 	iopte_t *base;
-	u32 dma_base;
+	u64 dma_base;
 	struct scatterlist *sgtmp;
 	int used;
 
 	/* Fast path single entry scatterlists. */
 	if (nelems == 1) {
-		sglist->dvma_address = pci_map_single(pdev, sglist->address, sglist->length, direction);
-		sglist->dvma_length = sglist->length;
+		sglist->dma_address = (dma64_addr_t)
+			pci_map_single(pdev,
+				       (sglist->address ?
+					sglist->address :
+					(page_address(sglist->page) + sglist->offset)),
+				       sglist->length, direction);
+		sglist->dma_length = sglist->length;
 		return 1;
 	}
 
@@ -540,8 +580,8 @@
 	used = nelems;
 
 	sgtmp = sglist;
-	while (used && sgtmp->dvma_length) {
-		sgtmp->dvma_address += dma_base;
+	while (used && sgtmp->dma_length) {
+		sgtmp->dma_address += dma_base;
 		sgtmp++;
 		used--;
 	}
@@ -574,6 +614,23 @@
 	return 0;
 }
 
+int pci64_map_sg(struct pci_dev *pdev, struct scatterlist *sg,
+		 int nelems, int direction)
+{
+	if ((pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS) != 0) {
+		int i;
+
+		for (i = 0; i < nelems; i++) {
+			sg[i].dma_address =
+				PCI64_ADDR_BASE + SG_ENT_PHYS_ADDRESS(&sg[i]);
+			sg[i].dma_length = sg[i].length;
+		}
+		return nelems;
+	}
+
+	return pci_map_sg(pdev, sg, nelems, direction);
+}
+
 /* Unmap a set of streaming mode DMA translations. */
 void pci_unmap_sg(struct pci_dev *pdev, struct scatterlist *sglist, int nelems, int direction)
 {
@@ -582,7 +639,7 @@
 	struct pci_strbuf *strbuf;
 	iopte_t *base;
 	unsigned long flags, ctx, i, npages;
-	u32 bus_addr;
+	u64 bus_addr;
 
 	if (direction == PCI_DMA_NONE)
 		BUG();
@@ -591,20 +648,21 @@
 	iommu = pcp->pbm->iommu;
 	strbuf = &pcp->pbm->stc;
 	
-	bus_addr = sglist->dvma_address & PAGE_MASK;
+	bus_addr = sglist->dma_address & PAGE_MASK;
 
 	for (i = 1; i < nelems; i++)
-		if (sglist[i].dvma_length == 0)
+		if (sglist[i].dma_length == 0)
 			break;
 	i--;
-	npages = (PAGE_ALIGN(sglist[i].dvma_address + sglist[i].dvma_length) - bus_addr) >> PAGE_SHIFT;
+	npages = (PAGE_ALIGN(sglist[i].dma_address + sglist[i].dma_length) - bus_addr) >> PAGE_SHIFT;
 
 	base = iommu->page_table +
 		((bus_addr - iommu->page_table_map_base) >> PAGE_SHIFT);
 
 #ifdef DEBUG_PCI_IOMMU
 	if (iopte_val(*base) == IOPTE_INVALID)
-		printk("pci_unmap_sg called on non-mapped region %08x,%d from %016lx\n", sglist->dvma_address, nelems, __builtin_return_address(0));
+		printk("pci_unmap_sg called on non-mapped region %016lx,%d from %016lx\n",
+		       sglist->dma_address, nelems, __builtin_return_address(0));
 #endif
 
 	spin_lock_irqsave(&iommu->lock, flags);
@@ -616,7 +674,7 @@
 
 	/* Step 1: Kick data out of streaming buffers if necessary. */
 	if (strbuf->strbuf_enabled) {
-		u32 vaddr = bus_addr;
+		u32 vaddr = (u32) bus_addr;
 
 		PCI_STC_FLUSHFLAG_INIT(strbuf);
 		if (strbuf->strbuf_ctxflush &&
@@ -648,6 +706,15 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+void pci64_unmap_sg(struct pci_dev *pdev, struct scatterlist *sglist,
+		    int nelems, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS))
+		return pci_unmap_sg(pdev, sglist, nelems, direction);
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
  */
@@ -709,6 +776,20 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+void pci64_dma_sync_single(struct pci_dev *pdev, dma64_addr_t bus_addr,
+			   size_t sz, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS)) {
+		if ((bus_addr >> 32) != (dma64_addr_t) 0)
+			BUG();
+
+		return pci_dma_sync_single(pdev, (dma_addr_t) bus_addr,
+					   sz, direction);
+	}
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
 /* Make physical memory consistent for a set of streaming
  * mode DMA translations after a transfer.
  */
@@ -735,7 +816,7 @@
 		iopte_t *iopte;
 
 		iopte = iommu->page_table +
-			((sglist[0].dvma_address - iommu->page_table_map_base) >> PAGE_SHIFT);
+			((sglist[0].dma_address - iommu->page_table_map_base) >> PAGE_SHIFT);
 		ctx = (iopte_val(*iopte) & IOPTE_CONTEXT) >> 47UL;
 	}
 
@@ -752,15 +833,15 @@
 		} while (((long)pci_iommu_read(matchreg)) < 0L);
 	} else {
 		unsigned long i, npages;
-		u32 bus_addr;
+		u64 bus_addr;
 
-		bus_addr = sglist[0].dvma_address & PAGE_MASK;
+		bus_addr = sglist[0].dma_address & PAGE_MASK;
 
 		for(i = 1; i < nelems; i++)
-			if (!sglist[i].dvma_length)
+			if (!sglist[i].dma_length)
 				break;
 		i--;
-		npages = (PAGE_ALIGN(sglist[i].dvma_address + sglist[i].dvma_length) - bus_addr) >> PAGE_SHIFT;
+		npages = (PAGE_ALIGN(sglist[i].dma_address + sglist[i].dma_length) - bus_addr) >> PAGE_SHIFT;
 		for (i = 0; i < npages; i++, bus_addr += PAGE_SIZE)
 			pci_iommu_write(strbuf->strbuf_pflush, bus_addr);
 	}
@@ -774,10 +855,19 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-int pci_dma_supported(struct pci_dev *pdev, dma_addr_t device_mask)
+void pci64_dma_sync_sg(struct pci_dev *pdev, struct scatterlist *sglist,
+		       int nelems, int direction)
+{
+	if (!(pdev->dma_flags & PCI_DMA_FLAG_HUGE_MAPS))
+		return pci_dma_sync_sg(pdev, sglist, nelems, direction);
+
+	/* If doing real DAC, there is nothing to do. */
+}
+
+int pci_dma_supported(struct pci_dev *pdev, u64 device_mask)
 {
 	struct pcidev_cookie *pcp = pdev->sysdata;
-	u32 dma_addr_mask;
+	u64 dma_addr_mask;
 
 	if (pdev == NULL) {
 		dma_addr_mask = 0xffffffff;
--- ./arch/sparc64/kernel/iommu_common.c.~1~	Tue Nov 28 08:33:08 2000
+++ ./arch/sparc64/kernel/iommu_common.c	Wed Aug 15 05:13:34 2001
@@ -12,7 +12,7 @@
  */
 
 #ifdef VERIFY_SG
-int verify_lengths(struct scatterlist *sg, int nents, int npages)
+static int verify_lengths(struct scatterlist *sg, int nents, int npages)
 {
 	int sg_len, dma_len;
 	int i, pgcount;
@@ -22,8 +22,8 @@
 		sg_len += sg[i].length;
 
 	dma_len = 0;
-	for (i = 0; i < nents && sg[i].dvma_length; i++)
-		dma_len += sg[i].dvma_length;
+	for (i = 0; i < nents && sg[i].dma_length; i++)
+		dma_len += sg[i].dma_length;
 
 	if (sg_len != dma_len) {
 		printk("verify_lengths: Error, different, sg[%d] dma[%d]\n",
@@ -32,13 +32,13 @@
 	}
 
 	pgcount = 0;
-	for (i = 0; i < nents && sg[i].dvma_length; i++) {
+	for (i = 0; i < nents && sg[i].dma_length; i++) {
 		unsigned long start, end;
 
-		start = sg[i].dvma_address;
+		start = sg[i].dma_address;
 		start = start & PAGE_MASK;
 
-		end = sg[i].dvma_address + sg[i].dvma_length;
+		end = sg[i].dma_address + sg[i].dma_length;
 		end = (end + (PAGE_SIZE - 1)) & PAGE_MASK;
 
 		pgcount += ((end - start) >> PAGE_SHIFT);
@@ -55,15 +55,16 @@
 	return 0;
 }
 
-int verify_one_map(struct scatterlist *dma_sg, struct scatterlist **__sg, int nents, iopte_t **__iopte)
+static int verify_one_map(struct scatterlist *dma_sg, struct scatterlist **__sg, int nents, iopte_t **__iopte)
 {
 	struct scatterlist *sg = *__sg;
 	iopte_t *iopte = *__iopte;
-	u32 dlen = dma_sg->dvma_length;
-	u32 daddr = dma_sg->dvma_address;
+	u32 dlen = dma_sg->dma_length;
+	u32 daddr;
 	unsigned int sglen;
 	unsigned long sgaddr;
 
+	daddr = dma_sg->dma_address;
 	sglen = sg->length;
 	sgaddr = (unsigned long) sg->address;
 	while (dlen > 0) {
@@ -136,7 +137,7 @@
 	return nents;
 }
 
-int verify_maps(struct scatterlist *sg, int nents, iopte_t *iopte)
+static int verify_maps(struct scatterlist *sg, int nents, iopte_t *iopte)
 {
 	struct scatterlist *dma_sg = sg;
 	struct scatterlist *orig_dma_sg = dma_sg;
@@ -147,7 +148,7 @@
 		if (nents <= 0)
 			break;
 		dma_sg++;
-		if (dma_sg->dvma_length == 0)
+		if (dma_sg->dma_length == 0)
 			break;
 	}
 
@@ -174,14 +175,15 @@
 	    verify_maps(sg, nents, iopte) < 0) {
 		int i;
 
-		printk("verify_sglist: Crap, messed up mappings, dumping, iodma at %08x.\n",
-		       (u32) (sg->dvma_address & PAGE_MASK));
+		printk("verify_sglist: Crap, messed up mappings, dumping, iodma at ");
+		printk("%016lx.\n", sg->dma_address & PAGE_MASK);
+
 		for (i = 0; i < nents; i++) {
 			printk("sg(%d): address(%p) length(%x) "
-			       "dma_address[%08x] dma_length[%08x]\n",
+			       "dma_address[%016lx] dma_length[%08x]\n",
 			       i,
 			       sg[i].address, sg[i].length,
-			       sg[i].dvma_address, sg[i].dvma_length);
+			       sg[i].dma_address, sg[i].dma_length);
 		}
 	}
 
@@ -189,30 +191,23 @@
 }
 #endif
 
-/* Two addresses are "virtually contiguous" if and only if:
- * 1) They are equal, or...
- * 2) They are both on a page boundry
- */
-#define VCONTIG(__X, __Y)	(((__X) == (__Y)) || \
-				 (((__X) | (__Y)) << (64UL - PAGE_SHIFT)) == 0UL)
-
 unsigned long prepare_sg(struct scatterlist *sg, int nents)
 {
 	struct scatterlist *dma_sg = sg;
 	unsigned long prev;
-	u32 dent_addr, dent_len;
+	u64 dent_addr, dent_len;
 
 	prev  = (unsigned long) sg->address;
 	prev += (unsigned long) (dent_len = sg->length);
-	dent_addr = (u32) ((unsigned long)sg->address & (PAGE_SIZE - 1UL));
+	dent_addr = (u64) ((unsigned long)sg->address & (PAGE_SIZE - 1UL));
 	while (--nents) {
 		unsigned long addr;
 
 		sg++;
 		addr = (unsigned long) sg->address;
 		if (! VCONTIG(prev, addr)) {
-			dma_sg->dvma_address = dent_addr;
-			dma_sg->dvma_length = dent_len;
+			dma_sg->dma_address = dent_addr;
+			dma_sg->dma_length = dent_len;
 			dma_sg++;
 
 			dent_addr = ((dent_addr +
@@ -225,8 +220,8 @@
 		dent_len += sg->length;
 		prev = addr + sg->length;
 	}
-	dma_sg->dvma_address = dent_addr;
-	dma_sg->dvma_length = dent_len;
+	dma_sg->dma_address = dent_addr;
+	dma_sg->dma_length = dent_len;
 
 	return ((unsigned long) dent_addr +
 		(unsigned long) dent_len +
--- ./arch/sparc64/kernel/sbus.c.~1~	Wed May 23 17:57:03 2001
+++ ./arch/sparc64/kernel/sbus.c	Wed Aug 15 06:41:43 2001
@@ -376,6 +376,11 @@
 	spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+#define SG_ENT_PHYS_ADDRESS(SG)	\
+	((SG)->address ? \
+	 __pa((SG)->address) : \
+	 (__pa(page_address((SG)->page)) + (SG)->offset))
+
 static inline void fill_sg(iopte_t *iopte, struct scatterlist *sg, int nused, unsigned long iopte_bits)
 {
 	struct scatterlist *dma_sg = sg;
@@ -383,11 +388,11 @@
 
 	for (i = 0; i < nused; i++) {
 		unsigned long pteval = ~0UL;
-		u32 dma_npages;
+		u64 dma_npages;
 
-		dma_npages = ((dma_sg->dvma_address & (PAGE_SIZE - 1UL)) +
-			      dma_sg->dvma_length +
-			      ((u32)(PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
+		dma_npages = ((dma_sg->dma_address & (PAGE_SIZE - 1UL)) +
+			      dma_sg->dma_length +
+			      ((PAGE_SIZE - 1UL))) >> PAGE_SHIFT;
 		do {
 			unsigned long offset;
 			signed int len;
@@ -400,7 +405,7 @@
 			for (;;) {
 				unsigned long tmp;
 
-				tmp = (unsigned long) __pa(sg->address);
+				tmp = (unsigned long) SG_ENT_PHYS_ADDRESS(sg);
 				len = sg->length;
 				if (((tmp ^ pteval) >> PAGE_SHIFT) != 0UL) {
 					pteval = tmp & PAGE_MASK;
@@ -433,9 +438,9 @@
 			 * detect a page crossing event.
 			 */
 			while ((pteval << (64 - PAGE_SHIFT)) != 0UL &&
-			       pteval == __pa(sg->address) &&
+			       (pteval == SG_ENT_PHYS_ADDRESS(sg)) &&
 			       ((pteval ^
-				 (__pa(sg->address) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
+				 (SG_ENT_PHYS_ADDRESS(sg) + sg->length - 1UL)) >> PAGE_SHIFT) == 0UL) {
 				pteval += sg->length;
 				sg++;
 			}
@@ -461,8 +466,13 @@
 
 	/* Fast path single entry scatterlists. */
 	if (nents == 1) {
-		sg->dvma_address = sbus_map_single(sdev, sg->address, sg->length, dir);
-		sg->dvma_length = sg->length;
+		sg->dma_address = (dma64_addr_t)
+			sbus_map_single(sdev,
+					(sg->address ?
+					 sg->address :
+					 (page_address(sg->page) + sg->offset)),
+					sg->length, dir);
+		sg->dma_length = sg->length;
 		return 1;
 	}
 
@@ -478,8 +488,8 @@
 	sgtmp = sg;
 	used = nents;
 
-	while (used && sgtmp->dvma_length) {
-		sgtmp->dvma_address += dma_base;
+	while (used && sgtmp->dma_length) {
+		sgtmp->dma_address += dma_base;
 		sgtmp++;
 		used--;
 	}
@@ -507,22 +517,22 @@
 {
 	unsigned long size, flags;
 	struct sbus_iommu *iommu;
-	u32 dvma_base;
+	u64 dvma_base;
 	int i;
 
 	/* Fast path single entry scatterlists. */
 	if (nents == 1) {
-		sbus_unmap_single(sdev, sg->dvma_address, sg->dvma_length, direction);
+		sbus_unmap_single(sdev, sg->dma_address, sg->dma_length, direction);
 		return;
 	}
 
-	dvma_base = sg[0].dvma_address & PAGE_MASK;
+	dvma_base = sg[0].dma_address & PAGE_MASK;
 	for (i = 0; i < nents; i++) {
-		if (sg[i].dvma_length == 0)
+		if (sg[i].dma_length == 0)
 			break;
 	}
 	i--;
-	size = PAGE_ALIGN(sg[i].dvma_address + sg[i].dvma_length) - dvma_base;
+	size = PAGE_ALIGN(sg[i].dma_address + sg[i].dma_length) - dvma_base;
 
 	iommu = sdev->bus->iommu;
 	spin_lock_irqsave(&iommu->lock, flags);
@@ -547,16 +557,16 @@
 {
 	struct sbus_iommu *iommu = sdev->bus->iommu;
 	unsigned long flags, size;
-	u32 base;
+	u64 base;
 	int i;
 
-	base = sg[0].dvma_address & PAGE_MASK;
+	base = sg[0].dma_address & PAGE_MASK;
 	for (i = 0; i < nents; i++) {
-		if (sg[i].dvma_length == 0)
+		if (sg[i].dma_length == 0)
 			break;
 	}
 	i--;
-	size = PAGE_ALIGN(sg[i].dvma_address + sg[i].dvma_length) - base;
+	size = PAGE_ALIGN(sg[i].dma_address + sg[i].dma_length) - base;
 
 	spin_lock_irqsave(&iommu->lock, flags);
 	strbuf_flush(iommu, base, size >> PAGE_SHIFT);
--- ./arch/sparc64/kernel/iommu_common.h.~1~	Mon Aug 13 11:04:26 2001
+++ ./arch/sparc64/kernel/iommu_common.h	Wed Aug 15 07:00:20 2001
@@ -18,10 +18,7 @@
 #undef VERIFY_SG
 
 #ifdef VERIFY_SG
-int verify_lengths(struct scatterlist *sg, int nents, int npages);
-int verify_one_map(struct scatterlist *dma_sg, struct scatterlist **__sg, int nents, iopte_t **__iopte);
-int verify_maps(struct scatterlist *sg, int nents, iopte_t *iopte);
-void verify_sglist(struct scatterlist *sg, int nents, iopte_t *iopte, int npages);
+extern void verify_sglist(struct scatterlist *sg, int nents, iopte_t *iopte, int npages);
 #endif
 
 /* Two addresses are "virtually contiguous" if and only if:
@@ -31,4 +28,4 @@
 #define VCONTIG(__X, __Y)	(((__X) == (__Y)) || \
 				 (((__X) | (__Y)) << (64UL - PAGE_SHIFT)) == 0UL)
 
-unsigned long prepare_sg(struct scatterlist *sg, int nents);
+extern unsigned long prepare_sg(struct scatterlist *sg, int nents);
--- ./arch/sparc64/kernel/sparc64_ksyms.c.~1~	Mon Jun  4 20:39:50 2001
+++ ./arch/sparc64/kernel/sparc64_ksyms.c	Wed Aug 15 06:23:37 2001
@@ -216,10 +216,16 @@
 EXPORT_SYMBOL(pci_free_consistent);
 EXPORT_SYMBOL(pci_map_single);
 EXPORT_SYMBOL(pci_unmap_single);
+EXPORT_SYMBOL(pci64_map_page);
+EXPORT_SYMBOL(pci64_unmap_page);
 EXPORT_SYMBOL(pci_map_sg);
 EXPORT_SYMBOL(pci_unmap_sg);
+EXPORT_SYMBOL(pci64_map_sg);
+EXPORT_SYMBOL(pci64_unmap_sg);
 EXPORT_SYMBOL(pci_dma_sync_single);
+EXPORT_SYMBOL(pci64_dma_sync_single);
 EXPORT_SYMBOL(pci_dma_sync_sg);
+EXPORT_SYMBOL(pci64_dma_sync_sg);
 EXPORT_SYMBOL(pci_dma_supported);
 #endif
 
--- ./arch/ia64/sn/io/pci_dma.c.~1~	Fri Apr 13 22:44:54 2001
+++ ./arch/ia64/sn/io/pci_dma.c	Thu Aug 16 04:50:22 2001
@@ -182,7 +182,7 @@
 }
 
 /*
- * On sn1 we use the alt_address entry of the scatterlist to store
+ * On sn1 we use the orig_address entry of the scatterlist to store
  * the physical address corresponding to the given virtual address
  */
 int
--- ./arch/parisc/kernel/ccio-dma.c.~1~	Sun Feb 11 23:53:07 2001
+++ ./arch/parisc/kernel/ccio-dma.c	Wed Aug 15 03:07:31 2001
@@ -638,7 +638,7 @@
 }
 
 
-static int ccio_dma_supported( struct pci_dev *dev, dma_addr_t mask)
+static int ccio_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	if (dev == NULL) {
 		printk(MODULE_NAME ": EISA/ISA/et al not supported\n");
--- ./arch/parisc/kernel/pci-dma.c.~1~	Sun Feb 11 23:53:07 2001
+++ ./arch/parisc/kernel/pci-dma.c	Wed Aug 15 03:07:50 2001
@@ -77,7 +77,7 @@
 static inline void dump_resmap(void) {;}
 #endif
 
-static int pa11_dma_supported( struct pci_dev *dev, dma_addr_t mask)
+static int pa11_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	return 1;
 }
--- ./arch/parisc/kernel/ccio-rm-dma.c.~1~	Wed Dec  6 05:30:33 2000
+++ ./arch/parisc/kernel/ccio-rm-dma.c	Wed Aug 15 03:07:42 2001
@@ -93,7 +93,7 @@
 }
 
 
-static int ccio_dma_supported( struct pci_dev *dev, dma_addr_t mask)  
+static int ccio_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	if (dev == NULL) {
 		printk(MODULE_NAME ": EISA/ISA/et al not supported\n");
--- ./arch/parisc/kernel/sba_iommu.c.~1~	Sun Feb 11 23:53:07 2001
+++ ./arch/parisc/kernel/sba_iommu.c	Wed Aug 15 03:07:56 2001
@@ -779,7 +779,7 @@
 }
 
 static int
-sba_dma_supported( struct pci_dev *dev, dma_addr_t mask)
+sba_dma_supported( struct pci_dev *dev, u64 mask)
 {
 	if (dev == NULL) {
 		printk(MODULE_NAME ": EISA/ISA/et al not supported\n");
--- ./drivers/net/acenic.c.~1~	Mon Aug 13 09:55:44 2001
+++ ./drivers/net/acenic.c	Wed Aug 15 02:33:22 2001
@@ -202,6 +202,7 @@
 #define pci_free_consistent(cookie, size, ptr, dma_ptr)	kfree(ptr)
 #define pci_map_single(cookie, address, size, dir)	virt_to_bus(address)
 #define pci_unmap_single(cookie, address, size, dir)
+#define pci_set_dma_mask(dev, mask)		do { } while (0)
 #endif
 
 #if (LINUX_VERSION_CODE < 0x02032b)
@@ -258,11 +259,6 @@
 #define ace_mark_net_bh()			{do{} while(0);}
 #define ace_if_down(dev)			{do{} while(0);}
 #endif
-
-#ifndef pci_set_dma_mask
-#define pci_set_dma_mask(dev, mask)		dev->dma_mask = mask;
-#endif
-
 
 #if (LINUX_VERSION_CODE >= 0x02031b)
 #define NEW_NETINIT
--- ./drivers/pci/pci.c.~1~	Mon Aug 13 22:05:39 2001
+++ ./drivers/pci/pci.c	Wed Aug 15 06:22:34 2001
@@ -832,7 +832,7 @@
 }
 
 int
-pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask)
+pci_set_dma_mask(struct pci_dev *dev, u64 mask)
 {
     if(! pci_dma_supported(dev, mask))
         return -EIO;
@@ -842,6 +842,12 @@
     return 0;
 }
     
+void
+pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off)
+{
+	dev->dma_flags |= on;
+	dev->dma_flags &= ~off;
+}
 
 /*
  * Translate the low bits of the PCI base
@@ -1954,6 +1960,7 @@
 EXPORT_SYMBOL(pci_find_subsys);
 EXPORT_SYMBOL(pci_set_master);
 EXPORT_SYMBOL(pci_set_dma_mask);
+EXPORT_SYMBOL(pci_change_dma_flag);
 EXPORT_SYMBOL(pci_assign_resource);
 EXPORT_SYMBOL(pci_register_driver);
 EXPORT_SYMBOL(pci_unregister_driver);
--- ./drivers/scsi/sr.c.~1~	Thu Jul  5 22:11:52 2001
+++ ./drivers/scsi/sr.c	Thu Aug 16 04:59:37 2001
@@ -265,6 +265,7 @@
 	struct scatterlist *sg, *old_sg = NULL;
 	int i, fsize, bsize, sg_ent, sg_count;
 	char *front, *back;
+	void **bbpnt, **old_bbpnt = NULL;
 
 	back = front = NULL;
 	sg_ent = SCpnt->use_sg;
@@ -292,17 +293,25 @@
 	 * extend or allocate new scatter-gather table
 	 */
 	sg_count = SCpnt->use_sg;
-	if (sg_count)
+	if (sg_count) {
 		old_sg = (struct scatterlist *) SCpnt->request_buffer;
-	else {
+		old_bbpnt = SCpnt->bounce_buffers;
+	} else {
 		sg_count = 1;
 		sg_ent++;
 	}
 
-	i = ((sg_ent * sizeof(struct scatterlist)) + 511) & ~511;
+	/* Get space for scatterlist and bounce buffer array. */
+	i  = sg_ent * sizeof(struct scatterlist);
+	i += sg_ent * sizeof(void *);
+	i  = (i + 511) & ~511;
+
 	if ((sg = scsi_malloc(i)) == NULL)
 		goto no_mem;
 
+	bbpnt = (void **)
+		((char *)sg + (sg_ent * sizeof(struct scatterlist)));
+
 	/*
 	 * no more failing memory allocs possible, we can safely assign
 	 * SCpnt values now
@@ -313,13 +322,15 @@
 
 	i = 0;
 	if (fsize) {
-		sg[0].address = sg[0].alt_address = front;
+		sg[0].address = bbpnt[0] = front;
 		sg[0].length = fsize;
 		i++;
 	}
 	if (old_sg) {
 		memcpy(sg + i, old_sg, SCpnt->use_sg * sizeof(struct scatterlist));
-		scsi_free(old_sg, ((SCpnt->use_sg * sizeof(struct scatterlist)) + 511) & ~511);
+		memcpy(bbpnt + i, old_bbpnt, SCpnt->use_sg * sizeof(void *));
+		scsi_free(old_sg, (((SCpnt->use_sg * sizeof(struct scatterlist)) +
+				    (SCpnt->use_sg * sizeof(void *))) + 511) & ~511);
 	} else {
 		sg[i].address = SCpnt->request_buffer;
 		sg[i].length = SCpnt->request_bufflen;
@@ -327,11 +338,12 @@
 
 	SCpnt->request_bufflen += (fsize + bsize);
 	SCpnt->request_buffer = sg;
+	SCpnt->bounce_buffers = bbpnt;
 	SCpnt->use_sg += i;
 
 	if (bsize) {
 		sg[SCpnt->use_sg].address = back;
-		sg[SCpnt->use_sg].alt_address = back;
+		bbpnt[SCpnt->use_sg] = back;
 		sg[SCpnt->use_sg].length = bsize;
 		SCpnt->use_sg++;
 	}
--- ./drivers/scsi/sym53c8xx.c.~1~	Thu Jul  5 22:11:53 2001
+++ ./drivers/scsi/sym53c8xx.c	Wed Aug 15 02:34:39 2001
@@ -13101,7 +13101,7 @@
 		(int) (PciDeviceFn(pdev) & 7));
 
 #ifdef SCSI_NCR_DYNAMIC_DMA_MAPPING
-	if (pci_set_dma_mask(pdev, (dma_addr_t) (0xffffffffUL))) {
+	if (pci_set_dma_mask(pdev, 0xffffffff)) {
 		printk(KERN_WARNING NAME53C8XX
 		       "32 BIT PCI BUS DMA ADDRESSING NOT SUPPORTED\n");
 		return -1;
--- ./drivers/scsi/sym53c8xx_comm.h.~1~	Tue Aug 14 21:43:17 2001
+++ ./drivers/scsi/sym53c8xx_comm.h	Wed Aug 15 03:05:23 2001
@@ -2186,7 +2186,7 @@
 		(int) (PciDeviceFn(pdev) & 7));
 
 #ifdef SCSI_NCR_DYNAMIC_DMA_MAPPING
-	if (!pci_dma_supported(pdev, (dma_addr_t) (0xffffffffUL))) {
+	if (!pci_dma_supported(pdev, 0xffffffff)) {
 		printk(KERN_WARNING NAME53C8XX
 		       "32 BIT PCI BUS DMA ADDRESSING NOT SUPPORTED\n");
 		return -1;
--- ./drivers/scsi/aha1542.c.~1~	Tue May  1 18:50:05 2001
+++ ./drivers/scsi/aha1542.c	Thu Aug 16 04:21:56 2001
@@ -67,12 +67,10 @@
 		       int nseg,
 		       int badseg)
 {
-	printk(KERN_CRIT "sgpnt[%d:%d] addr %p/0x%lx alt %p/0x%lx length %d\n",
+	printk(KERN_CRIT "sgpnt[%d:%d] addr %p/0x%lx length %d\n",
 	       badseg, nseg,
 	       sgpnt[badseg].address,
 	       SCSI_PA(sgpnt[badseg].address),
-	       sgpnt[badseg].alt_address,
-	       sgpnt[badseg].alt_address ? SCSI_PA(sgpnt[badseg].alt_address) : 0,
 	       sgpnt[badseg].length);
 
 	/*
@@ -716,7 +714,7 @@
 				unsigned char *ptr;
 				printk(KERN_CRIT "Bad segment list supplied to aha1542.c (%d, %d)\n", SCpnt->use_sg, i);
 				for (i = 0; i < SCpnt->use_sg; i++) {
-					printk(KERN_CRIT "%d: %x %x %d\n", i, (unsigned int) sgpnt[i].address, (unsigned int) sgpnt[i].alt_address,
+					printk(KERN_CRIT "%d: %p %d\n", i, sgpnt[i].address,
 					       sgpnt[i].length);
 				};
 				printk(KERN_CRIT "cptr %x: ", (unsigned int) cptr);
--- ./drivers/scsi/osst.c.~1~	Mon Jul 23 05:12:27 2001
+++ ./drivers/scsi/osst.c	Thu Aug 16 04:22:30 2001
@@ -4933,7 +4933,6 @@
 			tb->sg[0].address =
 			    (unsigned char *)__get_free_pages(priority, order);
 			if (tb->sg[0].address != NULL) {
-			    tb->sg[0].alt_address = NULL;
 			    tb->sg[0].length = b_size;
 			    break;
 			}
@@ -4969,7 +4968,6 @@
 				tb = NULL;
 				break;
 			    }
-			    tb->sg[segs].alt_address = NULL;
 			    tb->sg[segs].length = b_size;
 			    got += b_size;
 			    segs++;
@@ -5043,7 +5041,6 @@
 			normalize_buffer(STbuffer);
 			return FALSE;
 		}
-		STbuffer->sg[segs].alt_address = NULL;
 		STbuffer->sg[segs].length = b_size;
 		STbuffer->sg_segs += 1;
 		got += b_size;
--- ./drivers/scsi/scsi_debug.c.~1~	Tue Nov 28 08:33:08 2000
+++ ./drivers/scsi/scsi_debug.c	Thu Aug 16 04:23:40 2001
@@ -154,10 +154,7 @@
 	if (SCpnt->use_sg) {
 		sgpnt = (struct scatterlist *) SCpnt->buffer;
 		for (i = 0; i < SCpnt->use_sg; i++) {
-			lpnt = (int *) sgpnt[i].alt_address;
-			printk(":%p %p %d\n", sgpnt[i].alt_address, sgpnt[i].address, sgpnt[i].length);
-			if (lpnt)
-				printk(" (Alt %x) ", lpnt[15]);
+			printk(":%p %d\n", sgpnt[i].address, sgpnt[i].length);
 		};
 	} else {
 		printk("nosg: %p %p %d\n", SCpnt->request.buffer, SCpnt->buffer,
@@ -175,12 +172,6 @@
 	printk("\n");
 	if (flag == 0)
 		return;
-	lpnt = (unsigned int *) sgpnt[0].alt_address;
-	for (i = 0; i < sizeof(Scsi_Cmnd) / 4 + 1; i++) {
-		if ((i & 7) == 0)
-			printk("\n");
-		printk("%x ", *lpnt++);
-	};
 #if 0
 	printk("\n");
 	lpnt = (unsigned int *) sgpnt[0].address;
--- ./drivers/scsi/scsi.h.~1~	Wed Aug 15 06:48:23 2001
+++ ./drivers/scsi/scsi.h	Thu Aug 16 04:41:17 2001
@@ -745,7 +745,8 @@
 	unsigned request_bufflen;	/* Actual request size */
 
 	struct timer_list eh_timeout;	/* Used to time out the command. */
-	void *request_buffer;	/* Actual requested buffer */
+	void *request_buffer;		/* Actual requested buffer */
+        void **bounce_buffers;		/* Array of bounce buffers when using scatter-gather */
 
 	/* These elements define the operation we ultimately want to perform */
 	unsigned char data_cmnd[MAX_COMMAND_SIZE];
--- ./drivers/scsi/scsi_merge.c.~1~	Thu Jul  5 22:11:52 2001
+++ ./drivers/scsi/scsi_merge.c	Thu Aug 16 04:58:34 2001
@@ -120,9 +120,11 @@
 {
 	int jj;
 	struct scatterlist *sgpnt;
+	void **bbpnt;
 	int consumed = 0;
 
 	sgpnt = (struct scatterlist *) SCpnt->request_buffer;
+	bbpnt = SCpnt->bounce_buffers;
 
 	/*
 	 * Now print out a bunch of stats.  First, start with the request
@@ -136,15 +138,13 @@
 	 */
 	for(jj=0; jj < SCpnt->use_sg; jj++)
 	{
-		printk("[%d]\tlen:%d\taddr:%p\talt:%p\n",
+		printk("[%d]\tlen:%d\taddr:%p\tbounce:%p\n",
 		       jj,
 		       sgpnt[jj].length,
 		       sgpnt[jj].address,
-		       sgpnt[jj].alt_address);		       
-		if( sgpnt[jj].alt_address != NULL )
-		{
-			consumed = (sgpnt[jj].length >> 9);
-		}
+		       (bbpnt ? bbpnt[jj] : NULL));
+		if (bbpnt && bbpnt[jj])
+			consumed += sgpnt[jj].length;
 	}
 	printk("Total %d sectors consumed\n", consumed);
 	panic("DMA pool exhausted");
@@ -807,6 +807,7 @@
 	int		     sectors;
 	struct scatterlist * sgpnt;
 	int		     this_count;
+	void		   ** bbpnt;
 
 	/*
 	 * FIXME(eric) - don't inline this - it doesn't depend on the
@@ -861,10 +862,19 @@
 
 	/* 
 	 * Allocate the actual scatter-gather table itself.
-	 * scsi_malloc can only allocate in chunks of 512 bytes 
 	 */
-	SCpnt->sglist_len = (SCpnt->use_sg
-			     * sizeof(struct scatterlist) + 511) & ~511;
+	SCpnt->sglist_len = (SCpnt->use_sg * sizeof(struct scatterlist));
+
+	/* If we could potentially require ISA bounce buffers, allocate
+	 * space for this array here.
+	 */
+	if (dma_host)
+		SCpnt->sglist_len += (SCpnt->use_sg * sizeof(void *));
+
+	/* scsi_malloc can only allocate in chunks of 512 bytes so
+	 * round it up.
+	 */
+	SCpnt->sglist_len = (SCpnt->sglist_len + 511) & ~511;
 
 	sgpnt = (struct scatterlist *) scsi_malloc(SCpnt->sglist_len);
 
@@ -889,6 +899,14 @@
 	SCpnt->request_bufflen = 0;
 	bhprev = NULL;
 
+	if (dma_host)
+		bbpnt = (void **) ((char *)sgpnt +
+			 (SCpnt->use_sg * sizeof(struct scatterlist)));
+	else
+		bbpnt = NULL;
+
+	SCpnt->bounce_buffers = bbpnt;
+
 	for (count = 0, bh = SCpnt->request.bh;
 	     bh; bh = bh->b_reqnext) {
 		if (use_clustering && bhprev != NULL) {
@@ -956,7 +974,7 @@
 			if( scsi_dma_free_sectors - sectors <= 10  ) {
 				/*
 				 * If this would nearly drain the DMA
-				 * pool, mpty, then let's stop here.
+				 * pool empty, then let's stop here.
 				 * Don't make this request any larger.
 				 * This is kind of a safety valve that
 				 * we use - we could get screwed later
@@ -970,7 +988,7 @@
 				break;
 			}
 
-			sgpnt[i].alt_address = sgpnt[i].address;
+			bbpnt[i] = sgpnt[i].address;
 			sgpnt[i].address =
 			    (char *) scsi_malloc(sgpnt[i].length);
 			/*
@@ -987,7 +1005,7 @@
 				break;
 			}
 			if (SCpnt->request.cmd == WRITE) {
-				memcpy(sgpnt[i].address, sgpnt[i].alt_address,
+				memcpy(sgpnt[i].address, bbpnt[i],
 				       sgpnt[i].length);
 			}
 		}
--- ./drivers/scsi/scsi_lib.c.~1~	Sun Aug 12 23:50:32 2001
+++ ./drivers/scsi/scsi_lib.c	Thu Aug 16 04:41:34 2001
@@ -496,13 +496,16 @@
 	 */
 	if (SCpnt->use_sg) {
 		struct scatterlist *sgpnt;
+		void **bbpnt;
 		int i;
 
 		sgpnt = (struct scatterlist *) SCpnt->request_buffer;
+		bbpnt = SCpnt->bounce_buffers;
 
-		for (i = 0; i < SCpnt->use_sg; i++) {
-			if (sgpnt[i].alt_address) {
-				scsi_free(sgpnt[i].address, sgpnt[i].length);
+		if (bbpnt) {
+			for (i = 0; i < SCpnt->use_sg; i++) {
+				if (bbpnt[i])
+					scsi_free(sgpnt[i].address, sgpnt[i].length);
 			}
 		}
 		scsi_free(SCpnt->request_buffer, SCpnt->sglist_len);
@@ -568,18 +571,22 @@
 	 */
 	if (SCpnt->use_sg) {
 		struct scatterlist *sgpnt;
+		void **bbpnt;
 		int i;
 
 		sgpnt = (struct scatterlist *) SCpnt->buffer;
+		bbpnt = SCpnt->bounce_buffers;
 
-		for (i = 0; i < SCpnt->use_sg; i++) {
-			if (sgpnt[i].alt_address) {
-				if (SCpnt->request.cmd == READ) {
-					memcpy(sgpnt[i].alt_address, 
-					       sgpnt[i].address,
-					       sgpnt[i].length);
+		if (bbpnt) {
+			for (i = 0; i < SCpnt->use_sg; i++) {
+				if (bbpnt[i]) {
+					if (SCpnt->request.cmd == READ) {
+						memcpy(bbpnt[i],
+						       sgpnt[i].address,
+						       sgpnt[i].length);
+					}
+					scsi_free(sgpnt[i].address, sgpnt[i].length);
 				}
-				scsi_free(sgpnt[i].address, sgpnt[i].length);
 			}
 		}
 		scsi_free(SCpnt->buffer, SCpnt->sglist_len);
--- ./drivers/scsi/st.c.~1~	Sun Aug 12 23:50:32 2001
+++ ./drivers/scsi/st.c	Thu Aug 16 04:48:18 2001
@@ -3222,7 +3222,6 @@
 			tb->sg[0].address =
 			    (unsigned char *) __get_free_pages(priority, order);
 			if (tb->sg[0].address != NULL) {
-				tb->sg[0].alt_address = NULL;
 				tb->sg[0].length = b_size;
 				break;
 			}
@@ -3258,7 +3257,6 @@
 					tb = NULL;
 					break;
 				}
-				tb->sg[segs].alt_address = NULL;
 				tb->sg[segs].length = b_size;
 				got += b_size;
 				segs++;
@@ -3332,7 +3330,6 @@
 			normalize_buffer(STbuffer);
 			return FALSE;
 		}
-		STbuffer->sg[segs].alt_address = NULL;
 		STbuffer->sg[segs].length = b_size;
 		STbuffer->sg_segs += 1;
 		got += b_size;
--- ./drivers/scsi/qlogicfc.c.~1~	Sun Aug 12 23:50:32 2001
+++ ./drivers/scsi/qlogicfc.c	Wed Aug 15 06:53:38 2001
@@ -65,7 +65,7 @@
 
 #if 1
 /* Once pci64_ DMA mapping interface is in, kill this. */
-typedef dma_addr_t dma64_addr_t;
+#define dma64_addr_t dma_addr_t
 #define pci64_alloc_consistent(d,s,p) pci_alloc_consistent((d),(s),(p))
 #define pci64_free_consistent(d,s,c,a) pci_free_consistent((d),(s),(c),(a))
 #define pci64_map_single(d,c,s,dir) pci_map_single((d),(c),(s),(dir))
@@ -80,6 +80,7 @@
 #define pci64_dma_lo32(a) (a)
 #endif	/* BITS_PER_LONG */
 #define pci64_dma_build(hi,lo) (lo)
+#undef sg_dma64_address
 #define sg_dma64_address(s) sg_dma_address(s)
 #define sg_dma64_len(s) sg_dma_len(s)
 #if BITS_PER_LONG > 32
--- ./include/asm-alpha/pci.h.~1~	Wed May 23 17:57:18 2001
+++ ./include/asm-alpha/pci.h	Wed Aug 15 03:05:38 2001
@@ -144,7 +144,7 @@
    only drive the low 24-bits during PCI bus mastering, then
    you would pass 0x00ffffff as the mask to this function.  */
 
-extern int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask);
+extern int pci_dma_supported(struct pci_dev *hwdev, u64 mask);
 
 /* Return the index of the PCI controller for device PDEV. */
 extern int pci_controller_num(struct pci_dev *pdev);
--- ./include/asm-alpha/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-alpha/scatterlist.h	Thu Aug 16 04:50:38 2001
@@ -5,8 +5,6 @@
 
 struct scatterlist {
 	char *address;			/* Source/target vaddr.  */
-	char *alt_address;		/* Location of actual if address is a
-					   dma indirect buffer, else NULL.  */
 	dma_addr_t dma_address;
 	unsigned int length;
 	unsigned int dma_length;
--- ./include/asm-arm/pci.h.~1~	Sun Aug 12 23:50:35 2001
+++ ./include/asm-arm/pci.h	Wed Aug 15 03:05:45 2001
@@ -152,7 +152,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-static inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+static inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-arm/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-arm/scatterlist.h	Thu Aug 16 04:50:52 2001
@@ -5,7 +5,6 @@
 
 struct scatterlist {
 	char		*address;	/* virtual address		 */
-	char		*alt_address;	/* indirect dma address, or NULL */
 	dma_addr_t	dma_address;	/* dma address			 */
 	unsigned int	length;		/* length			 */
 };
--- ./include/asm-i386/pci.h.~1~	Fri Jul 27 02:21:22 2001
+++ ./include/asm-i386/pci.h	Wed Aug 15 06:47:07 2001
@@ -55,6 +55,9 @@
 extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
 				void *vaddr, dma_addr_t dma_handle);
 
+/* This is always fine. */
+#define pci_dac_cycles_ok(pci_dev)		(1)
+
 /* Map a single buffer of the indicated size for DMA in streaming mode.
  * The 32-bit bus address to use is returned.
  *
@@ -84,6 +87,46 @@
 	/* Nothing to do */
 }
 
+/*
+ * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
+ * to pci_map_single, but takes a struct page instead of a virtual address
+ */
+static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
+				      unsigned long offset, size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	return (page - mem_map) * PAGE_SIZE + offset;
+}
+
+static inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+				  size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
+/* 64-bit variants */
+static inline dma64_addr_t pci64_map_page(struct pci_dev *hwdev, struct page *page,
+					  unsigned long offset, size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+
+	return (((dma64_addr_t) (page - mem_map)) *
+		((dma64_addr_t) PAGE_SIZE)) + (dma64_addr_t) offset;
+}
+
+static inline void pci64_unmap_page(struct pci_dev *hwdev, dma64_addr_t dma_address,
+				    size_t size, int direction)
+{
+	if (direction == PCI_DMA_NONE)
+		BUG();
+	/* Nothing to do */
+}
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -102,8 +145,26 @@
 static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
 			     int nents, int direction)
 {
+	int i;
+
 	if (direction == PCI_DMA_NONE)
 		BUG();
+
+	/*
+	 * temporary 2.4 hack
+	 */
+	for (i = 0; i < nents; i++ ) {
+		if (sg[i].address && sg[i].page)
+			BUG();
+		else if (!sg[i].address && !sg[i].page)
+			BUG();
+
+		if (sg[i].address)
+			sg[i].dma_address = virt_to_bus(sg[i].address);
+		else
+			sg[i].dma_address = page_to_bus(sg[i].page) + sg[i].offset;
+	}
+
 	return nents;
 }
 
@@ -119,6 +180,9 @@
 	/* Nothing to do */
 }
 
+#define pci64_map_sg	pci_map_sg
+#define pci64_unmap_sg	pci_unmap_sg
+
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
  *
@@ -152,12 +216,15 @@
 	/* Nothing to do */
 }
 
+#define pci64_dma_sync_single	pci_dma_sync_single
+#define pci64_dma_sync_sg	pci_dma_sync_sg
+
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-static inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+static inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
         /*
          * we fall back to GFP_DMA when the mask isn't all 1s,
@@ -173,10 +240,10 @@
 /* These macros should be used after a pci_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries pci_map_sg
- * returns, or alternatively stop on the first sg_dma_len(sg) which
- * is 0.
+ * returns.
  */
-#define sg_dma_address(sg)	(virt_to_bus((sg)->address))
+#define sg_dma_address(sg)	((dma_addr_t) ((sg)->dma_address))
+#define sg_dma64_address(sg)	((sg)->dma_address)
 #define sg_dma_len(sg)		((sg)->length)
 
 /* Return the index of the PCI controller for device. */
--- ./include/asm-i386/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-i386/scatterlist.h	Thu Aug 16 04:51:00 2001
@@ -2,9 +2,12 @@
 #define _I386_SCATTERLIST_H
 
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
+    char *  address;    /* Location data is to be transferred to, NULL for
+			 * highmem page */
+    struct page * page; /* Location for highmem page, if any */
+    unsigned int offset;/* for highmem, page offset */
+
+    dma64_addr_t dma_address;
     unsigned int length;
 };
 
--- ./include/asm-i386/types.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-i386/types.h	Wed Aug 15 06:32:48 2001
@@ -27,6 +27,8 @@
  */
 #ifdef __KERNEL__
 
+#include <linux/config.h>
+
 typedef signed char s8;
 typedef unsigned char u8;
 
@@ -44,6 +46,11 @@
 /* Dma addresses are 32-bits wide.  */
 
 typedef u32 dma_addr_t;
+#ifdef CONFIG_HIGHMEM
+typedef u64 dma64_addr_t;
+#else
+typedef u32 dma64_addr_t;
+#endif
 
 #endif /* __KERNEL__ */
 
--- ./include/asm-m68k/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-m68k/scatterlist.h	Thu Aug 16 04:51:09 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
     unsigned long dvma_address;
 };
--- ./include/asm-mips/pci.h.~1~	Tue Jul  3 18:14:13 2001
+++ ./include/asm-mips/pci.h	Wed Aug 15 03:06:03 2001
@@ -206,7 +206,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	/*
 	 * we fall back to GFP_DMA when the mask isn't all 1s,
--- ./include/asm-mips/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-mips/scatterlist.h	Thu Aug 16 04:51:17 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
     
     __u32 dvma_address;
--- ./include/asm-ppc/pci.h.~1~	Wed May 23 17:57:21 2001
+++ ./include/asm-ppc/pci.h	Wed Aug 15 03:06:10 2001
@@ -108,7 +108,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-ppc/scatterlist.h.~1~	Wed May 23 17:57:21 2001
+++ ./include/asm-ppc/scatterlist.h	Thu Aug 16 04:51:23 2001
@@ -9,8 +9,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
 };
 
--- ./include/asm-sparc/pci.h.~1~	Sat May 12 03:47:41 2001
+++ ./include/asm-sparc/pci.h	Wed Aug 15 03:06:17 2001
@@ -108,7 +108,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-sparc/scatterlist.h.~1~	Thu Dec 14 14:52:04 2000
+++ ./include/asm-sparc/scatterlist.h	Thu Aug 16 04:51:30 2001
@@ -6,8 +6,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
 
     __u32 dvma_address; /* A place to hang host-specific addresses at. */
--- ./include/asm-sparc64/types.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-sparc64/types.h	Wed Aug 15 02:13:44 2001
@@ -45,9 +45,10 @@
 
 #define BITS_PER_LONG 64
 
-/* Dma addresses are 32-bits wide for now.  */
+/* Dma addresses come in 32-bit and 64-bit flavours.  */
 
 typedef u32 dma_addr_t;
+typedef u64 dma64_addr_t;
 
 #endif /* __KERNEL__ */
 
--- ./include/asm-sparc64/pci.h.~1~	Tue Aug 14 21:31:07 2001
+++ ./include/asm-sparc64/pci.h	Wed Aug 15 06:43:53 2001
@@ -28,6 +28,15 @@
 /* Dynamic DMA mapping stuff.
  */
 
+/* PCI 64-bit addressing works for all slots on all controller
+ * types on sparc64.  However, it requires that the device
+ * can drive enough of the 64 bits.
+ */
+#define PCI64_ADDR_BASE		0x3fff000000000000
+#define PCI64_REQUIRED_MASK	(~(dma64_addr_t)0)
+#define pci_dac_cycles_ok(pci_dev) \
+	(((pci_dev)->dma_mask & PCI64_REQUIRED_MASK) == PCI64_REQUIRED_MASK)
+
 #include <asm/scatterlist.h>
 
 struct pci_dev;
@@ -64,6 +73,20 @@
  */
 extern void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr, size_t size, int direction);
 
+/* No highmem on sparc64, plus we have an IOMMU, so mapping pages is easy. */
+#define pci_map_page(dev, page, off, size, dir) \
+	pci_map_single(dev, (page_address(page) + (off)), size, dir)
+#define pci_unmap_page(dev,addr,sz,dir) pci_unmap_single(dev,addr,sz,dir)
+
+/* The 64-bit cases might have to do something interesting if
+ * PCI_DMA_FLAG_HUGE_MAPS is set in hwdev->dma_flags.
+ */
+extern dma64_addr_t pci64_map_page(struct pci_dev *hwdev,
+				   struct page *page, unsigned long offset,
+				   size_t size, int direction);
+extern void pci64_unmap_page(struct pci_dev *hwdev, dma64_addr_t dma_addr,
+			     size_t size, int direction);
+
 /* Map a set of buffers described by scatterlist in streaming
  * mode for DMA.  This is the scather-gather version of the
  * above pci_map_single interface.  Here the scatter gather list
@@ -79,13 +102,19 @@
  * Device ownership issues as mentioned above for pci_map_single are
  * the same here.
  */
-extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nents, int direction);
+extern int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		      int nents, int direction);
+extern int pci64_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			int nents, int direction);
 
 /* Unmap a set of streaming mode DMA translations.
  * Again, cpu read rules concerning calls here are the same as for
  * pci_unmap_single() above.
  */
-extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nhwents, int direction);
+extern void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			 int nhwents, int direction);
+extern void pci64_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+			   int nhwents, int direction);
 
 /* Make physical memory consistent for a single
  * streaming mode DMA translation after a transfer.
@@ -96,7 +125,10 @@
  * next point you give the PCI dma address back to the card, the
  * device again owns the buffer.
  */
-extern void pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle, size_t size, int direction);
+extern void pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+				size_t size, int direction);
+extern void pci64_dma_sync_single(struct pci_dev *hwdev, dma64_addr_t dma_handle,
+				  size_t size, int direction);
 
 /* Make physical memory consistent for a set of streaming
  * mode DMA translations after a transfer.
@@ -105,13 +137,14 @@
  * same rules and usage.
  */
 extern void pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nelems, int direction);
+extern void pci64_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg, int nelems, int direction);
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask);
+extern int pci_dma_supported(struct pci_dev *hwdev, u64 mask);
 
 /* Return the index of the PCI controller for device PDEV. */
 
--- ./include/asm-sparc64/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-sparc64/scatterlist.h	Thu Aug 16 04:51:36 2001
@@ -5,17 +5,24 @@
 #include <asm/page.h>
 
 struct scatterlist {
-    char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
-    unsigned int length;
+	/* This will disappear in 2.5.x */
+	char *address;
 
-    __u32 dvma_address; /* A place to hang host-specific addresses at. */
-    __u32 dvma_length;
+	/* These two are only valid if ADDRESS member of this
+	 * struct is NULL.
+	 */
+	struct page *page;
+	unsigned int offset;
+
+	unsigned int length;
+
+	dma64_addr_t dma_address;
+	__u32 dma_length;
 };
 
-#define sg_dma_address(sg) ((sg)->dvma_address)
-#define sg_dma_len(sg)     ((sg)->dvma_length)
+#define sg_dma_address(sg)	((dma_addr_t) ((sg)->dma_address))
+#define sg_dma64_address(sg)	((sg)->dma_address)
+#define sg_dma_len(sg)     	((sg)->dma_length)
 
 #define ISA_DMA_THRESHOLD	(~0UL)
 
--- ./include/linux/pci.h.~1~	Tue Aug 14 21:31:11 2001
+++ ./include/linux/pci.h	Wed Aug 15 06:43:53 2001
@@ -314,6 +314,12 @@
 #define PCI_DMA_FROMDEVICE	2
 #define PCI_DMA_NONE		3
 
+/* These are the boolean attributes stored in pci_dev->dma_flags. */
+#define PCI_DMA_FLAG_HUGE_MAPS	0x00000001 /* Device may hold an enormous number
+					    * of mappings at once?
+					    */
+#define PCI_DMA_FLAG_ARCHMASK	0xf0000000 /* Reserved for arch-specific flags */
+
 #define DEVICE_COUNT_COMPATIBLE	4
 #define DEVICE_COUNT_IRQ	2
 #define DEVICE_COUNT_DMA	2
@@ -353,11 +359,12 @@
 
 	struct pci_driver *driver;	/* which driver has allocated this device */
 	void		*driver_data;	/* data private to the driver */
-	dma_addr_t	dma_mask;	/* Mask of the bits of bus address this
+	u64		dma_mask;	/* Mask of the bits of bus address this
 					   device implements.  Normally this is
 					   0xffffffff.  You only need to change
 					   this if your device has broken DMA
 					   or supports 64-bit transfers.  */
+	unsigned int	dma_flags;	/* See PCI_DMA_FLAG_* above */
 
 	u32             current_state;  /* Current operating state. In ACPI-speak,
 					   this is D0-D3, D0 being fully functional,
@@ -559,7 +566,8 @@
 int pci_enable_device(struct pci_dev *dev);
 void pci_disable_device(struct pci_dev *dev);
 void pci_set_master(struct pci_dev *dev);
-int pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask);
+int pci_set_dma_mask(struct pci_dev *dev, u64 mask);
+void pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off);
 int pci_assign_resource(struct pci_dev *dev, int i);
 
 /* Power management related routines */
@@ -641,7 +649,8 @@
 static inline int pci_enable_device(struct pci_dev *dev) { return -EIO; }
 static inline void pci_disable_device(struct pci_dev *dev) { }
 static inline int pci_module_init(struct pci_driver *drv) { return -ENODEV; }
-static inline int pci_set_dma_mask(struct pci_dev *dev, dma_addr_t mask) { return -EIO; }
+static inline int pci_set_dma_mask(struct pci_dev *dev, u64 mask) { return -EIO; }
+static inline void pci_change_dma_flag(struct pci_dev *dev, unsigned int on, unsigned int off) { }
 static inline int pci_assign_resource(struct pci_dev *dev, int i) { return -EBUSY;}
 static inline int pci_register_driver(struct pci_driver *drv) { return 0;}
 static inline void pci_unregister_driver(struct pci_driver *drv) { }
--- ./include/asm-sh/pci.h.~1~	Fri Jun 29 14:26:55 2001
+++ ./include/asm-sh/pci.h	Wed Aug 15 03:06:33 2001
@@ -167,7 +167,7 @@
  * only drive the low 24-bits during PCI bus mastering, then
  * you would pass 0x00ffffff as the mask to this function.
  */
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-sh/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-sh/scatterlist.h	Thu Aug 16 04:51:43 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
 };
 
--- ./include/asm-s390/scatterlist.h.~1~	Fri Feb 16 21:04:19 2001
+++ ./include/asm-s390/scatterlist.h	Thu Aug 16 04:51:50 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
 };
 
--- ./include/asm-ia64/pci.h.~1~	Sat May 12 03:47:41 2001
+++ ./include/asm-ia64/pci.h	Wed Aug 15 03:06:42 2001
@@ -52,7 +52,7 @@
  * you would pass 0x00ffffff as the mask to this function.
  */
 static inline int
-pci_dma_supported (struct pci_dev *hwdev, dma_addr_t mask)
+pci_dma_supported (struct pci_dev *hwdev, u64 mask)
 {
 	return 1;
 }
--- ./include/asm-ia64/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-ia64/scatterlist.h	Thu Aug 16 04:52:17 2001
@@ -8,11 +8,6 @@
 
 struct scatterlist {
 	char *address;		/* location data is to be transferred to */
-	/*
-	 * Location of actual buffer if ADDRESS points to a DMA
-	 * indirection buffer, NULL otherwise:
-	 */
-	char *alt_address;
 	char *orig_address;	/* Save away the original buffer address (used by pci-dma.c) */
 	unsigned int length;	/* buffer length */
 };
--- ./include/asm-mips64/pci.h.~1~	Thu Jul  5 16:52:48 2001
+++ ./include/asm-mips64/pci.h	Wed Aug 15 03:06:49 2001
@@ -195,7 +195,7 @@
 #endif
 }
 
-extern inline int pci_dma_supported(struct pci_dev *hwdev, dma_addr_t mask)
+extern inline int pci_dma_supported(struct pci_dev *hwdev, u64 mask)
 {
 	/*
 	 * we fall back to GFP_DMA when the mask isn't all 1s,
--- ./include/asm-mips64/scatterlist.h.~1~	Tue Nov 28 08:33:08 2000
+++ ./include/asm-mips64/scatterlist.h	Thu Aug 16 04:52:23 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
     
     __u32 dvma_address;
--- ./include/asm-parisc/pci.h.~1~	Sat May 12 03:47:41 2001
+++ ./include/asm-parisc/pci.h	Wed Aug 15 03:07:07 2001
@@ -113,7 +113,7 @@
 ** See Documentation/DMA-mapping.txt
 */
 struct pci_dma_ops {
-	int  (*dma_supported)(struct pci_dev *dev, dma_addr_t mask);
+	int  (*dma_supported)(struct pci_dev *dev, u64 mask);
 	void *(*alloc_consistent)(struct pci_dev *dev, size_t size, dma_addr_t *iova);
 	void (*free_consistent)(struct pci_dev *dev, size_t size, void *vaddr, dma_addr_t iova);
 	dma_addr_t (*map_single)(struct pci_dev *dev, void *addr, size_t size, int direction);
--- ./include/asm-parisc/scatterlist.h.~1~	Wed Dec  6 05:30:34 2000
+++ ./include/asm-parisc/scatterlist.h	Thu Aug 16 04:52:28 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
 	char *  address;    /* Location data is to be transferred to */
-	char * alt_address; /* Location of actual if address is a 
-			     * dma indirect buffer.  NULL otherwise */
 	unsigned int length;
 
 	/* an IOVA can be 64-bits on some PA-Risc platforms. */
--- ./include/asm-s390x/scatterlist.h.~1~	Fri Feb 16 21:04:20 2001
+++ ./include/asm-s390x/scatterlist.h	Thu Aug 16 04:52:35 2001
@@ -3,8 +3,6 @@
 
 struct scatterlist {
     char *  address;    /* Location data is to be transferred to */
-    char * alt_address; /* Location of actual if address is a 
-			 * dma indirect buffer.  NULL otherwise */
     unsigned int length;
 };
 
--- ./net/ipv6/mcast.c.~1~	Wed Apr 25 13:46:34 2001
+++ ./net/ipv6/mcast.c	Wed Aug 15 00:36:31 2001
@@ -5,7 +5,7 @@
  *	Authors:
  *	Pedro Roque		<roque@di.fc.ul.pt>	
  *
- *	$Id: mcast.c,v 1.37 2001/04/25 20:46:34 davem Exp $
+ *	$Id: mcast.c,v 1.38 2001/08/15 07:36:31 davem Exp $
  *
  *	Based on linux/ipv4/igmp.c and linux/ipv4/ip_sockglue.c 
  *
@@ -90,7 +90,6 @@
 
 	mc_lst->next = NULL;
 	memcpy(&mc_lst->addr, addr, sizeof(struct in6_addr));
-	mc_lst->ifindex = ifindex;
 
 	if (ifindex == 0) {
 		struct rt6_info *rt;
@@ -107,6 +106,8 @@
 		sock_kfree_s(sk, mc_lst, sizeof(*mc_lst));
 		return -ENODEV;
 	}
+
+	mc_lst->ifindex = dev->ifindex;
 
 	/*
 	 *	now add/increase the group membership on the device

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 11:56             ` David S. Miller
                                 ` (2 preceding siblings ...)
  2001-08-16 12:27               ` David S. Miller
@ 2001-08-16 12:34               ` David S. Miller
  2001-08-16 13:35                 ` Gerd Knorr
  2001-08-16 14:15                 ` David S. Miller
  3 siblings, 2 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16 12:34 UTC (permalink / raw)
  To: kraxel; +Cc: linux-kernel

   From: Gerd Knorr <kraxel@bytesex.org>
   Date: 16 Aug 2001 12:14:40 GMT
   
   While we are at it:  Is there some portable way to figure whenever I can
   do a PCI DMA transfer to some page?  On ia32 I can simply look at the
   physical address and if it is behind 4G it doesn't work for 32bit PCI
   devices.  But I think that is not true for architectures which have a
   iommu ...

Currently this is lacking.  The state of affairs for platforms
I know something about is:

x86: "4GB test"
alpha/sparc64/ppc64: any physical memory whatsoever may be accessed
		     via 32-bit PCI addressing due to IOMMU
ia64: "software IOMMU" scheme causes DMA to >4GB addresses to
      require bounce buffers when using 32-bit addressing
      Port posses broken 64-bit PCI addressing hack in an attempt
      to deal with limitations of software IOMMU scheme.

To be honest, you really shouldn't care about this.  If you are
writing a block device, the block/scsi/ide/whatever layer should take
care to only give you memory that can be DMA'd to/from.

Same goes for the networking layer.

In some cases, the distinction being made is "highmem vs not-highmem"
for something being DMA'able on PCI.  This is on thing the networking
references.  But, while this will always lead to correct behavior, it
is very inefficient.

Jens's and my work aims to directly address these kinds of issues.
Whatever we finally end up in Jens's code as the "DMA'able test" will
likely propagate into the networking bits.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 12:27               ` David S. Miller
@ 2001-08-16 12:48                 ` Jens Axboe
  2001-08-16 12:56                 ` Jens Axboe
  2001-08-16 13:08                 ` David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-16 12:48 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Thu, Aug 16 2001, David S. Miller wrote:
>    From: Jens Axboe <axboe@suse.de>
>    Date: Thu, 16 Aug 2001 14:03:17 +0200
> 
>    > That is why PCI_MAX_DMA32, or whatever you would like to name it, does
>    > not make any sense.  It can be a shorthand for drivers themselves, but
>    > that is it and personally I'd rather they just put the bits there
>    > explicitly.
>    
>    Drivers, right. THe block stuff used it in _one_ place -- the
>    BLK_BOUNCE_4G define, to indicated the need to bounce anything above 4G.
>    But no problem, I can just define that to 0xffffffff myself.
> 
> How can "the block stuff" (ie. generic code) make legal use of this

Block stuff is a crappy name, the block layer :-)

> value?  Which physical bits it may address, this is a device specific
> attribute and has nothing to with with 4GB and highmem and PCI
> standard specifications. :-)

It didn't make use of this value, it merely provided it for _drivers_ to
use. Driver passes in a max dma address, block layer translates that
into a page address. As for 4GB, see below.

> In fact, this is not only a device specific attribute, it also has
> things to do with elements of the platform.
> 
> This is why we have things like pci_dma_supported() and friends.
> Let me give an example, for most PCI controllers on Sparc64 if your
> device can address the upper 2GB of 32-bit PCI space, one may DMA
> to any physical memory location via the IOMMU these controllers have.
> 
> There may easily be HIGHMEM platforms which operate this way.  So the
> result is that CONFIG_HIGHMEM does _not_ mean ">=4GB memory must be
> bounced".
> 
> Really, 0xffffffff is a meaningless value.  You have to test against
> device indicated capabilities for bouncing decisions.

Ok, I see where we are not seeing eye to eye. Really, I meant for the
PCI_MAX_DMA32 value to be 'Max address below 4GB' and not 'Max address
we can DMA to with 32-bit PCI'. Does that make sense? Maybe my
explanations weren't quite clear, and of course it didn't really help
that I shoved it in pci.h :-)

> You do not even know how "addressable bits" translates into "range of
> physical memory that may be DMA'd to/from by device".  If an IOMMU is
> present on the platform, these two things have no relationship
> whatsoever.  These two things happen to have a direct relationship
> on x86, but that is as far as it goes.

That's why I need you to sanity check the cross-platform stuff like
that :-). I see what you mean, point taken. Clearly I need to change the
blk_queue_bounce_limit stuff to check with the PCI capabilities.

> Enough babbling on my part, I'll have a look at your bounce patch
> later today. :-)

Thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 12:27               ` David S. Miller
  2001-08-16 12:48                 ` Jens Axboe
@ 2001-08-16 12:56                 ` Jens Axboe
  2001-08-16 13:08                 ` David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2001-08-16 12:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, andrea

On Thu, Aug 16 2001, David S. Miller wrote:
> Enough babbling on my part, I'll have a look at your bounce patch
> later today. :-)

Wait for the next version, I'll clean up the PCI DMA bounce value stuff
first and post a new version.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 12:27               ` David S. Miller
  2001-08-16 12:48                 ` Jens Axboe
  2001-08-16 12:56                 ` Jens Axboe
@ 2001-08-16 13:08                 ` David S. Miller
  2 siblings, 0 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16 13:08 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel, andrea

   From: Jens Axboe <axboe@suse.de>
   Date: Thu, 16 Aug 2001 14:56:36 +0200

   On Thu, Aug 16 2001, David S. Miller wrote:
   > Enough babbling on my part, I'll have a look at your bounce patch
   > later today. :-)
   
   Wait for the next version, I'll clean up the PCI DMA bounce value
   stuff first and post a new version.

Ok.  I was thinking about the 4GB issue and we could describe it
simply using one platform macro define that could be boolean tested in
both your new block stuff and a fixed up version of the networking's
current ugly HIGHMEM tests.

/* PCI address are equivalent to memory physical addresses.
 * As a consequence, the lower 4GB of main memory may be
 * addressable using PCI single-address cycles.  The rest of
 * memory requires the use of dual-address cycles.
 *
 * If this is false, the kernel assumes that some hardware
 * translation mechanism exists to allow all of physical
 * memory to be accessed using single-address cycles.
 */
#define PCI_DMA_PHYS_IS_BUS	(1)

So you'd get things like:

	if (PCI_DMA_PHYS_IS_BUS) {
		/* We might need to bounce this. */
		if (! dev_dma_in_range(dev, address + len))
			address = make_bounce_buffer(dev, address, len);
	} else {
		/* All physical memory is legal for DMA so there
		 * is nothing to check.
		 */
	}

or whatever.  You get the idea.

This is really interesting because it means things like the following.

A device which is only capable of 32-bit PCI addressing can still just
use the pci_map_{single,sg}() interfaces yet DMA to all of system
memory.  The block and networking layers will never try to bounce stuff.

Basically, this is what happens today with non-CONFIG_HIGHMEM
64-bit platforms, with a particular cost for the cases where
translation is done via bounce buffers (notably ia64).

What do you think?

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 12:34               ` David S. Miller
@ 2001-08-16 13:35                 ` Gerd Knorr
  2001-08-16 14:15                 ` David S. Miller
  1 sibling, 0 replies; 29+ messages in thread
From: Gerd Knorr @ 2001-08-16 13:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

> To be honest, you really shouldn't care about this.  If you are
> writing a block device, the block/scsi/ide/whatever layer should take
> care to only give you memory that can be DMA'd to/from.
> 
> Same goes for the networking layer.

bttv is neither blkdev nor networking ...

The current kernel's bttv (0.7.x) uses vmalloc_32() for video buffers
and remaps these pages to userspace then.  My current devel versions
(0.8.x) use a completely different approach:  mmap() simply returns
shared anonymous memory, and bttv uses kiobufs then to lock pages for
I/O.  

This has the adwantage that the video buffers don't waste unswappable
kernel memory any more.  It also easy to write video data to any
userspace address (read() does that for example, so I don't have to
copy_to_user() the big video frames).

On the other hand I have a new problem now:  I have to deal with highmem
pages (that works with the highmem-I/O patches) and with pages which are
outside the DMA-able memory area (hmm...).

  Gerd

-- 
Damn lot people confuse usability and eye-candy.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 12:34               ` David S. Miller
  2001-08-16 13:35                 ` Gerd Knorr
@ 2001-08-16 14:15                 ` David S. Miller
  1 sibling, 0 replies; 29+ messages in thread
From: David S. Miller @ 2001-08-16 14:15 UTC (permalink / raw)
  To: kraxel; +Cc: linux-kernel

   From: Gerd Knorr <kraxel@bytesex.org>
   Date: Thu, 16 Aug 2001 15:35:41 +0200

   > To be honest, you really shouldn't care about this.  If you are
   > writing a block device, the block/scsi/ide/whatever layer should take
   > care to only give you memory that can be DMA'd to/from.
   > 
   > Same goes for the networking layer.
   
   bttv is neither blkdev nor networking ...
   
It is video layer, and the video layer should be helping along with
these sorts of issues.

The video layer should provide PCI device helpers which allow the
drivers to just do something like:

struct scatterlist *video_pci_get_user_pages(struct pci_dev *pdev,
					     int npages);

void video_pci_put_user_pages(struct pci_dev *pdev,
			      struct scatterlist *sg,
			      int npages);

The scatter list returned (NULL indicates failure, ala -ENOMEM)
will tell the driver everything it needs to know.  Let us define
it as follows:

SG entry 0
	address == base of vmalloc()'d area

SG entry 0 ... NPAGES
	sg_dma_address(sg) = dma_address for that page
	sg_dma_len(sg) = dma_length, usually PAGE_SIZE

sg_dma_len could be something larger than PAGE_SIZE if the
platform has a way of using virtual mappings in PCI space.
You would simply give the device DMA address/len pairs from
the SG array until you either hit the NPAGES entry or you
see a dma_len of zero.

I realize this does not address the kiovec based scheme you
are experimenting with now, but this does deal with the biggest
problem the video layer has right now wrt. portability.

In fact, this isn't even a video layer issue, and the kernel
ought to provide my suggested interfaces in some generic
place.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
  2001-08-16 14:56 ` Alan Cox
@ 2001-08-17 10:18   ` Gerd Knorr
  0 siblings, 0 replies; 29+ messages in thread
From: Gerd Knorr @ 2001-08-17 10:18 UTC (permalink / raw)
  To: Alan Cox; +Cc: David S. Miller, linux-kernel

On Thu, Aug 16, 2001 at 03:56:21PM +0100, Alan Cox wrote:
> > It is video layer, and the video layer should be helping along with
> > these sorts of issues.
> 
> Linus refused to let make make the vmalloc helpers generic code, thats
> why we have 8 or 9 different copies some containing old bugs

:-(

> > void video_pci_put_user_pages(struct pci_dev *pdev,
> > 			      struct scatterlist *sg,
> > 			      int npages);
> 
> Why video_pci. WHy is this even video related. This is a generic issue

I've moved current bttv's mm stuff to another source file and removed
any bttv-related stuff.  Compiles, but isn't fully tested yet.  The
first three functions are pretty generic.  The videomm_buf probably
isn't useful for anything but video4linux.  Depends on the highmem-io
patches as it uses the new scatterlist->page.

Comments?  Put that into videodev.[ch] ?

  Gerd

------------------------ cut here -----------------------
--- /dev/null	Thu Jan  1 01:00:00 1970
+++ video-mm.h	Fri Aug 17 11:57:02 2001
@@ -0,0 +1,47 @@
+/*
+ * memory management helpers for video4linux drivers (+others?)
+ */
+
+/*
+ * get the page for a kernel virtual address
+ * (i.e. for vmalloc()ed memory)
+ */
+struct page* videomm_vmalloc_to_page(unsigned long virt);
+
+/*
+ * return a scatterlist for vmalloc()'ed memory block.
+ */
+struct scatterlist* videomm_vmalloc_to_sg(unsigned long virt, int nr_pages);
+
+/*
+ * return a scatterlist for a iobuf
+ */
+struct scatterlist* videomm_iobuf_to_sg(struct kiobuf *iobuf);
+
+/* --------------------------------------------------------------------- */
+
+struct videomm_buf {
+	/* for userland buffer */
+	struct kiobuf       *iobuf;
+
+	/* for kernel buffers */
+	void*               *vmalloc;
+
+	/* common */
+	struct scatterlist  *sglist;
+	int                 sglen;
+	int                 nr_pages;
+};
+
+int videomm_init_user_buf(struct videomm_buf *vbuf, unsigned long data,
+			  unsigned long size);
+int videomm_init_kernel_buf(struct videomm_buf *vbuf, int nr_pages);
+int videomm_pci_map_buf(struct pci_dev *dev, struct videomm_buf *vbuf);
+int videomm_pci_unmap_buf(struct pci_dev *dev, struct videomm_buf *vbuf);
+int videomm_free_buf(struct videomm_buf *vbuf);
+
+/*
+ * Local variables:
+ * c-basic-offset: 8
+ * End:
+ */
--- /dev/null	Thu Jan  1 01:00:00 1970
+++ video-mm.c	Fri Aug 17 11:57:14 2001
@@ -0,0 +1,183 @@
+#include <linux/pci.h>
+#include <linux/iobuf.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <asm/page.h>
+#include <asm/pgtable.h>
+
+#include "video-mm.h"
+
+struct page*
+videomm_vmalloc_to_page(unsigned long virt)
+{
+	struct page *ret = NULL;
+	pmd_t *pmd;
+	pte_t *pte;
+	pgd_t *pgd;
+	
+	pgd = pgd_offset_k(virt);
+	if (!pgd_none(*pgd)) {
+		pmd = pmd_offset(pgd, virt);
+		if (!pmd_none(*pmd)) {
+			pte = pte_offset(pmd, virt);
+			if (pte_present(*pte)) {
+				ret = pte_page(*pte);
+			}
+		}
+	}
+	return ret;
+}
+
+struct scatterlist*
+videomm_vmalloc_to_sg(unsigned long virt, int nr_pages)
+{
+	struct scatterlist *sglist;
+	struct page *pg;
+	int i;
+
+	sglist = kmalloc(sizeof(struct scatterlist)*nr_pages, GFP_KERNEL);
+	if (NULL == sglist)
+		return NULL;
+	memset(sglist,0,sizeof(struct scatterlist)*nr_pages);
+	for (i = 0; i < nr_pages; i++, virt += PAGE_SIZE) {
+		pg = videomm_vmalloc_to_page(virt);
+		if (NULL == pg)
+			goto err;
+		sglist[i].page   = pg;
+		sglist[i].length = PAGE_SIZE;
+	}
+	return sglist;
+	
+ err:
+	kfree(sglist);
+	return NULL;
+}
+
+#if 1
+/*
+ * this is temporary hack only until there is something generic
+ * for that ...
+ */
+# if defined(__i386__)
+#  define NO_DMA(page) (((page) - mem_map) > (0xffffffff >> PAGE_SHIFT))
+# else
+#  define NO_DMA(page) (0)
+# endif
+#endif
+
+struct scatterlist*
+videomm_iobuf_to_sg(struct kiobuf *iobuf)
+{
+	struct scatterlist *sglist;
+	int i;
+
+	sglist = kmalloc(sizeof(struct scatterlist) * iobuf->nr_pages,
+			 GFP_KERNEL);
+	if (NULL == sglist)
+		return NULL;
+	memset(sglist,0,sizeof(struct scatterlist) * iobuf->nr_pages);
+
+	if (NO_DMA(iobuf->maplist[0]))
+		goto no_dma;
+	sglist[0].page   = iobuf->maplist[0];
+	sglist[0].offset = iobuf->offset;
+	sglist[0].length = PAGE_SIZE - iobuf->offset;
+	for (i = 1; i < iobuf->nr_pages; i++) {
+		if (NO_DMA(iobuf->maplist[i]))
+			goto no_dma;
+		sglist[i].page   = iobuf->maplist[i];
+		sglist[i].length = PAGE_SIZE;
+	}
+	return sglist;
+
+ no_dma:
+	/* FIXME: find a more elegant way than simply fail. */
+	kfree(sglist);
+	return NULL;
+}
+
+/* --------------------------------------------------------------------- */
+
+int videomm_init_user_buf(struct videomm_buf *vbuf, unsigned long data,
+			  unsigned long size)
+{
+	int err;
+
+	if (0 != (err = alloc_kiovec(1,&vbuf->iobuf)))
+		return err;
+	if (0 != (err = map_user_kiobuf(READ, vbuf->iobuf, data, size)))
+		return err;
+	vbuf->nr_pages = vbuf->iobuf->nr_pages;
+	return 0;
+}
+
+int videomm_init_kernel_buf(struct videomm_buf *vbuf, int nr_pages)
+{
+	vbuf->vmalloc = vmalloc_32(nr_pages << PAGE_SHIFT);
+	if (NULL == vbuf->vmalloc)
+		return -ENOMEM;
+	vbuf->nr_pages = nr_pages;
+	return 0;
+}
+
+int videomm_pci_map_buf(struct pci_dev *dev, struct videomm_buf *vbuf)
+{
+	int err;
+
+	if (0 == vbuf->nr_pages)
+		BUG();
+	
+	if (vbuf->iobuf) {
+		if (0 != (err = lock_kiovec(1,&vbuf->iobuf,1)))
+			return err;
+		vbuf->sglist = videomm_iobuf_to_sg(vbuf->iobuf);
+	}
+	if (vbuf->vmalloc) {
+		vbuf->sglist = videomm_vmalloc_to_sg
+			((unsigned long)vbuf->vmalloc,vbuf->nr_pages);
+	}
+	if (NULL == vbuf->sglist)
+		return -ENOMEM;
+	vbuf->sglen = pci_map_sg(dev,vbuf->sglist,vbuf->nr_pages,
+				 PCI_DMA_FROMDEVICE);
+	return 0;
+}
+
+int videomm_pci_unmap_buf(struct pci_dev *dev, struct videomm_buf *vbuf)
+{
+	if (!vbuf->sglen)
+		BUG();
+
+	pci_unmap_sg(dev,vbuf->sglist,vbuf->iobuf->nr_pages,
+		     PCI_DMA_FROMDEVICE);
+	kfree(vbuf->sglist);
+	vbuf->sglist = NULL;
+	vbuf->sglen = 0;
+	if (vbuf->iobuf)
+		unlock_kiovec(1,&vbuf->iobuf);
+	return 0;
+}
+
+int videomm_free_buf(struct videomm_buf *vbuf)
+{
+	if (vbuf->sglen)
+		BUG();
+
+	if (vbuf->iobuf) {
+		unmap_kiobuf(vbuf->iobuf);
+		free_kiovec(1,&vbuf->iobuf);
+		vbuf->iobuf = NULL;
+	}
+	if (vbuf->vmalloc) {
+		vfree(vbuf->vmalloc);
+		vbuf->vmalloc = NULL;
+	}
+	return 0;
+}
+
+
+/*
+ * Local variables:
+ * c-basic-offset: 8
+ * End:
+ */

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [patch] zero-bounce highmem I/O
       [not found] <no.id>
@ 2001-08-16 14:56 ` Alan Cox
  2001-08-17 10:18   ` Gerd Knorr
  0 siblings, 1 reply; 29+ messages in thread
From: Alan Cox @ 2001-08-16 14:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: kraxel, linux-kernel

> It is video layer, and the video layer should be helping along with
> these sorts of issues.

Linus refused to let make make the vmalloc helpers generic code, thats
why we have 8 or 9 different copies some containing old bugs

> void video_pci_put_user_pages(struct pci_dev *pdev,
> 			      struct scatterlist *sg,
> 			      int npages);

Why video_pci. WHy is this even video related. This is a generic issue

> In fact, this isn't even a video layer issue, and the kernel
> ought to provide my suggested interfaces in some generic
> place.

Then we agree on that

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2001-08-17 10:21 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-15  7:50 [patch] zero-bounce highmem I/O Jens Axboe
2001-08-15  9:11 ` David S. Miller
2001-08-15  9:17   ` Jens Axboe
2001-08-15  9:26   ` Jens Axboe
2001-08-15 10:22   ` David S. Miller
2001-08-15 11:13     ` Jens Axboe
2001-08-15 11:47     ` David S. Miller
2001-08-15 12:07       ` Jens Axboe
2001-08-15 12:35       ` David S. Miller
2001-08-15 13:10         ` Jens Axboe
2001-08-15 14:25           ` David S. Miller
2001-08-16 11:51             ` Jens Axboe
2001-08-16 11:56             ` David S. Miller
2001-08-16 12:03               ` Jens Axboe
2001-08-16 12:14               ` Gerd Knorr
2001-08-16 12:27               ` David S. Miller
2001-08-16 12:48                 ` Jens Axboe
2001-08-16 12:56                 ` Jens Axboe
2001-08-16 13:08                 ` David S. Miller
2001-08-16 12:34               ` David S. Miller
2001-08-16 13:35                 ` Gerd Knorr
2001-08-16 14:15                 ` David S. Miller
2001-08-16 12:28             ` kill alt_address (Re: [patch] zero-bounce highmem I/O) David S. Miller
2001-08-15 14:02         ` [patch] zero-bounce highmem I/O David S. Miller
2001-08-16  5:52           ` Jens Axboe
2001-08-15 19:20     ` Gérard Roudier
2001-08-16  8:12     ` David S. Miller
     [not found] <no.id>
2001-08-16 14:56 ` Alan Cox
2001-08-17 10:18   ` Gerd Knorr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).