diff options
| author | Russell King <rmk@flint.arm.linux.org.uk> | 2003-02-28 00:29:33 +0000 |
|---|---|---|
| committer | Russell King <rmk@flint.arm.linux.org.uk> | 2003-02-28 00:29:33 +0000 |
| commit | d5ebb4987c054df3d7023d7e6f41bc2f04fa4763 (patch) | |
| tree | 84329071564971c87975a284ac660e7592664461 | |
| parent | 76efaa97c388b81bb0efc5a7437971b674a0070c (diff) | |
[ARM PATCH] 1389/1: update iop3xx support for 2.5 (patch 2 of 4)
Patch from Eli Carter
Supercedes patch 1382
This patch copies the relevant documentation for the iop3xx support
from the 2.4.19-rmk4-ds2 kernel.
| -rw-r--r-- | Documentation/arm/XScale/IOP3XX/IQ80310 (renamed from Documentation/arm/XScale/IOP310/IQ80310) | 65 | ||||
| -rw-r--r-- | Documentation/arm/XScale/IOP3XX/IQ80321 | 216 | ||||
| -rw-r--r-- | Documentation/arm/XScale/IOP3XX/aau.txt | 178 | ||||
| -rw-r--r-- | Documentation/arm/XScale/IOP3XX/dma.txt | 214 | ||||
| -rw-r--r-- | Documentation/arm/XScale/IOP3XX/message.txt | 110 | ||||
| -rw-r--r-- | Documentation/arm/XScale/IOP3XX/pmon.txt | 71 | ||||
| -rw-r--r-- | Documentation/arm/XScale/cache-lock.txt | 123 | ||||
| -rw-r--r-- | Documentation/arm/XScale/pmu.txt | 168 | ||||
| -rw-r--r-- | Documentation/arm/XScale/tlb-lock.txt | 64 |
9 files changed, 1153 insertions, 56 deletions
diff --git a/Documentation/arm/XScale/IOP310/IQ80310 b/Documentation/arm/XScale/IOP3XX/IQ80310 index 4284d458e6f9..d1d122747df1 100644 --- a/Documentation/arm/XScale/IOP310/IQ80310 +++ b/Documentation/arm/XScale/IOP3XX/IQ80310 @@ -49,61 +49,14 @@ Preparing the Hardware ----------------------------- This document assumes you're using a Rev D or newer board running -Redboot as the bootloader. - -The as-supplied RedBoot image appears to leave the first page of RAM -in a corrupt state such that certain words in that page are unwritable -and contain random data. The value of the data, and the location within -the first page changes with each boot, but is generally in the range -0xa0000150 to 0xa0000fff. - -You can grab the source from the ECOS CVS or you can get a prebuilt image -from: - - ftp://source.mvista.com/pub/xscale/iop310/IQ80310/redboot.bin - -which is: - - # strings redboot.bin | grep bootstrap - RedBoot(tm) bootstrap and debug environment, version UNKNOWN - built 14:58:21, Aug 15 2001 - -md5sum of this version: - - bcb96edbc6f8e55b16c165930b6e4439 redboot.bin - -You have two options to program it: - -1. Using the FRU program (see the instructions in the user manual). - -2. Using a Linux host, with MTD support built into the host kernel: - - ensure that the RedBoot image is not locked (issue the following - command under the existing RedBoot image): - RedBoot> fis unlock -f 0 -l 0x40000 - - switch S3-1 and S3-2 on. - - reboot the host - - login as root - - identify the 80310 card: - # lspci - ... - 00:0c.1 Memory controller: Intel Corporation 80310 IOP [IO Processor] (rev 01) - - in this example, bus 0, slot 0c, function 1. - - insert the MTD modules, and the PCI map module: - # insmod drivers/mtd/maps/pci.o - - locate the MTD device (using the bus, slot, function) - # cat /proc/mtd - dev: size erasesize name - mtd0: 00800000 00020000 "00:0c.1" - - in this example, it is mtd device 0. Yours will be different. - Check carefully. - - program the flash - # cat redboot.bin > /dev/mtdblock0 - - check the kernel message log for errors (some cat commands don't - error on failure) - # dmesg - - switch S3-1 and S3-2 off - - reboot host - -In any case, make sure you do an 'fis init' command once you boot with the new +Redboot as the bootloader. Note that the version of RedBoot provided +with the boards has a major issue and you need to replace it with the +latest RedBoot. You can grab the source from the ECOS CVS or you can +get a prebuilt image and burn it in using FRU at: + + ftp://source.mvista.com/pub/xscale/iq80310/redboot.bin + +Make sure you do an 'fis init' command once you boot with the new RedBoot image. @@ -235,7 +188,7 @@ JFFS RedBoot partition mapped into the MTD partition scheme. You can grab a pre-built JFFS image to use as a root file system at: - ftp://source.mvista.com/pub/xscale/iop310/IQ80310/jffs.img + ftp://source.mvista.com/pub/xscale/iq80310/jffs.img For detailed info on using MTD and creating a JFFS image go to: diff --git a/Documentation/arm/XScale/IOP3XX/IQ80321 b/Documentation/arm/XScale/IOP3XX/IQ80321 new file mode 100644 index 000000000000..18d22175108b --- /dev/null +++ b/Documentation/arm/XScale/IOP3XX/IQ80321 @@ -0,0 +1,216 @@ + +Board Overview +----------------------------- + +The Worcester IQ80321 board is an evaluation platform for Intel's 80321 Xscale +CPU (sometimes called IOP321 chipset). + +The 80321 contains a single PCI hose (called the ATUs), a PCI-to-PCI bridge, +two DMA channels, I2C, I2O messaging unit, XOR unit for RAID operations, +a bus performance monitoring unit, and a memory controller with ECC features. + +For more information on the board, see http://developer.intel.com/iio + +Port Status +----------------------------- + +Supported: + +- MTD/JFFS/JFFS2 root +- NFS root +- RAMDISK root +- Serial port (ttyS0) +- Cache/TLB locking on 80321 CPU +- Performance monitoring unit on 80321 CPU + +TODO: + +- DMA engines +- I2C +- 80321 Bus Performance Monitor +- Application Accelerator Unit (XOR engine for RAID) +- I2O Messaging Unit +- I2C unit +- SSP + +Building the Kernel +----------------------------- +make iq80321_config +make oldconfig +make dep +make zImage + +This will build an image setup for BOOTP/NFS root support. To change this, +just run make menuconfig and disable nfs root or add a "root=" option. + +Preparing the Hardware +----------------------------- + +Make sure you do an 'fis init' command once you boot with the new +RedBoot image. + +Downloading Linux +----------------------------- + +Assuming you have your development system setup to act as a bootp/dhcp +server and running tftp: + +NOTE: The 80321 board uses a different default memory map than the 80310. + + RedBoot> load -r -b 0x01008000 -m y + +Once the download is completed: + + RedBoot> go 0x01008000 + +There is a version of RedBoot floating around that has DHCP support, but +I've never been able to cleanly transfer a kernel image and have it run. + +Root Devices +----------------------------- + +A kernel is not useful without a root filesystem, and you have several +choices with this board: NFS root, RAMDISK, or JFFS/JFFS2. For development +purposes, it is suggested that you use NFS root for easy access to various +tools. Once you're ready to deploy, probably want to utilize JFFS/JFFS2 on +the flash device. + +MTD on the IQ80321 +----------------------------- + +Linux on the IQ80321 supports RedBoot FIS paritioning if it is enabled. +Out of the box, once you've done 'fis init' on RedBoot, you will get +the following partitioning scheme: + + root@192.168.0.14:~# cat /proc/mtd + dev: size erasesize name + mtd0: 00040000 00020000 "RedBoot" + mtd1: 00040000 00020000 "RedBoot[backup]" + mtd2: 0075f000 00020000 "unallocated space" + mtd3: 00001000 00020000 "RedBoot config" + mtd4: 00020000 00020000 "FIS directory" + +To create an FIS directory, you need to use the fis command in RedBoot. +As an example, you can burn the kernel into the flash once it's downloaded: + + RedBoot> fis create -b 0x01008000 -l 0x8CBAC -r 0x01008000 -f 0x80000 kernel + ... Erase from 0x00080000-0x00120000: ..... + ... Program from 0x01008000-0x01094bac at 0x00080000: ..... + ... Unlock from 0x007e0000-0x00800000: . + ... Erase from 0x007e0000-0x00800000: . + ... Program from 0x01fdf000-0x01fff000 at 0x007e0000: . + ... Lock from 0x007e0000-0x00800000: . + + RedBoot> fis list + Name FLASH addr Mem addr Length Entry point + RedBoot 0x00000000 0x00000000 0x00040000 0x00000000 + RedBoot[backup] 0x00040000 0x00040000 0x00040000 0x00000000 + RedBoot config 0x007DF000 0x007DF000 0x00001000 0x00000000 + FIS directory 0x007E0000 0x007E0000 0x00020000 0x00000000 + kernel 0x00080000 0x01008000 0x000A0000 0x00000000 + +This leads to the following Linux MTD setup: + + mtroot@192.168.0.14:~# cat /proc/mtd + dev: size erasesize name + mtd0: 00040000 00020000 "RedBoot" + mtd1: 00040000 00020000 "RedBoot[backup]" + mtd2: 000a0000 00020000 "kernel" + mtd3: 006bf000 00020000 "unallocated space" + mtd4: 00001000 00020000 "RedBoot config" + mtd5: 00020000 00020000 "FIS directory" + +Note that there is not a 1:1 mapping to the number of RedBoot paritions to +MTD partitions as unused space also gets allocated into MTD partitions. + +As an aside, the -r option when creating the Kernel entry allows you to +simply do an 'fis load kernel' to copy the image from flash into memory. +You can then do an 'fis go 0x01008000' to start Linux. + +If you choose to use static partitioning instead of the RedBoot partioning: + + /dev/mtd0 0x00000000 - 0x0007ffff: Boot Monitor (512k) + /dev/mtd1 0x00080000 - 0x0011ffff: Kernel Image (640K) + /dev/mtd2 0x00120000 - 0x0071ffff: File System (6M) + /dev/mtd3 0x00720000 - 0x00800000: RedBoot Reserved (896K) + +To use a JFFS1/2 root FS, you need to donwload the JFFS image using either +tftp or ymodem, and then copy it to flash: + + RedBoot> load -r -b 0x01000000 /tftpboot/jffs.img + Raw file loaded 0x01000000-0x01600000 + RedBoot> fis create -b 0x01000000 -l 0x600000 -f 0x120000 jffs + ... Erase from 0x00120000-0x00720000: .................................. + ... Program from 0x01000000-0x01600000 at 0x00120000: .................. + ...................... + ... Unlock from 0x007e0000-0x00800000: . + ... Erase from 0x007e0000-0x00800000: . + ... Program from 0x01fdf000-0x01fff000 at 0x007e0000: . + ... Lock from 0x007e0000-0x00800000: . + RedBoot> fis list + Name FLASH addr Mem addr Length Entry point + RedBoot 0x00000000 0x00000000 0x00040000 0x00000000 + RedBoot[backup] 0x00040000 0x00040000 0x00040000 0x00000000 + RedBoot config 0x007DF000 0x007DF000 0x00001000 0x00000000 + FIS directory 0x007E0000 0x007E0000 0x00020000 0x00000000 + kernel 0x00080000 0x01008000 0x000A0000 0x01008000 + jffs 0x00120000 0x00120000 0x00600000 0x00000000 + +This looks like this in Linux: + + root@192.168.0.14:~# cat /proc/mtd + dev: size erasesize name + mtd0: 00040000 00020000 "RedBoot" + mtd1: 00040000 00020000 "RedBoot[backup]" + mtd2: 000a0000 00020000 "kernel" + mtd3: 00600000 00020000 "jffs" + mtd4: 000bf000 00020000 "unallocated space" + mtd5: 00001000 00020000 "RedBoot config" + mtd6: 00020000 00020000 "FIS directory" + +You need to boot the kernel once and watch the boot messages to see how the +JFFS RedBoot partition mapped into the MTD partition scheme. + +You can grab a pre-built JFFS image to use as a root file system at: + + ftp://source.mvista.com/pub/xscale/iq80310/jffs.img + +For detailed info on using MTD and creating a JFFS image go to: + + http://www.linux-mtd.infradead.org. + +For details on using RedBoot's FIS commands, type 'fis help' or consult +your RedBoot manual. + +BUGS and ISSUES +----------------------------- + +* As shipped from Intel, pre-production boards have two issues: + +- The on board ethernet is disabled S8E1-2 is off. You will need to turn it on. + +- The PCIXCAPs are configured for a 100Mhz clock, but the clock selected is + actually only 66Mhz. This causes the wrong PPL multiplier to be used and the + board only runs at 400Mhz instead of 600Mhz. The way to observe this is to + use a independent clock to time a "sleep 10" command from the prompt. If it + takes 15 seconds instead of 10, you are running at 400Mhz. + +- The experimental IOP310 drivers for the AAU, DMA, etc. are not supported yet. + +Contributors +----------------------------- +The port to the IQ80321 was performed by: + +Rory Bolt <rorybolt@pacbell.net> - Initial port, debugging. + +This port was based on the IQ80310 port with the following contributors: + +Nicolas Pitre <nico@cam.org> - Initial port, cleanup, debugging +Matt Porter <mporter@mvista.com> - PCI subsystem development, debugging +Tim Sanders <tsanders@sanders.org> - Initial PCI code +Deepak Saxena <dsaxena@mvista.com> - Cleanup, debug, cache lock, PMU + +The port is currently maintained by Deepak Saxena <dsaxena@mvista.com> + +----------------------------- +Enjoy. diff --git a/Documentation/arm/XScale/IOP3XX/aau.txt b/Documentation/arm/XScale/IOP3XX/aau.txt new file mode 100644 index 000000000000..e3852ccbf663 --- /dev/null +++ b/Documentation/arm/XScale/IOP3XX/aau.txt @@ -0,0 +1,178 @@ +Support functions for the Intel 80310 AAU +=========================================== + +Dave Jiang <dave.jiang@intel.com> +Last updated: 09/18/2001 + +The Intel 80312 companion chip in the 80310 chipset contains an AAU. The +AAU is capable of processing up to 8 data block sources and perform XOR +operations on them. This unit is typically used to accelerated XOR +operations utilized by RAID storage device drivers such as RAID 5. This +API is designed to provide a set of functions to take adventage of the +AAU. The AAU can also be used to transfer data blocks and used as a memory +copier. The AAU transfer the memory faster than the operation performed by +using CPU copy therefore it is recommended to use the AAU for memory copy. + +------------------ +int aau_request(u32 *aau_context, const char *device_id); +This function allows the user the acquire the control of the the AAU. The +function will return a context of AAU to the user and allocate +an interrupt for the AAU. The user must pass the context as a parameter to +various AAU API calls. + +int aau_queue_buffer(u32 aau_context, aau_head_t *listhead); +This function starts the AAU operation. The user must create a SGL +header with a SGL attached. The format is presented below. The SGL is +built from kernel memory. + +/* hardware descriptor */ +typedef struct _aau_desc +{ + u32 NDA; /* next descriptor address [READONLY] */ + u32 SAR[AAU_SAR_GROUP]; /* src addrs */ + u32 DAR; /* destination addr */ + u32 BC; /* byte count */ + u32 DC; /* descriptor control */ + u32 SARE[AAU_SAR_GROUP]; /* extended src addrs */ +} aau_desc_t; + +/* user SGL format */ +typedef struct _aau_sgl +{ + aau_desc_t aau_desc; /* AAU HW Desc */ + u32 status; /* status of SGL [READONLY] */ + struct _aau_sgl *next; /* pointer to next SG [READONLY] */ + void *dest; /* destination addr */ + void *src[AAU_SAR_GROUP]; /* source addr[4] */ + void *ext_src[AAU_SAR_GROUP]; /* ext src addr[4] */ + u32 total_src; /* total number of source */ +} aau_sgl_t; + +/* header for user SGL */ +typedef struct _aau_head +{ + u32 total; /* total descriptors allocated */ + u32 status; /* SGL status */ + aau_sgl_t *list; /* ptr to head of list */ + aau_callback_t callback; /* callback func ptr */ +} aau_head_t; + + +The function will call aau_start() and start the AAU after it queues +the SGL to the processing queue. When the function will either +a. Sleep on the wait queue aau->wait_q if no callback has been provided, or +b. Continue and then call the provided callback function when DMA interrupt + has been triggered. + +int aau_suspend(u32 aau_context); +Stops/Suspends the AAU operation + +int aau_free(u32 aau_context); +Frees the ownership of AAU. Called when no longer need AAU service. + +aau_sgl_t * aau_get_buffer(u32 aau_context, int num_buf); +This function obtains an AAU SGL for the user. User must specify the number +of descriptors to be allocated in the chain that is returned. + +void aau_return_buffer(u32 aau_context, aau_sgl_t *list); +This function returns all SGL back to the API after user is done. + +int aau_memcpy(void *dest, void *src, u32 size); +This function is a short cut for user to do memory copy utilizing the AAU for +better large block memory copy vs using the CPU. This is similar to using +typical memcpy() call. + +* User is responsible for the source address(es) and the destination address. + The source and destination should all be cached memory. + + + +void aau_test() +{ + u32 aau; + char dev_id[] = "AAU"; + int size = 2; + int err = 0; + aau_head_t *head; + aau_sgl_t *list; + u32 i; + u32 result = 0; + void *src, *dest; + + printk("Starting AAU test\n"); + if((err = aau_request(&aau, dev_id))<0) + { + printk("test - AAU request failed: %d\n", err); + return; + } + else + { + printk("test - AAU request successful\n"); + } + + head = kmalloc(sizeof(aau_head_t), GFP_KERNEL); + head->total = size; + head->status = 0; + head->callback = NULL; + + list = aau_get_buffer(aau, size); + if(!list) + { + printk("Can't get buffers\n"); + return; + } + head->list = list; + + src = kmalloc(1024, GFP_KERNEL); + dest = kmalloc(1024, GFP_KERNEL); + + while(list) + { + list->status = 0; + list->aau_desc->SAR[0] = (u32)src; + list->aau_desc->DAR = (u32)dest; + list->aau_desc->BC = 1024; + + /* see iop310-aau.h for more DCR commands */ + list->aau_desc->DC = AAU_DCR_WRITE | AAU_DCR_BLKCTRL_1_DF; + if(!list->next) + { + list->aau_desc->DC = AAU_DCR_IE; + break; + } + list = list->next; + } + + printk("test- Queueing buffer for AAU operation\n"); + err = aau_queue_buffer(aau, head); + if(err >= 0) + { + printk("AAU Queue Buffer is done...\n"); + } + else + { + printk("AAU Queue Buffer failed...: %d\n", err); + } + + + +#if 1 + printk("freeing the AAU\n"); + aau_return_buffer(aau, head->list); + aau_free(aau); + kfree(src); + kfree(dest); + kfree((void *)head); +#endif +} + +All Disclaimers apply. Use this at your own discretion. Neither Intel nor I +will be responsible if anything goes wrong. =) + + +TODO +____ +* Testing +* Do zero-size AAU transfer/channel at init + so all we have to do is chainining + diff --git a/Documentation/arm/XScale/IOP3XX/dma.txt b/Documentation/arm/XScale/IOP3XX/dma.txt new file mode 100644 index 000000000000..f6ef554e491e --- /dev/null +++ b/Documentation/arm/XScale/IOP3XX/dma.txt @@ -0,0 +1,214 @@ +Support functions forthe Intel 80310 DMA channels +================================================== + +Dave Jiang <dave.jiang@intel.com> +Last updated: 09/18/2001 + +The Intel 80310 XScale chipset provides 3 DMA channels via the 80312 I/O +companion chip. Two of them resides on the primary PCI bus and one on the +secondary PCI bus. + +The DMA API provided is not compatible with the generic interface in the +ARM tree unfortunately due to how the 80312 DMACs work. Hopefully some time +in the near future a software interface can be done to bridge the differences. +The DMA API has been modeled after Nicholas Pitre's SA11x0 DMA API therefore +they will look somewhat similar. + + +80310 DMA API +------------- + +int dma_request(dmach_t channel, const char *device_id); + +This function will attempt to allocate the channel depending on what the +user requests: + +IOP310_DMA_P0: PCI Primary 1 +IOP310_DMA_P1: PCI Primary 2 +IOP310_DMA_S0: PCI Secondary 1 +/*EOF*/ + +Once the user allocates the DMA channel it is owned until released. Although +other users can also use the same DMA channel, but no new resources will be +allocated. The function will return the allocated channel number if successful. + +int dma_queue_buffer(dmach_t channel, dma_sghead_t *listhead); + +The user will construct a SGL in the form of below: +/* + * Scattered Gather DMA List for user + */ +typedef struct _dma_desc +{ + u32 NDAR; /* next descriptor adress [READONLY] */ + u32 PDAR; /* PCI address */ + u32 PUADR; /* upper PCI address */ + u32 LADR; /* local address */ + u32 BC; /* byte count */ + u32 DC; /* descriptor control */ +} dma_desc_t; + +typedef struct _dma_sgl +{ + dma_desc_t dma_desc; /* DMA descriptor */ + u32 status; /* descriptor status [READONLY] */ + u32 data; /* user defined data */ + struct _dma_sgl *next; /* next descriptor [READONLY] */ +} dma_sgl_t; + +/* dma sgl head */ +typedef struct _dma_head +{ + u32 total; /* total elements in SGL */ + u32 status; /* status of sgl */ + u32 mode; /* read or write mode */ + dma_sgl_t *list; /* pointer to list */ + dma_callback_t callback; /* callback function */ +} dma_head_t; + + +The user shall allocate user SGL elements by calling the function: +dma_get_buffer(). This function will give the user an SGL element. The user +is responsible for creating the SGL head however. The user is also +responsible for allocating the memory for DMA data. The following code segment +shows how a DMA operation can be performed: + +#include <asm/arch/iop310-dma.h> + +void dma_test(void) +{ + char dev_id[] = "Primary 0"; + dma_head_t *sgl_head = NULL; + dma_sgl_t *sgl = NULL; + int err = 0; + int channel = -1; + u32 *test_ptr = 0; + DECLARE_WAIT_QUEUE_HEAD(wait_q); + + + *(IOP310_ATUCR) = (IOP310_ATUCR_PRIM_OUT_ENAB | + IOP310_ATUCR_DIR_ADDR_ENAB); + + channel = dma_request(IOP310_DMA_P0, dev_id); + + sgl_head = (dma_head_t *)kmalloc(sizeof(dma_head_t), GFP_KERNEL); + sgl_head->callback = NULL; /* no callback created */ + sgl_head->total = 2; /* allocating 2 DMA descriptors */ + sgl_head->mode = (DMA_MOD_WRITE); + sgl_head->status = 0; + + /* now we get the two descriptors */ + sgl = dma_get_buffer(channel, 2); + + /* we set the header to point to the list we allocated */ + sgl_head->list = sgl; + + /* allocate 1k of DMA data */ + sgl->data = (u32)kmalloc(1024, GFP_KERNEL); + + /* Local address is physical */ + sgl->dma_desc.LADR = (u32)virt_to_phys(sgl->data); + + /* write to arbitrary location over the PCI bus */ + sgl->dma_desc.PDAR = 0x00600000; + sgl->dma_desc.PUADR = 0; + sgl->dma_desc.BC = 1024; + + /* set write & invalidate PCI command */ + sgl->dma_desc.DC = DMA_DCR_PCI_MWI; + sgl->status = 0; + + /* set a pattern */ + memset(sgl->data, 0xFF, 1024); + + /* User's responsibility to keep buffers cached coherent */ + cpu_dcache_clean(sgl->data, sgl->data + 1024); + + sgl = sgl->next; + + sgl->data = (u32)kmalloc(1024, GFP_KERNEL); + sgl->dma_desc.LADR = (u32)virt_to_phys(sgl->data); + sgl->dma_desc.PDAR = 0x00610000; + sgl->dma_desc.PUADR = 0; + sgl->dma_desc.BC = 1024; + + /* second descriptor has interrupt flag enabled */ + sgl->dma_desc.DC = (DMA_DCR_PCI_MWI | DMA_DCR_IE); + + /* must set end of chain flag */ + sgl->status = DMA_END_CHAIN; /* DO NOT FORGET THIS!!!! */ + + memset(sgl->data, 0x0f, 1024); + /* User's responsibility to keep buffers cached coherent */ + cpu_dcache_clean(sgl->data, sgl->data + 1024); + + /* queing the buffer, this function will sleep since no callback */ + err = dma_queue_buffer(channel, sgl_head); + + /* now we are woken from DMA complete */ + + /* do data operations here */ + + /* free DMA data if necessary */ + + /* return the descriptors */ + dma_return_buffer(channel, sgl_head->list); + + /* free the DMA */ + dma_free(channel); + + kfree((void *)sgl_head); +} + + +dma_sgl_t * dma_get_buffer(dmach_t channel, int buf_num); + +This call allocates DMA descriptors for the user. + + +void dma_return_buffer(dmach_t channel, dma_sgl_t *list); + +This call returns the allocated descriptors back to the API. + + +int dma_suspend(dmach_t channel); + +This call suspends any DMA transfer on the given channel. + + + +int dma_resume(dmach_t channel); + +This call resumes a DMA transfer which would have been stopped through +dma_suspend(). + + +int dma_flush_all(dmach_t channel); + +This completely flushes all queued buffers and on-going DMA transfers on a +given channel. This is called when DMA channel errors have occured. + + +void dma_free(dmach_t channel); + +This clears all activities on a given DMA channel and releases it for future +requests. + + + +Buffer Allocation +----------------- +It is the user's responsibility to allocate, free, and keep track of the +allocated DMA data memory. Upon calling dma_queue_buffer() the user must +relinquish the control of the buffers to the kernel and not change the +state of the buffers that it has passed to the kernel. The user will regain +the control of the buffers when it has been woken up by the bottom half of +the DMA interrupt handler. The user can allocate cached buffers or non-cached +via pci_alloc_consistent(). It is the user's responsibility to ensure that +the data is cache coherent. + +*Reminder* +The user is responsble to ensure the ATU is setup properly for DMA transfers. + +All Disclaimers apply. Use this at your own discretion. Neither Intel nor I +will be responsible ifanything goes wrong. diff --git a/Documentation/arm/XScale/IOP3XX/message.txt b/Documentation/arm/XScale/IOP3XX/message.txt new file mode 100644 index 000000000000..480d13e7ab09 --- /dev/null +++ b/Documentation/arm/XScale/IOP3XX/message.txt @@ -0,0 +1,110 @@ +Support functions for the Intel 80310 MU +=========================================== + +Dave Jiang <dave.jiang@intel.com> +Last updated: 10/11/2001 + +The messaging unit of the IOP310 contains 4 components and is utilized for +passing messages between the PCI agents on the primary bus and the Intel(R) +80200 CPU. The four components are: +Messaging Component +Doorbell Component +Circular Queues Component +Index Registers Component + +Messaging Component: +Contains 4 32bit registers, 2 in and 2 out. Writing to the registers assert +interrupt on the PCI bus or to the 80200 depend on incoming or outgoing. + +int mu_msg_request(u32 *mu_context); +Request the usage of Messaging Component. mu_context is written back by the +API. The MU context is passed to other Messaging calls as a parameter. + +int mu_msg_set_callback(u32 mu_context, u8 reg, mu_msg_cb_t func); +Setup the callback function for incoming messages. Callback can be setup for +outbound 0, 1, or both outbound registers. + +int mu_msg_post(u32 mu_context, u32 val, u8 reg); +Posting a message in the val parameter. The reg parameter denotes whether +to use register 0, 1. + +int mu_msg_free(u32 mu_context, u8 mode); +Free the usage of messaging component. mode can be specified soft or hard. In +hardmode all resources are unallocated. + +Doorbell Component: +The doorbell registers contains 1 inbound and 1 outbound. Depending on the bits +being set different interrupts are asserted. + +int mu_db_request(u32 *mu_context); +Request the usage of the doorbell register. + +int mu_db_set_callback(u32 mu_context, mu_db_cb_t func); +Setting up the inbound callback. + +void mu_db_ring(u32 mu_context, u32 mask); +Write to the outbound db register with mask. + +int mu_db_free(u32 mu_context); +Free the usage of doorbell component. + +Circular Queues Component: +The circular queue component has 4 circular queues. Inbound post, inbound free, +outbound post, outbound free. These queues are used to pass messages. + +int mu_cq_request(u32 *mu_context, u32 q_size); +Request the usage of the queue. See code comment header for q_size. It tells +the API how big of queues to setup. + +int mu_cq_inbound_init(u32 mu_context, mfa_list_t *list, u32 size, + mu_cq_cb_t func); +Init inbound queues. The user must provide a list of free message frames to +be put in inbound free queue and the callback function to handle the inbound +messages. + +int mu_cq_enable(u32 mu_context); +Enables the circular queues mechanism. Called once all the setup functions +are called. + +u32 mu_cq_get_frame(u32 mu_context); +Obtain the address of an outbound free frame for the user. + +int mu_cq_post_frame(u32 mu_context, u32 mfa); +The user can post the frame once getting the frame and put information in the +frame. + +int mu_cq_free(u32 mu_context); +Free the usage of circular queues mechanism. + +Index Registers Component: +The index register provides the mechanism to receive inbound messages. + +int mu_ir_request(u32 *mu_context); +Request of Index Register component usage. + +int mu_ir_set_callback(u32 mu_context, mu_ir_cb_t callback); +Setting up callback for inbound messages. The callback will receive the +value of the register that IAR offsets to. + +int mu_ir_free(u32 mu_context); +Free the usage of Index Registers component. + +void mu_set_irq_threshold(u32 mu_context, int thresh); +Setup the IRQ threshold before relinquish processing in IRQ space. Default +is set at 10 loops. + + +*NOTE: Example of host driver that utilize the MU can be found in the Linux I2O +driver. Specifically i2o_pci and some functions of i2o_core. The I2O driver +only utilize the circular queues mechanism. The other 3 components are simple +enough that they can be easily setup. The MU API provides no flow control for +the messaging mechanism. Flow control of the messaging needs to be established +by a higher layer of software on the IOP or the host driver. + +All Disclaimers apply. Use this at your own discretion. Neither Intel nor I +will be responsible if anything goes wrong. =) + + +TODO +____ + diff --git a/Documentation/arm/XScale/IOP3XX/pmon.txt b/Documentation/arm/XScale/IOP3XX/pmon.txt new file mode 100644 index 000000000000..7978494a9b74 --- /dev/null +++ b/Documentation/arm/XScale/IOP3XX/pmon.txt @@ -0,0 +1,71 @@ + +Intel's XScale Microarchitecture 80312 companion processor provides a +Performance Monitoring Unit (PMON) that can be utilized to provide +information that can be useful for fine tuning of code. This text +file describes the API that's been developed for use by Linux kernel +programmers. Note that to get the most usage out of the PMON, +I highly reccomend getting the XScale reference manual from Intel[1] +and looking at chapter 12. + +To use the PMON, you must #include <asm-arm/arch-iop310/pmon.h> in your +source file. + +Since there's only one PMON, only one user can currently use the PMON +at a given time. To claim the PMON for usage, call iop310_pmon_claim() which +returns an identifier. When you are done using the PMON, call +iop310_pmon_release() with the id you were given earlier. + +The PMON consists of 14 registers that can be used for performance measurements. +By combining different statistics, you can derive complex performance metrics. + +To start the PMON, just call iop310_pmon_start(mode). Mode tells the PMON what +statistics to capture and can each be one of: + + IOP310_PMU_MODE0 + Performance Monitoring Disabled + + IOP310_PMU_MODE1 + Primary PCI bus and internal agents (bridge, dma Ch0, dam Ch1, patu) + + IOP310_PMU_MODE2 + Secondary PCI bus and internal agents (bridge, dma Ch0, dam Ch1, patu) + + IOP310_PMU_MODE3 + Secondary PCI bus and internal agents (external masters 0..2 and Intel + 80312 I/O companion chip) + + IOP310_PMU_MODE4 + Secondary PCI bus and internal agents (external masters 3..5 and Intel + 80312 I/O companion chip) + + IOP310_PMU_MODE5 + Intel 80312 I/O companion chip internal bus, DMA Channels and Application + Accelerator + + IOP310_PMU_MODE6 + Intel 80312 I/O companion chip internal bus, PATU, SATU and Intel 80200 + processor + + IOP310_PMU_MODE7 + Intel 80312 I/O companion chip internal bus, Primary PCI bus, Secondary + PCI bus and Secondary PCI agents (external masters 0..5 & Intel 80312 I/O + companion chip) + +To get the results back, call iop310_pmon_stop(&results) where results is +defined as follows: + +typedef struct _iop310_pmon_result +{ + u32 timestamp; /* Global Time Stamp Register */ + u32 timestamp_overflow; /* Time Stamp overflow count */ + u32 event_count[14]; /* Programmable Event Counter + Registers 1-14 */ + u32 event_overflow[14]; /* Overflow counter for PECR1-14 */ +} iop310_pmon_res_t; + + +-- +This code is still under development, so please feel free to send patches, +questions, comments, etc to me. + +Deepak Saxena <dsaxena@mvista.com> diff --git a/Documentation/arm/XScale/cache-lock.txt b/Documentation/arm/XScale/cache-lock.txt new file mode 100644 index 000000000000..9728c94f153d --- /dev/null +++ b/Documentation/arm/XScale/cache-lock.txt @@ -0,0 +1,123 @@ + +Intel's XScale Microarchitecture provides support for locking of data +and instructions into the appropriate caches. This file provides +an overview of the API that has been developed to take advantage of this +feature from kernel space. Note that there is NO support for user space +cache locking. + +For example usage of this code, grab: + + ftp://source.mvista.com/pub/xscale/cache-test.c + +If you have any questions, comments, patches, etc, please contact me. + +Deepak Saxena <dsaxena@mvista.com> + +API DESCRIPTION + + +I. Header File + + #include <asm/xscale-lock.h> + +II. Cache Capability Discovery + + SYNOPSIS + + int cache_query(u8 cache_type, + struct cache_capabilities *pcache); + + struct cache_capabilities + { + u32 flags; /* Flags defining capabilities */ + u32 cache_size; /* Cache size in K (1024 bytes) */ + u32 max_lock; /* Maximum lockable region in K */ + } + + /* + * Flags + */ + + /* + * Bit 0: Cache lockability + * Bits 1-31: Reserved for future use + */ + #define CACHE_LOCKABLE 0x00000001 /* Cache can be locked */ + + /* + * Cache Types + */ + #define ICACHE 0x00 + #define DCACHE 0x01 + + DESCRIPTION + + This function fills out the pcache capability identifier for the + requested cache. cache_type is either DCACHE or ICACHE. This + function is not very useful at the moment as all XScale CPU's + have the same size Cache, but is is provided for future XScale + based processors that may have larger cache sizes. + + RETURN VALUE + + This function returns 0 if no error occurs, otherwise it returns + a negative, errno compatible value. + + -EIO Unknown hardware error + +III. Cache Locking + + SYNOPSIS + + int cache_lock(void *addr, u32 len, u8 cache_type, const char *desc); + + DESCRIPTION + + This function locks a physically contigous portion of memory starting + at the virtual address pointed to by addr into the cache referenced + by cache_type. + + The address of the data/instruction that is to be locked must be + aligned on a cache line boundary (L1_CACHE_ALIGNEMENT). + + The desc parameter is an optional (pass NULL if not used) human readable + descriptor of the locked memory region that is used by the cache + management code to build the /proc/cache_locks table. + + Note that this function does not check whether the address is valid + or not before locking it into the cache. That duty is up to the + caller. Also, it does not check for duplicate or overlaping + entries. + + RETURN VALUE + + If the function is successful in locking the entry into cache, a + zero is returned. + + If an error occurs, an appropriate error value is returned. + + -EINVAL The memory address provided was not cache line aligned + -ENOMEM Could not allocate memory to complete operation + -ENOSPC Not enough space left on cache to lock in requested region + -EIO Unknown error + +III. Cache Unlocking + + SYNOPSIS + + int cache_unlock(void *addr) + + DESCRIPTION + + This function unlocks a portion of memory that was previously locked + into either the I or D cache. + + RETURN VALUE + + If the entry is cleanly unlocked from the cache, a 0 is returned. + In the case of an error, an appropriate error is returned. + + -ENOENT No entry with given address associated with this cache + -EIO Unknown error + + diff --git a/Documentation/arm/XScale/pmu.txt b/Documentation/arm/XScale/pmu.txt new file mode 100644 index 000000000000..508575d652bf --- /dev/null +++ b/Documentation/arm/XScale/pmu.txt @@ -0,0 +1,168 @@ + +Intel's XScale Microarchitecture processors provide a Performance +Monitoring Unit (PMU) that can be utilized to provide information +that can be useful for fine tuning of code. This text file describes +the API that's been developed for use by Linux kernel programmers. +When I have some extra time on my hand, I will extend the code to +provide support for user mode performance monitoring (which is +probably much more useful). Note that to get the most usage out +of the PMU, I highly reccomend getting the XScale reference manual +from Intel and looking at chapter 12. + +To use the PMU, you must #include <asm/xscale-pmu.h> in your source file. + +Since there's only one PMU, only one user can currently use the PMU +at a given time. To claim the PMU for usage, call pmu_claim() which +returns an identifier. When you are done using the PMU, call +pmu_release() with the identifier that you were given by pmu_claim. + +In addition, the PMU can only be used on XScale based systems that +provide an external timer. Systems that the PMU is currently supported +on are: + + - Cyclone IQ80310 + +Before delving into how to use the PMU code, let's do a quick overview +of the PMU itself. The PMU consists of three registers that can be +used for performance measurements. The first is the CCNT register with +provides the number of clock cycles elapsed since the PMU was started. +The next two register, PMN0 and PMN1, are eace user programmable to +provide 1 of 20 different performance statistics. By combining different +statistics, you can derive complex performance metrics. + +To start the PMU, just call pmu_start(pm0, pmn1). pmn0 and pmn1 tell +the PMU what statistics to capture and can each be one of: + +EVT_ICACHE_MISS + Instruction fetches requiring access to external memory + +EVT_ICACHE_NO_DELIVER + Instruction cache could not deliver an instruction. Either an + ICACHE miss or an instruction TLB miss. + +EVT_ICACHE_DATA_STALL + Stall in execution due to a data dependency. This counter is + incremented each cycle in which the condition is present. + +EVT_ITLB_MISS + Instruction TLB miss + +EVT_DTLB_MISS + Data TLB miss + +EVT_BRANCH + A branch instruction was executed and it may or may not have + changed program flow + +EVT_BRANCH_MISS + A branch (B or BL instructions only) was mispredicted + +EVT_INSTRUCTION + An instruction was executed + +EVT_DCACHE_FULL_STALL + Stall because data cache buffers are full. Incremented on every + cycle in which condition is present. + +EVT_DCACHE_FULL_STALL_CONTIG + Stall because data cache buffers are full. Incremented on every + cycle in which condition is contigous. + +EVT_DCACHE_ACCESS + Data cache access (data fetch) + +EVT_DCACHE_MISS + Data cache miss + +EVT_DCACHE_WRITE_BACK + Data cache write back. This counter is incremented for every + 1/2 line (four words) that are written back. + +EVT_PC_CHANGED + Software changed the PC. This is incremented only when the + software changes the PC and there is no mode change. For example, + a MOV instruction that targets the PC would increment the counter. + An SWI would not as it triggers a mode change. + +EVT_BCU_REQUEST + The Bus Control Unit(BCU) received a request from the core + +EVT_BCU_FULL + The BCU request queue if full. A high value for this event means + that the BCU is often waiting for to complete on the external bus. + +EVT_BCU_DRAIN + The BCU queues were drained due to either a Drain Write Buffer + command or an I/O transaction for a page that was marked as + uncacheable and unbufferable. + +EVT_BCU_ECC_NO_ELOG + The BCU detected an ECC error on the memory bus but noe ELOG + register was available to to log the errors. + +EVT_BCU_1_BIT_ERR + The BCU detected a 1-bit error while reading from the bus. + +EVT_RMW + An RMW cycle occurred due to narrow write on ECC protected memory. + +To get the results back, call pmu_stop(&results) where results is defined +as a struct pmu_results: + + struct pmu_results + { + u32 ccnt; /* Clock Counter Register */ + u32 ccnt_of; / + u32 pmn0; /* Performance Counter Register 0 */ + u32 pmn0_of; + u32 pmn1; /* Performance Counter Register 1 */ + u32 pmn1_of; + }; + +Pretty simple huh? Following are some examples of how to get some commonly +wanted numbers out of the PMU data. Note that since you will be dividing +things, this isn't super useful from the kernel and you need to printk the +data out to syslog. See [1] for more examples. + +Instruction Cache Efficiency + + pmu_start(EVT_INSTRUCTION, EVT_ICACHE_MISS); + ... + pmu_stop(&results); + + icache_miss_rage = results.pmn1 / results.pmn0; + cycles_per_instruction = results.ccnt / results.pmn0; + +Data Cache Efficiency + + pmu_start(EVT_DCACHE_ACCESS, EVT_DCACHE_MISS); + ... + pmu_stop(&results); + + dcache_miss_rage = results.pmn1 / results.pmn0; + +Instruction Fetch Latency + + pmu_start(EVT_ICACHE_NO_DELIVER, EVT_ICACHE_MISS); + ... + pmu_stop(&results); + + average_stall_waiting_for_instruction_fetch = + results.pmn0 / results.pmn1; + + percent_stall_cycles_due_to_instruction_fetch = + results.pmn0 / results.ccnt; + + +ToDo: + +- Add support for usermode PMU usage. This might require hooking into + the scheduler so that we pause the PMU when the task that requested + statistics is scheduled out. + +-- +This code is still under development, so please feel free to send patches, +questions, comments, etc to me. + +Deepak Saxena <dsaxena@mvista.com> + diff --git a/Documentation/arm/XScale/tlb-lock.txt b/Documentation/arm/XScale/tlb-lock.txt new file mode 100644 index 000000000000..1ba3e11d0b82 --- /dev/null +++ b/Documentation/arm/XScale/tlb-lock.txt @@ -0,0 +1,64 @@ + +Intel's XScale Microarchitecture provides support for locking of TLB +entries in both the instruction and data TLBs. This file provides +an overview of the API that has been developed to take advantage of this +feature from kernel space. Note that there is NO support for user space. + +In general, this feature should be used in conjunction with locking +data or instructions into the appropriate caches. See the file +cache-lock.txt in this directory. + +If you have any questions, comments, patches, etc, please contact me. + +Deepak Saxena <dsaxena@mvista.com> + + +API DESCRIPTION + +I. Header file + + #include <asm/xscale-lock.h> + +II. Locking an entry into the TLB + + SYNOPSIS + + xscale_tlb_lock(u8 tlb_type, u32 addr); + + /* + * TLB types + */ + #define ITLB 0x0 + #define DTLB 0x1 + + DESCRIPTION + + This function locks the virtual to physical mapping for virtual + address addr into the requested TLB. + + RETURN VALUE + + If the entry is properly locked into the TLB, a 0 is returned. + In case of an error, an appropriate error is returned. + + -ENOSPC No more entries left in the TLB + -EIO Unknown error + +III. Unlocking an entry from a TLB + + SYNOPSIS + + xscale_tlb_unlock(u8 tlb_type, u32 addr); + + DESCRIPTION + + This function unlocks the entry for virtual address addr from the + specified cache. + + RETURN VALUE + + If the TLB entry is properly unlocked, a 0 is returned. + In case of an error, an appropriate error is returned. + + -ENOENT No entry for given address in specified TLB + |
