Copyright © 2007 C.P.R. Baaij
May 2007
| Revision History | ||
|---|---|---|
| Revision 1.0 | May 10th, 2007 | Christiaan Baaij |
|
First release | ||
| Revision 1.1 | July 5th, 2007 | Christiaan Baaij |
|
First Minimal Implementation | ||
| Revision 1.2 | August 16th, 2007 | Christiaan Baaij |
|
DMA transfers using circular buffer implemented | ||
Table of Contents
List of Figures
Table of Contents
The C5409 based DSP on the DM320 architecture has 3 types of internal memory: Dual Access RAM (DARAM), Single Access RAM (SARAM) and On-Chip ROM. The DARAM and SARAM can be accessed from the ARM. The DSP has 32 KW (kilo words, 1 word = 16 bit) of DARAM that can be mapped into Program and Data Space simultaneously, 16KW of Page 1 Program SARAM and 16KW of Data SARAM. The DARAM can be mapped exclusively to Data Space by setting the OVLY to 0. On-Chip ROM is not seen in ARM memory space and will therefor not be used by the bridge.
The ARM downloads a specific boot image for the DSP to execute. There can be many boot images, one for every different task.
The boot sequence is as follows:
The ARM holds the DRST bit of the HPIBCTL register low for at least 2 DSP cycles to make the DSP go into reset. The ARM then brings it out of reset.
DSP status register PMST is initialized to move the vector table to 7F80h, all the interrupts are disabled except for INT0 and the DSP is set to IDLE1 mode.
While the DSP is in IDLE1, the ARM loads Program code and Data values to their respective memories
When the ARM finishes downloading the DSP code, it wakes up the DSP from IDLE1 mode by asserting INT0.
The DSP then branches to address 7F80h where the new interrupt vector table is located. The ARM should have loaded this location with at least a branch to the start code
External SDRAM can only be accessed through the several DMA controllers. Unlike the original C5409 on which the DM320 DSP is based on, the DM320 DSP cannot access this external memory directly as described in C5409 and C54x documents. The DMA controller of the HPIB bridge can transfer data between internal DSP RAM and external SDRAM. Co-Processor DMA (COP) can transfer data between SDRAM and the internal RAM of the peripherals on the DM320 SoC (such as the image buffers).
Table of Contents
The kernel driver would just be responsible for resetting and running DSP memory and allow access to the internal memory of the DSP to userland applications. This would mean that application developers have to design their own communication interface between ARM and DSP code. Example of such a communication interface would be a flag in the DSP internal memory that the DSP can set if it's done reading a buffer, and the ARM is able to poll so that it knows when to write new data in the buffer again.
The DSP control device provides a DSP control API for Linux userland application. Applications can control DSP reset and perform other controls through this device. To see an example of its usage check out the source code for the bridge control utility.
The DSP memory device provides the access to the DSP memory space for the DSP program loader in Linux userland. The DSP program loader loads the DSP binary image to the DSP internal memories (i.e. DARAM and SARAM) through this device.
CAUTION: The Internal DSP memory is only 16bit addressable on the ARM: to resemble the DSP addressing. So reading and writing should be done 1 word at a time, not 1 byte at a time (the read and write call will just fail otherwise). Below is an example of how to read one 16bit word at address 0x264 and print it to the console.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <fcntl.h>
#include "arm-dsp_bridge.h"
int main(int argc, char **argv) {
int fd;
unsigned short* buf;
buf = (unsigned short *)malloc(sizeof(short));
if ((fd = open(DSPMEMDEVNM, O_RDWR)) < 0) {
perror("open memory device");
return -1;
}
lseek(fd, 0x264, SEEK_SET);
if (read(fd,buf,sizeof(short)) != sizeof(short)) {
perror("read memory device");
free(buf);
return -1;
}
printf("value at word address 0x264: 0x%x\n", *buf);
free(buf);
return 0;
}The bridge communication device is the Linux counterpart of the bridge library on the DSP. It supports the display of debug messages send from the DSP and reading and writing to developer-specified buffers on the DSP using DMA transfers. A circular buffer scheme (in SDRAM) on the Linux side provides asynchronous transfers between the ARM and the DSP. Because this device only supports transfers between the two memories it is non-seekable. At the end of this chapter there is an example of how to use the read() and write() calls.
Check if read and/or write call will become blocking. A read call blocks if the circular read buffer is empty. A write call blocks if the circular write buffer is full.
Writes data to the circular write buffer. A program on the DSP must make a _hpib_read call to transfer this data from the circular write buffer to a buffer on the DSP. This call becomes blocking if the write buffer is full, and stays blocking until the DSP reads the previous data. Because the write() uses DMA transfers internally, transfer sizes have to be a multitude of 4 bytes (32 bit) as demanded by the HPIB DMA controller (the write call will fail otherwise).
NB: You will not be notified if the data reaches the DSP correctly. So you will need to implement this functionality yourself if needed.
Reads data from the circular read buffer. A program on the DSP must make a _hpib_write call to transfer data from a buffer on the DSP to this circular read buffer first. This call becomes blocking if the circular read buffer is empty, and stays blocking until the DSP places data in the buffer. Because the read() uses DMA transfers internally, transfer sizes have to be a multitude of 4 bytes (32 bit) as demanded by the HPIB DMA controller (the write call will fail otherwise).
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <fcntl.h>
#include "arm-dsp_bridge.h"
int main(int argc, char **argv) {
int fd;
int i = 0;
unsigned short* writebuf;
unsigned short* readbuf;
char str1[]="Hello World from the DSP: through the DMA void!!";
char str2[]="ffffffffffffffffffffffffffffffffffffffffffffffff";
/**
* Number of chars is an even number so transfer size will be a
* multitude of 4 bytes
*/
writebuf = (unsigned short *)malloc(strlen(str1)*sizeof(short));
readbuf = (unsigned short *)malloc(strlen(str2)*sizeof(short));
if ((fd = open(DSPCOMDEVNM, O_RDWR)) < 0) {
perror("open communications device");
free(writebuf);
free(readbuf);
return -1;
}
for(i=0; i < strlen(str1); ++i){
writebuf[i]=str1[i];
}
if (write(fd,writebuf,strlen(str1)*sizeof(short)) != strlen(str1)*sizeof(short)) {
perror("write com device");
free(writebuf);
free(readbuf);
return -1;
}
if (read(fd,readbuf,strlen(str2)*sizeof(short)) != strlen(str2)*sizeof(short)) {
perror("write com device");
free(writebuf);
free(readbuf);
return -1;
}
for(i=0; i < strlen(str2); ++i){
str2[i] = readbuf[i];
}
printf("Read from DSP: %s\n",str2);
free(writebuf);
free(readbuf);
close(fd);
return 0;
}Table of Contents
The bridge library provides very basic memory, string and debug functions. It also provides the DSP counterpart of the Linux bridge communication device. To use the bridge communication functions as DSP application has to be based on the accompanied framework. The framework and the library are currently written assembly, though future version might be done in C, depending if there will ever be a free compiler for the Texas Instruments C5409 DSP. Also note that there is no multi-process support at all at the moment, meaning only 1 single-threaded program can be loaded and run in DSP memory at a time.
Currently implements the counterparts of the read() and write() calls of the bridge communication device on the ARM (Linux) side. At the end of this section there's an example application using the _hpib_read and _hpib_write functions. These functions use the _hpiDma_transfer (for internal use only) to transfer the bytes between SDRAM and Internal DSP memory.
Argument 1: Buffer location in internal DSP RAM
Argument 2: Count (in bytes) to transfer from circular write buffer to DSP buffer
Returns: Amount (in bytes) actually transferred
NB: Due to the linear nature of the DMA transfer controller, a _hpib_read will only read to the 'end' of the circular buffer even if there are more filled bytes available at the 'beginning' of the circular buffer. This means _hpib_read will have to be called again (once the previous DMA transfer finishes) to read the remaining bytes. This can easily be implemented by filling in the handle_hpiDma_end function.
Argument 1: Buffer location in internal DSP RAM
Argument 2: Count (in bytes) to transfer from circular write buffer to DSP buffer
Returns: Amount (in bytes) actually transferred
NB: Due to the linear nature of the DMA transfer controller, a _hpib_write will only write to the 'end' of the circular buffer even if there are more free bytes available at the 'beginning' of the circular buffer. This means _hpib_write will have to be called again (once the previous DMA transfer finishes) to write the remaining bytes. This can easily be implemented by filling in the handle_hpiDma_end function.
NB: You will not be notified if the data reaches the ARM correctly. So you will need to implement this functionality yourself if needed.
Currently implements only the _debug function.
Argument 1: Location of the message to send
Returns: nothing
;*-----------------------------------------------------------------------------*
;* Main Program *
;*-----------------------------------------------------------------------------*
.mmregs
;*--------------------------- YOUR CODE STARTS HERE ---------------------------*
; Any definitions or references you need to add go here
.include "main.inc"
.global _inputBuffer
.bss _inputBuffer,__INPUT_BUFFER_LEN,0,0
.global _memset
.global _debug
.global _hpib_read
.global _hpib_write
;*--------------------------- YOUR CODE ENDS HERE -----------------------------*
; External Functions
.global _init_dspCom
.text
; Internal functions
.global _main
_main:
;*--------------------------- YOUR CODE STARTS HERE ---------------------------*
; Call any hardware initialization routines here
;*--------------------------- YOUR CODE ENDS HERE -----------------------------*
CALL #_init_dspCom ; Initiliaze DSP Communication struct
;*--------------------------- YOUR CODE STARTS HERE ---------------------------*
; Your main program starts here
; Clear input buffer
STM #_inputBuffer,AR1 ; Starting at location of _dspComBuffer
NOP
PSHM AR1 ; Store it on the stack
STM #0,AR1 ; Set value to 0
NOP
PSHM AR1 ; Store it on the stack
STM #__INPUT_BUFFER_LEN,AR1 ; Set length to __INPUT_BUFFER_LEN
NOP
PSHM AR1 ; Store it on the stack
CALL #_memset ; Call memset function
; Call read function
STM #_inputBuffer,AR1
NOP
PSHM AR1
STM #__INPUT_BUFFER_LEN,AR1
NOP
; __INPUT_BUFFER_LEN is in words, so multiply by 2
LDM AR1,A
SFTL A,#1,A
STLM A,AR1
PSHM AR1
CALL #_hpib_read
POPM AR1 ; Get return value from _hpib_read
PSHM AR1 ; And put it back on the stack
; debug("Read finished")
STM #__SL1,AR1
NOP
PSHM AR1
CALL #_debug
; debug(&_inputBuffer)
STM #_inputBuffer,AR1
NOP
PSHM AR1
CALL #_debug
; Write back the content of _inputBuffer
POPM AR2 ; Get return value from previous _hpib_read
STM #_inputBuffer,AR1
NOP
PSHM AR1
PSHM AR2
CALL #_hpib_write
POPM AR1 ; Get return value from _hpib_write
; Just loop forever
MAIN_LOOP:
B MAIN_LOOP
;*--------------------------- YOUR CODE ENDS HERE -----------------------------*
RET
.sect ".const"
;*--------------------------- YOUR CODE STARTS HERE ---------------------------*
; Put any strings here
__SL1: .string "Read finished",0
;*--------------------------- YOUR CODE ENDS HERE -----------------------------*Argument 1: String Source Location
Argument 2: String Destination Location
Returns: nothing
To use the communication features your DSP application has to be based on the framework supplied with the bridge. It as just gives you a basic starting point.
This file gets called when the program is loaded and started. It sets the stack pointer and start your _main function. If your _main function isn't a loop, it will return to this file and will loop forever in the _exit function. If you want to change the stack size, this file is the place to do that; the current default stack size is 1000 words.
This is where your _main function will reside and is called once the stack size is set. Any further hardware initialisations should occur here.
This is the interrupt vector table of the C5409 DSP, currently only four interrupts are plugged with interrupt handlers: reset, nmi, int0 and hpib_dma. You should leave these interrupts as they are, unless you know what you're doing. These four interrupts are enabled by the _init_dspCom called in _main.
This file contains the functions _handle_int0 and _handle_nmi. If you want to implement these handles then you should uncomment the CONTEXT_SAVE and CONTEXT_RESTORE macro's and place the functionality between those two macro's.
This interrupt handler gets called whenever a DMA transfer finished. This handler could for example start a new _hpib_read or _hpib_write function once a DMA transfer finishes. Be careful though that these functions block and that you're working in an interrupt context.
An include file used by the inner workings of the bridge communication functions. It describes certain constants and the communication structure.
Contains the CONTEXT_SAVE and CONTEXT_RESTORE macro's (which basically push and pop all the registers)
Linker command specification file for Binutils linker. It defines where the different memory sections will be placed in DSP memory.
Table of Contents
The bridge control utility 'bridgectl' is used to load DSP programs into internal DSP memory and start/stop the DSP. The programs can either be in the form <program.bin> <interruptvector.bin> or <program.out>. Where .bin files are in a binary format generated by 'c54x-objcopy' and .out files are in COFF2 format generated either by the binutils c54x assembler or Texas Instruments Code Composer Studio.
Following bridge commands are specified:
start <COFF2-program.out>
Resets the DSP, loads the specified <COFF2-program.out> (COFF2 format) into internal DSP memory and then releases the reset to start the program
stop
Resets the DSP.
load <program.bin> <interruptvector.bin>
Loads the <program.bin> (binary format) at address 0x80 and the <interruptvector.bin> (binary format) at address 0x7F80
loadcoff <coffile>
Loads the specified <COFF2-program.out> (COFF2 format) into internal DSP memory
run (=unreset)
Releases the reset making the program start
reset
Resets the DSP.
Table of Contents
Get the correct toolchain to build kernel modules and userspace programs for an ARM processor
Easiest is just to install the VMWare Image made by Crweb, it works for me, it should work for you.
Get the bridge software
Either get the latest build from svn: svn co https://svn.neurostechnology.com/hackers/darchon/arm-dsp_bridge/
Or download the archive: wget https://svn.neurostechnology.com/hackers/darchon/arm-dsp_bridge.tgz
Configure the build process for the DM320 DSP module (dm320dsp_module). Edit the Makefile for both modules to make KERNELDIR point to the kernel source tree that is used to build the kernel for the OSD
Build the module by running make
Make sure the binutils C54x toolchain is installed. (Look it up on google how to do this)
The bridge library can be bound in the ./bridgelib directory of the source archive
Copy the directory content and run make in the destination directory to build the library.
For this process I'm assuming you are using Crweb VMWare image, if not, you're on your own
Copy the 'bridgectl' directory from the archive to ~/Scratchbox-Home/
Type /scratchbox/login to enter the scratchbox building environment
Enter the 'bridgectl' directory you just copied and run make inside of it
I'm assuming you are using the VMWare image and are netbooting your OSD from this image for this step.
Copy the compiled kernel module, the load/unload scripts (dm320_dsp_load, dm320_dsp_unload) and the bridgectl utility to /srv/neuros-osd-rootfs/root/
Login to the OSD through the serial port
Unload ALL Ingenient kernel modules
Run the dm320_dsp_load script to install and load the bridge kernel modules
NB: The scripts need 'awk' to run. If 'awk' is not built into your busybox config, either rebuilt your busybox environment or execute the commands in the script manually.
To make your Linux programs use the DSP bridge include the 'arm-dsp_bridge.h' header that is in the ./include directory of the source archive. To make your DSP programs use the bridge, link against the bridge library and use the framework. Both can be found in ./bridgelib directory of the source archive. To load compiled DSP binaries on the DSP use the 'bridgectl' utility.
Some things to keep in mind when using the bridge:
DSP programs are always single threaded. And only one program can be loaded and run in DSP memory at a time.
NEVER use/overwrite the data memory locations 0x90 - 0x94. These are used internally by the bridge.
The bridge library functions overwrite registers without regard for their content. So if there is valuable data in them, be sure to save them before calling the bridge functions.
Do NOT use HPIB DMA directly, always use the _hpib_read and _hpib_write functions.
When you are developing for the C54x DSP, there are two ways to go:
Texas Instruments Code Composer Studio
This is a really use piece of software. It compiles C-code, has lots of libraries and debugging support. Too much to explain here, just check out Texas Instruments website. CCS compiles to COFF2 format by default.
Binutils Assembler
The only open-source tools for the C54x DSP is the binutils assembler. To use binutils for the C54x DSP you will need to compile it specifically for this archicture, check on google how to do this, it's really easy. The binutils assembler/linker builds to COFF0 by default, use "c54x-objcopy <input> -O coff2-c54x <output>" to convert an assembled binary to COFF2 format.