Cell Broadband Engine processor DMA engines, Part 2: From an SPE point of view

The Cell Broadband Engine™ (Cell BE) architecture provides on-chip DMA capabilities between the PPE and the SPEs. Meet the SPE interface to the DMA capabilities of the processor, from channel allocation to communication.

Vaidyanathan Srinivasan (svaidyan@in.ibm.com), PowerPC Test Tool Designer, IBM India Pvt Ltd.

Vaidyanathan Srinivasan has a Masters degree in Electronics and Communication engineering from Bharathidasan University, India. He has been working in IBM Global Services (Software Labs), India since February 2000. He has developed device drivers, low-level stress tools and diagnostics software for various PowerPC processors and PowerPC-based systems. His areas of interest are processor architecture and system design. You can contact him at svaidyan@in.ibm.com.



Anand K. Santhanam, PowerPC Test Tool Designer, IBM India Pvt Ltd.

Anand K. Santhanam has a Masters Degree in Software Systems from BITS Pilani, India. He has been in IBM Global Services (Software Labs), India, since July 1999. He has worked with ARM-Linux developing device drivers and power management in embedded systems, PCI device drivers, and developing stress tools for various PowerPC processors. His areas of interest include operating systems and processor architecture. You can reach him at asanthan@in.ibm.com.



Madhavan Srinivasan (masriniv@in.ibm.com), PowerPC Test Tool Designer, IBM India Pvt Ltd.

Madhavan Srinivasan has a B.Eng. in Electrical and Electronics from Madras University, India. He has been in IBM Global Services (Software Labs), India, since November 2003. He has worked in developing Linux/AIX diagnostics and verification tools for floating point and system coherency units of various PowerPC server processors. His areas of interest include PowerPC architecture and operating systems. You can reach him at masriniv@in.ibm.com.



02 May 2006

Part 1 in this series described the internals of the Cell Broadband Engine (Cell BE) architecture and the main components that provide the on-chip DMA capabilities between the PPE and the SPEs. While the previous article covered DMA initiated by the PPE's Memory Flow Controller (MFC), this article delves deeper into the other half of the on-chip DMA transactions, covering the Cell BE processor's SPE DMA architecture, channels, and DMAs from the SPE's perspective.

SPE channels

SPE channels are the primary interface between the MFC and the SPEs. The channels are akin to a one-way communication pipe in that they can be configured as either in read-only or write-only mode. Each channel has an associated channel count which indicates the outstanding operations that can be issued with the channel.

The channel count should be initialized with the SPE software while imposing a new context onto that particular SPE. Using the channel count property, the channels can be configured as blocking or non-blocking. Reading a read channel with a channel count of 0 causes the SPE to stall unless the data is available on the channel or an event happens which makes the channel count go non-zero. Similarly, a write to the write channel with channel count 0 causes the SPE to stall.

The SPEs stop executing when they are stalled. This design helps to save power while the processor has to wait for certain external events to occur. Traditional architectures would expect the software to implement a polling loop which keeps the processor unit executing instructions without making any forward progress. Good software design should minimize SPE stall cycles and try to keep the SPE as busy as possible. However, it is better to stall rather than execute redundant instruction sequences.

Each SPE has 128 channels, although not all of them are implemented and used. Please see the Cell Broadband Engine Architecture Specification (listed in Resources) for full details on channel implementation.

The SPE instruction set contains special instructions to use the channels. They are as follows:

  • Read channel (rdch) This instruction reads the selected channel into a general purpose SPE register. This instruction can only be used on a channel configured as a read channel. Using this instruction on a write channel raises an invalid channel instruction interrupt.
    For example: rdch rt, ch moves data from the channel denoted by ch to the register denoted by rt.
  • Write channel (wrch) This instruction transfers the contents of a general purpose SPE register to the selected channel. This instruction can only be used on a channel configured as a write channel. Using this instruction on a read channel raises an invalid channel instruction interrupt.
    For example: wdch ch, ra moves the data from register ra to channel ch.
  • Read channel count (rchcnt) This instruction reads the channel count of the selected channel into an SPE register. This instruction is useful in determining whether the channel is ready for read or write using the rdch or wrch instruction without stalling the SPE. Read channel count could be used in combination with read/write channel instructions to implement software polling loops and avoid stalling the SPE.
    For example: rchcnt rt, ch will return the channel count of channel ch into the register denoted by rt.

C/C++ intrinsics for Cell BE architecture include the following functions to perform read, write, and read channel count on channels:

  • spe_readch() reads data from a channel. Example: x = spe_readch(1) reads data from channel 1 to x.
  • spe_writech() writes data into a channel. Example: spe_writech(1, data) writes data from x to channel 1.
  • spe_readchcnt() reads the channel count. Example: x = spe_readchcnt(1) reads the count of channel 1 to x.

Channel types

Each SPE has 128 channels, and although most of them are not implemented, the remaining channels are grouped together according to specific functionalities they provide. Some of the important functionalities of channels include event management, signal notification, mailbox management, DMA enqueue, and DMA status check. This section looks at each of the above-mentioned groups in detail.

SPE mailbox channels

The MFC provides mailbox queues for the SPEs to interact with a PPE or any other external device. These mailboxes are used for communication between the SPE and other external devices for sending status, return codes, waiting for status, and so on. The SPE uses mailbox channels to access one end of the queue, and a PPE or other external devices uses MMIO registers to access the other end of the queue. The three mailbox queues are: the SPE outbound mailbox queue, the SPE outbound mailbox interrupt queue, and the SPE inbound mailbox queue. The SPE uses wrch on the outbound queues to send data through the queue and uses rdch to retrieve data from the inbound queues. The PPE or other external devices use the corresponding MMIO registers to read data from the queue and write data to the queue. The channels used for mailbox facility are as follows:

  • SPE write outbound mailbox channel
    The SPE writes data to the mailbox queue by using wrch on this channel, which decrements the channel count by 1. This channel is write blocking -- a write to the channel with channel count of 0 (outbound channel full) will cause the SPE to stall. In order to avoid the stalling of SPE, the rchcnt instruction can be used to determine whether this channel can take further data.
  • SPE write outbound mailbox interrupt channel
    This is similar to the normal outbound mailbox channel, except an interrupt will be generated to the PPE anytime this queue is not empty (decrease in channel count). This will be a class 2 interrupt from the corresponding SPE which can be routed to PPE.
  • SPE read inbound mailbox channel
    The SPE uses rdch to read the data from PPE or external device using this channel. The PPE or external device uses inbound MMIO register to place the data.

SPE signal notification channels

The external devices or processor use the SPE signal notification facility to send signals to the SPEs. The external devices use MMIO writes on signal notification registers to notify the signal to the SPE. The SPE, from its end, reads the signal notification channels to identify the signal. If the channel is read with no signals pending, it leads to SPE stall. If the SPE is allowed to stall by reading the channel while the count was zero, then the SPE would continue execution once the signal is presented in the queue. Signals are bit-mask corresponding to software events unlike mailbox which can be used to exchange data. Special MFC sndsig commands can be used to update signal notification channels among SPEs.

  • SPE signal notification channel 1 and 2
    A rdch instruction on these channels yields the 32bit signal word. The read resets any bit which was set. You can configure SPEs to 'OR' bit fields between successive signal updates.

SPE event channels

The SPE provides event management facilities through event channels that keep track of various hardware events enabled in the SPE write event mask channel. The SPE programs use these channels to find out the status of various events. The SPE program uses the SPE write event mask channel and enables the bit fields corresponding to the expected events. Once the events are enabled in the SPE write event mask, a read using the SPE event status channel indicates the status of each of the events. If none of the events have occurred, then reading the SPE event status channel will cause the SPE to stall. These are SPE hardware events, unlike mailbox and signal events which software generates.

After the event occurs, the software's responsibility is to acknowledge all those events in one write operation to the SPE event acknowledgement channel. The software can then proceed to handle that particular event. If the event is not handled, this might lead to phantom events, as the read from SPE event status channel could still show the event pending. Unlike signal notification, reading of the event status channel alone does not acknowledge the interrupt. It is possible to avoid polling for events by configuring SPE interrupt generation when the event or group of events occurs. These interrupts are presented to the SPE and not routed to the PPE.

The following events can be monitored using the event status channel:

  • SPE decrementer event
    Triggered when the Most Significant Bit (MSB) of the decrementer count transitions from 0 to 1 (or, when the value becomes negative).
  • SPE inbound mailbox available event
    Triggered when the SPE read inbound mailbox channel count becomes 0.
  • SPE outbound mailbox available event
    Triggered when the SPE write outbound mailbox channel count becomes greater than the pre-set program value.
  • SPE signal notification available events
    Triggered when an external device or processor writes into the signal notification channel(s).
  • MFC tag group update event
    Triggered when the MFC tag group status channel is updated, based on the tag status updates written into the tag status update request channel.
  • Privilege attention event
    Triggered by setting the privilege attention bit in the SPE privilege control register. Privileged software could use this feature to implement SPE debuggers.

The following channels are part of the SPE event management facility:

  • SPE read event status channel
    Read of this channel indicates the status of all the events that have been enabled in the SPE write event mask channel. A read from this channel with channel count of 1 returns the status of all enabled events, and sets the channel count to 0. This provides a wait-on-event facility wherein the channel count becomes 1, leading the SPE to "unstall" when the desired event occurs.
  • SPE write event mask channel
    This channel contains bit fields for all the events that should affect the SPE event mechanism. If a bit corresponding to an event is enabled in this channel, subsequent reads of SPE read event status channel reveal the actual status of that particular event.
  • SPE read event mask channel
    This is a means to read the current SPE event mask value.
  • SPE write event acknowledgement channel
    A write to this channel with a specific bit set clears the corresponding bit in event status channel. This indicates that the event has been serviced by the software.

Refer to Table B-1 in the Cell BE architecture document for a complete list of implemented channels (see Resources).

SPE-side DMA

The SPE DMAs can be used for transactions between main memory and SPE local store, or between any external device memory (for example, IO device) and SPE local store, or between any two SPE local stores. The SPEs initiate DMA transactions by using special DMA channels. The DMA channels are the primary interface from the SPE side to the MFC DMA engine. The SPE DMA enqueue logic is similar to the PPE DMA enqueue logic, and each of the DMA enqueue parameters has separate channels like source, destination, size, tag, and so on. The SPE can initiate up to 16 DMAs in parallel as the depth of the SPE DMA queue is 16.

Figure 1. SPE Channel interface DMA diagram
Figure 1. SPE Channel interface DMA diagram

DMA commands

SPE DMAs are similar to PPE DMAs in that the DMAs are done from the perspective of the SPE. In other words, a DMA GET will transfer data from the external device to the SPE, and a DMA PUT will transfer data from the SPE to the external device. The SPE DMAs are classified into three types:

  1. Single element DMA
    This is similar to the PPE-initiated DMA where SPE initiates a single element data transfer between its local store and to an external entity. The external entity could be main memory or another SPE or any other IO device memory.
  2. List DMAs
    This is used to transfer a list of elements from the local store of the SPE to the main memory, or from main memory to the local store. The DMA list command uses a list of effective addresses for transferring data to and from the local store. The effective address region need not be contiguous in the physical address space, whereas the SPE local store region is contiguous. A single DMA list transfer can have up to 2048 elements. Only the SPE can initiate a List DMA transaction.
  3. Atomic DMAs
    Atomic DMAs provide atomic update functionality from the SPE side. They mimic the behavior of the atomic commands lwarx, stwcx, ldarx, and stdcx used in the PPE side. Atomic DMAs can be performed only on coherent and cacheable pages. Atomic DMAs, like List DMAs can be initiated only from the SPE side.

The following is a list of various DMA commands from the SPE side:

  • GET moves data from external memory to local store.
  • PUT moves data from local store to external memory.
  • GETL moves lists, rather than a single data item, from external memory to local store.
  • PUTL moves lists, rather than a single data item, from local store to external memory.
  • GETLLAR gets a lock line and creates a reservation. This is similar to the lwarx and ldarx operations on the PPE. The size of transfer is one cache line. The command executes immediately and is not queued in the SPE DMA command queue.
  • PUTLLC puts a lock line based on a reservation obtained using GETLLAR. This is similar to stwcx, stdcx operation in PPE. The size of transfer is one cache line. The command executes immediately and is not queued in the SPE DMA command queue.
  • PUTLLUC puts a lock line unconditionally, with or without a reservation for the lock line. The PUTLLUC operation is not dependant on a previous GETLLAR. The command executes immediately and is not queued in the SPE DMA command queue.
  • PUTQLLUC puts a lock line unconditionally, but this command is placed in the SPE DMA command queue with other DMA commands.

The PUTLLUC or PUTLLQUC operations can clear any previous reservations made by GETLLAR.

DMA enqueue channels

The SPE has special DMA channels which are used for enqueuing DMAs. The SPE application has to use wrch and rdch commands to enqueue DMAs. The DMA channels are as follows:

  • Command and class ID channel
    This channel contains DMA command and the class ID of the DMA to be enqueued. If the SPE DMA command queue is full, then any write to the channel can stall the SPE. In this case rchcnt should be used on this channel to determine the free slots available in the SPE DMA queue before enqueuing any new DMA.
  • Command tag ID channel
    This channel contains the identifier or tag for the DMA command. Any number of DMA commands can be tagged with the same tag and they are referred to as a tag group. The tag group can be used to query for the completion of the DMA.
  • Transfer size or list size channel
    This channel contains the size of DMA transfer. The maximum transfer size is 16KB in the case of normal DMA, and it refers to the size of the list in case of List DMAs.
  • Effective address low or list address channel
    This channel contains the lower 32 bits of the effective address in case of normal DMA, or the pointer to the list element in the local store in case of List DMA. If translation is enabled, the effective address needs to be translated to the real address using MFC segment table and page table.
  • Effective address high channel
    This channel contains the higher address of the DMA effective address. It is concatenated with the lower part of effective address to form a 64bit effective address. This channel can be set to zero in which case the effective address is only 32bit.
  • Local address channel
    This channel contains the local storage address of the DMA which can either be the source or target.

The DMA enqueue logic is similar to the PPE DMA enqueue logic as given below. All the channels should be written from the SPE side using successive wrch instructions:

  1. Write to the local storage address channel.
  2. Write the effective address high channel.
  3. Write the effective address low or list address channel.
  4. Write the transfer/list size channel.
  5. Write the command tag ID channel.
  6. Write the command and class ID channel.

The write channel to the command and class ID channel causes the DMA to be enqueued in the command queue. You can do the steps preceding the writes to the command and class ID channel in any order. The command and class ID channel has a maximum count which is equal to the number of slots in the DMA queue. Software has to initialize the channel count to the number of empty DMA slots before operating on this channel. A wrch to this channel with a count of 0 (or, no more DMA queue slots free) will cause the SPE to stall.

DMA status check

The DMA completion status is based on the command tag, which is a 5bit identifier programmed in the command tag ID channel as part of the DMA enqueue process. Some specific DMA status channels contain the state of the enqueued DMA based on the tag. The SPE program has to operate on these status channels and query for the command tag or tag group to verify whether the DMA or group of DMAs has completed. The DMA status channels are given below:

  • MFC Write Tag group query mask channel
    This channel contains the bit masks of the tag groups that should be included as part of the DMA query. It has 32 bit fields representing 32 different tags or tag groups. This is a non-blocking channel.
  • MFC Read Tag group status channel
    This channel contains the status bits of all those tags or tag groups that have been enabled in the tag group query mask channel. If a bit is set to 1 in this channel, it indicates the DMA completion for that particular tag or tag group. A bit value of 0 indicates that either the DMA is still under progress or the tag is not part of the status query process. This is a read blocking channel, and the software should initialize the channel count to 1.
  • MFC Write Tag status update request channel
    This channel controls the mechanism of status update in the MFC read tag group status channel. This channel indicates that the status in the MFC read tag group can be one of the following:
    1. Updated immediately
    2. Updated when any tag or tag group DMA has been completed
    3. Updated only when the DMA corresponding to all the tags or tag groups enabled in the write tag group query mask channel has been completed

A write operation to this channel has to be completed before any attempt is made to read from the tag group status channel, or else this will induce a deadlock scenario. This channel is write blocking channel with a maximum count of 1.

Assuming a DMA with a tag value of 10 has been enqueued, the algorithm for DMA status checking could be as follows:

  1. Perform a write channel to the MFC write tag group query mask channel with the value 10. This enables the bit position 10 and indicates that the tag is part of DMA status query.
  2. Write the tag status update request channel with a suitable value that controls when the status update needs to happen in the DMA read tag group status channel.
  3. Use read channel instruction on the tag group status channel and poll for the status of the DMA completion based on the tag. If the value of the DMA read tag status channel is 10, then it indicates that the DMA is complete.

Below is an example code snippet of SPE-initiated DMA in real mode using C/C++ intrinsics of Cell Broadband Engine SDK. The DMA command used is PUT, which moves the data from SPU local store to external memory (main memory or IO device memory). The local store memory or source memory is set at 0x0, and the destination memory is set at 0x2000.

Listing 1. Storing data to main memory
   spe_dma.c

   #include <spu_intrinsics.h>
   #include <spu_internals.h>
   #define SPE_ADDR 0x0
   #define EA_ADDR 0x2000
   int spu_dma()
   {
     int status = 0;
     spu_writech(MFC_LSA, SPE_ADDR);  // Program the LSA channel
     spu_writech(MFC_EAH, 0x0);       // Program EAH channel; high address is 0
     spu_writech(MFC_EAL, EA_ADDR);   // Program the EAL. 0x2000.
     spu_writech(MFC_Size, 0x10);     // DMA of 16 bytes
     spu_writech(MFC_TagID, 5);       // DMA tag of 5
     spu_writech(MFC_Cmd, 0x20);      // PUT. Move data from LSA to EA
     // Check for DMA status
     // Clear any pending tag status update
     spu_writech(MFC_WrTagMask, 0);   // zero out the tag mask channel
     while(!spu_readchcnt(MFC_WrTagUpdate));  // read the tag update channel
                                              // count until 1 is returned.
     spu_readch(MFC_RdTagStat);       // Read the status channel and
                                      // discard the value
     // Now program and wait for the DMA status
     spu_writech(MFC_WrTagMask, 5);       // Program the DMA tag
     spu_writech(MFC_WrTagUpdate, 0x2);   // Poll till all DMAs with tag of
                                          // 5 are completed
     while(!spu_readchcnt(MFC_WrTagUpdate));  // read the tag update channel
                                              // count until 1 is returned.
     status = spu_readch(MFC_RdTagStat) ;  // read the status
     if (status == 5)
       // DMA SUCCESS
       return 0;
     else
       // DMA FAILURE
       return -1;
   }

DMAs with translation

As is the case with PPE DMA, SPE-initiated DMA can happen in translation mode also, and is controlled by the MFC translation bit in the MFC_SR1 register. When DMA happens with translation enabled, the effective address gets translated using the MFC SLB and page table. The local store address is not translated, and the mechanism of translation and handling of translation-related exceptions are identical to PPE-initiated DMAs.

Atomic DMA

Atomic DMAs are similar to the atomic operations in standard PowerPC®. SPE can reference memory outside of local store using DMAs, and these atomic DMA operations help SPEs to synchronize with other processing elements. The sync word or the lock word is generally a main memory location that is accessible to PPE using load/store instruction. The Cell Broadband Engine architecture allows lock implementation across PPE and SPE software.

The PPE would execute a sequence of lwarx/stwcx instruction to atomically update a lock word in main memory, while the SPEs would use two DMA commands, getllar and putllc, to load and update the lock word. If the PPE successfully executed stwcx, then the putllc DMA command fails, and the SPE has to retry the getllar/putllc sequence. The operation is very similar to the PPE lwarx instruction except that getllar is a complete DMA GET command with an implied size equal to one cache line.

The lock word is thus transferred to local store with reservation using getllar, loaded into an SPE register using SPE load instruction, and modified and stored to local store. Then the putllc DMA command tries to update the lock word in memory if this SPE still holds the reservation. This DMA operation can succeed or fail depending on the reservation held for that main memory address. SPE software reads Atomic Command Status Channel(0x1B) to get completion status of getllar and success or failure of putlluc. If the DMA failed, the SPE has to repeat the loop starting from the getllar DMA command. DMA tag group completion channels are not used to get status for atomic DMA commands such as getllar and putllc.

Figure 2. Atomic DMA diagram
Figure 2. Atomic DMA diagram

List DMA

The SPE features special DMA commands which take a list of main memory effective addresses that is much similar to the scatter-gather list used with storage (SCSI/IDE disk) controllers. The list of main memory effective addresses and size is maintained in local store.

Any DMA command will move data between local store and main memory. The main memory address is actually an effective address that gets translated by the MFC's MMU to a particular real address in the system. In List DMAs, a list of such effective addresses (EA) and size is generated in the local store and given as input to the DMA command.

The EA of the DMA command follows the list, while the local store address is contiguous. One contiguous block of local address can be transferred to a scattered list of EAs using a single PUT LIST DMA command. The same is true for GET LIST command where data is picked up from different effective addresses and transferred to local store as one continuous block. The list of EAs saved in local store memory can be reused by providing them to other DMA commands. In fact, the same list can be used in a GET LIST and a subsequent PUT LIST DMA command. The list is an array of transfer size and EA pairs, not a linked list kind of structure. Section 7.4 in the Cell BE architecture document explains the list structure in detail (see Resources).

You can use list-based DMA techniques to collect data from different SPE local stores and place them in one block for further processing. List DMAs make this type of data aggregation faster by reducing the number of DMA enqueue commands and the amount of control code needed to check on the status of operations. More sophisticated features in List DMA allow you to make the DMA operation wait for certain events before proceeding to the next list element. Basically, the List DMA can wait for completion of operation on each SPU before collecting data. List DMA with stall and notify reduces the need for software polling loop and complex control logic needed to coordinate between SPEs.

Figure 3. List DMA diagram
Figure 3. List DMA diagram

SPE context

SPE context consists of the following elements:

  • Contents of local store
  • Contents of all 128 registers
  • State of the channels
  • State of DMA commands in progress

The Cell Broadband Engine processor allows you to save and restore the context of an SPE at any point of time. Even the DMA operations in progress can be saved and restored or moved to another SPE. These features enable an OS to multitask SPE tasks on a given set of SPEs.

Though context switching an SPE is very time consuming and inefficient, the hardware provides the infrastructure, and the OS should use the feature as needed. Preemptive context switching of SPE is complex especially with DMA state management and is beyond the scope of this paper.

The OS running on the PPE has access to local store and channel states using MMIO registers. The register context of the SPE cannot be loaded directly from PPE. You would need a context-load code to be copied to local store along with the saved register context or initial register context. The context-load code should load each register with the value from the local store and then jump to SPE program start location.

Following are the essential steps to create a new SPE task:

  1. Set MFC SR1.
  2. Copy code and data into local store.
  3. Copy register context in predefined location in local store.
  4. Copy register context-load code into local store.
  5. Write NPC to point to register context-load code.
  6. Initialize channel count values.
  7. Initialize mail box channels and any other channel data.
  8. Initialize the SPE's MMU, specifically MFC_SDR1 and SLB entries.
  9. Write correct MFC_SR1, to enable or disable MFC MMU and other configurations.
  10. Write RunControl register to start the SPE. The SPE will execute the register context-load code and then jump to the actual SPE program start address.

All data copying operations to local store can be done using PPE-side DMA queue for that SPE. This saves a lot of cycles for the PPE core. The OS running on the PPE should be ready to handle any external interrupts coming from the SPE. SPE interrupts can also be masked during context load, and only the required interrupts can be enabled during SPE execution. Generally the stop-and-signal interrupt would be enabled to get the attention of PPE once this SPE program finished execution.

To retrieve the context of the SPE, you can follow a subset of the above steps. The register context of the SPE program might be needed only in case of debugging purpose. You can save the complete local store to main memory either through MMIO or DMA. A context-save program can be loaded into the local store and executed by setting NPC and run control register. This will move the register values to local store which can be subsequently moved out of the SPE.

Figure 4. SPE Context save/restore diagram
Figure 4. SPE Context save/restore diagram

Conclusion

The Cell Broadband Engine processor is a very unique processor, described as a heterogeneous system-on-a-chip. All eight SPEs work in sync with the PPEs to carry out various tasks, and the on-chip DMA engines provide the means of data movement between the SPE and PPE tasks. This series of articles explored the various facets of the on-chip DMA and the power it brings for efficiently moving data in and out of SPEs, thereby forming the backbone of Cell Broadband Engine functionality.

The Cell BE architecture specification and the Cell Broadband Engine SDK C/C++ intrinsics provide comprehensive coverage of Cell Broadband Engine functionality and a suitable framework for developing applications using the IBM Full-System Simulator for the Cell Broadband Engine Architecture.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration
ArticleID=110303
ArticleTitle=Cell Broadband Engine processor DMA engines, Part 2: From an SPE point of view
publish-date=05022006