This article provides an example that shows how to create and register a DaCS user error handler.
DaCS provides support for registration of user-created error handlers, which are called under certain error conditions. The error handlers can be called for synchronous or asynchronous errors.
In SDK 3.0, any synchronous error reported to the error handlers can cause the process to abort. This happens when DaCS has detected a fatal error from which it cannot recover. Asynchronous errors include child failures (from the host process) and termination requests from a parent (from the accelerator process). Abnormal child termination can cause the parent to abort after calling all registered error handlers.
A normal child exit with a non-zero status can be reported asynchronously to the error handlers, but the process does not abort. This enables the parent process to determine whether the non-zero exit represents an error condition.
When it is called, a user error handler is passed an error object
that describes the error, which can be inspected using services provided. The
error object contains the DE and PID of the failing process. These can be
used to call dacs_de_test() to reap its status
and so allow another process to be started on that DE.
The DaCS library uses the SIGTERM signal to
handle asynchronous errors and termination requests. A dedicated error
handling thread is created in
dacs_runtime_init() for this purpose.
Applications using the DaCS library should not create any application
threads before calling dacs_runtime_init(). No
application thread should unmask this signal.
Begin by creating a user error handler called
my_errhandler. You can use the code in Listing
1.
Listing 1. The error handler
/****************************************************************
Example of a user error handler
This includes invocations of additional functions of
the passed "dacs_error_t" error parameter
****************************************************************/
int my_errhandler(dacs_error_t error){
/*need local variables for passback of values */
DACS_ERR_T dacs_rc=0;
DACS_ERR_T dacs_error_rc;//hold code for error
de_id_t de=0;
dacs_process_id_t pid=0;
uint32_t code = 0;
const char * error_string;
/* Get the DACS_ERR_T in the error to learn what happened */
printf("\n\n--in my_dacs_errhandler\n");
dacs_error_rc=dacs_rc=dacs_error_num(error);
printf(" dacs_error_num indicates DACS_ERR_T=%d %s\n",
dacs_rc,dacs_strerror(dacs_rc));
/* Get the exit code in the error to learn what happened */
dacs_rc=dacs_error_code(error,&code);
if(dacs_rc){//if error invoking dacs_error_code
printf(" dacs_error_code call had error DACS_ERR_T=%d %s\n",
dacs_rc,dacs_strerror(dacs_rc));
}
else {
if (DACS_STS_PROC_ABORTED==dacs_error_rc){
printf(" dacs_error_code signal signal=%d ",code);
}
else if (DACS_STS_PROC_FAILED==dacs_error_rc){
printf(" dacs_error_code exit code=%d\n",code);
}
else {//else reason is different than aborted or failed
printf(" dacs_error_code exit/signal code=%d\n",code);
}
}
/* Get the error string in the error to learn what happened */
dacs_rc=dacs_error_str(error,&error_string);
if(dacs_rc){//if error invoking dacs_error_str
printf(" dacs_error_str call had error DACS_ERR_T=%d %s\n",
dacs_rc,dacs_strerror(dacs_rc));
}
else {
printf(" dacs_error_str=%s\n",error_string);
}
/* Which DE had this error? */
dacs_rc=dacs_error_de(error,&de);
if(dacs_rc){//if error invoking dacs_error_de
printf(" dacs_error_de call had error DACS_ERR_T=%d %s\n",
dacs_rc,dacs_strerror(dacs_rc));
}
else {
printf(" dacs_error_de=%08x\n",de);
}
/* What was the dacs_process_id_t? */
dacs_rc=dacs_error_pid(error,&pid);
if(dacs_rc){//if error invoking dacs_error_pid
printf(" dacs_error_pid call had error"
"DACS_ERR_T=%d %s\n",dacs_rc,dacs_strerror(dacs_rc));
}
else {
printf(" dacs_error_pid=%ld\n",pid);
}
printf("exiting user error handler\n\n");
return 0;//in SDK 3.0, return value is ignored
}
|
Now you register it.
You can register the user error handler using the
dacs_errhandler_reg API:
dacs_rc= dacs_errhandler_reg((dacs_error_handler_t)&my_errhandler,0);
If the address of my_errhandler is not passed
or the cast to dacs_error_handler_t is omitted,
the compiler produces warnings.
Reading the error handler output
This is what the output looks like if the accelerator program exits with a return code of 9:
--in my_dacs_errhandler dacs_error_num indicates DACS_ERR_T=4 DACS_STS_PROC_FAILED dacs_error_code exit code=9 dacs_error_str=DACS_STS_PROC_FAILED dacs_error_de=01020200 dacs_error_pid=5503 exiting user error handler |
This is the example output if the accelerator program aborts:
--in my_dacs_errhandler dacs_error_num indicates DACS_ERR_T=5 DACS_STS_PROC_ABORTED dacs_error_code signal signal=6 dacs_error_str=DACS_STS_PROC_ABORTED dacs_error_de=01020200 dacs_error_pid=5894 exiting user error handler |
This article provided an example that showed how to create and register a DaCS user error handler.
Learn
- Use an
RSS
feed to request notification for the upcoming articles in this series. (Find out more about RSS feeds of developerWorks content.)
- Refer to Data Communication and Synchronization Library for Cell Broadband Engine Programmer’s Guide and API Reference for the source material from which this article was extracted.
- Take a look at these DaCS-related
quick-read guides:
- "Intro to DaCS."
- "APIs, apps, versions, and PDT."
- "Reservation services."
- "Process management."
- "Group functions."
- "Intro to data communications."
- "Datacomm details: rDMA."
- "Datacomm details: rDMA block transfers."
- "Datacomm details: rDMA list transfers."
- "Datacomm details: Message-passing services."
- "Datacomm details: Mailbox services."
- "Wait identifier management."
- "Transfer completion routines."
- "Locking primitives."
- "Element types."
- "Error handling."
- "Error codes glossary."
- "Trace events glossary."
- "DaCS and hybrid x86."
- Find code examples in the "Fun with ALF"
series that show you how to
add large matrices together,
convert I/O data,
find minimum and maximum values,
overcome memory limits with multiple vector dot products,
perform matrix math using overlapped I/O buffers,
and
use task dependency in a two-stage pipeline application
(developerWorks, March-July 2008).
- Take a look at these ALF-related
quick-read guides:
- "Introducing ALF."
- "10 major ALF concepts."
- "Programming with ALF: Basic ALF application structure."
- "Programming with ALF: Double buffering."
- "Programming with ALF: Handling ALF constraints."
- "Programming with ALF: Optimizing ALF applications."
- "Programming with ALF: Accelerator buffer management."
- "ALF and hybrid x86."
- Learn more about Cell/B.E. programming
from the developerWorks series:
- "Programming high-performance applications on the Cell/B.E. processor"
- "PS3 fab-to-lab"
- "The little broadband engine that could"
- Refer to the Cell
Broadband Engine documentation section of the IBM Semiconductor Solutions Technical Library for a wealth of downloadable manuals,
specifications, and more.
- Sign up for the developerWorks newsletter
and get the latest developer news and Cell/B.E. happenings delivered to your inbox each week.
Check Power Architecture
® when you sign up to receive Cell/B.E. news in your newsletter.
Get products and technologies
- Get your copy of the
IBM SDK for Multicore Acceleration 3.0
or browse through the library of Cell/B.E. documentation.
- Find all Cell/B.E.-related articles, discussion forums, downloads,
and more at the IBM developerWorks Cell
Broadband Engine resource center: your definitive resource for all
things Cell/B.E.
- Contact IBM about custom
Cell/B.E.-based or custom-processor based solutions.
Discuss
- Check out the Cell Broadband
Engine Architecture forum to get your technical questions about the processor answered.
Juicy problems and answers from the forums are rounded up periodically and highlighted
in the "Forum watch" blog series.
- Go to the Cell Broadband Engine/Power Architecture blog for
news, downloads,
instructional resources, and event notifications for Cell/B.E. and other Power Architecture-related technologies. You can find
the popular "Forum
watch" blog series (Q&A roundup), the "FixIt" technology updates, and the Infobomb
quick-read technology introductions.

Kane Scarlett is a technology journalist/analyst with 20 years in the business, working for such publishers as National Geographic, Population Reference Bureau, Miller Freeman, and IDG, and managing, editing, and writing for such august journals as JavaWorld, LinuxWorld, and of course, developerWorks.



