Topic
6 replies Latest Post - ‏2008-10-16T16:59:06Z by SystemAdmin
SystemAdmin
SystemAdmin
10114 Posts
ACCEPTED ANSWER

Pinned topic Question about mfc barrier

‏2008-09-20T08:05:45Z |
Hi all,

I use the following double-buffering scheme:


unsigned 

int buffer_idx = 0; initiate_mfc_get(_buf0[buffer_idx], first_tag_id+buffer_idx); initiate_mfc_get(_buf1[buffer_idx], first_tag_id+buffer_idx); 

while (there_is_more_data) 
{ 

const unsigned 

int next_buffer_idx = buffer_idx ^ 1;   initiate_mfc_get(_buf0[next_buffer_idx], first_tag_id+next_buffer_idx); (<- *) initiate_mfc_get(_buf1[next_buffer_idx], first_tag_id+next_buffer_idx);   mfc_wait(1 << (first_tag_id+buffer_idx)); process_buffers(_buf0[buffer_idx], _buf1[buffer_idx]);   initiate_mfc_put(_buf0[buffer_idx], first_tag_id+buffer_idx); initiate_mfc_put(_buf1[buffer_idx], first_tag_id+buffer_idx);   buffer_idx = next_buffer_idx; 
} mfc_wait(1 << (first_tag_id+buffer_idx)); process_buffers(_buf0[buffer_idx], _buf1[buffer_idx]);   initiate_mfc_put(_buf0[buffer_idx], first_tag_id+buffer_idx); initiate_mfc_put(_buf1[buffer_idx], first_tag_id+buffer_idx);   mfc_wait(1 << (first_tag_id+buffer_idx));


The snippet above sends dma put and right after it dma get commands to the mfc with the same tag id and buffer_idx. This did not work as I expected, because the mfc may reorder these commands to improve efficiency. I wanted to enforce the reads to start only when all the writes had already been completed. This should be achieved by using a get with barrier at the line marked with the star.
But this did not work and I don't know why. Any ideas? Now both of the get commands at the star are fenced, and this works, but enforces the reads also to complete in the issued order, which I think is an unnecessary restriction. Why did the barrier not work? Or is the scheme wrong? In my real code the initiate_ functions create mfc lists and send list commands, but I think this has nothing to do with the problem.

peppin
Updated on 2008-10-16T16:59:06Z at 2008-10-16T16:59:06Z by SystemAdmin
  • mkistler
    mkistler
    551 Posts
    ACCEPTED ANSWER

    Re: Question about mfc barrier

    ‏2008-09-20T14:43:51Z  in response to SystemAdmin
    peppin,

    Can you be specific ... are you using get with barrier (GETB) or get with fence (GETF)? You mentioned both barrier and fence in your post, but these provide different ordering guarantees.

    Mike Kistler
    • SystemAdmin
      SystemAdmin
      10114 Posts
      ACCEPTED ANSWER

      Re: Question about mfc barrier

      ‏2008-09-20T15:52:44Z  in response to mkistler
      Hi,

      I tried get with barrier (GETB) at the star:
      
      initiate_mfc_get_barrier(_buf0[next_buffer_idx], first_tag_id+next_buffer_idx); (<- *)
      

      But it did not work and I don't understand why. That is my question.

      peppin
      • SystemAdmin
        SystemAdmin
        10114 Posts
        ACCEPTED ANSWER

        Re: Question about mfc barrier

        ‏2008-10-16T08:37:26Z  in response to SystemAdmin
        Nobody has any idea of what the problem might be here? :(

        I also tried a global mfc_barrier() before the get commands, but the result is the same, i.e. reads occur before all the writes has been completed. (I have verified that the barrier command appears in the queue using info spu dma.)
        It is simply not working in the way described in the documents, or more probably I still don't understand the way it should work.
        • SystemAdmin
          SystemAdmin
          10114 Posts
          ACCEPTED ANSWER

          Re: Question about mfc barrier

          ‏2008-10-16T11:50:08Z  in response to SystemAdmin
          Finally it seems that I have solved the problem. It had nothing to do with mfc barriers or fences, but the dma list in the local store was overwritten while the transfer was still active.
        • mkistler
          mkistler
          551 Posts
          ACCEPTED ANSWER

          Re: Question about mfc barrier

          ‏2008-10-16T12:25:24Z  in response to SystemAdmin
          I recommend that you post the full source code. That way others can see exactly what commands are being issued in what sequence, and that should help identify the problem.

          Mike Kistler
          • SystemAdmin
            SystemAdmin
            10114 Posts
            ACCEPTED ANSWER

            Re: Question about mfc barrier

            ‏2008-10-16T16:59:06Z  in response to mkistler
            Hi,

            As I have already said, I solved the problem. I am not allowed to post the full source code, but I can give a sketch about what it was.
            Consider the following snippet:

            
            dma_list_data _list[2]; dma_buffer    _buf[2];   unsigned 
            
            int buffer_idx = 0;   create_mfc_list(_list[buffer_idx]); initiate_mfc_get_list(_list[buffer_idx], _buf[buffer_idx], first_tag_id+buffer_idx);   
            
            while (there_is_more_data) 
            { 
            
            const unsigned 
            
            int next_buffer_idx = buffer_idx ^ 1;   create_mfc_list(_list[next_buffer_idx]); (4) initiate_mfc_get(_list[next_buffer_idx], _buf[next_buffer_idx], first_tag_id+next_buffer_idx); (1)   mfc_wait(1 << (first_tag_id+buffer_idx)); process_buffer(_buf[buffer_idx]);   create_mfc_list(_list[buffer_idx]); (3) initiate_mfc_put(_list[buffer_idx], _buf[buffer_idx], first_tag_id+buffer_idx); (2)   buffer_idx = next_buffer_idx; 
            }   mfc_wait(1 << (first_tag_id+buffer_idx)); process_buffer(_buf[buffer_idx]);   create_mfc_list(_list[buffer_idx]); initiate_mfc_put(_list[buffer_idx], _buf[buffer_idx], first_tag_id+buffer_idx); mfc_wait(1 << (first_tag_id+buffer_idx));
            


            As you can see, this double buffering scheme puts and (immediately after it) gets one of the buffers while processing the other. Thus, the GET command (line 1) has to be enforced to be issued after the PUT (line 2) has already been completed. You can use either fence or barrier with the GET command, both of them should work. But it still won't, because the code has another trivial error, which was hard for me to notice. It is that the dma list created at line 3 will be overwritten at line 4 but that time the PUT command is still working! Unfortunately, the output of the program will depend on the speed of process_buffer and which of the compiler optimizations are turned on. My solution was to use 3 buffers to store the lists for dma:

            
            dma_list_data _list[3]; dma_buffer    _buf[2];   unsigned 
            
            int buffer_idx = 0; unsigned 
            
            int list_buffer_idx = 0;   create_mfc_list(_list[list_buffer_idx]); initiate_mfc_get_list(_list[list_buffer_idx], _buf[buffer_idx], first_tag_id+buffer_idx); list_buffer_idx = (list_buffer_idx + 1) % 3;   
            
            while (there_is_more_data) 
            { 
            
            const unsigned 
            
            int next_buffer_idx = buffer_idx ^ 1;   create_mfc_list(_list[list_buffer_idx]); initiate_mfc_get_with_barrier(_list[list_buffer_idx], _buf[next_buffer_idx], first_tag_id+next_buffer_idx); list_buffer_idx = (list_buffer_idx + 1) % 3;   mfc_wait(1 << (first_tag_id+buffer_idx)); process_buffer(_buf[buffer_idx]);   create_mfc_list(_list[list_buffer_idx]); initiate_mfc_put(_list[list_buffer_idx], _buf[buffer_idx], first_tag_id+buffer_idx); list_buffer_idx = (list_buffer_idx + 1) % 3;   buffer_idx = next_buffer_idx; 
            }   mfc_wait(1 << (first_tag_id+buffer_idx)); process_buffer(_buf[buffer_idx]);   create_mfc_list(_list[list_buffer_idx]); initiate_mfc_put(_list[list_buffer_idx], _buf[buffer_idx], first_tag_id+buffer_idx); mfc_wait(1 << (first_tag_id+buffer_idx));
            


            I hope the explanation was easy to understand.
            Regards,