Pageable data structure
To minimize the data working set, try to concentrate the frequently used data and avoid unnecessary references to virtual-storage pages.
Specifically:
- Use the malloc() or calloc() subroutines to request only as much space as you actually need. Never request and then initialize a maximum-sized array when the actual situation uses only a fraction of it. When you touch a new page to initialize the array elements, you effectively force the VMM to steal a page of real memory from someone. Later, this results in a page fault when the process that owned that page tries to access it again. The difference between the malloc() and calloc() subroutines is not just in the interface.
- Because the calloc() subroutine zeroes the allocated storage, it touches every page that is allocated, whereas the malloc() subroutine touches only the first page. If you use the calloc() subroutine to allocate a large area and then use only a small portion at the beginning, you place an unnecessary load on the system. Not only do the pages have to be initialized; if their real-memory frames are reclaimed, the initialized and never-to-be-used pages must be written out to paging space. This situation wastes both I/O and paging-space slots.
- Linked lists of large structures (such as buffers) can result in similar problems. If your program does a lot of chain-following looking for a particular key, consider maintaining the links and keys separately from the data or using a hash-table approach instead.
- Locality of reference means locality in time, not just in address space. Initialize data structures just prior to when they are used (if at all). In a heavily loaded system, data structures that are resident for a long time between initialization and use risk having their frames stolen. Your program would then experience an unnecessary page fault when it began to use the data structure.
- Similarly, if a large structure is used early and then left untouched for the remainder of the program, it should be released. It is not sufficient to use the free() subroutine to free the space that was allocated with the malloc() or calloc() subroutines. The free() subroutine releases only the address range that the structure occupied. To release the real memory and paging space, use the disclaim() subroutine to disclaim the space as well. The call to disclaim() should be before the call to free().