Anyone who has used or learned C is no stranger to malloc. Everyone knows that malloc can allocate a contiguous memory space and can be freed by free when it is no longer needed. However, many programmers are not familiar with the underlying mechanisms of malloc, and some even consider it as a system call or a keyword provided by the operating system. In reality, malloc is just a standard library function in C, and its basic implementation is not complicated. Any programmer with a basic understanding of C and the operating system can easily grasp it.
This article explores the inner workings of malloc by implementing a simple version of it. Although this implementation is less efficient than existing ones like glibc, it is much simpler and easier to understand. What's important is that it follows the same principles as real implementations.
The article starts by introducing essential knowledge about the operating system's memory management and related system calls. Then, it gradually builds a simple malloc. For simplicity, the focus is on the x86_64 architecture running Linux.
1. What is malloc
2. Preliminary Knowledge
2.1.1 Virtual Memory Address and Physical Memory Address
2.1.2 Page and Address Composition
2.1.3 Memory Pages and Disk Pages
2.2 Linux Process Level Memory Management
2.2.1 Memory Arrangement
2.2.2 Heap Memory Model
2.2.3 brk and sbrk
2.2.4 Resource Limit and rlimit
3. Implementing Malloc
3.1 Toy Implementation
3.2 Formal Implementation
3.3 Legacy Issues and Optimization
4. Other References
1. What is malloc
Before implementing malloc, it’s necessary to define it properly. According to the standard C library, the prototype of malloc is:
void* malloc(size_t size);
The function must allocate a contiguous block of memory in the system, meeting the following requirements:
- The allocated memory must be at least the number of bytes specified by the size parameter.
- The return value is a pointer to the starting address of the allocated memory.
- The addresses allocated by multiple calls to malloc must not overlap unless they have been freed.
- Malloc should complete the allocation quickly (it should not use NP-hard algorithms).
- The implementation must include realloc and free functions.
More details about malloc can be found by typing 'man malloc' on the command line.
2. Preliminary Knowledge
Before implementing malloc, it's essential to understand how Linux manages memory.
2.1 Linux Memory Management
2.1.1 Virtual Memory Address and Physical Memory Address
Modern operating systems typically use virtual memory addressing. Each process seems to have access to a large amount of memory, but in reality, it depends on the physical memory available. The MMU (Memory Management Unit) translates virtual addresses into physical addresses.
2.1.2 Page and Address Composition
Memory is managed in pages rather than individual bytes. A typical page size in Linux is 4096 bytes. Addresses are divided into page numbers and offsets. The MMU maps these pages using a page table.
2.1.3 Memory Pages and Disk Pages
Memory acts as a cache for disk storage. When a page is not in physical memory, a page fault occurs, and the system loads the corresponding disk page into memory.
2.2 Linux Process Level Memory Management
2.2.1 Memory Arrangement
Understanding the relationship between virtual and physical memory helps explain how processes manage their memory. On a 64-bit Linux system, the user space is divided into sections such as code, data, BSS, heap, mapping area, and stack.
2.2.2 Heap Memory Model
Malloc primarily allocates memory from the heap. Linux maintains a break pointer that indicates the end of the heap. This pointer can be moved using brk and sbrk system calls.
2.2.3 brk and sbrk
These system calls adjust the break pointer to increase or decrease the heap size. They are crucial for managing dynamic memory allocation.
2.2.4 Resource Limit and rlimit
Each process has limits on the resources it can use. These limits can be retrieved and adjusted using getrlimit and setrlimit system calls.
3. Implementing Malloc
3.1 Toy Implementation
A simple toy implementation of malloc can be written using sbrk to move the break pointer. However, this implementation lacks features like memory tracking and cannot handle freeing memory effectively.
3.2 Formal Implementation
To create a more robust implementation, we need to use a linked list of blocks, each containing metadata and the actual data. This allows us to track allocated and free blocks efficiently.
3.2.1 Data Structure
We define a structure for each block, including size, next pointer, free flag, padding, and a magic pointer to ensure valid addresses. This structure helps manage memory allocation and deallocation.
3.2.2 Finding the Right Block
To find a suitable block, we use a first-fit algorithm. This involves scanning the list of blocks until one that meets the size requirement is found.
3.2.3 Opening New Blocks
If no suitable block is found, we extend the heap by moving the break pointer forward using sbrk. This creates a new block that can be added to the list.
3.2.4 Splitting Blocks
When a block is larger than needed, we split it into two parts. This reduces fragmentation and improves memory utilization.
3.2.5 Malloc Implementation
Combining all the elements, we implement a basic malloc function that allocates memory, splits blocks when necessary, and tracks allocated and free blocks.
3.2.6 Calloc Implementation
Calloc is implemented by calling malloc and then zeroing out the allocated memory. This ensures that the memory is initialized to zero.
3.2.7 Free Implementation
Freeing memory involves marking a block as free and merging it with adjacent free blocks if possible. This helps reduce fragmentation and improve memory efficiency.
3.2.8 Realloc Implementation
Realloc adjusts the size of an existing allocation. It may involve splitting a block, merging with adjacent blocks, or allocating new memory if necessary.
3.3 Legacy Issues and Optimization
The current implementation is simple but lacks several optimizations. Future improvements could include support for 32-bit and 64-bit systems, using mmap for large allocations, maintaining multiple lists based on block sizes, and optimizing the search for free blocks.
4. Other References
This article draws heavily from "A Malloc Tutorial" and other resources like "Computer Systems: A Programmer's Perspective." For deeper insights, readers are encouraged to explore the Linux kernel's memory management and real-world implementations like glibc.
LED Interactive Whiteboard,Smart Touch Screen Tv for Classroom,Interactive Tv Screens for Schools,Touch Screen Teaching Board
Shanghai Really Technology Co.,Ltd , https://www.really-led.com