Linux 内核中的链表

简介

通常链表数据结构至少应包含两个域：数据域和指针域，数据域用于存储数据，指针域用于建立与下一个节点的联系。按照指针域的组织以及各个节点之间的联系形式，链表又可以分为单链表、双链表、循环链表等多种类型。在数据结构课本中，链表的经典定义方式通常是这样的（以单链表为例）：

struct list_node {
    struct list_node *next;
    ElemType    data;
  };

这样的数据类型结构清楚，但是因为 next 的类型各不相同，不同的链表类型都必须有不同的操作方式，无法用统一的函数对它们进行处理。所以内核里面使用一个统一的链表将相应的数据结构穿起来。对不同的链表来说，引线都是一样的，差别就在引线上的连接的数据结构不同。如下图所示：

数据结构的定义就是：

struct list_head {struct list_head *next, *prev;};

struct list_node {
    struct list_head list;
    ElemType    data;
};

这样，对链表的处理最终都会转化为对 list_head 的处理，内核提供了一整套统一的宏和函数来进行 (代码位于 include/linux/list.h)。

宏和函数：

（1）初始化宏

#define LIST_HEAD_INIT(name) {&(name), &(name) }
#define LIST_HEAD(name) struct list_head name = LIST_HEAD_INIT(name)

LIST_HEAD 定义了一个 name 并进行初始化， name 的 next 和 prev 都指向它自己。如下图所示：

（2）插入

/* 内部函数，插入任意位置，前提是 prev 和 next 已知 */
static inline void __list_add(struct list_head *new,
                          struct list_head *prev,
                          struct list_head *next)
{
    next->prev = new;
    new->next = next;
    new->prev = prev;
    prev->next = new;
}

下面量个函数为对 __list_add 的包裹，封装出头部、尾部插入

static inline void list_add(struct list_head *new, struct list_head *head)
{__list_add(new, head, head->next);
}

static inline void list_add_tail(struct list_head *new, struct list_head *head)
{__list_add(new, head->prev, head);
}

（3）删除

static inline void __list_del(struct list_head * prev, struct list_head * next)
{
    next->prev = prev;
    prev->next = next;
}

static inline void list_del(struct list_head *entry)
{__list_del(entry->prev, entry->next);
    entry->next = LIST_POISON1;            // 对此位置的引用将报错
    entry->prev = LIST_POISON2;           // 同上
}

3. 数据访问：

以上仅介绍了对链表的操作，也就是能够在一个绳子上移动，而要通过绳子访问链表上的其它数据，就需要一些技巧了。在这里使用了容器的概念，把绳子上穿的数据结构当成容器，连着它 list 成为容器中的一个把手，通过 container_of 这个宏就能根据把手找到对应的容器。

/**
* list_entry - get the struct for this entry
* @ptr:        the &struct list_head pointer.
* @type:       the type of the struct this is embedded in.
* @member:     the name of the list_struct within the struct.
*/
#define list_entry(ptr, type, member) \
    container_of(ptr, type, member)

其中，

#define container_of(ptr, type, member) ({                  \
    const typeof(((type *)0)->member ) *__mptr = (ptr);    \
    (type *)((char *)__mptr - offsetof(type,member) );})

这里首先将 ptr 中的地址转化为 member 类型的地址，然后转化为 char 指针并减去 offset。其中， offset 的意义如下图所示，只需要将 list_head 所在位置减去 offset 的值，就可以得到整个数据结构的起始地址。将这个地址强制转化为 type 类型指针，就得到了指向这个结构体的指针。
内核中 offset 有两种获得方式：

#undef offsetof
#ifdef __compiler_offsetof 
#define offsetof(TYPE,MEMBER) __compiler_offsetof(TYPE,MEMBER)
#else
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#endif
#endif /* __KERNEL__ */

第一个种方式由编译器负责产生 offset 的值。第二种方式通过直接用 0 指针的方式获得。这里的技巧是：
当编译器看到 ptr->member 时，会自动将 ptr 中的地址与 member 的偏移量相加，也就是得到 ptr + offsetof(member)。而当我们将 0 强制转化为 TYPE 指针的时候，得到下式。注意在地址 0 的位置上并没有定义 TYPE 的数据结构，而仅仅是利用编译器的处理方式获得偏移量的位置。

ptr->member 
= ptr + offsetof(member) 
= 0 + offsetof(member)
= offsetof(member)

4. 遍历链表

用这个宏就可以根据链表头地址访问到链表中每一个结构体的位置，对链表中数据结构的操作，就通过 pos 进行。

/**
* list_for_each_entry -       iterate over list of given type
* @pos:        the type * to use as a loop cursor.
* @head:       the head for your list.
* @member:     the name of the list_struct within the struct.
*/
#define list_for_each_entry(pos, head, member)                      \
    for (pos = list_entry((head)->next, typeof(*pos), member);      \
         prefetch(pos->member.next), &pos->member != (head);        \
         pos = list_entry(pos->member.next, typeof(*pos), member))