Heap Pwn Note-1 堆的结构

网上已经有很多介绍malloc的文章,这里是我阅读CTFwiki和其它资料时的笔记。

准备工作

kali安装glibc源码

1
sudo apt install glibc-source

安装后,源码压缩包位于/usr/src/glibc/目录下,可以自行解压

gdb调试时使用

1
dir PATH-TO-GLIBC_SOURCE

命令,进入malloc后就会有源码作为对照。

gdb编译程序时使用

1
gcc -ggdb source.c -o source

命令,可以使应用程序也带有调试符号,可以对照源码。

另外安装pwndbg, gef, peda可以增加调试便捷性。

在线源码阅读

https://code.woboq.org/userspace/glibc/

https://elixir.bootlin.com/glibc/glibc-2.31/source

实际操作中目标平台的libc版本可能和本地版本不同,pwntools使用如下命令更改libc

1
io = process(['./bin'],env={"LD_PRELOAD":"./libc.so.6"})

堆概述

malloc函数返回对应大小字节的内存块指针

  • n=0,返回最小内存块
  • n<=0,由于转成size_t(多数情况下为无符号数)会非常大,通常来说由于没有足够空间而失败。

free函数释放指针对应的内存块,可能是malloc或realloc得到的

  • p=NULL,不执行任何操作
  • p已被释放,会出现乱七八糟的结果(double free)。所以指针释放后让它指向NULL
  • 释放很大内存空间时会归还一部分给系统。

内存分配背后的系统调用

  • (s)brk 程序break的地方,可以理解为堆的顶端
  • mmap,munmap

程序内存图像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
----+-----------------------------------------------------
1GB |Kernel Space
----+----------------------------------------------------
|Random Stack offset
|
+------------------------------------------------------
3GB |Stack
|Grows Down
+----------------------------------------------------
|
+-----------------------------------------------------
|Memory Mapping Segment
|File Mapping and anonymous mappings
|Grows Down
+----------------------------------------------------
|
+-------------brk-----------------------------------
|Heap
|Grows Up
+-------------Start_brk-----------------------------
|random brk of offset
+--------------------------------------------------
|BSS(Block Start by Symbol) Segment
|Uninitialized static variable,filled with 0
+-------------------------------------------------
|Data Segment
|Static variable initialized by programmer
+-------------------------------------------------
|Text Segment
|Instructions
----+-------------------------------------------------

malloc_chunk

申请的内存块使用malloc_chunk表示,无论大小,无论处于分配还是释放状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/*
This struct declaration is misleading (but accurate and necessary).
It declares a "view" into memory allowing access to necessary
fields at known offsets from a given base. See explanation below.
*/

struct malloc_chunk {

INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including overhead. */

struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;

/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};

由此可知,malloc_chunk的内存图像如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
size_t位宽
-----------+<----------------chunk-----------------------------
prev_size | is_last_free(physical_nibor)?nibor_size:nibor_data
-----------+----------------------------------------------------
size | size|NON_MAIN_ARENA|IS_MAPPED|PREV_INUSE
-----------+<---malloc returning pointer-------------------------
fd |
-----------+
bk |
-----------+ is_used?user_data:fd/bk/fd_nextsize/bk_nextsize
fd_nextsize|
-----------+
bk_nextsize|
-----------+------------------------------------------------------
user_data |
-----------+------------------------------------------------------
  • prev_size

    与该chunk物理相邻的前一chunk如果时空闲,记录前一个chunk大小,包括头,否则做前一chunk数据

  • size

    该chunk大小,必须是2*SIZE_SZ的整数倍,所以低3bit对chunk大小无影响,分别表示

    • NON_MAIN_ARENA 当前chunk是否不属于主线程 1不属于,0属于
    • IS_MAPPED 当前chunk是否由mmap分配
    • PREV_INUSE 前一chunk是否被分配(使用),第一个chunk的P位为1防止非法访存。P为0时可以使用prev_size获取上一个chunk的大小和地址,方便合并
  • fd bk

    chunk被分配时,fd开始为用户数据。空闲时,会被添加到对应的管理链表

    • fd: 下一个(非物理相邻)空闲chunk
    • bk: 上一个(非物理相邻)空闲chunk
  • fd_nextsize, bk_nextsize

    只用于较大chunk

    • fd_nextsize

      指向前一个与当前chunk大小不同的第一个空闲块,不包含bin头指针

    • bk_nextsize

      指向后一个与当前chunk大小不同的第一个空闲块,不包含bin头指针

    空闲的large chunk在fd的遍历顺序中,按照从小到大的顺序遍历

malloc_state

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/*
have_fastchunks indicates that there are probably some fastbin chunks.
It is set true on entering a chunk into any fastbin, and cleared early in
malloc_consolidate. The value is approximate since it may be set when there
are no fastbin chunks, or it may be clear even if there are fastbin chunks
available. Given it's sole purpose is to reduce number of redundant calls to
malloc_consolidate, it does not affect correctness. As a result we can safely
use relaxed atomic accesses.
*/

struct malloc_state
{
/* Serialize access. */ // 支持多线程
__libc_lock_define (, mutex);

/* Flags (formerly in max_fast). */
int flags;

/* Set if the fastbin chunks contain recently inserted free blocks. */
/* Note this is a bool but not all targets support atomics on booleans. */
int have_fastchunks;

/* Fastbins */
mfastbinptr fastbinsY[NFASTBINS];

/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;

/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;

/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2]; // default NBINS=128

/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];

/* Linked list */
struct malloc_state *next;

/* Linked list for free arenas. Access to this field is serialized
by free_list_lock in arena.c. */
struct malloc_state *next_free;

/* Number of threads attached to this arena. 0 if the arena is on
the free list. Access to this field is serialized by
free_list_lock in arena.c. */
INTERNAL_SIZE_T attached_threads;

/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};
  • 所有内部状态都位于malloc_state实例中,除非
    • USE_MALLOC_LOCK 被定义,mALLOC_MUTEx在上方声明
      • mmap不支持匿名映射,会有假的文件描述符

bin

glibc算法介绍

The main properties of the algorithms are:

  • For large (>= 512 bytes) requests, it is a pure best-fit allocator,
    with ties normally decided via FIFO (i.e. least recently used).
  • For small (<= 64 bytes by default) requests, it is a caching
    allocator, that maintains pools of quickly recycled chunks.
  • In between, and for combinations of large and small requests, it does
    the best it can trying to meet both goals at once.
  • For very large requests (>= 128KB by default), it relies on system
    memory mapping facilities, if supported.

一些值

OPTION DEFAULT VALUE

Compilation Environment options:

HAVE_MREMAP                0

Changing default word sizes:

INTERNAL_SIZE_T            size_t

Configuration and functionality options:

USE_PUBLIC_MALLOC_WRAPPERS NOT defined
USE_MALLOC_LOCK            NOT defined
MALLOC_DEBUG               NOT defined
REALLOC_ZERO_BYTES_FREES   1
TRIM_FASTBINS              0

Options for customizing MORECORE:

MORECORE                   sbrk
MORECORE_FAILURE           -1
MORECORE_CONTIGUOUS        1
MORECORE_CANNOT_TRIM       NOT defined
MORECORE_CLEARS            1
MMAP_AS_MORECORE_SIZE      (1024 * 1024)

Tuning options that are also dynamically changeable via mallopt:

DEFAULT_MXFAST             64 (for 32bit), 128 (for 64bit)
DEFAULT_TRIM_THRESHOLD     128 * 1024
DEFAULT_TOP_PAD            0
DEFAULT_MMAP_THRESHOLD     128 * 1024
DEFAULT_MMAP_MAX           65536
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
/*
Bins
An array of bin headers for free chunks. Each bin is doubly
linked. The bins are approximately proportionally (log) spaced.
There are a lot of these bins (128). This may look excessive, but
works very well in practice. Most bins hold sizes that are
unusual as malloc request sizes, but are more usual for fragments
and consolidated sets of chunks, which is what these bins hold, so
they can be found quickly. All procedures maintain the invariant
that no consolidated chunk physically borders another one, so each
chunk in a list is known to be preceeded and followed by either
inuse chunks or the ends of memory.
Chunks in bins are kept in size order, with ties going to the
approximately least recently used chunk. Ordering isn't needed
for the small bins, which all contain the same-sized chunks, but
facilitates best-fit allocation for larger chunks. These lists
are just sequential. Keeping them in order almost never requires
enough traversal to warrant using fancier ordered data
structures.
Chunks of the same size are linked with the most
recently freed at the front, and allocations are taken from the
back. This results in LRU (FIFO) allocation order, which tends
to give each chunk an equal opportunity to be consolidated with
adjacent freed chunks, resulting in larger free chunks and less
fragmentation.
To simplify use in double-linked lists, each bin header acts
as a malloc_chunk. This avoids special-casing for headers.
But to conserve space and improve locality, we allocate
only the fd/bk pointers of bins, and then use repositioning tricks
to treat these as the fields of a malloc_chunk*.
*/

typedef struct malloc_chunk *mbinptr; // bin是chunk链表
/* addressing -- note that bin_at(0) does not exist */
#define bin_at(m, i) \
(mbinptr) (((char *) &((m)->bins[((i) - 1) * 2])) \
- offsetof (struct malloc_chunk, fd))
/* analog of ++bin */
#define next_bin(b) ((mbinptr) ((char *) (b) + (sizeof (mchunkptr) << 1)))
/* Reminders about list directionality within bins */
#define first(b) ((b)->fd)
#define last(b) ((b)->bk)
/*
Indexing
Bins for sizes < 512 bytes contain chunks of all the same size, spaced
8 bytes apart. Larger bins are approximately logarithmically spaced:
64 bins of size 8
32 bins of size 64
16 bins of size 512
8 bins of size 4096
4 bins of size 32768
2 bins of size 262144
1 bin of size what's left
There is actually a little bit of slop in the numbers in bin_index
for the sake of speed. This makes no difference elsewhere.
The bins top out around 1MB because we expect to service large
requests via mmap.
Bin 0 does not exist. Bin 1 is the unordered list; if that would be
a valid chunk size the small bins are bumped up one.
*/
#define NBINS 128
#define NSMALLBINS 64
#define SMALLBIN_WIDTH MALLOC_ALIGNMENT
#define SMALLBIN_CORRECTION (MALLOC_ALIGNMENT > 2 * SIZE_SZ)
#define MIN_LARGE_SIZE ((NSMALLBINS - SMALLBIN_CORRECTION) * SMALLBIN_WIDTH)
#define in_smallbin_range(sz) \
((unsigned long) (sz) < (unsigned long) MIN_LARGE_SIZE)
#define smallbin_index(sz) \
((SMALLBIN_WIDTH == 16 ? (((unsigned) (sz)) >> 4) : (((unsigned) (sz)) >> 3))\
+ SMALLBIN_CORRECTION)
#define largebin_index_32(sz) \
(((((unsigned long) (sz)) >> 6) <= 38) ? 56 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ? 91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)
#define largebin_index_32_big(sz) \
(((((unsigned long) (sz)) >> 6) <= 45) ? 49 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ? 91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)
// XXX It remains to be seen whether it is good to keep the widths of
// XXX the buckets the same or whether it should be scaled by a factor
// XXX of two as well.
#define largebin_index_64(sz) \
(((((unsigned long) (sz)) >> 6) <= 48) ? 48 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ? 91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)
#define largebin_index(sz) \
(SIZE_SZ == 8 ? largebin_index_64 (sz) \
: MALLOC_ALIGNMENT == 16 ? largebin_index_32_big (sz) \
: largebin_index_32 (sz))
#define bin_index(sz) \
((in_smallbin_range (sz)) ? smallbin_index (sz) : largebin_index (sz))

unsorted bin

1
#define unsorted_chunks(M)          (bin_at (M, 1))

所有分配内存后分割剩下的chunk,free函数释放的内存块,malloc_consolidate合并的内存块,都会先放在unsorted bin内,在它们返回自己的bin时在这里还有机会再次使用。

top chunk

1
#define initial_top(M)              (unsorted_chunks (M))
  • top chunk位于可用内存边缘(如果是(s)brk分配的内存,则top chunk上边缘在brk)
  • 只有其他chunk都无法满足条件时才会使用该chunk
  • 第一次malloc会强制伸展top chunk
  • 当top chunk过大时,会返还给system
  • 初始化时对待top chunk为unsorted chunk

fast bin

small bin

large bin

unsorted bin

top chunk

将一个双向链表中的一个元素取出来,可能在以下地方使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* Take a chunk off a bin list.  */
static void
unlink_chunk (mstate av, mchunkptr p)
{
if (chunksize (p) != prev_size (next_chunk (p)))
malloc_printerr ("corrupted size vs. prev_size");
mchunkptr fd = p->fd;
mchunkptr bk = p->bk;
if (__builtin_expect (fd->bk != p || bk->fd != p, 0))
malloc_printerr ("corrupted double-linked list");
fd->bk = bk;
bk->fd = fd;
if (!in_smallbin_range (chunksize_nomask (p)) && p->fd_nextsize != NULL)
{
if (p->fd_nextsize->bk_nextsize != p
|| p->bk_nextsize->fd_nextsize != p)
malloc_printerr ("corrupted double-linked list (not small)");
if (fd->fd_nextsize == NULL)
{
if (p->fd_nextsize == p)
fd->fd_nextsize = fd->bk_nextsize = fd;
else
{
fd->fd_nextsize = p->fd_nextsize;
fd->bk_nextsize = p->bk_nextsize;
p->fd_nextsize->bk_nextsize = fd;
p->bk_nextsize->fd_nextsize = fd;
}
}
else
{
p->fd_nextsize->bk_nextsize = p->bk_nextsize;
p->bk_nextsize->fd_nextsize = p->fd_nextsize;
}
}
}

参考资料

[1] glibc源码下载&阅读地址