输出虚拟地址对应物理地址的程序

这是一个可以在程序内输出本进程中虚拟地址实际对应的物理地址,可以用来探究linux内存管理系统。
注意:由于一些原因,此程序未能找到在lab machine上成功运行的方法,因此需要在虚拟机上运行。

文件结构

1
2
3
4
vm.h vm.c	- 实现了由虚拟地址计算物理地址的函数,请勿修改
cow.c - 测试copy on write的程序
global.c - 测试.bss段映射匿名文件的程序
mmap.c - 测试mmap函数有关操作的程序

其中,vm.hvm.c核心代码来源于网络,为了方便调用,此处做了适当的微调与封装。

使用方法

函数接口

vm.hvm.c中,封装了一个函数,其可以根据传入的虚拟地址输出物理地址:

1
void show_pa(void* va, const char* info);

函数参数

va - 需要查看其物理地址的虚拟地址
info - 输出时首先输出的字符串,方便查看;如果为空字符串,则改为输出进程号

函数效果

会按如下格式输出一行:

1
{info}: virtual addr = {va}, physical addr = {pa}

如果info为空串,则{info}会替换为pid = {进程号}
如果物理地址转化成功,则{pa}为16进制物理地址,否则为错误信息(包括缺页、权限不足失败等)。

调用方法

新建一个.c文件,包含vm.h头文件后即可直接调用:

1
#include "vm.c"

可以在自己的代码中调用此函数观察某些变量的物理地址。

实例:cow.c

一个简单的例子,其fork出一个子进程,最初两个进程同一变量的物理地址一致,而父进程写变量时发生写时复制,父进程该变量的物理地址变化。代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/* copy on write */
#include"vm.h"
#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>

int main(){
static int c = 100;

int pid = fork();
if(pid == 0){ /* child */
show_pa((void*)&c, "");
sleep(2);
show_pa((void*)&c, "");
}
else{ /* parent */
sleep(1);
show_pa((void*)&c, "");
c = 20;
printf("parent: modified c.\n");
show_pa((void*)&c, "");
}
wait(NULL);
return 0;
}

编译

由于show_pa函数的定义在vm.c中,因此编译命令需要包含vm.c,以编译上述cow.c为例:

1
$ gcc cow.c vm.c -g -Wall -o vm

将得到名为vm的可执行文件。

运行 (重要!)

如果直接运行出现了如下情况(以上述cow.c编译出的可执行文件vm为例):

1
2
3
4
5
pid = 52128: virtual addr = 0x55d38e576010, physical addr = 10
pid = 52127: virtual addr = 0x55d38e576010, physical addr = 10
parent: modified c.
pid = 52127: virtual addr = 0x55d38e576010, physical addr = 10
pid = 52128: virtual addr = 0x55d38e576010, physical addr = 10

物理地址最多只有3个16进制数,不足4KB,即只输出了页内偏移。这是权限原因导致的,在lab machine上未能找到解决方案,但在自己的虚拟机/双系统上可以解决。

如果编译出的可执行文件为vm,则可以通过如下指令运行:

1
2
$ sudo setcap cap_sys_admin=eip vm
$ sudo ./vm

其中,第一条命令在每次编译后需要运行,若先前已运行过且没有重新编译则不需要。注意:两条命令sudo都是必要的。以此法的运行结果如下(每次运行可能不同):

1
2
3
4
5
pid = 52128: virtual addr = 0x55d38e576010, physical addr = 26541010
pid = 52127: virtual addr = 0x55d38e576010, physical addr = 26541010
parent: modified c.
pid = 52127: virtual addr = 0x55d38e576010, physical addr = 52376010
pid = 52128: virtual addr = 0x55d38e576010, physical addr = 26541010

可以观察到:父进程52127在修改了变量c的值后发生了写时复制。

更多实例

global.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/* anonymous file and CoW */
#include"vm.h"
#include<stdio.h>

#define LEN 1024
#define BUF 1024
int array[LEN][LEN];

int main(){
char buffer[BUF];
// printf("array = %p\n", array);
for(int i = 1; i <= 4; i++){
sprintf(buffer, "array[%d][0]", i);
show_pa(&array[i][0], buffer); /* page not present */
}
int sum = 0;
for(int i = 0; i < LEN; i++)
sum += array[i][0]; /* read but not write */
printf("sum = %d\n", sum);/* Avoid being optimized out */

for(int i = 1; i <= 4; i++){
sprintf(buffer, "array[%d][0]", i);
show_pa(&array[i][0], buffer); /* page present */
}

for(int i = LEN - 4; i < LEN; i++){
sprintf(buffer, "array[%d][0]", i);
show_pa(&array[i][0], buffer); /* page present */
}

for(int i = 1; i <= 4; i++){
array[i][0] = (i-1)*(i-1);
sprintf(buffer, "modified array[%d][0]", i);
show_pa(&array[i][0], buffer); /* copy on write */
}

for(int i = LEN - 4; i < LEN; i++){
sprintf(buffer, "array[%d][0]", i);
show_pa(&array[i][0], buffer); /* not changed */
}
}

首先定义了一个较大的全局变量int array[1024][1024],注意array[i][0]array[i+1][0]正好相差4KB。

之后执行如下过程:

  1. 直接查看它们的物理地址(缺页)
  2. 经过一趟读操作后查看它们的物理地址
  3. 对部分进行写操作后查看它们的物理地址

详细细节请参阅源代码,其可能的输出如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
array[1][0]: virtual addr = 0x55d619f68040, physical addr = page present is 0
array[2][0]: virtual addr = 0x55d619f69040, physical addr = page present is 0
array[3][0]: virtual addr = 0x55d619f6a040, physical addr = page present is 0
array[4][0]: virtual addr = 0x55d619f6b040, physical addr = page present is 0
sum = 0
array[1][0]: virtual addr = 0x55d619f68040, physical addr = 23f19040
array[2][0]: virtual addr = 0x55d619f69040, physical addr = 23f19040
array[3][0]: virtual addr = 0x55d619f6a040, physical addr = 23f19040
array[4][0]: virtual addr = 0x55d619f6b040, physical addr = 23f19040
array[1020][0]: virtual addr = 0x55d61a363040, physical addr = 23f19040
array[1021][0]: virtual addr = 0x55d61a364040, physical addr = 23f19040
array[1022][0]: virtual addr = 0x55d61a365040, physical addr = 23f19040
array[1023][0]: virtual addr = 0x55d61a366040, physical addr = 23f19040
modified array[1][0]: virtual addr = 0x55d619f68040, physical addr = 6c06e040
modified array[2][0]: virtual addr = 0x55d619f69040, physical addr = 11bb9040
modified array[3][0]: virtual addr = 0x55d619f6a040, physical addr = 4f7af040
modified array[4][0]: virtual addr = 0x55d619f6b040, physical addr = 75b67040
array[1020][0]: virtual addr = 0x55d61a363040, physical addr = 23f19040
array[1021][0]: virtual addr = 0x55d61a364040, physical addr = 23f19040
array[1022][0]: virtual addr = 0x55d61a365040, physical addr = 23f19040
array[1023][0]: virtual addr = 0x55d61a366040, physical addr = 23f19040

注意到,在经历读写操作前是缺页状态(前4行array[1][0]-array[4][0]),但修改程序后发现array[0][0]在读写前并未缺页,请思考可能的原因。

mmap.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include"vm.h"
#include<stdio.h>
#include<string.h>
#include<unistd.h>
#include<sys/mman.h>
#include<sys/wait.h>

int main(){
char* private1 = (char*)mmap(0, 128, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, 0, 0);
char* private2 = (char*)mmap(0, 128, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, 0, 0);
char* shared = (char*)mmap(0, 128, PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, 0, 0);

show_pa(private1, "private1"); /* not present */
show_pa(private2, "private2"); /* not present */
show_pa(shared, "shared"); /* not present */

printf("char: %d %d %d\n", (int)private1[0], (int)private2[0] ,(int)shared[0]); /* read */

show_pa(private1, "private1");
show_pa(private2, "private2"); /* private1 = private2 */
show_pa(shared, "shared"); /* shared != private*/

private2[0] = 'a';
printf("modified private2\n");
show_pa(private2, "private2"); /* copy on write */

private1[0] = 'b';
printf("modified private1\n");
show_pa(private1, "private1"); /* changed */

shared[0] = 'A';
printf("modified shared\n");
show_pa(shared, "shared"); /* not changed */

printf("fork!\n");
int pid = fork();
if(pid){ /* parent */
show_pa(private1, "[parent]private1");
show_pa(shared, "[parent]shared");
sleep(2);
/* step 2 */
show_pa(private1, "[parent]private1"); /* not changed */
show_pa(shared, "[parent]shared"); /* not changed */
printf("[parent]private1 = %c, shared = %c\n", private1[0], shared[0]);/* private1 not changed, shared changed*/

private1[0] = 'x';
printf("[parent]modified private1\n");
show_pa(private1, "[parent]private1"); /* not changed (refcnt==1)*/

}else{ /* child */
show_pa(private1, "[child]private1");
show_pa(shared, "[child]shared"); /* not present */
volatile char c = shared[0]; /* read */
show_pa(shared, "[child]shared"); /* same as parent */
sleep(1);
/* step 1 */
private1[0] = 'c';
printf("[child]modified private1\n");
shared[0] = c + 1;
printf("[child]modified shared\n");
printf("[child]private1 = %c, shared = %c\n", private1[0], shared[0]);
show_pa(private1, "[child]private1"); /* copy on write */
show_pa(shared, "[child]shared"); /* not changed */
}

wait(0);
return 0;
}

此程序mmap了两个私有页与一个共享页,并通过fork观察私有页与共享页在写时复制的区别,具体过程请见源代码。

值得注意的是,此程序fork后,子进程先写私有页触发写时复制,而后父进程写同一变量时并未发生复制,这表明内核维护了私有页面的引用数,若引用数为1则不会复制。

以下是一个可能的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
private1: virtual addr = 0x7fb9ab92c000, physical addr = page present is 0
private2: virtual addr = 0x7fb9ab92b000, physical addr = page present is 0
shared: virtual addr = 0x7fb9ab92a000, physical addr = page present is 0
char: 0 0 0
private1: virtual addr = 0x7fb9ab92c000, physical addr = 23f19000
private2: virtual addr = 0x7fb9ab92b000, physical addr = 23f19000
shared: virtual addr = 0x7fb9ab92a000, physical addr = 11a64000
modified private2
private2: virtual addr = 0x7fb9ab92b000, physical addr = 43336000
modified private1
private1: virtual addr = 0x7fb9ab92c000, physical addr = 71994000
modified shared
shared: virtual addr = 0x7fb9ab92a000, physical addr = 11a64000
fork!
[parent]private1: virtual addr = 0x7fb9ab92c000, physical addr = 71994000
[parent]shared: virtual addr = 0x7fb9ab92a000, physical addr = 11a64000
[child]private1: virtual addr = 0x7fb9ab92c000, physical addr = 71994000
[child]shared: virtual addr = 0x7fb9ab92a000, physical addr = page present is 0
[child]shared: virtual addr = 0x7fb9ab92a000, physical addr = 11a64000
[child]modified private1
[child]modified shared
[child]private1 = c, shared = B
[child]private1: virtual addr = 0x7fb9ab92c000, physical addr = 1e920000
[child]shared: virtual addr = 0x7fb9ab92a000, physical addr = 11a64000
[parent]private1: virtual addr = 0x7fb9ab92c000, physical addr = 71994000
[parent]shared: virtual addr = 0x7fb9ab92a000, physical addr = 11a64000
[parent]private1 = b, shared = B
[parent]modified private1
[parent]private1: virtual addr = 0x7fb9ab92c000, physical addr = 71994000

附录

vm.h

1
2
3
4
5
6
7
8
/**
* show physical address of given va
* params:
* va - vitural address
* info - a string will be printed at beginning, current
* process id will be printed instead if empty
*/
void show_pa(void* va, const char* info);

vm.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
/* Do not modify this file unless you know what you are doing */
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdint.h>
#include <string.h>

#define MAXBUF 1024

//计算虚拟地址对应的地址,传入虚拟地址vaddr,通过paddr传出物理地址
static const char* mem_addr(unsigned long vaddr, unsigned long *paddr)
{
int pageSize = getpagesize();//调用此函数获取系统设定的页面大小

unsigned long v_pageIndex = vaddr / pageSize;//计算此虚拟地址相对于0x0的经过的页面数
unsigned long v_offset = v_pageIndex * sizeof(uint64_t);//计算在/proc/pid/page_map文件中的偏移量
unsigned long page_offset = vaddr % pageSize;//计算虚拟地址在页面中的偏移量
uint64_t item = 0;//存储对应项的值

int fd = open("/proc/self/pagemap", O_RDONLY);//以只读方式打开/proc/pid/page_map
if(fd == -1)//判断是否打开失败
{
return "open /proc/self/pagemap error";
}

if(lseek(fd, v_offset, SEEK_SET) == -1)//将游标移动到相应位置,即对应项的起始地址且判断是否移动失败
{
return "sleek error";
}

if(read(fd, &item, sizeof(uint64_t)) != sizeof(uint64_t))//读取对应项的值,并存入item中,且判断读取数据位数是否正确
{
return "read item error";
}

if((((uint64_t)1 << 63) & item) == 0)//判断present是否为0
{
return "page present is 0";
}

uint64_t phy_pageIndex = (((uint64_t)1 << 55) - 1) & item;//计算物理页号,即取item的bit0-54

*paddr = (phy_pageIndex * pageSize) + page_offset;//再加上页内偏移量就得到了物理地址
return "";
}

void show_pa(void* va, const char* str){
unsigned long pa = -1;
const char* msg = mem_addr((unsigned long)va, &pa);
if(strlen(str) == 0)
printf("pid = %d: virtual addr = %p, ", getpid(), va);
else
printf("%s: virtual addr = %p, ", str, va);
if(strlen(msg))
printf("physical addr = %s\n", msg);
else
printf("physical addr = %lx\n", pa);
}

输出虚拟地址对应物理地址的程序

http://spiritedawaycn.github.io/2020/12/14/physical-address/

作者

SpiritedAwayCN

发布于

2020-12-14

更新于

2020-12-14

许可协议