This commit is contained in:
geekard
2012-12-30 14:55:16 +08:00
parent 411a288513
commit f8638a5844
58 changed files with 9509 additions and 81 deletions

View File

@@ -0,0 +1,38 @@
Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: 2012-12-23T16:02:09+08:00
====== Introduction to PIC ======
Created Sunday 23 December 2012
http://www.gentoo.org/proj/en/hardened/pic-guide.xml
PIC code radically differs from conventional code in the way it calls functions and operates on data variables.
It will access these functions and data __through an indirection table__, the "Global Offset Table" (GOT), by software convention accessible using the reserved name "**_GLOBAL_OFFSET_TABLE_**".
The exact mechanism used for this is hardware architecture dependent, but usually __a special machine register__ is reserved for setting up the location of the GOT when entering a function.
The rationale behind this indirect addressing is to generate code that can be __independently accessed__ of the actual load address. 例如共享库的目标代码在内存中只加载一次,但是可以映射到多个进程中。
In a true PIC library **without** relocations in the __text segment__, only the symbols exported in the __"Global Offset Table"__ need updating at run-time depending on the current load address of the various shared libraries in the address space of the running process. 使用PIC技术的共享库在动态链接(映射)到某个进程的地址空间中时其text section不需要重定位(更改),只需要对
GOT表中的符号进行重定位即可。而GOT位于.data section中。
Likewise, procedure calls to globally defined functions are redirected through the __"Procedure Linkage Table" (PLT)__ residing in the data segment of the core image. Again, this is done to avoid run-time modifications to the text segment.
其实PLT位于.text section之中是只读和可执行的它会使用GOT中的函数符号条目。
The __linker-editor__ allocates the Global Offset Table and Procedure Linkage Table when combining PIC object files into an image suitable for mapping into the process address space. It also collects all symbols that may be needed by the run-time link-editor and stores these along with the image's text and data bits. Another reserved symbol, **_DYNAMIC** is used to indicate the presence of the run-time linker structures. Whenever _DYNAMIC is relocated to 0, there is no need to invoke the run-time link- editor. If this symbol is non-zero, it points at a data structure from which the location of the necessary relocation- and symbol information can be derived. This is most notably used by the start-up module, **crt0, crt1S** and more recently **Scrt1**. The _DYNAMIC structure is conventionally located at the start of the data segment of the image to which it pertains.
On most architectures, when you compile source code to object code, you __need to specify__ whether the object code should be position independent or not. There are occasional architectures which don't make the distinction, usually because all object code is position independent by virtue of the __Application Binary Interface (ABI),__ or less often because the load address of the object is fixed at compile time, which implies that shared libraries are not supported by such a platform. If an object is compiled as position independent code (PIC), then the operating system can load the object __at any address__ in preparation for execution. This involves a time overhead, in replacing direct address references with relative addresses at compile time, and a space overhead, in maintaining information to help the runtime loader fill in the unresolved addresses at runtime.
Consequently, PIC objects are usually slightly larger and slower at runtime than the equivalent non-PIC object. The advantage of sharing library code on disk and in memory outweigh these problems as soon as the PIC object code in shared libraries is reused.
PIC compilation is exactly what is required for objects which will become __part of__ a shared library. Consequently, __libtool__ builds PIC objects for use in shared libraries and non-PIC objects for use in static libraries. Whenever libtool instructs the compiler to generate a PIC object, it also defines the preprocessor symbol, `PIC', so that assembly code can be aware of whether it will reside in a PIC object or not.
Typically, as libtool is compiling sources, it will generate a `.lo' object, as PIC, and a `.o' object, as non-PIC, and then it will use the appropriate one of the pair when linking executables and libraries of various sorts. On architectures where there is no distinction, the `.lo' file is just a soft link to the `.o' file.
In practice, you can link PIC objects into a static archive for a small overhead in execution and load speed, and often you can similarly link non-PIC objects into shared archives.
When you use position-independent code, relocatable references are generated as an indirection that use data in the shared object's data segment. The text segment code remains read-only, and all relocation updates are applied to corresponding entries within the data segment.
If a shared object is built from code that is not position-independent, the text segment will usually require a large number of relocations to be performed at runtime. Although the runtime linker is equipped to handle this, the system overhead this creates can cause serious performance degradation.
You can identify a shared object that requires relocations against its text segment using tools such as 'readelf -d foo' and inspect the output for any TEXTREL entry. The value of the TEXTREL entry is irrelevant. Its presence in a shared object indicates that text relocations exist.

View File

@@ -0,0 +1,213 @@
Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: 2012-12-21T20:33:28+08:00
====== elf 重定位 ======
Created Friday 21 December 2012
Relocation is the process of __associate the symbolic reference with symbolic definition.__ For example, when a program calls a function, the associate all instruction must transfor control to the **proper destination address.** In other words, relocatable files must have information for modifying their section content.
Relocation table entry structer:
{{./0.gif}}
* **r_offset:** Holds the location at which the relocation apply. For a relocable file, the value is the byte offset from the beginning of the section to the storage unit affected by relocation. For an executable file or a share object file, the value is the virtual address of units affected by relocation.
* **r_info:** Holds both the __symbol table index__ with respect to which the relocation must be made, and __the type of relocation__. For example, a call instrution's relocation enry would hold the symbol index of the function. Relocation types are processor-sepcific. The following code shows how to manipulate the values.
#define ELF32_R_SYM(info) ((info)>>8)
#define ELF32_R_TYPE(info) ((info)&ff)
#define ELF32_R_INFO(s,t) (((s)<<8) + ((t)&0xff))
symbol: bits 328
type: bits 70
**r_addend:** Holds a constand addend used to compute the value to be stored into the relocable field.
===== Relocation Types:(SYSTEM V Architecture) =====
The __link editor__ merge one or more relocable object files to form the output. It first disides how to combine and locate the input files then update the symbol values, and finally preform the relocation. Relocations applied to excutable or shared object files are similar.
link editor(ld)首先合并可重定位目标文件,然后解析其中的符号引用,并将符号的最终实际地址写入到符号表中,最后重定位。
The relocation types specific which bits to change and how to caculate their values下表真对的是x86而非x86_64.
{{./1.gif}}
**R_386_32:** Symbols value + addend. In the following Fig, thre is a relocation at the **0×7 bytes** offset into **.text** section. The linker alter the address of b with S+A, S is symbold bs new address after reset. A is the endian, here it is zero.
**R_386_32是绝对寻址的重定位。将符号解析后的绝对实际地址填充到关联section的offset处。**
{{./2.gif}}
**R_386_PC32:** Symbols **value+Addend-Place**. Because it is __Relative Near CALL__, the operand is the offset from the “next instruction” (EIP) to the called procedure, more infor is here. **VALUE+EIP = Symbol.value, EIP = Place+4. So VALUE = Symbold.value 4 Place**. S is Symbol.vale, -4 is the Addend. P is the new virtuall address of relocation entry computed by r_offset and other factors.
R_386_PC32是相对寻址的重定位。S是符号表中符号解析后的实际地址Place是调用该符号的指令地址所以相对偏移量为 **VALUE = Symbold.value 4 Place**
**R_386_GLOB_DAT:** This type is used to __set a global offset table entry__ to the address of the specific symbol. It is used for global or external variable in PIC code . 将解析后的全局或外部符号的实际地址写入到对应的__GOT条目中__。
**R_386_JMP_SLOT:** The linker editor creates this relocation type for dynamic linking. Its offset specify __the GOT entry that contain a PLT entry__. The dynamic linker use it to implement lazy linking.将解析后的外部函数实际入口地址写入到对应的GOT中的PLT条目类型中。
R_386_GLOB_DAT and R_386_JMP_SLOT are only appear in executable file or shared library.
__上面两种类型的重定位是由动态链接器解析符号后完成的与代码里是否引用该符号无关(因为代码是间接地利用GOTPLT来引用外部变量和符号的)。而且是对GOT中的符号value进行填充与代码段无关。__
**R_386_GOTOFF:**引用本文件内使用的static和rodata类型变量时使用的重定位类型。外部static变量和函数内static变量定义在.data section中对它们的引用不通过GOT条目而是其符号位置与GOT首地址的偏移量来实现的(同理,字符串字面量由于不能修改,一般保存在.rodata section中对它们的引用也不是通过GOT条目),即重定位值= S+A-GOT。示例如下
//ebx事先保存的是GOT的首地址
movl __globalVarStatic@GOTOFF__(%ebx), %eax __//globalVarStatic@GOTOFF的值为S+A-GOT再加上GOT的正好为符号globalVarStatic的地址。__
movl (%eax), 4(%esp)
**R_386_GOT32**代码中引用外部变量时ld生成的重定位类型。动态链接器将G+A-P的值填充到代码中的重定位位置。所以CPU实际寻址时得到的地址为R_386_GOT32+P-A = G。
**R_386_PLT32**代码中引用外部函数时ld生成的重定位类型。动态链接器将L+A-P的值填充到代码中的重定位位置。所以CPU实际寻址时得到的地址为R_386_PLT32+P-A = L。
上面的G和L指的是__相应符号GOT条目距GOT首地址的偏移量。注意GOT32和PLT32一般和GOTPC一起使用后者将GOT的首地址填充到代码段中的引用位置处。G+GOTPC=相应符号在GOT条目中的实际地址。__
在代码中引用外部变量时汇编器一般生成如下代码x86系统如果是x86_64则直接具有**rip寄存器**
call __i686.get_pc_thunk.cx
addl $_GLOBAL_OFFSET_TABLE_, %ecx //_GLOBAL_OFFSET_TABLE符号的值是__GOT表首地址距当前指令的偏移量__它的重定位类型为R_386_GOTPC。现在ecx寄存器保存的是__GOT表的绝对地址__。
movl var@GOT(%ecx), %eax //var@GOT是__var符号在GOT表中的偏移量__所以var@GOT(%ecx)会通过GOT中的var条目取得__var符号的实际地址并将其保存在eax寄存器中__。var@GOT的值是通过R_386_GOT32重定位的。
movl (%eax), %eax //取得var符号引用的内存单元值保存到eax寄存器中。
__i686.get_pc_thunk.cx: //该函数的目的是获取EIP的值。
mov (%esp),%ecx //此时的esp指向的内存单元保存的值时__函数返回后执行的指令地址即紧接着call的addl指令地址__。
ret
**R_386_GOTPC:** This type asembles R_386_PC32, except it use __the address of GOT__ in its caculation. The symbol referenced in this relocation normally is **_GLOBAL_OFFSET_TABLE_(见上面的代码示例)** , which additionally instructs linker to build the GOT. It normally used in PICs relocable files. See “ELF PIC Dessection“.
Sample:
d.c
int var = 10;
void fun (void){
var++;
int a = var;
}
#gcc -S -o __d.s__ -fPIC d.c //生成汇编代码这样其中包含有__编译器生成的指导汇编器生成重定位条目的具体信息__。而通过objdump -d d.o看到的反汇编代码已经去掉了这些重定位信息。
#gcc -c -o d.o -fPIC d.c
In d.s, d.c assembled with PIC option, there are instructions to load the GOT address, shown in the following figure.
{{./3.gif}}
There will be a R_386_GOTPC relocation entry in d.o for __update the value of “$_GLOBAL_OFFSET_TABLE”__ to the offset from “addl” to “GOT”(addl指令的地址与GOT表首地址的差值即为_GLOBAL_OFFSET_TABLE符号的值。__该值的计算方法是由重定位类型决定的__), see the following figureobjdump -d的反汇编代码已经看不出原始的重定位信息. The relocation entry is at 0xd bytes offset from .text section, $_GLOBAL_OFFSET_TABLE resides there. The items initial value is 0×2. It is the endian A for caculating the address of addl. During the relocation, the linker caculate the relocation entrys P (position) by r_offset first. Thus P-2 is the address addl. why -2? because the opcode of addl is 2 bytes long. So $_GLOBAL_OFFSET_TABLE = GOT-P+A.
{{./4.gif}}
**R_386_COPY:** The link editor creates this relocation type for dynamic linking. Its offset member refers to a location in a writable segement. The symbol table index specifies a symbol that should exists __both__ in the current object file and in a shared object. During execution, the dynamic linker __copies the data__ associated with the shared objects symbol to location specified by the offset.
Sample:
[root@www save]# cat 386copy.c
#include <stdio.h>
extern int a;
int main(void) {
printf(“%d\n”, a);
}
[root@www save]# cat b.c
int a = 10;
#gcc -fPIC -share -o b.so b.c
#gcc -o 386copy 386copy.c ./b.so
{{./5.gif}}
Fig2 shows the variable as value from shared object to executables .bss section.
===== Notation: =====
S : The value of the symbol whose index resides in the relocation entrys r_info.
A: The addend used to caculate the value of the relocation field.
P: The place, section offset or address, of the storage unit __being relocated__ (computed useing r_offset).__也就是计算后的值所替换的位置。__
G: The __offset__ into the global offset table at which the address of relocation entrys symbol will reside during execution.
GOT: The address of the global offset table.
L: The place, section offset or address, of PLT entry for a symbol.
B: The __base address__ at which a shared object file has been loaded into the memory during execution.
===== Relocation section: =====
A relocation section(**而不是重定位条目的属性**) reference other two sections: __a symbol table and a section to modify__. The section headers sh_info and sh_link, specify these relationships. sh_link is the symbol table index, sh_info is the section link.
Samples Code:
#include <stdio.h>
char a = a;
int b = 10;
extern char c;
extern void fun();
void pp (void) { }
int main(void) {
printf(“%d\n”, __b__);
int bb = __b__;
char cc = __c__; //b和c都是绝对寻址
**pp**(); //相对寻址
fun();
}
# gcc -c -o test1 test.c //生成的是可重定位的目标对象文件__没有使用GOT和PLT__所以和它们相关的重定位类型都没有使用。一般只使用了__R_386_PC32和R_386_32__两种类型。
[geekard@geekard rel]$ readelf -r rel.o
Relocation section '.rel.text' at offset 0x478 contains 7 entries:
Offset Info Type Sym.Value Sym. Name
0000000f 00000a01 __R_386_32__ 00000004 b
0000001a 00000501 R_386_32 00000000 .rodata
0000001f 00000d02 R_386_PC32 00000000 printf
00000024 00000a01 R_386_32 00000004 b
0000002f 00000e01 R_386_32 00000000 c
00000038 00000b02 R_386_PC32 00000000 pp
0000003d 00000f02 __R_386_PC32__ 00000000 fun
Relocation section '.rel.eh_frame' at offset 0x4b0 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000040 00000202 R_386_PC32 00000000 .text
没有使用PIC技术时对目标对象文件中全局变量符号引用地址的重定位是直接用实际地址替换(**R_386_32**)对__内部和外部函数__的调用是相对调转(**R_386_PC32**)。
[geekard@geekard rel]$ readelf -s rel.o
Symbol table '.symtab' contains 16 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS rel.c
2: 00000000 0 SECTION LOCAL DEFAULT 1
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 SECTION LOCAL DEFAULT 5
6: 00000000 0 SECTION LOCAL DEFAULT 7
7: 00000000 0 SECTION LOCAL DEFAULT 8
8: 00000000 0 SECTION LOCAL DEFAULT 6
**9: 00000000 1 OBJECT GLOBAL DEFAULT 3 a**
** 10: 00000004 4 OBJECT GLOBAL DEFAULT 3 b**
** 11: 00000000 5 FUNC GLOBAL DEFAULT 1 pp**
** 12: 00000005 62 FUNC GLOBAL DEFAULT 1 main**
** 13: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf**
** 14: 00000000 0 NOTYPE GLOBAL DEFAULT UND c**
** 15: 00000000 0 NOTYPE GLOBAL DEFAULT UND fun**
[geekard@geekard rel]$
[geekard@geekard rel]$ gcc -c __-fPIC__ -o rel.o rel.c #还是可重定位目标类型目标文件但是符号的引用使用了__位置无关__技术所以对全局变量和外部函数的引用使用了GOT和PLT。
[geekard@geekard rel]$ readelf -r rel.o
Relocation section '.rel.text' at offset 0x594 contains 9 entries:
Offset Info Type Sym.Value Sym. Name
00000010 00000f02 R_386_PC32 00000000 __x86.get_pc_thunk.bx
00000016 0000100a __R_386_GOTPC__ 00000000 _GLOBAL_OFFSET_TABLE_
0000001c 00000c03 __R_386_GOT32__ 00000004 b
00000028 00000509 R_386_GOTOFF 00000000 .rodata
00000030 00001104 R_386_PLT32 00000000 printf
00000036 00000c03 R_386_GOT32 00000004 b
00000042 00001203 R_386_GOT32 00000000 c
0000004e 00000d04 __R_386_PLT32__ 00000000 pp
00000053 00001304 R_386_PLT32 00000000 fun
Relocation section '.rel.eh_frame' at offset 0x5dc contains 3 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000040 00000202 R_386_PC32 00000000 .text
00000064 00000602 R_386_PC32 00000000 .text.__x86.get_pc_thu
[geekard@geekard rel]$
使用了PIC技术后所有符号的重定位使用__GOT和PLT。__
Here is the details of how REL section associated with symbol table and the section to relocation.
1. Show the ELF sections.
{{./7.gif}}
In fig1, the section .rel.text is REL, the sections it is associated with are the first and the 9th section, .text and .symtab.
2. Show the relocation section entries:
{{./8.gif}}
In fig2, we can see there are two relocation entries for symbol b because b is referenced two times and the linker has to relocation it two times.
3. What is the raw data of relocation table entry?
{{./9.gif}}
Fig3 shows the content of the first entry of relocation table. r_offset is 0×10, that means the relocation entry is at the 0×10 of test1. the symbol table index is 0×09. we can see the 9th entry of symbol table is b through Fig4;
4. bs offset if 4 byts offset from the start of data section and size is 4 bytes. Then the linercaculate the address of b and modify its address in .text section through relocation entry.
{{./10.gif}}
So, we get a simple flow of how linker do the relocation. __first, get all relocation entries, then get all symbols associated with the relocation entries, then caculate the address and modify the unist in the section assosicated with the relocation entries.__ The real relocation is more complex but main flow is like this.
Sunday, September 12th, 2010 at 16:29

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.0 KiB

View File

@@ -0,0 +1,15 @@
Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: 2012-12-23T13:11:25+08:00
====== CFI for gas ======
Created Sunday 23 December 2012
Modern ABIs don't require frame pointers to be used in functions.
Howerver missing FPs bring difficulties when doing a backtrace.
One solutions is to provide Dwarf-2 CFI(call frame information) data
for each such function. This can be easily done for example by GCC in
its output, but isn't that easy to write by hand for pure assembler functions.
With the help of these .cfi_* directives one can ass appropriate unwind info
into his asm source without too much trouble.

View File

@@ -0,0 +1,198 @@
Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: 2012-12-22T10:50:43+08:00
====== sample2--可重定位类型 ======
Created Saturday 22 December 2012
[geekard@geekard rel]$ cat -n rel.c **#测试文件**
1 #include <stdio.h>
2 int globalVar = 1;
3 int globalVarUninit;
4 static int globalVarStatic = 3;
5 extern externVar;
6
7 extern void externFun(void);
8 void Fun(void) {}
9
10 int main(void) {
11 int autoVar = globalVar;
12 static int staticVar = 2;
13 globalVarUninit = externVar;
14 printf("%d\n",globalVarStatic);
15 externFun();
16 Fun();
17 }
[geekard@geekard rel]$ **gcc -c rel.c #编译,生成可重定位类型的目标对象文件**
[geekard@geekard rel]$ readelf **-r** rel.o **#查看可重定位条目**
Relocation section '.rel.text' at offset 0x4e0 contains 8 entries:
Offset Info Type Sym.Value Sym. Name
0000000f 00000b01 R_386_32 00000000 globalVar __#对文件中的第11行变量引用进行重定位__
00000018 00000f01 R_386_32 00000000 externVar **#13**
0000001d 00000c01 R_386_32 00000004 globalVarUninit **#13**
00000022 00000301 R_386_32 00000000 .data
0000002d 00000601 R_386_32 00000000 .rodata
00000032 00001002 R_386_PC32 00000000 printf **#14**
00000037 00001102 R_386_PC32 00000000 externFun **#15**
0000003c 00000d02 R_386_PC32 00000000 Fun **#16**
Relocation section '.rel.eh_frame' at offset 0x520 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000040 00000202 R_386_PC32 00000000 .text
由于在编译时没有指定PIC所以重定位条目没有使用GOT或PLT。对全局变量使用的时R_386_32的绝对地址重定位对函数使用的是
R_386_PC32相对寻址重定位。
[geekard@geekard rel]$ **objdump -t rel.o #查看符号表**
rel.o: file format elf32-i386
SYMBOL TABLE:
00000000 l df *ABS* 00000000 rel.c
00000000 l d .text 00000000 .text
00000000 l d .data 00000000 .data
00000000 l d .bss 00000000 .bss
00000004 l O __.data__ 00000004 globalVarStatic
00000000 l d .rodata 00000000 .rodata
00000008 l O __.data__ 00000004 staticVar.1828
00000000 l d .note.GNU-stack 00000000 .note.GNU-stack
00000000 l d .eh_frame 00000000 .eh_frame
00000000 l d .comment 00000000 .comment
00000000 g O __.data__ 00000004 globalVar
00000004 O __*COM*__ 00000004 globalVarUninit
00000000 g F .text 00000005 Fun
00000005 g F .text 0000003d main
00000000 *UND* 00000000 externVar
00000000 *UND* 00000000 printf
00000000 *UND* 00000000 externFun
[geekard@geekard rel]$
全局静态变量、全局已初始化变量、静态自动变量都位于.data section中。但是全局未初始化变量位于COMMON(named after Fortran 77's "common blocks") section中而且对外不可见。file-scope and local-scope uninitiated global variables 保存在bss"Block Started by Symbol"段中。如果想让globalVarUninit保存在.bss section中可以在编译时使用-fno-common选项则是建议的用法。
geekard@ubuntu:~/Code$ cat bar.c
double globalVar;
int main() {}
geekard@ubuntu:~/Code$ cat bar.c
double globalVar;
int main() {}
geekard@ubuntu:~/Code$
geekard@ubuntu:~/Code$ gcc foo.c bar.c
编译并链接上面两个文件时,编译器并没有提示符号重复定义的错误,但是如果启用-fno-common选项则会提示错误。
geekard@ubuntu:~/Code$ gcc foo.c bar.c **-fno-common**
/tmp/cceNAIis.o:(.bss+0x0): multiple definition of `globalVar'
/tmp/ccWmFhZG.o:(.bss+0x0): first defined here
/usr/bin/ld: Warning: size of symbol `globalVar' changed from 4 in /tmp/ccWmFhZG.o to 8 in /tmp/cceNAIis.o
collect2: ld 返回 1
geekard@ubuntu:~/Code$
则是由于foo.oh和bar.o中的globalVar都放在.bss section中而且都是global bind所以会冲突。注意放在COMMON section中时
没有bind属性默认是外界不可见的
geekard@ubuntu:~/Code$ objdump -t bar.o |grep globalVar //未启用-fno-common无绑定信息外界不可见
0000000000000008 __O *COM*__ 0000000000000008 globalVar
geekard@ubuntu:~/Code$ objdump -t bar.o |grep globalVar //启用-fno-common后
0000000000000000 __g__ O __.bss__ 0000000000000008 globalVar
kkkn
[geekard@geekard rel]$ __gcc -S rel.c #编译__
[geekard@geekard rel]$ cat rel.s #查看编译生成的汇编代码代码中含有指示链接器ld生成各section和重定位的指令。
.file "rel.c"
.globl globalVar **#符号全局可见**
.data **#data section开始**
.align 4
.type globalVar, @object **#符号类型**
.size globalVar, 4 **#符号对象大小**
globalVar:
.long 1 **#符号的值**
__.comm__ globalVarUninit,4,4 **#COMMON section**
.align 4
.type globalVarStatic, @object
.size globalVarStatic, 4
globalVarStatic:
.long 3
.text **#代码段开始**
.globl Fun
.type Fun, @function
Fun:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
popl %ebp
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size Fun, .-Fun #函数对象的大小
.section .rodata **#rodata section的开始**
.LC0:
.string "%d\n"
.text
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $32, %esp
movl __globalVar,__ %eax __#对全局变量的引用是绝对寻址没有使用GOT。汇编时as会生成R_386_32类型的重定位条目__
movl %eax, 28(%esp)
movl __externVar__, %eax
movl %eax, __globalVarUninit__
movl __globalVarStatic__, %eax
movl %eax, 4(%esp)
movl $.LC0, (%esp)
call __printf #对外部或全局函数的引用使用的是相对寻址没有使用PLT。汇编时as会生成R_386_PC32类型的重定位条目__
call __externFun__
call __Fun __
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE1:
.size main, .-main
.data
.align 4
.type __staticVar.1828__, @object
.size staticVar.1828, 4
staticVar.1828:
.long 2
.ident "GCC: (GNU) 4.7.2"
.section .note.GNU-stack,"",@progbits
[geekard@geekard rel]$
[geekard@geekard rel]$ objdump -d rel.o
rel.o: file format elf32-i386
Disassembly of section .text:
00000000 <Fun>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 5d pop %ebp
4: c3 ret
00000005 <main>:
5: 55 push %ebp
6: 89 e5 mov %esp,%ebp
8: 83 e4 f0 and $0xfffffff0,%esp
b: 83 ec 20 sub $0x20,%esp
e: a1 00 00 00 00 mov 0x0,%eax
13: 89 44 24 1c mov %eax,0x1c(%esp)
17: a1 00 00 00 00 mov 0x0,%eax
1c: a3 00 00 00 00 mov %eax,0x0
21: a1 04 00 00 00 mov 0x4,%eax
26: 89 44 24 04 mov %eax,0x4(%esp)
2a: c7 04 24 00 00 00 00 movl $0x0,(%esp)
31: e8 fc ff ff ff call 32 <main+0x2d>
36: e8 fc ff ff ff call 37 <main+0x32>
3b: e8 fc ff ff ff call 3c <main+0x37>
40: c9 leave
41: c3 ret
[geekard@geekard rel]$

View File

@@ -0,0 +1,236 @@
Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: 2012-12-22T22:55:46+08:00
====== sample3--PIC可重定位类型 ======
Created Saturday 22 December 2012
[geekard@geekard rel]$ cat rel.c
#include <stdio.h>
int globalVar = 1;
int globalVarUninit;
static int globalVarStatic = 3;
extern externVar;
extern void externFun(int);
void Fun(void) {}
int main(void) {
int autoVar = globalVar;
static int staticVar = 2;
globalVarUninit = externVar;
printf("%d\n",globalVarStatic);
externFun(staticVar);
Fun();
}
[geekard@geekard rel]$ gcc -c __-fPIC__ rel.c
[geekard@geekard rel]$ readelf -r rel.o
Relocation section '**.rel.text**' at offset 0x600 contains 10 entries: **//对text section中的符号引用重定位**
Offset Info Type Sym.Value Sym. Name
00000010 00001102 R_386_PC32 00000000 __x86.get_pc_thunk.bx //相对寻址重定位这里没有使用PLT是因为该函数是文件内部定义的。
00000016 0000120a R_386___GOTPC__ 00000000 _GLOBAL_OFFSET_TABLE_ //用IP与GOT首地址的偏移量重定位代码中的值。
0000001c 00000d03 R_386___GOT32__ 00000000 globalVar //用var条目在GOT中的偏移量重定位代码中的值。
00000028 00001303 R_386_GOT32 00000000 externVar
00000030 00000e03 R_386_GOT32 00000004 globalVarUninit
00000038 00000309 R_386___GOTOFF__ 00000000 .data //用__static符号的地址与GOT的偏移量__来重定位代码段中的值。
00000042 00000609 R_386_GOTOFF 00000000 .rodata //.rodata section中保存的是**字符串字面量**。
0000004a 00001404 R_386___PLT32__ 00000000 printf //用printf条目在GOT PLT中的偏移量重定位代码中的值。
00000050 00000309 R_386_GOTOFF 00000000 .data
0000004f 00001504 R_386_PLT32 00000000 externFun
00000054 00000f04 R_386_PLT32 00000000 Fun
Relocation section '.rel.eh_frame' at offset 0x650 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000040 00000202 R_386_PC32 00000000 .text
00000064 00000802 R_386_PC32 00000000 .text.__x86.get_pc_thu
[geekard@geekard rel]$ objdump -t rel.o
rel.o: file format elf32-i386
SYMBOL TABLE:
00000000 l df *ABS* 00000000 rel.c
00000000 l d .text 00000000 .text //.text section定义
00000000 g F __.text__ 00000005 Fun //.text section中的第一个函数(偏移量为0),本文件内定义
00000005 g F .text 00000058 main //.text section中的第二个函数
00000000 g F .text.x86.get_pc_thunk.bx 00000000 .hidden x86.get_pc_thunk.bx
00000000 l d .text.x86.get_pc_thunk.bx 00000000 .text.x86.get_pc_thunk.bx
00000000 l d .data 00000000 .data //.data section定义
00000000 g O .data 00000004 globalVar
00000004 l O .data 00000004 globalVarStatic
00000008 l O .data 00000004 __staticVar.1828__
00000000 l d .rodata 00000000 .rodata
00000000 l d .bss 00000000 .bss
00000004 O __*COM*__ 00000004 globalVarUninit
00000000 l d .note.GNU-stack 00000000 .note.GNU-stack
00000000 l d .eh_frame 00000000 .eh_frame
00000000 l d .comment 00000000 .comment
00000000 l d .group 00000000 .group
00000000 __*UND*__ 00000000 _GLOBAL_OFFSET_TABLE_
00000000 *UND* 00000000 externVar
00000000 *UND* 00000000 printf
00000000 *UND* 00000000 externFun
[geekard@geekard rel]$ gcc -S __-fPIC__ rel.c
[geekard@geekard rel]$ cat rel.s
.file "rel.c"
.globl globalVar
.data
.align 4
.type globalVar, @object
.size globalVar, 4
globalVar:
.long 1
.comm globalVarUninit,4,4
.align 4
.type globalVarStatic, @object
.size globalVarStatic, 4
globalVarStatic:
.long 3
.text
.globl Fun
.type Fun, @function
Fun:
.LFB0: **//.LFB是Dwarf使用的一个标号与.LFE相匹配。**
.cfi_startproc //[[../CFI_for_gas.txt|cfi(call frame information)]]
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
popl %ebp
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size Fun, .-Fun
.section __.rodata__
__.LC0: //该标号没有使用.globl限定所以符号表中没有包含。只在本文件内有效。__
.string "%d\n"
.text
.globl main
.type main, @function
main:
.LFB1: //前面使用的是.LFB0, 所以这里是.LFB1
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
pushl %ebx
andl $-16, %esp
subl $32, %esp
.cfi_offset 3, -12
call ____x86.get_pc_thunk.bx __**//获得IP的值保存在ebx寄存器中**
addl $___GLOBAL_OFFSET_TABLE___, %ebx **//通过GOTPC重定位获得GOT与当前IP的偏移量。最终获得GOT的首地址**
movl __globalVar@GOT__(%ebx), %eax **//通过GOT32重定位获得var@GOT的值即var所在的GOT条目相对GOT的偏移量。**
movl (%eax), %eax **//eax寄存器的值为var符号的实际地址这样间接引用获得其实际值。**
movl %eax, 28(%esp)
movl externVar@GOT(%ebx), %eax
movl (%eax), %edx
movl globalVarUninit@GOT(%ebx), %eax
movl %edx, (%eax)
movl __globalVarStatic@GOTOFF__(%ebx), %eax __//GOTOFF类型的重定位__
movl %eax, 4(%esp)
leal __.LC0@GOTOFF__(%ebx), %eax
movl %eax, (%esp)
call __printf@PLT__
movl __staticVar.1828@GOTOFF__(%ebx), %eax
movl %eax, (%esp)
call externFun@PLT
call Fun@PLT
movl -4(%ebp), %ebx
leave
.cfi_restore 5
.cfi_restore 3
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE1:
.size main, .-main
.data
.align 4
.type staticVar.1828, @object
.size staticVar.1828, 4
staticVar.1828:
.long 2
.section .text.x86.get_pc_thunk.bx,"axG",@progbits,x86.get_pc_thunk.bx,comdat
.globl __x86.get_pc_thunk.bx
.hidden __x86.get_pc_thunk.bx
.type __x86.get_pc_thunk.bx, @function
__x86.get_pc_thunk.bx:
.LFB2:
.cfi_startproc
movl (%esp), %ebx
ret
.cfi_endproc
.LFE2:
.ident "GCC: (GNU) 4.7.2"
.section .note.GNU-stack,"",@progbits
[geekard@geekard rel]$
[geekard@geekard rel]$ objdump -d rel.o
rel.o: file format elf32-i386
Disassembly of section .text:
00000000 <Fun>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 5d pop %ebp
4: c3 ret
00000005 <main>:
5: 55 push %ebp
6: 89 e5 mov %esp,%ebp
8: 53 push %ebx
9: 83 e4 f0 and $0xfffffff0,%esp
c: 83 ec 20 sub $0x20,%esp
f: e8 __fc ff ff ff__ call 10 <main+0xb>
14: 81 c3 __02 00 00 00__ add $0x2,%ebx //objdump反汇编后的代码中已经__看不到原始的重定位信息__。所以需要和重定位条目一起查看。
1a: 8b 83 __00 00 00 00__ mov 0x0(%ebx),%eax
20: 8b 00 mov (%eax),%eax
22: 89 44 24 1c mov %eax,0x1c(%esp)
26: 8b 83 __00 00 00 00__ mov 0x0(%ebx),%eax
2c: 8b 10 mov (%eax),%edx
2e: 8b 83 __00 00 00 00__ mov 0x0(%ebx),%eax
34: 89 10 mov %edx,(%eax)
36: 8b 83 __04 00 00 00__ mov 0x4(%ebx),%eax
3c: 89 44 24 04 mov %eax,0x4(%esp)
40: 8d 83 __00 00 00 00__ lea 0x0(%ebx),%eax
46: 89 04 24 mov %eax,(%esp)
49: e8 __fc ff ff ff__ call 4a <main+0x45>
4e: 8b 83 __08 00 00 00__ mov 0x8(%ebx),%eax
54: 89 04 24 mov %eax,(%esp)
57: e8 __fc ff ff ff__ call 58 <main+0x53>
5c: e8 __fc ff ff ff__ call 5d <main+0x58>
61: 8b 5d fc mov -0x4(%ebp),%ebx
64: c9 leave
65: c3 ret
#上面黄色标记的位置需要链接器对其重定位。
Disassembly of section .text.__x86.get_pc_thunk.bx:
00000000 <__x86.get_pc_thunk.bx>:
0: 8b 1c 24 mov (%esp),%ebx
3: c3 ret
[geekard@geekard rel]$

View File

@@ -0,0 +1,125 @@
Content-Type: text/x-zim-wiki
Wiki-Format: zim 0.4
Creation-Date: 2012-12-16T17:34:49+08:00
====== ld-linux调试信息 ======
Created Sunday 16 December 2012
启用动态连接器调试信息输出的方法是定义变量LD_DEBUG=all
**[geekard@geekard hello]$ cat hello.c**
#include <stdio.h>
#include <stdlib.h>
int glb_init = 1;
int glb_uninit;
int main(void)
{
char *str = "Just a test string!";
printf("The test string is:\"%s\"\n", str);
printf("glb_init:%d, glb_uninit:%d\n", glb_init, glb_uninit);
pause(); **//暂停进程,这样可以查看其内存映射情况。**
exit(0);
}
[geekard@geekard hello]$ __strace -e trace=mmap2,mprotect,munmap,open,close -ELD_DEBUG=all ./hello &>log__
^Z
[1]+ Stopped strace -e trace=mmap2,mprotect,munmap,open,close -ELD_DEBUG=all ./hello &>log
#上面的log文件中包含有strace的输出和hello的ld-linux.so的DEBUG信息。
**#4727为hello的进程号下面命令从log中提取ld-linux.so的DEBUG信息**
**[geekard@geekard hello]$ cat log|grep 4727>log.ld **
#**下面命令从log中提取hello的系统调用信息**
**[geekard@geekard hello]$ cat log |sed '/4727/d' >log.strace**
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77b8000
**[geekard@geekard hello]$ readelf -l /lib/libc.so.6**
Elf file type is DYN (Shared object file)
Entry point __0x19760__
There are __10__ program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x00000034 0x00000034 0x00140 0x00140 R E 0x4
INTERP 0x16b7e8 0x0016b7e8 0x0016b7e8 0x00017 0x00017 R 0x1
[Requesting program interpreter: [[/usr/lib/ld-linux.so.2]]]
#第一个LOAD为RE其大小为1718236B(0x1a37dc)需要4KB对齐所以实际需要空间1720320B这会传给mmap2函数。
**LOAD** __0x000000__ 0x00000000 0x00000000 __0x1a37dc__ 0x1a37dc R E 0x1000
**LOAD** 0x1a41dc 0x001a41dc 0x001a41dc 0x02ce0 __0x058e8__ RW 0x1000
DYNAMIC __0x1a5d9c__ 0x001a5d9c 0x001a5d9c 0x000f8 0x000f8 RW 0x4
NOTE 0x000174 0x00000174 0x00000174 0x00044 0x00044 R 0x4
TLS 0x1a41dc 0x001a41dc 0x001a41dc 0x00008 0x00040 R 0x4
GNU_EH_FRAME 0x16b800 0x0016b800 0x0016b800 0x07454 0x07454 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
**GNU_RELRO** 0x1a41dc 0x001a41dc 0x001a41dc 0x01e24 0x01e24 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rel.dyn .rel.plt .plt .text __libc_freeres_fn __libc_thread_freeres_fn .rodata .interp .eh_frame_hdr .eh_frame .gcc_except_table .hash
03 .tdata .init_array __libc_subfreeres __libc_atexit __libc_thread_subfreeres .data.rel.ro .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.gnu.build-id .note.ABI-tag
06 .tdata .tbss
07 .eh_frame_hdr
08
09 .tdata .init_array __libc_subfreeres __libc_atexit __libc_thread_subfreeres .data.rel.ro .dynamic .got
可以看出libc.so.6中的__虚拟地址从0开始__。
**[geekard@geekard hello]$ pmap $(pgrep hello) |nl #查看hello进程的地址映射情况**
1 4727: ./hello
2 08048000 4K r-x-- /home/geekard/Code/hello/hello
3 08049000 4K rw--- /home/geekard/Code/hello/hello
4 __b75eb000__ 4K rw--- [ anon ] //libc的保护区域
//0xb75ec000为ld-linux.so映射libc到a.out进程地址空间时的__随机base地址(见后文log文件)__。
5 __b75ec000__ 1680K r-x-- /usr/lib/libc-2.16.so
6 __b7790000__ 8K r---- /usr/lib/libc-2.16.so
7 b7792000 4K rw--- /usr/lib/libc-2.16.so
8 __b7793000__ 12K rw--- [ anon ]
9 b77b8000 **8K** rw--- [ anon ] //包含有第一次调用mmap2()分配的匿名内存块。
10 b77ba000 4K r-x-- [ anon ] //ld的保护区域
11 b77bb000 128K r-x-- /usr/lib/ld-2.16.so
12 b77db000 4K r---- /usr/lib/ld-2.16.so
13 b77dc000 4K rw--- /usr/lib/ld-2.16.so
14 __bff8b000__ 132K rw--- [ stack ]
15 total 1996K
**[geekard@geekard hello]$ cat log //查看strace打印出的系统调用和ld-linux.so打印的DEBUG信息。**
//匿名映射mmap2的第一个参数为NULL所以内核会随机地选择一个地址这里为 **0xb77b9000。**
//匿名映射的虚拟地址空间为**0xb77b90000xb77ba000。包含在pmap打印的第9行中。**
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|**MAP_ANONYMOUS**, -1, 0) = **0xb77b9000**
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
//文件映射,内核随机选择一个起始地址,这里为**0xb7796000**
mmap2(**NULL**, 139868, PROT_READ, MAP_PRIVATE, 3, 0) = **0xb7796000**
4727:
4727: file=libc.so.6 [0]; needed by ./hello [0]
4727: find library=libc.so.6 [0]; searching
4727: search cache=/etc/ld.so.cache
//关闭了ld.so.cache所以**其映射的内存区域将删除**。
close(3) = 0
4727: trying file=/usr/lib/libc.so.6
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
//文件映射,内核随机选择一个起始地址,这里为 __0xb75ec000。这里映射的是第一个LOAD segment__
mmap2(**NULL**, 1743556, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = __0xb75ec000__
//这里映射的是第二个LOAD segment
mmap2(**0xb7790000,** 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, **0x1a4**) = 0xb7790000
mmap2(**0xb7793000**, 10948, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|**MAP_ANONYMOUS**, -1, 0) = 0xb7793000
close(3) = 0
4727:
4727: file=libc.so.6 [0]; generating link map
4727: dynamic: __0xb7791d9c__ base: __0xb75ec000__ size: 0x001a9ac4
4727: entry: 0xb7605760 phdr: 0xb75ec034 phnum: 10
4727:
4727: checking for version `GLIBC_2.0' in file /usr/lib/libc.so.6 [0] required by file ./hello [0]
4727: checking for version `GLIBC_2.3' in file /lib/ld-linux.so.2 [0] required by file /usr/lib/libc.so.6 [0]
4727: checking for version `GLIBC_PRIVATE' in file /lib/ld-linux.so.2 [0] required by file /usr/lib/libc.so.6 [0]
4727: checking for version `GLIBC_2.1' in file /lib/ld-linux.so.2 [0] required by file /usr/lib/libc.so.6 [0]
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = **0xb75eb000**
mprotect(0xb7790000, 8192, PROT_READ) = 0
mprotect(0xb77db000, 4096, PROT_READ) = 0
munmap(0xb7796000, 139868) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = **0xb77b8000**