# gzip, zipfile, tarfile 模块:处理压缩文件 In [1]: ```py import os, shutil, glob import zlib, gzip, bz2, zipfile, tarfile ``` gzip ## zilb 模块 `zlib` 提供了对字符串进行压缩和解压缩的功能: In [2]: ```py orginal = "this is a test string" compressed = zlib.compress(orginal) print compressed print zlib.decompress(compressed) ``` ```py x�+��,V�D������⒢̼tS��� this is a test string ``` 同时提供了两种校验和的计算方法: In [3]: ```py print zlib.adler32(orginal) & 0xffffffff ``` ```py 1407780813 ``` In [4]: ```py print zlib.crc32(orginal) & 0xffffffff ``` ```py 4236695221 ``` ## gzip 模块 `gzip` 模块可以产生 `.gz` 格式的文件,其压缩方式由 `zlib` 模块提供。 我们可以通过 `gzip.open` 方法来读写 `.gz` 格式的文件: In [5]: ```py content = "Lots of content here" with gzip.open('file.txt.gz', 'wb') as f: f.write(content) ``` 读: In [6]: ```py with gzip.open('file.txt.gz', 'rb') as f: file_content = f.read() print file_content ``` ```py Lots of content here ``` 将压缩文件内容解压出来: In [7]: ```py with gzip.open('file.txt.gz', 'rb') as f_in, open('file.txt', 'wb') as f_out: shutil.copyfileobj(f_in, f_out) ``` 此时,目录下应有 `file.txt` 文件,内容为: In [8]: ```py with open("file.txt") as f: print f.read() ``` ```py Lots of content here ``` In [9]: ```py os.remove("file.txt.gz") ``` ### bz2 模块 `bz2` 模块提供了另一种压缩文件的方法: In [10]: ```py orginal = "this is a test string" compressed = bz2.compress(orginal) print compressed print bz2.decompress(compressed) ``` ```py BZh91AY&SY*��v ��@"�� 10"zi�����FLT`�軒)„�P�˰ this is a test string ``` ## zipfile 模块 产生一些 `file.txt` 的复制: In [11]: ```py for i in range(10): shutil.copy("file.txt", "file.txt." + str(i)) ``` 将这些复制全部压缩到一个 `.zip` 文件中: In [12]: ```py f = zipfile.ZipFile('files.zip','w') for name in glob.glob("*.txt.[0-9]"): f.write(name) os.remove(name) f.close() ``` 解压这个 `.zip` 文件,用 `namelist` 方法查看压缩文件中的子文件名: In [13]: ```py f = zipfile.ZipFile('files.zip','r') print f.namelist() ``` ```py ['file.txt.9', 'file.txt.6', 'file.txt.2', 'file.txt.1', 'file.txt.5', 'file.txt.4', 'file.txt.3', 'file.txt.7', 'file.txt.8', 'file.txt.0'] ``` 使用 `f.read(name)` 方法来读取 `name` 文件中的内容: In [14]: ```py for name in f.namelist(): print name, "content:", f.read(name) f.close() ``` ```py file.txt.9 content: Lots of content here file.txt.6 content: Lots of content here file.txt.2 content: Lots of content here file.txt.1 content: Lots of content here file.txt.5 content: Lots of content here file.txt.4 content: Lots of content here file.txt.3 content: Lots of content here file.txt.7 content: Lots of content here file.txt.8 content: Lots of content here file.txt.0 content: Lots of content here ``` 可以用 `extract(name)` 或者 `extractall()` 解压单个或者全部文件。 ## tarfile 模块 支持 `.tar` 格式文件的读写: 例如可以这样将 `file.txt` 写入: In [15]: ```py f = tarfile.open("file.txt.tar", "w") f.add("file.txt") f.close() ``` 清理生成的文件: In [16]: ```py os.remove("file.txt") os.remove("file.txt.tar") os.remove("files.zip") ```