ailearning/docs/da/145.md

# gzip, zipfile, tarfile 模块：处理压缩文件

In [1]:

```py
import os, shutil, glob
import zlib, gzip, bz2, zipfile, tarfile

```

gzip

## zilb 模块

`zlib` 提供了对字符串进行压缩和解压缩的功能：

In [2]:

```py
orginal = "this is a test string"

compressed = zlib.compress(orginal)

print compressed
print zlib.decompress(compressed)

```

```py
x<EFBFBD>+<EFBFBD><EFBFBD>,V<EFBFBD>D<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>⒢̼tS<EFBFBD><EFBFBD><EFBFBD>
this is a test string

```

同时提供了两种校验和的计算方法：

In [3]:

```py
print zlib.adler32(orginal) & 0xffffffff

```

```py
1407780813

```

In [4]:

```py
print zlib.crc32(orginal) & 0xffffffff

```

```py
4236695221

```

## gzip 模块

`gzip` 模块可以产生 `.gz` 格式的文件，其压缩方式由 `zlib` 模块提供。

我们可以通过 `gzip.open` 方法来读写 `.gz` 格式的文件：

In [5]:

```py
content = "Lots of content here"
with gzip.open('file.txt.gz', 'wb') as f:
    f.write(content)

```

读：

In [6]:

```py
with gzip.open('file.txt.gz', 'rb') as f:
    file_content = f.read()

print file_content

```

```py
Lots of content here

```

将压缩文件内容解压出来：

In [7]:

```py
with gzip.open('file.txt.gz', 'rb') as f_in, open('file.txt', 'wb') as f_out:
    shutil.copyfileobj(f_in, f_out)

```

此时，目录下应有 `file.txt` 文件，内容为：

In [8]:

```py
with open("file.txt") as f:
    print f.read()

```

```py
Lots of content here

```

In [9]:

```py
os.remove("file.txt.gz")

```

### bz2 模块

`bz2` 模块提供了另一种压缩文件的方法：

In [10]:

```py
orginal = "this is a test string"

compressed = bz2.compress(orginal)

print compressed
print bz2.decompress(compressed)

```

```py
BZh91AY&SY*<EFBFBD><EFBFBD>v	<EFBFBD><EFBFBD>@"<EFBFBD><EFBFBD> 10"zi<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>FLT`<EFBFBD>軒)„<EFBFBD>P<EFBFBD>˰
this is a test string

```

## zipfile 模块

产生一些 `file.txt` 的复制：

In [11]:

```py
for i in range(10):
    shutil.copy("file.txt", "file.txt." + str(i))

```

将这些复制全部压缩到一个 `.zip` 文件中：

In [12]:

```py
f = zipfile.ZipFile('files.zip','w')

for name in glob.glob("*.txt.[0-9]"):
    f.write(name)
    os.remove(name)

f.close()

```

解压这个 `.zip` 文件，用 `namelist` 方法查看压缩文件中的子文件名：

In [13]:

```py
f = zipfile.ZipFile('files.zip','r')
print f.namelist()

```

```py
['file.txt.9', 'file.txt.6', 'file.txt.2', 'file.txt.1', 'file.txt.5', 'file.txt.4', 'file.txt.3', 'file.txt.7', 'file.txt.8', 'file.txt.0']

```

使用 `f.read(name)` 方法来读取 `name` 文件中的内容：

In [14]:

```py
for name in f.namelist():
    print name, "content:", f.read(name)

f.close()

```

```py
file.txt.9 content: Lots of content here
file.txt.6 content: Lots of content here
file.txt.2 content: Lots of content here
file.txt.1 content: Lots of content here
file.txt.5 content: Lots of content here
file.txt.4 content: Lots of content here
file.txt.3 content: Lots of content here
file.txt.7 content: Lots of content here
file.txt.8 content: Lots of content here
file.txt.0 content: Lots of content here

```

可以用 `extract(name)` 或者 `extractall()` 解压单个或者全部文件。

## tarfile 模块

支持 `.tar` 格式文件的读写：

例如可以这样将 `file.txt` 写入：

In [15]:

```py
f = tarfile.open("file.txt.tar", "w")
f.add("file.txt")
f.close()

```

清理生成的文件：

In [16]:

```py
os.remove("file.txt")
os.remove("file.txt.tar")
os.remove("files.zip")

```