mirror of
https://github.com/apachecn/ailearning.git
synced 2026-02-10 05:45:40 +08:00
241 lines
3.4 KiB
Markdown
241 lines
3.4 KiB
Markdown
# gzip, zipfile, tarfile 模块:处理压缩文件
|
||
|
||
In [1]:
|
||
|
||
```py
|
||
import os, shutil, glob
|
||
import zlib, gzip, bz2, zipfile, tarfile
|
||
|
||
```
|
||
|
||
gzip
|
||
|
||
## zilb 模块
|
||
|
||
`zlib` 提供了对字符串进行压缩和解压缩的功能:
|
||
|
||
In [2]:
|
||
|
||
```py
|
||
orginal = "this is a test string"
|
||
|
||
compressed = zlib.compress(orginal)
|
||
|
||
print compressed
|
||
print zlib.decompress(compressed)
|
||
|
||
```
|
||
|
||
```py
|
||
x<EFBFBD>+<EFBFBD><EFBFBD>,V<EFBFBD>D<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>⒢̼tS<EFBFBD><EFBFBD><EFBFBD>
|
||
this is a test string
|
||
|
||
```
|
||
|
||
同时提供了两种校验和的计算方法:
|
||
|
||
In [3]:
|
||
|
||
```py
|
||
print zlib.adler32(orginal) & 0xffffffff
|
||
|
||
```
|
||
|
||
```py
|
||
1407780813
|
||
|
||
```
|
||
|
||
In [4]:
|
||
|
||
```py
|
||
print zlib.crc32(orginal) & 0xffffffff
|
||
|
||
```
|
||
|
||
```py
|
||
4236695221
|
||
|
||
```
|
||
|
||
## gzip 模块
|
||
|
||
`gzip` 模块可以产生 `.gz` 格式的文件,其压缩方式由 `zlib` 模块提供。
|
||
|
||
我们可以通过 `gzip.open` 方法来读写 `.gz` 格式的文件:
|
||
|
||
In [5]:
|
||
|
||
```py
|
||
content = "Lots of content here"
|
||
with gzip.open('file.txt.gz', 'wb') as f:
|
||
f.write(content)
|
||
|
||
```
|
||
|
||
读:
|
||
|
||
In [6]:
|
||
|
||
```py
|
||
with gzip.open('file.txt.gz', 'rb') as f:
|
||
file_content = f.read()
|
||
|
||
print file_content
|
||
|
||
```
|
||
|
||
```py
|
||
Lots of content here
|
||
|
||
```
|
||
|
||
将压缩文件内容解压出来:
|
||
|
||
In [7]:
|
||
|
||
```py
|
||
with gzip.open('file.txt.gz', 'rb') as f_in, open('file.txt', 'wb') as f_out:
|
||
shutil.copyfileobj(f_in, f_out)
|
||
|
||
```
|
||
|
||
此时,目录下应有 `file.txt` 文件,内容为:
|
||
|
||
In [8]:
|
||
|
||
```py
|
||
with open("file.txt") as f:
|
||
print f.read()
|
||
|
||
```
|
||
|
||
```py
|
||
Lots of content here
|
||
|
||
```
|
||
|
||
In [9]:
|
||
|
||
```py
|
||
os.remove("file.txt.gz")
|
||
|
||
```
|
||
|
||
### bz2 模块
|
||
|
||
`bz2` 模块提供了另一种压缩文件的方法:
|
||
|
||
In [10]:
|
||
|
||
```py
|
||
orginal = "this is a test string"
|
||
|
||
compressed = bz2.compress(orginal)
|
||
|
||
print compressed
|
||
print bz2.decompress(compressed)
|
||
|
||
```
|
||
|
||
```py
|
||
BZh91AY&SY*<EFBFBD><EFBFBD>v <EFBFBD><EFBFBD>@"<EFBFBD><EFBFBD> 10"zi<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>FLT`<EFBFBD>軒)„<EFBFBD>P<EFBFBD>˰
|
||
this is a test string
|
||
|
||
```
|
||
|
||
## zipfile 模块
|
||
|
||
产生一些 `file.txt` 的复制:
|
||
|
||
In [11]:
|
||
|
||
```py
|
||
for i in range(10):
|
||
shutil.copy("file.txt", "file.txt." + str(i))
|
||
|
||
```
|
||
|
||
将这些复制全部压缩到一个 `.zip` 文件中:
|
||
|
||
In [12]:
|
||
|
||
```py
|
||
f = zipfile.ZipFile('files.zip','w')
|
||
|
||
for name in glob.glob("*.txt.[0-9]"):
|
||
f.write(name)
|
||
os.remove(name)
|
||
|
||
f.close()
|
||
|
||
```
|
||
|
||
解压这个 `.zip` 文件,用 `namelist` 方法查看压缩文件中的子文件名:
|
||
|
||
In [13]:
|
||
|
||
```py
|
||
f = zipfile.ZipFile('files.zip','r')
|
||
print f.namelist()
|
||
|
||
```
|
||
|
||
```py
|
||
['file.txt.9', 'file.txt.6', 'file.txt.2', 'file.txt.1', 'file.txt.5', 'file.txt.4', 'file.txt.3', 'file.txt.7', 'file.txt.8', 'file.txt.0']
|
||
|
||
```
|
||
|
||
使用 `f.read(name)` 方法来读取 `name` 文件中的内容:
|
||
|
||
In [14]:
|
||
|
||
```py
|
||
for name in f.namelist():
|
||
print name, "content:", f.read(name)
|
||
|
||
f.close()
|
||
|
||
```
|
||
|
||
```py
|
||
file.txt.9 content: Lots of content here
|
||
file.txt.6 content: Lots of content here
|
||
file.txt.2 content: Lots of content here
|
||
file.txt.1 content: Lots of content here
|
||
file.txt.5 content: Lots of content here
|
||
file.txt.4 content: Lots of content here
|
||
file.txt.3 content: Lots of content here
|
||
file.txt.7 content: Lots of content here
|
||
file.txt.8 content: Lots of content here
|
||
file.txt.0 content: Lots of content here
|
||
|
||
```
|
||
|
||
可以用 `extract(name)` 或者 `extractall()` 解压单个或者全部文件。
|
||
|
||
## tarfile 模块
|
||
|
||
支持 `.tar` 格式文件的读写:
|
||
|
||
例如可以这样将 `file.txt` 写入:
|
||
|
||
In [15]:
|
||
|
||
```py
|
||
f = tarfile.open("file.txt.tar", "w")
|
||
f.add("file.txt")
|
||
f.close()
|
||
|
||
```
|
||
|
||
清理生成的文件:
|
||
|
||
In [16]:
|
||
|
||
```py
|
||
os.remove("file.txt")
|
||
os.remove("file.txt.tar")
|
||
os.remove("files.zip")
|
||
|
||
``` |