Files
ailearning/docs/da/034.md
2020-10-19 21:08:55 +08:00

308 lines
3.8 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 数组排序
In [1]:
```py
%pylab
```
```py
Using matplotlib backend: Qt4Agg
Populating the interactive namespace from numpy and matplotlib
```
## sort 函数
先看这个例子:
In [2]:
```py
names = array(['bob', 'sue', 'jan', 'ad'])
weights = array([20.8, 93.2, 53.4, 61.8])
sort(weights)
```
Out[2]:
```py
array([ 20.8, 53.4, 61.8, 93.2])
```
`sort` 返回的结果是从小到大排列的。
## argsort 函数
`argsort` 返回从小到大的排列在数组中的索引位置:
In [3]:
```py
ordered_indices = argsort(weights)
ordered_indices
```
Out[3]:
```py
array([0, 2, 3, 1], dtype=int64)
```
可以用它来进行索引:
In [4]:
```py
weights[ordered_indices]
```
Out[4]:
```py
array([ 20.8, 53.4, 61.8, 93.2])
```
In [5]:
```py
names[ordered_indices]
```
Out[5]:
```py
array(['bob', 'jan', 'ad', 'sue'],
dtype='|S3')
```
使用函数并不会改变原来数组的值:
In [6]:
```py
weights
```
Out[6]:
```py
array([ 20.8, 93.2, 53.4, 61.8])
```
## sort 和 argsort 方法
数组也支持方法操作:
In [7]:
```py
data = array([20.8, 93.2, 53.4, 61.8])
data.argsort()
```
Out[7]:
```py
array([0, 2, 3, 1], dtype=int64)
```
`argsort` 方法与 `argsort` 函数的使用没什么区别,也不会改变数组的值。
In [8]:
```py
data
```
Out[8]:
```py
array([ 20.8, 93.2, 53.4, 61.8])
```
但是 `sort`方法会改变数组的值:
In [9]:
```py
data.sort()
```
In [10]:
```py
data
```
Out[10]:
```py
array([ 20.8, 53.4, 61.8, 93.2])
```
## 二维数组排序
对于多维数组sort方法默认沿着最后一维开始排序
In [11]:
```py
a = array([
[.2, .1, .5],
[.4, .8, .3],
[.9, .6, .7]
])
a
```
Out[11]:
```py
array([[ 0.2, 0.1, 0.5],
[ 0.4, 0.8, 0.3],
[ 0.9, 0.6, 0.7]])
```
对于二维数组,默认相当于对每一行进行排序:
In [12]:
```py
sort(a)
```
Out[12]:
```py
array([[ 0.1, 0.2, 0.5],
[ 0.3, 0.4, 0.8],
[ 0.6, 0.7, 0.9]])
```
改变轴,对每一列进行排序:
In [13]:
```py
sort(a, axis = 0)
```
Out[13]:
```py
array([[ 0.2, 0.1, 0.3],
[ 0.4, 0.6, 0.5],
[ 0.9, 0.8, 0.7]])
```
## searchsorted 函数
```py
searchsorted(sorted_array, values)
```
`searchsorted` 接受两个参数,其中,第一个必需是已排序的数组。
In [14]:
```py
sorted_array = linspace(0,1,5)
values = array([.1,.8,.3,.12,.5,.25])
```
In [15]:
```py
searchsorted(sorted_array, values)
```
Out[15]:
```py
array([1, 4, 2, 1, 2, 1], dtype=int64)
```
排序数组:
| 0 | 1 | 2 | 3 | 4 |
| --- | --- | --- | --- | --- |
| 0.0 | 0.25 | 0.5 | 0.75 | 1.0 |
数值:
| 值 | 0.1 | 0.8 | 0.3 | 0.12 | 0.5 | 0.25 |
| --- | --- | --- | --- | --- | --- | --- |
| 插入位置 | 1 | 4 | 2 | 1 | 2 | 1 |
`searchsorted` 返回的值相当于保持第一个数组的排序性质不变,将第二个数组中的值插入第一个数组中的位置:
例如 `0.1` 在 [0.0, 0.25) 之间,所以插入时应当放在第一个数组的索引 `1` 处,故第一个返回值为 `1`
In [16]:
```py
from numpy.random import rand
data = rand(100)
data.sort()
```
不加括号,默认是元组:
In [17]:
```py
bounds = .4, .6
bounds
```
Out[17]:
```py
(0.4, 0.6)
```
返回这两个值对应的插入位置:
In [18]:
```py
low_idx, high_idx = searchsorted(data, bounds)
```
利用插入位置,将数组中所有在这两个值之间的值提取出来:
In [19]:
```py
data[low_idx:high_idx]
```
Out[19]:
```py
array([ 0.41122674, 0.4395727 , 0.45609773, 0.45707137, 0.45772076,
0.46029997, 0.46757401, 0.47525517, 0.4969198 , 0.53068779,
0.55764166, 0.56288568, 0.56506548, 0.57003042, 0.58035233,
0.59279233, 0.59548555])
```