【3】数据分析-1-数据的处理--numpy--3--数组基本操作

August 21, 2018 py_module 阅读量：次

创建
检查
索引
组合
分割
复制
遍历
转换、序列化
如何反转行和整个数组？
Reshaping and Flattening 多维数组
如何表示缺失值和无限（missing values and infinite）？
如何计算ndarray上的mean，min，max
如何获得unique and the counts

一.创建数组

1.1 以list或tuple变量为参数产生一维数组（numpy array）

>>> from numpy import *
>>> a = array( [2,3,4] )
>>> a
array([2, 3, 4])

>>> a.dtype
dtype('int32')

>>> b = array([1.2, 3.5, 5.1])
>>> b.dtype
dtype('float64')

# Create an 1d array from a list
import numpy as np
list1 = [0,1,2,3,4]
arr1d = np.array(list1)

# Print the array and its type
print(type(arr1d))
arr1d

#> class 'numpy.ndarray'
#> array([0, 1, 2, 3, 4])

注：

数组（array）和列表（list）之间的主要区别在于，数组用于处理向量化操作，而python列表则不是。这意味着，如果应用函数，它将对数组中的每个项执行，而不是对整个数组对象执行。
另一个特点是，一旦创建了numpy array，就无法增加其大小。为此，您必须创建一个新数组。但是这种扩展大小的行为在列表（list）中是很自然的。

具体例子：

list1 + 2  # error

# Add 2 to each element of arr1d
arr1d + 2
#> array([2, 3, 4, 5, 6])

1.2 以list或tuple变量为元素产生二维数组

# Create a 2d array from a list of lists
list2 = [[0,1,2], [3,4,5], [6,7,8]]
arr2d = np.array(list2)
arr2d

#> array([[0, 1, 2],
#>        [3, 4, 5],
#>        [6, 7, 8]])

您还可以通过设置dtype参数来指定数据类型。一些最常用的numpy dtypes是： ‘float’，‘int’，‘bool’，‘str’和’object’。

要控制内存分配，您可以选择使用numpy.int32, numpy.int16, and numpy.float64等。the dtype of the created array is float64

# Create a float 2d array
arr2d_f = np.array(list2, dtype='float')
arr2d_f

#> array([[ 0.,  1.,  2.],
#>        [ 3.,  4.,  5.],
#>        [ 6.,  7.,  8.]])

每个数字后面的小数点表示float数据类型。

1.3 astype转换数据类型

一维数据类型转换

np.float64(42) # to float64  
#42.0  
  
np.int8(42.0)  # to int8  
#42  
  
np.bool(42)   # to bool  
#True    
  
np.bool(42.0)  # to bool  
#True    
  
np.float(True)  # to float  
#1.0

您还可以使用astype方法将其转换为其他数据类型。

# Convert to 'int' datatype
arr2d_f.astype('int')

#> array([[0, 1, 2],
#>        [3, 4, 5],
#>        [6, 7, 8]])

# Convert to int then to str datatype
arr2d_f.astype('int').astype('str')

#> array([['0', '1', '2'],
#>        ['3', '4', '5'],
#>        ['6', '7', '8']],
#>       dtype='U21')

与列表不同，numpy数组必须使所有项都具有相同的数据类型。这是另一个重要的区别。

但是，如果您不确定数组将保存的数据类型，或者您想要在同一数组中保存字符和数字，则可以将dtype设置为“object”

# Create a boolean array
arr2d_b = np.array([1, 0, 10], dtype='bool')
arr2d_b

#> array([ True, False,  True], dtype=bool)


# Create an object array to hold numbers as well as strings
arr1d_obj = np.array([1, 'a'], dtype='object')
arr1d_obj

#> array([1, 'a'], dtype=object)

1.4 ndarray数组向列表的转换

# Convert an array back to a list
arr1d_obj.tolist()

#> [1, 'a']

1.5 使用NumPy中函数创建ndarray数组，如:arange, ones, zeros，linspace等

使用numpy.arange方法

>>> print np.arange(15)
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14
>>> print type(np.arange(15))
<type 'numpy.ndarray'>

>>> print np.arange(15).reshape(3,5)
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
>>> print type(np.arange(15).reshape(3,5))
<type 'numpy.ndarray'>

>>> arange( 10, 30, 5 )
array([10, 15, 20, 25])
>>> arange( 0, 2, 0.3 ) # it accepts float arguments
array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

np.zeros 或 np.ones 0或都是1，或者任意数的数组

	>>> np.zeros( (3,4) )
	array([[0.,  0.,  0.,  0.],
				 [0.,  0.,  0.,  0.],
				 [0.,  0.,  0.,  0.]])
	>>> np.ones( (2,3,4), dtype=int16 )                # dtype can also be specified
	array([[[ 1, 1, 1, 1],
					[ 1, 1, 1, 1],
					[ 1, 1, 1, 1]],
				 [[ 1, 1, 1, 1],
					[ 1, 1, 1, 1],
					[ 1, 1, 1, 1]]], dtype=int16)
	>>> np.empty( (2,3) )
	array([[  3.73603959e-262,   6.02658058e-154,   6.55490914e-260],
				 [  5.30498948e-313,   3.14673309e-307,   1.00000000e+000]])

根据数组a的形状生成一个全1数组

np.ones_like(a)

根据数组a的形状生成一个全0数组

np.zeros_like(a)

根据数组a的形状生成一个数组，每个元素值都是val

np.full_like(a,val)

linspace根据起止数据等间距地填充数据，形成数组

np.linspace()

>>> linspace( 0, 2, 9 ) # 9 numbers from 0 to 2
array([ 0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75, 2. ])

>>> x = linspace( 0, 2*pi, 100 ) # useful to evaluate function at lots of points
>>> f = sin(x)

# Start at 1 and end at 50
np.linspace(start=1, stop=50, num=10, dtype=int)

#> array([ 1,  6, 11, 17, 22, 28, 33, 39, 44, 50])

请注意，因为我明确强制使用dtype，因为舍入而数字的间隔不均匀。

logspace

与np.linspace类似，也有np.logspace以对数标度上升。在np.logspace中，给定的起始值实际上是base^start，以base ^ stop结束，默认值为10。

# Limit the number of digits after the decimal to 2
np.set_printoptions(precision=2)  

# Start at 10^1 and end at 10^50
np.logspace(start=1, stop=50, num=10, base=10) 

#> array([  1.00e+01,   2.78e+06,   7.74e+11,   2.15e+17,   5.99e+22,
#>          1.67e+28,   4.64e+33,   1.29e+39,   3.59e+44,   1.00e+50])

tile和repeat创建重复序列

np.tile将重复整个列表或数组n次。然而，np.repeat重复每个项目n次。

a = [1,2,3] 

# Repeat whole of 'a' two times
print('Tile:   ', np.tile(a, 2))

# Repeat each element of 'a' two times
print('Repeat: ', np.repeat(a, 2))

#> Tile:    [1 2 3 1 2 3]
#> Repeat:  [1 1 2 2 3 3]

1.6 array和list 的主要区别是

数组支持向量化操作，而列表则不支持。
创建数组后，您无法更改其大小。您必须创建一个新阵列或覆盖现有阵列。
每个数组都有一个且只有一个dtype。其中的所有项目都应该是该类型。
等效的numpy数组比python列表列表占用更少的空间。

二、检查numpy数组的大小和形状

让我们考虑一下数组arr2d。由于它是从一系列的列表创建的，因此它有2个维度，可以显示为行和列，就像在矩阵中一样。

如果在列表列表中创建一个列表，它将具有3个维度，如在多维数据集中。

让我们假设你被交给了一个你自己没有创造的numpy向量。为了了解该数组，您想要探索的内容是什么？

是1D或2D阵列或更多。（ndim ）
每个维度（形状）中存在多少项（shape）
它的数据类型是什么（dtype）
其中的itmes总数（size）
数组中前几个items的样本（through indexing）

示例

# Create a 2d array with 3 rows and 4 columns
list2 = [[1, 2, 3, 4],[3, 4, 5, 6], [5, 6, 7, 8]]
arr2 = np.array(list2, dtype='float')
arr2

#> array([[ 1.,  2.,  3.,  4.],
#>        [ 3.,  4.,  5.,  6.],
#>        [ 5.,  6.,  7.,  8.]])
# shape
print('Shape: ', arr2.shape)

# dtype
print('Datatype: ', arr2.dtype)

# size
print('Size: ', arr2.size)

# ndim
print('Num Dimensions: ', arr2.ndim)

#> Shape:  (3, 4)
#> Datatype:  float64
#> Size:  12
#> Num Dimensions:  2

三、索引与切片

3.1 一维数组

>>> a = arange(10)**3
>>> a
array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])

>>> a[2]
8

>>> a[2:5]
array([ 8, 27, 64])

>>> a[:6:2] = -1000    
# equivalent to a[0:6:2] = -1000; from start to position 6, exclusive, set every 2nd element to -1000

>>> a
array([-1000,     1, -1000,    27, -1000,   125,   216,   343,   512,   729])

>>> a[ : :-1]                                 # reversed a
array([  729,   512,   343,   216,   125, -1000,    27, -1000,     1, -1000])

>>> for i in a:
...         print i**(1/3.),
...
nan 1.0 nan 3.0 nan 5.0 6.0 7.0 8.0 9.0

3.2 二维数组

>>> def f(x,y):
...         return 10*x+y
...

>>> b = fromfunction(f,(5,4),dtype=int)
>>> b
array([[ 0,  1,  2,  3],
			 [10, 11, 12, 13],
			 [20, 21, 22, 23],
			 [30, 31, 32, 33],
			 [40, 41, 42, 43]])

>>> b[2,3]
23

>>> b[0:5, 1]                       # each row in the second column of b
array([ 1, 11, 21, 31, 41])

>>> b[ : ,1]                        # equivalent to the previous example
array([ 1, 11, 21, 31, 41])

>>> b[1:3, : ]                      # each column in the second and third row of b
array([[10, 11, 12, 13],
			 [20, 21, 22, 23]])


>>> b[-1]                        # the last row. Equivalent to b[-1,:]
array([40, 41, 42, 43])

3.3 .的用处

>>> c = array( [ [[  0,  1,  2],      # a 3D array (two stacked 2D arrays)
...               [ 10, 12, 13]],
...
...              [[100,101,102],
...               [110,112,113]] ] )

>>> c.shape
(2, 2, 3)

>>> c[1,...]                                   # same as c[1,:,:] or c[1]
array([[100, 101, 102],
			 [110, 112, 113]])

>>> c[...,2]                                   # same as c[:,:,2]
array([[  2,  13],
			 [102, 113]])

输出一维数据

>>> for row in b:
...         print row
...
[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]

输出所有元素

>>> for element in b.flat:
...         print element,
...
0 1 2 3 10 11 12 13 20 21 22 23 30 31 32 33 40 41 42 43

3.4 更多例子

arr2

#> array([[ 1.,  2.,  3.,  4.],
#>          [ 3.,  4.,  5.,  6.],
#>          [ 5.,  6.,  7.,  8.]])

您可以使用从0开始的索引来提取数组上的特定部分，类似于使用python列表的方式。

但与列表不同，numpy数组可以选择在方括号中接受与维数一样多的参数。

# Extract the first 2 rows and columns
arr2[:2, :2]
list2[:2, :2]  # error

#> array([[ 1.,  2.],
#>        [ 3.,  4.]])

另外，numpy数组支持布尔索引。

布尔索引数组与要过滤的数组具有相同的形状，并且它仅包含True和False值。与True位置对应的值将保留在输出中。

# Get the boolean output by applying the condition to each element.
b = arr2 > 4
b

#> array([[False, False, False, False],
#>        [False, False,  True,  True],
#>        [ True,  True,  True,  True]], dtype=bool)
arr2[b]

#> array([ 5.,  6.,  5.,  6.,  7.,  8.])

四、数组的组合（函数）

#创建两个测试数组

a = np.arange(9).reshape(3,3)    
''''' 
 array([[0, 1, 2],   
           [3, 4, 5],   
           [6, 7, 8]])   
'''  
b = 2 * a  
''''' 
array([[ 0, 2, 4],   
       [ 6, 8, 10],   
       [12, 14, 16]])   
 
'''

#水平组合

np.hstack((a, b))   
''''' 
array([[ 0, 1, 2, 0, 2, 4],   
       [ 3, 4, 5, 6, 8, 10],   
       [ 6, 7, 8, 12, 14, 16]])  
'''  
  
#通过concatenate函数并指定相应的轴 (axis=1 水平，axis=0 垂直)  
con = np.concatenate((a, b), axis=1)  
''''' 
array([[ 0, 1, 2, 0, 2, 4],   
       [ 3, 4, 5, 6, 8, 10],   
       [ 6, 7, 8, 12, 14, 16]])

#垂直组合

np.vstack((a, b))  
''''' 
array([[ 0, 1, 2],   
       [ 3, 4, 5],   
       [ 6, 7, 8],    
       [ 0, 2, 4],   
       [ 6, 8, 10],   
       [12, 14, 16]])   
 
'''  
#或者使用concatenate   
con = np.concatenate((a, b), axis=0)

#深度组合 dstack(就是在数组的第三个轴（即深度）上组合,生成一个新的列表数组)

np.dstack((a, b))    
''''' 
array([[[ 0, 0],   
        [ 1, 2],   
        [ 2, 4]],   
   
       [[ 3, 6],   
        [ 4, 8],   
        [ 5, 10]],   
   
       [[ 6, 12],   
        [ 7, 14],   
        [ 8, 16]]])   
'''

#行组合,行组合可将多个一维数组作为新数组的每一行进行组合 #对于2维数组，其作用就像垂直组合一样。

one = np.arange(2)  
''''' 
array([0, 1]) 
'''  
  
two = one + 2   
''''' 
array([2, 3]) 
'''  
  
np.row_stack((one, two))  
''''' 
 
array([[0, 1],   
       [2, 3]])  
'''

#列组合(对于2维数组，其作用就像水平组合一样。)

np.column_stack((oned, twiceoned))    
''''' 
array([[0, 2],   
       [1, 3]])   
'''

五、数组分割

在NumPy中，分割数组的函数有hsplit、vsplit、dsplit和split。可将数组分割成相同大小的子数组，或指定原数组分割的位置。

#水平分割

a = arange(9).reshape(3,3)   
''''' 
array([[0, 1, 2],   
       [3, 4, 5],   
       [6, 7, 8]])  
'''  
  
np.hsplit(a, 3)   
''''' 
[array([[0],   
       [3],   
       [6]]),   
 array([[1],   
       [4],   
       [7]]),   
 array([[2],   
       [5],   
       [8]])]  
 
'''  
#方法二：用split函数并指定轴为1  
np.split(a, 3, axis=1)

#垂直分割 #垂直分割是沿着垂直的轴切分数组：

np.vsplit(a, 3)    
''''' 
[array([[0, 1, 2]]), array([[3, 4, 5]]), array([[6, 7, 8]])]  
'''  
  
#方法二  
#solit函数并指定轴为1  
np.split(a, 3, axis=0)

#面向深度的分割 #dsplit函数使用的是面向深度的分割

c = arange(27).reshape(3, 3, 3)    
  
''''' 
array([[[ 0,  1,  2],   
        [ 3,  4,  5],   
        [ 6,  7,  8]],   
   
       [[ 9, 10, 11],   
        [12, 13, 14],   
        [15, 16, 17]],   
   
       [[18, 19, 20],   
        [21, 22, 23],   
        [24, 25, 26]]])   
 
'''  
  
np.dsplit(c, 3)   
''''' 
[array([[[ 0],   
        [ 3],   
        [ 6]],   
   
       [[ 9],   
        [12],   
        [15]],   
   
       [[18],   
        [21],   
        [24]]]),   
 array([[[ 1],   
        [ 4],   
        [ 7]],   
   
       [[10],   
        [13],   
        [16]],   
   
       [[19],   
        [22],   
        [25]]]),   
 array([[[ 2],   
        [ 5],   
        [ 8]],   
   
       [[11],   
        [14],   
        [17]],   
   
       [[20],   
        [23],   
        [26]]])]   
'''

六、数组复制和镜像( view )

判断数组是否共享内存，也是用来直接判断数据是复制的还是镜像的

#方法一：  
a = np.arange(50)  
b = a.reshape((5, 10))  
print (b.base is a)  
  
#方法二：  
print (np.may_share_memory(a, b))  
  
#方法三：  
print (b.flags['OWNDATA'])  #False -- apparently this is a view  
e = np.ravel(b[:, 2])  
print (e.flags['OWNDATA'])  #True -- Apparently this is a new numpy object.

数组复制和镜像

1.)完全不复制 ,简单的赋值，而不复制数组对象或它们的数据。

a = np.arange(12)    
b = a      #不创建新对象    
b is a     # a和b是同一个数组对象的两个名字  
#  true  
      
b.shape = 3,4    #也改变了a的形状   
print a.shape  
''''' 
(3, 4) 
'''

2.) view的用法视图方法创造一个新的数组对象指向同一数据。

事实上，没有任何数据类型是固定的，主要取决于如何看待这片数据的内存区域。在numpy.ndarray.view中，提供对内存区域不同的切割方式，来完成数据类型的转换，而无须要对数据进行额外的copy，来节约内存空间。

c = a.view()    
c is a    
# false  
  
c.base is a      #c是a持有数据的镜像   
#true  
  
c.shape = 2,6    # a的形状没变  
print(a.shape)  # (3, 4)   
  
c[0,4] = 1234        #a的数据改变了  
print  a  
''''' 
array([[   0,    1,    2,    3],   
       [1234,    5,    6,    7],   
       [   8,    9,   10,   11]]) 
'''

3.)切片数组返回它的一个视图

s = a[ : , 1:3]     # 获得每一行1，2处的元素  
s[:] = 10           # s[:] 是s的镜像。注意区别s=10 and s[:]=10   
print a  
''''' 
array([[   0,   10,   10,    3],   
       [1234,   10,   10,    7],   
       [   8,   10,   10,   11]])   
'''

4.)深复制,这个复制方法完全复制数组和它的数据。

d = a.copy()       #创建了一个含有新数据的新数组对象  
d is a  
#False  
  
d.base is a        #d和a现在没有任何关系  
#False    
  
d[0,0] = 9999  
print a  
''''' 
array([[   0,   10,   10,    3],   
       [1234,   10,   10,    7],   
       [   8,   10,   10,   11]])  
'''

5.）在图像处理中的应用

当需要对输入图像三个通道进行相同的处理时，使用cv2.split和cv2.merge是相当浪费资源的，因为任何一个通道的数据对处理来说都是一样的，我们可以用view来将其转换为一维矩阵后再做处理，这要不需要额外的内存开销和时间开销。

def createFlatView(array):    
    """Return a 1D view of an array of any dimensionality."""    
    flatView = array.view()    
    flatView.shape = array.size    
    return flatView

如何从现有数组创建新数组

如果只是将数组的一部分分配给另一个数组，则刚创建的新数组实际上是指内存中的父数组。这意味着，如果对新数组进行任何更改，它也将反映在父数组中。因此，为了避免干扰父数组，您需要使用copy（）复制它。所有numpy数组都带有copy（）方法。

# Assign portion of arr2 to arr2a. Doesn't really create a new array.
arr2a = arr2[:2,:2]  
arr2a[:1, :1] = 100  # 100 will reflect in arr2
arr2

#> array([[ 100.,    2.,    3.,    4.],
#>        [   3.,   -1.,   -1.,    6.],
#>        [   5.,    6.,    7.,    8.]])

# Copy portion of arr2 to arr2b
arr2b = arr2[:2, :2].copy()
arr2b[:1, :1] = 101  # 101 will not reflect in arr2
arr2

#> array([[ 100.,    2.,    3.,    4.],
#>        [   3.,   -1.,   -1.,    6.],
#>        [   5.,    6.,    7.,    8.]])

六、数组遍历

a = np.arange(9).reshape(3,3)   
for row in a:  
  print row   
''''' 
[0 1 2] 
[3 4 5] 
[6 7 8] 
 
'''  
  
#对数组中每个元素都进行处理，可以使用flat属性，该属性是一个数组元素迭代器：  
for element in a.flat:  
  print element  
''''' 
0 1 2 3 4 5 6 7 8 
'''

八、数组序列化和反序列化

序列化是将对象状态转换为可保持或传输的形式的过程。序列化的补集是反序列化，后者将流转换为对象。这两个过程一起保证数据易于存储和传输。

python 提供pickle, cPickle 对象序列化/反序列化

这里使用numpy 提供的函数

#预定义数据栏位名称和类型    
table = np.loadtxt('example.txt',dtype='names': ('ID', 'Result', 'Type'),\    
    'formats': ('S4', 'f4', 'i2'))    
np.savetxt('somenewfile.txt')#序列化    
#二进制文件加载，保存    
data = np.empty((1000, 1000))    
np.save('test.npy', data)    
np.savez('test.npz', data)#采用压缩    
newdata = np.load('test.npy')

九、如何反转行和整个数组？

反转数组就像处理列表一样，但如果想要完全反转，则需要对所有轴（尺寸）执行操作。

# Reverse only the row positions
arr2[::-1, ]

#> array([[ 5.,  6.,  7.,  8.],
#>        [ 3.,  4.,  5.,  6.],
#>        [ 1.,  2.,  3.,  4.]])


# Reverse the row and column positions
arr2[::-1, ::-1]

#> array([[ 8.,  7.,  6.,  5.],
#>        [ 6.,  5.,  4.,  3.],
#>        [ 4.,  3.,  2.,  1.]])

十、Reshaping and Flattening 多维数组

Reshaping 更改项目的排列，以便在保持相同维数的同时更改阵列的形状。
Flattening 会将多维数组转换为平面1d数组。而不是任何其他形状。

首先，让我们将arr2阵列从3×4重塑为4×3形状。

# Reshape a 3x4 array to 4x3 array
arr2.reshape(4, 3)

#> array([[ 100.,    2.,    3.],
#>        [   4.,    3.,   -1.],
#>        [  -1.,    6.,    5.],
#>        [   6.,    7.,    8.]])

flatten() 和 ravel()区别

有两种流行的方法来实现展平。flatten（）和ravel（）

ravel和flatten之间的区别在于，使用ravel创建的新数组实际上是对父数组的引用。因此，对新数组的任何更改也会影响父级。但是内存有效，因为它不会创建副本。

# Flatten it to a 1d array
arr2.flatten()

#> array([ 100.,    2.,    3.,    4.,    3.,   -1.,   -1.,    6.,    5., 6.,    7.,    8.])
# Changing the flattened array does not change parent

b1 = arr2.flatten()  
b1[0] = 100  # changing b1 does not affect arr2
arr2

#> array([[ 100.,    2.,    3.,    4.],
#>        [   3.,   -1.,   -1.,    6.],
#>        [   5.,    6.,    7.,    8.]])

# Changing the raveled array changes the parent also.
b2 = arr2.ravel()  
b2[0] = 101  # changing b2 changes arr2 also
arr2

#> array([[ 101.,    2.,    3.,    4.],
#>        [   3.,   -1.,   -1.,    6.],
#>        [   5.,    6.,    7.,    8.]])

ndarray数组的维度变换

.reshape(shape)  不改变原数组元素
.resize(shape)      改变原数组元素
.swapaxes(ax1,ax2)

将数组n个维度中两个维度进行调换.flatten()

对数组进行降维，返回折叠后的一维数组，原数组不变

十一、如何表示缺失值和无限（missing values and infinite）？

缺失值可以使用np.nan对象表示，而np.inf表示无穷大。我们把一些放在arr2d中。

# Insert a nan and an inf
arr2[1,1] = np.nan  # not a number
arr2[1,2] = np.inf  # infinite
arr2

#> array([[  1.,   2.,   3.,   4.],
#>        [  3.,  nan,  inf,   6.],
#>        [  5.,   6.,   7.,   8.]])


# Replace nan and inf with -1. Don't use arr2 == np.nan

missing_bool = np.isnan(arr2) | np.isinf(arr2)
arr2[missing_bool] = -1  
arr2

#> array([[ 1.,  2.,  3.,  4.],
#>        [ 3., -1., -1.,  6.],
#>        [ 5.,  6.,  7.,  8.]])

十二、如何计算ndarray上的mean，min，max

ndarray有各自的方法来计算整个数组。

# mean, max and min
print("Mean value is: ", arr2.mean())
print("Max value is: ", arr2.max())
print("Min value is: ", arr2.min())

#> Mean value is:  3.58333333333
#> Max value is:  8.0
#> Min value is:  -1.0

mat.max(0)#n维数组axis=0维度的最小值，最大值

但是，如果要逐行或按列计算最小值，请改用np.amin版本。

# Row wise and column wise min
print("Column wise minimum: ", np.amin(arr2, axis=0))
print("Row wise minimum: ", np.amin(arr2, axis=1))

#> Column wise minimum:  [ 1. -1. -1.  4.]
#> Row wise minimum:  [ 1. -1.  5.]

计算最小行数很好。但是，如果你想按行进行其他计算/功能呢？可以使用np.apply_over_axis来完成，您将在即将到来的主题中看到它。

# Cumulative Sum
np.cumsum(arr2)

#> array([  1.,   3.,   6.,  10.,  13.,  12.,  11.,  17.,  22.,  28.,  35., 43.])

十三如何获得unique and the counts

np.unique方法可用于获取唯一项。如果要重复每个项目的计数，请将return_counts参数设置为True。

# Create random integers of size 10 between [0,10)
np.random.seed(100)
arr_rand = np.random.randint(0, 10, size=10)
print(arr_rand)

#> [8 8 3 7 7 0 4 2 5 2]
# Get the unique items and their counts
uniqs, counts = np.unique(arr_rand, return_counts=True)
print("Unique items : ", uniqs)
print("Counts       : ", counts)

#> Unique items :  [0 2 3 4 5 7 8]
#> Counts       :  [1 2 1 1 1 2 2]

参考资料

https://blog.csdn.net/mokeding/article/details/17476979
北京理工大学嵩天老师课件
https://www.machinelearningplus.com/python/numpy-tutorial-part1-array-python-examples/

药企，独角兽，苏州。团队长期招人，感兴趣的都可以发邮件聊聊：tiehan@sina.cn

个人公众号，比较懒，很少更新，可以在上面提问题，如果回复不及时，可发邮件给我： tiehan@sina.cn