【3.3.5】Pandas--DataFrame的dropna
去掉空值
DataFrame.dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False)
一、参数
Name | Description | Type/Default Value | Required / Optional |
---|---|---|---|
axis | Determine if rows or columns which contain missing values are removed. 0, or ‘index’ : Drop rows which contain missing values. 1, or ‘columns’ : Drop columns which contain missing value. | {0 or ‘index’, 1 or ‘columns’} ; Default Value: 0 | Required |
how | Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.‘any’ : If any NA values are present, drop that row or column; ‘all’ : If all values are NA, drop that row or column. | {‘any’, ‘all’} Default Value: ‘any’ | Required |
thresh | Require that many non-NA values. | int | Optional |
subset | Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include. | array-like | Optional |
inplace | If True, do operation inplace and return None. | boolDefault Value: False | Required |
二、示例
import numpy as np
import pandas as pd
df = pd.DataFrame({"name": ['Superman', 'Batman', 'Spiderman'],
"toy": [np.nan, 'Batmobile', 'Spiderman toy'],
"born": [pd.NaT, pd.Timestamp("1956-06-26"),
pd.NaT]})
df
输出:
name toy born
0 Superman NaN NaT
1 Batman Batmobile 1956-06-26
2 Spiderman Spiderman toy NaT
2.1 Drop the rows where at least one element is missing:
df.dropna()
name toy born
1 Batman Batmobile 1956-06-26
2.2 Drop the columns where at least one element is missing:
df.dropna(axis='columns')
df
name
0 Superman
1 Batman
2 Spiderman
2.3 Drop the rows where all elements are missing.
df.dropna(how='all')
name toy born
0 Superman NaN NaT
1 Batman Batmobile 1956-06-26
2 Spiderman Spiderman toy NaT
2.4 Keep only the rows with at least 2 non-NA values: 保留至少2个非空值的行
df.dropna(thresh=2)
name toy born
1 Batman Batmobile 1956-06-26
2 Spiderman Spiderman toy NaT
2.5 Define in which columns to look for missing values:
df.dropna(subset=['name', 'born'])
name toy born
1 Batman Batmobile 1956-06-26
2.6 Keep the DataFrame with valid entries in the same variable:
df.dropna(inplace=True)
df
name toy born
1 Batman Batmobile 1956-06-26
参考资料
这里是一个广告位,,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn