ai-day13

Ngchiwa Ng
2 min readMar 29, 2020

--

熟悉 python 常⽤套件 pandas 的操作⽅式,如排序、合 併、分組操作、Indexing 等

ref: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

hw :

  • 分組(cut: 0, 1–2, 3–5, >5)

cut_rule = [-np.inf, 0, 2, 5, np.inf]
app_train['CNT_CHILDREN_GROUP'] = pd.cut(app_train['CNT_CHILDREN'].values, cut_rule, include_lowest=True)
app_train['CNT_CHILDREN_GROUP'].value_counts()

result:

(-inf, 0.0]    215371
(0.0, 2.0] 87868
(2.0, 5.0] 4230
(5.0, inf] 42
  • group by
"""
print(app_train[['CNT_CHILDREN_GROUP', 'AMT_INCOME_TOTAL']])
grp = ['CNT_CHILDREN_GROUP']
grouped_df = app_train.groupby(grp)['AMT_INCOME_TOTAL']grouped_df.mean()
  • boxplot
    - y: AMT_INCOME_TOTAL
    - x:[“CNT_CHILDREN_GROUP”, “TARGET”]

--

--

Ngchiwa Ng
Ngchiwa Ng

Written by Ngchiwa Ng

Backend/iOS Engineer, rock the world

No responses yet