The course includes 7 hour recording (200 slides + multiple B2B cases) in China.This introduction course is desgined for B2B marketers who are are new to China’s Digital marketing system.

Key Content

* Digital landscape and major players in China
* Website marketing with Baidu (and its ecosystem), including Search Engine Optimization (SEO) and Search Engine Marketing (SEM) tips and tricks.
* Native ads with Wechat (ads channel and format, integrating online and offline marketing in China)
* Social media marketing (Overview of Chinese Social Media Channels, content marketing with wechat/weibo/douyin, and advanced techniques)
* Multiple case studies in different B2B industries

I was recently working on a new project with many Excel files, which include field names, sample values for each database table. I thought it would be very useful to have a quick python code to merge all these together into one big file.

To showcase the code, I craeted three files: customer.xlsx, event.xlsx, and referrals.xlsx.

and the code

import os
import pandas as pd
cwd = os.path.abspath(‘’) ##leave blank if your code and files are in same folder

writer=pd.ExcelWriter(‘data_model.xlsx’) ## craete a new file to be filled in later, you can call whatever you want.

for file in…

O’Reilly learning, formely known as Safari Books Online, is an online platform with vast array of technical content: over 40,000 books, video courses, live training, and other valuable materials. I’ve been using the service since my first job where my company purchased company-wide access. Last year I started my video series for it, making recommendation on new books.

The platform offers 7day free trial, but honestly the $49/month (or $499/year) plan is a bit too pricy for individual learner.

There’s indeed another way, and it’s through joining ACM: Association for Computing Machinery. If you work on the data analytics/mining…




6th episode of my bookclub continues on data visualization and Tableau



  • 客户流失模型
  • 营销预测模型
  • 信用风控模型
  • 购物车分析等等



这里面比较有意思的是中间numerical to bionominal的流程,因为在原始数据中并没有churn这个1/0变量


前面提到了线性回归中类别变量是无法被处理的,在实际操作中,我们需要对类别变量做一定的加工,现在最流行的一种处理方式称为 one hot encoding。 one hot编码是将类别变量转换为机器学习算法易于利用的一种形式的过程。


我将one hot encoding放在了输入数据之后,具体的配置中可以看到系统已经只保留了类别型变量供你选择,为了避免让模型太复杂,我只选择了gender和marital status两个变量

Peng's Draft

10+ year analytics professional || Host of Peng’s Book Club@Youtube || Advocate data for better (personal and social) decisions

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store