Xiaochang Miao
Data Scientist
San Francisco Bay Area, US
Message
Connect
Profile:
Summary
PhD in physics with 4+ year industrial working experience in data analysis & data science.
Knowledge & Skills:
- Expertise in Statistical Inference, Hypothesis Testing, A/B Testing, Probability Theory
- Working experience with Machine Learning Algorithms: Regression, Classification, Tree-based Methods, Ensemble Learning
- Hands-on experience with Big data tools: Hadoop, Spark, MapReduce, Hive, Pig Query, HP Vertica
- Computational programming: Python, R, Scala, SQL, Matlab, C
- Business intelligence tools: Tableau, Spotfire, Metric Insights, Caravel
Experience
2016
Apr
-
Present
Data Scientist
Shopkick Inc / Redwood City, CA
- Design and develop A/B testing framework in python to standardize and automate experiment analysis and reporting.
- Build various data pipelines to extract analytical information from client logs and integrated with ETL by using both Spark (with Scala) and python.
- Develop and maintain python codes in production to monitor and control fraud activities.
- Build a churn model with machine learning algorithms (Random Forest and Logistic Regression) to understand the key factors that would affect users’ long term retention based on their first-week engagement, and provide several actionable recommendations for product development and targeted marketing strategies.
- Implement numerous A/B testing to optimize user acquisition strategies, viral channel, first-week activation and user retention, and evaluate product performance including grocery shopping and marketing campaigns.
- Conduct in-depth data analysis based on users’ shopping profile and provide product insights to drive more revenue-generating engagement.
- Build various dashboards in Tableau, Caravel and Metric Insight to monitor fraud activities and report business metrics to C-level management. Write complex queries with Vertica SQL and spark SQL to address the ad-hoc requests to fulfill the business needs.
2013
Sep
-
2016
Mar
Sr. Device Engineer
Sandisk / Milpitas, CA
- Design new memory operation algorithms to improve product reliability & performance and submitted 5 US patents accordingly (4 of them are approved, and 1 are under review).
- Build analytical and statistical models (mainly regression models) to discover chip failure patterns.
- Develop object-oriented python scripts for data acquisition, cleaning and visualization.
- Perform A/B testing on semiconductor fabrication recipes to optimize chip processing.
- Supervise the test engineers to run qualification tests and analyze GBs of data to quantify the results based on customer specs for various product lines.
2009
Aug
-
2013
Aug
Graduate Research Assistant
University of Florida / Gainesville, FL
Work on graphene based electronic devices modeling and physical property studies. Published 5 papers in top peer-reviewed scientific journals with total citation > 800 as listed in google scholar.
Education
2009
-
2013
University of Florida
PhD in Physics
2004
-
2008