Siyu Zhou
Postdoctoral Fellow
Department of
Biostatistics and Bioinformatics
Rollins School of Public Health
356 Grace Crum Rollins
Building
1518 Clifton Rd NE
Atlanta, GA 30033
|
|
|
I am a
Postdoctoral Fellow at the Department of Biostatistics and Bioinformatics,
Emory University under supervision of Dr.
Limin Peng. Previously, I obtained my Ph.D. in Statistics from the
University of Pittsburgh under the supervision of Dr. Lucas Mentch. I earned my
M.Phil. in Mathematics from The Hong Kong University of Science and Technology,
advised by Dr. Man-Yu Wong
and a B.S. in Mathematics from the same institution.
At its core, my
work involves investigating and discovering the statistical and mathematical
properties of machine learning algorithms and leveraging such insights to
further improve model performance and to develop highly accurate, robust
methodologies to handle diverse types of data emerging from new technologies
and scientific needs. Moreover, I seek to develop feature importance measures
to aid in interpretability.
o
Siyu Zhou
and Limin Peng (2024+), Approximate Global Censored Quantile Random Forests
In preparation
o
Siyu Zhou
and Limin Peng (2024+), Global Quantile Learning with Censored Data Based on
Random Forests
Submitted
o
Meredith Wallace, Lucas Mentch, Bradley
Wheeler, Amanda Tapia, Marc Richards, Siyu Zhou, Lixia Yi, Susan
Redline, Daniel Buysse (2023), Use and
misuse of random forest variable importance metrics in medicine: demonstrations
through incident stroke prediction
BMC medical research methodology 23 (1),
144
o
Siyu Zhou
and Lucas Mentch (2023), Trees, Forests,
Chickens, and Eggs: When and Why to Prune Trees in a Random Forest
Statistical Analysis and Data Mining: The
ASA Data Science Journal 16 (1), 45-64
o
Lucas Mentch and Siyu Zhou (2022), Getting Better from Worse: Augmented
Bagging and a Cautionary Tale of Variable Importance
Journal of Machine Learning Research 23
(224), 1-32, 2022
o
Giles Hooker, Lucas Mentch and Siyu
Zhou (2021), Unrestricted
permutation forces extrapolation: variable importance requires at least one
more model, or there is no free variable importance
Statistics and Computing 31, 82 (2021)
o
Lucas Mentch and Siyu Zhou (2020), Randomization as Regularization:
A Degrees of Freedom Explanation for Random Forest Success
Journal
of Machine Learning Research, 21(171), 1-36, 2020
o Why
Random Forests Work and Why That's a Problem
Conference
on Statistical Learning and Data Science
Newport
Beach, CA, Nov 5 - 8, 2024
o Trees,
Forests, Chickens, and Eggs: When and Why to Prune Trees in a Random Forest
Joint
Statistical Meeting
Portland,
OR, Aug 3 - 8, 2024
o Regularization
on Ensembles of Tree and Variable Importance
ICSA
Applied Statistics Symposium
University
of Florida, Gainesville, FL, Jun 19 - 22, 2022
o Random
Forests: Why They Work and Why That's a Problem
Joint
Statistical Meeting
Virtual,
Aug 3 - 8, 2021
o Augmented
Bagging as an Alternative to Random Forests and Implications on Variable
Importance
Symposium
on Data Science and Statistics
Virtual,
Jun 2 - 4, 2021
o Augmented
Bagging as an Alternative to Random Forests
Joint
Statistical Meeting
Virtual,
Aug 2 - 6, 2020
o Explaining
the Practical Success of Random Forests
Symposium
on Data Science and Statistics
Virtual, Jun 3 - 5
o
University of Pittsburgh
o
Stat 800 - Statistics
in the Modern World
Instructor (2021 SUM)
o
Stat 1000 - Applied StatisticalMethods
Instructor (2020 SUM),
Teaching Assistant (2018 SP, 2019 SUM, 2019 FA, 2021 FA)
o
Stat 1100 - Statistics and Probability for Business Management
Teaching Assistant (2018
FA, 2020 SP)
o
Stat 1221 - Applied Regression
Teaching Assistant (2019
SP)
o
Stat 1361 - Statistical Learning and Data Science
Teaching Assistant (2022
SP)
o
Stat 2270 - Data Mining
Teaching Assistant (2019
FA, 2021 FA)
o
Stat 2360 - Statistical Learning and Data Science
Teaching Assistant (2022
SP)
o
The Hong Kong University of Science
and Technology
o
Math 3423 - Statistical Inference
Teaching Assistant (2015
FA, 2016 FA)
o
Math 3424 - Regression Analysis
Teaching Assistant (2015 SP, 2016 SP)