Distributed Machine Learning
Heng Huang (Department of Electrical and Computer Engineering & Department of Biomedical Informatics, University of Pittsburgh)
Abstract
Machine learning is gaining fresh momentum, and has helped us to enhance not only many industrial and professional processes but also our everyday living. The recent success of machine learning relies heavily on the surge of big data, big models, and big computing. However, inefficient algorithms restrict the applications of machine learning to big data mining tasks. In terms of big data, serious concerns, such as communication overhead and data privacy, should be rigorously addressed when we train models using large amounts of data located on multiple devices. In terms of the big model, it is still an underexplored research area if a model is too big to train on a single device. To address these challenging problems, we focused on designing new large-scale machine learning models, efficiently optimizing and training methods for big data mining, and studying new discoveries in both theory and applications.
For the challenges raised by big data, we proposed several new asynchronous distributed stochastic gradient descent or coordinate descent methods for efficiently solving convex and non-convex problems. We also designed new large-batch training methods for deep learning models to reduce the computation time significantly with better generalization performance. For the challenges raised by the big model, we scaled up the deep learning models by parallelizing the layer-wise computations with a theoretical guarantee, which is the first algorithm breaking the lock of backpropagation mechanism such that the large-scale deep learning models can be dramatically accelerated.
About the Speaker
Dr. Heng Huang is John A. Jurenko Endowed Professor in Electrical and Computer Engineering at University of Pittsburgh, and also Professor in Biomedical Informatics at University of Pittsburgh Medical Center. Dr. Huang received the PhD degree in Computer Science at Dartmouth College. His research areas include machine learning, big data mining, and biomedical data science. Dr. Huang has published more than 220 papers in top-tier conferences and many papers in premium journals, such as ICML, NeurIPS, KDD, RECOMB, ISMB, ICCV, CVPR, IJCAI, AAAI, Nature Machine Intelligence, Nucleic Acids Research, Bioinformatics, Medical Image Analysis, Neurobiology of Aging, IEEE TMI, TKDE, etc. Based on csrankings.org, for the last ten years, Dr. Huang is ranked 3rd among researchers who published most top computer science conference papers. As PI, Dr. Huang currently is leading NIH R01, U01, and multiple NSF funded projects on machine learning, neuroimaging, precision medicine, electronic medical record data analysis and privacy-preserving, smart healthcare, and cyber physical system. Over the past 13 years, Dr. Huang received more than $34,000,000 research funding. He is a Fellow of AIBME and serves as the Program Chair of ACM SIGKDD Conference 2020.
Future Speakers include:
September 11, 2020, 10:30 - 11:30 am ET
Ying Nian Wu (Department of Statistics, University of California, Los Angeles)
"A Representational Model of Grid Cells Based on Matrix Lie Algebras"
September 18, 2020, 10:30 - 11:30 am ET
Ruslan Salakhutdinov (Machine Learning Department, School of Computer Science, Carnegie Mellon University)
"Integrating Domain-Knowledge into Deep Learning"
We are able to take advantage of the online platform this year to broadcast these talks live via both Zoom meetings.
Detailed information about connecting and other schedule information are available at the webinar series website.
Event Type
- NISS Sponsored