I will discuss recent work demonstrating precise empirical scaling laws for neural networks.  Then I will explain how this line of thinking led to the GPT-3 language model.  If there's time, I may discuss more theoretical work attempting to explain the origin of the scaling laws.

 Jared Kaplan received his bachelor’s degree in physics and mathematics from Stanford, University his Ph.D. in physics at Harvard University, and worked as a postdoctoral fellow at SLAC and Stanford. Since 2012, he has been a professor of physics at Johns Hopkins University. His research has spanned a range of fields, including cosmology, particle physics, dark matter, scattering theory, and the AdS/CFT correspondence and quantum gravity. Most recently, he has been working on basic research in Machine Learning, in collaboration with researchers at OpenAI and Google.  Kaplan received his bachelor’s degree in physics and mathematics from Stanford, University his Ph.D. in physics at Harvard University, and worked as a postdoctoral fellow at SLAC and Stanford. Since 2012, he has been a professor of physics at Johns Hopkins University. His research has spanned a range of fields, including cosmology, particle physics, dark matter, scattering theory, and the AdS/CFT correspondence and quantum gravity. His work has been supported by a Sloan Foundation Fellowship, an NSF CAREER grant, and by the Simons Foundation as a principal investigator of the Collaboration on the Nonperturbative Bootstrap.  Most recently, he has been working on basic research in Machine Learning, in collaboration with researchers at OpenAI and Google.

Thursday, September 17th at 4:25 via Zoom

If you would like to attend from outside Lehigh,  please contact Prof.  J. Toulouse (jt02@lehigh.edu) for the passcode to the meeting.

Event Details

  • Alibek Kaliyev
  • Andrew Zheng

2 people are interested in this event

Lehigh University Events Calendar Powered by the Localist Community Event Platform © All rights reserved