Torsten Hoefler Explained

Torsten Hoefler
Field:High-Performance Computing
Computer Science
Work Institution:ETH Zurich
Swiss National Supercomputing Centre
Microsoft
Cray
University of Illinois at Urbana-Champaign
Indiana University
Alma Mater:Indiana University
TU Chemnitz
Doctoral Advisor:Andrew Lumsdaine
Awards:ACM Fellow[1]
IEEE CS Sidney Fernbach Award[2]
IEEE Fellow[3] [4] Latsis Prize of ETH Zürich[5]

Torsten Hoefler is a Professor of Computer Science at ETH Zurich[6] and the Chief Architect for Machine Learning at the Swiss National Supercomputing Centre.[7] Previously, he led the Advanced Application and User Support team at the Blue Waters Directorate of the National Center for Supercomputing Applications, and held an adjunct professor position at the Computer Science Department at the University of Illinois at Urbana Champaign.[8] His expertise lies in large-scale parallel computing and high-performance computing systems. He focuses on applications in large-scale artificial intelligence as well as climate sciences.

Hoefler is an IEEE Fellow,[9] ACM Fellow,[10] and a member of the European Academy of Sciences Academia Europaea.[11] His Erdos number is two.[12]

He has been invited to present several keynote lectures at major international conferences such as ACM's Federated Computing Research Conference,[13] IEEE Cluster,[14] HPC Asia, Supercomputing Asia,[15] or the International Symposium on Distributed Computing.[16]

Career

Hoefler received his Diplom in Computer Science from TU Chemnitz where he received the best student award in 2005.[17] He worked on high-performance computing systems from the very beginning of his career. He continued his studies at Indiana University, the home of Open MPI, under the guidance of Prof. Andrew Lumsdaine. He received his PhD in Computer Science in 2008 from Indiana University and was subsequently honored with the university's Young Alumni Award[18] as well as Distinguished Alumni Award[19]

He continued his work on the Message Passing Interface standard as a key member of the MPI Forum[20] responsible for the chapters on Collective Communication and Process Topologies as well as co-authoring the chapter on One-Sided Communications.[21]

In 2010, he joined the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign (UIUC). As lead for application performance analysis and support, he supported the design and deployment of the Blue Waters Supercomputer.[22] He also held a position as adjunct professor at UIUC's Computer Science department. He accepted a position as assistant professor at ETH Zurich in 2011,[23] where he received tenure in 2017,[24] and is full professor from 2020.[25]

Hoefler has held various visiting researcher positions at French Alternative Energies and Atomic Energy Commission in France, CINECA in Italy, as well as Argonne National Laboratory, Sandia National Laboratory, and Microsoft in the United States. As a consultant, he supported Cray Inc. in the area of high-performance networking and Microsoft Corporation in the areas of quantum computing and large-scale artificial intelligence systems. He spent his sabbatical in 2019 at Microsoft helping to establish various AI supercomputing efforts including the Maia 100 system.[26] [27] [28]

Hoefler has been a member of the ACM SIGHPC executive committee since its founding in 2011.[29]

He was elected IEEE Fellow for “contributions to large-scale parallel processing systems and supercomputers”,[9] ACM Fellow for “foundational contributions to High-Performance Computing and the application of HPC techniques to machine learning”,[10] and he received the IEEE Sidney Fernbach Award in 2022 for “application-aware design of HPC algorithms, systems and architectures, and transformative impact on scientific computing and industry”.

Hoefler received the inaugural Jack Dongarra award at ISC High Performance Conference in 2023.[30] [31] [32] He was appointed as a senior fellow of the Abu Dhabi Investment Authority Labs in 2023.[33] [34]

Research impact

Hoefler is known for his contributions to the Message Passing Interface (MPI) standard. He served as author for the chapters “Collective Communication” and “Process Topologies” in MPI-2.2 https://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf and the chapters “Collective Communication”, “One-Sided Communications”, and “Process Topologies” in MPI-3 https://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf. For the MPI-3 standardization, he chaired the Collective Communications and Topology working groups.[35]

He developed principles for the implementation of nonblocking collective operations and remote memory access that are widely used in MPI implementations such as OpenMPI, MPICH, and derivatives.[36] Nonblocking collective operations such as allreduce, allgather, or broadcast form the basis of modern AI training systems.[37]

After co-authoring a pioneering paper on parallel deep learning[38] and during his sabbatical at Microsoft, he coined the term “3D parallelism” in modern artificial intelligence training that organizes data parallelism, pipeline parallelism, and operator parallelism into one consistent view.[39]

In his work on high-speed interconnects, he co-developed several award-winning network topologies[40] [41] and contributed routing algorithms that are used in the OpenSM routing manager on InfiniBand computer clusters.[42]

On the application side, Hoefler focuses on improving the performance of climate simulations as a digital twin[43] [44] [45] and machine learning for climate simulations.[46] He has been a convener of the Berlin Summit in Earth Virtualization Engines[47] to develop strategies to enable global access to high-resolution climate simulations.[48] [49]

Scientific reproducibility

Hoefler has been vocal about improving reproducibility of performance measurements in high-performance computing[50] and later machine learning. The latter is featured in IEEE Computer Journal as a cover feature on Research Reproducibility.[51] As Technical Papers chair of ACM/IEEE Supercomputing Conference (SC18), he introduced a new revision-based review process to the conference to improve the quality of the publications.[52] His group received the SIGHPC Certificate of Appreciation for reproducible methods at the ACM/IEEE Supercomputing Conference (SC22) ACM student cluster competition.[53] His paper on HammingMesh received the ACM/IEEE Supercomputing Conference (SC22) Best Reproducibility Advancement Award.[54] He also presented the opening keynote at the first ACM Conference on Reproducibility and Replicability.[55]

Awards and honors

Hoefler and his team received six best (student) paper awards at the ACM/IEEE Supercomputing Conference between 2010 and 2023,[56] [57] [58] [59] the top conference in High-Performance Computing. Additional important awards are listed below.

2023

2022

2021

2020

2019

2015

2014

2013

2012

Notes and References

  1. Web site: Global Computing Association Names 57 Fellows for Outstanding Contributions That Propel Technology Today . 2023-02-17 . www.acm.org . en.
  2. Web site: Torsten Hoefler Receives IEEE CS Sidney Fernbach Award 2022 . 2023-02-17 . en-US.
  3. Web site: 2022 NEWLY ELEVATED FELLOWS . Institute of Electrical and Electronics Engineers (IEEE) . https://web.archive.org/web/20211124083848/https://www.ieee.org/content/dam/ieee-org/ieee/web/org/about/fellows/2022-ieee-fellows-class.pdf. dead. 24 November 2021.
  4. Web site: Adrian Perrig and Torsten Hoefler named IEEE Fellows . 2023-02-17 . inf.ethz.ch . en.
  5. Web site: Turning life into a profession . 2023-02-17 . ethz.ch . en.
  6. Web site: Prof. Dr. Torsten Hoefler . . Departement Informatik . Zürich, Schweiz. inf.ethz.ch. 22 June 2023.
  7. Web site: ETH Professor Torsten Hoefler Joins CSCS as Chief Architect for Machine Learning.
  8. Web site: Torsten Hoefler's CV .
  9. Web site: 2022 Newly Elevated Fellows. Institute of Electrical and Electronics Engineers (IEEE) . https://web.archive.org/web/20211124083848/https://www.ieee.org/content/dam/ieee-org/ieee/web/org/about/fellows/2022-ieee-fellows-class.pdf. dead. 24 November 2021.
  10. Web site: Global Computing Association Names 57 Fellows for Outstanding Contributions That Propel Technology Today. www.acm.org. 22 June 2023.
  11. Web site: Academy of Europe: Hoefler Torsten. www.ae-info.org. 22 June 2023.
  12. Web site: Erdos2, Version 2020, August 7, 2020. sites.google.com. 11 February 2024.
  13. Web site: Plenary Speakers. fcrc.acm.org. 22 June 2023.
  14. Web site: IEEE Cluster 2016. clustercomp.org. 22 June 2023.
  15. Web site: Keynote Speakers. 11 February 2024.
  16. Web site: Keynote Talks | International Symposium on DIStributed Computing (DISC) 2020. 22 June 2023.
  17. Web site: Leistung, die sich auszahlt. www.tu-chemnitz.de. 22 June 2023.
  18. Web site: Torsten Hoefler: University Honors and Awards: Indiana University. University Honors & Awards. 22 June 2023.
  19. Web site: Luddy School honors 2023 alumni award winners. IU News Archive. 6 Nov 2023.
  20. Web site: MPI Forum. www.mpi-forum.org. 22 June 2023.
  21. Web site: MPI 3.1 Specification.
  22. Web site: Blue Waters staff, partners bring home awards from SC10 . 2023-02-17 . NCSA . en-US.
  23. Web site: 16 new professors at the ETH Zurich. www.ethlife.ethz.ch. 22 June 2023.
  24. Web site: 18 professors appointed at ETH Zurich and EPFL . 2023-02-17 . www.admin.ch.
  25. Web site: 11 new professors appointed at ETH Zurich and EPFL. www.admin.ch. 22 June 2023.
  26. US . 11076210. 2021-07-27. Distributed processing architecture. Microsoft Technology Licensing LLC. Hoefler. Torsten . Heddes. Mattheus C.. Belk. Jonathan R..
  27. US . 11886938. 2021-03-11. Message communication between integrated computing devices. Microsoft Technology Licensing LLC. Goel. Deepak . Heddes. Mattheus C.. Hoefler. Torsten. Xu. Xialing.
  28. Web site: With a systems approach to chips, Microsoft aims to tailor everything ‘from silicon to service’ to meet AI demand. 11 February 2024.
  29. Web site: Meeting Your Needs - Executive Committee. www.sighpc.org. 22 June 2023.
  30. Web site: Torsten Hoefler Earns First Jack Dongarra Early Career Award. HPCwire. 22 June 2023.
  31. Web site: Torsten Hoefler Earns First Jack Dongarra Early Career Award - Welcome to ISC High Performance 2023. www.isc-hpc.com. 22 June 2023.
  32. Web site: Torsten Hoefler Named First Winner of Jack Dongarra Early Career Award. 17 April 2023. 22 June 2023.
  33. Web site: ADIA Lab Appoints Senior Fellows. 15 February 2024.
  34. Web site: Fellows. 15 February 2024.
  35. Web site: MPI 3.0 Collective Communications and Topology Working Group. 8 November 2023.
  36. Web site: Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI. 8 November 2023.
  37. Web site: Improving NCCL performance for cloud ML applications. 8 November 2023.
  38. Web site: Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. 10.1145/3320060 . 220247313 . 8 November 2023.
  39. Web site: HammingMesh: a network topology for large-scale deep learning. 8 November 2023.
  40. Book: https://ieeexplore.ieee.org/document/7013016. Slim Fly: A Cost Effective Low-Diameter Network Topology. 10.1109/SC.2014.34 . 1912.08968 . 2149630 . 8 November 2023 . SC14: International Conference for High Performance Computing, Networking, Storage and Analysis . 2014 . Besta . Maciej . Hoefler . Torsten . 348–359 . 978-1-4799-5500-8 .
  41. Book: https://ieeexplore.ieee.org/document/5577310. The PERCS High-Performance Interconnect. 10.1109/HOTI.2010.16 . 16627945 . 8 November 2023 . 2010 18th IEEE Symposium on High Performance Interconnects . 2010 . Arimilli . Baba . Arimilli . Ravi . Chung . Vicente . Clark . Scott . Denzel . Wolfgan . Drerup . Ben . Hoefler . Torsten . Joyner . Jody . Lewis . Jerry . Li . Jian . Ni . Nan . Rajamony . Ram . 75–82 . 978-1-4244-8547-5 .
  42. Book: https://ieeexplore.ieee.org/document/5238677. Optimized Routing for Large-Scale InfiniBand Networks. 10.1109/HOTI.2009.9 . 12742852 . 8 November 2023 . 2009 17th IEEE Symposium on High Performance Interconnects . 2009 . Hoefler . Torsten . Schneider . Timo . Lumsdaine . Andrew . 103–111 .
  43. Web site: Convection-resolving climate modeling on future supercomputing platforms (crCLIM). 8 November 2023.
  44. Web site: Scientists begin building highly accurate digital twin of our planet. 8 November 2023.
  45. Web site: The digital revolution of Earth-system science. 8 November 2023.
  46. Web site: Deep learning and a changing economy in weather and climate prediction. 8 November 2023.
  47. Web site: Participants. 8 November 2023.
  48. Web site: Earth Virtualization Engines (EVE). 8 November 2023.
  49. Web site: Earth Virtualization Engines: A Technical Perspective. 8 November 2023.
  50. Web site: Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results. 10.1145/2807591.2807644 . 165618 . 8 November 2023.
  51. Benchmarking Data Science: 12 Ways to Lie With Statistics and Performance on Parallel Computers. 2022 . 10.1109/MC.2022.3152681 . 251294669 . 8 November 2023 . Hoefler . Torsten . Computer . 55 . 8 . 49–56 .
  52. Web site: SC18 Papers Submissions Open Today with New Review Process. 8 November 2023.
  53. Web site: Congratulations to All of This Year's SC and Society Awardees. 8 November 2023.
  54. Web site: Luddy alum receives prestigious award for contributions to high performance computing. 8 November 2023.
  55. Web site: Featured Keynote Speakers. 8 November 2023.
  56. Web site: Blue Waters staff, partners bring home awards from SC10. 11 February 2024.
  57. Web site: SC13 Concludes with Awards for Outstanding Achievements in HPC. 11 February 2024.
  58. Web site: Supercomputing 2014 Recognizes Outstanding Achievements in HPC. 11 February 2024.
  59. Web site: Congratulations to the SC and Society Awardees for SC19 in Denver. 11 February 2024.
  60. Web site: Luddy alum receives prestigious award for contributions to high performance computing . 2023-02-17 . Luddy alum receives prestigious award for contributions to high performance computing.
  61. Web site: HPCwire Unveils Honorees for Its 2021 People to Watch Feature . 2023-02-17 . HPCwire . en-US.
  62. Web site: ERC Consolidator Grants 2020 .
  63. Web site: Olga Sorkine-Hornung and Torsten Hoefler receive 2 ERC Consolidator Grants . 2023-02-17 . inf.ethz.ch . en.
  64. Web site: Torsten Hoefler receives BenchCouncil Rising Star Award . 2023-02-17 . inf.ethz.ch . en.
  65. Web site: ACM Names Recipients of 2019 Gordon Bell Prize . 2023-02-17 . www.acm.org . en.
  66. Web site: Middle Award . 2023-02-17 . www.ieee-tcsc.org.
  67. Web site: ERC Starting Grants 2015 .
  68. Web site: Torsten Hoefler: University Honors and Awards: Indiana University . 2023-02-17 . University Honors & Awards . en-US.
  69. Web site: Torsten Hoefler IEEE Computer Society . 2023-02-17 . en-US.
  70. Web site: 2013 Faculty Award recipients. 22 June 2023.
  71. Web site: Black . Doug . 2012-02-21 . Torsten Hoefler Wins 2012 SIAG/Supercomputing Junior Scientist Prize . 2023-02-17 . High-Performance Computing News Analysis insideHPC . en-US.