Go fastR: High Performance Computing with R
When: Wednesday, 6th of December
Where: Basel Biometric Society Next Gen Training Day, Roche, Basel
How to register: Course held, the materials are now available.
Description
Have you ever struggled with slow R code? Have you ever wondered how to make it go fastR and fully leverage the power of high performance compute (HPC) clusters? If yes, this training is for you! You will learn how to profile, optimize, and parallelize your code on your local machine, as well as how to parallelize it on HPC clusters. While we will provide example code for commonly-performed activities for trial design, such as simulation studies, bootstrapping and cross-validation, to get the most out of this training, we strongly encourage you to bring your own (non-confidential) code to optimize on a Posit-provided HPC in the cloud during the second half of the training. You will also get to pick our brains on how to best optimize your code.
Details
The first half of the training is classroom-based with interactive exercises hosted on Posit Workbench connected to a cloud-based HPC cluster that participants will get access to. We cover the following topics:
Identifying bottlenecks in your R code, debugging, and optimizing
- Debugging R code & checking correctness
- Profiling R code to identify bottlenecks
- Optimizing bottlenecks: packages, vectorization, memory allocation
R parallelization on HPC clusters
- Amdahl’s law and limits of achievable speed up
- Parallelizing work onto compute clusters via clusterMQ and batchtools
- Consistently loading code, packages,
.libPaths()
andoptions()
on R workers - Uncorrelated random number generation for parallel R code
- Debugging R code in batchtools and clusterMQ jobs
For the second half, we highly encourage you to bring your own code and attempt to optimize and parallelize it with the help of the course instructors. You can also try some of the provided examples and explore optimization and parallelization on the provided HPC cluster with those. We look forward to your participation!
News
- [6/December/2023] Basel Biometric Society Next Gen Training Day Course
- [3/November/2023] Website and registration are live!
Facilitators
Michael Mayer is a dedicated scientist with a strong background in IT working at Posit. He specializes in bridging the realms of science and technology, including scientific and technical computing and with a particular focus on their application within the pharmaceutical industry. He has deep expertise in High-Performance Computing (HPC) clusters (both infrastructure and software ecosystems), cloud platforms such as AWS and Azure, scientific software development, DevOps, CI/CD, Big Data, Machine Learning, and the automation of IT processes. He is also well-versed in performance tuning such as employing parallelization and vectorization techniques, as well as scaling-up strategies.
Lukas A. Widmer is an Associate Directory Statistical Consultant in the Advanced Methodology and Data Science group at Novartis. He provides advice and is engaged in developing and implementing innovative statistical methods for clinical projects, develops trainings for statisticians and data scientists, and collaborates with external research groups. Lukas has diverse research interests, including safety modelling in dose escalation and Bayesian statistics, re-use of data through analysis results data modeling, genetic engineering, high performance computing as well as statistical software engineering. Before joining Novartis, he studied Computer Science as well as Computational Biology and Bioinformatics at ETH Zürich in Switzerland and UC Santa Barbara in the US. He also holds a Doctor of Science degree from the Department of Biosystems, Science and Engineering of ETH Zürich.