Module Number ML-4331 |
Module Title The Science of Machine Learning Benchmarks |
Lecture Type(s) Lecture, Tutorial |
---|---|---|
ECTS | 6 | |
Work load - Contact time - Self study |
Workload:
180 h Class time:
60 h / 4 SWS Self study:
120 h |
|
Duration | 1 Semester | |
Frequency | Irregular | |
Language of instruction | English | |
Type of Exam | Written exam |
|
Content | Benchmarks have played a central role in the progress of machine learning research since the 1980s. Although there's much researchers have done with them, we still know little about how and why benchmarks work. This class covers the emerging science of benchmarks. The first part focuses on laying the theoretical and empirical foundations that we build on throughout the class. The second part covers lessons about reliability and validity we draw from influential benchmarks, such as ImageNet. The final part turns to benchmarking and evaluation in the era of large language models. Students who would like to attend this course should meet the following requirements: |
|
Objectives | Working from first principles, the aim is to better understand why and when benchmarks work, how they fail, and how to best evaluate machine learning models. At the end of the class, students have a good understanding of machine learning benchmarks and the surrounding evaluation ecosystem. They can follow best practices in the evaluation of machine learning. They are able to identify and avoid pitfalls. |
|
Allocation of credits / grading |
Type of Class
Status
SWS
Credits
Type of Exam
Exam duration
Evaluation
Calculation
of Module (%) |
|
Prerequisite for participation | There are no specific prerequisites. | |
Lecturer / Other | Hardt, MPI | |
Literature | - |
|
Last offered | unknown | |
Planned for | Wintersemester 2024 | |
Assigned Study Areas | INFO-INFO, MEDI-APPL, MEDI-INFO, ML-CS, ML-DIV |