Quick Notes - Day 14
All previous posts in this series here
theme: AI Benchmarking
MLPerf training benchmark
Authors: Mattson et.al.
Published At: Proceedings of the 3 rd MLSys Conference, 2020 url
This paper describes the design and rationale for MLPerf training benchmark, a comprehensive benchmark suite for evaluating commercial as well as research DL systems. It had a brief comparison with other such benchmarking projects, and then discussed the challenges of such benchmarking efforts for DL (compared to other areas such as, say, microprocessors or databases). The MLPerf suite has vision, language, recommendation and reinforcement learning tasks with established datasets and commonly used models with those datasets. It then described the training process (which is meant to be replicable and reproducible) and discussed several aspects of its benchmarking process in terms of results obtained.
I found this idea very interesting and it is amazing that researchers from 17 different organizations (academia and industry) got together to produce this! It also seems to be a continuing effort and not a one off thing (https://mlperf.org/)
AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite
Authors: Gao et.al.
Published at: ArXiv (may not be peer reviewed yet) url
This paper also describes an AI Benchmarking suite, but more generic, and more extendable in nature compared to MLPerf (at least from what they say). On the basis of this framework, they propose guidelines on how to build end to end benchmarks by studying various domains and application scenarios. They also compared their benchmarking method with others including MLPerf.
This paper has a larger goal than MLPerf, it looks like, and definitely had a more diverse set of applications, and explored a broader range of questions. Finally, it seems to have done a more extensive comparison with other benchmarking efforts. I thought MLPerf’s authors coming from 17 organizations was amazing. This one has people from 20 organizations!