24-28 August 2020
US/Pacific timezone

Lightning Talk: Accelerating machine learning workloads using new GCC built ins

26 Aug 2020, 08:00
GNU Tools track/Virtual-Room (LPC 2020)

GNU Tools track/Virtual-Room

LPC 2020

Rajalakshmi S


Basic Linear Algebra Subprograms (BLAS) are used everywhere in machine learning and deep learning applications today. OpenBLAS is an optimized BLAS open source library used widely in AI workloads that implement algebraic operations for specific processor types.
This talk covers recent optimization in the OpenBLAS library for the POWER10 processor. As part of this optimization, assembly code for matrix multiplication
kernels in OpenBLAS is converted to C code using new compiler builtins. A sample optimization for matrix multiplication for POWER hardware in OpenBLAS will be used to explain how builtins are used and show the impact of application performance.

