The GPU/media/AI buffer management and interop microconference focuses on Linux kernel support for new graphics hardware that is coming out in the near future. Most vendors are also moving to firmware control of job scheduling, additionally complicating the DRM subsystem's model of open user space for all drivers and API. This has been a lively topic with neural-network accelerators in particular, which were accepted into an alternate subsystem to avoid the open-user space requirement, something which was later regretted.
As all of these changes impact both media and neural-network accelerators, this Linux Plumbers Conference microconference allows us to open the discussion past the graphics community and into the wider kernel community. Much of the graphics-specific integration will be discussed at XDC the prior week, but particularly with cgroup integration of memory and job scheduling being a topic, plus the already-complicated integration into the memory-management subsystem, input from core kernel developers would be much appreciated.
Quick 5 minutes introduction:
Rules of engagement
Notes taking strategy
Where to chat/interact
In order to meet our fixed frame deadlines (e.g. vertical refresh) whilst still having low power usage, we need to keep our power management policies balanced between performance bursts and deeper sleeps. Between dma-fence being used to declare synchronisation dependencies between multiple requests, and additional hints (e.g. input events suggesting that GPU activity will happen 'soon') we can...
- handling revoke and multi-user access for dmabuf/RDMA/ML
Currently there is no notion of cgroup accounting for GPU memory and execution. Discuss how we could integrate this with GEM/TTM memory management, including how to correctly account for allocations which are transferred between processes (e.g. Android gralloc-as-a-service), and integrating scheduler/runtime constraints with hardware-based scheduling on newer hardware designs.
Both future hardware and also user-visible APIs, are demanding that we discard our previous fence-based synchronisation model and allow arbitrary synchronisation primitives similar to Windows/DirectX 'timeline semaphores'. Outline the problems in trying to integrate this with our previous predictable fence-based model with dma_fence and dma_resv and discuss some potential paths and solutions.
Supporting predictable presentation timing for graphics and media usecases requires a great deal of plumbing through the stack, right up to userspace. Whilst some higher-level APIs have been discussed, there are a number of open questions including how to handle VRR, and how to support this with mailbox-type systems like KMS and Wayland. Outline the current state and wants from all the...