Spaarsh Gsoc Report
– layout: post title: “GSoC 2025 Report: Enhancing Debian packages with ROCm GPU acceleration” date: 2025-09-01 categories: gsoc debian ROCm debian-packaging author: Spaarsh Thakkar —
GitLab Salsa: @Spaarsh
GitHub: Spaarsh
Introduction
I am Spaarsh Thakkar, a final-year Computer Science Engineering undergrad from India. My interests lie in research and systems. My recent work has been in and around Graphics Processing Units and I also hold a keen interest in Computer Networks. At the time of writing, I have been an open-source contributor for almost a year.
Proposal Description (as shown on GSoC Project Profile1)
Due to Debian’s open-source nature, no Debian package in main can have a proprietary GPU package listed as a dependency. While AI and HPC workloads increasingly rely on GPU acceleration, many Debian packages still focus solely on CUDA, which is proprietary.
With the advent of ROCm, an open-source GPU computing platform, we can now integrate full-fledged AMD GPU support into Debian packages. This will improve the experience of developers working in AI/ML and HPC while positioning Debian as a strong OS choice for GPU-driven workloads. The proposal aims to aid in solving the aforementioned program by packaging several ROCm packages for debian and add ROCm support to some existing debian packages.
The deliverables are as follows:
- New Debian packages with GPU support
- Enhanced GPU support within existing Debian packages
- More autopackagetests running on the Debian ROCm CI
Key Objectives
Enable ROCm in:
- dbcsr
- gloo
- cp2k
Publish the following packages to debian apt
archive:
- hipblas-common
- hipBLASlt
Work Report
1. Publishing hipblas-common
to apt
This objective was successfully completed, resulting in hipblas-common
being published in the apt
repository2.
The process involved the following steps:
- Filing a Intent-To-Package (ITP)3
- Pulling the upstream source code repository from GitHub
- Adding the
debian/
packaging files - Testing the package locally
- Creating the corresponding project under
rocm-team
4 - Applying the necessary changes
- Building the package
- Testing it using
sbuild
- Signing the package files
- Uploading the package to the
mentors.debian.net
archive(now in official archive)5 - Addressing review feedback and making changes
- Requesting sponsorship6
- Securing sponsorship, which led to the package being accepted into the
experimental
branch ofapt
Since the beginning of GSoC, the package has also been promoted to the unstable
branch2.
2. DBCSR ROCm and Multi-Arch Support
During my GSoC project, I worked on extending the DBCSR (Distributed Block Compressed Sparse Row)7 package to improve its ROCm/HIP support, and handling multi-architecture GPU kernels in a way that is both practical for upstream maintainers and debian package developers.
The code changes can be found at my dbcsr
fork here8.
ROCm/HIP Enablement
- Enabled ROCm backend support to DBCSR, allowing GPU acceleration beyond CUDA by enabling HIP-based builds.
- Investigated and resolved build issues specific to HIP kernels within DBCSR.
Multi-Architecture GPU Kernel Handling
(The following content was presented in greater detail at DebConf’25 as well. The presentation video can be found here9 and the presentation slide can be found here10).
- DBCSR contains GPU kernels that are heavily optimized for specific architectures. By default, these are built for a single target architecture, which poses challenges for packaging where binaries need to support multiple possible GPU targets.
-
Explored different strategies for solving the multi-arch GPU kernel distribution problem, including:
- Option 1: Fat binaries (embedding multiple GPU architectures into a single binary, with runtime dispatch). This is ideal for end-users but requires deeper changes upstream and is not straightforward with HIP/ROCm.
-
Option 2: Arch-specific libraries (e.g.,
libdbcsr.gfxXXX.a
), where the alternatives system or explicit user selection would determine which one is used. This solves the problem but pushes complexity downstream into packaging and user configuration. - Option 3: Prefixed functions inside a single file, where kernels are compiled separately per architecture, functions are renamed with an arch prefix, and runtime logic in DBCSR decides which kernel to invoke. This shifts complexity upstream but could give a clean downstream experience.
- I critically analyzed these options in the context of Debian packaging and upstream maintainability. Arch-specific
.a
files introduce exponential dependency complexity. The prefixed-function approach seemed like a plausible way forward, though it requires upstream buy-in. - After consulting with my mentor, these concerns were raised in the
dbcsr
repository as a discussion here11
Summary
My work involved:
-
Enabling HIP/ROCm support in DBCSR.
-
Prototyping strategies for handling GPU multi-arch builds.
-
Evaluating the trade-offs between upstream maintainability and downstream packaging complexity.
3. gloo
, hipification and source code issues
One of the other packages that were targeted was gloo
12. It is a collective communications library and has the implementations of different Machine Learning communication algorithms.
The code changes can be found at my gloo
fork here13 (some changes have not be committed at the time of writing).
HIP/ROCm Enablement
-
Fixing old ROCm CMake functions The upstream Gloo codebase still used old ROCm CMake functions that began with the
hip_
prefix (for example,hip_add_executable
). These functions have since been deprecated/removed. I updated the build system to use the modern ROCm CMake equivalents so that the package can build properly in a current ROCm environment. -
Debian packaging changes I modified
debian/control
to add a new package,libgloo-rocm
, in addition to the existing packages. This allows proper separation and handling of ROCm-enabled builds in Debian. -
First successful library build After these changes, I was able to successfully build the library. However, I ran into issues when trying to produce the shared library: there were undefined symbol errors at link time.
Source Code Issue
On investigating the undefined symbol errors, I identified that these came from a lack of explicit template instantiation for some Gloo classes. Since C++ templates only get compiled when explicitly used or instantiated, this resulted in missing symbols in the shared library.
To solve this, I explored the source code and noticed that the HIP backend code was not natively written — it was generated from the CUDA backend using a custom hipification script maintained by the repo.
- I experimented with modifying the HIPification process itself, trying out
hipify-perl
14 instead of the repository’s custom Python script. - I also tried tweaking the source code in places where template instantiations were missing, so that the ROCm build would correctly export the needed symbols.
Summary
The issue is still unresolved. The core problem lies in how the source code is structured: the HIP backend is almost entirely auto-generated from CUDA code, and the process does not handle template instantiations correctly. Because of this, the Debian package for Gloo with ROCm support is not yet ready for release, and further source-level fixes are required to make the ROCm build reliable.
4. cp2k
CP2K
15 is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems.
HIP/ROCm Enablement
cp2k
depends on dbcsr
and hence, HIP/ROCm enablement in this package required the dbcsr
16 package to get through.
Even though dbcsr
isn’t ready yet, it was worthwhile to plan how it shall be built with HIP/ROCm
once we have dbcsr
in place. Upon doing this, it was realized that the architecture-wise libraries provided by the dbcsr
package will result in a complicated building process for cp2k
.
No changes have been made to this package yet and more concrete steps shall be taken once the dbcsr
package work is completed.
Summary
The multi-arch build process for cp2k
maybe complicated by the one static-library-per-architecture method used in the dependent package, dbcsr
.
Auxiliary Work & Activities
While working on the aformentioned GSoC Goals, there were a few other things that were also done.
-
libamdhip64-dev
bug file17While trying to enable HIP/ROCm in dbcsr, CMakeDetermineHIPCompiler.cmake was unable to find HIP runtime CMake package. After going through some similar issues faced by other developers earlier, it was decided to file a bug report under the
libamdhip64-dev
package.After discussions with and trying the changes suggested by Cory (my mentor) under the bug, the issue was resolved.
Turns out, the wrong compiler was being used by me! The
gcc
compiler was supposed to be used and I was usinghipcc
. The bug was closed since it was not due an issue with the package.Cory suggested that I add this info under the
ROCm
wiki page. It is yet to be done and hopefully I get it done soon. -
DebConf25 Talk
After facing the multi-arch build dilemma with
dbcsr
(and also getting to know about the issues faced by other fellow package developers), I came to realise that this was more than a packaging, build or programming issue. GPU-packaging was facing a policy issue.Hence, I decided to cover this problem in greater detail at my DebConf25 Virtual Presentation under the Outreach Session.
Shoutout to Cory for his support and Lucas Kanashiro for encouraging me to present my work!
-
Bi-Weekly AMD ROCm Meetings
Shortly after the Coding period started, Cory began the initiative of Bi-Weekly AMD ROCm Meetings18. Being a part of the meetings (participated in all but one!), seeing the work the other folks are doing and being able to discuss my own problems was a delight.
-
(Upcoming) IndiaFOSS 2025 Talk
After understanding the nuances and beauty of the debian packaging ecosystem in these months, I decided to spread the work about debian packaging and packaging software in general. My talk19 for the same got accepted in the upcoming IndiaFOSS 202520 conference!
I hope this beings more people towards the packaging ecosystem and to the debian developer ecosystem.
Conclusion
My GSoC time was fantastic! I plan to complete the work that I have started during my GSoC and beyond. Working with Cory21 and Utkarsh22 (a fellow GSoC’25 contributor under Cory) has been a very positive experience.
HIP/ROCm GPU-packaging is in a nascent stage. It is an exciting time to be in this space right now. The problems are new and never encountered before (CPU packaging isn’t architecture specific!). The problems were shall face in the coming time, and our solutions to them will set a precendent for the future.
References
1 : https://summerofcode.withgoogle.com/programs/2025/projects/9s4jUjV0
2 : https://tracker.debian.org/pkg/hipblas-common
3 : https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1105114
4 : https://salsa.debian.org/rocm-team
5 : https://packages.debian.org/source/sid/hipblas-common
6 : https://lists.debian.org/debian-ai/2025/05/msg00088.html
7 : https://www.cp2k.org/dbcsr
8 : https://salsa.debian.org/Spaarsh/dbcsr/
9 : https://drive.google.com/file/d/14WQuTMcI-L0lbi3zkUc9pT6RGwwVY0j1/view?usp=sharing
10 : https://docs.google.com/presentation/d/1p-nkHPgg5C5jKGy7ySZ8rts5G2vNFQpQJQ8UySOWgVE
11 : https://github.com/cp2k/dbcsr/discussions/933
12 : https://github.com/pytorch/gloo
13 : https://salsa.debian.org/Spaarsh/gloo
14 : https://tracker.debian.org/pkg/hipify
15 : https://www.cp2k.org/
16 : https://tracker.debian.org/pkg/dbcsr
17 : https://bugs.debian.org/cgi-bin/bugreport.cgi?https://fossunited.org/indiafoss/2025bug=1108159
18 : https://lists.debian.org/debian-ai/2025/05/msg00113.html
19 : https://fossunited.org/c/indiafoss/2025/cfp/dpq0b26ece
20 : https://fossunited.org/indiafoss/2025
21 : https://salsa.debian.org/cgmb
22 : https://salsa.debian.org/utk4r-sh