Dear ABINIT developers,
I’m interested on adding support to Intel GPUs through OpenMP/offload and I’d like to get some feedback/directions on how you believe it is the best way to start. I’ve been skim-reading the code and I have some doubts that I’d appreciate you help me understand/confirm:
-
As of now, support for GPUs relies on OpenMP/offload but also requires some sort of low-level programming through CUDA and HIP (in addition to some respective math libraries, e.g. CUBLAS). If I look into shared/common/src/17_gpu_toolbox I see that dev_spec_[cuda|hip] have some support functions from the lower level libraries. My questions about this are:
1.1) If plan is to go with OpenMP, would it make sense to create some sort of dev_spec_omp_offload which unifies these, or at least those supporting OpenMP offload?
1.2) There are some FFT and linear algebra -related files there, which I believe that I could forward to oneMKL, or do you think otherwise?
1.2.1) Not sure, if fully related – does this mean that there is some linear algebra or FFTs that are not currently deployed in GPUs? I wonder if one could use MKL (or whatever math lib) for all these computations and then the library decide on whether to use GPU underneath…
1.3) There are the timing_[cuda|hip] files that seem to use the GPU events to track some device time, but they print some data on the screen (see calc_cuda_time_, for instance) – are these actually used? Does ABINIT have some internal profiler?
1.4) How important is get_gpu_max_mem() ? Not sure we have this feature right now except possibly some OpenMP’s interop extensions. -
Also, about specific CUDA folders, there exists src/46_manage_cuda with CUDA code in there. Do you think this is needed to be ported for a first round of enabling?
-
I’ve read in some source comments (sorry don’t recall which exact file) that the plan is to reach a point in where the application uses managed memory (shared memory), is that right?
-
Intertwined with 1), I’m currently using the autoconf mechanism to configure my package – and it seems to require&test against CUDA or HIP libraries – but on Intel, I believe that most of these math libraries reside in oneMKL, so there may be no need to further check for GPU libs/capabilities. If so, I could “skip” some of the configuration in autoconf scripts?
-
Also related to configuration – do you suggest use autoconf configure or cmake? My understanding is that cmake for ABINIT is still on the works, right?
-
Is there a test/validation test that I can use for checking?
Since there are many topics here, and I may miss many others, would you be open to have a call to discuss the topic?
Thank you so much in advance.