Multibinit spin dynamics does not run in parallel on more than 1 CPU cores

Hi, I have two problems, probably mutually related, with Multibinit spin dynamics. Namely, Multibinit executes without issues when using only one core by issuing command:
mpirun -n 1 multibinit --F03 <mb.files |& tee mb.out

However, when I try to run on 2 or more cores (mpirun -n 2 ...) it exits with the first appearance of line Thermalization run:. It does not report any error.

I have another issue which is probably related to this one. When I run two Multibinit independent calculations in different folders, each on one core, the top linux command shows about 100% CPU utilization of each multibinit process, i.e. one core per each of the two processes. However, after starting any additional multibinit calculation the CPU utilization of each multibinit process proportionally decreases. For instance when I run 8 independent calculations each multibinit process utilizes only about 25% CPU time.

My computer has Ryzen 5950x CPU with 16 cores (32 threads), I compiled Abinit ver. 9.10.3 using GNU fortran and openmp. Openmp installation seems to be correct since Siesta parallel calculations run with 100% utilization on every requested core. I configured Abinit for compilation with:

FCFLAGS="-O2 -march=native -mtune=native -ffree-line-length-none -fallow-argument-mismatch -I/home/popsi/.local/include" CFLAGS="-O2 -march=native -mtune=native -I/home/popsi/.local/include" ../configure --with-hdf5=/home/popsi/.local --with-netcdf=/home/popsi/.local --with-netcdf-fortran=/home/popsi/.local --with-linalg-flavor="mkl" --enable-parallel-io --with-mpi --prefix=/home/popsi/.local

Before running multibinit I issue:

export LD_LIBRARY_PATH=/home/popsi/.local/lib:/opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:$LD_LIBRARY_PATH
export MKLROOT=/opt/intel/compilers_and_libraries_2017.4.196/linux/mkl

Example of mb.in file which I use is:

prt_model = 0

#--------------------------------------------------------------
#Monte carlo / molecular dynamics
#--------------------------------------------------------------
dynamics =  0    ! disable molecular dynamics

ncell =  10  10 10   ! size of supercell. 
#-------------------------------------------------------------
#Spin dynamics
#------------------------------------------------------------
spin_dynamics=1               ! 1: HeunP  2: Depondt-Mertens  3: Monte Carlo
spin_mag_field= 0.0 0.0 0.0   ! external magnetic field
spin_ntime_pre =10000          ! warming up steps. 
spin_ntime =100000             ! number of steps. 
spin_nctime=1000            ! number of time steps between two nc file write
spin_dt=2e-16 s               ! time step. 
spin_init_state = 1           ! random initial spin

spin_temperature=0.0

spin_var_temperature=1        ! switch on variable temperature calculation
spin_temperature_start=10      ! starting point of temperature
spin_temperature_end=100      ! ending point of temperature. 
spin_temperature_nstep=20     ! number of temperature steps.

spin_sia_add = 0              ! add a single ion anistropy (SIA) term?
#spin_sia_k1amp = 1e-6         ! amplitude of SIA (in Ha), how large should be used?
#spin_sia_k1dir = 0.0 0.0 1.0  ! direction of SIA

spin_calc_thermo_obs = 1      ! calculate thermodynamics related observables

Do you have suggestion what I did wrong, i.e. how to make running multibinit spin dynamics in parallel?

Hi,
Thanks for reporting this.
Could you try these commands and send the output?
ldd $(which multibinit)
and
mpirun --version
I will try on my computer and tell you the result.
If possible, could you also send the input files necessary for the run in case it is related to specific calculation.
Best regards,
HeXu

Hi, thank you for the reply!
Here is the output of ldd $(which multibinit):

linux-vdso.so.1 (0x00007ffd50fe6000)
	libmkl_scalapack_lp64.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_scalapack_lp64.so (0x00007f00da800000)
	libmkl_gf_lp64.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so (0x00007f00d9c00000)
	libmkl_sequential.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_sequential.so (0x00007f00d8e00000)
	libmkl_core.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_core.so (0x00007f00d7200000)
	libmkl_blacs_intelmpi_lp64.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_blacs_intelmpi_lp64.so (0x00007f00d6e00000)
	libgfortran.so.5 => /usr/lib/libgfortran.so.5 (0x00007f00d6b13000)
	libm.so.6 => /usr/lib/libm.so.6 (0x00007f00da713000)
	libmvec.so.1 => /usr/lib/libmvec.so.1 (0x00007f00d7107000)
	libnetcdff.so.6 => /home/popsi/.local/lib/libnetcdff.so.6 (0x00007f00da64a000)
	libnetcdf.so.19 => /home/popsi/.local/lib/libnetcdf.so.19 (0x00007f00d689e000)
	libhdf5_hl.so.310 => /home/popsi/.local/lib/libhdf5_hl.so.310 (0x00007f00db12b000)
	libhdf5.so.310 => /home/popsi/.local/lib/libhdf5.so.310 (0x00007f00d6422000)
	libz.so.1 => /home/popsi/.local/lib/libz.so.1 (0x00007f00db10a000)
	libxml2.so.2 => /home/popsi/.local/lib/libxml2.so.2 (0x00007f00d6258000)
	liblzma.so.5 => /usr/lib/liblzma.so.5 (0x00007f00d9bcd000)
	libmpi_usempif08.so.40 => /usr/lib/libmpi_usempif08.so.40 (0x00007f00d8dba000)
	libmpi_usempi_ignore_tkr.so.40 => /usr/lib/libmpi_usempi_ignore_tkr.so.40 (0x00007f00db0fa000)
	libmpi_mpifh.so.40 => /usr/lib/libmpi_mpifh.so.40 (0x00007f00d8d57000)
	libmpi.so.40 => /usr/lib/libmpi.so.40 (0x00007f00d612d000)
	libquadmath.so.0 => /usr/lib/libquadmath.so.0 (0x00007f00d70be000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f00d9ba8000)
	libc.so.6 => /usr/lib/libc.so.6 (0x00007f00d5ee0000)
	libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f00db0e2000)
	libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f00da645000)
	/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f00e18b8000)
	libzip.so.5 => /usr/lib/libzip.so.5 (0x00007f00d8d37000)
	libsz.so.2 => /usr/lib/libsz.so.2 (0x00007f00da639000)
	libbz2.so.1.0 => /usr/lib/libbz2.so.1.0 (0x00007f00d70ab000)
	libzstd.so.1 => /usr/lib/libzstd.so.1 (0x00007f00d5e0d000)
	libblosc.so.1 => /usr/lib/libblosc.so.1 (0x00007f00d7098000)
	libcurl.so.4 => /usr/lib/libcurl.so.4 (0x00007f00d5d55000)
	libopen-pal.so.40 => /usr/lib/libopen-pal.so.40 (0x00007f00d5ca9000)
	libopen-rte.so.40 => /usr/lib/libopen-rte.so.40 (0x00007f00d5bf0000)
	libhwloc.so.15 => /usr/lib/libhwloc.so.15 (0x00007f00d5b93000)
	libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x00007f00d5696000)
	liblz4.so.1 => /usr/lib/liblz4.so.1 (0x00007f00d7076000)
	libsnappy.so.1 => /usr/lib/libsnappy.so.1 (0x00007f00d706a000)
	libnghttp2.so.14 => /usr/lib/libnghttp2.so.14 (0x00007f00d703f000)
	libidn2.so.0 => /usr/lib/libidn2.so.0 (0x00007f00d5674000)
	libssh2.so.1 => /usr/lib/libssh2.so.1 (0x00007f00d562b000)
	libpsl.so.5 => /usr/lib/libpsl.so.5 (0x00007f00d5617000)
	libssl.so.3 => /usr/lib/libssl.so.3 (0x00007f00d5577000)
	libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2 (0x00007f00d5523000)
	libbrotlidec.so.1 => /usr/lib/libbrotlidec.so.1 (0x00007f00d5515000)
	libevent_core-2.1.so.7 => /usr/lib/libevent_core-2.1.so.7 (0x00007f00d54e3000)
	libevent_pthreads-2.1.so.7 => /usr/lib/libevent_pthreads-2.1.so.7 (0x00007f00da632000)
	libudev.so.1 => /usr/lib/libudev.so.1 (0x00007f00d54ac000)
	libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f00d5231000)
	libunistring.so.5 => /usr/lib/libunistring.so.5 (0x00007f00d5077000)
	libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0x00007f00d4f9f000)
	libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0x00007f00d4f71000)
	libcom_err.so.2 => /usr/lib/libcom_err.so.2 (0x00007f00d9ba2000)
	libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0x00007f00d4f63000)
	libkeyutils.so.1 => /usr/lib/libkeyutils.so.1 (0x00007f00d4f5c000)
	libresolv.so.2 => /usr/lib/libresolv.so.2 (0x00007f00d4f4b000)
	libbrotlicommon.so.1 => /usr/lib/libbrotlicommon.so.1 (0x00007f00d4f28000)

mpirun --version gives: mpirun (Open MPI) 4.1.5.

How can I send input files? I tarred/gziped exchange.xml, mb{in,files}, but the website accepts only files with extensions: jpg, jpeg, png, gif, heic, heif, webp, abi, abo, log, pdf, nc, ac, ac9, txt, in, fc.

Kind regards,
Igor

I’ve upgraded you to “basic user” status: you can send files now :wink:

best

jmb

Thank you! I still cannot upload .tar.gz file since the same error message shows about allowed file extensions for upload. The exchange.xml is 30MB large, compressing it together with mb.in and mb.files is only 2.5MB. So, I changed extension from .tar.gz to tar.log (.log is allowed). You may rename it back and decompress.

Cheers,
Igor
inputs.tar.log (2.5 MB)

added tar.gz and tgz :slight_smile:

Fine, whatever works :slight_smile:

In fact I have the same issue with a very simpler systems, HPC Co, which comes in the example folder of TB2J. I attach this one too.

inputs-simple.tar.log (26.0 KB)

Thanks for the test cases!
The first thing I notice is that ABINIT/MULTIBINIT is linked to the intelmpi version of BLACS in mkl while the MPI used is openmpi. (see this line:
libmkl_blacs_intelmpi_lp64.so => /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_blacs_intelmpi_lp64.so (0x00007f00d6e00000)
)
This should be unrelated to the problem as MULTIBINIT does not make use of BLACS.
I will debug with the files you sent and reply again.
Best regards,
HeXu

I directly changed Makefile to link to mkl_blacs_openmpi_lp64 instead of mkl_blacs_intelmpi_lp64 and rebuilt the executable. Indeed it did not solve the issue. Execution of multibinit gets stuck at line “Initial spins set to random values.” while there are multiple multibinit processes running with 100% CPU utilization (when run e.g. with mpirun -n 4 multibinit...). So, a part of the code uses 100% CPU time while actually doing nothing, or at least no further output is generated. Yet again, when running more than 2 independent serial multibinit calculations in different folders, each with mpirun -n 1 multibinit, the CPU usage linearly decreases, i.e. like they are somehow aware of each other.

m_spin_mover.F90.txt (38.3 KB)
m_spin_primitive_potential.F90.txt (34.6 KB)

Hi,
I have found the bug that the calculation halts when the ncpu>1. Could you try to replace the .F90 file in src/78_effpot with the two file attached. (the .txt should be removed from the filename).
I am not sure the cpu usage problem with multiple serial calculations is related or not. I have tried different machines and compilers and did not reproduce the problem.
Could you try with the fix. Thanks a lot!
Best regards,
HeXu

It works, HeXu!! Thanks for your support and effort!!!

I marked your last answer as “solution”.

I have two other questions but not related to the current thread, so I will open two new threads.