Crash while performing positron calculation with larger vacancy

iavas · August 17, 2023, 6:58pm

Hello all,

I am performing calculations of the positron annihilation momentum Doppler spectrum, using ABINIT 9.6.2, following this tutorial.

I have performed the calculation with the same parameters (PAW basis, lattice size, ecut, k points, e-p xc functional, etc) for different sizes of silicon vacancies, and both mono- and di-vacancy cases complete normally, but when I do this on a cluster consisting of three nearest neighbor silicon vacancies, the calculation always exits with error code 14 after completing the lifetime calculation and before starting the Doppler calculation.

Below is the input file I am using and the associated output log files:
sic332-vsisisi-cluster.abi (10.4 KB)
sic332-vsisisi-cluster.abo (89.7 KB)
sic332-vsisisi-cluster.log (1.3 MB)
sic332-vsisisi-cluster.err.log (14.4 KB)
From the err file, this seems to be a parallelization problem, but my limited knowledge of MPI didn’t allow me to pinpoint anything related to this error. I’ve tried a number of different ways of parallelizing the KGB, and they all exit at the same place (between lifetime and Doppler calculations) via the same error code (14). Meanwhile, the system of one and two silicon vacancies calculates smoothly, even though they contain more electrons.

Serial computing might solve this problem, I’m trying it, but it will definitely be very slow. I would like to find a way to parallelize the computation normally if possible.

Thanks!

trungnvm · August 18, 2023, 8:46am

Hi ivavas,

In my experience, this error comes from the memory issue. So, please take a snap shot of top when it starts the doppler part, you might see how big of memory you need.

Trung

iavas · August 19, 2023, 9:14am

Hi trungnvm,

I think that’s the problem, I tried another run with much fewer CPUs and the Doppler calculation is proceeding normally now. Thanks for your info!

What puzzles me now is that this case takes up almost twice as much memory per process as the divacancy case, and it has one less silicon atom, so it should have contained fewer electrons, why is the wavefunction data so much larger instead?