A search is presented for a heavy scalar (H) or pseudo-scalar (A) predicted by the two-Higgs-doublet models, where the H/A is produced in association with a top-quark pair $$(t\bar{t}H/A),$$ ( t t ¯ H / A ) , and with the H/A decaying into a $$t\bar{t}$$ t t ¯ pair. The full LHC Run 2 proton–proton collision data collected by the ATLAS experiment is used, corresponding to an integrated luminosity of $$139~\text {fb}^{-1}.$$ 139 fb - 1 . Events are selected requiring exactly one or two opposite-charge electrons or muons. Data-driven corrections are applied to improve the modelling of the $$t\bar{t}$$ t t ¯ +jets background in the regime with high jet and b-jet multiplicities. These include a novel multi-dimensional kinematic reweighting based on a neural network trained using data and simulations. An H/A-mass parameterised graph neural network is trained to optimise the signal-to-background discrimination. In combination with the previous search performed by the ATLAS Collaboration in the multilepton final state, the observed upper limits on the $$t\bar{t}H/A \rightarrow t\bar{t}t\bar{t}$$ t t ¯ H / A → t t ¯ t t ¯ production cross-section at 95% confidence level range between 14 fb and 5.0 fb for an H/A with mass between 400 $$\text {GeV}$$ GeV and 1000 $$\text {GeV}$$ GeV , respectively. Assuming that both the H and A contribute to the $$t\bar{t}t\bar{t}$$ t t ¯ t t ¯ cross-section, $$\tan \beta $$ tan β values below 1.7 or 0.7 are excluded for a mass of 400 $$\text {GeV}$$ GeV or 1000 $$\text {GeV}$$ GeV , respectively. The results are also used to constrain a model predicting the pair production of a colour-octet scalar, with the scalar decaying into a $$t\bar{t}$$ t t ¯ pair.