A time-dependent analytical model is rigorously derived which shows that the thermally induced modal instability in high power rare-earth doped fiber amplifiers is fundamentally a two-wave mixing between fundamental and higher-order modes through a thermally-induced grating imprinted by beating between these modes. We show that previously postulated movement of this grating to phase-match the coupling between the modes naturally occurs due to a finite thermal-response time of a fiber. This theory is consistent with experimental observations in that it accurately predicts the onset-like threshold and temporal instabilities in the kilohertz-frequency range.