It is likely something to do with the various frame rates for video, 24fps, 25fps and 23.9.
The audio is not linked to each frame of the movie, and so when a movie is played at 23.9 but recorded at 24 (or 25) then there will be drift. That is the most obvious case. Rounding errors, noisy clock sources et cetera can also lead to increasing async.
"The 24p frame rate is also a noninterlaced format, and is now widely adopted by those planning on transferring a video signal to film. But film- and video-makers turn to 24p for the "cine"-look even if their productions are not going to be transferred to film, simply because of the "look" of the frame rate. When transferred to NTSC television, the rate is effectively slowed to 23.976 fps, and when transferred to PAL or SECAM it is sped up to 25 fps. 35 mm movie cameras use a standard exposure rate of 24 frames per second, though many cameras offer rates of 23.976 fps for NTSC television and 25 fps for PAL/SECAM. The 24 fps rate became the de facto standard for sound motion pictures in the mid-1920s.[1]"