Although it probably seems like very little stretching, 2 seconds is a good amount and so the software youre using has to basically make up additional sample points along the new waveform where they hadnt existed before. also, when the softwares algorithms are crunching these numbers to resample, little clicks, pops, and other artifacts can surface as a result and that could be whats causing the audio to clip. even though you may not be able to hear them or see them at first glance, if youre able to zoom into the waveform all the way to sample level you may just find a nasty little peak. i know ive found similar issues as a result of using the 'time compression/expansion' tool in pro tools.
(this is my understanding anyways. i hope im not wrong.)