If your mics are 50' from the stacks, 4ms doesn't sound like enough.
The sound travels from the stacks to the mics at "The speed of sound" (duh!), which is "about 1 millisecond per foot", so I would think it's more like 50milliseconds. That's just a rough guess, but I think it's a lot more than 4. Technically, the speed of sound varies with air density, but unless you've got some radical weather changes going on, I'm not convinced it changes much over a 2 hour period, so I generally align the beginning and don't worry about drift.
When you have a big error (100ms or more) it will sound "echoy". When you are down to smaller amounts, it will sound OK, but not perfectly crisp. When you are spot on, it will be nice and crisp. I listened to your T05 flac, and IMO, it falls into that category of "Sounds OK, but not perfectly crisp", so I think your not quite perfect yet. It should sound as good as the SBD you are starting with, only better.
My technique is to try to find some spot where it goes from quiet to "loud" and line up the beginning of that jump. I find this more definitive than peaks. Then I look for other similar spots, and make sure they look lined up too. Then I fade AUD > left ear, SBD to right ear and listen to drums and make sure they sound in sync. If for some reason I can't get it just right, I would rather have the SBD a couple of milliseconds ahead of the AUD then the other way around.
That's our opinion, we welcome yours...