If the clips are from music, you don't you need to normalize.
If the clips are just from the balloons and fireworks, I'd compress those sections - say 10:1 at X dB where X is the loudest music during the same passage. I know, I know...you'll be missing the dynamic range of those fireworks, they can be pretty cool.
Still, I'd go for the music.
After compressing, I'd add gain across the board to bring your levels back up to 0.
Others may have different ideas about how to address your question. Sorry, I can't help on the normalization bit as I don't use that function in CE to up my levels.