Converting from 24 bit to 16 bit is truncation- the least significant 8 bits of each 24 bit sample are lopped off and thrown away. It is always advantageous to dither first before the truncation happens, and most editors and stand-alone programs do that automatically for you in a single step, meaning you don't actually have to tell it to dither first and then truncate. As best practice, dithering/truncation should be the last editing step, other than tracking. If you are planning on doing further editing, you are technically better off using simple triangular dither (you'll probably see that option in whatever you are using) rather than a fancy "noise-shaped psychoacoustic" dither. The noise shaped dithers can lower the perceived noise floor by a few additional dB and might be technically better if no additional processing will be done in the future, but honesty I doubt you'd hear any difference on most any material at playback levels anywhere near appropriate for non-harmful human listening. I've only heard a difference between different types of dither if I really crank up the quietest sections of my most dynamic, pristine recordings made in very quiet environments, and that simply doesn't correlate to real world listening. Doing that convinced me that simple triangular dither is usually the best choice for me unless I'm doing a final CD edit of really pristine stuff for a band to release or something, and even then just because I can, not really because I think it matters.
Resampling recalculates the sample rate, for example changing from 48Khz to 44.1kHz. That is a far more processor intensive task than changing the bit-depth, requiring mathematical calculations which aren't really much of a problem these days, but high quality routines used to be pretty demanding on more limited computer resources. A good quality resampling routine is probably more important than a fancy dither routine partly because of that computational overhead difference (dithering/truncating is computationally easy- just add some engineered noise to the bottom 9 bits and then chop off the lower 8 bits), and partly because low quality resampling artifacts manifest across all volume levels and are "unmusical" information, whereas dither noise effects only the very quietest bit of the wave file noisefloor of the recording and simply sounds like very quiet tape hiss, usually completely buried in the far louder acoustic noisefloor of any live recording. I try to use the best quality resampling algorithm my software offers and don't worry much about fancy dither.
Best practice order-
Do any processing such as fades, EQ, compression and what have you at or above the native bit-depth and sample rate of the recording. Most editors do that for you automatically. When that’s all done and if you need to convert, resample first, then truncate/dither as the final two steps. An exception is tracking, which can be done before or after resampling and dithering depending on what is more convenient for you. That’s because tracking is simple file splitting and doesn’t change any data between splits.