Anyone reading this nowadays:
The execution time difference comes from the fact, that async reads report larger execution times but they are not stalling the GPU pipeline! The read result might come back after a few other frames rendered. Basically you are measuring not just the pixel reading process (which is quick) but also every other command the GPU performs.
The sync read will stop GPU execution until the pixels are read.