My understanding of the Alpha 21064 chip (from several presentations) is that it uses a single inverter for clock distribution. This is built from one large PMOS and one large NMOS device. They have a total gate width of over 14 inches.
The AWESIM timing simulator from CMU was used to model this circuit. Obviously, since the chip is not 14" long, the devices are folded quite a number of times.