Soon after the advent of ticking clocks, scientists observed that the time told by them (and now, much more accurate clocks), and the time told by the Earth’s position were rarely exactly the same. It turns out that being on a revolving imperfect sphere floating in space, being reshaped by earthquakes and volcanic eruptions, and being dragged around by gravitational forces makes your rotation somewhat irregular. Who knew?
That’s the official Google blog.
Now that’s all well and good but eventually we started building systems that need to all talk to each other and agree on the time. If our arbitrary system of time keeping doesn’t/can’t exactly match the (changing) benchmark of the Earth’s position in space, what are we to do? In other words:
Very large-scale distributed systems, like ours, demand that time be well-synchronized and expect that time always moves forwards. Computers traditionally accommodate leap seconds by setting their clock backwards by one second at the very end of the day. But this “repeated” second can be a problem. For example, what happens to write operations that happen during that second? Does email that comes in during that second get stored correctly?
Well, Google does something called a Time Smear:
The solution we came up with came to be known as the “leap smear.” We modified our internal NTP servers to gradually add a couple of milliseconds to every update, varying over a time window before the moment when the leap second actually happens. This meant that when it became time to add an extra second at midnight, our clocks had already taken this into account, by skewing the time over the course of the day. All of our servers were then able to continue as normal with the new year, blissfully unaware that a leap second had just occurred.
Cool! More discussion on the topic from HN here.