Last week, I briefly explained the exploitations of the Meltdown and Spectre bug. This week, I’ll talk about the patches and some of the impact the fixes have had on the operating systems.
Patches have been rolled out for Meltdown, the easier of the two issues to exploit. It involves an upgrade and fix to the kernel by introducing Kernel Page Table Isolation (KPTI).1 KPTI separates the memory so that secure data cannot be loaded into the internal cache, where it can be reached. This means that every time an applications asks the processor to do something on its behalf, there’s an additional check for data security and access. There were initial warnings about how it could slow down older processors, but the rollout has been more impactful and disastrous than that. 2
It appears to affect the processing power of EVERY computer, and not just the older ones.
The #Meltdown patch (presumably) being applied to the underlying AWS EC2 hypervisor on some of our production Kafka brokers [d2.xlarge]. Ranges from 5-20% relative CPU increase. Ooof. pic.twitter.com/fXM0OzfdKx— Ian Chan (@chanian) January 6, 2018
Additionally, some of the patches caused random reboot issues on computers. As fun as it is when work force patches your computer and it reboots, so you lose all your work, you really don’t want that to happen in production.
The patches have had so many problems, Linus Torvalds, the creator of Linux and Git, has been quoted as saying, “[t]he patches are COMPLETE AND UTTER GARBAGE. … They do things that do not make sense.” (emphasis his) Windows machines have also experienced similar slowdowns for the patches.
It seems these slowdowns can be somewhat controlled by leveraging the PCID functionality on newer chips.3 Basically, PCID prevents the processor from having to flush the entire buffer regularly when you switch between contexts (or between secure and non-secure memory locations). It keeps IDs of the pages for user mode processes and kernel mode processes, and limits lookups only to particular contexts. So instead of having to load the full context every time is switches from user to kernel mode and vice versa, and flushing the old context, it can simply look at that one table.
Hooray, there is some good news out of this! Wait – turns out maybe not. It looks like a lot of computers are not actually using PCID, because it wasn’t deemed necessary before Meltdown, so many guest OS versions didn’t have it. So while the process exists, it’s not really leveraged.
An additional problem for Meltdown and Spectre: more chips were impacted than originally stated. Other companies, like AMD, who initially said they were not impacted by the problem, actually were affected by the problem. The problem is ubiquitous, and reaches into pretty much every computer processor everywhere - Android, IOS, Windows, Linux.
Spectre is more difficult to mitigate. Most of the academic papers on the vulnerability do not even address a full mitigation approach, so plans are still evolving.4 The explanations are highly technical, so if you are interested in specifics, I refer you to the references at the bottom of the page. Spectre security is only partially covered by the aforementioned bios updates.
So those are all the technical problems and solutions currently available to “solving” Spectre and Meltdown. But there are other impacts as well.
Perhaps the most interesting article I’ve read about Meltdown and Spectre, are the potential legal repercussions.7 I am not sure if the argument is a good one, but the author suggests that due to these patches, the processors are no longer performing as advertised. The processor is no longer holds the same value, and thus may be open to a few different types of claims: tort, contract, trespass to chattel. Over the next few months, we could see some settlements as a result of this performance degradation, although I find it unlikely. If it were just one set of chips impacted perhaps, but the hit to performance has been across the board.
The problem is not that 3.4GHz processor chips are working like 2.4GHz, but that we had an imperfect understanding of what a SECURE 3.4GHz processor looked like. And so on down the line. The problem is too systemic.
So update your computer if you haven’t (provided you have a 64-bit chip. And if you don’t have a 64-bit chip, I’m surprised you read a tech blog). And keep your ear turned towards the news, because I’m sure there will be more kernel updates to address these problems in the upcoming days.Resources