Their SLAP demo provides a great example of how defence-in-depth can make/break the viability of an exploit. That terrifying Safari demo is possible because Safari fails to isolate new windows in individual processes when calling `window.open` in js.
All the other side channel magic presented here doesn't matter if the data you want to read is in a seperate process with sufficient separation from the "hostile" process in the address space.
That's not a failure of Safari, it's required by window.open API semantics, in particular by the default Cross-Origin-Opener-Policy of "unsafe-none" [1].
By setting a different policy, sites can protect themselves against this.
I guess technically browsers could open new windows in a new browsing context group regardless of this setting and relay the allowed types of messages via IPC (if any), but that would be a major performance hit, and I don't think any other browsers do it differently.
Can't edit my original post anymore: Firefox and Chrome do seem to isolate even same-browsing-context-group and bridge the required APIs via IPC, so hopefully Safari will catch up at some point.
Basically, there are three scenarios:
- Completely unrelated tabs (e.g. those you open manually, those opened via command-click, tabs opened via '<a target="_blank' ...>" or 'rel="noopener"' references etc.) – these are relatively easily isolated if the browser supports it at all. All major (desktop!) browsers now largely do this, including Safari.
- "Same browsing context group" (but different origin) sites. These can communicate via various APIs, and historically that was achieved by just letting them run in the same rendering process. But in the face of attacks such as this one, this can be insecure. Firefox and Chrome provide sandboxing via separate processes; Safari does not.
- Same origin sites (without any stricter policy). These can fully access each other's DOM (if they have an opener/opened relationship), so there's not really any point in having them live in different renderers except possibly for fault isolation (e.g. one of them crashing not taking the other down). As far as I know, all browsers render these in the same process.
Sites can opt out of the second and third category into the first via various HTTP headers and HTML link attributes. If we were to design the web from scratch, arguably the default for window.open should be the first behavior, with an opt in to the second, but that's backwards compatibility for you.
I worked on a browser team when Spectre/Meltdown came out, and I can tell you that a big reason why Firefox and Chrome do such severe process isolation is exactly because these speculative attacks are almost impossible to entirely prevent. There were a number of other mitigations including hardening code emitted from C++ compilers and JS JITs, as well as attempts to limit high precision timers, but the browser vendors largely agreed that the only strong defense was complete process isolation.
I'm not surprised to see this come back to bite them if after like 7 years Apple still hasn't adopted the only strong defense.
To add to this and to quote a friend who has more NDAs in regards to microarchitecture than I can count and thus shall remain nameless: "You can have a fast CPU or a secure CPU: Pick one". Pretty much everything a modern CPU does has side effects that are something that any sufficiently motivated attacker can find a way to use (most likely). While many are core specific (register rename, execution port usage for example), many are not (speculative execution, speculative loads). Side channels are a persnickety thing, and nearly impossible to fully account for.
Can you make a "Secure" CPU? In theory yes, but it won't be fast or as power efficient as it could in theory be. Because the things that allow those things are all possible side channels. This is why in theory the TPM in your machine is for those sorts of things (allegedly, they have their own side channels).
The harder question is "what is enough?" e.g. at what level does it not matter that much anymore? The answer based on the post above this is based on quite a lot of risk analysis and design considerations. These design decisions were the best balance of security and speed given the available information at the time.
Sure, can you build that theoretically perfect secure CPU? Yes. But, if you can't do anything that actually needs security on it because it's so slow; do you care?
Your friend is genuine in their interpretation, but there is definitely more to the discussion than the zero sum game they allude to. One can have both performance and security, but sometimes it boils down to clever and nuanced design, and careful analysis as you point out.
This is also a fundamental property - if you can save time in some code/execution paths, but not in others (which is a very desirable attribute in most algorithms!), and that algorithm is doing something where knowing if it was able to go faster or slower has security implications (most any crypto algorithm, unless very carefully designed), then this is just the way it is - and has to be.
The way this has been trending is that in modern systems, we try to move as much of the ‘critical’ security information processing to known-slower-but-secure processing units.
But, for servers, in virtualized environments, or when someone hasn’t done the work to make that doable - we have these attacks.
Security isn’t a one-bit thing where you’re either perfectly secure or not. If someone breaks into your house through a window and steals your stuff, that does not make it a lie to claim that locking your front door is more secure.
In any event, Apple’s claim isn’t entirely true. It’s also not entirely false.
Browsers absolutely require JIT to be remotely performant. Giving third parties JIT on iOS would decrease security. And also we know Apple’s fetish for tight platform control, so it’s not like they’re working hard to find a way to do secure JIT for 3P.
But a security flaw in Safari’s process isolation has exactly zero bearing on the claim that giving third party apps JIT has security implications. That’s a very strange claim to make.
Security doesn’t lend itself to these dramatic pronouncements. There’s always multiple “except if” layers.
> Giving third parties JIT on iOS would decrease security.
Well, at least in this case it would have greatly increased security (since it would have allowed the availability of actual, native Chrome and Firefox ports).
And otherwise: Does Apple really have zero trust in their OS in satisfying the basic functionality of isolating processes against each other? This has been a feature of OSes since the first moon landing.
If JIT is such a problem then Apple shouldn't use it themselves. Sure, they let you disable it but it's still enabled by default while everyone pushes the narrative that Apple is all about security.
The alternative browsers have the required site isolation but aren't allowed. There's no fix for Safari and you must use it. I think it's very clearly decreasing the users' security.
Alternative browsers would introduce other security concerns, including JIT. It’s debatable whether that would be a net security gain or loss, but it’s silly to just pretend it’s not a thing.
Security as the product of multiple risks.
Discovering a new risk does not mean all of the other ones evaporate and all decision making should be made solely with this one factor in mind.
I think a detached and distanced perspective must come to the conclusion that vendor lock-in isn't healthy. For security, performance or flexibility it tends to fall short sooner or later.
One could also talk about the relevance of a speculative attack that hasn't been abused for years. There can be multiple reasons for that, but we shouldn't just ignore the main design motivation of Apple here. That would be frivolous and that excludes serious security discussions.
"Decreasing the security" is not binary thinking. It's just a fact today. Also, ability to run software doesn't make you less secure. I never saw any real proof of that. It's the opposite: Competition between different browsers forces them to increase the security, and it doesn't work for Safari on iOS.
Cross-Origin-Opener-Policy seems like a case of bad defaults where a less secure option has been selected so that we don't break some poorly maintained websites. Better to get the actual users of `window.open` to fix their code than to make every website insecure out of the box.
I can't imagine there are many sites passing significant amounts of data through this, the small number of users where IPC poses too high a penalty can opt their sites into the "same process" flag if really needed.
Forcing every website to adapt to a browser update is completely infeasible.
> I can't imagine there are many sites passing significant amounts of data through this
This is actually a quite common mechanism for popup-based authentication (which is much more secure than iframe-based one, as users can verify where they're potentially entering their credentials).
We had the tech in the 80's for the browser to facilitate popup authentication with process isolation. It's this niche and esoteric tech called IPC[1], so niche that one really can't blame Apple for not hearing about it.
It truly boggles the mind as to how all the other browsers pull it off.
To be fair, there wasn't that much sensitive web content around in the 80s to leak (primarily due to the web not yet existing, nor browsers), so it's only fair that browsers didn't consider using IPC for site isolation back then.
The point of my rather facetious comment is that IPC a well known thing (I struggle to even call it "tech") that has been around for 30-40 years. I don't understand why Apple needs people to make excuses for them, but this excuse would render Apple vastly more incompetent than neglecting to separate browser tabs in 2025.
Browsers are incredibly complex, and moving them to an IPC model is not easy. Essentially, you need to ensure "same process like", performant JavaScript interoperability in some cases, often (but not always) due to backwards compatibility.
Firefox has shared a lot about their efforts in moving there. If you're curious, there are a lot of blog posts and internal design docs under their project name "Project Fission".
But yeah, the fact that both Chrome and Firefox have managed to do so does leave Apple looking slightly bad here.
How often do tabs really need to communicate, and when they do, does it really need to be as fast as possible. I would say slower and secure would be a better design philosophy, especially as tab interaction is generally rare, and low bandwidth
It's not used only for authentication, and figuring out what a website is trying to do heuristically doesn't sound easy either (although I believe Chrome on Android does just that, and enforces a site-locked process when they deem it important for security reasons).
But how much data are those popup based auth sending through? At the absolute most a few MB in a couple calls. Even if it's dramatically slower over IPC it's not going to cause issues.
Similar problem with third-party-cookies. They would make some auth cases easier and safer, but we shouldn't generally allow them because they are abused for tracking.
Individuals could choose a "secure" browser or browser mode that provides increased protection from such attacks or a "compatible" one that is less likely to break old websites.
> Individuals could choose a "secure" browser or browser mode that provides increased protection from such attacks or a "compatible" one that is less likely to break old websites.
And then we get thousands of posts whining about Safari being broken because it is "not like Chrome" and developers moaning that their unsafe pet API is not supported. Web developers are never going to play ball.
idunno, as a professional web dev since 1998, I don't understand why Google, Apple and Mozilla are trying so hard to make the web browser like a complete OS (I technically understand why, I just think it's ridiculous). The amount of obscure APIs being added just boosts the surface area for vulnerabilities and makes low-resource web browsing nearly impossible. You either get "a web browser that works" or "a web browser that can load almost nothing", and basically nothing in between. I had to stop using Firefox on my old ThinkPad because after opening a few windows, it churns the CPU so hard it's not usable for a solid minute+. Let that finish up, and I have to wait again if I dare to open another page. i5-3220m, 8gb of RAM (where the OS uses like 200mb of RAM)... there's no excuse for this to not be able to browse the web.
Some fun examples of "your browser is an OS on top of your OS":
The Window Management API allows you to get detailed information on the displays connected to your device and more easily place windows on specific screens, paving the way towards more effective multi-screen applications. https://developer.mozilla.org/en-US/docs/Web/API/Window_Mana...
Every web developer is fine with 10% of the feature set, it’s just a different 10% for each dev. I am regularly annoyed by the inconsistent browser support for Web MIDI, something 99% of web devs probably don’t care about at all.
It would be easier to sandbox if there were fewer features of course, but in practice we rarely see exploits even in complicated low-level APIs like webgpu (which must be a nightmare to make secure given how many buggy drivers and different devices it has to support). So it seems like in practice we are able to sandbox these APIs securely, and having them provides an incredible benefit to users who are able to securely and easily run applications (how else do you recommend people do this on desktop platforms?).
I can't think of an analogy that doesn't come off crass.
I posit the likelihood the morass that is webgpu not having exploitable bugs approaches 0 approximately 25 seconds after the first public release of the code, if not months prior.
Its only when one of two things occur that publishing happens, basically: intelligent frustration, and "for the lols".
Someone hits a bug and gets pissed that the authors of the libraries blame everyone but the library authors. When working around the bug discover its not just a bug. Warn devs. Sometimes responsible disclosure means a quiet fix and then disclosure, but usually it means "here's the exploit they don't care ig"
If there's not enough curious people poking things, exploitable stuff could remain hidden too long.
> idunno, as a professional web dev since 1998, I don't understand why Google, Apple and Mozilla are trying so hard to make the web browser like a complete OS (I technically understand why, I just think it's ridiculous)
I am not a web developer but I completely agree with you. To me, adding more complex points of failure in humongous piles of code that we absolutely need to run in a modern life is not a great risk assessment. It’s like we never learnt from the security issues with the JVM.
Dunno, that API has only been available since Firefox 126 and I've been watching videos without having my screen go to sleep (or screensaver coming on) for like.. years and years (far before Firefox 126)
Indeed. I work on the Firefox media stack and we have been grabbing wake locks when video playback is happening for a long time. Occasionally e.g. on some linux desktop variant this has malfunctioned and we're alerted in no time and fix it.
The Wake Lock API is for other use cases, such as recipe websites, or other document for which you don't want the screen to go away / dim, the kind where you happen to need to look at the screen for long period of time without touching it/interacting w/ mouse and keyboard.
Prior to this API being introduced, websites used to play an inaudible/almost invisible looping media file to keep the screen awake. This has power usage implication, a small single digit number of watts (1 to 3.5 depending on os, hardware, mobile or not) is required to keep audio running (because of high priority threads, frequent wakeups, and simply because the audio hardware itself needs power).
Bless you. I was driven a little mad once trying to figure out why certain websites would steal audio focus away from music playing on my phone, it must have been some clumsy implementation of this.
> Now, video is done completely differently than it was.
Which is a thing that happened long before Firefox 126, too. (Browsers have simply requested the screen wakelock themselves when a video was playing. So this API is mainly for use cases that aren't playing a video.)
The first two, screen wake lock and web serial have good use cases imo. I wouldn’t be surprised if some in-use assistive technology uses serial communication - think screen readers or custom input devices. Keeping the screen from locking is also useful from a purely accessible standpoint as well for users who move slower or need more time to read things.
I fully agree, as someone also doing Web development in a similar timeframe, for plenty of stuff we would be better with native apps talking over Internet protocols, no need to transform a platform for interactive documents into a OS.
> The amount of obscure APIs being added just boosts the surface area for vulnerabilities
It’s often the ancient APIs from around 1995-2001 that are the most vulnerable ones, with information leaking across origins (like todays) needing hacky fixes to stay secure and compatible.
window.open(), target=_blank, cross site request forgery, etc.
IE6 from 2001 had a ton of these modern security issues, and Netscape before it probably had them too.
At that time there were tons of buffer overflow security holes so no on cared about side-channel attacks.
Well, for better or worse, the web is an application platform these days.
I consider it pretty great, since the alternative is installing native apps for things I'm using exactly once or very rarely.
There's a case to be made though that maybe these things should only be available to PWAs, which is what Apple is already doing for some functionality on iOS, including push notifications.
What? I thought Apple was trying to quietly kill pwas, probably bc they don't go thru their app store. And also bc if you make a good enough sandbox then you don't need to pay for "all the wooork we put in"
They certainly don’t have a lot of love for them as a first-class app development environment, but they are also their fig leaf of “open access” to the platform.
Regardless of that, I do like the idea of PWAs getting access to a higher tier of web APIs, especially those increasing the attack surface, as “installing an app” makes even non-sophisticated users do more of a double take than “visiting a website” in terms of access to their system.
That’s not a real choice though. All it takes is one website that is essential to me not supporting the secure mode and I’m forced to opt-out. The upstream website is making the choice for me.
I think they've gotten away with it because it's a pretty obscure setting and they say a bunch of things like "most users should not enable this, and if you do enable it you should expect things to break".
I mean, if all major browsers do it roughly once then users will complain to the few broken websites. They won’t even think to blame the browser if every other site works fine and the broken site is broken on all browsers.
Good luck trying to get Google or Microsoft to throw their paying enterprise users under the bus in the interest of slightly safer sandboxing defaults.
After building enterprise APIs for a few years, you’d be amazed at how hard it is to get companies to make even minor changes; backwards compatibility is key. Often it’s because they _can’t_ make the change themselves since they outsourced the code to a consulting agency. So they’d have to sign a new contract and get an agency to make the change.
They just won’t, and you’ll have a browser that people stop using.
It is probably outside the scope of what one company can do (although Apple is quite large…). But we need to fix our understanding of backwards compatibility. If a computer system provides the ability to keep doing something, but the way it provides that capability requires it to be insecure, then the system should not really be thought of as “backward compatible.” Because reasonably prudent people can’t actually keep doing the thing they were doing before.
Of source, modern computers on the modern web don’t really provide the ability to do much at all in a reasonably prudent fashion, so it is all a bit moot I guess.
Announce what to whom? To the hundreds of millions of users out there that don't even know what a browser is, let alone why it's now talking to them about something called a "site isolation framework"?
I would guess you would use a deprecation message in the console? Like they have done over cookie changes, etc. A normal user would obviously not check the console, but the devs or admins of the site sure might.
That's assuming there's still a dev around that has knowledge of, or even access to, the source code of a given webapp depending on the legacy functionality.
Sure. I just got a vibe from this thread that breaking security changes in the browser is a totally unknown phenomenon, but we had changes to behavior from other origin headers, demanding ssl and changes to cookies. Somehow we survived. ;)
A lot of people did complain very loudly about enforcing SSL, and it took decades to get here. Same for cookies.
So yes, breaking changes for privacy/security reasons do happen, but they're very painful, and if there's a more secure alternative (in this case, still isolating communicating processes and providing communication via IPC, and providing an opt-out way of the legacy behavior), that's often the easier path.
Safari definitely does use site isolation (if you check "Activity Monitor", you'll find Safari processes named after the sites they're displaying) in almost all cases.
window.open, in some constellations, is an exception, because the opening and opened sites have an explicit communication facility available to them, unless at least one of the two explicitly opts out of it. As far as I'm aware, Safari also correctly implements that opt-out.
The only thing that Chrome and Firefox seem to be doing on top of that, as far as I understand, is to actually enforce process-per-site even for sites from the same "browsing context group" (i.e. all that can hold programmatic references to each other, e.g. via window.opener), which requires using IPC.
> Safari definitely does use site isolation (if you check "Activity Monitor", you'll find Safari processes named after the sites they're displaying) in almost all cases.
From the FAQ:
"For leaking secrets, both SLAP and FLOP are confined to the address space they are trained in. As pointed out by iLeakage, Safari lacks Site Isolation, a measure used to enforce that two different webpages not from the same domain can never be handled by the same process. Thus, in Safari it is possible for an adversary's webpage to be handled by the same process (and thus address space) with an arbitrary webpage, increasing the attack surface including LAP- and LVP-based exploits.
On the other hand, although Chrome is equipped with Site Isolation, we demonstrate that it is not a perfect mitigation. We show the real-world existence of corner cases, where two subdomains of the same site can be merged into one process, again leading to LAP- and LVP-based attacks."
Yes, in some special cases, which both embedded/opened and embedding/opening websites can avoid by setting the appropriate HTTP headers/HTML attributes.
Of course it would be better if Safari would do the same thing as Chrome and Firefox and just provide a separate process for all contexts, including those that can communicate per specifications. But there's something sites can do today to avoid this potential information leak.
> websites can avoid by setting the appropriate HTTP headers/HTML attributes.
Individual sites plugging browser + CPU security holes seems like a violation of separation of concerns. Yes, I hope every bank out there puts this workaround into their site ASAP, but that's hardly a solution for the flaw itself.
The permanent solution to the flaw is either a hardware/OS-side fix (i.e. disabling this particular kind of speculation via a chicken bit, if there is one), or Safari implementing site isolation in the same way Chrome and Firefox are already doing.
But as the former might well be impossible (at least without ruining performance or requiring a hardware swap), and the latter might take a while, websites should still take the precautions they can. It's a good idea for other reasons anyway: Why keep around an inter-context messaging mechanism you possibly don't even need?
No. For fanboys, everything Apple does is the best thing anyone could ever do. So if Apple needs more time, then it is impossible to be faster. And last line of defense "Whataboutism". I wish people would not be so predictable and boring.
Process-per-site isolation doesn't necessarily have to use (much) more memory.
If you pre-initialize the renderer and JavaScript engine and then fork that pre-warmed instance for each site, every page of memory not written to remains shared in physical memory.
Properly accounting for that in task managers is hard, though; on many OSes, Chrome's memory usage looks much scarier than it is in reality.
Sorry, I was imprecise in my original post: It's definitely possible to isolate even sites in the same browsing context group, but it requires more work that Safari apparently just hasn't got around to yet.
Would that performance hit be really that significant? I can't imagine there are more than a couple of calls total, and that's all dwarfed by any web access. Or do I misunderstand what's required?
So should Protonmail (and any other site with similarly sensitive data) be setting that header, then? It’s probably hard to change the default. I bet some use cases (SSO popups?) depend on it.
It's not unreasonable to set a different header value for the login page only, where it should be safe because no external user data is being rendered.
That is a little different, though: those attributes are for if you're example.com linking to protonmail, the header is for if you're protonmail deciding on security policies for interactions with example.com.
> Considerations for Safari. We emphasize the importance of site isolation [55], a mechanism preventing webpages of different domains from sharing rendering processes. Site isolation is already present in Chrome and Firefox [42, 55], preventing sensitive information from other webpages from being allocated in the attacker’s address space. While its implementation is an ongoing effort by Apple [3, 4], site isolation is not currently on production releases of Safari. On the contrary, we also reflect on [Webkit's memory allocator] libpas’s heap layout from Section 6.3, allowing sites to not only share processes, but also heaps. Partitioning JavaScript heaps by at least several memory pages per-webpage would prevent JavaScript strings from the target webpage from being allocated within the 255-byte reach of the LAP.
No, only some of the side channel magic doesn't matter if you live in a different virtual memory space. Other past attacks didn't use virtual memory pointers and used physical memory pointers or didn't use any pointers at all - one could read data from another process, the kernel, another VM, the SGX enclaves or even proprietary CPU manufacturer code that runs on the CPU, like the CPU signing keys used for remote attestation.
The writing was on the wall for in-process sandboxing with Spectre, but that seems to have faded a bit. This just re-enforces that. Things like "safe in-process sandboxing with WASM" are just a fantasy, it can't be implemented.
"trivial" how do you figure? Remember these exploits bypass your own code's conditionals over a shockingly far duration. Unless you just mean for incredibly restrictive usages such as eBPF?
possible absent any performance concerns at all, yeah sure
> Unless you just mean for incredibly restrictive usages such as eBPF?
I was actually thinking something more like a bytecode interpreter that runs one operation and then sleeps until the next full wall clock second, but yes, that's my point: If you don't care about performance, you can make process isolation safe very easily.
I think at the point where you're suggesting 1hz bytecode interpreters the onus is kind of on you to be clear you're not talking about plausible points in the design space.
1 Hz is probably a bit too slow for practical applications, but my point is that somewhere between that, and simulating a parallel universe at each data-dependent branch, is probably a reasonably-safe spot, or more likely a spectrum that application developers get to pick their tradeoffs from.
What I know as a developer is web security is really hard. Last week there was a DOM clobbering gadget deep in my TypeScript world and I really didn't have the energy to understand who wants to clobber my DOM and why they need a gadget. I want to build stuff and what worries me is this stuff is just simply not foreseeable.
DOM security is a completely different beast from process isolation with WASM (in a web context or otherwise). The attack surface is vastly greater, due to the much larger API complexity.
To be fair, this is (relatively, compared to the age of the web) new behavior though.
Even Chrome, which pioneered the "tab per process" model, didn't isolate same browsing group context sites for a long time, and still shares processes on Android as a resource optimization: https://chromium.googlesource.com/chromium/src/+/main/docs/p...
Firefox only released their "Project Fission" in 2021, i.e. three years after Spectre.
This is a flawed comparison in many ways. As you might not understand IE was problematic because of its massive install base and everyone only, and only writing their websites for chrome oh wait. Typo'd there, meant IE.
Cool detail, in the section where they reverse-engineer the presence of an LVP on the M3:
Remarkably, while all other load widths activate the LVP
on any constant value fitting that width, we observe that acti-
vation on 8-byte wide loads occurs only when the load value
is zero. We conjecture that this may be a countermeasure for
memory safety such that the LVP will not learn values of
pointers. That is, with the M3 being a 64-bit CPU, pointers
are 8 bytes wide. Furthermore, on 64-bit macOS executables,
any virtual address below 0x100, 000, 000 is invalid.
3 hex digits = 12 bits = 4096 entries, the size of each address translation table on ARM. So it does make some (twisted) sort of sense. Assuming you're using 4k page size
> In order to make cache hits distinguishable from misses
in Safari, we reference the NOT gate-based cache amplifica-
tion primitive from [29, Section 5], adjusting the speculation
parameters for the M2 CPU. We run the amplifier 500 times
when the target address is cached and 500 more times when
it is evicted, in native and WebAssembly implementations.
Table 3 summarizes the timing distributions, with units in
ms. We observe that they are clearly separable even in a
web environment, allowing us to distinguish cache hits from
misses with WebKit’s 1 ms timer.
So i guess all the hub hub around disabling fine resolution timers and SharedArrayBuffer was for naught.
It doesn't hurt that setbacks for web app development coincidentally send developers into the open arms of Google and Apple's stores that collect a 30% cut of all revenue, so there was a good incentive to do it even if it didn't protect anyone.
> It doesn't hurt that setbacks for web app development coincidentally send developers into the open arms of Google and Apple's stores that collect a 30% cut of all revenue, so there was a good incentive to do it even if it didn't protect anyone.
That seems like a bit of a reach. Its an obscure feature that is rarely useful, and when it is all you have to do is send the right http header (if using chrome) and you get it back.
Multithreading may be an obscure feature to you but runtime developers get requests for it all the time. SAB becoming widely available was definitely delayed.
I would still maintain that needing multithreading on a website is relatively rare, and specifically needing SharedArrayBuffer instead of just multiple proceses (e.g. webworkers) is even more rare.
Did use cases exist? Sure. But not sufficiently to move the needle on app store usage.
One of the most significant use cases that gets impacted is game dev.
Web games benefit significantly from SAB (see the Godot engine for example) and mobile games make up a pretty sizable chunk of app store usage, particularly in app purchases.
> Did use cases exist? Sure. But not sufficiently to move the needle on app store usage.
We can't say that for sure. There is no shortage of examples where Apple neglects a feature that might provide parity with their own services to avoid competition. Safari, being a mandatory feature of iOS, is reasonably implicated as part of the conspiracy to prevent users from buying and selling software without the assent of a corporate benefactor.
> On the other hand, although Chrome is equipped with Site Isolation, we demonstrate that it is not a perfect mitigation. We show the real-world existence of corner cases, where two subdomains of the same site can be merged into one process, again leading to LAP- and LVP-based attacks.
Did anyone spot where this is mentioned?
Edit: it doesn’t seem like they have a general attack. Rather, it’s that some sites are not in the public suffix list.
Edit 2: It’s also interesting that they found that iPhone 13 and iPhone 13 mini (which have the same processor and came out at the same time) differ in LAP in that they observed only the latter as having it. Very curious…
Funny that I am seeing this now, because last Fall I had Daniel Genkin as my Intro to Cyber Security Professor (co-author of this result). Interesting class, but I remember him mentioning that they were working on a speculative attack for Apple CPUs after seeing the results of spectre and meltdown on Intel CPUs. I remember how he seemed almost paranoid about security, and I suppose I see why now (security is almost never guaranteed).
Apple released minor-version updates to both macOS and iOS the past few days, both containing several security fixes. Has anyone been able to confirm if they address these exploits?
Apple acknowledged the shared proof-of-concept and stated it plans to address the issues. However, at the time of writing, the flaws remain unmitigated.
"We want to thank the researchers for their collaboration as this proof of concept advances our understanding of these types of threats," Apple told BleepingComputer.
"Based on our analysis, we do not believe this issue poses an immediate risk to our users."
It's crazy that they were informed about this months ago and still have not fixed it yet. They're going to have to now that it's public but why would that pressure even be needed. I naively assumed if Apple still gets one thing right it's security updates. This is disappointing and concerning.
Aside: I feel like RUB has become kind of a global center for this kind of high-end offensive security work. Was I just not paying enough attention 10 years ago or is this a new-ish thing?
It depends on your threat model. If you don't run any untrusted code on your hardware (including Javascript), you can safely disable the mitigations. If you do run untrusted code, keep them enabled.
In the context of a regular end-user desktop machine, this seems overly paranoid to me. The odds of encountering a real, JS-based spectre attack in the wild are basically zero (has anyone ever seen a browser-based Spectre attack outside of a research context? even once?), and the odds of it then being able to retrieve actual sensitive data are also basically zero. That's two astonishingly tiny numbers multiplied together. The threat just isn't there.
For regular end-user desktop machines, the mitigations only decrease performance for no real benefit. Spectre is a highly targeted attack, it's not something you can just point at any random machine to retrieve all their bank passwords or whatever.
What is the threat model if I run lots of untrusted JavaScript, but I only have a small amount of memory in other processes worth reading and I would notice sustained high CPU usage?
Is there an example in the wild of a spectre exploit stealing my gmail cookie and doing something with it? (Would be difficult since it's tied to other fingerprints like my IP)
Or stealing credit card numbers when they're in memory after I place an online order?
You're not getting a boost, you're avoiding a penalty. In some (but not all) cases you can avoid the penalty and the exploits by disabling SMT. Remember, SMT isn't twice as many cores, just twice as many half-cores. You'll be fine.
Disabling SMT alone isn’t enough to mitigate CPU vulnerabilities. For full protection against issues like L1TF or MDS, you must both enable the relevant mitigations and disable SMT. Mitigations defend against attacks where an attacker executes on the same core after the victim, while disabling SMT protects against scenarios where the attacker runs concurrently with the victim.
It's a common misunderstanding that the CPU suddenly has twice as large performance envelope when SMT is enabled. Only specialized software/scenarios will tangibly benefit from the parasitic gains of SMT-induced extra parallelization, e.g. video encoders like x264 or CPU-bound raytracers to name a few examples. These gains typically amount to about 15-20% at the very extreme end. In some cases you'll see a performance drop due to the inherent contention of two "cores" sharing one actual core. If you're stuck with a dual-core CPU for your desktop setup you should absolutely enable SMT to make your general experience feel a bit more responsive.
> It's a common misunderstanding that the CPU suddenly has twice as large performance envelope when SMT is enabled.
Perhaps, but I am not under this misunderstanding and never expressed it.
> Only specialized software/scenarios will tangibly benefit from the parasitic gains of SMT-induced extra parallelization
In my experience it also speeds up C++/Rust compilation, which is the main thing I care about. I can't find any benchmarks now but I have definitely seen a benefit in the past.
> video encoders like x264 or CPU-bound raytracers to name a few examples. These gains typically amount to about 15-20% at the very extreme end.
Normally those types of compute heavy processes, data streamlined, processes don’t see much benefit from SMT. After all SMT only provides a performance benefit by allowing the CPU to pull from two distinct chains of instructions, and fill the pipeline gaps from one thread, with instructions from the other thread. It’s effectively instruction-by-instruction scheduling of two different threads.
But if you’re running an optimised and efficient process that doesn’t have significant unpredictable branching, or significant unpredictable memory operations. Then SMT offers you very little because the instruction pipeline for each thread is almost fully packed, offering few opportunities to schedule instructions from a different thread.
I agree that compression is all about increasing entropy per bit, which makes the output of a good compressor highly unpredictable.
But that doesn’t mean the process of compression involves significant amounts of unpredictable branching operations. If for no other reason than it would be extremely slow and inefficient, because many branching operations means you’re either processing input pixel-by-pixel, or your SIMD pipeline is full of dead zones that you can’t actually re-schedule, because it would desync your processing waves.
Video compression is mostly very clever signal processing built on top of primitives like convolutions. You’re taking large blocks of data, and performing uniform mathematical operations over all the data to perform what is effectively statistical analysis of that data. That analysis can then be used to drive a predictor, then you “just” need to XOR the predictor output with the actual data, and record the result (using some kind of variable length encoding scheme that lets you remove most of the unneeded bytes).
But just like computing the median of a large dataset can be done with no branches, regardless of how random or the large the input is. Video compression can also largely be done the same way, and indeed has to be done that way to be performant. There’s no other way to cram up to 4k * 3bytes per frame (~11MB) through a commercial CPU to perform compression at a reasonable speed. You must build your compressor on top of SIMD primitives, which inherently makes branching extremely expensive (many orders of magnitude more expensive than branching SISD operations).
> You’re taking large blocks of data, and performing uniform mathematical operations over all the data to perform what is effectively statistical analysis of that data.
It doesn't behave this way. If you're thinking of the DCT it uses that's mostly 4x4 which is not very large. As for motion analysis there are so many possible candidates (since it's on quarter-pixels) that it can't try all of them and very quickly starts trying to filter them out.
> it uses that's mostly 4x4 which is not very large
That's 16x32 which is AVX512. What other size would you suggest using and (more importantly) what commercially available CPU architecture are you running it on?
It usually speeds up basically everything parallelizable that looks kind of like a parser, lexer, tokenizer, .... Unless somebody goes out of their way to design a format with fewer data dependencies, those workloads are crippled on modern CPUs. That includes (de)compression routines, compilers, protobuf parsing, ....
The only real constraint is that you can actually leverage multiple threads. For protos as an example, that requires a modified version of the format with checkpoints or similar (which nobody does) or having many to work on concurrently (very common in webservers or whatever).
In practice probably not, as long as general population keeps it enabled. I mean, looking at effort required, it's not worth spending time exploiting spectre these days, because virtually everyone is protected. If you're not likely to be directly targeted, "herd immunity" will work.
If just visiting a webpage with some JS will let them do ACE on even 0.1% of visitors, hackers are probably still motivated enough to try it. But I vaguely remember these kinds of vulns can be patched in-browser for a perf hit instead of taking the hit system-wide, which sounds like an ok compromise.
>hackers are probably still motivated enough to try it.
The amount of actual exploit crafting that is needed to actually do something meaningful with a hack is pretty much not worth doing for any financial reason. The only time this happens now is when state funded actors or prominent groups with lots of manpower really want to take down an individual person.
Depends how automated it can be. I know some non-spectre 0-days were used broadly, either via viruses or port-scanning. Is it possible to craft some JS that'll use a spectre-like vuln to reliably grab something important like Chrome passwords or credit cards? Idk, it's hard to prove otherwise, and hackers have more time to think about this than I do.
> Is it possible to craft some JS that'll use a spectre-like vuln to reliably grab something important like Chrome passwords or credit cards?
Probably, but there’s a huge luck element involved, at least with spectre. It’s difficult to guide the speculative execution to read exactly what you want it to read, assuming you even know where it is. As a result you need to spend quite a bit of time on a single target before you’re likely to actually get the data you want. Even then, there’s likely a significant element of human analysis to assemble something useful from all the noise.
So yes, it’s almost certainly possible. But exploits don’t exist in a vacuum. If you’re expending that much effort to get credit card numbers, then quite frankly you’re a fool, because good old phishing attacks and other social engineering attacks are easier, more reliable, and above all, cheaper.
At the end of the day, crime is a business like any other, profitability and margins are king. You don’t waste time perfecting attacks that have significantly smaller margins than your existing attacks. The only exception to that is nation states, because they aren’t motivated by directly extracting cash from victims, and ultimately that’s what makes nation state actors so dangerous.
Linux on its own isn't even 0.1% visitors normally. We're talking multiple orders of magnitude less for disabled mitigations. And on top of all that, it's possible that exploiting on that machine is going to be harder due to custom software with uncommon memory layout - i.e. it's probably not a stock Ubuntu. And finally, for accessing data outside of the page, you really want to have some specific target, so they'd have to guess that too.
If only Linux is affected then sure. Was talking about spectre in general. Maybe only Linux users are turning the spectre mitigation flags off, but there are plenty of outdated Windows systems too.
You shouldn't disable Spectre mitigations, but Retbleed and Downfall (Intel) are much more of a "lab" exploit, and the fall-out for Retbleed is much more severe on cloud boxes than your personal PC. Easy 20-40% performance uplift on AMD Zen1-Zen2 and Intel 6th-11th gen.
It's a tragedy that so many websites insist on having the ability to run random downloaded code on our systems to do basic things like displaying simple text and images. Things browsers are capable of with nothing but HTML. Google refuses to even show search results, a bunch of literal hyperlinks, without javascript being enabled.
The real tragedy is that our processors try to win a bit more speed by sacrificing simplicity and therefore increasing the chances of such exploits. The other tragedy is that our operating systems are obsolete and have worthless security. Back in the day when UNIX was relevant, a hundred people could use it at the same on a mainframe with no fear of it breaking, now one person cannot safely use a single computer.
> The real tragedy is that our processors try to win a bit more speed by sacrificing simplicity and therefore increasing the chances of such exploits.
This gets repeated on every thread about speculative execution exploits, but I think people who say this are underestimating how huge the difference would be. I suspect processors without speculative execution would be many times slower, not "a bit".
Unix systems protected against a physical person trying to read or modify another persons file. They did not stop programs run by the user reading the users own data unexpectedly which is now considered unacceptable but was previously the norm.
my specific use case where I see significant performance improvement is image segmentation pipelines (which involve opencv-style image processing and AI inference). YMMV depending on your CPU I suppose.
It depends on the CPU, think. The most dramatic improvement I've seen is 20-30%+ improvements in python run times for numpy/pytorch heavy workloads on c2-standard-16 VMs in GCP with spectre mitigations disabled
Keep in mind that getting meltdown to work might be very difficult depending on your setup. I wouldn't have been able to at least when starting out my teacher didn't provide us with targetable hardware.
A spectre (particularly RSB-based ones) are nice to start out with imo.
Yea fair, this is obviously a high level overview. I think I found with meltdown that I needed the assembly code. I also was able to reproduce it with actual C code if I recall correctly but that was way more finnicky.
Hm... as I read it this is much worse. Spectre/Meltdown were data isolation vulnerabilities. You could exploit side channel (mostly timing) information to intuit state about memory across a protection boundary. Basically you can prime the CPU state to allow you to tell which of multiple code paths the kernel/hypervisor/whatever took, and then go from there to reading arbitrary data. Which is bad, obviously.
Here, they claim to have a remote exploit vulnerability. It's not that Apple is leaking data, it's that the CPUs have a bug where they appear to be actually executing code based on incorrectly-loaded ("predicted") data.
Though the details are, as is usually the case, thin. I await further analysis.
A browser-based attack, in theory, could have happened with Spectre/Meltdown as well. I seem to recall a PoC for Spectre in the browser, actually. I believe it's also a reason that microsecond precision in the browser was made a bit more opaque since that era.
It's not remote code execution, it's the same flavor of "out of bounds read through speculation" as previous vulnerabilities. It's terrifying because they have a working proof of concept from untrusted JS in Safari, but there have been speculative execution in browser JS engines before now also.
The language seems to argue otherwise: SLAP "allows the adversary to jump the LAP to the target webpage's string and trick the CPU into operating on it" and FLOP "allows us to run a function with the wrong arguments". That's absolutely not mere data exfiltration.
Now, maybe this is because of a trampoline based on pre-existing Safari bugs and not the CPU misfeature itself. Again, the details are slim.
But "the same flavor of vulerability" seems to be a mischaracterization.
My read: the attack gets 600 cycles of CPU time to execute its code (JITted Javascript, in web context) on the speculated data, and to use some side channel to communicate results back out of the speculated parallel-world.
Some of the earlier speculation attacks didn't get to do arbitrary compute on the speculated data, they could only for example influence whether something was loaded into cache or not.
In many CPU ISAs, load value predictors are unlikely to be useful, because they cannot guess the value that will be loaded with an acceptable probability.
The ARM ISA and also other ISAs with fixed-length instruction encoding are an exception. Because they have a fixed instruction length, typically of 32 bits, most constants cannot be embedded in the instruction encoding.
As a workaround, when programming for such ISAs, the constants are stored in constant pools that are close to the code for the function that will use them, and the load instructions load the constants using program-counter-relative addressing.
Frequently such constants must be reloaded from the constant pool, which allows the load value predictor to predict the value based on previous loads from the same relative address.
In contrast with the Apple ARM CPUs, for x86-64 CPUs it is very unlikely that a load value predictor can be worthwhile, because the constants are immediate values that are loaded directly into registers or are directly used as operands. There is no need for constants stored outside the function code, which may be reloaded multiple times, enabling prediction.
All fast CPUs can forward the stored data from the store buffer to subsequent loads from the same address, instead of waiting for the store to be completed in the external memory. This is not load value prediction.
> for x86-64 CPUs it is very unlikely that a load value predictor can be worthwhile
I think you're making a good point about immediate encodings probably making ARM code more amenable to LVP, but I'm not sure I totally buy this statement.
If you take some random x86 program, chances are there are still many loads that are very very predictable. There's a very recent ISCA'24 paper[^1] about this (which also happens to be half-attributed to authors from Intel PARL!):
> [...] we first study the static load instructions that repeatedly fetch the same value from the same load address across the entire workload trace. We call such a load global-stable.
> [..] We make two key observations. First, 34.2% of all dynamic loads are global-stable. Second, the fraction of global-stable loads are much higher in Client, Enterprise, and Server work-loads as compared to SPEC CPU 2017 workloads.
Unfortunately what you say is true for many legacy programs, but it is a consequence of the programs not being well structured by the programmer, or not being well optimized by the compiler, or due to a defect of the ISA, other than the lack of big immediate constants.
Some of the global-stable values are reloaded because the ISA does not provide enough explicitly-addressable registers, despite the fact that a modern CPU core may have 10 times to 20 times more available registers, which could be used to store the global-stable values.
This is one of the reasons why Intel wants to double the number of general-purpose directly addressable registers from 16 to 32 in the future Diamond Rapids CPU (the APX ISA extension).
In other cases the code is not well structured and it tests repeatedly some configuration options, which could be avoided by a proper partitioning of the code paths, where slow tests would be avoided and the execution time would be reduced, even at the price of a slight code size expansion (similarly to the effect of function inlining or loop unrolling).
Sometimes the use of such global-stable values could have been avoided even by moving at compile time the evaluation of some expressions, possibly combined with dynamic loading of some executable objects that had been compiled for different configurations.
So I have seen many cases of such global-stable values being used, even for CPU ISAs that do not force their use, but almost none of them were justified. Improving such programs at programming time or at compile time would have resulted in greater performance improvements, which would have been obtained with less energy consumption, than implementing a load-value predictor in the CPU.
I think you're under-estimating the amount of pointer chasing that lots of types of code has to do. B-Tree traversal for filesystems, mark loops for garbage collection, and sparse graph traversal are all places where you're doing a lot of pointer chasing.
I do wonder if there are other common code patterns that a practical LVP could exploit. One that comes to mind immediately are effectively constants at one remove: Think processing a large array of structs with long runs of identical values for some little-used parameter field. Or large bitmasks that are nearly all 0xFF or 0x00.
Probably not, but I don't think anyone has talked about it explicitly.
Otherwise, there are known examples of related-but-less-aggressive optimizations for resolving loads early. I'm pretty sure both AMD[^1] and Intel[^2] have had predictive store-to-load forwarding.
edit: Just noticed the FLOP paper also has a nice footnote about distinguishing LVP from forwarding during testing (ie. you want to drain your store queue)!
from doing some work on GC a couple years ago, at that time apple was the only one with it. The performance is awesome, it makes graph traversal ~2x faster.
Seems like speculative execution is just fundamentally insecure. With SPECTRE/MELTDOWN mitigations, doesn't CPU performance drop below the same CPU performance with no branch prediction at all? Should we move back to CISC? Or maybe VLIW?
I don't think so; speculative execution is the cornerstone of modern CPU performance. Even 15-year-old 32-bit ARM CPUs do it. The only phone/PC-grade processors without it are the first generation of Intel Atom, and I recall that early Atom processors sacrificed a ton of performance to keep power consumption low. I doubt this will change since mitigations are "good enough" to patch over major issues.
Maybe the boomers were right and we made computers way too complex? This might be a bit of a hyperbole, but seems like there will always be a security hole (even if mostly hard to exploit). But I also guess we can't get much faster without it either. So maybe we should reduce complexity. Atleast for safety critical systems.
Yes and it's very slow as a result. In-order cores without speculative execution can't be fast. Not unless you have no memory and only operate out of something equivalent to L1 cache.
Memory is slow. Insanely slow (compared to the CPU). You can process stupid fast if your entire working set can fit in a 2KB L1 cache, but the second you touch memory you're hosed. You can't hide memory latency without out-of-order execution and/or SMT. You fundamentally need to be parallel to hide latency. CPUs do it with out-of-order and speculative execution. GPUs do it by being stupidly parallel and running something like 32-64 way SMT (huge simplification). Many high-performance CPUs do all of these things.
Instruction level parallelism is simply not optional with the DRAM latency we have.
Cortex a53 may be slow, but it's fast enough for very many tasks. Once you design the your data structures to fit L1/L2 caches it actually is pretty damn fast. Best part of cache aware data structure design it also makes code run faster on Out Of Order CPUS. A53 is of course slow if you use modern layer-upon-layer-ware as your architecture.
But I was just really trying to point that in-order cpus are still around, they did not disappear with in-order atom.
>speculative execution is just fundamentally insecure
I dont think its inevitable, might be caused by greed. You could have small separate sections of cache (or additional tag) dedicated to per thread speculation, but no designer is willing to sacrifice real estate to something that will be thrown away the instantly.
I'd really be curious to recreate this or to know if any of their results included using private browsing mode?
I only bring it up because one of the reasons I use Safari with private browsing as a default is because, if I were to login to a site like Facebook in one tab, open a new private tab in the same window and try going to Facebook, it would not recognize that I had already logged in from the other tab. Chrome nor Firefox do that.
CPU vendors always say this when an exploit is published before they mitigate.
Sometimes they mean "no we don't think it's exploitable", sometimes the charitable reading is "we don't think anyone is exploiting this and we think developing an exploit will take quite some time".
Unfortunately they never reveal exactly that they mean. This is very annoying, because when it's the former case, they're often right! Security researchers publish bullshit sometimes. But the vendors basically leave you to figure it out for yourself.
And from the paper seems like they played it interestingly in the researchers direction as well:
"1.2. Responsible Disclosure
We disclosed our results to Apple on May 24, 2024.
Apple’s Product Security Team have acknowledged our
report and proof-of-concept code, requesting an extended
embargo beyond the 90-day window. At the time of writing,
Apple did not share any schedule regarding mitigation plans
concerning the results presented in this paper.
"
Serious answer, don't use Safari. Use a browser that properly separates webpages into isolated processes so that this kind of cross-site read is not possible.
There’re no other browsers on iPhone. Every iPhone browser is a reskin of Safari. They’re in theory supposed to allow other browsers in the EU, but AFAIK it has not happened yet.
Which is the reason alongside telemetry I tend to favor using websites over apps.
Having said that there are apps that are considered mainstream and not malicious by the general population but can become a convenient backdoor for, say, a state actor.
It will work unless someone forgets to add a public suffix into the public suffix list (as described in the FLOP paper). Both of these attacks target virtual memory pointers.
> While FLOP has an actionable mitigation, implementing it requires patches from software vendors and cannot be done by users. Apple has communicated to us that they plan to address these issues in an upcoming security update, hence it is important to enable automatic updates and ensure that your devices are running the latest operating system and applications.
I'm penchant toward disabling js by default on untrusted sites. It's basically someone's programm that we run on our machine and apparently we not yet can do sandboxes
Yes. They even used Asahi Linux to develop their techniques on as it gave better access to cpu controls. The exploit was tested on Safari (IE not a Linux fault but a CPU one).
Bizarre the M1 is immune to both; I'm more secure by not upgrading. (Sure, there's still a few, but they are mostly minor by comparison, or newer chips are also affected.)
Newer CPUs use more and more "hacks" - out of order execution, caching, speculative execution, branch prediction, etc - to gain performance improvements. The further back you go, the less vulnerable CPUs generally are to these (but possibly more vulnerable to other kinds of attacks).
For sure. But the further you go along, the _more_ of these tricks it uses and relies upon to improve performance. New vulnerabilities that are discovered are likely to take advantage of some feature on the spectrum of these “hacks”, from new to old. Because older “hacks” are well-known and studied, newly discovered vulnerabilities are more likely to target features in the newer end of that spectrum (AKA newer CPUs).
I can’t imagine taking a cpu design class in this more cynical era. So much of the speed these days seems to come down to distilling smoke and mirrors into physical form.
Indeed, my Apple Watch (series 3) has always been immune to all spectre type attacks, the CPU is too simple. It doesn’t do speculative execution at all.
That's vulnerable to DMP-based side channel attacks though (like GoFetch [1]), which you can only protect against in software [2] on the M3 and beyond.
The marketing culture for announcing hardware exploits is so strange to me. The norm seems to be getting a custom domain, logos, demos, an FAQ... why do all this instead of just reporting the exploit and releasing a paper?
Only academics read exploit papers. I don't see anything wrong with releasing the information is a more digestible way if it is something that affects the general populace. I only knew about heartbleed because of the website. https://heartbleed.com/
Heartbleed et al. demonstrated conclusively that recognition matters; I don't begrudge researchers any technique that increases the relative visibility of their work.
It’s happening in other parts of the research world too: a couple colleagues of mine were talking recently about a paper we found at a conference last year that had a web page to go along with it with a domain and fancy graphics and such. For a boring programming languages paper. We concluded this is the modern way to try to jack your citations by getting noticed for everything but the technical content of the work, which is a bit off putting.
Getting funding and good job offers is mostly about marketing. Even worse, lots of people controlling the purse strings aren't domain experts. In a way, it's no different from getting published in specific high-profile publications or attending specific universities.
It's a recent trend basically since Heartbleed had a cool name and lots of press. Why would you not want your exploit to be well known and to get lots of credit for it? If anything it's surprising it didn't happen earlier.
The custom domains can be a little silly, but for all the rest, why not? Logos (and the associated fancy name) are a lot more memorable than CVE-2025-XXXX. Demos are and were always appreciated. FAQs are a lot more digestible for the average reader than a paper.
I know it's kind of goofy, but I don't really see the downside to it.
Blame society. Businesses won't value security unless the fear of getting attacked is sufficiently strong and the losses significant. Otherwise why invest in it at all?
Definitely not just hardware exploits though. Look at heartbleed for example. It's been going on a long time. Hardware exploits are just so much more widely applicable hence the interest to researchers.
It also feels like that people who are highly determined to build high quality, secure software are not valued that much.
It is difficult to prove their effort. One security-related bug removes everything, even if it happened only once in 10 years in 1 million line code base.
Apple promotes privacy, sure. I'm not sure whether they promote security. Of course they are not against security, but I don't remember it being a significant theme in their marketing.
Interesting that the researchers have gone public before a mitigation is in place from Apple. Seems in pretty stark contrast to the industry-wide coordination that went into patching and mitigating spectre.
> We disclosed our results to Apple on May 24, 2024. Apple’s Product Security Team have acknowledged our report and proof-of-concept code, requesting an extended embargo beyond the 90-day window. At the time of writing, Apple did not share any schedule regarding mitigation plans concerning the results presented in this paper.
The vulnerability is over half a year old and over a quarter over the embargo window.
I'm not saying it's bad (I'm pretty close to a disclosure absolutist), though I don't really know what the norms are for hardware attacks --- both papers are well past the normal coordinated disclosure window for software.
As someone who has reported several (software) security vulnerabilities to Apple, I couldn’t care less. Some of the things I reported are trivial to fix, and dealing with their team has been a frustrating experience. They don’t deserve extensions, they need to get their shit together and stop releasing new crap every year. Clearly they can’t do it properly.
It's could be unfixable without a significant performance penalty, but at minimum they could make safari do proper process isolation like every other browser does.
Right, but that was before they knew about this exploit. My point was that even if they decided they needed to urgently switch to a multiprocesses architecture because it's the only way to mitigate this exploit, they might not be done yet.
This class of attacks is not new. Spectre demonstrated this possibility in 2018, and Apple was previously targeted by speculation attacks, e.g. https://gofetch.fail/ or https://ileakage.com/.
I think also this isn't fundamentally different to Spectre. Spectre introduced a whole new class of vulnerabilities (hence the name) and this is one of them. An impressive one, but still, it definitely doesn't deserve the coordination & secrecy that Spectre had.
Most browsers have already switched to process-per-site because of Spectre.
They claim to, but they drag their feet, demand terms most researchers find so unacceptable as to be a bit immoral (the point of "responsible disclosure" isn't, in fact, to hold secrets from the public arbitrarily long), and often end up paying only a fraction of what was expected, if anything.
Your links are all from 2021. I remember there was a lot of criticism at the time and so they updated their bug bounty program which quite a number of changes:
The vibe I got talking to people like Mark Dowd about this is that they're running something closer to an exploit bounty program, and it's pretty focused on patterns of vulnerabilities common to some pretty specific threat actors.
Their SLAP demo provides a great example of how defence-in-depth can make/break the viability of an exploit. That terrifying Safari demo is possible because Safari fails to isolate new windows in individual processes when calling `window.open` in js.
All the other side channel magic presented here doesn't matter if the data you want to read is in a seperate process with sufficient separation from the "hostile" process in the address space.
That's not a failure of Safari, it's required by window.open API semantics, in particular by the default Cross-Origin-Opener-Policy of "unsafe-none" [1].
By setting a different policy, sites can protect themselves against this.
I guess technically browsers could open new windows in a new browsing context group regardless of this setting and relay the allowed types of messages via IPC (if any), but that would be a major performance hit, and I don't think any other browsers do it differently.
[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cr...
Can't edit my original post anymore: Firefox and Chrome do seem to isolate even same-browsing-context-group and bridge the required APIs via IPC, so hopefully Safari will catch up at some point.
Basically, there are three scenarios:
- Completely unrelated tabs (e.g. those you open manually, those opened via command-click, tabs opened via '<a target="_blank' ...>" or 'rel="noopener"' references etc.) – these are relatively easily isolated if the browser supports it at all. All major (desktop!) browsers now largely do this, including Safari.
- "Same browsing context group" (but different origin) sites. These can communicate via various APIs, and historically that was achieved by just letting them run in the same rendering process. But in the face of attacks such as this one, this can be insecure. Firefox and Chrome provide sandboxing via separate processes; Safari does not.
- Same origin sites (without any stricter policy). These can fully access each other's DOM (if they have an opener/opened relationship), so there's not really any point in having them live in different renderers except possibly for fault isolation (e.g. one of them crashing not taking the other down). As far as I know, all browsers render these in the same process.
Sites can opt out of the second and third category into the first via various HTTP headers and HTML link attributes. If we were to design the web from scratch, arguably the default for window.open should be the first behavior, with an opt in to the second, but that's backwards compatibility for you.
I worked on a browser team when Spectre/Meltdown came out, and I can tell you that a big reason why Firefox and Chrome do such severe process isolation is exactly because these speculative attacks are almost impossible to entirely prevent. There were a number of other mitigations including hardening code emitted from C++ compilers and JS JITs, as well as attempts to limit high precision timers, but the browser vendors largely agreed that the only strong defense was complete process isolation.
I'm not surprised to see this come back to bite them if after like 7 years Apple still hasn't adopted the only strong defense.
To add to this and to quote a friend who has more NDAs in regards to microarchitecture than I can count and thus shall remain nameless: "You can have a fast CPU or a secure CPU: Pick one". Pretty much everything a modern CPU does has side effects that are something that any sufficiently motivated attacker can find a way to use (most likely). While many are core specific (register rename, execution port usage for example), many are not (speculative execution, speculative loads). Side channels are a persnickety thing, and nearly impossible to fully account for.
Can you make a "Secure" CPU? In theory yes, but it won't be fast or as power efficient as it could in theory be. Because the things that allow those things are all possible side channels. This is why in theory the TPM in your machine is for those sorts of things (allegedly, they have their own side channels).
The harder question is "what is enough?" e.g. at what level does it not matter that much anymore? The answer based on the post above this is based on quite a lot of risk analysis and design considerations. These design decisions were the best balance of security and speed given the available information at the time.
Sure, can you build that theoretically perfect secure CPU? Yes. But, if you can't do anything that actually needs security on it because it's so slow; do you care?
Your friend is genuine in their interpretation, but there is definitely more to the discussion than the zero sum game they allude to. One can have both performance and security, but sometimes it boils down to clever and nuanced design, and careful analysis as you point out.
This is also a fundamental property - if you can save time in some code/execution paths, but not in others (which is a very desirable attribute in most algorithms!), and that algorithm is doing something where knowing if it was able to go faster or slower has security implications (most any crypto algorithm, unless very carefully designed), then this is just the way it is - and has to be.
The way this has been trending is that in modern systems, we try to move as much of the ‘critical’ security information processing to known-slower-but-secure processing units.
But, for servers, in virtualized environments, or when someone hasn’t done the work to make that doable - we have these attacks.
So, ‘specialization’ essentially.
> I'm not surprised to see this come back to bite them if after like 7 years Apple still hasn't adopted the only strong defense.
So the Apple's argument that iOS can't have alternative browsers for security is a lie.
Strange claim.
Security isn’t a one-bit thing where you’re either perfectly secure or not. If someone breaks into your house through a window and steals your stuff, that does not make it a lie to claim that locking your front door is more secure.
In any event, Apple’s claim isn’t entirely true. It’s also not entirely false.
Browsers absolutely require JIT to be remotely performant. Giving third parties JIT on iOS would decrease security. And also we know Apple’s fetish for tight platform control, so it’s not like they’re working hard to find a way to do secure JIT for 3P.
But a security flaw in Safari’s process isolation has exactly zero bearing on the claim that giving third party apps JIT has security implications. That’s a very strange claim to make.
Security doesn’t lend itself to these dramatic pronouncements. There’s always multiple “except if” layers.
> Giving third parties JIT on iOS would decrease security.
Well, at least in this case it would have greatly increased security (since it would have allowed the availability of actual, native Chrome and Firefox ports).
And otherwise: Does Apple really have zero trust in their OS in satisfying the basic functionality of isolating processes against each other? This has been a feature of OSes since the first moon landing.
If JIT is such a problem then Apple shouldn't use it themselves. Sure, they let you disable it but it's still enabled by default while everyone pushes the narrative that Apple is all about security.
The alternative browsers have the required site isolation but aren't allowed. There's no fix for Safari and you must use it. I think it's very clearly decreasing the users' security.
Binary thinking is unhealthy.
Alternative browsers would introduce other security concerns, including JIT. It’s debatable whether that would be a net security gain or loss, but it’s silly to just pretend it’s not a thing.
Security as the product of multiple risks.
Discovering a new risk does not mean all of the other ones evaporate and all decision making should be made solely with this one factor in mind.
Can you provide any arguments that JIT would in fact decrease security other than "Apple says so"?
Every major mobile and desktop OS other than iOS has supported it for over a decade. Apple is just using this as a fig leaf.
I think a detached and distanced perspective must come to the conclusion that vendor lock-in isn't healthy. For security, performance or flexibility it tends to fall short sooner or later.
One could also talk about the relevance of a speculative attack that hasn't been abused for years. There can be multiple reasons for that, but we shouldn't just ignore the main design motivation of Apple here. That would be frivolous and that excludes serious security discussions.
"Decreasing the security" is not binary thinking. It's just a fact today. Also, ability to run software doesn't make you less secure. I never saw any real proof of that. It's the opposite: Competition between different browsers forces them to increase the security, and it doesn't work for Safari on iOS.
Are you really surprised, eventually the apple distortion field starts to wain around the edges but by then people have moved on to the new shiny.
Cross-Origin-Opener-Policy seems like a case of bad defaults where a less secure option has been selected so that we don't break some poorly maintained websites. Better to get the actual users of `window.open` to fix their code than to make every website insecure out of the box.
I can't imagine there are many sites passing significant amounts of data through this, the small number of users where IPC poses too high a penalty can opt their sites into the "same process" flag if really needed.
Forcing every website to adapt to a browser update is completely infeasible.
> I can't imagine there are many sites passing significant amounts of data through this
This is actually a quite common mechanism for popup-based authentication (which is much more secure than iframe-based one, as users can verify where they're potentially entering their credentials).
We had the tech in the 80's for the browser to facilitate popup authentication with process isolation. It's this niche and esoteric tech called IPC[1], so niche that one really can't blame Apple for not hearing about it.
It truly boggles the mind as to how all the other browsers pull it off.
[1]: https://en.wikipedia.org/wiki/Inter-process_communication
To be fair, there wasn't that much sensitive web content around in the 80s to leak (primarily due to the web not yet existing, nor browsers), so it's only fair that browsers didn't consider using IPC for site isolation back then.
The point of my rather facetious comment is that IPC a well known thing (I struggle to even call it "tech") that has been around for 30-40 years. I don't understand why Apple needs people to make excuses for them, but this excuse would render Apple vastly more incompetent than neglecting to separate browser tabs in 2025.
Browsers are incredibly complex, and moving them to an IPC model is not easy. Essentially, you need to ensure "same process like", performant JavaScript interoperability in some cases, often (but not always) due to backwards compatibility.
Firefox has shared a lot about their efforts in moving there. If you're curious, there are a lot of blog posts and internal design docs under their project name "Project Fission".
But yeah, the fact that both Chrome and Firefox have managed to do so does leave Apple looking slightly bad here.
How often do tabs really need to communicate, and when they do, does it really need to be as fast as possible. I would say slower and secure would be a better design philosophy, especially as tab interaction is generally rare, and low bandwidth
We already have this with (iirc) postMessage API.
That API is exactly one of the reasons Safari still runs some distinct origin sites in the same process together.
Performantly implementing that API across processes is possible, but not quite trivial.
popup-based authentication does not actually need high performance.
It's not used only for authentication, and figuring out what a website is trying to do heuristically doesn't sound easy either (although I believe Chrome on Android does just that, and enforces a site-locked process when they deem it important for security reasons).
To be fair, there was no web in the 80s.
That, and "cyber security" wasn't really a formalized field. It arguably still isn't, depending on how the question's framed.
But how much data are those popup based auth sending through? At the absolute most a few MB in a couple calls. Even if it's dramatically slower over IPC it's not going to cause issues.
Expecting websites to defend themselves against CPU side channel attacks is also absurd!
Similar problem with third-party-cookies. They would make some auth cases easier and safer, but we shouldn't generally allow them because they are abused for tracking.
Here I would agree with you though.
Why not a choice?
Individuals could choose a "secure" browser or browser mode that provides increased protection from such attacks or a "compatible" one that is less likely to break old websites.
> Individuals could choose a "secure" browser or browser mode that provides increased protection from such attacks or a "compatible" one that is less likely to break old websites.
And then we get thousands of posts whining about Safari being broken because it is "not like Chrome" and developers moaning that their unsafe pet API is not supported. Web developers are never going to play ball.
idunno, as a professional web dev since 1998, I don't understand why Google, Apple and Mozilla are trying so hard to make the web browser like a complete OS (I technically understand why, I just think it's ridiculous). The amount of obscure APIs being added just boosts the surface area for vulnerabilities and makes low-resource web browsing nearly impossible. You either get "a web browser that works" or "a web browser that can load almost nothing", and basically nothing in between. I had to stop using Firefox on my old ThinkPad because after opening a few windows, it churns the CPU so hard it's not usable for a solid minute+. Let that finish up, and I have to wait again if I dare to open another page. i5-3220m, 8gb of RAM (where the OS uses like 200mb of RAM)... there's no excuse for this to not be able to browse the web.
Some fun examples of "your browser is an OS on top of your OS":
The Screen Wake Lock API provides a way to prevent devices from dimming or locking the screen when an application needs to keep running. https://developer.mozilla.org/en-US/docs/Web/API/Screen_Wake...
The Web Serial API provides a way for websites to read from and write to serial devices. https://developer.mozilla.org/en-US/docs/Web/API/Web_Serial_...
The Window Management API allows you to get detailed information on the displays connected to your device and more easily place windows on specific screens, paving the way towards more effective multi-screen applications. https://developer.mozilla.org/en-US/docs/Web/API/Window_Mana...
The Compute Pressure API is a JavaScript API that enables you to observe the pressure of system resources such as the CPU. https://developer.mozilla.org/en-US/docs/Web/API/Compute_Pre...
(my thesis is, not all web developers want this stuff, and usually when I talk to people in the industry they agree, it's excessive)
Every web developer is fine with 10% of the feature set, it’s just a different 10% for each dev. I am regularly annoyed by the inconsistent browser support for Web MIDI, something 99% of web devs probably don’t care about at all.
It would be easier to sandbox if there were fewer features of course, but in practice we rarely see exploits even in complicated low-level APIs like webgpu (which must be a nightmare to make secure given how many buggy drivers and different devices it has to support). So it seems like in practice we are able to sandbox these APIs securely, and having them provides an incredible benefit to users who are able to securely and easily run applications (how else do you recommend people do this on desktop platforms?).
> we rarely see exploits
I can't think of an analogy that doesn't come off crass.
I posit the likelihood the morass that is webgpu not having exploitable bugs approaches 0 approximately 25 seconds after the first public release of the code, if not months prior.
Its only when one of two things occur that publishing happens, basically: intelligent frustration, and "for the lols".
Someone hits a bug and gets pissed that the authors of the libraries blame everyone but the library authors. When working around the bug discover its not just a bug. Warn devs. Sometimes responsible disclosure means a quiet fix and then disclosure, but usually it means "here's the exploit they don't care ig"
If there's not enough curious people poking things, exploitable stuff could remain hidden too long.
> idunno, as a professional web dev since 1998, I don't understand why Google, Apple and Mozilla are trying so hard to make the web browser like a complete OS (I technically understand why, I just think it's ridiculous)
I am not a web developer but I completely agree with you. To me, adding more complex points of failure in humongous piles of code that we absolutely need to run in a modern life is not a great risk assessment. It’s like we never learnt from the security issues with the JVM.
> The Screen Wake Lock API provides a way to prevent devices from dimming or locking the screen when an application needs to keep running. https://developer.mozilla.org/en-US/docs/Web/API/Screen_Wake...
There's an obscure use case for this called "Watching Video"
Dunno, that API has only been available since Firefox 126 and I've been watching videos without having my screen go to sleep (or screensaver coming on) for like.. years and years (far before Firefox 126)
Indeed. I work on the Firefox media stack and we have been grabbing wake locks when video playback is happening for a long time. Occasionally e.g. on some linux desktop variant this has malfunctioned and we're alerted in no time and fix it.
The Wake Lock API is for other use cases, such as recipe websites, or other document for which you don't want the screen to go away / dim, the kind where you happen to need to look at the screen for long period of time without touching it/interacting w/ mouse and keyboard.
Prior to this API being introduced, websites used to play an inaudible/almost invisible looping media file to keep the screen awake. This has power usage implication, a small single digit number of watts (1 to 3.5 depending on os, hardware, mobile or not) is required to keep audio running (because of high priority threads, frequent wakeups, and simply because the audio hardware itself needs power).
One of those libraries source, for illustration: https://github.com/richtr/NoSleep.js/blob/master/src/index.j...
Bless you. I was driven a little mad once trying to figure out why certain websites would steal audio focus away from music playing on my phone, it must have been some clumsy implementation of this.
That's because there used to be this thing called flash/silverlight that could do this instead. Now, video is done completely differently than it was.
> Now, video is done completely differently than it was.
Which is a thing that happened long before Firefox 126, too. (Browsers have simply requested the screen wakelock themselves when a video was playing. So this API is mainly for use cases that aren't playing a video.)
Recipe sites will keep themselves awake also, which is nice.
The first two, screen wake lock and web serial have good use cases imo. I wouldn’t be surprised if some in-use assistive technology uses serial communication - think screen readers or custom input devices. Keeping the screen from locking is also useful from a purely accessible standpoint as well for users who move slower or need more time to read things.
I fully agree, as someone also doing Web development in a similar timeframe, for plenty of stuff we would be better with native apps talking over Internet protocols, no need to transform a platform for interactive documents into a OS.
> The amount of obscure APIs being added just boosts the surface area for vulnerabilities
It’s often the ancient APIs from around 1995-2001 that are the most vulnerable ones, with information leaking across origins (like todays) needing hacky fixes to stay secure and compatible.
window.open(), target=_blank, cross site request forgery, etc.
IE6 from 2001 had a ton of these modern security issues, and Netscape before it probably had them too.
At that time there were tons of buffer overflow security holes so no on cared about side-channel attacks.
Well, for better or worse, the web is an application platform these days.
I consider it pretty great, since the alternative is installing native apps for things I'm using exactly once or very rarely.
There's a case to be made though that maybe these things should only be available to PWAs, which is what Apple is already doing for some functionality on iOS, including push notifications.
What? I thought Apple was trying to quietly kill pwas, probably bc they don't go thru their app store. And also bc if you make a good enough sandbox then you don't need to pay for "all the wooork we put in"
They certainly don’t have a lot of love for them as a first-class app development environment, but they are also their fig leaf of “open access” to the platform.
Regardless of that, I do like the idea of PWAs getting access to a higher tier of web APIs, especially those increasing the attack surface, as “installing an app” makes even non-sophisticated users do more of a double take than “visiting a website” in terms of access to their system.
iOS just really improved support for pwas sometime in the last part of last years. It's much better.
I agree with you, particularly when 90% of sites people visit are to read, watch, or engage with content.
That’s not a real choice though. All it takes is one website that is essential to me not supporting the secure mode and I’m forced to opt-out. The upstream website is making the choice for me.
Users don't want to make these kinds of choices, and generally there's no good way to educate them enough to figure out what they actually want.
People are downvoting you, but Apple does actually offer this: https://support.apple.com/en-us/105120
I think they've gotten away with it because it's a pretty obscure setting and they say a bunch of things like "most users should not enable this, and if you do enable it you should expect things to break".
It's super feasible if you own the API default.
It's definitely a quick way to get all your users to switch to a different browser or figure out how to disable updates forever.
One spot where safari is in an advantageous position to force a new default, as long as they roll it out on iOS first.
I mean, if all major browsers do it roughly once then users will complain to the few broken websites. They won’t even think to blame the browser if every other site works fine and the broken site is broken on all browsers.
Good luck trying to get Google or Microsoft to throw their paying enterprise users under the bus in the interest of slightly safer sandboxing defaults.
I wasn’t suggesting it would happen, only that the “users would stop updating their browsers” scenario seemed unlikely.
You just announce you are making a change and then turn it on later.
After building enterprise APIs for a few years, you’d be amazed at how hard it is to get companies to make even minor changes; backwards compatibility is key. Often it’s because they _can’t_ make the change themselves since they outsourced the code to a consulting agency. So they’d have to sign a new contract and get an agency to make the change.
They just won’t, and you’ll have a browser that people stop using.
It is probably outside the scope of what one company can do (although Apple is quite large…). But we need to fix our understanding of backwards compatibility. If a computer system provides the ability to keep doing something, but the way it provides that capability requires it to be insecure, then the system should not really be thought of as “backward compatible.” Because reasonably prudent people can’t actually keep doing the thing they were doing before.
Of source, modern computers on the modern web don’t really provide the ability to do much at all in a reasonably prudent fashion, so it is all a bit moot I guess.
Announce what to whom? To the hundreds of millions of users out there that don't even know what a browser is, let alone why it's now talking to them about something called a "site isolation framework"?
I would guess you would use a deprecation message in the console? Like they have done over cookie changes, etc. A normal user would obviously not check the console, but the devs or admins of the site sure might.
That's assuming there's still a dev around that has knowledge of, or even access to, the source code of a given webapp depending on the legacy functionality.
Sure. I just got a vibe from this thread that breaking security changes in the browser is a totally unknown phenomenon, but we had changes to behavior from other origin headers, demanding ssl and changes to cookies. Somehow we survived. ;)
A lot of people did complain very loudly about enforcing SSL, and it took decades to get here. Same for cookies.
So yes, breaking changes for privacy/security reasons do happen, but they're very painful, and if there's a more secure alternative (in this case, still isolating communicating processes and providing communication via IPC, and providing an opt-out way of the legacy behavior), that's often the easier path.
Other browsers do site isolation, why can’t Safari? (:
Safari definitely does use site isolation (if you check "Activity Monitor", you'll find Safari processes named after the sites they're displaying) in almost all cases.
window.open, in some constellations, is an exception, because the opening and opened sites have an explicit communication facility available to them, unless at least one of the two explicitly opts out of it. As far as I'm aware, Safari also correctly implements that opt-out.
The only thing that Chrome and Firefox seem to be doing on top of that, as far as I understand, is to actually enforce process-per-site even for sites from the same "browsing context group" (i.e. all that can hold programmatic references to each other, e.g. via window.opener), which requires using IPC.
> Safari definitely does use site isolation (if you check "Activity Monitor", you'll find Safari processes named after the sites they're displaying) in almost all cases.
From the FAQ:
"For leaking secrets, both SLAP and FLOP are confined to the address space they are trained in. As pointed out by iLeakage, Safari lacks Site Isolation, a measure used to enforce that two different webpages not from the same domain can never be handled by the same process. Thus, in Safari it is possible for an adversary's webpage to be handled by the same process (and thus address space) with an arbitrary webpage, increasing the attack surface including LAP- and LVP-based exploits.
On the other hand, although Chrome is equipped with Site Isolation, we demonstrate that it is not a perfect mitigation. We show the real-world existence of corner cases, where two subdomains of the same site can be merged into one process, again leading to LAP- and LVP-based attacks."
Yes, in some special cases, which both embedded/opened and embedding/opening websites can avoid by setting the appropriate HTTP headers/HTML attributes.
Of course it would be better if Safari would do the same thing as Chrome and Firefox and just provide a separate process for all contexts, including those that can communicate per specifications. But there's something sites can do today to avoid this potential information leak.
> websites can avoid by setting the appropriate HTTP headers/HTML attributes.
Individual sites plugging browser + CPU security holes seems like a violation of separation of concerns. Yes, I hope every bank out there puts this workaround into their site ASAP, but that's hardly a solution for the flaw itself.
The permanent solution to the flaw is either a hardware/OS-side fix (i.e. disabling this particular kind of speculation via a chicken bit, if there is one), or Safari implementing site isolation in the same way Chrome and Firefox are already doing.
But as the former might well be impossible (at least without ruining performance or requiring a hardware swap), and the latter might take a while, websites should still take the precautions they can. It's a good idea for other reasons anyway: Why keep around an inter-context messaging mechanism you possibly don't even need?
> But as the former might well be impossible (at least without ruining performance or requiring a hardware swap), and the latter might take a while,
According to the site they informed Apple in May 2024. Should that not have been enough time?
It took Chrome and Firefox years to achieve complete tab separation, so yes, it does seem too close, unfortunately.
No. For fanboys, everything Apple does is the best thing anyone could ever do. So if Apple needs more time, then it is impossible to be faster. And last line of defense "Whataboutism". I wish people would not be so predictable and boring.
Right, in what sane world would the website determine operating system process semantics? What next, syscalls?
Have you noticed how often people complain Chrome uses too much memory?
Process-per-site isolation doesn't necessarily have to use (much) more memory.
If you pre-initialize the renderer and JavaScript engine and then fork that pre-warmed instance for each site, every page of memory not written to remains shared in physical memory.
Properly accounting for that in task managers is hard, though; on many OSes, Chrome's memory usage looks much scarier than it is in reality.
You can't fork a GUI process on Apple OSes, most of the system can't handle it.
It'd also weaken any security protection based on randomness (eg ASLR slide, pointer encryption keys).
Huh, TIL Chrome on macOS might actually not be using the “zygote process” paradigm!
https://source.chromium.org/chromium/chromium/src/+/main:con...
Not possible on Windows, where most Chrome users are on.
Because long-inactive tabs should go to sleep.
If Chrome itself is not aggressive enough, try the "Auto Tab Discard" extension.
There is now a Chrome Option "Memory Saver" which dictate how aggressive it is to put Tab into Sleep.
Safari on the other hand doesn't even have Tab Sleep for whatever reason.
It is not required by window.open semantics, you can absolutely implement site isolation even in the presence of COOP unsafe-none
Sorry, I was imprecise in my original post: It's definitely possible to isolate even sites in the same browsing context group, but it requires more work that Safari apparently just hasn't got around to yet.
Would that performance hit be really that significant? I can't imagine there are more than a couple of calls total, and that's all dwarfed by any web access. Or do I misunderstand what's required?
So should Protonmail (and any other site with similarly sensitive data) be setting that header, then? It’s probably hard to change the default. I bet some use cases (SSO popups?) depend on it.
It's not unreasonable to set a different header value for the login page only, where it should be safe because no external user data is being rendered.
Sites can also opt into the same behavior by setting the rel="noopener" or alternatively target="_blank" attributes on outgoing links (i.e. <a> tags).
And yes, something like a webmail site should definitely be setting the header, or at lest these attributes for outbound content links.
That is a little different, though: those attributes are for if you're example.com linking to protonmail, the header is for if you're protonmail deciding on security policies for interactions with example.com.
The header works in both directions; cf. the table on https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cr....
Would it help to use separate processes and share the data on demand only, via IPC with a robust speculation barrier?
Yes, and that’s what Firefox and Chrome are already doing.
So zealous to defend Apple that you didn’t check Firefox and Chrome before posting?
This, 100%. From the SLAP paper linked in the OP https://predictors.fail/files/SLAP.pdf :
> Considerations for Safari. We emphasize the importance of site isolation [55], a mechanism preventing webpages of different domains from sharing rendering processes. Site isolation is already present in Chrome and Firefox [42, 55], preventing sensitive information from other webpages from being allocated in the attacker’s address space. While its implementation is an ongoing effort by Apple [3, 4], site isolation is not currently on production releases of Safari. On the contrary, we also reflect on [Webkit's memory allocator] libpas’s heap layout from Section 6.3, allowing sites to not only share processes, but also heaps. Partitioning JavaScript heaps by at least several memory pages per-webpage would prevent JavaScript strings from the target webpage from being allocated within the 255-byte reach of the LAP.
No, only some of the side channel magic doesn't matter if you live in a different virtual memory space. Other past attacks didn't use virtual memory pointers and used physical memory pointers or didn't use any pointers at all - one could read data from another process, the kernel, another VM, the SGX enclaves or even proprietary CPU manufacturer code that runs on the CPU, like the CPU signing keys used for remote attestation.
The writing was on the wall for in-process sandboxing with Spectre, but that seems to have faded a bit. This just re-enforces that. Things like "safe in-process sandboxing with WASM" are just a fantasy, it can't be implemented.
Safe in-process sandboxing is obviously possible and even trivial. It does get harder if you care about performance, though.
If the costs are high enough you’re basically reimplementing multi-process isolation from first principles.
"trivial" how do you figure? Remember these exploits bypass your own code's conditionals over a shockingly far duration. Unless you just mean for incredibly restrictive usages such as eBPF?
possible absent any performance concerns at all, yeah sure
> Unless you just mean for incredibly restrictive usages such as eBPF?
I was actually thinking something more like a bytecode interpreter that runs one operation and then sleeps until the next full wall clock second, but yes, that's my point: If you don't care about performance, you can make process isolation safe very easily.
I think at the point where you're suggesting 1hz bytecode interpreters the onus is kind of on you to be clear you're not talking about plausible points in the design space.
1 Hz is probably a bit too slow for practical applications, but my point is that somewhere between that, and simulating a parallel universe at each data-dependent branch, is probably a reasonably-safe spot, or more likely a spectrum that application developers get to pick their tradeoffs from.
I agree that 1hz is probably too slow for practical applications.
An identically useful comment would've been to place the bounds at 0 and infinity.
A zero Hz machine is arguably not Turing complete, though.
They didn't mention the bounds were inclusive either.
That is actually still not secure. The cache will happily retain the trace of the access forever.
Yeah, but unless a cache miss has a performance penalty of roughly one second, you should be fine.
What I know as a developer is web security is really hard. Last week there was a DOM clobbering gadget deep in my TypeScript world and I really didn't have the energy to understand who wants to clobber my DOM and why they need a gadget. I want to build stuff and what worries me is this stuff is just simply not foreseeable.
DOM security is a completely different beast from process isolation with WASM (in a web context or otherwise). The attack surface is vastly greater, due to the much larger API complexity.
Do other browsers have process isolation for new tabs?
Not necessarily for tabs on the same web site, but for different sites, yes. Hence "site isolation".
To be fair, this is (relatively, compared to the age of the web) new behavior though.
Even Chrome, which pioneered the "tab per process" model, didn't isolate same browsing group context sites for a long time, and still shares processes on Android as a resource optimization: https://chromium.googlesource.com/chromium/src/+/main/docs/p...
Firefox only released their "Project Fission" in 2021, i.e. three years after Spectre.
In Safari settings under Advanced, it’s possible to enable ”verify window.open user gesture”. Does that help at all?
AFAIK this only means the attacker has to dupe you into doing a UI event like scrolling or clicking or touching something on page. Very easy to do.
Safari is Internet Explorer of the 20'
This is a flawed comparison in many ways. As you might not understand IE was problematic because of its massive install base and everyone only, and only writing their websites for chrome oh wait. Typo'd there, meant IE.
Cool detail, in the section where they reverse-engineer the presence of an LVP on the M3:
Grouping a hex address by threes is crazy
3 hex digits = 12 bits = 4096 entries, the size of each address translation table on ARM. So it does make some (twisted) sort of sense. Assuming you're using 4k page size
Apple devices use 16kB pages.
macOS/iOS don’t
wait, so does this mean that if an exploit tries to use a 32 bit address it's immediately shut down?
There are usually no valid 32 bit addresses, i.e. the first 4GB are not mapped.
That might be their point. As the OP quoted
> any virtual address below 0x100, 000, 000 is invalid.
That kinda suggests that all 32bit addresses are inherently invalid on 64bit MacOS
Not inherently, it's just a linker default. You can run 32-bit processes through WINE.
Only in Rosetta. On arm64 these binaries will not be allowed to load.
Hmm, one part i found interesting
> In order to make cache hits distinguishable from misses in Safari, we reference the NOT gate-based cache amplifica- tion primitive from [29, Section 5], adjusting the speculation parameters for the M2 CPU. We run the amplifier 500 times when the target address is cached and 500 more times when it is evicted, in native and WebAssembly implementations. Table 3 summarizes the timing distributions, with units in ms. We observe that they are clearly separable even in a web environment, allowing us to distinguish cache hits from misses with WebKit’s 1 ms timer.
So i guess all the hub hub around disabling fine resolution timers and SharedArrayBuffer was for naught.
It delayed viable attacks by a few years, maybe?
It doesn't hurt that setbacks for web app development coincidentally send developers into the open arms of Google and Apple's stores that collect a 30% cut of all revenue, so there was a good incentive to do it even if it didn't protect anyone.
> It doesn't hurt that setbacks for web app development coincidentally send developers into the open arms of Google and Apple's stores that collect a 30% cut of all revenue, so there was a good incentive to do it even if it didn't protect anyone.
That seems like a bit of a reach. Its an obscure feature that is rarely useful, and when it is all you have to do is send the right http header (if using chrome) and you get it back.
Multithreading may be an obscure feature to you but runtime developers get requests for it all the time. SAB becoming widely available was definitely delayed.
I would still maintain that needing multithreading on a website is relatively rare, and specifically needing SharedArrayBuffer instead of just multiple proceses (e.g. webworkers) is even more rare.
Did use cases exist? Sure. But not sufficiently to move the needle on app store usage.
One of the most significant use cases that gets impacted is game dev.
Web games benefit significantly from SAB (see the Godot engine for example) and mobile games make up a pretty sizable chunk of app store usage, particularly in app purchases.
> Did use cases exist? Sure. But not sufficiently to move the needle on app store usage.
We can't say that for sure. There is no shortage of examples where Apple neglects a feature that might provide parity with their own services to avoid competition. Safari, being a mandatory feature of iOS, is reasonably implicated as part of the conspiracy to prevent users from buying and selling software without the assent of a corporate benefactor.
> On the other hand, although Chrome is equipped with Site Isolation, we demonstrate that it is not a perfect mitigation. We show the real-world existence of corner cases, where two subdomains of the same site can be merged into one process, again leading to LAP- and LVP-based attacks.
Did anyone spot where this is mentioned?
Edit: it doesn’t seem like they have a general attack. Rather, it’s that some sites are not in the public suffix list.
Edit 2: It’s also interesting that they found that iPhone 13 and iPhone 13 mini (which have the same processor and came out at the same time) differ in LAP in that they observed only the latter as having it. Very curious…
Right, and “where two subdomains of the same site can be merged into one process” is normal right, given Site Isolation ≠ Origin Isolation.
A PSL flaw is important, but also a low-cost fix.
Thanks for pointing this out.
Funny that I am seeing this now, because last Fall I had Daniel Genkin as my Intro to Cyber Security Professor (co-author of this result). Interesting class, but I remember him mentioning that they were working on a speculative attack for Apple CPUs after seeing the results of spectre and meltdown on Intel CPUs. I remember how he seemed almost paranoid about security, and I suppose I see why now (security is almost never guaranteed).
Especially now that I have just bought an M4 mac
Am curious if the problem impacts m4 given it came out after this was released and disclosed.
That and it moved to Arm’s 9.2 instructions.
Keep in mind that it takes at least 3 months to produce an M4, and the design has been finalized long before that. So most likely yes
Yes.
Apple released minor-version updates to both macOS and iOS the past few days, both containing several security fixes. Has anyone been able to confirm if they address these exploits?
They haven’t yet. From https://www.bleepingcomputer.com/news/security/new-apple-cpu...:
Apple acknowledged the shared proof-of-concept and stated it plans to address the issues. However, at the time of writing, the flaws remain unmitigated.
"We want to thank the researchers for their collaboration as this proof of concept advances our understanding of these types of threats," Apple told BleepingComputer.
"Based on our analysis, we do not believe this issue poses an immediate risk to our users."
It's crazy that they were informed about this months ago and still have not fixed it yet. They're going to have to now that it's public but why would that pressure even be needed. I naively assumed if Apple still gets one thing right it's security updates. This is disappointing and concerning.
Have you considered that it might be difficult?
Aside: I feel like RUB has become kind of a global center for this kind of high-end offensive security work. Was I just not paying enough attention 10 years ago or is this a new-ish thing?
Idk if it's RUB or Yuval, he was credited on spectre and meltdown as well (if I recall correctly), but he was at data61 or Uni adelaide at the time
It goes way back; check the work of the likes of Thorsten Holz or Christof Paar. TU Graz is another one.
What is RUB?
Ruhr University Bochum, the third author's University
Ruhr-University Bochum, in Germany
https://www.ruhr-uni-bochum.de/de
They've also consistently put out some of the best fuzzing research
Their offensive crypto work is also on point.
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can.
Mark Twain
CPUs on the other hand do nothing but speculate.
For any yung'uns seeing this for the first time, the spectre and meltdown attacks (and accompanying papers) are worth reading.
https://spectreattack.com/
Is it bad that I disable spectre mitigations on all my PCs to get a free double-digit-% performance boost?
It depends on your threat model. If you don't run any untrusted code on your hardware (including Javascript), you can safely disable the mitigations. If you do run untrusted code, keep them enabled.
In the context of a regular end-user desktop machine, this seems overly paranoid to me. The odds of encountering a real, JS-based spectre attack in the wild are basically zero (has anyone ever seen a browser-based Spectre attack outside of a research context? even once?), and the odds of it then being able to retrieve actual sensitive data are also basically zero. That's two astonishingly tiny numbers multiplied together. The threat just isn't there.
For regular end-user desktop machines, the mitigations only decrease performance for no real benefit. Spectre is a highly targeted attack, it's not something you can just point at any random machine to retrieve all their bank passwords or whatever.
Spectre is mitigated by your browser already.
What is the threat model if I run lots of untrusted JavaScript, but I only have a small amount of memory in other processes worth reading and I would notice sustained high CPU usage?
Is there an example in the wild of a spectre exploit stealing my gmail cookie and doing something with it? (Would be difficult since it's tied to other fingerprints like my IP)
Or stealing credit card numbers when they're in memory after I place an online order?
If you’re not sure, just keep the mitigations on.
[dead]
You're not getting a boost, you're avoiding a penalty. In some (but not all) cases you can avoid the penalty and the exploits by disabling SMT. Remember, SMT isn't twice as many cores, just twice as many half-cores. You'll be fine.
Disabling SMT alone isn’t enough to mitigate CPU vulnerabilities. For full protection against issues like L1TF or MDS, you must both enable the relevant mitigations and disable SMT. Mitigations defend against attacks where an attacker executes on the same core after the victim, while disabling SMT protects against scenarios where the attacker runs concurrently with the victim.
In my experience SMT is still faster for most workloads even with the mitigations.
It's a common misunderstanding that the CPU suddenly has twice as large performance envelope when SMT is enabled. Only specialized software/scenarios will tangibly benefit from the parasitic gains of SMT-induced extra parallelization, e.g. video encoders like x264 or CPU-bound raytracers to name a few examples. These gains typically amount to about 15-20% at the very extreme end. In some cases you'll see a performance drop due to the inherent contention of two "cores" sharing one actual core. If you're stuck with a dual-core CPU for your desktop setup you should absolutely enable SMT to make your general experience feel a bit more responsive.
> It's a common misunderstanding that the CPU suddenly has twice as large performance envelope when SMT is enabled.
Perhaps, but I am not under this misunderstanding and never expressed it.
> Only specialized software/scenarios will tangibly benefit from the parasitic gains of SMT-induced extra parallelization
In my experience it also speeds up C++/Rust compilation, which is the main thing I care about. I can't find any benchmarks now but I have definitely seen a benefit in the past.
Are you sure about your statement
> video encoders like x264 or CPU-bound raytracers to name a few examples. These gains typically amount to about 15-20% at the very extreme end.
Normally those types of compute heavy processes, data streamlined, processes don’t see much benefit from SMT. After all SMT only provides a performance benefit by allowing the CPU to pull from two distinct chains of instructions, and fill the pipeline gaps from one thread, with instructions from the other thread. It’s effectively instruction-by-instruction scheduling of two different threads.
But if you’re running an optimised and efficient process that doesn’t have significant unpredictable branching, or significant unpredictable memory operations. Then SMT offers you very little because the instruction pipeline for each thread is almost fully packed, offering few opportunities to schedule instructions from a different thread.
Compression is inherently unpredictable (if you can predict it, it's not compressed enough), which is vaguely speaking how it can help x264.
I agree that compression is all about increasing entropy per bit, which makes the output of a good compressor highly unpredictable.
But that doesn’t mean the process of compression involves significant amounts of unpredictable branching operations. If for no other reason than it would be extremely slow and inefficient, because many branching operations means you’re either processing input pixel-by-pixel, or your SIMD pipeline is full of dead zones that you can’t actually re-schedule, because it would desync your processing waves.
Video compression is mostly very clever signal processing built on top of primitives like convolutions. You’re taking large blocks of data, and performing uniform mathematical operations over all the data to perform what is effectively statistical analysis of that data. That analysis can then be used to drive a predictor, then you “just” need to XOR the predictor output with the actual data, and record the result (using some kind of variable length encoding scheme that lets you remove most of the unneeded bytes).
But just like computing the median of a large dataset can be done with no branches, regardless of how random or the large the input is. Video compression can also largely be done the same way, and indeed has to be done that way to be performant. There’s no other way to cram up to 4k * 3bytes per frame (~11MB) through a commercial CPU to perform compression at a reasonable speed. You must build your compressor on top of SIMD primitives, which inherently makes branching extremely expensive (many orders of magnitude more expensive than branching SISD operations).
> You’re taking large blocks of data, and performing uniform mathematical operations over all the data to perform what is effectively statistical analysis of that data.
It doesn't behave this way. If you're thinking of the DCT it uses that's mostly 4x4 which is not very large. As for motion analysis there are so many possible candidates (since it's on quarter-pixels) that it can't try all of them and very quickly starts trying to filter them out.
> it uses that's mostly 4x4 which is not very large
That's 16x32 which is AVX512. What other size would you suggest using and (more importantly) what commercially available CPU architecture are you running it on?
> Are you sure about your statement
Yes. From actual experience.
It usually speeds up basically everything parallelizable that looks kind of like a parser, lexer, tokenizer, .... Unless somebody goes out of their way to design a format with fewer data dependencies, those workloads are crippled on modern CPUs. That includes (de)compression routines, compilers, protobuf parsing, ....
The only real constraint is that you can actually leverage multiple threads. For protos as an example, that requires a modified version of the format with checkpoints or similar (which nobody does) or having many to work on concurrently (very common in webservers or whatever).
In practice probably not, as long as general population keeps it enabled. I mean, looking at effort required, it's not worth spending time exploiting spectre these days, because virtually everyone is protected. If you're not likely to be directly targeted, "herd immunity" will work.
If just visiting a webpage with some JS will let them do ACE on even 0.1% of visitors, hackers are probably still motivated enough to try it. But I vaguely remember these kinds of vulns can be patched in-browser for a perf hit instead of taking the hit system-wide, which sounds like an ok compromise.
Edit: Arbitrary memory access, not ACE
>hackers are probably still motivated enough to try it.
The amount of actual exploit crafting that is needed to actually do something meaningful with a hack is pretty much not worth doing for any financial reason. The only time this happens now is when state funded actors or prominent groups with lots of manpower really want to take down an individual person.
Depends how automated it can be. I know some non-spectre 0-days were used broadly, either via viruses or port-scanning. Is it possible to craft some JS that'll use a spectre-like vuln to reliably grab something important like Chrome passwords or credit cards? Idk, it's hard to prove otherwise, and hackers have more time to think about this than I do.
> Is it possible to craft some JS that'll use a spectre-like vuln to reliably grab something important like Chrome passwords or credit cards?
Probably, but there’s a huge luck element involved, at least with spectre. It’s difficult to guide the speculative execution to read exactly what you want it to read, assuming you even know where it is. As a result you need to spend quite a bit of time on a single target before you’re likely to actually get the data you want. Even then, there’s likely a significant element of human analysis to assemble something useful from all the noise.
So yes, it’s almost certainly possible. But exploits don’t exist in a vacuum. If you’re expending that much effort to get credit card numbers, then quite frankly you’re a fool, because good old phishing attacks and other social engineering attacks are easier, more reliable, and above all, cheaper.
At the end of the day, crime is a business like any other, profitability and margins are king. You don’t waste time perfecting attacks that have significantly smaller margins than your existing attacks. The only exception to that is nation states, because they aren’t motivated by directly extracting cash from victims, and ultimately that’s what makes nation state actors so dangerous.
> If just visiting a webpage with some JS will let them do ACE on even 0.1% of visitors
Spectre is not an arbitrary code execution kind of attack.
Oops, I meant arbitrary memory access.
Linux on its own isn't even 0.1% visitors normally. We're talking multiple orders of magnitude less for disabled mitigations. And on top of all that, it's possible that exploiting on that machine is going to be harder due to custom software with uncommon memory layout - i.e. it's probably not a stock Ubuntu. And finally, for accessing data outside of the page, you really want to have some specific target, so they'd have to guess that too.
If only Linux is affected then sure. Was talking about spectre in general. Maybe only Linux users are turning the spectre mitigation flags off, but there are plenty of outdated Windows systems too.
You shouldn't disable Spectre mitigations, but Retbleed and Downfall (Intel) are much more of a "lab" exploit, and the fall-out for Retbleed is much more severe on cloud boxes than your personal PC. Easy 20-40% performance uplift on AMD Zen1-Zen2 and Intel 6th-11th gen.
Only if you don't care if baddies see you go fast
If your machine is air gapped and/or not running random downloaded code, I think it is a possible reasonable option.
It's a tragedy that so many websites insist on having the ability to run random downloaded code on our systems to do basic things like displaying simple text and images. Things browsers are capable of with nothing but HTML. Google refuses to even show search results, a bunch of literal hyperlinks, without javascript being enabled.
The real tragedy is that our processors try to win a bit more speed by sacrificing simplicity and therefore increasing the chances of such exploits. The other tragedy is that our operating systems are obsolete and have worthless security. Back in the day when UNIX was relevant, a hundred people could use it at the same on a mainframe with no fear of it breaking, now one person cannot safely use a single computer.
> The real tragedy is that our processors try to win a bit more speed by sacrificing simplicity and therefore increasing the chances of such exploits.
This gets repeated on every thread about speculative execution exploits, but I think people who say this are underestimating how huge the difference would be. I suspect processors without speculative execution would be many times slower, not "a bit".
Unix systems protected against a physical person trying to read or modify another persons file. They did not stop programs run by the user reading the users own data unexpectedly which is now considered unacceptable but was previously the norm.
I got bad news: those systems were NOT secure
> Google refuses to even show search results, a bunch of literal hyperlinks, without javascript being enabled.
DuckDuckGo works fine with no JS.
It mostly works. I get a lot of "No results." and !images doesn't work
What are you doing where you see anything remotely close to double-digit-% gains from disabling spectre mitigations?
If mitigations include disabling SMT and the workload is compiling code, then the difference is easily in double digits.
What OS ships the mitigation of disabling SMT by default? Surely they just meant things like the retpoline mitigations in syscalls?
my specific use case where I see significant performance improvement is image segmentation pipelines (which involve opencv-style image processing and AI inference). YMMV depending on your CPU I suppose.
Video editing maybe? Which is not going to involve running untrusted code.
It's not going to hammer on syscalls, either, so it won't have any spectre-related regressions.
Where and how do you disable these mitigations?
Not sure about Windows, but on Linux I used: https://unix.stackexchange.com/a/554922
> double-digit-%
In the early days there was a ~10% hit, but that's changed a lot since then.
It depends on the CPU, think. The most dramatic improvement I've seen is 20-30%+ improvements in python run times for numpy/pytorch heavy workloads on c2-standard-16 VMs in GCP with spectre mitigations disabled
I’d highly recommend reading Flush+Reload first since the cache side channel is key to any of these miceoarchitectural attacks.
As someone who followed a course on all of this, this is indeed how we started out.
1. Read Flush + Reload
2. Then reproduce it in C
3. Then read Meltdown
4. Reproduce it in C
5. Read Spectre
6. Reproduce it in C
After that we had to implement a VUSEC paper. I chose GLitch [1].
[1] https://www.vusec.net/projects/glitch/
Keep in mind that getting meltdown to work might be very difficult depending on your setup. I wouldn't have been able to at least when starting out my teacher didn't provide us with targetable hardware.
A spectre (particularly RSB-based ones) are nice to start out with imo.
Yea fair, this is obviously a high level overview. I think I found with meltdown that I needed the assembly code. I also was able to reproduce it with actual C code if I recall correctly but that was way more finnicky.
Hm... as I read it this is much worse. Spectre/Meltdown were data isolation vulnerabilities. You could exploit side channel (mostly timing) information to intuit state about memory across a protection boundary. Basically you can prime the CPU state to allow you to tell which of multiple code paths the kernel/hypervisor/whatever took, and then go from there to reading arbitrary data. Which is bad, obviously.
Here, they claim to have a remote exploit vulnerability. It's not that Apple is leaking data, it's that the CPUs have a bug where they appear to be actually executing code based on incorrectly-loaded ("predicted") data.
Though the details are, as is usually the case, thin. I await further analysis.
A browser-based attack, in theory, could have happened with Spectre/Meltdown as well. I seem to recall a PoC for Spectre in the browser, actually. I believe it's also a reason that microsecond precision in the browser was made a bit more opaque since that era.
GLitch was a Rowhammer browser based attack [1]. It's not Spectre/Meltdown but still, for a while people thought it couldn't be done.
[1] https://www.vusec.net/projects/glitch/
They’re speculatively executing code. It’s not traditional code execution. (You can, of course, read the papers for full details.)
It's not remote code execution, it's the same flavor of "out of bounds read through speculation" as previous vulnerabilities. It's terrifying because they have a working proof of concept from untrusted JS in Safari, but there have been speculative execution in browser JS engines before now also.
The language seems to argue otherwise: SLAP "allows the adversary to jump the LAP to the target webpage's string and trick the CPU into operating on it" and FLOP "allows us to run a function with the wrong arguments". That's absolutely not mere data exfiltration.
Now, maybe this is because of a trampoline based on pre-existing Safari bugs and not the CPU misfeature itself. Again, the details are slim.
But "the same flavor of vulerability" seems to be a mischaracterization.
My read: the attack gets 600 cycles of CPU time to execute its code (JITted Javascript, in web context) on the speculated data, and to use some side channel to communicate results back out of the speculated parallel-world.
Some of the earlier speculation attacks didn't get to do arbitrary compute on the speculated data, they could only for example influence whether something was loaded into cache or not.
This makes an attack easier to write.
Only speculatively. The end result is the same as not executing that code, other than observable side effects.
No, they’re correct.
This introduced me to the idea of load value predictors. Is Apple the only chip designer using these in commercially released microarchitecture?
In many CPU ISAs, load value predictors are unlikely to be useful, because they cannot guess the value that will be loaded with an acceptable probability.
The ARM ISA and also other ISAs with fixed-length instruction encoding are an exception. Because they have a fixed instruction length, typically of 32 bits, most constants cannot be embedded in the instruction encoding.
As a workaround, when programming for such ISAs, the constants are stored in constant pools that are close to the code for the function that will use them, and the load instructions load the constants using program-counter-relative addressing.
Frequently such constants must be reloaded from the constant pool, which allows the load value predictor to predict the value based on previous loads from the same relative address.
In contrast with the Apple ARM CPUs, for x86-64 CPUs it is very unlikely that a load value predictor can be worthwhile, because the constants are immediate values that are loaded directly into registers or are directly used as operands. There is no need for constants stored outside the function code, which may be reloaded multiple times, enabling prediction.
All fast CPUs can forward the stored data from the store buffer to subsequent loads from the same address, instead of waiting for the store to be completed in the external memory. This is not load value prediction.
> for x86-64 CPUs it is very unlikely that a load value predictor can be worthwhile
I think you're making a good point about immediate encodings probably making ARM code more amenable to LVP, but I'm not sure I totally buy this statement.
If you take some random x86 program, chances are there are still many loads that are very very predictable. There's a very recent ISCA'24 paper[^1] about this (which also happens to be half-attributed to authors from Intel PARL!):
> [...] we first study the static load instructions that repeatedly fetch the same value from the same load address across the entire workload trace. We call such a load global-stable.
> [..] We make two key observations. First, 34.2% of all dynamic loads are global-stable. Second, the fraction of global-stable loads are much higher in Client, Enterprise, and Server work-loads as compared to SPEC CPU 2017 workloads.
[^1]: https://arxiv.org/pdf/2406.18786
Unfortunately what you say is true for many legacy programs, but it is a consequence of the programs not being well structured by the programmer, or not being well optimized by the compiler, or due to a defect of the ISA, other than the lack of big immediate constants.
Some of the global-stable values are reloaded because the ISA does not provide enough explicitly-addressable registers, despite the fact that a modern CPU core may have 10 times to 20 times more available registers, which could be used to store the global-stable values.
This is one of the reasons why Intel wants to double the number of general-purpose directly addressable registers from 16 to 32 in the future Diamond Rapids CPU (the APX ISA extension).
In other cases the code is not well structured and it tests repeatedly some configuration options, which could be avoided by a proper partitioning of the code paths, where slow tests would be avoided and the execution time would be reduced, even at the price of a slight code size expansion (similarly to the effect of function inlining or loop unrolling).
Sometimes the use of such global-stable values could have been avoided even by moving at compile time the evaluation of some expressions, possibly combined with dynamic loading of some executable objects that had been compiled for different configurations.
So I have seen many cases of such global-stable values being used, even for CPU ISAs that do not force their use, but almost none of them were justified. Improving such programs at programming time or at compile time would have resulted in greater performance improvements, which would have been obtained with less energy consumption, than implementing a load-value predictor in the CPU.
I think you're under-estimating the amount of pointer chasing that lots of types of code has to do. B-Tree traversal for filesystems, mark loops for garbage collection, and sparse graph traversal are all places where you're doing a lot of pointer chasing.
Thank you, fantastic answer.
I do wonder if there are other common code patterns that a practical LVP could exploit. One that comes to mind immediately are effectively constants at one remove: Think processing a large array of structs with long runs of identical values for some little-used parameter field. Or large bitmasks that are nearly all 0xFF or 0x00.
Probably not, but I don't think anyone has talked about it explicitly.
Otherwise, there are known examples of related-but-less-aggressive optimizations for resolving loads early. I'm pretty sure both AMD[^1] and Intel[^2] have had predictive store-to-load forwarding.
edit: Just noticed the FLOP paper also has a nice footnote about distinguishing LVP from forwarding during testing (ie. you want to drain your store queue)!
[^1]: https://www.amd.com/content/dam/amd/en/documents/processor-t...
[^2]: https://www.intel.com/content/www/us/en/developer/articles/t...
> I'm pretty sure both AMD[^1] and Intel[^2] have had predictive store-to-load forwarding.
IIRC this was how Spectre Variant 4 worked.
from doing some work on GC a couple years ago, at that time apple was the only one with it. The performance is awesome, it makes graph traversal ~2x faster.
Seems like speculative execution is just fundamentally insecure. With SPECTRE/MELTDOWN mitigations, doesn't CPU performance drop below the same CPU performance with no branch prediction at all? Should we move back to CISC? Or maybe VLIW?
I don't think so; speculative execution is the cornerstone of modern CPU performance. Even 15-year-old 32-bit ARM CPUs do it. The only phone/PC-grade processors without it are the first generation of Intel Atom, and I recall that early Atom processors sacrificed a ton of performance to keep power consumption low. I doubt this will change since mitigations are "good enough" to patch over major issues.
Maybe the boomers were right and we made computers way too complex? This might be a bit of a hyperbole, but seems like there will always be a security hole (even if mostly hard to exploit). But I also guess we can't get much faster without it either. So maybe we should reduce complexity. Atleast for safety critical systems.
Now wait until the zoomers come along and take the lead on these products. They grew up with iPads and no file system. It’s going to be chaos!
Boomers grew up without a filesystem too and things seem to have worked out fine.
There is extremely popular Cortex-A53 which is in-order core.
Yes and it's very slow as a result. In-order cores without speculative execution can't be fast. Not unless you have no memory and only operate out of something equivalent to L1 cache.
Memory is slow. Insanely slow (compared to the CPU). You can process stupid fast if your entire working set can fit in a 2KB L1 cache, but the second you touch memory you're hosed. You can't hide memory latency without out-of-order execution and/or SMT. You fundamentally need to be parallel to hide latency. CPUs do it with out-of-order and speculative execution. GPUs do it by being stupidly parallel and running something like 32-64 way SMT (huge simplification). Many high-performance CPUs do all of these things.
Instruction level parallelism is simply not optional with the DRAM latency we have.
Cortex a53 may be slow, but it's fast enough for very many tasks. Once you design the your data structures to fit L1/L2 caches it actually is pretty damn fast. Best part of cache aware data structure design it also makes code run faster on Out Of Order CPUS. A53 is of course slow if you use modern layer-upon-layer-ware as your architecture.
But I was just really trying to point that in-order cpus are still around, they did not disappear with in-order atom.
> SPECTRE/MELTDOWN mitigations, doesn't CPU performance drop below the same CPU performance with no branch prediction at all?
No. Processors with no branch prediction would be many times slower (up to dozens of times slower on workloads that don’t fit in cache)
>speculative execution is just fundamentally insecure
I dont think its inevitable, might be caused by greed. You could have small separate sections of cache (or additional tag) dedicated to per thread speculation, but no designer is willing to sacrifice real estate to something that will be thrown away the instantly.
> This research was supported by the Air Force Office of Scientific Research (AFOSR)
I wonder if this is the kind of grant that is no longer being funded (or at least "paused")
I'd really be curious to recreate this or to know if any of their results included using private browsing mode?
I only bring it up because one of the reasons I use Safari with private browsing as a default is because, if I were to login to a site like Facebook in one tab, open a new private tab in the same window and try going to Facebook, it would not recognize that I had already logged in from the other tab. Chrome nor Firefox do that.
Is the statement from Apple just PR or is this not a usable exploit?
"Based on our analysis, we do not believe this issue poses an immediate risk to our users."
https://www.bleepingcomputer.com/news/security/new-apple-cpu...
CPU vendors always say this when an exploit is published before they mitigate.
Sometimes they mean "no we don't think it's exploitable", sometimes the charitable reading is "we don't think anyone is exploiting this and we think developing an exploit will take quite some time".
Unfortunately they never reveal exactly that they mean. This is very annoying, because when it's the former case, they're often right! Security researchers publish bullshit sometimes. But the vendors basically leave you to figure it out for yourself.
And from the paper seems like they played it interestingly in the researchers direction as well:
"1.2. Responsible Disclosure
We disclosed our results to Apple on May 24, 2024. Apple’s Product Security Team have acknowledged our report and proof-of-concept code, requesting an extended embargo beyond the 90-day window. At the time of writing, Apple did not share any schedule regarding mitigation plans concerning the results presented in this paper. "
They carefully added “immediate”.
Given that the researchers published working exploits that you can modify for your own use, it’s PR.
>statement from Apple just PR
remember the iphone 6 battery and butterfly keyboard gate we both "small number of users" according to Apple.
Apple PR, which is unlike them; to wave it off.
OK, fun. What can we do to mitigate this until it gets patched?
Serious answer, don't use Safari. Use a browser that properly separates webpages into isolated processes so that this kind of cross-site read is not possible.
There’re no other browsers on iPhone. Every iPhone browser is a reskin of Safari. They’re in theory supposed to allow other browsers in the EU, but AFAIK it has not happened yet.
Then don't use an iPhone until it is patched.
What about turn JS off on your favourite iOS browser?
No need to turn JS off. Turn on Lockdown mode which disables Javascript JIT and WASM, which might be enough
It’s not.
That wouldn't prevent possible malware apps using WKWebview from getting out of the jail they are running out right?
Yes, I agree.
However I also expect that Swift-compiled apps can do this without a web browser component.
It’s a different threat model though, having installed a malicious app vs browsing a malicious site.
Which is the reason alongside telemetry I tend to favor using websites over apps.
Having said that there are apps that are considered mainstream and not malicious by the general population but can become a convenient backdoor for, say, a state actor.
So could this hypothetically open a mail client on your iPhone and read your emails?
No, it doesn’t do cross-address space attacks.
God I hate Apple sometimes
[flagged]
Was this comment so important you had to make a new account for it?
Will that work? Isn't memory treated in a unified way between processes, at some point?
Processors are not supposed to speculate across ASIDs
It will work unless someone forgets to add a public suffix into the public suffix list (as described in the FLOP paper). Both of these attacks target virtual memory pointers.
From the FAQ:
> While FLOP has an actionable mitigation, implementing it requires patches from software vendors and cannot be done by users. Apple has communicated to us that they plan to address these issues in an upcoming security update, hence it is important to enable automatic updates and ensure that your devices are running the latest operating system and applications.
I wonder if Lockdown Mode would help?
IIRC, it disables jit and webassembly, so i think yes
Related is Casey Muratori's explanation of the GoFetch speculative attack on Apple's M-series CPUs:
https://www.youtube.com/watch?v=uZEBkOrfUzM
I'm penchant toward disabling js by default on untrusted sites. It's basically someone's programm that we run on our machine and apparently we not yet can do sandboxes
If I understand correctly, this also affects Asahi Linux, right?
Yes. They even used Asahi Linux to develop their techniques on as it gave better access to cpu controls. The exploit was tested on Safari (IE not a Linux fault but a CPU one).
It’s a processor attack so yes
Can browser side channel attacks be made less effective by running another compute/branch heavy process?
Mine shitcoins in a web worker for extra protection?
Sure. Why not? It'd tick a couple of the boxes to possibly lower the exploit bitrate...
"Side channel attacks are a class of exploit that infers secrets by measuring manifestations such as timing, sound, and power consumption"
I bet you could construct a hard proof that any kind of speculation is insecure in the sense that it cannot be proven secure.
If that's not true, then someone's going to figure out exactly how to set bounds on safe speculation and that will become part of future CPU designs.
So this attack is still possible even with lockdown mode on?
Yes.
why?
It’s a processor bug
Great post, so interesting!
Bizarre the M1 is immune to both; I'm more secure by not upgrading. (Sure, there's still a few, but they are mostly minor by comparison, or newer chips are also affected.)
Newer CPUs use more and more "hacks" - out of order execution, caching, speculative execution, branch prediction, etc - to gain performance improvements. The further back you go, the less vulnerable CPUs generally are to these (but possibly more vulnerable to other kinds of attacks).
But M1 is squarely a modern CPU. It uses all the techniques you mention (as does every high-performance CPU since the Pentium Pro era).
For sure. But the further you go along, the _more_ of these tricks it uses and relies upon to improve performance. New vulnerabilities that are discovered are likely to take advantage of some feature on the spectrum of these “hacks”, from new to old. Because older “hacks” are well-known and studied, newly discovered vulnerabilities are more likely to target features in the newer end of that spectrum (AKA newer CPUs).
Does it have a Load Value Predictor?
No.
I don't know.
I can’t imagine taking a cpu design class in this more cynical era. So much of the speed these days seems to come down to distilling smoke and mirrors into physical form.
On the other hand, research into finding these vulnerabilities seems to be booming. Though presumably, so is the grey-hat market.
Indeed, my Apple Watch (series 3) has always been immune to all spectre type attacks, the CPU is too simple. It doesn’t do speculative execution at all.
The M1 doesn't have load address or value predictors. It's less sophisticated, and so has a smaller microarchitectural attack surface.
It has a power button, which effectively reduces the microarchitectural attack surface to zero.
That's vulnerable to DMP-based side channel attacks though (like GoFetch [1]), which you can only protect against in software [2] on the M3 and beyond.
[1] https://gofetch.fail/
[2] https://developer.apple.com/documentation/xcode/writing-arm6...
This isn't just limited to safari. I wonder if this apple fix this more thoroughly than spectre and meltdown.
does it affect gecko browsers? firefox?
Could this attack be used to jailbreak the latest iPhone?
No.
The marketing culture for announcing hardware exploits is so strange to me. The norm seems to be getting a custom domain, logos, demos, an FAQ... why do all this instead of just reporting the exploit and releasing a paper?
Only academics read exploit papers. I don't see anything wrong with releasing the information is a more digestible way if it is something that affects the general populace. I only knew about heartbleed because of the website. https://heartbleed.com/
Heartbleed et al. demonstrated conclusively that recognition matters; I don't begrudge researchers any technique that increases the relative visibility of their work.
It’s happening in other parts of the research world too: a couple colleagues of mine were talking recently about a paper we found at a conference last year that had a web page to go along with it with a domain and fancy graphics and such. For a boring programming languages paper. We concluded this is the modern way to try to jack your citations by getting noticed for everything but the technical content of the work, which is a bit off putting.
Getting funding and good job offers is mostly about marketing. Even worse, lots of people controlling the purse strings aren't domain experts. In a way, it's no different from getting published in specific high-profile publications or attending specific universities.
It's a recent trend basically since Heartbleed had a cool name and lots of press. Why would you not want your exploit to be well known and to get lots of credit for it? If anything it's surprising it didn't happen earlier.
The custom domains can be a little silly, but for all the rest, why not? Logos (and the associated fancy name) are a lot more memorable than CVE-2025-XXXX. Demos are and were always appreciated. FAQs are a lot more digestible for the average reader than a paper.
I know it's kind of goofy, but I don't really see the downside to it.
Blame society. Businesses won't value security unless the fear of getting attacked is sufficiently strong and the losses significant. Otherwise why invest in it at all?
Definitely not just hardware exploits though. Look at heartbleed for example. It's been going on a long time. Hardware exploits are just so much more widely applicable hence the interest to researchers.
It also feels like that people who are highly determined to build high quality, secure software are not valued that much.
It is difficult to prove their effort. One security-related bug removes everything, even if it happened only once in 10 years in 1 million line code base.
In this case it is very generic domain name. Maybe more specific one would be okay, but this is not anymore.
... and another one...
Good reminder that Marketing is detached from reality.
[dead]
[flagged]
> As pointed out by iLeakage, Safari lacks Site Isolation
Well I'm shocked, for such a company that promotes security and privacy, apple not having put site isolation into safari seems amateurish.
Apple promotes privacy, sure. I'm not sure whether they promote security. Of course they are not against security, but I don't remember it being a significant theme in their marketing.
I believe it's implied when they say 'what happens on your iphone stays on your iphone'.
You can have security without privacy but you can't have privacy without security, when they promote privacy they also claim to be secure.
No website isolation goes against both of these principles.
I consider the world lucky that the Apple apologists haven't emboldened Apple to prevent its customers from using other browsers on their Macs... yet.
A great hack
Tidal
Interesting that the researchers have gone public before a mitigation is in place from Apple. Seems in pretty stark contrast to the industry-wide coordination that went into patching and mitigating spectre.
> We disclosed our results to Apple on May 24, 2024. Apple’s Product Security Team have acknowledged our report and proof-of-concept code, requesting an extended embargo beyond the 90-day window. At the time of writing, Apple did not share any schedule regarding mitigation plans concerning the results presented in this paper.
The vulnerability is over half a year old and over a quarter over the embargo window.
The LVP vulnerability was reported in September, for what it's worth.
That's still more or less 120 days ago. Well over the 90 day typical window.
I'm not saying it's bad (I'm pretty close to a disclosure absolutist), though I don't really know what the norms are for hardware attacks --- both papers are well past the normal coordinated disclosure window for software.
The vendor requested embargo timelines can get pretty crazy.
Intel requested an embargo for 21 months for SRBDS/Crosstalk.
For Downfall, a more recent one, Intel requested a 12 month embargo to release a microcode update.
For all I know, that might be a reasonable ask for these kinds of vulnerabilities --- not that researchers have to honor them.
I wonder if Apple are being slow due to the complexity of the potential fix, or because they're dragging their feet.
As someone who has reported several (software) security vulnerabilities to Apple, I couldn’t care less. Some of the things I reported are trivial to fix, and dealing with their team has been a frustrating experience. They don’t deserve extensions, they need to get their shit together and stop releasing new crap every year. Clearly they can’t do it properly.
Because the fix will most likely kill performance, like Spectre/Meltdown did, and then their pretty graphs wont look so impressive any more.
Or it's a hardware issue and they don't have any way to do a microcode fix for this
It's could be unfixable without a significant performance penalty, but at minimum they could make safari do proper process isolation like every other browser does.
I could easily imagine such a refactor of Safari taking more than 90 days even if Apple made it the highest possible priority.
The bug is open since 2018, so clearly they don’t actually care about that.
https://bugs.webkit.org/show_bug.cgi?id=184466
Right, but that was before they knew about this exploit. My point was that even if they decided they needed to urgently switch to a multiprocesses architecture because it's the only way to mitigate this exploit, they might not be done yet.
This class of attacks is not new. Spectre demonstrated this possibility in 2018, and Apple was previously targeted by speculation attacks, e.g. https://gofetch.fail/ or https://ileakage.com/.
> We disclosed SLAP to Apple on May 24, 2024, and FLOP on September 3, 2024
I think also this isn't fundamentally different to Spectre. Spectre introduced a whole new class of vulnerabilities (hence the name) and this is one of them. An impressive one, but still, it definitely doesn't deserve the coordination & secrecy that Spectre had.
Most browsers have already switched to process-per-site because of Spectre.
Does Apple pay bug bounties?
They claim to, but they drag their feet, demand terms most researchers find so unacceptable as to be a bit immoral (the point of "responsible disclosure" isn't, in fact, to hold secrets from the public arbitrarily long), and often end up paying only a fraction of what was expected, if anything.
https://pxlnv.com/linklog/apple-bug-bounty-troubles/
https://www.marketplace.org/shows/marketplace-tech/looking-f...
https://mjtsai.com/blog/2021/07/13/more-trouble-with-the-app...
Your links are all from 2021. I remember there was a lot of criticism at the time and so they updated their bug bounty program which quite a number of changes:
https://security.apple.com/blog/apple-security-bounty-upgrad...
Would be interesting to see if it made a difference.
The vibe I got talking to people like Mark Dowd about this is that they're running something closer to an exploit bounty program, and it's pretty focused on patterns of vulnerabilities common to some pretty specific threat actors.
Yes, they do.