CVE-2024-35877

In the Linux kernel, the following vulnerability has been resolved:

x86/mm/pat: fix VM_PAT handling in COW mappings

PAT handling won't do the right thing in COW mappings: the first PTE (or, in fact, all PTEs) can be replaced during write faults to point at anon folios. Reliably recovering the correct PFN and cachemode using follow_phys() from PTEs will not work in COW mappings.

Using follow_phys(), we might just get the address+protection of the anon folio (which is very wrong), or fail on swap/nonswap entries, failing follow_phys() and triggering a WARN_ON_ONCE() in untrack_pfn() and track_pfn_copy(), not properly calling free_pfn_range().

In free_pfn_range(), we either wouldn't call memtype_free() or would call it with the wrong range, possibly leaking memory.

To fix that, let's update follow_phys() to refuse returning anon folios, and fallback to using the stored PFN inside vma->vm_pgoff for COW mappings if we run into that.

We will now properly handle untrack_pfn() with COW mappings, where we don't need the cachemode. We'll have to fail fork()->track_pfn_copy() if the first page was replaced by an anon folio, though: we'd have to store the cachemode in the VMA to make this work, likely growing the VMA size.

For now, lets keep it simple and let track_pfn_copy() just fail in that case: it would have failed in the past with swap/nonswap entries already, and it would have done the wrong thing with anon folios.

Simple reproducer to trigger the WARN_ON_ONCE() in untrack_pfn():

<--- C reproducer ---> #include <stdio.h> #include <sys/mman.h> #include <unistd.h> #include <liburing.h>

int main(void) { struct io_uring_params p = {}; int ring_fd; size_t size; char *map;

     ring_fd = io_uring_setup(1, &p);
     if (ring_fd < 0) {
             perror("io_uring_setup");
             return 1;
     }
     size = p.sq_off.array + p.sq_entries * sizeof(unsigned);

     /* Map the submission...

CVE-2024-35877

Details

References

Ecosystems

Timeline