page_table_check.rst 3.7 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
  1. .. SPDX-License-Identifier: GPL-2.0
  2. ================
  3. Page Table Check
  4. ================
  5. Introduction
  6. ============
  7. Page table check allows to harden the kernel by ensuring that some types of
  8. the memory corruptions are prevented.
  9. Page table check performs extra verifications at the time when new pages become
  10. accessible from the userspace by getting their page table entries (PTEs PMDs
  11. etc.) added into the table.
  12. In case of most detected corruption, the kernel is crashed. There is a small
  13. performance and memory overhead associated with the page table check. Therefore,
  14. it is disabled by default, but can be optionally enabled on systems where the
  15. extra hardening outweighs the performance costs. Also, because page table check
  16. is synchronous, it can help with debugging double map memory corruption issues,
  17. by crashing kernel at the time wrong mapping occurs instead of later which is
  18. often the case with memory corruptions bugs.
  19. It can also be used to do page table entry checks over various flags, dump
  20. warnings when illegal combinations of entry flags are detected. Currently,
  21. userfaultfd is the only user of such to sanity check wr-protect bit against
  22. any writable flags. Illegal flag combinations will not directly cause data
  23. corruption in this case immediately, but that will cause read-only data to
  24. be writable, leading to corrupt when the page content is later modified.
  25. Double mapping detection logic
  26. ==============================
  27. +-------------------+-------------------+-------------------+------------------+
  28. | Current Mapping | New mapping | Permissions | Rule |
  29. +===================+===================+===================+==================+
  30. | Anonymous | Anonymous | Read | Allow |
  31. +-------------------+-------------------+-------------------+------------------+
  32. | Anonymous | Anonymous | Read / Write | Prohibit |
  33. +-------------------+-------------------+-------------------+------------------+
  34. | Anonymous | Named | Any | Prohibit |
  35. +-------------------+-------------------+-------------------+------------------+
  36. | Named | Anonymous | Any | Prohibit |
  37. +-------------------+-------------------+-------------------+------------------+
  38. | Named | Named | Any | Allow |
  39. +-------------------+-------------------+-------------------+------------------+
  40. Enabling Page Table Check
  41. =========================
  42. Build kernel with:
  43. - PAGE_TABLE_CHECK=y
  44. Note, it can only be enabled on platforms where ARCH_SUPPORTS_PAGE_TABLE_CHECK
  45. is available.
  46. - Boot with 'page_table_check=on' kernel parameter.
  47. Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have page
  48. table support without extra kernel parameter.
  49. Implementation notes
  50. ====================
  51. We specifically decided not to use VMA information in order to avoid relying on
  52. MM states (except for limited "struct page" info). The page table check is a
  53. separate from Linux-MM state machine that verifies that the user accessible
  54. pages are not falsely shared.
  55. PAGE_TABLE_CHECK depends on EXCLUSIVE_SYSTEM_RAM. The reason is that without
  56. EXCLUSIVE_SYSTEM_RAM, users are allowed to map arbitrary physical memory
  57. regions into the userspace via /dev/mem. At the same time, pages may change
  58. their properties (e.g., from anonymous pages to named pages) while they are
  59. still being mapped in the userspace, leading to "corruption" detected by the
  60. page table check.
  61. Even with EXCLUSIVE_SYSTEM_RAM, I/O pages may be still allowed to be mapped via
  62. /dev/mem. However, these pages are always considered as named pages, so they
  63. won't break the logic used in the page table check.