ramoops.rst 6.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177
  1. Ramoops oops/panic logger
  2. =========================
  3. Sergiu Iordache <sergiu@chromium.org>
  4. Updated: 10 Feb 2021
  5. Introduction
  6. ------------
  7. Ramoops is an oops/panic logger that writes its logs to RAM before the system
  8. crashes. It works by logging oopses and panics in a circular buffer. Ramoops
  9. needs a system with persistent RAM so that the content of that area can
  10. survive after a restart.
  11. Ramoops concepts
  12. ----------------
  13. Ramoops uses a predefined memory area to store the dump. The start and size
  14. and type of the memory area are set using three variables:
  15. * ``mem_address`` for the start
  16. * ``mem_size`` for the size. The memory size will be rounded down to a
  17. power of two.
  18. * ``mem_type`` to specify if the memory type (default is pgprot_writecombine).
  19. * ``mem_name`` to specify a memory region defined by ``reserve_mem`` command
  20. line parameter.
  21. Typically the default value of ``mem_type=0`` should be used as that sets the pstore
  22. mapping to pgprot_writecombine. Setting ``mem_type=1`` attempts to use
  23. ``pgprot_noncached``, which only works on some platforms. This is because pstore
  24. depends on atomic operations. At least on ARM, pgprot_noncached causes the
  25. memory to be mapped strongly ordered, and atomic operations on strongly ordered
  26. memory are implementation defined, and won't work on many ARMs such as omaps.
  27. Setting ``mem_type=2`` attempts to treat the memory region as normal memory,
  28. which enables full cache on it. This can improve the performance.
  29. The memory area is divided into ``record_size`` chunks (also rounded down to
  30. power of two) and each kmesg dump writes a ``record_size`` chunk of
  31. information.
  32. Limiting which kinds of kmsg dumps are stored can be controlled via
  33. the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
  34. ``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
  35. ``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
  36. ``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
  37. (KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
  38. ``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
  39. otherwise KMSG_DUMP_MAX.
  40. The module uses a counter to record multiple dumps but the counter gets reset
  41. on restart (i.e. new dumps after the restart will overwrite old ones).
  42. Ramoops also supports software ECC protection of persistent memory regions.
  43. This might be useful when a hardware reset was used to bring the machine back
  44. to life (i.e. a watchdog triggered). In such cases, RAM may be somewhat
  45. corrupt, but usually it is restorable.
  46. Setting the parameters
  47. ----------------------
  48. Setting the ramoops parameters can be done in several different manners:
  49. A. Use the module parameters (which have the names of the variables described
  50. as before). For quick debugging, you can also reserve parts of memory during
  51. boot and then use the reserved memory for ramoops. For example, assuming a
  52. machine with > 128 MB of memory, the following kernel command line will tell
  53. the kernel to use only the first 128 MB of memory, and place ECC-protected
  54. ramoops region at 128 MB boundary::
  55. mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1
  56. B. Use Device Tree bindings, as described in
  57. ``Documentation/devicetree/bindings/reserved-memory/ramoops.yaml``.
  58. For example::
  59. reserved-memory {
  60. #address-cells = <2>;
  61. #size-cells = <2>;
  62. ranges;
  63. ramoops@8f000000 {
  64. compatible = "ramoops";
  65. reg = <0 0x8f000000 0 0x100000>;
  66. record-size = <0x4000>;
  67. console-size = <0x4000>;
  68. };
  69. };
  70. C. Use a platform device and set the platform data. The parameters can then
  71. be set through that platform data. An example of doing that is:
  72. .. code-block:: c
  73. #include <linux/pstore_ram.h>
  74. [...]
  75. static struct ramoops_platform_data ramoops_data = {
  76. .mem_size = <...>,
  77. .mem_address = <...>,
  78. .mem_type = <...>,
  79. .record_size = <...>,
  80. .max_reason = <...>,
  81. .ecc = <...>,
  82. };
  83. static struct platform_device ramoops_dev = {
  84. .name = "ramoops",
  85. .dev = {
  86. .platform_data = &ramoops_data,
  87. },
  88. };
  89. [... inside a function ...]
  90. int ret;
  91. ret = platform_device_register(&ramoops_dev);
  92. if (ret) {
  93. printk(KERN_ERR "unable to register platform device\n");
  94. return ret;
  95. }
  96. D. Using a region of memory reserved via ``reserve_mem`` command line
  97. parameter. The address and size will be defined by the ``reserve_mem``
  98. parameter. Note, that ``reserve_mem`` may not always allocate memory
  99. in the same location, and cannot be relied upon. Testing will need
  100. to be done, and it may not work on every machine, nor every kernel.
  101. Consider this a "best effort" approach. The ``reserve_mem`` option
  102. takes a size, alignment and name as arguments. The name is used
  103. to map the memory to a label that can be retrieved by ramoops.
  104. reserve_mem=2M:4096:oops ramoops.mem_name=oops
  105. You can specify either RAM memory or peripheral devices' memory. However, when
  106. specifying RAM, be sure to reserve the memory by issuing memblock_reserve()
  107. very early in the architecture code, e.g.::
  108. #include <linux/memblock.h>
  109. memblock_reserve(ramoops_data.mem_address, ramoops_data.mem_size);
  110. Dump format
  111. -----------
  112. The data dump begins with a header, currently defined as ``====`` followed by a
  113. timestamp and a new line. The dump then continues with the actual data.
  114. Reading the data
  115. ----------------
  116. The dump data can be read from the pstore filesystem. The format for these
  117. files is ``dmesg-ramoops-N``, where N is the record number in memory. To delete
  118. a stored record from RAM, simply unlink the respective pstore file.
  119. Persistent function tracing
  120. ---------------------------
  121. Persistent function tracing might be useful for debugging software or hardware
  122. related hangs. The functions call chain log is stored in a ``ftrace-ramoops``
  123. file. Here is an example of usage::
  124. # mount -t debugfs debugfs /sys/kernel/debug/
  125. # echo 1 > /sys/kernel/debug/pstore/record_ftrace
  126. # reboot -f
  127. [...]
  128. # mount -t pstore pstore /mnt/
  129. # tail /mnt/ftrace-ramoops
  130. 0 ffffffff8101ea64 ffffffff8101bcda native_apic_mem_read <- disconnect_bsp_APIC+0x6a/0xc0
  131. 0 ffffffff8101ea44 ffffffff8101bcf6 native_apic_mem_write <- disconnect_bsp_APIC+0x86/0xc0
  132. 0 ffffffff81020084 ffffffff8101a4b5 hpet_disable <- native_machine_shutdown+0x75/0x90
  133. 0 ffffffff81005f94 ffffffff8101a4bb iommu_shutdown_noop <- native_machine_shutdown+0x7b/0x90
  134. 0 ffffffff8101a6a1 ffffffff8101a437 native_machine_emergency_restart <- native_machine_restart+0x37/0x40
  135. 0 ffffffff811f9876 ffffffff8101a73a acpi_reboot <- native_machine_emergency_restart+0xaa/0x1e0
  136. 0 ffffffff8101a514 ffffffff8101a772 mach_reboot_fixups <- native_machine_emergency_restart+0xe2/0x1e0
  137. 0 ffffffff811d9c54 ffffffff8101a7a0 __const_udelay <- native_machine_emergency_restart+0x110/0x1e0
  138. 0 ffffffff811d9c34 ffffffff811d9c80 __delay <- __const_udelay+0x30/0x40
  139. 0 ffffffff811d9d14 ffffffff811d9c3f delay_tsc <- __delay+0xf/0x20