reporting-issues.rst 94 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764
  1. .. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
  2. .. See the bottom of this file for additional redistribution information.
  3. Reporting issues
  4. ++++++++++++++++
  5. The short guide (aka TL;DR)
  6. ===========================
  7. Are you facing a regression with vanilla kernels from the same stable or
  8. longterm series? One still supported? Then search the `LKML
  9. <https://lore.kernel.org/lkml/>`_ and the `Linux stable mailing list
  10. <https://lore.kernel.org/stable/>`_ archives for matching reports to join. If
  11. you don't find any, install `the latest release from that series
  12. <https://kernel.org/>`_. If it still shows the issue, report it to the stable
  13. mailing list (stable@vger.kernel.org) and CC the regressions list
  14. (regressions@lists.linux.dev); ideally also CC the maintainer and the mailing
  15. list for the subsystem in question.
  16. In all other cases try your best guess which kernel part might be causing the
  17. issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its developers
  18. expect to be told about problems, which most of the time will be by email with a
  19. mailing list in CC. Check the destination's archives for matching reports;
  20. search the `LKML <https://lore.kernel.org/lkml/>`_ and the web, too. If you
  21. don't find any to join, install `the latest mainline kernel
  22. <https://kernel.org/>`_. If the issue is present there, send a report.
  23. The issue was fixed there, but you would like to see it resolved in a still
  24. supported stable or longterm series as well? Then install its latest release.
  25. If it shows the problem, search for the change that fixed it in mainline and
  26. check if backporting is in the works or was discarded; if it's neither, ask
  27. those who handled the change for it.
  28. **General remarks**: When installing and testing a kernel as outlined above,
  29. ensure it's vanilla (IOW: not patched and not using add-on modules). Also make
  30. sure it's built and running in a healthy environment and not already tainted
  31. before the issue occurs.
  32. If you are facing multiple issues with the Linux kernel at once, report each
  33. separately. While writing your report, include all information relevant to the
  34. issue, like the kernel and the distro used. In case of a regression, CC the
  35. regressions mailing list (regressions@lists.linux.dev) to your report. Also try
  36. to pin-point the culprit with a bisection; if you succeed, include its
  37. commit-id and CC everyone in the sign-off-by chain.
  38. Once the report is out, answer any questions that come up and help where you
  39. can. That includes keeping the ball rolling by occasionally retesting with newer
  40. releases and sending a status update afterwards.
  41. Step-by-step guide how to report issues to the kernel maintainers
  42. =================================================================
  43. The above TL;DR outlines roughly how to report issues to the Linux kernel
  44. developers. It might be all that's needed for people already familiar with
  45. reporting issues to Free/Libre & Open Source Software (FLOSS) projects. For
  46. everyone else there is this section. It is more detailed and uses a
  47. step-by-step approach. It still tries to be brief for readability and leaves
  48. out a lot of details; those are described below the step-by-step guide in a
  49. reference section, which explains each of the steps in more detail.
  50. Note: this section covers a few more aspects than the TL;DR and does things in
  51. a slightly different order. That's in your interest, to make sure you notice
  52. early if an issue that looks like a Linux kernel problem is actually caused by
  53. something else. These steps thus help to ensure the time you invest in this
  54. process won't feel wasted in the end:
  55. * Are you facing an issue with a Linux kernel a hardware or software vendor
  56. provided? Then in almost all cases you are better off to stop reading this
  57. document and reporting the issue to your vendor instead, unless you are
  58. willing to install the latest Linux version yourself. Be aware the latter
  59. will often be needed anyway to hunt down and fix issues.
  60. * Perform a rough search for existing reports with your favorite internet
  61. search engine; additionally, check the archives of the `Linux Kernel Mailing
  62. List (LKML) <https://lore.kernel.org/lkml/>`_. If you find matching reports,
  63. join the discussion instead of sending a new one.
  64. * See if the issue you are dealing with qualifies as regression, security
  65. issue, or a really severe problem: those are 'issues of high priority' that
  66. need special handling in some steps that are about to follow.
  67. * Make sure it's not the kernel's surroundings that are causing the issue
  68. you face.
  69. * Create a fresh backup and put system repair and restore tools at hand.
  70. * Ensure your system does not enhance its kernels by building additional
  71. kernel modules on-the-fly, which solutions like DKMS might be doing locally
  72. without your knowledge.
  73. * Check if your kernel was 'tainted' when the issue occurred, as the event
  74. that made the kernel set this flag might be causing the issue you face.
  75. * Write down coarsely how to reproduce the issue. If you deal with multiple
  76. issues at once, create separate notes for each of them and make sure they
  77. work independently on a freshly booted system. That's needed, as each issue
  78. needs to get reported to the kernel developers separately, unless they are
  79. strongly entangled.
  80. * If you are facing a regression within a stable or longterm version line
  81. (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
  82. 'Dealing with regressions within a stable and longterm kernel line'.
  83. * Locate the driver or kernel subsystem that seems to be causing the issue.
  84. Find out how and where its developers expect reports. Note: most of the
  85. time this won't be bugzilla.kernel.org, as issues typically need to be sent
  86. by mail to a maintainer and a public mailing list.
  87. * Search the archives of the bug tracker or mailing list in question
  88. thoroughly for reports that might match your issue. If you find anything,
  89. join the discussion instead of sending a new report.
  90. After these preparations you'll now enter the main part:
  91. * Unless you are already running the latest 'mainline' Linux kernel, better
  92. go and install it for the reporting process. Testing and reporting with
  93. the latest 'stable' Linux can be an acceptable alternative in some
  94. situations; during the merge window that actually might be even the best
  95. approach, but in that development phase it can be an even better idea to
  96. suspend your efforts for a few days anyway. Whatever version you choose,
  97. ideally use a 'vanilla' build. Ignoring these advices will dramatically
  98. increase the risk your report will be rejected or ignored.
  99. * Ensure the kernel you just installed does not 'taint' itself when
  100. running.
  101. * Reproduce the issue with the kernel you just installed. If it doesn't show
  102. up there, scroll down to the instructions for issues only happening with
  103. stable and longterm kernels.
  104. * Optimize your notes: try to find and write the most straightforward way to
  105. reproduce your issue. Make sure the end result has all the important
  106. details, and at the same time is easy to read and understand for others
  107. that hear about it for the first time. And if you learned something in this
  108. process, consider searching again for existing reports about the issue.
  109. * If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
  110. decoding the kernel log to find the line of code that triggered the error.
  111. * If your problem is a regression, try to narrow down when the issue was
  112. introduced as much as possible.
  113. * Start to compile the report by writing a detailed description about the
  114. issue. Always mention a few things: the latest kernel version you installed
  115. for reproducing, the Linux Distribution used, and your notes on how to
  116. reproduce the issue. Ideally, make the kernel's build configuration
  117. (.config) and the output from ``dmesg`` available somewhere on the net and
  118. link to it. Include or upload all other information that might be relevant,
  119. like the output/screenshot of an Oops or the output from ``lspci``. Once
  120. you wrote this main part, insert a normal length paragraph on top of it
  121. outlining the issue and the impact quickly. On top of this add one sentence
  122. that briefly describes the problem and gets people to read on. Now give the
  123. thing a descriptive title or subject that yet again is shorter. Then you're
  124. ready to send or file the report like the MAINTAINERS file told you, unless
  125. you are dealing with one of those 'issues of high priority': they need
  126. special care which is explained in 'Special handling for high priority
  127. issues' below.
  128. * Wait for reactions and keep the thing rolling until you can accept the
  129. outcome in one way or the other. Thus react publicly and in a timely manner
  130. to any inquiries. Test proposed fixes. Do proactive testing: retest with at
  131. least every first release candidate (RC) of a new mainline version and
  132. report your results. Send friendly reminders if things stall. And try to
  133. help yourself, if you don't get any help or if it's unsatisfying.
  134. Reporting regressions within a stable and longterm kernel line
  135. --------------------------------------------------------------
  136. This subsection is for you, if you followed above process and got sent here at
  137. the point about regression within a stable or longterm kernel version line. You
  138. face one of those if something breaks when updating from 5.10.4 to 5.10.5 (a
  139. switch from 5.9.15 to 5.10.5 does not qualify). The developers want to fix such
  140. regressions as quickly as possible, hence there is a streamlined process to
  141. report them:
  142. * Check if the kernel developers still maintain the Linux kernel version
  143. line you care about: go to the `front page of kernel.org
  144. <https://kernel.org/>`_ and make sure it mentions
  145. the latest release of the particular version line without an '[EOL]' tag.
  146. * Check the archives of the `Linux stable mailing list
  147. <https://lore.kernel.org/stable/>`_ for existing reports.
  148. * Install the latest release from the particular version line as a vanilla
  149. kernel. Ensure this kernel is not tainted and still shows the problem, as
  150. the issue might have already been fixed there. If you first noticed the
  151. problem with a vendor kernel, check a vanilla build of the last version
  152. known to work performs fine as well.
  153. * Send a short problem report to the Linux stable mailing list
  154. (stable@vger.kernel.org) and CC the Linux regressions mailing list
  155. (regressions@lists.linux.dev); if you suspect the cause in a particular
  156. subsystem, CC its maintainer and its mailing list. Roughly describe the
  157. issue and ideally explain how to reproduce it. Mention the first version
  158. that shows the problem and the last version that's working fine. Then
  159. wait for further instructions.
  160. The reference section below explains each of these steps in more detail.
  161. Reporting issues only occurring in older kernel version lines
  162. -------------------------------------------------------------
  163. This subsection is for you, if you tried the latest mainline kernel as outlined
  164. above, but failed to reproduce your issue there; at the same time you want to
  165. see the issue fixed in a still supported stable or longterm series or vendor
  166. kernels regularly rebased on those. If that the case, follow these steps:
  167. * Prepare yourself for the possibility that going through the next few steps
  168. might not get the issue solved in older releases: the fix might be too big
  169. or risky to get backported there.
  170. * Perform the first three steps in the section "Dealing with regressions
  171. within a stable and longterm kernel line" above.
  172. * Search the Linux kernel version control system for the change that fixed
  173. the issue in mainline, as its commit message might tell you if the fix is
  174. scheduled for backporting already. If you don't find anything that way,
  175. search the appropriate mailing lists for posts that discuss such an issue
  176. or peer-review possible fixes; then check the discussions if the fix was
  177. deemed unsuitable for backporting. If backporting was not considered at
  178. all, join the newest discussion, asking if it's in the cards.
  179. * One of the former steps should lead to a solution. If that doesn't work
  180. out, ask the maintainers for the subsystem that seems to be causing the
  181. issue for advice; CC the mailing list for the particular subsystem as well
  182. as the stable mailing list.
  183. The reference section below explains each of these steps in more detail.
  184. Reference section: Reporting issues to the kernel maintainers
  185. =============================================================
  186. The detailed guides above outline all the major steps in brief fashion, which
  187. should be enough for most people. But sometimes there are situations where even
  188. experienced users might wonder how to actually do one of those steps. That's
  189. what this section is for, as it will provide a lot more details on each of the
  190. above steps. Consider this as reference documentation: it's possible to read it
  191. from top to bottom. But it's mainly meant to skim over and a place to look up
  192. details how to actually perform those steps.
  193. A few words of general advice before digging into the details:
  194. * The Linux kernel developers are well aware this process is complicated and
  195. demands more than other FLOSS projects. We'd love to make it simpler. But
  196. that would require work in various places as well as some infrastructure,
  197. which would need constant maintenance; nobody has stepped up to do that
  198. work, so that's just how things are for now.
  199. * A warranty or support contract with some vendor doesn't entitle you to
  200. request fixes from developers in the upstream Linux kernel community: such
  201. contracts are completely outside the scope of the Linux kernel, its
  202. development community, and this document. That's why you can't demand
  203. anything such a contract guarantees in this context, not even if the
  204. developer handling the issue works for the vendor in question. If you want
  205. to claim your rights, use the vendor's support channel instead. When doing
  206. so, you might want to mention you'd like to see the issue fixed in the
  207. upstream Linux kernel; motivate them by saying it's the only way to ensure
  208. the fix in the end will get incorporated in all Linux distributions.
  209. * If you never reported an issue to a FLOSS project before you should consider
  210. reading `How to Report Bugs Effectively
  211. <https://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_, `How To Ask
  212. Questions The Smart Way
  213. <http://www.catb.org/esr/faqs/smart-questions.html>`_, and `How to ask good
  214. questions <https://jvns.ca/blog/good-questions/>`_.
  215. With that off the table, find below the details on how to properly report
  216. issues to the Linux kernel developers.
  217. Make sure you're using the upstream Linux kernel
  218. ------------------------------------------------
  219. *Are you facing an issue with a Linux kernel a hardware or software vendor
  220. provided? Then in almost all cases you are better off to stop reading this
  221. document and reporting the issue to your vendor instead, unless you are
  222. willing to install the latest Linux version yourself. Be aware the latter
  223. will often be needed anyway to hunt down and fix issues.*
  224. Like most programmers, Linux kernel developers don't like to spend time dealing
  225. with reports for issues that don't even happen with their current code. It's
  226. just a waste everybody's time, especially yours. Unfortunately such situations
  227. easily happen when it comes to the kernel and often leads to frustration on both
  228. sides. That's because almost all Linux-based kernels pre-installed on devices
  229. (Computers, Laptops, Smartphones, Routers, …) and most shipped by Linux
  230. distributors are quite distant from the official Linux kernel as distributed by
  231. kernel.org: these kernels from these vendors are often ancient from the point of
  232. Linux development or heavily modified, often both.
  233. Most of these vendor kernels are quite unsuitable for reporting issues to the
  234. Linux kernel developers: an issue you face with one of them might have been
  235. fixed by the Linux kernel developers months or years ago already; additionally,
  236. the modifications and enhancements by the vendor might be causing the issue you
  237. face, even if they look small or totally unrelated. That's why you should report
  238. issues with these kernels to the vendor. Its developers should look into the
  239. report and, in case it turns out to be an upstream issue, fix it directly
  240. upstream or forward the report there. In practice that often does not work out
  241. or might not what you want. You thus might want to consider circumventing the
  242. vendor by installing the very latest Linux kernel core yourself. If that's an
  243. option for you move ahead in this process, as a later step in this guide will
  244. explain how to do that once it rules out other potential causes for your issue.
  245. Note, the previous paragraph is starting with the word 'most', as sometimes
  246. developers in fact are willing to handle reports about issues occurring with
  247. vendor kernels. If they do in the end highly depends on the developers and the
  248. issue in question. Your chances are quite good if the distributor applied only
  249. small modifications to a kernel based on a recent Linux version; that for
  250. example often holds true for the mainline kernels shipped by Debian GNU/Linux
  251. Sid or Fedora Rawhide. Some developers will also accept reports about issues
  252. with kernels from distributions shipping the latest stable kernel, as long as
  253. its only slightly modified; that for example is often the case for Arch Linux,
  254. regular Fedora releases, and openSUSE Tumbleweed. But keep in mind, you better
  255. want to use a mainline Linux and avoid using a stable kernel for this
  256. process, as outlined in the section 'Install a fresh kernel for testing' in more
  257. detail.
  258. Obviously you are free to ignore all this advice and report problems with an old
  259. or heavily modified vendor kernel to the upstream Linux developers. But note,
  260. those often get rejected or ignored, so consider yourself warned. But it's still
  261. better than not reporting the issue at all: sometimes such reports directly or
  262. indirectly will help to get the issue fixed over time.
  263. Search for existing reports, first run
  264. --------------------------------------
  265. *Perform a rough search for existing reports with your favorite internet
  266. search engine; additionally, check the archives of the Linux Kernel Mailing
  267. List (LKML). If you find matching reports, join the discussion instead of
  268. sending a new one.*
  269. Reporting an issue that someone else already brought forward is often a waste of
  270. time for everyone involved, especially you as the reporter. So it's in your own
  271. interest to thoroughly check if somebody reported the issue already. At this
  272. step of the process it's okay to just perform a rough search: a later step will
  273. tell you to perform a more detailed search once you know where your issue needs
  274. to be reported to. Nevertheless, do not hurry with this step of the reporting
  275. process, it can save you time and trouble.
  276. Simply search the internet with your favorite search engine first. Afterwards,
  277. search the `Linux Kernel Mailing List (LKML) archives
  278. <https://lore.kernel.org/lkml/>`_.
  279. If you get flooded with results consider telling your search engine to limit
  280. search timeframe to the past month or year. And wherever you search, make sure
  281. to use good search terms; vary them a few times, too. While doing so try to
  282. look at the issue from the perspective of someone else: that will help you to
  283. come up with other words to use as search terms. Also make sure not to use too
  284. many search terms at once. Remember to search with and without information like
  285. the name of the kernel driver or the name of the affected hardware component.
  286. But its exact brand name (say 'ASUS Red Devil Radeon RX 5700 XT Gaming OC')
  287. often is not much helpful, as it is too specific. Instead try search terms like
  288. the model line (Radeon 5700 or Radeon 5000) and the code name of the main chip
  289. ('Navi' or 'Navi10') with and without its manufacturer ('AMD').
  290. In case you find an existing report about your issue, join the discussion, as
  291. you might be able to provide valuable additional information. That can be
  292. important even when a fix is prepared or in its final stages already, as
  293. developers might look for people that can provide additional information or
  294. test a proposed fix. Jump to the section 'Duties after the report went out' for
  295. details on how to get properly involved.
  296. Note, searching `bugzilla.kernel.org <https://bugzilla.kernel.org/>`_ might also
  297. be a good idea, as that might provide valuable insights or turn up matching
  298. reports. If you find the latter, just keep in mind: most subsystems expect
  299. reports in different places, as described below in the section "Check where you
  300. need to report your issue". The developers that should take care of the issue
  301. thus might not even be aware of the bugzilla ticket. Hence, check the ticket if
  302. the issue already got reported as outlined in this document and if not consider
  303. doing so.
  304. Issue of high priority?
  305. -----------------------
  306. *See if the issue you are dealing with qualifies as regression, security
  307. issue, or a really severe problem: those are 'issues of high priority' that
  308. need special handling in some steps that are about to follow.*
  309. Linus Torvalds and the leading Linux kernel developers want to see some issues
  310. fixed as soon as possible, hence there are 'issues of high priority' that get
  311. handled slightly differently in the reporting process. Three type of cases
  312. qualify: regressions, security issues, and really severe problems.
  313. You deal with a regression if some application or practical use case running
  314. fine with one Linux kernel works worse or not at all with a newer version
  315. compiled using a similar configuration. The document
  316. Documentation/admin-guide/reporting-regressions.rst explains this in more
  317. detail. It also provides a good deal of other information about regressions you
  318. might want to be aware of; it for example explains how to add your issue to the
  319. list of tracked regressions, to ensure it won't fall through the cracks.
  320. What qualifies as security issue is left to your judgment. Consider reading
  321. Documentation/process/security-bugs.rst before proceeding, as it
  322. provides additional details how to best handle security issues.
  323. An issue is a 'really severe problem' when something totally unacceptably bad
  324. happens. That's for example the case when a Linux kernel corrupts the data it's
  325. handling or damages hardware it's running on. You're also dealing with a severe
  326. issue when the kernel suddenly stops working with an error message ('kernel
  327. panic') or without any farewell note at all. Note: do not confuse a 'panic' (a
  328. fatal error where the kernel stop itself) with a 'Oops' (a recoverable error),
  329. as the kernel remains running after the latter.
  330. Ensure a healthy environment
  331. ----------------------------
  332. *Make sure it's not the kernel's surroundings that are causing the issue
  333. you face.*
  334. Problems that look a lot like a kernel issue are sometimes caused by build or
  335. runtime environment. It's hard to rule out that problem completely, but you
  336. should minimize it:
  337. * Use proven tools when building your kernel, as bugs in the compiler or the
  338. binutils can cause the resulting kernel to misbehave.
  339. * Ensure your computer components run within their design specifications;
  340. that's especially important for the main processor, the main memory, and the
  341. motherboard. Therefore, stop undervolting or overclocking when facing a
  342. potential kernel issue.
  343. * Try to make sure it's not faulty hardware that is causing your issue. Bad
  344. main memory for example can result in a multitude of issues that will
  345. manifest itself in problems looking like kernel issues.
  346. * If you're dealing with a filesystem issue, you might want to check the file
  347. system in question with ``fsck``, as it might be damaged in a way that leads
  348. to unexpected kernel behavior.
  349. * When dealing with a regression, make sure it's not something else that
  350. changed in parallel to updating the kernel. The problem for example might be
  351. caused by other software that was updated at the same time. It can also
  352. happen that a hardware component coincidentally just broke when you rebooted
  353. into a new kernel for the first time. Updating the systems BIOS or changing
  354. something in the BIOS Setup can also lead to problems that on look a lot
  355. like a kernel regression.
  356. Prepare for emergencies
  357. -----------------------
  358. *Create a fresh backup and put system repair and restore tools at hand.*
  359. Reminder, you are dealing with computers, which sometimes do unexpected things,
  360. especially if you fiddle with crucial parts like the kernel of its operating
  361. system. That's what you are about to do in this process. Thus, make sure to
  362. create a fresh backup; also ensure you have all tools at hand to repair or
  363. reinstall the operating system as well as everything you need to restore the
  364. backup.
  365. Make sure your kernel doesn't get enhanced
  366. ------------------------------------------
  367. *Ensure your system does not enhance its kernels by building additional
  368. kernel modules on-the-fly, which solutions like DKMS might be doing locally
  369. without your knowledge.*
  370. The risk your issue report gets ignored or rejected dramatically increases if
  371. your kernel gets enhanced in any way. That's why you should remove or disable
  372. mechanisms like akmods and DKMS: those build add-on kernel modules
  373. automatically, for example when you install a new Linux kernel or boot it for
  374. the first time. Also remove any modules they might have installed. Then reboot
  375. before proceeding.
  376. Note, you might not be aware that your system is using one of these solutions:
  377. they often get set up silently when you install Nvidia's proprietary graphics
  378. driver, VirtualBox, or other software that requires a some support from a
  379. module not part of the Linux kernel. That why your might need to uninstall the
  380. packages with such software to get rid of any 3rd party kernel module.
  381. Check 'taint' flag
  382. ------------------
  383. *Check if your kernel was 'tainted' when the issue occurred, as the event
  384. that made the kernel set this flag might be causing the issue you face.*
  385. The kernel marks itself with a 'taint' flag when something happens that might
  386. lead to follow-up errors that look totally unrelated. The issue you face might
  387. be such an error if your kernel is tainted. That's why it's in your interest to
  388. rule this out early before investing more time into this process. This is the
  389. only reason why this step is here, as this process later will tell you to
  390. install the latest mainline kernel; you will need to check the taint flag again
  391. then, as that's when it matters because it's the kernel the report will focus
  392. on.
  393. On a running system is easy to check if the kernel tainted itself: if ``cat
  394. /proc/sys/kernel/tainted`` returns '0' then the kernel is not tainted and
  395. everything is fine. Checking that file is impossible in some situations; that's
  396. why the kernel also mentions the taint status when it reports an internal
  397. problem (a 'kernel bug'), a recoverable error (a 'kernel Oops') or a
  398. non-recoverable error before halting operation (a 'kernel panic'). Look near
  399. the top of the error messages printed when one of these occurs and search for a
  400. line starting with 'CPU:'. It should end with 'Not tainted' if the kernel was
  401. not tainted when it noticed the problem; it was tainted if you see 'Tainted:'
  402. followed by a few spaces and some letters.
  403. If your kernel is tainted, study Documentation/admin-guide/tainted-kernels.rst
  404. to find out why. Try to eliminate the reason. Often it's caused by one these
  405. three things:
  406. 1. A recoverable error (a 'kernel Oops') occurred and the kernel tainted
  407. itself, as the kernel knows it might misbehave in strange ways after that
  408. point. In that case check your kernel or system log and look for a section
  409. that starts with this::
  410. Oops: 0000 [#1] SMP
  411. That's the first Oops since boot-up, as the '#1' between the brackets shows.
  412. Every Oops and any other problem that happens after that point might be a
  413. follow-up problem to that first Oops, even if both look totally unrelated.
  414. Rule this out by getting rid of the cause for the first Oops and reproducing
  415. the issue afterwards. Sometimes simply restarting will be enough, sometimes
  416. a change to the configuration followed by a reboot can eliminate the Oops.
  417. But don't invest too much time into this at this point of the process, as
  418. the cause for the Oops might already be fixed in the newer Linux kernel
  419. version you are going to install later in this process.
  420. 2. Your system uses a software that installs its own kernel modules, for
  421. example Nvidia's proprietary graphics driver or VirtualBox. The kernel
  422. taints itself when it loads such module from external sources (even if
  423. they are Open Source): they sometimes cause errors in unrelated kernel
  424. areas and thus might be causing the issue you face. You therefore have to
  425. prevent those modules from loading when you want to report an issue to the
  426. Linux kernel developers. Most of the time the easiest way to do that is:
  427. temporarily uninstall such software including any modules they might have
  428. installed. Afterwards reboot.
  429. 3. The kernel also taints itself when it's loading a module that resides in
  430. the staging tree of the Linux kernel source. That's a special area for
  431. code (mostly drivers) that does not yet fulfill the normal Linux kernel
  432. quality standards. When you report an issue with such a module it's
  433. obviously okay if the kernel is tainted; just make sure the module in
  434. question is the only reason for the taint. If the issue happens in an
  435. unrelated area reboot and temporarily block the module from being loaded
  436. by specifying ``foo.blacklist=1`` as kernel parameter (replace 'foo' with
  437. the name of the module in question).
  438. Document how to reproduce issue
  439. -------------------------------
  440. *Write down coarsely how to reproduce the issue. If you deal with multiple
  441. issues at once, create separate notes for each of them and make sure they
  442. work independently on a freshly booted system. That's needed, as each issue
  443. needs to get reported to the kernel developers separately, unless they are
  444. strongly entangled.*
  445. If you deal with multiple issues at once, you'll have to report each of them
  446. separately, as they might be handled by different developers. Describing
  447. various issues in one report also makes it quite difficult for others to tear
  448. it apart. Hence, only combine issues in one report if they are very strongly
  449. entangled.
  450. Additionally, during the reporting process you will have to test if the issue
  451. happens with other kernel versions. Therefore, it will make your work easier if
  452. you know exactly how to reproduce an issue quickly on a freshly booted system.
  453. Note: it's often fruitless to report issues that only happened once, as they
  454. might be caused by a bit flip due to cosmic radiation. That's why you should
  455. try to rule that out by reproducing the issue before going further. Feel free
  456. to ignore this advice if you are experienced enough to tell a one-time error
  457. due to faulty hardware apart from a kernel issue that rarely happens and thus
  458. is hard to reproduce.
  459. Regression in stable or longterm kernel?
  460. ----------------------------------------
  461. *If you are facing a regression within a stable or longterm version line
  462. (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
  463. 'Dealing with regressions within a stable and longterm kernel line'.*
  464. Regression within a stable and longterm kernel version line are something the
  465. Linux developers want to fix badly, as such issues are even more unwanted than
  466. regression in the main development branch, as they can quickly affect a lot of
  467. people. The developers thus want to learn about such issues as quickly as
  468. possible, hence there is a streamlined process to report them. Note,
  469. regressions with newer kernel version line (say something broke when switching
  470. from 5.9.15 to 5.10.5) do not qualify.
  471. Check where you need to report your issue
  472. -----------------------------------------
  473. *Locate the driver or kernel subsystem that seems to be causing the issue.
  474. Find out how and where its developers expect reports. Note: most of the
  475. time this won't be bugzilla.kernel.org, as issues typically need to be sent
  476. by mail to a maintainer and a public mailing list.*
  477. It's crucial to send your report to the right people, as the Linux kernel is a
  478. big project and most of its developers are only familiar with a small subset of
  479. it. Quite a few programmers for example only care for just one driver, for
  480. example one for a WiFi chip; its developer likely will only have small or no
  481. knowledge about the internals of remote or unrelated "subsystems", like the TCP
  482. stack, the PCIe/PCI subsystem, memory management or file systems.
  483. Problem is: the Linux kernel lacks a central bug tracker where you can simply
  484. file your issue and make it reach the developers that need to know about it.
  485. That's why you have to find the right place and way to report issues yourself.
  486. You can do that with the help of a script (see below), but it mainly targets
  487. kernel developers and experts. For everybody else the MAINTAINERS file is the
  488. better place.
  489. How to read the MAINTAINERS file
  490. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  491. To illustrate how to use the :ref:`MAINTAINERS <maintainers>` file, lets assume
  492. the WiFi in your Laptop suddenly misbehaves after updating the kernel. In that
  493. case it's likely an issue in the WiFi driver. Obviously it could also be some
  494. code it builds upon, but unless you suspect something like that stick to the
  495. driver. If it's really something else, the driver's developers will get the
  496. right people involved.
  497. Sadly, there is no way to check which code is driving a particular hardware
  498. component that is both universal and easy.
  499. In case of a problem with the WiFi driver you for example might want to look at
  500. the output of ``lspci -k``, as it lists devices on the PCI/PCIe bus and the
  501. kernel module driving it::
  502. [user@something ~]$ lspci -k
  503. [...]
  504. 3a:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
  505. Subsystem: Bigfoot Networks, Inc. Device 1535
  506. Kernel driver in use: ath10k_pci
  507. Kernel modules: ath10k_pci
  508. [...]
  509. But this approach won't work if your WiFi chip is connected over USB or some
  510. other internal bus. In those cases you might want to check your WiFi manager or
  511. the output of ``ip link``. Look for the name of the problematic network
  512. interface, which might be something like 'wlp58s0'. This name can be used like
  513. this to find the module driving it::
  514. [user@something ~]$ realpath --relative-to=/sys/module/ /sys/class/net/wlp58s0/device/driver/module
  515. ath10k_pci
  516. In case tricks like these don't bring you any further, try to search the
  517. internet on how to narrow down the driver or subsystem in question. And if you
  518. are unsure which it is: just try your best guess, somebody will help you if you
  519. guessed poorly.
  520. Once you know the driver or subsystem, you want to search for it in the
  521. MAINTAINERS file. In the case of 'ath10k_pci' you won't find anything, as the
  522. name is too specific. Sometimes you will need to search on the net for help;
  523. but before doing so, try a somewhat shorted or modified name when searching the
  524. MAINTAINERS file, as then you might find something like this::
  525. QUALCOMM ATHEROS ATH10K WIRELESS DRIVER
  526. Mail: A. Some Human <shuman@example.com>
  527. Mailing list: ath10k@lists.infradead.org
  528. Status: Supported
  529. Web-page: https://wireless.wiki.kernel.org/en/users/Drivers/ath10k
  530. SCM: git git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
  531. Files: drivers/net/wireless/ath/ath10k/
  532. Note: the line description will be abbreviations, if you read the plain
  533. MAINTAINERS file found in the root of the Linux source tree. 'Mail:' for
  534. example will be 'M:', 'Mailing list:' will be 'L', and 'Status:' will be 'S:'.
  535. A section near the top of the file explains these and other abbreviations.
  536. First look at the line 'Status'. Ideally it should be 'Supported' or
  537. 'Maintained'. If it states 'Obsolete' then you are using some outdated approach
  538. that was replaced by a newer solution you need to switch to. Sometimes the code
  539. only has someone who provides 'Odd Fixes' when feeling motivated. And with
  540. 'Orphan' you are totally out of luck, as nobody takes care of the code anymore.
  541. That only leaves these options: arrange yourself to live with the issue, fix it
  542. yourself, or find a programmer somewhere willing to fix it.
  543. After checking the status, look for a line starting with 'bugs:': it will tell
  544. you where to find a subsystem specific bug tracker to file your issue. The
  545. example above does not have such a line. That is the case for most sections, as
  546. Linux kernel development is completely driven by mail. Very few subsystems use
  547. a bug tracker, and only some of those rely on bugzilla.kernel.org.
  548. In this and many other cases you thus have to look for lines starting with
  549. 'Mail:' instead. Those mention the name and the email addresses for the
  550. maintainers of the particular code. Also look for a line starting with 'Mailing
  551. list:', which tells you the public mailing list where the code is developed.
  552. Your report later needs to go by mail to those addresses. Additionally, for all
  553. issue reports sent by email, make sure to add the Linux Kernel Mailing List
  554. (LKML) <linux-kernel@vger.kernel.org> to CC. Don't omit either of the mailing
  555. lists when sending your issue report by mail later! Maintainers are busy people
  556. and might leave some work for other developers on the subsystem specific list;
  557. and LKML is important to have one place where all issue reports can be found.
  558. Finding the maintainers with the help of a script
  559. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  560. For people that have the Linux sources at hand there is a second option to find
  561. the proper place to report: the script 'scripts/get_maintainer.pl' which tries
  562. to find all people to contact. It queries the MAINTAINERS file and needs to be
  563. called with a path to the source code in question. For drivers compiled as
  564. module if often can be found with a command like this::
  565. $ modinfo ath10k_pci | grep filename | sed 's!/lib/modules/.*/kernel/!!; s!filename:!!; s!\.ko\(\|\.xz\)!!'
  566. drivers/net/wireless/ath/ath10k/ath10k_pci.ko
  567. Pass parts of this to the script::
  568. $ ./scripts/get_maintainer.pl -f drivers/net/wireless/ath/ath10k*
  569. Some Human <shuman@example.com> (supporter:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER)
  570. Another S. Human <asomehuman@example.com> (maintainer:NETWORKING DRIVERS)
  571. ath10k@lists.infradead.org (open list:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER)
  572. linux-wireless@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS))
  573. netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
  574. linux-kernel@vger.kernel.org (open list)
  575. Don't sent your report to all of them. Send it to the maintainers, which the
  576. script calls "supporter:"; additionally CC the most specific mailing list for
  577. the code as well as the Linux Kernel Mailing List (LKML). In this case you thus
  578. would need to send the report to 'Some Human <shuman@example.com>' with
  579. 'ath10k@lists.infradead.org' and 'linux-kernel@vger.kernel.org' in CC.
  580. Note: in case you cloned the Linux sources with git you might want to call
  581. ``get_maintainer.pl`` a second time with ``--git``. The script then will look
  582. at the commit history to find which people recently worked on the code in
  583. question, as they might be able to help. But use these results with care, as it
  584. can easily send you in a wrong direction. That for example happens quickly in
  585. areas rarely changed (like old or unmaintained drivers): sometimes such code is
  586. modified during tree-wide cleanups by developers that do not care about the
  587. particular driver at all.
  588. Search for existing reports, second run
  589. ---------------------------------------
  590. *Search the archives of the bug tracker or mailing list in question
  591. thoroughly for reports that might match your issue. If you find anything,
  592. join the discussion instead of sending a new report.*
  593. As mentioned earlier already: reporting an issue that someone else already
  594. brought forward is often a waste of time for everyone involved, especially you
  595. as the reporter. That's why you should search for existing report again, now
  596. that you know where they need to be reported to. If it's mailing list, you will
  597. often find its archives on `lore.kernel.org <https://lore.kernel.org/>`_.
  598. But some list are hosted in different places. That for example is the case for
  599. the ath10k WiFi driver used as example in the previous step. But you'll often
  600. find the archives for these lists easily on the net. Searching for 'archive
  601. ath10k@lists.infradead.org' for example will lead you to the `Info page for the
  602. ath10k mailing list <https://lists.infradead.org/mailman/listinfo/ath10k>`_,
  603. which at the top links to its
  604. `list archives <https://lists.infradead.org/pipermail/ath10k/>`_. Sadly this and
  605. quite a few other lists miss a way to search the archives. In those cases use a
  606. regular internet search engine and add something like
  607. 'site:lists.infradead.org/pipermail/ath10k/' to your search terms, which limits
  608. the results to the archives at that URL.
  609. It's also wise to check the internet, LKML and maybe bugzilla.kernel.org again
  610. at this point. If your report needs to be filed in a bug tracker, you may want
  611. to check the mailing list archives for the subsystem as well, as someone might
  612. have reported it only there.
  613. For details how to search and what to do if you find matching reports see
  614. "Search for existing reports, first run" above.
  615. Do not hurry with this step of the reporting process: spending 30 to 60 minutes
  616. or even more time can save you and others quite a lot of time and trouble.
  617. Install a fresh kernel for testing
  618. ----------------------------------
  619. *Unless you are already running the latest 'mainline' Linux kernel, better
  620. go and install it for the reporting process. Testing and reporting with
  621. the latest 'stable' Linux can be an acceptable alternative in some
  622. situations; during the merge window that actually might be even the best
  623. approach, but in that development phase it can be an even better idea to
  624. suspend your efforts for a few days anyway. Whatever version you choose,
  625. ideally use a 'vanilla' built. Ignoring these advices will dramatically
  626. increase the risk your report will be rejected or ignored.*
  627. As mentioned in the detailed explanation for the first step already: Like most
  628. programmers, Linux kernel developers don't like to spend time dealing with
  629. reports for issues that don't even happen with the current code. It's just a
  630. waste everybody's time, especially yours. That's why it's in everybody's
  631. interest that you confirm the issue still exists with the latest upstream code
  632. before reporting it. You are free to ignore this advice, but as outlined
  633. earlier: doing so dramatically increases the risk that your issue report might
  634. get rejected or simply ignored.
  635. In the scope of the kernel "latest upstream" normally means:
  636. * Install a mainline kernel; the latest stable kernel can be an option, but
  637. most of the time is better avoided. Longterm kernels (sometimes called 'LTS
  638. kernels') are unsuitable at this point of the process. The next subsection
  639. explains all of this in more detail.
  640. * The over next subsection describes way to obtain and install such a kernel.
  641. It also outlines that using a pre-compiled kernel are fine, but better are
  642. vanilla, which means: it was built using Linux sources taken straight `from
  643. kernel.org <https://kernel.org/>`_ and not modified or enhanced in any way.
  644. Choosing the right version for testing
  645. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  646. Head over to `kernel.org <https://kernel.org/>`_ to find out which version you
  647. want to use for testing. Ignore the big yellow button that says 'Latest release'
  648. and look a little lower at the table. At its top you'll see a line starting with
  649. mainline, which most of the time will point to a pre-release with a version
  650. number like '5.8-rc2'. If that's the case, you'll want to use this mainline
  651. kernel for testing, as that where all fixes have to be applied first. Do not let
  652. that 'rc' scare you, these 'development kernels' are pretty reliable — and you
  653. made a backup, as you were instructed above, didn't you?
  654. In about two out of every nine to ten weeks, mainline might point you to a
  655. proper release with a version number like '5.7'. If that happens, consider
  656. suspending the reporting process until the first pre-release of the next
  657. version (5.8-rc1) shows up on kernel.org. That's because the Linux development
  658. cycle then is in its two-week long 'merge window'. The bulk of the changes and
  659. all intrusive ones get merged for the next release during this time. It's a bit
  660. more risky to use mainline during this period. Kernel developers are also often
  661. quite busy then and might have no spare time to deal with issue reports. It's
  662. also quite possible that one of the many changes applied during the merge
  663. window fixes the issue you face; that's why you soon would have to retest with
  664. a newer kernel version anyway, as outlined below in the section 'Duties after
  665. the report went out'.
  666. That's why it might make sense to wait till the merge window is over. But don't
  667. to that if you're dealing with something that shouldn't wait. In that case
  668. consider obtaining the latest mainline kernel via git (see below) or use the
  669. latest stable version offered on kernel.org. Using that is also acceptable in
  670. case mainline for some reason does currently not work for you. An in general:
  671. using it for reproducing the issue is also better than not reporting it issue
  672. at all.
  673. Better avoid using the latest stable kernel outside merge windows, as all fixes
  674. must be applied to mainline first. That's why checking the latest mainline
  675. kernel is so important: any issue you want to see fixed in older version lines
  676. needs to be fixed in mainline first before it can get backported, which can
  677. take a few days or weeks. Another reason: the fix you hope for might be too
  678. hard or risky for backporting; reporting the issue again hence is unlikely to
  679. change anything.
  680. These aspects are also why longterm kernels (sometimes called "LTS kernels")
  681. are unsuitable for this part of the reporting process: they are to distant from
  682. the current code. Hence go and test mainline first and follow the process
  683. further: if the issue doesn't occur with mainline it will guide you how to get
  684. it fixed in older version lines, if that's in the cards for the fix in question.
  685. How to obtain a fresh Linux kernel
  686. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  687. **Using a pre-compiled kernel**: This is often the quickest, easiest, and safest
  688. way for testing — especially is you are unfamiliar with the Linux kernel. The
  689. problem: most of those shipped by distributors or add-on repositories are build
  690. from modified Linux sources. They are thus not vanilla and therefore often
  691. unsuitable for testing and issue reporting: the changes might cause the issue
  692. you face or influence it somehow.
  693. But you are in luck if you are using a popular Linux distribution: for quite a
  694. few of them you'll find repositories on the net that contain packages with the
  695. latest mainline or stable Linux built as vanilla kernel. It's totally okay to
  696. use these, just make sure from the repository's description they are vanilla or
  697. at least close to it. Additionally ensure the packages contain the latest
  698. versions as offered on kernel.org. The packages are likely unsuitable if they
  699. are older than a week, as new mainline and stable kernels typically get released
  700. at least once a week.
  701. Please note that you might need to build your own kernel manually later: that's
  702. sometimes needed for debugging or testing fixes, as described later in this
  703. document. Also be aware that pre-compiled kernels might lack debug symbols that
  704. are needed to decode messages the kernel prints when a panic, Oops, warning, or
  705. BUG occurs; if you plan to decode those, you might be better off compiling a
  706. kernel yourself (see the end of this subsection and the section titled 'Decode
  707. failure messages' for details).
  708. **Using git**: Developers and experienced Linux users familiar with git are
  709. often best served by obtaining the latest Linux kernel sources straight from the
  710. `official development repository on kernel.org
  711. <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_.
  712. Those are likely a bit ahead of the latest mainline pre-release. Don't worry
  713. about it: they are as reliable as a proper pre-release, unless the kernel's
  714. development cycle is currently in the middle of a merge window. But even then
  715. they are quite reliable.
  716. **Conventional**: People unfamiliar with git are often best served by
  717. downloading the sources as tarball from `kernel.org <https://kernel.org/>`_.
  718. How to actually build a kernel is not described here, as many websites explain
  719. the necessary steps already. If you are new to it, consider following one of
  720. those how-to's that suggest to use ``make localmodconfig``, as that tries to
  721. pick up the configuration of your current kernel and then tries to adjust it
  722. somewhat for your system. That does not make the resulting kernel any better,
  723. but quicker to compile.
  724. Note: If you are dealing with a panic, Oops, warning, or BUG from the kernel,
  725. please try to enable CONFIG_KALLSYMS when configuring your kernel.
  726. Additionally, enable CONFIG_DEBUG_KERNEL and CONFIG_DEBUG_INFO, too; the
  727. latter is the relevant one of those two, but can only be reached if you enable
  728. the former. Be aware CONFIG_DEBUG_INFO increases the storage space required to
  729. build a kernel by quite a bit. But that's worth it, as these options will allow
  730. you later to pinpoint the exact line of code that triggers your issue. The
  731. section 'Decode failure messages' below explains this in more detail.
  732. But keep in mind: Always keep a record of the issue encountered in case it is
  733. hard to reproduce. Sending an undecoded report is better than not reporting
  734. the issue at all.
  735. Check 'taint' flag
  736. ------------------
  737. *Ensure the kernel you just installed does not 'taint' itself when
  738. running.*
  739. As outlined above in more detail already: the kernel sets a 'taint' flag when
  740. something happens that can lead to follow-up errors that look totally
  741. unrelated. That's why you need to check if the kernel you just installed does
  742. not set this flag. And if it does, you in almost all the cases needs to
  743. eliminate the reason for it before you reporting issues that occur with it. See
  744. the section above for details how to do that.
  745. Reproduce issue with the fresh kernel
  746. -------------------------------------
  747. *Reproduce the issue with the kernel you just installed. If it doesn't show
  748. up there, scroll down to the instructions for issues only happening with
  749. stable and longterm kernels.*
  750. Check if the issue occurs with the fresh Linux kernel version you just
  751. installed. If it was fixed there already, consider sticking with this version
  752. line and abandoning your plan to report the issue. But keep in mind that other
  753. users might still be plagued by it, as long as it's not fixed in either stable
  754. and longterm version from kernel.org (and thus vendor kernels derived from
  755. those). If you prefer to use one of those or just want to help their users,
  756. head over to the section "Details about reporting issues only occurring in
  757. older kernel version lines" below.
  758. Optimize description to reproduce issue
  759. ---------------------------------------
  760. *Optimize your notes: try to find and write the most straightforward way to
  761. reproduce your issue. Make sure the end result has all the important
  762. details, and at the same time is easy to read and understand for others
  763. that hear about it for the first time. And if you learned something in this
  764. process, consider searching again for existing reports about the issue.*
  765. An unnecessarily complex report will make it hard for others to understand your
  766. report. Thus try to find a reproducer that's straight forward to describe and
  767. thus easy to understand in written form. Include all important details, but at
  768. the same time try to keep it as short as possible.
  769. In this in the previous steps you likely have learned a thing or two about the
  770. issue you face. Use this knowledge and search again for existing reports
  771. instead you can join.
  772. Decode failure messages
  773. -----------------------
  774. *If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
  775. decoding the kernel log to find the line of code that triggered the error.*
  776. When the kernel detects an internal problem, it will log some information about
  777. the executed code. This makes it possible to pinpoint the exact line in the
  778. source code that triggered the issue and shows how it was called. But that only
  779. works if you enabled CONFIG_DEBUG_INFO and CONFIG_KALLSYMS when configuring
  780. your kernel. If you did so, consider to decode the information from the
  781. kernel's log. That will make it a lot easier to understand what lead to the
  782. 'panic', 'Oops', 'warning', or 'BUG', which increases the chances that someone
  783. can provide a fix.
  784. Decoding can be done with a script you find in the Linux source tree. If you
  785. are running a kernel you compiled yourself earlier, call it like this::
  786. [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux
  787. If you are running a packaged vanilla kernel, you will likely have to install
  788. the corresponding packages with debug symbols. Then call the script (which you
  789. might need to get from the Linux sources if your distro does not package it)
  790. like this::
  791. [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh \
  792. /usr/lib/debug/lib/modules/5.10.10-4.1.x86_64/vmlinux /usr/src/kernels/5.10.10-4.1.x86_64/
  793. The script will work on log lines like the following, which show the address of
  794. the code the kernel was executing when the error occurred::
  795. [ 68.387301] RIP: 0010:test_module_init+0x5/0xffa [test_module]
  796. Once decoded, these lines will look like this::
  797. [ 68.387301] RIP: 0010:test_module_init (/home/username/linux-5.10.5/test-module/test-module.c:16) test_module
  798. In this case the executed code was built from the file
  799. '~/linux-5.10.5/test-module/test-module.c' and the error occurred by the
  800. instructions found in line '16'.
  801. The script will similarly decode the addresses mentioned in the section
  802. starting with 'Call trace', which show the path to the function where the
  803. problem occurred. Additionally, the script will show the assembler output for
  804. the code section the kernel was executing.
  805. Note, if you can't get this to work, simply skip this step and mention the
  806. reason for it in the report. If you're lucky, it might not be needed. And if it
  807. is, someone might help you to get things going. Also be aware this is just one
  808. of several ways to decode kernel stack traces. Sometimes different steps will
  809. be required to retrieve the relevant details. Don't worry about that, if that's
  810. needed in your case, developers will tell you what to do.
  811. Special care for regressions
  812. ----------------------------
  813. *If your problem is a regression, try to narrow down when the issue was
  814. introduced as much as possible.*
  815. Linux lead developer Linus Torvalds insists that the Linux kernel never
  816. worsens, that's why he deems regressions as unacceptable and wants to see them
  817. fixed quickly. That's why changes that introduced a regression are often
  818. promptly reverted if the issue they cause can't get solved quickly any other
  819. way. Reporting a regression is thus a bit like playing a kind of trump card to
  820. get something quickly fixed. But for that to happen the change that's causing
  821. the regression needs to be known. Normally it's up to the reporter to track
  822. down the culprit, as maintainers often won't have the time or setup at hand to
  823. reproduce it themselves.
  824. To find the change there is a process called 'bisection' which the document
  825. Documentation/admin-guide/bug-bisect.rst describes in detail. That process
  826. will often require you to build about ten to twenty kernel images, trying to
  827. reproduce the issue with each of them before building the next. Yes, that takes
  828. some time, but don't worry, it works a lot quicker than most people assume.
  829. Thanks to a 'binary search' this will lead you to the one commit in the source
  830. code management system that's causing the regression. Once you find it, search
  831. the net for the subject of the change, its commit id and the shortened commit id
  832. (the first 12 characters of the commit id). This will lead you to existing
  833. reports about it, if there are any.
  834. Note, a bisection needs a bit of know-how, which not everyone has, and quite a
  835. bit of effort, which not everyone is willing to invest. Nevertheless, it's
  836. highly recommended performing a bisection yourself. If you really can't or
  837. don't want to go down that route at least find out which mainline kernel
  838. introduced the regression. If something for example breaks when switching from
  839. 5.5.15 to 5.8.4, then try at least all the mainline releases in that area (5.6,
  840. 5.7 and 5.8) to check when it first showed up. Unless you're trying to find a
  841. regression in a stable or longterm kernel, avoid testing versions which number
  842. has three sections (5.6.12, 5.7.8), as that makes the outcome hard to
  843. interpret, which might render your testing useless. Once you found the major
  844. version which introduced the regression, feel free to move on in the reporting
  845. process. But keep in mind: it depends on the issue at hand if the developers
  846. will be able to help without knowing the culprit. Sometimes they might
  847. recognize from the report want went wrong and can fix it; other times they will
  848. be unable to help unless you perform a bisection.
  849. When dealing with regressions make sure the issue you face is really caused by
  850. the kernel and not by something else, as outlined above already.
  851. In the whole process keep in mind: an issue only qualifies as regression if the
  852. older and the newer kernel got built with a similar configuration. This can be
  853. achieved by using ``make olddefconfig``, as explained in more detail by
  854. Documentation/admin-guide/reporting-regressions.rst; that document also
  855. provides a good deal of other information about regressions you might want to be
  856. aware of.
  857. Write and send the report
  858. -------------------------
  859. *Start to compile the report by writing a detailed description about the
  860. issue. Always mention a few things: the latest kernel version you installed
  861. for reproducing, the Linux Distribution used, and your notes on how to
  862. reproduce the issue. Ideally, make the kernel's build configuration
  863. (.config) and the output from ``dmesg`` available somewhere on the net and
  864. link to it. Include or upload all other information that might be relevant,
  865. like the output/screenshot of an Oops or the output from ``lspci``. Once
  866. you wrote this main part, insert a normal length paragraph on top of it
  867. outlining the issue and the impact quickly. On top of this add one sentence
  868. that briefly describes the problem and gets people to read on. Now give the
  869. thing a descriptive title or subject that yet again is shorter. Then you're
  870. ready to send or file the report like the MAINTAINERS file told you, unless
  871. you are dealing with one of those 'issues of high priority': they need
  872. special care which is explained in 'Special handling for high priority
  873. issues' below.*
  874. Now that you have prepared everything it's time to write your report. How to do
  875. that is partly explained by the three documents linked to in the preface above.
  876. That's why this text will only mention a few of the essentials as well as
  877. things specific to the Linux kernel.
  878. There is one thing that fits both categories: the most crucial parts of your
  879. report are the title/subject, the first sentence, and the first paragraph.
  880. Developers often get quite a lot of mail. They thus often just take a few
  881. seconds to skim a mail before deciding to move on or look closer. Thus: the
  882. better the top section of your report, the higher are the chances that someone
  883. will look into it and help you. And that is why you should ignore them for now
  884. and write the detailed report first. ;-)
  885. Things each report should mention
  886. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  887. Describe in detail how your issue happens with the fresh vanilla kernel you
  888. installed. Try to include the step-by-step instructions you wrote and optimized
  889. earlier that outline how you and ideally others can reproduce the issue; in
  890. those rare cases where that's impossible try to describe what you did to
  891. trigger it.
  892. Also include all the relevant information others might need to understand the
  893. issue and its environment. What's actually needed depends a lot on the issue,
  894. but there are some things you should include always:
  895. * the output from ``cat /proc/version``, which contains the Linux kernel
  896. version number and the compiler it was built with.
  897. * the Linux distribution the machine is running (``hostnamectl | grep
  898. "Operating System"``)
  899. * the architecture of the CPU and the operating system (``uname -mi``)
  900. * if you are dealing with a regression and performed a bisection, mention the
  901. subject and the commit-id of the change that is causing it.
  902. In a lot of cases it's also wise to make two more things available to those
  903. that read your report:
  904. * the configuration used for building your Linux kernel (the '.config' file)
  905. * the kernel's messages that you get from ``dmesg`` written to a file. Make
  906. sure that it starts with a line like 'Linux version 5.8-1
  907. (foobar@example.com) (gcc (GCC) 10.2.1, GNU ld version 2.34) #1 SMP Mon Aug
  908. 3 14:54:37 UTC 2020' If it's missing, then important messages from the first
  909. boot phase already got discarded. In this case instead consider using
  910. ``journalctl -b 0 -k``; alternatively you can also reboot, reproduce the
  911. issue and call ``dmesg`` right afterwards.
  912. These two files are big, that's why it's a bad idea to put them directly into
  913. your report. If you are filing the issue in a bug tracker then attach them to
  914. the ticket. If you report the issue by mail do not attach them, as that makes
  915. the mail too large; instead do one of these things:
  916. * Upload the files somewhere public (your website, a public file paste
  917. service, a ticket created just for this purpose on `bugzilla.kernel.org
  918. <https://bugzilla.kernel.org/>`_, ...) and include a link to them in your
  919. report. Ideally use something where the files stay available for years, as
  920. they could be useful to someone many years from now; this for example can
  921. happen if five or ten years from now a developer works on some code that was
  922. changed just to fix your issue.
  923. * Put the files aside and mention you will send them later in individual
  924. replies to your own mail. Just remember to actually do that once the report
  925. went out. ;-)
  926. Things that might be wise to provide
  927. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  928. Depending on the issue you might need to add more background data. Here are a
  929. few suggestions what often is good to provide:
  930. * If you are dealing with a 'warning', an 'OOPS' or a 'panic' from the kernel,
  931. include it. If you can't copy'n'paste it, try to capture a netconsole trace
  932. or at least take a picture of the screen.
  933. * If the issue might be related to your computer hardware, mention what kind
  934. of system you use. If you for example have problems with your graphics card,
  935. mention its manufacturer, the card's model, and what chip is uses. If it's a
  936. laptop mention its name, but try to make sure it's meaningful. 'Dell XPS 13'
  937. for example is not, because it might be the one from 2012; that one looks
  938. not that different from the one sold today, but apart from that the two have
  939. nothing in common. Hence, in such cases add the exact model number, which
  940. for example are '9380' or '7390' for XPS 13 models introduced during 2019.
  941. Names like 'Lenovo Thinkpad T590' are also somewhat ambiguous: there are
  942. variants of this laptop with and without a dedicated graphics chip, so try
  943. to find the exact model name or specify the main components.
  944. * Mention the relevant software in use. If you have problems with loading
  945. modules, you want to mention the versions of kmod, systemd, and udev in use.
  946. If one of the DRM drivers misbehaves, you want to state the versions of
  947. libdrm and Mesa; also specify your Wayland compositor or the X-Server and
  948. its driver. If you have a filesystem issue, mention the version of
  949. corresponding filesystem utilities (e2fsprogs, btrfs-progs, xfsprogs, ...).
  950. * Gather additional information from the kernel that might be of interest. The
  951. output from ``lspci -nn`` will for example help others to identify what
  952. hardware you use. If you have a problem with hardware you even might want to
  953. make the output from ``sudo lspci -vvv`` available, as that provides
  954. insights how the components were configured. For some issues it might be
  955. good to include the contents of files like ``/proc/cpuinfo``,
  956. ``/proc/ioports``, ``/proc/iomem``, ``/proc/modules``, or
  957. ``/proc/scsi/scsi``. Some subsystem also offer tools to collect relevant
  958. information. One such tool is ``alsa-info.sh`` `which the audio/sound
  959. subsystem developers provide <https://www.alsa-project.org/wiki/AlsaInfo>`_.
  960. Those examples should give your some ideas of what data might be wise to
  961. attach, but you have to think yourself what will be helpful for others to know.
  962. Don't worry too much about forgetting something, as developers will ask for
  963. additional details they need. But making everything important available from
  964. the start increases the chance someone will take a closer look.
  965. The important part: the head of your report
  966. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  967. Now that you have the detailed part of the report prepared let's get to the
  968. most important section: the first few sentences. Thus go to the top, add
  969. something like 'The detailed description:' before the part you just wrote and
  970. insert two newlines at the top. Now write one normal length paragraph that
  971. describes the issue roughly. Leave out all boring details and focus on the
  972. crucial parts readers need to know to understand what this is all about; if you
  973. think this bug affects a lot of users, mention this to get people interested.
  974. Once you did that insert two more lines at the top and write a one sentence
  975. summary that explains quickly what the report is about. After that you have to
  976. get even more abstract and write an even shorter subject/title for the report.
  977. Now that you have written this part take some time to optimize it, as it is the
  978. most important parts of your report: a lot of people will only read this before
  979. they decide if reading the rest is time well spent.
  980. Now send or file the report like the :ref:`MAINTAINERS <maintainers>` file told
  981. you, unless it's one of those 'issues of high priority' outlined earlier: in
  982. that case please read the next subsection first before sending the report on
  983. its way.
  984. Special handling for high priority issues
  985. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  986. Reports for high priority issues need special handling.
  987. **Severe issues**: make sure the subject or ticket title as well as the first
  988. paragraph makes the severeness obvious.
  989. **Regressions**: make the report's subject start with '[REGRESSION]'.
  990. In case you performed a successful bisection, use the title of the change that
  991. introduced the regression as the second part of your subject. Make the report
  992. also mention the commit id of the culprit. In case of an unsuccessful bisection,
  993. make your report mention the latest tested version that's working fine (say 5.7)
  994. and the oldest where the issue occurs (say 5.8-rc1).
  995. When sending the report by mail, CC the Linux regressions mailing list
  996. (regressions@lists.linux.dev). In case the report needs to be filed to some web
  997. tracker, proceed to do so. Once filed, forward the report by mail to the
  998. regressions list; CC the maintainer and the mailing list for the subsystem in
  999. question. Make sure to inline the forwarded report, hence do not attach it.
  1000. Also add a short note at the top where you mention the URL to the ticket.
  1001. When mailing or forwarding the report, in case of a successful bisection add the
  1002. author of the culprit to the recipients; also CC everyone in the signed-off-by
  1003. chain, which you find at the end of its commit message.
  1004. **Security issues**: for these issues your will have to evaluate if a
  1005. short-term risk to other users would arise if details were publicly disclosed.
  1006. If that's not the case simply proceed with reporting the issue as described.
  1007. For issues that bear such a risk you will need to adjust the reporting process
  1008. slightly:
  1009. * If the MAINTAINERS file instructed you to report the issue by mail, do not
  1010. CC any public mailing lists.
  1011. * If you were supposed to file the issue in a bug tracker make sure to mark
  1012. the ticket as 'private' or 'security issue'. If the bug tracker does not
  1013. offer a way to keep reports private, forget about it and send your report as
  1014. a private mail to the maintainers instead.
  1015. In both cases make sure to also mail your report to the addresses the
  1016. MAINTAINERS file lists in the section 'security contact'. Ideally directly CC
  1017. them when sending the report by mail. If you filed it in a bug tracker, forward
  1018. the report's text to these addresses; but on top of it put a small note where
  1019. you mention that you filed it with a link to the ticket.
  1020. See Documentation/process/security-bugs.rst for more information.
  1021. Duties after the report went out
  1022. --------------------------------
  1023. *Wait for reactions and keep the thing rolling until you can accept the
  1024. outcome in one way or the other. Thus react publicly and in a timely manner
  1025. to any inquiries. Test proposed fixes. Do proactive testing: retest with at
  1026. least every first release candidate (RC) of a new mainline version and
  1027. report your results. Send friendly reminders if things stall. And try to
  1028. help yourself, if you don't get any help or if it's unsatisfying.*
  1029. If your report was good and you are really lucky then one of the developers
  1030. might immediately spot what's causing the issue; they then might write a patch
  1031. to fix it, test it, and send it straight for integration in mainline while
  1032. tagging it for later backport to stable and longterm kernels that need it. Then
  1033. all you need to do is reply with a 'Thank you very much' and switch to a version
  1034. with the fix once it gets released.
  1035. But this ideal scenario rarely happens. That's why the job is only starting
  1036. once you got the report out. What you'll have to do depends on the situations,
  1037. but often it will be the things listed below. But before digging into the
  1038. details, here are a few important things you need to keep in mind for this part
  1039. of the process.
  1040. General advice for further interactions
  1041. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1042. **Always reply in public**: When you filed the issue in a bug tracker, always
  1043. reply there and do not contact any of the developers privately about it. For
  1044. mailed reports always use the 'Reply-all' function when replying to any mails
  1045. you receive. That includes mails with any additional data you might want to add
  1046. to your report: go to your mail applications 'Sent' folder and use 'reply-all'
  1047. on your mail with the report. This approach will make sure the public mailing
  1048. list(s) and everyone else that gets involved over time stays in the loop; it
  1049. also keeps the mail thread intact, which among others is really important for
  1050. mailing lists to group all related mails together.
  1051. There are just two situations where a comment in a bug tracker or a 'Reply-all'
  1052. is unsuitable:
  1053. * Someone tells you to send something privately.
  1054. * You were told to send something, but noticed it contains sensitive
  1055. information that needs to be kept private. In that case it's okay to send it
  1056. in private to the developer that asked for it. But note in the ticket or a
  1057. mail that you did that, so everyone else knows you honored the request.
  1058. **Do research before asking for clarifications or help**: In this part of the
  1059. process someone might tell you to do something that requires a skill you might
  1060. not have mastered yet. For example, you might be asked to use some test tools
  1061. you never have heard of yet; or you might be asked to apply a patch to the
  1062. Linux kernel sources to test if it helps. In some cases it will be fine sending
  1063. a reply asking for instructions how to do that. But before going that route try
  1064. to find the answer own your own by searching the internet; alternatively
  1065. consider asking in other places for advice. For example ask a friend or post
  1066. about it to a chatroom or forum you normally hang out.
  1067. **Be patient**: If you are really lucky you might get a reply to your report
  1068. within a few hours. But most of the time it will take longer, as maintainers
  1069. are scattered around the globe and thus might be in a different time zone – one
  1070. where they already enjoy their night away from keyboard.
  1071. In general, kernel developers will take one to five business days to respond to
  1072. reports. Sometimes it will take longer, as they might be busy with the merge
  1073. windows, other work, visiting developer conferences, or simply enjoying a long
  1074. summer holiday.
  1075. The 'issues of high priority' (see above for an explanation) are an exception
  1076. here: maintainers should address them as soon as possible; that's why you
  1077. should wait a week at maximum (or just two days if it's something urgent)
  1078. before sending a friendly reminder.
  1079. Sometimes the maintainer might not be responding in a timely manner; other
  1080. times there might be disagreements, for example if an issue qualifies as
  1081. regression or not. In such cases raise your concerns on the mailing list and
  1082. ask others for public or private replies how to move on. If that fails, it
  1083. might be appropriate to get a higher authority involved. In case of a WiFi
  1084. driver that would be the wireless maintainers; if there are no higher level
  1085. maintainers or all else fails, it might be one of those rare situations where
  1086. it's okay to get Linus Torvalds involved.
  1087. **Proactive testing**: Every time the first pre-release (the 'rc1') of a new
  1088. mainline kernel version gets released, go and check if the issue is fixed there
  1089. or if anything of importance changed. Mention the outcome in the ticket or in a
  1090. mail you sent as reply to your report (make sure it has all those in the CC
  1091. that up to that point participated in the discussion). This will show your
  1092. commitment and that you are willing to help. It also tells developers if the
  1093. issue persists and makes sure they do not forget about it. A few other
  1094. occasional retests (for example with rc3, rc5 and the final) are also a good
  1095. idea, but only report your results if something relevant changed or if you are
  1096. writing something anyway.
  1097. With all these general things off the table let's get into the details of how
  1098. to help to get issues resolved once they were reported.
  1099. Inquires and testing request
  1100. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1101. Here are your duties in case you got replies to your report:
  1102. **Check who you deal with**: Most of the time it will be the maintainer or a
  1103. developer of the particular code area that will respond to your report. But as
  1104. issues are normally reported in public it could be anyone that's replying —
  1105. including people that want to help, but in the end might guide you totally off
  1106. track with their questions or requests. That rarely happens, but it's one of
  1107. many reasons why it's wise to quickly run an internet search to see who you're
  1108. interacting with. By doing this you also get aware if your report was heard by
  1109. the right people, as a reminder to the maintainer (see below) might be in order
  1110. later if discussion fades out without leading to a satisfying solution for the
  1111. issue.
  1112. **Inquiries for data**: Often you will be asked to test something or provide
  1113. additional details. Try to provide the requested information soon, as you have
  1114. the attention of someone that might help and risk losing it the longer you
  1115. wait; that outcome is even likely if you do not provide the information within
  1116. a few business days.
  1117. **Requests for testing**: When you are asked to test a diagnostic patch or a
  1118. possible fix, try to test it in timely manner, too. But do it properly and make
  1119. sure to not rush it: mixing things up can happen easily and can lead to a lot
  1120. of confusion for everyone involved. A common mistake for example is thinking a
  1121. proposed patch with a fix was applied, but in fact wasn't. Things like that
  1122. happen even to experienced testers occasionally, but they most of the time will
  1123. notice when the kernel with the fix behaves just as one without it.
  1124. What to do when nothing of substance happens
  1125. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1126. Some reports will not get any reaction from the responsible Linux kernel
  1127. developers; or a discussion around the issue evolved, but faded out with
  1128. nothing of substance coming out of it.
  1129. In these cases wait two (better: three) weeks before sending a friendly
  1130. reminder: maybe the maintainer was just away from keyboard for a while when
  1131. your report arrived or had something more important to take care of. When
  1132. writing the reminder, kindly ask if anything else from your side is needed to
  1133. get the ball running somehow. If the report got out by mail, do that in the
  1134. first lines of a mail that is a reply to your initial mail (see above) which
  1135. includes a full quote of the original report below: that's on of those few
  1136. situations where such a 'TOFU' (Text Over, Fullquote Under) is the right
  1137. approach, as then all the recipients will have the details at hand immediately
  1138. in the proper order.
  1139. After the reminder wait three more weeks for replies. If you still don't get a
  1140. proper reaction, you first should reconsider your approach. Did you maybe try
  1141. to reach out to the wrong people? Was the report maybe offensive or so
  1142. confusing that people decided to completely stay away from it? The best way to
  1143. rule out such factors: show the report to one or two people familiar with FLOSS
  1144. issue reporting and ask for their opinion. Also ask them for their advice how
  1145. to move forward. That might mean: prepare a better report and make those people
  1146. review it before you send it out. Such an approach is totally fine; just
  1147. mention that this is the second and improved report on the issue and include a
  1148. link to the first report.
  1149. If the report was proper you can send a second reminder; in it ask for advice
  1150. why the report did not get any replies. A good moment for this second reminder
  1151. mail is shortly after the first pre-release (the 'rc1') of a new Linux kernel
  1152. version got published, as you should retest and provide a status update at that
  1153. point anyway (see above).
  1154. If the second reminder again results in no reaction within a week, try to
  1155. contact a higher-level maintainer asking for advice: even busy maintainers by
  1156. then should at least have sent some kind of acknowledgment.
  1157. Remember to prepare yourself for a disappointment: maintainers ideally should
  1158. react somehow to every issue report, but they are only obliged to fix those
  1159. 'issues of high priority' outlined earlier. So don't be too devastating if you
  1160. get a reply along the lines of 'thanks for the report, I have more important
  1161. issues to deal with currently and won't have time to look into this for the
  1162. foreseeable future'.
  1163. It's also possible that after some discussion in the bug tracker or on a list
  1164. nothing happens anymore and reminders don't help to motivate anyone to work out
  1165. a fix. Such situations can be devastating, but is within the cards when it
  1166. comes to Linux kernel development. This and several other reasons for not
  1167. getting help are explained in 'Why some issues won't get any reaction or remain
  1168. unfixed after being reported' near the end of this document.
  1169. Don't get devastated if you don't find any help or if the issue in the end does
  1170. not get solved: the Linux kernel is FLOSS and thus you can still help yourself.
  1171. You for example could try to find others that are affected and team up with
  1172. them to get the issue resolved. Such a team could prepare a fresh report
  1173. together that mentions how many you are and why this is something that in your
  1174. option should get fixed. Maybe together you can also narrow down the root cause
  1175. or the change that introduced a regression, which often makes developing a fix
  1176. easier. And with a bit of luck there might be someone in the team that knows a
  1177. bit about programming and might be able to write a fix.
  1178. Reference for "Reporting regressions within a stable and longterm kernel line"
  1179. ------------------------------------------------------------------------------
  1180. This subsection provides details for the steps you need to perform if you face
  1181. a regression within a stable and longterm kernel line.
  1182. Make sure the particular version line still gets support
  1183. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1184. *Check if the kernel developers still maintain the Linux kernel version
  1185. line you care about: go to the front page of kernel.org and make sure it
  1186. mentions the latest release of the particular version line without an
  1187. '[EOL]' tag.*
  1188. Most kernel version lines only get supported for about three months, as
  1189. maintaining them longer is quite a lot of work. Hence, only one per year is
  1190. chosen and gets supported for at least two years (often six). That's why you
  1191. need to check if the kernel developers still support the version line you care
  1192. for.
  1193. Note, if kernel.org lists two stable version lines on the front page, you
  1194. should consider switching to the newer one and forget about the older one:
  1195. support for it is likely to be abandoned soon. Then it will get a "end-of-life"
  1196. (EOL) stamp. Version lines that reached that point still get mentioned on the
  1197. kernel.org front page for a week or two, but are unsuitable for testing and
  1198. reporting.
  1199. Search stable mailing list
  1200. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  1201. *Check the archives of the Linux stable mailing list for existing reports.*
  1202. Maybe the issue you face is already known and was fixed or is about to. Hence,
  1203. `search the archives of the Linux stable mailing list
  1204. <https://lore.kernel.org/stable/>`_ for reports about an issue like yours. If
  1205. you find any matches, consider joining the discussion, unless the fix is
  1206. already finished and scheduled to get applied soon.
  1207. Reproduce issue with the newest release
  1208. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1209. *Install the latest release from the particular version line as a vanilla
  1210. kernel. Ensure this kernel is not tainted and still shows the problem, as
  1211. the issue might have already been fixed there. If you first noticed the
  1212. problem with a vendor kernel, check a vanilla build of the last version
  1213. known to work performs fine as well.*
  1214. Before investing any more time in this process you want to check if the issue
  1215. was already fixed in the latest release of version line you're interested in.
  1216. This kernel needs to be vanilla and shouldn't be tainted before the issue
  1217. happens, as detailed outlined already above in the section "Install a fresh
  1218. kernel for testing".
  1219. Did you first notice the regression with a vendor kernel? Then changes the
  1220. vendor applied might be interfering. You need to rule that out by performing
  1221. a recheck. Say something broke when you updated from 5.10.4-vendor.42 to
  1222. 5.10.5-vendor.43. Then after testing the latest 5.10 release as outlined in
  1223. the previous paragraph check if a vanilla build of Linux 5.10.4 works fine as
  1224. well. If things are broken there, the issue does not qualify as upstream
  1225. regression and you need switch back to the main step-by-step guide to report
  1226. the issue.
  1227. Report the regression
  1228. ~~~~~~~~~~~~~~~~~~~~~
  1229. *Send a short problem report to the Linux stable mailing list
  1230. (stable@vger.kernel.org) and CC the Linux regressions mailing list
  1231. (regressions@lists.linux.dev); if you suspect the cause in a particular
  1232. subsystem, CC its maintainer and its mailing list. Roughly describe the
  1233. issue and ideally explain how to reproduce it. Mention the first version
  1234. that shows the problem and the last version that's working fine. Then
  1235. wait for further instructions.*
  1236. When reporting a regression that happens within a stable or longterm kernel
  1237. line (say when updating from 5.10.4 to 5.10.5) a brief report is enough for
  1238. the start to get the issue reported quickly. Hence a rough description to the
  1239. stable and regressions mailing list is all it takes; but in case you suspect
  1240. the cause in a particular subsystem, CC its maintainers and its mailing list
  1241. as well, because that will speed things up.
  1242. And note, it helps developers a great deal if you can specify the exact version
  1243. that introduced the problem. Hence if possible within a reasonable time frame,
  1244. try to find that version using vanilla kernels. Lets assume something broke when
  1245. your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as
  1246. instructed above go and check the latest kernel from that version line, say
  1247. 5.10.9. If it shows the problem, try a vanilla 5.10.5 to ensure that no patches
  1248. the distributor applied interfere. If the issue doesn't manifest itself there,
  1249. try 5.10.7 and then (depending on the outcome) 5.10.8 or 5.10.6 to find the
  1250. first version where things broke. Mention it in the report and state that 5.10.9
  1251. is still broken.
  1252. What the previous paragraph outlines is basically a rough manual 'bisection'.
  1253. Once your report is out your might get asked to do a proper one, as it allows to
  1254. pinpoint the exact change that causes the issue (which then can easily get
  1255. reverted to fix the issue quickly). Hence consider to do a proper bisection
  1256. right away if time permits. See the section 'Special care for regressions' and
  1257. the document Documentation/admin-guide/bug-bisect.rst for details how to
  1258. perform one. In case of a successful bisection add the author of the culprit to
  1259. the recipients; also CC everyone in the signed-off-by chain, which you find at
  1260. the end of its commit message.
  1261. Reference for "Reporting issues only occurring in older kernel version lines"
  1262. -----------------------------------------------------------------------------
  1263. This section provides details for the steps you need to take if you could not
  1264. reproduce your issue with a mainline kernel, but want to see it fixed in older
  1265. version lines (aka stable and longterm kernels).
  1266. Some fixes are too complex
  1267. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  1268. *Prepare yourself for the possibility that going through the next few steps
  1269. might not get the issue solved in older releases: the fix might be too big
  1270. or risky to get backported there.*
  1271. Even small and seemingly obvious code-changes sometimes introduce new and
  1272. totally unexpected problems. The maintainers of the stable and longterm kernels
  1273. are very aware of that and thus only apply changes to these kernels that are
  1274. within rules outlined in Documentation/process/stable-kernel-rules.rst.
  1275. Complex or risky changes for example do not qualify and thus only get applied
  1276. to mainline. Other fixes are easy to get backported to the newest stable and
  1277. longterm kernels, but too risky to integrate into older ones. So be aware the
  1278. fix you are hoping for might be one of those that won't be backported to the
  1279. version line your care about. In that case you'll have no other choice then to
  1280. live with the issue or switch to a newer Linux version, unless you want to
  1281. patch the fix into your kernels yourself.
  1282. Common preparations
  1283. ~~~~~~~~~~~~~~~~~~~
  1284. *Perform the first three steps in the section "Reporting issues only
  1285. occurring in older kernel version lines" above.*
  1286. You need to carry out a few steps already described in another section of this
  1287. guide. Those steps will let you:
  1288. * Check if the kernel developers still maintain the Linux kernel version line
  1289. you care about.
  1290. * Search the Linux stable mailing list for exiting reports.
  1291. * Check with the latest release.
  1292. Check code history and search for existing discussions
  1293. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1294. *Search the Linux kernel version control system for the change that fixed
  1295. the issue in mainline, as its commit message might tell you if the fix is
  1296. scheduled for backporting already. If you don't find anything that way,
  1297. search the appropriate mailing lists for posts that discuss such an issue
  1298. or peer-review possible fixes; then check the discussions if the fix was
  1299. deemed unsuitable for backporting. If backporting was not considered at
  1300. all, join the newest discussion, asking if it's in the cards.*
  1301. In a lot of cases the issue you deal with will have happened with mainline, but
  1302. got fixed there. The commit that fixed it would need to get backported as well
  1303. to get the issue solved. That's why you want to search for it or any
  1304. discussions abound it.
  1305. * First try to find the fix in the Git repository that holds the Linux kernel
  1306. sources. You can do this with the web interfaces `on kernel.org
  1307. <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_
  1308. or its mirror `on GitHub <https://github.com/torvalds/linux>`_; if you have
  1309. a local clone you alternatively can search on the command line with ``git
  1310. log --grep=<pattern>``.
  1311. If you find the fix, look if the commit message near the end contains a
  1312. 'stable tag' that looks like this:
  1313. Cc: <stable@vger.kernel.org> # 5.4+
  1314. If that's case the developer marked the fix safe for backporting to version
  1315. line 5.4 and later. Most of the time it's getting applied there within two
  1316. weeks, but sometimes it takes a bit longer.
  1317. * If the commit doesn't tell you anything or if you can't find the fix, look
  1318. again for discussions about the issue. Search the net with your favorite
  1319. internet search engine as well as the archives for the `Linux kernel
  1320. developers mailing list <https://lore.kernel.org/lkml/>`_. Also read the
  1321. section `Locate kernel area that causes the issue` above and follow the
  1322. instructions to find the subsystem in question: its bug tracker or mailing
  1323. list archive might have the answer you are looking for.
  1324. * If you see a proposed fix, search for it in the version control system as
  1325. outlined above, as the commit might tell you if a backport can be expected.
  1326. * Check the discussions for any indicators the fix might be too risky to get
  1327. backported to the version line you care about. If that's the case you have
  1328. to live with the issue or switch to the kernel version line where the fix
  1329. got applied.
  1330. * If the fix doesn't contain a stable tag and backporting was not discussed,
  1331. join the discussion: mention the version where you face the issue and that
  1332. you would like to see it fixed, if suitable.
  1333. Ask for advice
  1334. ~~~~~~~~~~~~~~
  1335. *One of the former steps should lead to a solution. If that doesn't work
  1336. out, ask the maintainers for the subsystem that seems to be causing the
  1337. issue for advice; CC the mailing list for the particular subsystem as well
  1338. as the stable mailing list.*
  1339. If the previous three steps didn't get you closer to a solution there is only
  1340. one option left: ask for advice. Do that in a mail you sent to the maintainers
  1341. for the subsystem where the issue seems to have its roots; CC the mailing list
  1342. for the subsystem as well as the stable mailing list (stable@vger.kernel.org).
  1343. Why some issues won't get any reaction or remain unfixed after being reported
  1344. =============================================================================
  1345. When reporting a problem to the Linux developers, be aware only 'issues of high
  1346. priority' (regressions, security issues, severe problems) are definitely going
  1347. to get resolved. The maintainers or if all else fails Linus Torvalds himself
  1348. will make sure of that. They and the other kernel developers will fix a lot of
  1349. other issues as well. But be aware that sometimes they can't or won't help; and
  1350. sometimes there isn't even anyone to send a report to.
  1351. This is best explained with kernel developers that contribute to the Linux
  1352. kernel in their spare time. Quite a few of the drivers in the kernel were
  1353. written by such programmers, often because they simply wanted to make their
  1354. hardware usable on their favorite operating system.
  1355. These programmers most of the time will happily fix problems other people
  1356. report. But nobody can force them to do, as they are contributing voluntarily.
  1357. Then there are situations where such developers really want to fix an issue,
  1358. but can't: sometimes they lack hardware programming documentation to do so.
  1359. This often happens when the publicly available docs are superficial or the
  1360. driver was written with the help of reverse engineering.
  1361. Sooner or later spare time developers will also stop caring for the driver.
  1362. Maybe their test hardware broke, got replaced by something more fancy, or is so
  1363. old that it's something you don't find much outside of computer museums
  1364. anymore. Sometimes developer stops caring for their code and Linux at all, as
  1365. something different in their life became way more important. In some cases
  1366. nobody is willing to take over the job as maintainer – and nobody can be forced
  1367. to, as contributing to the Linux kernel is done on a voluntary basis. Abandoned
  1368. drivers nevertheless remain in the kernel: they are still useful for people and
  1369. removing would be a regression.
  1370. The situation is not that different with developers that are paid for their
  1371. work on the Linux kernel. Those contribute most changes these days. But their
  1372. employers sooner or later also stop caring for their code or make its
  1373. programmer focus on other things. Hardware vendors for example earn their money
  1374. mainly by selling new hardware; quite a few of them hence are not investing
  1375. much time and energy in maintaining a Linux kernel driver for something they
  1376. stopped selling years ago. Enterprise Linux distributors often care for a
  1377. longer time period, but in new versions often leave support for old and rare
  1378. hardware aside to limit the scope. Often spare time contributors take over once
  1379. a company orphans some code, but as mentioned above: sooner or later they will
  1380. leave the code behind, too.
  1381. Priorities are another reason why some issues are not fixed, as maintainers
  1382. quite often are forced to set those, as time to work on Linux is limited.
  1383. That's true for spare time or the time employers grant their developers to
  1384. spend on maintenance work on the upstream kernel. Sometimes maintainers also
  1385. get overwhelmed with reports, even if a driver is working nearly perfectly. To
  1386. not get completely stuck, the programmer thus might have no other choice than
  1387. to prioritize issue reports and reject some of them.
  1388. But don't worry too much about all of this, a lot of drivers have active
  1389. maintainers who are quite interested in fixing as many issues as possible.
  1390. Closing words
  1391. =============
  1392. Compared with other Free/Libre & Open Source Software it's hard to report
  1393. issues to the Linux kernel developers: the length and complexity of this
  1394. document and the implications between the lines illustrate that. But that's how
  1395. it is for now. The main author of this text hopes documenting the state of the
  1396. art will lay some groundwork to improve the situation over time.
  1397. ..
  1398. end-of-content
  1399. ..
  1400. This document is maintained by Thorsten Leemhuis <linux@leemhuis.info>. If
  1401. you spot a typo or small mistake, feel free to let him know directly and
  1402. he'll fix it. You are free to do the same in a mostly informal way if you
  1403. want to contribute changes to the text, but for copyright reasons please CC
  1404. linux-doc@vger.kernel.org and "sign-off" your contribution as
  1405. Documentation/process/submitting-patches.rst outlines in the section "Sign
  1406. your work - the Developer's Certificate of Origin".
  1407. ..
  1408. This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top
  1409. of the file. If you want to distribute this text under CC-BY-4.0 only,
  1410. please use "The Linux kernel developers" for author attribution and link
  1411. this as source:
  1412. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/reporting-issues.rst
  1413. ..
  1414. Note: Only the content of this RST file as found in the Linux kernel sources
  1415. is available under CC-BY-4.0, as versions of this text that were processed
  1416. (for example by the kernel's build system) might contain content taken from
  1417. files which use a more restrictive license.