| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270 |
- .. SPDX-License-Identifier: GPL-2.0
- .. include:: <isonum.txt>
- =========================
- System Suspend Code Flows
- =========================
- :Copyright: |copy| 2020 Intel Corporation
- :Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- At least one global system-wide transition needs to be carried out for the
- system to get from the working state into one of the supported
- :doc:`sleep states <sleep-states>`. Hibernation requires more than one
- transition to occur for this purpose, but the other sleep states, commonly
- referred to as *system-wide suspend* (or simply *system suspend*) states, need
- only one.
- For those sleep states, the transition from the working state of the system into
- the target sleep state is referred to as *system suspend* too (in the majority
- of cases, whether this means a transition or a sleep state of the system should
- be clear from the context) and the transition back from the sleep state into the
- working state is referred to as *system resume*.
- The kernel code flows associated with the suspend and resume transitions for
- different sleep states of the system are quite similar, but there are some
- significant differences between the :ref:`suspend-to-idle <s2idle>` code flows
- and the code flows related to the :ref:`suspend-to-RAM <s2ram>` and
- :ref:`standby <standby>` sleep states.
- The :ref:`suspend-to-RAM <s2ram>` and :ref:`standby <standby>` sleep states
- cannot be implemented without platform support and the difference between them
- boils down to the platform-specific actions carried out by the suspend and
- resume hooks that need to be provided by the platform driver to make them
- available. Apart from that, the suspend and resume code flows for these sleep
- states are mostly identical, so they both together will be referred to as
- *platform-dependent suspend* states in what follows.
- .. _s2idle_suspend:
- Suspend-to-idle Suspend Code Flow
- =================================
- The following steps are taken in order to transition the system from the working
- state to the :ref:`suspend-to-idle <s2idle>` sleep state:
- 1. Invoking system-wide suspend notifiers.
- Kernel subsystems can register callbacks to be invoked when the suspend
- transition is about to occur and when the resume transition has finished.
- That allows them to prepare for the change of the system state and to clean
- up after getting back to the working state.
- 2. Freezing tasks.
- Tasks are frozen primarily in order to avoid unchecked hardware accesses
- from user space through MMIO regions or I/O registers exposed directly to
- it and to prevent user space from entering the kernel while the next step
- of the transition is in progress (which might have been problematic for
- various reasons).
- All user space tasks are intercepted as though they were sent a signal and
- put into uninterruptible sleep until the end of the subsequent system resume
- transition.
- The kernel threads that choose to be frozen during system suspend for
- specific reasons are frozen subsequently, but they are not intercepted.
- Instead, they are expected to periodically check whether or not they need
- to be frozen and to put themselves into uninterruptible sleep if so. [Note,
- however, that kernel threads can use locking and other concurrency controls
- available in kernel space to synchronize themselves with system suspend and
- resume, which can be much more precise than the freezing, so the latter is
- not a recommended option for kernel threads.]
- 3. Suspending devices and reconfiguring IRQs.
- Devices are suspended in four phases called *prepare*, *suspend*,
- *late suspend* and *noirq suspend* (see :ref:`driverapi_pm_devices` for more
- information on what exactly happens in each phase).
- Every device is visited in each phase, but typically it is not physically
- accessed in more than two of them.
- The runtime PM API is disabled for every device during the *late* suspend
- phase and high-level ("action") interrupt handlers are prevented from being
- invoked before the *noirq* suspend phase.
- Interrupts are still handled after that, but they are only acknowledged to
- interrupt controllers without performing any device-specific actions that
- would be triggered in the working state of the system (those actions are
- deferred till the subsequent system resume transition as described
- `below <s2idle_resume_>`_).
- IRQs associated with system wakeup devices are "armed" so that the resume
- transition of the system is started when one of them signals an event.
- 4. Freezing the scheduler tick and suspending timekeeping.
- When all devices have been suspended, CPUs enter the idle loop and are put
- into the deepest available idle state. While doing that, each of them
- "freezes" its own scheduler tick so that the timer events associated with
- the tick do not occur until the CPU is woken up by another interrupt source.
- The last CPU to enter the idle state also stops the timekeeping which
- (among other things) prevents high resolution timers from triggering going
- forward until the first CPU that is woken up restarts the timekeeping.
- That allows the CPUs to stay in the deep idle state relatively long in one
- go.
- From this point on, the CPUs can only be woken up by non-timer hardware
- interrupts. If that happens, they go back to the idle state unless the
- interrupt that woke up one of them comes from an IRQ that has been armed for
- system wakeup, in which case the system resume transition is started.
- .. _s2idle_resume:
- Suspend-to-idle Resume Code Flow
- ================================
- The following steps are taken in order to transition the system from the
- :ref:`suspend-to-idle <s2idle>` sleep state into the working state:
- 1. Resuming timekeeping and unfreezing the scheduler tick.
- When one of the CPUs is woken up (by a non-timer hardware interrupt), it
- leaves the idle state entered in the last step of the preceding suspend
- transition, restarts the timekeeping (unless it has been restarted already
- by another CPU that woke up earlier) and the scheduler tick on that CPU is
- unfrozen.
- If the interrupt that has woken up the CPU was armed for system wakeup,
- the system resume transition begins.
- 2. Resuming devices and restoring the working-state configuration of IRQs.
- Devices are resumed in four phases called *noirq resume*, *early resume*,
- *resume* and *complete* (see :ref:`driverapi_pm_devices` for more
- information on what exactly happens in each phase).
- Every device is visited in each phase, but typically it is not physically
- accessed in more than two of them.
- The working-state configuration of IRQs is restored after the *noirq* resume
- phase and the runtime PM API is re-enabled for every device whose driver
- supports it during the *early* resume phase.
- 3. Thawing tasks.
- Tasks frozen in step 2 of the preceding `suspend <s2idle_suspend_>`_
- transition are "thawed", which means that they are woken up from the
- uninterruptible sleep that they went into at that time and user space tasks
- are allowed to exit the kernel.
- 4. Invoking system-wide resume notifiers.
- This is analogous to step 1 of the `suspend <s2idle_suspend_>`_ transition
- and the same set of callbacks is invoked at this point, but a different
- "notification type" parameter value is passed to them.
- Platform-dependent Suspend Code Flow
- ====================================
- The following steps are taken in order to transition the system from the working
- state to platform-dependent suspend state:
- 1. Invoking system-wide suspend notifiers.
- This step is the same as step 1 of the suspend-to-idle suspend transition
- described `above <s2idle_suspend_>`_.
- 2. Freezing tasks.
- This step is the same as step 2 of the suspend-to-idle suspend transition
- described `above <s2idle_suspend_>`_.
- 3. Suspending devices and reconfiguring IRQs.
- This step is analogous to step 3 of the suspend-to-idle suspend transition
- described `above <s2idle_suspend_>`_, but the arming of IRQs for system
- wakeup generally does not have any effect on the platform.
- There are platforms that can go into a very deep low-power state internally
- when all CPUs in them are in sufficiently deep idle states and all I/O
- devices have been put into low-power states. On those platforms,
- suspend-to-idle can reduce system power very effectively.
- On the other platforms, however, low-level components (like interrupt
- controllers) need to be turned off in a platform-specific way (implemented
- in the hooks provided by the platform driver) to achieve comparable power
- reduction.
- That usually prevents in-band hardware interrupts from waking up the system,
- which must be done in a special platform-dependent way. Then, the
- configuration of system wakeup sources usually starts when system wakeup
- devices are suspended and is finalized by the platform suspend hooks later
- on.
- 4. Disabling non-boot CPUs.
- On some platforms the suspend hooks mentioned above must run in a one-CPU
- configuration of the system (in particular, the hardware cannot be accessed
- by any code running in parallel with the platform suspend hooks that may,
- and often do, trap into the platform firmware in order to finalize the
- suspend transition).
- For this reason, the CPU offline/online (CPU hotplug) framework is used
- to take all of the CPUs in the system, except for one (the boot CPU),
- offline (typically, the CPUs that have been taken offline go into deep idle
- states).
- This means that all tasks are migrated away from those CPUs and all IRQs are
- rerouted to the only CPU that remains online.
- 5. Suspending core system components.
- This prepares the core system components for (possibly) losing power going
- forward and suspends the timekeeping.
- 6. Platform-specific power removal.
- This is expected to remove power from all of the system components except
- for the memory controller and RAM (in order to preserve the contents of the
- latter) and some devices designated for system wakeup.
- In many cases control is passed to the platform firmware which is expected
- to finalize the suspend transition as needed.
- Platform-dependent Resume Code Flow
- ===================================
- The following steps are taken in order to transition the system from a
- platform-dependent suspend state into the working state:
- 1. Platform-specific system wakeup.
- The platform is woken up by a signal from one of the designated system
- wakeup devices (which need not be an in-band hardware interrupt) and
- control is passed back to the kernel (the working configuration of the
- platform may need to be restored by the platform firmware before the
- kernel gets control again).
- 2. Resuming core system components.
- The suspend-time configuration of the core system components is restored and
- the timekeeping is resumed.
- 3. Re-enabling non-boot CPUs.
- The CPUs disabled in step 4 of the preceding suspend transition are taken
- back online and their suspend-time configuration is restored.
- 4. Resuming devices and restoring the working-state configuration of IRQs.
- This step is the same as step 2 of the suspend-to-idle suspend transition
- described `above <s2idle_resume_>`_.
- 5. Thawing tasks.
- This step is the same as step 3 of the suspend-to-idle suspend transition
- described `above <s2idle_resume_>`_.
- 6. Invoking system-wide resume notifiers.
- This step is the same as step 4 of the suspend-to-idle suspend transition
- described `above <s2idle_resume_>`_.
|