| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851 |
- .. SPDX-License-Identifier: GPL-2.0
- =========================================
- IAA Compression Accelerator Crypto Driver
- =========================================
- Tom Zanussi <tom.zanussi@linux.intel.com>
- The IAA crypto driver supports compression/decompression compatible
- with the DEFLATE compression standard described in RFC 1951, which is
- the compression/decompression algorithm exported by this module.
- The IAA hardware spec can be found here:
- https://cdrdv2.intel.com/v1/dl/getContent/721858
- The iaa_crypto driver is designed to work as a layer underneath
- higher-level compression devices such as zswap.
- Users can select IAA compress/decompress acceleration by specifying
- one of the supported IAA compression algorithms in whatever facility
- allows compression algorithms to be selected.
- For example, a zswap device can select the IAA 'fixed' mode
- represented by selecting the 'deflate-iaa' crypto compression
- algorithm::
- # echo deflate-iaa > /sys/module/zswap/parameters/compressor
- This will tell zswap to use the IAA 'fixed' compression mode for all
- compresses and decompresses.
- Currently, there is only one compression modes available, 'fixed'
- mode.
- The 'fixed' compression mode implements the compression scheme
- specified by RFC 1951 and is given the crypto algorithm name
- 'deflate-iaa'. (Because the IAA hardware has a 4k history-window
- limitation, only buffers <= 4k, or that have been compressed using a
- <= 4k history window, are technically compliant with the deflate spec,
- which allows for a window of up to 32k. Because of this limitation,
- the IAA fixed mode deflate algorithm is given its own algorithm name
- rather than simply 'deflate').
- Config options and other setup
- ==============================
- The IAA crypto driver is available via menuconfig using the following
- path::
- Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression Accelerator
- In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO.
- The IAA crypto driver also supports statistics, which are available
- via menuconfig using the following path::
- Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression -> Enable Intel(R) IAA Compression Accelerator Statistics
- In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS.
- The following config options should also be enabled::
- CONFIG_IRQ_REMAP=y
- CONFIG_INTEL_IOMMU=y
- CONFIG_INTEL_IOMMU_SVM=y
- CONFIG_PCI_ATS=y
- CONFIG_PCI_PRI=y
- CONFIG_PCI_PASID=y
- CONFIG_INTEL_IDXD=m
- CONFIG_INTEL_IDXD_SVM=y
- IAA is one of the first Intel accelerator IPs that can work in
- conjunction with the Intel IOMMU. There are multiple modes that exist
- for testing. Based on IOMMU configuration, there are 3 modes::
- - Scalable
- - Legacy
- - No IOMMU
- Scalable mode
- -------------
- Scalable mode supports Shared Virtual Memory (SVM or SVA). It is
- entered when using the kernel boot commandline::
- intel_iommu=on,sm_on
- with VT-d turned on in BIOS.
- With scalable mode, both shared and dedicated workqueues are available
- for use.
- For scalable mode, the following BIOS settings should be enabled::
- Socket Configuration > IIO Configuration > Intel VT for Directed I/O (VT-d) > Intel VT for Directed I/O
- Socket Configuration > IIO Configuration > PCIe ENQCMD > ENQCMDS
- Legacy mode
- -----------
- Legacy mode is entered when using the kernel boot commandline::
- intel_iommu=off
- or VT-d is not turned on in BIOS.
- If you have booted into Linux and not sure if VT-d is on, do a "dmesg
- | grep -i dmar". If you don't see a number of DMAR devices enumerated,
- most likely VT-d is not on.
- With legacy mode, only dedicated workqueues are available for use.
- No IOMMU mode
- -------------
- No IOMMU mode is entered when using the kernel boot commandline::
- iommu=off.
- With no IOMMU mode, only dedicated workqueues are available for use.
- Usage
- =====
- accel-config
- ------------
- When loaded, the iaa_crypto driver automatically creates a default
- configuration and enables it, and assigns default driver attributes.
- If a different configuration or set of driver attributes is required,
- the user must first disable the IAA devices and workqueues, reset the
- configuration, and then re-register the deflate-iaa algorithm with the
- crypto subsystem by removing and reinserting the iaa_crypto module.
- The :ref:`iaa_disable_script` in the 'Use Cases'
- section below can be used to disable the default configuration.
- See :ref:`iaa_default_config` below for details of the default
- configuration.
- More likely than not, however, and because of the complexity and
- configurability of the accelerator devices, the user will want to
- configure the device and manually enable the desired devices and
- workqueues.
- The userspace tool to help doing that is called accel-config. Using
- accel-config to configure device or loading a previously saved config
- is highly recommended. The device can be controlled via sysfs
- directly but comes with the warning that you should do this ONLY if
- you know exactly what you are doing. The following sections will not
- cover the sysfs interface but assumes you will be using accel-config.
- The :ref:`iaa_sysfs_config` section in the appendix below can be
- consulted for the sysfs interface details if interested.
- The accel-config tool along with instructions for building it can be
- found here:
- https://github.com/intel/idxd-config/#readme
- Typical usage
- -------------
- In order for the iaa_crypto module to actually do any
- compression/decompression work on behalf of a facility, one or more
- IAA workqueues need to be bound to the iaa_crypto driver.
- For instance, here's an example of configuring an IAA workqueue and
- binding it to the iaa_crypto driver (note that device names are
- specified as 'iax' rather than 'iaa' - this is because upstream still
- has the old 'iax' device naming in place) ::
- # configure wq1.0
- accel-config config-wq --group-id=0 --mode=dedicated --type=kernel --priority=10 --name="iaa_crypto" --driver-name="crypto" iax1/wq1.0
- accel-config config-engine iax1/engine1.0 --group-id=0
- # enable IAA device iax1
- accel-config enable-device iax1
- # enable wq1.0 on IAX device iax1
- accel-config enable-wq iax1/wq1.0
- Whenever a new workqueue is bound to or unbound from the iaa_crypto
- driver, the available workqueues are 'rebalanced' such that work
- submitted from a particular CPU is given to the most appropriate
- workqueue available. Current best practice is to configure and bind
- at least one workqueue for each IAA device, but as long as there is at
- least one workqueue configured and bound to any IAA device in the
- system, the iaa_crypto driver will work, albeit most likely not as
- efficiently.
- The IAA crypto algorigthms is operational and compression and
- decompression operations are fully enabled following the successful
- binding of the first IAA workqueue to the iaa_crypto driver.
- Similarly, the IAA crypto algorithm is not operational and compression
- and decompression operations are disabled following the unbinding of
- the last IAA worqueue to the iaa_crypto driver.
- As a result, the IAA crypto algorithms and thus the IAA hardware are
- only available when one or more workques are bound to the iaa_crypto
- driver.
- When there are no IAA workqueues bound to the driver, the IAA crypto
- algorithms can be unregistered by removing the module.
- Driver attributes
- -----------------
- There are a couple user-configurable driver attributes that can be
- used to configure various modes of operation. They're listed below,
- along with their default values. To set any of these attributes, echo
- the appropriate values to the attribute file located under
- /sys/bus/dsa/drivers/crypto/
- The attribute settings at the time the IAA algorithms are registered
- are captured in each algorithm's crypto_ctx and used for all compresses
- and decompresses when using that algorithm.
- The available attributes are:
- - verify_compress
- Toggle compression verification. If set, each compress will be
- internally decompressed and the contents verified, returning error
- codes if unsuccessful. This can be toggled with 0/1::
- echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress
- The default setting is '1' - verify all compresses.
- - sync_mode
- Select mode to be used to wait for completion of each compresses
- and decompress operation.
- The crypto async interface support implemented by iaa_crypto
- provides an implementation that satisfies the interface but does
- so in a synchronous manner - it fills and submits the IDXD
- descriptor and then loops around waiting for it to complete before
- returning. This isn't a problem at the moment, since all existing
- callers (e.g. zswap) wrap any asynchronous callees in a
- synchronous wrapper anyway.
- The iaa_crypto driver does however provide true asynchronous
- support for callers that can make use of it. In this mode, it
- fills and submits the IDXD descriptor, then returns immediately
- with -EINPROGRESS. The caller can then either poll for completion
- itself, which requires specific code in the caller which currently
- nothing in the upstream kernel implements, or go to sleep and wait
- for an interrupt signaling completion. This latter mode is
- supported by current users in the kernel such as zswap via
- synchronous wrappers. Although it is supported this mode is
- significantly slower than the synchronous mode that does the
- polling in the iaa_crypto driver previously mentioned.
- This mode can be enabled by writing 'async_irq' to the sync_mode
- iaa_crypto driver attribute::
- echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode
- Async mode without interrupts (caller must poll) can be enabled by
- writing 'async' to it (please see Caveat)::
- echo async > /sys/bus/dsa/drivers/crypto/sync_mode
- The mode that does the polling in the iaa_crypto driver can be
- enabled by writing 'sync' to it::
- echo sync > /sys/bus/dsa/drivers/crypto/sync_mode
- The default mode is 'sync'.
- Caveat: since the only mechanism that iaa_crypto currently implements
- for async polling without interrupts is via the 'sync' mode as
- described earlier, writing 'async' to
- '/sys/bus/dsa/drivers/crypto/sync_mode' will internally enable the
- 'sync' mode. This is to ensure correct iaa_crypto behavior until true
- async polling without interrupts is enabled in iaa_crypto.
- .. _iaa_default_config:
- IAA Default Configuration
- -------------------------
- When the iaa_crypto driver is loaded, each IAA device has a single
- work queue configured for it, with the following attributes::
- mode "dedicated"
- threshold 0
- size Total WQ Size from WQCAP
- priority 10
- type IDXD_WQT_KERNEL
- group 0
- name "iaa_crypto"
- driver_name "crypto"
- The devices and workqueues are also enabled and therefore the driver
- is ready to be used without any additional configuration.
- The default driver attributes in effect when the driver is loaded are::
- sync_mode "sync"
- verify_compress 1
- In order to change either the device/work queue or driver attributes,
- the enabled devices and workqueues must first be disabled. In order
- to have the new configuration applied to the deflate-iaa crypto
- algorithm, it needs to be re-registered by removing and reinserting
- the iaa_crypto module. The :ref:`iaa_disable_script` in the 'Use
- Cases' section below can be used to disable the default configuration.
- Statistics
- ==========
- If the optional debugfs statistics support is enabled, the IAA crypto
- driver will generate statistics which can be accessed in debugfs at::
- # ls -al /sys/kernel/debug/iaa-crypto/
- total 0
- drwxr-xr-x 2 root root 0 Mar 3 07:55 .
- drwx------ 53 root root 0 Mar 3 07:55 ..
- -rw-r--r-- 1 root root 0 Mar 3 07:55 global_stats
- -rw-r--r-- 1 root root 0 Mar 3 07:55 stats_reset
- -rw-r--r-- 1 root root 0 Mar 3 07:55 wq_stats
- The global_stats file shows a set of global statistics collected since
- the driver has been loaded or reset::
- # cat global_stats
- global stats:
- total_comp_calls: 4300
- total_decomp_calls: 4164
- total_sw_decomp_calls: 0
- total_comp_bytes_out: 5993989
- total_decomp_bytes_in: 5993989
- total_completion_einval_errors: 0
- total_completion_timeout_errors: 0
- total_completion_comp_buf_overflow_errors: 136
- The wq_stats file shows per-wq stats, a set for each iaa device and wq
- in addition to some global stats::
- # cat wq_stats
- iaa device:
- id: 1
- n_wqs: 1
- comp_calls: 0
- comp_bytes: 0
- decomp_calls: 0
- decomp_bytes: 0
- wqs:
- name: iaa_crypto
- comp_calls: 0
- comp_bytes: 0
- decomp_calls: 0
- decomp_bytes: 0
- iaa device:
- id: 3
- n_wqs: 1
- comp_calls: 0
- comp_bytes: 0
- decomp_calls: 0
- decomp_bytes: 0
- wqs:
- name: iaa_crypto
- comp_calls: 0
- comp_bytes: 0
- decomp_calls: 0
- decomp_bytes: 0
- iaa device:
- id: 5
- n_wqs: 1
- comp_calls: 1360
- comp_bytes: 1999776
- decomp_calls: 0
- decomp_bytes: 0
- wqs:
- name: iaa_crypto
- comp_calls: 1360
- comp_bytes: 1999776
- decomp_calls: 0
- decomp_bytes: 0
- iaa device:
- id: 7
- n_wqs: 1
- comp_calls: 2940
- comp_bytes: 3994213
- decomp_calls: 4164
- decomp_bytes: 5993989
- wqs:
- name: iaa_crypto
- comp_calls: 2940
- comp_bytes: 3994213
- decomp_calls: 4164
- decomp_bytes: 5993989
- ...
- Writing to 'stats_reset' resets all the stats, including the
- per-device and per-wq stats::
- # echo 1 > stats_reset
- # cat wq_stats
- global stats:
- total_comp_calls: 0
- total_decomp_calls: 0
- total_comp_bytes_out: 0
- total_decomp_bytes_in: 0
- total_completion_einval_errors: 0
- total_completion_timeout_errors: 0
- total_completion_comp_buf_overflow_errors: 0
- ...
- Use cases
- =========
- Simple zswap test
- -----------------
- For this example, the kernel should be configured according to the
- dedicated mode options described above, and zswap should be enabled as
- well::
- CONFIG_ZSWAP=y
- This is a simple test that uses iaa_compress as the compressor for a
- swap (zswap) device. It sets up the zswap device and then uses the
- memory_memadvise program listed below to forcibly swap out and in a
- specified number of pages, demonstrating both compress and decompress.
- The zswap test expects the work queues for each IAA device on the
- system to be configured properly as a kernel workqueue with a
- workqueue driver_name of "crypto".
- The first step is to make sure the iaa_crypto module is loaded::
- modprobe iaa_crypto
- If the IAA devices and workqueues haven't previously been disabled and
- reconfigured, then the default configuration should be in place and no
- further IAA configuration is necessary. See :ref:`iaa_default_config`
- below for details of the default configuration.
- If the default configuration is in place, you should see the iaa
- devices and wq0s enabled::
- # cat /sys/bus/dsa/devices/iax1/state
- enabled
- # cat /sys/bus/dsa/devices/iax1/wq1.0/state
- enabled
- To demonstrate that the following steps work as expected, these
- commands can be used to enable debug output::
- # echo -n 'module iaa_crypto +p' > /sys/kernel/debug/dynamic_debug/control
- # echo -n 'module idxd +p' > /sys/kernel/debug/dynamic_debug/control
- Use the following commands to enable zswap::
- # echo 0 > /sys/module/zswap/parameters/enabled
- # echo 50 > /sys/module/zswap/parameters/max_pool_percent
- # echo deflate-iaa > /sys/module/zswap/parameters/compressor
- # echo zsmalloc > /sys/module/zswap/parameters/zpool
- # echo 1 > /sys/module/zswap/parameters/enabled
- # echo 100 > /proc/sys/vm/swappiness
- # echo never > /sys/kernel/mm/transparent_hugepage/enabled
- # echo 1 > /proc/sys/vm/overcommit_memory
- Now you can now run the zswap workload you want to measure. For
- example, using the memory_memadvise code below, the following command
- will swap in and out 100 pages::
- ./memory_madvise 100
- Allocating 100 pages to swap in/out
- Swapping out 100 pages
- Swapping in 100 pages
- Swapped out and in 100 pages
- You should see something like the following in the dmesg output::
- [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096
- [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192
- [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568
- [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
- ...
- Now that basic functionality has been demonstrated, the defaults can
- be erased and replaced with a different configuration. To do that,
- first disable zswap::
- # echo lzo > /sys/module/zswap/parameters/compressor
- # swapoff -a
- # echo 0 > /sys/module/zswap/parameters/accept_threshold_percent
- # echo 0 > /sys/module/zswap/parameters/max_pool_percent
- # echo 0 > /sys/module/zswap/parameters/enabled
- # echo 0 > /sys/module/zswap/parameters/enabled
- Then run the :ref:`iaa_disable_script` in the 'Use Cases' section
- below to disable the default configuration.
- Finally turn swap back on::
- # swapon -a
- Following all that the IAA device(s) can now be re-configured and
- enabled as desired for further testing. Below is one example.
- The zswap test expects the work queues for each IAA device on the
- system to be configured properly as a kernel workqueue with a
- workqueue driver_name of "crypto".
- The below script automatically does that::
- #!/bin/bash
- echo "IAA devices:"
- lspci -d:0cfe
- echo "# IAA devices:"
- lspci -d:0cfe | wc -l
- #
- # count iaa instances
- #
- iaa_dev_id="0cfe"
- num_iaa=$(lspci -d:${iaa_dev_id} | wc -l)
- echo "Found ${num_iaa} IAA instances"
- #
- # disable iaa wqs and devices
- #
- echo "Disable IAA"
- for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
- echo disable wq iax${i}/wq${i}.0
- accel-config disable-wq iax${i}/wq${i}.0
- echo disable iaa iax${i}
- accel-config disable-device iax${i}
- done
- echo "End Disable IAA"
- echo "Reload iaa_crypto module"
- rmmod iaa_crypto
- modprobe iaa_crypto
- echo "End Reload iaa_crypto module"
- #
- # configure iaa wqs and devices
- #
- echo "Configure IAA"
- for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
- accel-config config-wq --group-id=0 --mode=dedicated --wq-size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver-name="crypto" iax${i}/wq${i}.0
- accel-config config-engine iax${i}/engine${i}.0 --group-id=0
- done
- echo "End Configure IAA"
- #
- # enable iaa wqs and devices
- #
- echo "Enable IAA"
- for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
- echo enable iaa iax${i}
- accel-config enable-device iax${i}
- echo enable wq iax${i}/wq${i}.0
- accel-config enable-wq iax${i}/wq${i}.0
- done
- echo "End Enable IAA"
- When the workqueues are bound to the iaa_crypto driver, you should
- see something similar to the following in dmesg output if you've
- enabled debug output (echo -n 'module iaa_crypto +p' >
- /sys/kernel/debug/dynamic_debug/control)::
- [ 60.752344] idxd 0000:f6:02.0: add_iaa_wq: added wq 000000004068d14d to iaa 00000000c9585ba2, n_wq 1
- [ 60.752346] iaa_crypto: rebalance_wq_table: nr_nodes=2, nr_cpus 160, nr_iaa 8, cpus_per_iaa 20
- [ 60.752347] iaa_crypto: rebalance_wq_table: iaa=0
- [ 60.752349] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0)
- [ 60.752350] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0)
- [ 60.752352] iaa_crypto: rebalance_wq_table: assigned wq for cpu=0, node=0 = wq 00000000c8bb4452
- [ 60.752354] iaa_crypto: rebalance_wq_table: iaa=0
- [ 60.752355] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0)
- [ 60.752356] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0)
- [ 60.752358] iaa_crypto: rebalance_wq_table: assigned wq for cpu=1, node=0 = wq 00000000c8bb4452
- [ 60.752359] iaa_crypto: rebalance_wq_table: iaa=0
- [ 60.752360] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0)
- [ 60.752361] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0)
- [ 60.752362] iaa_crypto: rebalance_wq_table: assigned wq for cpu=2, node=0 = wq 00000000c8bb4452
- [ 60.752364] iaa_crypto: rebalance_wq_table: iaa=0
- .
- .
- .
- Once the workqueues and devices have been enabled, the IAA crypto
- algorithms are enabled and available. When the IAA crypto algorithms
- have been successfully enabled, you should see the following dmesg
- output::
- [ 64.893759] iaa_crypto: iaa_crypto_enable: iaa_crypto now ENABLED
- Now run the following zswap-specific setup commands to have zswap use
- the 'fixed' compression mode::
- echo 0 > /sys/module/zswap/parameters/enabled
- echo 50 > /sys/module/zswap/parameters/max_pool_percent
- echo deflate-iaa > /sys/module/zswap/parameters/compressor
- echo zsmalloc > /sys/module/zswap/parameters/zpool
- echo 1 > /sys/module/zswap/parameters/enabled
- echo 100 > /proc/sys/vm/swappiness
- echo never > /sys/kernel/mm/transparent_hugepage/enabled
- echo 1 > /proc/sys/vm/overcommit_memory
- Finally, you can now run the zswap workload you want to measure. For
- example, using the code below, the following command will swap in and
- out 100 pages::
- ./memory_madvise 100
- Allocating 100 pages to swap in/out
- Swapping out 100 pages
- Swapping in 100 pages
- Swapped out and in 100 pages
- You should see something like the following in the dmesg output if
- you've enabled debug output (echo -n 'module iaa_crypto +p' >
- /sys/kernel/debug/dynamic_debug/control)::
- [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096
- [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192
- [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568
- [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
- [ 409.203227] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228
- [ 409.203235] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21ee3dc000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096
- [ 409.203239] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21ee3dc000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
- [ 409.203254] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228
- [ 409.203256] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21f1551000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096
- [ 409.203257] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21f1551000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
- In order to unregister the IAA crypto algorithms, and register new
- ones using different parameters, any users of the current algorithm
- should be stopped and the IAA workqueues and devices disabled.
- In the case of zswap, remove the IAA crypto algorithm as the
- compressor and turn off swap (to remove all references to
- iaa_crypto)::
- echo lzo > /sys/module/zswap/parameters/compressor
- swapoff -a
- echo 0 > /sys/module/zswap/parameters/accept_threshold_percent
- echo 0 > /sys/module/zswap/parameters/max_pool_percent
- echo 0 > /sys/module/zswap/parameters/enabled
- Once zswap is disabled and no longer using iaa_crypto, the IAA wqs and
- devices can be disabled.
- .. _iaa_disable_script:
- IAA disable script
- ------------------
- The below script automatically does that::
- #!/bin/bash
- echo "IAA devices:"
- lspci -d:0cfe
- echo "# IAA devices:"
- lspci -d:0cfe | wc -l
- #
- # count iaa instances
- #
- iaa_dev_id="0cfe"
- num_iaa=$(lspci -d:${iaa_dev_id} | wc -l)
- echo "Found ${num_iaa} IAA instances"
- #
- # disable iaa wqs and devices
- #
- echo "Disable IAA"
- for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
- echo disable wq iax${i}/wq${i}.0
- accel-config disable-wq iax${i}/wq${i}.0
- echo disable iaa iax${i}
- accel-config disable-device iax${i}
- done
- echo "End Disable IAA"
- Finally, at this point the iaa_crypto module can be removed, which
- will unregister the current IAA crypto algorithms::
- rmmod iaa_crypto
- memory_madvise.c (gcc -o memory_memadvise memory_madvise.c)::
- #include <stdio.h>
- #include <stdlib.h>
- #include <string.h>
- #include <unistd.h>
- #include <sys/mman.h>
- #include <linux/mman.h>
- #ifndef MADV_PAGEOUT
- #define MADV_PAGEOUT 21 /* force pages out immediately */
- #endif
- #define PG_SZ 4096
- int main(int argc, char **argv)
- {
- int i, nr_pages = 1;
- int64_t *dump_ptr;
- char *addr, *a;
- int loop = 1;
- if (argc > 1)
- nr_pages = atoi(argv[1]);
- printf("Allocating %d pages to swap in/out\n", nr_pages);
- /* allocate pages */
- addr = mmap(NULL, nr_pages * PG_SZ, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
- *addr = 1;
- /* initialize data in page to all '*' chars */
- memset(addr, '*', nr_pages * PG_SZ);
- printf("Swapping out %d pages\n", nr_pages);
- /* Tell kernel to swap it out */
- madvise(addr, nr_pages * PG_SZ, MADV_PAGEOUT);
- while (loop > 0) {
- /* Wait for swap out to finish */
- sleep(5);
- a = addr;
- printf("Swapping in %d pages\n", nr_pages);
- /* Access the page ... this will swap it back in again */
- for (i = 0; i < nr_pages; i++) {
- if (a[0] != '*') {
- printf("Bad data from decompress!!!!!\n");
- dump_ptr = (int64_t *)a;
- for (int j = 0; j < 100; j++) {
- printf(" page %d data: %#llx\n", i, *dump_ptr);
- dump_ptr++;
- }
- }
- a += PG_SZ;
- }
- loop --;
- }
- printf("Swapped out and in %d pages\n", nr_pages);
- Appendix
- ========
- .. _iaa_sysfs_config:
- IAA sysfs config interface
- --------------------------
- Below is a description of the IAA sysfs interface, which as mentioned
- in the main document, should only be used if you know exactly what you
- are doing. Even then, there's no compelling reason to use it directly
- since accel-config can do everything the sysfs interface can and in
- fact accel-config is based on it under the covers.
- The 'IAA config path' is /sys/bus/dsa/devices and contains
- subdirectories representing each IAA device, workqueue, engine, and
- group. Note that in the sysfs interface, the IAA devices are actually
- named using iax e.g. iax1, iax3, etc. (Note that IAA devices are the
- odd-numbered devices; the even-numbered devices are DSA devices and
- can be ignored for IAA).
- The 'IAA device bind path' is /sys/bus/dsa/drivers/idxd/bind and is
- the file that is written to enable an IAA device.
- The 'IAA workqueue bind path' is /sys/bus/dsa/drivers/crypto/bind and
- is the file that is written to enable an IAA workqueue.
- Similarly /sys/bus/dsa/drivers/idxd/unbind and
- /sys/bus/dsa/drivers/crypto/unbind are used to disable IAA devices and
- workqueues.
- The basic sequence of commands needed to set up the IAA devices and
- workqueues is:
- For each device::
- 1) Disable any workqueues enabled on the device. For example to
- disable workques 0 and 1 on IAA device 3::
- # echo wq3.0 > /sys/bus/dsa/drivers/crypto/unbind
- # echo wq3.1 > /sys/bus/dsa/drivers/crypto/unbind
- 2) Disable the device. For example to disable IAA device 3::
- # echo iax3 > /sys/bus/dsa/drivers/idxd/unbind
- 3) configure the desired workqueues. For example, to configure
- workqueue 3 on IAA device 3::
- # echo dedicated > /sys/bus/dsa/devices/iax3/wq3.3/mode
- # echo 128 > /sys/bus/dsa/devices/iax3/wq3.3/size
- # echo 0 > /sys/bus/dsa/devices/iax3/wq3.3/group_id
- # echo 10 > /sys/bus/dsa/devices/iax3/wq3.3/priority
- # echo "kernel" > /sys/bus/dsa/devices/iax3/wq3.3/type
- # echo "iaa_crypto" > /sys/bus/dsa/devices/iax3/wq3.3/name
- # echo "crypto" > /sys/bus/dsa/devices/iax3/wq3.3/driver_name
- 4) Enable the device. For example to enable IAA device 3::
- # echo iax3 > /sys/bus/dsa/drivers/idxd/bind
- 5) Enable the desired workqueues on the device. For example to
- enable workques 0 and 1 on IAA device 3::
- # echo wq3.0 > /sys/bus/dsa/drivers/crypto/bind
- # echo wq3.1 > /sys/bus/dsa/drivers/crypto/bind
|