coresight.rst 31 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695
  1. ======================================
  2. Coresight - HW Assisted Tracing on ARM
  3. ======================================
  4. :Author: Mathieu Poirier <mathieu.poirier@linaro.org>
  5. :Date: September 11th, 2014
  6. Introduction
  7. ------------
  8. Coresight is an umbrella of technologies allowing for the debugging of ARM
  9. based SoC. It includes solutions for JTAG and HW assisted tracing. This
  10. document is concerned with the latter.
  11. HW assisted tracing is becoming increasingly useful when dealing with systems
  12. that have many SoCs and other components like GPU and DMA engines. ARM has
  13. developed a HW assisted tracing solution by means of different components, each
  14. being added to a design at synthesis time to cater to specific tracing needs.
  15. Components are generally categorised as source, link and sinks and are
  16. (usually) discovered using the AMBA bus.
  17. "Sources" generate a compressed stream representing the processor instruction
  18. path based on tracing scenarios as configured by users. From there the stream
  19. flows through the coresight system (via ATB bus) using links that are connecting
  20. the emanating source to a sink(s). Sinks serve as endpoints to the coresight
  21. implementation, either storing the compressed stream in a memory buffer or
  22. creating an interface to the outside world where data can be transferred to a
  23. host without fear of filling up the onboard coresight memory buffer.
  24. At typical coresight system would look like this::
  25. *****************************************************************
  26. **************************** AMBA AXI ****************************===||
  27. ***************************************************************** ||
  28. ^ ^ | ||
  29. | | * **
  30. 0000000 ::::: 0000000 ::::: ::::: @@@@@@@ ||||||||||||
  31. 0 CPU 0<-->: C : 0 CPU 0<-->: C : : C : @ STM @ || System ||
  32. |->0000000 : T : |->0000000 : T : : T :<--->@@@@@ || Memory ||
  33. | #######<-->: I : | #######<-->: I : : I : @@@<-| ||||||||||||
  34. | # ETM # ::::: | # PTM # ::::: ::::: @ |
  35. | ##### ^ ^ | ##### ^ ! ^ ! . | |||||||||
  36. | |->### | ! | |->### | ! | ! . | || DAP ||
  37. | | # | ! | | # | ! | ! . | |||||||||
  38. | | . | ! | | . | ! | ! . | | |
  39. | | . | ! | | . | ! | ! . | | *
  40. | | . | ! | | . | ! | ! . | | SWD/
  41. | | . | ! | | . | ! | ! . | | JTAG
  42. *****************************************************************<-|
  43. *************************** AMBA Debug APB ************************
  44. *****************************************************************
  45. | . ! . ! ! . |
  46. | . * . * * . |
  47. *****************************************************************
  48. ******************** Cross Trigger Matrix (CTM) *******************
  49. *****************************************************************
  50. | . ^ . . |
  51. | * ! * * |
  52. *****************************************************************
  53. ****************** AMBA Advanced Trace Bus (ATB) ******************
  54. *****************************************************************
  55. | ! =============== |
  56. | * ===== F =====<---------|
  57. | ::::::::: ==== U ====
  58. |-->:: CTI ::<!! === N ===
  59. | ::::::::: ! == N ==
  60. | ^ * == E ==
  61. | ! &&&&&&&&& IIIIIII == L ==
  62. |------>&& ETB &&<......II I =======
  63. | ! &&&&&&&&& II I .
  64. | ! I I .
  65. | ! I REP I<..........
  66. | ! I I
  67. | !!>&&&&&&&&& II I *Source: ARM ltd.
  68. |------>& TPIU &<......II I DAP = Debug Access Port
  69. &&&&&&&&& IIIIIII ETM = Embedded Trace Macrocell
  70. ; PTM = Program Trace Macrocell
  71. ; CTI = Cross Trigger Interface
  72. * ETB = Embedded Trace Buffer
  73. To trace port TPIU= Trace Port Interface Unit
  74. SWD = Serial Wire Debug
  75. While on target configuration of the components is done via the APB bus,
  76. all trace data are carried out-of-band on the ATB bus. The CTM provides
  77. a way to aggregate and distribute signals between CoreSight components.
  78. The coresight framework provides a central point to represent, configure and
  79. manage coresight devices on a platform. This first implementation centers on
  80. the basic tracing functionality, enabling components such ETM/PTM, funnel,
  81. replicator, TMC, TPIU and ETB. Future work will enable more
  82. intricate IP blocks such as STM and CTI.
  83. Acronyms and Classification
  84. ---------------------------
  85. Acronyms:
  86. PTM:
  87. Program Trace Macrocell
  88. ETM:
  89. Embedded Trace Macrocell
  90. STM:
  91. System trace Macrocell
  92. ETB:
  93. Embedded Trace Buffer
  94. ITM:
  95. Instrumentation Trace Macrocell
  96. TPIU:
  97. Trace Port Interface Unit
  98. TMC-ETR:
  99. Trace Memory Controller, configured as Embedded Trace Router
  100. TMC-ETF:
  101. Trace Memory Controller, configured as Embedded Trace FIFO
  102. CTI:
  103. Cross Trigger Interface
  104. Classification:
  105. Source:
  106. ETMv3.x ETMv4, PTMv1.0, PTMv1.1, STM, STM500, ITM
  107. Link:
  108. Funnel, replicator (intelligent or not), TMC-ETR
  109. Sinks:
  110. ETBv1.0, ETB1.1, TPIU, TMC-ETF
  111. Misc:
  112. CTI
  113. Device Tree Bindings
  114. --------------------
  115. See ``Documentation/devicetree/bindings/arm/arm,coresight-*.yaml`` for details.
  116. As of this writing drivers for ITM, STMs and CTIs are not provided but are
  117. expected to be added as the solution matures.
  118. Framework and implementation
  119. ----------------------------
  120. The coresight framework provides a central point to represent, configure and
  121. manage coresight devices on a platform. Any coresight compliant device can
  122. register with the framework for as long as they use the right APIs:
  123. .. c:function:: struct coresight_device *coresight_register(struct coresight_desc *desc);
  124. .. c:function:: void coresight_unregister(struct coresight_device *csdev);
  125. The registering function is taking a ``struct coresight_desc *desc`` and
  126. register the device with the core framework. The unregister function takes
  127. a reference to a ``struct coresight_device *csdev`` obtained at registration time.
  128. If everything goes well during the registration process the new devices will
  129. show up under /sys/bus/coresight/devices, as showns here for a TC2 platform::
  130. root:~# ls /sys/bus/coresight/devices/
  131. replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm
  132. 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm
  133. root:~#
  134. The functions take a ``struct coresight_device``, which looks like this::
  135. struct coresight_desc {
  136. enum coresight_dev_type type;
  137. struct coresight_dev_subtype subtype;
  138. const struct coresight_ops *ops;
  139. struct coresight_platform_data *pdata;
  140. struct device *dev;
  141. const struct attribute_group **groups;
  142. };
  143. The "coresight_dev_type" identifies what the device is, i.e, source link or
  144. sink while the "coresight_dev_subtype" will characterise that type further.
  145. The ``struct coresight_ops`` is mandatory and will tell the framework how to
  146. perform base operations related to the components, each component having
  147. a different set of requirement. For that ``struct coresight_ops_sink``,
  148. ``struct coresight_ops_link`` and ``struct coresight_ops_source`` have been
  149. provided.
  150. The next field ``struct coresight_platform_data *pdata`` is acquired by calling
  151. ``of_get_coresight_platform_data()``, as part of the driver's _probe routine and
  152. ``struct device *dev`` gets the device reference embedded in the ``amba_device``::
  153. static int etm_probe(struct amba_device *adev, const struct amba_id *id)
  154. {
  155. ...
  156. ...
  157. drvdata->dev = &adev->dev;
  158. ...
  159. }
  160. Specific class of device (source, link, or sink) have generic operations
  161. that can be performed on them (see ``struct coresight_ops``). The ``**groups``
  162. is a list of sysfs entries pertaining to operations
  163. specific to that component only. "Implementation defined" customisations are
  164. expected to be accessed and controlled using those entries.
  165. Device Naming scheme
  166. --------------------
  167. The devices that appear on the "coresight" bus were named the same as their
  168. parent devices, i.e, the real devices that appears on AMBA bus or the platform bus.
  169. Thus the names were based on the Linux Open Firmware layer naming convention,
  170. which follows the base physical address of the device followed by the device
  171. type. e.g::
  172. root:~# ls /sys/bus/coresight/devices/
  173. 20010000.etf 20040000.funnel 20100000.stm 22040000.etm
  174. 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu
  175. 20070000.etr 20120000.replicator 220c0000.funnel
  176. 23040000.etm 23140000.etm 23340000.etm
  177. However, with the introduction of ACPI support, the names of the real
  178. devices are a bit cryptic and non-obvious. Thus, a new naming scheme was
  179. introduced to use more generic names based on the type of the device. The
  180. following rules apply::
  181. 1) Devices that are bound to CPUs, are named based on the CPU logical
  182. number.
  183. e.g, ETM bound to CPU0 is named "etm0"
  184. 2) All other devices follow a pattern, "<device_type_prefix>N", where :
  185. <device_type_prefix> - A prefix specific to the type of the device
  186. N - a sequential number assigned based on the order
  187. of probing.
  188. e.g, tmc_etf0, tmc_etr0, funnel0, funnel1
  189. Thus, with the new scheme the devices could appear as ::
  190. root:~# ls /sys/bus/coresight/devices/
  191. etm0 etm1 etm2 etm3 etm4 etm5 funnel0
  192. funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
  193. Some of the examples below might refer to old naming scheme and some
  194. to the newer scheme, to give a confirmation that what you see on your
  195. system is not unexpected. One must use the "names" as they appear on
  196. the system under specified locations.
  197. Topology Representation
  198. -----------------------
  199. Each CoreSight component has a ``connections`` directory which will contain
  200. links to other CoreSight components. This allows the user to explore the trace
  201. topology and for larger systems, determine the most appropriate sink for a
  202. given source. The connection information can also be used to establish
  203. which CTI devices are connected to a given component. This directory contains a
  204. ``nr_links`` attribute detailing the number of links in the directory.
  205. For an ETM source, in this case ``etm0`` on a Juno platform, a typical
  206. arrangement will be::
  207. linaro-developer:~# ls - l /sys/bus/coresight/devices/etm0/connections
  208. <file details> cti_cpu0 -> ../../../23020000.cti/cti_cpu0
  209. <file details> nr_links
  210. <file details> out:0 -> ../../../230c0000.funnel/funnel2
  211. Following the out port to ``funnel2``::
  212. linaro-developer:~# ls -l /sys/bus/coresight/devices/funnel2/connections
  213. <file details> in:0 -> ../../../23040000.etm/etm0
  214. <file details> in:1 -> ../../../23140000.etm/etm3
  215. <file details> in:2 -> ../../../23240000.etm/etm4
  216. <file details> in:3 -> ../../../23340000.etm/etm5
  217. <file details> nr_links
  218. <file details> out:0 -> ../../../20040000.funnel/funnel0
  219. And again to ``funnel0``::
  220. linaro-developer:~# ls -l /sys/bus/coresight/devices/funnel0/connections
  221. <file details> in:0 -> ../../../220c0000.funnel/funnel1
  222. <file details> in:1 -> ../../../230c0000.funnel/funnel2
  223. <file details> nr_links
  224. <file details> out:0 -> ../../../20010000.etf/tmc_etf0
  225. Finding the first sink ``tmc_etf0``. This can be used to collect data
  226. as a sink, or as a link to propagate further along the chain::
  227. linaro-developer:~# ls -l /sys/bus/coresight/devices/tmc_etf0/connections
  228. <file details> cti_sys0 -> ../../../20020000.cti/cti_sys0
  229. <file details> in:0 -> ../../../20040000.funnel/funnel0
  230. <file details> nr_links
  231. <file details> out:0 -> ../../../20150000.funnel/funnel4
  232. via ``funnel4``::
  233. linaro-developer:~# ls -l /sys/bus/coresight/devices/funnel4/connections
  234. <file details> in:0 -> ../../../20010000.etf/tmc_etf0
  235. <file details> in:1 -> ../../../20140000.etf/tmc_etf1
  236. <file details> nr_links
  237. <file details> out:0 -> ../../../20120000.replicator/replicator0
  238. and a ``replicator0``::
  239. linaro-developer:~# ls -l /sys/bus/coresight/devices/replicator0/connections
  240. <file details> in:0 -> ../../../20150000.funnel/funnel4
  241. <file details> nr_links
  242. <file details> out:0 -> ../../../20030000.tpiu/tpiu0
  243. <file details> out:1 -> ../../../20070000.etr/tmc_etr0
  244. Arriving at the final sink in the chain, ``tmc_etr0``::
  245. linaro-developer:~# ls -l /sys/bus/coresight/devices/tmc_etr0/connections
  246. <file details> cti_sys0 -> ../../../20020000.cti/cti_sys0
  247. <file details> in:0 -> ../../../20120000.replicator/replicator0
  248. <file details> nr_links
  249. As described below, when using sysfs it is sufficient to enable a sink and
  250. a source for successful trace. The framework will correctly enable all
  251. intermediate links as required.
  252. Note: ``cti_sys0`` appears in two of the connections lists above.
  253. CTIs can connect to multiple devices and are arranged in a star topology
  254. via the CTM. See (Documentation/trace/coresight/coresight-ect.rst)
  255. [#fourth]_ for further details.
  256. Looking at this device we see 4 connections::
  257. linaro-developer:~# ls -l /sys/bus/coresight/devices/cti_sys0/connections
  258. <file details> nr_links
  259. <file details> stm0 -> ../../../20100000.stm/stm0
  260. <file details> tmc_etf0 -> ../../../20010000.etf/tmc_etf0
  261. <file details> tmc_etr0 -> ../../../20070000.etr/tmc_etr0
  262. <file details> tpiu0 -> ../../../20030000.tpiu/tpiu0
  263. How to use the tracer modules
  264. -----------------------------
  265. There are two ways to use the Coresight framework:
  266. 1. using the perf cmd line tools.
  267. 2. interacting directly with the Coresight devices using the sysFS interface.
  268. Preference is given to the former as using the sysFS interface
  269. requires a deep understanding of the Coresight HW. The following sections
  270. provide details on using both methods.
  271. Using the sysFS interface
  272. ~~~~~~~~~~~~~~~~~~~~~~~~~
  273. Before trace collection can start, a coresight sink needs to be identified.
  274. There is no limit on the amount of sinks (nor sources) that can be enabled at
  275. any given moment. As a generic operation, all device pertaining to the sink
  276. class will have an "active" entry in sysfs::
  277. root:/sys/bus/coresight/devices# ls
  278. replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm
  279. 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm
  280. root:/sys/bus/coresight/devices# ls 20010000.etb
  281. enable_sink status trigger_cntr
  282. root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink
  283. root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink
  284. 1
  285. root:/sys/bus/coresight/devices#
  286. At boot time the current etm3x driver will configure the first address
  287. comparator with "_stext" and "_etext", essentially tracing any instruction
  288. that falls within that range. As such "enabling" a source will immediately
  289. trigger a trace capture::
  290. root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source
  291. root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source
  292. 1
  293. root:/sys/bus/coresight/devices# cat 20010000.etb/status
  294. Depth: 0x2000
  295. Status: 0x1
  296. RAM read ptr: 0x0
  297. RAM wrt ptr: 0x19d3 <----- The write pointer is moving
  298. Trigger cnt: 0x0
  299. Control: 0x1
  300. Flush status: 0x0
  301. Flush ctrl: 0x2001
  302. root:/sys/bus/coresight/devices#
  303. Trace collection is stopped the same way::
  304. root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source
  305. root:/sys/bus/coresight/devices#
  306. The content of the ETB buffer can be harvested directly from /dev::
  307. root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \
  308. of=~/cstrace.bin
  309. 64+0 records in
  310. 64+0 records out
  311. 32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s
  312. root:/sys/bus/coresight/devices#
  313. The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32.
  314. Following is a DS-5 output of an experimental loop that increments a variable up
  315. to a certain value. The example is simple and yet provides a glimpse of the
  316. wealth of possibilities that coresight provides.
  317. ::
  318. Info Tracing enabled
  319. Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr}
  320. Instruction 0 0x8026B540 E24DD00C false SUB sp,sp,#0xc
  321. Instruction 0 0x8026B544 E3A03000 false MOV r3,#0
  322. Instruction 0 0x8026B548 E58D3004 false STR r3,[sp,#4]
  323. Instruction 0 0x8026B54C E59D3004 false LDR r3,[sp,#4]
  324. Instruction 0 0x8026B550 E3530004 false CMP r3,#4
  325. Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
  326. Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
  327. Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
  328. Timestamp Timestamp: 17106715833
  329. Instruction 319 0x8026B54C E59D3004 false LDR r3,[sp,#4]
  330. Instruction 0 0x8026B550 E3530004 false CMP r3,#4
  331. Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
  332. Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
  333. Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
  334. Instruction 9 0x8026B54C E59D3004 false LDR r3,[sp,#4]
  335. Instruction 0 0x8026B550 E3530004 false CMP r3,#4
  336. Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
  337. Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
  338. Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
  339. Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4]
  340. Instruction 0 0x8026B550 E3530004 false CMP r3,#4
  341. Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
  342. Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
  343. Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
  344. Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4]
  345. Instruction 0 0x8026B550 E3530004 false CMP r3,#4
  346. Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
  347. Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
  348. Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
  349. Instruction 10 0x8026B54C E59D3004 false LDR r3,[sp,#4]
  350. Instruction 0 0x8026B550 E3530004 false CMP r3,#4
  351. Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1
  352. Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4]
  353. Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c
  354. Instruction 6 0x8026B560 EE1D3F30 false MRC p15,#0x0,r3,c13,c0,#1
  355. Instruction 0 0x8026B564 E1A0100D false MOV r1,sp
  356. Instruction 0 0x8026B568 E3C12D7F false BIC r2,r1,#0x1fc0
  357. Instruction 0 0x8026B56C E3C2203F false BIC r2,r2,#0x3f
  358. Instruction 0 0x8026B570 E59D1004 false LDR r1,[sp,#4]
  359. Instruction 0 0x8026B574 E59F0010 false LDR r0,[pc,#16] ; [0x8026B58C] = 0x80550368
  360. Instruction 0 0x8026B578 E592200C false LDR r2,[r2,#0xc]
  361. Instruction 0 0x8026B57C E59221D0 false LDR r2,[r2,#0x1d0]
  362. Instruction 0 0x8026B580 EB07A4CF true BL {pc}+0x1e9344 ; 0x804548c4
  363. Info Tracing enabled
  364. Instruction 13570831 0x8026B584 E28DD00C false ADD sp,sp,#0xc
  365. Instruction 0 0x8026B588 E8BD8000 true LDM sp!,{pc}
  366. Timestamp Timestamp: 17107041535
  367. Using perf framework
  368. ~~~~~~~~~~~~~~~~~~~~
  369. Coresight tracers are represented using the Perf framework's Performance
  370. Monitoring Unit (PMU) abstraction. As such the perf framework takes charge of
  371. controlling when tracing gets enabled based on when the process of interest is
  372. scheduled. When configured in a system, Coresight PMUs will be listed when
  373. queried by the perf command line tool:
  374. linaro@linaro-nano:~$ ./perf list pmu
  375. List of pre-defined events (to be used in -e):
  376. cs_etm// [Kernel PMU event]
  377. linaro@linaro-nano:~$
  378. Regardless of the number of tracers available in a system (usually equal to the
  379. amount of processor cores), the "cs_etm" PMU will be listed only once.
  380. A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is
  381. listed along with configuration options within forward slashes '/'. Since a
  382. Coresight system will typically have more than one sink, the name of the sink to
  383. work with needs to be specified as an event option.
  384. On newer kernels the available sinks are listed in sysFS under
  385. ($SYSFS)/bus/event_source/devices/cs_etm/sinks/::
  386. root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls
  387. tmc_etf0 tmc_etr0 tpiu0
  388. On older kernels, this may need to be found from the list of coresight devices,
  389. available under ($SYSFS)/bus/coresight/devices/::
  390. root:~# ls /sys/bus/coresight/devices/
  391. etm0 etm1 etm2 etm3 etm4 etm5 funnel0
  392. funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
  393. root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
  394. As mentioned above in section "Device Naming scheme", the names of the devices could
  395. look different from what is used in the example above. One must use the device names
  396. as it appears under the sysFS.
  397. The syntax within the forward slashes '/' is important. The '@' character
  398. tells the parser that a sink is about to be specified and that this is the sink
  399. to use for the trace session.
  400. More information on the above and other example on how to use Coresight with
  401. the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub
  402. repository [#third]_.
  403. Advanced perf framework usage
  404. -----------------------------
  405. AutoFDO analysis using the perf tools
  406. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  407. perf can be used to record and analyze trace of programs.
  408. Execution can be recorded using 'perf record' with the cs_etm event,
  409. specifying the name of the sink to record to, e.g::
  410. perf record -e cs_etm/@tmc_etr0/u --per-thread
  411. The 'perf report' and 'perf script' commands can be used to analyze execution,
  412. synthesizing instruction and branch events from the instruction trace.
  413. 'perf inject' can be used to replace the trace data with the synthesized events.
  414. The --itrace option controls the type and frequency of synthesized events
  415. (see perf documentation).
  416. Note that only 64-bit programs are currently supported - further work is
  417. required to support instruction decode of 32-bit Arm programs.
  418. Tracing PID
  419. ~~~~~~~~~~~
  420. The kernel can be built to write the PID value into the PE ContextID registers.
  421. For a kernel running at EL1, the PID is stored in CONTEXTIDR_EL1. A PE may
  422. implement Arm Virtualization Host Extensions (VHE), which the kernel can
  423. run at EL2 as a virtualisation host; in this case, the PID value is stored in
  424. CONTEXTIDR_EL2.
  425. perf provides PMU formats that program the ETM to insert these values into the
  426. trace data; the PMU formats are defined as below:
  427. "contextid1": Available on both EL1 kernel and EL2 kernel. When the
  428. kernel is running at EL1, "contextid1" enables the PID
  429. tracing; when the kernel is running at EL2, this enables
  430. tracing the PID of guest applications.
  431. "contextid2": Only usable when the kernel is running at EL2. When
  432. selected, enables PID tracing on EL2 kernel.
  433. "contextid": Will be an alias for the option that enables PID
  434. tracing. I.e,
  435. contextid == contextid1, on EL1 kernel.
  436. contextid == contextid2, on EL2 kernel.
  437. perf will always enable PID tracing at the relevant EL, this is accomplished by
  438. automatically enable the "contextid" config - but for EL2 it is possible to make
  439. specific adjustments using configs "contextid1" and "contextid2", E.g. if a user
  440. wants to trace PIDs for both host and guest, the two configs "contextid1" and
  441. "contextid2" can be set at the same time:
  442. perf record -e cs_etm/contextid1,contextid2/u -- vm
  443. Generating coverage files for Feedback Directed Optimization: AutoFDO
  444. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  445. 'perf inject' accepts the --itrace option in which case tracing data is
  446. removed and replaced with the synthesized events. e.g.
  447. ::
  448. perf inject --itrace --strip -i perf.data -o perf.data.new
  449. Below is an example of using ARM ETM for autoFDO. It requires autofdo
  450. (https://github.com/google/autofdo) and gcc version 5. The bubble
  451. sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial).
  452. ::
  453. $ gcc-5 -O3 sort.c -o sort
  454. $ taskset -c 2 ./sort
  455. Bubble sorting array of 30000 elements
  456. 5910 ms
  457. $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort
  458. Bubble sorting array of 30000 elements
  459. 12543 ms
  460. [ perf record: Woken up 35 times to write data ]
  461. [ perf record: Captured and wrote 69.640 MB perf.data ]
  462. $ perf inject -i perf.data -o inj.data --itrace=il64 --strip
  463. $ create_gcov --binary=./sort --profile=inj.data --gcov=sort.gcov -gcov_version=1
  464. $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
  465. $ taskset -c 2 ./sort_autofdo
  466. Bubble sorting array of 30000 elements
  467. 5806 ms
  468. Config option formats
  469. ~~~~~~~~~~~~~~~~~~~~~
  470. The following strings can be provided between // on the perf command line to enable various options.
  471. They are also listed in the folder /sys/bus/event_source/devices/cs_etm/format/
  472. .. list-table::
  473. :header-rows: 1
  474. * - Option
  475. - Description
  476. * - branch_broadcast
  477. - Session local version of the system wide setting:
  478. :ref:`ETM_MODE_BB <coresight-branch-broadcast>`
  479. * - contextid
  480. - See `Tracing PID`_
  481. * - contextid1
  482. - See `Tracing PID`_
  483. * - contextid2
  484. - See `Tracing PID`_
  485. * - configid
  486. - Selection for a custom configuration. This is an implementation detail and not used directly,
  487. see :ref:`trace/coresight/coresight-config:Using Configurations in perf`
  488. * - preset
  489. - Override for parameters in a custom configuration, see
  490. :ref:`trace/coresight/coresight-config:Using Configurations in perf`
  491. * - sinkid
  492. - Hashed version of the string to select a sink, automatically set when using the @ notation.
  493. This is an internal implementation detail and is not used directly, see `Using perf
  494. framework`_.
  495. * - cycacc
  496. - Session local version of the system wide setting: :ref:`ETMv4_MODE_CYCACC
  497. <coresight-cycle-accurate>`
  498. * - retstack
  499. - Session local version of the system wide setting: :ref:`ETM_MODE_RETURNSTACK
  500. <coresight-return-stack>`
  501. * - timestamp
  502. - Session local version of the system wide setting: :ref:`ETMv4_MODE_TIMESTAMP
  503. <coresight-timestamp>`
  504. * - cc_threshold
  505. - Cycle count threshold value. If nothing is provided here or the provided value is 0, then the
  506. default value i.e 0x100 will be used. If provided value is less than minimum cycles threshold
  507. value, as indicated via TRCIDR3.CCITMIN, then the minimum value will be used instead.
  508. How to use the STM module
  509. -------------------------
  510. Using the System Trace Macrocell module is the same as the tracers - the only
  511. difference is that clients are driving the trace capture rather
  512. than the program flow through the code.
  513. As with any other CoreSight component, specifics about the STM tracer can be
  514. found in sysfs with more information on each entry being found in [#first]_::
  515. root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0
  516. enable_source hwevent_select port_enable subsystem uevent
  517. hwevent_enable mgmt port_select traceid
  518. root@genericarmv8:~#
  519. Like any other source a sink needs to be identified and the STM enabled before
  520. being used::
  521. root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
  522. root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source
  523. From there user space applications can request and use channels using the devfs
  524. interface provided for that purpose by the generic STM API::
  525. root@genericarmv8:~# ls -l /dev/stm0
  526. crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0
  527. root@genericarmv8:~#
  528. Details on how to use the generic STM API can be found here:
  529. - Documentation/trace/stm.rst [#second]_.
  530. The CTI & CTM Modules
  531. ---------------------
  532. The CTI (Cross Trigger Interface) provides a set of trigger signals between
  533. individual CTIs and components, and can propagate these between all CTIs via
  534. channels on the CTM (Cross Trigger Matrix).
  535. A separate documentation file is provided to explain the use of these devices.
  536. (Documentation/trace/coresight/coresight-ect.rst) [#fourth]_.
  537. CoreSight System Configuration
  538. ------------------------------
  539. CoreSight components can be complex devices with many programming options.
  540. Furthermore, components can be programmed to interact with each other across the
  541. complete system.
  542. A CoreSight System Configuration manager is provided to allow these complex programming
  543. configurations to be selected and used easily from perf and sysfs.
  544. See the separate document for further information.
  545. (Documentation/trace/coresight/coresight-config.rst) [#fifth]_.
  546. .. [#first] Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
  547. .. [#second] Documentation/trace/stm.rst
  548. .. [#third] https://github.com/Linaro/perf-opencsd
  549. .. [#fourth] Documentation/trace/coresight/coresight-ect.rst
  550. .. [#fifth] Documentation/trace/coresight/coresight-config.rst