strparser.txt 8.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207
  1. Stream Parser (strparser)
  2. Introduction
  3. ============
  4. The stream parser (strparser) is a utility that parses messages of an
  5. application layer protocol running over a data stream. The stream
  6. parser works in conjunction with an upper layer in the kernel to provide
  7. kernel support for application layer messages. For instance, Kernel
  8. Connection Multiplexor (KCM) uses the Stream Parser to parse messages
  9. using a BPF program.
  10. The strparser works in one of two modes: receive callback or general
  11. mode.
  12. In receive callback mode, the strparser is called from the data_ready
  13. callback of a TCP socket. Messages are parsed and delivered as they are
  14. received on the socket.
  15. In general mode, a sequence of skbs are fed to strparser from an
  16. outside source. Message are parsed and delivered as the sequence is
  17. processed. This modes allows strparser to be applied to arbitrary
  18. streams of data.
  19. Interface
  20. =========
  21. The API includes a context structure, a set of callbacks, utility
  22. functions, and a data_ready function for receive callback mode. The
  23. callbacks include a parse_msg function that is called to perform
  24. parsing (e.g. BPF parsing in case of KCM), and a rcv_msg function
  25. that is called when a full message has been completed.
  26. Functions
  27. =========
  28. strp_init(struct strparser *strp, struct sock *sk,
  29. const struct strp_callbacks *cb)
  30. Called to initialize a stream parser. strp is a struct of type
  31. strparser that is allocated by the upper layer. sk is the TCP
  32. socket associated with the stream parser for use with receive
  33. callback mode; in general mode this is set to NULL. Callbacks
  34. are called by the stream parser (the callbacks are listed below).
  35. void strp_pause(struct strparser *strp)
  36. Temporarily pause a stream parser. Message parsing is suspended
  37. and no new messages are delivered to the upper layer.
  38. void strp_unpause(struct strparser *strp)
  39. Unpause a paused stream parser.
  40. void strp_stop(struct strparser *strp);
  41. strp_stop is called to completely stop stream parser operations.
  42. This is called internally when the stream parser encounters an
  43. error, and it is called from the upper layer to stop parsing
  44. operations.
  45. void strp_done(struct strparser *strp);
  46. strp_done is called to release any resources held by the stream
  47. parser instance. This must be called after the stream processor
  48. has been stopped.
  49. int strp_process(struct strparser *strp, struct sk_buff *orig_skb,
  50. unsigned int orig_offset, size_t orig_len,
  51. size_t max_msg_size, long timeo)
  52. strp_process is called in general mode for a stream parser to
  53. parse an sk_buff. The number of bytes processed or a negative
  54. error number is returned. Note that strp_process does not
  55. consume the sk_buff. max_msg_size is maximum size the stream
  56. parser will parse. timeo is timeout for completing a message.
  57. void strp_data_ready(struct strparser *strp);
  58. The upper layer calls strp_tcp_data_ready when data is ready on
  59. the lower socket for strparser to process. This should be called
  60. from a data_ready callback that is set on the socket. Note that
  61. maximum messages size is the limit of the receive socket
  62. buffer and message timeout is the receive timeout for the socket.
  63. void strp_check_rcv(struct strparser *strp);
  64. strp_check_rcv is called to check for new messages on the socket.
  65. This is normally called at initialization of a stream parser
  66. instance or after strp_unpause.
  67. Callbacks
  68. =========
  69. There are six callbacks:
  70. int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
  71. parse_msg is called to determine the length of the next message
  72. in the stream. The upper layer must implement this function. It
  73. should parse the sk_buff as containing the headers for the
  74. next application layer message in the stream.
  75. The skb->cb in the input skb is a struct strp_msg. Only
  76. the offset field is relevant in parse_msg and gives the offset
  77. where the message starts in the skb.
  78. The return values of this function are:
  79. >0 : indicates length of successfully parsed message
  80. 0 : indicates more data must be received to parse the message
  81. -ESTRPIPE : current message should not be processed by the
  82. kernel, return control of the socket to userspace which
  83. can proceed to read the messages itself
  84. other < 0 : Error in parsing, give control back to userspace
  85. assuming that synchronization is lost and the stream
  86. is unrecoverable (application expected to close TCP socket)
  87. In the case that an error is returned (return value is less than
  88. zero) and the parser is in receive callback mode, then it will set
  89. the error on TCP socket and wake it up. If parse_msg returned
  90. -ESTRPIPE and the stream parser had previously read some bytes for
  91. the current message, then the error set on the attached socket is
  92. ENODATA since the stream is unrecoverable in that case.
  93. void (*lock)(struct strparser *strp)
  94. The lock callback is called to lock the strp structure when
  95. the strparser is performing an asynchronous operation (such as
  96. processing a timeout). In receive callback mode the default
  97. function is to lock_sock for the associated socket. In general
  98. mode the callback must be set appropriately.
  99. void (*unlock)(struct strparser *strp)
  100. The unlock callback is called to release the lock obtained
  101. by the lock callback. In receive callback mode the default
  102. function is release_sock for the associated socket. In general
  103. mode the callback must be set appropriately.
  104. void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);
  105. rcv_msg is called when a full message has been received and
  106. is queued. The callee must consume the sk_buff; it can
  107. call strp_pause to prevent any further messages from being
  108. received in rcv_msg (see strp_pause above). This callback
  109. must be set.
  110. The skb->cb in the input skb is a struct strp_msg. This
  111. struct contains two fields: offset and full_len. Offset is
  112. where the message starts in the skb, and full_len is the
  113. the length of the message. skb->len - offset may be greater
  114. then full_len since strparser does not trim the skb.
  115. int (*read_sock_done)(struct strparser *strp, int err);
  116. read_sock_done is called when the stream parser is done reading
  117. the TCP socket in receive callback mode. The stream parser may
  118. read multiple messages in a loop and this function allows cleanup
  119. to occur when exiting the loop. If the callback is not set (NULL
  120. in strp_init) a default function is used.
  121. void (*abort_parser)(struct strparser *strp, int err);
  122. This function is called when stream parser encounters an error
  123. in parsing. The default function stops the stream parser and
  124. sets the error in the socket if the parser is in receive callback
  125. mode. The default function can be changed by setting the callback
  126. to non-NULL in strp_init.
  127. Statistics
  128. ==========
  129. Various counters are kept for each stream parser instance. These are in
  130. the strp_stats structure. strp_aggr_stats is a convenience structure for
  131. accumulating statistics for multiple stream parser instances.
  132. save_strp_stats and aggregate_strp_stats are helper functions to save
  133. and aggregate statistics.
  134. Message assembly limits
  135. =======================
  136. The stream parser provide mechanisms to limit the resources consumed by
  137. message assembly.
  138. A timer is set when assembly starts for a new message. In receive
  139. callback mode the message timeout is taken from rcvtime for the
  140. associated TCP socket. In general mode, the timeout is passed as an
  141. argument in strp_process. If the timer fires before assembly completes
  142. the stream parser is aborted and the ETIMEDOUT error is set on the TCP
  143. socket if in receive callback mode.
  144. In receive callback mode, message length is limited to the receive
  145. buffer size of the associated TCP socket. If the length returned by
  146. parse_msg is greater than the socket buffer size then the stream parser
  147. is aborted with EMSGSIZE error set on the TCP socket. Note that this
  148. makes the maximum size of receive skbuffs for a socket with a stream
  149. parser to be 2*sk_rcvbuf of the TCP socket.
  150. In general mode the message length limit is passed in as an argument
  151. to strp_process.
  152. Author
  153. ======
  154. Tom Herbert (tom@quantonium.net)