| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452 |
- .. SPDX-License-Identifier: GPL-2.0
- ==============================
- Network Filesystem Caching API
- ==============================
- Fscache provides an API by which a network filesystem can make use of local
- caching facilities. The API is arranged around a number of principles:
- (1) A cache is logically organised into volumes and data storage objects
- within those volumes.
- (2) Volumes and data storage objects are represented by various types of
- cookie.
- (3) Cookies have keys that distinguish them from their peers.
- (4) Cookies have coherency data that allows a cache to determine if the
- cached data is still valid.
- (5) I/O is done asynchronously where possible.
- This API is used by::
- #include <linux/fscache.h>.
- .. This document contains the following sections:
- (1) Overview
- (2) Volume registration
- (3) Data file registration
- (4) Declaring a cookie to be in use
- (5) Resizing a data file (truncation)
- (6) Data I/O API
- (7) Data file coherency
- (8) Data file invalidation
- (9) Write back resource management
- (10) Caching of local modifications
- (11) Page release and invalidation
- Overview
- ========
- The fscache hierarchy is organised on two levels from a network filesystem's
- point of view. The upper level represents "volumes" and the lower level
- represents "data storage objects". These are represented by two types of
- cookie, hereafter referred to as "volume cookies" and "cookies".
- A network filesystem acquires a volume cookie for a volume using a volume key,
- which represents all the information that defines that volume (e.g. cell name
- or server address, volume ID or share name). This must be rendered as a
- printable string that can be used as a directory name (ie. no '/' characters
- and shouldn't begin with a '.'). The maximum name length is one less than the
- maximum size of a filename component (allowing the cache backend one char for
- its own purposes).
- A filesystem would typically have a volume cookie for each superblock.
- The filesystem then acquires a cookie for each file within that volume using an
- object key. Object keys are binary blobs and only need to be unique within
- their parent volume. The cache backend is responsible for rendering the binary
- blob into something it can use and may employ hash tables, trees or whatever to
- improve its ability to find an object. This is transparent to the network
- filesystem.
- A filesystem would typically have a cookie for each inode, and would acquire it
- in iget and relinquish it when evicting the cookie.
- Once it has a cookie, the filesystem needs to mark the cookie as being in use.
- This causes fscache to send the cache backend off to look up/create resources
- for the cookie in the background, to check its coherency and, if necessary, to
- mark the object as being under modification.
- A filesystem would typically "use" the cookie in its file open routine and
- unuse it in file release and it needs to use the cookie around calls to
- truncate the cookie locally. It *also* needs to use the cookie when the
- pagecache becomes dirty and unuse it when writeback is complete. This is
- slightly tricky, and provision is made for it.
- When performing a read, write or resize on a cookie, the filesystem must first
- begin an operation. This copies the resources into a holding struct and puts
- extra pins into the cache to stop cache withdrawal from tearing down the
- structures being used. The actual operation can then be issued and conflicting
- invalidations can be detected upon completion.
- The filesystem is expected to use netfslib to access the cache, but that's not
- actually required and it can use the fscache I/O API directly.
- Volume Registration
- ===================
- The first step for a network filesystem is to acquire a volume cookie for the
- volume it wants to access::
- struct fscache_volume *
- fscache_acquire_volume(const char *volume_key,
- const char *cache_name,
- const void *coherency_data,
- size_t coherency_len);
- This function creates a volume cookie with the specified volume key as its name
- and notes the coherency data.
- The volume key must be a printable string with no '/' characters in it. It
- should begin with the name of the filesystem and should be no longer than 254
- characters. It should uniquely represent the volume and will be matched with
- what's stored in the cache.
- The caller may also specify the name of the cache to use. If specified,
- fscache will look up or create a cache cookie of that name and will use a cache
- of that name if it is online or comes online. If no cache name is specified,
- it will use the first cache that comes to hand and set the name to that.
- The specified coherency data is stored in the cookie and will be matched
- against coherency data stored on disk. The data pointer may be NULL if no data
- is provided. If the coherency data doesn't match, the entire cache volume will
- be invalidated.
- This function can return errors such as EBUSY if the volume key is already in
- use by an acquired volume or ENOMEM if an allocation failure occurred. It may
- also return a NULL volume cookie if fscache is not enabled. It is safe to
- pass a NULL cookie to any function that takes a volume cookie. This will
- cause that function to do nothing.
- When the network filesystem has finished with a volume, it should relinquish it
- by calling::
- void fscache_relinquish_volume(struct fscache_volume *volume,
- const void *coherency_data,
- bool invalidate);
- This will cause the volume to be committed or removed, and if sealed the
- coherency data will be set to the value supplied. The amount of coherency data
- must match the length specified when the volume was acquired. Note that all
- data cookies obtained in this volume must be relinquished before the volume is
- relinquished.
- Data File Registration
- ======================
- Once it has a volume cookie, a network filesystem can use it to acquire a
- cookie for data storage::
- struct fscache_cookie *
- fscache_acquire_cookie(struct fscache_volume *volume,
- u8 advice,
- const void *index_key,
- size_t index_key_len,
- const void *aux_data,
- size_t aux_data_len,
- loff_t object_size)
- This creates the cookie in the volume using the specified index key. The index
- key is a binary blob of the given length and must be unique for the volume.
- This is saved into the cookie. There are no restrictions on the content, but
- its length shouldn't exceed about three quarters of the maximum filename length
- to allow for encoding.
- The caller should also pass in a piece of coherency data in aux_data. A buffer
- of size aux_data_len will be allocated and the coherency data copied in. It is
- assumed that the size is invariant over time. The coherency data is used to
- check the validity of data in the cache. Functions are provided by which the
- coherency data can be updated.
- The file size of the object being cached should also be provided. This may be
- used to trim the data and will be stored with the coherency data.
- This function never returns an error, though it may return a NULL cookie on
- allocation failure or if fscache is not enabled. It is safe to pass in a NULL
- volume cookie and pass the NULL cookie returned to any function that takes it.
- This will cause that function to do nothing.
- When the network filesystem has finished with a cookie, it should relinquish it
- by calling::
- void fscache_relinquish_cookie(struct fscache_cookie *cookie,
- bool retire);
- This will cause fscache to either commit the storage backing the cookie or
- delete it.
- Marking A Cookie In-Use
- =======================
- Once a cookie has been acquired by a network filesystem, the filesystem should
- tell fscache when it intends to use the cookie (typically done on file open)
- and should say when it has finished with it (typically on file close)::
- void fscache_use_cookie(struct fscache_cookie *cookie,
- bool will_modify);
- void fscache_unuse_cookie(struct fscache_cookie *cookie,
- const void *aux_data,
- const loff_t *object_size);
- The *use* function tells fscache that it will use the cookie and, additionally,
- indicate if the user is intending to modify the contents locally. If not yet
- done, this will trigger the cache backend to go and gather the resources it
- needs to access/store data in the cache. This is done in the background, and
- so may not be complete by the time the function returns.
- The *unuse* function indicates that a filesystem has finished using a cookie.
- It optionally updates the stored coherency data and object size and then
- decreases the in-use counter. When the last user unuses the cookie, it is
- scheduled for garbage collection. If not reused within a short time, the
- resources will be released to reduce system resource consumption.
- A cookie must be marked in-use before it can be accessed for read, write or
- resize - and an in-use mark must be kept whilst there is dirty data in the
- pagecache in order to avoid an oops due to trying to open a file during process
- exit.
- Note that in-use marks are cumulative. For each time a cookie is marked
- in-use, it must be unused.
- Resizing A Data File (Truncation)
- =================================
- If a network filesystem file is resized locally by truncation, the following
- should be called to notify the cache::
- void fscache_resize_cookie(struct fscache_cookie *cookie,
- loff_t new_size);
- The caller must have first marked the cookie in-use. The cookie and the new
- size are passed in and the cache is synchronously resized. This is expected to
- be called from ``->setattr()`` inode operation under the inode lock.
- Data I/O API
- ============
- To do data I/O operations directly through a cookie, the following functions
- are available::
- int fscache_begin_read_operation(struct netfs_cache_resources *cres,
- struct fscache_cookie *cookie);
- int fscache_read(struct netfs_cache_resources *cres,
- loff_t start_pos,
- struct iov_iter *iter,
- enum netfs_read_from_hole read_hole,
- netfs_io_terminated_t term_func,
- void *term_func_priv);
- int fscache_write(struct netfs_cache_resources *cres,
- loff_t start_pos,
- struct iov_iter *iter,
- netfs_io_terminated_t term_func,
- void *term_func_priv);
- The *begin* function sets up an operation, attaching the resources required to
- the cache resources block from the cookie. Assuming it doesn't return an error
- (for instance, it will return -ENOBUFS if given a NULL cookie, but otherwise do
- nothing), then one of the other two functions can be issued.
- The *read* and *write* functions initiate a direct-IO operation. Both take the
- previously set up cache resources block, an indication of the start file
- position, and an I/O iterator that describes buffer and indicates the amount of
- data.
- The read function also takes a parameter to indicate how it should handle a
- partially populated region (a hole) in the disk content. This may be to ignore
- it, skip over an initial hole and place zeros in the buffer or give an error.
- The read and write functions can be given an optional termination function that
- will be run on completion::
- typedef
- void (*netfs_io_terminated_t)(void *priv, ssize_t transferred_or_error,
- bool was_async);
- If a termination function is given, the operation will be run asynchronously
- and the termination function will be called upon completion. If not given, the
- operation will be run synchronously. Note that in the asynchronous case, it is
- possible for the operation to complete before the function returns.
- Both the read and write functions end the operation when they complete,
- detaching any pinned resources.
- The read operation will fail with ESTALE if invalidation occurred whilst the
- operation was ongoing.
- Data File Coherency
- ===================
- To request an update of the coherency data and file size on a cookie, the
- following should be called::
- void fscache_update_cookie(struct fscache_cookie *cookie,
- const void *aux_data,
- const loff_t *object_size);
- This will update the cookie's coherency data and/or file size.
- Data File Invalidation
- ======================
- Sometimes it will be necessary to invalidate an object that contains data.
- Typically this will be necessary when the server informs the network filesystem
- of a remote third-party change - at which point the filesystem has to throw
- away the state and cached data that it had for an file and reload from the
- server.
- To indicate that a cache object should be invalidated, the following should be
- called::
- void fscache_invalidate(struct fscache_cookie *cookie,
- const void *aux_data,
- loff_t size,
- unsigned int flags);
- This increases the invalidation counter in the cookie to cause outstanding
- reads to fail with -ESTALE, sets the coherency data and file size from the
- information supplied, blocks new I/O on the cookie and dispatches the cache to
- go and get rid of the old data.
- Invalidation runs asynchronously in a worker thread so that it doesn't block
- too much.
- Write-Back Resource Management
- ==============================
- To write data to the cache from network filesystem writeback, the cache
- resources required need to be pinned at the point the modification is made (for
- instance when the page is marked dirty) as it's not possible to open a file in
- a thread that's exiting.
- The following facilities are provided to manage this:
- * An inode flag, ``I_PINNING_FSCACHE_WB``, is provided to indicate that an
- in-use is held on the cookie for this inode. It can only be changed if the
- the inode lock is held.
- * A flag, ``unpinned_fscache_wb`` is placed in the ``writeback_control``
- struct that gets set if ``__writeback_single_inode()`` clears
- ``I_PINNING_FSCACHE_WB`` because all the dirty pages were cleared.
- To support this, the following functions are provided::
- bool fscache_dirty_folio(struct address_space *mapping,
- struct folio *folio,
- struct fscache_cookie *cookie);
- void fscache_unpin_writeback(struct writeback_control *wbc,
- struct fscache_cookie *cookie);
- void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
- struct inode *inode,
- const void *aux);
- The *set* function is intended to be called from the filesystem's
- ``dirty_folio`` address space operation. If ``I_PINNING_FSCACHE_WB`` is not
- set, it sets that flag and increments the use count on the cookie (the caller
- must already have called ``fscache_use_cookie()``).
- The *unpin* function is intended to be called from the filesystem's
- ``write_inode`` superblock operation. It cleans up after writing by unusing
- the cookie if unpinned_fscache_wb is set in the writeback_control struct.
- The *clear* function is intended to be called from the netfs's ``evict_inode``
- superblock operation. It must be called *after*
- ``truncate_inode_pages_final()``, but *before* ``clear_inode()``. This cleans
- up any hanging ``I_PINNING_FSCACHE_WB``. It also allows the coherency data to
- be updated.
- Caching of Local Modifications
- ==============================
- If a network filesystem has locally modified data that it wants to write to the
- cache, it needs to mark the pages to indicate that a write is in progress, and
- if the mark is already present, it needs to wait for it to be removed first
- (presumably due to an already in-progress operation). This prevents multiple
- competing DIO writes to the same storage in the cache.
- Firstly, the netfs should determine if caching is available by doing something
- like::
- bool caching = fscache_cookie_enabled(cookie);
- If caching is to be attempted, pages should be waited for and then marked using
- the following functions provided by the netfs helper library::
- void set_page_fscache(struct page *page);
- void wait_on_page_fscache(struct page *page);
- int wait_on_page_fscache_killable(struct page *page);
- Once all the pages in the span are marked, the netfs can ask fscache to
- schedule a write of that region::
- void fscache_write_to_cache(struct fscache_cookie *cookie,
- struct address_space *mapping,
- loff_t start, size_t len, loff_t i_size,
- netfs_io_terminated_t term_func,
- void *term_func_priv,
- bool caching)
- And if an error occurs before that point is reached, the marks can be removed
- by calling::
- void fscache_clear_page_bits(struct address_space *mapping,
- loff_t start, size_t len,
- bool caching)
- In these functions, a pointer to the mapping to which the source pages are
- attached is passed in and start and len indicate the size of the region that's
- going to be written (it doesn't have to align to page boundaries necessarily,
- but it does have to align to DIO boundaries on the backing filesystem). The
- caching parameter indicates if caching should be skipped, and if false, the
- functions do nothing.
- The write function takes some additional parameters: the cookie representing
- the cache object to be written to, i_size indicates the size of the netfs file
- and term_func indicates an optional completion function, to which
- term_func_priv will be passed, along with the error or amount written.
- Note that the write function will always run asynchronously and will unmark all
- the pages upon completion before calling term_func.
- Page Release and Invalidation
- =============================
- Fscache keeps track of whether we have any data in the cache yet for a cache
- object we've just created. It knows it doesn't have to do any reading until it
- has done a write and then the page it wrote from has been released by the VM,
- after which it *has* to look in the cache.
- To inform fscache that a page might now be in the cache, the following function
- should be called from the ``release_folio`` address space op::
- void fscache_note_page_release(struct fscache_cookie *cookie);
- if the page has been released (ie. release_folio returned true).
- Page release and page invalidation should also wait for any mark left on the
- page to say that a DIO write is underway from that page::
- void wait_on_page_fscache(struct page *page);
- int wait_on_page_fscache_killable(struct page *page);
- API Function Reference
- ======================
- .. kernel-doc:: include/linux/fscache.h
|