vnode(9)
- NetBSD Manual Pages
VNODE(9) NetBSD Kernel Developer's Manual VNODE(9)
NAME
vnode, vref, vrele, vrele_async, vput, vhold, holdrele, vcache_get,
vcache_new, vcache_rekey_enter, vcache_rekey_exit, vrecycle, vgone,
vgonel, vdead_check, vflush, vaccess, bdevvp, cdevvp, vfinddev, vdevgone,
vwakeup, vflushbuf, vinvalbuf, vtruncbuf, vprint -- kernel representation
of a file or directory
SYNOPSIS
#include <sys/param.h>
#include <sys/vnode.h>
void
vref(struct vnode *vp);
void
vrele(struct vnode *vp);
void
vrele_async(struct vnode *vp);
void
vput(struct vnode *vp);
void
vhold(struct vnode *vp);
void
holdrele(struct vnode *vp);
int
vcache_get(struct mount *mp, const void *key, size_t key_len,
struct vnode **vpp);
int
vcache_new(struct mount *mp, struct vnode *dvp, struct vattr *vap,
kauth_cred_t cred, void *extra, struct vnode **vpp);
int
vcache_rekey_enter(struct mount *mp, struct vnode *vp,
const void *old_key, size_t old_key_len, const void *new_key,
size_t new_key_len);
void
vcache_rekey_exit(struct mount *mp, struct vnode *vp,
const void *old_key, size_t old_key_len, const void *new_key,
size_t new_key_len);
int
vrecycle(struct vnode *vp);
void
vgone(struct vnode *vp);
void
vgonel(struct vnode *vp, struct lwp *l);
int
vdead_check(struct vnode *vp, int flags);
int
vflush(struct mount *mp, struct vnode *skipvp, int flags);
int
vaccess(enum vtype type, mode_t file_mode, uid_t uid, gid_t gid,
mode_t acc_mode, kauth_cred_t cred);
int
bdevvp(dev_t dev, struct vnode **vpp);
int
cdevvp(dev_t dev, struct vnode **vpp);
int
vfinddev(dev_t dev, enum vtype, struct vnode **vpp);
void
vdevgone(int maj, int minl, int minh, enum vtype type);
void
vwakeup(struct buf *bp);
int
vflushbuf(struct vnode *vp, int sync);
int
vinvalbuf(struct vnode *vp, int flags, kauth_cred_t cred, struct lwp *l,
int slpflag, int slptimeo);
int
vtruncbuf(struct vnode *vp, daddr_t lbn, int slpflag, int slptimeo);
void
vprint(const char *label, struct vnode *vp);
DESCRIPTION
A vnode represents an on-disk file in use by the system. Each vfs(9)
file system provides a set of vnodeops(9) operations on vnodes, invoked
by file-system-independent system calls and supported by file-system-
independent library routines.
Each mounted file system provides a vnode for the root of the file sys-
tem, via VFS_ROOT(9). Other vnodes are obtained by VOP_LOOKUP(9). Users
of vnodes usually invoke these indirectly via namei(9) to obtain vnodes
from paths.
Each file system usually maintains a cache mapping recently used inode
numbers, or the equivalent, to vnodes, and a cache mapping recently used
file names to vnodes. If memory is scarce, the system may decide to
reclaim an unused cached vnode, calling VOP_RECLAIM(9) to remove it from
the caches and to free file-system-specific memory associated with it. A
file system may also choose to immediately reclaim a cached vnode once it
is unused, in VOP_INACTIVE(9), if the vnode has been deleted on disk.
When a file system retrieves a vnode from a cache, the vnode may not have
any users, and another thread in the system may be simultaneously decid-
ing to reclaim it. Thus, to retrieve a vnode from a cache, one must use
vcache_get(), not vref(), to acquire the first reference.
The vnode has the following structure:
struct vnode {
struct uvm_object v_uobj; /* the VM object */
kcondvar_t v_cv; /* synchronization */
voff_t v_size; /* size of file */
voff_t v_writesize; /* new size after write */
int v_iflag; /* VI_* flags */
int v_vflag; /* VV_* flags */
int v_uflag; /* VU_* flags */
int v_numoutput; /* # of pending writes */
int v_writecount; /* ref count of writers */
int v_holdcnt; /* page & buffer refs */
struct mount *v_mount; /* ptr to vfs we are in */
int (**v_op)(void *); /* vnode operations vector */
struct buflists v_cleanblkhd; /* clean blocklist head */
struct buflists v_dirtyblkhd; /* dirty blocklist head */
union {
struct mount *vu_mountedhere;/* ptr to vfs (VDIR) */
struct socket *vu_socket; /* unix ipc (VSOCK) */
struct specnode *vu_specnode; /* device (VCHR, VBLK) */
struct fifoinfo *vu_fifoinfo; /* fifo (VFIFO) */
struct uvm_ractx *vu_ractx; /* read-ahead ctx (VREG) */
} v_un;
enum vtype v_type; /* vnode type */
enum vtagtype v_tag; /* type of underlying data */
void *v_data; /* private data for fs */
struct klist v_klist; /* notes attached to vnode */
};
Most members of the vnode structure should be treated as opaque and only
manipulated using the proper functions. There are some rather common
exceptions detailed throughout this page.
Files and file systems are inextricably linked with the virtual memory
system and v_uobj contains the data maintained by the virtual memory sys-
tem. For compatibility with code written before the integration of
uvm(9) into NetBSD, C-preprocessor directives are used to alias the mem-
bers of v_uobj.
Vnode flags are recorded by v_iflag, v_vflag and v_uflag. Valid flags
are:
VV_ROOT This vnode is the root of its file system.
VV_SYSTEM This vnode is being used by the kernel; only used to
skip quota files in vflush().
VV_ISTTY This vnode represents a tty; used when reading dead
vnodes.
VV_MAPPED This vnode might have user mappings.
VV_MPSAFE This file system is MP safe.
VV_LOCKSWORK This vnode's file system supports locking.
VI_TEXT This vnode is a pure text prototype.
VI_EXECMAP This vnode has executable mappings.
VI_WRMAP This vnode might have PROT_WRITE user mappings.
VI_WRMAPDIRTY This vnode might have dirty pages due to VWRITEMAP.
VI_XLOCK This vnode is currently locked to change underlying
type.
VI_ONWORKLST This vnode is on syncer work-list.
VI_MARKER A dummy marker vnode.
VI_CLEAN This vnode has been reclaimed and is no longer
attached to a file system.
VU_DIROP This vnode is involved in a directory operation.
This flag is used exclusively by LFS.
The VI_XLOCK flag is used to prevent multiple processes from entering the
vnode reclamation code. It is also used as a flag to indicate that
reclamation is in progress. Before v_iflag can be modified, the
v_interlock mutex must be acquired. See lock(9) for details on the ker-
nel locking API.
Each vnode has three reference counts: v_usecount, v_writecount and
v_holdcnt. The first is the number of active references within the ker-
nel to the vnode. This count is maintained by vref(), vrele(),
vrele_async(), and vput(). The second is the number of active references
within the kernel to the vnode performing write access to the file. It
is maintained by the open(2) and close(2) system calls. The third is the
number of references within the kernel requiring the vnode to remain
active and not be recycled. This count is maintained by vhold() and
holdrele(). When both the v_usecount and v_holdcnt reach zero, the vnode
is cached. The transition from the cache is handled by a kernel thread
and vrecycle(). Access to v_usecount, v_writecount and v_holdcnt is also
protected by the v_interlock mutex.
The number of pending synchronous and asynchronous writes on the vnode
are recorded in v_numoutput. It is used by fsync(2) to wait for all
writes to complete before returning to the user. Its value must only be
modified at splbio (see spl(9)). It does not track the number of dirty
buffers attached to the vnode.
The link to the file system which owns the vnode is recorded by v_mount.
See vfsops(9) for further information of file system mount status.
The v_op pointer points to its vnode operations vector. This vector
describes what operations can be done to the file associated with the
vnode. The system maintains one vnode operations vector for each file
system type configured into the kernel. The vnode operations vector con-
tains a pointer to a function for each operation supported by the file
system. See vnodeops(9) for a description of vnode operations.
When a user wants a new vnode for another file or wants a valid vnode
which is cached, vcache_get() or vcache_new() is invoked to allocate a
vnode and initialize it for the new file.
The type of object the vnode represents is recorded by v_type. It is
used by generic code to perform checks to ensure operations are performed
on valid file system objects. Valid types are:
VNON The vnode has no type.
VREG The vnode represents a regular file.
VDIR The vnode represents a directory.
VBLK The vnode represents a block special device.
VCHR The vnode represents a character special device.
VLNK The vnode represents a symbolic link.
VSOCK The vnode represents a socket.
VFIFO The vnode represents a pipe.
VBAD The vnode represents a bad file (not currently used).
Vnode tag types are used by external programs only (e.g., pstat(8)), and
should never be inspected by the kernel. Its use is deprecated since new
v_tag values cannot be defined for loadable file systems. The v_tag mem-
ber is read-only. Valid tag types are:
VT_NON non file system
VT_UFS universal file system
VT_NFS network file system
VT_MFS memory file system
VT_MSDOSFS FAT file system
VT_LFS log-structured file system
VT_LOFS loopback file system
VT_FDESC file descriptor file system
VT_NULL null file system layer
VT_UMAP uid/gid remapping file system layer
VT_KERNFS kernel interface file system
VT_PROCFS process interface file system
VT_AFS AFS file system
VT_ISOFS ISO 9660 file system(s)
VT_UNION union file system
VT_ADOSFS Amiga file system
VT_EXT2FS Linux's ext2 file system
VT_CODA Coda file system
VT_FILECORE filecore file system
VT_NTFS Microsoft NT's file system
VT_VFS virtual file system
VT_OVERLAY overlay file system
VT_SMBFS SMB file system
VT_PTYFS pseudo-terminal device file system
VT_TMPFS efficient memory file system
VT_UDF universal disk format file system
VT_SYSVBFS systemV boot file system
The vnode lock is acquired by calling vn_lock(9) and released by calling
VOP_UNLOCK(9). The reason for this asymmetry is that vn_lock(9) is a
wrapper for VOP_LOCK(9) with extra checks, while the unlocking step usu-
ally does not need additional checks and thus has no wrapper.
The vnode locking operation is complicated because it is used for many
purposes. Sometimes it is used to bundle a series of vnode operations
(see vnodeops(9)) into an atomic group. Many file systems rely on it to
prevent race conditions in updating file system type specific data struc-
tures rather than using their own private locks. The vnode lock can
operate as a multiple-reader (shared-access lock) or single-writer lock
(exclusive access lock), however many current file system implementations
were written assuming only single-writer locking. Multiple-reader lock-
ing functions equivalently only in the presence of big-lock SMP locking
or a uni-processor machine. The lock may be held while sleeping. While
the vnode lock is acquired, the holder is guaranteed that the vnode will
not be reclaimed or invalidated. Most file system functions require that
you hold the vnode lock on entry. See lock(9) for details on the kernel
locking API.
Each file system underlying a vnode allocates its own private area and
hangs it from v_data.
Most functions discussed in this page that operate on vnodes cannot be
called from interrupt context. The members v_numoutput, v_holdcnt,
v_dirtyblkhd, and v_cleanblkhd are modified in interrupt context and must
be protected by splbio(9) unless it is certain that there is no chance an
interrupt handler will modify them. The vnode lock must not be acquired
within interrupt context.
FUNCTIONS
vref(vp)
Increment v_usecount of the vnode vp. Any kernel thread system
which uses a vnode (e.g., during the operation of some algorithm
or to store in a data structure) should call vref().
vrele(vp)
Decrement v_usecount of unlocked vnode vp. Any code in the sys-
tem which is using a vnode should call vrele() when it is fin-
ished with the vnode. If v_usecount of the vnode reaches zero
and v_holdcnt is greater than zero, the vnode is placed on the
holdlist. If both v_usecount and v_holdcnt are zero, the vnode
is cached.
vrele_async(vp)
Will asynchronously release the vnode in different context than
the caller, sometime after the call.
vput(vp)
Legacy convenience routine for unlocking and releasing vp.
Equivalent to:
VOP_UNLOCK(vp);
vrele(vp);
New code should prefer using VOP_UNLOCK(9) and vrele() directly.
vhold(vp)
Mark the vnode vp as active by incrementing vp->v_holdcnt. Once
held, the vnode will not be recycled until it is released with
holdrele().
holdrele(vp)
Mark the vnode vp as inactive by decrementing vp->v_holdcnt.
vcache_get(mp, key, key_len, vpp)
Allocate a new vnode. The new vnode is returned referenced in
the address specified by vpp.
The argument mp is the mount point for the file system to lookup
the file in.
The arguments key and key_len uniquely identify the file in the
file system.
If a vnode is successfully retrieved zero is returned, otherwise
an appropriate error code is returned.
vcache_new(mp, dvp, vap, cred, vpp)
Allocate a new vnode with a new file. The new vnode is returned
referenced in the address specified by vpp.
The argument mp is the mount point for the file system to create
the file in.
The argument dvp points to the directory to create the file in.
The argument vap points to the attributes for the file to cre-
ate.
The argument cred holds the credentials for the file to create.
The argument extra allows the caller to pass more information
about the file to create.
If a vnode is successfully created zero is returned, otherwise
an appropriate error code is returned.
vcache_rekey_enter(mp, vp, old_key, old_key_len, new_key, new_key_len)
Prepare to change the key of a cached vnode.
The argument mp is the mount point for the file system the vnode
vp resides in.
The arguments old_key and old_key_len identify the cached vnode.
The arguments new_key and new_key_len will identify the vnode
after rename.
If the new key already exists EEXIST is returned, otherwise zero
is returned.
vcache_rekey_exit(mp, vp, old_key, old_key_len, new_key, new_key_len)
Finish rename after calling vcache_rekey_enter().
vrecycle(vp)
Recycle the referenced vnode vp if this is the last reference.
vrecycle() is a null operation if the reference count is greater
than one.
vgone(vp)
Eliminate all activity associated with the unlocked vnode vp in
preparation for recycling. This operation is restricted to sus-
pended file systems. See vfs_suspend(9).
vgonel(vp, p)
Eliminate all activity associated with the locked vnode vp in
preparation for recycling.
vdead_check(vp, flags)
Check the vnode vp for being or becoming dead. Returns ENOENT
for a dead vnode and zero otherwise. If flags is VDEAD_NOWAIT
it will return EBUSY if the vnode is becoming dead and the func-
tion will not sleep.
Whenever this function returns a non-zero value all future calls
for this vp will also return a non-zero value.
vflush(mp, skipvp, flags)
Remove any vnodes in the vnode table belonging to mount point
mp. If skipvp is not NULL it is exempt from being flushed. The
argument flags is a set of flags modifying the operation of
vflush(). If FORCECLOSE is not specified, there should not be
any active vnodes and the error EBUSY is returned if any are
found (this is a user error, not a system error). If FORCECLOSE
is specified, active vnodes that are found are detached. If
WRITECLOSE is set, only flush out regular file vnodes open for
writing. SKIPSYSTEM causes any vnodes marked V_SYSTEM to be
skipped.
vaccess(type, file_mode, uid, gid, acc_mode, cred)
Do access checking by comparing the file's permissions to the
caller's desired access type acc_mode and credentials cred.
bdevvp(dev, vpp)
Create a vnode for a block device. bdevvp() is used for root
file systems, swap areas and for memory file system special
devices.
cdevvp(dev, vpp)
Create a vnode for a character device. cdevvp() is used for the
console and kernfs special devices.
vfinddev(dev, vtype, vpp)
Lookup a vnode by device number. The vnode is referenced and
returned in the address specified by vpp.
vdevgone(int maj, int min, int minh, enum vtype type)
Reclaim all vnodes that correspond to the specified minor number
range minl to minh (endpoints inclusive) of the specified major
maj.
vwakeup(bp)
Update outstanding I/O count vp->v_numoutput for the vnode
bp->b_vp and do a wakeup if requested and vp->vflag has VBWAIT
set.
vflushbuf(vp, sync)
Flush all dirty buffers to disk for the file with the locked
vnode vp. The argument sync specifies whether the I/O should be
synchronous and vflushbuf() will sleep until vp->v_numoutput is
zero and vp->v_dirtyblkhd is empty.
vinvalbuf(vp, flags, cred, l, slpflag, slptimeo)
Flush out and invalidate all buffers associated with locked
vnode vp. The argument l and cred specified the calling process
and its credentials. The ltsleep(9) flag and timeout are speci-
fied by the arguments slpflag and slptimeo respectively. If the
operation is successful zero is returned, otherwise an appropri-
ate error code is returned.
vtruncbuf(vp, lbn, slpflag, slptimeo)
Destroy any in-core buffers past the file truncation length for
the locked vnode vp. The truncation length is specified by lbn.
vtruncbuf() will sleep while the I/O is performed, The
ltsleep(9) flag and timeout are specified by the arguments
slpflag and slptimeo respectively. If the operation is success-
ful zero is returned, otherwise an appropriate error code is
returned.
vprint(label, vp)
This function is used by the kernel to dump vnode information
during a panic. It is only used if the kernel option DIAGNOSTIC
is compiled into the kernel. The argument label is a string to
prefix the information dump of vnode vp.
CODE REFERENCES
The vnode framework is implemented within the file sys/kern/vfs_subr.c.
SEE ALSO
intro(9), lock(9), namecache(9), namei(9), uvm(9), vattr(9), vfs(9),
vfsops(9), vnodeops(9), vnsubr(9)
BUGS
The locking protocol is inconsistent. Many vnode operations are passed
locked vnodes on entry but release the lock before they exit. The lock-
ing protocol is used in some places to attempt to make a series of opera-
tions atomic (e.g., access check then operation). This does not work for
non-local file systems that do not support locking (e.g., NFS). The
vnode interface would benefit from a simpler locking protocol.
NetBSD 9.3 January 1, 2019 NetBSD 9.3
Powered by man-cgi (2021-06-01).
Maintained for NetBSD
by Kimmo Suominen.
Based on man-cgi by Panagiotis Christias.