getiobuf(9)
- NetBSD Manual Pages
BUFFERIO(9) NetBSD Kernel Developer's Manual BUFFERIO(9)
NAME
BUFFERIO, biodone, biowait, getiobuf, putiobuf, nestiobuf_setup,
nestiobuf_done -- block I/O buffer transfers
SYNOPSIS
#include <sys/buf.h>
void
biodone(buf_t *bp);
int
biowait(buf_t *bp);
buf_t *
getiobuf(struct vnode *vp, bool waitok);
void
putiobuf(buf_t *bp);
void
nestiobuf_setup(buf_t *mbp, buf_t *bp, int offset, size_t size);
void
nestiobuf_done(buf_t *mbp, int donebytes, int error);
DESCRIPTION
The BUFFERIO subsystem manages block I/O buffer transfers, described by
the struct buf structure, which serves multiple purposes between users in
BUFFERIO, users in buffercache(9), and users in block device drivers to
execute transfers to physical disks.
BLOCK DEVICE USERS
Users of BUFFERIO wishing to submit a buffer for block I/O transfer must
obtain a struct buf, e.g. via getiobuf(), fill its parameters, and submit
it to a block device with bdev_strategy(9), usually via VOP_STRATEGY(9).
The parameters to an I/O transfer described by bp are specified by the
following struct buf fields:
bp->b_flags
Flags specifying the type of transfer.
B_READ Transfer is read from device. If not set, transfer
is write to device.
B_ASYNC
Asynchronous I/O. Caller must not provide
bp->b_iodone and must not call biowait(bp).
For legibility, callers should indicate writes by passing the
pseudo-flag B_WRITE, which is zero.
bp->b_data
Pointer to kernel virtual address of source/target for trans-
fer.
bp->b_bcount
Nonnegative number of bytes requested for transfer.
bp->b_blkno
Block number at which to do transfer.
bp->b_iodone
I/O completion callback. B_ASYNC must not be set in
bp->b_flags.
Additionally, if the I/O transfer is a write associated with a vnode(9)
vp, then before the user submits it to a block device, the user must
increment vp->v_numoutput. The user must not acquire vp's vnode lock
between incrementing vp->v_numoutput and submitting bp to a block device
-- doing so will likely cause deadlock with the syncer.
Block I/O transfer completion may be notified by the bp->b_iodone call-
back, by signalling biowait() waiters, or not at all in the B_ASYNC case.
- If the user sets the bp->b_iodone callback to a non-NULL function
pointer, it will be called in soft interrupt context when the I/O
transfer is complete. The user may not call biowait(bp) in this
case.
- If B_ASYNC is set, then the I/O transfer is asynchronous and the user
will not be notified when it is completed. The user may not call
biowait(bp) in this case.
- Otherwise, if bp->b_iodone is NULL and B_ASYNC is not specified, the
user may wait for the I/O transfer to complete with biowait(bp).
Once an I/O transfer has completed, its struct buf may be reused, but the
user must first clear the BO_DONE flag of bp->b_oflags before reusing it.
NESTED I/O TRANSFERS
Sometimes an I/O transfer from a single buffer in memory cannot go to a
single location on a block device: it must be split up into smaller
transfers for each segment of the memory buffer.
After initializing the b_flags, b_data, and b_bcount parameters of an I/O
transfer for the buffer, called the master buffer, the user can issue
smaller transfers for segments of the buffer using nestiobuf_setup().
When nested I/O transfers complete, in any order, they debit from the
amount of work left to be done in the master buffer. If any segments of
the buffer were skipped, the user can report this with nestiobuf_done()
to debit the skipped part of the work.
The master buffer's I/O transfer is completed when all nested buffers'
I/O transfers are completed, and if nestiobuf_done() is called in the
case of skipped segments.
For writes associated with a vnode vp, nestiobuf_setup() accounts for
vp->v_numoutput, so the caller is not allowed to acquire vp's vnode lock
before submitting the nested I/O transfer to a block device. However,
the caller is responsible for accounting the master buffer in
vp->v_numoutput. This must be done very carefully because after incre-
menting vp->v_numoutput, the caller is not allowed to acquire vp's vnode
lock before either calling nestiobuf_done() or submitting the last nested
I/O transfer to a block device.
For example:
struct buf *mbp, *bp;
size_t skipped = 0;
unsigned i;
int error = 0;
mbp = getiobuf(vp, true);
mbp->b_data = data;
mbp->b_resid = mbp->b_bcount = datalen;
mbp->b_flags = B_WRITE;
KASSERT(0 < nsegs);
KASSERT(datalen == nsegs*segsz);
for (i = 0; i < nsegs; i++) {
struct vnode *devvp;
daddr_t blkno;
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
error = VOP_BMAP(vp, i*segsz, &devvp, &blkno, NULL);
VOP_UNLOCK(vp);
if (error == 0 && blkno == -1)
error = EIO;
if (error) {
/* Give up early, don't try to handle holes. */
skipped += datalen - i*segsz;
break;
}
bp = getiobuf(vp, true);
nestiobuf_setup(bp, mbp, i*segsz, segsz);
bp->b_blkno = blkno;
if (i == nsegs - 1) /* Last segment. */
break;
VOP_STRATEGY(devvp, bp);
}
/*
* Account v_numoutput for master write.
* (Must not vn_lock before last VOP_STRATEGY!)
*/
mutex_enter(&vp->v_interlock);
vp->v_numoutput++;
mutex_exit(&vp->v_interlock);
if (skipped)
nestiobuf_done(mbp, skipped, error);
else
VOP_STRATEGY(devvp, bp);
BLOCK DEVICE DRIVERS
Block device drivers implement a `strategy' method, in the d_strategy
member of struct bdevsw (driver(9)), to queue a buffer for disk I/O. The
inputs to the strategy method are:
bp->b_flags
Flags specifying the type of transfer.
B_READ Transfer is read from device. If not set, transfer
is write to device.
bp->b_data
Pointer to kernel virtual address of source/target for trans-
fer.
bp->b_bcount
Nonnegative number of bytes requested for transfer.
bp->b_blkno
Block number at which to do transfer, relative to partition
start.
If the strategy method uses bufq(9), it must additionally initialize the
following fields before queueing bp with bufq_put(9):
bp->b_rawblkno
Block number relative to volume start.
When the I/O transfer is complete, whether it succeeded or failed, the
strategy method must:
- Set bp->b_error to zero on success, or to an errno(2) error code on
failure.
- Set bp->b_resid to the number of bytes remaining to transfer, whether
on success or on failure. If no bytes were transferred, this must be
set to bp->b_bcount.
- Call biodone(bp).
FUNCTIONS
biodone(bp)
Notify that the I/O transfer described by bp has completed.
To be called by a block device driver. Caller must first set
bp->b_error to an error code and bp->b_resid to the number of bytes
remaining to transfer.
biowait(bp)
Wait for the synchronous I/O transfer described by bp to complete.
Returns the value of bp->b_error.
To be called by a user requesting the I/O transfer.
May not be called if bp has a callback or is asynchronous -- that
is, if bp->b_iodone is set, or if B_ASYNC is set in bp->b_flags.
getiobuf(vp, waitok)
Allocate a struct buf for an I/O transfer. If vp is non-NULL, the
transfer is associated with it. If waitok is false, returns NULL
if none can be allocated immediately.
The resulting struct buf pointer must eventually be passed to
putiobuf() to release it. Do not use brelse(9).
The buffer may not be used for an asynchronous I/O transfer,
because there is no way to know when it is completed and may be
safely passed to putiobuf(). Asynchronous I/O transfers are
allowed only for buffers in the buffercache(9).
May sleep if waitok is true.
putiobuf(bp)
Free bp, which must have been allocated by getiobuf(). Either bp
must never have been submitted to a block device, or the I/O trans-
fer must have completed.
CODE REFERENCES
The BUFFERIO subsystem is implemented in sys/kern/vfs_bio.c.
SEE ALSO
buffercache(9), bufq(9)
BUGS
The BUFFERIO abstraction provides no way to cancel an I/O transfer once
it has been submitted to a block device.
The BUFFERIO abstraction provides no way to do I/O transfers with non-
kernel pages, e.g. directly to buffers in userland without copying into
the kernel first.
The struct buf type is all mixed up with the buffercache(9).
The BUFFERIO abstraction is a totally idiotic API design.
The v_numoutput accounting required of BUFFERIO callers is asinine.
NetBSD 10.99 September 12, 2019 NetBSD 10.99
Powered by man-cgi (2021-06-01).
Maintained for NetBSD
by Kimmo Suominen.
Based on man-cgi by Panagiotis Christias.