Required reading: Chapter 15, 16, and 17
- Programs perform input/output through devices that interact with the real world. An electronic device controller typically controls the device. The controller is connected through a bus to the rest of the computer (processor and memory). By programming the controller, a program on the processor can control the device. Typically the software to control the device is divided in two pieces: the device-independent software and the device-dependent software, the driver.
- Interface issues:
- UNIX devices in general fall in two categories: block devices (e.g., disk) and character devices (e.g., terminal, printer). Block devices break bytes up into blocks. Character devices represent a continuous stream of bytes.
- Names for devices allow users to interact with devices (e.g.,
mount /dev/rk0 /mnt). For example,
/dev/rk0 specifies the i-node for a special file. the i-node contains a major device number and a minor number. The major device number indexes into bdevsw (block device) or cdevsw (character device). The minor number indexes into an array within that device driver (if there are multiple devices of the same type).
- The driver can access the driver either has through special memory locations (memory-mapped I/O) or with reserved instructions. Most common today is memory-mapped I/O.
- The device works in parallel with the main processor. Therefore, the processor and controller need a way to coordinate. The two main styles for coordination are polling and interrupts.
- Many computers support DMA (direct-memory access) for larger data transfers between device and memory. This allows a device to transfer data from device to memory (and vice versa) without the CPU's involvement. DMA reduces the load on the CPU and makes better use of the bus.
- Writing software for device drivers is challenging:
- devices are designed by hardware designers
- concurrency within driver (e.g., interrupt handler). typically addressed by disabling and enabling interrupts carefully.
- concurrency between driver and device. typically addressed by ensuring that each register has only one writer.
- potential for deadlock
- address translation
- Buffer Cache. Disk I/O is slow compared to CPU. The fastest I/O is the one you don't do; thus, cache data blocks in main memory. Buffer cache design issues:
- Division of memory between programs and buffer cache.
- Replacement policy. LRU, MRU, etc.
- Write behind or not? Benefit: delays writes, which may provide write absorption (multiple writes to same block result in one I/O operations). Write behind, unfortunately, opens the window for system failures. Write behind also introduces additional concurrency (e.g., a read request may have to wait until a dirty buffer has been written out).
- Asynchronous operations. Can caller of operations on the buffer cache proceed while the buffer cache is performing the requested operation?
V6 code examples
- Device driver: RK. Salient properties: 1,228,800 words, 10msec track-to-track seek, 85 msec worst case. Compare with today disks? What improved most capacity or seek time? (Answer: capacity!)
Driver fits in 3 pages of code.
- Does the PDP-11 use memory mapped I/O? (Yes, device is accessed through the address RKADDR).
- How is the device controlled? Through the registers ds (driver status), er (error), cs (control status), wc (word count), ba (disk address), and da (data buffer).
- Concurrency control between processor and device controller: if bit 7 is set in cs, the driver is allowed to manipulate the registers; otherwise, the controller has control. the processor indicates that the controller should take control through bit 0 (GO). the driver caches this information in d_active.
- Concurrency control within driver: spl5().
- Polling versus interrupts: bit 6 says whether to issue an interrupt or not on completion of a command.
- Does device use physical addresses or virtual addresses? (Answer: physical addresses.)
- How are block numbers translated to disk location? (Answer: rkaddr).
- Buffer cache. v6 keeps a cache of NBUF buffers, which each contain one block. The buffer flags include: B_DONE (I/O operation has completed), B_BUSY (in use by a device), B_WANTED (another threads wants this buffer), B_ASYNC (buffer is used in an asynchronous operation), B_DELWRI (don't write buffer until it leaves the free list), and B_ERROR (I/O operation aborted). Are these flags non exclusive? (No, multiple can be set at the same time, but many combinations are not possible.)
Detailed buffer cache notes:
- If only one application is running, can it use all available memory for buffer cache? (No: buffer cache has a fixed size.)
- What is the cache replacement policy? (Answer: LRU. brelse puts buffers at the end of the freelist. getblk grabs them from the beginning of the freelist).
- Why are disk blocks cached? The buffer is placed on the freelist immediately after it has been used. (Yes, but when it is put on the freelist, it also stays on the device list, thus getblk will find it there and avoid doing a disk access.)
- Does the buffer cache support write behind? (Yes: writes are delayed optionally, writedb versus writeb.) How long are they delayed? (Until buffer is needed, but still is dirty (4961), or a user utility forces them out, which apparently runs once every 30 seconds.)
- Buffer cache also support asynchronous operations. The caller can continue while the I/O operation is in progress (e.g., see 4820). iodone() releases the buffer when the I/O operation is complete.
- Various form of concurrency control: (1) no buffers available at all; (2) desired buffer in cache, but in use for an I/O operation (B_BUSY, B_WANTED); (3) dirty buffers needs to be used for something else (B_DELWRI); (4) asynchronous operation in action (B_ASYNC); and (5) operation completed (B_DONE).
- Can bio overwrite a part of a block? If so, must bio read the block in first? (No. A write always overwrites a block completely.)
- Can a thread ever be blocked on a disk write it earlier issued (i.e., is it possible that a thread can a second write to the same block that is used by the first write)? (Yes, it can get blocked if it attempts to write a block for which it just issued an asynchronous write).
- Can you delete 1 line of code from bio.c and cause a deadlock?
B_READ - marks i/o read request
B_DONE - i/o has finished
B_ERROR - i/o caused error
B_BUSY - locked (not available)
B_WANTED - issue wakeup when !BUSY or DONE changes
B_ASYNC - don't wait for i/o completion
B_DELWRI - write before reusing block
b_forw b_back - device list
av_forw av_back - avail list
b_dev - device owning this block
b_wcounct - i/o params
b_error - i/o error reason (not used)
b_resid - how much i/o happened
binit - put all blocks on free list
set up circular linked list on block device tabs
bread - read a block from disk into a buffer
how do errors get propagated?
- any read-only operations can just check for error on exit
- iget checks B_ERROR.
return locked block
breada - start a read for the block, but start i/o on the next block
return locked block, do not lock read-ahead block
(why can't someone get the read-ahead block before it's done?
the block is locked, and unlocked by iodone)
bwrite - write the block
handed a locked block, eventually unlocks it
bdwrite - release with delayed write
mark the block to be written later
unlock it but leave it in memory
why bawrite for tmtab?
bawrite - start async write of block
how do we know it's okay to call the strategy?
(strategy queues the blocks)
brelse - put the buffer back on the free list
put b on freelist
wakeup(freelist) if needed
getblk - find a block for dev, blkno already in cache
return locked block
why is it safe to call notavail? what if it's already notavail
and the pointer is being used for something else?
iodone - finish with block after i/o finishes
b->flags |= B_DONE
if async write,
iowait - wait for i/o to finish
waiting for B_DONE
geterror - set EIO with B_ERROR
bflush - flush all the write-behind blocks
bawrite on every block
doesn't actually wait for blocks to finish writing
devstart - generic DEC controller device i/o start
physio - start raw i/o
rkstrategy - add block to queue, kick disk driver (rkstart)
rkintr - note block finished
locking protocol -
b->flags =| B_WANTED;
b->flags =| B_BUSY
if(b->flags & B_WANTED)
b->flags =& ~B_BUSY
routines return locked blocks