i-nodes
A Unix file is described by an information block called an i-node. There is an i-node on disc for every file on the disc and there is also a copy in kernel memory for every open file. All the information about a file, other than it's name, is stored in the i-node. This information includes
The layout of the components of a System V disc i-node is shown in the following diagram. It occupies 16 32-bit words.
There are 13 physical block addresses in an i-node, each of these addresses is 3 bytes long. The first ten block addresses refer directly to data blocks, the next refers to a first level index block (which holds the addresses of further data blocks), the next refers to a second level index block (which holds the addresses of further index blocks) and the last refers to a third level index block (which holds the addresses of further second level index blocks). All physical addresses associated with a file are implicitly assumed to reside on the same disc, there is no facility whereby a file could span more than one disc. There is no requirement that the physical addresses of a file should be contiguous (i.e adjacent) and with multiple files being handled on a disc it is unlikely that contiguity would offer any advanatages for performance. There is also, more surprisingly, no requirement that all logical blocks should map to physical blocks, it is quite permissible for files to have "holes" and this is quite likely to happen with large sparsely populated direct access files.
Assuming 512 byte blocks and 3 bytes per address which is equivalent to a disc capacity of about 8 GByte. An index block of 512 bytes is capable of holding 170 3 byte addresses. The size of the largest file can be calculated thus.
The total addressable space comes to 2530344960 bytes (approximately 2.5 Gbytes).
BSD and other more recent versions of Unix use a larger disc i-node format that consists of 32 4 byte words. The block addresses now occupy 4 bytes rather than 3 and various other fields are larger as will be seen from the diagram below.
The extensions include space for 32-bit user and group ids and an i-node generation number incremented when a free i-node is used for a different file. The generation number is used by the network file system for file handle calculation. It should also be noted that the time fields have expanded to 64 bits so that the year 2031 problem (when the Unix standard time format wraps) can be avoided.
Once a file has been opened the in-memory (the phrase "in-core" is traditionally used) version of the i-node contains significant extra information. This extra information includes