The wave file format is a widely supported format for storing digital audio. A wave file uses the Resource Interchange File Format (RIFF) file structure and hence data is organized in chunks as described below. Each chunk contains information about its type and size and can easily be skipped by software that does not understand the specific chunk type.
A wave file is organized as follows.
Byte sequence description | Length in bytes | Starts at byte | Value |
chunk ID | 4 | 0x00 | The ASCII character string "RIFF" |
size | 4 | 0x04 | The size of the wave file (number of bytes) less 8 (less the size of the "chunk ID" and the "size") |
RIFF type ID | 4 | 0x08 | The ASCII character string "WAVE" |
wave chunks | various | 0x0C | Various chunks in the wave file as described below |
Endianism
All information is stored with the least significant byte first (little-endian). For example, if the 4-byte value for the average bytes per second in the format chunk is 88,200 = 0x00015888, this information will be stored with the following byte sequence.
0x88 0x58 0x01 0x00
Word alignment
All information in a wave file must be word aligned (i.e., aligned at every two bytes). If a chunk has an odd number of bytes, then it will be padded with a zero byte, although this byte will not be counted in the size of the chunk.
Wave chunks
A wave file would include at least the following chunks.
Other chunks that may exist in a wave file include the following.
Silent chunk
Wave list chunk
Fact chunk
Cue chunk
Playlist chunk
List chunk
Sample chunk
Instrument chunk
Other RIFF chunks
Since the RIFF format is used for other types of files, such as AVI files, a RIFF file can contain types of chunks that are not relevant to the wave file format. For example, the junk and pad chunks are used to add random data to the file to, perhaps, align the file chunks on a 2K boundary. A software application does not have to recognize or use all chunk types and may ignore certain chunks.
Comments
Not really little-endian
Uh, the format where the least significant byte comes first is called BIG endian.
Because the end of it has the biggest value. And the little byte is the beginning.
Like Intel x86 is a big-endian type of deal, which is easy to double-check.
Opposite
I know this is weird, but little end (little bytes) first is "little-endian" and big end (big bytes) first is "big-endian". I don't know who created this naming convention, but it is what it is...
Endianess
It is quite the problem of endianess that we simply can not agree on the order of the ends. When writing numbers in decimal, like one-hundred-twenty-three (123), we use big-endian. It's called big endian because the first thing that you read is of the biggest significance. You could argue that the digit three (3) is at "the end", but the digit one (1) is also at "the end", it just happens to be the other one. You could say that the naming convention of endianess suffers from endianess.
Add new comment