The B Programming Language 

 

Overview

B is a computer language designed in 1970, directly descending from BCPL. B is good for recursive, non-numeric, machine independent applications, such as system and language work.

B is a simple procedural language and a typeless language. A typeless language can be thought of as having a single data type, the `word,' or `cell,' a fixed-length bit pattern. This means that the compiler does not keep track of whether variables refer to integers, characters, octal numbers, an so on. You can subtract the letter ‘a’ from 2.0 without getting an error. B can be thought of as C without types. The typeless language gives the programmer a lot of freedom. However, the programmer must make sure the operations they are asking B to do make sense.

In B, an identifier is formed from characters a-z, A-Z, 0-9, the (_), and the dot (.). The first character cannot be a digit. Identifiers in B can be long but only the first eight characters are significant. This causes the compiler to believe that function1 and function3 are the same. B (except for earliest versions of B) recognizes separate compilation, and provides a means for including text from named files. Storage limitations on the B compiler demanded a one-pass technique in which output was generated as soon as possible, and the syntactic redesign that made this possible was carried forward into C. When you compile a program, B ignores case distinctions in external names.

B allows the programmer to define octal, decimal, floating point, ASCII character, and BCD character and string constants. All but the last of these are stored internally in a single machine word. A machine word is 36 bits. External variables are the only form of “global” variables in B.

A program written in B can contain three kinds of components:

 
Manifest constant definitions;
External variable definitions; and
Function body definitions

After B was working, Thompson rewrote B in itself (a bootstrapping step). During development, he continually struggled against memory limitations: each language addition inflated the compiler so it could barely fit, but each rewrite taking advantage of the feature reduced its size. For example, B introduced generalized assignment operators, using x=+y to add y to x. The notation came from Algol 68.(In B and early C, the operator was spelled =+ instead of +=; this mistake, repaired in 1976, was induced by a seductively easy way of handling the first form in B's lexical analyzer.)

Thompson went a step further by inventing the ++ and -- operators, which increment or decrement; their prefix or postfix position determines whether the alteration occurs before or after noting the value of the operand. They were not in the earliest versions of B, but appeared along the way. Indeed, the auto-increment cells were not used directly in implementation of the operators, and a stronger motivation for the innovation was probably his observation that the translation of ++x was smaller than that of x=x+1.

The B compiler on the PDP-7 did not generate machine instructions, but instead `threaded code' [Bell 72], an interpretive scheme in which the compiler's output consists of a sequence of addresses of code fragments that perform the elementary operations. On the PDP-7 Unix system, only a few things were written in B except B itself, because the machine was too small and too slow to do more than experiment; rewriting the operating system and the utilities wholly into B was too expensive a step to seem feasible. At some point Thompson relieved the address-space crunch by offering a `virtual B' compiler that allowed the interpreted program to occupy more than 8K bytes by paging the code and data within the interpreter, but it was too slow to be practical for the common utilities. Still, some utilities written in B appeared, including an early version of the variable-precision calculator dc familiar to Unix users.

By 1971, people wanted to create interesting software more easily. Using assembler was dreary enough that B, despite its performance problems, had been supplemented by a small library of useful service routines and was being used for more and more new programs. In conclusion, B brought along the development of the language C. It is worth summarizing compactly the roles of the direct contributors to today's C language. Ken Thompson created the B language in 1969-70; it was derived directly from Martin Richards's BCPL. Dennis Ritchie turned B into C during 1971-73, keeping most of B's syntax while adding types and many other changes, and writing the first compiler.