Traditional Rank-Size Relationships
We have seen that the rank-size distribution can be described by
the mathematical formula with which most geographers are familiar
r_i ((p_i)^q) = K
where q and K are constants, r is the rank of the ith city, and p_i is the
population of that city (Berry and Garrison 1958). The simple or
restrictive rank size relationship, which, as we saw, assumes the value
of q to be l, the value of K to be the population of the largest city
(P), and is frequently expressed in terms of the value of P_1. (Carroll
1982).
P_1 = (P)/(r_i).
The debate about the nature of the urban rank size relationship
continues for a variety of reasons. First, an epistemological concern: a
mathematical description of a phenomenon, regardless of the precision
with which the formula describes the data, is just a description, not an
explanation. Several researchers thus take care to point out that the
labeling of the rank size relationship as a law or a rule is incorrect.
(Chorley and Haggett 1967, Stewart 1958, 222,244). Szymanski and Agnew
(1981) disparage the rank size relationship as one of three examples of
poor scientific study in geography. They argue that a statistical
relationship, found only occasionally, and without theoretical
foundation, has been elevated to the status of law. Secondly, there are
concerns about the nature of the "straight line" itself. These concerns
arise from several considerations. First is a concern for the margin of
error which can be described graphically. Log log graph paper affords
wide latitude in plotting data. Even when the plotted data exhibit a
"good fit", it is striking how far off the mark are predictions of the
ranks of specific urban areas. Anyone who has had the experience of
plotting a rank size curve knows how slight a difference even several
thousands of population can make in the placement of a dot on the graph
in the high and middle ranges of many sets of urban data. Rapoport
(1978,847) reminds us that some monotonically decreasing curve will
describe just about any group of objects arranged according to size.
Secondly, the existence of a straight line on a graph also depends, to an
extent, on the portion of the urban distribution selected for analysis
the lowest levels of the urban hierarchy frequently deviate markedly from
the projected straight line (and the mathematical formula) and thus are
frequently omitted from analysis because attention is given to the
largest urban areas. The largest cities, as well, usually deviate
significantly from the values predicted by the formula or the straight
line on a graph. This is the phenomenon known as the primate city.
Attempts to accommodate the lower portion of the curve often displace the
upper portion and vice versa. Sahal (1981,294) has discussed this as a
problem of results being sensitive to the origin of the independent
variable. Applied to the simple rank size rule this means, for example,
that instead of attempting to predict the size of the 100th largest city
by dividing the population of the largest city by l00, we could, with
equal logic, but with different results, predict the size of the largest
city by multiplying the population of the city of rank 100 by 100. This
follows, of course, from equation (7) above in which the product of these
two factors is a constant. Rosing (1966) relates how Zipf accomplished a
similar end by determining the population of the largest city (New York),
not by census data, but by the computation of the y intercept of a
regression line through the ranking of the 100 largest cities on double
log paper.
There is evidence that a full urban distribution may be "S" shaped,
reflecting a growth or logistics curve (Stewart 1958, 245), or "J"
shaped, and that the linear orientation of the data exists only in
portions of the distribution. Parr and Jones (1983, 284 85) for example,
describe the rank size distribution as a lognormal distribution if
truncated at a sufficiently high level. Carroll (1979) offers the
greatest clarity on these issues. He points out that the rank size
distribution is more properly classified as a relation between two
variables than as a probability distribution. The rank-size distribution
can be derived from three kinds of the class of skew probability
distributions. All of these probability distributions --- the lognormal,
the Pareto and the Yule, are J shaped and highly skewed in the upper
tail. Each of the distributions is unique although the upper tails are
quite similar, therefore, if the examination of a set of urban data
excludes the smaller urban centers, any or all three distributions might
apply. Carroll states
We have seen that the law of proportionate effect results in the
lognormal distribution. This law with a lower threshold results in the
Pareto. And, this law with a lower threshold at which new units enter
in a constant rate gives the Yule distribution.
Nader (1984) suggests that a non-logarithmic approach to rank-size data
yields results superior to that of a logarithmic model. Perhaps such non
linear distributions are polymodal and reflect mixtures of two or more
rank size distributions, as has been reported for some distributions in
geology (Krombein and Graybill 1965,108,126). Another perspective on this
issue is that of Sahal (1981) who states quite simply that it is a
general characteristic of simple laws that they do not hold over the
entire range of variables. Clearly the rank size relationship is still
open to interpretation, and it is the intent of this research paper
to suggest even another method which can generate a distribution similar
to the rank-size rule.
Next Section: Rank Size and Central Place Theory
Go to Index
Back to Geography at OUZ