Tim O'Connor wrote:

Larry, I suspect Tim would have recommended computing the standard deviation of this data set, in order to see how closely these observed differences from the mean values match the expected standard deviation. One does not "expect" all values in a sample data set from a large population to match the mean values, but one does expect that the mean of the deviations is predictable. If it is, then you can make a case that your data set is a good representation of a hypothetical sample. If not, then the data set may be skewed, or your expected sample may be incorrect (i.e. Tim's theory of distribution may be incorrect).

I believe that the random variables in this situation are the boxcar counts for each railroad. The actual boxcar counts for US railroads are not random variables. A single set of data, like the numbers I posted for the Charles collection, has only one value for each random variable so a standard deviation cannot be computed.

On the other hand, if all of the trains in all of the conductor's books owned by list members were treated as separate observations, then standard deviations could be computed for the number of boxcars observed for each railroad. The data in the conductors books could also be aggregated for each railroad.

Larry Kline

Pittsburgh, PA