freight car distribution - rejecting the equal distribution hypothesis.


Malcolm Laughlin <mlaughlinnyc@...>
 

It has been stated that the burden of proof is on me that box cars
were not distributed on all railroads in proportion to ownership.
The burden of proof notion is not applicable here because it is not
possible to "prove" either side of the argument, which could be done
only if car accounting records from the period were available.
Absent any method of proof, we have to fall back on logic based on
known practices and traffic flows at the time.

Here are some variables that work against proportional distribution
across all railroads.

- Car service rules. While we know that car service rules were not
always observed, we also know that many railroads did observe them
most of the time. This greatly reduces the likelihood of an FEC car
in Oregon or a WP car in Maine.

- Length of haul. I believe that the average length of haul of a
carload was in the area of 300 to 500 miles. This was for all car
types and probably unknowable by specific type. Given our known
bias, of undeterminable magnitude, toward loading in home road cars,
the probability of seeing any mark on any railroad is biased downward
by its distance from the owner.

- Seasonal tides – the grain harvest for example. Nationwide there
was a fixed fleet of cars usable for grain loading. The big Midwest
harvest began in Oklahoma and nearby areas and over the course of a
few months moved north to the Dakotas. The car fleet moved north
also, quite inconsistent with the proportional distribution
hypothesis. The AAR issued movement orders to get cars from the east
to grain loading areas was another source of bias.

- Surpluses and shortages. They varied widely by season, region and
type of load. The largest 40 ft. narrow door box car shortage crises
were in the best grain crop years. When the shortage was over, many
cars went home to rest for a few months. A peak grain harvest in a
time of business recession would really skew the distribution.

- Suitability for loading. Some cars were more suited for paper
loading, others for grain, BCK cars for flour, etc. Locations of
large users of newsprint strongly biased the destination areas of
cars with marks of the paper loading railroads. Railroads serving
lumber producers a much higher proportion of cars dimensionally
suitable for lumber. The multitude of such situations caused many
pockets of ownership concentration.

- Cars out of service. On the NYC we had thousands of XM box cars
out of service awaiting rebuilding or rarely used because of
obsolescence. The proportion of such cars varied widely among
railroads. As a result the listings in the ORER was only
approximately representative of the cars actually hauling freight.

I could probably think of other factors given some time, but I think
this is enough to say that the equal distribution hypothesis requires
a huge leap of faith. Remember that it is a theory developed in a
quest to answer a question that may not be answerable and uses data
that is only indirectly related to the end result and purportedly
validated by a very small sample. When statistical results vary
widely from what our knowledge of the real world leads us to expect,
we should first question the validity of the statistical method.

BTW, that storage to shortage phenomenon is a reason that average
loads per car per year gives us no useful information about
turnaround time of cars actually in use. And it didn't mean that
long tracks of stored cars. That surplus was distributed a day or
two at a time over thousands of yards and stations - cars awaiting
distribution for a day or two more than they would have in a time of
shortage.


Stokes John
 

I agree with Malcolm. The very notion that box cars were distributed on all railroads across the country in proportion to ownership is patently absurd on its face. As has been noted before, and mentioned here in some respects, there were just too many variables and oddities and contingencies, etc. for such a hypotheses to work in reality. All one has to do is look at some photos of rail yards and freight trains from a good sample of the US and one would instantly recognize the randomness, and sometimes predictable, patterns of box car distribution across the country. The reason people get so excited about seeing a BAR red white and blue box car in Dallas, Texas is because it is a rarity, just as a WP car in Boston would be a wonder to behold. The Granger railroads kept a high percentage of their fleets on their home irons, and especially the CB&Q. I lived in Lubbock, Texas for four years while in school many years ago, when the single sheathed box cars were predominant on the Q and about all you would see on the High Plains of the Panhandle at that time were these great SS box cars. Out here in the PNW during the BN days one could see long freight trains with almost nothing but BN cars, along with some still un re-painted NP and GN cars, with a smattering of other roads, even some strays form the SouthEast at times.


Proportional distribution across all railroads on some mathematical formula and theory is just an exercise in lahlah land. Fun, interesting, and helpful in a tiny degree as previously noted, but not applicable or transferable to model railroad situations (aside from the virtual railroads, where anything is possible).
John Stokes
Bellevue, WA



To: STMFC@yahoogroups.comFrom: mlaughlinnyc@yahoo.comDate: Mon, 18 Aug 2008 16:06:46 +0000Subject: [STMFC] Re: freight car distribution - rejecting the equal distribution hypothesis.




It has been stated that the burden of proof is on me that box cars were not distributed on all railroads in proportion to ownership. The burden of proof notion is not applicable here because it is not possible to "prove" either side of the argument, which could be done only if car accounting records from the period were available. Absent any method of proof, we have to fall back on logic based on known practices and traffic flows at the time.Here are some variables that work against proportional distribution across all railroads.- Car service rules. While we know that car service rules were not always observed, we also know that many railroads did observe them most of the time. This greatly reduces the likelihood of an FEC car in Oregon or a WP car in Maine.- Length of haul. I believe that the average length of haul of a carload was in the area of 300 to 500 miles. This was for all car types and probably unknowable by specific type. Given our known bias, of undeterminable magnitude, toward loading in home road cars, the probability of seeing any mark on any railroad is biased downward by its distance from the owner.- Seasonal tides the grain harvest for example. Nationwide there was a fixed fleet of cars usable for grain loading. The big Midwest harvest began in Oklahoma and nearby areas and over the course of a few months moved north to the Dakotas. The car fleet moved north also, quite inconsistent with the proportional distribution hypothesis. The AAR issued movement orders to get cars from the east to grain loading areas was another source of bias.- Surpluses and shortages. They varied widely by season, region and type of load. The largest 40 ft. narrow door box car shortage crises were in the best grain crop years. When the shortage was over, many cars went home to rest for a few months. A peak grain harvest in a time of business recession would really skew the distribution.- Suitability for loading. Some cars were more suited for paper loading, others for grain, BCK cars for flour, etc. Locations of large users of newsprint strongly biased the destination areas of cars with marks of the paper loading railroads. Railroads serving lumber producers a much higher proportion of cars dimensionally suitable for lumber. The multitude of such situations caused many pockets of ownership concentration.- Cars out of service. On the NYC we had thousands of XM box cars out of service awaiting rebuilding or rarely used because of obsolescence. The proportion of such cars varied widely among railroads. As a result the listings in the ORER was only approximately representative of the cars actually hauling freight.I could probably think of other factors given some time, but I think this is enough to say that the equal distribution hypothesis requires a huge leap of faith. Remember that it is a theory developed in a quest to answer a question that may not be answerable and uses data that is only indirectly related to the end result and purportedly validated by a very small sample. When statistical results vary widely from what our knowledge of the real world leads us to expect, we should first question the validity of the statistical method.BTW, that storage to shortage phenomenon is a reason that average loads per car per year gives us no useful information about turnaround time of cars actually in use. And it didn't mean that long tracks of stored cars. That surplus was distributed a day or two at a time over thousands of yards and stations - cars awaiting distribution for a day or two more than they would have in a time of shortage.


Anthony Thompson <thompson@...>
 

John Stokes wrote:
I agree with Malcolm. The very notion that box cars were distributed on all railroads across the country in proportion to ownership is patently absurd on its face.
Gosh, John, this sure saves you from doing any data analysis or for that matter, even looking at any data. To call Tim Gilbert's work "lahlah land" is insulting and ignorant. But hey, opinion trumps facts, right?

Tony Thompson Editor, Signature Press, Berkeley, CA
2906 Forest Ave., Berkeley, CA 94705 www.signaturepress.com
(510) 540-6538; fax, (510) 540-1937; e-mail, thompson@signaturepress.com
Publishers of books on railroad history


Anthony Thompson <thompson@...>
 

Malcolm Laughlin wrote:
It has been stated that the burden of proof is on me that box cars were not distributed on all railroads in proportion to ownership. The burden of proof notion is not applicable here because it is not possible to "prove" either side of the argument . . .
Yeah, it's hard to disprove all of Tim's analysis, so let's say neither side can be proved.

Here are some variables that work against proportional distribution across all railroads . . . I could probably think of other factors given some time, but I think this is enough to say that the equal distribution hypothesis requires a huge leap of faith.
This is willfully ignoring how Tim went about what he did. He took the data and analyzed it to see what happened. He often said he was pretty surprised by the results. But that's not a "leap of faith," it's believing the analysis. The fact that there are lots of variabilities or factors which COULD work against the hypothesis doesn't mean that, on balance, they DO contradict the hypothesis.
I guess the objectors, having no data of their own or any way to disprove Tim's analysis, are naturally falling back on objecting to the entire idea. If Tim were with us, I'm sure he'd be smiling at these mental gymnastics.

Tony Thompson Editor, Signature Press, Berkeley, CA
2906 Forest Ave., Berkeley, CA 94705 www.signaturepress.com
(510) 540-6538; fax, (510) 540-1937; e-mail, thompson@signaturepress.com
Publishers of books on railroad history


Stokes John
 

Tony,

Finally got some response other than the repetition of the studies. No, not meant to be insulting at all, but I think correct for the notion that this "data" applies absolutely. I don't think Tim is saying that the equal distribution accurately depicts the status on every railroad in the country on every day of every month of every year I do not recall his making the claim that was the case, or that every freight train or groups of freight trains on any given railroad would always have the predicted percentages. What I think flies in the face of reality is the notion that this formula works every time and that it directly and specifically applies to modeling situations on most people's layouts.


I believe what some people are saying is that the idea that there was this perfect mathematical distribution of box cars to every railroad in the land in proportion to ownership is hard to see in light of the almost infinite variables that would affect such distribution, and photographic and personal observation evidence to the contrary. Who made this happen, or how did it happen. Some unseen guiding hand? What did Mark Twain say about statistics?


Actually, this all has little direct effect on most modelers, since none of us, except the virtual guys and really large clubs, even begin to approach the traffic potential to allow any one to follow these theories.


And I have looked at the facts, and they are more than statistics based on small samples, and they say that this mathematical precision did not occur in real life. But I would like to hear why that is not correct. Tell those who disbelieve, in a good concise paragraph again, the meat of the theory and the facts that back it up. I am willing to try to learn.


John Stokes
Bellevue, Wa



To: STMFC@yahoogroups.comFrom: thompson@signaturepress.comDate: Mon, 18 Aug 2008 10:40:08 -0700Subject: Re: [STMFC] Re: freight car distribution - rejecting the equal distribution hypothesis.




John Stokes wrote:> I agree with Malcolm. The very notion that box cars were distributed > on all railroads across the country in proportion to ownership is > patently absurd on its face.Gosh, John, this sure saves you from doing any data analysis or for that matter, even looking at any data. To call Tim Gilbert's work "lahlah land" is insulting and ignorant. But hey, opinion trumps facts, right?Tony Thompson Editor, Signature Press, Berkeley, CA2906 Forest Ave., Berkeley, CA 94705 www.signaturepress.com(510) 540-6538; fax, (510) 540-1937; e-mail, thompson@signaturepress.comPublishers of books on railroad history


Anthony Thompson <thompson@...>
 

John Stokes wrote:
Finally got some response other than the repetition of the studies. No, not meant to be insulting at all, but I think correct for the notion that this "data" applies absolutely. I don't think Tim is saying that the equal distribution accurately depicts the status on every railroad in the country on every day of every month of every year . . .
No, Tim did not, and I think would be horrified if anyone so applies it. He was quite knowledgeable about prototype car handling and knew about all the variables people think they are the first to point out.

And I have looked at the facts, and they are more than statistics based on small samples, and they say that this mathematical precision did not occur in real life. But I would like to hear why that is not correct. Tell those who disbelieve, in a good concise paragraph again, the meat of the theory and the facts that back it up. I am willing to try to learn.
I personally think Tim's data say very clearly that the appearance of free-running cars like box cars DID follow, statistically, not absolutely with mathematical precision (Tim never said anything like that, so let's drop it now), the proportions in the national car fleet. That means that in MOST cases, the ancient hobby rules of thumb, than interchange partners dominate other roads and that the farther away the railroad, the less likely are its cars, are wrong.
OF COURSE there are other variables. OF COURSE this doesn't work in every location or on every train. The point is that it's the underlying reality. Anyone who doesn't have better data than Tim's will just have to get used to it.

Tony Thompson Editor, Signature Press, Berkeley, CA
2906 Forest Ave., Berkeley, CA 94705 www.signaturepress.com
(510) 540-6538; fax, (510) 540-1937; e-mail, thompson@signaturepress.com
Publishers of books on railroad history


Stokes John
 

Tony,

I can except that, on a theoretical basis, which is different than what some of the comments have conveyed, and time to move on. The logic of it still just escapes me, but I can't completely put my finger on it, or why I think the data, while perhaps valid for what it is, has something missing, but sometimes one has to accept what seems improbable and deal with it until better data is unearthed, if ever.


Thanks for the more rational explanation.

John S.



To: STMFC@yahoogroups.comFrom: thompson@signaturepress.comDate: Mon, 18 Aug 2008 11:18:26 -0700Subject: Re: [STMFC] Re: freight car distribution - rejecting the equal distribution hypothesis.




John Stokes wrote:> Finally got some response other than the repetition of the studies. > No, not meant to be insulting at all, but I think correct for the > notion that this "data" applies absolutely. I don't think Tim is > saying that the equal distribution accurately depicts the status on > every railroad in the country on every day of every month of every > year . . .No, Tim did not, and I think would be horrified if anyone so applies it. He was quite knowledgeable about prototype car handling and knew about all the variables people think they are the first to point out.> And I have looked at the facts, and they are more than statistics > based on small samples, and they say that this mathematical precision > did not occur in real life. But I would like to hear why that is not > correct. Tell those who disbelieve, in a good concise paragraph again, > the meat of the theory and the facts that back it up. I am willing to > try to learn.I personally think Tim's data say very clearly that the appearance of free-running cars like box cars DID follow, statistically, not absolutely with mathematical precision (Tim never said anything like that, so let's drop it now), the proportions in the national car fleet. That means that in MOST cases, the ancient hobby rules of thumb, than interchange partners dominate other roads and that the farther away the railroad, the less likely are its cars, are wrong.OF COURSE there are other variables. OF COURSE this doesn't work in every location or on every train. The point is that it's the underlying reality. Anyone who doesn't have better data than Tim's will just have to get used to it.Tony Thompson Editor, Signature Press, Berkeley, CA2906 Forest Ave., Berkeley, CA 94705 www.signaturepress.com(510) 540-6538; fax, (510) 540-1937; e-mail, thompson@signaturepress.comPublishers of books on railroad history


major_denis_bloodnok <smokeandsteam@...>
 

This is willfully ignoring how Tim went about what he did. He
took the data and analyzed it to see what happened. He often said he
was pretty surprised by the results. But that's not a "leap of
faith,"
it's believing the analysis.
Thank you Tony

I don't think that he ever set out to prove anything one way or
another but rather to find out if there was any thing useful to be
extracted form the data he had.

Tim's work produced results that do seem counter intuitive if we stick
to some long held conventions of freightcar distribution, but ones
that haven't yet been countered with similar studies showing different
results.

Certainly there are some oddities in some wheel reports, but the
overall picture based on his work is clear. Now, if we had other
equally rigorous studies showing different results such as a closer
correspondence with theories about regional or connecting roads we
woudl need to look at these and see if we can find out why

However there are other ideas worth thinking about if we are
considering model railroads rather than real ones.

For example, a collection of freight cars weighted with cars from a
specific region can help provide a sense of time and place as much as
the locomotives and scenery can - it may not be perfectly accurate if
you adhere to the school that says "always model the average and never
the unusual", but it's perfectly believable if carried out in
moderation.

I'm not advocating a complete "build-what-you-like" philosophy, simply
suggesting that Tim's work should be treated as more of a field guide
than a stone tablet. The up and down rythm of varying rooflines and
car types is much more effective in catching the look and feel of a
real steam freight train than an over-disciplined adherence to a
mystical formula of freight car distribution.


Aidrian Bridgeman-Sutton


Cyril Durrenberger
 

All,

One of the main problems with all of this data analysis is that you are trying to take a very small sample size (really insignificant) and extrapolate to the whole fleet. This also assumes that the any sample you analyze is representative of the whole, which is not very likely.

As has been pointed out such an approach is also limited by the year (the rosters will change with time), season and location.

Also as pointed out in this post, most modelers do not have a large fleet of cars (how long would it take to build 500 detailed resin cars?)

So this is what I think would be a reasonable approach for the analysis: (remark I am a research scientist at The University of Texas at Austin)

1. Use data that is likely to reflect trains that would be hauling a representative sample. This would have to be through freights, not local freights.

2. Then compare train data set to roster data set for that year in a very easy to follow format (I suggest an excel spread sheet). This would be done for each train data set. Post the results in a data base for the group.

3. The groupings of small railroads would be necessary to simplify the process.

4. Then ask the question - does the train data compared to the roster data support that the cars are distributed according to the size of the national roster information.

5. It would be important to do this for box cars, gondolas and flat cars in separate groupings. One could do this for hopper cars and stock cars, but they are likely to be impacted more by location, but this would become evident in such an analysis..

If Tim has already done this, then present the data in such a format. What I have seen does not do this. Please do not tell us that is what it shows, show us the data then each person can determine if this applicable to their situation and then they can decide how to use the information.

6. Also include the time of year and the location for the train data set as there may be seasonal and regional impacts that one may wish to consider.

7. Then the user can apply their knowledge of differences in traffic patterns, etc for their own railroad on a case by case basis. It is not likely that the same approach will work for all model railroads.

8. Forget the statistical analysis - the train data sets are far too small to reasonably apply them. If you want to do a statistical analysis, you would need to apply some sort of boot strap process to resample the data.

Once the data is presented anyone can then take that and apply it as they see fit for their model railroad.

I will tell you how I have gone about it for my model railroad of the Houston East & West Texas circa 1910.

1. Use the data from the railroad commission reports on commodities shipped on the railroad to determine the split of cars by type - box car, refrigerator car, flat car, gondola car, stock car, etc. Other data in the reports tells the train lengths and split between loads and empties.

2. I would obtain the ICC Blue Book data for 1910 to determine the ratio of home road cars to other cars. In my case for the HE&WT these data would likely to be applicable only to box cars, Knowing that based on other information most flat car and gondola loads were generated locally with home road (SP lines) cars. Other more robust data sources discussed below are used for tank cars, stock cars and refrigerator cars.

3. If we can show that in 1910 the non home box cars are distributed by national roster data, use this to determine the number of box cars for each railroad.

4. Realize that in 1910 (it is not necessarily the same for later years), most tank car and refrigerator cars used on the HE&WT were owned by private owners, so that would mean another type of analysis.. Use the railroad commission reports of mileage for private owner cars (since there was significant variation from year to year, average over a span of 3 years - 1909, 1910 and 1911) to determine companies likely to show up on the HE&WT. This number of cars for each company on the model railroad would then be coupled with the industries served on the model railroad, the availability of information on the company's cars (not as much of a problem for the 1950's), availability of suitable decals to letter the cars and the availability of kits (or the willingness to scratch build or kit bash the cars needed) to determine the tank car and refrigerator car fleets for the model railroad.. The data for the private owner cars during this time period was no where near the split
based on number of cars nationally, with some large fleets of cars never showing up at all. A case in point would be the Union Tank Line - very small percentage, but there was a large representation from Merchants and Planters Oil Company and Higgins Oil & Fuel Company, both with rather small fleets of cars. The same sort of thing happened with refrigerator cars. PFE was not well represented as they had recently started operations. Armour owned companies had the largest representation. Santa Fe cars did not have much mileage on the HE&WT. Another problem is that refrigerator cars from the Houston Packing Company had a good deal of mileage, but I can not locate any information on what they lookied like or how they were lettered. The same thing happens with many of the tank cars, many of which were not listed in the ORER.


5. Flat cars and gondolas would be mainly SP lines cars.

6. Stock cars would be a mix of SP lines cars, private owner cars (much of the live stock was shipped on private owner cars in 1910) and other cars. However, there was not much live stock shipped on the HE&WT.

I have done all of the above except for the following exceptions:. I have not had the ICC Blue Book data (I just found out about it) nor have I done the percentage of box cars based on the national fleet. I intend to do that if I can locate the Blue Book data for the HE&WT for 1910 and then I will compare the results of this approach to the other one I used earlier. That being - 50% of the cars being local (SP lines in this case), 25% from lines that connected with the HE&WT and the other 25% from the national fleet based roughly on the percentage of their car fleet. The split of model SP Lines box, flat and gondola cars was based on the percentage of each type of car on the roster at that time and the road names. As mentioned earlier, this presents some problems as many of these cars would have to be scratch built.

Cyril Durrenberger


John Stokes <ggstokes@msn.com> wrote:
Tony,

Finally got some response other than the repetition of the studies. No, not meant to be insulting at all, but I think correct for the notion that this "data" applies absolutely. I don't think Tim is saying that the equal distribution accurately depicts the status on every railroad in the country on every day of every month of every year I do not recall his making the claim that was the case, or that every freight train or groups of freight trains on any given railroad would always have the predicted percentages. What I think flies in the face of reality is the notion that this formula works every time and that it directly and specifically applies to modeling situations on most people's layouts.


I believe what some people are saying is that the idea that there was this perfect mathematical distribution of box cars to every railroad in the land in proportion to ownership is hard to see in light of the almost infinite variables that would affect such distribution, and photographic and personal observation evidence to the contrary. Who made this happen, or how did it happen. Some unseen guiding hand? What did Mark Twain say about statistics?


Actually, this all has little direct effect on most modelers, since none of us, except the virtual guys and really large clubs, even begin to approach the traffic potential to allow any one to follow these theories.


And I have looked at the facts, and they are more than statistics based on small samples, and they say that this mathematical precision did not occur in real life. But I would like to hear why that is not correct. Tell those who disbelieve, in a good concise paragraph again, the meat of the theory and the facts that back it up. I am willing to try to learn.


John Stokes
Believe, Wa

To: STMFC@yahoogroups.comFrom: thompson@signaturepress.comDate: Mon, 18 Aug 2008 10:40:08 -0700Subject: Re: [STMFC] Re: freight car distribution - rejecting the equal distribution hypothesis.

John Stokes wrote:> I agree with Malcolm. The very notion that box cars were distributed > on all railroads across the country in proportion to ownership is > patently absurd on its face.Gosh, John, this sure saves you from doing any data analysis or for that matter, even looking at any data. To call Tim Gilbert's work "lahlah land" is insulting and ignorant. But hey, opinion trumps facts, right?Tony Thompson Editor, Signature Press, Berkeley, CA2906 Forest Ave., Berkeley, CA 94705 www.signaturepress.com(510) 540-6538; fax, (510) 540-1937; e-mail, thompson@signaturepress.comPublishers of books on railroad history


Mike Brock <brockm@...>
 

We are getting to the point...not unlike discussions about color...where we
are not seeing anything new. So...we are getting to the point where the
thread will need to be terminated. Obviously some members are convinced
regarding the Nelson/Gilbert theory's validity and others are not. Before the thread is terminated...until new data becomes
available, I would appreciate seeing some clarification on a few points.

Tony Thompson writes:

"I personally think Tim's data say very clearly that the
appearance of free-running cars like box cars DID follow,
statistically, not absolutely with mathematical precision (Tim never
said anything like that, so let's drop it now), the proportions in the
national car fleet."

What does that mean? In the 1947 data covering Laramie to Green River, the
theory predicts [ I guess that's a good word ] 28 SP box cars. The actual number was 34...a 20% error. In the 1949 data, the theory predicts 52 SP box cars. The actual number was 136...or an error of 161%. Now...when I have mentioned this before, the answer was...nooo problem. This is statistics. OK...fine. No argument. Suppose that damned UP train with the 36 SP box cars was in Fraley's sample. Now the error would be 230%. What if 5 more such trains showed up? 576%. What if it were 1000%? Or 10000%? When does it become a problem...or are the violating SP numbers just thrown away? If the reply to this is that an error of 161% is OK, why bother with the individual national %? Just take the acceptable SP number of 136 = .01 (Y) (1325 box cars), Y = 10.2% and use it for all RR's? After all, SP's national % of 3.6% is fairly representative of all RR's except for PRR and NYC. Just add 5% more for them. I guarantee that the "error" between the actual national % for CGA, Rutland or FEC won't produce a worse error than using the actual SP national % does with the Fraley 1949 data. And, it will be a lot easier to do...don't have to look up anything. Of course, the same thing can be achieved by just acquiring the same number of cars for every RR except PRR and NYC. Get two of each of those. Then do the same thing 3 more times until you have 4 cars of every RR except PRR and NYC which you will have 8 of. I guarantee that will get you in the envelop of statistical success just as much as taking ther national % for each RR.

"The point is that it's the
underlying reality. Anyone who doesn't have better data than Tim's will
just have to get used to it."

As far As I know, Tim's data was the 1947 Fraley and the Southern RR data on a train in Asheville. I gave Tim a copy of my Fraley [ 1949 ]. Did he use any other data of actual car reports?

Mike Brock


Mike Brock <brockm@...>
 

Cyril Durrenberger says:

"One of the main problems with all of this data analysis is that you are trying to take a very small sample size (really insignificant) and extrapolate to the whole fleet. This also assumes that the any sample you analyze is representative of the whole, which is not very likely."

Amen.

"1. Use data that is likely to reflect trains that would be hauling a representative sample. This would have to be through freights, not local freights.

2. Then compare train data set to roster data set for that year in a very easy to follow format (I suggest an excel spread sheet). This would be done for each train data set. Post the results in a data base for the group.

3. The groupings of small railroads would be necessary to simplify the process.

4. Then ask the question - does the train data compared to the roster data support that the cars are distributed according to the size of the national roster information."

You are saying to compare the real data to the national %? Good grief. Use real data? <G>. BTW, here's Tim's comments regarding such:

"In 1947, the ownership of foreign boxcars aggregated into eight ICC
Geographic Regions correlated pretty well with the percentage those
regions owned of the National Boxcar Fleet. In 1949, that correlation
was blown to hell."

I have no analysis on individual RR's other than SP from my 1949 Fraley. I seem to recall, however, being bitterly disappointed that NP box cars were much less represented than I had hoped...refuting my Fifth Rule of Frt Cars. Sort of like being hoisted on my own Fraley.

"If Tim has already done this, then present the data in such a format. What I have seen does not do this. Please do not tell us that is what it shows, show us the data then each person can determine if this applicable to their situation and then they can decide how to use the information."

As far as I know, Tim only grouped RR's by regions except for those closely associated to UP...SP, Milw, CB&Q, and C&NW as is shown in his message of Feb 3, 2006, which I republished here last week.

Mike Brock


Stokes John
 

Well, I got thoroughly chastised and sneered at by the resident statistical intellectuals because I questioned the state of undress, then along come some other people asking cogent questions about all this, and Mike asks logical and pertinent questions again. It still seems to me that this is all hogwash about the national statistics as they may apply to and be useful for predicting how many SP box cars will be seen on the Inside Gateway on April 23, 1956 (stayed within the magic time frame, Tim). Random doesn't mean predictable, except that you can predict that it will be random. Dictionary definition of random is "lacking aim or method; purposeless; haphazard." In statistics it means "of statistical sample selection in which all possible samples have equal probability of selection." Maybe we are applying random walk here, Tim, or random variables, where the variable's values are determined independently according to a probability distribution? Predictability means capable of being predicted, which means to say in advance what one believes will happen. Yes, you can predict that the percentages of box cars in a given freight train will be random, but you say they will be in a set percentage that does not vary. Round and round we go.

Quantum physicists know about this. A random event cannot be predicted or duplicated, it's a Surprise! Almost everything is predictable, but many outcomes are very difficult to predict because the variables that drive the outcome are either unknown or difficult to measure. That is precisely what we are dealing with here. While we can get the stats on the nationwide freight car fleet, and somehow come to the conclusion that this percentage holds true as the box cars travel around the nation, each following as if by magic its random predictable pattern and percentage, the fact is that there are a whole host of variables that drive the outcome and we either don't know them all or we don't have enough information to do anything with them.

This is like playing a video game, it exercises the mind and the keeps one's juices flowing, but in the end it is virtually meaningless and not necessarily a good way to spend one's time, especially when one realizes that there are so many models to build and run and so little time to do it in.

Bye bye,

John Stokes
Bellevue, Wa





To: STMFC@yahoogroups.comFrom: brockm@brevard.netDate: Mon, 18 Aug 2008 17:20:08 -0400Subject: Re: [STMFC] Re: freight car distribution - rejecting the equal distribution hypothesis.



We are getting to the point...not unlike discussions about color...where weare not seeing anything new. So...we are getting to the point where thethread will need to be terminated. Obviously some members are convincedregarding the Nelson/Gilbert theory's validity and others are not. Before the thread is terminated...until new data becomesavailable, I would appreciate seeing some clarification on a few points.Tony Thompson writes:"I personally think Tim's data say very clearly that theappearance of free-running cars like box cars DID follow,statistically, not absolutely with mathematical precision (Tim neversaid anything like that, so let's drop it now), the proportions in thenational car fleet."What does that mean? In the 1947 data covering Laramie to Green River, thetheory predicts [ I guess that's a good word ] 28 SP box cars. The actual number was 34...a 20% error. In the 1949 data, the theory predicts 52 SP box cars. The actual number was 136...or an error of 161%. Now...when I have mentioned this before, the answer was...nooo problem. This is statistics. OK...fine. No argument. Suppose that damned UP train with the 36 SP box cars was in Fraley's sample. Now the error would be 230%. What if 5 more such trains showed up? 576%. What if it were 1000%? Or 10000%? When does it become a problem...or are the violating SP numbers just thrown away? If the reply to this is that an error of 161% is OK, why bother with the individual national %? Just take the acceptable SP number of 136 = .01 (Y) (1325 box cars), Y = 10.2% and use it for all RR's? After all, SP's national % of 3.6% is fairly representative of all RR's except for PRR and NYC. Just add 5% more for them. I guarantee that the "error" between the actual national % for CGA, Rutland or FEC won't produce a worse error than using the actual SP national % does with the Fraley 1949 data. And, it will be a lot easier to do...don't have to look up anything. Of course, the same thing can be achieved by just acquiring the same number of cars for every RR except PRR and NYC. Get two of each of those. Then do the same thing 3 more times until you have 4 cars of every RR except PRR and NYC which you will have 8 of. I guarantee that will get you in the envelop of statistical success just as much as taking ther national % for each RR."The point is that it's theunderlying reality. Anyone who doesn't have better data than Tim's willjust have to get used to it."As far As I know, Tim's data was the 1947 Fraley and the Southern RR data on a train in Asheville. I gave Tim a copy of my Fraley [ 1949 ]. Did he use any other data of actual car reports?Mike Brock