Rich, once a publication has been converted to a PDF (typically
toggle quoted messageShow quoted text
by OCR software) then indexing can be done automatically by dozens
of available tools including Google Desktop -- you can point your
personal Google Desktop at a PDF collection in an offline archive
for example, and it will read and index the documents for you.
Of course, the worldwide Google search engine will do the same,
but the data you want will be buried much deeper... The only
current limitation of these indexers is poor recognition of
image files (JPEG etc) but in another 10-15 years I think that
automatic recognition and indexing of railroad images (as well
as all other images) will be very common.
There's an interesting article on the "Fourth Paradigm" in today's
New York Times, how science and analysis are being transformed by
data mining software tools. Someone is bound to develop "brilliant"
software that can read a magazine and recognize ads and distinguish
between different types of content and produce specialized indexes.
At 12/18/2009 08:38 PM Friday, you wrote:
Such an index would be a really nice addition to our knowledge base,
but to do it would be very time consuming.
Back in the 1980's I did an index like this in Lotus 123, but it only
dealt with every PRR entry for Railway Age, Railroad Gazette, and Railway
Age Gazette (as it was called for a few years after they merged). It took
up over 80 pages. I've never digitized it, but still today refer to it
often. There was a limited publication of it (i.e. about 6 copies). The
PRRT&HS archives have one, as does the U of M transportation library. A similar
index of every freight car article would be wonderful, but I can't imagine
anyone having the time to do it all. It took me over two years collecting
the PRR data and organizing it. Maybe today in the age of fast computers
someone will figure out a way? I recall it took my desktop all afternoon to
sort the pages out back then. I'd like to eventually update the index to
include advertisements that were PRR oriented.
Lacking access to a PDF copy machine I can't offer the index online,
but if anyone wants to pay the cost of a hard copy I would gladly go to the
copy shop and produce one.
A side light of this was that Railroad Gazette was the more advanced
magazine of the day (early 20th century), but Railway Age had began in
Chicago a number of years earlier and had the "bragging rights" as the oldest
such magazine in publication. Once merged, the name became Railroad Age
Gazette, but morphed back to Railway Age, so that it could claim such an early
date as it's first issue. A third magazine, Railway Review, also existed,
and merged into Railway Age around 1926. This all began in 1908 with the
initial merger of Railway Age, and Railroad Gazette, to become Railroad Age
Gazette, which soon became Railway Age Gazette, and then finally Railway
Age in 1917. There was no name change when Railroad Review merged into the