by ASQ Member Jay Armstrong
We don’t usually confuse statisticians with spies, and yet, statistics played a significant role in the defeat of Germany in the Second World War. By the 1930s, Germany’s pre-war industrial prowess and engineering capabilities were both mature and formidable. This was particularly true for her various weapon systems. With D-Day looming, of greatest concern were the newest Mark V “Panther” series tanks. Unfortunately, these were superior to the allies’ tanks across all performance dimensions – armor, cannon size, speed, and mobility. Beyond this painful reality, was the critical need to understand the actual number of tanks Germany was manufacturing in order to develop effective tactics and counter-measures. This is where statistics – or more precisely – the point estimation formula came in. The Germans, notorious as they are for order and precision, thoughtfully applied serial numbers to their equipment, which exactly matched the production date for that item. This compulsion would provide the allies with a solution to their problem.
The idea that serial numbers could provide insights into material production output first arose in 1943 and was immediately used by the Economic warfare Division to gauge German tire yields. More creative statisticians soon realized that such thinking could be leveraged against all military output and began asking for the serial numbers from captured and destroyed tanks. The front-line soldiers, although annoyed at what seemed to be a trifle, complied. Let’s review the statistical underpinnings of this thinking with an example. Suppose that I have the serial numbers for 20 destroyed tanks in Table 1.
What is a reasonable estimate of the total number of tanks that have been produced, to date?
Or statistically speaking, I want to perform an: “estimation of the maximum point of a discrete uniform distribution using sampling without replacement.”
Our useful formula, the “Maximum Likelihood Estimator” is:
Where: = the estimated total number of tanks
= the estimated total number of tanks
m = the highest obtained serial number
n = the sample size (my destroyed tanks)
I note from my sample that “669” is the highest serial number in the series, and thus becomes “m.” My sample size is “20” and is therefore, “n.” Substituting, I determine that equals 701.45, or approximately 700 tanks.
After the war, the statistical calculations proved to be far more accurate and reliable than did the traditional methods of SWAG and intelligence gathering at German factories (Table 2). The actual monthly production number – 245 tanks/month, matched the statistical prediction of 246 tanks/month. Depressingly, initial Allied intelligence had suggested an astounding 1400 tanks were being produced monthly. In fact, the Germans had cleverly contributed to this numerical confusion by painting and repainting higher numbers on their tank turrets to confuse spies.