home *** CD-ROM | disk | FTP | other *** search
- Date: 01 Feb 93 22:51:51 CST
- From: Jim Thomas <tk0jut2@mvs.cso.niu.edu>
- Subject: File 3--How does the SPA Calculate Piracy?
-
- The Software Protection Association (SPA) estimates that software
- piracy has declined between 1989-91. But, says the SPA, piracy still
- cost the industry over $1.2 billion in lost revenues in 1991. Critics
- argue that the piracy rate and its costs are grossly over-estimated.
- The SPA believes that its estimates, while perhaps imperfect,
- nonetheless are quite conservative and, if anything, significantly
- underestimate the extent of software piracy. Who's right? How does
- the SPA arrive at its estimates? The information below comes from SPA
- documents and from SPA David Tremblay, SPA's Research Director.
-
- Identifying and counting behaviors that are normally hidden presents
- several methodological problems. Calculating the extent of piracy is
- no exception. First, there is no victim in the traditional sense.
- There are no snatched purses, dead bodies, empty bank accounts,
- trashed computers, or other directly obvious signs of predation.
- Therefore, we rarely have direct knowledge of an alleged "offense."
- Second, the concepts used to define or measure an "offense" can pose
- particular problems, because definitions are subject to imprecision.
- Third, "victims" of piracy are often unaware that they are victims
- until informed by someone who measures victimization, such as the SPA.
-
- The "DARK FIGURE OF CRIME" is the knowledge-gap between crimes KNOWN
- to have occured and crimes that ACTUALLY occured. No existing
- methodolgy can precisely measure this dark figure, and even the most
- sophisticated provide only approximations. It's therefore not
- surprising that the SPA's attempts to measure the "dark figure of
- piracy" face methodological problems.
-
- The Methodology
-
- Four sets of facts and an assumption underlie the SPA's methodology.
- One set of facts is hardware sales from Dataquest, a marketing
- research company in San Jose, Calif. The calculations begin by
- determining the number of Intel- and MacIntosh-based PCs sold during a
- given year.
-
- The second set of data derives from an SPA reporting program in which
- about 150 of the generally larger companies report their unit sales
- and revenue to the SPA. The business applications sales are taken
- from the report and used to estimate the the total unit sales of
- software in the U.S. in a given year. Operating systems are excluded.
- The data do not constitute a random sample, but are based on voluntary
- self-reporting of the participating companies. This method is common
- in survey research and, if used with caution, the lack of randomness
- or representativeness of the population surveyed need not be a
- problem.
-
- The third set of facts is the average number of applications that
- users are estimated to have on their personal computers. This body of
- data comes from member research that is sent back to the SPA. The
- members obtain this information from several sources, including
- surveys of their own customer base and from returned registration
- cards. The SPA estimates that the typical DOS (or Intel-based) PC user
- has three applications, and the typical MacIntosh user has five. One
- reason that Mac users may have more than Intel-based users is the ease
- of use and the cross-learning between different Mac programs that
- reduces the learning curve and better-integrates the Mac programs with
- each other.
-
- The fourth datum is the average price for a software program in a
- given year. However, in calculating the total dollar volume of
- revenues lost to piracy, David Tremblay indicates that "street value"
- prices are factored in, rather than assuming that each program would
- sell for market list price.
-
- Finally, the methodology is based on the ASSUMPTION that all of the
- units of software that are purchased in a calendar year are purchased
- by or for use on PCS that are new that year. It assumes no application
- sales to computers purchased in previous years.
-
- These data are then plugged into a formula (figures are illustrative):
-
- 1. The PC hardware sales (in number of units) are multiplied by the
- number of applications used. If there are 1 million Intel-based units
- sold, and each has 3 commercial software applications (excluding the
- operating system itself), we get a figure of 3 million.
-
- 2. The number of applications used is subtracted from the number of
- applications purchased during that year. If 2.4 million applications
- are sold, the difference is 600,000. This is assumed to be the number
- of applications pirated.
-
- 3. The number of applications pirated is then multiplied by the
- average cost of a software package, which has declined from $189 in
- 1989 to $152 in 1991.
-
- David Tremblay candidly recognizes the methodological problems,
- although he feels that, on balance, the problems understate rather
- than overstate the level of piracy. He recognizes several market
- problems that could affect the estimates (the skewing directions are
- my own):
-
- 1) Since 1989, the average price per software application has
- decreased. This skews DOWNWARD the proportion of dollar losses from
- year to year.
-
- 2) Hardware sales have been revised downward by Dataquest, which
- reduces the base number of PCs on which piracy estimates are based.
- This skews the piracy estimate UPWARD.
-
- 3) Contrary to the assumption of "no application sales to installed
- base," there is evidence that an increasing percentage of software is
- being sold for use on existing PCs. This skews the piracy estimate
- UPWARD.
-
- There are additional problems. Among them:
-
- 1) The total software sales include sales of upgrades. This would
- seem to under-estimate the extent of illicit software, because it
- over-estimates the base-figure of software sold. For example, if 100
- PCS are sold in a given year, and if each PC has an average of three
- applications, we would expect 300 applications to be sold. If,
- however, we find that only 270 applications are sold, the "piracy
- score" would be 300-270= 30; 30/300 = .1, or ten percent. If upgrades
- are included, and if 20 percent of sales are upgrades, that means
- 300-216 = 84; 84/300 = .28, or a 28 percent piracy rate. Including
- upgrades skews the piracy estimate DOWNWARD but the costs of piracy
- UPWARD.
-
- This, however, is misleading, because the base number of applications
- is taken for *all* PCs, not just the PCs purchased in the first year.
- There is no evidence to suggest that the number of applications on a
- PC declines overtime. The evidence, as the SPA acknowledges, is the
- opposite. Hence, the base-figure of total applications (3) does not
- give an accurate expectation of the expected number of software sales,
- which would dramatically inflate the base of software sales. Consider
- this example: Person A purchases a computer and three software
- programs in 1989. Person A purchases two more programs in 1990, and
- one in 1991. Person B purchases a computer in 1991 and three
- applications in 1991. Assuming that they are the only ones who
- purchased software or hardware in 1991, the average number of
- installed applications on a PC is 4.5. The number of software sales in
- 1991 is 4. An awkward percentage aside, The piracy score is .5 (half a
- program, or 12.5 percent piracy rate). In reality, all applications
- can be matched to sales, but the method's assumptions inflate the
- score. It's currently difficult to assess how severely inclusion of
- installed applications on previously purchased computers exaggerates
- the piracy figure. But, if the SPA's current piracy estimate of 20
- percent is correct, even a small influence would produce a dramatic
- inflation of the estimate. The SPA's method of including all
- installed applications in its base data, while restricting comparison
- to only applications purchased in the most recent year, is to my mind
- a fatal flaw.
-
- In short, the applications on a PC include not only applications
- purchased the first year, but also include all those collected in
- subsequent years. Further, even if upgrades are included (which would
- push the piracy score DOWNWARD), the price of upgrades at street
- prices is generally a fraction of cost for a program's first-purchase,
- and failing to take this into account skews loss of revenue UPWARD.
-
- 2) A second problem involves the reliability (consistency) and validity
- (accuracy) of reporting methods of company-generated data, especially
- registration card data. It cannot be assumed that the methodological
- procedures of different reporting companies are either consistent
- among themselves (which means they may not be reporting the same
- things) or that their procedures are uniformly accurate. Differing
- definitions of concepts, variations in means of tracking and recording
- data, or differences in representative are but a few of the problems
- affecting reliability and validity. This could skew estimates EITHER
- upward or downward.
-
- 3) The value of lost revenue also is dramatically inflated by other
- questionable assumptions. For two reasons, it cannot be assumed that
- every unpurchased program represents a lost sale. First, there is no
- evidence to support, and much evidence to challenge, the assumption
- that if I did not possess a copy of dBase or Aldus Pagemaker
- "borrowed" from my employer that I would purchase it. The ethics of
- such borrowing aside, such an act simply does not represent nearly
- $1,000 of lost revenue. Second, as an actual example, I (and many
- others at my university) have dBase and Word Perfect (and many other
- programs) licitly installed on a home or office PC. These two programs
- alone have a street value of about $700. I would include them as
- "installed" programs in a survey. However, I did not purchase either
- program. Hence, they would not show up in sales statistics, and would
- therefore be attributed to "piracy." But, I did not obtain them
- illicitly. They were obtained under a site license and are installed
- licitly. Consider another example. When I purchased a PC in 1988, it
- came (legitimately) loaded with two programs. I bought two more. Now,
- I have four legitimate programs loaded, but only two would show up in
- normal sales figures. It would seem, from the statistics, that I had
- two "pirated" programs--two purchased, two unpurchased, even though
- there were none. BOTH the piracy score and the lost revenue estimate
- are skewed UPWARD.
-
- Although the subject of a separate article, the SPA's method also
- fails to consider the possibility that casual copying and sharing may
- enhance rather than reduce sales by creating a "software culture" and
- increasing the visibility and end-user facility with the products. If
- sales are increased, it would skew the lost revenues UPWARD. Whatever
- the result, this is an assumption that cannot be discarded without
- strong empirical evidence.
-
- These are just a few of the problems that inflate the overall picture
- of piracy and why I cannot accept the figure given by the SPA as
- accurate. And, if the piracy rate for 1991 is only about 20 percent
- (and in decline), it would appear that--even if the problem is only
- mildly inflated--the losses are far, far less (and the problem
- therefore not as severe) as anti-piracy advocates claim. Yet, despite
- dramatic evidence of decline on a variety of key indicators, SPA
- rhetoric, its advocacy for broader and more punitive legislation, and
- its lucrative aggressive litigation campaigns continue to escalate.
-
- A caveat: David Tremblay, the SPA Research Directory, makes no claims
- about total accuracy. He is also aware of and quick to point out some
- of the methodological problems. He would not agree with my view of at
- least some of the problems, and perhaps has antidotes for others. In
- my own discussions with him, he was careful not to speak beyond the
- data, and--like any good methodologist--approached the task of
- calculating piracy as a puzzle. His own attitude, if I understood him
- correctly, was that he's more than willing to modify the method with a
- better procedure if one can be pointed out. Perhaps I misunderstood
- him, but I was continually left with the impression that his goal was
- not to "prove" a preferred outcome, but to refine the data and method
- to provide as accurate an estimate possible, whatever answer it might
- provide. In short, he has no preconceived ideological ax to grind in
- coming up with his figures.
-
- It should be noted that if a different methodology were used, it is
- quite possible that both the extent of piracy and the lost revenue
- costs *could* be much higher than the SPA's estimates. However, at
- stake is *this* methodology. Contrary to SPA claims, *this*
- methodology appears to INFLATE the frequency and costs.
-
- This, however, does not alter the fact that SPA press releases and
- other material appear to manipulate the data to promote a distorted
- image of piracy. We can agree that there are those who unethically
- (and illegally) profit from piracy, and we can agree that if one uses
- a commercial software program regularly, payment should be made. This
- does not mean that we must also accept the dramatic image of rampant
- piracy and multi-billion dollar revenue loss by casual "chippers."
- Software piracy is, according to SPA data, in dramatic decline.
- Evidence suggests that this decline is the result of education and
- awareness, rather than coercive litigation. At stake is not whether
- we accept ripoff, but rather what we do about it. The statistical
- method and its results do not seem sufficient to warrant increased
- demands for tougher piracy laws or for expanding the law enforcement
- attention to address what seems to be a declining problem.
-
- If I am correct in judging that the SPA's estimate of piracy is
- significantly inflated, then it seems that they are engaging in
- hyperbole to justify its highly publicized litigation campaign. Some
- might find this a good thing. My own concern, however, is that the
- litigation campaign is a revenue-generating enterprise that--to use
- the SPA's own promotional literature--resembles a law unto itself,
- more akin to a bounty hunter than a public-interest group. The SPA
- appears to have an image problem, and the root of the image problem
- lies in some critics see as speaking beyond the data in describing
- piracy and in using the law to fill its coffers. It is unfortunate
- that the many valuable things the SPA does are overshadowed by its
- self-laudatory high-profile image as a private law enforcement agency.
-
- The methodology underlies an ideological opposition not just to
- intellectual property, but to human interaction and socal norms. In
- promoting a zero-tolerance attitude toward a strict definition of
- "piracy" and rigid adherence to the limitations of shrinkwrap
- licenses, the SPA would isolate the causal swapper and criminalize
- along with major predators non-predators as well. As Richard Stallman,
- a promoter of freeware, argues in the first issue of _Wired_ Magazine
- (p. 34), violation of shrinkwrap is called piracy, but he views
- sharing as being a "good neighbor:"
-
- I don't think that people should ever make promises not to
- share with their neighbor.
-
- It's that gray area between being a good neighbor and crossing over
- into unacceptable behavior that, to my mind, poses the dilemma over
- which there is room for considerable honest intellectual disagreement.
-
- Downloaded From P-80 International Information Systems 304-744-2253
-