Measuring the Link Between Academic Science and Industrial Innovation
                 The Case of California’s Research Universities?


                                  Lee Branstetter
                              Department of Economics
                               University of California
                                 Davis, CA 95616
                                     and NBER


                                    June 30, 2000


                              Preliminary and Incomplete

                                Do Not Quote or Cite


?
  Acknowledgements: I wish to thank Ernst Berndt, Iain Cockburn, Robert Feenstra,
Adam Jaffe, Josh Lerner, and David Mowery for useful comments and suggestions. I
also wish to thank a number of academic scientists and industrial R&D managers for
providing me with their insights into the process by which knowledge flows from
academia to industry. I am indebted to Colin Cameron for detailed guidance concerning
the econometric models used in this paper and to Hiau-Looi Kee and Kaoru Nabeshima
for excellent research assistance. I would like to thank Tony Breitzman and Francis
Narin of CHI-Research, Adam Jaffe, and Marie and Jerry Thursby for their help in
obtaining the data used in this study. This project was funded by grants from the
University of California Industry-University Cooperative Research Program, the NBER
Project on Industrial Technology and Productivity, the Japan Foundation Center for
Global Partnership, and the Institute for Governmental Affairs at UC-Davis.
I.     Introduction

       The impact of academic science on industrial innovation has received a great deal

of attention. A full review of even the recent literature is beyond the scope of this draft,

and I will only mention a few studies from the streams of research on which this current

paper directly builds. One such stream of research has used case studies or surveys in an

attempt to assess both the magnitude of this impact and the channels through which it

flows. Mansfield (1990) directly interviewed industrial research directors to obtain their

assessments of the impact of academic research on industrial R&D, finding that that this

impact is substantial across a wide-range of industries. Cohen et. al. (1994) have

continued in this tradition, surveying a large cross-section of firms on the impact of

academic science on their own research productivity and the means by which these

knowledge flows are mediated. Other qualitative studies of this phenomenon include

Faulkner and Senker (1995).

       A second stream of research has attempted to quantitatively assess the real effects

of academic research. Jaffe (1989) examined the impact of university R&D spending on

the patenting of “technologically proximate” industrial firms. Adams (1990) studied the

impact of basic research by relating lagged measures of scientific output (as measured by

counts of papers) to movements in productivity measures. Jaffe et. al. (1993, 1996,

1998) have studied “knowledge spillovers” from academic science to industrial R&D by

examining, among other things, trends in patenting by universities and the citations made

to these university patents by other entities, including R&D-performing industrial firms.

       A third stream of research has undertaken quantitative analysis of university-

industry research collaboration. Zucker et. al. (1998) and Cockburn and Henderson


                                                                                               2
(1998, 2000) have used measures of direct collaboration (i.e., co-authored papers)

between academic scientists and industrial R&D labs, finding that measures of firm

research performance are correlated with measures of “connectedness” to academic

science. A number of papers (Zucker et. al. 1998, Audretsch and Stephan, 1996) have

studied “start-up” activity linked to academic research or academic researchers. Finally,

several recent studies have examined university licensing of university-generated

inventions (Barnes, Mowery, and Ziedonis, 1998, Mowery et. al., 1998, Thursby and

Thursby, 2000).

         This paper uses patent citations to academic papers to measure “knowledge

spillovers” between academic science and industrial R&D. It is not the first paper to use

such data – Francis Narin and his collaborators have pioneered the use of these data (see

Narin et. al., 1997) in large-scale statistical analysis and, in fact, the patent citation data

used in this paper was originally generated by Narin’s firm, CHI Research. However,

this paper takes a very different approach to the data than has Narin’s work. On the one

hand, I focus solely on patent citations to academic papers authored by scientists

affiliated with the campuses and laboratories of the University of California (UC) system

and Stanford University.1 In contrast, Narin’s work has looked at a much broader sample

of citations to multiple universities, public R&D labs, and academic publications

generated by private firms.

         The limited scope of my analysis allows me to subject individual citations and the

patents in which they appear to a high level of scrutiny. Using data on the residence of

1
  Data on citations to the University of Southern California have been acquired but not yet analyzed. In the
future, I hope to acquire and utilize data on the California Institute of Technology as well. The inclusion of
data from these institutions would cover almost all patent citations made to academic papers over my
sample time period. I acknowledge that the scope of the present study does not quite live up to what is
implied by its title!


                                                                                                            3
the inventors named in the patent, I am able to examine issues of geographic localization

of knowledge spillovers. By matching citing patents to a control group of non-citing

patents, I am able to study aspects of individual patents which are correlated with

citations and the distribution of citations over time and “technology space” as well as

geographic space. Finally, I am able to identify “highly cited” academic scientists and

“intensively citing” industrial firms whom are then interviewed to facilitate a richer

understanding of the kinds of interactions between academia and industry that generate

the observed citations. This complementary fieldwork is directly inspired by Jaffe,

Fogarty, and Banks (1999).2

II.        Citations of Academic Papers as Indicators of Knowledge Spillovers

           Since the contribution of this paper lies, to a great extent, in the data being used, it

is worthwhile to point out at the outset both the advantages and disadvantages of the data.

The primary advantage is rather dramatically illustrated in Figure 1. This graph

illustrates the trends over the 1988-1997 period in several alternative indices of university

research output and knowledge spillovers for the University of California’s 9 campuses

and affiliated laboratories, including university patents by issue year (patents), invention

disclosures by year of disclosure filing (disclosures), new licenses of university

technology by date of contract (licenses), the number of citations to previous university

patents by issue year of the citing patent (citations to UC patents), and the number of

citations to UC-generated academic papers by issue year of the citing patent (citations to

UC papers). The latter index towers over everything else, and it is growing (almost

exponentially) over time whereas the other indices are comparatively stagnant. The clear

implication is that there are far more data points to work with using this measure than any
2
    Future drafts will include a section which discusses the results of these field interviews.


                                                                                                  4
of the examined alternatives. Figure 2 gives a similar graph for Stanford, driving home

the point.

         Despite the passage of the Bayh-Dole Act and despite the best efforts of eager

university technology transfer officers to encourage this sort of activity, California’s

research universities still produce a relatively small number of patents.3 While the

numbers have increased over time, the results of Henderson et. al. (1998) on a national

sample of university patents suggest that more marginal inventions are being patented,

such that the average quality of university patents is actually declining. Likewise, few

patented inventions are licensed, and only a handful of these ever generate substantial

revenues. Thursby and Thursby (2000) suggest that, nationwide, university licensing

efforts may also be running into diminishing returns.

         However, a focus on patents or licensing may miss important channels by which

academic institutions provide useful technological information to industrial innovators.

Not all academic disciplines produce intellectual outputs that lend themselves to patent

protection. It is also probably true that few academics possess the traits and skills that

would enable them to bring an innovation from the initial “concept” stage all the way to

successful licensing contract negotiations with a potential manufacturer. In this context,

it is perhaps not surprising that attempts to expand university patenting and licensing

from the initial low levels already bear evidence of diminishing returns.


3
   The number of patents is small relative to what a “patent-intensive” firm might be expected to produce
with the same level of R&D spending. It is not small relative to the levels of patenting in other research
university systems. It is worth pointing out that UC had instituted procedures for the disclosure and
licensing of university inventions which preceded the Bayh-Dole Act by several decades, so the impact of
the Act is, perhaps, less evident here. See Mowery et. al. (1998) for more detail and an institutional history
of technology licensing at UC and Stanford.


                                                                                                             5
       On the other hand, the academic promotion system creates strong incentives for

academic scientists to publish all results of scientific merit. Thus, UC and Stanford

generate thousands of papers annually. If we wish to measure the impact of university

research, then perhaps we should take as our starting point the broadest measure of

university research output – academic publication. To the extent that citations to these

publications reflect knowledge spillovers, Figures 1 and 2 would seem to imply that the

spillovers are growing in importance in a way that is consistent with the widely held

conviction that industrial research is increasingly building upon academic science.

       I will argue throughout this paper that the available evidence suggests that these

citations do reflect knowledge spillovers. This view is strongly supported by my initial

field interviews of highly cited scholars and intensively citing firms. However, the reader

may find large sample survey evidence more persuasive. Such evidence in presented in

Cohen et. al. (1994), and Table 1 summarizes a larger table in their paper which conveys

the same message. When asked by what means they obtain useful research results from

academia which can serve as inputs to their own R&D process, industrial R&D directors

across a wide range of industries consistently listed the academic literature as one of the

most important channels. Averaging across industries, academic publications are the

single most important source of such information. Note that publications play a

particularly important role in the pharmaceutical industry. Faulkner and Senker (1995)

present similar evidence from in-depth interviews of executives in the biotechnology

industry, summarized in Table 2. Research by Cockburn and Henderson (1998) and

Zucker et. al. (1998, 1999) has stressed the role of direct university-industry research

collaboration in promoting information flows in the pharmaceutical and biotechnology


                                                                                              6
industries. While nothing in this paper calls into question the view that direct

collaboration is important, the cited survey evidence and my own field work suggests that

direct collaboration is not the only significant channel of such information flows in these

sectors.

       That being said, these citations do possess some disadvantages, some of which

they share with the use of patent citations to other patents. As Jaffe et. al. (1998) have

stressed, patent citations can appear for reasons that have little or nothing to do with

knowledge spillovers. This general truth also applies to citations to the scientific

literature, though with perhaps less force. The legal obligation to cite relevant “prior art”

generates citations where there were no “spillovers,” and citations can be added ex-post

by parties other than the inventor. Tackling this problem head-on, Jaffe et. al. (1998)

presents the results of a field study suggesting that, despite the presence of substantial

“noise” in patent citation data, there is enough “signal” in them to support inference of

knowledge spillovers, particularly when such inference is based on large numbers of

patents from multiple organizations. My own fieldwork suggests that a similar result

obtains with citations to the scientific literature, but caution in making that inference is

clearly warranted.

       In my view, the other primary disadvantage (again, a disadvantage shared with

citations to previous patents) is that the appearance of citations does not directly lead to

an accounting, in dollar value terms, of the economic benefits created by university

research. University licensing revenues may be relatively small numbers, but at least

they are dollar values. No research results presented in this paper will get us anywhere


                                                                                               7
close to such a dollar value accounting, but I do sketch out in section V how we might

use citations as an intermediate step toward such an accounting.

        The final disadvantage is somewhat unique to this study: I am, after all, only

looking at citations to the publications of a single public university system and a single

private university. I would like to make the argument that these are uniquely important,

productive institutions, but I do not make any claim that the results presented in this

paper are generalizable to all leading research universities, much less noncorporate

scientific institutions in general.

        The rest of the paper will proceed as follows. First, I describe the citations data

along several dimensions in section III. I also briefly discuss the fieldwork component of

this study. Next, I conduct some preliminary econometric analysis. This requires me to

merge my citations data with other data on the geographic location and technology class

of the citing patent, and combine this merged data with a matched set of “nonciting”

control patents. I describe the data creation process, the statistical models used to analyze

the resulting data, and my preliminary results in section IV. In section V, I go on to

describe other ways in which these citations data might be used. Section VI presents

some (very) preliminary conclusions.

III.    Describing the Data

        Again, the reader is referred to Figures 1 and 2, which give the time series trends

for several alternative indices of “knowledge flows” from UC and Stanford. Figures 3

and 4 aggregate citations over time, but break them out across seven major fields of

science for UC and Stanford respectively. Biotechnology-related fields of science

(biomedical research, clinical medicine) constitute a very large portion of the total. They


                                                                                              8
also account for much of the recent increase in citations to academic research, as is

illustrated by Figure 5 for UC. The figure for Stanford is similar, unless one breaks off

Stanford Medical School, in which case “engineering” and “physics” related technologies

are the main drivers of changes over time, as is illustrated by Figure 6.

       Figure 7 breaks down citations according to the campuses and UC-affiliated

laboratories with which the cited scholar was affiliated at the time of paper publication.

Clearly medical schools are important drivers of overall citations. However, “biotech-

related” fields are important outside of medical schools in the UC system. Even at UC-

Berkeley, which possesses one of the nation’s strongest engineering faculties,

“engineering” and “physics” related citations are less numerous than “biotech-related”

citations. That being said, one sees increases over time in citations to fields such as

“engineering” and “physics,” particularly at Berkeley and Stanford. Many of these

patents and the cited papers connected to developments in electrical engineering and

computer science.

       The relative dominance of biotech citations mirrors trends in the aggregate

citations data noted by Narin et. al. (1997), so it is not an artifact of my sample.

Interestingly, it also mirrors trends in the distribution of university patents and licenses

across fields. Though the numbers involved are much smaller, the studies of UC

patenting and licensing conducted by Mowery et. al. and Mowery and Ziedonis also

demonstrate the dominant role of biotech-related invention. Biotech also plays a strong

role in the patenting and licensing of Stanford and Columbia.

       Mowery et. al. suggest several reasons for this. First, they point to the large share

of federal (and state) research funding focused on the life sciences, a trend which


                                                                                               9
stretches back to the 1970s. In part, the output measures represent the return to decades

of sustained public R&D investment in biomedical research at leading universities.

These authors also suggest that industrial research in biotechnology and pharmaceuticals

is now “closer” to academic research. In other words, the process of product invention

and development now builds much more closely and directly on academic bioscience

than it used to, a development discussed at length in the work of Zucker et. al., Cockburn

and Henderson, and Gambardella (1998). Evidently, these same effects are reflected in

my citations data.

       It is also of interest to look at the (unconditional) distribution of citations in space

and time. Space and time constraints prevent more than a cursory glance at this, but the

reader is referred to Figure 8, which provides a 3-dimensional representation of the

incidence of citation in geographic space for the entire United States. The height of the

cones represents the number of patents citing UC-Berkeley generated papers within a

particular U.S. county, as identified by the address of the first inventor listed on the

patent application. Note the concentration of citations in California and in the

Northeastern research/industrial corridor – a “bicoastal” pattern that Mowery and

Ziedonis also find in citations to UC-generated patents. Figure 9 presents a similar

distribution for California counties only. Of course, the underlying geographic

distribution of research activity is also quite skewed, and any formal investigation of the

geographic localization of knowledge spillovers would have to control for this.

       Following Jaffe et. al. (1993), we conducted a formal test of the geographic

localization of knowledge spillovers by matching each of our citing patents with a

nonciting “control” patent issued on roughly the same date in the same patent class as the


                                                                                             10
cititing patent. Let pc be the probability that a citation comes from the same county as

that in which the cited university campus is located. Let p0 be the corresponding

probability for a randomly drawn control patent. We test for localization using the

following test statistic:

                     p c ? p0
                     ˆ     ˆ
t?
        [ p c (1 ? p c ) ? p 0 (1 ? p o )] / n
          ˆ        ˆ       ˆ        ˆ

where the two terms in the numerator are the sample proportion estimates of pc and p0.

The null hypothesis that pc=p0 is easily rejected at conventional levels.

           The (unconditional) distribution of time lags between the publication date of the

paper and the application (filing) date of the patent is given in Figure 10.4 The shape of

this distribution resembles the double-exponential curves estimated by Jaffe and

Trajtenberg. The modal citation lag is quite short, but the distribution is quite heavily

skewed to the right, with nontrivial numbers of citations being made to fairly old papers.

In a very small number of cases, the lag is negative, suggesting that academic references

were added to a patent application after the initial filing.

           Fieldwork

           Since the “raw” citations data make references to specific scholars, it is relatively

easy to identify on each UC campus “highly cited” scholars – that is, scholars whose

work is frequently cited in patents, including patents assigned to firms. I have

interviewed a number of these scholars about their cited research, showing them in the

interview a comprehensive list of the patent citations to their work and asking them about

the possible technological linkages between these patented inventions and their papers as

well as any relationship or connection they might have to the citing organization.

4
    This is for citations to UC papers only.


                                                                                              11
       While some citations arise in the context of a formal relationship between the

cited scholar and the citing organization, most do not. Interviewed scholars are often

surprised to find their work cited in the patents of particular industrial firms. However,

upon a close reading of the patent abstract, they are often able to identify a plausible

technological linkage between the patented invention and the cited research. There is

considerable variance among highly cited scholars in terms of the extent to which they

have attempted to profit financially from their own research. Some act as consultants to

firms, some deliberately seek corporate funding for their labs, and a number of scholars

have successfully issued one or more patents protecting an invention. However, many

highly cited scholars have done none of those things. The common denominator among

all cited scholars and cited papers is scientific quality. The scholars are frequently the

leading intellectual lights in their departments, and the cited papers often represent either

their most important scientific contributions or a methodological advance with

widespread application.

       I have also contacted industrial R&D managers of “frequently citing” firms, in

order to examine the “knowledge flow” process from the perspective of the firm. The

managers I have interviewed generally accept the view that patent citations reflect

knowledge spillovers, although one corporate patent attorney emphasized that some

references to the scientific literature are added ex-post or for defensive reasons. R&D

managers emphasized the role of the scientific literature as a vehicle for knowledge flow,

but they also stressed the importance of cultivating long-term relationships with key

academic experts – which was often reflected in the citations patterns. Interestingly, co-

authorship did not receive much emphasis in my discussions with R&D managers. The


                                                                                             12
universal view among interviewees on the corporate side was that the rise in the

incidence of citation of academic research represents a real “convergence” of research in

industry and academia rather than merely a change in citation practices or the

computerization of scientific literature indices.

         Econometric Analysis

         Data Construction and Basic Approach

         The parallels between this research project and the analysis of patent citations

pursued by Jaffe, Trajtenberg, and Henderson suggests the use of similar methodology.

While I would like to employ a citations function approach, data availability precludes it.

I do not have good information on the number of potentially citable papers being

generated by UC and Stanford by scientific field.5 I only observe papers that are cited at

least once. In principle, measures of academic publications could be created from

publicly available sources, but obtaining the data and matching it to the correct campus is

likely to be a long, expensive undertaking.

         Instead, I take the following approach. After matching my initial data on citations

to additional data on the citing patents, I then construct a random sample of nonciting

patents drawn from the same set of issue years, 1988-1997.6 The presence of this

“control group” of nonciting patents enables me to conduct statistical analysis of the

likelihood of a given patent making citations to UC (or Stanford) academic research as a

function of the characteristics of the citing patent (including, by extension, the


5
   I thank Jim Adams for providing me with his data on paper counts, which includes information on some
of the campuses I study. Unfortunately, there is only a limited overlap in the time series dimension of his
data and the years of my sample of patent citations.
6
  This component of my research relied heavily on data from the REI data base at Case Western Reserve
University. I am grateful to Adam Jaffe and Michael Fogarty for access to these data.


                                                                                                         13
characteristics of its named first inventor) and the cited paper (including, by extension,

the campus affiliation of the authors).

          Taking this step involves collapsing my sample of nearly 40,000 citations down to

the much smaller number of unique citing patents, many of which make more than one

reference to academic publications from UC and/or Stanford. At the moment,

econometric work has focused on UC-citing patents. Results for Stanford will be

integrated at a later date. The integer nature of the number of citations to academic

publications made per patent, which will be the dependent variable, calls for the use of

count data models. Regression analysis based on the standard Poisson and Negative

Binomial models has become increasingly familiar, so no derivation will be given for

these models, and they will be estimated as a “benchmark.”7 However, key features of

the data will require me to modify the likelihood functions of the benchmark models.

          Conceptually, one can think of the likelihood of observing a citation to academic

research as being a function of an unobserved latent variable – “proximity to academic

science.”8 Conditioning on a patent being sufficiently “close” to academic research for a

citation to take place, one may observe the patent making anywhere from 1 to as many as

38 citations to academic papers. The actual number of citations made will be a function

of attributes of the citing patent (such as geographic and temporal distance from the

relevant research) and a function of the cited research and the campus where it was

conducted. This seems to call for a specification analogous to the “Tobit” model, but

one set up to handle the “integer” or “count” nature of the dependent variable when

citations are actually observed.


7
    The classic reference is Hausman, Hall, and Griliches (1984).
8
    I am grateful to Adam Jaffe for discussions on these issues.


                                                                                             14
          A statistical model which comes close to meeting these requirements is the so-

called hurdle Poisson model and its generalization, the hurdle negative binomial model.

An alternative formulation with some similarities to the “hurdle Poisson” is the “zero-

inflated Poisson” model, which we utilize in the current draft. Alternatively, I can

conduct econometric analysis using only those patents which cite a UC academic paper

at least once. This implies a Poisson (or Negative Binomial) distribution which is

truncated from below (i.e., at zero). The basic features of these models are described in

the next section.

          Sketch Derivation of the Estimation Techniques

          A complete derivation of these models is given in Cameron and Trivedi (1998).

Here we present only the essential features of these derivations, beginning with truncated

Poisson and negative binomial distributions, and the implications of truncation for

empirical analysis. This brief description draws heavily on Cameron and Trivedi (1998)

and uses their notation. Let

H ( yi , ? ) ? Pr[Yi ? y i ]                                                           (1)

denote the CDF of the discrete random variable with PDF h( y i , ? ) , where ? is a

parameter vector. In my applications, of course, yi will be the number of citations to

academic research made by patent i. If realizations of y less than a positive integer r (in

our case, 1) are omitted, the ensuing distribution is given by

                               h( y i , ? )
f ( yi , ? | yi ? r ) ?                                                                (2)
                           1 ? H (r ? 1, ? )

          One special case would be the left-truncated negative binomial, for which

                  ? ( yi ? ? ? 1 )                                    ?1
h( y i , ? ) ?       ?1
                                   (? ? i ) yi (1 ? ? ? i ) ? ( yi ? ? )               (3)
               ? (? )? ( yi ? 1)


                                                                                             15
where ? ? ( ? i ,? ) . The truncated mean and variance are defined by

?i ? ?i ? ?i

?   i
     2
         ? ? i ? ? ? i2 ? ? i ( ? i ? r )                                                             (4)

? i ? ? i [1 ? ? (r ? 1)]? (r ? 1, ? i , ? )

? (r ? 1, ? i , ? ) ? h(r ? 1, ? i ) / 1 ? H (r ? 1, ? i )

In a similar way, the truncated mean and variance of the Poisson distribution can be

derived as a limiting case of the above, where ? ? 0 .

             Skipping from the general to the specific, the mean and variance of the Poisson

distribution truncated at zero are

                            ?i
E[ y i | y i ? 0 ] ?                                                                                  (5)
                         1 ? e? ? i

and

V [ y i | y i ? 0] ? E[ y i | yi ? 0][1 ? Pr[ y ? 0]E[ y i | yi ? 0]
     ? i ? ? ie??i ?                                                                                  (6)
?            ?1 ?         ?
  1 ? e? ? i ? 1 ? e? ? i ?

A more general negative binomial distribution truncated at zero would have the following

first two moments:

                                  ?i
E[ yi | y i ? 0] ?                              1
                                                                                                      (7)
                                            ?
                         1 ? (1 ? ? ? i )       ?


and

                                                      ?                                           ?
                                  ?i                                     1
                                                                                   ?i
                                                    ? ?1 ? (1 ? ? ? i ) ?                         ?
                                                                       ?
V [ y i | y i ? 0] ?                            1     ?                                       1   ?   (8)
                                            ?                                               ?
                         1 ? (1 ? ? ? i )       ?     ?
                                                      ?                    1 ? (1 ? ? ? i )   ?   ?
                                                                                                  ?


                                                                                                            16
Note that the truncated Poisson, unlike the standard Poisson model, does not have equal

first and second moments. As pointed out by Cameron and Trivedi (1998),

misspecification of the distribution implies that the first conditional truncated moment,

which depends on the correct probability of zero value, will also be misspecified,

resulting in inconsistent estimates of our parameters if the parent distribution is

incorrectly specified.

            The left-truncated Poisson model can be estimated by maximum likelihood

methods. Let the log likelihood estimation be based on n independent observations, such

that

              n ?                        ?               r ?1     ?             ?
L( ? ) ?     ?1 ?
             i? ?
                  yi ln( ? i ) ? ? i ? ln?1 ? exp(? ? i )? ? ij j!? ? ln( y i !)?
                                         ?                        ?                   (9)
                                         ?               j? 0     ?             ?

where the MLE of ? is the solution of

 n
                                ?? i
? ?y
i ?1
       i   ? ? i ? ? i ? i? 1
                       ?
                                ??
                                     ?0                                               (10)


where

           ? i h( r , ? i )
?i ?                                                                                  (11)
       ?1 ? H (r ? 1, ? i ?
            While the preceding models are appropriate for exploring patents which cite an

academic paper at least once, it is possible that much could be learned from a model

which could accommodate a sample of patents which never cite such papers (our

“control” group) as well as patents which cite papers once or multiple times. Two such

models exist in the received econometrics literature – the “hurdle” Poisson model and the

“zero-inflated” Poisson model, both of which have a more general negative binomial

version.


                                                                                             17
          In essence, a hurdle model of either variety is a finite mixture generated by

combining the zeroes generated by one density with the zeroes and positives generated by

a second zero-truncated density. The moments are determined by the probability of

crossing the zero “threshold” and by moments of the second density. To put this into

mathematical notation

E[ y | x] ? Pr[ y ? 0 | x]E y ? 0 [ y | y ? 0, x]                                                                          (12)

          For a concrete example, consider the negative binomial hurdle model, which I

will estimate in subsequent drafts of the paper. Let ? 1i ? exp( xi?? 1 ) be the negative

binomial mean parameter for the case of zero counts. Similarly, let ? 2i ? ? 2 ( xi?? 2 ) for

the positive set J={1, 2,…}. Further define the indicator function 1[ yi ? J ] ? 1 if y i ? J

and 1[ y i ? J ] ? 0 if yi=0. From the negative binomial with a quadratic variance function,

the following probabilities can be obtained:

Pr[ yi ? 0 | xi ] ? (1 ? ? 1 ? 1i ) ? 1 / ? 1                                                                              (13)

1 ? Pr[ yi ? 0 | xi ] ?      ?
                             yi ? J
                                      h( y i | xi ) ? 1 ? (1 ? ? 1 ? 1i ) ? 1 / ? 1                                        (14)

                                                                                               ?
                                                                                            ?? 21
                             ? ( yi ? ? 2 1 ?
                                         ?
                                                         1             ?                             ? ? 2i ?
                                               ?                       ?                            ??
Pr[ y i | xi , y i ? 0] ?      ?1              ? (1 ? ? ? )1 / ? 2 ? 1 ?
                          ? (? 2 )? ( y i ? 1) ?                                                     ? ? ? ? ?1 ?
                                                                                                                ?          (15)
                                                       2 2i            ?                             ? 2i    2 ?


          Equation (13) gives the probability of zero counts, equation (14) gives the

probability of “crossing the threshold,” and equation (15) is the truncated-at-zero

distribution. The log-likelihood function splits into two components, such that:

                      n                                                          n
L1 ( ? 1 , ? 1 ) ?   ? ?(1 ? 1[ yi ?
                     i ?1
                                            J ]) ln Pr[ y i ? 0} | xi ??       ? 1[ y
                                                                                i ?1
                                                                                        i   ? J ] ln(1 ? Pr[ y i ? 0 | xi ]) (16)


and


                                                                                                                                    18
                      n
L2 ( ? 2 , ? 2 ) ?   ? 1[ y
                     i ?1
                              i   ? J ] ln Pr[ y i | xi , y i ? 0]                      (17)


so that

L( ? 1 , ? 2 , ? 1 ,? 2 ) ? L1 ( ? 1 ,? 1 ) ? L2 ( ? 2 ,? 2 )                           (18)

          Note that this model contains a critical assumption – the component of the

likelihood function which determines whether or not a nonzero realization of the

dependent variable occurs is separable from and estimated independently of the

component of the likelihood function which determines the count of the dependent

variable, conditioning on that count being greater than zero. This conveys a practical

advantage, as it makes estimation easier. However, in the context of my study, that

assumption is potentially problematic. Intuitively, “proximity to academic science” could

influence not only the likelihood of a citation occurring but also the number of citations

actually made.

          Lambert (1992), among others, has introduced an alternative to the hurdle

approach. Consider the following:

Pr[ y i ? 0] ? ? i ? (1 ? ? i )e ? ? i                                                  (19)

                                  e ? ? i ? ir
Pr[ yi ? 0] ? (1 ? ? i )                       , r ? 1,2,...                            (20)
                                       r!

where ? i is the proportion of zeros. Lambert defines ? i ? ? ( xi , ? ) and proposed

parameterizing ? i as a logistic function of an observable vector of covariates z, thus

ensuring nonnegativity of ? i ; that is

yi=0 with probability ? i

yi ~ P[ ? i ] with probability (1- ? i )                                                (21)


                                                                                               19
           exp( z i? )
                   ?
?i ?
         1 ? exp( z i? )
                     ?

I will follow Lambert in using the logistic functional form for ? i . Let 1(yi=0) denote an

indicator function that takes value of 1 if yi=0 and zero otherwise. The joint likelihood

function after omitting constants is given by

               n                                              n
L(? , ? ) ? ? 1( yi ? 0) ln(exp( i? ) ? exp(? exp(xi?? ))) ? ? (1 ? 1( yi ? 0))(yi xi?? ? exp(xi? ))
                               z?                                                                ?
              i ?1                                           i ?1
   n
                                                                                                       (22)
? ? ln(1 ? exp(zi? ))
                 ?
  i ?1


Here, too, I am assuming functional independence of the ? i and ? i components of the

joint likelihood function. To the extent that this assumption is questionable, the results

will need to be interpreted with appropriate caution.

           Going further, it is clear that behind the data generating processes producing the

patents and citations in my sample are inventors actually choosing where, in the

technology space, to conduct research. Some inventors may deliberately choose to work

in regions of the technology space where a rich foundation of prior academic research

makes commercial R&D more productive. This suggests problems of endogeneity that

neither the “hurdle” models nor the ZIP model would be able to handle. For these

reasons, the econometric results contained herein are not presented as the last word, nor

are the estimated coefficients given strong structural interpretations. At this stage, I am

using these regressions to describe multivariate correlations in the data – no more than

that.


                                                                                                              20
        Specifications Used

        Recall that, in the preceding Poisson-based equations, ? i ? exp( xi?? ) defines the

“exogenous” variables used and the regression parameters estimated; estimation of

negative binomial models involves the estimation of the additional parameter, ? . My

“baseline” specification will be:

ui ? exp(? 1 ? ? 2 disti ? ? 3 dstatei ? ? 4 dcntyi ? ? ? o Oi ? ? ? c Ci ? ? ? f Fi ? ? ? L Li ) (23)
                                                   o           c          f          L


where dist is a measure of linear distance between a cited campus (e.g., UC-Berkeley)

and “location of invention” of the citing patent, which is presumed to be the county

containing the address of the first inventor listed on the patent document. When a given

patent document cites more than one California campus, the distance measure is an

average measure of distance to each cited campus, weighted by the number of citations

made to each campus. In specifications in which matched control patents are used,

distance is measured between the location of the control patent and the campus cited by

the citing patent to which our control patent is matched. Since the impact of geographic

distance on knowledge spillovers is unlikely to be linear, I also include two dummy

variables denoting significant geographic boundaries. Dstate is a dummy variable equal

to 1 if the citing patent (or matched control) and the cited campus are located in the same

state. Dcnty is a similarly constructed dummy variable equal to 1 if the citing patent and

cited campus are located in the same county. When a given patent cites more than one

campus, these dummy variables are set equal to one if any of the cited campuses is

located in the same state or county. The coefficients on these three variables should

provide some sense of the extent of geographic localization of knowledge spillovers,

controlling for other attributes of the citing patents and cited campuses.


                                                                                                         21
       The O’s are a set of dummy variables corresponding to the organizational form of

the assignee of the citing patent, based on an assignee classification system developed by

Meg Fernando, former administrator of the REI patent data base. These include

government, universities, non-profit non-university research labs, and private firms, and

are incorporated into our specification on the grounds that different classes of assignees

may have a differential propensity to cite academic papers. It would be particularly

useful to control for the higher propensity of university patents to cite academic papers of

the faculty inventor.

       The C’s are dummy variables for the different cited campuses, with medical

schools distinguished from the main campuses. A patent citing more than one UC

campus will have more than one of these dummy variables equal to 1. Likewise, the F’s

are dummy variables for the scientific field of the cited paper, where I borrow a

categorization of scientific disciplines developed by CHI Research, Inc.: biology,

chemistry, biomedical research, clinical medicine, earth and space sciences, engineering

and technology, physics, mathematics, and psychology. This allows me to control for

field effects in estimating the differential “citedness” of different campuses, and also

allows me to control for campus effects in estimating the differential “citedness” of

different groups of academic disciplines. When more than one paper is cited, all relevant

dummy variables are set equal to 1.

       Finally, I want to get some sense of how temporal distance between the cited

paper and the citing patent affect the probability of citation. In practice, I do this by

defining a set of lag dummy variables (the L’s), each set equal to 1 when the citing patent

cites a paper whose publication year preceded the patent application year by that lag


                                                                                             22
amount. Where more than one paper is cited, more than one lag dummy variable will be

set equal to 1. Inference using these coefficients will need to keep in mind that the lag

dummy variables are potentially picking up “cohort effects” of the citing patents and

“cohort effects” of the citing papers as well as the impact of time lags, per se. In later

drafts, I hope to use a more sophisticated specification to obtain a cleaner estimate of the

effect of time lags.

         The ZIP models require that I specify a set of variables determining the

                                                                  exp( z i? )
                                                                          ?
probability of “noncitation,” that is, I need to define ? i ?                   . A simple
                                                                1 ? exp( z i? )
                                                                            ?

approach is to suppose that the propensity to cite academic science depends on where you

are in the “patent space.” Some industrial technologies are quite proximate to academic

scientific research, others are less so. If I make the rather heroic assumption that the

location of a given patent in the patent space is exogenous, or, at least think of it as

predetermined, then I can use information based on the patent class assigned to the patent

to define ? i . As an initial step, I create a set of dummy variables (dcat1-dcat3) set equal

to 1 if a patent is assigned to one of the classes which frequently cite biomedical research,

clinical medicine, or electrical engineering, respectively. Thus

z i ? ? 0 ? ? 1 dcat1i ? ? 2 dcat 2i ? ? 3 dcat 3i                                           (24)

         Description of Initial Results

         Initial regression results are reported in Tables 3 and 4. As a benchmark, Table 3

presents results of estimates using Poisson and negative binomial models on the

truncated sample (that is, the sample of citing patents). The first three columns of

regression results present coefficients, standard errors, and z-statistics, respectively, from


                                                                                                    23
a Poisson regression which includes only measures of geographic and temporal distance

between the citing patent and the cited paper. The next three columns present the same

information from a negative binomial regression which also includes campus, field, and

organizational form effects. Due to space constraints, we only show results on the

distance, time, and field coefficients.

        Several aspects of these results merit comment. Distance seems to matter – a

result that is very much line with previous work by Jaffe et. al. and by Mowery et. al.

Being in the same state has a statistically significant impact on the probability of citation,

as does being in the same county in the negative binomial results. On the other hand, our

measures of linear distance do not have a significant impact on expected citation in either

specification.

        The coefficients on the lag dummy variables display a rather curious pattern. It

seems that the “lag effects” peak at short lags between publication of the paper and

application of the citing patent, then peak again at much longer lag lengths. One

interpretation is that patented innovations benefits both from 1) recent publications which

embody the latest results and 2) older publications which contain truly central, paradigm-

shattering results. Another interpretation is that the estimates of the impact of lag length

are confounded with cohort effects of the cited papers and citing patents. In future work,

I hope to explore alternative specifications in order to unpack and separately measure

these effects.

        The field effects are strong and significant in the negative binomial estimates, and

their pattern is what one might expect given the distribution of citations across fields.

Biomedical research and clinical medicine have large, highly significant coefficients.


                                                                                            24
Engineering/technology and physics have smaller, but still significantly positive

coefficients. Once field effects are controlled for, the differential citedness of the

different UC campuses is much less pronounced. The “campus effect” coefficients are

not shown for reasons of space, nor are the organizational coefficients. It is clear from

the latter, though, that universities are more likely to cite academic papers in their patents

than other kinds of institutions.

        In future drafts, I plan to present estimates from modified Poisson and NB models

which explicitly deal with the truncation in this subsample. At the time of this writing,

these results are not yet available.

        In Table 4, I present results on the full sample of citing patents plus the matched

nonciting control patents. As a benchmark, I start with results from a zero-inflated

Poisson model, with the ? i function defined as in equation (24). These results are

presented in the first three columns of Table 4, using the same format as in Table 3.

        Interestingly, the inclusion of nonciting patents seems to increase the measured

impact of proximity on the level of citations. The dstate dummy variable’s coefficient

doubles in this specification. Including nonciting controls also seems to shift the “peak”

of our estimated lag effects to longer lag lengths. The field effects are also much larger

in this expanded sample, though care needs to be taken in interpreting that coefficient, as

the “citing field” dummy variables of all control patents are set equal to zero.

        The ZIP results include estimated coefficients on the variables used in the ? i

function. A Wald test of the restriction that these parameters are equal to zero is

resoundingly rejected, and a “straight” Poisson model is therefore rejected in favor of the

ZIP model. Similar results obtain with the ZINB model. Not surprisingly, the crude


                                                                                             25
measures of “location in patent space” turn out to be good predictors of the probability of

citation, but given the way these data were generated we can hardly view that as

confirmation of any explicit hypothesis.

       In addition to the coefficients reported, the Poisson specification also includes as

regressors the “organizational” dummy variables. Briefly summarized, the coefficients

on these variables indicate that individual assignees and corporate assignees are

systematically less likely to cite academic papers in their patents relative to other

organizations. (That being said, corporate assignees do account for the majority of citing

patents.)

       The final three columns in this table present results from a zero-inflated negative

binomial model. In this specification, I only use data on corporate citing and control

patents, and I do not attempt to separately estimate code or field effects. This changes the

sample, but also gives us a sense of the extent to which distance and lag effects vary for

corporate citers of academic papers. By and large, the results are qualitatively similar to

our other results. Note that, if anything, the impact of geographic proximity is even

stronger here than elsewhere – here being in the same county has an additional positive

impact of even greater magnitude than being in the same state. Note also that the lag

effects have a pattern similar to those generated in previous specifications. The “lag

effects” from several specifications are graphed out in Figure 11.

V.     Next Steps

       I have demonstrated that industrial patent citations to academic papers are

numerous and increasing, I have suggested that these citations are indicative of

knowledge spillovers, and I have attempted to trace out the paths of these spillovers


                                                                                          26
across time, space, and technology class, controlling for various attributes of the citing

patent and the cited academic research. What I have not done is give the reader any sense

of the economic value, if any, created by these spillovers.

       Taking that next step would require an examination of what the citing

organizations (especially citing firms) do with the knowledge they extract from academic

science. Cockburn and Henderson (1998) and Zucker et. al. (1998) have undertaken

studies which examine how firm innovative performance is affected by measures of

“connectedness” to academic science. Following the lead of these earlier papers, it

should be possible to compare the innovative output of frequently citing firms to less

frequently citing or non-citing firms. Ceteris paribus, does the incorporation of academic

science into the industrial innovation process lead to better outcomes? Do firms which

utilize UC or Stanford academic science generate more and better patents, obtain higher

levels of revenue and profit, and generate more value for shareholders over sustained

periods of time than firms which do not? A quantitative, econometric investigation of

this question holds out the possibility of being able to both demonstrate and quantify the

positive impact of public science. While it may never be possible to establish plausible

“conversion factors” for turning paper or citation counts into dollar values, one might at

least obtain some systematic sense of the difference university-industry research

spillovers make through this sort of exercise.

       Such an analysis would require a different approach to the data. One such

approach would be to make the assignee, rather than the patent, the unit of analysis. In

principle, I could obtain all the patents taken out by the assignees identified in our data

base. For firms, assignee names could be linked to CUSIP or other firm identifier codes,


                                                                                              27
and patenting data could be linked with R&D input, sales, and stock market data. Not

only could such data be used to explore the impact of citation on innovative output, but it

could also be used to study patterns of citation by particular assignees over time. How do

firms learn where the good academic science is? Do they zero on in on particular favored

sources of academic science over time?

       Armed with that knowledge, one can then return to the stream of research pursued

by Adams and Griliches – the measurement of the output of science. A somewhat

pessimistic observation of Adams and Griliches (1996) was that science itself, as

measured by quality adjusted paper counts, seems to show some evidence of diminishing

returns. However, if more recent cohorts of papers are increasingly cited by industrial

inventions, and if that higher incidence of citation is leading to more or better products

and services, then a broader measure of scientific output may actually yield evidence of

constant or even increasing returns. The marriage of measures of academic resource

input – federal, state, and local R&D dollars – with patent citation augmented measures

of academic scientific output may better enable us to not only track but also optimize the

transformation of research inputs into economic outputs.

VI.    Conclusions

       At this early stage in the research process, of course, any “conclusions” must

remain quite tentative. Certainly, I hope to produce a second draft in short order with

contains estimates based on all the models sketched out in section IV. I also hope to

incorporate results which use data on citations to Stanford University and, in the longer

run, the California Institute of Technology and the University of Southern California.


                                                                                             28
        That being said, I think several points can be made even at this early stage. First,

relative to other indicators of knowledge flow from academia to the private sector,

citations to academic papers are relatively numerous, rich, and available across campuses

and scientific disciplines. Quite simply, there is a great deal of information to be mined

from this source, and the existing literature (much of generated by Francis Narin) has

only begun this process.

        Second, many of the patterns that others have discovered in citations to university

patents in general (Jaffe et. al.) and citations to Stanford/University of California patents

in particular are also reflected in citations to academic papers. That is, citations are

concentrated in “biotech-related” fields, medical schools play a large role in generating

cited research, and there is evidence of geographic localization of knowledge spillovers

in our data.

        Third, these data suggest that the temporal link between academic science and

patented innovation is short. The modal lag in the raw data is only 2 years, and the

pattern of “lag effects” in the econometric evidence also suggests that it is relatively

recent science that is a driving force behind patenting. While the marginal university

patent may be less “idea-rich” than it used to be, there is no evidence that the marginal

paper, particularly in the highly cited disciplines, is any less “idea-rich” than it used to be.

In fact, since citations are increasing must faster than papers, the crude numbers and

estimates presented herein would seem to suggest that the marginal “quality” (or at least

“marginal relevance”) of papers in at least some disciplines is increasing.

        Finally, the preceding observations would seem to indicate that the research

agenda sketched out in this paper holds considerable promise. Given the availability (at a


                                                                                             29
price) of similar data for other major university systems, and the clear interest of

university administrators in documenting and improving the rate at which they deliver

useful technological information to the private sector, it is my hope that this paper will

stimulate similar research by other scholars in other states. The creation of a “master”

data set containing such data for the top 30 or so university systems would likely prove to

be an extremely important and useful research tool. While such a data set may be beyond

the reach of any individual scholar, it should be very much within the reach of the

community of scholars involved in the NBER productivity program and similar groups.


                                                                                             30
        Bibliography

Adams, J., 1990, “Fundamental Stocks of Knowledge and Productivity Growth,”
      Journal of Political Economy 98: 673-702.

Adams, J. and Z. Griliches, 1996, “Research Productivity in a System of Universities,”
      NBER working paper no. 5833.

Audretsch, D. and P. Stephan, 1996, “Company-Scientist Locational Links: The
       Case of Biotechnology,” American Economic Review, Vol. 86, No. 3.

Barnes, M., D. Mowery, A. Ziedonis, 1998, “The Geographic Reach of Market and
       Nonmarket Channels of Technology Transfer: Comparing Citations and Licenses
       of University Patents,” working paper.

Branstetter, L., 1999, “Is FDI a Channel of R&D Spillovers: Evidence from Japan’s
       FDI in the U.S.” working paper, University of California, Davis.

Cameron, A. C. and P. Trivedi, 1998, The Regression Analysis of Count Data,
      Econometric Society Monograph No. 30, Cambridge: Cambridge University
      Press.

Cockburn, I. and R. Henderson, 1998, “The Organization of Research in Drug
      Discovery,”Journal of Industrial Economics, Vol XLVI, No. 2.

Cohen, W., R. Florida, L. Randazzese, and J. Walsh, 1998, “Industry and the Academy:
       Uneasy Partners in the Cause of Technological Advance,” in R. Noll, ed.,
       Challenges to the Research University. Washington, D.C.: Brookings Institution

Evenson, R. and Y. Kislev, 1976, “A Stochastic Model of Applied Research,” Journal of
      Political Economy 84 (2): 265-282.

Faulkner, W. and J. Senker, 1995, Knowledge Frontiers: Public Sector Research and
       Industrial Innovation in Biotechnology, Engineering Ceramics, and Parallel
       Computing, Oxford: Clarendon Press.

Gambardella, A, 1995, Science and Innovation: The U.S. Pharmaceutical Industry
     during the 1980s, Cambridge: Cambridge University Press.

Henderson, R., A. B. Jaffe, and M. Trajtenberg, 1998, “Universities as a Source of
      Commercial Technology: A Detailed Analysis of University Patenting, 1965-
      1988,” Review of Economics and Statistics, 119-127.

Jaffe, A., 1989, “The Real Effects of Academic Research,” American Economic Review,
        79 (5), pp. 957-70


                                                                                         31
Jaffe, A., M. Trajtenberg, and R. Henderson, 1993, “Geographic Localization of
        Knowledge Spillovers as Evidenced by Patent Citations,” Quarterly Journal of
        Economics, Vol. CVIII, No. 3.

Jaffe, A. and M. Trajtenberg, 1996, “Flows of Knowledge from Universities and Federal
        Labs: Modeling the Flow of Patent Citations over Time and across Institutional
        and Geographic Boundaries,” NBER working paper no. 5712.

Jaffe, A., M. Fogarty, and B. Banks, (1998), “Evidence from Patents and Patent Citations
        on the Impact of NASA and Other Federal Labs on Commercial Innovation,”
        Journal of Industrial Economics, Vol. XLVI, No. 2.

Jensen, R. and M. Thursby, 1999, “Proofs and Prototypes for Sale: The Licensing of
       University Inventions,” American Economic Review.

Kortum, S. and J. Lerner, 1997, “Stronger Protection or Technological Revolution:
      Which is Behind the Recent Surge in Patenting?” working paper.

Lambert, D., 1992, “Zero-inflated Poisson Regression, with an Application to Defects in
      Manufacturing,” Technometrics 34: 1-14.

Mansfield, E., 1995, “Academic Research Underlying Industrial Innovations: Sources,
      Characteristics, and Financing,” The Review of Economics and Statistics 77: 55-
      65.

Mowery, D., R. Nelson, B. Sampat, and A. Ziedonis, 1998, “The Effects of the Bayh
     -Dole Act on U.S. University Research and Technology Transfer: An Analysis of
     Data from Columbia University, the University of California, and Stanford
     University,” working paper

Narin, F., K. Hamilton, and D. Olivastro, 1997, “The Increasing Linkage Between U.S.
       Technology and Public Science,” Research Policy 197: 101-121.

Office of Technology Transfer, University of California, 1997, Annual Report:
       University of California Technology Transfer Program. Oakland, CA: University
       of California.

Rosenbloom, R. and W. Spencer, 1996, Engines of Innovation: U.S. Industrial Research
      at the End of an Era, Boston: Harvard Business School Press.

Stephan, P., 1996, “The Economics of Science,” Journal of Economic Literature 34:
       1199-1235.

Thursby, J. and M. Thursby, 2000, “Who is Selling the Ivory Tower? Sources of Growth
      in University Licensing,” NBER Working Paper No. 7718.


                                                                                       32
Zucker, L., M. Darby, and M. Brewer, 1998, “Intellectual Capital and the Birth of U.S.
       Biotechnology Enterprises,” American Economic Review, 88: 290-306.


                                                                                         33
34
                      Figure 1        Citations to UC papers vs other indicators


9000


8000


7000


6000
                                                                                      Citations to UC papers

5000


4000


3000


2000
                                                                                           Citations to UC patents
                                                                                                                              Licenses
                                                                  Invention Disclosures
                                                                                                                    Patents
1000


  0
       88   89   90              91             92           93              94              95                96              97


                                                                                                                                         35
                      Figure 2 Citations of Stanford papers vs. other indicators


2500


2000


                                                                               Citations of Stanford papers


1500


1000


                                                                 Citations of Stanford patents


500

                                                   Disclosures
                                                                     Licenses                Stanford Patents


  0
       88   89   90              91           92            93            94               95                 96   97


                                                                                                                        36
     Table 1      Importance to Industrial R&D of Information Sources on University Research

SIC Industry                 Patents Publications Conferences Informal channels Hires Licenses JVs Contract Research Consulting Personal Exchange

2320 Petroleum                     0        46.67       53.33             33.33   13.3   13.33   13           26.67      46.67                  0
2400 Chemicals                    25        34.37       28.12             18.75   18.8    7.81   16           20.63      26.56               9.37
2423 Drugs                     56.86        72.55       60.78             60.78   31.4   35.29   41            54.9       54.9               7.84
2922 Machine Tools                10           40          40                40    20        0   10              20         40                  0
3010 Computers                  8.33        41.67       41.67             33.33   33.3    4.17   8.3           8.33      29.17               4.17
3100 Electrical Equipment       9.09        31.82       22.73             22.73      0       0   9.1          13.64       9.09                  0
3210 Electronic Components        20           36          28                36    32       12   12               8      33.33                  4
3211 Semiconductors            22.22        61.11       55.56             64.71   27.8   16.67   28           16.67      33.33               5.56
3220 Communications Equip.      5.88           50       32.35             32.35   29.4    8.82   8.8          17.65      29.41              20.59
3311 Medical Equip.            27.54        37.68       34.78             46.38   18.8   18.84   23           23.19      44.93                5.8
3312 Precision Instruments        25           50       44.44             44.44   11.1   13.89   19            8.33      36.11               5.56
3410 Car/Truck                 33.33        33.33       11.11             33.33   11.1   11.11   22           33.33      22.22              11.11
3430 Autoparts                  9.37        43.75       31.25                25   18.8    9.37   22           18.75      21.87               9.37
3530 Aerospace                 14.58        58.33          50             54.17   18.8    6.25   40           35.42      39.58               4.17

     All Manufacturing         17.61        40.91       34.42             35.28 19.9      9.73 18             21.26      32.15               5.84

     Source: Cohen, Florida, Randazzese, and Walsh, 1998, pp. 180-181


                                                                                                                                      37
                     Table 2 Impact of Public Sector Research in Biotechnology

Activity             Overall           Literature          Contact          Recruitment

Future Innovations               9.1                 4.5              2.3                  2.3
Search                          45.5                  25               16                  4.5
RD&D                            22.7                13.6              9.1
Instrumentalities               22.7                 9.1              9.1                  4.5

Overall                                             52.2             36.4                 11.4

Source: Faulkner and Senker, 1995


                                                                                                 38
                                  Figure 3   Citations by Scientific Field, UC

                 Mathematics
                                                 Biology


Psychology
                                  Physics

                 Engineering
                 and technology


                                                                                 biology
             Chemistry                                                           biomedical research
                                                                                 clinical medicine
                                                                                 chemistry
                                                                                 earth and space
                                                    Biomedical Research          engineering and technology
                                                                                 mathematics
                                                                                 physics
                  Clinical Medicine
                                                                                 psychology


                                                                                                              39
                  Figure 4   Stanford citations by field, including Stanford Medical School
                                                         Biology


       Mathematics


                                        Physics


                                                                     Biomedical Research
                         Engineering and Technology


Earth and space
                                     Clinical Medicine


                                                                   Chemistry


                                                                                              40
                       Figure 5 Biotech citations drive overall trends for UC


9000


8000


7000


6000


5000                                                                                      Other
                                                                                          Clinical medicine
4000                                                                                      Biomedical research


3000


2000


1000


  0
       88   89   90   91         92           93           94          95       96   97


                                                                                                                41
                     Figure 6 Stanford citations by field, 1988-1997
                              (Excluding Medical School)


700


600


500


400
                                                             Other


300


                                                             Physics
200


100                                                           Engineering/Technology


                                                                 Biomedical Research
 0
      88   89   90    91             92             93               94            95   96   97


                                                                                                  42
                                          Figure 7 Total citations by UC campus/institution


7000


                                                                   UC-San Francisco

6000


5000
       UC-Berkeley


4000

                                                                                                     UCLA-Med School

3000
                                                         UCSD   UCSD
                                                                Med
                        UC-Davis
                                                                                                   UCLA
2000
                                                                                                                       UC Extension
                                                                                                                       Service
                                                                               UCSB
                                               UCI-Med
1000
             Berkeley          UCD Med   UCI                                                                     UC-Los Alamos
             Med                                                                                                                      Riverside
                                                                                      Santa Cruz

  0


                                                                                                                                                  43
Figure 8 Citations to UC Berkeley Papers, US


                                               44
Figure 9 Citations to UC Berkeley Papers, California


                                                       45
                                  Figure 10       Lags between paper publication and patent application


3000


2500


2000


1500


1000


500


  0
       -3   -2   -1   0   1   2    3    4     5      6    7    8   9    10   11   12   13   14   15   16   17   18   19    20   21


                                                                                                                      46
47
TABLE 3 Econometric Analysis of Citing Patents Only

                           Poisson                  Negative Binomial
Variable         Coeff.    Std. Err.  z-stat Coeff.     Std. Err.  z-stat
dist              0.052545 0.037272       1.41 -0.00203 0.038137 -0.053
dstate            0.021823 0.002642       8.26 0.010019 0.002966 3.378
dcnty             0.000947 0.003952       0.24 0.016167 0.004436 3.644
Dlag0             0.369132 0.029263 12.615 0.318366 0.029552 10.773
Dlag1             0.406983 0.021491 18.937 0.328933 0.022207 14.812
Dlag2             0.390081 0.020043 19.462 0.328275 0.020387 16.102
Dlag3             0.389842 0.019945 19.546 0.331016 0.020281 16.321
Dlag4             0.405211 0.020102 20.158 0.336577 0.020523           16.4
Dlag5             0.361611    0.02054 17.605 0.293899 0.020918 14.05
Dlag6             0.380266 0.021281 17.869 0.319121 0.021641 14.746
Dlag7             0.300711    0.02279 13.195      0.2408 0.023024 10.459
Dlag8             0.290672 0.025003 11.626 0.233868 0.025031 9.343
Dlag9             0.267624      0.0268 9.986    0.22207 0.026839 8.274
Dlag10            0.292398    0.02828 10.339 0.236958 0.028416 8.339
Dlag11            0.253135 0.031311 8.084 0.197521 0.032013            6.17
Dlag12            0.293841 0.032585 9.018 0.267821 0.033074 8.098
Dlag13            0.272878 0.035835 7.615 0.225961 0.036363 6.214
Dlag14            0.303293 0.040831 7.428 0.199679 0.042206 4.731
Dlag15            0.370119 0.042143 8.782 0.310247 0.043229 7.177
Dlag16            0.225049 0.048345 4.655 0.161185 0.049453 3.259
Dlag17            0.363246 0.051596       7.04 0.316077 0.051734       6.11
Dlag18            0.508345 0.063849 7.962 0.383934 0.065444 5.867
Dlag19            0.262838 0.076289 3.445 0.227468 0.077309 2.942
Dlag20            0.112348 0.091689 1.225       0.04834 0.095874 0.504
Dlag21            0.299037 0.132362 2.259 0.246146 0.134674 1.828
_cons              -0.09989 0.019069 -5.238 -0.43165 0.043481 -9.928
Biology                                        0.142797 0.040006 3.569
Biomed                                         0.243673 0.024646 9.887
Chemistry                                      0.186879 0.030365 6.154
Clin. Medicine                                 0.180109 0.023833 7.557
Earth/space                                    0.105041 0.136151 0.771
Engineering/tech                               0.094863 0.037678 2.518
Mathematics                                    0.192904 0.379149 0.509
Physics                                        0.102372     0.03729 2.745


                                                                              49
TABLE 4 Econometric Analysis of Citing Patents Plus Nonciting Controls

                          Poisson                     Negative Binomial
Variable        Coeff.    Std. Error z-stat    Coeff.    Std. Error z-stat
dist             0.028518    0.03825     0.746 -0.02492 0.053027        -0.47
dstate           0.041533 0.002708 15.337 0.030714 0.003321             9.249
dcnty            0.005281 0.004156       1.271 0.039739 0.007057        5.631
Dlag0            0.183462 0.029213        6.28 0.415766 0.046894        8.866
Dlag1            0.144973 0.021573        6.72 0.438943 0.033029        13.29
Dlag2            0.182471 0.020064       9.094 0.412661 0.031047 13.292
Dlag3            0.176568    0.02002      8.82 0.487401 0.029599 16.467
Dlag4            0.197665 0.020032       9.868 0.459188 0.029682        15.47
Dlag5            0.229656 0.020326 11.299 0.430753 0.029691 14.508
Dlag6            0.247381 0.021513 11.499 0.443892 0.031693 14.006
Dlag7            0.133829 0.022685          5.9 0.314559     0.03458    9.097
Dlag8            0.168771 0.024382       6.922 0.334996 0.036051        9.292
Dlag9            0.118561 0.025947       4.569 0.279705 0.039065         7.16
Dlag10           0.121217 0.028229       4.294 0.281775 0.040235        7.003
Dlag11           0.105573 0.031759       3.324 0.310739 0.044123        7.043
Dlag12           0.043467 0.032944       1.319 0.294047 0.048249        6.094
Dlag13           0.166856 0.035612       4.685 0.235102 0.049194        4.779
Dlag14           0.106759 0.041106       2.597 0.365968 0.056556        6.471
Dlag15           0.207689 0.042265       4.914 0.387112 0.058335        6.636
Dlag16            0.01901 0.049637       0.383 0.357288 0.068931        5.183
Dlag17           0.230787 0.051975        4.44 0.343885 0.071315        4.822
Dlag18           0.181156 0.066885       2.708 0.476819 0.086675        5.501
Dlag19           0.118101      0.0778    1.518 0.222155 0.105675        2.102
Dlag20           0.217564 0.089988       2.418 0.132281 0.124251        1.065
Dlag21           0.363832 0.133659       2.722 0.560115 0.220336        2.542
Biology          0.684281 0.037331       18.33
Biomed           1.413191 0.019157 73.769
Chemistry        0.948693 0.026069 36.392
Clin. Medicine   1.039707 0.019039 54.609
Earth/space       1.38828 0.134705 10.306
Engineering/tech 1.245991 0.032275 38.605
Mathematics      1.274324 0.379088       3.362
Physics          1.099169 0.031695 34.679
apyear                                          0.027679 0.004683       5.911
_cons            -1.49231 0.041779 -35.719 -55.5148 9.324683 -5.954
inflate
categ1             -19.819 536938.2           0 -5.70235 0.323109 -17.648
categ2           -4.18121 963.7202 -0.004 -1.50184 0.163473 -9.187
categ3           -3.48186    498.832 -0.007 -1.36812 0.107109 -12.773
_cons            -14.9426 97.82946 -0.153 0.853846 0.033085 25.808


                                                                                50
51
                                          Figure 11     Coefficients on Lag Terms


0.6
                                              Negative Binomial, Firm Only
                                              Poisson, Citing Only
                                              Negative Binomial, Citing Only
0.5


0.4


0.3


0.2


0.1


 0
      0   1   2   3   4   5   6   7   8   9   10   11   12   13   14   15   16   17   18   19   20


                                                                                                     52