Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CDO TUMOR SUPPRESSOR GENE AND PROTEIN
Document Type and Number:
WIPO Patent Application WO/1998/004697
Kind Code:
A1
Abstract:
The present invention relates to a Class II tumor suppressor gene designated 'cdo' (<u>C</u>am-related gene/<u>d</u>own-regulated by <u>o</u>ncogenes), which is down-regulated at the mRNA level in transformed cells, and is expressed at extremely low levels in adult tissues.

Inventors:
KRAUSS ROBERT S (US)
GAO MIN (US)
KANG JONG-SUN (US)
FEINLEIB JESSICA (US)
Application Number:
PCT/US1997/014418
Publication Date:
February 05, 1998
Filing Date:
July 11, 1997
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SINAI SCHOOL MEDICINE (US)
KRAUSS ROBERT S (US)
GAO MIN (US)
KANG JONG SUN (US)
FEINLEIB JESSICA (US)
International Classes:
C07K14/705; C12N15/12; A61K38/00; (IPC1-7): C12N15/12; A61K48/00; C07K14/705; C12N15/85; C12Q1/68; G01N33/53
Other References:
SCIENCE, 25 August 1995, Volume 269, MARSHALL E., "Gene Therapy's Growing Pains", pages 1050-1055.
ONCOGENE, 1995, Volume 10, KLINGELHUTZ A.J. et al., "The DCC Gene Suppresses the Malignant Phenotype of Transformed Human Epithelial Cells", pages 1581-1586.
CANCER RESEARCH, 01 January 1995, Volume 55, HSIEH J.-T. et al., "Tumor Suppressive Role in Androgen-Regulated Epithelial Cell Adhesion Molecule (C-CAM) in Prostate Carcinoma Cell Revealed by Sense and Antisense Approaches", pages 190-197.
SCIENCE, 25 April 1997, Volume 276, ROUSH W., "Putative Cancer Gene Shows Up in Development Instead", pages 534-535.
Attorney, Agent or Firm:
Kole, Lisa B. (Graves Donohue & Raymond, 44th floor, 30 Rockefeller Plaz, New York NY, US)
Download PDF:
Claims:
CLAIMS
1. A purified and isolated nucleic acid molecule having a nucleic acid sequence as set forth in Figure 1 for form a of rat cdo (SEQ ID NO : 1).
2. A purified and isolated nucleic acid molecule which is at least 90 percent homologous to the nucleic acid molecule of claim 1.
3. A purified and isolated nucleic acid molecule at least 30 nucleotides in length which hybridizes to the nucleic acid molecule of claim 1 under stringent conditions.
4. A protein encoded by the nucleic acid molecule of claim 1.
5. A protein encoded by the nucleic acid molecule of claim 2.
6. A protein encoded by the nucleic acid molecule of claim 3.
7. A purified and isolated nucleic acid molecule having a nucleic acid sequence as set forth in Figure 1 for form (3 of rat cdo (SEQ ID NO : 2).
8. A purified and isolated nucleic acid molecule which is at least 90 percent homologous to the nucleic acid molecule of claim 7.
9. A purified and isolated nucleic acid molecule at least 30 nucleotides in length which hybridizes to the nucleic acid molecule of claim 7 under stringent conditions.
10. A protein encoded by the nucleic acid molecule of claim 7.
11. A protein encoded by the nucleic acid molecule of claim 8.
12. A protein encoded by the nucleic acid molecule of claim 9.
13. A purified and isolated nucleic acid molecule having a nucleic acid sequence as set forth in Figure 11 for human cdo (SEQ ID NO : 3).
14. A purified and isolated nucleic acid molecule which is at least 90 percent homologous to the nucleic acid molecule of claim 13.
15. A purified and isolated nucleic acid molecule at least 30 nucleotides in length which hybridizes to the nucleic acid molecule of claim 13 under stringent conditions.
16. A protein encoded by the nucleic acid molecule of claim 13.
17. A protein encoded by the nucleic acid molecule of claim 14.
18. A protein encoded by the nucleic acid molecule of claim 15.
19. A cdoencoding nucleic acid molecule, as comprised in pSKcdoa, deposited with the American Type Culture Collection and assigned accession number 97667.
20. A protein encoded by the nucleic acid molecule of claim 19.
21. A purified and isolated protein having an amino acid sequence as set forth for rat cdo form a in Figure 1.
22. A purified and isolated nucleic acid molecule encoding the protein of claim 21.
23. A purified and isolated protein having an amino acid sequence as set forth for rat cdo form p in Figure 1.
24. A purified and isolated nucleic acid molecule encoding the protein of claim 23.
25. A purified and isolated protein having an amino acid sequence as set forth for human cdo in Figure 11.
26. A purified and isolated nucleic acid molecule encoding the protein of claim 25.
27. A method for inhibiting proliferation of a cell, comprising introducing, into the cell, a nucleic acid molecule encoding a cdo protein operably linked to elements necessary for its expression, such that the cdo protein is expressed in the cell.
28. A method for inhibiting transformation of a cell, comprising introducing, into the cell, a nucleic acid molecule encoding a cdo protein operably linked to elements necessary for its expression, such that the cdo protein is expressed in the cell.
29. A method of diagnosing a disorder of cell proliferation in a subject, comprising measuring the amount of cdo expression in a test sample of cells collected from the subject, and comparing the level of cdo expression in the test sample to a control sample of cells from a normal subject, wherein a decreased amount of cdo expression in the test sample is indicative of a disorder of increased cell proliferation in the subject.
30. A method of diagnosing a malignant disorder in a subject, comprising measuring the amount of cdo expression in a test sample of cells collected from the subject, and comparing the level of cdo expression in the test sample to a control sample of cells from a normal subject, wherein a decreased amount of cdo expression in the test sample is indicative of a malignant disorder in the subject.
Description:
Description cdo Tumor Suppressor Gene And Protein This invention was made with government support under Grant No.

CA59474 from the NIH, such that the United States Government may have certain rights herein.

1. Introduction The present invention relates to a newly discovered tumor suppressor gene, termed"cdo" (for Cam-related gene down-regulated by oncogenes) and, in particular, to cdo nucleic acids and proteins. The invention is also directed to methods and compositions for detecting cdo nucleic acids and proteins in vertebrate samples and to methods of treating malignancies and other disorders of cell proliferation.

2. Background of the Invention Carcinogenesis involves multiple, independent somatic mutations in proto- oncogenes and tumor suppressor genes. The number of proto-oncogenes isolated and characterized now totals more than 70 (Bishop, 1991, Cell 64: 235-248). In contrast, only about 10 candidate tumor suppressor genes have been identified (Knudson, 1993, Proc. Natl. Acad. Sci. U. S. A. 90: 1091-1092).

A tumor suppressor gene, in essence, is a"recessive oncogene", a gene whose role in the development of neoplasms becomes apparent only after inactivating mutations have occurred in both alleles of the gene. Tumor suppressor genes are generally divided into two classes. Those of the Class I type are genes inactivated by mutation or deletion in tumor cells. Tumor suppressor genes of the Class II type are genes which are unaltered by mutation in tumor cells, but rather

are transcriptionally down-regulated by mutations in Class I genes or proto- oncogenes.

Several Class II tumor suppressor genes have recently been identified.

These include, for example, genes encoding: (1) maspin (a putative protease inhibitor); (2) tropomyosin I ; (3) oc-actinin ; (4) vinculin; and (5) N03/DAN (Zou, et al., 1994, Science 263: 526-529; Prasad, et al., 1993, Proc. Natl. Acad. Sci.

U. S. A. 90: 7039-7043; Gluck, et al., 1993, Proc. Natl. Acad. Sci. U. S. A. 90: 383- 387; Fernandez, et al., 1992, J. Cell. Biol., 119: 427-438 ; Ozaki and Sakiyama, 1994, Cancer Res., 54: 646-648).

Many of the properties exhibited by tumors and transformed cells in vitro suggest that defects in cell adhesion molecules ("CAMs") play a significant role in oncogenesis. Surface molecules involved in cell adhesion are divided into four major groups: (1) immunoglobulin superfamily members ("IgSF"), (2) cadherins, (3) integrins, and (4) selectins (Hynes and Lander, 1992, Cell 68: 303-322). Recent data strongly indicates a role in tumor suppression for the first three of these groups of CAMs (Hedrick, et al., 1993, Trends Cell Biol., 3: 36-39).

Surface molecules which are members of the IgSF mediate Ca' independent, homo-and heterophilic cell-cell adhesion. Studies on rat cerebellar cell lines transformed with a temperature-sensitive mutant of Rous Sarcoma virus first indicated that loss of function of IgSF members may contribute to oncogenesis (Greenberg, et al., 1984, Proc. Natl. Acad. Sci. U. S. A. 81: 969-973). At non- permissive temperatures, these cells showed a neuronal morphology, and aggregated by a mechanism mediated by the neural cell adhesion molecule (N- CAM). When the cells were shifted to a temperature permissive for transformation, N-CAM expression was down-regulated, the cells became morphologically transformed, and lost most of their ability to aggregate.

More recently, down-regulation of a gene designated"DCC"has been associated with colorectal cancers (Fearon et al., 1990, Science 247: 49-56; Hedrick

et al., 1994, Genes Develop. 8 : 1174-1183; Reale et al., 1994, Cancer Res. 54: 4493- 4501; Narayanan et al., 1992, Oncogene 7 : 553-561; Tanaka et al., 1991, Nature 349: 340-342; and Pierceall et al., 1994, J. Cell. Biol. 124: 1017-1027). DCC, located on human chromosome 18, is a member of the IgSF, and has been found to be the target of somatic mutations in several colorectal cancers. Additionally, DCC mRNA and protein expression have been observed to be decreased or absent in a majority of colorectal cancer specimens studied, relative to normal colonic mucosa. Subsequent studies indicated that DCC expression may also be involved in additional types of cancers, including malignancies of the breast and prostate.

Furthermore, it was found that expression of DCC anti-sense RNA in Rat 1 fibroblasts induced transformation of these cells, as measured by growth in soft agar and tumorigenicity in nude mice. Unlike N-CAM, DCC is expressed at extremely low levels in adult tissues. This observation, as well as recent experimental data, suggest that DCC may function not simply to aggregate cells physically, but to act as a transducer of signals that regulate cell growth and, particularly, differentiation.

3. Summary of the Invention The present invention relates to a tumor suppressor gene, termed"cdo", and its encoded protein. The invention is based, at least in part, on the discovery of the cdo gene, which is related to, but distinct from, the DCC gene disclosed above, and on the discovery that expression of cdo is decreased in transformed cells.

Accordingly, the present invention also provides for methods of using cdo nucleic acids and proteins in the diagnosis and treatment of malignant diseases as well as proliferative disorders.

4. Description of the Figures FIGURE 1. Nucleic acid and amino acid sequences of rat cdo (complete cdo-encoding sequence designated as"form a", SEQ ID NO : 1). In an alternately spliced form ("form ß", SEQ ID NO : 2) of rat cdo, nucleic acids 2694-3044 (indicated by brackets) are deleted.

FIGURE 2. Diagram showing the relative positions of rat cdo-encoding nucleic acid and human clones.

FIGURE 3. Comparison of human (SEQ ID NO : 3) and rat (SEQ ID NO : 1) nucleic acid sequences.

FIGURE 4A-B. Northern blot analysis showing total RNA from parental rat 6 cells transformed by the H-ras, neu, v-src, v-raf, protein kinase C E, v-fos or c-myc oncogenes hybridized to (A) cdo probe or (B) GAPDH probe.

FIGURE 5. Northern blot analysis of RNA prepared from confluent, serum-starved rat 6 cells stimulated with serum, and (A) hybridized with cdo probe or (B) stained with ethidium bromide.

FIGURE 6. Expression of cdo mRNA in various tissues, as detected by PCR-based"exon connection"assays.

FIGURE 7. Autoradiogram of SDS-PAGE showing products of in vitro transcription/translation of rat a and P cdo-encoding cDNAs. The reporter gene luciferase was used as a positive control.

FIGURE 8. Coomassie stained gel showing bacterial synthesis of fusion protein comprising glutathione S transferase and the intracellular domain of cdo.

FIGURE 9A-E. (A) Western blot demonstrating that antisera-recognized cdo protein is present in 293 cells transfected with cdo-containing expression vector, but not with control vector. (B) Western blot demonstrating cdo protein expression in parental rat 6 cells and ras-transformed C1-T24 cells. (C) Western blot showing time course of cdo expression in serum-stimulated rat 6 cells. (D) Western blot showing expression of cdo protein in adherent cultures (lane P) and

non-adherent cultures (lane M). (E) Northern blot showing cdo RNA expression in adherent cultures (lane P) and non-adherent cultures (lane M).

FIGURE 10A-B. Depicts an autoradiogram showing the results of PCR- based exon-connection assay for cdo expression in transformed breast cells.

FIGURE 11. Nucleic acid and amino acid sequence of human cdo (SEQ ID NO : 3).

5. Detailed Description of the Invention The present invention relates to nucleic acid molecules encoding cdo, cdo proteins, peptide fragments and derivatives, and antibodies directed toward cdo. In addition, the invention relates to pharmacological compositions and diagnostic and therapeutic uses of cdo nucleic acids and proteins.

5.1. cdo-encoding Nucleic Acids The present invention relates to purified and isolated nucleic acid molecules encoding cdo. It is based, at least in part, on the cloning and characterization of rat and human cdo-encoding nucleic acids. It is also based on the discovery that alternative splicing gives rise to multiple forms of cdo-encoding nucleic acids.

In one nonlimiting embodiment, the present invention provides for a purified and isolated nucleic acid encoding rat cdo. For example, and not by way of limitation, the invention provides for a nucleic acid molecule having a sequence as set forth in FIGURE 1, which sets forth two alternatively spliced forms of rat cdo.

The complete cdo-encoding sequence set forth in FIGURE 1 is designated"form a" (SEQ ID NO : 1), and the form bearing a deletion in nucleic acids 2694-3044 is designated"form 3" (SEQ ID NO : 2), with these same terms being applied to the corresponding cdo proteins. The present invention also provides for nucleic acid molecules which are at least 90 percent (and preferably at least 95 percent)

homologous to forms a and P of the cdo-encoding sequence set forth in FIGURE 1 (SEQ ID No: 1 and SEQ ID NO: 2, respectively), wherein the percent homology is defined as the percentage of identical nucleic acids occurring in molecules which have been aligned in a manner which pairs residues, for comparison, with the greatest degree of similarity (e. g., MacVector, Version 4.1,"Sequence Analysis Software for the MacIntosh", International Biotechnologies, Inc., a subsidiary of Eastman Kodak Co., New Haven, Connecticut). In still further nonlimiting embodiments, the present invention provides for a nucleic acid molecule, at least 30 and preferably at least 50 nucleotides in length, which hybridizes with a nucleic acid molecule having a sequence as set forth for form a or form ß in FIGURE 1 (SEQ ID NO : 1 or SEQ ID NO : 2, respectively) under stringent conditions, wherein stringent conditions are defined as hybridization in 50 percent formamide at 42°C, followed by washing in 0. 1xSSC and 0.1 percent sodium dodecyl sulfate at 68°C.

The present invention also provides for purified and isolated nucleic acid molecules which encode a protein having an amino acid sequence as set forth for form a or ß in FIGURE 1 and to (i) nucleic acid molecules at least 90 percent (and preferably at least 95 percent) homologous and (ii) nucleic acid molecules (at least 30 or at least 50 nucleotides in length) which hybridize under stringent conditions, thereto.

In additional embodiments, the present invention provides for human cdo- encoding nucleic acids, as comprised in human cDNA clones pHCl2, pkSHA3-4, and pTA7. A nucleic acid sequence of human cdo (FIGURE 11 ; SEQ ID NO : 3), as obtained by sequencing the cloned DNA (for much, but not all, of the cloned DNA, both strands were sequenced), corresponding to rat cdo, is set forth in FIGURE 3. The present invention also provides for nucleic acid molecules that are at least 90 percent (and preferably at least 95 percent) homologous to the human cdo-encoding nucleic acids as contained in the deposited clones or having the sequence set forth in FIGURE 11 (SEQ ID NO : 3), or that are at least 30 or at least

50 nucleotides in length and hybridize under stringent conditions (as set forth above) thereto.

The present invention yet further provides for the cloning of a cdo cDNA or genomic sequence using nucleic acid sequences disclosed herein. For example, and not by way of limitation, such a cdo-encoding nucleic acid may be cloned by a combination of procedures, comprising the derivation of an oligonucleotide probe based on the sequence information provided herein, construction of a cDNA or genomic library, and selection, isolation and cloning of the cdo-encoding nucleic acid. A preferred procedure utilizes the polymerase chain reaction (PCR; Saiki et al., 1985, Science 230: 1350-1354) to expand the number of cdo sequences for cloning.

Those skilled in the art would be enabled by the sequence data provided herein to isolate cdo mRNA, cDNA or genomic DNA from a vertebrate species (as discussed in Section 6, below, a"Zoo"Southern blot showed the presence of cdo genes in all vertebrate species tested). For example, one of ordinary skill may generate a synthetic DNA probe consisting of a 10-30 nucleotide segment of a cdo nucleic acid molecule, as set forth above, and use the probe to screen at high stringency a cDNA library from an appropriate cell line presumed to carry cdo- encoding sequences. Alternatively, one may design two appropriate PCR primers, based upon the disclosed nucleic acid sequences, and generate a cdo cDNA either from the same library, or directly from the mRNA of that cell line. Both of these procedures are standard in the art (Benton and Davis, 1977, Science 196: 180-182 ; Maniatis et al., 1978, Cell 15: 687-701).

Multiple copies of a cdo-encoding nucleic acid may be readily produced by inserting the nucleic acid into an appropriate cloning vector and introducing that vector into a suitable host cell, such as a bacterial cell.

5.2. Expression of cdo The cdo-encoding nucleic acid molecules set forth above may be expressed in a suitable host cell, for example, a bacterial, yeast, fungal, plant, insect, or vertebrate host cell. A cdo-encoding nucleic acid molecule may be inserted into a suitable expression vector, including a plasmid, cosmid, phage, or virus vector.

The vector may further comprise control elements which aid in the transcription, translation, and/or processing of cdo, as well as one or more selection marker. For example, useful control elements include one or more of the following: a promoter/enhancer element, polyadenylation signal, transcriptional terminator, translational initiation site and terminator, ribosome binding site, nuclear localization signal, and secretory signal sequence. The vector may then be introduced, using standard techniques, into a suitable host cell for expression.

As a specific, nonlimiting example, a cdo nucleic acid molecule may be incorporated into a pMV12 retroviral vector, which contains a hygromycin resistance gene as a selection marker. The pMV12/cdo vector may then be packaged to form retrovirus suitable for transduction and then may be transduced into an appropriate cell line. For example, a pMV12-cdo plasmid vector may be transiently transfected into the ecotropic packaging mutant helper cell line, BOSC, to generate replication-defective viral particles; then, transduction of pMV12/cdo retrovirus into ras-transformed rat 6 or 3T3 cells (for example, C1/T24 cells), is preferred. Hygromycin-resistant transfectants may then be selected. Ectopic expression of cdo in the transfectants may be confirmed by immunoprecipitation and/or Western blot analyses of total cellular protein with antibodies specific for extracellular and intracellular regions of cdo.

As another specific, nonlimiting example, a mammalian cell line having inducible expression of cdo may be prepared utilizing an expression system based on the tetracycline-resistance (tet) operon of E. coli (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. U. S. A. 89 : 5547-5551). This system employs a tetracycline-

controlled, hybrid trans-activator (tTA) that consists of the tet-repressor and the transcriptional trans-activating domain of herpes simplex virus protein 16 (VP16).

The tTA can bind to tet operator sequences placed in front of a minimal mammalian promoter and thereby repress transcription in the presence of tetra- cycline. Removal of tetracycline from the culture medium causes rapid induction of cDNAs placed downstream of the tet operator sequences (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. U. S. A. 89: 5547-5551; Schmid, 1995, Trends Cell Biol. 5: 266-267). It may be desirable to generate cell lines that stably express tTA (e. g., by transient transfection with a reporter linked to a tet operator), and then to introduce the cdo gene-bearing construct comprised in the tet operator plasmid in a second transfection step.

As another nonlimiting example, nucleic acid encoding cdo protein may be incorporated into an expression vector so as to produce a fusion protein. For example, and not by way of limitation, cDNA encoding cdo protein may be fused, in frame, into the vector pGEX-SX 1, which enables the IPTG-inducible production of a cdo/glutathione S-transferase fusion protein. Escherichia coli strain XL-1 Blue may then be transformed with this vector, and cultured in the presence of IPTG, to produce the fusion protein, which may then be captured on glutathione beads.

In yet another specific, non-limiting example, a cdo nucleic acid may be incorporated into the pGEX-KG vector (Guan and Dixon, 1991, Anal. Biochem.

192: 262-267). This vector directs IPTG-inducible expression of glutathione S- transferase (GST) fusion proteins that contain an"improved"thrombin cleavage region between the GST and fusion partner proteins. It also provides a one-step purification procedure of cleaved recombinant protein. Lysates from IPTG-treated bacteria that harbor the PGEX-KG/cdo vector may be incubated with glutathione agarose beads, and the beads extensively washed. The GST/cdo-bound beads may then be transferred to thrombin cleavage buffer and incubated with thrombin, followed by collection of the supernatant containing the cleaved, purified cdo

portion of the fusion protein. The homogeneity of the recombinant cdo protein may be confirmed by SDS-PAGE and used for generation of antibodies. If necessary, a final acrylamide gel purification step may be added.

5.3. cdo Proteins The present invention relates to purified and isolated cdo proteins encoded by the nucleic acid molecules described in section 5.1, supra. Such proteins may be produced by techniques set forth in section 5.2.

In one nonlimiting embodiment, the present invention provides for a purified and isolated rat cdo protein. For example, and not by way of limitation, the invention provides for a protein having an amino acid sequence as set forth in FIGURE 1, which sets forth the amino acid sequences of the two alternatively spliced forms of rat cdo.

The present invention also relates to proteins having amino acid sequences which are functionally equivalent to the amino acids sequences set forth in FIGURE 1 and 11. For example, one or more of the amino acid residues within these sequences may be substituted with another amino acid residue of a similar polarity which acts as a functional equivalent, resulting in a silent alteration.

Substitutes for an amino acid within the sequence may be selected from other members of the class to which the amino acid belongs. For example, the nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine. The polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine. The positively charged (basic) amino acids include arginine, lysine and histidine. The negatively charged (acidic) amino acids include aspartic acid and glutamic acid.

Also within the scope of the invention are cdo proteins that have been modified by glycosylation, proteolytic cleavage, or incorporation into a larger molecule.

In additional embodiments, the present invention provides for human cdo proteins, as encoded by (i) nucleic acids comprised in human cDNA clones pHC12, pkSHA3-4, and pTA7; (ii) nucleic acid sequence SEQ ID NO : 3 (FIGURE 11); and (iii) nucleic acid molecules which are at least 90 percent homologous (preferably at least 95 percent homologous), or which hybridize under stringent conditions, to cdo-encoding nucleic acids comprised in human cDNA clones pHC12, pkSHA3-4, and pTA7, or to nucleic acid sequence SEQ ID NO : 3 (FIGURE 11).

The present invention also provides for protein fragments of cdo, for example fragments comprising the intracellular portion, and for fusion proteins comprising this intracellular portion of cdo. For example, but not by way of limitation, the intracellular portion of cdo may have a sequence (i) as set forth for form (x of rat cdo in FIGURE 1 (SEQ ID NO : 1), from amino acid encoded by nucleic acid residue 4455-5222 to the end of the protein coding sequence in FIGURE 1; (ii) as encoded by nucleic acid residues 2893-3675 in FIGURE 11 (SEQ ID NO : 3); or (iii) at least 90 percent, and preferably at least 95 percent, homologous to the sequences of (i) or (ii). The preparation of a fusion protein comprising the intracellular potion of cdo, and the cleavage of a fragment comprising the intracellular portion of cdo therefrom, is illustrated in Example 7, below.

5.4. Anti-cdo Antibodies According to the invention, a cdo protein, as set forth above, or an immunogenic fragment thereof, may be used as an immunogen to generate anti-cdo antibodies.

To improve the likelihood of producing an anti-cdo immune response, the amino acid sequence of cdo may be analyzed in order to identify portions of the cdo molecule which may be associated with greater immunogenicity. For example,

the amino acid sequence may be subjected to computer analysis to identify surface epitopes, according to the method of Hopp and Woods, 1981, Proc. Natl. Acad.

Sci. U. S. A. 78 : 3824-3828. Such epitopes may then be isolated and incorporated into a suitable carrier molecule.

For preparation of monoclonal antibodies toward cdo, any technique which provides for the production of antibody molecules by a continuous cell line or by an organism may be used. For example, and not by way of limitation, the hybridoma technique originally developed by Kohler and Milstein (1975, Nature 256: 495-497), or the trioma technique (Kozbor et al., 1983, Immunology Today 4 : 72), or other techniques used for monoclonal antibody production, including methods for producing chimeric, humanized, or primatized antibodies, may be employed.

Alternatively, polyclonal antibodies directed toward cdo may be prepared by methods known in the art. Various adjuvants may be used to increase the immunological response, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, and keyhole limpet hemocyanin.

The present invention further provides for nucleic acids encoding immunoglobulin molecules directed toward cdo, including nucleic acids encoding single chain antibodies as well as conventional antibody molecules.

Antibody molecules may be purified by known techniques, such as immunoabsorption or immunoaffinity chromatography, chromatographic methods such as HPLC, or combinations thereof.

The present invention also provides for antibody fragments directed toward cdo, including, but not limited to, F (ab') 2 and Fab fragments.

5.5. Screening for cdo Expression in Cells According to the invention, a nucleic acid probe comprising a portion of a cdo-encoding nucleic acid, as described in Section 5.1, may be used as a probe to detect cdo mRNA in a cell sample prepared from a subject who is suspected to suffer from a malignancy or defect in cell proliferation. Preferably, a probe used in such methods hybridizes specifically and selectively to nucleic acid encoding cdo; more preferably, the probe hybridizes to a nucleic acid having a sequence as set forth in FIGURE 1 (SEQ ID NOS 1 or 2) or FIGURE 11 (SEQ ID NO : 3) under stringent conditions, but does not hybridize, under stringent conditions, to non- cdo-encoding nucleic acid.

For detecting cdo mRNA, the probe may be a single-stranded DNA or RNA molecule which is complementary to a cdo-encoding nucleic acid molecule.

The probe may be produced by chemical synthetic or recombinant DNA methods, and may be labeled such that it is directly or, alternatively, indirectly, detectable.

The probe may preferably be at least 10 nucleotides long, and more preferably, may be 20-50 nucleotides long. The assay may be carried out by standard methods such as in situ hybridization, or Northern analysis, using hybridization conditions appropriate for the degree of homology estimated between the probe and its target nucleic acid. Such techniques may be practiced, in a manner known to the skilled artisan, to allow quantitative or semi-quantitative measurement of cdo levels. In view of the low abundance of cdo mRNA in many cell types, it may be desirable to measure cdo mRNA levels by amplification methods, such as the PCR exon- connection as described in Fearon et al., 1990, Science 247: 49-56.

Alternatively, cdo expression may be measured by assessing the amount of cdo protein produced using, for example, standard Western blot, ELISA, or RIA assays. For example, immunohistochemistry may be performed, as described in Hedrick et al., 1994, Genes Develop. 8 : 1174-1183. A cell sample may be fixed on a glass microscope slide, depleted of endogenous peroxidase activity by incubation

in 0.3% H202 in methanol, and blocked with non-immune goat serum. The sample may then be incubated with anti-cdo antibody, which is, for example but not by way of limitation, directed against the intracellular domain of cdo. In one specific, nonlimiting embodiment, detection of this antibody may be with a secondary, bio- tinylated, goat anti-rabbit antibody, followed by Vectastain Elite (Vector Laboratories) and diaminobenzidine staining. For this purpose, anti-cdo antibodies are desirably affinity purified by passage over Sepharose linked to cdo protein or a relevant fragment thereof. Staining with pre-immune rabbit serum in place of anti- cdo antibody may be used as a control.

In the foregoing methods, detecting, in a cell sample, cdo mRNA or protein levels which are substantially decreased or absent, compared to a level of cdo mRNA or protein measured in a control cell sample, indicates that the sample cells may be malignant or may exhibit a defect in proliferation.

In this regard, the present invention provides for a method of diagnosing a disorder of cell proliferation in a subject, comprising measuring the amount of cdo expression in a test sample of cells collected from the subject, and comparing the level of cdo expression in the test sample to a control sample of cells from a normal subject, wherein a decreased amount of cdo expression in the test sample is indicative of a disorder associated with increased cell proliferation in the subject.

As shown in Example Section 9, below, decreased or absent cdo expression correlated positively with the transformed phenotype in breast cells. An increased amount of cdo expression may conversely indicate a disorder associated with decreased cell proliferation in the subject.

In another embodiment, the present invention provides for a method of diagnosing a malignant disorder in a subject, comprising measuring the amount of cdo expression in a test sample of cells collected from the subject, and comparing the level of cdo expression in the test sample to a control sample of cells from a

normal subject, wherein a decreased amount of cdo expression in the test sample is indicative of a malignant disorder in the subject.

Similarly, the detection, by such methods, of a cdo mRNA or protein of abnormal size may indicate the presence of a malignancy or proliferative disorder.

5.6. Therapeutic Uses of cdo Cdo expression may have an inhibitory effect on cell proliferation and on the expression of the transformed phenotype. Accordingly, cdo protein or cdo- encoding nucleic acid, or a portion thereof, may be introduced into a cell in order to inhibit the proliferation or transformation of the cell, or to reverse the transformed phenotype in a malignant cell.

Such methods may be useful for use in cell culturing techniques, where it may be desirable to retard the proliferation of cells (for example, but not by way of limitation, in a feeder culture). Similarly, where expression of cdo may be controlled (for example, by an inducible promoter), cells may be synchronized.

Such methods may be applied (in vitro or in vivo) to a cell of a subject in need of such treatment. A subject may be a human or non-human subject. A subject suffering from a malignant disorder or a proliferative disorder may be considered to be in need of such treatment.

To introduce cdo-encoding nucleic acid into a cell, a suitable expression vector, as set forth above, may be used. Where cdo nucleic acid is used for gene therapy, any vector system known in the art of gene therapy may be used.

Where a cdo protein, or a portion thereof, is to be introduced into a cell, the protein may be incorporated into a vesicle, or may be fused to a second protein or other molecule, to promote uptake.

6. Example : Cloning and Characterization of Rat cdo A cDNA library was constructed from a transformation-resistant mutant rat 6 embryo fibroblast cell line that fails to form colonies in soft agar when infected with v-H-ras expressing retrovirus. The library was screened by differential hybridization with 32P-labelled cDNA prepared from both mutant and control cell mRNA. From this screening procedure, an approximately 3.5 kb cDNA clone, designated cdo, was isolated. The cDNA hybridized to an approximately 8.5 kb mRNA species and showed a mild (about 2 fold) up-regulation in mutant cells.

The 3.5 kb cDNA clone represented the 3'most portion of the full length cdo mRNA.

To obtain clones spanning the entire 8.5 kb mRNA, a random-primed library was prepared from confluent, serum-starved rat 6 cells, and cDNAs covering the full sequence were isolated in three rounds of walking. The nucleic acid sequence of the full length cdo cDNA was then determined and is disclosed in FIGURE 1 (form a ; SEQ ID NO : 1).

Analysis of the amino acid sequence of the open reading frame (SEQ ID NO : 1), revealed cdo to be a new member of the IgSF, containing a putative signal sequence, 5 Ig-like C2 repeats, 3 fibronectin type III-like (FN III) repeats, a 25 residue transmembrane-spanning region, and an intracellular domain of 256 amino acids. By comparison, the candidate tumor suppressor gene, DCC, contains a signal sequence, 4 Ig-like C2 repeats, 6 FN III repeats, a transmembrane region and an intracellular domain of 325 amino acids. Protein sequence similarity among different members of the IgSF is usually only about 10-25%. It is significant, therefore, that some of the Ig and FN III repeats encoded by cdo are >60 percent similar to such repeats found in DCC. In fact, in data base searches on the predicted cdo polypeptide sequence, DCC consistently came up as the most closely related known protein. Other genes that show a relatively high degree of similarity include the bona fide adhesion molecules N-CAM and F11/F3/contactin. Interest-

ingly, however, the intracellular domain of cdo shows no homology to any gene currently in the major gene databases.

It was subsequently discovered that a variant form of cdo-encoding mRNA was also expressed in rat cells. The nucleic acid and amino acid sequence of this variant form, in which nucleotides 2694-3044 have been deleted, is also depicted in FIGURE 1 (form f3 ; SEQ ID NO : 2).

FIGURE 4A-B depicts a Northern blot analysis of cdo mRNA expression (FIGURE 4A) in parental rat 6 cells transformed by the H-ras, neu, v-src, v-raf, protein kinase C E, v-fos or c-myc oncogenes. Twenty micrograms of total cellular RNA from designated cell lines was fractionated on agarose/formaldehyde gels, blotted to a nylon membrane and hybridized with a 32P-labelled rat cdo cDNA probe. The lane designations in FIGURE 4A refer to the oncogene expressed in the given rat 6 cell derivative. The lower panel (FIGURE 4B) shows signal generated when the same filter was hybridized to a rat GAPDH probe, as a control for the integrity of the RNA in each lane. Expression of cdo mRNA was significantly down-regulated by transformation. In contrast, over expression of protein kinase C ß 1, which is a very weak transforming gene, had little effect on cdo expression.

Thus, down-regulation of cdo mRNA expression correlated with establishment of the transformed phenotype by several different oncogenes. This evidence suggests that cdo is a candidate Class II tumor suppressor gene in that it is down regulated in transformed cells.

Expression of cdo appears to be associated with the cell-cycle. FIGURE 5A shows the time course of cdo expression in serum-stimulated rat 6 cells.

Confluent cultures of wild-type rat 6 cells were rendered quiescent by incubation in medium containing 0.1 % serum for 48 hours and then stimulated by refeeding with fresh-serum-containing medium. RNA was isolated at various time points thereafter, and cdo expression was analyzed by Northern blotting techniques as described for FIGURE 4A-B. The numbers above the lanes indicate the time, in

hours, after refeeding of the cultures. An ethidium bromide stained gel, shown in FIGURE 5B, serves as a loading control. A decrease in steady state cdo mRNA levels was apparent within 2 hours. Four hours after serum stimulation, cdo expression was nearly extinguished. Expression of cdo returned in 8-12 hours, and in 24-48 hours had been restored to the level of the original cultures prior to stimulation. Thus, steady state levels of cdo mRNA were transiently down- regulated as quiescent, Go-arrested cells reentered the cell cycle. Finally, it should be noted that cdo mRNA is ordinarily present in low amounts in rat 6 cells (the transformable parent cell line of the mutant cell line in which cdo was first discovered), representing no more than 0.003% of total cellular mRNA, even when cells are confluent and starved of serum.

As shown in FIGURE 6, expression of cdo in adult rat tissues was analyzed in various tissues by the PCR-based"exon-connection"strategy to detect DCC expression. This technique is described in Fearon et al, 1990, Science 247: 49-56.

In using this technique, a portion of the cdo gene was isolated from a rat genomic library and a partial exon/intron structure deduced over a-15 kb region. Random primers and reverse transcriptase were used to generate cDNA from total RNA derived from the above named tissues. PCR primers directed against sequences from separate exons were then used to amplify a cDNA product that linked these exons. RNA from brain, liver, kidney, heart, large and small intestines, spleen, thymus, lung, stomach, breast and skeletal muscle was analyzed in this manner. A PCR product of the predicted size (1883 bp) was detected in most tissues, but no such product was found in liver, kidney and skeletal muscle. This expression pattern is consistent with the related gene, DCC. Thus, like DCC, but in contrast to bona fide adhesion molecules such as N-CAM and F11, cdo is expressed at extremely low levels. It is possible, therefore, that the role of cdo, like DCC, is not merely to bind cells together physically, but as a transducer of signals involved in cell growth and differentiation.

To evaluate the conservation of cdo in various organisms, a Southern"zoo" blot was prepared containing DNA from various species using the rat cdo cDNA as a probe. This test revealed that mouse and human DNA displayed strong cross- species hybridization to the rat cdo cDNA, while chicken, frog (Xenopus laevis) and zebra fish DNA displayed a clearly detectable, but weaker, signal. This data indicates that cdo has been conserved throughout vertebrate evolution.

7. Example : Expression of cdo Both a and P forms of rat cdo-encoding nucleic acids were expressed in an in vitro transcription/translation reaction. The two cDNAs encoding the a and P forms were subcloned into the Bluescript vector, pSK (Stratagene) and added to a Promega TNT kit reaction. A product of the predicted molecular weight (approximately 136,000 and 124,000 for a and (3 forms, respectively) was produced, as shown in FIGURE 7, in the lanes designated cdo os and cdo P. Three higher molecular weight forms were also detected, which probably represent glycosylation products.

FIGURE 8 shows the synthesis in bacteria of a recombinant fusion protein made between glutathione S transferase and the intracellular portion of rat cdo (encoded by nucleic acid residues 4666-5267. Nucleic acid encoding the intracellular portion of cdo was incorporated, in frame, into the vector pGEX-5Xl (Pharmacia). This vector enables IPTG-inducible production of a glutathione S- transferase (GST) fusion protein. E. coli strain XL-1 Blue was transformed with the recombinant vector, and then a large culture was grown and induced with IPTG. The cells were then harvested and the GST-cdo fusion protein was purified by capture on glutathione beads, followed by SDS-gel electrophoresis and electroelution from a slice of the gel. The homogeneity of the recombinant cdo protein was confirmed by SDS-PAGE (see FIGURE 8).

The resulting purified protein was used to produce polyclonal antisera in rats by standard techniques (by Zymed Labs, Inc.). The polyclonal antisera was then used to demonstrate that the antisera-recognized cdo protein exhibited expres- sion patterns which paralleled cdo RNA expression, as discussed above (FIGURE 9A-E). FIGURE 9A shows that transfection of 293 cells with an expression vector pBabePuro containing rat cdo-encoding cDNA resulted in production of antisera- recognized cdo protein. FIGURE 9B shows that antisera-recognized cdo protein is present in parental rat 6 cells but not in ras-transformed rat 6 cells (line C1-T24).

FIGURE 9C depicts a Western blot which demonstrates that serum stimulation of serum-starved rat 6 cells leads to transient down-regulation of antisera-recognized cdo protein. FIGURES 9D-E demonstrate that cdo RNA (FIGURE 9E) and antisera-recognized cdo protein (FIGURE 9D) are associated with cell substratum adhesion; lane P in these figures represents adherent cultures; lane M represents non-adherent cultures grown in methylcellulose.

8A. Example : Human cdo Clones Human cloned cDNA corresponding to more than 98 percent of the rat open reading frame has been isolated. The human cloned cDNA is contained in clones pHC12, pkSHA3-4, and pTA7, prepared from two lambda phage libraries, one derived from fetal lung (a gift from Stuart Aaronson) and the second derived from human fetal brain (Clonetech). FIGURE 2 provides a map of clones pHC12, pkSHA3-4 and pTA7 relative to rat cdo. FIGURE 3 provides a comparison between the obtained human sequence and rat cdo. The sequence data indicates that splice variant forms of cdo, in addition to forms a : and ß, are likely to exist.

For example, human nucleotides 3304-3318 have no counterpart in the rat sequence; this likely represents an alternatively spliced exon encoding five amino acids.

8B. Example : Cloning and Characterization of Human cdo Analysis of the amino acid sequence of the open reading frame of human cdo (SEQ ID NO : 3) shows that it contains 5 Ig repeats, 3 FN III repeats, 925 residue transmembrane spanning region, and an intracellular domain of 261 amino acids. Therefore, human and rat cdo have virtually identical domain structures in their extracellular region, and their intracellular region are highly related, but neither resemble other known proteins. Human and rat cdo are ~ 81 % identical and ~ 96% similar at the amino acid level, and the relative positions of amino acids that define the Ig and FN III-like domain consensus are fully conserved.

9. Example : Expression of cdo in Transformed Breast Cells Expression of cdo was studied in various normal human mammary epithelial and mammary cancer cell lines, as well as in a culture derived from reduction mammoplasty. Expression was assessed by PCR-based exon-connection assay, as described above. FIGURE 10A-B depicts an autoradiogram showing the results of these experiments. Briefly, equal amounts of RNA from each cell type were reverse-transcribed and gave similar yields of first-strand synthesis. Equal amounts of cDNA were then amplified by PCR with cdo-specific primers. The products were fractionated on an agarose gel, blotted to a nylon filter and hybridized to a"P-labelled human cdo cDNA probe (FIGURE lOA). A PCR reaction with primers specific for p-actin was also performed on each sample as a control, and the ethidium bromide-stained gel demonstrating production of the appropriately-sized product is shown in FIGURE 1 OB. The lane designations represent the different normal and tumor cell lines."RM"refers to an epithelial cell culture derived from a reduction mammoplasty. As shown in FIGURE 10A, Four out of seven transformed cell lines showed a substantial decrease in CDO levels.

10. Deposit of Microorganisms The following plasmids were deposited with the American Type Culture Collection (ATCC), 12301 Parklawn Drive, Rockville, MD 20852, on July 25,1996: rat cdo in plasmid pSKcdoa, accession no. 97667. human cdo in the following plasmids: pHC12, accession no. 97668. pkSHA3-4, accession no. 97670. pTA7, accession no. 97669.

Various publications are cited herein, the contents of which are hereby incorporated in their entireties.

11. Sequence Listing (1) GENERAL INFORMATION (i) APPLICANT: KRAUSS, ROBERT GAO, MIN KANG, JONG-SUN FEINLEIB, JESSICA (ii) TITLE OF INVENTION: CDO TUMOR SUPPRESSOR AND PROTEIN (iii) NUMBER OF SEQUENCES: 3 (iv) CORRESPONDENCE ADDRESS: (A) ADDRESSEE: Brumbaugh, Graves, Donohue & Raymond (B) STREET: 30 Rockefeller Plaza (C) CITY: New York (D) STATE: NY (E) COUNTRY: U. S. A.

(F) ZIP: 10112 (v) COMPUTER READABLE FORM: (A) MEDIUM TYPE: 3.5 inch Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS (D) SOFTWARE: FastSEQ Version 1.5 (vi) CURRENT APPLICATION DATA: (A) APPLICATION NUMBER: (B) FILING DATE: 11-JUL-1997 (C) CLASSIFICATION: (vii) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER : (B) FILING DATE: (viii) ATTORNEY/AGENT INFORMATION: (A) NAME: Clark, Richard S (B) REGISTRATION NUMBER: 26,154 (C) REFERENCE/DOCKET NUMBER: A30375-PCT (ix) TELECOMMUNICATION INFORMATION: (A) TELEPHONE: 212-408-2558 (B) TELEFAX: 212-765-2519 (C) TELEX:

(2) INFORMATION FOR SEQ ID NO : 1 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 6039 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTISENSE: NO (v) FRAGMENT TYPE: (vi) ORIGINAL SOURCE: (A) ORGANISM: Rat (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : GAGATGGCTC AGGGGTTAAG ACTACTTGTT GCTCTTGCAG AGGACGCAGG TTTAGTTCCC 60 TGCACCCACA TGGTAGTTCA CAGCCATCTG TAATTCCATT TCCAGAGGAA CCAAGTCCTC 120 TTCCACCCTC CAAGGGCACT GCATACACAT GGTCTACATA CATATATACA GGCAAAACAC 180 ATAAACAATT TTTAAAGGAA TAAACAATCT TATAAAGGTA GAGAACACAG CACACACAGA 240 GGTGGTGACT CACATCAGTG AGCCTCAGGC TCTCGGCAGC AGTAAATCTG TGTGAAGATG 300 GGCACCTACC TGGACAGAAG GCGAGGAAGA TGCCCAAAAC ACATCTTACT AGCAAGTAGG 360 CAGGAAAAGG GCAAGTATTT CAGCAGACAC AAGACAAAAT CAAAGTATTT ATTGTGTTTC 420 AGGTGAGAGG AGGAAAGAAA GACTTCAAGG AAATGCAGGA CACTGGACAC AGAAACATCA 480 GGGCTTGACG GATGATCAGA TAAGGACCAG GTCAGGGTGC AAGGGGTTTT TGACCTGTTG 540 GACAACAACA GAAGCCTAGC TCCGATCCAA TGCAATACAG CCTTTCTCAT CAATCCATTT 600 CTTAAACTAC AAGTATGTGA GAGGGAGCAC GCATATGAGT ATAGGTGTTT GCACAGGCCA 660 GATGCATCAG ATGCCCTGGA GTCACAAGTG ATTATGGGAA CTTAACTCAG GTCCCTGTAA 720 GAGCAGTACA GGCTCTTAAC CACCAAGGCA ACTCAGAAGC CCCCAATACA ATGTTGTCCT 780 GGCAAGGAGG AATCTCAGGG AATCACTGGA CATGCTAGTT TAGTGACTCG GTGGTGTCTG 840 GTCCTGAGAG AAAGGCATGC AATCACCTAC ACCTTTCAGT GATGTGGCAT GGCTCTGGAA 900 AGGGGGCTGC TAGCCGCAAA TCATCCTAAC TTGCCTGAGA AAATACATCC AATCCCAGAT 960 CCTGAGGATC TGAGGATACT CCTTAAGAGT CCAGGCTATG CCAAGGGCTC TCTATCTGGA 1020 AATTCCGGGT TCTTAGTTAC AGGTTCCACA AAGCCAGGTG CACAACTCCA GGCTCCGCCT 1080 CCACCAGCCC AGAGTCTGCC TGCTGTGCAT CCATGGCGGT ACATCACTAG ATAGCAACAC 1140 CGCCATCTCA GTACATCTGC ACAGATGTGG TGCCGGCTCC CGCTCACCTC ACTTGGCCCG 1200 CCCAGTGGAA CTAGGAACAA GGACATGAAC CTGGCACAGC CTACGTCAGA AGAGCCATGA 1260 CTCTGGCCGA TTGTGCTTGA TAGAGCCATT CACAGAACTA TATTGTGAAC TCACGTAGAA 1320 TGGTTCATTT CCAAAGGCCA TTGTTTAAAA CTCCTTAAGC GGGGTTGGGG ATTTAGCTCA 1380 GCGGTAGAGC GCTTACCTAG GAAGCGCCGC CGGGGCTTAG CCAGCCCTGC GGAGAGTAAG 1440 CTGAGGCCTG AGCTTTCTCA GAAAGTCTTC CCAGCCTACT TCTGTAATTG GGCGATATGC 1500 ATCCAGACCT CGGACCCTTA TGGAAACTGC TGTATGTTCT TGTGATTCTG TGTTCTTCTG 1560 TGAGCTCAGA CTTGGCAACT TATTTTATTT CTGAGCCACT CTCTGCTGTC CAGAAGCTTG 1620 GCAGACCCGT GGTCCTACAT TGTTCTGCTA AACCTGTTAC TGCCCGAATC TCATGGTTGC 1680 ATAATGGAAA ACGATTGGAC AGAAACACAG AACAGATAAA GATCCACCGG GGGACTTTGA 1740 CCATTCTGTC TCTCAACCCT TCCCTTTCTG GTTGCTACCA GTGTGTGGCC AACAACAGTG 1800 TTGGGGCTGT TGTCAGTGGC CCTGCCACAG TGTCCGCTGA CGCCCTGGCT GATTTCGATT 1860 CATCAACAAT GCATGTTATT ACTGCAGAAG AGAAAAACAC GGGTTTCATT GGCTGCAGGG 1920 TACCTGAGAG TAACCCCAAA GCTGAGGTGC GCTACAAGAT CCGGGGAAAG TGGCTGATGT 1980 ATTCCACAGG GAACTACATA ATCCTTCCCT CAGGAAATCT TCAGATTTTG AATGTATCCT 2040 CGAAGGATAA GGGATCGTAC AAGTGTGCTG CCTATAATCC TGTCACCAGT GAACTGAAAG 2100

TTGAACCCGC TGGCCGGAAG CTCCTTGTGA GTCGTCCTTC CTCGGATGGT TTTCACATTC 2160 TTCACCCTGC TCTTTCTCAG GCATTAGCTG TCCTTCCGCA CAGCCCTGTT ACCTTGGAGT 2220 GTGTAGTGAG TGGGGTCCCC GCCTCACAAG TGTATTGGCT GAAGGACGGG CAGGATTGCC 2280 TGTCAGGAAG CAACTGGAGA AGGCTGTACT CTCACCTGGC CACAGCTAGC ATCGACCCAG 2340 CGGATTCCGG GAACTATTCC TGTGTGGTGG GCAACAACAG TTCCGGAGAT GTTAAACACG 2400 TCACTTACAC AGTCAACGTA CTGGAGCACG CTTCAATTTC TAAAGGGCTG CACGATCAGA 2460 AGGTGTCCCT GGGGGCCACC GTACGTTTTA CCTGCGAAGT TCACGGGAAC CCAGCCCCCA 2520 ACCGCACCTG GTTTCATAAC GCACAGCCCA TCCGCCCCTC CTCACGGCAT CTGACGGAAG 2580 GAAGTGTTCT GAAGATCACC GGGGTCATCA TGGAGGATTC TGGGTTGTAT CAGTGCATGG 2640 CAGACAATGG GATTGGATTT ATGCAATCTA CTGGAAGACT TCAAATTGAA CAAGACAGTG 2700 GACAGAGGCC TGTCATAGTC ACCGCCCCAG CAAACGTAGA GGTGACGGAC GGAGACTTCG 2760 TGACTTTGTC TTGCAATGCC ACAGGAGAGC CTGTCCCGGT CATTCATTGG TACGGCCGCC 2820 ATGGATTGAT AACCAGCCAT CCATCTCAGG TCCTTAGGTC CAAATCTCGA AAGTCCCACC 2880 TCTTCCGACC TGGGGACCTG GACCCGGAGC CTGTCTACCT CATCATGTCC CAAGCGGGCT 2940 CGAGCTCTCT GTCCATCCAG GCAGTGACTC GGGAGCATGC TGGGAAATAC ACTTGCGAAG 3000 CTGTGAACAA ACATGGCAGC ACACAGTCAG AAGCGTTCCT CACAGTCGTT CCTTTTGAAA 3060 CAAACACAAA GGCAGAGCCA GTCACACCCT CCGAAGCTTC TCAGAACGAT GAACGAGACC 3120 CACGAGACGG TTCAGAGTCC GGCTTGCTGA ACTTGTTTCC AGTGAAGGTG CATTCCGGTG 3180 GAGTGGAATT GCCAGCAGAG AAAAATGCCT CTGTCCCCGA TGCTCCTAAC ATACTGAGCC 3240 CCCCACAGAC CCACATGCCA GACACATACA CCCTGGTGTG GAGGACGGGG AGGGATGGCG 3300 GAATGCCCAT CAACGCCTAT TTCGTGAAGT ACCGAAAGCT GGACGACGGC AGTGGTGCGG 3360 TAGGCAGCTG GCACACGGTT CGCGTCCCAG GGAGTGAGAG CGAGCTGCAT CTAACCGAAC 3420 TGGAGCCTTC AAGCCTTTAT GAAGTTTTGA TGGTGGCCAG AAGTGCAGTC GGCGAAGGAC 3480 AGCCTGCCAT GCTTACCTTC CGGACCAGCA AAGAAAAGAT GGCATCATCA AAAAACACCC 3540 AGGCGTCCTT TCCACCTGTG GGCATCCCTA AGCGGCCTGT AACTTCGGAG GCTTCCAACA 3600 GCAATTTTGG AGTTGTGCTT ACGGATTCCT CTAGGCATAG TGGAGTCCCA GAGGCACCAG 3660 ATCGACCTAC TATCTCGATG GCATCGGAGA CCTCAGTCTA TGTCACCTGG ATTCCCCGTG 3720 CAAATGGCGG CTCTCCCATC ACTGCCTTCA AGGTGGAATA TAAGCGGATG AAAAGTAGTG 3780 ACTGGCTGGT GGCTGCTGAA GACATCCCTC CTTCCAAACT CTCTGTGGAA GTCCGGAGTT 3840 TAGAGCCAGG TTCGATATAC AAATTTAGGG TCATTGTTAT CAACCATTAC GGTGAGAGTT 3900 TTCGGAGCTC GGCGTCCCGT CCCTACCAGG TGGCTGGTTT CCCAAATCGC TTTTCCAATC 3960 GCCCCATAAC TGGACCTCAC ATCGCATACA CAGAGGCTGT CAGCGATACT CAGATCATGC 4020 TAAAATGGAC GTATATTCCA TCAAGTAACA ATAACACTCC CATTCAAGGA TTCTATATCT 4080 ATTACCGGCC AACAGACAGT GACAATGACA GTGATTACAA GAGGGATGTT GTGGAAGGTT 4140 CAAAGCAGTG GCACACCATT GGTCACCTGC AGCCAGAGAC CTCCTATGAC ATTAAGATGC 4200 AGTGCTTTAA TGAAGGAGGA GAGAGCGAGT TCAGCAACGT GATGATCTGC GAGACTAAAG 4260 TGAAACGAGT TCCCGGAGCA TCGGAGTATC CCATGAAAGA GTTGAGCACT CCTCCCAGTT 4320 CTTCAGGGAA CGGAGGGAAC GTGGGGCCTG CAACCAGCCC TGCCAGGAGC AGCGACATGC 4380 TGTACCTCAT CGTCGGCTGT GTGCTTGGGG TTATGGTCCT CATTCTTCTG GTCTTCATTG 4440 CACTGTGTCT GTGGAAGAGT CGCCAACAGA GTGCCATACA GAAATATGAT CCTCCAGGAT 4500 ATCTCTACCA GGGGTCAGAG ATTAATGGGC AGATGGTAGA GTATACCACT CTCTCAGGAA 4560 CAGCCCGGAT CAATGGGAGT GTTCACGGAG GCTTCCTCAG CAAAGGCAGT CTCAGCAATG 4620 GCTGCTCTCA CCTCCACCAC AAAGGCCCCA ACGGAGTCAA TGGGATCCTG AATGGAACCA 4680 TAAATGGGGG GCTTTATTCT GCACACACCA GCTCCCTAAC CAGGACGTGT GTGGAGTTCG 4740 AACATCCTCA CCATCTAGTG AACGGTGGAG CAGTGTACAC GGCTGTCCCT CAGATGGACC 4800 CACTGGAATG CATTAATTGT CGGAATTGCC GGAACAACAA TAGGTGTTTC ACCAAAACCA 4860 ACAGTCCCCT TCCTGTGGTC CCAGTGGTAG CCTCTTATCC TCAGGATGGA CTGGAAATGA 4920 AGCCCCTCGG TGTCATGAAG TTCCCAGTGT GTCCAGTTTC CACAGTTCCT GATGGTGGCC 4980 AGATACCTGA GGAGTGCCTC AAGGACAGCG TGGCACCAGC ACCTACCCAG CGTACATGCC 5040 GCCAGGACAA CACAAGCGAC ATCAATTCTG ATTCCACAGA AGACACAGCA GAGTTCAACA 5100 GAGGAGACAG CAGCGGTCAT TCAGAAGCAG AGGACAAAGT TTTCAGTTGG AGTCCTCTTA 5160 TTTTATCACC TGTCTTGGAG GCTGCAGTGA GAAGACAGCG TGGTCTCCTC CTGGCCCCCC 5220 TCTAGACGGG CTGTCAGTGG TCCTTCAGCA AGCCCAAGAG ACCTGAGAGG AATGTGAGCA 5280 GGCCTCCCAC TGCAAGCCAG TAACTGCACC ACACAGGCCT GGGGACAAAC TGTGTGAAGG 5340 ACATTAATTC AAATCAGAGA AAACCATTAT TTATTTTTTG GTAGTAGTAA TGTCATGAAT 5400 GTATCTTAAA ATGTGCGCCC TTTTATATTA TTTATGCCTT ATGTTTTCCC TTCCCCATTT 5460

CTTCCTCCCC CTATTTTTTT TTAATGCAGA GTTTTTTTAA TCGTCTGGAG AGCAGGGGAT 5520 CATTCTGTGT CTTCTGGGGC CTTCAGTTGG CAGGCTTCCA TCTTCTGGCC TGTTCAGTTG 5580 TGGGGAGAGG ATTGCCCACG CCATTTCACA TTTGTCACCA GTCGTTCCTA GATGGAAAAA 5640 CTGTTACCTC TCCCATGTCG TCAGACTTTT GGGAACATTT AAAAATCAAC CATAGTTGTG 5700 ATCATATGTT TATAATAGCC AACCCAGCTG ACACACTTTT GAGTACCTTC CAGAAAAATA 5760 CTAATACTGA CTTATTTTCT CTCTGTGCCT GGGTACAAGT AGCGATCAAT CTTCTAGGTA 5820 TCAAATTAGC CAGTGTTGTT ATGAACACAG AAACATGTAA AGTCTGTTTG GATTATTGTA 5880 TTATATAGAA AGGGAACCAG ATTCAGAAGG AAAAGTAATG CCTTCATGGT TCCACTGCTC 5940 CTATTATCTT AATTTACAAA ATGGCCACCA CGAGTTCTTT GCACTACTTG CAGAGGTATA 6000 TAATAAATAC AAAAGTAAGG CCTTTAAACT GATAGTTTG 6039 (2) INFORMATION FOR SEQ ID NO : 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5688 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTISENSE: NO (v) FRAGMENT TYPE: (vi) ORIGINAL SOURCE: (A) ORGANISM: Rat (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: GAGATGGCTC AGGGGTTAAG ACTACTTGTT GCTCTTGCAG AGGACGCAGG TTTAGTTCCC 60 TGCACCCACA TGGTAGTTCA CAGCCATCTG TAATTCCATT TCCAGAGGAA CCAAGTCCTC 120 TTCCACCCTC CAAGGGCACT GCATACACAT GGTCTACATA CATATATACA GGCAAAACAC 180 ATAAACAATT TTTAAAGGAA TAAACAATCT TATAAAGGTA GAGAACACAG CACACACAGA 240 GGTGGTGACT CACATCAGTG AGCCTCAGGC TCTCGGCAGC AGTAAATCTG TGTGAAGATG 300 GGCACCTACC TGGACAGAAG GCGAGGAAGA TGCCCAAAAC ACATCTTACT AGCAAGTAGG 360 CAGGAAAAGG GCAAGTATTT CAGCAGACAC AAGACAAAAT CAAAGTATTT ATTGTGTTTC 420 AGGTGAGAGG AGGAAAGAAA GACTTCAAGG AAATGCAGGA CACTGGACAC AGAAACATCA 480 GGGCTTGACG GATGATCAGA TAAGGACCAG GTCAGGGTGC AAGGGGTTTT TGACCTGTTG 540 GACAACAACA GAAGCCTAGC TCCGATCCAA TGCAATACAG CCTTTCTCAT CAATCCATTT 600 CTTAAACTAC AAGTATGTGA GAGGGAGCAC GCATATGAGT ATAGGTGTTT GCACAGGCCA 660 GATGCATCAG ATGCCCTGGA GTCACAAGTG ATTATGGGAA CTTAACTCAG GTCCCTGTAA 720 GAGCAGTACA GGCTCTTAAC CACCAAGGCA ACTCAGAAGC CCCCAATACA ATGTTGTCCT 780 GGCAAGGAGG AATCTCAGGG AATCACTGGA CATGCTAGTT TAGTGACTCG GTGGTGTCTG 840 GTCCTGAGAG AAAGGCATGC AATCACCTAC ACCTTTCAGT GATGTGGCAT GGCTCTGGAA 900 AGGGGGCTGC TAGCCGCAAA TCATCCTAAC TTGCCTGAGA AAATACATCC AATCCCAGAT 960 CCTGAGGATC TGAGGATACT CCTTAAGAGT CCAGGCTATG CCAAGGGCTC TCTATCTGGA 1020 AATTCCGGGT TCTTAGTTAC AGGTTCCACA AAGCCAGGTG CACAACTCCA GGCTCCGCCT 1080 CCACCAGCCC AGAGTCTGCC TGCTGTGCAT CCATGGCGGT ACATCACTAG ATAGCAACAC 1140 CGCCATCTCA GTACATCTGC ACAGATGTGG TGCCGGCTCC CGCTCACCTC ACTTGGCCCG 1200 CCCAGTGGAA CTAGGAACAA GGACATGAAC CTGGCACAGC CTACGTCAGA AGAGCCATGA 1260 CTCTGGCCGA TTGTGCTTGA TAGAGCCATT CACAGAACTA TATTGTGAAC TCACGTAGAA 1320 TGGTTCATTT CCAAAGGCCA TTGTTTAAAA CTCCTTAAGC GGGGTTGGGG ATTTAGCTCA 1380 GCGGTAGAGC GCTTACCTAG GAAGCGCCGC CGGGGCTTAG CCAGCCCTGC GGAGAGTAAG 1440 CTGAGGCCTG AGCTTTCTCA GAAAGTCTTC CCAGCCTACT TCTGTAATTG GGCGATATGC 1500

ATCCAGACCT CGGACCCTTA TGGAAACTGC TGTATGTTCT TGTGATTCTG TGTTCTTCTG 1560 TGAGCTCAGA CTTGGCAACT TATTTTATTT CTGAGCCACT CTCTGCTGTC CAGAAGCTTG 1620 GCAGACCCGT GGTCCTACAT TGTTCTGCTA AACCTGTTAC TGCCCGAATC TCATGGTTGC 1680 ATAATGGAAA ACGATTGGAC AGAAACACAG AACAGATAAA GATCCACCGG GGGACTTTGA 1740 CCATTCTGTC TCTCAACCCT TCCCTTTCTG GTTGCTACCA GTGTGTGGCC AACAACAGTG 1800 TTGGGGCTGT TGTCAGTGGC CCTGCCACAG TGTCCGCTGA CGCCCTGGCT GATTTCGATT 1860 CATCAACAAT GCATGTTATT ACTGCAGAAG AGAAAAACAC GGGTTTCATT GGCTGCAGGG 1920 TACCTGAGAG TAACCCCAAA GCTGAGGTGC GCTACAAGAT CCGGGGAAAG TGGCTGATGT 1980 ATTCCACAGG GAACTACATA ATCCTTCCCT CAGGAAATCT TCAGATTTTG AATGTATCCT 2040 CGAAGGATAA GGGATCGTAC AAGTGTGCTG CCTATAATCC TGTCACCAGT GAACTGAAAG 2100 TTGAACCCGC TGGCCGGAAG CTCCTTGTGA GTCGTCCTTC CTCGGATGGT TTTCACATTC 2160 TTCACCCTGC TCTTTCTCAG GCATTAGCTG TCCTTCCGCA CAGCCCTGTT ACCTTGGAGT 2220 GTGTAGTGAG TGGGGTCCCC GCCTCACAAG TGTATTGGCT GAAGGACGGG CAGGATTGCC 2280 TGTCAGGAAG CAACTGGAGA AGGCTGTACT CTCACCTGGC CACAGCTAGC ATCGACCCAG 2340 CGGATTCCGG GAACTATTCC TGTGTGGTGG GCAACAACAG TTCCGGAGAT GTTAAACACG 2400 TCACTTACAC AGTCAACGTA CTGGAGCACG CTTCAATTTC TAAAGGGCTG CACGATCAGA 2460 AGGTGTCCCT GGGGGCCACC GTACGTTTTA CCTGCGAAGT TCACGGGAAC CCAGCCCCCA 2520 ACCGCACCTG GTTTCATAAC GCACAGCCCA TCCGCCCCTC CTCACGGCAT CTGACGGAAG 2580 GAAGTGTTCT GAAGATCACC GGGGTCATCA TGGAGGATTC TGGGTTGTAT CAGTGCATGG 2640 CAGACAATGG GATTGGATTT ATGCAATCTA CTGGAAGACT TCAAATTGAA CAAGTCGTTC 2700 CTTTTGAAAC AAACACAAAG GCAGAGCCAG TCACACCCTC CGAAGCTTCT CAGAACGATG 2760 AACGAGACCC ACGAGACGGT TCAGAGTCCG GCTTGCTGAA CTTGTTTCCA GTGAAGGTGC 2820 ATTCCGGTGG AGTGGAATTG CCAGCAGAGA AAAATGCCTC TGTCCCCGAT GCTCCTAACA 2880 TACTGAGCCC CCCACAGACC CACATGCCAG ACACATACAC CCTGGTGTGG AGGACGGGGA 2940 GGGATGGCGG AATGCCCATC AACGCCTATT TCGTGAAGTA CCGAAAGCTG GACGACGGCA 3000 GTGGTGCGGT AGGCAGCTGG CACACGGTTC GCGTCCCAGG GAGTGAGAGC GAGCTGCATC 3060 TAACCGAACT GGAGCCTTCA AGCCTTTATG AAGTTTTGAT GGTGGCCAGA AGTGCAGTCG 3120 GCGAAGGACA GCCTGCCATG CTTACCTTCC GGACCAGCAA AGAAAAGATG GCATCATCAA 3180 AAAACACCCA GGCGTCCTTT CCACCTGTGG GCATCCCTAA GCGGCCTGTA ACTTCGGAGG 3240 CTTCCAACAG CAATTTTGGA GTTGTGCTTA CGGATTCCTC TAGGCATAGT GGAGTCCCAG 3300 AGGCACCAGA TCGACCTACT ATCTCGATGG CATCGGAGAC CTCAGTCTAT GTCACCTGGA 3360 TTCCCCGTGC AAATGGCGGC TCTCCCATCA CTGCCTTCAA GGTGGAATAT AAGCGGATGA 3420 AAAGTAGTGA CTGGCTGGTG GCTGCTGAAG ACATCCCTCC TTCCAAACTC TCTGTGGAAG 3480 TCCGGAGTTT AGAGCCAGGT TCGATATACA AATTTAGGGT CATTGTTATC AACCATTACG 3540 GTGAGAGTTT TCGGAGCTCG GCGTCCCGTC CCTACCAGGT GGCTGGTTTC CCAAATCGCT 3600 TTTCCAATCG CCCCATAACT GGACCTCACA TCGCATACAC AGAGGCTGTC AGCGATACTC 3660 AGATCATGCT AAAATGGACG TATATTCCAT CAAGTAACAA TAACACTCCC ATTCAAGGAT 3720 TCTATATCTA TTACCGGCCA ACAGACAGTG ACAATGACAG TGATTACAAG AGGGATGTTG 3780 TGGAAGGTTC AAAGCAGTGG CACACCATTG GTCACCTGCA GCCAGAGACC TCCTATGACA 3840 TTAAGATGCA GTGCTTTAAT GAAGGAGGAG AGAGCGAGTT CAGCAACGTG ATGATCTGCG 3900 AGACTAAAGT GAAACGAGTT CCCGGAGCAT CGGAGTATCC CATGAAAGAG TTGAGCACTC 3960 CTCCCAGTTC TTCAGGGAAC GGAGGGAACG TGGGGCCTGC AACCAGCCCT GCCAGGAGCA 4020 GCGACATGCT GTACCTCATC GTCGGCTGTG TGCTTGGGGT TATGGTCCTC ATTCTTCTGG 4080 TCTTCATTGC ACTGTGTCTG TGGAAGAGTC GCCAACAGAG TGCCATACAG AAATATGATC 4140 CTCCAGGATA TCTCTACCAG GGGTCAGAGA TTAATGGGCA GATGGTAGAG TATACCACTC 4200 TCTCAGGAAC AGCCCGGATC AATGGGAGTG TTCACGGAGG CTTCCTCAGC AAAGGCAGTC 4260 TCAGCAATGG CTGCTCTCAC CTCCACCACA AAGGCCCCAA CGGAGTCAAT GGGATCCTGA 4320 ATGGAACCAT AAATGGGGGG CTTTATTCTG CACACACCAG CTCCCTAACC AGGACGTGTG 4380 TGGAGTTCGA ACATCCTCAC CATCTAGTGA ACGGTGGAGC AGTGTACACG GCTGTCCCTC 4440 AGATGGACCC ACTGGAATGC ATTAATTGTC GGAATTGCCG GAACAACAAT AGGTGTTTCA 4500 CCAAAACCAA CAGTCCCCTT CCTGTGGTCC CAGTGGTAGC CTCTTATCCT CAGGATGGAC 4560 TGGAAATGAA GCCCCTCGGT GTCATGAAGT TCCCAGTGTG TCCAGTTTCC ACAGTTCCTG 4620 ATGGTGGCCA GATACCTGAG GAGTGCCTCA AGGACAGCGT GGCACCAGCA CCTACCCAGC 4680 GTACATGCCG CCAGGACAAC ACAAGCGACA TCAATTCTGA TTCCACAGAA GACACAGCAG 4740 AGTTCAACAG AGGAGACAGC AGCGGTCATT CAGAAGCAGA GGACAAAGTT TTCAGTTGGA 4800 GTCCTCTTAT TTTATCACCT GTCTTGGAGG CTGCAGTGAG AAGACAGCGT GGTCTCCTCC 4860

TGGCCCCCCT CTAGACGGGC TGTCAGTGGT CCTTCAGCAA GCCCAAGAGA CCTGAGAGGA 4920 ATGTGAGCAG GCCTCCCACT GCAAGCCAGT AACTGCACCA CACAGGCCTG GGGACAAACT 4980 GTGTGAAGGA CATTAATTCA AATCAGAGAA AACCATTATT TATTTTTTGG TAGTAGTAAT 5040 GTCATGAATG TATCTTAAAA TGTGCGCCCT TTTATATTAT TTATGCCTTA TGTTTTCCCT 5100 TCCCCATTTC TTCCTCCCCC TATTTTTTTT TAATGCAGAG TTTTTTTAAT CGTCTGGAGA 5160 GCAGGGGATC ATTCTGTGTC TTCTGGGGCC TTCAGTTGGC AGGCTTCCAT CTTCTGGCCT 5220 GTTCAGTTGT GGGGAGAGGA TTGCCCACGC CATTTCACAT TTGTCACCAG TCGTTCCTAG 5280 ATGGAAAAAC TGTTACCTCT CCCATGTCGT CAGACTTTTG GGAACATTTA AAAATCAACC 5340 ATAGTTGTGA TCATATGTTT ATAATAGCCA ACCCAGCTGA CACACTTTTG AGTACCTTCC 5400 AGAAAAATAC TAATACTGAC TTATTTTCTC TCTGTGCCTG GGTACAAGTA GCGATCAATC 5460 TTCTAGGTAT CAAATTAGCC AGTGTTGTTA TGAACACAGA AACATGTAAA GTCTGTTTGG 5520 ATTATTGTAT TATATAGAAA GGGAACCAGA TTCAGAAGGA AAAGTAATGC CTTCATGGTT 5580 CCACTGCTCC TATTATCTTA ATTTACAAAA TGGCCACCAC GAGTTCTTTG CACTACTTGC 5640 AGAGGTATAT AATAAATACA AAAGTAAGGC CTTTAAACTG ATAGTTTG 5688 (2) INFORMATION FOR SEQ ID NO : 3: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3976 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTISENSE: NO (v) FRAGMENT TYPE: (vi) ORIGINAL SOURCE: (A) ORGANISM: Human (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: GACTTGGCAC CTTATTTTAC TTCTGAGCCG CTCTCTGCTG TCCAGAAACT TGGTGGACCT 60 GTAGTACTGC ATTGTTCTGC TCAACCTGTG ACCACTCGTA TCTCATGGCT GCATAACGGA 120 AAAACATTGG ATGGAAACCT GGAACATATT AAGATTCATC AGGGGACTCT GACAATTCTT 180 TCTCTCAACT CCTCTCTTTT GGGTTACTAC CAGTGCCTTG CCAACAATAG CATCGGTGCC 240 ATTGTGAGTG GCCCTGCGAC AGTATCTGTG GCAGTTCTTG GTGATTTTG GTTCATCCACA 300 AAGCATGTTA TTACAGCAGA AGAAAAAAGT GCTGGTTTCA TTGGCTGCA GGGTACCGGAG 360 AGTAACCCCA AAGCTGAGGT GCGCTATAAA ATCCGGGGAA AATGGCTGG AACATTCCACA 420 GAGAATTACT TAATCCTTCC ATCAGGAAAT CTTCAGATTT TGAATGTAT CCTTAGAGGAC 480 AAGGGATCAT ACAAATGTGC AGCTTATAAT CCTGTCACAC ATCAATTAA AAGTTGAACCT 540 ATTGGCCGAA AGCTCCTTGT GAGTCGTCCT TCTTCAGATG ATGTTCACA TTCTTCACCCC 600 ACCCATTCAC AGGCATTAGC TGTTCTTTCT CGTAGCCCTG TAACCTTGG AGTGTGTGGTG 660 AGTGGGGTCC CGGCTCCTCA AGTGTATTGG CTAAAGGACG GGCAGGACA TTGCACCAGGA 720 AGCAACTGGA GAAGGTTGTA TTCTCATCTT GCCACTGATA GCGTTGACC CGGCGGACTCC 780 GGAAACTATT CCTGCATGGC GGGAAACAAG TCTGGAGATG TAGAATATG TGACTTACATG 840 GTTAATGTAC TTGAACATGC TTCCATTTCT AAAGGACTAC AGGATCAGA TAGTGTCTCTG 900 GGTGCCACAG TACACTTTAC CTGCGACGTT CATGGGAACC CAGCCCCCA ACTGTACCTGG 960 TTTCACAATG CACAGCCTAT TCATCCTTCT GCACGACATC TAACTGCAG GAAACGGACTG 1020 AAAATCAGTG GGGTTACTGT GGAAGATGTT GGGATGTATC AGTGTGTAG CAGATAATGGG 1080 ATTGGATTTA TGCACTCTAC TGGAAGACTT GAAATTGAAA ATGACGGTG GATTCAAGCCA 1140 GTTATAATTA CGGCACCAGT AAGTGCAAAG GTTGCAGACG GAGACTTTG TTACTCTGTCC 1200 TGCAATGCCA GTGGGCTGCC GGTTCCGGTC ATTCGTTGGT ATGACAGCC ATGGATTGATA 1260

ACCAGCCATC CATCTCAAGT CCTGAGATCG AAATCCCGAA AATCACAGT TATCAAGACCT 1320 GAGGGCTTGA ACCTGGAGCC TGTGTACTTC GTCCTGTCCC AAGCTGGTG CAAGCTCTCTC 1380 CATATTCAGG CTGTGACTCA GGAACATGCG GGGAAATACA TCTGCGAAG CTGCAAATGAA 1440 CATGGTACCA CACAGGCAGA AGCATCTCTC ATGGTTGTTC CTTTTGAAA CAAATACAAAA 1500 GCAGAGACAG TCACACTTCC TGATGCTGCT CAGAATGATG ACAGAAGTA AGAGAGATGGT 1560 TCAGAAACTG GGTTACTGAG CTCATTTCCG GTGAAGGTCC ATCCCAGTG CAGTGGAATCA 1620 GCACCAGAGA AAAACGCCAG CGGCATCTCT GTTCCTGATG CCCCCATCA TACTGAGCCCC 1680 CCACAGACCC ACACACCAGA CACGTACAAC CTGGTGTGGA GGGCAGGCA AGGATGGTGGG 1740 CTGCCCATCA ATGCTTACTT TGTGAAGTAT CGAAAGCTGG ATGATGGGG TTGGCATGCTG 1800 GGAAGCTGGC ACACGGTTCG AGTCCCAGGA AGTGAAAATG AGCTCCATT TAGCTGAGCTG 1860 GAGCCATCTA GTCTTTATGA AGTCTTGATG GTAGCAAGAA GCGCAGCAG GTGAAGGCCAA 1920 CCTGCCATGA TTACCTTCCG AACCAGCAAA GAAAAAACAG CGTCATCAA AAAACACCCAG 1980 GCATCCTCTC CACCCGTGGG CATCCCTAAG TATCCCGTTG TTTCAGAGG CTGCAAACAAC 2040 AATTTTGGAG TGGTACTTAC AGATTCCTCT AGGCACAGTG GAGTTCCAG AGGCACCAGAT 2100 CGGCCTACCA TCTCCACTGC ATCAGAGACA TCAGTCTATG TCACTTGGA TTCCTCGGGCA 2160 AACGGGGGTT CTCCAATCAC TGCCTTCAAA GTCGAATATA AACGGATGA GGACCAGCAAT 2220 TGGCTGGTGG CAGCTGAAGA CATCCCTCCT TCCAAACTTT CAGTGGAAG TTCGTAGTTTA 2280 GAACCAGGTT CAACATACAA ATTTAGGGTC ATTGCCATCA ACCATTATG GTGAGAGTTTT 2340 CGGAGTTCAG CATCTCGTCC TTATCAAGTG GTTGGGTTCC CCAATCGCT TTTCCAGCCGT 2400 CCAATAACTG GACCTCACAT TGCATACACA GAGGCTGTCA GCGATACTC AGATCATGCTA 2460 AAGTGGACGT ACATTCCATC AAGTAACAAT AACACTCCCA TTCAAGGAT TTTATATCTAT 2520 TACCGACCAA CAGATAGTGA CAATGACAGT GATTACAAGA GGGATGTTG TAGAAGGTTCA 2580 AAGCAGTGGC ACATGATTGG CCACCTGCAG CCAGAAACCT CCTATGACA TTAAAATGCAA 2640 TGCTTCAATG AAGGAGGAGA AAGTGAATTT AGCAATGTGA TGATCTGCG AGACTAAAGTG 2700 AAACGTGTTC CTGGAGCTTC TGAATATCCT GTCAAAGACT TGAGTACCC CTCCAAATTCT 2760 TTGGGAAGTG GAGGAAATGT GGGGCCTGCA ACCAGCCCTG CCAGAAGCA GTGACATGTTA 2820 TATCTGATCG TTGGCTGTGT GCTGGGCGTC ATGGTCCTCA TTCTGATGG TTTTCATTGCA 2880 ATGTGCCTGT GGAAGAATCG CCAGCAGAAT ACCATACAAA AATATGACC CACCAGGATAT 2940 CTCTACCAAG GATCAGATAT GAACGGGCAG ATGGTGGACT ACACCACTC TCTCAGGAGCA 3000 AGTCAGATAA ATGGAAATGT TCACGGAGGC TTCCTAACCA ATGGCGGTC TCAGCAGTGGC 3060 TATTCCCACC TTCACCATAA GGTCCCCAAT GCAGTCAATG GAATTGTGA ATGGGAGCCTA 3120 AATGGAGGGC TTTACTCCGG GCACAGCAAC TCTCTAACCA GGACACACG TGGATTTTGAA 3180 CATCCTCATC ATCTAGTGAA TGGTGGTGGA ATGTACACGG CCGTGCCTC AGATTGACCCT 3240 CTGGAGTGTG TTAACTGCCG AAATTGTCGA AACAACAATA GGTGTTTCA CCAAAACCAAC 3300 AGCACTTTCA GCAGCAGCCC TCCTCCTGTG GTCCCTGTGG TAGCACCTT ATCCTCAGGAT 3360 GGTTTGGAAA TGAAGCCCCT CAGTCACGTG AAGGTGCCTG TATGCCTGA CTTCCGCAGTC 3420 CCTGATTGTG GCCAGTTGCC GGAGGAGAGC GTCAAGGACA ATGTGGAAC CAGTCCCTACT 3480 CAGCGTACCT GCTGTCAGGA CATTGTAAAT GACGTCAGCT CTGATGGCT CAGAAGATCCA 3540 GCAGAGTTCA GCAGAGGAGA CAGCTGTGCC CATTCAGAAA CAGAGATCA ACATTGTAAGT 3600 TGGAATGCTC TTATTTTGCC ACCTGTCCCG CAGGCTGTGC TGAGAAGAC AATGTGGTCTC 3660 CACCTGGCAT TCCTTTAGAC AGCCCGACAG AGGTCCTTCA GCAGCCCCG GGAAACCTGAG 3720 ACATGCAACA ACCAGTCATG TTCCAACTTC AAGCCGGTAA CACACAACA GGCTGGGAGCG 3780 AACTGTGTGA AGGACCTTAA TTCAAATCAG AGAAAATCAT TATTTATTT TTTTGTAGTAG 3840 TAATGTCATA TGAATGTATC TTAAAACGTG TGCCCTTTTA TATTATTTA TGCCTTAAATG 3900 TTTTCTTCCC CATTCCTTCC TCCCCCTCGG TAGGAAACAA CCTTGTTTT GCATAGTATTC 3960 AGTCACCTGG AGGGCA 3976