(82) DEPARTMENT OF COMMERCE Patent and Trademark Office 37 CFR Part 1 [Docket No: 960828235-8109-02] RIN: 0651-AA88 Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Disclosures AGENCY: Patent and Trademark Office, Commerce. ACTION: Final Rule SUMMARY: The Patent and Trademark Office (PTO) is amending the rules for submitting nucleotide or amino acid sequences in computer readable form (CRF) for patent applications.These amendments simplify the requirements of the rules, rearrange portions of the rules for better understanding and establish consistent rules to permit a single internationally acceptable computer readable form. Sequence Listings will be presented in an international, language neutral format using numeric identifiers rather than the current subject headings. The Paper Sequence Listing will preferably be a separately numbered section of the patent application. Sequences which contain fewer than four specifically identified nucleotides or amino acids will no longer be required to be submitted in computer readable form. DATES: EFFECTIVE DATE: July 1, 1998. The incorporation by reference of certain publications listed in the regulations is approved by the Director of the Federal Register as of July 1, 1998. APPLICABILITY DATE: Sections 1.821 through 1.825 as amended apply to applications filed on or after July 1, 1998, except for: (1) applications that claim the benefit of a prior application under 35 U.S.C. 120 filed before July 1, 1998, and which do not add subject matter involving a sequence listing subject to 1.821 through 1.825; and (2) reissue applications in which the application for the patent sought to be reissued was filed before July 1, 1998. Sections 1.821 through 1.825 apply during a reexamination proceeding if the application for the patent sought to be reexamined was filed on or after July 1, 1998. FOR FURTHER INFORMATION CONTACT: Esther M. Kepplinger, by telephone at (703) 308-1495; by mail addressed to: Box Comments - Patents, Assistant Commissioner for Patents, Washington, DC 20231 marked to her attention; by facsimile to (703) 305-3935; or by electronic mail at esther.kepplinger@uspto.gov. SUPPLEMENTAL INFORMATION: Sections 1.821 through 1.825 of title 37 provide a standardized format for the description of nucleotide and amino acid sequence data in patent applications and require the submission of such sequences in computer readable form (CRF).Sections 1.821 through 1.825 provide the following benefits to the PTO: (1) improved search capabilities; (2) improved interference detection; (3) more efficient examination; (4) cost savings for the input of the sequence data; (5) more efficient and accurate printing of sequences in patents; (6) exchange of the sequence data with other patent offices electronically; and (7) improved public access to the sequences electronically. REASONS FOR THE CHANGES In response to the needs of our customers, the procedural requirements found in former 1.821 through 1.825 have been reduced. Sections 1.821 through 1.825 are being amended to be consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (signed in 1998 and effective July 1, 1998). ST.25 replaces WIPO Standards ST.23 and ST.24 which deal with paper and electronic submissions of sequence listings. A Meeting of International Authorities (MIA) under the Patent Cooperation Treaty (PCT) was held in November of 1994 to discuss simplification of sequence listing submission requirements. Under the previous PCT Regulations, each International Searching Authority, each International Preliminary Examining Authority and each designated/elected office was free to set the requirements for submission of sequence listings in paper and electronic form. This imposed a burden on applicants by requiring them to prepare sequence listings in many different formats. In addition, sequence listings were required to be translated for consideration in the national stage at considerable cost to applicants and at the risk that the information could be inaccurately translated. After the November 1994 MIA, the PTO, the European Patent Office (EPO) and the Japanese Patent Office (JPO) worked together with WIPO to create a new international standard which forms the basis of WIPO Standard ST.25 (1998). Sections 1.821 through 1.825 of 37 CFR, as amended herein, are consistent with WIPO Standard ST.25 (1998) and the PCT sequence listing requirements. Sequence listings prepared in accordance with 1.821 through 1.825 as amended generally will be acceptable in all countries which adhere to WIPO Standard ST.25 (1998). In addition, a sequence listing prepared in accordance with the 1.821 through 1.825 as amended will be acceptable for the national stage in all PCT member countries which require the submission of a sequence listing. As a result of this rule change, applicants will experience a reduction in cost since only one sequence listing in paper and electronic form will need to be prepared and translations of this listing will not be needed. All necessary changes to the text of 1.821 through 1.825 to reflect the new WIPO Standard ST.25 (1998), have been made. Each change is described below. OVERVIEW OF THE CHANGES The changes in this Final Rule include: (1) use of numeric identifiers to replace the language subject headings within the submission; (2) elimination of unnecessary and confusing data elements; (3) movement of the paper Sequence Listing to the end of the application, preferably with separately numbered pages; (4) elimination of the requirement to provide a submission for sequences with fewer than four specifically defined nucleotides or amino acids; (5) use of lower-case one-letter codes for nucleotide bases; (6) rearrangement of portions of the rules to improve their context; (7) clarification and simplification of the rules to aid in understanding; and (8) minor changes to accomplish harmonization with WIPO Standard ST.25 (1998) as well as the EPO and the JPO standards. Amended 1.821 through 1.825 are not mandatory for: (1) applications that claim the benefit of a prior application under 35 U.S.C. 120 filed before July 1, 1998, and which do not add subject matter involving a sequence listing subject to 1.821 through 1.825; (2) reissue applications in which the application for the patent sought to be reissued was filed before July 1, 1998; and (3) reexamination proceedings if the application for the patent sought to be reexamined was filed before July 1, 1998. The PTO will accept and encourages the submission of sequence listings in compliance with amended 1.821 through 1.825 for any application or reexamination proceeding. All sequence listings (including the entire computer readable form) must be submitted in compliance with either 1.821 through 1.825 as amended in this Final Rule or (when permitted) former 1.821 through 1.825. If the CRF for a new application would be identical to a compliant CRF already on file in the PTO, the applicant may make reference to the other application and the CRF in lieu of filing a duplicate CRF in the new application by following the procedures set forth in 1.821(e). If exceptional circumstances do arise and certain applicants experience specific hardships in attempting to comply with amended 1.821 through 1.825, the PTO will consider a petition under 1.183 to waive certain requirements of 1.821 through 1.825. A Notice of Proposed Rulemaking entitled "Changes Implementing Nucleotide and/or Amino Acid Sequence Listings" (Notice of Proposed Rulemaking) was published in the Federal Register at 61 FR 51855 (October 4, 1996), and in the Official Gazette of the Patent and Trademark Office, at 1191 Off. Gaz. Pat. Office 168 (October 29, 1996). Sections 1.821 through 1.825 as adopted contain several changes from these sections. This Final Rule provides a discussion of the content of the specific rules being amended, description of the changes in the text of the proposed rules, and explanation of the reasons supporting the changes. In addition, comments received in response to the Notice of Proposed Rulemaking are analyzed. Discussion of Specific Rules and Changes from the Proposed Rules: Title 37 of the Code of Federal Regulations, Part 1, is amended as follows. SECTION 1.77 The proposed change to 37 CFR 1.77 was previously adopted. See Miscellaneous Changes to Patent Practice; Final Rule, 61 FR 42790 (August 19, 1996), 1190 Off. Gaz. Pat. Office 67 (September 17, 1996). Section 1.821 Section 1.821 incorporates by reference the World Intellectual Property Organization (WIPO) Handbook on Industrial Property Information and Documentation, Standard ST.25 (1998), including Tables 1 through 6 of Appendix 2, in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies may be obtained from the World Intellectual Property Organization; 34 chemin des Colombettes; 1211 Geneva 20 Switzerland. Copies may be inspected at the Patent Search Room; Crystal Plaza 3, Lobby Level; 2021 South Clark Place; Arlington, VA 22202. Copies may also be inspected at the Office of the Federal Register, 800 North Capitol Street, NW, Suite 700, Washington, DC 20408. These Tables are reproduced below. WIPO Standard ST.25 (1998), Appendix 2, Table 1, provides that the bases of a nucleotide sequence should be represented using the following one-letter code for nucleotide sequence characters: Table 1: one letter codes for nucleotide sequences Symbol Meaning Origin of designation a a adenine g g guanine c c cytosine t t thymine u u uracil r g or a purine y t/u or c pyrimidine m a or c amino k g or t/u keto s g or c strong interactions 3 H-bonds w a or t/u weak interactions 2 H-bonds b g or c or t/u not a d a or g or t/u not c h a or c or t/u not g v a or g or c not t, not u n (a or g or c any or t/u) or (unknown or other) WIPO Standard ST.25 (1998), Appendix 2, Table 2, provides that modified bases may be represented as the corresponding unmodified bases in the sequence itself, if the modified base is one of those listed below and the modification is further described in the Feature section of the Sequence Listing. The codes from the list below may be used in the description ( i.e., the specification and drawings, or in the Sequence Listing) but these codes may not be used in the sequence itself. Table 2: modified bases Symbol Meaning ac4c 4-acetyl cytidine chm5u 5-(carboxyhydroxylmethyl)uridine cm 2-O-methylcytidine cmnm5s2u 5-carboxymethylaminomethyl-2- thiouridine cmnm5u 5-carboxymethylaminomethyluridine d dihydrouridine fm 2-O-methylpseudouridine gal q beta, D-galactosylqueuosine gm 2-O-methylguanosine I inosine i6a N6-isopentenyladenosine m1a 1-methyladenosine m1f 1-methylpseudouridine m1g 1-methylguanosine m1i 1-methylinosine m22g 2,2-dimethylguanosine m2a 2-methyladenosine m2g 2-methylguanosine m3c 3-methylcytidine m5c 5-methylcytidine m6a N6-methyladenosine m7g 7-methylguanosine mam5u 5-methylaminomethyluridine mam5s2u 5-methoxyaminomethyl-2-thiouridine man q beta, D-mannosylqueuosine mcm5s2u 5-methoxycarbonylmethyl-2-thiouridine mcm5u 5-methoxycarbonylmethyluridine mo5u 5-methoxyuridine ms2i6a 2-methylthio-N6-isopentenyladenosine ms2t6a N-((9-beta-D-ribofuranosyl-2- methylthiopurine-6-yl) carbamoyl) threonine mt6a N-((9-beta-D-ribofuranosylpurine- 6-yl)N- methylcarbamoyl) threonine mv uridine-5-oxyacetic acid-methylester o5u uridine-5-oxyacetic acid osyw wybutoxosine p pseudouridine q queuosine s2c 2-thiocytidine s2t 5-methyl-2-thiouridine s2u 2-thiouridine s4u 4-thiouridine t 5-methyluridine t6a N-((9-beta-D-ribofuranosylpurine-6-yl)- carbamoyl)threonine tm 2-O-methyl-5-methyluridine um 2-O-methyluridine yw wybutosine x 3-(3-amino-3-carboxy-propyl)uridine, (acp3)u WIPO Standard ST.25 (1998), Appendix 2, Table 3, provides that the amino acids should be represented using the following three-letter code with the first letter as a capital. Table 3: amino acid three-letter codes Symbol Meaning Ala Alanine Cys Cysteine Asp Aspartic Acid Glu Glutamic Acid Phe Phenylalanine Gly Glycine His Histidine Ile Isoleucine Lys Lysine Leu Leucine Met Methionine Asn Asparagine Pro Proline Gln Glutamine Arg Arginine Ser Serine Thr Threonine Val Valine Trp Tryptophan Tyr Tyrosine Asx Asp or Asn Glx Glu or Gln Xaa unknown or other WIPO Standard ST.25 (1998), Appendix 2, Table 4, provides that modified and unusual amino acids may be represented as the corresponding unmodified amino acids in the sequence itself if the modified or unusual amino acid is one of those listed below and the modification is further described in the Feature section of the Sequence Listing. The codes from the list below may be used in the description (i.e., the specification and drawings, or in Sequence Listing) but these codes may not be used in the sequence itself. Table 4: modified and unusual amino acid codes Symbol Meaning Aad 2-Aminoadipic acid bAad 3-aminoadipic acid bAla beta-Alanine, beta-Aminopropionic acid Abu 2-Aminobutyric acid 4Abu 4-Aminobutyric acid, piperidinic acid Acp 6-Aminocaproic acid Ahe 2-Aminoheptanoic acid Aib 2-Aminoisobutyric acid bAib 3-Aminoisobutyric acid Apm 2-Aminopimelic acid Dbu 2,4-Diaminobu tyric acid Des Desmosine Dpm 2,2-Diaminopimelic acid Dpr 2,3-Diaminopropionic acid EtGly N-Ethylglycine EtAsn N-Ethylasparagine Hyl Hydroxylysine aHyl allo-Hydroxylysine 3Hyp 3-Hydroxyproline 4Hyp 4-Hydroxyproline Ide Isodesmosine aIle allo-Isoleucine MeGly N-Methylglycine, sarcosine MeIle N-Methylisoleucine MeLys 6-N-Methyllysine MeVal N-Methylvaline Nva Norvaline Nle Norleucine Orn Ornithine WIPO Standard ST.25 (1998), Appendix 2, Table 5 provides for feature keys related to DNA sequences. Table 5: Feature keys related to nucleotide sequences Key Description allele a related individual or strain contains stable, alternative forms of the same gene which differs from the presented sequence at this location (and perhaps others) attenuator 1) region of DNA at which regulation of termination of transcription occurs, which controls the expression of some bacterial operons; 2) sequence segment located between the promoter and the first structural gene that causes partial termination of transcription C_region constant region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains. Includes one or more exons depending on the particular chain CAAT_signal CAAT box; part of a conserved sequence located about 75 bp up-stream of the start point of eukaryotic transcrip- tion which may be involved in RNA polymerase binding; consensus=GG (C or T)CAATCT CDS coding sequence; sequence of nucleotides that corresponds with the sequence of amino acids in a protein (location includes stop codon). Feature includes amino acid conceptual translation conflict independent determinations of the “same” sequence differ at this site or region D-loop displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein D-segment diversity segment of immunoglobulin heavy chain, and T-cell receptor beta chain enhancer a cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter exon region of genome that codes for portion of spliced mRNA; may contain 5'UTR, all CDSs, and 3'UTR GC_signal GC box; a conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG gene region of biological interest identified as a gene and for which a name has been assigned iDNA Intervening DNA; DNA which is eliminated through any of several kinds of recombination intron a segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it J_segment joining segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma-chains LTR long terminal repeat, a sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses mat_peptide mature peptide or protein coding sequence; coding sequence for the mature or final peptide or protein product following post-translational modification. The location does not include the stop codon (unlike the corresponding CDS) misc_binding site in nucleic acid which covalently or non-covalently binds another moiety that cannot be described by any other Binding key (primer_bind or protein_bind) misc_dif- feature sequence is different from that presented in the ference entry and cannot be described by any other Difference key (conflict, unsure, old sequence, mutation, variation, allele, or modified_base) misc_eature region of biological interest which cannot be described by any other feature key; a new or rare feature misc_recomb site of any generalized, site-specific or replicative recombination event where there is a breakage and reunion of duplex DNA that cannot be described by other recombination keys (iDNA and virion) or qualifiers of source key (/insertion_seq,/transposon,/proviral) misc_RNA any transcript or RNA product that cannot be defined by other RNA keys (prim transcript, precursor RNA, mRNA, 5'clip, 3'clip, 5'UTR, 3'UTR, exon, CDS, sig_peptide, transit_pep- tide, mat_peptide, intron, polyA_site, rRNA, tRNA, scRNA, and snRNA) misc_signal any region containing a signal controlling or altering gene function or expression that cannot be described by other Signal keys (promoter, CAAT_signal, TATA_signal, -35_signal, -10_signal, GC_signal, RBS, polyA_signal, enhancer, attenuator, terminator, and rep_origin) misc_struc- any secondary or tertiary structure or conformation that ture cannot be described by other Structure keys (stem_loop and D-loop) modified_ the indicated nucleotide is a modified nucleotide and should base be substituted for by the indicated molecule (given in the mod base qualifier value) mRNA messenger RNA; includes 5'untranslated region (5'UTR), coding sequences (CDS, exon) and 3'untranslated region (3'UTR) mutation a related strain has an abrupt, inheritable change in the sequence at this location N_region Extra nucleotides inserted between rearranged immunoglobulin segments old_se- the presented sequence revises a previous version of quence the sequence at this location polyA_sig- recognition region necessary for endonuclease nal cleavage of an RNA transcript that is followed by polyaden- ylation; consensus=AATAAA polyA_site site on an RNA transcript to which will be added adenine residues by post- transcriptional polyadenylation precur- any RNA species that is not yet the mature RNA product; may sor_RNA include 5'clipped region (5'clip), 5'untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3'untranslated region (3'UTR), and 3'clipped region (3'clip) prim_trans- primary (initial, unprocessed) transcript; includes 5'clipped cript region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3'untranslated region (3'UTR), and 3'clipped region (3'clip) primer_bind Non-covalent primer binding site for initiation of replica- tion, transcription, or reverse transcription. Includes site(s) for synthetic e.g., PCR primer elements promoter region on a DNA molecule involved in RNA polymerase binding to initiate transcription pro- non-covalent protein binding site on nucleic acid tein_bind RBS ribosome binding site repeat_re- region of genome containing repeating units gion repeat_unit single repeat element rep_origin origin of replication; starting site for duplication of nucleic acid to give two identical copies rRNA mature ribosomal RNA; the RNA component of the ribonucleo- protein particle (ribosome) which assembles amino acids into proteins S_region Switch region of immunoglobulin heavy chains. Involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin class from the same B-cell satellite many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA scRNA small cytoplasmic RNA; any one of several small cytoplasmic RNA molecules present in the cytoplasm and (sometimes) nucle- us of a eukaryote sig_peptide signal peptide coding sequence; coding sequence for an N- terminal domain of a secreted protein; this domain is invol- ved in attaching nascent polypeptide to the membrane; leader sequence snRNA small nuclear RNA; any one of many small RNA species confined to the nucleus; several of the snRNAs are involved in splicing or other RNA processing reactions source identifies the biological source of the specified span of the sequence. This key is mandatory. Every entry will have, as a minimum, a single source key spanning the entire sequence. More than one source key per sequence is permissi- ble stem_loop hairpin; a double-helical region formed by base-pairing between adjacent (inverted) complementary sequences in a single strand of RNA or DNA STS Sequence Tagged Site. Short, single-copy DNA sequence that characterizes a mapping landmark on the genome and can be detected by PCR. A region of the genome can be mapped by determining the order of a series of STSs TATA_signal TATA box; Goldberg-Hogness box; a conserved AT-rich septamer found about 25 bp before the start point of each eukaryotic RNA polymerase II transcript unit which may be involved in positioning the enzyme for correct initiation; consensus=TATA(A or T)A(A or T) terminator sequence of DNA located either at the end of the transcript or adjacent to a promoter region that causes RNA polymerase to terminate transcription; may also be site of binding of repressor protein trans- transit peptide coding sequence; coding sequence for an it_peptide N-terminal domain of a nuclear-encoded organellar protein; this domain is involved in post- translational import of the protein into the organelle tRNA mature transfer RNA, a small RNA molecule (75-85 bases long) that mediates the translation of a nucleic acid sequence into an amino acid sequence unsure author is unsure of exact sequence in this region V_region Variable region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains. Codes for the variable amino terminal portion. Can be made up from V_segments, D_segments, N_regions, and J_segments V_segment variable segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains. Codes for most of the variable region (V_region) and the last few amino acids of the leader peptide variation a related strain contains stable mutations from the same gene (e.g., RFLPs, polymorphisms, etc.) which differ from the presented sequence at this location (and possibly others) 3'clip 3'-most region of a precursor transcript that is clipped off during processing 3'UTR region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein 5'clip 5'-most region of a precursor transcript that is clipped off during processing 5'UTR region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein -10_signal pribnow box; a conserved region about 10 bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TATAAT -35 signal a conserved hexamer about 35 bp upstream of the start point of bacterial transcription units; consensus=TTGACA [ ] or TGTTGACA [ ] WIPO Standard ST.25 (1998), Appendix 2, Table 6 provide for feature keys related to protein sequences. Table 6: Feature keys related to Protein sequences Key Description CONFLICT Different papers report differing sequences VARIANT Authors report that sequence variants exist VARSPLIC Description of sequence variants produced by alternative splicing MUTAGEN Site which has been experimentally altered MOD RES Post-translational modification of a residue ACETYLATION N-terminal or other AMIDATION Generally at the C-terminal of a mature active peptide BLOCKED Undetermined N- or C-terminal blocking group FORMYLATION Of the N-terminal methionine GAMMA-CARBOXYGLU- Of asparagine, aspartic acid, proline or lysine TAMIC ACID HYDROXY- LATION METHYLATION Generally of lysine or arginine PHOSPHORYLATION Of serine, threonine, tyrosine, aspartic acid or histidine PYRROLIDONE CAR- N-terminal glutamate which has formed an internal BOXYLIC ACID cyclic lactam SULFATATION Generally of tyrosine LIPID Covalent binding of a lipidic moiety MYRISTATE Myristate group attached through an amide bond to the N- terminal glycine residue of the mature form of a protein or to an internal lysine residue PALMITATE Palmitate group attached through a thioether bond to a cysteine residue or throughan ester bond to a serine or threonine residue FARNESYL Farnesyl group attached through a thioether bond to a cysteine residue GERANYL-GERANYL Geranyl-geranyl group attached through a thioether bond to a cysteine residue GPI-ANCHOR Glycosyl-phosphatidylinositol (GPI) group linked to the alpha- carboxyl group of the C-terminal residue of the mature form of a protein N-ACYL DIGLYCERIDE N-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide-linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages DISULFID Disulfide bond. The `FROM' and `TO' endpoints rep- resent the two residues which are linked by an intra-chain disulfide bond. If the `FROM' and `TO' endpoints are identical, the disulfide bond is an interchain one and the description field indicates the nature of the cross-link THIOLEST Thiolester bond. The `FROM' and `TO' endpoints represent the two residues which are linked by the thiolester bond THIOETH Thioether bond. The `FROM' and `TO' endpoints represent the two residues which are linked by the thioether bond CARBOHYD Glycosylation site. The nature of the carbohydrate (if known) is given in the description field METAL Binding site for a metal ion. The description field indicates the nature of the metal BINDING Binding site for any chemical group (co-enzyme, prosthetic group, etc.). The chemical nature of the group is given in the description field SIGNAL Extent of a signal sequence (prepeptide) TRANSIT Extent of a transit peptide (mitochondrial, chloroplastic, or for a microbody) PROPEP Extent of a propeptide CHAIN Extent of a polypeptide chain in the mature protein PEPTIDE Extent of a released active peptide DOMAIN Extent of a domain of interest on the sequence. The nature of that domain is given in the description field CA_BIND Extent of a calcium-binding region DNA_BIND Extent of a DNA-binding region NP_BIND Extent of a nucleotide phosphate binding region. The nature of the nucleotide phosphate is indicated in the description field TRANSMEM Extent of a transmembrane region ZN_FING Extent of a zinc finger region SIMILAR Extent of a similarity with another protein sequence. Precise information, relative to that sequence is given in the description field REPEAT Extent of an internal sequence repetition HELIX Secondary structure - Helices, e.g., Alpha-helix, 3(10) helix, or Pi-helix STRAND Secondary structure - Beta-strand, e.g., Hydrogen bonded beta- strand, or Residue in an isolated beta-bridge TURN Secondary structure - Turns, e.g., H-bonded turn (3-turn, 4-turn, or 5-turn) ACT_SITE Amino acid(s) involved in the activity of an enzyme SITE Any other interesting site on the sequence INIT_MET The sequence is known to start with an initiator methionine NON_TER The residue at an extremity of the sequence is not the terminal residue. If applied to position 1, this signifies that the first position is not the N-terminus of the complete molecule. If applied to the last position, it signifies that this position is not the C-terminus of the complete molecule. There is no description field for this key NON_CONS Non consecutive residues. Indicates that two residues in a sequence are not consecutive and that there are a number of unsequenced residues between them UNSURE Uncertainties in the sequence. Used to describe region(s) of a sequence for which the authors are unsure about the sequence assignment In paragraph (a) of 1.821, the reference to "Standard ST.23: Recommendation for the presentation of Nucleotide and Amino Acid Sequence Listings in Patent Applications and in Published Patent Documents, paragraphs 8 through 12, April 1994" has been replaced by "Standard ST.25: Standard for the Presentation of Nucleotide and Amino Acid Sequence Listings in Patent Applications (1998), including Tables 1 through 6 in Appendix 2." These changes reflect the correct information with regard to the incorporated WIPO standard and the lists of symbols for nucleotide and amino acid sequence characters. Further in paragraph (a) of 1.821, "(Hereinafter "WIPO Standard ST.23 (April, 1994)")" has been changed to "(Hereinafter "WIPO Standard ST.25 (1998))." This change is necessary to indicate the correct abbreviation for new standard ST.25. Further in paragraph (a) of 1.821, both occurrences of "Copies of ST.23" have been changed to "Copies of WIPO Standard ST.25 (1998)." This change is necessary to reflect the new standard number. In paragraph (a)(1) of 1.821, "ST.23 (April 1994), paragraph 8" has been changed to "ST.25 (1998), Appendix 2, Table 1." This change reflects the correct information with regard to the incorporated WIPO standard and the list of symbols to be used for nucleotide sequence characters. Further in paragraph (a)(1) of 1.821, "ST.23 (April 1994), paragraph 9" has been changed to "ST.25 (1998), Appendix 2, Table 2." This change reflects the correct information with regard to the incorporated WIPO standard and the list of modified bases which can be presented as unmodified nucleotide sequence characters. In paragraph (a)(2) of 1.821, all three occurrences of "ST.23 (April 1994), paragraph 11" have been changed to "ST.25 (1998), Appendix 2, Table 3." This change reflects the correct information with regard to the incorporated WIPO standard and the list of symbols to be used for amino acid sequence characters. Further in paragraph (a)(2) of 1.821, "ST.23 (April 1994), paragraph 12" has been changed to "ST.25 (1998), Appendix 2, Table 4." This change reflects the correct information with regard to the incorporated WIPO standard and the list of modified or unusual amino acids which can be presented as unmodified amino acid sequence characters. In paragraph (c) of 1.821, each of the three occurrences of the words "integer identifier" or "integer identifiers" has been changed to "sequence identifier" or "sequence identifiers" as appropriate. WIPO Standard ST.25 (1998), uses the term "sequence identifier" rather than "integer identifier." Thus, this change is necessary to achieve harmonization with the international standard. In the last sentence of paragraph (c) of 1.821, the phrase "The sequence omitted shall appear following the integer identifier" of the proposed rule has been replaced by the code `000' shall be used in place of the sequence." The response for the numeric identifier <160> shall include the total number of SEQ ID NOs, whether followed by a sequence or by the code "000". The code <000> should be put into <400>. This change permits flexibility in the preparation and amendment of Sequence Listings. It also makes the rule language-neutral and is consistent with WIPO Standard ST.25 (1998). In paragraph (d) of 1.821, the words "integer identifier" have been changed to "sequence identifier." WIPO Standard ST.25 (1998) uses the term "sequence identifier" rather than "integer identifier." Thus, this change is necessary to achieve harmonization with the international standard. In paragraphs (f), (g) and (h) of 1.821, the sentence "Such a statement must be a verified statement if made by a person not registered to practice before the Office" has been deleted. The separate verification requirements in 1.821 have been eliminated in view of the recent amendment to 1.4(d) and 10.18. See Changes to Patent Practice and Procedure; Final Rule, 62 FR. 53131 (October 10, 1997), 1203 Off. Gaz. Pat. Office 63 (October 21, 1997). Paragraph (g) of 1.821 has also been amended to provide that the Office will provide a "period of time" (rather than one month) within which the applicant must comply with the requirements of 1.821(b) through (f) in order to avoid abandonment. Further in paragraph (f) of 1.821, the following has been added at the end of the first sentence, ", e.g., the information recorded in computer readable form is identical to the written sequence listing." WIPO Standard ST.25 (1998), paragraph 39, requires the language which has been added as an acceptable example for phrasing the required statement that the computer readable form and the written sequence listing are the same. Section 1.822 In paragraph (b) of 1.822, both references to WIPO Standard ST.23 (April 1994), paragraphs 8 and 11, as proposed have been changed to "WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3." These changes reflect the correct information with regard to the incorporated WIPO standard and the lists of symbols for nucleotide and amino acid sequence characters. Further in paragraph (b) of 1.822, "WIPO Standard ST.23 (April 1994), paragraphs 9 and 12" as proposed has been changed to "WIPO Standard ST.25 (1998), Appendix 2, Tables 2 and 4." This change reflects the correct information with regard to the incorporated WIPO standard and the lists of modified bases and modified or unusual amino acids which can be depicted in the Sequence Listing via the symbols for a corresponding unmodified base or amino acid. Further in paragraph (b) of 1.822, the symbol designating an unknown nucleotide base or a nucleotide base other than those listed in the WIPO standard was proposed as an upper case letter "N." This symbol has been changed to a lower case letter "n." This change is consistent with the use of lower case letters for the symbols representing the nucleotide bases. Further in paragraph (b) of 1.822, the language has been clarified to specifically state that each "n" or "Xaa" represents only a single residue. Thus, for example, a single "Xaa" may not be used to designate a string of four amino acids, each of which is unknown. This represents a codification of existing practice. Further in paragraph (b) of 1.822, the information required in the Feature section to explain the use of "n" or "Xaa" in a given sequence is referred to "as appropriate." Additional instruction is added at the end of paragraph (b) of 1.822 following "the Feature section" indicating ", preferably by including one or more feature keys listed in WIPO Standard ST.25 (1998), Appendix 2, Tables 5 and 6." This change specifies the preference for using the feature keys listed in the WIPO standard in order to aid applicants in filing a CRF which will comply with WIPO Standard ST.25 (1998). These feature keys are controlled vocabulary and are considered language neutral. Their use is required in a PCT patent application or a patent application in a foreign country which has adopted WIPO Standard ST.25 (1998). In paragraph (c)(1) of 1.822, "WIPO Standard ST.23 (April 1994), paragraph 8" as proposed has been changed to WIPO Standard ST.25 (1998), Appendix 2, Table 1." This change reflects the correct information with regard to the incorporated WIPO standard and the list of symbols to be used for nucleotide sequence characters. In paragraph (d)(1) of 1.822, "WIPO Standard ST.23 (April 1994), paragraph 11, as proposed has been changed to "WIPO Standard ST.25 (1998), Appendix 2, Table 3." This change reflects the correct information with regard to the incorporated WIPO standard and the list of symbols to be used for amino acid sequence characters. In paragraph (d)(4) of 1.822, the section notes that enumeration requirements are applicable to amino acid sequences that are circular in configuration. The following language has been added to the end of the paragraph ", with the exception that the designation of the first amino acid of the sequence may be made at the option of the applicant." This change is necessary to provide consistency with its counterpart of circular nucleotide sequences as provided in paragraph (c)(7) of 1.822. This change is also consistent with WIPO Standard ST.25 (1998), paragraph 21. In paragraph (e) of 1.822, the words "integer identifiers" have been changed to "sequence identifiers." WIPO Standard ST.25 (1998) uses the term "sequence identifier" rather than "integer identifier." Thus, this change is necessary to achieve harmonization with the international standard. Section 1.823 In paragraph (a) of 1.823, the entire second sentence which read "On a separate page of the application specification, immediately prior to the claims, there shall be a reference to the presence of the `Sequence Listing' in a `Sequence Listing Annex.'" has been eliminated. The designation of the Sequence Listing as an annex to the specification was initially proposed in an early version of the international standard. This terminology is not used in WIPO Standard ST.25 (1998), however, and so it has also been eliminated from paragraph (a) of 1.823, as proposed. Simplification results as well by the elimination of the requirement that the Sequence Listing must be designated as an annex to the specification. In paragraph (a) of 1.823, the third sentence has been modified by deleting the words "shall appear in the `Sequence Listing Annex,' which is." As explained above, the current version of the international standard does not require designating the Sequence Listing as an annex to the specification. In paragraph (a) of 1.823, the words "preferably should be" have been added to the third sentence, before "numbered independently of the numbering of the remainder of the application" to describe the independent page numbering of the Sequence Listing in paper copy form. The term "preferably" was added for purposes of harmonization with WIPO Standard ST.25 (1998). In paragraph (a) of 1.823, the last clause of the third sentence "and shall be placed in the application file" has been deleted as unnecessary and potentially confusing now that the reference to a "Sequence Listing Annex" has been removed from this paragraph. In paragraph (a) of 1.823, the fourth sentence has been eliminated in its entirety. As explained above, the current version of the international standard does not require designating the Sequence Listing as an annex to the specification. In paragraph (a) of 1.823, in both occurrences in the fifth sentence and in the single occurrence in the sixth sentence, the word "shall" has been changed to "should." These changes are necessary for purposes of achieving consistency with WIPO Standard ST.25 (1998). In paragraph (b) of 1.823, the first sentence has been modified by the deletion of the words "in addition to and immediately preceding." This change is consistent with WIPO Standard ST.25 (1998). In paragraph (b) of 1.823, the fifth sentence has been deleted, eliminating the prohibition of any item of information occupying more than one line. This change is consistent with WIPO Standard ST.25 (1998). In paragraph (b) of 1.823, the last sentence has been deleted to eliminate the "rep" designation for data elements of the "Sequence Listing." Certain data elements may still be repeated within the listing but this change was made for harmonization of the table with WIPO Standard ST.25 (1998). In paragraph (b) of 1.823, the eighth sentence has been modified to reflect the new numeric numbering scheme, for harmonization with WIPO Standard ST.25 (1998). Specifically, "<100> through <193>" of the proposed rule has been changed to "<110> through <170>." The table in paragraph (b) of 1.823, has been changed to reflect the revised numbering scheme and data elements used in WIPO Standard ST.25 (1998). The specific changes are as follows: Numeric identifier "<100>, General Information," has been deleted from the proposed rules, as it is not present in WIPO Standard ST.25 (1998). Numeric identifier "<110>, Applicant," in the proposed rule, has been changed to indicate that"preferably " a maximum of ten names may be indicated. This change allows for more than ten names in the Applicant field for those instances in which such would be appropriate. This change is consistent with WIPO Standard ST.25 (1998). Numeric identifier "<120>, Title of Invention," in the proposed rule, has been changed to eliminate the limitation that the title be a maximum of four lines. This change allows applicants more flexibility with respect to the title. This change is consistent with WIPO Standard ST.25 (1998). Numeric identifier "<130>, Number of Sequences," in the proposed rule, has been changed to reflect "<130>, File Reference," as stated in WIPO Standard ST.25 (1998). This numeric identifier was indicated as "<183>, File Reference/Docket Number", in the rule as proposed.As proposed this was an optional numeric identifier. The numeric identifier remains optional once the application has been assigned an application number, e.g., a serial number. This numeric identifier is now MANDATORY when an application number has not yet been assigned to the application, such as on the day the application is initially filed. This change will assist in the matching of sequence information submissions with an application in the event that either the paper copy or the computer readable form were to become separated from the remainder of the application. This change is consistent with WIPO Standard ST.25 (1998). The Number of Sequences field identified as "<130>" in the proposed rule is now numbered"<160>" in 1.823 as adopted and redefined as "Number of SEQ ID NOs." The information associated with numeric identifiers "<140>" through "<153>,""Correspondence Address " through " Operating System " of the proposed rule, has been eliminated to reduce the burden on the applicant and to harmonize with WIPO Standard ST.25 (1998). Some of these numeric identifiers have been used in the new numbering scheme and have been associated with different information as indicated herein and in the Table of 1.823.One remaining numeric identifier within the Computer Readable Form section, "<154>, Software," of the proposed rule, will remain, with the exception that it has been reassigned the numeric identifier of "<170>" to reflect the numbering scheme presented in WIPO Standard ST.25 (1998). The main headings "<160>, Current Application Data " and "<170>, Prior Application Data," of the proposed rules, have been eliminated to harmonize with WIPO Standard ST.25 (1998) and reduce the number of fields in the Sequence Listing. The information that was to appear under these main headings remains in the rules but has been reassigned numeric identifiers<140> through<151>. The specific changes are as follows: "<160>" has been redefined as "Number of SEQ ID NOs "; "<161>, Application Number," of the proposed rule is now numbered as"<140>," and is defined as "Current Application Number"; "<162>, Filing Date," of the proposed rule is now numbered "<141>," and is defined as "Current Filing Date"; "<170>" has been redefined as "Software "; "<171>, Application Number," of the proposed rule is now numbered as "<150>," and is defined as "Prior Application Number"; "<172>, Filing Date," of the proposed rule is now numbered as "<151>," and is defined as "Prior Application Filing Date." The numeric identifiers now numbered "<150>, Prior Application Number,", and "<151>, Prior Application Filing Date," are now mandatory only in those instances in which a claim for priority with respect to those prior applications is being made under either 35 U.S.C. 119 or 120.This change will provide information in this regard when it is most useful and was necessary to harmonize these rules with WIPO Standard ST.25 (1998). Throughout the Sequence Listing, application numbers must be set forth as a combination of the two digit country code, as set forth in WIPO Standard ST.3, as well as an application number in accordance with WIPO Standard ST.13 or for an international application, the numbering system as set out in Section 307(a) of the Administrative Instructions under the PCT. Numeric identifiers "<180>, Attorney/Agent Information," through "<182>, Registration Number," of the proposed rule, have been eliminated to harmonize with WIPO Standard ST.25 (1998) and reduce the number of fields in the Sequence Listing. Numeric identifier "<183>, File Reference/Docket Number " of the proposed rule has been reassigned as numeric identifier "<130>," and redefined as "File Reference" in an effort to harmonize with WIPO Standard ST.25 (1998). The Telecommunication Information section, "<190>" through "<193>" of the proposed rules, has been eliminated in order to reduce the number of fields in the Sequence Listing and harmonize with WIPO Standard ST.25 (1998). Numeric identifier "<200>, Information for SEQ ID NO:#:", has been reassigned the numeric identifier "<210>, SEQ ID NO: #:" This numeric identifier indicates the integer, referred to in these final rules as the sequence identifier for both the sequence information and the actual sequence which follows the information. Numeric identifier "<210>, Sequence Characteristics," of the proposed rule has been eliminated in order to reduce the number of required elements in the Sequence Listing and harmonize with WIPO Standard ST.25 (1998). The valid responses for the mandatory numeric identifier "<212>, Type," have been changed from "N" and "A", as stated in the proposed rule, to "DNA," "RNA," and "PRT" (protein) in order to harmonize with WIPO Standard ST.25 (1998). A compound that is a mixture of DNA and RNA should be represented by "DNA." This change is consistent with WIPO Standard ST.25 (1998). Numeric identifier "<213>, Organism," has been added to the Sequence Listing of these final rules in an effort to harmonize with WIPO Standard ST.25 (1998). A response for the Organism identifier is MANDATORY. The valid responses are the scientific name, i.e. "Genus species", "Artificial Sequence", or "Unknown." Numeric identifier "<214>, Topology," of the proposed rule, has been eliminated to harmonize with WIPO Standard ST.25 (1998), and to reduce the burden on the applicant. Numeric identifier "<290>, Feature," has become numeric identifier "<220>, Feature." This numeric identifier has become MANDATORY for those sequences in which numeric identifier "<213>, Organism," is completed with either "Artificial Sequence" or "Unknown." This numeric identifier is also required if the compound sequence is a mixture of DNA and RNA. Numeric identifier "<220>, Feature" is a header only. No data are added immediately following this numeric identifier. These changes are required to achieve harmonization with WIPO Standard ST.25 (1998). Numeric identifier "<291>, Name/Key," has become numeric identifier "<221>, Name/Key."As proposed, the information provided was restricted to a maximum of four lines. The four line restriction has been removed to reduce the limitations on this field. The comment section of this numeric identifier has been changed in that it now indicates that the selection of a feature name or feature key is preferably made from those listed in Tables 5 and 6 of WIPO Standard ST.25 (1998). These tables are reproduced above and this preference for the listed feature names and keys is consistent with the requirement of WIPO Standard ST.25 (1998). Numeric identifier "<292>, Location," has become "<222>, Location," so as to be consistent with the numeric identifiers contained in WIPO Standard ST.25 (1998). Numeric identifier "<294>, Other Information," has become numeric identifier "<223>, Other Information," so as to be consistent with the numeric identifiers contained in WIPO Standard ST.25 (1998). This numeric identifier has become MANDATORY for those sequences in which numeric identifier "<213>, Organism," is completed with either "Artificial Sequence" or "Unknown". Numeric identifier "<223>, Other Information," should contain source information in those instances when the organism is unknown or is an artificial sequence. For example, the source may be unknown because the material was isolated from a mixed bacterial culture rather than a pure culture. In such a case, numeric identifier šbn"<223>,šxn Other Information," should be completed by explaining the mixed culture source of the sequenced material. If a sequence is completely synthesized this should be indicated in numeric identifier "<223>, Other Information," while numeric identifier "<213>, Organism," would indicate "Artificial Sequence." This change has been made to accomplish harmonization between these rules and WIPO Standard ST.25 (1998) which contains the same mandatory requirement in this regard. Numeric identifiers "<308>" through "<310>," referring to the " Patent Document Number," "Filing Date " and " Publication Date," of the proposed rule, have been moved to numeric identifiers "<310>" to "<312>," respectively, of this Final Rule in order to harmonize with the numeric numbering scheme of WIPO Standard ST.25 (1998). Citations in the Sequence Listing must comply with WIPO Standard ST.6 for publication numbers and WIPO Standard ST.16 for document codes. New numeric identifiers "<308>, Database Accession Number," and "<309>Database Entry Date," have been added to the final rules to harmonize with WIPO Standard ST.25 (1998).These fields were added to the publication information section of WIPO Standard ST.25 (1998) to give an applicant more opportunity to further identify a published citation. Numeric identifier<400> " Sequence Description: SEQ ID šbnNO: #:"šxn has been changed to "Sequence " for clarity. Also for clarity, the explanation in the table has been changed to "SEQ ID NO shall follow the numeric identifier and should appear on the line preceding the sequence." The format of the date fields has been changed throughout the Sequence Listing to accommodate for international conventions. All date fields referenced in the Sequence Listing shall conform to WIPO Standard ST.2. Because compliance with 1.821 through 1.825 as amended should produce Sequence Listings that are acceptable to all receiving offices, a standardized date field convention was required. Section 1.824 In paragraph (a)(6) of 1.824, ", the date on which the data were recorded on the computer readable form" was added after "title of the invention" to harmonize with WIPO Standard ST.25 (1998) requirements. While this requirement of 1.824 was proposed to be eliminated, that proposal is not adopted for purposes of harmonization with WIPO Standard ST.25 (1998). Also in paragraph (a)(6) of 1.824, " name and type of computer and" was deleted to reduce the requirements. Section 1.825 In paragraphs (a), (b), and (d) of 1.825, the sentence "Such a statement must be a verified statement if made by a person not registered to practice before the Office" has been deleted. The separate verification requirements in 1.825 have been eliminated in view of the recent amendment to 1.4(d) and 10.18. See Changes to Patent Practice and Procedure; Final Rule, 62 FR. 53131 (October 10, 1997), 1203 Off. Gaz. Pat. Office 63 (October 21, 1997). Response to and Analysis of Comments Six written comments were received in response to the Notice of Proposed Rulemaking. Several of these comments address the three specific queries set forth in the Notice of Proposed Rulemaking. The first query posed in the Notice of Proposed Rulemaking was: (1) Should the PTO accept voluntary submissions of computer readable forms and Sequence Listings where a D-amino acid is contained in the sequence? If such voluntary submissions are accepted, should there be a restriction on the choice of identifying a D-amino acid by an Xaa or by its L-amino acid counterpart abbreviation? Comment: One comment indicated that not only should the PTO accept voluntary submissions under these rules where a D-amino acid is contained in the sequence, the Office should make such submissions mandatory and designated by an Xaa. One comment indicated that sequences containing D-amino acids should not be in the PTO databases. Response: Upon careful consideration, the PTO has decided to accept voluntary submissions of protein sequences containing D-amino acids. The PTO strongly encourages anyone making such voluntary submissions to identify a D-amino acid with an Xaa, describing the D-amino acid in the Features section of the Sequence Listing. This section is indicated by numeric identifiers<220>through<223> in 37 CFR 1.823. Procedural concerns compel this acceptance of voluntary submissions. Computer readable forms are processed prior to examination. It is cumbersome to establish a viable procedure to redact any voluntary submissions out of the PTO database. The use of Xaa to indicate a D-amino acid, should such sequence information be submitted in accordance with these rules, is encouraged so as to alert anyone reviewing the sequence that a particular amino acid is other than a naturally occurring L-amino acid and to more accurately depict the extent of similarities between such a sequence and the L-amino acid containing sequences present in a database being searched for examination or other purposes. Because the sequence databases do not currently include D-amino acids in sequences and thus are not searchable for such sequences, the submission of those sequences containing D-amino acids will not be made mandatory. The second query posed in the proposed rules was: (2) Should the provisions of 37 CFR 1.821(c) be altered to exclude some prior art sequences from inclusion in the Sequence Listing even though they are presented in a patent application disclosure as sequences? Should the reference to an accession number of an admitted prior art sequence in a publicly available, electronic, sequence database suffice and exclude that sequence from the requirements of the sequence rules? Comment: Four comments indicated that known "prior art" sequences should not be required in the Sequence Listing. A referral to a publicly available, electronic, sequence database for access to such "prior art" sequences would be an acceptable alternative to two of those commenting on this aspect; the other two did not address this point. The reasons given for excluding such sequences are the expense and time required by applicants and their representatives in the inclusion of "prior art" sequences that are considered to be "non-inventive". Reducing the bulk of the paper copy of the Sequence Listing was also mentioned. Response: The requirement to submit all disclosed sequences in the format required by 1.821 through 1.825 is maintained. This point was discussed with officials from the JPO and EPO.The offices have considered the stated concerns with regard to costs to applicants. Sections 1.821 through 1.825 do not require any information to be disclosed in the form of a sequence, but rather require a particular format whenever information is presented in the form of a sequence.Those applicants for whom compliance with the rules remains a significant hardship may petition under 1.183 for a waiver of the applicable requirement of 1.821 through 1.825. The technical and legal concerns mentioned in the Notice of Proposed Rulemaking still exist concerning the use of an alternative reference to a publicly available, electronic, sequence database. These concerns are: (1) What constitutes a publicly available, electronic, sequence database? (2) Would the USPTO and the other patent offices which have similar rules be required to produce a list of internationally accepted databases? (3) What would be the criteria for such acceptance? (4) An additional issue would exist involving electronic records maintenance: is there any assurance that once information is contained in a database that it will be retained and available indefinitely without alteration? Changes to the information in nucleic acid sequence databases resulting from the discovery of sequencing errors are well-known. (5) Does the mere existence of the sequence information in such a record constitute reasonable means of retrieval? In other words, would one need some text basis or other identifier to retrieve the information? Additional reasons for the inclusion of these prior art sequences remain relevant. These reasons are: (1) the assessment of whether a particular sequence falls within the requirements of the current rules is simple; (2) the general public is assured that all patents which contain any sequence information contain all of the sequence information in the Sequence Listing and all sequences are available in a computer accessible form; and (3) as a publication, the contextual association of new and old information is potentially unique to the patent and very valuable to anyone assessing the state of the art at the time of a patented invention, and thus are desirable to be present in electronic form in association with that patent. The third query posed in the proposed rules was: (3) Should Sequence Listings filed in an international application filed under the PCT be published only electronically and made available for retrieval electronically by an accession number from several sequence repositories? Comment: Two comments were received in response to this query, one in favor and one opposed to limiting the publication of the Sequence Listing to an electronic form for published PCT applications in the international phase. Response: At this time paper copies of the Sequence Listings filed as part of the description will continue to be published in applications filed under PCT. The PTO together with the EPO, JPO and WIPO will continue to discuss the possibility of electronic publication. However, any implementation of such electronic publication in lieu of publication in paper form will not be undertaken until further study has been completed. Comment: One comment suggested that informative English words be placed next to the numerical headings in the Sequence Listing as printed in a U.S. patent. Response: The PTO will provide English words corresponding to the numeric identifiers in the printed U.S. patents. Comment: One comment suggested addition of a descriptive comment line to the Sequence Listing. Response: The "Other Information" line in the Features section, which is numeric identifier <223> in 1.823, provides for a description of a sequence. While completion of this section is only mandatory when the sequence contains "n", "Xaa", a modified or unusual L-amino acid or a modified base, it is frequently completed in other circumstances. Comment: One comment requested we harmonize 1.821 through 1.825 with PCT, EPO and other authorities such that the differences in the requirements for Sequence Listing submissions are minimal. Response: This change to 1.821 through 1.825 is the result of such an effort to harmonize the PTO, PCT, EPO and JPO Sequence Listing requirements to the extent possible. The requirements of newly developed WIPO ST.25 are substantially identical to the requirements of amended 1.821 through 1.825. PatentIn Version 2.0 software, now available, is drafted to meet all of the requirements of WIPO Standard ST.25 (1998). The requirements of 1.821 through 1.825, however, are less stringent than the requirements of WIPO Standard ST.25 (1998). Thus, applicants who wish to file in countries which adhere to WIPO Standard ST.25 (1998) should consider the following when not using PatentIn Version 2.0: 1. The WIPO Standard ST.25 (1998) does not permit submissions using a Macintosh computer. 2. The WIPO Standard ST.25 (1998) does not accept the range of media permitted by amended 1.821 through 1.825. 3. The answers in field <221> and <222> must use selections from Tables 5 and 6 of WIPO Standard ST.25 (1998) to comply with that standard. The terms from these Tables are considered language neutral vocabulary. 4. Any free text in numeric identifier <223> of a Sequence Listing will not be translated and thus must also appear in the specification of applications filed under WIPO Standard ST.25 (1998) for compliance. 5. A CRF filed after the filing of an application under the PCT does not form part of the disclosure and will not be published in the pamphlet. 6. Paragraph 39 of WIPO Standard ST.25 (1998) requires the specific wording "the information recorded on the form is identical to the written sequence listing." 7. WIPO Standard ST.25 (1998), paragraph 24, requires spaces between specified numeric identifiers in the Sequence Listing. Comment: One comment requested a WINDOWSÌrm based version of PatentIn. Response: A WINDOWSÌrm based version of PatentIn, PatentIn 2.0, has been developed through a Trilaterally-sponsored joint initiative and is being made available. Comment: One comment expressed concern over application of the doctrine of equivalents by the courts to sequence-based claim language. Response: Sections 1.821 through 1.825 do not establish a disclosure requirement, nor do they alter the requirements of 35 U.S.C. 112. They merely require a particular format whenever information is presented in the form of a sequence. The use of sequence identification numbers (SEQ ID NO: #) only provides a shorthand way for applicants to refer to sequence information.These identification numbers do not in any way restrict the manner in which an invention can be claimed. Similarly, the use of this format does not impact the potential interpretations and legal determinations that could be made with respect to claims containing information in the form of a nucleotide or amino acid sequence. Comment: One comment requested the flexibility to use single-letter amino acid codes. Response: Sections 1.821 through 1.825 as amended do not constrain an applicant from using single letter codes in the disclosure. The requirements of the sequence searching and the sequence storage mechanisms include only the three-letter codes, thus the need for the constraint on the Sequence Listing information. There is no such restriction on the sequence format in the body of the disclosure or in the figures imposed by 1.821 through 1.825, or any of the rules of practice; only the format for the Sequence Listing is specified by 1.821 through 1.825. Review Under the Paperwork Reduction Act of 1995. Notwithstanding any other provision of law, no person is required to respond to nor shall a person be subject to a penalty for failure to comply with a collection of information subject to the requirements of the Paperwork Reduction Act (PRA) unless that collection of information displays a currently valid OMB control number. This rule contains collections of information requirements subject to the PRA. The principal impact of this Final Rule is: (1) elimination of certain requirements of 1.821 through 1.825; and (2) revision of 1.821 through 1.825 for consistency with WIPO Standard ST.25 (1998), which will permit Sequence Listings to be presented in an international, language neutral format. The public reporting burden for these collections of information have been approved by the Office of Management and Budget (OMB) under OMB control number 0651-0024. The public reporting burden for this collection of information is estimated to average 80 minutes per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the information. Send comments regarding this burden estimate or any other aspect of the data requirements, including suggestions for reducing this burden, to Esther M. Kepplinger at the address specified above or to the Office of Information and Regulatory Affairs of OMB, New Executive Office Bldg., 725 17th St. NW, rm. 10235, Washington, DC 20230, Attn: Desk Officer for the Patent and Trademark Office. Other Considerations. This Final Rule is in conformity with the requirements of the Regulatory Flexibility Act (5 U.S.C. 601 et seq.), Executive Order 12612 (October 26, 1987), and the Paperwork Reduction Act of 1995 (44 U.S.C. 3501 et seq.). It has been determined that this rulemaking is not significant for the purposes of Executive Order 12866 (September 30, 1993). The Assistant General Counsel for Legislation and Regulation of the Department of Commerce has certified to the Chief Counsel for Advocacy, Small Business Administration that this Final Rule would not have a significant impact on a substantial number of small entities (Regulatory Flexibility Act, 5 U.S.C. 605(b)). The principal impact of this Final Rule is: (1) elimination of certain requirements of 1.821 through 1.825; and (2) revision of 1.821 through 1.825 for consistency with WIPO Standard ST.25 (1998), which will permit Sequence Listings to be presented in an international, language neutral format. The Office has determined that this Final Rule has no Federalism implications affecting the relationship between the National Government and the States as outlined in Executive Order 12612. List of Subjects 37 CFR Part 1 Administrative practice and procedure, Courts, Freedom of Information, Inventions and patents, Incorporation by reference, Reporting and record-keeping requirements, Small businesses. For the reasons set forth in the preamble and under the authority granted to the Commissioner of Patents and Trademarks by 35 U.S.C. 6, Title 37 of the Code of Federal Regulations, part 1, is amended as follows: PART 1 - RULES OF PRACTICE IN PATENT CASES 1. The authority citation for 37 CFR part 1 continues to read as follows: Authority: 35 U.S.C. 6, unless otherwise noted. 2. Section 1.821 is revised to read as follows: 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and/or amino acid sequences as used in 1.821 through 1.825 are interpreted to mean an unbranched sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides. Branched sequences are specifically excluded from this definition. Sequences with fewer than four specifically defined nucleotides or amino acids are specifically excluded from this section. "Specifically defined" means those amino acids other than "Xaa" and those nucleotide bases other than "n" defined in accordance with the World Intellectual Property Organization (WIPO) Handbook on Industrial Property Information and Documentation, Standard ST.25: Standard for the Presentation of Nucleotide and Amino Acid Sequence Listings in Patent Applications (1998), including Tables 1 through 6 in Appendix 2, herein incorporated by reference. (Hereinafter "WIPO Standard ST.25 (1998)"). This incorporation by reference was approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO Standard ST.25 (1998) may be obtained from the World Intellectual Property Organization; 34 chemin des Colombettes; 1211 Geneva 20 Switzerland. Copies of ST.25 may be inspected at the Patent Search Room; Crystal Plaza 3, Lobby Level; 2021 South Clark Place; Arlington, VA 22202. Copies may also be inspected at the Office of the Federal Register, 800 North Capitol Street, NW, Suite 700, Washington, DC. Nucleotides and amino acids are further defined as follows: (1) Nucleotides: Nucleotides are intended to embrace only those nucleotides that can be represented using the symbols set forth in WIPO Standard ST.25 (1998), Appendix 2, Table 1.Modifications, e.g., methylated bases, may be described as set forth in WIPO Standard ST.25 (1998), Appendix 2, Table 2, but shall not be shown explicitly in the nucleotide sequence. (2) Amino acids: Amino acids are those L-amino acids commonly found in naturally occurring proteins and are listed in WIPO Standard ST.25 (1998), Appendix 2, Table 3. Those amino acid sequences containing D-amino acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated using the symbols shown in WIPO Standard ST.25 (1998), Appendix 2, Table 3 with the modified positions; e.g., hydroxylations or glycosylations, being described as set forth in WIPO Standard ST.25 (1998), Appendix 2, Table 4, but these modifications shall not be shown explicitly in the amino acid sequence. Any peptide or protein that can be expressed as a sequence using the symbols in WIPO Standard ST.25 (1998), Appendix 2, Table 3 in conjunction with a description in the Feature section to describe, for example, modified linkages, cross links and end caps, non- peptidyl bonds, etc., is embraced by this definition. (b) Patent applications which contain disclosures of nucleotide and/or amino acid sequences, in accordance with the definition in paragraph (a) of this section, shall, with regard to the manner in which the nucleotide and/or amino acid sequences are presented and described, conform exclusively to the requirements of 1.821 through 1.825. (c) Patent applications which contain disclosures of nucleotide and/or amino acid sequences must contain, as a separate part of the disclosure, a paper copy disclosing the nucleotide and/or amino acid sequences and associated information using the symbols and format in accordance with the requirements of 1.822 and 1.823. This paper copy is hereinafter referred to as the "Sequence Listing." Each sequence disclosed must appear separately in the "Sequence Listing." Each sequence set forth in the "Sequence Listing" shall be assigned a separate sequence identifier. The sequence identifiers shall begin with 1 and increase sequentially by integers. If no sequence is present for a sequence identifier, the code "000" shall be used in place of the sequence. The response for the numeric identifier <60> shall include the total number of SEQ ID NOs, whether followed by a sequence or by the code "000." (d) Where the description or claims of a patent application discuss a sequence that is set forth in the "Sequence Listing" in accordance with paragraph (c) of this section, reference must be made to the sequence by use of the sequence identifier, preceded by "SEQ ID NO:" in the text of the description or claims, even if the sequence is also embedded in the text of the description or claims of the patent application. (e) A copy of the "Sequence Listing" referred to in paragraph (c) of this section must also be submitted in computer readable form in accordance with the requirements of 1.824.The computer readable form is a copy of the "Sequence Listing" and will not necessarily be retained as a part of the patent application file. If the computer readable form of a new application is to be identical with the computer readable form of another application of the applicant on file in the Patent and Trademark Office, reference may be made to the other application and computer readable form in lieu of filing a duplicate computer readable form in the new application if the computer readable form in the other application was compliant with all of the requirements of these rules. The new application shall be accompanied by a letter making such reference to the other application and computer readable form, both of which shall be completely identified. In the new application, applicant must also request the use of the compliant computer readable "Sequence Listing" that is already on file for the other application and must state that the paper copy of the "Sequence Listing" in the new application is identical to the computer readable copy filed for the other application. (f) In addition to the paper copy required by paragraph (c) of this section and the computer readable form required by paragraph (e) of this section, a statement that the content of the paper and computer readable copies are the same must be submitted with the computer readable form, e.g., a statement that "the information recorded in computer readable form is identical to the written sequence listing." (g) If any of the requirements of paragraphs (b) through (f) of this section are not satisfied at the time of filing under 35 U.S.C. 111(a) or at the time of entering the national stage under 35 U.S.C. 371, applicant will be notified and given a period of time within which to comply with such requirements in order to prevent abandonment of the application. Any submission in reply to a requirement under this paragraph must be accompanied by a statement that the submission includes no new matter. (h) If any of the requirements of paragraphs (b) through (f) of this section are not satisfied at the time of filing an international application under the Patent Cooperation Treaty (PCT), which application is to be searched by the United States International Searching Authority or examined by the United States International Preliminary Examining Authority, applicant will be sent a notice necessitating compliance with the requirements within a prescribed time period. Any submission in reply to a requirement under this paragraph must be accompanied by a statement that the submission does not include matter which goes beyond the disclosure in the international application as filed. If applicant fails to timely provide the required computer readable form, the United States International Searching Authority shall search only to the extent that a meaningful search can be performed without the computer readable form and the United States International Preliminary Examining Authority shall examine only to the extent that a meaningful examination can be performed without the computer readable form. 3. Section 1.822 is revised to read as follows: 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall conform to the requirements of paragraphs (b) through (e) of this section. (b) The code for representing the nucleotide and/or amino acid sequence characters shall conform to the code set forth in the tables in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3. This incorporation by reference was approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of ST.25 may be obtained from the World Intellectual Property Organization; 34 chemin des Colombettes; 1211 Geneva 20 Switzerland. Copies of ST.25 may be inspected at the Patent Search Room; Crystal Plaza 3, Lobby Level; 2021 South Clark Place; Arlington, VA 22202. Copies may also be inspected at the Office of the Federal Register, 800 North Capitol Street, NW, Suite 700, Washington, DC.No code other than that specified in these sections shall be used in nucleotide and amino acid sequences. A modified base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those listed in WIPO Standard ST.25 (1998), Appendix 2, Tables 2 and 4, and the modification is also set forth in the Feature section. Otherwise, each occurrence of a base or amino acid not appearing in WIPO Standard ST.25 (1998), Appendix 2, Tables 1 and 3, shall be listed in a given sequence as "n" or "Xaa," respectively, with further information, as appropriate, given in the Feature section, preferably by including one or more feature keys listed in WIPO Standard ST.25 (1998), Appendix 2, Tables 5 and 6. (c) Format representation of nucleotides: (1) A nucleotide sequence shall be listed using the lower-case letter for representing the one-letter code for the nucleotide bases set forth in WIPO Standard ST.25 (1998), Appendix 2, Table 1. (2) The bases in a nucleotide sequence (including introns) shall be listed in groups of 10 bases except in the coding parts of the sequence. Leftover bases, fewer than 10 in number, at the end of noncoding parts of a sequence shall be grouped together and separated from adjacent groups of 10 or 3 bases by a space. (3) The bases in the coding parts of a nucleotide sequence shall be listed as triplets (codons). The amino acids corresponding to the codons in the coding parts of a nucleotide sequence shall be typed immediately below the corresponding codons. Where a codon spans an intron, the amino acid symbol shall be typed below the portion of the codon containing two nucleotides. (4) A nucleotide sequence shall be listed with a maximum of 16 codons or 60 bases per line, with a space provided between each codon or group of 10 bases. (5) A nucleotide sequence shall be presented, only by a single strand, in the 5 to 3 direction, from left to right. (6) The enumeration of nucleotide bases shall start at the first base of the sequence with number 1. The enumeration shall be continuous through the whole sequence in the direction 5 to 3. The enumeration shall be marked in the right margin, next to the line containing the one-letter codes for the bases, and giving the number of the last base of that line. (7) For those nucleotide sequences that are circular in configuration, the enumeration method set forth in paragraph (c)(6) of this section remains applicable with the exception that the designation of the first base of the nucleotide sequence may be made at the option of the applicant. (d) Representation of amino acids: (1) The amino acids in a protein or peptide sequence shall be listed using the three-letter abbreviation with the first letter as an upper case character, as in WIPO Standard ST.25 (1998), Appendix 2, Table 3. (2) A protein or peptide sequence shall be listed with a maximum of 16 amino acids per line, with a space provided between each amino acid. (3) An amino acid sequence shall be presented in the amino to carboxy direction, from left to right, and the amino and carboxy groups shall not be presented in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first mature protein, with the number 1. When presented, the amino acids preceding the mature protein, e.g., pre-sequences, pro-sequences, pre-pro-sequences and signal sequences, shall have negative numbers, counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids shall start at the first amino acid at the amino terminal as number 1. It shall be marked below the sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth in this section remains applicable for amino acid sequences that are circular in configuration, with the exception that the designation of the first amino acid of the sequence may be made at the option of the applicant. (5) An amino acid sequence that contains internal terminator symbols (e.g., "Ter", "*", or ".", etc.) may not be represented as a single amino acid sequence, but shall be presented as separate amino acid sequences. (e) A sequence with a gap or gaps shall be presented as a plurality of separate sequences, with separate sequence identifiers, with the number of separate sequences being equal in number to the number of continuous strings of sequence data. A sequence that is made up of one or more noncontiguous segments of a larger sequence or segments from different sequences shall be presented as a separate sequence. 4. Section 1.823 is revised to read as follows: 1.823 Requirements for nucleotide and/or amino acid sequences as part of the application papers. (a) The "Sequence Listing" required by 1.821(c), setting forth the nucleotide and/or amino acid sequences and associated information in accordance with paragraph (b) of this section, must begin on a new page and must be titled "Sequence Listing". The "Sequence Listing" preferably should be numbered independently of the numbering of the remainder of the application. Each page of the "Sequence Listing" should contain no more than 66 lines and each line should contain no more than 72 characters. A fixed-width font should be used exclusively throughout the "Sequence Listing." (b) The "Sequence Listing" shall, except as otherwise indicated, include the actual nucleotide and/or amino acid sequence, the numeric identifiers and their accompanying information as shown in the following table. The numeric identifier shall be used only in the "Sequence Listing." The order and presentation of the items of information in the "Sequence Listing" shall conform to the arrangement given below. Each item of information shall begin on a new line and shall begin with the numeric identifier enclosed in angle brackets as shown. The submission of those items of information designated with an "M" is mandatory. The submission of those items of information designated with an "O" is optional. Numeric identifiers <110> through <170> shall only be set forth at the beginning of the "Sequence Listing." The following table illustrates the numeric identifiers. Numeric Definition Comments and Mandatory (M) or Identifier Format Optional (O) <110> Applicant Preferably max. M of 10 names; one name per line; preferable format: Surname, Other Names and/or Initials <120> Title of M Invention <130> File Reference Personal file M when filed prior reference to assignment of appl. number <140> Current Applica- Specify as: M, if available tion Number US 07/999,999 or PCT/US96/99999 <141> Current Filing Specify as: yyyy-mm-dd M, if available Date <150> Prior Application Specify as: M, if applicable Number US 07/999,999 or include priority PCT/US96/99999 documents under 35 USC 119 and 120 <151> Prior Application Specify as: yyyy-mm-dd M, if applicable Filing Date <160> Number of SEQ ID Count includes M NOs total number of SEQ ID NOs <170> Software Name of software used O to create the Sequence Listing <210> SEQ ID NO:#: Response shall be an M integer repre- senting the SEQ ID NO shown <211> Length Respond with an integer M expressing the number of bases or amino acid residues <212> Type Whether presented M sequence mole- cule is DNA, RNA, or PRT (protein). If a nucleotide sequence con- tains both DNA and RNA frag- ments, the type shall be "DNA." In ad- dition, the combined DNA/ RNA molecule shall be further described in the <220> to <223> feature section. <213> Organism Scientific name, M i.e. Genus/species, Unknown or Artifi- cial Sequence. In addition, the "Unknown" or "Artificial Se- quence" organisms shall be further described in the <220> to <223> feature section. <220> Feature Leave blank after M, under the <220>. <221-223> following condi- provide for a tions: if "n," description of "Xaa," or a mod- points of bio- ified or unusual logical signi- L-amino acid or ficance in the modified base was sequence. used in a se- quence; if ORGAN- ISM is "Artifi- cial Sequence" or "Unknown"; if molecule is combined DNA/RNA. <221> Name/Key Provide appropriate M, under the fol- identifier for lowing conditions: feature, pre- if "n," "Xaa," or ferably from a modified or un- WIPO Standard usual L-amino ST.25 (1998), acid or modified Appendix 2, base was used in Tables 5 and 6 a sequence <222> Location Specify location M, under the fol- within sequence; lowing conditions: where appropriate if "n," "Xaa," or state number of a modified or un- first and last usual L-amino bases/amino acids acid or modified in feature base was used in a sequence <223> Other Infor- Other relevant M, under the fol- mation information; lowing conditions: four lines maximum if "n," "Xaa," or a modified or un- usual L-amino acid or modified base was used in a sequence; if ORGANISM is "Artificial Sequence" or "Unknown"; if molecule is com- bined DNA/RNA. <300> Publication Leave blank O Information after <300> <301> Authors Preferably max O of ten named authors of publi- cation; specify one name per line; preferable format: Surname, Other Names and/or Initials <302> Title O <303> Journal O <304> Volume O <305> Issue O <306> Pages O <307> Date Journal date on which O data published; specify as yyyy-mm- dd, MMM-yyyy or Season-yyyy <308> Database Accession number O Accession assigned by data- Number base including database name <309> Database Entry Date of entry in O Date database; specify as yyyy-mm-dd or MMM-yyyy <310> Patent Document Document number; O Number for patent-type citations only. Specify as, for example, US 07/999,999 <311> Patent Filing Document filing O Date date, for patent- type citations only; specify as yyyy-mm-dd <312> Publication Date Document publication O date, for patent-type citations only; specify as yyyy-mm-dd <313> Relevant FROM (position) TO O Residues (position) <400> Sequence SEQ ID NO should M follow the numeric identifier and should appear on the line pre- ceding the actual sequence 5. Section 1.824 is revised to read as follows: 1.824 Form and format for nucleotide and/or amino acid sequence submissions in computer readable form. (a) The computer readable form required by 1.821(e) shall meet the following specifications: (1) The computer readable form shall contain a single "Sequence Listing" as either a diskette, series of diskettes, or other permissible media outlined in paragraph (c) of this section. (2) The "Sequence Listing" in paragraph (a) (l) of this section shall be submitted in American Standard Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer readable form may be created by any means, such as word processors, nucleotide/amino acid sequence editors or other custom computer programs; however, it shall conform to all specifications detailed in this section. (4) File compression is acceptable when using diskette media, so long as the compressed file is in a self-extracting format that will decompress on one of the systems described in paragraph (b) of this section. (5) Page numbering shall not appear within the computer readable form version of the "Sequence Listing" file. (6) All computer readable forms shall have a label permanently affixed thereto on which has been hand-printed or typed: the name of the applicant, the title of the invention, the date on which the data were recorded on the computer readable form, the operating system used, a reference number, and an application serial number and filing date, if known. (b) Computer readable form submissions must meet these format requirements: (1) Computer: IBM PC/XT/AT, or compatibles, or Apple Macintosh; (2) Operating System: MS-DOS, Unix or Macintosh; (3) Line Terminator: ASCII Carriage Return plus ASCII Line Feed; (4) Pagination: Continuous file (no "hard page break" codes permitted); (c) Computer readable form files submitted may be in any of the following media: (1) Diskette: 3.50 inch, 1.44 Mb storage; 3.50 inch, 720 Kb storage; 5.25 inch, 1.2 Mb storage; 5.25 inch, 360 Kb storage. (2) Magnetic tape: 0.5 inch, up to 24000 feet; Density: 1600 or 6250 bits per inch, 9 track; Format: Unix tar command; specify blocking factor (not "block size"); Line Terminator: ASCII Carriage Return plus ASCII Line Feed. (3) 8mm Data Cartridge: Format: Unix tar command; specify blocking factor (not "block size"); Line Terminator: ASCII Carriage Return plus ASCII Line Feed. (4) CD-ROM: Format: ISO 9660 or High Sierra Format (5) Magneto Optical Disk: Size/Storage Specifications: 5.25 inch, 640 Mb. (d) Computer readable forms that are submitted to the Office will not be returned to the applicant. 6. Section 1.825 is revised to read as follows: 1.825 Amendments to or replacement of sequence listing and computer readable copy thereof. (a) Any amendment to the paper copy of the "Sequence Listing" ( 1.821(c)) must be made by the submission of substitute sheets. Amendments must be accompanied by a statement that indicates support for the amendment in the application, as filed, and a statement that the substitute sheets include no new matter. (b) Any amendment to the paper copy of the "Sequence Listing," in accordance with paragraph (a) of this section, must be accompanied by a substitute copy of the computer readable form ( 1.821(e)) including all previously submitted data with the amendment incorporated therein, accompanied by a statement that the copy in computer readable form is the same as the substitute copy of the "Sequence Listing." (c) Any appropriate amendments to the "Sequence Listing" in a patent; e.g., by reason of reissue or certificate of correction, must comply with the requirements of paragraphs (a) and (b) of this section. (d) If, upon receipt, the computer readable form is found to be damaged or unreadable, applicant must provide, within such time as set by the Commissioner, a substitute copy of the data in computer readable form accompanied by a statement that the substitute data is identical to that originally filed. 7. Appendix A to Subpart G to Part 1 is revised to read as follows: Appendix A To Subpart G to Part 1 - Sample Sequence Listing <110> Smith, John Smith, Jane <120> Example of a Sequence Listing <130> 01-00001 <140> US 08/999,999 <141> 1998-02-28 <150> EP 91000000 <151> 1997-12-31 <160> 2 <170> PatentIn ver. 2.0 <210> 1 <211> 403 <212> DNA <213> Paramecium aurelia <220> <221> CDS <222> 341..394 <300> <301> Doe, Richard <302> Isolation and Characterization of a Gene Encoding a Protease from Paramecium sp. <303> Journal of Fictional Genes <304> 1 <305> 4 <306> 1 - 7 <307> 1988-06-20 <400> 1 ctactctact ctactctcat ctactatctt ctttggatct ctgagtctgc ctgagtggta 60 ctcttgagtc ctggagatct ctcctctcac atgtgatcgt cgagactgac cgatagatcg 120 ctgactgact ctgagatagt cgagcccgta cgagacccgt cgagggtgac agagagtggg 180 cgcgtgcgcg cagagcgccg cgccggtgcg cgcgcgagtg cgcggtgggc cgcgcgaggg 240 ctttcgcggc agcggcggcg ctttccggcg cgcgcccgtc cgcccctaga cctgagaggt 300 cttctcttcc ctcctcttca ctagagaggt ctatatatac atg gtt tca atg ttc 355 Met Val Ser Met Phe 1 5 agc ttg tct ttc aaa tgg cct gga ttt tgt ttg ttt gtt tgtttg 403 Ser Leu Ser Phe Lys Trp Pro Gly Phe Cys Leu Phe Val 10 15 <210> 2 <211> 18 <212> PRT <213> Paramecium aurelia <400> 2 Met Val Ser Met Phe Ser Leu Ser Phe Lys Trp Pro Gly Phe Cys Leu 1 5 10 15 Phe Val May 22, 1998 BRUCE A. LEHMAN Assistant Secretary of Commerce and Commissioner of Patents and Trademarks [1211 OG 82]