2422 Nucleotide and/or Amino Acid Sequence Disclosures in Patent Applications Subject to WIPO ST.25 [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
37 CFR 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications.
- (a) Nucleotide and/or amino acid sequences, as used in §§
1.821 through 1.825, are
interpreted to mean an unbranched sequence of 4 or more amino acids or an
unbranched sequence of 10 or more nucleotides. Branched sequences are specifically
excluded from this definition. Sequences with fewer than four specifically defined
nucleotides or amino acids are specifically excluded from this section.
“Specifically defined” means those amino acids other than “Xaa” and those
nucleotide bases other than “n,” defined in accordance with Appendices A through F
to this subpart. Nucleotides and amino acids are further defined as follows:
- (1) Nucleotides: Nucleotides are intended to embrace only those nucleotides that can be represented using the symbols set forth in Appendix A to this subpart. Modifications (e.g., methylated bases) may be described as set forth in Appendix B to this subpart but shall not be shown explicitly in the nucleotide sequence.
- (2) Amino acids: Amino acids are those L-amino acids commonly found in naturally occurring proteins and are listed in appendix C to this subpart. Those amino acid sequences containing D-amino acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated using the symbols shown in appendix C to this subpart, with the modified positions (e.g., hydroxylations or glycosylations) being described as set forth in appendix D to this subpart, but these modifications shall not be shown explicitly in the amino acid sequence. Any peptide or protein that can be expressed as a sequence using the symbols in appendix C to this subpart, in conjunction with a description in the Feature section, to describe, for example, modified linkages, cross links and end caps, non-peptidyl bonds, etc., is embraced by this definition.
- Note 1 to paragraph (a): Appendices A through F to this subpart contain Tables 1– 6 of the World Intellectual Property Organization (WIPO) Handbook on Industrial Property Information and Documentation, Standard ST.25: Standard for the Presentation of Nucleotide and Amino Acid Sequence Listings in Patent Applications (2009).
- (b) Patent applications which contain disclosures of nucleotide and/or amino acid sequences, in accordance with the definition in paragraph (a) of this section, shall, with regard to the manner in which the nucleotide and/or amino acid sequences are presented and described, conform exclusively to the requirements of §§ 1.821 through 1.825.
- (c) Patent applications that contain disclosures of nucleotide and/or
amino acid sequences, as defined in paragraph (a) of this section, must contain a
“Sequence Listing,” which is a separate part of the specification containing each
of those nucleotide and/or amino acid sequences and associated information using
the symbols and format in accordance with the requirements of §§ 1.822 and
1.823. The “Sequence Listing” must be submitted as follows,
except for a national stage entry under § 1.495(b)(1), where the “Sequence Listing” has been
previously communicated by the International Bureau or originally filed in the
United States Patent and Trademark Office and complies with Patent Cooperation
Treaty (PCT) Rule 5.2:
- (1) As an ASCII plain text file, in compliance with § 1.824, submitted via the USPTO patent electronic filing system or on a read-only optical disc under § 1.52(e), accompanied by an incorporation by reference statement of the ASCII plain text file, in a separate paragraph of the specification, in accordance with § 1.77(b)(5);
- (2) As a PDF file via the USPTO patent electronic filing system; or
- (3) On physical sheets of paper.
- (d) Where the description or claims of a patent application discuss a sequence that is set forth in the “Sequence Listing,” in accordance with paragraph (c) of this section, reference must be made to the sequence by use of the sequence identifier (§ 1.823(a)(5)), preceded by “SEQ ID NO:” or the like, in the text of the description or claims, even if the sequence is also embedded in the text of the description or claims of the patent application. Where a sequence is presented in a drawing, reference must be made to the sequence by use of the sequence identifier (§ 1.823(a)(5)), either in the drawing or in the Brief Description of the Drawings, where the correlation between multiple sequences in the drawing and their sequence identifiers (§ 1.823(a)(5)) in the Brief Description is clear.
- (e)
- (1) If the “Sequence Listing” under paragraph
(c) of this section is submitted in an application filed under
35 U.S.C. 111(a) as a PDF file (§ 1.821(c)(2)) via the USPTO patent electronic filing
system or on physical sheets of paper (§ 1.821(c)(3)), then the following must be submitted:
- (i) A CRF of the “Sequence Listing,” in accordance with the requirements of § 1.824; and
- (ii) A statement that the sequence information contained in the CRF submitted under paragraph (e)(1)(i) of this section is identical to the sequence information contained in the “Sequence Listing” under paragraph (c) of this section.
- (2) If the “Sequence Listing” under paragraph (c)
of this section in an application submitted under 35 U.S.C.
371 is a PDF file (paragraph (c)(2) of this section)
or on physical sheets of paper (paragraph (c)(3) of this section), and not
also as an ASCII plain text file, in compliance with § 1.824
(paragraph (c)(1) of this section), then the following must be
submitted:
- (i) A CRF of the “Sequence Listing,” in accordance with the requirements of § 1.824; and
- (ii) A statement that the sequence information contained in the CRF submitted under paragraph (e)(2)(i) of this section is identical to the sequence information contained in the “Sequence Listing” under paragraph (c)(2) or (3) of this section.
- (3) If a “Sequence Listing” in ASCII plain text
format, in compliance with § 1.824, has not been submitted for an
international application under the PCT, and that application contains
disclosures of nucleotide and/or amino acid sequences, as defined in
paragraph (a) of this section, and is to be searched by the United States
International Searching Authority or examined by the United States
International Preliminary Examining Authority, then the following must be
submitted:
- (i) A CRF of the “Sequence Listing,” in accordance with the requirements of § 1.824;
- (ii) The late furnishing fee for providing a “Sequence Listing” in response to an invitation, as set forth in § 1.445(a)(5); and
- (iii) A statement that the sequence information contained in the CRF, submitted under paragraph (e)(3)(i) of this section, does not go beyond the disclosure in the international application as filed, or a statement that the information recorded in the ASCII plain text file, submitted under paragraph (e)(3)(i) of this section, is identical to the sequence listing contained in the international application as filed, as applicable.
- (4) The CRF may not be retained as a part of the patent application file.
- (1) If the “Sequence Listing” under paragraph
(c) of this section is submitted in an application filed under
35 U.S.C. 111(a) as a PDF file (§ 1.821(c)(2)) via the USPTO patent electronic filing
system or on physical sheets of paper (§ 1.821(c)(3)), then the following must be submitted:
- (f) [reserved]
- (g) If any of the requirements of paragraphs (b) through (e) of this section are not satisfied at the time of filing under 35 U.S.C. 111(a) or at the time of entering the national stage under 35 U.S.C. 371, the applicant will be notified and given a period of time within which to comply with such requirements in order to prevent abandonment of the application. Any amendment to add or replace a “Sequence Listing” and CRF copy thereof in reply to a requirement under this paragraph must be submitted in accordance with the requirements of § 1.825.
- (h) If any of the requirements of paragraph (e)(3) of this section are not satisfied at the time of filing an international application under the PCT, and the application is to be searched by the United States International Searching Authority or examined by the United States International Preliminary Examining Authority, the applicant may be sent a notice necessitating compliance with the requirements within a prescribed time period. Where a “Sequence Listing” under PCT Rule 13ter is provided in reply to a requirement under this paragraph, it must be accompanied by a statement that the information recorded in the ASCII plain text file under paragraph (e)(3)(i) of this section is identical to the sequence listing contained in the international application as filed, or does not go beyond the disclosure in the international application as filed, as applicable. It must also be accompanied by the late furnishing fee, as set forth in § 1.445(a)(5). If the applicant fails to timely provide the required CRF, the United States International Searching Authority shall search only to the extent that a meaningful search can be performed without the CRF, and the United States International Preliminary Examining Authority shall examine only to the extent that a meaningful examination can be performed without the CRF.
37 CFR 1.821 and 37 CFR 1.822 reference Appendices A-F, which contain Tables 1–6 of the World Intellectual Property Organization (WIPO) Handbook on Industrial Property Information and Documentation, Standard ST.25: Standard for Nucleotide and Amino Acid Sequence Listings in Patent Applications (2009) (hereinafter WIPO Standard ST.25 (2009)). Appendices A-F are reproduced below. The current version of WIPO Standard ST.25 is available online at www.wipo.int /export/sites/ www/standards/en/pdf/03-25-01.pdf.
Appendix A provides that the bases of a nucleotide sequence should be represented using the following one-letter symbol for nucleotide sequence characters:
Symbol | Meaning | Origin of designation |
---|---|---|
a | a | adenine. |
g | g | guanine. |
c | c | cytosine. |
t | t | thymine. |
u | u | uracil. |
r | g or a | purine. |
y | t/u or c | pyrimidine. |
m | a or c | amino. |
k | g or t/u | keto. |
s | g or c | strong interactions 3H-bonds. |
w | a or t/u | weak interactions 2H-bonds. |
b | g or c or t/u | not a. |
d | a or g or t/u | not c. |
h | a or c or t/u | not g. |
v | a or g or c | not t, not u. |
n | a or g or c or t/u, unknown, or other | any. |
Appendix B provides that modified bases may be represented as the corresponding unmodified bases in the sequence itself, if the modification is further described in numeric identifier <223> of the Feature section of the “Sequence Listing”. The symbols from the list below may be used in the description (i.e., the specification and drawing, or in the Feature section of the “Sequence Listing”) but these symbols may not be used in the sequence itself. Modifications not listed in Appendix B may also be represented as the corresponding unmodified base in the sequence itself, and the modification should be described using its full chemical name in the Feature section of the “Sequence Listing”.
Symbol | Meaning |
---|---|
ac4c | 4-acetylcytidine. |
chm5u | 5-(carboxyhydroxymethyl)uridine. |
cm | 2'-O-methylcytidine. |
cmnm5s2u | 5-carboxymethylaminomethyl-2- thiouridine. |
cmnm5u | 5-carboxymethylaminomethyluridine. |
d | dihydrouridine. |
fm | 2'-O-methylpseudouridine. |
gal q | beta, D-galactosylqueuosine. |
gm | 2'-O-methylguanosine. |
i | inosine. |
i6a | N6-isopentenyladenosine. |
m1a | 1-methyladenosine. |
m1f | 1-methylpseudouridine. |
m1g | 1-methylguanosine. |
m1i | 1-methylinosine. |
m22g | 2,2-dimethylguanosine. |
m2a | 2-methyladenosine. |
m2g | 2-methylguanosine. |
m3c | 3-methylcytidine. |
m5c | 5-methylcytidine. |
m6a | N6-methyladenosine. |
m7g | 7-methylguanosine. |
mam5u | 5-methylaminomethyluridine. |
mam5s2u | 5-methoxyaminomethyl-2-thiouridine. |
man q | beta, D-mannosylqueuosine. |
mcm5s2u | 5-methoxycarbonylmethyl-2- thiouridine. |
mcm5u | 5-methoxycarbonylmethyluridine. |
mo5u | 5-methoxyuridine. |
ms2i6a | 2-methylthio-N6- isopentenyladenosine. |
ms2t6a | N-((9-beta-D-ribofuranosyl-2- methylthiopurine-6- yl)carbamoyl)threonine. |
mt6a | N-((9-beta-D-ribofuranosylpurine-6- yl)N-methylcarbamoyl)threonine. |
mv | uridine-5-oxyacetic acid-methylester. |
o5u | uridine-5-oxyacetic acid. |
osyw | wybutoxosine. |
p | pseudouridine. |
q | queuosine. |
s2c | 2-thiocytidine. |
s2t | 5-methyl-2-thiouridine. |
s2u | 2-thiouridine. |
s4u | 4-thiouridine. |
t | 5-methyluridine. |
t6a | N-((9-beta-D-ribofuranosylpurine-6- yl)-carbamoyl)threonine. |
tm | 2'-O-methyl-5-methyluridine. |
um | 2'-O-methyluridine. |
yw | wybutosine. |
x | 3-(3-amino-3-carboxy-propyl)uridine, (acp3)u. |
Appendix C provides that the amino acids should be represented using the following three-letter symbols with the first letter as a capital.
Symbol | Meaning |
---|---|
Ala | Alanine. |
Cys | Cysteine. |
Asp | Aspartic Acid. |
Glu | Glutamic Acid. |
Phe | Phenylalanine. |
Gly | Glycine. |
His | Histidine. |
Ile | Isoleucine. |
Lys | Lysine. |
Leu | Leucine. |
Met | Methionine. |
Asn | Asparagine. |
Pro | Proline. |
Gln | Glutamine. |
Arg | Arginine. |
Ser | Serine. |
Thr | Threonine. |
Val | Valine. |
Trp | Tryptophan. |
Tyr | Tyrosine. |
Asx | Asp or Asn. |
Glx | Glu or Gln. |
Xaa | unknown or other. |
Appendix D provides that modified and unusual amino acids may be represented as the corresponding unmodified amino acids in the sequence itself if the modification is further described in numeric identifier <223> of the Feature section of the “Sequence Listing”. The symbols from the list below may be used in the description (i.e., the specification and drawings, or in the Feature section of the “Sequence Listing”) but these symbols may not be used in the sequence itself. Modifications not listed in Appendix D may also be represented as the corresponding unmodified amino acid in the sequence itself, and the modification should be described using its full chemical name in the Feature section of the “Sequence Listing”.
Symbol | Meaning |
---|---|
Aad | 2-Aminoadipic acid. |
bAad | 3-Aminoadipic acid. |
bAla | beta-Alanine, beta-Aminopropionic acid. |
Abu | 2-Aminobutyric acid. |
4Abu | 4-Aminobutyric acid, piperidinic acid. |
Acp | 6-Aminocaproic acid. |
Ahe | 2-Aminoheptanoic acid. |
Aib | 2-Aminoisobutyric acid. |
bAib | 3-Aminoisobutyric acid. |
Apm | 2-Aminopimelic acid. |
Dbu | 2,4 Diaminobutyric acid. |
Des | Desmosine. |
Dpm | 2,2'-Diaminopimelic acid. |
Dpr | 2,3-Diaminopropionic acid. |
EtGly | N-Ethylglycine. |
EtAsn | N-Ethylasparagine. |
Hyl | Hydroxylysine. |
aHyl | allo-Hydroxylysine. |
3Hyp | 3-Hydroxyproline. |
4Hyp | 4-Hydroxyproline. |
Ide | Isodesmosine. |
aIle | allo-Isoleucine. |
MeGly | N-Methylglycine, sarcosine. |
MeIle | N-Methylisoleucine. |
MeLys | 6-N-Methyllysine. |
MeVal | N-Methylvaline. |
Nva | Norvaline. |
Nle | Norleucine. |
Orn | Ornithine. |
Appendix E provides for feature keys related to nucleotide sequences.
Key | Description |
---|---|
allele | a related individual or strain contains stable, alternative forms of the same gene, which differs from the presented sequence at this location (and perhaps others). |
attenuator | (1) region of DNA at which regulation of termination of transcription occurs, which controls the expression of some bacterial operons; (2) sequence segment located between the promoter and the first structural gene that causes partial termination of transcription. |
C_region | constant region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; includes one or more exons depending on the particular chain. |
CAAT_signal | CAAT box; part of a conserved sequence located about 75 bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG (C or T) CAATCT. |
CDS | coding sequence; sequence of nucleotides that corresponds with the sequence of amino acids in a protein (location includes stop codon); feature includes amino acid conceptual translation. |
conflict | independent determinations of the "same'' sequence differ at this site or region. |
D-loop | displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein. |
D-segment | diversity segment of immunoglobulin heavy chain, and T-cell receptor beta chain. |
enhancer | a cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter. |
exon | region of genome that codes for portion of spliced mRNA; may contain 5'UTR, all CDSs, and 3'UTR. |
GC_signal | GC box; a conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG. |
gene | region of biological interest identified as a gene and for which a name has been assigned. |
iDNA | intervening DNA; DNA which is eliminated through any of several kinds of recombination. |
intron | a segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it. |
J_segment | joining segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains. |
LTR | long terminal repeat, a sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses. |
mat_peptide | mature peptide or protein coding sequence; coding sequence for the mature or final peptide or protein product following post- translational modification; the location does not include the stop codon (unlike the corresponding CDS). |
misc_binding | site in nucleic acid which covalently or non-covalently binds another moiety that cannot be described by any other Binding key (primer_bind or protein_bind). |
misc_difference | feature sequence is different from that presented in the entry and cannot be described by any other Difference key (conflict, unsure, old_sequence, mutation, variation, allele, or modified_base). |
misc_feature | region of biological interest which cannot be described by any other feature key; a new or rare feature. |
misc_recomb | site of any generalized, site-specific or replicative recombination event where there is a breakage and reunion of duplex DNA that cannot be described by other recombination keys (iDNA and virion) or qualifiers of source key (/insertion_seq, /transposon, /proviral). |
misc_RNA | any transcript or RNA product that cannot be defined by other RNA keys (prim_transcript, precursor_RNA, mRNA, 5'clip, 3'clip, 5'UTR, 3'UTR, exon, CDS, sig_peptide, transit_peptide, mat_peptide, intron, polyA_site, rRNA, tRNA, scRNA, and snRNA). |
misc_signal | any region containing a signal controlling or altering gene function or expression that cannot be described by other Signal keys (promoter, CAAT_signal, TATA_signal, –35_signal, –10_signal, GC_signal, RBS, polyA_signal, enhancer, attenuator, terminator, and rep_origin). |
misc_structure | any secondary or tertiary structure or conformation that cannot be described by other Structure keys (stem_loop and D-loop). |
modified_base | the indicated nucleotide is a modified nucleotide and should be substituted for by the indicated molecule (given in the mod_base qualifier value). |
mRNA | messenger RNA; includes 5' untranslated region (5'UTR), coding sequences (CDS, exon) and 3' untranslated region (3'UTR). |
mutation | a related strain has an abrupt, inheritable change in the sequence at this location. |
N_region | extra nucleotides inserted between rearranged immunoglobulin segments. |
old_sequence | the presented sequence revises a previous version of the sequence at this location. |
polyA_signal | recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA. |
polyA_site | site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. |
precursor_RNA | any RNA species that is not yet the mature RNA product; may include 5' clipped region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3'UTR), and 3' clipped region (3'clip). |
prim_transcript | primary (initial, unprocessed) transcript; includes 5' clipped region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3'UTR), and 3' clipped region (3'clip). |
primer_bind | non-covalent primer binding site for initiation of replication, transcription, or reverse transcription; includes site(s) for synthetic, for example, PCR primer elements. |
promoter | region on a DNA molecule involved in RNA polymerase binding to initiate transcription. |
protein_bind | non-covalent protein binding site on nucleic acid. |
RBS | ribosome binding site. |
repeat_region | region of genome containing repeating units. |
repeat_unit | single repeat element. |
rep_origin | origin of replication; starting site for duplication of nucleic acid to give two identical copies. |
rRNA | mature ribosomal RNA; the RNA component of the ribonucleoprotein particle (ribosome) which assembles amino acids into proteins. |
S_region | switch region of immunoglobulin heavy chains; involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin class from the same B-cell. |
satellite | many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA. |
scRNA | small cytoplasmic RNA; any one of several small cytoplasmic RNA molecules present in the cytoplasm and (sometimes) nucleus of a eukaryote. |
sig_peptide | signal peptide coding sequence; coding sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane; leader sequence. |
snRNA | small nuclear RNA; any one of many small RNA species confined to the nucleus; several of the snRNAs are involved in splicing or other RNA processing reactions. |
source | identifies the biological source of the specified span of the sequence; this key is mandatory; every entry will have, as a minimum, a single source key spanning the entire sequence; more than one source key per sequence is permissible. |
stem_loop | hairpin; a double-helical region formed by base-pairing between adjacent (inverted) complementary sequences in a single strand of RNA or DNA |
STS | Sequence Tagged Site; short, single-copy DNA sequence that characterizes a mapping landmark on the genome and can be detected by PCR; a region of the genome can be mapped by determining the order of a series of STSs. |
TATA_signal | TATA box; Goldberg-Hogness box; a conserved AT-rich septamer found about 25 bp before the start point of each eukaryotic RNA polymerase II transcript unit which may be involved in positioning the enzyme for correct initiation; consensus=TATA(A or T)A(A or T). |
terminator | sequence of DNA located either at the end of the transcript or adjacent to a promoter region that causes RNA polymerase to terminate transcription; may also be site of binding of repressor protein. |
transit_peptide | transit peptide coding sequence; coding sequence for an N-terminal domain of a nuclear-encoded organellar protein; this domain is involved in post-translational import of the protein into the organelle. |
tRNA | mature transfer RNA, a small RNA molecule (75-85 bases long) that mediates the translation of a nucleic acid sequence into an amino acid sequence. |
unsure | author is unsure of exact sequence in this region. |
V_region | variable region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; codes for the variable amino terminal portion; can be made up from V_segments, D_segments, N_regions, and J_segments. |
V_segment | variable segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; codes for most of the variable region (V_region) and the last few amino acids of the leader peptide. |
variation | a related strain contains stable mutations from the same gene (for example, RFLPs, polymorphisms, etc.) which differ from the presented sequence at this location (and possibly others). |
3'clip | 3'-most region of a precursor transcript that is clipped off during processing. |
3'UTR | region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein. |
5'clip | 5'-most region of a precursor transcript that is clipped off during processing. |
5'UTR | region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein. |
–10_signal | pribnow box; a conserved region about 10 bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. |
–35_signal | a conserved hexamer about 35 bp upstream of the start point of bacterial transcription units; consensus=TTGACa [ ] or TGTTGACA [ ]. |
Appendix F provides for feature keys related to protein sequences.
Key | Description |
---|---|
CONFLICT | different papers report differing sequences. |
VARIANT | authors report that sequence variants exist. |
VARSPLIC | description of sequence variants produced by alternative splicing. |
MUTAGEN | site which has been experimentally altered. |
MOD_RES | post-translational modification of a residue. |
ACETYLATION | N-terminal or other. |
AMIDATION | generally at the C-terminal of a mature active peptide. |
BLOCKED | undetermined N- or C-terminal blocking group. |
FORMYLATION | of the N-terminal methionine. |
GAMMA-CARBOXYGLUTAMIC ACID HYDROXYLATION | of asparagine, aspartic acid, proline, or lysine. |
METHYLATION | generally of lysine or arginine. |
PHOSPHORYLATION | of serine, threonine, tyrosine, aspartic acid or histidine. |
PYRROLIDONE CARBOXYLIC ACID | N-terminal glutamate which has formed an internal cyclic lactam. |
SULFATATION | generally of tyrosine. |
LIPID | covalent binding of a lipidic moiety. |
MYRISTATE | myristate group attached through an amide bond to the N-terminal glycine residue of the mature form of a protein or to an internal lysine residue. |
PALMITATE | palmitate group attached through a thioether bond to a cysteine residue or through an ester bond to a serine or threonine residue. |
FARNESYL | farnesyl group attached through a thioether bond to a cysteine residue. |
GERANYL-GERANYL | geranyl-geranyl group attached through a thioether bond to a cysteine residue. |
GPI-ANCHOR | glycosyl-phosphatidylinositol (GPI) group linked to the alpha- carboxyl group of the C-terminal residue of the mature form of a protein. |
N-ACYL DIGLYCERIDE | N-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide- linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages. |
DISULFID | disulfide bond; the 'FROM' and 'TO' endpoints represent the two residues which are linked by an intra-chain disulfide bond; if the `FROM' and `TO' endpoints are identical, the disulfide bond is an interchain one and the description field indicates the nature of the cross-link. |
THIOLEST | thiolester bond; the 'FROM' and 'TO' endpoints represent the two residues which are linked by the thiolester bond. |
THIOETH | thioether bond; the 'FROM' and 'TO' endpoints represent the two residues which are linked by the thioether bond. |
CARBOHYD | glycosylation site; the nature of the carbohydrate (if known) is given in the description field. |
METAL | binding site for a metal ion; the description field indicates the nature of the metal. |
BINDING | binding site for any chemical group (co- enzyme, prosthetic group, etc.); the chemical nature of the group is given in the description field. |
SIGNAL | extent of a signal sequence (prepeptide). |
TRANSIT | extent of a transit peptide (mitochondrial, chloroplastic, or for a microbody). |
PROPEP | extent of a propeptide. |
CHAIN | extent of a polypeptide chain in the mature protein. |
PEPTIDE | extent of a released active peptide. |
DOMAIN | extent of a domain of interest on the sequence; the nature of that domain is given in the description field. |
CA_BIND | extent of a calcium-binding region. |
DNA_BIND | extent of a DNA-binding region. |
NP_BIND | extent of a nucleotide phosphate binding region; the nature of the nucleotide phosphate is indicated in the description field. |
TRANSMEM | extent of a transmembrane region. |
ZN_FING | extent of a zinc finger region. |
SIMILAR | extent of a similarity with another protein sequence; precise information, relative to that sequence, is given in the description field. |
REPEAT | extent of an internal sequence repetition. |
HELIX | secondary structure: Helices, for example, Alpha-helix, 3(10) helix, or Pi- helix. |
STRAND | secondary structure: Beta-strand, for example, Hydrogen bonded beta-strand, or Residue in an isolated beta-bridge. |
TURN | secondary structure Turns, for example, H-bonded turn (3-turn, 4-turn, or 5-turn). |
ACT_SITE | amino acid(s) involved in the activity of an enzyme. |
SITE | any other interesting site on the sequence. |
INIT_MET | the sequence is known to start with an initiator methionine. |
NON_TER | the residue at an extremity of the sequence is not the terminal residue; if applied to position 1, this signifies that the first position is not the N- terminus of the complete molecule; if applied to the last position, it signifies that this position is not the C-terminus of the complete molecule; there is no description field for this key. |
NON_CONS | non consecutive residues; indicates that two residues in a sequence are not consecutive and that there are a number of unsequenced residues between them. |
UNSURE | uncertainties in the sequence; used to describe region(s) of a sequence for which the authors are unsure about the sequence assignment. |
The requirements of 37 CFR 1.821 through 37 CFR 1.825 are the result of an effort to harmonize the USPTO requirements with international sequence listing requirements to the extent possible. The requirements of 37 CFR 1.821 through 37 CFR 1.825 substantially correspond to the requirements of WIPO Standard ST.25 (2009). However, the requirements of 37 CFR 1.821 through 37 CFR 1.825 are less stringent than the requirements of WIPO Standard ST.25 (2009). Thus, applicants who have filed or wish to file international applications or applications in countries that adhere to WIPO Standard ST.25 (2009) should be aware of the following requirements:
- (A) The data in numeric identifier <221> must use selections from Tables 5 and 6 of WIPO Standard ST.25 (2009) to comply with that standard. The terms from these Tables are considered language neutral vocabulary;
- (B) WIPO Standard ST.25 (2009), paragraph 24, requires a blank line between numeric identifiers in the sequence listing when the digit in the first or second position of the numeric identifier changes;
- (C) Where the sequence listing forming part of the description of the international application contains free text, e.g., free text in numeric identifier <223>, any such free text shall be repeated in the main part of the description in the language thereof (PCT Rule 5.2(b)). It is recommended that the free text in the language of the main part of the description be put in a specific section of the description called “Sequence Listing Free Text”;
- (D) A sequence listing filed after the international filing date is generally not considered to be part of the disclosure and usually will not be published as part of the international application publication (see PCT Article 34 and PCT Rules 26 and 91 for exceptions); and
- (E) Paragraph 4(v) of WIPO Standard ST.25 (2009) requires an accompanying statement with the specific wording “the information recorded in electronic form furnished under PCT Rule 13ter is identical to the sequence listing as contained in the international application”.
With further regard to requirements (A) and (B), is noted that PatentIn Version 3.5.1 software (see MPEP § 2430) generates sequence listings that meet all of the requirements of WIPO Standard ST.25 (2009). Applicants should similarly be aware that filing requirements for sequence listings may differ between a national US application, a foreign application and an international application during international phase. For example, where an international application is filed in paper, the sequence listing part of the international application must similarly be provided in paper. In addition, a copy of the sequence listing in ASCII plain text, to be used for the purpose of the international search (PCT Rule 13ter) must be filed on read-only optical disc or via the USPTO electronic filing system. Furthermore, in contrast to US national applications, a sequence listing filed with RO/US in ASCII plain text that is 300 MB or more in size is not subject to a size fee during the international phase of an international application.
2422.01 Nucleotide and/or Amino Acids Disclosures Requiring a “Sequence Listing” [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
I. LENGTH THRESHOLDS37 CFR 1.821(a) presents a definition for “nucleotide and/or amino acid sequences.” This definition sets forth limits, in terms of numbers of amino acids and/or numbers of nucleotides, at or above which compliance with the sequence rules is required. Nucleotide and/or amino acid sequences as used in 37 CFR 1.821 through 37 CFR 1.825 are interpreted to mean an unbranched sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides. Branched sequences are specifically excluded from this definition. Sequences with fewer than ten specifically defined nucleotides or four specifically defined amino acids are specifically excluded from 37 CFR 1.821. “Specifically defined” means those amino acids other than “Xaa” and those nucleotide bases other than “n” defined in Appendices A-F to 37 CFR part 1, Subpart G (see MPEP § 2422(I)).
The limit of four or more amino acids was established for consistency with limits in place for industry database collections whereas the limit of ten or more nucleotides, while lower than certain industry database limits, was established to encompass those nucleotide sequences to which the smallest probe will bind in a stable manner.
II. REPRESENTATION OF NUCLEIC ACIDS AND AMINO ACIDS37 CFR 1.821(a)(1) and 37 CFR 1.821(a)(2) present further definitions for those nucleotide and amino acid sequences that are intended to be embraced by the sequence rules. Situations in which the applicability of the rules is in issue will be resolved on a case-by-case basis.
Nucleotide sequences are further limited to those that can be represented by the symbols set forth in 37 CFR 1.822(b) and Appendices A and B to 37 CFR part 1, Subpart G (see MPEP § 2422(I)). The presence of other than typical 5' to 3' phosphodiester linkages in a nucleotide sequence does not render the rules inapplicable. For example, the Office does not want to exclude linkages of the type commonly found in naturally occurring nucleotides, e.g., eukaryotic end capped sequences.
Amino acid sequences are further limited to those in 37 CFR 1.822(b) and Appendices C and D to 37 CFR part 1, Subpart G (see MPEP § 2422(I)) and those L-amino acids that are commonly found in naturally occurring proteins. The presence of one or more D-amino acids in a sequence will exclude that sequence from the scope of the rules. Voluntary compliance is, however, encouraged in these situations; the symbol “Xaa” can be used to represent D-amino acids. The sequence rules embrace “[a]ny peptide or protein that can be expressed as a sequence using the symbols in Appendix C to 37 CFR part 1, Subpart G (see MPEP § 2422(I)) in conjunction with a description in the Feature section to describe, for example, modified linkages, cross links and end caps, non-peptidyl bonds, etc.” 37 CFR 1.821(a)(2).
With regard to amino acid sequences, the use of the terms “peptide or protein” implies that the amino acids in a given sequence are linked by at least three consecutive peptide bonds. Accordingly, an amino acid sequence is not excluded from the scope of the rules merely due to the presence of a single non-peptidyl bond. If an amino acid sequence can be represented by a string of amino acid abbreviations, modifications in the sequence, if any, set forth in the Features section, the sequence comes within the scope of the rules. However, the rules are not intended to encompass the subject matter that is generally referred to as synthetic resins.
III. SEQUENCES DISCLOSED IN APPLICATION TEXTThe requirement for compliance in 37 CFR 1.821(c) is directed to “disclosures of nucleotide and/or amino acid sequences.” (Emphasis added.) All sequences, whether claimed or not, that meet the length thresholds in 37 CFR 1.821(a) are subject to the “Sequence Listing” rules. The goal of the Office is to build a comprehensive database that can be used for, inter alia, assessing the prior art. It is therefore essential that all sequences, whether only disclosed or also claimed, be included in the database. In those instances in which prior art sequences are only referred to in a given application by name and a publication or accession reference, they need not be included as part of the “Sequence Listing”, unless the referred-to sequence is “essential material” per MPEP § 608.01(p). However, if the applicant presents the sequence as a string of particular nucleotide bases or amino acids, whether by way of symbols, words or chemical structure, it is necessary to include the sequence in the “Sequence Listing” regardless of whether the applicant considers the sequence to be prior art, so long as the sequence meets the criteria of 37 CFR 1.821(a). In general, any sequence that is disclosed and/or claimed as a sequence, i.e., as a string of particular nucleotide bases or amino acids, and that otherwise meets the criteria of 37 CFR 1.821(a), must be set forth in the “Sequence Listing”.
IV. VARIANTS OF A PRESENTED SEQUENCEIt is generally acceptable to present a single, primary sequence in the specification and “Sequence Listing” by enumeration of its residues in accordance with the sequence rules (“primary sequence”) and to discuss and/or claim variants of that primary sequence without presenting each variant as a separate sequence in the “Sequence Listing”. Where the variant sequence meets the length thresholds of 37 CFR 1.821(a) and is disclosed by enumeration of its residues anywhere in an application, it must be presented in a “Sequence Listing” in a manner that complies with the requirements of the sequence rules. However, the primary sequence should be annotated in the “Sequence Listing” to reflect such variants. By way of example only, the following types of sequence disclosures would be treated as noted herein by the Office. With respect to a primary sequence and “conservatively modified variants thereof,” the sequences may be described as SEQ ID NO:X (the primary sequence) and “conservatively modified variants thereof,” if desired. With respect to a sequence that “may be deleted at the C-terminus by 1, 2, 3, 4, or 5 residues,” all of the implied variations do not need to be included in the “Sequence Listing”. In this latter example, only the sequence without deletions needs to be included in the “Sequence Listing”, though applicant is encouraged to annotate the sequence to indicate that deletions have been made at the C-terminus by 1, 2, 3, 4, or 5 residues.
The Office's database will only contain the unmodified sequence. It is strongly recommended that any sequences appearing in the claims, or sequences that are considered essential to understanding the invention, be included in the “Sequence Listing” as a separate sequence.
V. SEQUENCE IDENTIFIER37 CFR 1.821(d) and 37 CFR 1.823(a)(5) require that each disclosed nucleic acid and/or amino acid sequence in the application appear separately in the “Sequence Listing”, with each sequence further being assigned a sequence identifier, referred to as “SEQ ID NO.” or the like. The use of “SEQ ID NO:” is preferred, but including “or the like” is intended to ensure that a formalities notice is not sent when an application uses, for example, “SEQ NO.” or “Seq. Id. No.” or any similar identification for an amino acid or nucleotide sequence in the specification or claims where it is clear that a sequence from the “Sequence Listing” is shown in the description or claims. The sequence identifiers must begin with 1 and increase sequentially by integers. The requirement for sequence identifiers, at a minimum, requires that each sequence be assigned a different number for purposes of identification. However, where practical and for ease of reference, sequences should be presented in the “Sequence Listing” in numerical order and in the order in which they are discussed in the application.
37 CFR 1.821(d) further requires that where the description or claims of a patent application discuss a sequence that is set forth in the “Sequence Listing”, a reference to the sequence identifier of that sequence is required at all occurrences, even if in the text of the description or claims where the sequence is set forth by enumeration of its residues. This requirement is also intended to permit references elsewhere in the application (e.g., specification, claims, or drawings) to sequences set forth in the “Sequence Listing” by the use of assigned sequence identifiers without repeating the sequence. Sequence identifiers can also be used to discuss and/or claim parts or fragments of a properly presented sequence. For example, language such as “residues 14 to 243 of SEQ ID NO:23” is permissible and the fragment need not be separately presented in the “Sequence Listing”. Where a sequence that meets the length thresholds of 37 CFR 1.821(a) is disclosed by enumeration of its residues anywhere in an application, it must be presented in a “Sequence Listing” in a manner that complies with the requirements of the sequence rules.
The rules do not alter, in any way, the requirements of 35 U.S.C. 112. The implementation of the rules has had no effect on disclosure and/or claiming requirements. The rules, in general, or the use of sequence identifiers throughout the specification and claims, specifically, should not raise any issues under 35 U.S.C. 112(a) or 35 U.S.C. 112(b). The use of sequence identifiers (SEQ ID NO:X or the like) only provides a shorthand way for applicants to discuss and claim their inventions. These identifiers do not in any way restrict the manner in which an invention can be claimed.
2422.02 The Requirement for Exclusive Conformance; Sequences Presented in Drawing Figures [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
For all applications that disclose nucleic acid and/or amino acid sequences that fall within the definition set forth in 37 CFR 1.821(a), 37 CFR 1.821(b)requires exclusive conformance to the requirements of 37 CFR 1.821 through 37 CFR 1.825 with regard to the manner in which the disclosed nucleotide and/or amino acid sequences are presented and described. This requirement is necessary to minimize any confusion that could result if more than one format for representing sequence data was employed in a given application.
Pursuant to 37 CFR 1.83(a), sequences that are included in sequence listings should not be duplicated in the drawings. However many significant sequence characteristics may only be demonstrated by a figure. This is especially true in view of the fact that the representation of double stranded nucleotides is not permitted in the “Sequence Listing” and many significant nucleotide features, such as “sticky ends” and the like, may only be shown effectively by reference to a drawing figure. Further, the similarity or homology between/among sequences may only be depicted in an effective manner in a drawing figure. Similarly, drawing figures are recommended for use with amino acid sequences to depict structural features of the corresponding protein, such as epitopes and interaction domains. The situations discussed herein are given by way of example only and there may be many other reasons for including a sequence in a drawing. However, when a sequence is presented in a drawing, the sequence must still be included in the “Sequence Listing” if the sequence falls within the definition set forth in 37 CFR 1.821(a), and a sequence identifier (“SEQ ID NO:X” or the like) must be used, either in the drawing itself or in the Brief Description of the Drawings.
2422.03 Sequence Listing Submission [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
37 CFR 1.821(c) requires that applications containing disclosures of nucleotide and/or amino acid sequences that fall within the definitions of 37 CFR 1.821(a) contain, as a separate part of the specification, a disclosure of each of the nucleotide and/or amino acid sequences, and associated information, using the format and symbols that are set forth in 37 CFR 1.822 and 37 CFR 1.823. This separate part of the specification is referred to as the “Sequence Listing”. Except for submission of a national phase application under 35 U.S.C. 371 of an international application (PCT) that is compliant with PCT Rule 5.2(a), the “Sequence Listing” required pursuant to 37 CFR 1.821(c) must be submitted (1) as an ASCII plain text file via the USPTO patent electronic filing system or on a read-only optical disc under 37 CFR 1.52(e), accompanied by an incorporation by reference statement of the ASCII plain text file, in a separate paragraph of the specification, in accordance with 37 CFR 1.77(b)(5) (see 37 CFR 1.821(c)(1)); (2) as a PDF image file submitted via the USPTO patent electronic filing system (see 37 CFR 1.821(c)(2)); or (3) on physical sheets of paper (see 37 CFR 1.821(c)(3)). If the “Sequence Listing” required by 37 CFR 1.821(c) is submitted on physical sheets of paper or as a PDF image file, the “Sequence Listing” is a separate part of the specification which must begin on a new page within the specification. Also, in an application filed under 35 U.S.C. 111(a) in which sequence information is submitted in an ASCII plain text file in compliance with 37 CFR 1.824 and as a PDF image file or on physical sheets of paper, the PDF image file or the physical sheets of paper will comply with the listing requirement under 37 CFR 1.821(c) and the ASCII plain text file will comply with the CRF requirement under 37 CFR 1.821(e)(1)(i).
When the “Sequence Listing” is submitted as a PDF image file under 37 CFR 1.821(c)(2) via USPTO patent electronic filing system or on physical sheets of paper under 37 CFR 1.821(c)(3), 37 CFR 1.821(e)(1) requires that the copy of the 1.821(c) “Sequence Listing” must also be submitted in a separate computer readable form (CRF) in accordance with the requirements of 37 CFR 1.824. Similarly, in the case of a national stage application submitted under 35 U.S.C. 371, when the “Sequence Listing” is submitted as a PDF image file via the USPTO patent electronic filing system under 37 CFR 1.821(c)(2) or on physical sheets of paper under 37 CFR 1.821(c)(3), 37 CFR 1.821(e)(2) requires that the copy of the “Sequence Listing” referred to in 37 CFR 1.821(c) must also be submitted in a separate CRF in accordance with the requirements of 37 CFR 1.824. 37 CFR 1.821(e)(3) requires that, in text format in an international application under the PCT that is to be searched by the United States International Searching Authority or examined by the United States International Preliminary Examining Authority, a CRF in accordance with the requirements of 37 CFR 1.824 must be submitted if a “Sequence Listing” in ASCII plain text format in compliance with 37 CFR 1.824 has not been submitted and the application contains disclosures of nucleotide and/or amino acid sequences, as defined in 37 CFR 1.821(a).
At entry into the national stage under 35 U.S.C. 371, an international application compliant with PCT Rule 5.2 that contains a “Sequence Listing” in ASCII plain text format as part of the description satisfies the requirements of 37 CFR 1.821(c) and 37 CFR 1.821(e). If the international application was previously communicated by the International Bureau under PCT Article 20 and/or was originally filed in the United States Patent and Trademark Office, then no further submission of a “Sequence Listing” or incorporation by reference into the specification is required. Alternatively, if applicant must provide a copy of the international application as required by 37 CFR 1.495(b)(1), the copy of the international application must include the “Sequence Listing” part of the application, and no incorporation by reference into the specification is required. See also 37 CFR 1.823(b)(2) and 1.825(c).
Whether submitted via the USPTO patent electronic filing system or on read-only optical disc(s), the ASCII plain text file must contain a copy of a single “Sequence Listing” in a single file. One hundred (100) megabytes is the size limit for “Sequence Listing” and CRF text files submitted via the USPTO patent electronic filing system, and “Sequence Listing” and CRF text files cannot be compressed when submitted via the USPTO patent electronic filing system. If a user wishes to submit an electronic copy of a “Sequence Listing” or CRF text file that exceeds 100 megabytes, the “Sequence Listing” or CRF must be filed on read-only optical disc(s).
I. ASCII PLAIN TEXT FILE SUBMITTED VIA USPTO PATENT ELECTRONIC FILING SYSTEMThe Office strongly suggests filing the “Sequence Listing” required by 37 CFR 1.821(c) as an ASCII plain text file via the USPTO patent electronic filing system. See 37 CFR 1.821(c)(1). If sequence information is submitted in an application filed under 35 U.S.C. 111(a) as an ASCII plain text file or as a national stage application submitted under 35 U.S.C. 371 as an ASCII plain text file in compliance with 37 CFR 1.824 via the USPTO patent electronic filing system and applicant has not filed a “Sequence Listing” as a PDF image file or on physical sheets of paper, the ASCII plain text file will serve as both the “Sequence Listing” under 37 CFR 1.821(c) and the CRF of the “Sequence Listing” under 37 CFR 1.821(e). See 37 CFR 1.821(e)(1). See 37 CFR 1.821(e)(2). Note that for applications filed under 35 U.S.C. 111(a), the specification must contain a statement in a separate paragraph (see 37 CFR 1.77(b)(5)) that incorporates by reference the material in the “Sequence Listing” ASCII plain text file identifying the name of the ASCII plain text file, the date of creation, and the size of the ASCII plain text file in bytes (see 37 CFR 1.823(b)(1)). However, an incorporation by reference statement is not required in an international application and is not required in an application file under 35 U.S.C. 371 where the “Sequence Listing” has been previously communicated to the International Bureau or originally filed in the USPTO and complies with Patent Cooperation Treaty Rule 5.2. See 37 CFR 1.821(c), 1.823(b)(2), and 1.825(c). See also MPEP § 2422.03(a) for additional information pertaining to USPTO patent electronic filing system submissions of a “Sequence Listing”.
II. ASCII PLAIN TEXT FILE ON READ-ONLY OPTICAL DISCIf the “Sequence Listing” as required by 37 CFR 1.821(c) is submitted on read-only optical disc(s) in accordance with 37 CFR 1.52(e), the specification must contain an incorporation by reference of the material on the read-only optical disc in a separate paragraph (see 37 CFR 1.77(b)(5)) identifying the name of the file, the date of creation, and the size of the file in bytes (37 CFR 1.823(b)(1)). However, an incorporation by reference statement is not required in an international application and is not required in an application file under 35 U.S.C. 371 where the “Sequence Listing” has been previously communicated to the International Bureau or originally filed in the USPTO and complies with Patent Cooperation Treaty Rule 5.2. See 37 CFR 1.821(c), 1.823(b)(2), and 1.825(c). It is noted that a “Sequence Listing” may be compressed using WinZip®, 7-Zip, or Unix®/Linux® Zip (37 CFR 1.824(b)(2)(ii)). If a compressed ASCII plain text file does not fit on a single read-only optical disc due to storage limitations of the read-only optical disc, the compressed ASCII plain text file may be split into multiple file parts and placed on multiple read-only optical discs which are labeled in compliance with 37 CFR 1.52(e)(5)(vi) and (37 CFR 1.824(b)(2)(iv)).
The read-only optical disc used to submit the “Sequence Listing” may also contain “Large Tables” if the table has more than 50 pages of text. See 1.52(e)(1)(iii) and 37 CFR 1.58(c), (f) and (i). The read-only optical disc and duplicate copy must be labeled “Copy 1” and “Copy 2,” respectively, and a statement stating that the copies are identical must be included. If the two read-only optical discs are not identical, the Office will use the disc labeled “Copy 1” for further processing (37 CFR 1.58(i) ). See also MPEP § 608.05.
If the “Sequence Listing” is submitted in an application filed under 35 U.S.C. 111(a) as an ASCII plain text file in compliance with 37 CFR 1.824 on read-only optical disc(s) in accordance with 37 CFR 1.52(e) and applicant has not filed a “Sequence Listing” as a PDF image file or on physical sheets of paper, the ASCII plain text file will serve as both the “Sequence Listing” under 37 CFR 1.821(c) and the CRF of the “Sequence Listing” under 37 CFR 1.821(e). See 37 CFR 1.821(e)(1). Similarly, if the “Sequence Listing” is filed in a national stage application submitted under 35 U.S.C. 371 as an ASCII plain text file in compliance with 37 CFR 1.824 on read-only optical disc(s) in accordance with 37 CFR 1.52(e) and applicant has not filed a “Sequence Listing” as a PDF image file or on physical sheets of paper, the ASCII plain text file will serve as both the “Sequence Listing” under 37 CFR 1.821(c) and the CRF of the “Sequence Listing” under 37 CFR 1.821(e).
A.ASCII Plain Text Files Up to 300 MBWhen the “Sequence Listing” or CRF is submitted via read-only optical disc(s), the text file may either be compressed or not compressed. If the ASCII plain text file is not compressed, the ASCII plain text file must be contained on a single read-only optical disc. However, if the file does not fit on a single read-only optical disc even after compression, a compressed ASCII plain text file may be split into multiple file parts, in accordance with the target read-only optical disc size, and labeled in compliance with 37 CFR 1.52(e)(5)(vi). See 37 CFR 1.824(b).
B.ASCII Plain Text Files 300 MB or OverAny “Sequence Listing” ASCII plain text file of 300 MB or more is subject to a fee under 37 CFR 1.21(o) to manage handling of the oversized submission (37 CFR 1.52(f)(3)). Pricing for this fee is divided into two tiers with Tier 1 for file sizes 300 MB to 800 MB and Tier 2 for file sizes greater than 800 MB. The level of effort associated with the handling of mega-“Sequence Listing” is significant, because the Office’s systems require extra storage and special handling for files beyond 300 MB. The fee should encourage applicants to draft their specifications such that sequence data that is not essential material is not required to be included in a “Sequence Listing”. A reduced number of mega-“Sequence Listings” will benefit the Office and the public by reducing the strain on Office resources, thus facilitating the effective administration of the patent system.
The fee under 37 CFR 1.21(o) is due upon the first submission of a “Sequence Listing” that exceeds 800 MB, or the first submission of a “Sequence Listing” of at least 300MB, whichever applicable fee is higher. As an example, if an application was filed prior to January 16, 2018 (with or without a text file “Sequence Listing”), and thereafter a mega-“Sequence Listing” that is between 300 and 800 MB is filed, the fee under 37 CFR 1.21(o)(1) is due. If an applicant thereafter files a corrected “Sequence Listing” that is also between 300 and 800 MB, no additional fee is due. If a further corrected “Sequence Listing” is filed and the file size exceeds 800 MB, then the total fee owed under 37 CFR 1.21(o) is the fee set forth in 37 CFR 1.21(o)(2). The fee, which is difference between the current fee and the prior paid fee, is due upon submission of the mega-“Sequence Listing”. Subsequent deletion or reduction in size of a “Sequence Listing” does not change the requirement to pay the mega-“Sequence Listing” submission fee.
The fee under 37 CFR 1.21(o) does not apply to international applications in the international stage, but does apply to the submission of mega-“Sequence Listings” received in national stage applications under 35 U.S.C. 371, including mega-"Sequence Listings" received by the Office pursuant to PCT Article 20. See MPEP § 2422.03(a), subsection IV, for additional information.
2422.03(a) “Sequence Listing” Submitted as ASCII Plain Text Files [R-01.2024]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
The Legal Framework for Patent Electronic System (www.uspto.gov/PatentLegalFramework) and MPEP § 502.05 provide detailed information pertaining to filing applications and other documents via the USPTO patent electronic filing system. The information below is specific to “Sequence Listing” submissions via the USPTO patent electronic filing system.
Pursuant to the Legal Framework for Patent Electronic System, applicants may submit a “Sequence Listing” under 37 CFR 1.821 as an ASCII plain text file via the USPTO patent electronic filing system or on read-only optical disc(s), provided the specification contains a statement in a separate paragraph (preferably on the first page) that incorporates by reference the material in the ASCII plain text file identifying the name of the ASCII plain text file, the date of creation, and the size of the ASCII plain text file in bytes. See 37 CFR 1.77(b)(5) and 1.823(b)(1). An exception is that an incorporation by reference statement is not required in an international application and is not required in an application file under 35 U.S.C. 371 where the “Sequence Listing” has been previously communicated to the International Bureau or originally filed in the USPTO and complies with Patent Cooperation Treaty Rule 5.2. See 37 CFR 1.821(c), 1.823(b)(2), and 1.825(c). The requirements of 37 CFR 1.52(e) for documents submitted on read-only optical disc(s) are not applicable to a “Sequence Listing” submitted as ASCII plain text files via the USPTO patent electronic filing system. However, each text file must be an ASCII plain text file and have a file name with a “.txt” extension. See 37 CFR 1.824.
I. ASCII PLAIN TEXT FILES SERVE AS BOTH THE “SEQUENCE LISTING” AND THE CRFIt is recommended that a “Sequence Listing” be submitted as an ASCII plain text file via the USPTO patent electronic filing system rather than as a PDF image file. See subsection IV, below, for information regarding filing an international application (PCT) with a sequence listing ASCII plain text file via the USPTO patent electronic filing system.
If the “Sequence Listing” is submitted in an application filed under 35 U.S.C. 111(a) as an ASCII plain text file in compliance with 37 CFR 1.824 via the USPTO patent electronic filing system and applicant has not filed a “Sequence Listing” as a PDF image file or on physical sheets of paper, the ASCII plain text file will serve as both the “Sequence Listing” under 37 CFR 1.821(c) and the CRF of the “Sequence Listing” under 37 CFR 1.821(e). See 37 CFR 1.821(e)(1). Similarly, if the “Sequence Listing” is filed in a national stage application submitted under 35 U.S.C. 371 as an ASCII plain text file in compliance with 37 CFR 1.824 via the USPTO patent electronic filing system, the ASCII plain text file will serve as both the “Sequence Listing” under 37 CFR 1.821(c) and the CRF of the “Sequence Listing” under 37 CFR 1.821(e). Thus, the following are not required and should not be submitted: (1) a second copy of the “Sequence Listing” as a PDF image file or on physical sheets of paper; and (2) a statement under 37 CFR 1.821(e)(1)(ii) or (2)(ii) (indicating that the sequence information contained in the “Sequence Listing” under 37 CFR 1.821(c) and CRF copy of the “Sequence Listing” under 37 CFR 1.821(e)(1)(ii) or (2)(ii) are identical). Also, the practice of CRF transfers has been eliminated. Checker software that may be used to check a “Sequence Listing” for compliance with the requirements of 37 CFR 1.824 is available on the USPTO website at www.uspto.gov/Checker4. The User Notes on the Checker website should be consulted for an explanation of the scope of errors and content that is able to be verified by the Checker software.
If a user adds a “Sequence Listing” (under PCT Rule 13ter) as an ASCII plain text file via the USPTO patent electronic filing system in response to a requirement under 37 CFR 1.821(h), the “Sequence Listing” ASCII plain text file must be accompanied by a statement that the ASCII plain text file does not go beyond the disclosure in the international application as filed and the late furnishing fee as set forth in 37 CFR 1.445(a)(5). However, if a user submits an amendment to add or replace a “Sequence Listing” (under 37 CFR 1.821(c)) as an ASCII plain text file via the USPTO patent electronic filing system in response to a requirement under 37 CFR 1.821(g), the submission must comply with 37 CFR 1.825. See MPEP § 2426.
In applications filed under 35 U.S.C. 111(a) and 371, submission of the “Sequence Listing” as a PDF image file or on physical sheets of paper is not recommended. Where the “Sequence Listing” is submitted as a PDF image or on physical sheets of paper, applicant must provide the CRF required by 37 CFR 1.821(e)(1)(i). Note that the “Sequence Listing” in the PDF image file or on physical sheets of paper will not be excluded when determining the application size fee. The USPTO prefers the submission of sequence information in an ASCII plain text file via the USPTO patent electronic filing system because as stated above, if in an application filed under 35 U.S.C. 111(a) or 35 U.S.C. 371 applicant has not filed a second copy of the “Sequence Listing” as a PDF image file or on physical sheets of paper (see 37 CFR 1.821(e)(1)), the ASCII plain text file will serve as both the “Sequence Listing” required by 37 CFR 1.821(c) and the CRF required by 37 CFR 1.821(e).
II. APPLICATION SIZE FEEAny “Sequence Listing” or CRF of a “Sequence Listing” submitted as an ASCII plain text file via the USPTO patent electronic filing system that is in compliance with 37 CFR 1.821(c) or (e) will be excluded when determining the application size fee required by 37 CFR 1.16(s) or 1.492(j) as per 37 CFR 1.52(f)(2). A “Sequence Listing” submitted as a PDF image file via the USPTO patent electronic filing system or on read-only optical disc will not be excluded when determining the application size fee.
See subsection IV, below, for additional information regarding application size fees in an international application (PCT).
III. SIZE RESTRICTIONS FOR ASCII PLAIN TEXT FILESOne hundred (100) megabytes is the size limit for “Sequence Listing” ASCII plain text files submitted via the USPTO patent electronic filing system, and compression is not allowed for a “Sequence Listing” submitted via the USPTO patent electronic filing system. If a user wishes to submit a “Sequence Listing” ASCII plain text file that exceeds 100 megabytes, it is recommended that the user file the application without the “Sequence Listing” using the USPTO patent electronic filing system to obtain the application number and confirmation number, and then file the “Sequence Listing” on read-only optical disc(s) in accordance with 37 CFR 1.52(e) and 1.824 on the same day by using Priority Mail Express® from the USPS in accordance with 37 CFR 1.10, or hand delivery, in order to secure the same filing date for all parts of the application. Note: a submission of a “Sequence Listing” in electronic form of 300 MB or more in size is subject to an oversized submission fee set forth in 37 CFR 1.21(o). See 37 CFR 1.52(f)(3). Alternatively, a user may submit the application on physical sheets of paper and include the “Sequence Listing” ASCII plain text file on read-only optical disc(s) in accordance with 37 CFR 1.52(e). “Sequence Listing” ASCII plain text files may not be divided into multiple files so as to not exceed the 100 MB size limit for filing via the USPTO patent electronic filing system, and any “Sequence Listing” greater than 100 MB must be submitted on read-only optical disc(s). If the “Sequence Listing” is filed on a read-only optical disc, the ”Sequence Listing” must be a single file, and any not compressed file must be contained on a single read-only optical disc. However, a compressed file that does not fit on a single read-only optical disc may be split into multiple file parts, in accordance with the target read-only optical disc size, and must have a label permanently affixed thereto on which the following information has been hand-printed or typed: (i) First-named inventor (if known); (ii) Title of the invention; (iii) Attorney docket or file reference number (if applicable); (iv) Application number and filing date (if known); (v) Date on which the data were recorded on the read-only optical disc; and (vi) Disc order (e.g., “1 of X”), if multiple read-only optical discs are submitted. See 37 CFR 1.52(e)(5) and 1.824(b).
See subsection IV.B, below, for information regarding submission of a sequence listing text file that exceeds 100 megabytes in an international application (PCT) filed via the USPTO patent electronic filing system.
The current size limit on the USPTO patent electronic filing system for ASCII plain text file submissions of “Large Tables” and a “Computer Program Listing Appendix” is 25 megabytes per submission, and compression is not allowed for “Large Tables” and a “Computer Program Listing Appendix” submitted as ASCII plain text files via the USPTO patent electronic filing system. Files above the 25 MB limit for “Large Tables” and a “Computer Program Listing Appendix” may be either (1) broken up into multiple files that are no larger than 25 MB each and those smaller files may be submitted via the USPTO patent electronic filing system or (2) submitted on read-only optical disc(s) (see 37 CFR 1.52(e)). If the user chooses to break up a “Large Table” or “Computer Program Listing Appendix” file so that it may be submitted via the USPTO patent electronic filing system, the file names must indicate their order (e.g., “1 of X”, “2 of X”). If a user wishes to file an application with a “Large Table” or “Computer Program Listing Appendix” ASCII plain text file that is larger than 25 megabytes, it is recommended that the user file the application without the “Large Table” or “Computer Program Listing Appendix” using the USPTO patent electronic filing system to obtain the application number and confirmation number, and then file the “Large Table” or “Computer Program Listing Appendix” on read-only optical disc(s) in accordance with 37 CFR 1.52(e) and 1.58(c) or 1.96(c) on the same day by using Priority Mail Express® from the USPS in accordance with 37 CFR 1.10, or hand delivery, in order to secure the same filing date for all parts of the application. Alternatively, a user may submit the application on physical sheets of paper and include the “Large Table” or “Computer Program Listing Appendix” ASCII plain text file on read-only optical disc(s) in accordance with 37 CFR 1.52(e) in order to secure the same filing date for all parts of the application. See 37 CFR 1.58(f) and 37 CFR 1.96(c)(4).
IV. FILING A SEQUENCE LISTING IN INTERNATIONAL APPLICATIONS (PCT) VIA THE USPTO PATENT ELECTRONIC FILING SYSTEMUnder PCT Rule 5.2(a), “where the international application contains disclosure of one or more nucleotide and/or amino acid sequences, the description shall contain a sequence listing complying with the standard provided for in the Administrative Instructions and presented as a separate part of the description in accordance with that standard”. The standard is set forth in the PCT Administrative Instructions Annex C, entitled Standard for the Presentation of Nucleotide and Amino Acid Sequence Listings in International Patent Applications Under the PCT. When filing an international application (PCT) using the USPTO patent electronic filing system, it is highly recommended to submit a sequence listing as a single ASCII plain text file with a “.txt” extension on the international filing date. Note that 100 megabytes is the size limit for submitting a sequence listing ASCII plain text file via the USPTO patent electronic filing system. See subsection IV.B, below. Although it is not recommended, applicant can also submit the sequence listing in a PCT application as a PDF image file as part of the application on the international filing date.
If the sequence listing is submitted as an ASCII plain text file when filing a new international application (PCT), applicant need not and should not submit any additional copies, including PDF image files. The single ASCII plain text file is preferred because the ASCII plain text file will serve both as the sequence listing part of the description required under PCT Rule 5.2 and the electronic form under PCT Rule 13ter.1(a) in the absence of a PDF sequence listing file. The check list (Box No. IX) of the PCT Request (form PCT/RO/101) provided via the USPTO patent electronic filing system together with the international application (PCT) must indicate that the sequence listing submitted as an ASCII plain text file forms part of the international application. Furthermore, the statement as set forth in paragraph 4(v) of the AI Annex C (Administrative Instructions under the PCT, Annex C), that “the information recorded in electronic form furnished under PCT Rule 13ter is identical to the sequence listing as contained in the international application,” is not necessary.
Submission of the sequence listing part of the description as a PDF image file in a new international application (PCT) is not recommended because, where the application does not contain a sequence listing as an ASCII plain text file, the International Searching Authority or the International Preliminary Authority may invite applicant to furnish a copy of the sequence listing in ASCII plain text format for the purposes of international search and/or preliminary examination. Any sequence listing submitted in response to this invitation will not form part of the application.
When a sequence listing is filed via the USPTO patent electronic filing system in a new PCT international application as both a PDF image file and an ASCII plain text file, but the Request form Box No. IX does not indicate which one forms part of the international application, the PDF image copy of the sequence listing will be considered to form part of the application and the ASCII plain text file will be considered to be an accompanying item for search purposes under PCT Rule 13ter.1(a) only.
The international filing fee for an international application (PCT) that includes a sequence listing, filed via the USPTO patent electronic filing system, is calculated based on the type of sequence listing file that is part of the description of the international application (PCT). A sequence listing filed as an ASCII plain text file will not be included in the sheet count of the international application (PCT). A sequence listing filed as a PDF image file will be included in the sheet count of the international application (PCT), when it is part of the description. When a new PCT international application is filed via the USPTO patent electronic filing system and contains the sequence listing part of the description as a PDF image file and a sequence listing ASCII plain text to be used only for search purposes under PCT Rule 13ter.1(a), the sheets of the PDF image file will count towards excess sheet fees, if any.
B.File Size and Quantity LimitsOne hundred (100) megabytes is the size limit for sequence listing ASCII plain text files submitted via the USPTO patent electronic filing system. See 37 CFR 1.824(b)(1). Sequence listing ASCII plain text files must not be partitioned into multiple files for filing via the USPTO patent electronic filing system. The sequence listing must be in a single ASCII plain text file.
Applicant may use the USPTO patent electronic filing system to file part of the international application (PCT) and to obtain the international application (PCT) number and the confirmation number, and then file the remainder of the international application (PCT) on the same day as one or more follow-on submissions using the USPTO patent electronic filing system, in order to secure the same filing date for all parts of the international application (PCT). However, applicant is not permitted to file part of the international application (PCT) electronically via the USPTO patent electronic filing system, and then file the remainder of the international application (PCT) on paper to secure a filing date of all parts of the international application (PCT).
In the situation where applicant needs to file a sequence listing that is over one hundred (100) megabytes, applicant may use the USPTO patent electronic filing system to file the international application (PCT) without the sequence listing to obtain the international application (PCT) number and the confirmation number, and then file the sequence listing on read-only optical discs on the same day by using Priority Mail Express® from the USPS in accordance with 37 CFR 1.10, or by hand delivery, in order to secure the same filing date for all parts of the international application (PCT). However, the read-only optical discs must not contain PDF image files and must fully comply with the guidelines for filing a sequence listing on electronic media. The check list of the PCT Request provided via the USPTO patent electronic filing system together with the international application (PCT) must indicate that the sequence listing will be filed separately on physical data carrier(s), on the same day and in the form of an Annex C/ST.25 text file. The sequence listing must be a single file, but the file may be split for submission on multiple physical media using software designed to divide a file into multiple files for subsequent concatenation. If the user breaks up a sequence listing for submission on multiple read-only optical discs, the read-only optical discs must be labeled to indicate their order (e.g., “1 of X”, “2 of X”).
Submissions of very lengthy sequence listings (300 MB or over a.k.a. mega-sequence listings) in international applications are not subject to the mega-sequence listing submission fees set forth in 37 CFR 1.21(o). However, for mega-sequence listing submissions on or after January 16, 2018, the fee under 37 CFR 1.21(o) does apply to the submission of mega-sequence listings received in national stage applications under 35 U.S.C. 371, including mega-sequence listings received by the Office pursuant to PCT Article 20. Similarly, if an international application is filed at RO/US with a mega-sequence listing, and thereafter a bypass continuing application is filed under 35 U.S.C. 111(a), the fee under 37 CFR 1.21(o) will be due in the continuing application for mega-sequence listing submissions on or after January 16, 2018. See 37 CFR 1.52(f)(3).
2422.04 The Requirement for a Computer Readable Copy of the “Sequence Listing” [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
37 CFR 1.821(e) requires the submission of a copy of the “Sequence Listing” in computer readable form (CRF) in an application filed under 35 U.S.C. 111(a) or in a national stage application submitted under 35 U.S.C. 371. A separate computer readable form must be submitted via the USPTO patent electronic filing system or on read-only optical disc(s), as permitted by 37 CFR 1.824(b), when the “Sequence Listing” required by 37 CFR 1.821(c) is submitted as a PDF image file or on physical sheets of paper in a U.S. application filed under 35 U.S.C. 111(a) (see 37 CFR 1.821(e)(1)) or when the “Sequence Listing” required by 37 CFR 1.821(c) is submitted as a PDF image file or on physical sheets of paper and not also submitted as an ASCII plain text file in a national stage application (see 37 CFR 1.821(e)(2)). However, the Office prefers submission of sequence information as an ASCII plain text file via the USPTO patent electronic filing system or on read-only optical disc(s) without a copy of the “Sequence Listing” as a PDF image file or on physical sheets of paper in all applications because such an ASCII submission will serve as both the “Sequence Listing” required by 37 CFR 1.821(c) and the CRF of the “Sequence Listing” required by 37 CFR 1.821(e) and the “Sequence Listing” submitted as an ASCII plain text file will not be included in the application size fee determination under 37 CFR 1.52(f)(1) or (2). See MPEP § 2422.03(a)(I).
The information on the computer readable form will be entered into the Office’s database for searching and printing nucleotide and amino acid sequences. This electronic database will also enable the Office to provide published sequence data, in electronic form, to the National Center for Biotechnology Information (NCBI) for publication in GenBank, and enable NCBI to exchange data with the DNA Data Bank of Japan (DDBJ) and the European Bioinformatics Institute (EBI). It should be noted that the Office’s database complies with the confidentiality requirement imposed by 35 U.S.C. 122. Unpublished pending application sequences are maintained in the database separately from published or patented sequences. That is, the Office will not exchange or make public any information on any sequence until the patent application containing that information is published or matures into a patent, or as otherwise allowed by 35 U.S.C. 122.
The Office may permit correction of the “Sequence Listing” submitted pursuant to 37 CFR 1.821(c), whether on physical sheets of paper or as a PDF image file, at the least, during the pendency of a given application by reference to the computer readable form thereof submitted pursuant to37 CFR 1.821(e) if both the “Sequence Listing” and computer readable form were submitted at the time of filing of the application and the totality of the circumstances otherwise substantiate the proposed correction. A mere discrepancy between the “Sequence Listing” and the computer readable form may not, in and of itself, be sufficient to justify a proposed correction. In this regard, the Office will assume that the computer readable form has been incorporated by reference into the application when the “Sequence Listing” and computer readable form were submitted at the time of filing of the application. The Office will attempt to accommodate or address all correction issues, but it must be kept in mind that the real burden rests with the applicant to ensure that any discrepancies between the “Sequence Listing” and the CRF copy are eliminated or minimized. Applicants should be aware that there will be instances where the applicant may have to suffer the consequences of any discrepancies between the two. It is noted that in an application filed under 35 U.S.C. 111(a) in which applicant has not filed a second copy of the “Sequence Listing” as a PDF image file or on physical sheets of paper (see 37 CFR 1.821(e)(1)), an ASCII plain text file will serve as both the “Sequence Listing” required by 37 CFR 1.821(c) and the CRF required by 37 CFR 1.821(e), eliminating any chance for discrepancies. Filing the “Sequence Listing” as an ASCII plain text file submitted via the USPTO patent electronic filing system that complies with both 37 CFR 1.821(c) and (e) is the Office’s preferred method of receiving a “Sequence Listing”.
The Office does not desire to be bound by a requirement to permanently preserve computer readable forms submitted on read-only optical disc(s) for support, priority or correction purposes. Thus, once use of the CRF by the Office for processing has ended, i.e., once the Office has entered the data contained on the computer readable form into the appropriate database, the Office does not intend to further preserve the CRF submitted by the applicant, and applicant should not expect to have the read-only optical disc(s) returned. See 37 CFR 1.52(e)(6).
2422.05 [Reserved]
2422.06 Requirement for Statement Regarding Information Contained in the “Sequence Listing” and Separate Computer Readable Form [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
When a separate computer readable form (CRF) of a “Sequence Listing” is submitted in an application filed under 35 U.S.C. 111(a) because the “Sequence Listing” is filed as a PDF image file (37 CFR 1.821(c)(2)) or on physical sheets of paper (37 CFR 1.821(c)(3)) (see 37 CFR 1.821(e)(1)(i)) or filed in a national stage application submitted under 35 U.S.C. 371 because the “Sequence Listing” is filed as a PDF image file (37 CFR 1.821(c)(2)) or on physical sheets of paper (37 CFR 1.821(c)(3)) and not also as an ASCII plain text file in compliance with 37 CFR 1.824 (see 37 CFR 1.821(e)(2)(i)), 37 CFR 1.821(e)(1)(ii) and (2)(ii) require, a statement that the information contained in the “Sequence Listing” and the separate CRF are identical. When a CRF is submitted in the international stage of an international application under the PCT in response to an notice requesting a ASCII plain text formatted sequence listing by the United States International Searching Authority or by the United States International Preliminary Examining Authority because a sequence listing in ASCII plain text format in compliance with 37 CFR 1.824 has not been filed (see 37 CFR 1.821(e)(3)(i)), 37 CFR 1.821(e)(3)(iii) requires a statement that the information contained in the CRF does not go beyond the disclosure in the international application as filed or a statement that the information recorded in the ASCII plain text file of the CRF is identical to the sequence listing contained in the international application as filed, as applicable. Such a statement may be made by a registered practitioner, the applicant, an inventor, or the person who actually compares the sequence data on behalf of the aforementioned. See MPEP § 2428 for further information and Sample Statements.
Note that, in an application filed under 35 U.S.C. 111(a), if a “Sequence Listing” is filed as an ASCII plain text file via the USPTO patent electronic filing system or on a read-only optical disc under 37 CFR 1.52(e), and applicant has not filed a “Sequence Listing” as a PDF image file or on physical sheets of paper, the ASCII plain text file will serve as both the “Sequence Listing” required by 37 CFR 1.821(c) and the computer readable form (CRF) required by 37 CFR 1.821(e) (37 CFR 1.821(e)(1)). Also, in a national stage application submitted under 35 U.S.C. 371, if a “Sequence Listing” is filed as an ASCII plain text file via the USPTO patent electronic filing system or on a read-only optical disc under 37 CFR 1.52(e), the ASCII plain text file will serve as both the “Sequence Listing” required by 37 CFR 1.821(c) and the computer readable form (CRF) required by 37 CFR 1.821(e). See MPEP § 2422.03(a), subsection I, for additional information. See also MPEP § 2422.03(a) subsection IV, for additional information regarding international stage applications. Thus, for applications filed under 35 U.S.C. 111(a) and 35 U.S.C. 371, the following are not required and should not be submitted: (1) a second copy of the “Sequence Listing” as a PDF image file or on physical sheets of paper; and (2) a statement under 37 CFR 1.821(e)(1)(ii) or (2)(ii) (indicating that the sequence information contained in the “Sequence Listing” under 37 CFR 1.821(c) and CRF copy of the “Sequence Listing” under 37 CFR 1.821(e)(1)(ii) or (2)(ii) are identical).
2422.07 Requirements for Compliance and Consequences of Non-Compliance [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
37 CFR 1.821(g) requires compliance with the requirements of 37 CFR 1.821(b) through (e), as discussed above, if they are not satisfied at the time of filing under 35 U.S.C. 111(a) or at the time of entering the national stage of an international application under 35 U.S.C. 371, within the period of time set in a notice requiring compliance. When applicant files an amendment to comply with the requirements of 37 CFR 1.821(g) and that amendment adds or replaces a “Sequence Listing” and CRF copy thereof, the amendment must be submitted in accordance with the requirements of 37 CFR 1.825. Failure to provide a proper reply in compliance with 37 CFR 1.825 will result in the abandonment of the application. See MPEP § 2426. Extensions of time in which to reply to a requirement under this paragraph are available pursuant to 37 CFR 1.136. Note, however, that patent applications filed under 35 U.S.C. 111 on or after December 18, 2013, and international patent applications in which the national stage commenced under 35 U.S.C. 371 on or after December 18, 2013, may be subject to reductions in patent terms adjustment pursuant to 37 CFR 1.704(c)(13) if they are not in condition for examination within eight months from the filing date or date of commencement, respectively. “In condition for examination” includes compliance with 37 CFR 1.821 through 1.825 (see 37 CFR 1.704(f)).
Provisional applications filed under 35 U.S.C. 111(b) need not comply with 37 CFR 1.821 through 1.825, however, applicants are encouraged to file a “Sequence Listing” as defined in 37 CFR 1.821(c) for ease of identification of the sequence information contained in the provisional application.
If any of the requirements of 37 CFR 1.821(e)(3) are not satisfied at the time of filing an international application under the Patent Cooperation Treaty (PCT), and the application is to be searched by the United States International Searching Authority or examined by the United States International Preliminary Examining Authority, the applicant may be sent a notice necessitating compliance with the requirements within a prescribed time period. Where a sequence listing under PCT Rule 13ter is provided in reply to a under 37 CFR 1.821(h), the sequence listing must be accompanied by a statement that the information recorded in the ASCII plain text file under 37 CFR 1.821(e)(3)(i) is identical to the sequence listing contained in the international application as filed, or does not go beyond the disclosure in the international application as filed, as applicable. Such a statement may be made by a registered practitioner, the applicant, an inventor, or the person who actually compares the sequence data on behalf of the aforementioned. Also, the ASCII plain text file under 37 CFR 1.821(e)(3)(i) must be accompanied by the late furnishing fee, as set forth in 37 CFR 1.445(a)(5). International applications that fail to comply with any of the requirements of 37 CFR 1.821(e)(3) will be searched and/or examined only to the extent possible without the benefit of the information in computer readable form. See PCT Administrative Instructions Section 513(c).
The requirement to submit a statement that a submission in reply to the requirement under 37 CFR 1.821(h) does not go beyond the disclosure in the application as filed or that the information recorded in the ASCII plain text file under 37 CFR 1.821(e)(3)(i) is identical to the sequence listing contained in the international application as filed is not the first instance in which the applicant has been required to ensure that there is not new matter upon amendment. The requirement is analogous to that found in 37 CFR 1.125 regarding substitute specifications. When a substitute specification is required because the number or nature of amendments would make it difficult to examine the application, the applicant must include a statement that the substitute specification includes no new matter. The necessity of requiring sequence information as an ASCII plain text file is similar to the necessity of requiring a substitute specification and, likewise, the burden is on the applicant to ensure that no new matter is added. Applicants have a duty to comply with the statutory prohibition (35 U.S.C. 132 and 35 U.S.C. 251) against the introduction of new matter.
The correction of errors in sequencing or any other errors that are made in describing an invention are subject to the statutory prohibition (35 U.S.C. 132 and 35 U.S.C. 251) against the introduction of new matter.
2422.08 Presumptions Regarding Compliance [R-07.2022]
[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 2412-2419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]
Neither the presence nor absence of information which is not required under the sequence rules will create a presumption that such information is necessary to satisfy any of the requirements of 35 U.S.C. 112. Further, the grant of a patent on an application that is subject to 37 CFR 1.821 through 37 CFR 1.825 constitutes a presumption that the granted patent complies with the requirements of these rules.