| Class 704 | DATA PROCESSING: SPEECH SIGNAL PROCESSING, LINGUISTICS, LANGUAGE TRANSLATION, AND AUDIO COMPRESSION/DECOMPRESSION |
| Click here for a printable version of this file | |
Expand/Contract Processing Please Wait
![]() | ![]() | 1 | LINGUISTICS |
![]() | ![]() | 2 | Translation machine |
![]() | ![]() | 3 | Having particular Input/Output device |
![]() | ![]() | 4 | Based on phrase, clause, or idiom |
![]() | ![]() | 5 | For partial translation |
![]() | ![]() | 6 | Punctuation |
![]() | ![]() | 7 | Storage or retrieval of data |
![]() | ![]() | 8 | Multilingual or national language support |
![]() | ![]() | 9 | Natural language |
![]() | ![]() | 10 | Dictionary building, modification, or prioritization |
![]() | ![]() | 200 | SPEECH SIGNAL PROCESSING |
![]() | ![]() | 200.1 | Psychoacoustic |
![]() | ![]() | 201 | For storage or transmission |
![]() | ![]() | 202 | Neural network |
![]() | ![]() | 203 | Transformation |
![]() | ![]() | 205 | Frequency |
![]() | ![]() | 211 | Time |
![]() | ![]() | 212 | Pulse code modulation (PCM) |
![]() | ![]() | 213 | Zero crossing |
![]() | ![]() | 214 | Voiced or unvoiced |
![]() | ![]() | 215 | Silence decision |
![]() | ![]() | 216 | Correlation function |
![]() | ![]() | 219 | Linear prediction |
![]() | ![]() | 220 | Analysis by synthesis |
![]() | ![]() | 221 | Pattern matching vocoders |
![]() | ![]() | 224 | Normalizing |
![]() | ![]() | 225 | Gain control |
![]() | ![]() | 226 | Noise |
![]() | ![]() | 229 | Adaptive bit allocation |
![]() | ![]() | 230 | Quantization |
![]() | ![]() | 231 | Recognition |
![]() | ![]() | 232 | Neural network |
![]() | ![]() | 233 | Detect speech in noise |
![]() | ![]() | 234 | Normalizing |
![]() | ![]() | 235 | Speech to image |
![]() | ![]() | 236 | Specialized equations or comparisons |
![]() | ![]() | 237 | Correlation |
![]() | ![]() | 238 | Distance |
![]() | ![]() | 239 | Similarity |
![]() | ![]() | 240 | Probability |
![]() | ![]() | 241 | Dynamic time warping |
![]() | ![]() | 242 | Viterbi trellis |
![]() | ![]() | 243 | Creating patterns for matching |
![]() | ![]() | 246 | Voice recognition |
![]() | ![]() | 251 | Word recognition |
![]() | ![]() | 252 | Preliminary matching |
![]() | ![]() | 253 | Endpoint detection |
![]() | ![]() | 254 | Subportions |
![]() | ![]() | 255 | Specialized models |
![]() | ![]() | 256 | Markov |
![]() | ![]() | 256.1 | Hidden Markov Model (HMM) (EPO) |
![]() | ![]() | 256.2 | Training of HMM (EPO) |
![]() | ![]() | 256.3 | With insufficient amount of training data, e.g., state sharing, tying, deleted interpolation (EPO) |
![]() | ![]() | 256.4 | Duration modeling in HMM, e.g., semi HMM, segmental models, transition probabilities (EPO) |
![]() | ![]() | 256.5 | Hidden Markov (HM) network (EPO) |
![]() | ![]() | 256.6 | State emission probability (EPO) |
![]() | ![]() | 257 | Natural language |
![]() | ![]() | 258 | Synthesis |
![]() | ![]() | 259 | Neural network |
![]() | ![]() | 260 | Image to speech |
![]() | ![]() | 261 | Vocal tract model |
![]() | ![]() | 262 | Linear prediction |
![]() | ![]() | 263 | Correlation |
![]() | ![]() | 264 | Excitation |
![]() | ![]() | 265 | Interpolation |
![]() | ![]() | 266 | Specialized model |
![]() | ![]() | 267 | Time element |
![]() | ![]() | 268 | Frequency element |
![]() | ![]() | 269 | Transformation |
![]() | ![]() | 270 | Application |
![]() | ![]() | 500 | AUDIO SIGNAL BANDWIDTH COMPRESSION OR EXPANSION |
![]() | ![]() | 503 | AUDIO SIGNAL TIME COMPRESSION OR EXPANSION (E.G., RUN LENGTH CODING) |
| E-SUBCLASSES | ||
| The following subclasses beginning with the letter E are E-subclasses. Each E-subclass corresponds in scope to a classification in a foreign classification system, for example, the European Classification system (ECLA). The foreign classification equivalent to an E-subclass is identified in the subclass definition. In addition to US documents classified in E-subclasses by US examiners, documents are regularly classified in E-subclasses according to the classification practices of any foreign Offices identified in parentheses at the end of the title. For example, "(EPO)" at the end of a title indicates both European and US patent documents, as classified by the EPO, are regularly added to the subclass. E-subclasses may contain subject matter outside the scope of this class.Consult their definitions, or the documents themselves to clarify or interpret titles. |
![]() | ![]() | E17.001 | SPEAKER IDENTIFICATION OR VERIFICATION (EPO) |
![]() | ![]() | E17.002 | Recognition of special voice characteristics, e.g., for use in a lie detector; recognition of animal voices, etc. (EPO) |
![]() | ![]() | E17.003 | Systems using speaker recognizers (EPO) |
![]() | ![]() | E17.004 | Details (EPO) |
![]() | ![]() | E17.006 | Training, model building, enrollment (EPO) |
![]() | ![]() | E17.007 | Decision making techniques, pattern matching strategies (EPO) |
![]() | ![]() | E17.008 | Use of particular distance or distortion metric between probe pattern and reference templates (EPO) |
![]() | ![]() | E17.009 | Multimodal systems, i.e., based on the integration of multiple recognition engines or experts fusion (EPO) |
![]() | ![]() | E17.01 | Score normalization (EPO) |
![]() | ![]() | E17.011 | Use of phonemic categorization or speech recognition prior to speaker recognition or verification (EPO) |
![]() | ![]() | E17.012 | Hidden Markov Models (HMMs) (EPO) |
![]() | ![]() | E17.013 | Artificial neural networks, connectionist approaches (EPO) |
![]() | ![]() | E17.014 | Pattern transformations and operations aimed at increasing system robustness, e.g., against channel noise, different working conditions, etc. (EPO) |
![]() | ![]() | E17.015 | Interactive procedures, man-machine interface (EPO) |
![]() | ![]() | E15.001 | SPEECH RECOGNITION (EPO) |
![]() | ![]() | E15.002 | Assessment or evaluation of speech recognition systems (EPO) |
![]() | ![]() | E15.003 | Language recognition (EPO) |
![]() | ![]() | E15.004 | Feature extraction for speech recognition; selection of recognition unit (EPO) |
![]() | ![]() | E15.005 | Segmentation or word limit detection (EPO) |
![]() | ![]() | E15.007 | Creation of reference templates; training of speech recognition systems, e.g., adaption to the characteristics of the speaker's voice, etc. (EPO) |
![]() | ![]() | E15.014 | Speech classification or search (EPO) |
![]() | ![]() | E15.015 | Using distance or distortion measures between unknown speech and reference templates (EPO) |
![]() | ![]() | E15.016 | Using dynamic programming techniques, e.g., Dynamic Time Warping (DTW), etc. (EPO) |
![]() | ![]() | E15.017 | Using artificial neural networks (EPO) |
![]() | ![]() | E15.018 | Using natural language modeling (EPO) |
![]() | ![]() | E15.019 | Using context dependencies, e.g., language models, etc. (EPO) |
![]() | ![]() | E15.02 | Phonemic context, e.g., pronunciation rules, phonotactical constraints, phoneme n-grams, etc. (EPO) |
![]() | ![]() | E15.021 | Grammatical context, e.g., disambiguation of the recognition hypotheses based on word sequence rules, etc. (EPO) |
![]() | ![]() | E15.022 | Formal grammars, e.g., finite state automata, context free grammars, word networks, etc. (EPO) |
![]() | ![]() | E15.023 | Probabilistic grammars, e.g., word n-grams, etc. (EPO) |
![]() | ![]() | E15.024 | Semantic context, e.g., disambiguation of the recognition hypotheses based on word meaning, etc. (EPO) |
![]() | ![]() | E15.025 | Using prosody or stress (EPO) |
![]() | ![]() | E15.026 | Parsing for meaning understanding (EPO) |
![]() | ![]() | E15.027 | Using statistical models, e.g., Hidden Markov Models (HMMs), etc. (EPO) |
![]() | ![]() | E15.028 | Hidden Markov Models (HMMs) (EPO) |
![]() | ![]() | E15.029 | Training of Hidden Markov Models (HMMs) (EPO) |
![]() | ![]() | E15.03 | With insufficient amount of training data, e.g., state sharing, tying, deleted interpolation, etc. (EPO) |
![]() | ![]() | E15.031 | Duration modeling in Hidden Markov Models (HMMs), e.g., semi-HMM, segmental models, transition probabilities, etc. (EPO) |
![]() | ![]() | E15.032 | Hidden Markov Models (HMMs) network (EPO) |
![]() | ![]() | E15.033 | State emission probabilities (EPO) |
![]() | ![]() | E15.037 | Non-hidden Markov Model (EPO) |
![]() | ![]() | E15.038 | Recognition networks (EPO) |
![]() | ![]() | E15.039 | Speech recognition techniques for robustness in adverse environments, e.g., in noise, of stress induced speech, etc. (EPO) |
![]() | ![]() | E15.04 | Procedures used during a speech recognition process, e.g., man-machine dialogue, etc. (EPO) |
![]() | ![]() | E15.041 | Speech recognition using nonacoustical features, e.g., position of the lips, etc. (EPO) |
![]() | ![]() | E15.043 | Speech to text systems (EPO) |
![]() | ![]() | E15.044 | Speech recognition depending on application context, e.g., in a computer, etc. (EPO) |
![]() | ![]() | E15.045 | Systems using speech recognizers (EPO) |
![]() | ![]() | E15.046 | Constructional details of speech recognition systems (EPO) |
![]() | ![]() | E15.047 | Distributed recognition, e.g., in client-server systems for mobile phones or network applications, etc. (EPO) |
![]() | ![]() | E15.048 | Memory allocation or algorithm optimization to reduce hardware requirements (EPO) |
![]() | ![]() | E15.049 | Multiple recognizers used in sequence or in parallel; corresponding voting or score combination systems (EPO) |
![]() | ![]() | E15.05 | Recognizers for parallel processing (EPO) |
![]() | ![]() | E19.001 | SPEECH OR AUDIO SIGNAL ANALYSIS-SYNTHESIS TECHNIQUES FOR REDUNDANCY REDUCTION, E.G., IN VOCODERS, ETC.; CODING OR DECODING OF SPEECH OR AUDIO SIGNALS; COMPRESSION OR EXPANSION OF SPEECH OR AUDIO SIGNALS, E.G., SOURCE-FILTER MODELS, PSYCHOACOUSTIC ANALYSIS, ETC. (EPO) |
![]() | ![]() | E19.002 | Perceptual measures for quality assessment (EPO) |
![]() | ![]() | E19.003 | Correction of errors induced by the transmission channel, if related to the coding (EPO) |
![]() | ![]() | E19.004 | Lossless audio signal coding; perfect reconstruction of coded audio signal by transmission of coding error (EPO) |
![]() | ![]() | E19.005 | Multichannel audio signal coding and decoding, i.e., using interchannel correlation to reduce redundancies, e.g., joint-stereo, intensity-coding, matrixing, etc. (EPO) |
![]() | ![]() | E19.006 | Comfort noise, silence coding (EPO) |
![]() | ![]() | E19.007 | Speech coding using phonetic or linguistical decoding of the source; reconstruction using text-to-speech synthesis (EPO) |
![]() | ![]() | E19.008 | Systems using vocoders (EPO) |
![]() | ![]() | E19.009 | Audio watermarking, i.e., embedding inaudible data in the audio signal (EPO) |
![]() | ![]() | E19.01 | Using spectral analysis, e.g., transform vocoders, subband vocoders, perceptual audio coders, psychoacoustically based lossy encoding, etc., e.g., MPEG audio, Dolby AC-3, etc. (EPO) |
![]() | ![]() | E19.011 | Blocking, i.e., grouping of samples in time, choice of analysis window, overlap factor (EPO) |
![]() | ![]() | E19.013 | Noise substitution, i.e., substituting nontonal spectral components by noisy source (EPO) |
![]() | ![]() | E19.014 | Spectral prediction for pre-echo prevention; temporal noise shaping (TNS), e.g., in MPEG2 or MPEG4, etc. (EPO) |
![]() | ![]() | E19.015 | Quantization or dequantization of spectral components (EPO) |
![]() | ![]() | E19.018 | Using subband decomposition (EPO) |
![]() | ![]() | E19.02 | Using orthogonal transformation (EPO) |
![]() | ![]() | E19.022 | Dynamic bit allocation (EPO) |
![]() | ![]() | E19.023 | Using predictive techniques; codecs based on source-filter modelization (EPO) |
![]() | ![]() | E19.024 | Determination or coding of the spectral characteristics, e.g., of the short-term prediction coefficients, etc. (EPO) |
![]() | ![]() | E19.026 | Determination or coding of the excitation function; determination or coding of the long-term prediction characteristics (EPO) |
![]() | ![]() | E19.027 | Determination or coding of an excitation gain (EPO) |
![]() | ![]() | E19.028 | Using mixed excitation model, e.g., MELP, MBE, Split band LPC, HVXC, etc. (EPO) |
![]() | ![]() | E19.029 | Long-term prediction, i.e., removing periodical redundancies, e.g., adaptive codebook, pitch predictor, etc. (EPO) |
![]() | ![]() | E19.03 | Using sinusoidal excitation model (EPO) |
![]() | ![]() | E19.031 | Using prototype waveform decomposition or waveform interpolative coders (PWI) (EPO) |
![]() | ![]() | E19.032 | Determination or coding of a multipulse excitation (EPO) |
![]() | ![]() | E19.035 | Determination or coding of a code excitation; code excited linear prediction (CELP) vocoders (EPO) |
![]() | ![]() | E19.039 | Details of speech and audio coders (EPO) |
![]() | ![]() | E19.04 | Vocoder architecture (EPO) |
![]() | ![]() | E19.045 | Pre- or post-filtering (EPO) |
![]() | ![]() | E19.046 | Pre-filtering, e.g., high frequency emphasis prior to encoding, etc. (EPO) |
![]() | ![]() | E19.047 | Post-filtering, e.g., pitch enhancement, formant emphasis for decoder, etc. (EPO) |
![]() | ![]() | E19.048 | Audio streaming, i.e., formatting and decoding of an encoded audio signal (EPO) |
![]() | ![]() | E19.049 | Transcoding, i.e., converting between two coded representations avoiding cascaded coding-decoding (EPO) |
![]() | ![]() | E21.001 | MODIFICATION OF AT LEAST ONE CHARACTERISTIC OF SPEECH WAVES (EPO) |
![]() | ![]() | E21.002 | Speech enhancement, e.g., noise reduction, echo cancellation, etc. (EPO) |
![]() | ![]() | E21.003 | Applications (EPO) |
![]() | ![]() | E21.004 | Speech corrupted by noise (EPO) |
![]() | ![]() | E21.007 | Speech corrupted by echo-reverberation (EPO) |
![]() | ![]() | E21.008 | Speech corrupted by stress-Lombard effect (EPO) |
![]() | ![]() | E21.009 | Enhancement of intelligibility of clean or coded speech (EPO) |
![]() | ![]() | E21.01 | Enhancement of diverse speech (EPO) |
![]() | ![]() | E21.011 | Bandwidth extension taking place at the receiving side, e.g., generation of low- or high-frequency components, regeneration of spectral holes, etc. (EPO) |
![]() | ![]() | E21.012 | Separate reconstruction of interference and of speech signal (EPO) |
![]() | ![]() | E21.014 | Active noise canceling (EPO) |
![]() | ![]() | E21.015 | Public address system (EPO) |
![]() | ![]() | E21.016 | Suppression or repetition of time signal segments (EPO) |
![]() | ![]() | E21.017 | Time compression or expansion (EPO) |
![]() | ![]() | E21.019 | Transformation of speech into a nonaudible representation, e.g., speech visualization, speech processing for tactile aids, etc. (EPO) |
![]() | ![]() | E11.001 | MISCELLANEOUS ANALYSIS OR DETECTION OF SPEECH CHARACTERISTICS (EPO) |
![]() | ![]() | E11.002 | General speech analysis without concrete application (EPO) |
![]() | ![]() | E11.003 | Detection of presence or absence of speech signals (EPO) |
![]() | ![]() | E11.006 | Pitch determination of speech signals (EPO) |
![]() | ![]() | E11.007 | Voiced-unvoiced decision (EPO) |
![]() | ![]() | E13.001 | SPEECH SYNTHESIS; TEXT TO SPEECH SYSTEMS (EPO) |
![]() | ![]() | E13.002 | Methods for producing synthetic speech; speech synthesizers (EPO) |
![]() | ![]() | E13.003 | Concept-to-speech synthesizers; generation of natural phrases not from text but from machine-based concepts (EPO) |
![]() | ![]() | E13.004 | Sound editing, manipulating voice of the synthesizer (EPO) |
![]() | ![]() | E13.005 | Details of speech synthesis systems, e.g., synthesizer architecture, memory management, etc. (EPO) |
![]() | ![]() | E13.006 | Architecture of speech synthesizers (EPO) |
![]() | ![]() | E13.007 | Excitation (EPO) |
![]() | ![]() | E13.008 | Systems using speech synthesizers (EPO) |
![]() | ![]() | E13.009 | Elementary speech units used in speech synthesizers; concatenation rules (EPO) |
![]() | ![]() | E13.011 | Text analysis, generation of parameters for speech synthesis out of text, e.g., grapheme to phoneme translation, prosody generation, stress, or intonation determination, etc. (EPO) |
| FOREIGN ART COLLECTIONS | ||
| FOR000 | CLASS-RELATED FOREIGN DOCUMENTS |
![[List of Pre Grant Publications for class 704 subclass 1]](../as.gif)
![[List of Patents for class 704 subclass 1]](../ps.gif)






