rsc-464 ETC-unknow, rsc-464 Datasheet - Page 4

rsc-464

Manufacturer Part Number

rsc-464

Description

Speech Recognition Processor

Manufacturer

ETC-unknow

Datasheet

1.RSC-464.pdf (41 pages)

Current page: 4 of 41
Download datasheet (472Kb)

RSC-464

Speech Technologies

Speech Recognition

The RSC-464 is designed to operate in tandem with the FluentChip™ technology library, including speaker

independent (SI), speaker dependent (SD), and speaker verification (SV) speech recognition. Combinations of

these technologies may used to create applications that are rich in features. These are described below:

Speech and Music Synthesis

The RSC-464 provides high-quality speech compression using Sensory SX™ technology. One may select various

data rates from approximately 2.4 to 10.8Kbps to manage speech quality versus allotted memory. The highest data

rates use 16KHz sample rates to provide high quality reproduction of high pitched voices. Speech and sound

effects may also be compressed using 8-bit PCM (64Kbps) or 4-bit ADPCM (32Kbps) technologies.

The RSC-464 also provides eight-voice, wave table music synthesis which allows multiple, simultaneous

instruments for harmonizing. The RSC-464 uses a MIDI-like system to generate music. One or more of the eight

voices may be speech playback instead of music. One or more of the eight voices may be a drum track comprising

multiple drums. In effect, drum tracks allow the number of simultaneous instruments to exceed 8.

Speech and Music data may be stored in on-chip ROM. Speech data may alternatively be stored in off-chip serial

data ROM or serial data Flash for extended durations.

Easy to use tools allow the developer to record and compress their own voice talents and create with the push of a

button, or to create their own MIDI scores and instruments.

Record and Playback

The RSC-464 can perform speech record and playback (sometimes called “voice memo”) using either 8 bits

(64Kbps) or 4 bits (32Kbps) per sample, depending on the quantity and quality of playback desired. The record and

playback technology also optionally performs silence removal to reduce memory requirements.

External serial Flash or SRAM is required to store the compressed speech.

Speaker Independent recognition requires no user training. The RSC-464 can recognize up to 15 commands in

an active set (number of sets is limited only by internal ROM size). Text-to-SI (T2SI), based on a hybrid of

Hidden Markov Modeling and Neural Net technologies, allows creation of accurate SI recognition sets in

seconds. SI requires on-chip ROM.

Speaker Dependent recognition allows the user to create names for products or customize recognition sets. SD

is implemented with DTW (dynamic time warping) pattern matching technology. SD requires programmable

memory to store the personalized speech templates(trained patterns) that may be on-chip SRAM, or off-chip

serial EEPROM, Flash Memory, or SRAM. Up to 50 templates can be recognized in an active set (the number of

unique sets is limited only by programmable memory capacity). The RSC-464 can store 1 SD templates in on-

chip SRAM.

Speaker Verification enables the RSC-464 to authenticate when a previously trained password is spoken by the

target user. SV is also implemented with DTW technology. 1 SV template can be stored in on-chip SRAM, or

more with external programmable memory such as delineated in SD above.

Word Spotting enables the RSC-464 to spot a specific word surrounded by other speech within a phrase. This

can be quite effective when the users response may vary (e.g. spotting “telephone” in the phrases “ummm

telephone”, or “telephone call”). This option is available for SI and SD.

Continuous Listening allows the chip to continuously listen for a specific word. This may be used as a trigger

word to request a device to listen for a command. This option is available for SI and SD.

P/N 80-0282-A

Preliminary Data Sheet

rsc-464 ETC-unknow, rsc-464 Datasheet - Page 4

rsc-464

Related parts for rsc-464