circuitcellar.com
Magazine Support   Digital Library   Products & Services   Suppliers Directory 
 
 





 

February 1998, Issue 91

Low-Cost Voice Recognition


FUTURE TINY ENHANCEMENTS

Naturally, there are ways to improve the system. I was surprised by the HC05’s speed. I also wound up with at least 200 bytes of leftover ROM for more code. Tiny Voice’s code is modular, and updates can be easily added.

I can increase the EEPROM capacity to 1 or even 2 KB. This size would provide more template storage or allow for more frame features to better resolve differences in speech patterns.

I’d also like to add some fuzzy logic to the pattern-matching algorithm to improve recognition accuracy and the rejection criteria.

Adding a serial port instead of push buttons and LEDs could reduce cost and add more functionality. Threshold values could be changed, templates uploaded and downloaded, and so on.

I want an MCU-controlled gain adjustment on the input for different microphone levels and background noise.

Another improvement would be to add a dynamic time warp (DTW) algorithm to the pattern-matching routine. The DTW takes into account slight variations on how each word is pronounced—in particular, variations in lengths of phonemes.

But with only 200 bytes of code space left over, adding a DTW would be challenging. A first-order approximation may be achievable, however.

I’d rather use C than assembly language. When I started this project, I knew squeezing this functionality into 1200 bytes would be tough. So, a high-level language was out of the question.

Since then, I’ve had the opportunity to try out a C compiler from Byte Craft. The good news is, it generates small enough code. The bad news: I wish I’d used it earlier.

And as a final wish, I would like to use a different processor. Of all these improvements, this one is probably the best. You can now get equivalent MCUs with built-in ADCs, which would provide more elaborate signal processing and better noise rejection.

One of the best candidates for a low-cost system is the Sharp SM8500 8-bit MCU. It has almost everything you need for an embedded voice-command system, including a 10-bit ADC (8 channels) and an 8-bit DAC, which is useful for voice feedback and verification.

The SM8500 features SIO and UART ports to communicate with other system devices, 2 KB of internal RAM, as well as internal ROM and the ability to access external ROM or RAM. It also offers 80+ I/O pins for keypad and display interfacing, hardware multiply and divide, and a 250-ns instruction cycle time. And, it costs under $3.

If you’re willing to spend a bit more, then a new level of performance may be realized. New 32-bit RISC MCUs are becoming available in the sub $15 or even sub $10 range.

For example, the Sharp ARM710M RISC processor, running at a conservative 16 MHz, performs a complete FFT-Mel-Cepstrum analysis using only 50% of the processor’s resources.

With the ability of RISC processors to address large amounts of memory, you have the ingredients to put together a dictation system like the one I’m using now. And, it can run off a couple pen-light batteries!