Speech recognition has become common in many application domains, from dictation systems for professional practices to vocal user interfaces for people with disabilities or hands-free system control. However, so far the performance of automatic speech recognition (ASR) systems are comparable to human speech recognition (HSR) only under very strict working conditions, and in general much lower. Incorporating acoustic-phonetic knowledge into ASR design has been proven a viable approach to raise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as detectors for manner of articulation attributes starting from representations of speech signal frames. In this paper, the full system implementation is described. The system has a first stage for MFCC extraction followed by a second stage implementing a sinusoidal based multi-layer perceptron for speech event classification. Implementation details over a Celoxica RC203 board are given

SABATO MARCO SINISCALCHI, FULVIO GENNARO, SALVATORE ANDOLINA, SALVATORE, VITABILE, GENTILE A, et al. (2006). Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks. In Embedded Knowledge-Based Speech Detectors for Real-Time Recognition Tasks (pp.353-360). IEEE [10.1109/ICPPW.2006.35].

Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks

SINISCALCHI, Sabato Marco;GENNARO, Francesca;ANDOLINA, Salvatore;VITABILE, Salvatore;GENTILE, Antonio;SORBELLO, Filippo
2006-01-01

Abstract

Speech recognition has become common in many application domains, from dictation systems for professional practices to vocal user interfaces for people with disabilities or hands-free system control. However, so far the performance of automatic speech recognition (ASR) systems are comparable to human speech recognition (HSR) only under very strict working conditions, and in general much lower. Incorporating acoustic-phonetic knowledge into ASR design has been proven a viable approach to raise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as detectors for manner of articulation attributes starting from representations of speech signal frames. In this paper, the full system implementation is described. The system has a first stage for MFCC extraction followed by a second stage implementing a sinusoidal based multi-layer perceptron for speech event classification. Implementation details over a Celoxica RC203 board are given
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni
2006
The 2006 International Conference Workshops on Parallel Processing
Columbus, OHIO
14-18 August, 2006
2006
8
A stampa
Workshop collegato alla International Conference on Parallel Processing (ICPP 2006)
SABATO MARCO SINISCALCHI, FULVIO GENNARO, SALVATORE ANDOLINA, SALVATORE, VITABILE, GENTILE A, et al. (2006). Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks. In Embedded Knowledge-Based Speech Detectors for Real-Time Recognition Tasks (pp.353-360). IEEE [10.1109/ICPPW.2006.35].
Proceedings (atti dei congressi)
SABATO MARCO SINISCALCHI; FULVIO GENNARO; SALVATORE ANDOLINA; SALVATORE; VITABILE; GENTILE A; FILIPPO SORBELLO
File in questo prodotto:
File Dimensione Formato  
Embedded_Knowledge-Based_Speech_Detectors_for_Real_Gentile.pdf

accesso aperto

Dimensione 410.64 kB
Formato Adobe PDF
410.64 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/15503
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact