During the last decades the availability of increasingly cheaper technology for pervasive monitoring has boosted the creation of systems able to automatically comprehend the events occurring in the monitored area, in order to plan a set of actions to bring the environment closer to the user's preferences. These systems must inevitably process a great amount of raw data - sensor measurements - and need to summarize them in a high-level representation to accomplish their tasks. An implicit requirement is the need to learn from experience, in order to be able to capture the hidden structure of the data, in terms of relations between its key components. The availability of large collections of data, however, has increased the awareness that "measuring" does not seamlessly translate into "understanding", and more data does not entail more knowledge. Scientific literature documents a massive use of Statistical Machine Learning in almost all data analysis and data mining applications, aiming at minimizing the need for a-priori knowledge. A remarkable drawback of such algorithms, however, is their failure to effortlessly provide insight about the most significant features of the data, as they typically just provide optimal parameter settings for a "black-box". In this thesis, it is claimed that structure is the key to handle the complexity of acquiring knowledge from unstructured data in real-life scenarios. A shift in perspective will allow to tackle with the unaddressed goal of representing knowledge by means of the structure inferred from the collected samples; more specifically, the suggestion is to state this process within the framework of formal languages and automata borrowing concepts and methods from Algorithmic Learning Theory. In this context, knowledge extraction may be turned into structural pattern identification, letting syntactic models emerge from data itself. In order to prove the soundness of this proposal, three different case studies will be presented, exploiting statistical learning, syntactical methods and formal languages, respectively. The third approach will be particularly useful to highlight the advantage of building intrinsically recursive models, which give multi-scale - more natural - representations; as a result, the computational burden that characterizes the huge volume of data will be lessened. Moreover, the task of designing reliable and efficient automatic systems for knowledge extraction can be alleviated by using such human-understandable models.
During the last decades the availability of increasingly cheaper technology for pervasive monitoring has boosted the creation of systems able to automatically comprehend the events occurring in the monitored area, in order to plan a set of actions to bring the environment closer to the user's preferences. These systems must inevitably process a great amount of raw data - sensor measurements - and need to summarize them in a high-level representation to accomplish their tasks. An implicit requirement is the need to learn from experience, in order to be able to capture the hidden structure of the data, in terms of relations between its key components. The availability of large collections of data, however, has increased the awareness that "measuring" does not seamlessly translate into "understanding", and more data does not entail more knowledge. Scientific literature documents a massive use of Statistical Machine Learning in almost all data analysis and data mining applications, aiming at minimizing the need for a-priori knowledge. A remarkable drawback of such algorithms, however, is their failure to effortlessly provide insight about the most significant features of the data, as they typically just provide optimal parameter settings for a "black-box". In this thesis, it is claimed that structure is the key to handle the complexity of acquiring knowledge from unstructured data in real-life scenarios. A shift in perspective will allow to tackle with the unaddressed goal of representing knowledge by means of the structure inferred from the collected samples; more specifically, the suggestion is to state this process within the framework of formal languages and automata borrowing concepts and methods from Algorithmic Learning Theory. In this context, knowledge extraction may be turned into structural pattern identification, letting syntactic models emerge from data itself. In order to prove the soundness of this proposal, three different case studies will be presented, exploiting statistical learning, syntactical methods and formal languages, respectively. The third approach will be particularly useful to highlight the advantage of building intrinsically recursive models, which give multi-scale - more natural - representations; as a result, the computational burden that characterizes the huge volume of data will be lessened. Moreover, the task of designing reliable and efficient automatic systems for knowledge extraction can be alleviated by using such human-understandable models.
Cottone, P.Structural Knowledge Extraction and Representation in Sensory Data.
Structural Knowledge Extraction and Representation in Sensory Data
COTTONE, Pietro
Abstract
During the last decades the availability of increasingly cheaper technology for pervasive monitoring has boosted the creation of systems able to automatically comprehend the events occurring in the monitored area, in order to plan a set of actions to bring the environment closer to the user's preferences. These systems must inevitably process a great amount of raw data - sensor measurements - and need to summarize them in a high-level representation to accomplish their tasks. An implicit requirement is the need to learn from experience, in order to be able to capture the hidden structure of the data, in terms of relations between its key components. The availability of large collections of data, however, has increased the awareness that "measuring" does not seamlessly translate into "understanding", and more data does not entail more knowledge. Scientific literature documents a massive use of Statistical Machine Learning in almost all data analysis and data mining applications, aiming at minimizing the need for a-priori knowledge. A remarkable drawback of such algorithms, however, is their failure to effortlessly provide insight about the most significant features of the data, as they typically just provide optimal parameter settings for a "black-box". In this thesis, it is claimed that structure is the key to handle the complexity of acquiring knowledge from unstructured data in real-life scenarios. A shift in perspective will allow to tackle with the unaddressed goal of representing knowledge by means of the structure inferred from the collected samples; more specifically, the suggestion is to state this process within the framework of formal languages and automata borrowing concepts and methods from Algorithmic Learning Theory. In this context, knowledge extraction may be turned into structural pattern identification, letting syntactic models emerge from data itself. In order to prove the soundness of this proposal, three different case studies will be presented, exploiting statistical learning, syntactical methods and formal languages, respectively. The third approach will be particularly useful to highlight the advantage of building intrinsically recursive models, which give multi-scale - more natural - representations; as a result, the computational burden that characterizes the huge volume of data will be lessened. Moreover, the task of designing reliable and efficient automatic systems for knowledge extraction can be alleviated by using such human-understandable models.File | Dimensione | Formato | |
---|---|---|---|
Tesi_Cottone_Pietro.pdf
accesso aperto
Descrizione: tesi
Dimensione
7.57 MB
Formato
Adobe PDF
|
7.57 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.