During the past eight years I have co-supervised a number of M.Sc. studests, mainly from the
						Politecnico di Milano.
						
						I currently co-supervise M.Sc. students for their thesis as an intern in BdSound.
						Conducting a thesis at BdSound involves finding a trade-off between computational complexity,
						real-time feasibility and performance.
						
						For any question, do not hesitate to contact me.
					
				
				Past theses
				
				
					
					Methods for providing input gain robustness to dnn-based real-time speech processing systems
					Yilmaz Ugur Ozcan, July 2024 
					Short abstract 
					Input gain variations can significantly impact the performance of DNN-based real-time speech processing systems. 
					This thesis explores three methods to enhance robustness against these variations: Gain-Augmented Training, Differential Features, and Smoothed Frame Normalization. 
					Experimental results show that these approaches improve the consistency and reliability of DNN outputs under varying input gain condition.
						
					Read the full abstract
					
				
				
					
					A Lightweight Speaker Verification System for Real-Time Applications
					Eray Ozgunay, July 2024
					Short abstract 
					This work tackles key challenges in Speaker Verification (SV) by introducing a novel, lightweight SV system designed for real-time applications in noisy and reverberant environments. 
					The system leverages advanced convolutional techniques within a Deep Neural Network (DNN) and real-time pooling layers to enhance responsiveness and stability across various acoustic conditions. 
					While it may not achieve the highest performance levels, it excels in real-time processing, making it ideal for dynamic environments where speed and computational efficiency are crucial.
						
					
				
				
					
						A cascade approach for speech enhancement based on deep learning
						Filippo Gualtieri, April 2023
						Short abstract
						We propose a cascaded network with a lightweight phase-unaware approach and an optional more
						computationally
						demanding phase-aware stage to perform single-channel Speech Enahncement based on Deep Learning
						(DL). Our solution performs as good as more complex baselines
						in terms of parameters and Floating Point Operations (FLOPs) according to both objective quality
						metrics and subjective evaluations
						
						
					
				
				
					
						Real-time multimicrophone speaker separation for the automotive scenario, using a
							lightweight convolutional neural network
						Federico Maver, December 2022
						Short abstract
						We address the multichannel speaker separation problem, and we propose two causal and
						lightweight Deep Neural Network (DNN) models that can
						adapt to a wide range of microphone positions and distances. The problem focuses on the
						automotive scenario.
						
						
					
				
				
					
						Real-time speech dereverberation using asmall-footprint convolutional neural
							network
						Federico Di Marzo, April 2022
						Short abstract
						We propose an innovative technique based on the use of a Convolutional Neural Network (CNN),
						designed to offer a small-footprint and optimized computational performance, for systems that
						workin real-time,
						with minimal latency.
						
						
					
				
				
					
						Speaker recognition with small-footprint CNN
						Francesco Salani, December 2021
						Short abstract
						A speaker recognition system is a technology that aims to recognize
						a person's identity based on their voice.
						In this thesis, we propose a low-latency speaker recognition system based on
						Deep Neural Networks.
						
						
					
				
				
					
						A deep real-time talk state detector for acoustic echo cancellation
						Daniele Foscarin, September 2021
						Short abstract
						A novel approach, using a talk state detector (TSD) to enhance the performance of a linear
						acoustic echo cancellation.
						It consists of a fully convolutional neural network classifier that performs causal processing
						to meet the real-time requiremment with less than 8,000 trainable parameters.
						
						
					
				
				
					
						A real-time solution for speech enhancement using dilated convolutional neural
							networks
						Fabio Segato, July 2021
						Short abstract 
						In this work, we propose a speech enhancement solution based on Deep Neural Networks that
						withstands the strict
						requirements imposed by embedded devices in terms of memory footprint and processing power.
						The proposed approach operates in real-time, extracting perceptually-relevant features in
						an efficient fashion.
						
						Read the full abstract
					
				
				
					
						A hybrid approach for computationally-efficient beamforming using sparse linear microphone
							arrays
						Davide Balsarri, December 2020
						Short abstract We propose a hybrid beamforming solution that combines
						two methods: one that is efficient for signals with high input SNR and one with low input SNR.
						Results show that our SCM-based hybrid
						solution outperforms most SCM-based methods and exhibits a lower computational complexity.
						
						Read the full abstract
					
				
				
					
						Voice activity detection using small-footprint deep learning
						Luca Menescardi, December 2019
						Short abstract Techniques employed to detect the presence or absence of human
						voice in an audio signal are called Voice Activity Detection
						(VAD) algorithms. Our approach optimizes both the feature extraction and the classification
						performed by the deep neural network. The goal is to comply with requirements imposed by
						embedded systems.
						
						Read the full abstract
					
				
				
					
						Automatic playlist generation using recurrent neural network
						Rosilde Tatiana Irene, July 2018
						Short abstract In this study we propose an automatic playlist generation
						approach which analyzes hand-crafted playlists, understands their
						structure and generates new playlists accordingly. We have adopted a deep learning architecture,
						in particular a Recurrent Neural
						Network, which is specialized in sequence modeling. 
						Read the thesis
					
				
				
					
						Beat tracking using recurrent neural network : a transfer learning approach
						Davide Fiocchi, April 2018
						Short abstract In this work, we propose an approach to apply transfer learning
						for beat tracking.
						We use a deep RNN as the starting network trained on popular music, and we transfer it to track
						beats of folk music.
						Moreover, we test if the resultant models are able to deal with highly variable music, such as
						Greek folk music.
						Read the thesis
					
				
				
					
						Learning a personalized similarity metric for musical content
						Luca Carloni, April, 2018
						Short abstract We present a hybrid model for personalized
						similarity modeling that relies on both content-based and user-related similarity information.
						We exploit a non-metric scaling technique to first elaborate a
						low-dimensional space (or embedding) which fulfills the similarity information provided by the
						user, and a regression technique to learn a mapping between
						content-based information and embedding-related information. 
						Read the thesis
					
				
				
					
						A personalized metric for music similarity using Siamese deep neural networks
						Federico Sala, April 2018
						Short abstract In this thesis we propose
						an approach to model a personalized music similarity metric based on a Deep Neural Network.
						We use a first stage for learning a generic music similarity metric relying on a great amount of
						data,
						and a second stage for customizing it using personalized annotations collected through a survey.
						
						Read the thesis
					
				
				
					
						Analysis of musical structure : an approach based on deep learning
						Davide Andreoletti, July 2015
						Short abstract We propose a Music Structural
						Analysis algorithm where we use a Deep Belief Network to extract a sequence of descriptors that
						is successively
						given as input to several Music Structural Analysis algorithms presented in literature. 
						Read the thesis
					
				
				
					
						A music search engine based on a contextual related semantic model
						Alessandro Gallo, April 2014
						Short abstract In this work we propose an approach for music high-level
						description and music retrieval, that we named Contextual-related semantic model. Our method
						defines different semantic contexts and dimensional semantic relations between music descriptors
						belonging to the same context. 
						Read the thesis
					
				
			
			
			
				
				
				
				
				 
					 
						Audio speech source separation and enhancement in an automotive scenario using different microphone configurations 
 
						Federico Maver, Daniele Foscarin, Davide Balsarri, Luca Menescardi, Michele Buccoli, Simone Pecorino, Antonio Grosso
 ù
						2024 AES 5th International Conference on Automotive Audio
 
					
 
				
				  
						An empirical evaluation of in-car acoustic measurements for the sports car scenario 
 
						David Badiane, Filippo Gualtieri, Alessandro Proverbio, Michele Buccoli, Simone Pecorino, Antonio Grosso,  Michele Ebri, Alfonso Oliva, Luca Battisti, Marco Olivieri 
 
						2024 AES 5th International Conference on Automotive Audio
						In collaboration with Teoresi and Ferrari 
					
 
				
				
				
				
					
						Real-Time Multichannel Speech Separation and Enhancement Using a Beamspace-Domain-Based Lightweight CNN | Link
						Marco Olivieri, Luca Comanducci, Mirco Pezzoli, Davide Balsarri, Luca Menescardi, Michele Buccoli, Simone Pecorino, Antonio Grosso, Fabio Antonacci, Augusto Sarti
						IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
					
				
				
					
						Towards a general framework for the annotation of dance motion sequences | Link
						Katerina El Raheb, Michele Buccoli, Massimiliano Zanoni, Akrivi Katifori, Aristotelis
						Kasomoulis, Augusto Sarti, Yannis Ioannidis
						Multimedia Tools and Applications, 2022
					
				
				
					
						Deep music on air | Link
						Massimiliano Zanoni, Michele Buccoli, Guglielmo Cassinelli, Giorgio Rinolfi
						Proceedings of the 9th International Conference on Digital and Interactive Arts, 2019
					
				
				
					
						A Presence- and Performance-Driven Framework to Investigate Interactive Networked Music
							Learning Scenarios | Link
						Stefano Delle Monache, Luca Comanducci, Michele Buccoli, Massimiliano Zanoni, Augusto Sarti,
						Enrico Pietrocola, Filippo Berbenni, and Giovanni Cospito
						Wireless Communications and Mobile Computing, 2019
					
				
				
					
						Automatic playlist generation using Convolutional Neural Networks and Recurrent Neural
							Networks | Link
						Rosilde Tatiana Irene, Clara Borrelli, Massimiliano Zanoni, Michele Buccoli, Augusto Sarti
						Proceedings of the 27th European Signal Processing Conference (EUSIPCO), 2019
					
				
				
					
						Virtual Reality and Choreographic Practice: The Potential for New Creative Methods | Link
						Rosa E. Cisneros, Karen Wood, Sarah Whatley, Michele Buccoli, Massimiliano Zanoni, Augusto
						Sarti
						Body, Space & Technology 18 (1), 2019 
					
				
				
					
						Three-dimensional mapping of high-level music features for music browsing | Link
						Stefano Cherubin, Clara Borrelli, Massimiliano Zanoni, Michele Buccoli, Augusto Sarti, Stefano
						Tubaro
						Proceedings of the International Workshop on Multilayer Music Representation and Processing
						(MMRP), Milan, Italy, 2019
					
				
				
					
						Investigating Networked Music Performances in Pedagogical Scenarios for the InterMUSIC
							Project | Link
						Luca Comanducci, Michele Buccoli, Massimiliano Zanoni, Augusto Sarti, Stefano Delle Monache,
						Giovanni Cospito, Enrico Pietrocola, Filippo Berbenni
						Proceedings of the 23rd Conference of Open Innovations Association (FRUCT), Bologna, Italy,
						2018
					
				
				
					
						Time is not on my side: network latency, presence and performance in remote music
							interaction| Link
						Stefano Delle Monache, Michele Buccoli, Luca Comanducci, Augusto Sarti, Giovanni Cospito, Enrico
						Pietrocola, Filippo Berbenni
						Proceedings of the XXIII Colloquio di Informatica Musicale (CIM), Udine, Italy, 2018
					
				
				
					
						WhoLoDancE: Whole-body Interaction Learning for Dance Education
						Anna Rizzo, Katerina El Raheb, Sarah Whatley, Rosa Maria Cisneros, Massimiliano Zanoni, Antonio
						Camurri, Vladimir Viro, Jean-Marc Matos, Stefano Piana, Michele Buccoli, Amalia Markatzi, Pablo
						Palacio, Oshri Zohar Even, Augusto Sarti, Yannis Ioannidis, Edwin-Morley Fletcher
						EUROMED International Conference on Digital Heritage, 2018
					
				
				
					
						Beat tracking using recurrent neural network: a transfer learning approach
						Davide Fiocchi, Michele Buccoli, Massimiliano Zanoni, Fabio Antonacci, Augusto Sarti
						Proc. of the 26th European Signal Processing Conference (EUSIPCO), 2018
					
				
				
					
						Using multi-dimensional correlation for matching and alignment of MoCap and video
							signals
						Michele Buccoli, Bruno Di Giorgi, Massimiliano Zanoni, Fabio Antonacci, Augusto Sarti
						Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing (MMSP),
						Luton, United Kingdom, 2017
						The paper won the Top-10% award
					
				
				
					
						Unsupervised feature learning for Music Structural Analysis
						
						Michele Buccoli, Davide Andreoletti, Massimiliano Zanoni, Augusto Sarti, Stefano Tubaro
						Proceedings of 24th European Signal Processing Conference (EUSIPCO), Budapest, Hungary, 2016
					
				
				
					
						A higher-dimensional expansion of affective norms for English terms for music tagging
						
						Michele Buccoli, Massimiliano Zanoni, György Fazekas, Augusto Sarti, Mark Sandler and Stefano
						Tubaro
						Proceedings of 17th International Society for Music Information Retrieval Conference (ISMIR),
						New York City, USA, 2016
					
				
				
					
						A Dimensional Contextual Semantic Model For Music Description And Retrieval
						
						Michele Buccoli, Alessandro Gallo, Massimiliano Zanoni, Augusto Sarti, Stefano Tubaro
						DMRN+10: Digital Music Research Network One-day Workshop 2015, London, UK, 2015
					
				
				
					
						Feature-Based Analysis of the Effects of Packet Delay on Networked Musical Interactions
						
						Cristina Emma Margherita Rottondi, Michele Buccoli, Massimiliano Zanoni, Dario Garao, Giacomo
						Verticale, Augusto Sarti
						Journal of the Audio Engineering Society 63 (11), 864-875
					
				
				
					
						An Unsupervised Approach To The Semantic Description Of The Sound Quality Of
							Violins
						Michele Buccoli, Massimiliano Zanoni, Francesco Setragno, Augusto Sarti, Fabio Antonacci
						European Signal Processing Conference (EUSIPCO), Nice, France, 2015
					
				
				
					
						A Dimensional Contextual Semantic Model For Music Description And Retrieval
						Michele Buccoli, Assandro Gallo, Massimiliano Zanoni, Augusto Sarti, Stefano Tubaro
						IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane,
						Australia, 2015
					
				
				
					
						Unsupervised Feature Learning For Bootleg Detection Using Deep Learning
							Architectures
						Michele Buccoli, Paolo Bestagini, Massimiliano Zanoni, Augusto Sarti, Stefano Tubaro
						IEEE International Workshop on Information Forensics and Security (WIFS), Atlanta, USA, 2014
					
				
				
					
						A Music Search Engine Based Of Semantic Text-query Query
						Michele Buccoli, Massimiliano Zanoni, Augusto Sarti, Stefano Tubaro
						in Proceedings of the 15th international workshop on multimedia signal processing - MMSP 2013 -
						September 30 - October 02, 2013, Pula (Sardinia), Italy
					
				
			
			
				
				
					I am a hackathon enthusiast. Participating to hackathons is the best way to
						meet new people, learn new packages, test my coding-under-stress skills and
						quickly realize new projects. 
					This section helps me keeping track of the hackathons I participated so far. The teammates for
						each project are usually indicated in the source code. 
				
				
					Winner of the hackathon
					
					remote, July 8-10 2022
						Pacifier
						👶Pacifier💤 is a tool for converting the melody of any song into a lullaby to put your baby to
						sleep.
						Source code
					
				
				
					
					remote, September 11-12 2020
						Unnamed project
						This was supposed to generate music from landscape pictures, but it failed miserably.
						Source code
					
				
				
					Winner of challenge "daily routine"
					
					remote, April 3-6 2020
						Corunner Virus
						🏃Corunner Virus🦠 is a tool for running from home with an AI-based augmenter-reality
						treadmill.
						Learn more (in Italian)
						Demo
					
				
				
					
					Abbey Road Studios, London, November 10-11, 2018
						Abbey Blues
						Abbey Blues is a tool / live performance for guitar and lyrics. The tool recognizes sentiment
						of the lyrics that are being singed and triggers different background music and guitar
						effects.
						Source code
					
				
				
					Best hack on music creation or performance
					
					Vienna, September 29-30, 2017
						Samosa
						Samosa is a tool / live performance for face and guitar. A camera recognizes the emotion
						expressed by the face of the performer, triggering different background music and effect for the
						guitar performer.
						Source code
					
				
				
					
					Spotify HQ, New York, August 6th 2016
						Unnamed project
						This was supposed to recognize people rapping and scoring the quality of their rhymes, but it
						failed miserably.
						Source code
					
				
			
			
			
				
				
					Teaching assistant for Creative Computing and Programming course
					2019 - present 
						M.Sc. in Computer Engineering - Politecnico di Milano 
						The course is taught in English.
						You can find the material of the course by connecting to the beep portal and logging in with
						your PoliMI credentials.
					
				 
				
					Teaching assistant for the Information Retrieval and Data Mining course
					A.Y 2015/2016; 2016/2017; 2017/2018; 2018/2019
						M.Sc. in Computer Engineering - Politecnico di Milano 
						The course was taught in English.
						You can find the material of the course by connecting to the beep portal and logging in with
						your PoliMI credentials.
					
				 
				
					Organizer and Teacher of the workshop Creative Computing for Artistic Performances
					A.Y 2017/2018 
						B.Sc. and M.Sc. in Engineering - Politecnico di Milano 
					
				 
				
					Exercizes on Multimedia Signal Processing, 1st module
					A.Y 2014/2015; 2016/2017; 2017/2018; 2018/2019 
						M.Sc. in Computer Engineering - Politecnico di Milano 
						The course was taught in English.
					
				 
				
					Matlab Tutoring
					A.Y 2014/2015 
						M.Sc. in Computer Engineering - Como Campus Politecnico di Milano 
						The course was taught in English.