RICERCANDO

Partner:

University of Ljubljana

RICERCANDO is an advanced toolbox for mining mobile broadband measurement data collected via the MONROE platform. RICERCANDO supports integrative exploration, visualization and interpretation of the collected data and metadata. The main use of the toolbox is to facilitate the identification and interpretation of anomalies and problems within the data (e.g. clusters of measurements reporting particularly poor performance). In particular, the aim is to (1) ease the problem discovery and troubleshooting of the MONROE monitoring system, (2) avoid erroneous accounting to the monitored broadband mobile network(s) of data anomalies caused by glitches within the monitoring system, and (3) speed up the correct identification of the source of the problem (root-cause analysis).

RICERCANDO toolbox is designed with modularity and flexibility in mind. In addition, it supports advanced visualisation and efficient iterative analysis. Tools developed within the project include:

  • Data transformation and preprocessing tools
  • Large-scale interactive data visualisation tools supporting both georeferenced and temporal data analysis and visualisation
  • Anomaly detection tools supporting automatic identification and visualisation of performance anomalies
  • In-depth data mining and root cause analysis tools that expand an already well established and feature-rich data mining suite Orange

Implementation-wise, these are spread over Bash scripts, Jupyter Notebooks written in Python, and custom Orange data mining software widgets.

We successfully demonstrate the usability of RICERCANDO by analysing a six-month long trace of MONROE metadata encompassing a total of a few billion records (approximately 80 GB of data). With the help of RICERCANDO tools we detect and isolate performance anomalies, conduct a statistical analysis to identify potential factors causing the anomalies, and through further analysis based on iterative experiments and data mining in Orange pinpoint the most likely reasons for observed anomalies.

The project was conducted by an interdisciplinary team of computer networking and data mining experts from the Faculty of Computer and Information Science, University of Ljubljana lead by prof Fabio Ricciato, prof Blaž Zupan, and dr Veljko Pejović. RICERCANDO tools are released as open source software and are available at: https://github.com/ivek1312/ricercando

Software release is available here:
https://github.com/ivek1312/ricercando