In this research area we deal with: *Algorithms and methodologies for algorithm analysis. Methodologies for database exploration and educational data mining. Languages for coordinating dynamic interactions between distributed components. Formal methods for the specification and verification of concurrent, distributed and mobile systems. Models and methods for the analysis of security protocols.*

Among the specific topics, we are dealing particularly with:

- Algorithms for the analysis of real-world graphs.

In a society in which we are almost always 'connected', the ability to easily acquire different information has become a source of huge amounts of data to process and analyse. Graphs are a data structure that allows us to model many of the relationships found in this data, abstracting millions or billions of connections from particular specific aspects and effectively representing them: the complexity of the entities involved is enclosed in a graph vertex and the complex mechanisms of interaction between them are simply described by an arc. A social network, for example, can be seen as a graph in which the nodes are users and the arcs correspond to their friendships. Similarly, the Bitcoin economy (or any other economic system) corresponds to a graph in which the nodes are the users and the arcs represent the payments between them. In this context, discovering the most important and influential users, what their future friendships will be, which communities they belong to, and the structure of the society they live in can be translated into problems on graphs. Research in this area focuses on the development of models and algorithms to solve these kinds of problems efficiently on large-scale real-world graphs. The research work develops in line with two orthogonal aspects: the first is the translation of a real-world problem into a problem on a graph, the second is aimed at the development of efficient algorithms, hopefully with (near) linear time complexity, to solve it. In this sense, theory and practice progress together, as practical problems motivate theoretical contributions which lead, in turn, to practical and efficient solutions of the original problems.

People: Ferreira Da Silva, A. Marino - Analysis of Algorithms

Information and telecommunications technologies require a number of algorithms to transform, store and transmit large amounts of data. The efficiency of these algorithms depends on both the technological aspects and the spatio-temporal complexity of the algorithms. Algorithm analysis is a branch of computer science that studies the efficiency of algorithms according to certain characteristic parameters. The aim is to provide precise information on the efficiency of any type of algorithm by studying its behaviour in the worst, best and average case. Using the results obtained, it is then possible to choose the best among several algorithms that perform the same task.

The general effort to accurately predict the performance of algorithms has come to involve research in analytical combinatorics, analysis of discrete random structures, asymptotic analysis, exact and boundary distributions, and other fields of investigation in computer science, probability theory, and enumerative combinatorics, using an approach based on generating functions and the method of coefficients, the symbolic method, and combinatorial sums. Generator functions have been used in combinatorics for 300 years and in recent decades have been one of the most widely used tools in the analysis of algorithms. They serve both as a combinatorial tool to facilitate counting and as an analytical tool to develop precise estimates for quantities of interest. The coefficients method is a complement to the previous concept, while the symbolic method allows the development of relationships between combinatorial structures and associated generating functions. Combinatorial identities have attracted attention since ancient times and some of this literature has a strong connection with the analysis of algorithms.

People: Merlini, Verri - Design, formalisation and implementation of programming languages

Research in this area focuses on the design of new programming languages or the extension of existing languages. The languages in question can be either General Purpose Language (GPL) or Domain Specific Language (DSL). We focus particularly on statically typed Java-like languages, i.e.: mainly Object-Oriented. We aim to give the formalisation of such languages by testing various properties of correctness. In addition, we will also address their implementation by incorporating them into an IDE. In the case of extensions of languages, we focus on extensions of mainstream languages such as Java, aiming to improve their functionality from the developer's point of view, in order to achieve greater reuse of codes and greater flexibility in writing programs, while maintaining the properties of correctness.

People: Bettini - Teaching computer science in schools

There is a cultural mission to be undertaken here. The problem is not that of the specific contents to be administered in a certain type of school rather than in another. The matter is much more general, as it concerns the revaluation of computer science as a branch of science, as autonomous and fundamental as physics and chemistry, for example. The mission, however, goes far beyond the possible “claim” of an adequate number of hours for the study of “computer science” in as many schools as possible, because it concerns a profound overhaul of the way in which scientific subjects are taught in general. In short, it is a question of changing the current, essentially didactic teaching of certain scientific subjects, almost always as if they were independent of each other, into a genuine education in scientific thinking, which is still largely absent, even in the most scientifically oriented schools. One of the consequences of the distortion produced by this system is, as an exemplary case, the phantom discipline, computer science, which, in the general debate, seems to exhaust its own reason for existence in being smeared over everything of a technological nature. Consequently, anyone dealing with “media”, with the construction of sites, with various technological manipulations, anyone who knows how to use at least two or three “personal productivity” programmes, becomes, to some extent, a computer scientist. While computer science is present in schools, apart from a few special cases, it only offers the opportunity to learn a few applications for personal productivity or, at bes, for some “coding” work, which is often sterile, as it is culturally disconnected from the rest. It is therefore necessary to undertake research into the teaching of computer science, which must, however, be seen as part of a broader action to rethink the teaching of scientific thought. We have called it a cultural mission because the consequences of the absence of scientific thinking are ubiquitous, vast and devastating. This does not mean opposing sciences to humanities. On the contrary, the correct reassessment of scientific thought and its role in society is a humanistic mission.

People: Verri - Educational data mining

Numerous sectors, from economic and commercial activities to public administration, are involved in the growth of data in information systems. For this reason, it is important to develop new methodologies and technologies to manage and analyse all the information that can be derived from such large data sources. As far as the field of education is concerned, Educational data mining (EDM) is a recent research area that uses*machine learning*and*data mining*algorithms to explore and analyse both the large archives of data normally found in school and university databases for administrative purposes and the large amounts of information on interation between teaching and learning generated in e-learning or web-based educational contexts. EDM tries to use all this information to better understand the performance of the students' learning process and can be used by the university or school management to improve the whole educational process. Literature on the use of*data mining*in educational systems looks mainly at techniques such as clustering, classification, associative rule finding and sequential pattern analysis. Our studies cover these techniques and their application to data from various cohorts of students in the Computer Science and Science degree courses, who share the same self-assessment test required of students before enrolling at the University of Florence.

People: Bertaccini, Merlini, Verri - Machine Learning for Structured Data

In many application domains, information is naturally represented as structured data. Structured data is distinguished from tabular (or "flat") data by the presence of a structure that indicates how the various components of the information are related to each other. For example, in chemistry a molecule is naturally represented by a graph whose structure tells us how the atoms interact with each other. Similarly, a natural language sentence or program code is intrinsically compositional and can be represented as a tree. Research in this area focuses on the development of new Machine Learning techniques (both neural and probabilistic) and their application of these in specific domains.

Persone: Bertaccini, Castellana - Formal methods for the specification and verification of concurrent, distributed and mobile systems

Research in this area focuses on understanding the foundational and application problems raised by modern concurrent, distributed and mobile systems, such as service-oriented architectures, autonomic computing systems and cyber-physical systems, operating in non-fully determined and changing environments. The aim is to identify models, languages and formal tools capable of capturing different and specific aspects of such complex systems, enabling the study of problems concerning, for example, interaction, connectivity, adaptivity and security. Process calculations, operational, denotational and observational semantics, type systems, modal and temporal logics are some of the formal methods and analysis techniques used in such studies.

People: Pugliese, Tiezzi

The following lines of research are active:- Resource access control policies: Development of a methodology centred on the Formal Access Control Policy Language (FACPL), founded on a solid formal basis and supported by user-friendly software tools, for the specification, analysis and enforcement of attribute-based access control policies.
- Domain-specific languages: Design of languages developed specifically to express distinctive features of certain domains of distributed systems, such as network-aware programming, service-oriented computing (SOC), autonomic computing and collective adaptive systems programming.
- Service contracts: Extensions of contract automata, automata-based formalisms developed to formalise contracts for service-oriented applications, with mechanisms actually used by real-world SOC technologies, and development of algorithms to synthesise orchestrations and choreographies of service contracts formalised via contract automata.

- Formal models and methods for Security and Safety

The overall aim is to develop methodologies, based on a rigorous mathematical framework, to analyse computational systems from the point of view of Security and Safety.

With regard to Security, models and methods that provide formal guarantees on the following aspects are developed:- Quantitative Information Flow (QIF), where the aim is to prove that a given system does not usually release more than a set amount of information on a certain secret data item (e.g.: a password, or a user ID in an anonymisation system such as Tor). Information theory techniques are widely used to analyse this type of problem.
- Privacy in the context of anonymisation and publication of datasets. Here the general aim is to ensure that a malicious user cannot use (anonymised) publicly disclosed data to learn sensitive information about specific individuals. This generally implies that an appropriate trade-off between data utility and privacy can be established. An important tool in this field is the Bayesian inference methods.

The resulting models and methods are also tested on real-world systems and datasets. As regards safety, continuous systems, and, more generally, hybrid systems, are examined. The dynamics of such systems are described, for the continuous part, by systems of non-linear differential equations, which rarely admit a solution in closed form. A central role in this field is played by the search for “invariants”, which guarantee that the trajectories of the system in the space of states never escape from a “safety zone” specified by the user (e.g., in the case of an aeroplane, that speed and inclination always ensure protection against stalling). Coalgebraic (automata theory, bisimulation) and algebraic (polynomials, ideals) theoretical tools are used to define algorithms for the automatic generation of such invariants. These algorithms are also tested on specifications of realistic systems.

Researchers involved: Boreale

**People**: Bertaccini, Bettini, Boreale, Castellana, Ferreira Da Silva, A. Marino, Merlini, Pugliese, Tiezzi, Verri

Funded research projects in the subject area of 'Computer science: algorithms, languages, formal methods and security', with PIs from the Department

- Assessment cybersecurity readiness, Programme POR FESR 2014-2020, Regione Toscana, (2019-2021, PI: R. Pugliese): The aim of the project is to empirically assess the level of maturity of SMEs within the context of cyber security and to identify concrete measures that can enable them to improve their awareness with respect to cyber risks and threats, and to seize opportunities to raise their cyber security capabilities.

Last update

16.04.2024