Methodologies for Big Data Analytics

Methodologies for Big Data Analytics

A programme of methodological research on the quality of data, analysis and modelling techniques for Big Data led by Professor Maria Fasli..

About this research stream

This research underpins the work on the Centre with a focus on techniques and methods for the quality, pre-processing and analysis of Big Data. It also addresses the modelling and predicting of complex and adaptive socio-economic systems. This research focuses on:

Data quality grading and assurance

This research will develop new and adapt existing methodologies for merging data from multiple sources. It will also develop robust techniques for data quality grading and assurance providing automated data quality and cleaning procedures for use by researchers.

Identifying "unusual" data segments

Methods will be developed to automatically identify "unusual" data segments through an ICMetrics-based technique. Such methods will be able to alert researchers of specific data segments that require subsequent further analysis and identify potential issues with unsolicited data manipulation and integrity breaches.

Text data mining

Textual data represents rich information, but lacks structure and requires specialist techniques to be mined and linked properly as well as to reason with and make useful correlations. A set of techniques will be developed for extracting entities, relations between them, opinions and other elements for use to support semantic indexing and visualisation and anonymisation.

Tracking interactions among users

Data generated via the interaction of users online contains a wealth of information. This research will investigate automatic methods for tracking interactions that can be used, for example, to identify service pathways in local government or business data to aid organisations in improving service delivery to citizens/customers. Methods to identify the context of the interaction and the individual user needs to provide tailor-made services will also be developed.

Machine learning and transactional data

Investigate machine learning and other methods for identifying stylised facts, seasonal, spatial or other relations, patterns of behaviour at the level of the individual, group, or region from transactional data from business, local government or other organisations. Such methods can provide essential decision support information to organisations in planning services based on predicted trends, spikes or troughs in demand.

Developing methods to evaluate, target and monitor the provision of care

Models and statistical methods for the analysis of local government health and social care data will be developed alongside new data mining and machine learning algorithms to identify intervention subgroups, and new joint modelling methods to improve existing predictive models with a view to evaluate, target and monitor the provision of care.

Meta-analysis and evidence synthesis methods

Data vary in content and granularity. Some will be available at the individual or firm level but often, due to various business or privacy preservation considerations, the data will be aggregated to higher levels, such as postcode, ward or institutional level, or aggregated by individual characteristics (e.g. age group). The focus of this project will be on developing meta-analysis and evidence synthesis methods to enable users to undertake unified analysis specifically for the types of data available through the Centre. We shall also develop new methods for indirect comparisons (network meta-analysis) of social interventions.

Agent-based modelling and social simulation

Datasets encompass the results of interactions/transactions within complex socio-economic systems. Although the techniques and methods developed under the first theme will enable researchers to analyse and mine these datasets, there is a need to understand the data, behaviours and processes that have led to these, at a much deeper level. Alongside analytical models, we will be deploying agent-based modelling and social simulation (ABSS) as an alternative method for exploring complex Big Data. ABSS enables one to alter the rules, interactions, and behaviour of the individual components within the system and observe the subsequent impact at the individual and the emergent system behaviour. This facilitates alternative and exhaustive scenario testing. ABSS can serve as a decision support tool for policy makers helping them identify issues and factors to enable them to better design and implement policies based on the features of their target population. Firms can also use such tools to better understand customer behaviour and market trends.

Early warning systems for social care

Similar to early warning systems for natural disasters and medical emergencies, such a system for social care would draw attention to a crisis at various levels: locality, institution or an individual. This would require a data-sharing platform that can pull together information held by separate agencies and would create a real-time score for levels of risk based on aggregated values of identified predictors.

Research news

View a selection of the latest Methodologies for Big Data Analytics research papers below or you can view them all in our Research Repository.

Research team

Professor Maria Fasli

Research Lead

Maria’s research interests lie in agents and multi-agent systems and their theoretical foundations and practical applications, machine learning, data exploration, analysing and modelling complex data, and Big Data.

Dr Beatriz de la Iglesia


Beatriz's current research interests include data mining and in particular the extraction of partial classification rules or nuggets using meta-heuristic algorithms.

Dr Udo Krushwitz


Udo's current research interests include Natural language processing (NLP) and Information retrieval (IR) and the implementation of such techniques in real applications.

Professor Elena Kulinskaya


Elena's current research interests include Foundations of statistics, Statistical methods for discrete and skewed data, Asymptotic methods, Applied statistics and Meta-analysis and research synthesis.

Professor Berthold Lausen


Berthold's's current research interests include Biostatistics, Classification, Clinical Research, Computational Statistics, Epidemiology, and Systems Biology.

Professor Klaus MacDonald-Maier


Klaus's current research interests include Embedded Systems and System-on-Chip (SoC) design, development support and technology to increase performance and reliability.

Dr Abdellah Salhi


Abdellah's current research interests include Optimisation: Mathematical Programming and Heuristics, Numerical Analysis; Data Mining and Bioinformatics.

Dr Abhijit Sengupta


Abhijit's current research interests include Innovation and technology management, Firm strategy, Role of ecosystems and institutions on innovation and firm strategy, Networks and complex systems, and Analytics and behaviour.