This is the list of projects with their descriptions. The descriptions could change a bit until the start of the school.

The Late-Life Dementia Risk Index using Large Electronic Medical Records Database

The increase in life-expectancy has caused diseases that cause decline in mental ability to occur more often than ever. Collectively these diseases are called Dementia, with Alzheimer accounting for 60-80% of all such cases.  The need of early diagnosis is of paramount importance, in order to have the maximum benefit from available treatments. Current state of the art algorithms apply risk models on collected data on an attempt to accurately predict the risk of actually having the disease in a certain period of time. The key difficulties in applying such techniques are noise in data measurement, and the vast variation of patients’ response to such diseases. Our intention is to develop and validate a late-life dementia risk index using Large Electronic Medical Records database. We will apply Multi-Task Learning algorithms because the nature of the disease is such that the method is applicable. This technique has shown an improvement over single task techniques, in various fields which was applied. An increase in prediction of Dementia would be an invaluable tool for doctors to improve the quality of life for a large part of elderly population. Through this project, the students will become familiar with multiple aspects of this exciting interdisciplinary field, and gain highly valuable hands-on experience in real-word problems.

Supervisors: George Giannoulis, Ioannis Kakadiaris, Dan Price, Rajender R. Aparasu, Paliouras Georgios

Characterizing the Big Data distributed sources of the LOD cloud

During the last years, the trend to open up data and provide them freely on the Internet has intensified in volume as well as quality and value of the data made available. This created an unprecedented opportunity for the linked data community to combine, cross-reference, and analyse large volumes of high-quality data and to build innovative applications. This project will prototype a proof of concept of a system, based on Hadoop, Mahout, and similar cutting-edge frameworks in order to efficiently extract statistics about the performance and contents of distributed Big Data sources. Such statistics can be used to optimize the querying of the large-scale distributed sources on the LOD cloud.

Supervisors: A. Charalambidis, S. Konstantopoulos, G. Kalampokinis

Pay-as-you-go modularization of large ontologies

The proliferation of large ontologies imposes the necessity of partitioning them to coherent modules. However, the effectiveness of the modularization depends on the target purpose. E.g. the modularization criteria when the target is efficient reasoning may be different when we target to ontology alignment. The aim of this project is to comparably investigate different algorithms in different contexts (i.e. target purpose), specify the desiderata for each context, and build a computationally efficient algorithm for a pay-as-you-go algorithm for the modularization of ontologies.

Supervisors: G. Vouros, A. Koukourikos

Dense structures in dynamic social networks and influence detection

Online social networks (SN) can be characterised by the existence of sparse and dense regions, as formed by the patterns of links between the nodes.  Dense regions are important because they denote high connectivity, cohesion, and can be employed in influence detection and in the study of information diffusion.  Moreover, many SNs can only be accurately described as heterogeneous, in the sense that they comprise interrelated entities (i.e. nodes) of two, three, or more types: such as users, tags, URLs etc. Finally, in certain social networks information is obtained as a continuous stream of posts made by users. The point of this work is to adapt an incremental algorithm for dense structure discovery, in the case of a heterogeneous and streaming social network (such as in Twitter) and then use that information in detecting influential users.

Supervisor: D. Vogiatzis

REVEALing Entity Relations in Twitter

Relation extraction is the task of identifying the relations that hold between interesting entities in text data. Being a challenging subtask of information extraction, it extracts the knowledge required to move from named entity recognition to data interpretation and understanding. During this project, the goal is to discover text-based relations between Named Entities in twitter data. We will focus on web-scale information extraction, where the relations cannot be specified in advance and speed is important.

Supervisors: A. Krithara, G. Paliouras


This project aims to give a positive answer to the ultimate question regarding all high-tech products: it’s all good, but does it make coffee? The project participants will develop and integrate software controlling a robot, software for Android mobile devices, and the electronic and electrical hardware needed control a coffee maker’s valve. In the final demonstration, your fellow students will be able to use their mobile devices to order coffee, place their empty mug on the robot, and the robot will position itself under the valve, remotely turn it open, fill the mug, and return to the person ordering the coffee with a full mug.

Supervisors: A. Lydakis, A. S. Dogruoz, T. Giannakopoulos, M. Dagioglou, S. Konstantopoulos

Content-based recommendation and browsing in multimedia collections

Recommendation systems (RS) are an application of pattern analysis techniques to the task of generating personalized recommendations for several types of information items (e.g. web pages, movies, music, etc). In general, movie RSs use (a) collaborative knowledge extracted from what users with similar preferences and tastes rated in the past and (b) analysis of the features shared between similarly rated items. However, in most of the cases, these features are restricted to high-level metadata such as authors, tags, directors, genres, etc. The purpose of this project is to analyse the low-level multimedia content of a movie in order to extract patterns, correlations and trends that can be used to enhance RSs as well as to develop an innovative, colourful, and intuitive visualizations of movies for browsing the collection.

Supervisors: T. Giannakopoulos, A. S. Dogruoz, S. Konstantopoulos

Analyzing Energy Usage in the Smart Grid and Incentivizing Renewable Energy Usage

Putting smart meters in every household generates a large volume of data that must be processed; each meter generates a stream of data regarding the energy usage from each household. However, this also gives us a number of opportunities. First, by analyzing this data we look for patterns of energy usage, thus learning what the main energy loads of a household are and when they are normally used, as well as preferences of the user regarding these loads as well as their comfort in regards to the temperature in the house. In this way, we can use machine learning and data analytics technologies to both learn these patterns as well as discovering ways to reduce energy usage.  Second, in order to increase the penetration of renewable energy sources in the grid it is necessary to incentivize prosumers in the distribution grid to be flexible when they consume energy. Using gamification, we will provide social incentives to consumers by comparing their performance with their friends’, e.g. announcing who uses more green energy. We will also develop intelligent agents that analyze their users preferences and advise them (or even take action if explicitly allowed) that can help their users win this social game.

Supervisor: I. A. Vetsikas, C. Akasiadis

Argument Extraction on large corpora from social media

Argumentation is a branch of philosophy that studies the act or process of forming reasons and of drawing conclusions in the context of a discussion, dialogue, or conversation. Being an important element of human communication, its use is very frequent in texts, as a means to convey meaning to the reader. As a result, argumentation has attracted significant research focus from many disciplines, ranging from philosophy to artificial intelligence. This project will explore ways for identifying arguments on textual data (texts from social media), and associate these arguments to elements of a predefined semantic model. The project will concentrate on three languages (English, German, Greek) and will try to exploit information that can be acquired in an unsupervised way from large corpora in these three languages.

Supervisors: G. Petasis, V. Karkaletsis

Enhanced Assisted Training for Teleoperation Applications

The project consists in developing a complete virtual system in which an operator can teleoperate a virtual robot by means of two joysticks or a whole-body interface. The robot could be a simple 2D 4DoF (1DoF is for the open-close of the gripper) planar manipulator, but could also be more complex. The operator can be trained in three different ways: without feedback, with visual cues, and with haptic cues. The goal of the training task will be appropriately chosen. Through specially designed experimental sessions, modeling of operator performance will take place, towards evaluating and comparing the effectiveness of the three training strategies.

Supervisors: N. Mavridis, P. Gallina

Compositionality and the Symbol Grounding Problem

Compositionality is widely accepted as a fundamental principle in linguistics but is also acknowledged as a key cognitive capacity. However, despite the prime importance of compositionality towards explaining the nature of meaning and concepts in cognition, and despite the need of computational models of composition of grounded meaning that could be applied in embodied intelligent agents, such as robots, there is little existing research. Most importantly, applicable computational models of semantic composition of grounded meaning do not exist yet. Thus, we aim to create empirically derivable computational models of semantic composition that can be applied to cognitive robots, which are capable to create and process grounded perceptual-semantic associations, and most importantly their compositions, ideally taking into account syntactic, pragmatic as well as semantic considerations.

Supervisor: N. Mavridis

Smart Buildings and the Human-Machine Cloud

Towards the goal of the human-robot cloud, i.e. distributed, on-demand, reconfigurable human-machine cognitive systems that are made up of sensing, processing, and actuation components and potentially cover large areas, in this project we consider a pilot smart-building example in which a set of cameras, laser range finders, and other sensors, together with a number of processing and actuation elements, including face detection, expression recognition, and laser range finder people trackers, are transformed to a prototypical reconfigurable distributed extended cognitive system.

Supervisor: N. Mavridis

Fun, Dynamic, Multimodal Robot Learning with I Spy and 20 Questions

Can we minimize the typical tedium of training robots by naturally integrating robot learning into conversational interactions with humans? Specifically, can robots engage humans in interactive games such as ISpy and 20 Questions, which can be naturally multimodal, in a way that assists the robot in speech recognition, language learning and object recognition? The project will focus on creating a game combining I Spy and 20 Questions that when played jointly by a human and a robot enables the robot to improve both their language and vision performance.

Supervisors: R. Nielsen, M. Dagioglou, V. Karkaletsis