Student Projects – Compliant and Accountable Systems Group

A selection of possible student projects proposed by our team (CompAcctSys) are listed below, in no particular order. Note these are only indicative of the sorts of projects we could supervise. Should you be interested in these, or any other projects in the areas including ‘algorithmic accountability’ and ‘automated-decision making’ (particularly around machine learning), security/privacy/data protection, social aspects of data/tech, distributed/mobile systems/IoT/AR/VR, digital art & interactive environments, etc., the best way forward is to have a chat and discuss. Please contact jatinder.singh [@] cl.cam.ac.uk in the first instance.

Shielding from online targeting
Detecting abuse in online games
Blocking in the Internet of Things
Supporting Seamlessness: A reconfigurable, command-and-control framework for the Internet of Things
Enabling physical world gaming
Making systems reviewable
Implementing “decision provenance” for accountable ML
XR auditability
Logging vs privacy: A trade-off
The sustainability of XR auditing
Disclosure dashboards for data protection access requests
Efficient web service monitoring
Risks of vulnerability detection using financial data
Public attitudes on sharing sensitive data (ML fairness)
ML explainability: A systematic comparison

Shielding from online targeting

Targeted advertising and recommender systems are common on online services, but raise concerns about privacy, data protection, and discrimination. While some platforms have implemented mechanisms to allow users to set their preferences, these are typically quite limited. Improving user control over these systems to help them better protect their privacy could go some way towards addressing some of these issues. This project involves exploring tooling to allow users to exclude certain factors relating to themselves (demographic attributes, interests, preferences, behaviours, etc.) from consideration by the systems that disseminate targeted advertising or recommended content.

Detecting abuse in online games

Massively multiplayer online role-playing games (MMORPGs) are online computer games that enable individuals of varying ages, genders, ethnicities and cultures to interact with one another in a digital environment. While users usually experience fun working together while solving quests, building, trading, exploring, etc., many players experience some form of harassment. This is particularly relevant in highly immersive environments, such as those offered by Virtual Reality. Games developers have realised this issue and as a result, many games now provide the option to report in-game anti-social behaviour. However, generated reports of harassment must be reviewed manually, by a human. This not only takes time (and therefore exposes a player to further abuse), but also costs money either through staff/moderator costs or through lost players. For this project, you will identify technical indicators of harassment within a gaming environment.

Blocking in the Internet of Things

It has been shown that many mobile apps and IoT devices communicate to a range of servers, some of which appear beyond that needed to function. This raises privacy considerations given that the sensors and actuators deployed in our homes, streets, carried in our pockets, etc, can generate much sensitive information. This project involves undertaking a technical survey to investigate the extent to which seemingly spurious traffic can be blocked while still enabling appropriate functioning of the device/application.

Supporting Seamlessness: A reconfigurable, command-and-control framework for the Internet of Things

Visions for future computing environments are of a hyperconnected world, where a bunch of sensors, actuators, and other components provide functionality seamlessly. Such an environment requires a flexible management infrastructure ‘under-the-hood’. This project involves designing and engineering a management infrastructure to enable such functionality, aiming at supporting an IoT environment. It has a strong practical/engineering component, the aim being for an open source release.

Enabling physical world gaming

Pokemon-go brought the concept of augmented reality-driven gaming to the masses. However, many games in the space tend to be location and image based — i.e. responding to a person’s (device’s) location, and viewpoint. However, as the Internet of Things evolves, where connected ‘things’ have actuation capabilities, there is real potential for gaming to bring about physical-world effects. This project is to explore the potential in this space, whereby games not only entail a physical element, but interact with devices and tech embedded within the environment.

Making systems reviewable

Algorithmic accountability is a topic of increasing importance for legal compliance, for identifying problems with machine learning systems, and for improving societal acceptance of new technologies. In practice, automated decision-making consists of both human (organisational) and technical processes and systems. ‘Reviewability’ in the context of automated decision-making means accountability through record-keeping and logging of the technical and human processes that produce a system and its decisions. This project will explore what kind of automated logging could work at different stages of those processes, and at different levels of abstraction, to support reviewability in different contexts.

Implementing “decision provenance” for accountable ML

Many of the concerns regarding automated decision-making (ML systems) relate to the processes and contexts in which the models are built, deployed and operated. Towards this, decision provenance is a concept that concerns tracking the flow of data to assist accountability concerns. This project involves taking forward the concept through practical implementations, involving various data sources, devices and ML models in order to demonstrate its potential.

XR auditability

Augmented, virtual, and mixed reality (XR) systems are becoming more widespread and are being deployed in increasingly high risk scenarios (construction sites, operating theatres, etc.) — but things will inevitably go wrong, be it through software bugs, hardware faults, or human error. However, the complexity of these systems means that it may be difficult to understand what happened, and take subsequent steps to avoid re-occurences. This project explores the auditability of XR — building data capture into a popular game engine that allows developers to (i) select various types of data to capture (game engine state, user input, raw sensor data, engine logs, etc.) and (ii) allowing them to recreate the steps leading to the incident in question and diagnose the sources of issues once they occur.

Logging vs privacy: A trade-off

Online platforms including social networks and cloud services allow ordinary users (non-experts) to bring about social harms, such as distributing fake news, harassing people using fake videos and voices, just to name a few. To detect inappropriate usage of such, keeping and analysing audit records about how users use the online platforms represent ways forward. This approach might get passed as law in some countries (e.g., Brazil fake news bill). However, keeping audit records about users’ identity and activity itself raises critical privacy concerns. Accordingly, there is a challenging dilemma here. On the one hand, online platforms need to audit users activities tackle malicious usage. On the other hand, auditing users’ activities raises privacy and security concerns. This project will explore ways forward towards balancing these competing concerns.

The sustainability of XR auditing

Capturing data about how a system works, how it is operated and in which contexts is crucial to facilitating investigations and reviews when they are invariably needed. For emerging technologies such Virtual, Augmented and Mixed Reality (XR) systems, auditability is even more important, especially given such tech blends the physical and digital spaces. As XR adoption by the public (as well as by more high-consequence sectors such as surgery, military and manufacturing) continues, so too will various risks to safety, privacy and security.

The auditability of XR systems is therefore an important challenge. However, capturing audit data requires additional resources (storage, processing, energy, etc.) and will therefore incur financial costs to those who operate or otherwise maintain XR systems, as well as environmental costs which may affect policy and regulation. Such costs may outweigh the risks of fines or penalties incurred through failures or accidents, especially in large-scale deployments of complex XR systems. In this project you will build an XR prototype application with mechanisms to capture various data relevant to audit. You’ll then work to identify the various sustainability considerations of audit and perform experiments to measure the sustainability of auditability in XR.

Disclosure dashboards for data protection access requests

Data protection laws such as the GDPR often provide users with rights over accessing their data; however, such regulations typically say very little about how this data should be provided to the user. This means that similar services may supply that data in very different ways (database dumps, CSV files, images), and these may not always be particularly useful for users. This project will explore the creation of an interactive dashboard which can import data archives in a variety of formats, parse that data, and then present it to the user in different ways. By doing so, the dashboard aims to assist a broad range of stakeholders in interpreting and making sense of the data that comes from a wide variety of services, promoting more general approaches for making use of data access rights.

Efficient web service monitoring

This project is in the context of monitoring web services for various purposes (e.g. uncovering potential misuse). Such services take input data, perform some processing, and return some output/results. For instance, a facial emotion recognition service takes an input image and returns detected emotions. Now consider a set of servers that run one or more of such services (perhaps distributed) which need to be monitored. The monitoring involves recording inputs, processing by-products (state) and outputs to be processed later on for various purposes. This project will explore how these can be done in an efficient manner.

Risks of vulnerability detection using financial data

The UK Finance Regulator points to using ML for vulnerability detection as offering customer benefit, but there are many potential risks. There is no clear definition of vulnerability, and it may manifest itself in people in different ways. This project would explore the risk-benefit trade-offs in use of machine learning (including big data analysis using transactional data, voice analysis from customer phone calls, etc.) in detecting vulnerability drivers. Examples of risks may include higher false positive rates for women vs. men in voice analytics, low accuracy for those with strong accents, misuse of vulnerability data (e.g. for credit risk, violating the purpose limitation in GDPR), security and fraud risks, and privacy risk (Pew Research Center found in some contexts, most Americans don’t want to be surveilled even if it is for their own benefit).

Public attitudes on sharing sensitive data (ML fairness)

A recent CDEI study showed UK customers are more likely to financially punish banks when told they are using advanced algorithms that may discriminate based on gender, race, and social media usage. However, there is a clear tension between the need to track protected features to test for discrimination and the need to protect customers’ privacy and sensitive information. This project would conduct surveys and interviews to understand under what circumstances, contexts, or communication strategies people are more likely to be comfortable sharing their demographic characteristics. For example, are they more likely to share their race when told it will be used for regulators to investigate whether there are any discriminatory practices? What are the trade-offs between the potential benefit of being able to audit a model for bias and the potential privacy and security costs? And what are the model building (ML) implications for such?

ML explainability: A systematic comparison

There is much discussion regarding making ML models more explainable. While there is intuition that certain model building methods are more explainable than others (e.g. neural networks vs. decision trees) there is little in the way of a rigorous comparison between the approaches. This project is to perform such a comparison, by defining various metrics, criteria, and systematically testing a range of approaches, to provide data indicating the differences and relevant factors/considerations.