Peer-reviewed Papers & Theses

This section contains my peer-reviewed academic publications, including conference papers, journal articles, and theses.

While my primary field of academic research since 2021 has been data protection and formal methods, I have also published peer-reviewed data science and political science papers. The latter are identified with green badges in the list below.

Papers & Theses (12)

Security Thesis Formal methods

Rigorous & Automated Privacy by Design

François Hublet
2025

@inproceedings{hublet2025rigorous,
  title = {Rigorous & Automated Privacy by Design},
  author = {François Hublet},
  year = {2025}
}

Authors' version

I will defend my doctoral thesis on December 12. The current version of the manuscript is available to interested parties prior to the defense. Please enter the password to access the document.

Journal Political Science

A Tale of Three Cleavages

François Hublet & Mattéo Lanoë
Revue française de science politique (RFSP), 2025

L’élection présidentielle française de 2022 s’est déroulée dans un contexte de réalignement qui a vu l’émergence d’une compétition assez équilibrée entre gauche, centre-droit et extrême droite. Pour cerner les clivages sous-jacents, nous analysons un vaste ensemble de données socio-démographiques à l’échelle des bureaux de vote. Nos résultats vont dans le sens des théories des « nouveaux clivages » tout en montrant que les variables socio-économiques demeurent des facteurs déterminants du comportement électoral. Nous identifions trois démarcations structurantes socio-économique, éducative et migratoire qui fondent un système de clivages croisés entre les quatre principales options de vote (votes Mélenchon, Macron, Le Pen et abstention).
@article{hublet2025a,
  title = {A Tale of Three Cleavages},
  author = {François Hublet and Mattéo Lanoë},
  year = {2025},
  journal = {Revue française de science politique},
  volume = {2},
  number = {75},
  publisher = {Presses de Sciences Po},
  doi = {10.3917/rfsp.752.0215}
}

DOI: 10.3917/rfsp.752.0215

Journal Political Science

A Tale of Three Cleavages

François Hublet & Mattéo Lanoë
Revue française de science politique (RFSP), 2025

The 2022 French presidential election unfolded during a phase of political realignment marked by the emergence of a relatively balanced tripolar competition among left-wing, center-right, and far-right parties. To analyze the underlying social and electoral cleavages, we conducted a large-scale, precinct-level quantitative study of socio-demographic variables. While our findings broadly support neo-cleavage theories, they also underscore the enduring explanatory power of socio-economic predictors in shaping electoral behavior. We identify three main cleavage dimensions—socio-economic, educational, and migratory—that produce a system of cross-cutting cleavages among the four main electoral choices: Mélenchon, Macron, Le Pen, and abstention.
@article{hublet2025a,
  title = {A Tale of Three Cleavages},
  author = {François Hublet and Mattéo Lanoë},
  year = {2025},
  journal = {Revue française de science politique},
  volume = {2},
  number = {75},
  publisher = {Presses de Sciences Po}
}

Conference Formal methods

Instrumenting Runtime Enforcement

François Hublet, David Basin, Linda Hu, Srđan Krstić, & Lennard Reese
International Conference on Runtime Verification (RV), 2025

Runtime enforcement ensures that a running system complies with a property by observing and modifying the system’s actions. In practice, the property is often defined in terms of high-level, abstract events, while the system’s behavior consists of low-level, concrete actions. The relationship between actions and events is established in the instrumentation process, where developers must ensure that (i) system actions report the right events, and (ii) the necessary modifications to the system’s behavior are correctly enforced. However, the abstraction gap between a high-level property and low-level actions makes this process error-prone.
In this paper, we refine an existing formal model of runtime enforcement, which leaves instrumentation implicit, into a more precise model that explicitly accounts for instrumentation. We propose a correctness criterion for instrumentation and present a novel library, called InstrLib, that instruments Python applications for runtime enforcement.
@inproceedings{hublet2025instrumenting,
  title = {Instrumenting Runtime Enforcement},
  author = {François Hublet and David Basin and Linda Hu and Srđan Krstić and Lennard Reese},
  year = {2025},
  booktitle = {International Conference on Runtime Verification},
  series = {LNCS},
  volume = {16087},
  publisher = {Springer},
  doi = {10.1007/978-3-032-05435-7_10}
}

DOI: 10.1007/978-3-032-05435-7_10

Conference Formal methods

Scaling Up Proactive Enforcement

François Hublet, Leonardo Lima, David Basin, Srđan Krstić, & Dmitriy Traytel
International Conference on Computer Aided Verification (CAV), 2025

Runtime enforcers receive events from a system and output commands ensuring the system’s policy compliance. Proactive enforcers extend traditional (reactive) enforcers by emitting commands at any time, rather only as a response to system actions. However, proactive enforcers have so far lacked support for many useful policy features. This, along with the existing tools’ poor performance, hinders their adoption. We present a performance-optimized, proactive enforcement algorithm for a rich policy language: metric first-order temporal logic with function applications, aggregations, and let bindings. We have implemented this algorithm in EnfGuard, the first proactive enforcer tool that supports the above constructs. We evaluated our tool using a novel set of six benchmarks containing both real-world and synthetic policies and logs, demonstrating that it enforces realistic policies out-of-the-box and achieves the necessary performance to be used in real-time systems.
@inproceedings{hublet2025scaling,
  title = {Scaling Up Proactive Enforcement},
  author = {François Hublet and Leonardo Lima and David Basin and Srđan Krstić and Dmitriy Traytel},
  year = {2025},
  booktitle = {International Conference on Computer Aided Verification},
  doi = {10.1007/978-3-031-98682-6_19}
}

DOI: 10.1007/978-3-031-98682-6_19

Journal Data Science

Where the Borders Lie: Mapping Cross-Border Communities in 10 Western European Countries

Aurore Sallard & François Hublet
Transportation Research Record (TRR), 2024

With the deepening of European integration, Western Europe has witnessed the emergence of highly interconnected cross-border living areas. So far, these areas have received rather limited attention from both quantitative research and public policy. The COVID-19 pandemic dramatically exposed the limitations of the status quo: with travel restrictions imposed at administrative borders and limited cross-border crisis management, the daily life of people in border regions was affected in a disproportionate way. In an effort to better understand the geography of cross-border communities, this paper presents the first large-scale quantitative analysis of cross-border communities in Western Europe. We apply the Louvain community detection algorithm to a transnational, fine-grained dataset gathering commuter flows across 10 Western European countries. This allows us to produce the first comprehensive transnational mapping of communities in these countries and identify five main cross-border living areas. Based on these findings, we put forward policy recommendations aimed at improving the design of mobility censuses and developing new institutional frameworks in cross-border regions.
@article{sallard2024where,
  title = {Where the Borders Lie: Mapping Cross-Border Communities in 10 Western European Countries},
  author = {Aurore Sallard and François Hublet},
  year = {2024},
  journal = {Transportation Research Record},
  volume = {2679},
  number = {1},
  publisher = {Sage},
  doi = {10.1177/03611981241254389}
}

DOI: 10.1177/03611981241254389

Conference Formal methods

Proactive Real-Time First-Order Enforcement

François Hublet, Leonardo Lima, David Basin, Srđan Krstić, & Dmitriy Traytel
International Conference on Computer Aided Verification (CAV), 2024

Modern software systems must comply with increasingly complex regulations in domains ranging from industrial automation to data protection. Runtime enforcement addresses this challenge by empowering systems to not only observe, but also actively control the behavior of target systems by modifying their actions to ensure policy compliance. We propose a novel approach to the proactive real-time enforcement of policies expressed in metric first-order temporal logic (MFOTL). We introduce a new system model, define an expressive MFOTL fragment that is enforceable in that model, and develop a sound enforcement algorithm for this fragment. We implement this algorithm in a new tool called WhyEnf and carry out a case study on enforcing GDPR-related policies. Our tool can enforce all policies from the study in real-time with modest overhead. Our work thus provides the first tool-supported approach that can proactively enforce expressive first-order policies in real time.
@inproceedings{hublet2024proactive,
  title = {Proactive Real-Time First-Order Enforcement},
  author = {François Hublet and Leonardo Lima and David Basin and Srđan Krstić and Dmitriy Traytel},
  year = {2024},
  booktitle = {International Conference on Computer Aided Verification},
  doi = {10.1007/978-3-031-65630-9_8}
}

DOI: 10.1007/978-3-031-65630-9_8 Authors' version Extended version Slides

Workshop Preprint

Towards an Enforceable GDPR Specification

François Hublet, Alexander Kvamme, & Srđan Krstić
Mapping and Governing the Online World Workshop (MGOW), 2024

While Privacy by Design (PbD) is prescribed by modern privacy regulations such as the EU's GDPR, achieving PbD in real software systems is a notoriously difficult task. One emerging technique to realize PbD is Runtime enforcement (RE), in which an enforcer, loaded with a specification of a system's privacy requirements, observes the actions performed by the system and instructs it to perform actions that will ensure compliance with these requirements at all times. To be able to use RE techniques for PbD, privacy regulations first need to be translated into an enforceable specification. In this paper, we report on our ongoing work in formalizing the GDPR. We first present a set of requirements and an iterative methodology for creating enforceable formal specifications of legal provisions. Then, we report on a preliminary case study in which we used our methodology to derive an enforceable specification of part of the GDPR. Our case study suggests that our methodology can be effectively used to develop accurate enforceable specifications.
@inproceedings{hublet2024towards,
  title = {Towards an Enforceable GDPR Specification},
  author = {François Hublet and Alexander Kvamme and Srđan Krstić},
  year = {2024},
  booktitle = {Mapping and Governing the Online World Workshop},
  doi = {10.48550/arXiv.2402.17350}
}

DOI: 10.48550/arXiv.2402.17350

Conference Security Journal

User-controlled Privacy: Taint, Track, and Control

François Hublet, David Basin, & Srđan Krstić
Proceedings of Privacy Enforcing Technologies (PoPETS), 2024

In this paper, we develop the first language-based, Privacy by Design approach that provides support for a rich class of privacy policies. These policies may be user-defined, rather than programmer-defined policies, and combine fine-grained information flow policies (considering individual application inputs and outputs) with temporal constraints. Our approach, called Taint, Track, and Control (TTC), combines dynamic information-flow control with runtime enforcement to enforce these policies in the presence of malicious users and developers. We provide semantics and correctness proofs for TTC, formalized using the Isabelle/HOL proof assistant. We also implement our approach in a proof-of-concept web development framework and port three baseline applications from previous work into this framework for evaluation. Overall, our approach enforces expressive user-defined privacy policies with practical runtime performance.
@inproceedings{hublet2024user-controlled,
  title = {User-controlled Privacy: Taint, Track, and Control},
  author = {François Hublet and David Basin and Srđan Krstić},
  year = {2024},
  booktitle = {Proceedings of Privacy Enforcing Technologies},
  doi = {10.56553/popets-2024-0034}
}

DOI: 10.56553/popets-2024-0034 Authors' version Slides Artifact

Conference Security

Enforcing the GDPR

François Hublet, David Basin, & Srđan Krstić
European Symposium on Research in Computer Security (ESORICS), 2023

Violations of data protection laws such as the General Data Protection Regulation (GDPR) are ubiquitous. Currently building IT support to implement such laws is difficult and the alternatives such as manual controls augmented by auditing are limited and scale poorly. This calls for developing automated enforcement techniques that can rely on a formalization of the law.

In this paper, we present the first enforceable specification of a comprehensive set of GDPR provisions, and describe an architecture that automatically enforces thisspecification in web applications. We evaluate our architecture by implementing three case studies and show that our approach incurs only modest development and runtime overhead, while covering the most relevant privacy-related aspects of GDPR that can be enforced at runtime.

@inproceedings{hublet2023enforcing,
  title = {Enforcing the GDPR},
  author = {François Hublet and David Basin and Srđan Krstić},
  year = {2023},
  booktitle = {European Symposium on Research in Computer Security},
  doi = {10.1007/978-3-031-51476-0_20}
}

DOI: 10.1007/978-3-031-51476-0_20 Authors' version Slides Artifact

Computational linguistics Journal

IDL-PMCFG, a Grammar Formalism for Describing Free Word Order Languages

François Hublet
Journal of Logic, Language and Information (JLLI), 2022

We introduce Interleave-Disjunction-Lock parallel multiple context-free grammars (IDL-PMCFG), a novel grammar formalism designed to describe the syntax of free word order languages that allow for extensive interleaving of grammatical constituents. Though interleaved constituents, and especially the so-called hyperbaton, are common in several ancient (Classical Latin and Greek, Sanskrit...) and modern (Hungarian, Finnish...) languages, these syntactic structures are often difficult to express in existing formalisms. The IDL-PMCFG formalism combines Seki et al.’s parallel multiple context-free grammars (PMCFG) with Nederhof and Satta’s IDL expressions. We define the semantics of IDL-PMCFGs and study their expressivity, proving that IDL-PMCFG extends both PMCFG and IDL-CFG (context-free grammars equipped with IDL expressions) and that IDL-PMCFG parsing is NP-hard. We then introduce COMPĀ, a programming language extending Ranta’s Grammatical Framework (GF) and built as a high-level front-end formalism to IDL-PMCFG for practical grammar development. We present a parsing algorithm for IDL-PMCFG inspired by earlier Earley-style PMCFG parsing algorithms and Nederhof and Satta’s IDL graphs and give a worst-case estimate of its complexity as a function of several metrics on IDL expressions, the size of the input and a new notion of the G-density of a language.
@article{hublet2022idl-pmcfg,
  title = {IDL-PMCFG, a Grammar Formalism for Describing Free Word Order Languages},
  author = {François Hublet},
  year = {2022},
  journal = {Journal of Logic, Language and Information},
  doi = {10.1007/s10849-022-09363-0}
}

DOI: 10.1007/s10849-022-09363-0

Conference Security

Real-time Policy Enforcement with Metric First-Order Temporal Logic

François Hublet, David Basin, & Srđan Krstić
European Symposium on Research in Computer Security (ESORICS), 2022

Correctness and regulatory compliance of today’s software systems are crucial for our safety and security. This can be achieved with policy enforcement: the process of correcting system behavior to satisfy a given policy. The enforcer’s capabilities determine which policies are enforceable.

We study the enforceability of policies specified in metric first-order temporal logic (MFOTL) with enforcers that can cause and suppress different system actions in real time. We show that a formula from an expressive safety fragment of MFOTL is enforceable if and only if it is equivalent to a formula in a simpler, syntactically defined MFOTL fragment. We propose an enforcement algorithm for all monitorable formulae (i.e., formulae whose violations can be detected by manipulating finite sets of satisfying valuations) from the latter fragment, and show that our EnfPoly enforcer tool outperforms state-of-the-art enforcers.

@inproceedings{hublet2022real-time,
  title = {Real-time Policy Enforcement with Metric First-Order Temporal Logic},
  author = {François Hublet and David Basin and Srđan Krstić},
  year = {2022},
  booktitle = {European Symposium on Research in Computer Security},
  doi = {10.1007/978-3-031-17146-8_11}
}

DOI: 10.1007/978-3-031-17146-8_11 Extended version Slides

Security Thesis

The Databank Model

François Hublet
2021, Master's thesis, ETH Zürich

In this thesis, we design and implement the 'Databank Model', a new privacy-preserving web architecture for database-backed applications. The Databank Model aims at making the web more user-centric and safe by separating data storage fromdata processing functions. In this model, data storage and data policy enforcement are delegated to a trusted third party called the Databank, which serves as a proxy between users and applications. Application developers deploy parts of their code which interact with user data directly to the Databank. This allows them to provide their service without retrieving user data. The Databank monitors code executed against its database and prevents violations of its users’ policies. The overall infrastructure provides strong formal guarantees to users that their policies will be correctly enforced.

Through a novel combination of ideas from both information-flow monitoring and runtime verification, we design a realistic Python-like programming language called Dmol, tailored for the development of database-backed web applications. The Dmol' language features both static and dynamic information-flow propagation and uses an external monitoring backend to detect violations of users' policies, specified in a fragment of Metric First-Order Temporal Logic (MFOTL), at runtime. Noninterference properties are proved for this language and user policies are shown to be correctly enforced in the resulting execution model. We implement a prototype of the Databank infrastructure in Python and OCaml with Dmol' as a Databank-side programming language and assess the practicality of our approach in a case study.

@inproceedings{hublet2021the,
  title = {The Databank Model},
  author = {François Hublet},
  year = {2021},
  doi = {10.3929/ethz-b-000477329},
  note = {, Master's thesis, ETH Zürich}
}

DOI: 10.3929/ethz-b-000477329

Graded 6.0 (best mark). ETH medal 2022 for outstanding Master's thesis.