Skills & Tools

I started coding in middle school and continued to refine my skills throughout my scholar and professional journey. I am now able to develop in Python, Java, and C++. I chose to specialize in artificial intelligence during the last year of my master's program, and I've been exploring and applying the many dimensions of this this exciting research domain ever since.

NLP / LLMs 90%
Deep Learning 90%
PyTorch 85%

Python 95%
Data Visualization 75%
Machine Learning 65%

Career keys

Research Engineer

Inria

Now - June 2024

Keywords: NLP, Large Language Models, Privacy, Anonymization, Model Auditing, Machine Unlearning, Membership Inference Attack, Memorization, Extraction, Identifiying Words

PhD in Computer Science

INSA Lyon

February 2024 - October 2020

Keywords: Smart Cities, Human mobility, Decision making, Clustering, Learning, Recurrent Neural Networks, Deep Reinforcement Learning, Traffic light, Cyclists, Waiting time, Vehicle counts

Master of Computer Science

University of Lyon

2020 - 2018

Keywords: Artificial Intelligence, Real time, Bio-Inspired, Intelligent Tutoring System, Combinatorial Optimization

UIT and Bachelor of Computer Science

University of Lyon

2018 - 2015

Keywords: Web Programming, Java, C++, Network, System, SQL, Linux

Detailed Resume

Research Engineer

Involvment in PANAME and NoLeFa projects

SEPTEMBER 2025 - TODAY

Inria (CITI Laboratory, PRIVATICS team), Villeurbanne

Tools used : Python, PyTorch, HuggingFace, Transformers
In parallel with my research activities, I contribute to the French PANAME project.
Established through a collaboration between INRIA, CNIL, PEReN, and ANSSI, PANAME (Privacy Auditing of AI Models) aims to develop a standardized library for auditing model privacy. The objective is to enable organizations training models on sensitive data to easily conduct attacks and evaluate associated privacy risks.
Together with another research engineer, I am responsible for developing the LLM module of the library. This includes designing model wrappers, implementing privacy attacks, and producing example scripts as well as unit and functional tests.
Link of the project presentation (in french)

I am also involved in Work Package 2 (WP2) of the European project NoLeFa. WP2 focuses on developing an AI testing suite to support compliance with the AI Act, including assessments of robustness, fairness, and training data quality.
The project is currently at an early stage, and I am working on an initial use case involving a re-identification attack on a LLM trained on medical data.
Link of the project

Research Engineer

JUNE 2024 - TODAY

Inria (CITI Laboratory, PRIVATICS team), Villeurbanne

Tools used : Python, PyTorch, HuggingFace, Transformers, Opacus, Machine Unlearning
My work investigates sensitive data leakage and adversarial vulnerabilities in LLMs, with a particular focus on medical use cases.
We first developed a Privacy-Preserving Language Modeling (PPLM) training strategy designed to prevent models from learning both direct and indirect identifiers. For masked language models (MLMs), we prohibit the masking of identifiers during finetuning. For causal language models (CLMs), identifiers are replaced with padding tokens that are excluded from loss computation.
Compared to both protected and unprotected baselines (models trained on anonymized data and models trained using differential privacy), our approach achieves the best utility-privacy trade-off.
Link of the preprint

Building on this, we explored multiple unlearning techniques, integrating PPLM into the reconstruction phase to improve the safety of models originally finetuned without privacy safeguards. Remarkably, retraining for only 10-20% of the original finetuning duration was sufficient to reach a utility-privacy trade-off comparable to fully protected training.
We showed that retraining for only 10-20% of the original finetuning duration was sufficient to reach a utility-privacy trade-off comparable to fully protected finetuning.
Link of the preprint

I also contributed to a study examining the relationship between privacy risks and model language.
I was responsible for implementing all experimental risk assessments, including data extraction attacks, counterfactual memorization evaluations, and membership inference attacks.
Link of the paper

Teaching

SEPTEMBER 2025 - TODAY

INSA Lyon

I’m pleased to currently teach the basics of algorithmic to first year students at INSA Lyon.

PhD Student

PhD Project

2022 - 2023

Inria (CITI Laboratory, Agora team), Villeurbanne

Tools used : Python, PyTorch, SUMO, Deep Reinforcement Learning (3DQN, PPO)
I imagined a new way of making a road segment attractive to cyclists using the dynamic properties of traffic lights.
Using SUMO (Simulation of Urban MObility) and real vehicle counter data, I simulated a traffic light with green phases added specifically for cyclists. This trafic light allows cyclists to securely cross an intersection by separating their flows from the motorized vehicles.
Naively added, these green phases for cyclists explode the waiting time of all vehicles at the intersection.
I trained deep reinforcement learning agents to control this traffic light while minimizing vehicle waiting time. The agent trained using the Dueling Dual Deep Q-Network (3DQN) outperforms other tested traffic light control methods.
Link of the paper

Teaching

SEPTEMBER 2021 - JANUAY 2022

INSA Lyon

I’ve had the pleasure to teach for 1 semester the basics of algorithmic to first year students at INSA Lyon.

PhD Project

2020 - 2022

Inria (CITI Laboratory, Agora team), Villeurbanne

Tools used : Python, PyTorch, Sklearn, Jupyter Notebook, Recurrent Neural Networks (LSTM), Clustering algorithms (DBSCAN, K-means)
I developped a method to create implicit route choice models for cyclists using GPS tracks. Given an origin and a destination, the models are capable of generating a route that approximates cyclists’ behavior.
The first part of this work was to find datasets of GPS tracks and to analyse them both quantitatibvely and qualitatively.
A clustering algorithm is then applied to the GPS tracks in order to find cyclists' preferred road segments.
A LSTM network is trained to find the relevant preferred road segments from a shortest path between any pair of origin/destination.
The relevant preferred road segments are then used to weight a road graph which is used to generate routes approximating cyclists’ behavior.
Link of the paper

My code

DRL algorithms

Motivated by a strong interest in Deep Reinforcement Learning, I implemented a selection of well-known DRL algorithms on my own time to better understand their theoretical foundations and practical behavior.

AI4BioMed

I participated in the 2025 AI4BioMed Spring School, where I implemented a protein contact prediction model that infers contact matrices directly from protein sequences. The approach relies on ESM to generate contextualized amino acid embeddings used as input features.

PrAACT method for Pictalk

I implemented an LLM-based pictogram prediction system to assist friends in building an application aimed at improving communication accessibility for non-verbal users.

Papers code

Code for the papers:

"Anonymization by design of language modeling"
"Leverage Unlearning to Sanitize LLMs"
"The Model's Language Matters: A Comparative Privacy Analysis of LLMs"

Paper code

Code for the paper "GPS-based bicycle route choice model using clustering methods and a LSTM network".

Paper code

Code for the paper "A DRL solution to help reduce the cost in waiting time of securing a traffic light for cyclists".

Publications

Implicit GPS-based bicycle route choice model using clustering methods and a LSTM network | PLOS ONE
A DRL solution to help reduce the cost in waiting time of securing a traffic light for cyclists | Journal of Cycling and Micromobility Research
Towards the Anonymization of Language Modeling | arXiv preprint arXiv:2501.02407
Leverage Unlearning to Sanitize LLMs | arXiv preprint arXiv:2510.21322
The Model's Language Matters: A Comparative Privacy Analysis of LLMs | EACL 2026