Clinical AI & biomedical signal validation

Clinically validated machine learning (ML) on biomedical signals: peripheral arterial clot trials, lung biopsy tissue classification, and multimodal psychiatric biomarkers from clinical trial cohorts.

Peripheral arterial clot characterisation

SEPARATE and E-SEPARATE evaluated impedance based clot characterisation with the Clotild® smart guidewire (Sensome). Bioimpedance targets clot tissue rich in red blood cells.

Clinical abstract

In vivo identification of clot rich in red blood cells in peripheral arterial disease (SEPARATE study)

In vivo clinical validation (n = 17); 100% primary endpoint success for lesion impedance data. Clot identification in peripheral arterial disease (tissue rich in red blood cells).

Also presented at Paris Vascular Insights (PVI) 2024, Paris

JET OPEN the world 2025, Osaka

2025

Clinical · in vivo

First author

Clinical abstract

Ex-vivo thrombus analysis in peripheral arterial disease versus histology (E-SEPARATE study)

Ex vivo machine learning (ML) versus histology gold standard (n = 15); coefficient of determination R² = 0.79 in peripheral arterial disease.

Also presented at Paris Vascular Insights (PVI) 2024, Paris

JET OPEN the world 2025, Osaka

2025

Preclinical · ex vivo

First author

Lung tissue classification during bronchoscopic biopsy

INSPECT used bioimpedance on a bronchoscopy stylet for tool-in-lesion confirmation during biopsy of central and peripheral lung lesions (Sensome).

Clinical abstract

In situ lung tissue characterisation during bronchoscopic biopsy (INSPECT study)

Machine learning (ML) tissue classification during bronchoscopic lung biopsy; 26 patients across Australia and France.

American Thoracic Society (ATS) 2026 International Conference

2026

Clinical · first in human

Conference abstract

Third author · ML analysis lead

Psychiatric treatment-response prediction (multimodal biomarkers)

Multimodal model predicting treatment response across four psychiatric conditions in pooled clinical trial cohorts. Christos led biosignal and imaging feature extraction (EEG, ECG, galvanic skin response).

Conference poster

Transprognostic treatment-response prediction across depression, ADHD, OCD, and PTSD

Major depressive disorder, ADHD, OCD, and PTSD; external validation on unseen clinical cohorts (TRIPOD Type 4); ranked first in the TDBRAIN international competition.

6th Neuropsychiatric Drug Development Summit, Boston

2022

Clinical · external validation (unseen cohort)

Third author · biosignal and imaging feature extraction lead

Cognitive neuroscience & language processing

MEG and EEG during sentence reading, contrasted with a recurrent language model (*Cortex*), and grammatical agreement in humans versus language models (EMNLP 2023).

Human language processing and computational language models (*Cortex*)

PhD work at NeuroSpin / Sorbonne University: how humans and language models process sentence structure. MEG/EEG during reading versus a two-layer long short-term memory (LSTM) language model.

Peer-reviewed journal

Disentangling Hierarchical and Sequential Computations during Sentence Processing

n=22; combined MEG and EEG during sentence reading. Only hierarchical structure was decodable from brain signals; transition and congruity stayed at chance. A two-layer LSTM on the same sentences showed all three effects decodable.

Decoding plots comparing human MEG data and an LSTM language model for grammatical number and animacy during sentence reading
Human MEG/EEG (left): structural effect only; transition and congruity at chance. LSTM (right): structural, transition, and congruity effects decodable. Zacharopoulos, Dehaene, Lakretz, Cortex 2026.

Cortex (Elsevier)

2026

First author · corresponding author

Psycholinguistics and computational modelling

Human grammatical agreement versus language models. First-author EMNLP 2023 main-track paper with Meta AI and NeuroSpin co-authors.

Conference paper

Assessing the influence of attractor-verb distance on grammatical agreement in humans and language models

Rapid serial visual presentation agreement task (n=34): humans and language models err more with proximal attractors; linear response-time effect of distance; GPT-Neo-1.3B and grammar-corrected T5 compared to humans.

Error rate and response time for humans, GPT-3, and T5 across baseline, distal, and proximal attractor conditions
Human and language-model error rate and response time by attractor distance and grammaticality (from paper; fig. 2).

EMNLP 2023 (Empirical Methods in Natural Language Processing), main track

2023

First author

Language-model evaluation & representation

Computational studies of large language model (LLM) behaviour and internal representations: personality-trait expression and semantic-violation detection in causal LMs.

LLM evaluation and representation analysis

Personality-trait probing and layer-wise semantic-violation decoding in causal language models.

Conference paper

Decoding Emergent Big Five Traits in Large Language Models: Temperature-Dependent Expression and Architectural Clustering

Six LLMs, BFI-2, temperature 02: four traits differ across models; Neuroticism and Extraversion track temperature (R² = 0.35 / 0.25).

Effects of sampling temperature on Big Five personality trait scores across six large language models
Temperature effects on trait expression (from paper). Neuroticism and Extraversion are most sensitive to sampling temperature.

IJCNLP 2025 (International Joint Conference on Natural Language Processing)

2025

First author · corresponding author

Conference paper

In Machina N400: Pinpointing Where a Causal Language Model Detects Semantic Violations

Phi-2, 1520 sentence pairs: per-layer AUC shows semantic violations decoded in layers 18–30 (cluster p < 0.001); early layers at chance; participation ratio expansion then collapse.

Layer-wise ROC-AUC for decoding plausible versus implausible sentence endings in Phi-2
Mean ROC-AUC by layer; grey band marks layers 18–30 above chance after cluster permutation (p < 0.001).
Participation ratio across Phi-2 layers for violation versus control sentences
Participation ratio by layer: early expansion for violations, mid-stack convergence, later compression (from paper).

Springer CCIS / AICS 2025

2025

First author · corresponding author

Earlier work

Earlier contributions before the clinical-AI and NeuroSpin research lines.

Valence and arousal ratings for Hellenic words

Cross-sectional psychometrics (Aristotle University of Thessaloniki): valence and arousal norms across the adult lifespan.

Conference abstract

Valence, and arousal ratings for Hellenic words by young, middle-aged, and older adults

Cross-sectional study (n = 84): older adults rated Hellenic words more positively and with higher arousal than younger groups; age-by-valence interactions across pleasant, neutral, and unpleasant word sets.

SAN2016 Meeting, Corfu · Frontiers in Human Neuroscience (conference abstract)

2016

Third author