Publications

2026

Hand gesture realisation of contrastive focus in real-time whisper-to-speech synthesis: Investigating the transfer from implicit to explicit control of intonation

Delphine Charuau, Nathalie Henrich Bernardoni, Gerber, Silvain and Perrotin, Olivier

Speech Communication , vol. 177 , pp. 103344

From Hype to Insight: Rethinking Large Language Model Integration in Visual Speech Recognition

Rishabh Jain, Naomi Harte

ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 18717–18721

The Role of Prosodic and Lexical Cues in Turn-Taking with Self-Supervised Speech Representations

Sam O'Connor Russell, Delphine Charuau, Naomi Harte

ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 16307–16311

2025

Impact of a Sharpness Based Loss Function for Removing Out-of-Focus Blur

Uditangshu Aurangabadkar, Darren Ramsook, Anil Kokaram

2025 33rd European Signal Processing Conference (EUSIPCO) , pp. 601–605

Hot topics in speech synthesis evaluation

Gérard Bailly, Elisabeth André, Erica Cooper, Esther Klabbers, Benjamin Cowan et al.

13th edition of the Speech Synthesis Workshop , pp. 1–7

Multi Task Denoiser Training for Solving Linear Inverse Problems

Clément Bled, François Pitié

Proceedings of the 22nd ACM SIGGRAPH European Conference on Visual Media Production , pp. 1–9

Multimodal Dynamics of Hand Gestures and Pauses in Multiparty Interactions

Delphine Charuau, Naomi Harte

Interspeech 2025 , pp. 3030–3034

Efficient motion-based metrics for video frame interpolation

Conall Daly, Darren Ramsook, Anil Kokaram

Applications of Digital Image Processing XLVIII , pp. 46

An Efficient Quality Metric for Video Frame Interpolation Based on Motion-Field Divergence

Conall Daly, Darren Ramsook, Anil Kokaram

2025 17th International Conference on Quality of Multimedia Experience (QoMEX) , pp. 1–7

Enabling the replicability of speech synthesis perceptual evaluations

Sébastien Le Maguer, Gwénolé Lecorvé, Damien Lolive, Naomi Harte, Juraj Šimko

Interspeech 2025 , pp. 2545–2549

Uncovering the Visual Contribution in Audio-Visual Speech Recognition

Zhaofeng Lin, Naomi Harte

ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 1–5

Interpreting the Role of Visemes in Audio-Visual Speech Recognition

Aristeidis Papadopoulos, Naomi Harte

2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , pp. 1–8

Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction

Sam O'Connor Russell, Naomi Harte

Findings of the Association for Computational Linguistics: ACL 2025 , pp. 209–221

Visual Cues Support Robust Turn-taking Prediction in Noise

Sam O'Connor Russell, Naomi Harte

Interspeech 2025 , pp. 1073--1077

An Empirical Study of Reducing AV1 Decoder Complexity and Energy Consumption via Encoder Parameter Tuning

Vibhoothi Vibhoothi, Julien Zouein, Shanker Shreejith, Jean-Baptiste Kempf, Anil Kokaram

2025 Picture Coding Symposium (PCS) , pp. 1–5

LiteVPNet: A Lightweight Network for Video Encoding Control in Quality-Critical Applications

Vibhoothi Vibhoothi, François Pitié, Anil Kokaram

2025 Picture Coding Symposium (PCS) , pp. 1–5

AV1 Motion Vector Fidelity and Application for Efficient Optical Flow

Julien Zouein, Vibhoothi Vibhoothi, Anil Kokaram

2025 Picture Coding Symposium (PCS) , pp. 1–5

2024

A Dictionary Based Approach for Removing Out-of-Focus Blur

Uditangshu Aurangabadkar, Anil Kokaram

2024 IEEE International Conference on Image Processing (ICIP) , pp. 1494--1499

DOI

A Sharpness Based Loss Function for Removing Out-of-Focus Blur

Uditangshu Aurangabadkar, Darren Ramsook, Anil Kokaram

2024 IEEE 26th International Workshop on Multimedia Signal Processing (MMSP) , pp. 1--6

DOI

Lightweight Video Denoising Using a Classic Bayesian Backbone

Clément Bled, François Pitié

2024 IEEE International Conference on Multimedia and Expo (ICME) , pp. 1–6

Training speech-breathing coordination in computer-assisted reading

Delphine Charuau, Andrea Briglia, Erika Godde, Gérard Bailly

Interspeech 2024 , pp. 5128–5132

Joint Speech-Text Embeddings for Multitask Speech Processing

Michael Gian Gonzales, Peter Corcoran, Naomi Harte, Michael Schukat

IEEE Access , vol. 12 , pp. 145955–145967

Demystifying the use of Compression in Virtual Production

Anil Kokaram, Vibhoothi Vibhoothi, Zouein, Julien and Pitié, François, Christopher Nash, James Bentley et al.

SMPTE Media Technology Summit 2024

The limits of the Mean Opinion Score for speech synthesis evaluation

Sébastien Le Maguer, Simon King, Naomi Harte

Computer Speech & Language , vol. 84 , pp. 101577

A Neural Enhancement Post-Processor with a Dynamic AV1 Encoder Configuration Strategy for CLIC 2024

Darren Ramsook, Anil Kokaram

2024 Data Compression Conference (DCC) , pp. 372--381

DOI

Comparative Analysis of Subjective Evaluations for Traditional and Neural-Based Video Enhancement Techniques

Darren Ramsook, Vibhoothi, Anil Kokaram, Katsenou, Angeliki and Bull, David

2024 16th International Conference on Quality of Multimedia Experience (QoMEX) , pp. 242--245

DOI

What automatic speech recognition can and cannot do for conversational speech transcription

Sam O'Connor Russell, Iona Gessinger, Anna Krason, Gabriella Vigliocco, Naomi Harte

Research Methods in Applied Linguistics , vol. 3 , no. 3 , pp. 100163

Predicting total time to compress a video corpus using online inference systems

Xin Shu, Vibhoothi Vibhoothi, Anil Kokaram

2024 IEEE International Conference on Visual Communications and Image Processing (VCIP) , pp. 1–5

Using Single-Pass Look-Ahead in Modern Codecs for Optimized Transcoding Deployment

Vibhoothi Vibhoothi, Julien Zouein, Pitié, François and Kokaram, Anil

SMPTE Motion Imaging Journal , vol. 133 , no. 6

Unravelling the Power of Single-Pass Look-Ahead in Modern Codecs for Optimized Transcoding Deployment

Vibhoothi Vibhoothi, Julien Zouein, François Pitié, Anil Kokaram

NAB Broadcast Engineering and Information Technology (BEIT) Conference

2023

Learnable Frontends That Do Not Learn: Quantifying Sensitivity To Filterbank Initialisation

Mark Anderson, Tomi Kinnunen, Naomi Harte

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 1--5

DOI

Pushing the Limits of the Wiener Filter in Image Denoising

Clément Bled, François Pitié

2023 IEEE International Conference on Image Processing (ICIP) , pp. 2590--2594

DOI

Fine Grained Spoken Document Summarization Through Text Segmentation

Samantha Kotey, Rozenn Dahyot, Naomi Harte

2022 IEEE Spoken Language Technology Workshop (SLT) , pp. 647--654

DOI

Learnt Deep Hyperparameter Selection in Adversarial Training for Compressed Video Enhancement with a Perceptual Critic

Darren Ramsook, Anil Kokaram

2023 IEEE International Conference on Image Processing (ICIP) , pp. 2420--2424

DOI

The disparity between optimal and practical Lagrangian multiplier estimation in video encoders

Daniel Joseph Ringis, Vibhoothi Vibhoothi, François Pitié, Anil Kokaram

Frontiers in Signal Processing , vol. 3

Comparison of HDR quality metrics in Per-Clip Lagrangian multiplier optimisation with AV1

Vibhoothi Vibhoothi, François Pitié, Katsenou, Angeliki and Su, Yeping, Balu Adsumilli, Anil Kokaram

2023 IEEE International Conference on Multimedia and Expo (ICME) , pp. 1655--1660

DOI

Filling the gaps in video transcoder deployment in the cloud

Vibhoothi, Daniel Joseph Ringis, Xin Shu, François Pitié, Zsolt Lorincz et al.

NAB Broadcast Engineering and Information Technology (BEIT) Conference

URL

Recommendations for Verifying HDR Subjective Testing Workflows

Vibhoothi Vibhoothi, Angeliki Katsenou, Squires, John and Pitié, Francois, Anil Kokaram

2023 15th International Conference on Quality of Multimedia Experience (QoMEX) , pp. 197--200

DOI

Subjective Assessment of the Impact of a Content Adaptive Optimiser for Compressing 4K HDR Content With AV1

Vibhoothi Vibhoothi, Angeliki Katsenou, Pitié, François and Domijan, Katarina, Anil Kokaram

2023 IEEE International Conference on Image Processing (ICIP) , pp. 2610--2614

DOI

2022

An empirical approach for estimating the effect of a transcoding aware preprocessor

Varoun Hanooman, Yeping Su, Neil Birkbeck, Balu Adsumilli, Anil Kokaram

Applications of Digital Image Processing XLV , vol. 12226 , pp. 1222613

Learnable Acoustic Frontends in Bird Activity Detection

Mark Anderson, Naomi Harte

2022 International Workshop on Acoustic Signal Enhancement (IWAENC) , pp. 1--5

DOI

Assessing Advances in Real Noise Image Denoisers

Clement Bled, Francois Pitie

Proceedings of the 19th ACM SIGGRAPH European Conference on Visual Media Production

An Empirical Approach for Optimising the Impact of a Preprocessor in a Transcoding Pipeline

Varoun Hanooman, Anil C. Kokaram, Yeping Su, Birkbeck, Neil and Adsumili, Balu

2022 IEEE International Conference on Image Processing (ICIP) , pp. 2201--2205

DOI

Robo-Identity: Exploring Artificial Identity and Emotion via Speech Interactions

Guy Laban, Sebastien Le Maguer, Minha Lee, Dimosthenis Kontogiorgos, Samantha Reig et al.

Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction , pp. 1265–1268

Back to the Future: Extending the Blizzard Challenge 2013

Sébastien Le Maguer, Simon King, Naomi Harte

Proceedings of Interspeech , pp. 2378--2382

DOI

Production characteristics of obstruents in WaveNet and older TTS systems

Ayushi Pandey, Sébastien Le Maguer, Julie Carson-Berndsen and Naomi Harte

Proceedings of Interspeech , pp. 2373--2377

DOI

A Deep Learning post-processor with a perceptual loss function for video compression artifact removal

D. Ramsook, A. Kokaram, N. Birkbeck, Y. Su, B. Adsumilli

2022 Picture Coding Symposium (PCS) , pp. 85--89

DOI

Perceptually motivated deep neural network for video compression artifact removal

Darren Ramsook, Anil Kokaram, Neil Birkbeck, Yeping Su, Balu Adsumilli

Applications of Digital Image Processing XLV , vol. 12226 , pp. 122260h

Automating sports broadcasting using ultra-high definition cameras, neural networks, and classical denoising

Sophia Rosney, Ciarán Donegan, Meegan Gower, Wissam Jassim and Hugh Denman, Donal Scannell et al.

Applications of Digital Image Processing XLV , vol. 12226 , pp. 122260y

Direct optimisation of for HDR content adaptive transcoding in AV1

Vibhoothi, François Pitié, Angeliki Katsenou, Daniel Joseph Ringis, Yeping Su et al.

Applications of Digital Image Processing XLV , vol. 12226 , pp. 36--45

2021

Bioacoustic Event Detection with prototypical networks and data augmentation

Mark Anderson, Naomi Harte

arXiv

Low Resource Species Agnostic Bird Activity Detection

Mark Anderson, John Kennedy, Naomi Harte

2021 IEEE Workshop on Signal Processing Systems (SiPS) , pp. 34--39

DOI

An articulatory study of differences and similarities between stuttered disfluencies and non-pathological disfluencies

Ivana Didirková, Sébastien Le Maguer, Fabrice Hirsch

Clinical Linguistics & Phonetics , vol. 35 , no. 3 , pp. 201--221

DOI

Phonetic accommodation in interaction with a virtual language learning tutor: A Wizard-of-Oz study

Iona Gessinger, Bernd Möbius, Sébastien Le Maguer, Eran Raveh, Ingmar Steiner

Journal of Phonetics , vol. 86 , pp. 101029

DOI

Synthesizing a Human-like Voice is the Easy Way

Sébastien Le Maguer, Benjamin R. Cowan

CUI 2021 - 3rd Conference on Conversational User Interfaces

DOI

Will synthetic speech provide a suitable voice for robots?

Sébastien Le Maguer

Robo-Identity: Artificial identity and multi-embodiment

Mind your p's and k's -- Comparing obstruents across TTS voices of the Blizzard Challenge 2013

Ayushi Pandey, Sébastien Le Maguer, Julie Berndsen, Naomi Harte

Speech Synthesis Workshop (SSW) , pp. 166--171

DOI

CNN-Based Video Codec Classifier For Multimedia Forensics

Rodrigo Pessoa, Anil Kokaram, Francois Pitie, Mark Sugrue

2021 IEEE International Conference on Image Processing (ICIP) , pp. 3033--3037

DOI

A differentiable estimator of VMAF for Video

Darren Ramsook, Anil Kokaram, Noel O'Connor, Birkbeck, Neil and Su, Yeping, Balu Adsumilli

2021 Picture Coding Symposium (PCS) , pp. 1--5

DOI

A differentiable VMAF proxy as a loss function for video noise reduction

Darren Ramsook, Anil Kokaram, Noel O'Connor, Neil Birkbeck and Yeping Su, Balu Adsumilli

Applications of Digital Image Processing XLIV , vol. 11842 , pp. 118420x

Near optimal per-clip lagrangian multiplier prediction in hevc

Daniel J Ringis, François Pitié, Anil Kokaram

2021 Picture Coding Symposium (PCS) , pp. 1--5

Per-clip and per-bitrate adaptation of the Lagrangian multiplier in video coding

Daniel J Ringis, Francois Pitie, Anil Kokaram

Applications of Digital Image Processing XLIV , vol. 11842 , pp. 118420o

Liaison and Pronunciation Learning in End-to-End Text-to-Speech in French

Jason Taylor, Sébastien Le Maguer, Korin Richmond

Speech Synthesis Workshop (SSW) , pp. 195--199

DOI

2020

FlexEval, création de sites web légers pour des campagnes de tests perceptifs multimédias

Cédric Fayet, Alexis Blond, Grégoire Coulombel, Claude Simon, Damien Lolive et al.

Proceedings of JEP, TALN and RECITAL , pp. 22--25

URL

A Bayesian View of Frame Interpolation and a Comparison with Existing Motion Picture Effects Tools

Anil Kokaram, Davinder Singh, Simon Robinson

2020 IEEE International Conference on Image Processing (ICIP) , pp. 553--557

DOI

Can Auditory Nerve models tell us what's different about WaveNet vocoded speech?

Sébastien Le Maguer, Naomi Harte

Conference of the International Speech Communication Association (Interspeech)

Investigation of Auditory Nerve Model Based Analysis for Vocoded Speech Synthesis

Sébastien Le Maguer, Naomi Harte

International Conference on Quality of Multimedia Experience (QoMEX) , pp. 1--6

DOI

Per-clip adaptive Lagrangian multiplier optimisation with low-resolution proxies

Daniel J Ringis, François Pitié, Anil Kokaram

Applications of Digital Image Processing XLIII , vol. 11510 , pp. 115100e

Per Clip Lagrangian Multiplier Optimisation for HEVC

Daniel J Ringis, François Pitié, Anil Kokaram

Electronic Imaging , vol. 2020 , no. 10 , pp. 136--1

Introducing Prosodic Speaker Identity for a Better Expressive Speech Synthesis Control

Aghilas Sini, Sébastien Le Maguer, Damien Lolive, Elisabeth Delais-Roussarie

Speech Prosody , pp. 935--939

Should robots have accents?

Ilaria Torre, Sébastien Le Maguer

International Conference on Robot & Human Interactive Communication (RO-MAN) , pp. 208--214

DOI

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch et al.

Computer Speech and Language , vol. 64 , pp. 101--114

DOI

2019

A Low-Complexity Mosaicing Algorithm for Stock Assessment of Seabed-Burrowing Species

David Corrigan, Ken Sooknanan, Jennifer Doyle, Colm Lordan and Anil Kokaram

IEEE Journal of Oceanic Engineering , vol. 44 , no. 2 , pp. 386--400

An Advert Creation System for Next-Gen Publicity

Atul Nautiyal, Killian McCabe, Murhaf Hossari, Soumyabrata Dev and Matthew Nicholson, Clare Conran et al.

Machine Learning and Knowledge Discovery in Databases , pp. 663--667

Shot boundary detection based on orthogonal polynomial

Sadiq H. Abdulhussain, Abd Rahman Ramli, Basheera M. Mahmmod, M. Iqbal Saripan, Syed Abdul Rahman Al-Haddad et al.

Multimed Tools Appl , vol. 78 , no. 14 , pp. 20361--20382

Articulatory behaviour during disfluencies in stuttered speech

Ivana Didirková, Sébastien Le Maguer, Fabrice Hirsch and Dodji Gbedahou

Proceedings of the International Congress of Phonetic Science (ICPhS)

Solar Flare Forecasting from Magnetic Feature Properties Generated by the Solar Monitor Active Region Tracker

Katarina Domijan, D. Shaun Bloomfield, François Pitié

Solar Physics , vol. 294 , no. 1

NSQM: A non-intrusive assessment of speech quality using normalized energies of the neurogram

Wissam A. Jassim, Muhammad S. A. Zilany

Computer Speech $&$ Language , vol. 58 , pp. 260--279

Speech Enhancement Algorithm Based on Super-Gaussian Modeling and Orthogonal Polynomials

Basheera M. Mahmmod, Abd Rahman Ramli, Thar Baker, Feras Al-Obeidat, Sadiq H. Abdulhussain et al.

IEEE Access , vol. 7 , pp. 103485--103504

Speech Synthesis Evaluation - State-of-the-Art Assessment and Suggestion for a Novel Research Program

Petra Wagner, Jonas Beskow, Simon Betz, Jens Edlund, Joakim Gustafson et al.

10th ISCA Speech Synthesis Workshop

2018

A New Hybrid form of Krawtchouk and Tchebichef Polynomials: Design and Application

Sadiq H. Abdulhussain, Abd Rahman Ramli, Basheera M. Mahmmod, M. Iqbal Saripan, Syed Abdul Rahman Al-Haddad et al.

J Math Imaging Vis , vol. 61 , no. 4 , pp. 555--570

Methods and Challenges in Shot Boundary Detection: A Review

Sadiq H. Abdulhussain, Abd Ramli, M. Saripan, Basheera Mahmmod and Syed Abdul Rahman Al-Haddad, Wissam Jassim

Entropy , vol. 20 , no. 4 , pp. 214

Radon transform of auditory neurograms: a robust feature set for phoneme classification

Md. Shariful Alam, Wissam A. Jassim, Muhammad S. A. Zilany

IET Signal Processing , vol. 12 , no. 3 , pp. 260--268

Neural net architectures for image demosaicing

Rhys Buggy, Marco Forte, François Pitié

Applications of Digital Image Processing XLI

Estimation of the Asymmetry Parameter of the Glottal Flow Waveform Using the Electroglottographic Signal

João Cabral

Interspeech 2018

Perception and prediction of speaker appeal - A single speaker study

Ailbhe Cullen, Andrew Hines, Naomi Harte

Computer Speech $&$ Language , vol. 52 , pp. 23--40

The Impact of Reduced Video Quality on Visual Speech Recognition

Laura Dungan, Ali Karaali, Naomi Harte

2018 25th IEEE International Conference on Image Processing (ICIP)

ADNet: A Deep Network for Detecting Adverts

Murhaf Hossari, Soumyabrata Dev, Matthew Nicholson, Killian McCabe, Atul Nautiyal et al.

URL

Voice Activity Detection Using Neurograms

Wissam A. Jassim, Naomi Harte

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Edge-Based Defocus Blur Estimation With Adaptive Scale Selection

Ali Karaali, Claudio Rosito Jung

IEEE Transactions on Image Processing , vol. 27 , no. 3 , pp. 1126--1137

Temporal Consistency for Still Image Based Defocus Blur Estimation Methods

Ali Karaali, Claudio Rosito Jung, François Pitié

2018 25th IEEE International Conference on Image Processing (ICIP)

Using video quality metrics for something other than compression

Anil Kokaram, Chao Chen, Yilin Wang, Jessie Lin, Balu Adsumilli et al.

Applications of Digital Image Processing XLI

Signal compression and enhancement using a new orthogonal-polynomial-based discrete transform

Basheera M. Mahmmod, Abd Rahman bin Ramli, Sadiq H. Abdulhussain and Syed Abdul Rahman Al-Haddad, Wissam A. Jassim

IET Signal Processing , vol. 12 , no. 1 , pp. 129--142

Acoustic distinctions between speech and singing: Is singing acoustically more stable than speech?

Beatriz Medeiros, João Cabral

9th International Conference on Speech Prosody 2018

Measuring vocal difference in bird population pairs

Colm O'Reilly, Kangkuso Analuddin, David J. Kelly, Naomi Harte

The Journal of the Acoustical Society of America , vol. 143 , no. 3 , pp. 1658--1671

Using modern motion estimation algorithms in existing video codecs

Daniel Joseph Ringis, Davinder Singh, François Pitié, Anil Kokaram

Applications of Digital Image Processing XLI

Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs

Matthew Roddy, Gabriel Skantze, Naomi Harte

Interspeech 2018

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

George Sterpu, Christian Saam, Naomi Harte

Proceedings of the 20th ACM International Conference on Multimodal Interaction

Can DNNs Learn to Lipread Full Sentences?

George Sterpu, Christian Saam, Naomi Harte

2018 25th IEEE International Conference on Image Processing (ICIP)

Trust in artificial voices: A "congruency effect" of first impressions and behavioural experience

Ilaria Torre, Jeremy Goslin, Laurence White, Debora Zanatto

Proceedings of the Technology, Mind, and Society

2017

Image edge detection operators based on orthogonal polynomials

Sadiq H. Abdulhussain, Abd. Rahman Ramli, Basheera M. Mahmmod and Syed Abdul Rahman Al-Haddad, Wissam A. Jassim

International Journal of Image and Data Fusion , pp. 1--16

On Computational Aspects of Tchebichef Polynomials for Higher Polynomial Order

Sadiq H. Abdulhussain, Abd Rahman Ramli, Syed Abdul Rahman Al-Haddad, Basheera M. Mahmmod, Wissam A. Jassim

IEEE Access , vol. 5 , pp. 2470--2478

Phoneme Classification Using the Auditory Neurogram

Md. Shariful Alam, Muhammad S. A. Zilany, Wissam A. Jassim, Mohd Yazed Ahmad

IEEE Access , vol. 5 , pp. 633--642

The Influence of Synthetic Voice on the Evaluation of a Virtual Character

João Cabral, Benjamin R. Cowan, Katja Zibrek, Rachel McDonnell

Interspeech 2017

Thin slicing to predict viewer impressions of TED Talks

Ailbhe Cullen, Naomi Harte

The 14th International Conference on Auditory-Visual Speech Processing

A no-reference video quality predictor for compression and scaling artifacts

Deepti Ghadiyaram, Chao Chen, Sasi Inguva, Anil Kokaram

2017 IEEE International Conference on Image Processing (ICIP)

A software radio LTE network testbed for video quality of experience experimentation

Ismael Gomez, Paul Sutton, Avishek Nag, Ahmed Selim, Linda Doyle et al.

2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX)

A No-Reference Video Quality Predictor For Compressed Videos

Sasi Inguva, Chao Chen, Anil Kokaram

Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features

Wissam A. Jassim, Raveendran Paramesran, Naomi Harte

IET Signal Processing , vol. 11 , no. 5 , pp. 587--595

Low-Distortion MMSE Speech Enhancement Estimator Based on Laplacian Prior

Basheera M. Mahmmod, Abd Rahman Ramli, Sadiq H. Abdulhussain, Syed Abdul Rahman Al-Haddad, Wissam A. Jassim

IEEE Access , vol. 5 , pp. 9866--9881

Automatic frequency feature extraction for bird species delimitation

Colm O'Reilly, Munevver Kcokuer, Peter Jancovic, Regan Drennan and Naomi Harte

2017 25th European Signal Processing Conference (EUSIPCO)

Pitch tracking of bird vocalizations and an automated process using YIN-bird

Colm O'Reilly, Naomi Harte

Cogent Biology , vol. 3 , no. 1 , pp. 1322025

Detecting conversational gaze aversion using unsupervised learning

Matthew Roddy, Naomi Harte

2017 25th European Signal Processing Conference (EUSIPCO)

Towards predicting dialog acts from previous speakers non-verbal cues

Matthew Roddy, Naomi Harte

Mmsym

Objective Assessment of Perceptual Audio Quality Using ViSQOLAudio

Colm Sloan, Naomi Harte, Damien Kelly, Anil Kokaram, Andrew Hines

IEEE Transactions on Broadcasting , vol. 63 , no. 4 , pp. 693--705

Towards Lipreading Sentences with Active Appearance Models

George Sterpu, Naomi Harte

The 14th International Conference on Auditory-Visual Speech Processing

A longitudinal database of Irish political speech with annotations of speaker ability

Ailbhe Cullen, Naomi Harte

Lang Resources & Evaluation , vol. 52 , no. 2 , pp. 401--432

2016

Anatomy from the outside in: a new on-line surface anatomy guide

Journal of Anatomy , vol. 228 , no. 1 , pp. 24--25

The ADAPT entry to the Blizzard Challenge 2016

João Cabral, Christian Saam, Eva Vanmassenhove, S. Bradley and Fasih Haider

Proceedings of the Blizzard Challenge 2016 Workshop

A Perceptual Quality Metric for Videos Distorted by Spatially Correlated Noise

Chao Chen, Mohammad Izadi, Anil Kokaram

Proceedings of the 24th ACM international conference on Multimedia

A Subjective Study for the Design of Multi-resolution ABR Video Streams with the VP9 Codec

Chao Chen, Sasi Inguva, Andrew Rankin, Anil Kokaram

Electronic Imaging , vol. 2016 , no. 2 , pp. 1--5

Optimizing Transcoder Quality Targets Using a Neural Network with an Embedded Bitrate Model

Michele Covell, Martín Arjovsky, Yao-Chung Lin, Anil Kokaram

Electronic Imaging , vol. 2016 , no. 2 , pp. 1--7

A robust automatic birdsong phrase classification: A template-based approach

Kantapon Kaewtip, Abeer Alwan, Colm O'Reilly, Charles E. Taylor

The Journal of the Acoustical Society of America , vol. 140 , no. 5 , pp. 3691--3701

YIN-Bird: Improved Pitch Tracking for Bird Vocalisations

Colm O'Reilly, Nicola M. Marples, David J. Kelly, Naomi Harte

Interspeech 2016

An alternative matting Laplacian

François Pitié

2016 IEEE International Conference on Image Processing (ICIP)

Rank Reduced Alternative Matting Laplacian

François Pitié

Proceedings of the 13th European Conference on Visual Media Production (CVMP 2016) - CVMP 2016

Geometry-driven quantization for omnidirectional image coding

Francesca De Simone, Pascal Frossard, Paul Wilkins, Neil Birkbeck, Anil Kokaram

2016 Picture Coding Symposium (PCS)

Bitrate classification of twice-encoded audio using objective quality features

Colm Sloan, Naomi Harte, Damien Kelly, Anil Kokaram, Andrew Hines

2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX)

Prediction of Emotions from Text using Sentiment Analysis for Expressive Speech Synthesis

Eva Vanmassenhove, João Cabral, Fasih Haider

9th ISCA Speech Synthesis Workshop

A cloud-based large-scale distributed video analysis system

Yongzhe Wang, Wei-Ta Chen, Huahui Wu, Anil Kokaram, Jaron Schaeffer

2016 IEEE International Conference on Image Processing (ICIP)

A perceptual visibility metric for banding artifacts

Yilin Wang, Sang-Uok Kum, Chao Chen, Anil Kokaram

2016 IEEE International Conference on Image Processing (ICIP)

Double-Tip Artifact Removal From Atomic Force Microscopy Images

Yun-Feng Wang, Jason I. Kilpatrick, Suzanne Jarvis, Frank Boland, Anil Kokaram et al.

IEEE Transactions on Image Processing , vol. 25 , no. 6 , pp. 2774--2788

2015

TCD-TIMIT: An Audio-Visual Corpus of Continuous Speech

Naomi Harte, Eoin Gillen

IEEE Transactions on Multimedia , vol. 17 , no. 5 , pp. 603--615

TCD-VoIP, a research database of degraded speech for assessing quality in VoIP applications

Naomi Harte, Eoin Gillen, Andrew Hines

2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX)

ViSQOL: an objective speech quality model

Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte

Journal on Audio, Speech, and Music Processing , vol. 2015 , no. 1

ViSQOLAudio: An objective audio quality metric for low bitrate codecs

Andrew Hines, Eoin Gillen, Damien Kelly, Jan Skoglund, Anil Kokaram et al.

The Journal of the Acoustical Society of America , vol. 137 , no. 6 , pp. El449--el455

Forensic comparison of ageing voices from automatic and auditory perspectives

Finnian Kelly, Naomi Harte

Ijsll , vol. 22 , no. 2 , pp. 167--202

Special Issue in Honour of William J. (Bill) Fitzgerald

Ercan E. Kuruoglu, Joan Lasenby, A. Taylan Cemgil, Anil Kokaram, Robin D. Morris

Digital Signal Processing , vol. 47 , pp. 1--2

Multipass encoding for reducing pulsing artifacts in cloud based video transcoding

Yao-Chung Lin, Hugh Denman, Anil Kokaram

2015 IEEE International Conference on Image Processing (ICIP)

Quantifying Difference in Vocalizations of Bird Populations

Colm O'Reilly, Nicola M. Marples, David J. Kelly, Naomi Harte

INTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association, September 610, Dresden, Germany , pp. 3417--3421

An Analysis of the Impact of Playout Delay Adjustments introduced by VoIP Jitter Buffers on Listening Speech Quality

Peter Pocta, Hugh Melvin, Andrew Hines

Acta Acustica united with Acustica , vol. 101 , no. 3 , pp. 616--631

Segmentation and Inpainting for Stereoscopic Videos

Félix Raimbault

Enhancement, Summarization and Analysis of Underwater Videos of Nephrops Habitats

Ken Sooknanan

2014

Advanced video debanding

Gary Baugh, Anil Kokaram, François Pitié

Proceedings of the 11th European Conference on Visual Media Production

A Video Database for the Development of Stereo-3D Post-Production Algorithms

David Corrigan, François Pitié, Marcin Gorzel, Gavin Kearney, Valerie Morris et al.

JVRB - Journal of Virtual Reality and Broadcasting 2014

Building a Database of Political Speech

Ailbhe Cullen, Andrew Hines, Naomi Harte

Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge - AVEC '14

Investigation of Ambisonic Rendering of Elevated Sound Sources

Marcin Gorzel, Gavin Kearney, Frank Boland

Audio Engineering Society 55th International Conference: Spatial Audio

Perceived Audio Quality for Streaming Stereo Music

Andrew Hines, Eoin Gillen, Damien Kelly, Jan Skoglund, Anil Kokaram et al.

Proceedings of the 22nd ACM international conference on Multimedia

Robustness And Prediction Accuracy Of Machine Learning For Objective Visual Quality Assessment

Andrew Hines, Paul Kendrick, Adriaan Barri, Manish Narwaria, J. A. Redi.

Eusipco

Automatic Recognition of Ageing Speakers

Finnian Kelly

Detecting Arrivals in Room Impulse Responses With Dynamic Time Warping

Ian J. Kelly, Frank Boland

IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 22 , no. 7 , pp. 1139--1147

Phase and Randomness in Acoustic Responses

Ian J. Kelly, B. O'Toole, F.M. Boland, Marcin Gorzel

25th IET Irish Signals & Systems Conference 2014 and 2014 China-Ireland International Conference on Information and Communities Technologies (ISSC 2014/CIICT 2014)

Randomness and the reverberation time, RTinf, of acoustic responses

Ian J. Kelly, Frank Boland

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Prediction Quality Assessment

Matjaž Kukar

Conformal Prediction for Reliable Machine Learning , pp. 145--166

Effect of long-term ageing on i-vector speaker verification

David van Leeuwen, Finnian Kelly, Rahim Saeidi, Naomi Harte

InterSpeech 2014

Virtual 5.1 Surround Sound Localization using Head-Tracking Devices

B.C. O'Toole, L. O'Sullivan, Ian J. Kelly, Frank Boland, Marcin Gorzel et al.

25th IET Irish Signals & Systems Conference 2014 and 2014 China-Ireland International Conference on Information and Communities Technologies (ISSC 2014/CIICT 2014)

Assessment of Audio/Video synchronisation in streaming media

François Pitié, Damien Kelly, Thierry Foucu, Naomi Harte, Anil Kokaram

2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX)

Towards Automated Classification of Seabed Substrates in Underwater Video

Matthew Pugh, Bernard Tiddeman, Hannah Dee, Philip Hughes

2014 ICPR Workshop on Computer Vision for Analysis of Underwater Imagery

Bleed-Through Document Image Restoration

Róisín Rowley-Brooke

Mosaics for Nephrops detection in underwater survey videos

Ken Sooknanan, Jennifer Doyle, Colm Lordan, James Wilson, Anil Kokaram et al.

2014 Oceans - St. John's

Classification of Seabed Type from Underwater Video

Steven Tyner, James Wilson, David Corrigan

Irish Machine Vision and Image Processing Conference (IMVIP)

Automated registration of low and high resolution atomic force microscopy images using scale invariant features

Yun-Feng Wang, Jason I. Kilpatrick, Suzanne Jarvis, Frank Boland, Anil Kokaram et al.

2014 IEEE International Conference on Image Processing (ICIP)

2013

Exploiting randomness in acoustic impulse responses to achieve headphone compensation through deconvolution

Ian J. Kellyand Frank Boland

The Journal of the Acoustical Society of America 133 (5) , vol. 133 , no. 5 , pp. 2778--2787

Depth perception of audio sources in stereo 3D environments

David Corrigan, Marcin Gorzel, John Squires, Frank Boland

Stereoscopic Displays and Applications XXIV

Creaky Voice and the Classification of Affect

Ailbhe Cullen, John Kane, Thomas Drugman, Naomi Harte

Workshop on Affective Social Speech Signals (WASSS)

Late Integration of Features for Acoustic Emotion Recognition

Ailbhe Cullen, Naomi Harte

European Signal Processing Conference (EUSIPCO)

Blotch and scratch removal in archived film using a semi-transparent corruption model and a ground-truth generation technique

Mohamed A Elgharib, François Pitié, Anil Kokaram

Journal on Image and Video Processing , vol. 2013 , no. 1

User-assisted reflection detection and feature point tracking

Mohamed A. Elgharib, François Pitié, Anil Kokaram and Venkatesh Saligrama

Proceedings of the 10th European Conference on Visual Media Production - CVMP '13

Identifying new bird species from differences in birdsong.

Naomi Harte, Sadhbh Murphy, David J. Kelly, Nicola M. Marples

Interspeech , pp. 2900--2904

Detailed comparative analysis of PESQ and VISQOL behaviour in the context of playout delay adjustments introduced by VOIP jitter buffer algorithms

Andrew Hines, Peter Pocta, Hugh Melvin

2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX)

Monitoring the Effects of Temporal Clipping on VoIP Speech Quality

Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte

Interspeech 2013

Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA

Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Auditory detectability of vocal ageing and its effect on forensic automatic speaker recognition

Finnian Kelly, Naomi Harte

InterSpeech 2013

Eigenageing Compensation for Speaker Verification

Finnian Kelly, Niko Brummer, Naomi Harte

InterSpeech 2013

Exploiting randomness in acoustic impulse responses to achieve headphone compensation through deconvolution

Ian J. Kelly, Frank Boland

The Journal of the Acoustical Society of America , vol. 133 , no. 5 , pp. 2778--2787

Speaker verification in score-ageing-quality classification space

Finnian Kelly, Andrzej Drygajlo, Naomi Harte

Computer Speech $&$ Language , vol. 27 , no. 5 , pp. 1068--1084

The impact of ageing on speech-based biometric systems

Finnian Kelly, Naomi Harte

'Age Factors in Biometric Processing'

Shape Models for Image Segmentation in Microscopy

Kangyu Pan

Adaptive video stabilisation with dominant motion layer estimation for home video and TV broadcast

Félix Raimbault, Yalcin Incesu

2013 IEEE International Conference on Image Processing

User-assisted sparse stereo-video segmentation

Félix Raimbault, François Pitié, Anil Kokaram

Proceedings of the 10th European Conference on Visual Media Production - CVMP '13

A Non-parametric Framework for Document Bleed-through Removal

Róisín Rowley-Brooke, François Pitié, Anil Kokaram

2013 IEEE Conference on Computer Vision and Pattern Recognition

Degraded manuscript restoration: A case study

Róisín Rowley-Brooke, François Pitié, Anil Kokaram

Annual Conference of the Society for Musicology in Ireland (SMI)

Nonrigid recto-verso registration using page outline structure and content preserving warps

Róisín Rowley-Brooke, François Pitié, Anil Kokaram

Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing - HIP '13

Residual Life Prediction of Rotating Machines Using Acoustic Noise Signals

Patricia Scanlon, Darren F. Kavanagh, Frank Boland

IEEE Transactions on Instrumentation and Measurement , vol. 62 , no. 1 , pp. 95--108

Mosaics For Burrow Detection in Underwater Surveillance Video

Ken Sooknanan, Jennifer Doyle, James Wilson, Naomi Harte, Anil Kokaram et al.

Oceans 2013

2012

Phoneme-to-Viseme Mapping for Visual Speech Recognition

Luca Cappelletta, Naomi Harte

International Conference on Patter Recognition Applications and Methods (ICPRAM) , vol. 2 , pp. 322--329

Algorithms for the Digital Restoration of Torn Films

David Corrigan, Anil Kokaram, Naomi Harte

IEEE Transactions on Image Processing , vol. 21 , no. 2 , pp. 573--587

Lower and upper bounds for approximation of the Kullback-Leibler divergence between Gaussian Mixture Models

J.-L. Durrieu, J.-Ph. Thiran, Finnian Kelly

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Distance Perception in Virtual Audio-Visual Environments

Marcin Gorzel, David Corrigan, Gavin Kearney, John Squires and Frank Boland

25th AES UK Conference: Spatial Audio in Today's 3D World

Improved Speech Intelligibility with a Chimaera Hearing Aid Algorithm

Andrew Hines, Naomi Harte

InterSpeech 2012

Predicting Speech Intelligibility

Andrew Hines

Speech intelligibility prediction using a Neurogram Similarity Index Measure

Andrew Hines, Naomi Harte

Speech Communication , vol. 54 , no. 2 , pp. 306--320

ViSQOL: The Virtual Speech Quality Objective Listener

Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte

International Workshop on Acoustic Signal Enhancement (IWAENC)

Distance Perception in Interactive Virtual Acoustic Environments using First and Higher Order Ambisonic Sound Fields

Gavin Kearney, Marcin Gorzel, Henry Rice, Frank Boland

Acta Acustica united with Acustica , vol. 98 , no. 1 , pp. 61--71

On loudspeaker rendering of auditory distance in higher order Ambisonics

Gavin Kearney, Marcin Gorzel, Frank Boland

Acoustics 2012

On Phase and Randomness in Head Related Impulse Responses

Ian J. Kelly, Frank Boland

9th IMA International Conference on Mathematics in Signal Processing

Speaker verification with long-term ageing data

Finnian Kelly, Andrzej Drygajlo, Naomi Harte

2012 5th IAPR International Conference on Biometrics (ICB)

HRIR Order Reduction Using Approximate Factorization

Claire Masterson, Gavin Kearney, Marcin Gorzel, Frank Boland

IEEE Transactions on Audio, Speech, and Language Processing , vol. 20 , no. 6 , pp. 1808--1817

A wavelet-based Bayesian framework for 3D object segmentation in microscopy

Kangyu Pan, David Corrigan, Jens Hillebrand, Mani Ramaswami and Anil Kokaram

Three-Dimensional and Multidimensional Microscopy: Image Acquisition and Processing XIX

Stereo video completion for rig and artefact removal

Félix Raimbault, François Pitié, Anil Kokaram

2012 13th International Workshop on Image Analysis for Multimedia Interactive Services

Stereo-video inpainting

Félix Raimbault

J. Electron. Imaging , vol. 21 , no. 1 , pp. 011005

A Ground Truth Bleed-Through Document Image Database

Róisín Rowley-Brooke, François Pitié, Anil Kokaram

Theory and Practice of Digital Libraries , pp. 185--196

Bleed-through removal in degraded documents

Róisín Rowley-Brooke, Anil Kokaram

Document Recognition and Retrieval XIX

Improving underwater visibility using vignetting correction

Ken Sooknanan, Anil Kokaram, David Corrigan, Gary Baugh, James Wilson et al.

Visual Information Processing and Communication III

Indexing and selection of well-lit details in underwater video mosaics using vignetting estimation

Ken Sooknanan, Anil Kokaram, David Corrigan, Gary Baugh, Naomi Harte et al.

2012 Oceans - Yeosu

Restoration of high-resolution AFM images captured with broken probes

Y. F. Wang, David Corrigan, C. Forman, Suzanne Jarvis, Anil Kokaram

Three-Dimensional and Multidimensional Microscopy: Image Acquisition and Processing XIX

2011

Motion Estimation for Regions of Reflections through Layer Separation

Mohamed Abdelaziz Ahmed, François Pitié, Anil Kokaram

2011 Conference for Visual Media Production

Reflection detection in image sequences

Mohamed Abdelaziz Ahmed, François Pitié, Anil Kokaram

Cvpr 2011

An Extended Multiresolution Approach to Mouth Specific AAM Fitting for Speech Recognition

Craig Berry, Anil Kokaram, Naomi Harte

European Signal Processing Conference (Eusipco)

Viseme Definitions Comparison for Visual-Only Speech Recognition

Luca Cappelletta, Naomi Harte

European Signal Processing Conference (Eusipco)

Restoration of Image Burnout in 3D-Stereoscopic Media Using Inter-View Gradient Interpolation

David Corrigan, François Pitié, Anil Kokaram

European Signal Processing Conference (Eusipco)

Restoring Image Burnout in 3D-Stereoscopic Media using Temporally Consistent Disparity Maps

David Corrigan, François Pitié, Anil Kokaram

Irish Signals and Systems Conference

Handling Transparency in Digital Video

Mohamed A. Elgharib

On the Perception of Dynamic Sound Sources in Ambisonic Binaural Renderings

Marcin Gorzel, Gavin Kearney, Henry Rice, Frank Boland

AES 41st International Conference

Comparing hearing aid algorithm performance using Simulated Performance Intensity Functions

Andrew Hines, Naomi Harte

Speech perception and auditory disorders, Int. Symposium on Audiological and Auditory Research (ISAAR)

Reproduction of the performance/intensity function using image processing and a computational model (A)

Andrew Hines, Naomi Harte

Int J Audiol , vol. 50 , no. 10 , pp. 723

Simulated performance intensity functions

Andrew Hines, Naomi Harte

2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society

Real-time walkthrough auralisation of the acoustics of Christ Church cathedral Dublin

Gavin Kearney, Marcin Gorzel, Frank Boland, F. Smyth, D. Lennon et al.

Proc of the Institute of Acoustics , vol. 33 , pp. 244--258

Effects of Long-Term Ageing on Speaker Verification

Finnian Kelly, Naomi Harte

Lecture Notes in Computer Science , pp. 113--124

Cellsnake: A new active contour technique for cell/fibre segmentation

Kangyu Pan, Anil Kokaram, Kerry Gilmore, Michael J. Higgins and Robert Kapsa, Gordon G. Wallace

2011 18th IEEE International Conference on Image Processing

Stereo video inpainting

Félix Raimbault, Anil Kokaram

Stereoscopic Displays and Applications XXII

Bleed-Through Removal in Degraded Manuscripts

Róisín Rowley-Brooke, Anil Kokaram

Irish Signals and Systems Conference

Degraded Document Bleed-Through Removal

Róisín Rowley-Brooke, Anil Kokaram

2011 Irish Machine Vision and Image Processing Conference

2010

Semi-automatic motion based segmentation using long term motion trajectories

Gary Baugh, Anil Kokaram

2010 IEEE International Conference on Image Processing

Nostril detection for robust mouth tracking

Luca Cappelletta, Naomi Harte

IET Irish Signals and Systems Conference (ISSC 2010)

A Video Database for the Development of Stereo-3D Post-Production Algorithms

David Corrigan, François Pitié, Valerie Morris, Andrew Rankin, M. Linnane et al.

2010 Conference on Visual Media Production

Evaluating Sensorineural Hearing Loss With An Auditory Nerve Model Using: A Mean Structural Similarity Measure

Andrew Hines, Naomi Harte

European Signal Processing Conference (EUSIPCO '10)

Speech intelligibility from image processing

Andrew Hines, Naomi Harte

Speech Communication , vol. 52 , no. 9 , pp. 736--752

Depth perception in interactive virtual acoustic environments using higher order ambisonic soundfields

Gavin Kearney, Marcin Gorzel, H. Rice, Frank Boland

2nd International Ambisonics and Spherical Acoustics Symposium

A Comparison of Auditory Features for Robust Speech Recognition

Finnian Kelly, Naomi Harte

European Signal Processing Conference (EUSIPCO '10)

Auditory Features Revisited for Robust Speech Recognition

Finnian Kelly, Naomi Harte

2010 20th International Conference on Pattern Recognition

Training GMMs for speaker verification

Finnian Kelly, Naomi Harte

IET Irish Signals and Systems Conference (ISSC 2010)

HRIR Factorisation: A Regularised Approach

C. Masterson, Gavin Kearney, Frank Boland

Euspico 2010 , vol. 2 , pp. 751--755

Optimised virtual loudspeaker reproduction

C. Masterson, Gavin Kearney, Marcin Gorzel, H. Rice, Frank Boland

IET Irish Signals and Systems Conference (ISSC 2010)

Content-Based Media Processing

Deirdre O'Regan

Gaussian mixture models for spots in microscopy using a new split/merge em algorithm

Kangyu Pan, Anil Kokaram, Jens Hillebrand, Mani Ramaswami

2010 IEEE International Conference on Image Processing

Gaussian Mixtures for Intensity Modeling of Spots in Microscopy

Kangyu Pan, Jens Hillebrand, Mani Ramaswami, Anil Kokaram

International Symposium on Biomedical Imaging (ISBI'10) , pp. 121--124

Matting with a depth map

François Pitié, Anil Kokaram

2010 IEEE International Conference on Image Processing

2008

François Pitié, Anil Kokaram, Rozenn Dahyot

CRC Press , pp. 295--321