Building a Read Aloud eBook in Python

2:00pm - 2:25pm on Saturday, October 6 in PennTop North

Max Schwartz

Audience Level:


What does it take to merge an audiobook and an eBook into a single experience, switching from reading to listening and back? We will show how we use Python to combine NLP and speech processing technologies to generate a Read Aloud eBook and showcase an application to help developing readers.


This is a project by Binod Gyawali, Beata Beigman Klebanov, and Anastassia Loukina.

Nowadays books are increasingly consumed not only through reading but also through listening. But what does it take to create an eBook where the reader can switch between the two modalities? In this talk we describe how Python can be used to apply NLP and speech processing technologies to combine an existing eBook in EPUB format and an audio book into a single Read Aloud book. The system we developed uses Python libraries to read the EPUB file contents, NLP methods to process the content, open-sourced speech processing tools (Kaldi-based forced alignment) to align the audio files with the eBook content, and finally creates a Read Aloud book using the alignment information, EPUB content, and the audio files. We use the ebooklib Python library (with some updates to add Read Aloud EPUB generation functionality) to generate the final Read Aloud book.

We’ll conclude with a demonstration of a Read Aloud eBook and showcase an educational application which uses such a book. In this application a student alternates between listening to audiobook and reading aloud. During listening, the text of the book is highlighted along with the audio playback to help students follow along with the narration and maintain focus.

The talk will be an informative talk with no coding involved. We will discuss the system to generate the Read Aloud eBook, show the demo of the book and discuss the challenges that we faced in the process. The targeted audience will be beginner level. Though we do not require the audience to have any prior understanding of eBook structure or forced alignment, familiarity with these would be an advantage.

Want to edit this page?