epub电子书生成audio book

Wednesday, September 20, 2023 - Permalink

使用epub-to-text转epub文本

epub-to-text会自动根据章节拆分。

clone epub-to-text

git clone https://github.com/Projet-TAMIS/epub-to-text.git

install dependencies

需要先安装nodeJS

再拉依赖

npm install

create `main.js`

run with node

node main.js

使用epub2txt2转epub文本

epub-to-text根据章节拆分有时候（大部分时候😅）不准。

clone epub2txt2

git clone https://github.com/kevinboone/epub2txt2.git

make & make install

需要先安装gcc和make

再编译

make & make install

convert

epub2txt2的参数-a(--ascii)可以忽略注音，这在很多语言里还是很实用的，比如日文的假名注音。这些注音如果不忽略的话在后续的tts处理时会影响效果（同样的词读两遍）。（实测不灵🙅）

epub2txt -a xxx.epub > xxx.txt

split txt file

拆分生成的txt

在linux环境下：

split -l 888 output.txt -d -a 2 output__

-l: 单个文件行数
-d: 以数字命名输出文件
-a: 数字位数

使用epub2splittxt转epub文本

这个仓库比较老，使用的是python2，而且不支持epub3。（epub3中不强制要求有toc.ncx作为目录）

功能上也不支持指定输出目录。

clone epub2splittxt

git clone https://github.com/gtas5/epub2splittxt.git

install pip2 for python2

curl https://bootstrap.pypa.io/pip/2.7/get-pip.py --output get-pip.py

convert

python2 epub2splittxt.py a.epub

text to speech

edge-tts基于Microsoft Edge TTS service

install edge-tts

pip3 install edge-tts

list avaiable voices (optional)

edge-tts --list-voices

text to speech

edge-tts --voice ja-JP-NanamiNeural --text "$(cat xxx/ouput/chapter1.txt)" --write-media  xxx/ouput/chapter1.mp3 --write-subtitles xxx/ouput/chapter1.vtt

升级版

因为现有的工具用起来实在太麻烦，所以我这里写了一个升级版

clone Attt/epub2audiobook

git clone https://github.com/Attt/epub2audiobook

feature:

用python3重写
兼容epub3
纯文本提取
删除注音
支持自动语言选择
支持apple tts
支持选择输入/输出目录
输出音频带ID3Tag
封面提取

conclusion

有一说一，MS的TTS是真的吊。 🥸

[参考]

epub电子书生成audio book

使用epub-to-text转epub文本

clone epub-to-text

install dependencies

create main.js

run with node

使用epub2txt2转epub文本

clone epub2txt2

make & make install

convert

split txt file

使用epub2splittxt转epub文本

clone epub2splittxt

install pip2 for python2

convert

text to speech

install edge-tts

list avaiable voices (optional)

text to speech

升级版

clone Attt/epub2audiobook

conclusion

create `main.js`