In this post, I will describe how to convert dictionaries
in ABBYY Lingvo’s format
mobi dictionaries that can work with Kindle.
I will assume that you already have appropriate
The first step is to make sure that
.dsl files use UTF-8 encoding.
We may check a file encoding using
If you see something different than
UTF-8 Unicode (with BOM),
as in the above example, then you have to convert the files to UTF-8
first. We may use
iconv for this purpose:
We need to make sure that
.dsl files does not
contain metadata info (lines starting with
# at the
beginning of the file):
If you see lines starting with
# as in the above example, please
Next we need to grab
You need to have
ruby installed on your machine for the script
to work. Actually I don’t like running someone else’s code
on my machine, so I ran the script inside a virtual machine
(which, for security reasons, I also recommend you to do).
Now we can execute the script:
In a lot of languages, the same word can occur in
different forms, for example
in English the word “write” can occur in forms: wrote, written, writes.
We want our dictionary to recognize all these variations,
and for this reason we need the so called wordforms.
Fortunately for us
dsl2mobi comes with a buildin
wordforms files for several languages.
If you want to create a dictionary from e.g. Russian to
Polish you need to use Russian wordforms (as in our example).
If you want to create a dictionary from English to Russian
you would need to use English wordforms, etc.
dsl2mobi should create at least two files in the
output-dir, one with
.html extension (containing
actual content) and one with
extension (containing metadata).
Next we need to
grab KindleGen from Amazon
to actually generate
-c2 option to compress the dictionary.
Unfortunately, in my case
kindlegen does not wanted to
.opf file generated by
To make it work, I needed to edit my
.opf file to:
Also make that
tags have proper values, otherwise your dict may not work
I also had to change the beginning of the
dict.html file to:
After these changes I was able to generate a
.mobi file that
worked perfectly with my Kindle.
If your dictionary is really huge (the
.html file bigger than 20MB),
KindleGen may either take a lot of time (a few hours) or
it may not finish at all.
In this case I advice you to split, the single
into three or four smaller files (each should be less than 20MB),
and then to add them as “chapters” to the
You can use
tail for the splitting:
Then you have to use
vim or other editor to make sure that
all files have proper
<head> sections, and
are properly ended with
You will also have to make sure that dictionary
entries are not split across the files.
They are quite easy to recognize, as they
usually start with
<a> tag followed by
KindleGen needed around 1h of time to convert 80MB split into four parts, so be prepared to wait for a bit.