Converting Greek texts of different encodings to UTF-8


There exist at least two "ways" (called "encodings") to write Greek with a computer. The older one called "iso-8859-7" and the new one called "UTF-8". To use your TeX files with modern engines such as xelatex or lualatex your Greek must be in UTF-8. So old files will not work unless converted.

The case of Monotonic Greek


If you are on Linux and use monotonic Greek the conversion is easy. Execute this command

iconv -f iso8859-7 -t UTF-8 myoldfile.tex > mynewfile.tex

(I did not forget a dash in the above: iso8859-7 is correct.)

However, if you are on MS-Windows or if you use Polytonic Greek this conversion is not so easy. The following method works for monotonic Greek on all platforms:

Open your old file in texworks installed with TeXLive. For installation check the documentation of TeXLive, or for a guide in Greek check the links on this page.

You will not see proper Greek on texworks in this case. But the lowest line of texworks, next to its frame has a box saying "UTF-8". Click on this and change the encoding to "ISO-8859-7".

Now you still do not see proper Greek. You have to reload your file. Click again on the same box (which will now say ISO-8859-7) and now choose "Reload using selected encoding". Say Yes to the next warning window.

Now you should see Greek on texworks. It is still though in ISO-8859-7. Click again on the same box and again change ISO-8859-7 to UTF-8. But now instead of reloading the file as we did before, save it! Your file will now be in UTF-8 and it can be used with xelatex or lualatex. Do not forget to remove lines such as \usepackage[iso-8859-7]{inputenc} or the babel commands, and add lines such as

\usepackage[default]{fontsetup}
\usepackage{xgreek} % if the main language of the document is Greek.



The case of Polytonic Greek


Now lets see the polytonic case. The problem here is that you may have converted ά to proper UTF-8 encoding but your document has conventions for the polytonic such as >ά| for alpha with psili, oxia and ypogegrammeni. These extra characters are not part of a special encoding. They are TeX conventions. How do we proceed?

If you are on Linux and have many files to convert then the command line is your friend. Check this page and get the script that will go directly from iso8859-7 to UTF-8. You SHOULD NOT use the above texwork method. The script will do everything for you.

If you are on MS-Windows or you have just a couple of files convert them to UTF-8 using texwork as described above and use this page to do the final conversions for the breathing marks, ypogegrammeni, bareia, and perispomeni.


(C) Ch. Kornaros, A. Tsolomitis. March 2022.