Handling Non-ASCII Characters in Python Source

The Python interpreter may raise a SyntaxError when it encounters non-ASCII characters in a source file without an explicit encoding declaration. This issue commonly occurs when using comments written in languages like Korean (for example, # 메인 함수).

Consider the following module:

def main():
    print ("Main Function")

# Main function
# @see https://madplay.github.io/post/python-main-function
if __name__ == "__main__":
	main()

Executing this script produces the following error:

$ python taeng.py 
  File "taeng.py", line 4
SyntaxError: Non-ASCII character '\xeb' in file taeng.py on line 4,
but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

The resolution involves declaring the file’s character encoding. Adding the following comment at the beginning of the file instructs the interpreter to parse it as UTF-8.

This declaration must appear on either the first or second line of the source file to be effective.

# -*- coding: utf-8 -*-

def main():
    print ("Main Function")

# Main function
if __name__ == "__main__":
	main()