Modified "find_orphans.py" script to handle unicode
Posted: Mon Aug 15, 2016 5:38 pm
Not sure if this is the correct place to post this, but I recently ran into a unicode issue while running the find_orphans.py scripts https://www.mythtv.org/wiki/Find_orphans.py
I find this to be a handy tool now that I am building up a collection of recordings and wanted better organization within Kodi (ie, using the Kodi library to sort by season & episode) which leaves greater chances of recordings to be out of sync with the database (especially if deleting from Kodi). I was testing & shuffling things around with mythicalLibrarian and had a couple abandoned files. One of which contained a non ASCII character in the subtitle from the recorded table that gave the following error:
Traceback (most recent call last):
File "./find_orphans.py", line 230, in <module>
main()
File "./find_orphans.py", line 169, in main
printrecs("Recordings with missing files", recs)
File "./find_orphans.py", line 43, in printrecs
rec.pprint()
File "./find_orphans.py", line 38, in pprint
print u' {0:<70}{1:>28}'.format(name,self.basename)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 73: ordinal not in range(128)
After realizing that one of the recordings had an offending character in the subtitle, I searched on how to add unicode support to a python script. I am not well versed in python, FYI - I am more of a shell scripter, but did find it pretty easy to correct. Figured I'd post this in case anyone comes across a similar error while using the tool. The lines I added to the very beginning of the scripts are shown with a "+"
#!/usr/bin/env python2
+ # -*- coding: utf-8> -*-
from MythTV import MythDB, MythBE, Recorded, MythError
from socket import timeout
import os
+ import codecs
import sys
+ # PYTHONIOENCODING="UTF-8"
+ UTF8Writer = codecs.getwriter('utf8')
+ sys.stdout = UTF8Writer(sys.stdout)
I find this to be a handy tool now that I am building up a collection of recordings and wanted better organization within Kodi (ie, using the Kodi library to sort by season & episode) which leaves greater chances of recordings to be out of sync with the database (especially if deleting from Kodi). I was testing & shuffling things around with mythicalLibrarian and had a couple abandoned files. One of which contained a non ASCII character in the subtitle from the recorded table that gave the following error:
Traceback (most recent call last):
File "./find_orphans.py", line 230, in <module>
main()
File "./find_orphans.py", line 169, in main
printrecs("Recordings with missing files", recs)
File "./find_orphans.py", line 43, in printrecs
rec.pprint()
File "./find_orphans.py", line 38, in pprint
print u' {0:<70}{1:>28}'.format(name,self.basename)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 73: ordinal not in range(128)
After realizing that one of the recordings had an offending character in the subtitle, I searched on how to add unicode support to a python script. I am not well versed in python, FYI - I am more of a shell scripter, but did find it pretty easy to correct. Figured I'd post this in case anyone comes across a similar error while using the tool. The lines I added to the very beginning of the scripts are shown with a "+"
#!/usr/bin/env python2
+ # -*- coding: utf-8> -*-
from MythTV import MythDB, MythBE, Recorded, MythError
from socket import timeout
import os
+ import codecs
import sys
+ # PYTHONIOENCODING="UTF-8"
+ UTF8Writer = codecs.getwriter('utf8')
+ sys.stdout = UTF8Writer(sys.stdout)