Really bad closed caption on some recordings.

For discussion related to MythTV which doesn't belong in another forum.

Moderator: Forum Moderators

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Really bad closed caption on some recordings.

Post by mathog » Sat Feb 27, 2021 1:03 am

My wife likes to watch Spanish Language Telenovelas. There is a lot of slang in these so she turned on closed captioning so she could see the word and then look it up. When playing "Buscando a Frida", a Telemundo show, the resulting text is virtually unusable. It contained maybe 20% of the text and often just parts of a word. Then we watched one of these shows live on the TV and turned on closed captioning there and it was much better, with nearly 100% coverage of the spoken text. Spanish is not a very concise language and we think the missing pieces in this case may just be the person doing the work dropping things on purpose so that the CC can keep up with the scene.

Today we watched a recording of "The Mallorca Files", which is in English but is a British, or at least European production . With closed captioning turned on many letters were missing too. It was still possible to read it, mostly, but "Get" might show up as "et". We have not yet had a chance to watch one of these live on the TV with CC to see if that works better.

Closed captioning in US shows, like "Grey's Anatomy" or "Magnum PI" (the current one) always work fine in MythTV recordings.

This is MythTV 30.0 on Ubuntu 18.04.4 LTS.

I have a sneaking suspicion that the ones which are broken may be using UTF-8 and the ones which work are just plain ASCII. In any case, has anybody else seen this issue or know of a way to make it work better?

Thanks.

gedakc
Junior
Posts: 50
Joined: Fri Jul 18, 2014 1:28 am
Canada

Re: Really bad closed caption on some recordings.

Post by gedakc » Sat Feb 27, 2021 4:19 pm

If there was poor signal strength during the recording then I've found that that adversely impacts Closed Captions (CC).

Pre-recorded shows seem to have the best CC, with live feeds being hit and miss (probably because it is difficult to type in CC in real time.).

I've used both MythTV and Kodi with the MythTV PVR Client add-on to view recordings. MythTV is vastly superior in it's ability to parse and display CC.

With Kodi one can find and link an independent subtitle/CC file (.srt extension) to a recording in an effort to have a better CC experience. The downside is the CC can more easily get out of sync due to commercials so then I have to spend more time advancing or delaying the CC so that it more closely matches the spoken words.

User avatar
bill6502
Developer
Posts: 1870
Joined: Fri Feb 07, 2014 5:28 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by bill6502 » Sat Feb 27, 2021 7:45 pm

Not having the issue with master (v32-Pre). It may have been fixed in v31 as well. I can't test it.
I tested on Telemundo and Univision (I know that's spelled wrong) and while I can't
read Spanish, there were no odd looking words.

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by mathog » Sat Feb 27, 2021 11:44 pm

Let's see, it looks like it is possible to update v30 to v31 on 18.04.4 LTS using a different ppa. Otherwise the whole machine could be updated to 20.04 LTS, which also brings in v31, but that would be much more work (and would probably break LIRC, again.)

User avatar
bill6502
Developer
Posts: 1870
Joined: Fri Feb 07, 2014 5:28 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by bill6502 » Sat Feb 27, 2021 11:51 pm

I have v31 running on 18.04, but build from source.

It could be a difficult test just to see if captions are better
for you.

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by mathog » Sun Feb 28, 2021 7:15 am

bill6502 wrote:
Sat Feb 27, 2021 11:51 pm
I have v31 running on 18.04, but build from source.

It could be a difficult test just to see if captions are better
for you.
You lost me in that last sentence. What would be difficult, building from source or determining if the captions are bad?

User avatar
kmdewaal
Developer
Posts: 288
Joined: Wed Dec 07, 2016 8:01 pm
Netherlands

Re: Really bad closed caption on some recordings.

Post by kmdewaal » Sun Feb 28, 2021 8:27 am

Closed captioning is part of the recording. If you can post a significant part of a recording that shows the problem, e.g. 5 minutes or so, then I (or Bill or anybody else) can check if the playback is correct with a v32 system.
N.B. You cannot post a big file to this forum so it must be done in another way, e.g. with a gmail link, wetransfer or so.

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by mathog » Mon Mar 01, 2021 5:48 am

We will try to make some short examples tomorrow. Thanks.

User avatar
bill6502
Developer
Posts: 1870
Joined: Fri Feb 07, 2014 5:28 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by bill6502 » Mon Mar 01, 2021 11:14 pm

@ mathog, I was just pointing out that upgrading to v31 isn't guaranteed to
fix the issue.

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by mathog » Wed Mar 03, 2021 1:23 am

Here:

https://drive.google.com/drive/folders/ ... sp=sharing

there are two files "frida.ts" and "loli.ts". Both have badly messed up closed captioning when played in mythtv 30.0. They both have correct, or at least largely correct closed captioning when that is turned on in vlc 3.0.8 on Ubuntu 18.04.4 LTS.

Note, these samples will be removed in a week or so, since I think this sort of snippet used for debugging is fair use, but I really don't want to have to argue that point in court against Telemundo's lawyers!

Some glitch points (besides the general absence of captioning, which is everywhere)

Frida:
1:39 "s" above line "Mentiras, es to oportunidad"
1:45 blue rectangle above "para siempre"
2:26 single "A" (with accent over it)

Loli
2:42 empty blue rectangle above "Esta pasando"

I will have to wait for another episode of "All creatures great and small" to sample that. It isn't a general UK production thing since "Sense and Sensibility" had no CC issues.

Thanks

User avatar
kmdewaal
Developer
Posts: 288
Joined: Wed Dec 07, 2016 8:01 pm
Netherlands

Re: Really bad closed caption on some recordings.

Post by kmdewaal » Wed Mar 03, 2021 9:55 am

Playing this back with the latest master gives interesting results.
The menu "Playback menu" / "Subtitles" shows three choices:: "Enable Subtitles" / "Select ATSC CC" / "Select VBI CC".
Selecting "Enable Subtitles" does select the first one which is the "ATSC CC".
If I do "Select ATSC CC" / "CC1 English" then I have the same issues that you describe.
But if I do "Select VBI CC" / "CC1 English" then I get correct subtitling in Spanish.
I think that this just might work for you also.

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by mathog » Wed Mar 03, 2021 8:32 pm

I will try it tonight. Odd that it says "CC1 English" even when the subtitles are in another language. One would guess that somewhere in the CC data there is a field which actually says which language is present, because I think some content has CC available in more than one language. "CC1 English" is maybe "CC1 Primary"?

What do "ATSC CC" and "VBI CC" mean in this context? CEA-708 vs. EIA-608? As if an end user would/should have any idea what those are!

One other glitch I have seen recently on closed captioning for English language programs - a "?" at the end of each of the first N-1 lines on an N line subtitle. Most recently we saw this with the Fox network show "The Resident". We don't have any of those saved at the moment to post an example. I'm guessing though that it probably indicates a line wrap character which is being looked up for a glyph, and since there is no glyph for it, it is filled with "?".

Thanks.

User avatar
kmdewaal
Developer
Posts: 288
Joined: Wed Dec 07, 2016 8:01 pm
Netherlands

Re: Really bad closed caption on some recordings.

Post by kmdewaal » Thu Mar 04, 2021 5:54 pm

Of course, even when it is possible for you to get correct Spanish subtitling that does not make it good.

On the screen of the loli.ts it says that CC1 is Espanol and that CC3 is English. This corresponds with VLC; that shows four subtitle options; 1 is indeed Spanish and 3 is indeed English.
If I understand the Wikipedia correct then an ATSC broadcast should have CC608 carried in the CC708 and there should also be native CC708.
The CC608 indeed being the equivalent of VBI CC (video blanking interval, the period in which the CC data is transmitted in the analog TV days).
This means that also the menu entries shown by MythTV are not correct.

It is true that MythTV is not (yet) perfect but also the transmissions by broadcasters are not always perfect.
When mythfrontend is started with "-v vbi" then lots of parity errors show up when playing the frida.ts file.
However, vlc shows both Spanish and English subtitles correct so there is room for improvement.

Because you are having the problems with mythtv v30 this means it is not a recent regression. Which is unfortunate because then it would be easy to find.....

mathog
Senior
Posts: 107
Joined: Thu Dec 15, 2016 5:22 pm
United States of America

Re: Really bad closed caption on some recordings.

Post by mathog » Thu Mar 04, 2021 8:58 pm

vlc analysis of the loli.ts example under tools -> Media Information ->Codec tab shows 5 streams with the last 4 being CC info, all as

Code: Select all

Codec:  EIA-608 subtitles (c608)
Description: Closed captions #   [# one of 1,2,3,4]
Type: Subtitle
If there is CC708 in there vlc isn't seeing it either.

Do you want me to file a bug report (and where)? Or since you are a developer would you prefer to do it?

User avatar
kmdewaal
Developer
Posts: 288
Joined: Wed Dec 07, 2016 8:01 pm
Netherlands

Re: Really bad closed caption on some recordings.

Post by kmdewaal » Fri Mar 05, 2021 5:36 pm

I intend to create a Github ticket (fyi: on https://github.com/MythTV/mythtv/issues) after first doing some more investigation.
Thanks for the reporting and for the streams, that really helps a lot.
Hope that you can for now live with the VBI/CC608 subtitles.

Post Reply