Make it closer to 0 if you get too many duplicated subtitle lines, or make it closer to 100 if you get too few subtitle lines.Įxtract subtitles from only a clip of the video. The default value 90 is fine for most cases. Subtitle lines with larger Levenshtein ratios than this threshold will be merged together. Make it closer to 0 if you get too few words in each line, or make it closer to 100 if there are too many excess words in each line. The default value 65 is fine for most cases. Words with lower confidence than this value will be discarded. You can read more about Tesseract language data files on their wiki page.Ĭonfidence threshold for word predictions. Language files will be automatically downloaded to your ~/tessdata. lang='hin+eng' for Hindi and English together. Note that you can use more than one language, e.g. 'HanS' for simplified Chinese) are supported. 'eng' for English) and all script names in this repository (e.g. You can extract subtitles in almost any language. Video_path: str, file_path = 'subtitle.srt', lang = 'eng', time_start = '0:00', time_end = '',Ĭonf_threshold = 65, sim_threshold = 90, use_fullframe = False)
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |