Python Help

Soldato
Joined
3 Aug 2012
Posts
2,718
Location
Second Star to the Right
In an effort to try and learn Python, and make something useful for me, I'm trying to create a project that semi automates checking for adverts in TV programs I record on Tvheadend (yes I know Tvh has a comskip plugin, but where's the fun in that! ).


I've got the basic program working so it checks the recording folder, looks to see if the recording has already been processed (gets added to a text file when it has been), then runs comskip if it's a new program.


I'm trying to create two functions, that I'm having problems with:

Checking to see if the recordings, that have already been processed and added to the text file, still exist on disk. If not, I want to remove the line from the text file
Checking to see if a recording is currently active so I can ignore it until it's finished


The relevant code bits I've got so far (which don't work):

Python:
import os
import fnmatch
import subprocess
from time import sleep

RECORDINGS = "PATH/TO/RECORDINGS"
VIDEOS = "OUTPUT FOLDER"
COMSKIP = "comskip"
HD_PROGRAMS = " HD "
ARG1 = "--ts"
ARG2 = "--quiet"
ARG3 = "--vdpau"
ARG4 = "--ini=comskip.ini"
ARG5 = "--output=OUTPUT FOLDER"
PROCESSED_FILES = "processed.txt"


def check_processed():
    """Checks to see if we have already processed the recording and added it to processed.txt"""
    try:
        with open(PROCESSED_FILES, "r") as processed_files:
            file_check = processed_files.readlines()
    except FileNotFoundError:
        with open(PROCESSED_FILES, "w+") as processed_files:
            file_check = processed_files.readlines()
    else:
        for entry in file_check:
            if str(file) in entry:
                return True  # The string is found
        return False  # The string does not exist in the file


def processed_exist_check():
    file = [f for f in os.listdir(RECORDINGS) if fnmatch.fnmatch(f, '*.ts')]
    for f in file:
        with open(PROCESSED_FILES, "r") as pf:
            lines = pf.readlines()
            for line in lines:
                if line not in f:
                    print(f"Can't find {line}")
                else:
                    print(f"Found {line}")

# This function keeps failing saying it can't find the files.


def delete_extra_files():
    """Deletes all the extra files comskip creates when scanning for adverts"""
    filename = os.path.splitext(file)[0]
    os.remove(f"{VIDEOS}{filename}.txt")
    os.remove(f"{VIDEOS}{filename}.edl")
    os.remove(f"{VIDEOS}{filename}.log")


def file_size_check():
    for video in os.listdir(RECORDINGS):
        if fnmatch.fnmatch(video, '*.ts'):
            size_of_file = [
                (video, os.stat(os.path.join(RECORDINGS, video)).st_size)
            ]
            # This just converts the file into MB, and is unnecessary for this project, but nice to know
            # print(size_of_file)
            # for f, s in size_of_file:
            #     print("{} : {}MB".format(f, round(s / (1024 * 1024), 3)))
            return size_of_file[0][1]  # This doesn't work properly, with or without the [0][1]
        else:
            pass


def is_recording():
    # Doesn't work
    print("Checking file size")
    first_check = file_size_check()
    print(first_check)
    sleep(5)
    print("Rechecking file size to ensure it's not still recording")
    second_check = file_size_check()
    print(second_check)
    if second_check > first_check:
        return True
    else:
        return False

# This function only ever seems to return the file size for the same file over and over so never skips files that are actively recording


# Get the video filenames and see if we've already processed them

processed_exist_check()

for file in os.listdir(RECORDINGS):
    if fnmatch.fnmatch(file, '*.ts'):
        if check_processed():
            pass
        elif is_recording():
            pass
        else:
            print(f"Processing {file}")
            result = subprocess.run([COMSKIP, ARG1, ARG2, ARG3, ARG4, ARG5, f"{RECORDINGS}{file}"])
            with open(PROCESSED_FILES, "a") as completed:
                completed.write(f"{file}\n")
                delete_extra_files()



Any ideas? I've spent the last day searching online and trying different ways, but it's clear I'm doing something wrong as neither of these functions work properly.
 
Last edited:
Thanks for that. The file size check is now working with:

Python:
def file_size_check(video):
    return os.stat(os.path.join(RECORDINGS, video)).st_size

I still can't get the processed_exist_check() to identify files that are no longer on disk and remove them from the text file as it always says it can't find the file.
 
I've been playing around and have got a little further:

Python:
def processed_exist_check():
    file = [f for f in os.listdir(RECORDINGS) if fnmatch.fnmatch(f, '*.ts')]
    on_disk = ""
    with open(PROCESSED_FILES, "r") as pf:
        lines = pf.readlines()
        on_disk = file
        print(on_disk)
        for line in lines:
            no_newline = line.strip("\n")
            if no_newline in on_disk:
                print(f"FOUND {no_newline}")
            else:
                print(f"{no_newline} not found")

This seems to accurately report the correct results, but when I try and remove the line from the text file with:

Python:
def processed_exist_check():
    file = [f for f in os.listdir(RECORDINGS) if fnmatch.fnmatch(f, '*.ts')]
    on_disk = ""
    with open(PROCESSED_FILES, "r") as pf:
        lines = pf.readlines()
        on_disk = file
        print(on_disk)
        for line in lines:
            no_newline = line.strip("\n")
            if no_newline in on_disk:
                continue
            else:
                with open(PROCESSED_FILES, "w") as pf:
                    pf.write(line)

It ends up removing everything except the last entry. I think I need another for loop in the else: section but my attempts there don't seem to be working either as I end up with just the entire contents rewritten multiple times.
 
More minor edits:

Python:
def processed_exist_check():
    file = [f for f in os.listdir(RECORDINGS) if fnmatch.fnmatch(f, '*.ts')]
    on_disk = ""
    with open(PROCESSED_FILES, "r") as pf:
        lines = pf.readlines()
        on_disk = file
        # print(on_disk)
        for line in lines:
            if line.strip("\n") in on_disk:
                continue
            else:
                with open(PROCESSED_FILES, "w") as pf:
                    if line != on_disk:
                        print(f"deleting {line}")
                        pf.write(line)

The printout suggests it's deleting the correct lines, but again all it does is delete everything except the final line in the text file.
 
Thanks again. I'm really struggling with this bit. I've tried making two methods to compare, but I'm clearly not getting the syntax right when trying to call it later to get it to actually work:

Python:
def on_disk():
    for vid in os.listdir(RECORDINGS):
        if fnmatch.fnmatch(vid, "*.ts"):
            return vid


def in_file():
    with open(PROCESSED_FILES, "r") as pf:
        lines = pf.readlines()
        for line in lines:
            return line

I also started looking at trying to enumerate the lines, and can correctly print out a list of the lines where the file still exists on disk, but once again can't get it to actually delete the lines that no longer exist:

Python:
def processed_exist_check():
    vid = [f for f in os.listdir(RECORDINGS) if fnmatch.fnmatch(f, '*.ts')]
    on_disk = vid
    with open(PROCESSED_FILES, "r") as pf:
        lines = pf.readlines()
        for number, line in enumerate(lines):
            proc_line = line.strip("\n")
            if proc_line in on_disk:
                print(number, proc_line)

I'm sure it shouldn't be this difficult but it's currently baffling me, though that's not necessarily saying much :)
 
Last edited:
I still couldn't get the code to work yesterday when I tried but, with some of your modifications, along with a spark of inspiration in the early hours of this morning, I think I may now have fixed it:

Python:
def processed_exist_check():
    """Check to see if the processed files still exist on the disk; delete them from the list if not"""
    with open(PROCESSED_FILES, "r") as pf:
        lines = pf.readlines()
        with open(PROCESSED_FILES, "w") as removing:
            for number, line in enumerate(lines):
                proc_line = line.strip("\n")
                if on_disk(proc_line):
                    temp_num = number
                    if number in [temp_num]:
                        removing.write(line)

I've tested this a few times, and it does now appear to work rewriting the text file with only files that have been processed and still exist on disk.

Thanks for all the help.

Now to add a few more bells & whistles. :)
 
Back
Top Bottom