watchdog(python)-仅监视一种文件格式,并忽略“ PatternMatchingEventHandler”中的所有其他内容

我正在运行this article中的代码,并进行了一些更改以监视仅在指定目录中为.csv的一种格式的文件创建/添加。

现在的问题是:

只要添加的新文件不是.csv格式,我的程序就会中断(停止监视,但继续运行);为了弥补这一点,这是我对ignore_patterns参数所做的操作(但是在添加其他格式的新文件后,程序仍然停止监视):
PatternmatchingEventHandler(patterns="*.csv",ignore_patterns=["*~"],ignore_directories=True,case_sensitive=True)

完整的代码是:

import time
import csv
from datetime import datetime
from watchdog.observers import Observer
from watchdog.events import PatternmatchingEventHandler
from os import path
from pandas import read_csv
# class that takes care of everything
class file_validator(PatternmatchingEventHandler):
    def __init__(self,source_path):
        # setting parameters for 'PatternmatchingEventHandler'
        super(file_validator,self).__init__(patterns="*.csv",case_sensitive=True)
        self.source_path = source_path
        self.print_info = None

    def on_created(self,event):
        # this is the new file that was created
        new_file = event.src_path
        # details of each new .csv file
        # demographic details
        file_name = path.basename(new_file)
        file_size = f"{path.getsize(new_file) / 1000} KiB"
        file_creation = f"{datetime.fromtimestamp(path.getmtime(new_file)).strftime('%Y-%m-%d %H:%M:%S')}"
        new_data = read_csv(new_file)
        # more details
        number_columns = new_data.shape[1]
        data_types_data = [
            ('float' if i == 'float64' else ('int' if i == 'int64' else ('character' if i == 'object' else i))) for i in
            [x.name for x in list(new_data.dtypes)]]
        null_count_data = list(dict(new_data.isna().sum()).values())
        print(f"{file_name},{file_size},{file_creation},{number_columns}")
        # trying to access this info,but of no help
        self.print_info = f"{file_name},{number_columns}"

    def return_logs(self):
        return self.print_info

# main function    
if __name__ == "__main__":
    some_path = "C:\\Users\\neevaN_Reddy\\Documents\\learning dash\\"
    my_validator = file_validator(source_path=some_path)
    my_observer = Observer()
    my_observer.schedule(my_validator,some_path,recursive=True)
    my_observer.start()
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        my_observer.stop()
        my_observer.join()
    # # this doesn't print anything
    print(my_validator.return_logs)

编辑1(在Quentin Pradet发表评论后): 在评论中提出您的建议后,我的论点已更改为:

super(file_validator,# ignore_patterns=["*~"],case_sensitive=True)

并且当我复制其他格式的文件(我尝试使用.ipynb文件)时,我看到了此错误(此后程序也停止监视.csv文件):

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\threading.py",line 926,in _bootstrap_inner
    self.run()
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\watchdog\observers\api.py",line 199,in run
    self.dispatch_events(self.event_queue,self.timeout)
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\watchdog\observers\api.py",line 368,in dispatch_events
    handler.dispatch(event)
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\watchdog\events.py",line 454,in dispatch
    _method_map[event_type](event)
  File "C:/Users/neevaN_Reddy/Documents/Work/Project-Aretaeus/diabetes_risk project/file validation using a class.py",line 26,in on_created
    new_data = read_csv(new_file)
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\parsers.py",line 685,in parser_f
    return _read(filepath_or_buffer,kwds)
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\parsers.py",line 463,in _read
    data = parser.read(nrows)
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\parsers.py",line 1154,in read
    ret = self._engine.read(nrows)
  File "C:\Users\neevaN_Reddy\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\parsers.py",line 2059,in read
    data = self._reader.read(nrows)
  File "pandas/_libs/parsers.pyx",line 881,in pandas._libs.parsers.textreader.read
  File "pandas/_libs/parsers.pyx",line 896,in pandas._libs.parsers.textreader._read_low_memory
  File "pandas/_libs/parsers.pyx",line 950,in pandas._libs.parsers.textreader._read_rows
  File "pandas/_libs/parsers.pyx",line 937,in pandas._libs.parsers.textreader._tokenize_rows
  File "pandas/_libs/parsers.pyx",line 2132,in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 4,saw 2

显然,pandas出现了一些错误,这意味着我的on_created函数也针对非.csv的文件格式被触发,我认为这意味着必须执行一些操作ignore_patterns中的参数中没有添加其他格式的文件时触发on_created函数。

luofeihuai7713 回答:watchdog(python)-仅监视一种文件格式,并忽略“ PatternMatchingEventHandler”中的所有其他内容

您能否尝试将patterns作为列表而不是字符串发送,例如。 patterns=["*.csv"]

本文链接:https://www.f2er.com/3024225.html

大家都在问