Python-覆盖类变量的初始化

我有一个超类和一个子类,它们需要基于正则表达式来不同地处理它们的初始化。请参见下面的工作示例。

import os
import re


class Sample:
    RE = r'(?P<id>\d+)'
    STRICT_MATCHING = False

    def __init__(self,f):
        self.file = f
        self.basename = os.path.basename(os.path.splitext(self.file)[0])

        re_ = re.compile(self.RE)
        match = re_.fullmatch if self.STRICT_MATCHING else re_.match
        self.__dict__.update(match(self.basename).groupdict())


class DetailedSample(Sample):
    RE = r'(?P<id>\d+)_(?P<dir>[lr])_(?P<n>\d+)'
    STRICT_MATCHING = True


s1 = Sample("/asdf/2.jpg")
print(s1.id)
s2 = DetailedSample("/asdfadsf/2_l_2.jpg")
print(s2.id,s2.dir,s2.n)

此代码有效,但有两个缺点:

  • 每次初始化新的Sample时,都会重新编译正则表达式。
  • 不能从match中的其他类方法调用Sample函数(例如,我可能希望能够检查文件是否具有有效名称-相对于RE -在初始化Sample之前)。

简单来说,我想要这样的东西:

class Sample:
    RE = r'(?P<id>\d+)'
    STRICT_MATCHING = False
    re_ = re.compile(RE)  #
    match = re_.fullmatch if STRICT_MATCHING else re_.match  #

    def __init__(self,f):
        self.file = f
        self.basename = os.path.basename(os.path.splitext(self.file)[0])

        self.__dict__.update(self.match(self.basename).groupdict())

    @classmethod
    def valid(cls,f):
        basename,ext = os.path.splitext(os.path.basename(f))
        return cls.match(basename) and ext.lower() in ('.jpg','.jpeg','.png')


class DetailedSample(Sample):
    RE = r'(?P<id>\d+)_(?P<dir>[lr])_(?P<n>\d+)'
    STRICT_MATCHING = True

但是,这显然在子类中不起作用,因为标记为#的两行在子类中重新定义RESTRICT_MATCHING后将不会执行。

有没有一种方法可以做到

  • 保留第一种方法的功能(即基于正则表达式的初始化);
  • 仅编译正则表达式并为每个子类定义一次match方法;
  • 允许从类方法中调用match方法;
  • 只需要在子类中重新定义正则表达式字符串和STRICT_MATCHING参数?
q4623051 回答:Python-覆盖类变量的初始化

您可以使用a = tf.constant([[1,2],[-2,3.]]) b = tf.constant([[-2,3],[0,4.]]) print(f"{tf.tensordot(a,b,0)}\t tf.einsum('ij,kl',a,b)\t- ((the last 0 axes of a),(the first 0 axes of b))") print(f"{tf.tensordot(a,(0,0))}\t tf.einsum('ij,ik',b)\t- ((0th axis of a),(0th axis of b))") print(f"{tf.tensordot(a,1))}\t tf.einsum('ij,ki',(1st axis of b))") print(f"{tf.tensordot(a,1)}\t tf.matmul(a,b)\t\t- ((the last 1 axes of a),(the first 1 axes of b))") print(f"{tf.tensordot(a,((1,),)))}\t tf.einsum('ij,jk',b)\t- ((1st axis of a),(1,0))}\t tf.matmul(a,b)\t\t- ((1st axis of a),2)}\t tf.reduce_sum(tf.multiply(a,b))\t- ((the last 2 axes of a),(the first 2 axes of b))") print(f"{tf.tensordot(a,((0,1),1)))}\t tf.einsum('ij,ij->',b)\t\t- ((0th axis of a,1st axis of a),(0th axis of b,1st axis of b))") [[[[-2. 3.] [ 0. 4.]] [[-4. 6.] [ 0. 8.]]] [[[ 4. -6.] [-0. -8.]] [[-6. 9.] [ 0. 12.]]]] tf.einsum('ij,b) - ((the last 0 axes of a),(the first 0 axes of b)) [[-2. -5.] [-4. 18.]] tf.einsum('ij,b) - ((0th axis of a),(0th axis of b)) [[-8. -8.] [ 5. 12.]] tf.einsum('ij,(1st axis of b)) [[-2. 11.] [ 4. 6.]] tf.matmul(a,b) - ((the last 1 axes of a),(the first 1 axes of b)) [[-2. 11.] [ 4. 6.]] tf.einsum('ij,b) - ((1st axis of a),(0th axis of b)) [[-2. 11.] [ 4. 6.]] tf.matmul(a,b) - ((1st axis of a),(0th axis of b)) 16.0 tf.reduce_sum(tf.multiply(a,b)) - ((the last 2 axes of a),(the first 2 axes of b)) 16.0 tf.einsum('ij,b) - ((0th axis of a,1st axis of b)) 来确保每个子类都可以正常工作。这将在您的公共基类继承的私有基类中定义。

__init_subclass__

除非以后需要直接访问已编译的正则表达式,否则import os import re class _BaseSample: RE = r'(?P<id>\d+)' STRICT_MATCHING = False def __init_subclass__(cls,**kwargs): super().__init_subclass__(**kwargs) cls._re = re.compile(cls.RE) cls.match = cls._re.fullmatch if cls.STRICT_MATCHING else cls._re.match class Sample(_BaseSample): def __init__(self,f): self.file = f self.basename = os.path.basename(os.path.splitext(self.file)[0] self.__dict__.update(self.match(self.basename).groupdict()) class DetailedSample(Sample): RE = r'(?P<id>\d+)_(?P<dir>[lr])_(?P<n>\d+)' STRICT_MATCHING = True s1 = Sample("/asdf/2.jpg") print(s1.id) s2 = DetailedSample("/asdfadsf/2_l_2.jpg") print(s2.id,s2.dir,s2.n) 可以是_re的局部变量,而不是每个类的类属性。

请注意,_BaseSample.__init_subclass__也可以接受其他关键字参数,这些关键字参数作为关键字参数提供给__init_subclass__语句本身。我认为这样做没有什么特别的好处。您只需要提供用于设置classRE的接口即可。有关详细信息,请参见Customizing Class Creation

,

您可以通过装饰类来做到这一点。

此装饰器检查STRICT_MATCHING属性并相应地设置match属性。

def set_match(cls):
    match = cls.RE.fullmatch if cls.STRICT_MATCHING else cls.RE.match
    setattr(cls,'match',match)
    return cls


@set_match
class Sample:
    RE = re.compile(r'(?P<id>\d+)')
    STRICT_MATCHING = False

    def __init__(self,f):
        self.file = f
        self.basename = os.path.basename(os.path.splitext(self.file)[0])
        self.__dict__.update(self.match(self.basename).groupdict())


@set_match
class DetailedSample(Sample):
    RE = re.compile(r'(?P<id>\d+)_(?P<dir>[lr])_(?P<n>\d+)')
    STRICT_MATCHING = True

使用元类可以获得相同的效果:

class MetaMatchSetter(type):

    def __new__(cls,clsname,bases,clsdict):
        rgx = clsdict['RE']
        match = rgx.fullmatch if clsdict['STRICT_MATCHING'] else rgx.match
        clsdict['match'] = match
        return super().__new__(cls,clsdict)


class Sample(metaclass=MetaMatchSetter):
    ...

class DetailedSample(Sample):
    ...

但是,在我看来,使用类修饰符(或chepner的答案中所述的__init_subclass__)更具可读性和可理解性。

,

您可以按照wiki.python.org的说明来缓存/存储已编译的正则表达式,如果实例属性是,则需要使用类属性:

import os
import re
import functools

def memoize(obj):
    cache = obj.cache = {}

    @functools.wraps(obj)
    def memoizer(*args,**kwargs):
        if args not in cache:
            cache[args] = obj(*args,**kwargs)
        return cache[args]
    return memoizer


@memoize
def myRegExpCompiler(*args):
    print("compiling")
    return re.compile(*args)


class Sample:
    RE = r'(?P<id>\d+)'
    STRICT_MATCHING = False

    def __init__(self,f):
        self.file = f
        self.basename = os.path.basename(os.path.splitext(self.file)[0])

        re_ = myRegExpCompiler(self.__class__.RE) # use cls method!
        match = re_.fullmatch if self.__class__.STRICT_MATCHING else re_.match # use cls method!
        self.__dict__.update(match(self.basename).groupdict())


class DetailedSample(Sample):
    RE = r'(?P<id>\d+)_(?P<dir>[lr])_(?P<n>\d+)'
    STRICT_MATCHING = True


s1 = Sample("/asdf/2.jpg")
print(s1.id)
s2 = DetailedSample("/asdfadsf/2_l_2.jpg")
print(s2.id,s2.n)
s3 = DetailedSample("/asdfadsf/2_l_2.jpg")
print(s3.id,s3.dir,s3.n)

输出:

compiling
2
compiling
2 l 2
2 l 2

...如您所见。表达式仅被编译两次。

本文链接:https://www.f2er.com/3050400.html

大家都在问