Amazon S3对象名称的正则表达式

2024-04-27 • 问答

从aws doc https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html中，我们知道对象名称中允许使用的字符。我想构建一个正则表达式，该表达式应指定一个对象或一组这样的对象：

/abc/obj*
/abc/*
/*
/abc/obj1.txt

我创建的正则表达式如下：

"((/[a-zA-Z0-9]+)*((/[a-zA-Z0-9\\.]*(\\*)?)?))"

除了需要在方括号内添加的其他符号之外，此正则表达式看起来不错还是需要更多增强或简化？

首先，您的正则表达式无法正常工作。例如，对于/abc/obj.txt而言，它与.txt部分不匹配。参见A demo of your regex。其次，在子表达式[a-zA-Z0-9\\.]中，您不需要反斜杠字符。 .将被解释为没有它们的句点字符。第三，您应该在正则表达式的开头^和结尾的$，以确保您符合所需的条件，并且输入中没有多余的内容。第四，您没有指定要使用的语言。

我在这里使用Python：

import re

tests = [
    '/abc/obj*','/abc/*','/*','/abc/obj1.txt'
]

# the regex: ^/([a-zA-Z0-9]+/)*(\*|([a-zA-Z0-9]+(\*|(\.[a-zA-Z0-9]+)?)))$

for test in tests:
    m = re.match(r"""
        ^                   # the start of the string
        /                   # a leading /
        ([a-zA-Z0-9]+/)*    # 0 or more: abc/
        (\*                 # first choice: *
        |                   # or
        ([a-zA-Z0-9]+       # second choice: abc followed by either:
            (\*|(\.[a-zA-Z0-9]+)?)))    # * or .def or nothing
        $                   # the end of the string
        """,test,flags=re.X)
    print(test,f'match = {m is not None}')

打印：

/abc/obj* match = True
/abc/* match = True
/* match = True
/abc/obj1.txt match = True

Regex Demo

但是，当我在https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html阅读对象键规范时，似乎您的测试用例似乎不是有效的示例，因为其中显示的所有示例都没有前导/字符。看来*字符也应该像其他任何字符一样对待，并且可以在任何位置多次出现。这实际上使正则表达式更简单：

^[a-zA-Z0-9!_.*'()-]+(/[a-zA-Z0-9!_.*'()-]+)*$

Regex Demo

新代码：

import re

tests = [
    'abc','-/abc/(def)/!x*yz.def.hij'
]

# the regex: ^[a-zA-Z0-9!_.*'()-]+(/[a-zA-Z0-9!_.*'()-]+)*$

for test in tests:
    m = re.match(r"""
        ^                       # the start of the string
        [a-zA-Z0-9!_.*'()-]+    # 1 or more: ~abc*(def)
        (
            /
            [a-zA-Z0-9!_.*'()-]+
        )*                      # 0 or more of /~abc*(def)
        $                       # the end of the string
        """,f'match = {m is not None}')

打印：

abc match = True
-/abc/(def)/!x*yz.def.hij match = True

Amazon S3对象名称的正则表达式

yang130sam 回答：Amazon S3对象名称的正则表达式

大家都在问