两种看似相似的解决方案在运行时复杂度上的差异

我正在尝试解决一项任务,并编写了一个解决方案,该解决方案与我在网上寻找更有效的解决方案时所找到的解决方案非常接近。

这是赋值语句

  

此问题的目标是实现2-SUM的变体   算法将在本周的讲座中介绍。

     

该文件包含100万个正负整数   (可能会重复一些!)。这是您的整数数组,   文件的第i行指定数组的第i个条目。

     

您的任务是计算区间中的目标值t的数量   [-10000,10000](含),使得在其中存在不同的数字x,y   满足x + y = t的输入文件   变量all_ints是包含所有整数的列表

hashtable = set(all_ints)
min_t = -1024
max_t = -min_t + 1

solutions = 0
k = 0

from time import time

# Solution 1 that I wrote
t0 = time()
for n in range(min_t,max_t):
    solutions+=1 if any([True if 2*i!=n and n-i in hashtable else False for i in hashtable]) else 0


# Solution 2 that I found online
t1 = time()
solutions2 = sum(1 for n in range(min_t,max_t) if any(n - x in hashtable and 2 * x != n for x in hashtable))
t2 = time()

print(t1-t0) #857.0168697834015
print(t2-t1) #591.8803908824921

在基本检查中,这两种解决方案看起来非常相似。然而,它们的运行时间却大不相同,尽管它们都是线性扩展的,但是当我减小min_t的值时,它们的偏移会更远。

两种看似相似的解决方案在运行时复杂度上的差异

导致这种情况的两种实现之间的根本区别是什么

asdqq 回答:两种看似相似的解决方案在运行时复杂度上的差异

额外的列表理解创建可能会减慢算法的速度,从而增加容器的开销。

第二种解决方案使用生成器,该生成器在计算时立即产生值,而无需列表。

,

如上所述,如果...否则显式为False,则您将执行True。这是附加步骤,如果值较大,则可能会显示一些开销。最重要的是,您首先要构建整个列表,然后检查其中是否有True。第二种解决方案使用的是惰性的生成器,一旦遇到True值,它将立即停止执行,从而节省大量时间。

下面的示例是您的基本代码,该代码使用10K整数(而不是一百万)进行测试。请注意,这种计时方式不是很可靠(使用timeit是更好的选择),但出于说明目的,我想它给出了一个足够好的主意。

import random
from time import time

all_ints = [random.randint(0,int(1E04)) for i in range(int(1E04))]
hashtable = set(all_ints)
min_t = -1024
max_t = -min_t + 1

solutions = 0
k = 0

# Solution 1 that I wrote
t0 = time()
for n in range(min_t,max_t):
    solutions += 1 if any(2*i!=n and n-i in hashtable for i in hashtable) else 0


# Solution 2 that I found online
t1 = time()
print('solution_1',t1-t0)

solutions2 = sum(1 for n in range(min_t,max_t) if any(n - x in hashtable and 2 * x != n for x in hashtable))
t2 = time()

assert solutions == solutions2

print('solution_2',t2-t1)
  • solution_1 11.994043588638306
  • solution_2 2.5971169471740723

现在,让我们删除列表理解,而使用生成器:

solutions += 1 if any(True if 2*i!=n and n-i in hashtable else False for i in hashtable) else 0
  • solution_1 7.071138620376587
  • solution_2 4.714937686920166

另一项改进:您不需要“如果...则为真”,您只需传递条件本身即可。

solutions += 1 if any(2*i!=n and n-i in hashtable for i in hashtable) else 0
  • solution_1 6.017507076263428
  • solution_2 2.9826767444610596

最后:只要您有一个包含多个条件的条件,请仔细考虑条件的顺序。从最快到最慢命令它们。在您的情况下,执行2*i!=n可能比设置查找(速度非常快)要慢(尤其是对于较大的值),因此建议您交换顺序。

solutions += 1 if any(n-i in hashtable and 2*i!=n for i in hashtable) else 0
  • solution_1 3.184004545211792
  • solution_2 2.4962565898895264

现在唯一的区别是您有一个显式循环,(更重要的是)您进行了迭代加法,而不是对整个列表求和。


这是一个更好的基准(但结论是相同的):

import timeit

setup = '''
import random
all_ints = [random.randint(0,int(1E03)) for i in range(int(1E03))]
hashtable = set(all_ints)
min_t = -1024
max_t = -min_t + 1
'''


# Solution 1 (original)
def sum_one(min_t,max_t,hashtable):
    solutions = 0
    for n in range(min_t,max_t):
        solutions += 1 if any([True if 2 * i != n and n - i in hashtable else False for i in hashtable]) else 0

    return solutions

# Solution 1 (generator instead of list comprehension)
def sum_one_gen(min_t,hashtable):
    """
    - generator instead of list comprehension
    """
    solutions = 0
    for n in range(min_t,max_t):
        solutions += 1 if any(True if 2 * i != n and n - i in hashtable else False for i in hashtable) else 0

    return solutions


def sum_one_gen_implicit_conditional(min_t,hashtable):
    """
    - generator instead of list comprehension
    - remove True if ... else False in favour of implicit conditional
    """
    solutions = 0
    for n in range(min_t,max_t):
        solutions += 1 if any(2 * i != n and n - i in hashtable for i in hashtable) else 0

    return solutions


def sum_one_gen_swap_conditions(min_t,hashtable):
    """
    - generator instead of list comprehension
    - remove True if ... else False in favour of implicit conditional
    - swap conditions
    """
    solutions = 0
    for n in range(min_t,max_t):
        solutions += 1 if any(n - i in hashtable and 2 * i != n for i in hashtable) else 0

    return solutions

def sum_two(min_t,hashtable):
    return sum(1 for n in range(min_t,max_t) if any(n - x in hashtable and 2 * x != n for x in hashtable))


sum_one_s = timeit.timeit(stmt='sum_one(min_t,hashtable)',setup=setup + 'from __main__ import sum_one',number=1000)
print('sum_one_s',sum_one_s)

sum_one_gen_s = timeit.timeit(stmt='sum_one_gen(min_t,setup=setup + 'from __main__ import sum_one_gen',number=1000)
print('sum_one_gen_s',sum_one_gen_s)

sum_one_gen_implicit_conditional_s = timeit.timeit(stmt='sum_one_gen_implicit_conditional(min_t,setup=setup + 'from __main__ import sum_one_gen_implicit_conditional',number=1000)
print('sum_one_gen_implicit_conditional_s',sum_one_gen_implicit_conditional_s)

sum_one_gen_swap_conditions_s = timeit.timeit(stmt='sum_one_gen_swap_conditions(min_t,setup=setup + 'from __main__ import sum_one_gen_swap_conditions',number=1000)
print('sum_one_gen_swap_conditions_s',sum_one_gen_swap_conditions_s)

sum_two_s = timeit.timeit(stmt='sum_two(min_t,setup=setup + 'from __main__ import sum_two',number=1000)
print('sum_two_s',sum_two_s)

结果:

  • sum_one_s 144.5597518660361
  • sum_one_gen_s 84.23750544409268
  • sum_one_gen_implicit_conditional_s 78.40655627311207
  • sum_one_gen_swap_conditions_s 57.4693741860101
  • sum_two_s 52.64653824502602
本文链接:https://www.f2er.com/3146116.html

大家都在问