在下面的代码中,每10个整数的许多向量构造有60%的几率,或者现有的向量被删除,有40%的几率.因此,会有很多调用new / malloc和delete.
@H_502_2@由于所有这些向量都是vector< int>类型,自定义分配器可以帮助减少对new和delete的调用,从而提高性能吗?这个想法是删除的矢量的空间可以由新构造的空间重用.这样的分配器怎么样?
注意:这个问题是关于分配器,它减少了对new和delete的调用.
- #include <iostream>
- #include <vector>
- #include <random>
- using namespace std;
- int main()
- {
- // Random generator and distribution
- mt19937 gen(123456);
- uniform_real_distribution<> dis01(0.,1.);
- // Make or delete 10E6 vectors.
- vector< vector<int> > v; //the inner vectors will make many calls to new and delete
- v.reserve(10E5); //assume some size.
- for(int i=0; i<10E6; ++i)
- {
- if(dis01(gen)<0.6) // if true: make new sub-vector
- {
- v.emplace_back(); //new sub-vector
- v.back().reserve(10);
- for(int k=0; k<10; ++k)
- v.back().emplace_back(k); //fill it with some numbers
- }
- else // else,delete the last entry if there is one.
- if(!v.empty())
- v.pop_back();
- }
- cout<<"v.size()= "<<v.size();
- return 0;
- }
解决方法
这适用于C 11.较旧的标准需要额外的东西
@H_502_2@在分配器[1]中实现.
这是概念验证代码.它运行并解决了这个例子@H_502_2@问题但受到一些限制.它仍然@H_502_2@演示了如何使用自定义分配器来改进@H_502_2@在很多std :: vectors的场景中的性能@H_502_2@创造和破坏.
PoolAlloc.hh:
- template<typename T>
- struct MemChunk
- {
- std::size_t buf_size=0;
- T* buf=nullptr;
- T* top=nullptr;
- std::size_t used=0;
- };
- template<typename T>
- class PoolAllocator
- {
- public:
- using value_type=T;
- PoolAllocator();
- explicit PoolAllocator(std::size_t);
- PoolAllocator(PoolAllocator const&) noexcept;
- template<typename U>
- PoolAllocator(PoolAllocator<U> const&) noexcept;
- PoolAllocator(PoolAllocator&&) noexcept;
- PoolAllocator& operator=(PoolAllocator const&)=delete;
- PoolAllocator& operator=(PoolAllocator&&)=delete;
- ~PoolAllocator();
- template <typename U>
- struct rebind
- {
- using other=PoolAllocator<U>;
- };
- T* allocate(std::size_t);
- void deallocate(T*,std::size_t) noexcept;
- template<typename U1,typename U2>
- friend bool operator==(PoolAllocator<U1> const&,PoolAllocator<U2> const&) noexcept;
- private:
- std::vector<MemChunk<T>>* memory_=nullptr;
- int* ref_count_=nullptr;
- std::size_t default_buf_size_=0;
- };
- template<typename T>
- PoolAllocator<T>::PoolAllocator():
- PoolAllocator{100000} {}
- template<typename T>
- PoolAllocator<T>::PoolAllocator(std::size_t buf_size):
- memory_{new std::vector<MemChunk<T>>},ref_count_{new int(0)},default_buf_size_{buf_size}
- {
- memory_->emplace_back();
- memory_->back().buf_size=buf_size;
- memory_->back().buf=new T[buf_size];
- memory_->back().top=memory_->back().buf;
- ++(*ref_count_);
- }
- template<typename T>
- PoolAllocator<T>::PoolAllocator(PoolAllocator const& src) noexcept:
- memory_{src.memory_},ref_count_{src.ref_count_},default_buf_size_{src.default_buf_size_}
- {
- ++(*ref_count_);
- }
- template<typename T>
- PoolAllocator<T>::PoolAllocator(PoolAllocator&& src) noexcept:
- memory_{src.memory_},default_buf_size_{src.default_buf_size_}
- {
- src.memory_=nullptr;
- src.ref_count_=nullptr;
- }
- template<typename T>
- template<typename U>
- PoolAllocator<T>::PoolAllocator(PoolAllocator<U> const& src) noexcept:
- memory_{src.memory_},default_buf_size_{src.default_buf_size_}
- {
- ++(*ref_count_);
- }
- template<typename T>
- PoolAllocator<T>::~PoolAllocator()
- {
- if (ref_count_!=nullptr)
- {
- --(*ref_count_);
- if (*ref_count_==0)
- {
- if (memory_!=nullptr)
- {
- for (auto& it : *memory_)
- {
- delete[] it.buf;
- }
- delete memory_;
- }
- delete ref_count_;
- }
- }
- }
- template<typename T>
- T*
- PoolAllocator<T>::allocate(std::size_t n)
- {
- MemChunk<T>* mem_chunk=&memory_->back();
- if ((mem_chunk->used+n)>mem_chunk->buf_size)
- {
- default_buf_size_*=2;
- memory_->emplace_back();
- mem_chunk=&memory_->back();
- std::size_t buf_size=default_buf_size_;
- if (n>default_buf_size_)
- {
- buf_size=n;
- }
- mem_chunk->buf_size=buf_size;
- mem_chunk->buf=new T[mem_chunk->buf_size];
- mem_chunk->top=mem_chunk->buf;
- }
- T* r=mem_chunk->top;
- mem_chunk->top+=n;
- mem_chunk->used+=n;
- return r;
- }
- template<typename T>
- void
- PoolAllocator<T>::deallocate(T* addr,std::size_t n) noexcept
- {
- MemChunk<T>* mem_chunk=&memory_->back();
- if (mem_chunk->used>n and (mem_chunk->top-n)==addr)
- {
- mem_chunk->used-=n;
- mem_chunk->top-=n;
- }
- }
- template<typename U1,typename U2>
- bool operator==(PoolAllocator<U1> const& lhs,PoolAllocator<U2> const& rhs) noexcept
- {
- return (std::is_same<U1,U2>::value and lhs.memory_==rhs.memory_);
- }
使用以下方式修改的示例:
- #include <iostream>
- #include <vector>
- #include <random>
- #include "PoolAlloc.hh"
- using namespace std;
- int main()
- {
- // Random generator and distribution
- mt19937 gen(123456);
- uniform_real_distribution<> dis01(0.,1.);
- PoolAllocator<int> palloc{1000000};
- // Make or delete 10E6 vectors.
- vector< vector<int,PoolAllocator<int>> > v; //the inner vectors will make many calls to new and delete
- v.reserve(10E5); //assume some size.
- for(int i=0; i<10E6; ++i)
- {
- if(dis01(gen)<0.6) // if true: make new sub-vector
- {
- v.emplace_back(palloc); //new sub-vector
- v.back().reserve(10);
- for(int k=0; k<10; ++k)
- v.back().emplace_back(k); //fill it with some numbers
- }
- else // else,delete the last entry if there is one.
- if(!v.empty())
- v.pop_back();
- }
- cout<<"v.size()= "<<v.size();
- return 0;
- }
对malloc的调用次数从~6e6下降到21@H_502_2@指令数从3.7e9下降到2.5e9(使用-O3,@H_502_2@用valgrind测量–tool = callgrind).
有一些实施细节会影响到@H_502_2@在不同的使用情况下的表现.
目前使用多个缓冲区.如果一个满了,另一个满了@H_502_2@被建造.这种方式永远不必重新分配@H_502_2@操作会让你进入一个受伤的世界(见@H_502_2@评论).
最大的问题是,如何处理解除分配的内存.@H_502_2@目前使用的是一种简单的方法,只能进行解除分配@H_502_2@内存可用于稍后在它结束时分配@H_502_2@缓冲.对于你的例子就足够了,就像你一样@H_502_2@在缓冲区的末尾释放内存.
对于更复杂的场景,您需要更复杂的场景@H_502_2@机制.存储地址需要一些数据结构@H_502_2@和可用内存块的大小.多个概念是可能的@H_502_2@在这里,他们的表现会因具体情况而有所不同@H_502_2@它们被用于.我怀疑有一个很好的一刀切@H_502_2@解决方案在这