c++ - Can multithreading speed up memory allocation?

Question

Welcome To Ask or Share your Answers For Others

c++ - Can multithreading speed up memory allocation?

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

I'm working with an 8 core processor, and am using Boost threads to run a large program. Logically, the program can be split into groups, where each group is run by a thread. Inside each group, some classes invoke the 'new' operator a total of 10000 times. Rational Quantify shows that the 'new' memory allocation is taking up the maximum processing time when the program runs, and is slowing down the entire program.

One way I can speed up the system could be to use threads inside each 'group', so that the 10000 memory allocations can happen in parallel.

I'm unclear of how the memory allocation will be managed here. Will the OS scheduler really be able to allocate memory in parallel?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

186 views

1 Answer

深蓝 · Answer 1 · 2021-10-17T02:50:01+0000

Standard CRT

While with older of Visual Studio the default CRT allocator was blocking, this is no longer true at least for Visual Studio 2010 and newer, which calls corresponding OS functions directly. The Windows heap manager was blocking until Widows XP, in XP the optional Low Fragmentation Heap is not blocking, while the default one is, and newer OSes (Vista/Win7) use LFH by default. The performance of recent (Windows 7) allocators is very good, comparable to scalable replacements listed below (you still might prefer them if targeting older platforms or when you need some other features they provide). There exist several multiple "scalable allocators", with different licenses and different drawbacks. I think on Linux the default runtime library already uses a scalable allocator (some variant of PTMalloc).

Scalable replacements

I know about:

HOARD (GNU + commercial licenses)
MicroQuill SmartHeap for SMP (commercial license)
Google Perf Tools TCMalloc (BSD license)
NedMalloc (BSD license)
JemAlloc (BSD license)
PTMalloc (GNU, no Windows port yet?)
Intel Thread Building Blocks (GNU, commercial)

You might want to check Scalable memory allocator experiences for my experiences with trying to use some of them in a Windows project.

In practice most of them work by having a per thread cache and per thread pre-allocated regions for allocations, which means that small allocations most often happen inside of a context of thread only, OS services are called only infrequently.

Categories

c++ - Can multithreading speed up memory allocation?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Standard CRT

Scalable replacements

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags