Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have decided to learn how multi-threading is done in Python, and I did a comparison to see what kind of performance gain I would get on a dual-core CPU. I found that my simple multi-threaded code actually runs slower than the sequential equivalent, and I cant figure out why.

The test I contrived was to generate a large list of random numbers and then print the maximum

from random import random
import threading

def ox():
    print max([random() for x in xrange(20000000)])

ox() takes about 6 seconds to complete on my Intel Core 2 Duo, while ox();ox() takes about 12 seconds.

I then tried calling ox() from two threads to see how fast that would complete.

def go():
    r = threading.Thread(target=ox)
    r.start()
    ox()

go() takes about 18 seconds to complete, with the two results printing within 1 second of eachother. Why should this be slower?

I suspect ox() is being parallelized automatically, because I if look at the Windows task manager performance tab, and call ox() in my python console, both processors jump to about 75% utilization until it completes. Does Python automatically parallelize things like max() when it can?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
649 views
Welcome To Ask or Share your Answers For Others

1 Answer

  1. Python has the GIL. Python bytecode will only be executed by a single processor at a time. Only certain C modules (which don't manage Python state) will be able to run concurrently.
  2. The Python GIL has a huge overhead in locking the state between threads. There are fixes for this in newer versions or in development branches - which at the very least should make multi-threaded CPU bound code as fast as single threaded code.

You need to use a multi-process framework to parallelize with Python. Luckily, the multiprocessing module which ships with Python makes that fairly easy.

Very few languages can auto-parallelize expressions. If that is the functionality you want, I suggest Haskell (Data Parallel Haskell)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...