python多线程使用总结

最近为了提高接口性能，想把串行调用rpc接口的逻辑改成多线程，以达到并行调用的效果。

具体场景：假设接口A逻辑里会串行调用接口B和C，同时B和C没有数据依赖，现在要把调用B和C的地方改为两个线程分别调用B和C，并分别获取B和C的返回结果。

python多线程简介

python的多线程库有thread和threading两个，threading是高级模块，封装了thread，所以一般都是使用threading。

需要说明的是，python解释器由于设计时有GIL全局锁，导致了多线程无法利用多核，要想利用多核，可以使用多进程。

GIL全局锁（Global Interpreter Lock），任何python线程执行前，必须先获得GIL锁，然后，每执行100条字节码，解释器就自动释放GIL锁，让别的线程有机会执行。这个GIL全局锁实际上把所有线程的执行代码都给上了锁，所以，多线程在Python中只能交替执行，即使100个线程跑在100核CPU上，也只能用到1个核。

多线程使用方式

代码逻辑如下：

class ABC(object):
	def __init__():
		pass
	def A(self):
		...
		resB = self.B(b1, b2)
		resC = self.C(c1, c2)
		return resB + resC
		
	def B(self, b1, b2)：
		res = b1 + b2
		return res
		
	def C(self, c1, c2):
		res = c1 * c2
		return res
	

修改为多线程：

import threading
class ABC(objsect):
	...
	def A(self):
		...
		# multi thread
	    threads = []
	    # target 对应要调用的接口名，args对应接口入参
	    t1 = threading.Thread(target=self.B, args=(b1, b2))	  
	    threads.append(t1)
	    t2 = threading.Thread(target=C, args=(c1, c2))
	    threads.append(t2)
	    t1.start() # start()表示启动线程
	    t2.start()
	    t1.join() # join()表示等待至线程终止
	    t2.join()
	    ...
	
	def B(self, b1, b2)：
		res = b1 + b2
		return res
		
	def C(self, c1, c2):
		res = c1 * c2
		return res

解释说明：

上面的逻辑中，先依次调用 t1.start() 和 t2.start()表示先分别启动线程t1和t2，然后t1.join()表示检查线程t1有没有执行完，如果没有，则阻塞，直到t1执行完，然后再检查t2。

这种方式有一个问题，接口A里面需要拿到B和C的返回值，上面这种方式，只是调用了B和C，但是没有拿到他们的返回值。

多线程传递数据

我们用Queue来实现在线程间传递数据的功能

在python中，队列是线程最常用的交换数据的方式

import threading
import Queue 
queue = Queue.Queue() # 创建一个队列
class ABC(objsect):
	...
	def A(self):
		...
		# multi thread
	    threads = []
	    # target 对应要调用的接口名，args对应接口入参
	    t1 = threading.Thread(target=self.B, args=(b1, b2))	  
	    threads.append(t1)
	    t2 = threading.Thread(target=C, args=(c1, c2))
	    threads.append(t2)
	    t1.start() # start()表示启动线程
	    t2.start()
	    t1.join() # join()表示等待至线程终止
	    t2.join()
		# 从queue中获取返回值
		res_data = []
	    while not queue.empty():
	        res_data.append(queue.get()) 
		for info in res_data:
	        if info.has_key('B'):
	            resB = info['B']
	        if info.has_key('C'):
	            resC = info['C']
	     return resB + resC
	       
	def B(self, b1, b2)：
		res = b1 + b2
		'''
		将返回数据插入队列中, 
		为了区分B和C返回的数据，可以返回dict类型，
		key为接口名，value为实际的返回结果
		'''	 
		queue.put({'B':res}) 
		return res
		
	def C(self, c1, c2):
		res = c1 * c2
		queue.put({'C':res})
		return res

通过这种方式，就可以在A中拿到B和C的返回数据了。

另外，还有一种获取线程返回值的方式，可以参考

参考

python多线程

如何获取线程的返回值

python多线程使用总结

python多线程简介

多线程使用方式

多线程传递数据

参考

Comments