[Concurrency]Python入門Multi threading範例

  • 4054
  • 0
  • 2018-09-12

[Concurrency]Python入門Multi threading範例

主程式是main.py, 而物件是寫在Crawlers.py裡面:
main.py:

from Crawlers import GloriaCrawler

import threading
import time

objG = GloriaCrawler()
# 建立一個子執行緒並立刻執行
t = threading.Thread(target = objG.Go())

# 主執行緒繼續執行自己的工作
for i in range(10):
  print("Main thread:", i)
  time.sleep(1)

print('program end')


Crawlers.py:

class GloriaCrawler():
    def __init__(self):
        print('init done.')
    def Go(self):
                
        # 引入 requests 模組
        import requests
        # 使用 GET 方式下載普通網頁
        resp = requests.get('https://www.gloriatour.com.tw/EW/GO/GroupList.asp')

        if resp.status_code == requests.codes.ok:
            print("OK")
        else:
            print("There is a problem!")
            exit()
        html = resp.text
        #https://regex101.com/r/XHuNJH/1
        pattern = r'<div class=\"product_name\">.*?<span class=\"product_num\">[\d\w]+</span>\s*\r\n\s*(?P<TourName>.*?)\s*\r\n\s*<div class=\"product_tag\">.*?<div class=\"product_days\">(?P<Days>\d+)天.*?<div class=\"product_date normal\">(?P<Date>\d{4}/\d{2}/\d{2}).*?售價\$<strong>(?P<Money>[0-9,]+)</strong>.*?機位<\/span><span class=\"number\">(?P<Total>\d+)</span>.*?可售<\/span><span class=\"number\">(?P<Available>\d+)</span><\/div>'

        import re
        # DOTALL:就是csharp裡面的singleline
        pattern = re.compile(pattern, re.DOTALL)
        for m in pattern.finditer(html):
            print(m.group('TourName'))  
            print(m.group('Days'))  


ps.20180912:補充以上的寫法,經本機實測發現並非multi threading!仍然是依序執行!

以下新的ThreadPoolExecutor寫法,經實測的確有multi-threading:

import requests
from concurrent.futures import ThreadPoolExecutor

class TestClass():
    def __init__(self, start, end):
        self.start = start
        self.end = end
    def customPrint(self):
        for i in range(self.start,self.end):
            print(str(i))

def worker1():
    test1 = TestClass(1,100)
    test1.customPrint()

def worker2():
    test2 = TestClass(101,200)
    test2.customPrint()

with ThreadPoolExecutor() as executor:
    executor.submit(worker1)
    executor.submit(worker2)

輸出:
數字的輸出完全是交錯的,真的有multi-threading




參考資料:
Send Simultaneous Requests python (all at once)
https://stackoverflow.com/questions/40391898/send-simultaneous-requests-python-all-at-once
Using class attribute in concurrent.futures threads
https://stackoverflow.com/questions/46993312/using-class-attribute-in-concurrent-futures-threads
Python 多執行緒 threading 模組平行化程式設計教學
https://blog.gtwang.org/programming/python-threading-multithreaded-programming-tutorial/