[Robot]利用WebClient, HttpClient爬蟲https網站

  • 1343
  • 0
  • 2017-12-22

[Robot]利用WebClient, HttpClient爬蟲https網站

利用WebClient類別去爬https的網站

static void Main(string[] args)
{

	WebClient wc = new WebClient();
	ServicePointManager.ServerCertificateValidationCallback =
						delegate { return true; };
	ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls;
	string s = wc.DownloadString("https://yourwebsite/");

}


用HttpClient爬https網頁:
需加入參考C:\Program Files (x86)\Microsoft ASP.NET\ASP.NET MVC 4\Assemblies\System.Net.Http.dll
並using System.Net.Http;

static void Main(string[] args)
{


	string result = "";
	HttpResponseMessage httpResponseMessage;

	using (HttpClient httpClient = new HttpClient())
	{

		httpClient.BaseAddress = new Uri("https://gg.bet");
		
		//以下這行application/json自行看情況修改
		httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
		
		ServicePointManager.Expect100Continue = true;		
		ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072;
		ServicePointManager.DefaultConnectionLimit = 9999;
		httpResponseMessage = httpClient.GetAsync("/en/betting/match/0:7504").Result;
		result = (!httpResponseMessage.IsSuccessStatusCode ? result ?? "" :
			string.Concat(result, httpResponseMessage.Content.ReadAsStringAsync().Result));
		
	}
	Console.WriteLine(result);
	Console.WriteLine("press any key to continue");
	Console.ReadKey();	
}



參考資料:
http://blog.darkthread.net/blogs/darkthreadtw/archive/2010/05/06/webclient-ssl-dismatch.aspx
https://www.blogger.com/comment.g?blogID=4221096241867026900&postID=7691647913213189973&page=1&token=1478767251776