{"id":5788,"date":"2019-08-19T09:00:20","date_gmt":"2019-08-19T00:00:20","guid":{"rendered":"https:\/\/cg-method.com\/?p=5788"},"modified":"2022-12-05T10:52:46","modified_gmt":"2022-12-05T01:52:46","slug":"chrome-selenium-scraping","status":"publish","type":"post","link":"https:\/\/cg-method.com\/chrome-selenium-scraping\/","title":{"rendered":"Selenium\u2502\u81ea\u52d5\u3067\u30ed\u30b0\u30a4\u30f3\u3057\u3066\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0\u3059\u308b\u65b9\u6cd5"},"content":{"rendered":"\n

\u4ee5\u524d\u304b\u3089\u3084\u3063\u3066\u307f\u305f\u304b\u3063\u305f\u30d6\u30e9\u30a6\u30b6\u306e\u30aa\u30fc\u30c8\u30e1\u30fc\u30b7\u30e7\u30f3\u30c4\u30fc\u30eb\u3001Selenium\uff08\u30bb\u30ec\u30cb\u30a6\u30e0\uff09\u306b\u89e6\u308c\u3066\u307f\u307e\u3057\u305f\u3002<\/p>\n\n\n\n

\u30b9\u30af\u30ea\u30d7\u30c8\u3067\u8907\u96d1\u306a\u30d6\u30e9\u30a6\u30b6\u30fc\u64cd\u4f5c\u304c\u3067\u304d\u308b\u306e\u3067\u3001\u304b\u306a\u308a\u81ea\u52d5\u5316\u3067\u304d\u308b\u8868\u73fe\u306e\u5e45\u304c\u5e83\u304c\u308a\u305d\u3046\u3067\u3059\u3002<\/p>\n\n\n\n

selenium\u306e\u74b0\u5883\u69cb\u7bc9<\/h2>\n\n\n\n

Python\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb<\/h3>\n\n\n\n
\"\"<\/figure>\n\n\n\n

\u30a4\u30f3\u30b9\u30c8\u30fc\u30e9\u30fc\u3092\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3057\u3066Windows\u306bPython 3.7.4(Windows x86-64 web-based installer)\u3092\u5165\u308c\u307e\u3059\u3002<\/p>\n\n\n\n

https:\/\/www.python.org\/downloads\/windows\/<\/a><\/p>\n\n\n\n

Add Python 3.4 to PATH<\/strong>\u306b\u30c1\u30a7\u30c3\u30af\u3092\u5165\u308c\u3066\u3044\u308c\u3070\u74b0\u5883\u5909\u6570\u306b\u30d1\u30b9\u304c\u8ffd\u52a0\u3055\u308c\u307e\u3059\u3002<\/p>\n\n\n\n

C:\\Users\\\u30e6\u30fc\u30b6\u30fc\u540d\\AppData\\Local\\Programs\\Python\\Python37\\\nC:\\Users\\\u30e6\u30fc\u30b6\u30fc\u540d\\AppData\\Local\\Programs\\Python\\Python37\\Scripts\\\n<\/code><\/pre>\n\n\n\n

selenium\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb<\/h3>\n\n\n\n

\u30b3\u30de\u30f3\u30c9\u30d7\u30ed\u30f3\u30d7\u30c8\u3092\u958b\u3044\u3066\u3001\u4e0b\u8a18\u306e\u30b3\u30de\u30f3\u30c9\u3092\u5165\u529b<\/p>\n\n\n\n

pip install selenium\n<\/code><\/pre>\n\n\n\n
\"\"<\/figure>\n\n\n\n

ChromeDriver\u306e\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9<\/h3>\n\n\n\n

\u307e\u305aChrome\u3092\u958b\u3044\u3066\u30b9\u30ea\u30fc\u30c9\u30c3\u30c8\u30e1\u30cb\u30e5\u30fc\uff1e\u30d8\u30eb\u30d7\uff1eChrome\u306b\u3064\u3044\u3066\u304b\u3089\u30d0\u30fc\u30b8\u30e7\u30f3\u3092\u78ba\u8a8d\u3057\u307e\u3059\u3002<\/p>\n\n\n\n

\u6b21\u306b\u4e0b\u8a18\u306e\u30ea\u30f3\u30af\u304b\u3089Chrome\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u3068\u5408\u308f\u305b\u305f\u30c9\u30e9\u30a4\u30d0\u30fc<\/strong>\u3092\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\uff08ChromeDriver 76.0.3809.68\uff09
http:\/\/chromedriver.chromium.org\/downloads<\/a><\/p>\n\n\n\n

selenium\u306e\u5b9f\u884c<\/h2>\n\n\n\n

Chrome\u3092\u8d77\u52d5\u3057\u3066Google\u3092\u958b\u304f<\/h3>\n\n\n\n

test.py\u3068\u3044\u3046\u540d\u524d\u3067\u4e2d\u8eab\u306f\u4e0b\u8a18\u306e\u30b3\u30fc\u30c9\u306b\u3057\u305f\u3082\u306e\u3092\u30b3\u30de\u30f3\u30c9\u30d7\u30ed\u30f3\u30d7\u30c8\u3067\u5b9f\u884c\u3057\u3066\u307f\u307e\u3059\u3002<\/p>\n\n\n\n

python C:\\Users\\\u30e6\u30fc\u30b6\u30fc\u540d\\Desktop\\test.py\n<\/code><\/pre>\n\n\n\n

\u203b\u3082\u3061\u308d\u3093.py\u3092\u30c0\u30d6\u30eb\u30af\u30ea\u30c3\u30af\u3067\u3082OK\u3067\u3059\u3002<\/p>\n\n\n\n

from selenium import webdriver\ndriver = webdriver.Chrome(\"C:\/chromedriver_win32\/chromedriver.exe\")\ndriver.get(\"https:\/\/google.co.jp\")\n<\/code><\/pre>\n\n\n\n

\u203bWindows\u306e\u5834\u5408\u306f\\\u3067\u306f\u306a\u304f\u30d0\u30c3\u30af\u30b9\u30e9\u30c3\u30b7\u30e5\uff08\/\uff09\u3067\u30d1\u30b9\u3092\u533a\u5207\u308b\u3053\u3068\uff01
\u203b\u6587\u5b57\u30a8\u30f3\u30b3\u30fc\u30c9\u306f\u300cUTF-8\u300d<\/p>\n\n\n\n

\"\"<\/figure>\n\n\n\n

\u30c6\u30b9\u30c8\u3057\u3066\u3001Chrome\u304c\u8d77\u52d5\u3059\u308c\u3070OK\u3067\u3059\u3002
\u203b\u30d6\u30e9\u30a6\u30b6\u30fc\u3068\u30c9\u30e9\u30a4\u30d0\u30fc\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u304c\u5408\u308f\u305b\u306a\u3044\u3068\u8d77\u52d5\u3057\u307e\u305b\u3093<\/p>\n\n\n\n

Google\u3067\u300cCG\u30e1\u30bd\u30c3\u30c9\u300d\u3068\u691c\u7d22\u3059\u308b<\/h3>\n\n\n\n
from selenium import webdriver\nfrom selenium.webdriver.common.keys import Keys\ndriver = webdriver.Chrome(\"C:\/chromedriver_win32\/chromedriver.exe\")\ndriver.get(\"https:\/\/www.google.co.jp\/\")\nsearch = driver.find_element_by_name(\"q\")\nsearch.send_keys(\"CG\u30e1\u30bd\u30c3\u30c9\")\nsearch.send_keys(Keys.RETURN)\n<\/code><\/pre>\n\n\n\n

\u3053\u306e\u30d6\u30ed\u30b0\u5185\u3067\u300c\u604b\u58f0\u300d\u3068\u691c\u7d22\u3059\u308b<\/h3>\n\n\n\n
import time\nfrom selenium import webdriver\nfrom selenium.webdriver.common.keys import Keys\n\ndriver = webdriver.Chrome(\"C:\/chromedriver_win32\/chromedriver.exe\")\ndriver.get(\"https:\/\/cg-method.com\")\n\nxpath = '\/\/*[@id=\"s\"]'\nsearch = driver.find_elements_by_xpath(xpath)[1]\nsearch.send_keys(\"\u604b\u58f0\")\nsearch.send_keys(Keys.RETURN)\n<\/code><\/pre>\n\n\n\n

\u203b\u691c\u7d22BOX\u304c\u8907\u6570\u3042\u308b\u5834\u5408\u306fdriver.find_elements_by_xpath(xpath)[1]<\/strong>\u3068\u8a18\u8ff0\u3059\u308b<\/p>\n\n\n\n

Twitter\u30a2\u30ca\u30ea\u30c6\u30a3\u30af\u30b9\u306b\u30ed\u30b0\u30a4\u30f3\u3057\u3066\u4eca\u6708\u306e\u30c4\u30a4\u30fc\u30c8\u6570\u3092\u53d6\u5f97\u3059\u308b<\/h3>\n\n\n\n

\u4e00\u756a\u8eab\u8fd1\u306a\u306e\u3067\u30c6\u30b9\u30c8\u3068\u3057\u3066Twitter\u3092\u9078\u3073\u307e\u3057\u305f\u304c\u3001\u672c\u6765\u306f\u30b9\u30af\u30ec\u30a4\u30d4\u30f3\u30b0\u7981\u6b62\u306a\u306e\u3067\u5b9f\u7528\u5316\u306f\u975e\u63a8\u5968\u3067\u3059\uff01<\/p>\n\n\n\n

import time\nfrom selenium import webdriver\nfrom selenium.webdriver.chrome.options import Options\nfrom selenium.webdriver.common.keys import Keys\n\ntwitter()\n\naccount = '\u30a2\u30ab\u30a6\u30f3\u30c8'\npassword = '\u30d1\u30b9\u30ef\u30fc\u30c9'\n\ndef twitter():\n    driver = webdriver.Chrome(\"C:\/chromedriver_win32\/chromedriver.exe\")\n    driver.get('https:\/\/analytics.twitter.com\/user\/cg_method\/home')\n    time.sleep(3)\n\n    element_account = driver.find_element_by_class_name(\"js-username-field\")\n    element_account.send_keys(account)\n    time.sleep(3)\n\n    element_pass = driver.find_element_by_class_name(\"js-password-field\")\n    element_pass.send_keys(password)\n    time.sleep(3)\n\n    element_login = driver.find_element_by_xpath('\/\/*[@id=\"page-container\"]\/div\/div[1]\/form\/div[2]\/button')\n    driver.execute_script(\"window.scrollTo(0, document.body.scrollHeight);\")\n    element_login.click()\n    time.sleep(3)\n\n    selector = \"body > div.container > div > div.home-content > div > div.home-columns > div.home-column-secondary > div:nth-child(2) > div > div > div:nth-child(1) > div > div\"\n    tweetsNum = driver.find_element_by_css_selector(selector)\n    print(\"\u4eca\u6708\u306e\u30c4\u30a4\u30fc\u30c8\u6570:\",tweetsNum.text)\n<\/code><\/pre>\n\n\n\n

\u53d6\u5f97\u3057\u305f\u3044\u8981\u7d20\u3092\u62fe\u3046\u65b9\u6cd5<\/h3>\n\n\n\n
    \n
  1. \u8981\u7d20\u3092\u9078\u629e\u3057\u3066\u53f3\u30af\u30ea\u30c3\u30af\uff1e\u691c\u8a3c\n\n<\/li>\n\n\n\n
  2. \n

    \u3055\u3089\u306b\u53f3\u30af\u30ea\u30c3\u30af\u3057\u3066CSS\u306a\u308aXpath\u306a\u308a\u3092\u30b3\u30d4\u30fc\u3059\u308b\u3060\u3051\u3067\u3059\u3002<\/p>\n<\/li>\n<\/ol>\n\n\n\n

    \"\"<\/figure>\n\n\n\n

    [\u8ffd\u8a18]\u30d5\u30ec\u30fc\u30e0\u5358\u4f4d\u3067\u30a2\u30af\u30bb\u30b9\u3059\u308b\u30b9\u30af\u30ea\u30d7\u30c8\u4f8b<\/h2>\n\n\n\n

    \u30d5\u30ec\u30fc\u30e0\u3067\u5225\u308c\u3066\u3044\u308b\u5834\u5408\u306b\u624b\u9593\u53d6\u3063\u305f\u306e\u3067\u30e1\u30e2\u3002<\/p>\n\n\n\n

    \u4e0a\u90e8\u306e\u30d5\u30ec\u30fc\u30e0\u306b\u30b9\u30a4\u30c3\u30c1▶\u4e00\u5ea6\u4e0a\u306e\u968e\u5c64\u306b\u623b\u3063\u3066▶\u4e0b\u306e\u30d5\u30ec\u30fc\u30e0\u306b\u30b9\u30a4\u30c3\u30c1\u3059\u308b\u3002<\/p>\n\n\n\n