[ 收藏 ] [ 繁体中文 ]  
臺灣貨到付款、ATM、超商、信用卡PAYPAL付款,4-7個工作日送達,999元臺幣免運費   在線留言 商品價格為新臺幣 
首頁 電影 連續劇 音樂 圖書 女裝 男裝 童裝 內衣 百貨家居 包包 女鞋 男鞋 童鞋 計算機周邊

商品搜索

 类 别:
 关键字:
    

商品分类

  • 新类目

     管理
     投资理财
     经济
     社会科学
  • Python網絡數據采集 (影印版)第2版 (美)瑞安·米切爾(Ryan Mitch
    該商品所屬分類:計算機/網絡 -> 計算機/網絡
    【市場價】
    651-944
    【優惠價】
    407-590
    【作者】 Ryan 
    【出版社】東南大學出版社 
    【ISBN】9787564179779
    【折扣說明】一次購物滿999元台幣免運費+贈品
    一次購物滿2000元台幣95折+免運費+贈品
    一次購物滿3000元台幣92折+免運費+贈品
    一次購物滿4000元台幣88折+免運費+贈品
    【本期贈品】①優質無紡布環保袋,做工棒!②品牌簽字筆 ③品牌手帕紙巾
    版本正版全新電子版PDF檔
    您已选择: 正版全新
    溫馨提示:如果有多種選項,請先選擇再點擊加入購物車。
    *. 電子圖書價格是0.69折,例如了得網價格是100元,電子書pdf的價格則是69元。
    *. 購買電子書不支持貨到付款,購買時選擇atm或者超商、PayPal付款。付款後1-24小時內通過郵件傳輸給您。
    *. 如果收到的電子書不滿意,可以聯絡我們退款。謝謝。
    內容介紹



    ISBN編號:9787564179779
    書名:Python網絡數據采集 Python網絡數據采集
    作者:Ryan

    代碼:89
    開本:16開
    是否是套裝:否

    出版社名稱:東南大學出版社

        
        
    "

    Python網絡數據采集 (影印版)第2版

    作  者: (美)瑞安·米切爾(Ryan Mitchell) 著
    size="731x8"
    定  價: 89
    size="731x8"
    出?版?社: 東南大學出版社
    size="731x8"
    出版日期: 2018年11月01日
    size="731x8"
    頁  數: 288
    size="731x8"
    裝  幀: 平裝
    size="731x8"
    ISBN: 9787564179779
    size="731x8"
    目錄
    Preface
    Part Ⅰ.Building Scrapers
    1.Your First Web Scraper
    Connecting
    An Introduction to BeautifulSoup
    Installing BeautifulSoup
    Running BeautifulSoup
    Connecting Reliably and Handling Exceptions
    2.Advanced HTML Parsing
    You Don't Always Need a Hammer
    Another Serving of BeautifulSoup
    findo and findallo with BeautifulSoup
    Other BeautifulSoup Objects
    Navigating Trees
    Regular Expressions
    Regular Expressions and BeautifulSoup
    Accessing Attributes
    Lambda Expressions
    3.Writing Web Crawlers
    Traversing a Single Domain
    Crawling an Entire Site
    Collecting Data Across an Entire Site
    Crawling Across the Internet
    4.Web Crawling Models
    Planning and Defining Objects
    Dealing with Different Website Layouts
    Structuring Crawlers
    Crawling Sites Through Search
    Crawling Sites Through Links
    Crawling Multiple Page Types
    Thinking About Web Crawler Models
    5.Scrapy
    Installing Scrapy
    Initializing a New Spider
    Writing a Simple Scraper
    Spidering with Rules
    Creating Items
    Outputting Items
    The Item Pipeline
    Logging with Scrapy
    More Resources
    6.St0ring Data
    Media Files
    Storing Data to CSV
    MySQL
    Installing MySQL
    Some Basic Commands
    Integrating with Python
    Database Techniques and Good Practice
    "Six Degrees" in MySQL
    Email
    Part Ⅱ.Advanced Scraping
    7.Reading Documents
    Document Encoding
    Text
    Text Encoding and the Global Internet
    CSV
    Reading CSV Files
    PDF
    Microsoft Word and .docx
    8.Cleaning Your Dirty Data
    Cleaning in Code
    Data Normalization
    Cleaning After the Fact
    OpenRefine
    9.Reading and Writing Natural Languages
    Summarizing Data
    Markov Models
    Six Degrees of Wikipedia:Conclusion
    Natural Language Toolkit
    Installation and Setup
    Statistical Analysis with NLTK
    Lexicographical Analysis with NLTK
    Additional Resources
    10.Crawling Through Forms and Logins
    Python Requests Library
    Submitting a Basic Form
    Radio Buttons,Checkboxes,and Other Inputs
    Submitting Files and Images
    Handling Logins and Cookies
    HTTP Basic Access Authentication
    Other Form Problems
    11.Scraping JavaScript
    A Brief Introduction to JavaScript
    Common JavaScript Libraries
    Ajax and Dynamic HTML
    Executing JavaScript in Python with Selenium
    Additional Selenium Webdrivers
    Handling Redirects
    A Final Note on JavaScript
    12.Crawling Through APIs
    A Brief Introduction to APIs
    HTTP Methods and APIs
    More About API Responses
    Parsing JSON
    Undocumented APIs
    Finding Undocumented APIs
    Documenting Undocumented APIs
    Finding and Documenting APIs Automatically
    Combining APIs with Other Data Sources
    More About APIs
    13.Image Processing and Text Recognition
    Overview of Libraries
    Pillow
    Tesseract
    NumPy
    Processing Well-Formatted Text
    Adjusting Images Automatically
    Scraping Text from Images on Websites
    Reading CAPTCHAs and Training Tesseract
    Training Tesseract
    Retrieving CAPTCHAs and Submitting Solutions
    14.Avoiding Scraping Traps
    A Note on Ethics
    Looking Like a Human
    Adjust Your Headers
    Handling Cookies with JavaScript
    Timing Is Everything
    Common Form Security Features
    Hidden Input Field Values
    Avoiding Honeypots
    The Human Checklist
    15.Testing Your Website with Scrapers
    An Introduction to Testing
    What Are Unit Tests?
    Python unittest
    Testing Wikipedia
    Testing with Selenium
    Interacting with the Site
    unittest or Selenium?
    16.Web Crawling in Parallel
    Processes versus Threads
    Multithreaded Crawling
    Race Conditions and Queues
    The threading Module
    Multiprocess Crawling
    Multiprocess Crawling
    Communicating Between Processes
    Multiprocess Crawling--Another Approach
    17.Scraping Rem0tely
    Why Use Remote Servers?
    Avoiding IP Address Blocking
    Portability and Extensibility
    Tor
    PySocks
    Remote Hosting
    Running from a Website-Hosting Account
    Running from the Cloud
    Additional Resources
    18.The Legalities and Ethics of Web Scraping
    Trademarks,Copyrights,Patents,Oh My!
    Copyright Law
    Trespass to Chattels
    The Computer Fraud and Abuse Act
    robots.txt and Terms of Service
    Three Web Scrapers
    eBay versus Bidder's Edge and Trespass to Chattels
    United States v.Auernheimer and The Computer Fraud and Abuse Act
    Field v.Google:Copyright and robots.txt
    Moving Forward
    Index
    內容虛線

    內容簡介

    size="789x11"

    如果編程是魔法,那麼網絡數據采集肯定就是某種巫術。編寫一個簡單的自動化程序,你就可以查詢Web服務器,請求數據,解析數據以提取所需的信息。這本實用書籍的擴充版不但介紹了網絡數據采集,更是從現代網絡中抓取幾乎各類數據的綜合指南。《Python網絡數據采集(第2版·影印版·英文版)》部分側重於網絡數據采集機制:使用Python向Web服務器請求信息,對服務器響應信息做基本的處理,自動與站點展開交互。第二部分探討了各種更具體的工具和應用程序,以應對你可能遇到的任何網絡數據采集場景。

    作者簡介

    (美)瑞安·米切爾(Ryan Mitchell) 著

    size="43x26"

    瑞安·米切爾,位於波士頓的HedgeServ的不錯軟件工程師,負責開發公司的API和數據分析工具。她畢業於歐林工程學院,擁有哈佛大學擴展學院(Harvard Urliversity Exterlsion School)軟件工程碩士學位以及數據科學證書。在加入HedgeServ之前,她曾就職於Abine,負責使用Python開發網絡數據采集工具和自動化工具。她經常從事零售、金融和制藥行業的網絡數據采集項目的咨詢工作,還曾經在東北大學和歐林工程學院擔任課程顧問和兼職教員。

    "
     
    網友評論  我們期待著您對此商品發表評論
     
    相關商品
    在線留言 商品價格為新臺幣
    關於我們 送貨時間 安全付款 會員登入 加入會員 我的帳戶 網站聯盟
    DVD 連續劇 Copyright © 2024, Digital 了得網 Co., Ltd.
    返回頂部