Erle Robotics Python Networking Gitbook Free
Introduction
1. Introduction to Client/Server Networking
- 1.1. Virtualenv
- 1.2. Installing virtualenv in Erle
- 1.3. Create a virtual environment to test packages
2. Introduction to socket
- 2.1. What is socket?
- 2.2. Creating a Socket
- 2.3. Using sockets
- 2.4. Disconnecting
- 2.5. Non - blocking sockets
3. UDP and TCP
- 3.1. Addresses and port numbers
- 3.2. UDP
- 3.3. TCP
4. Socket names and DNS
- 4.1. Socket names
- 4.2. Five socket cordinates
- 4.3. IPv6
- 4.4. The getaddrinfo() function
- 4.5. A Sketch of How DNS Works
- 4.6. Using DNS
5. Network Data and Network Errors
- 5.1. Text and Encodings
- 5.2. Network Byte Order
- 5.3. Framing and Quoting
- 5.4. Pickles and Self-Delimiting Formats
- 5.5. XML, JSON, Etc.
- 5.6. Compression
- 5.7. Network Exceptions
- 5.8. Handling Exceptions
6. TLS and SSL
- 6.1. Cleartext on the Network
- 6.2. TLS Encrypts Your Conversations
- 6.3. Supporting TLS in Python
- 6.4. The Standard SSL Module
7. Server Architecture
- 7.1. Daemons and Logging
- 7.2. Introductory example
- 7.3. Elementary client
- 7.4. Event-Driven Servers
- 7.5. The Semantics of Non-blocking
- 7.6. Twisted Python
- 7.7. Threading and Multi-processing
- 7.8. Threading and Multi-processing Frameworks
8. Caches, Message Queues, and Map-Reduce
- 8.1. Using Memcached
- 8.2. Memcached and Sharding
- 8.3. Message Queues
- 8.4. Using Message Queues from Python
- 8.5. Map-Reduce
9. HTTP
- 9.1. URL Anatomy
- 9.2. Relative URLs
- 9.3. Instrumenting urllib2
- 9.4. The GET Method and The Host Header
- 9.5. Payloads and Persistent Connections
- 9.6. POST And Forms
- 9.7. REST And More HTTP Methods
- 9.8. Identifying User Agents and Web Servers
- 9.9. Content Type Negotiation
- 9.10. Compression
- 9.11. HTTP Caching
- 9.12. The HEAD Method
- 9.13. HTTPS Encryption
- 9.14. HTTP Authentication
- 9.15. Cookies
- 9.16. HTTP Session Hijacking
- 9.17. Cross-Site Scripting Attacks
10. Screen Scraping
- 10.1. Fetching Web Pages
- 10.2. Downloading Pages Through Form Submission
- 10.3. The Structure of Web Pages
- 10.4. Three Axes
- 10.5. Diving into an HTML Document
- 10.6. Selectors
11. Web Applications
- 11.1. Web Servers and Python
- 11.2. Choosing a Web Server
- 11.3. WSGI
- 11.4. WSGI Middleware
- 11.5. Python Web Frameworks
- 11.6. URL Dispatch Techniques
- 11.7. Templates
- 11.8. Pure-Python Web Servers
- 11.9. Common Gateway Interface (CGI)
- 11.10. mod_python
12. E-mail Composition and Decoding
- 12.1. E-mail Messages
- 12.2. Composing Traditional Messages
- 12.3. Parsing Traditional Messages
- 12.4. Parsing Dates
- 12.5. Understanding MIME
- 12.6. Composing MIME Attachments
- 12.7. MIME Alternative Parts
- 12.8. Composing Non-English Headers
- 12.9. Composing Nested Multiparts
- 12.10. Parsing MIME Messages
- 12.11. Decoding Headers
13. Simple Mail Transport Protocol (SMTP)
- 13.1. E-mail Clients, Webmail Services
- 13.2. How SMTP Is Used
- 13.3. Sending E-Mail
- 13.4. Introducing the SMTP Library
- 13.5. Error Handling and Conversation Debugging
- 13.6. Getting Information from EHLO
- 13.7. Using Secure Sockets Layer and Transport Layer Security
- 13.8. Authenticated SMTP
14. Post Office Protocol (POP)
- 14.1. Connecting and Authenticating
- 14.2. Obtaining Mailbox Information
- 14.3. Downloading and Deleting Messages
15. Internet Message Access Protocol (IMAP)
- 15.1. Understanding IMAP in Python
- 15.2. IMAPClient
- 15.3. Message Numbers vs. UIDs
- 15.4. Summary Information
- 15.5. Downloading an Entire Mailbox
- 15.6. Downloading Messages Individually
- 15.7. Flagging and Deleting Messages
- 15.8. Searching and Manipulating Messages
16. Telnet and SSH
- 16.1. Command-Line Automation
- 16.2. Command-Line Expansion and Quoting
- 16.3. Unix Has No Special Characters
- 16.4. Quoting Characters for Protection
- 16.5. Things Are Different in a Terminal
- 16.6. Terminals Do Buffering
- 16.7. Telnet
- 16.8. SSH: The Secure Shell
- 16.9. SSH Host Keys
- 16.10. SSH Authentication
- 16.11. Shell Sessions and Individual Commands
- 16.12. SFTP: File Transfer Over SSH
17. File Transfer Protocol (FTP)
- 17.1. What to Use Instead of FTP
- 17.2. Communication Channels
- 17.3. Using FTP in Python
- 17.4. ASCII and Binary Files
- 17.5. Advanced Binary Downloading
- 17.6. Uploading Data
- 17.7. Advanced Binary Uploading
- 17.8. Handling Errors
- 17.9. Detecting Directories and Recursive Download
- 17.10. Creating Directories, Deleting Things
18. Remote Procedure Call (RPC)
- 18.1. Features of RPC
- 18.2. XML-RPC
- 18.3. JSON-RPC
- 18.4. Self-documenting Data
- 18.5. Talking About Objects: Pyro and RPyC
- 18.6. An RPyC Example
- 18.7. RPC, Web Frameworks, Message Queues

Erle Robotics Python Networking Gitbook Free

Selectors

A selector is a pattern that is crafted to match document elements on which your program wants to operate.Some of them are:

People who are deeply XML-centric prefer XPath expressions, which are a companion technology to XML itself and let you match elements based on their ancestors, their own identity, and textual matches against their attributes and text content.

If you are a web developer, then you probably link to CSS selectors as the most natural choice for examining HTML.

Both lxml and BeautifulSoup, as we have seen, provide a smattering of their own methods for finding document elements.

Here are standards and descriptions for each of the selector styles just described:

And, finally, here are links to documentation that looks at selector methods peculiar to lxml and BeautifulSoup:

Now, here you have a completed weather scraper in the file weather.py:


import sys, urllib, urllib2
import lxml.etree
from lxml.cssselect import CSSSelector
from BeautifulSoup import BeautifulSoup

if len(sys.argv) < 2:
    print >>sys.stderr, 'usage: weather.py CITY, STATE'
    exit(2)

data = urllib.urlencode({'inputstring': ' '.join(sys.argv[1:])})
info = urllib2.urlopen('http://forecast.weather.gov/zipcity.php', data)
content = info.read()

# Solution #1 using CSSSelector
parser = lxml.etree.HTMLParser(encoding='utf-8')
tree = lxml.etree.fromstring(content, parser)
big = CSSSelector('td.big')(tree)[0]
if big.find('font') is not None:
    big = big.find('font')
print 'Condition:', big.text.strip()
print 'Temperature:', big.findall('br')[1].tail
tr = tree.xpath('.//td[b="Humidity"]')[0].getparent()
print 'Humidity:', tr.findall('td')[1].text
print

# Solution #2 using BeautifulSoup
soup = BeautifulSoup(content)  # doctest: +SKIP
big = soup.find('td', 'big')
if big.font is not None:
    big = big.font
print 'Condition:', big.contents[0].string.strip()
temp = big.contents[3].string or big.contents[4].string  # can be either
print 'Temperature:', temp.replace('&deg;', ' ')
tr = soup.find('b', text='Humidity').parent.parent.parent
print 'Humidity:', tr('td')[1].string
print

Take into account that for running this you also need to have the lxm module installed.