Erle Robotics Python Networking Gitbook Free
Introduction
1. Introduction to Client/Server Networking
- 1.1. Virtualenv
- 1.2. Installing virtualenv in Erle
- 1.3. Create a virtual environment to test packages
2. Introduction to socket
- 2.1. What is socket?
- 2.2. Creating a Socket
- 2.3. Using sockets
- 2.4. Disconnecting
- 2.5. Non - blocking sockets
3. UDP and TCP
- 3.1. Addresses and port numbers
- 3.2. UDP
- 3.3. TCP
4. Socket names and DNS
- 4.1. Socket names
- 4.2. Five socket cordinates
- 4.3. IPv6
- 4.4. The getaddrinfo() function
- 4.5. A Sketch of How DNS Works
- 4.6. Using DNS
5. Network Data and Network Errors
- 5.1. Text and Encodings
- 5.2. Network Byte Order
- 5.3. Framing and Quoting
- 5.4. Pickles and Self-Delimiting Formats
- 5.5. XML, JSON, Etc.
- 5.6. Compression
- 5.7. Network Exceptions
- 5.8. Handling Exceptions
6. TLS and SSL
- 6.1. Cleartext on the Network
- 6.2. TLS Encrypts Your Conversations
- 6.3. Supporting TLS in Python
- 6.4. The Standard SSL Module
7. Server Architecture
- 7.1. Daemons and Logging
- 7.2. Introductory example
- 7.3. Elementary client
- 7.4. Event-Driven Servers
- 7.5. The Semantics of Non-blocking
- 7.6. Twisted Python
- 7.7. Threading and Multi-processing
- 7.8. Threading and Multi-processing Frameworks
8. Caches, Message Queues, and Map-Reduce
- 8.1. Using Memcached
- 8.2. Memcached and Sharding
- 8.3. Message Queues
- 8.4. Using Message Queues from Python
- 8.5. Map-Reduce
9. HTTP
- 9.1. URL Anatomy
- 9.2. Relative URLs
- 9.3. Instrumenting urllib2
- 9.4. The GET Method and The Host Header
- 9.5. Payloads and Persistent Connections
- 9.6. POST And Forms
- 9.7. REST And More HTTP Methods
- 9.8. Identifying User Agents and Web Servers
- 9.9. Content Type Negotiation
- 9.10. Compression
- 9.11. HTTP Caching
- 9.12. The HEAD Method
- 9.13. HTTPS Encryption
- 9.14. HTTP Authentication
- 9.15. Cookies
- 9.16. HTTP Session Hijacking
- 9.17. Cross-Site Scripting Attacks
10. Screen Scraping
- 10.1. Fetching Web Pages
- 10.2. Downloading Pages Through Form Submission
- 10.3. The Structure of Web Pages
- 10.4. Three Axes
- 10.5. Diving into an HTML Document
- 10.6. Selectors
11. Web Applications
- 11.1. Web Servers and Python
- 11.2. Choosing a Web Server
- 11.3. WSGI
- 11.4. WSGI Middleware
- 11.5. Python Web Frameworks
- 11.6. URL Dispatch Techniques
- 11.7. Templates
- 11.8. Pure-Python Web Servers
- 11.9. Common Gateway Interface (CGI)
- 11.10. mod_python
12. E-mail Composition and Decoding
- 12.1. E-mail Messages
- 12.2. Composing Traditional Messages
- 12.3. Parsing Traditional Messages
- 12.4. Parsing Dates
- 12.5. Understanding MIME
- 12.6. Composing MIME Attachments
- 12.7. MIME Alternative Parts
- 12.8. Composing Non-English Headers
- 12.9. Composing Nested Multiparts
- 12.10. Parsing MIME Messages
- 12.11. Decoding Headers
13. Simple Mail Transport Protocol (SMTP)
- 13.1. E-mail Clients, Webmail Services
- 13.2. How SMTP Is Used
- 13.3. Sending E-Mail
- 13.4. Introducing the SMTP Library
- 13.5. Error Handling and Conversation Debugging
- 13.6. Getting Information from EHLO
- 13.7. Using Secure Sockets Layer and Transport Layer Security
- 13.8. Authenticated SMTP
14. Post Office Protocol (POP)
- 14.1. Connecting and Authenticating
- 14.2. Obtaining Mailbox Information
- 14.3. Downloading and Deleting Messages
15. Internet Message Access Protocol (IMAP)
- 15.1. Understanding IMAP in Python
- 15.2. IMAPClient
- 15.3. Message Numbers vs. UIDs
- 15.4. Summary Information
- 15.5. Downloading an Entire Mailbox
- 15.6. Downloading Messages Individually
- 15.7. Flagging and Deleting Messages
- 15.8. Searching and Manipulating Messages
16. Telnet and SSH
- 16.1. Command-Line Automation
- 16.2. Command-Line Expansion and Quoting
- 16.3. Unix Has No Special Characters
- 16.4. Quoting Characters for Protection
- 16.5. Things Are Different in a Terminal
- 16.6. Terminals Do Buffering
- 16.7. Telnet
- 16.8. SSH: The Secure Shell
- 16.9. SSH Host Keys
- 16.10. SSH Authentication
- 16.11. Shell Sessions and Individual Commands
- 16.12. SFTP: File Transfer Over SSH
17. File Transfer Protocol (FTP)
- 17.1. What to Use Instead of FTP
- 17.2. Communication Channels
- 17.3. Using FTP in Python
- 17.4. ASCII and Binary Files
- 17.5. Advanced Binary Downloading
- 17.6. Uploading Data
- 17.7. Advanced Binary Uploading
- 17.8. Handling Errors
- 17.9. Detecting Directories and Recursive Download
- 17.10. Creating Directories, Deleting Things
18. Remote Procedure Call (RPC)
- 18.1. Features of RPC
- 18.2. XML-RPC
- 18.3. JSON-RPC
- 18.4. Self-documenting Data
- 18.5. Talking About Objects: Pyro and RPyC
- 18.6. An RPyC Example
- 18.7. RPC, Web Frameworks, Message Queues

Erle Robotics Python Networking Gitbook Free

URL Anatomy

Uniform Resource Locators (URLs), are strings that tell your web browser how to fetch resources from the World Wide WebThey are a subclass of the full set of possible Uniform Resource Identifiers (URIs); specifically, they are URIs constructed so that they give instructions for fetching a document, instead of serving only as an identifier.

To understand how they work,F consider a very simple URL, for example, like the following:http://python.org If submitted to a web browser, this URL is interpreted as an order to resolve the host name python.org to an IP address , make a TCP connection to that IP address at the standard HTTP port 80 , and then ask for the root document / that lives at that site.

Now imagine another more complicated URL, imagine that we wanted the logo for Nord/LB, a large German bank. The resulting URL might look something like this: http://example.com:8080/Nord%2FLB/logo?shape=square&dpi=96

Here, the URL specifies more information than our previous example did:

The protocol will, again, be HTTP.
The hostname example.com will be resolved to an IP.
This time, port 8080 will be used instead of 80.
Once a connection is complete, the remote server will be asked for the resource named: /Nord%2FLB/logo?shape=square&dpi=96

Web servers, in practice, have absolute freedom to interpret URLs as they please; however, the intention of the standard is that this URL be parsed into two question-mark-delimited pieces. The first is a path consisting of two elements:

A Nord/LB path element.
A logo path element.

The string following the ? is interpreted as a query containing two terms:

A shape parameter whose value is square.
A dpi parameter whose value is 96.

Any characters beyond the alphanumerics, a few punctuation marks—specifically the set $- _.+!*'(),—and the special delimiter characters themselves (like the slashes) must be percent-encoded by following a percent sign % with the two-digit hexadecimal code for the character.

You should note that the following URL paths are not equivalent:

Nord%2FLB%2Flogo = A single path component, named Nord/LB/logo.

Nord%2FLB/logo = Two path components, Nord/LB and logo.

Nord/LB/logo= Three separate path components Nord, LB, and logo.

The most important Python routines for working with URLs live, appropriately enough, in their own module.The urlparse module; this module defines a standard interface to break URL strings up in components (addressing scheme, network location, path etc.), to combine the components back into a URL string, and to convert a “relative URL” to an absolute URL given a “base URL.”

>>> from urlparse import urlparse, urldefrag, parse_qs, parse_qsl

With these routines, you can get large and complex URLs like the example given earlier and turn
them into their component parts, with RFC-compliant parsing already implemented for you:

```python
>>> p = urlparse('http://example.com:8080/Nord%2FLB/logo?shape=square&dpi=96')
>>> p
ParseResult(scheme='http', netloc='example.com:8080', path='/Nord%2FLB/logo',
» » » params='', query='shape=square&dpi=96', fragment='')

The query string that is offered by the ParseResult can then be submitted to one of the parsing routines if you want to interpret it as a series of key-value pairs, which is a standard way for web forms to submit them:

>>> parse_qs(p.query)
{'shape': ['square'], 'dpi': ['96']}

Note that each value in this dictionary is a list, rather than simply a string. This is to support the fact that a given parameter might be specified several times in a single URL; in such cases, the values are simply appended to the list:

>>> parse_qs('mode=topographic&pin=Boston&pin=San%20Francisco')
{'mode': ['topographic'], 'pin': ['Boston', 'San Francisco']}

This, you will note, preserves the order in which values arrive; of course, this does not preserve the order of the parameters themselves because dictionary keys do not remember any particular order. If the order is important to you, then use the parse_qsl() function instead (the l must stand for “list”):

>>> parse_qsl('mode=topographic&pin=Boston&pin=San%20Francisco')
[('mode', 'topographic'), ('pin', 'Boston'), ('pin', 'San Francisco')]
`

Finally, note that an “anchor” appended to a URL after a # character is not relevant to the HTTP protocol. This is because any anchor is stripped off and is not turned into part of the HTTP request. Instead, the anchor tells a web client to jump to some particular section of a document after the HTTP transaction is complete and the document has been downloaded. To remove the anchor, use urldefrag():

>>> u = 'http://docs.python.org/library/urlparse.html#urlparse.urldefrag'
>>> urldefrag(u)
('http://docs.python.org/library/urlparse.html', 'urlparse.urldefrag')

You can turn a ParseResult back into a URL by calling its geturl() method. When combined with the urlencode() function, which knows how to build query strings, this can be used to construct new URLs:

>>> import urllib, urlparse
>>> query = urllib.urlencode({'company': 'Nord/LB', 'report': 'sales'})
>>> p = urlparse.ParseResult(
... 'https', 'example.com', 'data', None, query, None)
>>> p.geturl()
'https://example.com/data?report=sales&company=Nord%2FLB'

For last, the HTTP request look like this:

GET /rfc/rfc2616.txt HTTP/1.1
Accept-Encoding: identity
Host: www.ietf.org
Connection: close
User-Agent: Python-urllib/2.7

And the HTTP response that comes back over the socket also starts with a set of headers, but then also includes a body that contains the document itself that has been requested :

HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Fri, 11 Jul 2014 07:02:55 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: close
Set-Cookie: __cfduid=d5be98ff9fbae526f308d478da5bb413e1405062173934; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ietf.org; HttpOnly
Last-Modified: Fri, 11 Jun 1999 18:46:53 GMT
Vary: Accept-Encoding
CF-RAY: 1483235b13c51043-CDG
<addinfourl at 4341048456 whose fp = <socket._fileobject object at 0x102a13750>>