Erle Robotics Python Networking Gitbook Free
Introduction
1. Introduction to Client/Server Networking
- 1.1. Virtualenv
- 1.2. Installing virtualenv in Erle
- 1.3. Create a virtual environment to test packages
2. Introduction to socket
- 2.1. What is socket?
- 2.2. Creating a Socket
- 2.3. Using sockets
- 2.4. Disconnecting
- 2.5. Non - blocking sockets
3. UDP and TCP
- 3.1. Addresses and port numbers
- 3.2. UDP
- 3.3. TCP
4. Socket names and DNS
- 4.1. Socket names
- 4.2. Five socket cordinates
- 4.3. IPv6
- 4.4. The getaddrinfo() function
- 4.5. A Sketch of How DNS Works
- 4.6. Using DNS
5. Network Data and Network Errors
- 5.1. Text and Encodings
- 5.2. Network Byte Order
- 5.3. Framing and Quoting
- 5.4. Pickles and Self-Delimiting Formats
- 5.5. XML, JSON, Etc.
- 5.6. Compression
- 5.7. Network Exceptions
- 5.8. Handling Exceptions
6. TLS and SSL
- 6.1. Cleartext on the Network
- 6.2. TLS Encrypts Your Conversations
- 6.3. Supporting TLS in Python
- 6.4. The Standard SSL Module
7. Server Architecture
- 7.1. Daemons and Logging
- 7.2. Introductory example
- 7.3. Elementary client
- 7.4. Event-Driven Servers
- 7.5. The Semantics of Non-blocking
- 7.6. Twisted Python
- 7.7. Threading and Multi-processing
- 7.8. Threading and Multi-processing Frameworks
8. Caches, Message Queues, and Map-Reduce
- 8.1. Using Memcached
- 8.2. Memcached and Sharding
- 8.3. Message Queues
- 8.4. Using Message Queues from Python
- 8.5. Map-Reduce
9. HTTP
- 9.1. URL Anatomy
- 9.2. Relative URLs
- 9.3. Instrumenting urllib2
- 9.4. The GET Method and The Host Header
- 9.5. Payloads and Persistent Connections
- 9.6. POST And Forms
- 9.7. REST And More HTTP Methods
- 9.8. Identifying User Agents and Web Servers
- 9.9. Content Type Negotiation
- 9.10. Compression
- 9.11. HTTP Caching
- 9.12. The HEAD Method
- 9.13. HTTPS Encryption
- 9.14. HTTP Authentication
- 9.15. Cookies
- 9.16. HTTP Session Hijacking
- 9.17. Cross-Site Scripting Attacks
10. Screen Scraping
- 10.1. Fetching Web Pages
- 10.2. Downloading Pages Through Form Submission
- 10.3. The Structure of Web Pages
- 10.4. Three Axes
- 10.5. Diving into an HTML Document
- 10.6. Selectors
11. Web Applications
- 11.1. Web Servers and Python
- 11.2. Choosing a Web Server
- 11.3. WSGI
- 11.4. WSGI Middleware
- 11.5. Python Web Frameworks
- 11.6. URL Dispatch Techniques
- 11.7. Templates
- 11.8. Pure-Python Web Servers
- 11.9. Common Gateway Interface (CGI)
- 11.10. mod_python
12. E-mail Composition and Decoding
- 12.1. E-mail Messages
- 12.2. Composing Traditional Messages
- 12.3. Parsing Traditional Messages
- 12.4. Parsing Dates
- 12.5. Understanding MIME
- 12.6. Composing MIME Attachments
- 12.7. MIME Alternative Parts
- 12.8. Composing Non-English Headers
- 12.9. Composing Nested Multiparts
- 12.10. Parsing MIME Messages
- 12.11. Decoding Headers
13. Simple Mail Transport Protocol (SMTP)
- 13.1. E-mail Clients, Webmail Services
- 13.2. How SMTP Is Used
- 13.3. Sending E-Mail
- 13.4. Introducing the SMTP Library
- 13.5. Error Handling and Conversation Debugging
- 13.6. Getting Information from EHLO
- 13.7. Using Secure Sockets Layer and Transport Layer Security
- 13.8. Authenticated SMTP
14. Post Office Protocol (POP)
- 14.1. Connecting and Authenticating
- 14.2. Obtaining Mailbox Information
- 14.3. Downloading and Deleting Messages
15. Internet Message Access Protocol (IMAP)
- 15.1. Understanding IMAP in Python
- 15.2. IMAPClient
- 15.3. Message Numbers vs. UIDs
- 15.4. Summary Information
- 15.5. Downloading an Entire Mailbox
- 15.6. Downloading Messages Individually
- 15.7. Flagging and Deleting Messages
- 15.8. Searching and Manipulating Messages
16. Telnet and SSH
- 16.1. Command-Line Automation
- 16.2. Command-Line Expansion and Quoting
- 16.3. Unix Has No Special Characters
- 16.4. Quoting Characters for Protection
- 16.5. Things Are Different in a Terminal
- 16.6. Terminals Do Buffering
- 16.7. Telnet
- 16.8. SSH: The Secure Shell
- 16.9. SSH Host Keys
- 16.10. SSH Authentication
- 16.11. Shell Sessions and Individual Commands
- 16.12. SFTP: File Transfer Over SSH
17. File Transfer Protocol (FTP)
- 17.1. What to Use Instead of FTP
- 17.2. Communication Channels
- 17.3. Using FTP in Python
- 17.4. ASCII and Binary Files
- 17.5. Advanced Binary Downloading
- 17.6. Uploading Data
- 17.7. Advanced Binary Uploading
- 17.8. Handling Errors
- 17.9. Detecting Directories and Recursive Download
- 17.10. Creating Directories, Deleting Things
18. Remote Procedure Call (RPC)
- 18.1. Features of RPC
- 18.2. XML-RPC
- 18.3. JSON-RPC
- 18.4. Self-documenting Data
- 18.5. Talking About Objects: Pyro and RPyC
- 18.6. An RPyC Example
- 18.7. RPC, Web Frameworks, Message Queues

Erle Robotics Python Networking Gitbook Free

The Structure of Web Pages

The Hypertext Markup Language (HTML) is one of many markup dialects built atop the Standard Generalized Markup Language (SGML), which bequeathed to the world the idea of using thousands of angle brackets to mark up plain text. Inserting bold and italics into a format like HTML is as simple as typing eight angle brackets:

The very strange book Tristram Shandy. The very strange book Tristram Shandy.

In the terminology of SGML, the strings <b> and  are each tags—they are, in fact, an opening and a closing tag—and together they create an element that contains the text very inside it. Elements can contain text as well as other elements, and can define a series of key/value attribute pairs that give more information about the element:

I am reading Hamlet.
I am reading Hamlet.

The problem with SGML languages in this regard—and HTML is one particular example—is that they expect parsers to know the rules about which elements can be nested inside which other elements, and this leads to constructions like this unordered list<ul>, inside which are several list items <li>:

<ul><li>First<li>Second<li>Third<li>Fourth</ul>
- First
- Second
- Third
- Fourth

Since HTML in fact says that

elements cannot nest, an HTML parser will understand the foregoing snippet to be equivalent to this more explicit XML string:

<ul><li>First</li><li>Second</li><li>Third</li><li>Fourth</li></ul>
- First
- Second
- Third
- Fourth

And beyond this implicit understanding of HTML that a parser must possess are the twin problems that, first, various browsers over the years have varied wildly in how well they can reconstruct the document structure when given very concise or even deeply broken HTML; and, second, most web page authors judge the quality of their HTML by whether their browser of choice renders it correctly. This has resulted not only in a World Wide Web that is full of sites with invalid and broken HTML markup, but also in the fact that the permissiveness built into browsers has encouraged different flavors of broken HTML among their different user groups.

For more documentation about these topic visit: