mirror of
https://github.com/beyondx/Notes.git
synced 2026-02-06 03:44:12 +08:00
Add New Notes
This commit is contained in:
@@ -0,0 +1,726 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2011-10-13T22:14:43+08:00
|
||||
|
||||
====== Python Web Server Gateway Interface v1.0 ======
|
||||
Created Thursday 13 October 2011
|
||||
http://www.python.org/dev/peps/pep-0333/
|
||||
|
||||
PEP: 333
|
||||
Title: Python Web Server Gateway Interface v1.0
|
||||
Version: 763b6e5c6cf1
|
||||
Last-Modified: 2011-03-04 04:58:22 +0000 (Fri, 04 Mar 2011)
|
||||
Author: Phillip J. Eby <pje at telecommunity.com>
|
||||
Discussions-To: Python Web-SIG <web-sig at python.org>
|
||||
Status: Final
|
||||
Type: Informational
|
||||
Content-Type: text/x-rst
|
||||
Created: 07-Dec-2003
|
||||
Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004, 27-Sep-2010
|
||||
Superseded-By: 3333
|
||||
|
||||
Contents
|
||||
|
||||
Preface
|
||||
Abstract
|
||||
Rationale and Goals
|
||||
Specification Overview
|
||||
The Application/Framework Side
|
||||
The Server/Gateway Side
|
||||
Middleware: Components that Play Both Sides
|
||||
Specification Details
|
||||
environ Variables
|
||||
Input and Error Streams
|
||||
The start_response() Callable
|
||||
Handling the Content-Length Header
|
||||
Buffering and Streaming
|
||||
Middleware Handling of Block Boundaries
|
||||
The write() Callable
|
||||
Unicode Issues
|
||||
Error Handling
|
||||
HTTP 1.1 Expect/Continue
|
||||
Other HTTP Features
|
||||
Thread Support
|
||||
Implementation/Application Notes
|
||||
Server Extension APIs
|
||||
Application Configuration
|
||||
URL Reconstruction
|
||||
Supporting Older (<2.2) Versions of Python
|
||||
Optional Platform-Specific File Handling
|
||||
Questions and Answers
|
||||
Proposed/Under Discussion
|
||||
Acknowledgements
|
||||
References
|
||||
Copyright
|
||||
|
||||
====== Preface ======
|
||||
|
||||
Note: For an updated version of this spec that supports Python 3.x and includes community errata, addenda, and clarifications, please see PEP 3333 instead.
|
||||
|
||||
===== Abstract =====
|
||||
|
||||
This document specifies a proposed **standard interface** **between web servers and Python web applications or frameworks, to** **promote web application portability across a variety of web servers.**
|
||||
|
||||
===== Rationale and Goals =====
|
||||
|
||||
Python currently boasts a wide variety of **web application frameworks**, such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to name just a few [1]. This wide variety of choices can be a problem for new Python users, because generally speaking, **their choice of web framework will limit their choice of usable web servers**, and vice versa.
|
||||
|
||||
By contrast, although Java has just as many web application frameworks available, Java's "servlet" API makes it possible for applications written with any Java web application framework to run in any web server that supports the servlet API.
|
||||
|
||||
The availability and widespread use of such an API in web servers for Python -- whether those servers are written in Python (e.g. Medusa), embed Python (e.g. mod_python), or invoke Python via a gateway protocol (e.g. CGI, FastCGI, etc.) -- would** separate choice of framework from choice of web server**, freeing users to choose a pairing that suits them, while freeing framework and server developers to focus on their preferred area of specialization.
|
||||
|
||||
__This PEP, therefore, proposes a simple and universal interface between web servers and web applications or frameworks: the Python Web Server Gateway Interface (WSGI).__
|
||||
|
||||
But the mere existence of a WSGI spec does nothing to address the existing state of servers and frameworks for Python web applications. Server and framework authors and maintainers must actually** implement WSGI **for there to be any effect.
|
||||
|
||||
However, since no existing servers or frameworks support WSGI, there is little immediate reward for an author who implements WSGI support. Thus, WSGI must be easy to implement, so that an author's initial investment in the interface can be reasonably low.
|
||||
|
||||
Thus, __simplicity of implementation __on both the server and framework sides of the interface is absolutely critical to the utility of the WSGI interface, and is therefore the principal criterion for any design decisions.
|
||||
|
||||
Note, however, that simplicity of implementation for a framework author is not the same thing as ease of use for a web application author. WSGI presents an absolutely "no frills(褶皱、花边、装饰)" interface to the framework author, because bells and whistles like response objects and cookie handling would just get in the way of existing frameworks' handling of these issues. Again, __the goal of WSGI is to facilitate easy interconnection of existing servers and applications or frameworks__, not to create a new web framework.
|
||||
|
||||
Note also that this goal precludes WSGI from requiring anything that is not already available in deployed versions of Python. Therefore, new standard library modules are not proposed or required by this specification, and nothing in WSGI requires a Python version greater than 2.2.2. (It would be a good idea, however, for future versions of Python to include support for this interface in web servers provided by the standard library.)
|
||||
|
||||
In addition to ease of implementation for existing and future frameworks and servers, it should also be easy to create request preprocessors, response postprocessors, and other __WSGI-based "middleware" components__ that look like an application to their containing server, while acting as a server for their contained applications.
|
||||
|
||||
If middleware can be both simple and robust, and WSGI is widely available in servers and frameworks, it allows for the possibility of an entirely new kind of Python web application framework: one consisting of loosely-coupled WSGI middleware components. Indeed, existing framework authors may even choose to refactor their frameworks' existing services to be provided in this way, becoming more like libraries used with WSGI, and less like monolithic frameworks. This would then allow application developers to choose "best-of-breed" components for specific functionality, rather than having to commit to all the pros and cons of a single framework.
|
||||
|
||||
Of course, as of this writing, that day is doubtless quite far off. In the meantime, it is a sufficient short-term goal for WSGI to enable the use of any framework with any server.
|
||||
|
||||
Finally, it should be mentioned that the current version of WSGI does not prescribe any particular mechanism for "deploying" an application for use with a web server or server gateway. At the present time, this is necessarily implementation-defined by the server or gateway. After a sufficient number of servers and frameworks have implemented WSGI to provide field experience with varying deployment requirements, it may make sense to create another PEP, describing a deployment standard for WSGI servers and application frameworks.
|
||||
|
||||
====== Specification Overview ======
|
||||
|
||||
The WSGI interface has two sides: the "server" or "gateway" side, and the "application" or "framework" side. The server side invokes a callable object that is provided by the application side. The specifics of how that object is provided are up to the server or gateway. It is assumed that some servers or gateways will require an application's deployer to write a short script to create an instance of the server or gateway, and supply it with the application object. Other servers and gateways may use configuration files or other mechanisms to specify where an application object should be imported from, or otherwise obtained.
|
||||
|
||||
In addition to "pure" servers/gateways and applications/frameworks, it is also possible to create "middleware" components that implement both sides of this specification. Such components act as an application to their containing server, and as a server to a contained application, and can be used to provide extended APIs, content transformation, navigation, and other useful functions.
|
||||
|
||||
Throughout this specification, we will use the term "a callable" to mean "a function, method, class, or an instance with a __call__ method". It is up to the server, gateway, or application implementing the callable to choose the appropriate implementation technique for their needs. Conversely, a server, gateway, or application that is invoking a callable must not have any dependency on what kind of callable was provided to it. Callables are only to be called, not introspected upon.
|
||||
The Application/Framework Side
|
||||
|
||||
The application object is simply a callable object that accepts two arguments. The term "object" should not be misconstrued as requiring an actual object instance: a function, method, class, or instance with a __call__ method are all acceptable for use as an application object. Application objects must be able to be invoked more than once, as virtually all servers/gateways (other than CGI) will make such repeated requests.
|
||||
|
||||
(Note: although we refer to it as an "application" object, this should not be construed to mean that application developers will use WSGI as a web programming API! It is assumed that application developers will continue to use existing, high-level framework services to develop their applications. WSGI is a tool for framework and server developers, and is not intended to directly support application developers.)
|
||||
|
||||
Here are two example application objects; one is a function, and the other is a class:
|
||||
|
||||
def simple_app(environ, start_response):
|
||||
"""Simplest possible application object"""
|
||||
status = '200 OK'
|
||||
response_headers = [('Content-type', 'text/plain')]
|
||||
start_response(status, response_headers)
|
||||
return ['Hello world!\n']
|
||||
|
||||
|
||||
class AppClass:
|
||||
"""Produce the same output, but using a class
|
||||
|
||||
(Note: 'AppClass' is the "application" here, so calling it
|
||||
returns an instance of 'AppClass', which is then the iterable
|
||||
return value of the "application callable" as required by
|
||||
the spec.
|
||||
|
||||
If we wanted to use *instances* of 'AppClass' as application
|
||||
objects instead, we would have to implement a '__call__'
|
||||
method, which would be invoked to execute the application,
|
||||
and we would need to create an instance for use by the
|
||||
server or gateway.
|
||||
"""
|
||||
|
||||
def __init__(self, environ, start_response):
|
||||
self.environ = environ
|
||||
self.start = start_response
|
||||
|
||||
def __iter__(self):
|
||||
status = '200 OK'
|
||||
response_headers = [('Content-type', 'text/plain')]
|
||||
self.start(status, response_headers)
|
||||
yield "Hello world!\n"
|
||||
|
||||
The Server/Gateway Side
|
||||
|
||||
The server or gateway invokes the application callable once for each request it receives from an HTTP client, that is directed at the application. To illustrate, here is a simple CGI gateway, implemented as a function taking an application object. Note that this simple example has limited error handling, because by default an uncaught exception will be dumped to sys.stderr and logged by the web server.
|
||||
|
||||
import os, sys
|
||||
|
||||
def run_with_cgi(application):
|
||||
|
||||
environ = dict(os.environ.items())
|
||||
environ['wsgi.input'] = sys.stdin
|
||||
environ['wsgi.errors'] = sys.stderr
|
||||
environ['wsgi.version'] = (1, 0)
|
||||
environ['wsgi.multithread'] = False
|
||||
environ['wsgi.multiprocess'] = True
|
||||
environ['wsgi.run_once'] = True
|
||||
|
||||
if environ.get('HTTPS', 'off') in ('on', '1'):
|
||||
environ['wsgi.url_scheme'] = 'https'
|
||||
else:
|
||||
environ['wsgi.url_scheme'] = 'http'
|
||||
|
||||
headers_set = []
|
||||
headers_sent = []
|
||||
|
||||
def write(data):
|
||||
if not headers_set:
|
||||
raise AssertionError("write() before start_response()")
|
||||
|
||||
elif not headers_sent:
|
||||
# Before the first output, send the stored headers
|
||||
status, response_headers = headers_sent[:] = headers_set
|
||||
sys.stdout.write('Status: %s\r\n' % status)
|
||||
for header in response_headers:
|
||||
sys.stdout.write('%s: %s\r\n' % header)
|
||||
sys.stdout.write('\r\n')
|
||||
|
||||
sys.stdout.write(data)
|
||||
sys.stdout.flush()
|
||||
|
||||
def start_response(status, response_headers, exc_info=None):
|
||||
if exc_info:
|
||||
try:
|
||||
if headers_sent:
|
||||
# Re-raise original exception if headers sent
|
||||
raise exc_info[0], exc_info[1], exc_info[2]
|
||||
finally:
|
||||
exc_info = None # avoid dangling circular ref
|
||||
elif headers_set:
|
||||
raise AssertionError("Headers already set!")
|
||||
|
||||
headers_set[:] = [status, response_headers]
|
||||
return write
|
||||
|
||||
result = application(environ, start_response)
|
||||
try:
|
||||
for data in result:
|
||||
if data: # don't send headers until body appears
|
||||
write(data)
|
||||
if not headers_sent:
|
||||
write('') # send headers now if body was empty
|
||||
finally:
|
||||
if hasattr(result, 'close'):
|
||||
result.close()
|
||||
|
||||
Middleware: Components that Play Both Sides
|
||||
|
||||
Note that a single object may play the role of a server with respect to some application(s), while also acting as an application with respect to some server(s). Such "middleware" components can perform such functions as:
|
||||
|
||||
Routing a request to different application objects based on the target URL, after rewriting the environ accordingly.
|
||||
Allowing multiple applications or frameworks to run side-by-side in the same process
|
||||
Load balancing and remote processing, by forwarding requests and responses over a network
|
||||
Perform content postprocessing, such as applying XSL stylesheets
|
||||
|
||||
The presence of middleware in general is transparent to both the "server/gateway" and the "application/framework" sides of the interface, and should require no special support. A user who desires to incorporate middleware into an application simply provides the middleware component to the server, as if it were an application, and configures the middleware component to invoke the application, as if the middleware component were a server. Of course, the "application" that the middleware wraps may in fact be another middleware component wrapping another application, and so on, creating what is referred to as a "middleware stack".
|
||||
|
||||
For the most part, middleware must conform to the restrictions and requirements of both the server and application sides of WSGI. In some cases, however, requirements for middleware are more stringent than for a "pure" server or application, and these points will be noted in the specification.
|
||||
|
||||
Here is a (tongue-in-cheek) example of a middleware component that converts text/plain responses to pig latin, using Joe Strout's piglatin.py. (Note: a "real" middleware component would probably use a more robust way of checking the content type, and should also check for a content encoding. Also, this simple example ignores the possibility that a word might be split across a block boundary.)
|
||||
|
||||
from piglatin import piglatin
|
||||
|
||||
class LatinIter:
|
||||
|
||||
"""Transform iterated output to piglatin, if it's okay to do so
|
||||
|
||||
Note that the "okayness" can change until the application yields
|
||||
its first non-empty string, so 'transform_ok' has to be a mutable
|
||||
truth value.
|
||||
"""
|
||||
|
||||
def __init__(self, result, transform_ok):
|
||||
if hasattr(result, 'close'):
|
||||
self.close = result.close
|
||||
self._next = iter(result).next
|
||||
self.transform_ok = transform_ok
|
||||
|
||||
def __iter__(self):
|
||||
return self
|
||||
|
||||
def next(self):
|
||||
if self.transform_ok:
|
||||
return piglatin(self._next())
|
||||
else:
|
||||
return self._next()
|
||||
|
||||
class Latinator:
|
||||
|
||||
# by default, don't transform output
|
||||
transform = False
|
||||
|
||||
def __init__(self, application):
|
||||
self.application = application
|
||||
|
||||
def __call__(self, environ, start_response):
|
||||
|
||||
transform_ok = []
|
||||
|
||||
def start_latin(status, response_headers, exc_info=None):
|
||||
|
||||
# Reset ok flag, in case this is a repeat call
|
||||
del transform_ok[:]
|
||||
|
||||
for name, value in response_headers:
|
||||
if name.lower() == 'content-type' and value == 'text/plain':
|
||||
transform_ok.append(True)
|
||||
# Strip content-length if present, else it'll be wrong
|
||||
response_headers = [(name, value)
|
||||
for name, value in response_headers
|
||||
if name.lower() != 'content-length'
|
||||
]
|
||||
break
|
||||
|
||||
write = start_response(status, response_headers, exc_info)
|
||||
|
||||
if transform_ok:
|
||||
def write_latin(data):
|
||||
write(piglatin(data))
|
||||
return write_latin
|
||||
else:
|
||||
return write
|
||||
|
||||
return LatinIter(self.application(environ, start_latin), transform_ok)
|
||||
|
||||
|
||||
# Run foo_app under a Latinator's control, using the example CGI gateway
|
||||
from foo_app import foo_app
|
||||
run_with_cgi(Latinator(foo_app))
|
||||
|
||||
Specification Details
|
||||
|
||||
The application object must accept two positional arguments. For the sake of illustration, we have named them environ and start_response, but they are not required to have these names. A server or gateway must invoke the application object using positional (not keyword) arguments. (E.g. by calling result = application(environ, start_response) as shown above.)
|
||||
|
||||
The environ parameter is a dictionary object, containing CGI-style environment variables. This object must be a builtin Python dictionary (not a subclass, UserDict or other dictionary emulation), and the application is allowed to modify the dictionary in any way it desires. The dictionary must also include certain WSGI-required variables (described in a later section), and may also include server-specific extension variables, named according to a convention that will be described below.
|
||||
|
||||
The start_response parameter is a callable accepting two required positional arguments, and one optional argument. For the sake of illustration, we have named these arguments status, response_headers, and exc_info, but they are not required to have these names, and the application must invoke the start_response callable using positional arguments (e.g. start_response(status, response_headers)).
|
||||
|
||||
The status parameter is a status string of the form "999 Message here", and response_headers is a list of (header_name, header_value) tuples describing the HTTP response header. The optional exc_info parameter is described below in the sections on The start_response() Callable and Error Handling. It is used only when the application has trapped an error and is attempting to display an error message to the browser.
|
||||
|
||||
The start_response callable must return a write(body_data) callable that takes one positional parameter: a string to be written as part of the HTTP response body. (Note: the write() callable is provided only to support certain existing frameworks' imperative output APIs; it should not be used by new applications or frameworks if it can be avoided. See the Buffering and Streaming section for more details.)
|
||||
|
||||
When called by the server, the application object must return an iterable yielding zero or more strings. This can be accomplished in a variety of ways, such as by returning a list of strings, or by the application being a generator function that yields strings, or by the application being a class whose instances are iterable. Regardless of how it is accomplished, the application object must always return an iterable yielding zero or more strings.
|
||||
|
||||
The server or gateway must transmit the yielded strings to the client in an unbuffered fashion, completing the transmission of each string before requesting another one. (In other words, applications should perform their own buffering. See the Buffering and Streaming section below for more on how application output must be handled.)
|
||||
|
||||
The server or gateway should treat the yielded strings as binary byte sequences: in particular, it should ensure that line endings are not altered. The application is responsible for ensuring that the string(s) to be written are in a format suitable for the client. (The server or gateway may apply HTTP transfer encodings, or perform other transformations for the purpose of implementing HTTP features such as byte-range transmission. See Other HTTP Features, below, for more details.)
|
||||
|
||||
If a call to len(iterable) succeeds, the server must be able to rely on the result being accurate. That is, if the iterable returned by the application provides a working __len__() method, it must return an accurate result. (See the Handling the Content-Length Header section for information on how this would normally be used.)
|
||||
|
||||
If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request, whether the request was completed normally, or terminated early due to an error. (This is to support resource release by the application. This protocol is intended to complement PEP 325's generator support, and other common iterables with close() methods.
|
||||
|
||||
(Note: the application must invoke the start_response() callable before the iterable yields its first body string, so that the server can send the headers before any body content. However, this invocation may be performed by the iterable's first iteration, so servers must not assume that start_response() has been called before they begin iterating over the iterable.)
|
||||
|
||||
Finally, servers and gateways must not directly use any other attributes of the iterable returned by the application, unless it is an instance of a type specific to that server or gateway, such as a "file wrapper" returned by wsgi.file_wrapper (see Optional Platform-Specific File Handling). In the general case, only attributes specified here, or accessed via e.g. the PEP 234 iteration APIs are acceptable.
|
||||
environ Variables
|
||||
|
||||
The environ dictionary is required to contain these CGI environment variables, as defined by the Common Gateway Interface specification [2]. The following variables must be present, unless their value would be an empty string, in which case they may be omitted, except as otherwise noted below.
|
||||
|
||||
REQUEST_METHOD
|
||||
The HTTP request method, such as "GET" or "POST". This cannot ever be an empty string, and so is always required.
|
||||
SCRIPT_NAME
|
||||
The initial portion of the request URL's "path" that corresponds to the application object, so that the application knows its virtual "location". This may be an empty string, if the application corresponds to the "root" of the server.
|
||||
PATH_INFO
|
||||
The remainder of the request URL's "path", designating the virtual "location" of the request's target within the application. This may be an empty string, if the request URL targets the application root and does not have a trailing slash.
|
||||
QUERY_STRING
|
||||
The portion of the request URL that follows the "?", if any. May be empty or absent.
|
||||
CONTENT_TYPE
|
||||
The contents of any Content-Type fields in the HTTP request. May be empty or absent.
|
||||
CONTENT_LENGTH
|
||||
The contents of any Content-Length fields in the HTTP request. May be empty or absent.
|
||||
SERVER_NAME, SERVER_PORT
|
||||
When combined with SCRIPT_NAME and PATH_INFO, these variables can be used to complete the URL. Note, however, that HTTP_HOST, if present, should be used in preference to SERVER_NAME for reconstructing the request URL. See the URL Reconstruction section below for more detail. SERVER_NAME and SERVER_PORT can never be empty strings, and so are always required.
|
||||
SERVER_PROTOCOL
|
||||
The version of the protocol the client used to send the request. Typically this will be something like "HTTP/1.0" or "HTTP/1.1" and may be used by the application to determine how to treat any HTTP request headers. (This variable should probably be called REQUEST_PROTOCOL, since it denotes the protocol used in the request, and is not necessarily the protocol that will be used in the server's response. However, for compatibility with CGI we have to keep the existing name.)
|
||||
HTTP_ Variables
|
||||
Variables corresponding to the client-supplied HTTP request headers (i.e., variables whose names begin with "HTTP_"). The presence or absence of these variables should correspond with the presence or absence of the appropriate HTTP header in the request.
|
||||
|
||||
A server or gateway should attempt to provide as many other CGI variables as are applicable. In addition, if SSL is in use, the server or gateway should also provide as many of the Apache SSL environment variables [5] as are applicable, such as HTTPS=on and SSL_PROTOCOL. Note, however, that an application that uses any CGI variables other than the ones listed above are necessarily non-portable to web servers that do not support the relevant extensions. (For example, web servers that do not publish files will not be able to provide a meaningful DOCUMENT_ROOT or PATH_TRANSLATED.)
|
||||
|
||||
A WSGI-compliant server or gateway should document what variables it provides, along with their definitions as appropriate. Applications should check for the presence of any variables they require, and have a fallback plan in the event such a variable is absent.
|
||||
|
||||
Note: missing variables (such as REMOTE_USER when no authentication has occurred) should be left out of the environ dictionary. Also note that CGI-defined variables must be strings, if they are present at all. It is a violation of this specification for a CGI variable's value to be of any type other than str.
|
||||
|
||||
In addition to the CGI-defined variables, the environ dictionary may also contain arbitrary operating-system "environment variables", and must contain the following WSGI-defined variables:
|
||||
Variable Value
|
||||
wsgi.version The tuple (1, 0), representing WSGI version 1.0.
|
||||
wsgi.url_scheme A string representing the "scheme" portion of the URL at which the application is being invoked. Normally, this will have the value "http" or "https", as appropriate.
|
||||
wsgi.input An input stream (file-like object) from which the HTTP request body can be read. (The server or gateway may perform reads on-demand as requested by the application, or it may pre- read the client's request body and buffer it in-memory or on disk, or use any other technique for providing such an input stream, according to its preference.)
|
||||
wsgi.errors
|
||||
|
||||
An output stream (file-like object) to which error output can be written, for the purpose of recording program or other errors in a standardized and possibly centralized location. This should be a "text mode" stream; i.e., applications should use "\n" as a line ending, and assume that it will be converted to the correct line ending by the server/gateway.
|
||||
|
||||
For many servers, wsgi.errors will be the server's main error log. Alternatively, this may be sys.stderr, or a log file of some sort. The server's documentation should include an explanation of how to configure this or where to find the recorded output. A server or gateway may supply different error streams to different applications, if this is desired.
|
||||
wsgi.multithread This value should evaluate true if the application object may be simultaneously invoked by another thread in the same process, and should evaluate false otherwise.
|
||||
wsgi.multiprocess This value should evaluate true if an equivalent application object may be simultaneously invoked by another process, and should evaluate false otherwise.
|
||||
wsgi.run_once This value should evaluate true if the server or gateway expects (but does not guarantee!) that the application will only be invoked this one time during the life of its containing process. Normally, this will only be true for a gateway based on CGI (or something similar).
|
||||
|
||||
Finally, the environ dictionary may also contain server-defined variables. These variables should be named using only lower-case letters, numbers, dots, and underscores, and should be prefixed with a name that is unique to the defining server or gateway. For example, mod_python might define variables with names like mod_python.some_variable.
|
||||
Input and Error Streams
|
||||
|
||||
The input and error streams provided by the server must support the following methods:
|
||||
Method Stream Notes
|
||||
read(size) input 1
|
||||
readline() input 1, 2
|
||||
readlines(hint) input 1, 3
|
||||
__iter__() input
|
||||
flush() errors 4
|
||||
write(str) errors
|
||||
writelines(seq) errors
|
||||
|
||||
The semantics of each method are as documented in the Python Library Reference, except for these notes as listed in the table above:
|
||||
|
||||
The server is not required to read past the client's specified Content-Length, and is allowed to simulate an end-of-file condition if the application attempts to read past that point. The application should not attempt to read more data than is specified by the CONTENT_LENGTH variable.
|
||||
The optional "size" argument to readline() is not supported, as it may be complex for server authors to implement, and is not often used in practice.
|
||||
Note that the hint argument to readlines() is optional for both caller and implementer. The application is free not to supply it, and the server or gateway is free to ignore it.
|
||||
Since the errors stream may not be rewound, servers and gateways are free to forward write operations immediately, without buffering. In this case, the flush() method may be a no-op. Portable applications, however, cannot assume that output is unbuffered or that flush() is a no-op. They must call flush() if they need to ensure that output has in fact been written. (For example, to minimize intermingling of data from multiple processes writing to the same error log.)
|
||||
|
||||
The methods listed in the table above must be supported by all servers conforming to this specification. Applications conforming to this specification must not use any other methods or attributes of the input or errors objects. In particular, applications must not attempt to close these streams, even if they possess close() methods.
|
||||
The start_response() Callable
|
||||
|
||||
The second parameter passed to the application object is a callable of the form start_response(status, response_headers, exc_info=None). (As with all WSGI callables, the arguments must be supplied positionally, not by keyword.) The start_response callable is used to begin the HTTP response, and it must return a write(body_data) callable (see the Buffering and Streaming section, below).
|
||||
|
||||
The status argument is an HTTP "status" string like "200 OK" or "404 Not Found". That is, it is a string consisting of a Status-Code and a Reason-Phrase, in that order and separated by a single space, with no surrounding whitespace or other characters. (See RFC 2616, Section 6.1.1 for more information.) The string must not contain control characters, and must not be terminated with a carriage return, linefeed, or combination thereof.
|
||||
|
||||
The response_headers argument is a list of (header_name, header_value) tuples. It must be a Python list; i.e. type(response_headers) is ListType, and the server may change its contents in any way it desires. Each header_name must be a valid HTTP header field-name (as defined by RFC 2616, Section 4.2), without a trailing colon or other punctuation.
|
||||
|
||||
Each header_value must not include any control characters, including carriage returns or linefeeds, either embedded or at the end. (These requirements are to minimize the complexity of any parsing that must be performed by servers, gateways, and intermediate response processors that need to inspect or modify response headers.)
|
||||
|
||||
In general, the server or gateway is responsible for ensuring that correct headers are sent to the client: if the application omits a header required by HTTP (or other relevant specifications that are in effect), the server or gateway must add it. For example, the HTTP Date: and Server: headers would normally be supplied by the server or gateway.
|
||||
|
||||
(A reminder for server/gateway authors: HTTP header names are case-insensitive, so be sure to take that into consideration when examining application-supplied headers!)
|
||||
|
||||
Applications and middleware are forbidden from using HTTP/1.1 "hop-by-hop" features or headers, any equivalent features in HTTP/1.0, or any headers that would affect the persistence of the client's connection to the web server. These features are the exclusive province of the actual web server, and a server or gateway should consider it a fatal error for an application to attempt sending them, and raise an error if they are supplied to start_response(). (For more specifics on "hop-by-hop" features and headers, please see the Other HTTP Features section below.)
|
||||
|
||||
The start_response callable must not actually transmit the response headers. Instead, it must store them for the server or gateway to transmit only after the first iteration of the application return value that yields a non-empty string, or upon the application's first invocation of the write() callable. In other words, response headers must not be sent until there is actual body data available, or until the application's returned iterable is exhausted. (The only possible exception to this rule is if the response headers explicitly include a Content-Length of zero.)
|
||||
|
||||
This delaying of response header transmission is to ensure that buffered and asynchronous applications can replace their originally intended output with error output, up until the last possible moment. For example, the application may need to change the response status from "200 OK" to "500 Internal Error", if an error occurs while the body is being generated within an application buffer.
|
||||
|
||||
The exc_info argument, if supplied, must be a Python sys.exc_info() tuple. This argument should be supplied by the application only if start_response is being called by an error handler. If exc_info is supplied, and no HTTP headers have been output yet, start_response should replace the currently-stored HTTP response headers with the newly-supplied ones, thus allowing the application to "change its mind" about the output when an error has occurred.
|
||||
|
||||
However, if exc_info is provided, and the HTTP headers have already been sent, start_response must raise an error, and should raise the exc_info tuple. That is:
|
||||
|
||||
raise exc_info[0], exc_info[1], exc_info[2]
|
||||
|
||||
This will re-raise the exception trapped by the application, and in principle should abort the application. (It is not safe for the application to attempt error output to the browser once the HTTP headers have already been sent.) The application must not trap any exceptions raised by start_response, if it called start_response with exc_info. Instead, it should allow such exceptions to propagate back to the server or gateway. See Error Handling below, for more details.
|
||||
|
||||
The application may call start_response more than once, if and only if the exc_info argument is provided. More precisely, it is a fatal error to call start_response without the exc_info argument if start_response has already been called within the current invocation of the application. (See the example CGI gateway above for an illustration of the correct logic.)
|
||||
|
||||
Note: servers, gateways, or middleware implementing start_response should ensure that no reference is held to the exc_info parameter beyond the duration of the function's execution, to avoid creating a circular reference through the traceback and frames involved. The simplest way to do this is something like:
|
||||
|
||||
def start_response(status, response_headers, exc_info=None):
|
||||
if exc_info:
|
||||
try:
|
||||
# do stuff w/exc_info here
|
||||
finally:
|
||||
exc_info = None # Avoid circular ref.
|
||||
|
||||
The example CGI gateway provides another illustration of this technique.
|
||||
Handling the Content-Length Header
|
||||
|
||||
If the application does not supply a Content-Length header, a server or gateway may choose one of several approaches to handling it. The simplest of these is to close the client connection when the response is completed.
|
||||
|
||||
Under some circumstances, however, the server or gateway may be able to either generate a Content-Length header, or at least avoid the need to close the client connection. If the application does not call the write() callable, and returns an iterable whose len() is 1, then the server can automatically determine Content-Length by taking the length of the first string yielded by the iterable.
|
||||
|
||||
And, if the server and client both support HTTP/1.1 "chunked encoding" [3], then the server may use chunked encoding to send a chunk for each write() call or string yielded by the iterable, thus generating a Content-Length header for each chunk. This allows the server to keep the client connection alive, if it wishes to do so. Note that the server must comply fully with RFC 2616 when doing this, or else fall back to one of the other strategies for dealing with the absence of Content-Length.
|
||||
|
||||
(Note: applications and middleware must not apply any kind of Transfer-Encoding to their output, such as chunking or gzipping; as "hop-by-hop" operations, these encodings are the province of the actual web server/gateway. See Other HTTP Features below, for more details.)
|
||||
Buffering and Streaming
|
||||
|
||||
Generally speaking, applications will achieve the best throughput by buffering their (modestly-sized) output and sending it all at once. This is a common approach in existing frameworks such as Zope: the output is buffered in a StringIO or similar object, then transmitted all at once, along with the response headers.
|
||||
|
||||
The corresponding approach in WSGI is for the application to simply return a single-element iterable (such as a list) containing the response body as a single string. This is the recommended approach for the vast majority of application functions, that render HTML pages whose text easily fits in memory.
|
||||
|
||||
For large files, however, or for specialized uses of HTTP streaming (such as multipart "server push"), an application may need to provide output in smaller blocks (e.g. to avoid loading a large file into memory). It's also sometimes the case that part of a response may be time-consuming to produce, but it would be useful to send ahead the portion of the response that precedes it.
|
||||
|
||||
In these cases, applications will usually return an iterator (often a generator-iterator) that produces the output in a block-by-block fashion. These blocks may be broken to coincide with mulitpart boundaries (for "server push"), or just before time-consuming tasks (such as reading another block of an on-disk file).
|
||||
|
||||
WSGI servers, gateways, and middleware must not delay the transmission of any block; they must either fully transmit the block to the client, or guarantee that they will continue transmission even while the application is producing its next block. A server/gateway or middleware may provide this guarantee in one of three ways:
|
||||
|
||||
Send the entire block to the operating system (and request that any O/S buffers be flushed) before returning control to the application, OR
|
||||
Use a different thread to ensure that the block continues to be transmitted while the application produces the next block.
|
||||
(Middleware only) send the entire block to its parent gateway/server
|
||||
|
||||
By providing this guarantee, WSGI allows applications to ensure that transmission will not become stalled at an arbitrary point in their output data. This is critical for proper functioning of e.g. multipart "server push" streaming, where data between multipart boundaries should be transmitted in full to the client.
|
||||
Middleware Handling of Block Boundaries
|
||||
|
||||
In order to better support asynchronous applications and servers, middleware components must not block iteration waiting for multiple values from an application iterable. If the middleware needs to accumulate more data from the application before it can produce any output, it must yield an empty string.
|
||||
|
||||
To put this requirement another way, a middleware component must yield at least one value each time its underlying application yields a value. If the middleware cannot yield any other value, it must yield an empty string.
|
||||
|
||||
This requirement ensures that asynchronous applications and servers can conspire to reduce the number of threads that are required to run a given number of application instances simultaneously.
|
||||
|
||||
Note also that this requirement means that middleware must return an iterable as soon as its underlying application returns an iterable. It is also forbidden for middleware to use the write() callable to transmit data that is yielded by an underlying application. Middleware may only use their parent server's write() callable to transmit data that the underlying application sent using a middleware-provided write() callable.
|
||||
The write() Callable
|
||||
|
||||
Some existing application framework APIs support unbuffered output in a different manner than WSGI. Specifically, they provide a "write" function or method of some kind to write an unbuffered block of data, or else they provide a buffered "write" function and a "flush" mechanism to flush the buffer.
|
||||
|
||||
Unfortunately, such APIs cannot be implemented in terms of WSGI's "iterable" application return value, unless threads or other special mechanisms are used.
|
||||
|
||||
Therefore, to allow these frameworks to continue using an imperative API, WSGI includes a special write() callable, returned by the start_response callable.
|
||||
|
||||
New WSGI applications and frameworks should not use the write() callable if it is possible to avoid doing so. The write() callable is strictly a hack to support imperative streaming APIs. In general, applications should produce their output via their returned iterable, as this makes it possible for web servers to interleave other tasks in the same Python thread, potentially providing better throughput for the server as a whole.
|
||||
|
||||
The write() callable is returned by the start_response() callable, and it accepts a single parameter: a string to be written as part of the HTTP response body, that is treated exactly as though it had been yielded by the output iterable. In other words, before write() returns, it must guarantee that the passed-in string was either completely sent to the client, or that it is buffered for transmission while the application proceeds onward.
|
||||
|
||||
An application must return an iterable object, even if it uses write() to produce all or part of its response body. The returned iterable may be empty (i.e. yield no non-empty strings), but if it does yield non-empty strings, that output must be treated normally by the server or gateway (i.e., it must be sent or queued immediately). Applications must not invoke write() from within their return iterable, and therefore any strings yielded by the iterable are transmitted after all strings passed to write() have been sent to the client.
|
||||
Unicode Issues
|
||||
|
||||
HTTP does not directly support Unicode, and neither does this interface. All encoding/decoding must be handled by the application; all strings passed to or from the server must be standard Python byte strings, not Unicode objects. The result of using a Unicode object where a string object is required, is undefined.
|
||||
|
||||
Note also that strings passed to start_response() as a status or as response headers must follow RFC 2616 with respect to encoding. That is, they must either be ISO-8859-1 characters, or use RFC 2047 MIME encoding.
|
||||
|
||||
On Python platforms where the str or StringType type is in fact Unicode-based (e.g. Jython, IronPython, Python 3000, etc.), all "strings" referred to in this specification must contain only code points representable in ISO-8859-1 encoding (\u0000 through \u00FF, inclusive). It is a fatal error for an application to supply strings containing any other Unicode character or code point. Similarly, servers and gateways must not supply strings to an application containing any other Unicode characters.
|
||||
|
||||
Again, all strings referred to in this specification must be of type str or StringType, and must not be of type unicode or UnicodeType. And, even if a given platform allows for more than 8 bits per character in str/StringType objects, only the lower 8 bits may be used, for any value referred to in this specification as a "string".
|
||||
Error Handling
|
||||
|
||||
In general, applications should try to trap their own, internal errors, and display a helpful message in the browser. (It is up to the application to decide what "helpful" means in this context.)
|
||||
|
||||
However, to display such a message, the application must not have actually sent any data to the browser yet, or else it risks corrupting the response. WSGI therefore provides a mechanism to either allow the application to send its error message, or be automatically aborted: the exc_info argument to start_response. Here is an example of its use:
|
||||
|
||||
try:
|
||||
# regular application code here
|
||||
status = "200 Froody"
|
||||
response_headers = [("content-type", "text/plain")]
|
||||
start_response(status, response_headers)
|
||||
return ["normal body goes here"]
|
||||
except:
|
||||
# XXX should trap runtime issues like MemoryError, KeyboardInterrupt
|
||||
# in a separate handler before this bare 'except:'...
|
||||
status = "500 Oops"
|
||||
response_headers = [("content-type", "text/plain")]
|
||||
start_response(status, response_headers, sys.exc_info())
|
||||
return ["error body goes here"]
|
||||
|
||||
If no output has been written when an exception occurs, the call to start_response will return normally, and the application will return an error body to be sent to the browser. However, if any output has already been sent to the browser, start_response will reraise the provided exception. This exception should not be trapped by the application, and so the application will abort. The server or gateway can then trap this (fatal) exception and abort the response.
|
||||
|
||||
Servers should trap and log any exception that aborts an application or the iteration of its return value. If a partial response has already been written to the browser when an application error occurs, the server or gateway may attempt to add an error message to the output, if the already-sent headers indicate a text/* content type that the server knows how to modify cleanly.
|
||||
|
||||
Some middleware may wish to provide additional exception handling services, or intercept and replace application error messages. In such cases, middleware may choose to not re-raise the exc_info supplied to start_response, but instead raise a middleware-specific exception, or simply return without an exception after storing the supplied arguments. This will then cause the application to return its error body iterable (or invoke write()), allowing the middleware to capture and modify the error output. These techniques will work as long as application authors:
|
||||
|
||||
Always provide exc_info when beginning an error response
|
||||
Never trap errors raised by start_response when exc_info is being provided
|
||||
|
||||
HTTP 1.1 Expect/Continue
|
||||
|
||||
Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's "expect/continue" mechanism. This may be done in any of several ways:
|
||||
|
||||
Respond to requests containing an Expect: 100-continue request with an immediate "100 Continue" response, and proceed normally.
|
||||
Proceed with the request normally, but provide the application with a wsgi.input stream that will send the "100 Continue" response if/when the application first attempts to read from the input stream. The read request must then remain blocked until the client responds.
|
||||
Wait until the client decides that the server does not support expect/continue, and sends the request body on its own. (This is suboptimal, and is not recommended.)
|
||||
|
||||
Note that these behavior restrictions do not apply for HTTP 1.0 requests, or for requests that are not directed to an application object. For more information on HTTP 1.1 Expect/Continue, see RFC 2616, sections 8.2.3 and 10.1.1.
|
||||
Other HTTP Features
|
||||
|
||||
In general, servers and gateways should "play dumb" and allow the application complete control over its output. They should only make changes that do not alter the effective semantics of the application's response. It is always possible for the application developer to add middleware components to supply additional features, so server/gateway developers should be conservative in their implementation. In a sense, a server should consider itself to be like an HTTP "gateway server", with the application being an HTTP "origin server". (See RFC 2616, section 1.3, for the definition of these terms.)
|
||||
|
||||
However, because WSGI servers and applications do not communicate via HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to WSGI internal communications. WSGI applications must not generate any "hop-by-hop" headers [4], attempt to use HTTP features that would require them to generate such headers, or rely on the content of any incoming "hop-by-hop" headers in the environ dictionary. WSGI servers must handle any supported inbound "hop-by-hop" headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.
|
||||
|
||||
Applying these principles to a variety of HTTP features, it should be clear that a server may handle cache validation via the If-None-Match and If-Modified-Since request headers and the Last-Modified and ETag response headers. However, it is not required to do this, and the application should perform its own cache validation if it wants to support that feature, since the server/gateway is not required to do such validation.
|
||||
|
||||
Similarly, a server may re-encode or transport-encode an application's response, but the application should use a suitable content encoding on its own, and must not apply a transport encoding. A server may transmit byte ranges of the application's response if requested by the client, and the application doesn't natively support byte ranges. Again, however, the application should perform this function on its own if desired.
|
||||
|
||||
Note that these restrictions on applications do not necessarily mean that every application must reimplement every HTTP feature; many HTTP features can be partially or fully implemented by middleware components, thus freeing both server and application authors from implementing the same features over and over again.
|
||||
Thread Support
|
||||
|
||||
Thread support, or lack thereof, is also server-dependent. Servers that can run multiple requests in parallel, should also provide the option of running an application in a single-threaded fashion, so that applications or frameworks that are not thread-safe may still be used with that server.
|
||||
Implementation/Application Notes
|
||||
Server Extension APIs
|
||||
|
||||
Some server authors may wish to expose more advanced APIs, that application or framework authors can use for specialized purposes. For example, a gateway based on mod_python might wish to expose part of the Apache API as a WSGI extension.
|
||||
|
||||
In the simplest case, this requires nothing more than defining an environ variable, such as mod_python.some_api. But, in many cases, the possible presence of middleware can make this difficult. For example, an API that offers access to the same HTTP headers that are found in environ variables, might return different data if environ has been modified by middleware.
|
||||
|
||||
In general, any extension API that duplicates, supplants, or bypasses some portion of WSGI functionality runs the risk of being incompatible with middleware components. Server/gateway developers should not assume that nobody will use middleware, because some framework developers specifically intend to organize or reorganize their frameworks to function almost entirely as middleware of various kinds.
|
||||
|
||||
So, to provide maximum compatibility, servers and gateways that provide extension APIs that replace some WSGI functionality, must design those APIs so that they are invoked using the portion of the API that they replace. For example, an extension API to access HTTP request headers must require the application to pass in its current environ, so that the server/gateway may verify that HTTP headers accessible via the API have not been altered by middleware. If the extension API cannot guarantee that it will always agree with environ about the contents of HTTP headers, it must refuse service to the application, e.g. by raising an error, returning None instead of a header collection, or whatever is appropriate to the API.
|
||||
|
||||
Similarly, if an extension API provides an alternate means of writing response data or headers, it should require the start_response callable to be passed in, before the application can obtain the extended service. If the object passed in is not the same one that the server/gateway originally supplied to the application, it cannot guarantee correct operation and must refuse to provide the extended service to the application.
|
||||
|
||||
These guidelines also apply to middleware that adds information such as parsed cookies, form variables, sessions, and the like to environ. Specifically, such middleware should provide these features as functions which operate on environ, rather than simply stuffing values into environ. This helps ensure that information is calculated from environ after any middleware has done any URL rewrites or other environ modifications.
|
||||
|
||||
It is very important that these "safe extension" rules be followed by both server/gateway and middleware developers, in order to avoid a future in which middleware developers are forced to delete any and all extension APIs from environ to ensure that their mediation isn't being bypassed by applications using those extensions!
|
||||
Application Configuration
|
||||
|
||||
This specification does not define how a server selects or obtains an application to invoke. These and other configuration options are highly server-specific matters. It is expected that server/gateway authors will document how to configure the server to execute a particular application object, and with what options (such as threading options).
|
||||
|
||||
Framework authors, on the other hand, should document how to create an application object that wraps their framework's functionality. The user, who has chosen both the server and the application framework, must connect the two together. However, since both the framework and the server now have a common interface, this should be merely a mechanical matter, rather than a significant engineering effort for each new server/framework pair.
|
||||
|
||||
Finally, some applications, frameworks, and middleware may wish to use the environ dictionary to receive simple string configuration options. Servers and gateways should support this by allowing an application's deployer to specify name-value pairs to be placed in environ. In the simplest case, this support can consist merely of copying all operating system-supplied environment variables from os.environ into the environ dictionary, since the deployer in principle can configure these externally to the server, or in the CGI case they may be able to be set via the server's configuration files.
|
||||
|
||||
Applications should try to keep such required variables to a minimum, since not all servers will support easy configuration of them. Of course, even in the worst case, persons deploying an application can create a script to supply the necessary configuration values:
|
||||
|
||||
from the_app import application
|
||||
|
||||
def new_app(environ, start_response):
|
||||
environ['the_app.configval1'] = 'something'
|
||||
return application(environ, start_response)
|
||||
|
||||
But, most existing applications and frameworks will probably only need a single configuration value from environ, to indicate the location of their application or framework-specific configuration file(s). (Of course, applications should cache such configuration, to avoid having to re-read it upon each invocation.)
|
||||
URL Reconstruction
|
||||
|
||||
If an application wishes to reconstruct a request's complete URL, it may do so using the following algorithm, contributed by Ian Bicking:
|
||||
|
||||
from urllib import quote
|
||||
url = environ['wsgi.url_scheme']+'://'
|
||||
|
||||
if environ.get('HTTP_HOST'):
|
||||
url += environ['HTTP_HOST']
|
||||
else:
|
||||
url += environ['SERVER_NAME']
|
||||
|
||||
if environ['wsgi.url_scheme'] == 'https':
|
||||
if environ['SERVER_PORT'] != '443':
|
||||
url += ':' + environ['SERVER_PORT']
|
||||
else:
|
||||
if environ['SERVER_PORT'] != '80':
|
||||
url += ':' + environ['SERVER_PORT']
|
||||
|
||||
url += quote(environ.get('SCRIPT_NAME', ''))
|
||||
url += quote(environ.get('PATH_INFO', ''))
|
||||
if environ.get('QUERY_STRING'):
|
||||
url += '?' + environ['QUERY_STRING']
|
||||
|
||||
Note that such a reconstructed URL may not be precisely the same URI as requested by the client. Server rewrite rules, for example, may have modified the client's originally requested URL to place it in a canonical form.
|
||||
Supporting Older (<2.2) Versions of Python
|
||||
|
||||
Some servers, gateways, or applications may wish to support older (<2.2) versions of Python. This is especially important if Jython is a target platform, since as of this writing a production-ready version of Jython 2.2 is not yet available.
|
||||
|
||||
For servers and gateways, this is relatively straightforward: servers and gateways targeting pre-2.2 versions of Python must simply restrict themselves to using only a standard "for" loop to iterate over any iterable returned by an application. This is the only way to ensure source-level compatibility with both the pre-2.2 iterator protocol (discussed further below) and "today's" iterator protocol (see PEP 234).
|
||||
|
||||
(Note that this technique necessarily applies only to servers, gateways, or middleware that are written in Python. Discussion of how to use iterator protocol(s) correctly from other languages is outside the scope of this PEP.)
|
||||
|
||||
For applications, supporting pre-2.2 versions of Python is slightly more complex:
|
||||
|
||||
You may not return a file object and expect it to work as an iterable, since before Python 2.2, files were not iterable. (In general, you shouldn't do this anyway, because it will perform quite poorly most of the time!) Use wsgi.file_wrapper or an application-specific file wrapper class. (See Optional Platform-Specific File Handling for more on wsgi.file_wrapper, and an example class you can use to wrap a file as an iterable.)
|
||||
If you return a custom iterable, it must implement the pre-2.2 iterator protocol. That is, provide a __getitem__ method that accepts an integer key, and raises IndexError when exhausted. (Note that built-in sequence types are also acceptable, since they also implement this protocol.)
|
||||
|
||||
Finally, middleware that wishes to support pre-2.2 versions of Python, and iterates over application return values or itself returns an iterable (or both), must follow the appropriate recommendations above.
|
||||
|
||||
(Note: It should go without saying that to support pre-2.2 versions of Python, any server, gateway, application, or middleware must also use only language features available in the target version, use 1 and 0 instead of True and False, etc.)
|
||||
Optional Platform-Specific File Handling
|
||||
|
||||
Some operating environments provide special high-performance file- transmission facilities, such as the Unix sendfile() call. Servers and gateways may expose this functionality via an optional wsgi.file_wrapper key in the environ. An application may use this "file wrapper" to convert a file or file-like object into an iterable that it then returns, e.g.:
|
||||
|
||||
if 'wsgi.file_wrapper' in environ:
|
||||
return environ['wsgi.file_wrapper'](filelike, block_size)
|
||||
else:
|
||||
return iter(lambda: filelike.read(block_size), '')
|
||||
|
||||
If the server or gateway supplies wsgi.file_wrapper, it must be a callable that accepts one required positional parameter, and one optional positional parameter. The first parameter is the file-like object to be sent, and the second parameter is an optional block size "suggestion" (which the server/gateway need not use). The callable must return an iterable object, and must not perform any data transmission until and unless the server/gateway actually receives the iterable as a return value from the application. (To do otherwise would prevent middleware from being able to interpret or override the response data.)
|
||||
|
||||
To be considered "file-like", the object supplied by the application must have a read() method that takes an optional size argument. It may have a close() method, and if so, the iterable returned by wsgi.file_wrapper must have a close() method that invokes the original file-like object's close() method. If the "file-like" object has any other methods or attributes with names matching those of Python built-in file objects (e.g. fileno()), the wsgi.file_wrapper may assume that these methods or attributes have the same semantics as those of a built-in file object.
|
||||
|
||||
The actual implementation of any platform-specific file handling must occur after the application returns, and the server or gateway checks to see if a wrapper object was returned. (Again, because of the presence of middleware, error handlers, and the like, it is not guaranteed that any wrapper created will actually be used.)
|
||||
|
||||
Apart from the handling of close(), the semantics of returning a file wrapper from the application should be the same as if the application had returned iter(filelike.read, ''). In other words, transmission should begin at the current position within the "file" at the time that transmission begins, and continue until the end is reached.
|
||||
|
||||
Of course, platform-specific file transmission APIs don't usually accept arbitrary "file-like" objects. Therefore, a wsgi.file_wrapper has to introspect the supplied object for things such as a fileno() (Unix-like OSes) or a java.nio.FileChannel (under Jython) in order to determine if the file-like object is suitable for use with the platform-specific API it supports.
|
||||
|
||||
Note that even if the object is not suitable for the platform API, the wsgi.file_wrapper must still return an iterable that wraps read() and close(), so that applications using file wrappers are portable across platforms. Here's a simple platform-agnostic file wrapper class, suitable for old (pre 2.2) and new Pythons alike:
|
||||
|
||||
class FileWrapper:
|
||||
|
||||
def __init__(self, filelike, blksize=8192):
|
||||
self.filelike = filelike
|
||||
self.blksize = blksize
|
||||
if hasattr(filelike, 'close'):
|
||||
self.close = filelike.close
|
||||
|
||||
def __getitem__(self, key):
|
||||
data = self.filelike.read(self.blksize)
|
||||
if data:
|
||||
return data
|
||||
raise IndexError
|
||||
|
||||
and here is a snippet from a server/gateway that uses it to provide access to a platform-specific API:
|
||||
|
||||
environ['wsgi.file_wrapper'] = FileWrapper
|
||||
result = application(environ, start_response)
|
||||
|
||||
try:
|
||||
if isinstance(result, FileWrapper):
|
||||
# check if result.filelike is usable w/platform-specific
|
||||
# API, and if so, use that API to transmit the result.
|
||||
# If not, fall through to normal iterable handling
|
||||
# loop below.
|
||||
|
||||
for data in result:
|
||||
# etc.
|
||||
|
||||
finally:
|
||||
if hasattr(result, 'close'):
|
||||
result.close()
|
||||
|
||||
Questions and Answers
|
||||
|
||||
Why must environ be a dictionary? What's wrong with using a subclass?
|
||||
|
||||
The rationale for requiring a dictionary is to maximize portability between servers. The alternative would be to define some subset of a dictionary's methods as being the standard and portable interface. In practice, however, most servers will probably find a dictionary adequate to their needs, and thus framework authors will come to expect the full set of dictionary features to be available, since they will be there more often than not. But, if some server chooses not to use a dictionary, then there will be interoperability problems despite that server's "conformance" to spec. Therefore, making a dictionary mandatory simplifies the specification and guarantees interoperabilty.
|
||||
|
||||
Note that this does not prevent server or framework developers from offering specialized services as custom variables inside the environ dictionary. This is the recommended approach for offering any such value-added services.
|
||||
|
||||
Why can you call write() and yield strings/return an iterable? Shouldn't we pick just one way?
|
||||
|
||||
If we supported only the iteration approach, then current frameworks that assume the availability of "push" suffer. But, if we only support pushing via write(), then server performance suffers for transmission of e.g. large files (if a worker thread can't begin work on a new request until all of the output has been sent). Thus, this compromise allows an application framework to support both approaches, as appropriate, but with only a little more burden to the server implementor than a push-only approach would require.
|
||||
|
||||
What's the close() for?
|
||||
|
||||
When writes are done during the execution of an application object, the application can ensure that resources are released using a try/finally block. But, if the application returns an iterable, any resources used will not be released until the iterable is garbage collected. The close() idiom allows an application to release critical resources at the end of a request, and it's forward-compatible with the support for try/finally in generators that's proposed by PEP 325.
|
||||
|
||||
Why is this interface so low-level? I want feature X! (e.g. cookies, sessions, persistence, ...)
|
||||
|
||||
This isn't Yet Another Python Web Framework. It's just a way for frameworks to talk to web servers, and vice versa. If you want these features, you need to pick a web framework that provides the features you want. And if that framework lets you create a WSGI application, you should be able to run it in most WSGI-supporting servers. Also, some WSGI servers may offer additional services via objects provided in their environ dictionary; see the applicable server documentation for details. (Of course, applications that use such extensions will not be portable to other WSGI-based servers.)
|
||||
|
||||
Why use CGI variables instead of good old HTTP headers? And why mix them in with WSGI-defined variables?
|
||||
|
||||
Many existing web frameworks are built heavily upon the CGI spec, and existing web servers know how to generate CGI variables. In contrast, alternative ways of representing inbound HTTP information are fragmented and lack market share. Thus, using the CGI "standard" seems like a good way to leverage existing implementations. As for mixing them with WSGI variables, separating them would just require two dictionary arguments to be passed around, while providing no real benefits.
|
||||
|
||||
What about the status string? Can't we just use the number, passing in 200 instead of "200 OK"?
|
||||
|
||||
Doing this would complicate the server or gateway, by requiring them to have a table of numeric statuses and corresponding messages. By contrast, it is easy for an application or framework author to type the extra text to go with the specific response code they are using, and existing frameworks often already have a table containing the needed messages. So, on balance it seems better to make the application/framework responsible, rather than the server or gateway.
|
||||
|
||||
Why is wsgi.run_once not guaranteed to run the app only once?
|
||||
|
||||
Because it's merely a suggestion to the application that it should "rig for infrequent running". This is intended for application frameworks that have multiple modes of operation for caching, sessions, and so forth. In a "multiple run" mode, such frameworks may preload caches, and may not write e.g. logs or session data to disk after each request. In "single run" mode, such frameworks avoid preloading and flush all necessary writes after each request.
|
||||
|
||||
However, in order to test an application or framework to verify correct operation in the latter mode, it may be necessary (or at least expedient) to invoke it more than once. Therefore, an application should not assume that it will definitely not be run again, just because it is called with wsgi.run_once set to True.
|
||||
|
||||
Feature X (dictionaries, callables, etc.) are ugly for use in application code; why don't we use objects instead?
|
||||
|
||||
All of these implementation choices of WSGI are specifically intended to decouple features from one another; recombining these features into encapsulated objects makes it somewhat harder to write servers or gateways, and an order of magnitude harder to write middleware that replaces or modifies only small portions of the overall functionality.
|
||||
|
||||
In essence, middleware wants to have a "Chain of Responsibility" pattern, whereby it can act as a "handler" for some functions, while allowing others to remain unchanged. This is difficult to do with ordinary Python objects, if the interface is to remain extensible. For example, one must use __getattr__ or __getattribute__ overrides, to ensure that extensions (such as attributes defined by future WSGI versions) are passed through.
|
||||
|
||||
This type of code is notoriously difficult to get 100% correct, and few people will want to write it themselves. They will therefore copy other people's implementations, but fail to update them when the person they copied from corrects yet another corner case.
|
||||
|
||||
Further, this necessary boilerplate would be pure excise, a developer tax paid by middleware developers to support a slightly prettier API for application framework developers. But, application framework developers will typically only be updating one framework to support WSGI, and in a very limited part of their framework as a whole. It will likely be their first (and maybe their only) WSGI implementation, and thus they will likely implement with this specification ready to hand. Thus, the effort of making the API "prettier" with object attributes and suchlike would likely be wasted for this audience.
|
||||
|
||||
We encourage those who want a prettier (or otherwise improved) WSGI interface for use in direct web application programming (as opposed to web framework development) to develop APIs or frameworks that wrap WSGI for convenient use by application developers. In this way, WSGI can remain conveniently low-level for server and middleware authors, while not being "ugly" for application developers.
|
||||
|
||||
Proposed/Under Discussion
|
||||
|
||||
These items are currently being discussed on the Web-SIG and elsewhere, or are on the PEP author's "to-do" list:
|
||||
|
||||
Should wsgi.input be an iterator instead of a file? This would help for asynchronous applications and chunked-encoding input streams.
|
||||
Optional extensions are being discussed for pausing iteration of an application's output until input is available or until a callback occurs.
|
||||
Add a section about synchronous vs. asynchronous apps and servers, the relevant threading models, and issues/design goals in these areas.
|
||||
|
||||
Acknowledgements
|
||||
|
||||
Thanks go to the many folks on the Web-SIG mailing list whose thoughtful feedback made this revised draft possible. Especially:
|
||||
|
||||
Gregory "Grisha" Trubetskoy, author of mod_python, who beat up on the first draft as not offering any advantages over "plain old CGI", thus encouraging me to look for a better approach.
|
||||
Ian Bicking, who helped nag me into properly specifying the multithreading and multiprocess options, as well as badgering me to provide a mechanism for servers to supply custom extension data to an application.
|
||||
Tony Lownds, who came up with the concept of a start_response function that took the status and headers, returning a write function. His input also guided the design of the exception handling facilities, especially in the area of allowing for middleware that overrides application error messages.
|
||||
Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython (well before the spec was finalized) helped to shape the "supporting older versions of Python" section, as well as the optional wsgi.file_wrapper facility.
|
||||
Mark Nottingham, who reviewed the spec extensively for issues with HTTP RFC compliance, especially with regard to HTTP/1.1 features that I didn't even know existed until he pointed them out.
|
||||
|
||||
References
|
||||
[1] The Python Wiki "Web Programming" topic (http://www.python.org/cgi-bin/moinmoin/WebProgramming)
|
||||
[2] The Common Gateway Interface Specification, v 1.1, 3rd Draft (http://ken.coar.org/cgi/draft-coar-cgi-v11-03.txt)
|
||||
[3] "Chunked Transfer Coding" -- HTTP/1.1, section 3.6.1 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1)
|
||||
[4] "End-to-end and Hop-by-hop Headers" -- HTTP/1.1, Section 13.5.1 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.1)
|
||||
[5] mod_ssl Reference, "Environment Variables" (http://www.modssl.org/docs/2.8/ssl_reference.html#ToC25)
|
||||
Copyright
|
||||
|
||||
This document has been placed in the public domain.
|
||||
195
Zim/Programme/python/WSGI/wsgi初探.txt
Normal file
195
Zim/Programme/python/WSGI/wsgi初探.txt
Normal file
@@ -0,0 +1,195 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2011-10-13T22:34:33+08:00
|
||||
|
||||
====== wsgi初探 ======
|
||||
Created Thursday 13 October 2011
|
||||
http://linluxiang.iteye.com/blog/799163
|
||||
|
||||
===== 前言 =====
|
||||
|
||||
本文不涉及WSGI的具体协议的介绍,也不会有协议完整的实现,甚至描述中还会掺杂着本人自己对于WSGI的见解。所有的WSGI官方定义请看http://www.python.org/dev/peps/pep-3333/。
|
||||
|
||||
|
||||
===== WSGI是什么? =====
|
||||
|
||||
WSGI的官方定义是,the Python Web Server Gateway Interface。从名字就可以看出来,这东西是一个Gateway,也就是网关。网关的作用就是在协议之间进行转换。
|
||||
|
||||
也就是说,WSGI就像是一座桥梁,一边连着web服务器,另一边连着用户的应用。但是呢,这个桥的功能很弱,有时候还需要别的桥来帮忙才能进行处理。
|
||||
|
||||
下面对本文出现的一些名词做定义。
|
||||
wsgi app ,又称应用 ,就是一个WSGI application。
|
||||
wsgi container ,又称 容器 ,虽然这个部分常常被称为handler,不过我个人认为handler容易和app混淆,所以我称之为容器。 wsgi_middleware ,又称 中间件 。一种特殊类型的程序,专门负责在容器和应用之间干坏事的。
|
||||
|
||||
一图胜千言,直接来一个我自己理解的WSGI架构图吧。
|
||||
{{~/sync/notes/zim/python/WSGI/wsgi初探/1.jpg}}
|
||||
|
||||
可以看出,服务器,容器和应用之间存在着十分纠结的关系。下面就要把这些纠结的关系理清楚。
|
||||
|
||||
===== WSGI应用 =====
|
||||
|
||||
|
||||
WSGI应用其实就是一个**callable的对象**。举一个最简单的例子,假设存在如下的一个应用:
|
||||
|
||||
def application(environ, start_response):
|
||||
status = '200 OK'
|
||||
output = 'World!'
|
||||
response_headers = [('Content-type', 'text/plain'),
|
||||
('Content-Length', str(12)]
|
||||
__ write =__ start_response(status, response_headers)
|
||||
write('Hello ')
|
||||
return [output]
|
||||
|
||||
这个WSGI应用简单的可以用简陋来形容,但是他的确是一个功能完整的WSGI应用。只不过给人留下了太多的疑点,environ是什么?start_response是什么?为什么可以同时用write和return来返回内容?
|
||||
|
||||
对于这些疑问,不妨自己猜测一下他的作用。联想到CGI,那么environ可能就是一系列的环境变量,用来**表示HTTP请求的信息(客户端发过来的)**,比如说method之类的。start_response,可能是接受**HTTP response头信息(应用返回给客户端的Http头信息)**,然后返回一个write函数,这个write函数可以把__HTTP response的body__返回给客户端。return自然是将HTTP response的body信息返回。不过这里的write和函数返回有什么区别?会不会是其实外围默认调用write对应用返回值进行处理?而且为什么应用的返回值是一个__列表__呢?说明肯定存在一个__对应用执行结果的迭代输出过程__。难道说他隐含的支持iterator或者generator吗?
|
||||
|
||||
等等,应用执行结果?__一个应用既然是一个函数,说明肯定有一个对象去执行它,并且可以猜到,这个对象把environ和start_response传给应用,将应用的返回结果输出给客户端。那么这个对象是什么呢?自然就是WSGI容器了。__
|
||||
|
||||
===== WSGI容器 =====
|
||||
|
||||
|
||||
先说说WSGI容器的来源,其实这是我自己编造出来的一个概念。来源就是JavaServlet容器。我个人理解两者有相似的地方,就顺手拿过来用了。
|
||||
|
||||
__WSGI容器的作用,就是构建一个让WSGI应用成功执行的环境__。成功执行,意味着需要传入正确的参数,以及正确处理返回的结果,还得把结果返回给客户端。
|
||||
|
||||
所以,WSGI容器的工作流程大致就是,用webserver规定的通信方式,能从webserver获得正确的request信息,__封装好__,传给WSGI应用执行,正确的返回response。
|
||||
|
||||
一般来说,WSGI容器必须__依附于现有的webserver的技术__才能实现,比如说CGI,FastCGI,或者是embed的模式。
|
||||
|
||||
下面利用CGI的方式编写一个最简单的WSGI容器。关于WSGI容器的协议官方文档并没有具体的说如何实现,只是介绍了一些需要约束的东西。具体内容看PEP3333中的协议。
|
||||
|
||||
#!/usr/bin/python
|
||||
#encoding:utf8
|
||||
|
||||
import cgi
|
||||
import cgitb
|
||||
import sys
|
||||
import os
|
||||
|
||||
#Make the environ argument
|
||||
environ = {}
|
||||
environ['REQUEST_METHOD'] = os.environ['REQUEST_METHOD']
|
||||
environ['SCRIPT_NAME'] = os.environ['SCRIPT_NAME']
|
||||
environ['PATH_INFO'] = os.environ['PATH_INFO']
|
||||
environ['QUERY_STRING'] = os.environ['QUERY_STRING']
|
||||
environ['CONTENT_TYPE'] = os.environ['CONTENT_TYPE']
|
||||
environ['CONTENT_LENGTH'] = os.environ['CONTENT_LENGTH']
|
||||
environ['SERVER_NAME'] = os.environ['SERVER_NAME']
|
||||
environ['SERVER_PORT'] = os.environ['SERVER_PORT']
|
||||
environ['SERVER_PROTOCOL'] = os.environ['SERVER_PROTOCOL']
|
||||
environ['wsgi.version'] = (1, 0)
|
||||
environ['wsgi.url_scheme'] = 'http'
|
||||
environ['wsgi.input'] = sys.stdin
|
||||
environ['wsgi.errors'] = sys.stderr
|
||||
environ['wsgi.multithread'] = False
|
||||
environ['wsgi.multiprocess'] = True
|
||||
environ['wsgi.run_once'] = True
|
||||
|
||||
|
||||
#make the start_response argument
|
||||
#注意,WSGI协议规定,如果没有body内容,是不能返回http response头信息的。
|
||||
sent_header = False
|
||||
res_status = None
|
||||
res_headers = None
|
||||
|
||||
def write(body):
|
||||
global sent_header
|
||||
if sent_header:
|
||||
sys.stdout.write(body)
|
||||
else:
|
||||
print res_status
|
||||
for k, v in res_headers:
|
||||
print k + ': ' + v
|
||||
print
|
||||
sys.stdout.write(body)
|
||||
sent_header = True
|
||||
|
||||
def start_response(status, response_headers):
|
||||
global res_status
|
||||
global res_headers
|
||||
res_status = status
|
||||
res_headers = response_headers
|
||||
return write
|
||||
|
||||
#here is the application
|
||||
def application(environ, start_response):
|
||||
status = '200 OK'
|
||||
output = 'World!'
|
||||
response_headers = [('Content-type', 'text/plain'),
|
||||
('Content-Length', str(12)]
|
||||
write = start_response(status, response_headers)
|
||||
write('Hello ')
|
||||
return [output]
|
||||
|
||||
#here run the application
|
||||
result = application(environ, start_response)
|
||||
for value in result:
|
||||
write(value)
|
||||
|
||||
看吧。其实实现一个WSGI容器也不难。
|
||||
|
||||
不过我从WSGI容器的设计中可以看出WSGI的应用设计上面存在着一个重大的问题就是:为什么要提供两种方式返回数据?明明只有一个write函数,却既可以在application里面调用,又可以在容器中传输应用的返回值来调用。如果说让我来设计的话,直接把start_response给去掉了。就用application(environ)这个接口。传一个方法,然后返回值就__是status, response_headers和一个字符串的列表__。实际传输的方法全部隐藏了。用户只需要从environ中读取数据处理就行了。。
|
||||
|
||||
可喜的是,搜了一下貌似web3的标准里面应用的设计和我的想法类似。希望web3协议能早日普及。
|
||||
|
||||
|
||||
====== Middleware中间件 ======
|
||||
|
||||
中间件是一类特殊的程序,可以在容器和应用之间干一些坏事。。其实熟悉python的decorator的人就会发现,这和decoraotr没什么区别。
|
||||
|
||||
下面来实现一个route的简单middleware。
|
||||
|
||||
class Router(object):
|
||||
def __init__(self):
|
||||
self.path_info = {}
|
||||
def route(self, environ, start_response):
|
||||
application = self.path_info[environ['PATH_INFO']]
|
||||
return application(environ, start_response)
|
||||
def __call__(self, path):
|
||||
def wrapper(application):
|
||||
self.path_info[path] = application
|
||||
return wrapper
|
||||
|
||||
这就是一个很简单的路由功能的middleware。将上面那段wsgi容器的代码里面的应用修改成如下:
|
||||
|
||||
|
||||
router = Router()
|
||||
|
||||
#here is the application
|
||||
@router('/hello')
|
||||
def hello(environ, start_response):
|
||||
status = '200 OK'
|
||||
output = 'Hello'
|
||||
response_headers = [('Content-type', 'text/plain'),
|
||||
('Content-Length', str(len(output)))]
|
||||
write = start_response(status, response_headers)
|
||||
return [output]
|
||||
|
||||
@router('/world')
|
||||
def world(environ, start_response):
|
||||
status = '200 OK'
|
||||
output = 'World!'
|
||||
response_headers = [('Content-type', 'text/plain'),
|
||||
('Content-Length', str(len(output)))]
|
||||
write = start_response(status, response_headers)
|
||||
return [output]
|
||||
#here run the application
|
||||
result = router.route(environ, start_response)
|
||||
for value in result:
|
||||
write(value)
|
||||
|
||||
|
||||
|
||||
|
||||
这样,**容器就会自动的根据访问的地址找到对应的app执行了**。
|
||||
|
||||
|
||||
====== 延伸 ======
|
||||
|
||||
|
||||
写着写着,怎么越来越像一个框架了?看来Python开发框架真是简单。。
|
||||
|
||||
其实从另外一个角度去考虑。如果把application当作是一个运算单元。利用middleware调控IO和运算资源,那么利用WSGI组成一个分布式的系统。
|
||||
|
||||
好吧,全文完。
|
||||
BIN
Zim/Programme/python/WSGI/wsgi初探/1.jpg
Normal file
BIN
Zim/Programme/python/WSGI/wsgi初探/1.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 34 KiB |
411
Zim/Programme/python/WSGI/主题:在Python3.0中处理web请求-封装wsgi.txt
Normal file
411
Zim/Programme/python/WSGI/主题:在Python3.0中处理web请求-封装wsgi.txt
Normal file
@@ -0,0 +1,411 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2011-10-13T22:13:34+08:00
|
||||
|
||||
====== 主题:在Python3.0中处理web请求-封装wsgi ======
|
||||
Created Thursday 13 October 2011
|
||||
http://www.iteye.com/topic/396244
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import socketserver, re, cgi, io, urllib.parse
|
||||
from wsgiref.simple_server import WSGIServer
|
||||
|
||||
class AppException(Exception):
|
||||
pass
|
||||
|
||||
class Request(object):
|
||||
"""保存客户端请求信息"""
|
||||
|
||||
def __init__(self, env):
|
||||
self.env = env
|
||||
self.winput = env["wsgi.input"]
|
||||
self.method = env["REQUEST_METHOD"] # 获取请求方法(GET or POST)
|
||||
self.__attrs = {}
|
||||
self.attributes = {}
|
||||
self.encoding = "UTF-8"
|
||||
|
||||
def __getattr__(self, attr):
|
||||
if(attr == "params" and "params" not in self.__attrs):
|
||||
fp = None
|
||||
if(self.method == "POST"):
|
||||
content = self.winput.read(int(self.env.get("CONTENT_LENGTH","0")))
|
||||
#fp = io.StringIO(content.decode(self.encoding))
|
||||
fp = io.StringIO(urllib.parse.unquote(content.decode("ISO-8859-1"),encoding=self.encoding))
|
||||
|
||||
self.fs = cgi.FieldStorage(fp = fp, environ=self.env, keep_blank_values=1)# 创建FieldStorage
|
||||
self.params = {}
|
||||
for key in self.fs.keys():
|
||||
self.params[key] = self.fs[key].value
|
||||
self.__attrs["params"] = self.params
|
||||
return self.__attrs[attr]
|
||||
|
||||
class Response(object):
|
||||
"""对客户端进行响应"""
|
||||
|
||||
def __init__(self, start_response, write = None):
|
||||
self.encoding = "UTF-8"
|
||||
self.start_response = start_response
|
||||
self._write = write
|
||||
|
||||
def write(self, string):
|
||||
"""向流中写数据
|
||||
@param string:要写到流中的字符串
|
||||
"""
|
||||
if(self._write is None):
|
||||
self._write = self.start_response("200 OK", [("Content-type","text/html;charset="+self.encoding)])
|
||||
self._write(string.encode(self.encoding).decode("ISO-8859-1"))
|
||||
|
||||
def redirect(self, url):
|
||||
"""跳转"""
|
||||
if(self._write is not None):
|
||||
raise AppException("响应流已写入数据,无法进行跳转。")
|
||||
self.start_response("302 OK", [("Location",url)])
|
||||
|
||||
class ThreadingWSGIServer(WSGIServer, socketserver.ThreadingMixIn):
|
||||
"""一个使用多线程处理请求的WSGI服务类"""
|
||||
pass
|
||||
|
||||
class WSGIApplication(object):
|
||||
"""WSGI服务器程序"""
|
||||
def __init__(self, urls=None):
|
||||
self.urls = urls # URL映射
|
||||
|
||||
def getHandlerByUrl(self, url):
|
||||
"""根据URL获取处理程序,如果没有找到该处理程序则返回None"""
|
||||
url = url.replace("//","/") # 避免输入错误引起的url解释错误
|
||||
|
||||
urlArr = url.split('/')
|
||||
for setUrl in self.urls.keys():
|
||||
setUrlArr = setUrl.split("/")
|
||||
#print(setUrl.replace("*",r'\w*'))
|
||||
if(len(setUrlArr) == len(urlArr)):
|
||||
for i in range(len(urlArr)):
|
||||
if(i == len(urlArr) - 1 and
|
||||
(setUrlArr[i] == '*' or setUrlArr[i] == urlArr[i] or
|
||||
('*' in setUrlArr[i] and re.search(setUrlArr[i].replace("*",r'\w*'),urlArr[i])))):
|
||||
return self.urls[setUrl]
|
||||
if(setUrlArr[i] == '*' or setUrlArr[i]==' '):
|
||||
continue;
|
||||
if(setUrlArr[i] != urlArr[i]):
|
||||
break;
|
||||
|
||||
def make_app(self):
|
||||
"""建立WSGI响应程序"""
|
||||
def wsgi_app(env, start_response):
|
||||
#print(";\n".join([k+"="+str(v) for k, v in env.items()]))
|
||||
url = env["PATH_INFO"] # 获取当前请求URL
|
||||
handlerCls = self.getHandlerByUrl(url)
|
||||
if(handlerCls is None):
|
||||
# 未经定义的url处理
|
||||
start_response("500 OK", [("Content-type","text/html;charset=utf-8")])
|
||||
return "Error URL"
|
||||
if(not hasattr(handlerCls,"doGET") and not hasattr(handlerCls,"doPOST")):
|
||||
# 映射错误
|
||||
start_response("500 OK", [("Content-type","text/html;charset=utf-8")])
|
||||
return "Error Mapping"
|
||||
request = Request(env)
|
||||
response = Response(start_response)
|
||||
try:
|
||||
handler = handlerCls(request, response)
|
||||
except TypeError as e:
|
||||
handler = handlerCls()
|
||||
methodName = "do" + request.method
|
||||
returnValue = None
|
||||
try:
|
||||
returnValue = getattr(handler,methodName)(request, response)
|
||||
except TypeError as e:
|
||||
returnValue = getattr(handler,methodName)()
|
||||
if(returnValue is None):
|
||||
returnValue=[]
|
||||
return returnValue
|
||||
return wsgi_app
|
||||
|
||||
def make_server(self, serverIp='', port=8080, test=False):
|
||||
"""建立一个默认服务器
|
||||
@param test: 是否只是做一次测试
|
||||
"""
|
||||
from wsgiref.simple_server import make_server # 加载模块
|
||||
httpd = make_server(serverIp, port, self.make_app(), server_class=ThreadingWSGIServer)
|
||||
if test: # 如果只是测试
|
||||
httpd.handle_request() # 处理单次请求
|
||||
else:
|
||||
httpd.serve_forever() # 处理多次请求
|
||||
return True
|
||||
|
||||
def main():
|
||||
app = WSGIApplication(urls={"/a/*":TestHandler, "/a/b/*.do":TestHandler})
|
||||
app.make_server(test=True)
|
||||
|
||||
class TestHandler(object):
|
||||
def __init__(self):
|
||||
pass
|
||||
def doGET(self, request=None, response=None):
|
||||
request.encoding='UTF-8'
|
||||
response.write("Hello")
|
||||
def doPOST(self, request=None, response=None):
|
||||
#request.encoding='UTF-8'
|
||||
#response.write(request.params["name"])
|
||||
response.redirect("/a/x")
|
||||
|
||||
if __name__=="__main__":
|
||||
main()
|
||||
#input()
|
||||
|
||||
接上篇 在Python3.0中处理web请求-继续封装wsgi :
|
||||
|
||||
这次加入了Cookies封装,session支持,从线程作用域获取request,response等。目前session还不能被持久化
|
||||
http://www.iteye.com/topic/397437
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import socketserver, re, cgi, io, urllib.parse
|
||||
from wsgiref.simple_server import WSGIServer
|
||||
import threading, time, urllib, guid
|
||||
from http.cookies import SimpleCookie
|
||||
|
||||
ctx = context = threading.local()
|
||||
|
||||
class AppException(Exception):
|
||||
pass
|
||||
|
||||
class SessionPool(object):
|
||||
sessionIdKey = "psessionid"
|
||||
|
||||
"""存储Session的地方"""
|
||||
def __init__(self, session_store_time=30):
|
||||
"""初始化Session池
|
||||
@param session_store_time:session存储时间,单位:分钟
|
||||
"""
|
||||
self.session_store_time = session_store_time
|
||||
self.sessions = {}
|
||||
|
||||
def getSession(self, key):
|
||||
"""从池中获取Session"""
|
||||
if(key in self.sessions):
|
||||
session = self.sessions[key]
|
||||
if(session.isTimeOut()):
|
||||
self.removeSession(session.sessionId)
|
||||
else:
|
||||
return session
|
||||
return None
|
||||
|
||||
def createSession(self):
|
||||
"""创建一个新的Session"""
|
||||
sessionId = self.newSessionId()
|
||||
session = Session(sessionId, self.session_store_time)
|
||||
self.sessions[sessionId] = session
|
||||
return session
|
||||
|
||||
def removeSession(self, key):
|
||||
"""删除Session"""
|
||||
#self.sessions.remove(key)
|
||||
if(key in self.sessions):
|
||||
del self.sessions[key]
|
||||
|
||||
def newSessionId(self, ip=None):
|
||||
"""创建一个新的SessionId"""
|
||||
return guid.generate(ip)
|
||||
|
||||
def getSessionByCookie(self, cookie, response=None, create=True):
|
||||
"""根据Cookie信息找到session"""
|
||||
sessionId = cookie.get(SessionPool.sessionIdKey, None)
|
||||
if(sessionId is not None):
|
||||
sessionId = sessionId.value
|
||||
session = self.getSession(sessionId)
|
||||
if(session is not None):
|
||||
session.lastAccessTime = time.time()
|
||||
return session
|
||||
if(create):
|
||||
session = self.createSession()
|
||||
response.putCookie(SessionPool.sessionIdKey, session.sessionId)
|
||||
return session
|
||||
return None
|
||||
|
||||
def saveSessions(self):
|
||||
pass
|
||||
|
||||
class Session(dict):
|
||||
"""一个客户端会话"""
|
||||
def __init__(self, sid, store_time):
|
||||
self.sessionId = sid
|
||||
self.lastAccessTime = self.createTime = time.time()
|
||||
self.maxInactiveInterval = store_time # session存储时间,单位:分钟
|
||||
|
||||
def isTimeOut(self):
|
||||
"""判断是否已超时"""
|
||||
return time.time() - self.lastAccessTime > self.maxInactiveInterval * 60
|
||||
|
||||
class Request(object):
|
||||
"""保存客户端请求信息"""
|
||||
|
||||
def __init__(self, env, sessions):
|
||||
self.env = env
|
||||
self.winput = env["wsgi.input"]
|
||||
self.method = env["REQUEST_METHOD"] # 获取请求方法(GET or POST)
|
||||
self.__attrs = {}
|
||||
self.attributes = {}
|
||||
self.encoding = "UTF-8"
|
||||
self.cookies = SimpleCookie(env.get("HTTP_COOKIE",""))
|
||||
self.response = ctx.response
|
||||
self.sessionPool = sessions
|
||||
|
||||
def __getattr__(self, attr):
|
||||
if(attr == "params" and "params" not in self.__attrs): # 获取客户端请求参数
|
||||
fp = None
|
||||
if(self.method == "POST"): #如果请求时以POST方式提交的,则以POST方式处理,否则以GET方式处理
|
||||
content = self.winput.read(int(self.env.get("CONTENT_LENGTH","0")))
|
||||
#fp = io.StringIO(content.decode(self.encoding))
|
||||
fp = io.StringIO(urllib.parse.unquote(content.decode("ISO-8859-1"),encoding=self.encoding))
|
||||
|
||||
self.fs = cgi.FieldStorage(fp = fp, environ=self.env, keep_blank_values=1)# 创建FieldStorage
|
||||
self.params = {}
|
||||
for key in self.fs.keys():
|
||||
self.params[key] = self.fs[key].value
|
||||
self.__attrs["params"] = self.params
|
||||
if(attr == "session" and "session" not in self.__attrs): # 该request中不存在session则创建一个
|
||||
self.session = self.sessionPool.getSessionByCookie(self.cookies, self.response)
|
||||
return self.session
|
||||
return self.__attrs[attr]
|
||||
|
||||
class Response(object):
|
||||
"""对客户端进行响应"""
|
||||
|
||||
def __init__(self, start_response, write = None):
|
||||
self.encoding = "UTF-8"
|
||||
self.start_response = start_response
|
||||
self._write = write
|
||||
self.cookies = None
|
||||
self.headers = {}
|
||||
|
||||
def write(self, string):
|
||||
"""向流中写数据
|
||||
@param string:要写到流中的字符串
|
||||
"""
|
||||
if(self._write is None):
|
||||
__headers = [("Content-type","text/html;charset="+self.encoding)]
|
||||
if(self.cookies is not None):
|
||||
t = ('Set-Cookie', self.cookies.output(header=""))
|
||||
__headers.append(t)
|
||||
for k, v in self.headers.items():
|
||||
t = (k,v)
|
||||
__headers.append(t)
|
||||
self._write = self.start_response("200 OK", __headers)
|
||||
self._write(string.encode(self.encoding).decode("ISO-8859-1"))
|
||||
|
||||
def redirect(self, url):
|
||||
"""跳转"""
|
||||
if(self._write is not None):
|
||||
raise AppException("响应流已写入数据,无法进行跳转。")
|
||||
self.start_response("302 OK", [("Location",url)])
|
||||
|
||||
def putCookie(self, key, value, expires=1000000, path='/'):
|
||||
"""添加Cookie信息"""
|
||||
if(self.cookies is None):
|
||||
self.cookies = SimpleCookie()
|
||||
self.cookies[key] = urllib.parse.quote(value)
|
||||
self.cookies[key]["expires"] = expires
|
||||
self.cookies[key]['path'] = path
|
||||
|
||||
def addHeaders(key, value):
|
||||
self.headers[key] = value
|
||||
|
||||
#WSGIServer必须放在后面…否则没有异步效果
|
||||
class ThreadingWSGIServer(socketserver.ThreadingMixIn, WSGIServer):
|
||||
"""一个使用多线程处理请求的WSGI服务类"""
|
||||
pass
|
||||
|
||||
class WSGIApplication(object):
|
||||
"""WSGI服务器程序"""
|
||||
def __init__(self, urls=None):
|
||||
self.urls = urls # URL映射
|
||||
self.sessions = SessionPool(1)
|
||||
|
||||
def getHandlerByUrl(self, url):
|
||||
"""根据URL获取处理程序,如果没有找到该处理程序则返回None"""
|
||||
url = url.replace("//","/") # 避免输入错误引起的url解释错误
|
||||
|
||||
urlArr = url.split('/')
|
||||
for setUrl in self.urls.keys():
|
||||
setUrlArr = setUrl.split("/")
|
||||
#print(setUrl.replace("*",r'\w*'))
|
||||
if(len(setUrlArr) == len(urlArr)):
|
||||
for i in range(len(urlArr)):
|
||||
if(i == len(urlArr) - 1 and
|
||||
(setUrlArr[i] == '*' or setUrlArr[i] == urlArr[i] or
|
||||
('*' in setUrlArr[i] and re.search(setUrlArr[i].replace("*",r'\w*'),urlArr[i])))):
|
||||
return self.urls[setUrl]
|
||||
if(setUrlArr[i] == '*' or setUrlArr[i]==' '):
|
||||
continue;
|
||||
if(setUrlArr[i] != urlArr[i]):
|
||||
break;
|
||||
|
||||
def make_app(self):
|
||||
"""建立WSGI响应程序"""
|
||||
def wsgi_app(env, start_response):
|
||||
print("start request....")
|
||||
#print(";\n".join([k+"="+str(v) for k, v in env.items()]))
|
||||
url = env["PATH_INFO"] # 获取当前请求URL
|
||||
handlerCls = self.getHandlerByUrl(url)
|
||||
if(handlerCls is None):
|
||||
# 未经定义的url处理
|
||||
start_response("500 OK", [("Content-type","text/html;charset=utf-8")])
|
||||
return "Error URL"
|
||||
if(not hasattr(handlerCls,"doGET") and not hasattr(handlerCls,"doPOST")):
|
||||
# 映射错误
|
||||
start_response("500 OK", [("Content-type","text/html;charset=utf-8")])
|
||||
return "Error Mapping"
|
||||
response = Response(start_response)
|
||||
ctx.response = response
|
||||
request = Request(env, self.sessions)
|
||||
ctx.request = request # 将request和response放入当前线程作用域中,方便访问
|
||||
try:
|
||||
handler = handlerCls(request, response)
|
||||
except TypeError as e:
|
||||
handler = handlerCls()
|
||||
methodName = "do" + request.method
|
||||
returnValue = None
|
||||
try:
|
||||
returnValue = getattr(handler,methodName)(request, response)
|
||||
except TypeError as e:
|
||||
returnValue = getattr(handler,methodName)()
|
||||
if(returnValue is None):
|
||||
returnValue=[]
|
||||
print("end request....")
|
||||
return returnValue
|
||||
return wsgi_app
|
||||
|
||||
def make_server(self, serverIp='', port=8080, test=False):
|
||||
"""建立一个默认服务器
|
||||
@param test: 是否只是做一次测试
|
||||
"""
|
||||
from wsgiref.simple_server import make_server # 加载模块
|
||||
httpd = make_server(serverIp, port, self.make_app(), server_class=ThreadingWSGIServer)
|
||||
if test: # 如果只是测试
|
||||
httpd.handle_request() # 处理单次请求
|
||||
else:
|
||||
httpd.serve_forever() # 处理多次请求
|
||||
return True
|
||||
|
||||
def main():
|
||||
app = WSGIApplication(urls={"/a/*":TestHandler, "/a/b/*.do":TestHandler})
|
||||
app.make_server(test=False,port=9000)
|
||||
|
||||
|
||||
class TestHandler(object):
|
||||
def __init__(self):
|
||||
pass
|
||||
def doGET(self):
|
||||
ctx.request.encoding='UTF-8'
|
||||
session = ctx.request.session
|
||||
if("x" in ctx.request.params):
|
||||
session["x"] = ctx.request.params["x"]
|
||||
#time.sleep(3)
|
||||
ctx.response.write("Hello "+session["x"])
|
||||
|
||||
def doPOST(self):
|
||||
#request.encoding='UTF-8'
|
||||
#response.write(request.params["name"])
|
||||
ctx.response.redirect("/a/x")
|
||||
|
||||
if __name__=="__main__":
|
||||
main()
|
||||
#input()
|
||||
23
Zim/Programme/python/WSGI/关于_WSGI.txt
Normal file
23
Zim/Programme/python/WSGI/关于_WSGI.txt
Normal file
@@ -0,0 +1,23 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2011-10-13T18:28:24+08:00
|
||||
|
||||
====== 关于 WSGI ======
|
||||
Created Thursday 13 October 2011
|
||||
http://eishn.blog.163.com/blog/static/652318201011082044410/
|
||||
|
||||
WSGI 主要是读一下 PEP333 。其实把里面两段示范代码看了就明白了。我读了下示范代码和环境变量的要求, 就写了 (eurasia) WSGI server 事情就这么简单。
|
||||
|
||||
一个比较容易产生疑惑的地方是, 可能会把 (1)** WSGI server** 和 (2) 基于 WSGI 的 framework 混淆了。其实 WSGI 是分成 server 和 framework (即 application) 两部分 (当然还有 middleware)。严格说 WSGI 只是一个协议, 规范 server 和 framework 之间连接的接口。
|
||||
|
||||
(1) WSGI server 把服务器功能以** WSGI 接口**暴露出来。比如 mod_wsgi 是一种 server, 把 apache 的功能以 WSGI 接口的形式提供出来。
|
||||
|
||||
(2) WSGI framework 就是我们经常提到的 Django 这种框架。不过需要注意的是, 很少有单纯的 WSGI framework , 基于 WSGI 的框架往往都自带 WSGI server。比如 Django、CherryPy 都自带 WSGI server 主要是测试用途, 发布时则使用生产环境的 WSGI server。而有些 WSGI 下的框架比如 pylons、bfg 等, 自己不实现 WSGI server。使用 paste 作为 WSGI server。
|
||||
|
||||
Paste 是流行的 WSGI server, 带有很多中间件。还有** flup** 也是一个提供中间件的库。
|
||||
|
||||
搞清除 WSGI server 和 application, 中间件自然就清楚了。除了** session、cache** 之类的应用, 前段时间看到一个 bfg 下的中间件专门用于给网站换肤的 (skin) 。中间件可以想到的用法还很多。
|
||||
|
||||
这里再补充一下, 像 django 这样的框架如何以__ fastcgi (CGI也是种规范协议,与WSGI不同,因此需要转换)__的方式跑在 apache 上的。这要用到 flup.fcgi 或者 fastcgi.py (eurasia 中也设计了一个 fastcgi.py 的实现) 这些工具, 它们就是把 fastcgi 协议转换成 WSGI 接口 (把 fastcgi 变成一个 WSGI server) 供框架接入。整个架构是这样的: django -> fcgi2 wsgiserver -> mod_fcgi -> apache 。
|
||||
|
||||
虽然我不是 WSGI 的粉丝, 但是不可否认 WSGI 对 python web 的意义重大。有意自己设计 web 框架, 又不想做 socket 层和 http 报文解析的同学, 可以从 WSGI 开始设计自己的框架。在 python 圈子里有个共识, 自己随手搞个 web 框架跟喝口水一样自然, 非常方便。或许每个 python 玩家都会经历一个倒腾框架的阶段吧。
|
||||
41
Zim/Programme/python/WSGI/在apache下配置mod_wsgi.txt
Normal file
41
Zim/Programme/python/WSGI/在apache下配置mod_wsgi.txt
Normal file
@@ -0,0 +1,41 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2011-10-13T17:18:36+08:00
|
||||
|
||||
====== 在apache下配置mod wsgi ======
|
||||
Created Thursday 13 October 2011
|
||||
http://healich.iteye.com/blog/727620
|
||||
|
||||
在apache下配置mod_wsgi
|
||||
|
||||
Apache http Server: http://httpd.apache.org/
|
||||
modwsgi: http://code.google.com/p/modwsgi/, http://code.google.com/p/modwsgi/wiki/InstallationInstructions
|
||||
WSGI: http://www.python.org/dev/peps/pep-0333/
|
||||
|
||||
在安装好apache之后,还需要下载mod_wsgi.mod_wsgi是用于apache支持python wsgi协议的扩展,当前版本是3.3,有windows下支持不同python版本的二进制文件下载。
|
||||
|
||||
首先需要使apache httpd服务器加载wsgi_module扩展。将下载的mod_wsgi.so置于apache serverr安装目录的modules文件下,在httpd.conf文件中添加如下一行:
|
||||
|
||||
LoadModule wsgi_module modules/mod_wsgi.so
|
||||
|
||||
使用**WSGIScriptAlias**指令来指定wsgi application的启动脚本。在httpd.conf中添加如下一行,这里使用默认的DocumentRoot:
|
||||
|
||||
WSGIScriptAlias /test "/path/to/docRoot/test.wsgi"
|
||||
|
||||
在**/test路径**下访问测试程序,wsgi脚本文件为**test.wsgi**
|
||||
|
||||
def application(environ, start_response):
|
||||
status = '200 OK'
|
||||
output = 'Hello World!'
|
||||
|
||||
response_headers = [('Content-type', 'text/plain'),
|
||||
('Content-Length', str(len(output)))]
|
||||
start_response(status, response_headers)
|
||||
|
||||
return [output]
|
||||
|
||||
|
||||
|
||||
重启apache sever之后,可以通过http://localhost/test%E6%9D%A5%E8%AE%BF%E9%97%AE%E6%B5%8B%E8%AF%95%E7%A8%8B%E5%BA%8F%E4%BA%86%E3%80%82%E5%A6%82%E6%9E%9C%E6%98%BE%E7%A4%BA%E2%80%9CHello World!”则表明mod_wsgi安装成功。
|
||||
|
||||
|
||||
100
Zim/Programme/python/WSGI/捉摸Python的WSGI.txt
Normal file
100
Zim/Programme/python/WSGI/捉摸Python的WSGI.txt
Normal file
@@ -0,0 +1,100 @@
|
||||
Content-Type: text/x-zim-wiki
|
||||
Wiki-Format: zim 0.4
|
||||
Creation-Date: 2011-10-13T21:50:17+08:00
|
||||
|
||||
====== 捉摸Python的WSGI ======
|
||||
Created Thursday 13 October 2011
|
||||
http://www.iteye.com/topic/734050
|
||||
|
||||
过去的这个月,接触的最多的就是Python的WSGI了,WSGI不是框架不是模块,仅仅是一个**规范协议**,定义了一些**接口**,却影响着Python网络开发的方方面面。对于WSGI有这么一段定义:WSGI is the Web Server Gateway Interface. It is a specification for **web servers and application servers to communicate with web applications** (though it can also be used for more than that).我想我这篇文章不是详细介绍WSGI内容的,只是想扯扯我对WSGI相关的学习。
|
||||
|
||||
诚如那个WSGI的定义所说的,协议定义了一套接口来实现**服务器端与应用端通信的规范化**(或者说是统一化)。这是怎样的一套接口呢?很简单,尤其是对于应用端。
|
||||
|
||||
应用端只需要实现一个**接受**两个参数的,含有__call__方法的,返回一个可遍历的含有零个或多个string结果的Python对象(我强调说Python对象,只是想和Java的对象区别开,在Python里一个方法、一个类型……都是对象,Python是真“一切皆对象”,详见《Python源码分析》)即可。码农都知道,传入参数的名字可以任意取,这里也不例外,但习惯把第一个参数命名为“environ”,第二个为“start_response”。至于这个对象的内容怎样,**应用自由发挥**去吧……
|
||||
|
||||
服务器端要做的也不复杂,就是对于每一个来访的**请求**,调用一次应用端“**注册**”的那个协议规定应用端必须要实现的对象,然后返回相应的**响应消息**。这样一次服务器端与应用端的通信也就完成了,一次对用户请求的处理也随之完成了!当然了,既然**协议规定了服务器端在调用的时候要传递两个参数**,自然也规定了这两个参数的一些细节。比如第一个参数其实就是一个字典对象,里面是所有从用户请求和服务器环境变量中获取的信息内容,协议当然会定义一些必须有的值,及这些值对应的变量名;第二个参数其实就是一个**回调函数**,它向应用端传递一个用来生成**响应内容体**的write对象,这个对象也是有__call__方法的。
|
||||
|
||||
协议也提到了,还可以设计**中间件**来连接服务器端与应用端,来实现一些通用的功能,比如**session、routing**等。
|
||||
|
||||
具体怎么应用这个协议呢?Python自带的**wsgiref(一个支持WSGI协议的服务器实现)**模块有个简单的例子:
|
||||
|
||||
from wsgiref.simple_server import make_server
|
||||
|
||||
def hello_world_app(**environ, start_response**):
|
||||
status = '200 OK' # HTTP Status
|
||||
headers = [('Content-type', 'text/plain')] # HTTP Headers
|
||||
** start_response(status, headers) **
|
||||
|
||||
# The returned object is going to be printed
|
||||
** return** ["Hello World"]
|
||||
|
||||
httpd = make_server('', 8000, **hello_world_app**) #启动服务器,并将app注册到服务器中那个
|
||||
print "Serving on port 8000..."
|
||||
|
||||
# Serve until process is killed
|
||||
httpd.serve_forever()
|
||||
|
||||
这个例子更多体现的是应用端的开发方法,很简单的按照协议**实现一个了满足规范的方法**,这样当浏览器向本机8000端口发起一个请求时,就会得到一个“Hello World”的字符串文本响应。这个例子虽然简单,但非常清楚的说明了应用端与服务器端的接口应用方式。
|
||||
|
||||
你可能会想到:现在对该端口的不同地址的请求都是由这个“hello_world_app”函数处理的,你可以实现一个功能,解析一下请求的PATH信息,针对**不同的地址**转发给不同的函数或是类来处理;你可能会觉得使用environ和start_response这两个参数不直观,你可以像Java的servlet那样自己封装成两个request和response对象来用;你觉得有些**常用功能**可以提取出来,在**具体应用逻辑之外**来做……哈哈,那你就已经在思考怎么做中间件或是Web框架了!其实这些也都有人做过了,比如Routes、WebOb、Beaker……当然你大可以自己造自己独有的轮子,有时候自己做过一遍了才会对现有的成熟的东西有更好的理解,最重要的是在Python的世界里这些都不难做到!
|
||||
|
||||
不知你是不是和我一样,在写应用的时候或多或少的会想一下服务器端是怎么运作的呢?可能最模糊的流程大家都能想得到:服务器开一个socket等待客户端连接;请求来了,服务器会读出传来的数据,然后根据HTTP协议做一些初步的封装,接着就可以调用事先注册的应用程序了,并将请求的数据塞进去;等响应处理完毕了再把数据通过socket发出去,over。好在Python的代码简洁,而自带的wsgiref中的simple server也很简单,就让我们探究一下更具体的实现吧!
|
||||
|
||||
首先看一下类的继承关系,这个simple server真正的类是WSGIServer,继承自HTTPServer,HTTPServer类又继承自TCPServer,TCPServer又继承自BaseServer;与server类直接打交道的还有RequestHandler类,从最上层的
|
||||
WSGIRequestHandler —> BaseHTTPRequestHandler —> StreamRequestHandler —> BaseRequestHandler。
|
||||
相对Java而言不是很复杂吧,它们是怎么工作的呢?容我稍微解释一下。
|
||||
|
||||
|
||||
让我们从Server的最基类BaseServer看起。它有一段注释非常清楚的介绍了它定义的方法的用处:
|
||||
|
||||
Methods for the caller:
|
||||
|
||||
- __init__(server_address, RequestHandlerClass)
|
||||
- serve_forever()
|
||||
- handle_request() # if you do not use serve_forever()
|
||||
- fileno() -> int # for select()
|
||||
|
||||
Methods that may be overridden:
|
||||
|
||||
- server_bind()
|
||||
- server_activate()
|
||||
- get_request() -> request, client_address
|
||||
- verify_request(request, client_address)
|
||||
- server_close()
|
||||
- process_request(request, client_address)
|
||||
- close_request(request)
|
||||
- handle_error()
|
||||
|
||||
Methods for derived classes:
|
||||
|
||||
- finish_request(request, client_address)
|
||||
|
||||
可见,一个server类其实就这么几个方法。
|
||||
|
||||
在可以被外部调用的四个方法中,构造方法显然就是用来创建实例的;第四个可能是和构建异步服务器有关的,这里就略过了;从具体的代码可以看到,剩下两个方法的用处是相同的,就是处理收到的请求,只是serve_forever()方法会在server进程存在期间循环处理,而handle_request()处理一次就退出了(其实server_forever()就是循环调用了handle_request())。在handle_request()中说明了具体的从接受到返回一个请求的全部流程,代码也很简单:
|
||||
|
||||
|
||||
def handle_request(self):
|
||||
"""Handle one request, possibly blocking."""
|
||||
try:
|
||||
request, client_address = self.get_request()
|
||||
except socket.error:
|
||||
return
|
||||
if self.verify_request(request, client_address):
|
||||
try:
|
||||
self.process_request(request, client_address)
|
||||
except:
|
||||
self.handle_error(request, client_address)
|
||||
self.close_request(request)
|
||||
|
||||
BaseServer虽然定义了这些内部调用的方法,但内容基本都是空的,留给了**具体的Server类去实现**。从BaseServer的代码中就可以看到RequestHandler类的用处了,它是具体的解析了request的内容,它由finish_request()调用,而这个finsh_request()方法显然应该是在process_request()方法中被调用的。
|
||||
|
||||
TCPServer继承BaseServer类,它真正具体化了我们猜测的socket连接的初始化过程。
|
||||
|
||||
在与上面两个类相同的源文件中,还有两个主要的类:ThreadingMixIn和ForkingMixIn,这两个类分别重载了process_request()方法,并且相应使用了新建一个线程或是进程的方式来调用finish_request()方法。这也从应用的角度解释了为什么要在finish_request()外套一层process_request(),而不是直接在handle_request()的第二个try块中调用。
|
||||
|
||||
HTTPServer其实做的工作很简单,就是记录了socket server的名字。
|
||||
|
||||
接下来就该看看WSGIServer了。它做了两件新的工作:设置了一些基本的__环境变量值__,并且接受__应用程序的注册__。从这个Server的代码可以看出,应用端实现的那个接口就是从这里注册到服务器端的,而且只能注册一个哦!所以要有多个应用只能通过routing的方式来转发调用了。而且这个WSGIServer不是多线程或是多进程的~
|
||||
|
||||
至于具体封装请求内容的RequestHandler类就不打算分析了,感兴趣的话,看官们自个看一下源码吧,也很简单哦!下一篇博客打算分享一下我对pylons框架的运行过程的学习。
|
||||
Reference in New Issue
Block a user