[ { "url": "http://docs.python.org/library/struct.html", "title": "struct", "html": "
This module performs conversions between Python values and C structs represented\nas Python strings. This can be used in handling binary data stored in files or\nfrom network connections, among other sources. It uses\nFormat Strings as compact descriptions of the layout of the C\nstructs and the intended conversion to/from Python values.
\nNote
\nBy default, the result of packing a given C struct includes pad bytes in\norder to maintain proper alignment for the C types involved; similarly,\nalignment is taken into account when unpacking. This behavior is chosen so\nthat the bytes of a packed struct correspond exactly to the layout in memory\nof the corresponding C struct. To handle platform-independent data formats\nor omit implicit pad bytes, use standard size and alignment instead of\nnative size and alignment: see Byte Order, Size, and Alignment for details.
\nThe module defines the following exception and functions:
\nPack the values v1, v2, ... according to the given format, write the\npacked bytes into the writable buffer starting at offset. Note that the\noffset is a required argument.
\n\nNew in version 2.5.
\nUnpack the buffer according to the given format. The result is a tuple even\nif it contains exactly one item. The buffer must contain at least the\namount of data required by the format (len(buffer[offset:]) must be at\nleast calcsize(fmt)).
\n\nNew in version 2.5.
\nFormat strings are the mechanism used to specify the expected layout when\npacking and unpacking data. They are built up from Format Characters,\nwhich specify the type of data being packed/unpacked. In addition, there are\nspecial characters for controlling the Byte Order, Size, and Alignment.
\nBy default, C types are represented in the machine’s native format and byte\norder, and properly aligned by skipping pad bytes if necessary (according to the\nrules used by the C compiler).
\nAlternatively, the first character of the format string can be used to indicate\nthe byte order, size and alignment of the packed data, according to the\nfollowing table:
\nCharacter | \nByte order | \nSize | \nAlignment | \n
---|---|---|---|
@ | \nnative | \nnative | \nnative | \n
= | \nnative | \nstandard | \nnone | \n
< | \nlittle-endian | \nstandard | \nnone | \n
> | \nbig-endian | \nstandard | \nnone | \n
! | \nnetwork (= big-endian) | \nstandard | \nnone | \n
If the first character is not one of these, '@' is assumed.
\nNative byte order is big-endian or little-endian, depending on the host\nsystem. For example, Intel x86 and AMD64 (x86-64) are little-endian;\nMotorola 68000 and PowerPC G5 are big-endian; ARM and Intel Itanium feature\nswitchable endianness (bi-endian). Use sys.byteorder to check the\nendianness of your system.
\nNative size and alignment are determined using the C compiler’s\nsizeof expression. This is always combined with native byte order.
\nStandard size depends only on the format character; see the table in\nthe Format Characters section.
\nNote the difference between '@' and '=': both use native byte order, but\nthe size and alignment of the latter is standardized.
\nThe form '!' is available for those poor souls who claim they can’t remember\nwhether network byte order is big-endian or little-endian.
\nThere is no way to indicate non-native byte order (force byte-swapping); use the\nappropriate choice of '<' or '>'.
\nNotes:
\nFormat characters have the following meaning; the conversion between C and\nPython values should be obvious given their types. The ‘Standard size’ column\nrefers to the size of the packed value in bytes when using standard size; that\nis, when the format string starts with one of '<', '>', '!' or\n'='. When using native size, the size of the packed value is\nplatform-dependent.
\nFormat | \nC Type | \nPython type | \nStandard size | \nNotes | \n
---|---|---|---|---|
x | \npad byte | \nno value | \n\n | \n |
c | \nchar | \nstring of length 1 | \n1 | \n\n |
b | \nsigned char | \ninteger | \n1 | \n(3) | \n
B | \nunsigned char | \ninteger | \n1 | \n(3) | \n
? | \n_Bool | \nbool | \n1 | \n(1) | \n
h | \nshort | \ninteger | \n2 | \n(3) | \n
H | \nunsigned short | \ninteger | \n2 | \n(3) | \n
i | \nint | \ninteger | \n4 | \n(3) | \n
I | \nunsigned int | \ninteger | \n4 | \n(3) | \n
l | \nlong | \ninteger | \n4 | \n(3) | \n
L | \nunsigned long | \ninteger | \n4 | \n(3) | \n
q | \nlong long | \ninteger | \n8 | \n(2), (3) | \n
Q | \nunsigned long\nlong | \ninteger | \n8 | \n(2), (3) | \n
f | \nfloat | \nfloat | \n4 | \n(4) | \n
d | \ndouble | \nfloat | \n8 | \n(4) | \n
s | \nchar[] | \nstring | \n\n | \n |
p | \nchar[] | \nstring | \n\n | \n |
P | \nvoid * | \ninteger | \n\n | (5), (3) | \n
Notes:
\nThe '?' conversion code corresponds to the _Bool type defined by\nC99. If this type is not available, it is simulated using a char. In\nstandard mode, it is always represented by one byte.
\n\nNew in version 2.6.
\nThe 'q' and 'Q' conversion codes are available in native mode only if\nthe platform C compiler supports C long long, or, on Windows,\n__int64. They are always available in standard modes.
\n\nNew in version 2.2.
\nWhen attempting to pack a non-integer using any of the integer conversion\ncodes, if the non-integer has a __index__() method then that method is\ncalled to convert the argument to an integer before packing. If no\n__index__() method exists, or the call to __index__() raises\nTypeError, then the __int__() method is tried. However, the use\nof __int__() is deprecated, and will raise DeprecationWarning.
\n\nChanged in version 2.7: Use of the __index__() method for non-integers is new in 2.7.
\n\nChanged in version 2.7: Prior to version 2.7, not all integer conversion codes would use the\n__int__() method to convert, and DeprecationWarning was\nraised only for float arguments.
\nFor the 'f' and 'd' conversion codes, the packed representation uses\nthe IEEE 754 binary32 (for 'f') or binary64 (for 'd') format,\nregardless of the floating-point format used by the platform.
\nThe 'P' format character is only available for the native byte ordering\n(selected as the default or with the '@' byte order character). The byte\norder character '=' chooses to use little- or big-endian ordering based\non the host system. The struct module does not interpret this as native\nordering, so the 'P' format is not available.
\nA format character may be preceded by an integral repeat count. For example,\nthe format string '4h' means exactly the same as 'hhhh'.
\nWhitespace characters between formats are ignored; a count and its format must\nnot contain whitespace though.
\nFor the 's' format character, the count is interpreted as the size of the\nstring, not a repeat count like for the other format characters; for example,\n'10s' means a single 10-byte string, while '10c' means 10 characters.\nIf a count is not given, it defaults to 1. For packing, the string is\ntruncated or padded with null bytes as appropriate to make it fit. For\nunpacking, the resulting string always has exactly the specified number of\nbytes. As a special case, '0s' means a single, empty string (while\n'0c' means 0 characters).
\nThe 'p' format character encodes a “Pascal string”, meaning a short\nvariable-length string stored in a fixed number of bytes, given by the count.\nThe first byte stored is the length of the string, or 255, whichever is smaller.\nThe bytes of the string follow. If the string passed in to pack() is too\nlong (longer than the count minus 1), only the leading count-1 bytes of the\nstring are stored. If the string is shorter than count-1, it is padded with\nnull bytes so that exactly count bytes in all are used. Note that for\nunpack(), the 'p' format character consumes count bytes, but that the\nstring returned can never contain more than 255 characters.
\nFor the 'P' format character, the return value is a Python integer or long\ninteger, depending on the size needed to hold a pointer when it has been cast to\nan integer type. A NULL pointer will always be returned as the Python integer\n0. When packing pointer-sized values, Python integer or long integer objects\nmay be used. For example, the Alpha and Merced processors use 64-bit pointer\nvalues, meaning a Python long integer will be used to hold the pointer; other\nplatforms use 32-bit pointers and will use a Python integer.
\nFor the '?' format character, the return value is either True or\nFalse. When packing, the truth value of the argument object is used.\nEither 0 or 1 in the native or standard bool representation will be packed, and\nany non-zero value will be True when unpacking.
\nNote
\nAll examples assume a native byte order, size, and alignment with a\nbig-endian machine.
\nA basic example of packing/unpacking three integers:
\n>>> from struct import *\n>>> pack('hhl', 1, 2, 3)\n'\\x00\\x01\\x00\\x02\\x00\\x00\\x00\\x03'\n>>> unpack('hhl', '\\x00\\x01\\x00\\x02\\x00\\x00\\x00\\x03')\n(1, 2, 3)\n>>> calcsize('hhl')\n8\n
Unpacked fields can be named by assigning them to variables or by wrapping\nthe result in a named tuple:
\n>>> record = 'raymond \\x32\\x12\\x08\\x01\\x08'\n>>> name, serialnum, school, gradelevel = unpack('<10sHHb', record)\n\n>>> from collections import namedtuple\n>>> Student = namedtuple('Student', 'name serialnum school gradelevel')\n>>> Student._make(unpack('<10sHHb', record))\nStudent(name='raymond ', serialnum=4658, school=264, gradelevel=8)\n
The ordering of format characters may have an impact on size since the padding\nneeded to satisfy alignment requirements is different:
\n>>> pack('ci', '*', 0x12131415)\n'*\\x00\\x00\\x00\\x12\\x13\\x14\\x15'\n>>> pack('ic', 0x12131415, '*')\n'\\x12\\x13\\x14\\x15*'\n>>> calcsize('ci')\n8\n>>> calcsize('ic')\n5\n
The following format 'llh0l' specifies two pad bytes at the end, assuming\nlongs are aligned on 4-byte boundaries:
\n>>> pack('llh0l', 1, 2, 3)\n'\\x00\\x00\\x00\\x01\\x00\\x00\\x00\\x02\\x00\\x03\\x00\\x00'\n
This only works when native size and alignment are in effect; standard size and\nalignment does not enforce any alignment.
\n\nThe struct module also defines the following type:
\nReturn a new Struct object which writes and reads binary data according to\nthe format string format. Creating a Struct object once and calling its\nmethods is more efficient than calling the struct functions with the\nsame format since the format string only needs to be compiled once.
\n\nNew in version 2.5.
\nCompiled Struct objects support the following methods and attributes:
\nExceptions should be class objects. The exceptions are defined in the module\nexceptions. This module never needs to be imported explicitly: the\nexceptions are provided in the built-in namespace as well as the\nexceptions module.
\nFor class exceptions, in a try statement with an except\nclause that mentions a particular class, that clause also handles any exception\nclasses derived from that class (but not exception classes from which it is\nderived). Two exception classes that are not related via subclassing are never\nequivalent, even if they have the same name.
\nThe built-in exceptions listed below can be generated by the interpreter or\nbuilt-in functions. Except where mentioned, they have an “associated value”\nindicating the detailed cause of the error. This may be a string or a tuple\ncontaining several items of information (e.g., an error code and a string\nexplaining the code). The associated value is the second argument to the\nraise statement. If the exception class is derived from the standard\nroot class BaseException, the associated value is present as the\nexception instance’s args attribute.
\nUser code can raise built-in exceptions. This can be used to test an exception\nhandler or to report an error condition “just like” the situation in which the\ninterpreter raises the same exception; but beware that there is nothing to\nprevent user code from raising an inappropriate error.
\nThe built-in exception classes can be sub-classed to define new exceptions;\nprogrammers are encouraged to at least derive new exceptions from the\nException class and not BaseException. More information on\ndefining exceptions is available in the Python Tutorial under\nUser-defined Exceptions.
\nThe following exceptions are only used as base classes for other exceptions.
\nThe base class for all built-in exceptions. It is not meant to be directly\ninherited by user-defined classes (for that, use Exception). If\nstr() or unicode() is called on an instance of this class, the\nrepresentation of the argument(s) to the instance are returned, or the empty\nstring when there were no arguments.
\n\nNew in version 2.5.
\n\n\nAll built-in, non-system-exiting exceptions are derived from this class. All\nuser-defined exceptions should also be derived from this class.
\n\nChanged in version 2.5: Changed to inherit from BaseException.
\nThe base class for exceptions that can occur outside the Python system:\nIOError, OSError. When exceptions of this type are created with a\n2-tuple, the first item is available on the instance’s errno attribute\n(it is assumed to be an error number), and the second item is available on the\nstrerror attribute (it is usually the associated error message). The\ntuple itself is also available on the args attribute.
\n\nNew in version 1.5.2.
\nWhen an EnvironmentError exception is instantiated with a 3-tuple, the\nfirst two items are available as above, while the third item is available on the\nfilename attribute. However, for backwards compatibility, the\nargs attribute contains only a 2-tuple of the first two constructor\narguments.
\nThe filename attribute is None when this exception is created with\nother than 3 arguments. The errno and strerror attributes are\nalso None when the instance was created with other than 2 or 3 arguments.\nIn this last case, args contains the verbatim constructor arguments as a\ntuple.
\nThe following exceptions are the exceptions that are actually raised.
\n\n\nRaise when a generator‘s close() method is called. It\ndirectly inherits from BaseException instead of StandardError since\nit is technically not an error.
\n\nNew in version 2.5.
\n\nChanged in version 2.6: Changed to inherit from BaseException.
\nRaised when an I/O operation (such as a print statement, the built-in\nopen() function or a method of a file object) fails for an I/O-related\nreason, e.g., “file not found” or “disk full”.
\nThis class is derived from EnvironmentError. See the discussion above\nfor more information on exception instance attributes.
\n\nChanged in version 2.6: Changed socket.error to use this as a base class.
\nRaised when the user hits the interrupt key (normally Control-C or\nDelete). During execution, a check for interrupts is made regularly.\nInterrupts typed when a built-in function input() or raw_input() is\nwaiting for input also raise this exception. The exception inherits from\nBaseException so as to not be accidentally caught by code that catches\nException and thus prevent the interpreter from exiting.
\n\nChanged in version 2.5: Changed to inherit from BaseException.
\nThis exception is derived from RuntimeError. In user defined base\nclasses, abstract methods should raise this exception when they require derived\nclasses to override the method.
\n\nNew in version 1.5.2.
\nThis exception is derived from EnvironmentError. It is raised when a\nfunction returns a system-related error (not for illegal argument types or\nother incidental errors). The errno attribute is a numeric error\ncode from errno, and the strerror attribute is the\ncorresponding string, as would be printed by the C function perror().\nSee the module errno, which contains names for the error codes defined\nby the underlying operating system.
\nFor exceptions that involve a file system path (such as chdir() or\nunlink()), the exception instance will contain a third attribute,\nfilename, which is the file name passed to the function.
\n\nNew in version 1.5.2.
\nThis exception is raised when a weak reference proxy, created by the\nweakref.proxy() function, is used to access an attribute of the referent\nafter it has been garbage collected. For more information on weak references,\nsee the weakref module.
\n\nNew in version 2.2: Previously known as the weakref.ReferenceError exception.
\nRaised by an iterator‘s next() method to signal that\nthere are no further values. This is derived from Exception rather\nthan StandardError, since this is not considered an error in its\nnormal application.
\n\nNew in version 2.2.
\nRaised when the parser encounters a syntax error. This may occur in an\nimport statement, in an exec statement, in a call to the\nbuilt-in function eval() or input(), or when reading the initial\nscript or standard input (also interactively).
\nInstances of this class have attributes filename, lineno,\noffset and text for easier access to the details. str()\nof the exception instance returns only the message.
\nRaised when the interpreter finds an internal error, but the situation does not\nlook so serious to cause it to abandon all hope. The associated value is a\nstring indicating what went wrong (in low-level terms).
\nYou should report this to the author or maintainer of your Python interpreter.\nBe sure to report the version of the Python interpreter (sys.version; it is\nalso printed at the start of an interactive Python session), the exact error\nmessage (the exception’s associated value) and if possible the source of the\nprogram that triggered the error.
\nThis exception is raised by the sys.exit() function. When it is not\nhandled, the Python interpreter exits; no stack traceback is printed. If the\nassociated value is a plain integer, it specifies the system exit status (passed\nto C’s exit() function); if it is None, the exit status is zero; if\nit has another type (such as a string), the object’s value is printed and the\nexit status is one.
\nInstances have an attribute code which is set to the proposed exit\nstatus or error message (defaulting to None). Also, this exception derives\ndirectly from BaseException and not StandardError, since it is not\ntechnically an error.
\nA call to sys.exit() is translated into an exception so that clean-up\nhandlers (finally clauses of try statements) can be\nexecuted, and so that a debugger can execute a script without running the risk\nof losing control. The os._exit() function can be used if it is\nabsolutely positively necessary to exit immediately (for example, in the child\nprocess after a call to fork()).
\nThe exception inherits from BaseException instead of StandardError\nor Exception so that it is not accidentally caught by code that catches\nException. This allows the exception to properly propagate up and cause\nthe interpreter to exit.
\n\nChanged in version 2.5: Changed to inherit from BaseException.
\nRaised when a reference is made to a local variable in a function or method, but\nno value has been bound to that variable. This is a subclass of\nNameError.
\n\nNew in version 2.0.
\nRaised when a Unicode-related encoding or decoding error occurs. It is a\nsubclass of ValueError.
\n\nNew in version 2.0.
\nRaised when a Unicode-related error occurs during encoding. It is a subclass of\nUnicodeError.
\n\nNew in version 2.3.
\nRaised when a Unicode-related error occurs during decoding. It is a subclass of\nUnicodeError.
\n\nNew in version 2.3.
\nRaised when a Unicode-related error occurs during translating. It is a subclass\nof UnicodeError.
\n\nNew in version 2.3.
\nRaised when a Windows-specific error occurs or when the error number does not\ncorrespond to an errno value. The winerror and\nstrerror values are created from the return values of the\nGetLastError() and FormatMessage() functions from the Windows\nPlatform API. The errno value maps the winerror value to\ncorresponding errno.h values. This is a subclass of OSError.
\n\nNew in version 2.0.
\n\nChanged in version 2.5: Previous versions put the GetLastError() codes into errno.
\nThe following exceptions are used as warning categories; see the warnings\nmodule for more information.
\nBase class for warnings about probable mistakes in module imports.
\n\nNew in version 2.5.
\nBase class for warnings related to Unicode.
\n\nNew in version 2.5.
\nThe class hierarchy for built-in exceptions is:
\nBaseException\n +-- SystemExit\n +-- KeyboardInterrupt\n +-- GeneratorExit\n +-- Exception\n +-- StopIteration\n +-- StandardError\n | +-- BufferError\n | +-- ArithmeticError\n | | +-- FloatingPointError\n | | +-- OverflowError\n | | +-- ZeroDivisionError\n | +-- AssertionError\n | +-- AttributeError\n | +-- EnvironmentError\n | | +-- IOError\n | | +-- OSError\n | | +-- WindowsError (Windows)\n | | +-- VMSError (VMS)\n | +-- EOFError\n | +-- ImportError\n | +-- LookupError\n | | +-- IndexError\n | | +-- KeyError\n | +-- MemoryError\n | +-- NameError\n | | +-- UnboundLocalError\n | +-- ReferenceError\n | +-- RuntimeError\n | | +-- NotImplementedError\n | +-- SyntaxError\n | | +-- IndentationError\n | | +-- TabError\n | +-- SystemError\n | +-- TypeError\n | +-- ValueError\n | +-- UnicodeError\n | +-- UnicodeDecodeError\n | +-- UnicodeEncodeError\n | +-- UnicodeTranslateError\n +-- Warning\n +-- DeprecationWarning\n +-- PendingDeprecationWarning\n +-- RuntimeWarning\n +-- SyntaxWarning\n +-- UserWarning\n +-- FutureWarning\n\t +-- ImportWarning\n\t +-- UnicodeWarning\n\t +-- BytesWarning\n
\nSource code: Lib/string.py
\nThe string module contains a number of useful constants and\nclasses, as well as some deprecated legacy functions that are also\navailable as methods on strings. In addition, Python’s built-in string\nclasses support the sequence type methods described in the\nSequence Types — str, unicode, list, tuple, bytearray, buffer, xrange section, and also the string-specific methods described\nin the String Methods section. To output formatted strings use\ntemplate strings or the
The constants defined in this module are:
\n\nNew in version 2.6.
\nThe built-in str and unicode classes provide the ability\nto do complex variable substitutions and value formatting via the\nstr.format() method described in PEP 3101. The Formatter\nclass in the string module allows you to create and customize your own\nstring formatting behaviors using the same implementation as the built-in\nformat() method.
\nThe Formatter class has the following public methods:
\nIn addition, the Formatter defines a number of methods that are\nintended to be replaced by subclasses:
\nLoop over the format_string and return an iterable of tuples\n(literal_text, field_name, format_spec, conversion). This is used\nby vformat() to break the string into either literal text, or\nreplacement fields.
\nThe values in the tuple conceptually represent a span of literal text\nfollowed by a single replacement field. If there is no literal text\n(which can happen if two replacement fields occur consecutively), then\nliteral_text will be a zero-length string. If there is no replacement\nfield, then the values of field_name, format_spec and conversion\nwill be None.
\nRetrieve a given field value. The key argument will be either an\ninteger or a string. If it is an integer, it represents the index of the\npositional argument in args; if it is a string, then it represents a\nnamed argument in kwargs.
\nThe args parameter is set to the list of positional arguments to\nvformat(), and the kwargs parameter is set to the dictionary of\nkeyword arguments.
\nFor compound field names, these functions are only called for the first\ncomponent of the field name; Subsequent components are handled through\nnormal attribute and indexing operations.
\nSo for example, the field expression ‘0.name’ would cause\nget_value() to be called with a key argument of 0. The name\nattribute will be looked up after get_value() returns by calling the\nbuilt-in getattr() function.
\nIf the index or keyword refers to an item that does not exist, then an\nIndexError or KeyError should be raised.
\nThe str.format() method and the Formatter class share the same\nsyntax for format strings (although in the case of Formatter,\nsubclasses can define their own format string syntax).
\nFormat strings contain “replacement fields” surrounded by curly braces {}.\nAnything that is not contained in braces is considered literal text, which is\ncopied unchanged to the output. If you need to include a brace character in the\nliteral text, it can be escaped by doubling: {{ and }}.
\nThe grammar for a replacement field is as follows:
\n\n\n\nreplacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}"\nfield_name ::= arg_name ("." attribute_name | "[" element_index "]")*\narg_name ::= [identifier | integer]\nattribute_name ::= identifier\nelement_index ::= integer | index_string\nindex_string ::= <any source character except "]"> +\nconversion ::= "r" | "s"\nformat_spec ::= <described in the next section>\n\n
In less formal terms, the replacement field can start with a field_name that specifies\nthe object whose value is to be formatted and inserted\ninto the output instead of the replacement field.\nThe field_name is optionally followed by a conversion field, which is\npreceded by an exclamation point '!', and a format_spec, which is preceded\nby a colon ':'. These specify a non-default format for the replacement value.
\nSee also the Format Specification Mini-Language section.
\nThe field_name itself begins with an arg_name that is either a number or a\nkeyword. If it’s a number, it refers to a positional argument, and if it’s a keyword,\nit refers to a named keyword argument. If the numerical arg_names in a format string\nare 0, 1, 2, ... in sequence, they can all be omitted (not just some)\nand the numbers 0, 1, 2, ... will be automatically inserted in that order.\nBecause arg_name is not quote-delimited, it is not possible to specify arbitrary\ndictionary keys (e.g., the strings '10' or ':-]') within a format string.\nThe arg_name can be followed by any number of index or\nattribute expressions. An expression of the form '.name' selects the named\nattribute using getattr(), while an expression of the form '[index]'\ndoes an index lookup using __getitem__().
\n\nChanged in version 2.7: The positional argument specifiers can be omitted, so '{} {}' is\nequivalent to '{0} {1}'.
\nSome simple format string examples:
\n"First, thou shalt count to {0}" # References first positional argument\n"Bring me a {}" # Implicitly references the first positional argument\n"From {} to {}" # Same as "From {0} to {1}"\n"My quest is {name}" # References keyword argument 'name'\n"Weight in tons {0.weight}" # 'weight' attribute of first positional arg\n"Units destroyed: {players[0]}" # First element of keyword argument 'players'.\n
The conversion field causes a type coercion before formatting. Normally, the\njob of formatting a value is done by the __format__() method of the value\nitself. However, in some cases it is desirable to force a type to be formatted\nas a string, overriding its own definition of formatting. By converting the\nvalue to a string before calling __format__(), the normal formatting logic\nis bypassed.
\nTwo conversion flags are currently supported: '!s' which calls str()\non the value, and '!r' which calls repr().
\nSome examples:
\n"Harold's a clever {0!s}" # Calls str() on the argument first\n"Bring out the holy {name!r}" # Calls repr() on the argument first\n
The format_spec field contains a specification of how the value should be\npresented, including such details as field width, alignment, padding, decimal\nprecision and so on. Each value type can define its own “formatting\nmini-language” or interpretation of the format_spec.
\nMost built-in types support a common formatting mini-language, which is\ndescribed in the next section.
\nA format_spec field can also include nested replacement fields within it.\nThese nested replacement fields can contain only a field name; conversion flags\nand format specifications are not allowed. The replacement fields within the\nformat_spec are substituted before the format_spec string is interpreted.\nThis allows the formatting of a value to be dynamically specified.
\nSee the Format examples section for some examples.
\n“Format specifications” are used within replacement fields contained within a\nformat string to define how individual values are presented (see\nFormat String Syntax). They can also be passed directly to the built-in\nformat() function. Each formattable type may define how the format\nspecification is to be interpreted.
\nMost built-in types implement the following options for format specifications,\nalthough some of the formatting options are only supported by the numeric types.
\nA general convention is that an empty format string ("") produces\nthe same result as if you had called str() on the value. A\nnon-empty format string typically modifies the result.
\nThe general form of a standard format specifier is:
\n\nformat_spec ::= [[fill]align][sign][#][0][width][,][.precision][type]\nfill ::= <a character other than '}'>\nalign ::= "<" | ">" | "=" | "^"\nsign ::= "+" | "-" | " "\nwidth ::= integer\nprecision ::= integer\ntype ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"\n\n
The fill character can be any character other than ‘{‘ or ‘}’. The presence\nof a fill character is signaled by the character following it, which must be\none of the alignment options. If the second character of format_spec is not\na valid alignment option, then it is assumed that both the fill character and\nthe alignment option are absent.
\nThe meaning of the various alignment options is as follows:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Option \nMeaning \n\n '<' \nForces the field to be left-aligned within the available\nspace (this is the default for most objects). \n\n '>' \nForces the field to be right-aligned within the\navailable space (this is the default for numbers). \n\n '=' \nForces the padding to be placed after the sign (if any)\nbut before the digits. This is used for printing fields\nin the form ‘+000000120’. This alignment option is only\nvalid for numeric types. \n\n\n '^' \nForces the field to be centered within the available\nspace. \n
Note that unless a minimum field width is defined, the field width will always\nbe the same size as the data to fill it, so that the alignment option has no\nmeaning in this case.
\nThe sign option is only valid for number types, and can be one of the\nfollowing:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Option \nMeaning \n\n '+' \nindicates that a sign should be used for both\npositive as well as negative numbers. \n\n '-' \nindicates that a sign should be used only for negative\nnumbers (this is the default behavior). \n\n\n space \nindicates that a leading space should be used on\npositive numbers, and a minus sign on negative numbers. \n
The '#' option is only valid for integers, and only for binary, octal, or\nhexadecimal output. If present, it specifies that the output will be prefixed\nby '0b', '0o', or '0x', respectively.
\nThe ',' option signals the use of a comma for a thousands separator.\nFor a locale aware separator, use the 'n' integer presentation type\ninstead.
\n\nChanged in version 2.7: Added the ',' option (see also PEP 378).
\nwidth is a decimal integer defining the minimum field width. If not\nspecified, then the field width will be determined by the content.
\nIf the width field is preceded by a zero ('0') character, this enables\nzero-padding. This is equivalent to an alignment type of '=' and a fill\ncharacter of '0'.
\nThe precision is a decimal number indicating how many digits should be\ndisplayed after the decimal point for a floating point value formatted with\n'f' and 'F', or before and after the decimal point for a floating point\nvalue formatted with 'g' or 'G'. For non-number types the field\nindicates the maximum field size - in other words, how many characters will be\nused from the field content. The precision is not allowed for integer values.
\nFinally, the type determines how the data should be presented.
\nThe available string presentation types are:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Type \nMeaning \n\n 's' \nString format. This is the default type for strings and\nmay be omitted. \n\n\n None \nThe same as 's'. \n
The available integer presentation types are:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Type \nMeaning \n\n 'b' \nBinary format. Outputs the number in base 2. \n\n 'c' \nCharacter. Converts the integer to the corresponding\nunicode character before printing. \n\n 'd' \nDecimal Integer. Outputs the number in base 10. \n\n 'o' \nOctal format. Outputs the number in base 8. \n\n 'x' \nHex format. Outputs the number in base 16, using lower-\ncase letters for the digits above 9. \n\n 'X' \nHex format. Outputs the number in base 16, using upper-\ncase letters for the digits above 9. \n\n 'n' \nNumber. This is the same as 'd', except that it uses\nthe current locale setting to insert the appropriate\nnumber separator characters. \n\n\n None \nThe same as 'd'. \n
In addition to the above presentation types, integers can be formatted\nwith the floating point presentation types listed below (except\n'n' and None). When doing so, float() is used to convert the\ninteger to a floating point number before formatting.
\nThe available presentation types for floating point and decimal values are:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Type \nMeaning \n\n 'e' \nExponent notation. Prints the number in scientific\nnotation using the letter ‘e’ to indicate the exponent. \n\n 'E' \nExponent notation. Same as 'e' except it uses an\nupper case ‘E’ as the separator character. \n\n 'f' \nFixed point. Displays the number as a fixed-point\nnumber. \n\n 'F' \nFixed point. Same as 'f'. \n\n 'g' \n\n General format. For a given precision p >= 1,\nthis rounds the number to p significant digits and\nthen formats the result in either fixed-point format\nor in scientific notation, depending on its magnitude.
\nThe precise rules are as follows: suppose that the\nresult formatted with presentation type 'e' and\nprecision p-1 would have exponent exp. Then\nif -4 <= exp < p, the number is formatted\nwith presentation type 'f' and precision\np-1-exp. Otherwise, the number is formatted\nwith presentation type 'e' and precision p-1.\nIn both cases insignificant trailing zeros are removed\nfrom the significand, and the decimal point is also\nremoved if there are no remaining digits following it.
\nPositive and negative infinity, positive and negative\nzero, and nans, are formatted as inf, -inf,\n0, -0 and nan respectively, regardless of\nthe precision.
\nA precision of 0 is treated as equivalent to a\nprecision of 1.
\n\n 'G' \nGeneral format. Same as 'g' except switches to\n'E' if the number gets too large. The\nrepresentations of infinity and NaN are uppercased, too. \n\n 'n' \nNumber. This is the same as 'g', except that it uses\nthe current locale setting to insert the appropriate\nnumber separator characters. \n\n '%' \nPercentage. Multiplies the number by 100 and displays\nin fixed ('f') format, followed by a percent sign. \n\n\n None \nThe same as 'g'. \n
This section contains examples of the new format syntax and comparison with\nthe old
In most of the cases the syntax is similar to the old
The new format syntax also supports new and different options, shown in the\nfollow examples.
\nAccessing arguments by position:
\n>>> '{0}, {1}, {2}'.format('a', 'b', 'c')\n'a, b, c'\n>>> '{}, {}, {}'.format('a', 'b', 'c') # 2.7+ only\n'a, b, c'\n>>> '{2}, {1}, {0}'.format('a', 'b', 'c')\n'c, b, a'\n>>> '{2}, {1}, {0}'.format(*'abc') # unpacking argument sequence\n'c, b, a'\n>>> '{0}{1}{0}'.format('abra', 'cad') # arguments' indices can be repeated\n'abracadabra'\n
Accessing arguments by name:
\n>>> 'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')\n'Coordinates: 37.24N, -115.81W'\n>>> coord = {'latitude': '37.24N', 'longitude': '-115.81W'}\n>>> 'Coordinates: {latitude}, {longitude}'.format(**coord)\n'Coordinates: 37.24N, -115.81W'\n
Accessing arguments’ attributes:
\n>>> c = 3-5j\n>>> ('The complex number {0} is formed from the real part {0.real} '\n... 'and the imaginary part {0.imag}.').format(c)\n'The complex number (3-5j) is formed from the real part 3.0 and the imaginary part -5.0.'\n>>> class Point(object):\n... def __init__(self, x, y):\n... self.x, self.y = x, y\n... def __str__(self):\n... return 'Point({self.x}, {self.y})'.format(self=self)\n...\n>>> str(Point(4, 2))\n'Point(4, 2)'\n
Accessing arguments’ items:
\n>>> coord = (3, 5)\n>>> 'X: {0[0]}; Y: {0[1]}'.format(coord)\n'X: 3; Y: 5'\n
Replacing %s and %r:
\n>>> "repr() shows quotes: {!r}; str() doesn't: {!s}".format('test1', 'test2')\n"repr() shows quotes: 'test1'; str() doesn't: test2"\n
Aligning the text and specifying a width:
\n>>> '{:<30}'.format('left aligned')\n'left aligned '\n>>> '{:>30}'.format('right aligned')\n' right aligned'\n>>> '{:^30}'.format('centered')\n' centered '\n>>> '{:*^30}'.format('centered') # use '*' as a fill char\n'***********centered***********'\n
Replacing %+f, %-f, and
>>> '{:+f}; {:+f}'.format(3.14, -3.14) # show it always\n'+3.140000; -3.140000'\n>>> '{: f}; {: f}'.format(3.14, -3.14) # show a space for positive numbers\n' 3.140000; -3.140000'\n>>> '{:-f}; {:-f}'.format(3.14, -3.14) # show only the minus -- same as '{:f}; {:f}'\n'3.140000; -3.140000'\n
Replacing %x and %o and converting the value to different bases:
\n>>> # format also supports binary numbers\n>>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42)\n'int: 42; hex: 2a; oct: 52; bin: 101010'\n>>> # with 0x, 0o, or 0b as prefix:\n>>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42)\n'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010'\n
Using the comma as a thousands separator:
\n>>> '{:,}'.format(1234567890)\n'1,234,567,890'\n
Expressing a percentage:
\n>>> points = 19.5\n>>> total = 22\n>>> 'Correct answers: {:.2%}'.format(points/total)\n'Correct answers: 88.64%'\n
Using type-specific formatting:
\n>>> import datetime\n>>> d = datetime.datetime(2010, 7, 4, 12, 15, 58)\n>>> '{:%Y-%m-%d %H:%M:%S}'.format(d)\n'2010-07-04 12:15:58'\n
Nesting arguments and more complex examples:
\n>>> for align, text in zip('<^>', ['left', 'center', 'right']):\n... '{0:{fill}{align}16}'.format(text, fill=align, align=align)\n...\n'left<<<<<<<<<<<<'\n'^^^^^center^^^^^'\n'>>>>>>>>>>>right'\n>>>\n>>> octets = [192, 168, 0, 1]\n>>> '{:02X}{:02X}{:02X}{:02X}'.format(*octets)\n'C0A80001'\n>>> int(_, 16)\n3232235521\n>>>\n>>> width = 5\n>>> for num in range(5,12):\n... for base in 'dXob':\n... print '{0:{width}{base}}'.format(num, base=base, width=width),\n... print\n...\n 5 5 5 101\n 6 6 6 110\n 7 7 7 111\n 8 8 10 1000\n 9 9 11 1001\n 10 A 12 1010\n 11 B 13 1011\n
\nNew in version 2.4.
\nTemplates provide simpler string substitutions as described in PEP 292.\nInstead of the normal
Any other appearance of $ in the string will result in a ValueError\nbeing raised.
\nThe string module provides a Template class that implements\nthese rules. The methods of Template are:
\nThe constructor takes a single argument which is the template string.
\nLike substitute(), except that if placeholders are missing from\nmapping and kws, instead of raising a KeyError exception, the\noriginal placeholder will appear in the resulting string intact. Also,\nunlike with substitute(), any other appearances of the $ will\nsimply return $ instead of raising ValueError.
\nWhile other exceptions may still occur, this method is called “safe”\nbecause substitutions always tries to return a usable string instead of\nraising an exception. In another sense, safe_substitute() may be\nanything other than safe, since it will silently ignore malformed\ntemplates containing dangling delimiters, unmatched braces, or\nplaceholders that are not valid Python identifiers.
\nTemplate instances also provide one public data attribute:
\nHere is an example of how to use a Template:
\n>>> from string import Template\n>>> s = Template('$who likes $what')\n>>> s.substitute(who='tim', what='kung pao')\n'tim likes kung pao'\n>>> d = dict(who='tim')\n>>> Template('Give $who $100').substitute(d)\nTraceback (most recent call last):\n[...]\nValueError: Invalid placeholder in string: line 1, col 10\n>>> Template('$who likes $what').substitute(d)\nTraceback (most recent call last):\n[...]\nKeyError: 'what'\n>>> Template('$who likes $what').safe_substitute(d)\n'tim likes $what'
\nAdvanced usage: you can derive subclasses of Template to customize the\nplaceholder syntax, delimiter character, or the entire regular expression used\nto parse template strings. To do this, you can override these class attributes:
\nAlternatively, you can provide the entire regular expression pattern by\noverriding the class attribute pattern. If you do this, the value must be a\nregular expression object with four named capturing groups. The capturing\ngroups correspond to the rules given above, along with the invalid placeholder\nrule:
\nThe following functions are available to operate on string and Unicode objects.\nThey are not available as string methods.
\nReturn a translation table suitable for passing to translate(), that will\nmap each character in from into the character at the same position in to;\nfrom and to must have the same length.
\nNote
\nDon’t use strings derived from lowercase and uppercase as\narguments; in some locales, these don’t have the same length. For case\nconversions, always use str.lower() and str.upper().
\nThe following list of functions are also defined as methods of string and\nUnicode objects; see section String Methods for more information on\nthose. You should consider these functions as deprecated, although they will\nnot be removed until Python 3.0. The functions defined in this module are:
\n\nDeprecated since version 2.0: Use the float() built-in function.
\nConvert a string to a floating point number. The string must have the standard\nsyntax for a floating point literal in Python, optionally preceded by a sign\n(+ or -). Note that this behaves identical to the built-in function\nfloat() when passed a string.
\nNote
\nWhen passing in a string, values for NaN and Infinity may be returned, depending\non the underlying C library. The specific set of strings accepted which cause\nthese values to be returned depends entirely on the C library and is known to\nvary.
\n\nDeprecated since version 2.0: Use the int() built-in function.
\nConvert string s to an integer in the given base. The string must consist\nof one or more digits, optionally preceded by a sign (+ or -). The\nbase defaults to 10. If it is 0, a default base is chosen depending on the\nleading characters of the string (after stripping the sign): 0x or 0X\nmeans 16, 0 means 8, anything else means 10. If base is 16, a leading\n0x or 0X is always accepted, though not required. This behaves\nidentically to the built-in function int() when passed a string. (Also\nnote: for a more flexible interpretation of numeric literals, use the built-in\nfunction eval().)
\n\nDeprecated since version 2.0: Use the long() built-in function.
\nConvert string s to a long integer in the given base. The string must\nconsist of one or more digits, optionally preceded by a sign (+ or -).\nThe base argument has the same meaning as for atoi(). A trailing l\nor L is not allowed, except if the base is 0. Note that when invoked\nwithout base or with base set to 10, this behaves identical to the built-in\nfunction long() when passed a string.
\nReturn a list of the words of the string s. If the optional second argument\nsep is absent or None, the words are separated by arbitrary strings of\nwhitespace characters (space, tab, newline, return, formfeed). If the second\nargument sep is present and not None, it specifies a string to be used as\nthe word separator. The returned list will then have one more item than the\nnumber of non-overlapping occurrences of the separator in the string. The\noptional third argument maxsplit defaults to 0. If it is nonzero, at most\nmaxsplit number of splits occur, and the remainder of the string is returned\nas the final element of the list (thus, the list will have at most\nmaxsplit+1 elements).
\nThe behavior of split on an empty string depends on the value of sep. If sep\nis not specified, or specified as None, the result will be an empty list.\nIf sep is specified as any string, the result will be a list containing one\nelement which is an empty string.
\nReturn a list of the words of the string s, scanning s from the end. To all\nintents and purposes, the resulting list of words is the same as returned by\nsplit(), except when the optional third argument maxsplit is explicitly\nspecified and nonzero. When maxsplit is nonzero, at most maxsplit number of\nsplits – the rightmost ones – occur, and the remainder of the string is\nreturned as the first element of the list (thus, the list will have at most\nmaxsplit+1 elements).
\n\nNew in version 2.4.
\nReturn a copy of the string with leading characters removed. If chars is\nomitted or None, whitespace characters are removed. If given and not\nNone, chars must be a string; the characters in the string will be\nstripped from the beginning of the string this method is called on.
\n\nChanged in version 2.2.3: The chars parameter was added. The chars parameter cannot be passed in\nearlier 2.2 versions.
\nReturn a copy of the string with trailing characters removed. If chars is\nomitted or None, whitespace characters are removed. If given and not\nNone, chars must be a string; the characters in the string will be\nstripped from the end of the string this method is called on.
\n\nChanged in version 2.2.3: The chars parameter was added. The chars parameter cannot be passed in\nearlier 2.2 versions.
\nReturn a copy of the string with leading and trailing characters removed. If\nchars is omitted or None, whitespace characters are removed. If given and\nnot None, chars must be a string; the characters in the string will be\nstripped from the both ends of the string this method is called on.
\n\nChanged in version 2.2.3: The chars parameter was added. The chars parameter cannot be passed in\nearlier 2.2 versions.
\nThis module provides regular expression matching operations similar to\nthose found in Perl. Both patterns and strings to be searched can be\nUnicode strings as well as 8-bit strings.
\nRegular expressions use the backslash character ('\\') to indicate\nspecial forms or to allow special characters to be used without invoking\ntheir special meaning. This collides with Python’s usage of the same\ncharacter for the same purpose in string literals; for example, to match\na literal backslash, one might have to write '\\\\\\\\' as the pattern\nstring, because the regular expression must be \\\\, and each\nbackslash must be expressed as \\\\ inside a regular Python string\nliteral.
\nThe solution is to use Python’s raw string notation for regular expression\npatterns; backslashes are not handled in any special way in a string literal\nprefixed with 'r'. So r"\\n" is a two-character string containing\n'\\' and 'n', while "\\n" is a one-character string containing a\nnewline. Usually patterns will be expressed in Python code using this raw\nstring notation.
\nIt is important to note that most regular expression operations are available as\nmodule-level functions and RegexObject methods. The functions are\nshortcuts that don’t require you to compile a regex object first, but miss some\nfine-tuning parameters.
\nSee also
\nA regular expression (or RE) specifies a set of strings that matches it; the\nfunctions in this module let you check if a particular string matches a given\nregular expression (or if a given regular expression matches a particular\nstring, which comes down to the same thing).
\nRegular expressions can be concatenated to form new regular expressions; if A\nand B are both regular expressions, then AB is also a regular expression.\nIn general, if a string p matches A and another string q matches B, the\nstring pq will match AB. This holds unless A or B contain low precedence\noperations; boundary conditions between A and B; or have numbered group\nreferences. Thus, complex expressions can easily be constructed from simpler\nprimitive expressions like the ones described here. For details of the theory\nand implementation of regular expressions, consult the Friedl book referenced\nabove, or almost any textbook about compiler construction.
\nA brief explanation of the format of regular expressions follows. For further\ninformation and a gentler presentation, consult the Regular Expression HOWTO.
\nRegular expressions can contain both special and ordinary characters. Most\nordinary characters, like 'A', 'a', or '0', are the simplest regular\nexpressions; they simply match themselves. You can concatenate ordinary\ncharacters, so last matches the string 'last'. (In the rest of this\nsection, we’ll write RE’s in this special style, usually without quotes, and\nstrings to be matched 'in single quotes'.)
\nSome characters, like '|' or '(', are special. Special\ncharacters either stand for classes of ordinary characters, or affect\nhow the regular expressions around them are interpreted. Regular\nexpression pattern strings may not contain null bytes, but can specify\nthe null byte using the \\number notation, e.g., '\\x00'.
\nThe special characters are:
\nEither escapes special characters (permitting you to match characters like\n'*', '?', and so forth), or signals a special sequence; special\nsequences are discussed below.
\nIf you’re not using a raw string to express the pattern, remember that Python\nalso uses the backslash as an escape sequence in string literals; if the escape\nsequence isn’t recognized by Python’s parser, the backslash and subsequent\ncharacter are included in the resulting string. However, if Python would\nrecognize the resulting sequence, the backslash should be repeated twice. This\nis complicated and hard to understand, so it’s highly recommended that you use\nraw strings for all but the simplest expressions.
\nUsed to indicate a set of characters. In a set:
\n(One or more letters from the set 'i', 'L', 'm', 's',\n'u', 'x'.) The group matches the empty string; the letters\nset the corresponding flags: re.I (ignore case),\nre.L (locale dependent), re.M (multi-line),\nre.S (dot matches all), re.U (Unicode dependent),\nand re.X (verbose), for the entire regular expression. (The\nflags are described in Module Contents.) This\nis useful if you wish to include the flags as part of the regular\nexpression, instead of passing a flag argument to the\nre.compile() function.
\nNote that the (?x) flag changes how the expression is parsed. It should be\nused first in the expression string, or after one or more whitespace characters.\nIf there are non-whitespace characters before the flag, the results are\nundefined.
\nSimilar to regular parentheses, but the substring matched by the group is\naccessible within the rest of the regular expression via the symbolic group\nname name. Group names must be valid Python identifiers, and each group\nname must be defined only once within a regular expression. A symbolic group\nis also a numbered group, just as if the group were not named. So the group\nnamed id in the example below can also be referenced as the numbered group\n1.
\nFor example, if the pattern is (?P<id>[a-zA-Z_]\\w*), the group can be\nreferenced by its name in arguments to methods of match objects, such as\nm.group('id') or m.end('id'), and also by name in the regular\nexpression itself (using (?P=id)) and replacement text given to\n.sub() (using \\g<id>).
\nMatches if the current position in the string is preceded by a match for ...\nthat ends at the current position. This is called a positive lookbehind\nassertion. (?<=abc)def will find a match in abcdef, since the\nlookbehind will back up 3 characters and check if the contained pattern matches.\nThe contained pattern must only match strings of some fixed length, meaning that\nabc or a|b are allowed, but a* and a{3,4} are not. Note that\npatterns which start with positive lookbehind assertions will never match at the\nbeginning of the string being searched; you will most likely want to use the\nsearch() function rather than the match() function:
\n>>> import re\n>>> m = re.search('(?<=abc)def', 'abcdef')\n>>> m.group(0)\n'def'\n
This example looks for a word following a hyphen:
\n>>> m = re.search('(?<=-)\\w+', 'spam-egg')\n>>> m.group(0)\n'egg'\n
Will try to match with yes-pattern if the group with given id or name\nexists, and with no-pattern if it doesn’t. no-pattern is optional and\ncan be omitted. For example, (<)?(\\w+@\\w+(?:\\.\\w+)+)(?(1)>) is a poor email\nmatching pattern, which will match with '<user@host.com>' as well as\n'user@host.com', but not with '<user@host.com'.
\n\nNew in version 2.4.
\nThe special sequences consist of '\\' and a character from the list below.\nIf the ordinary character is not on the list, then the resulting RE will match\nthe second character. For example, \\$ matches the character '$'.
\nMost of the standard escapes supported by Python string literals are also\naccepted by the regular expression parser:
\n\\a \\b \\f \\n\n\\r \\t \\v \\x\n\\\\
\nOctal escapes are included in a limited form: If the first digit is a 0, or if\nthere are three octal digits, it is considered an octal escape. Otherwise, it is\na group reference. As for string literals, octal escapes are always at most\nthree digits in length.
\nPython offers two different primitive operations based on regular expressions:\nmatch checks for a match only at the beginning of the string, while\nsearch checks for a match anywhere in the string (this is what Perl does\nby default).
\nNote that match may differ from search even when using a regular expression\nbeginning with '^': '^' matches only at the start of the string, or in\nMULTILINE mode also immediately following a newline. The “match”\noperation succeeds only if the pattern matches at the start of the string\nregardless of mode, or at the starting position given by the optional pos\nargument regardless of whether a newline precedes it.
\n>>> re.match("c", "abcdef") # No match\n>>> re.search("c", "abcdef") # Match\n<_sre.SRE_Match object at ...>\n
The module defines several functions, constants, and an exception. Some of the\nfunctions are simplified versions of the full featured methods for compiled\nregular expressions. Most non-trivial applications always use the compiled\nform.
\nCompile a regular expression pattern into a regular expression object, which\ncan be used for matching using its match() and search() methods,\ndescribed below.
\nThe expression’s behaviour can be modified by specifying a flags value.\nValues can be any of the following variables, combined using bitwise OR (the\n| operator).
\nThe sequence
\nprog = re.compile(pattern)\nresult = prog.match(string)\n
is equivalent to
\nresult = re.match(pattern, string)\n
but using re.compile() and saving the resulting regular expression\nobject for reuse is more efficient when the expression will be used several\ntimes in a single program.
\nNote
\nThe compiled versions of the most recent patterns passed to\nre.match(), re.search() or re.compile() are cached, so\nprograms that use only a few regular expressions at a time needn’t worry\nabout compiling regular expressions.
\nMake \\w, \\W, \\b, \\B, \\d, \\D, \\s and \\S dependent\non the Unicode character properties database.
\n\nNew in version 2.0.
\nThis flag allows you to write regular expressions that look nicer. Whitespace\nwithin the pattern is ignored, except when in a character class or preceded by\nan unescaped backslash, and, when a line contains a '#' neither in a\ncharacter class or preceded by an unescaped backslash, all characters from the\nleftmost such '#' through the end of the line are ignored.
\nThat means that the two following regular expression objects that match a\ndecimal number are functionally equal:
\na = re.compile(r"""\\d + # the integral part\n \\. # the decimal point\n \\d * # some fractional digits""", re.X)\nb = re.compile(r"\\d+\\.\\d*")\n
If zero or more characters at the beginning of string match the regular\nexpression pattern, return a corresponding MatchObject instance.\nReturn None if the string does not match the pattern; note that this is\ndifferent from a zero-length match.
\nNote
\nIf you want to locate a match anywhere in string, use search()\ninstead.
\nSplit string by the occurrences of pattern. If capturing parentheses are\nused in pattern, then the text of all groups in the pattern are also returned\nas part of the resulting list. If maxsplit is nonzero, at most maxsplit\nsplits occur, and the remainder of the string is returned as the final element\nof the list. (Incompatibility note: in the original Python 1.5 release,\nmaxsplit was ignored. This has been fixed in later releases.)
\n>>> re.split('\\W+', 'Words, words, words.')\n['Words', 'words', 'words', '']\n>>> re.split('(\\W+)', 'Words, words, words.')\n['Words', ', ', 'words', ', ', 'words', '.', '']\n>>> re.split('\\W+', 'Words, words, words.', 1)\n['Words', 'words, words.']\n>>> re.split('[a-f]+', '0a3B9', flags=re.IGNORECASE)\n['0', '3', '9']\n
If there are capturing groups in the separator and it matches at the start of\nthe string, the result will start with an empty string. The same holds for\nthe end of the string:
\n>>> re.split('(\\W+)', '...words, words...')\n['', '...', 'words', ', ', 'words', '...', '']\n
That way, separator components are always found at the same relative\nindices within the result list (e.g., if there’s one capturing group\nin the separator, the 0th, the 2nd and so forth).
\nNote that split will never split a string on an empty pattern match.\nFor example:
\n>>> re.split('x*', 'foo')\n['foo']\n>>> re.split("(?m)^$", "foo\\n\\nbar\\n")\n['foo\\n\\nbar\\n']\n
\nChanged in version 2.7: Added the optional flags argument.
\nReturn all non-overlapping matches of pattern in string, as a list of\nstrings. The string is scanned left-to-right, and matches are returned in\nthe order found. If one or more groups are present in the pattern, return a\nlist of groups; this will be a list of tuples if the pattern has more than\none group. Empty matches are included in the result unless they touch the\nbeginning of another match.
\n\nNew in version 1.5.2.
\n\nChanged in version 2.4: Added the optional flags argument.
\nReturn an iterator yielding MatchObject instances over all\nnon-overlapping matches for the RE pattern in string. The string is\nscanned left-to-right, and matches are returned in the order found. Empty\nmatches are included in the result unless they touch the beginning of another\nmatch.
\n\nNew in version 2.2.
\n\nChanged in version 2.4: Added the optional flags argument.
\nReturn the string obtained by replacing the leftmost non-overlapping occurrences\nof pattern in string by the replacement repl. If the pattern isn’t found,\nstring is returned unchanged. repl can be a string or a function; if it is\na string, any backslash escapes in it are processed. That is, \\n is\nconverted to a single newline character, \\r is converted to a carriage return, and\nso forth. Unknown escapes such as \\j are left alone. Backreferences, such\nas \\6, are replaced with the substring matched by group 6 in the pattern.\nFor example:
\n>>> re.sub(r'def\\s+([a-zA-Z_][a-zA-Z_0-9]*)\\s*\\(\\s*\\):',\n... r'static PyObject*\\npy_\\1(void)\\n{',\n... 'def myfunc():')\n'static PyObject*\\npy_myfunc(void)\\n{'\n
If repl is a function, it is called for every non-overlapping occurrence of\npattern. The function takes a single match object argument, and returns the\nreplacement string. For example:
\n>>> def dashrepl(matchobj):\n... if matchobj.group(0) == '-': return ' '\n... else: return '-'\n>>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')\n'pro--gram files'\n>>> re.sub(r'\\sAND\\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)\n'Baked Beans & Spam'\n
The pattern may be a string or an RE object.
\nThe optional argument count is the maximum number of pattern occurrences to be\nreplaced; count must be a non-negative integer. If omitted or zero, all\noccurrences will be replaced. Empty matches for the pattern are replaced only\nwhen not adjacent to a previous match, so sub('x*', '-', 'abc') returns\n'-a-b-c-'.
\nIn addition to character escapes and backreferences as described above,\n\\g<name> will use the substring matched by the group named name, as\ndefined by the (?P<name>...) syntax. \\g<number> uses the corresponding\ngroup number; \\g<2> is therefore equivalent to \\2, but isn’t ambiguous\nin a replacement such as \\g<2>0. \\20 would be interpreted as a\nreference to group 20, not a reference to group 2 followed by the literal\ncharacter '0'. The backreference \\g<0> substitutes in the entire\nsubstring matched by the RE.
\n\nChanged in version 2.7: Added the optional flags argument.
\nPerform the same operation as sub(), but return a tuple (new_string,\nnumber_of_subs_made).
\n\nChanged in version 2.7: Added the optional flags argument.
\nThe RegexObject class supports the following methods and attributes:
\nScan through string looking for a location where this regular expression\nproduces a match, and return a corresponding MatchObject instance.\nReturn None if no position in the string matches the pattern; note that this\nis different from finding a zero-length match at some point in the string.
\nThe optional second parameter pos gives an index in the string where the\nsearch is to start; it defaults to 0. This is not completely equivalent to\nslicing the string; the '^' pattern character matches at the real beginning\nof the string and at positions just after a newline, but not necessarily at the\nindex where the search is to start.
\nThe optional parameter endpos limits how far the string will be searched; it\nwill be as if the string is endpos characters long, so only the characters\nfrom pos to endpos - 1 will be searched for a match. If endpos is less\nthan pos, no match will be found, otherwise, if rx is a compiled regular\nexpression object, rx.search(string, 0, 50) is equivalent to\nrx.search(string[:50], 0).
\n>>> pattern = re.compile("d")\n>>> pattern.search("dog") # Match at index 0\n<_sre.SRE_Match object at ...>\n>>> pattern.search("dog", 1) # No match; search doesn't include the "d"\n
If zero or more characters at the beginning of string match this regular\nexpression, return a corresponding MatchObject instance. Return\nNone if the string does not match the pattern; note that this is different\nfrom a zero-length match.
\nThe optional pos and endpos parameters have the same meaning as for the\nsearch() method.
\nNote
\nIf you want to locate a match anywhere in string, use\nsearch() instead.
\n>>> pattern = re.compile("o")\n>>> pattern.match("dog") # No match as "o" is not at the start of "dog".\n>>> pattern.match("dog", 1) # Match as "o" is the 2nd character of "dog".\n<_sre.SRE_Match object at ...>\n
Match Objects always have a boolean value of True, so that you can test\nwhether e.g. match() resulted in a match with a simple if statement. They\nsupport the following methods and attributes:
\nReturns one or more subgroups of the match. If there is a single argument, the\nresult is a single string; if there are multiple arguments, the result is a\ntuple with one item per argument. Without arguments, group1 defaults to zero\n(the whole match is returned). If a groupN argument is zero, the corresponding\nreturn value is the entire matching string; if it is in the inclusive range\n[1..99], it is the string matching the corresponding parenthesized group. If a\ngroup number is negative or larger than the number of groups defined in the\npattern, an IndexError exception is raised. If a group is contained in a\npart of the pattern that did not match, the corresponding result is None.\nIf a group is contained in a part of the pattern that matched multiple times,\nthe last match is returned.
\n>>> m = re.match(r"(\\w+) (\\w+)", "Isaac Newton, physicist")\n>>> m.group(0) # The entire match\n'Isaac Newton'\n>>> m.group(1) # The first parenthesized subgroup.\n'Isaac'\n>>> m.group(2) # The second parenthesized subgroup.\n'Newton'\n>>> m.group(1, 2) # Multiple arguments give us a tuple.\n('Isaac', 'Newton')\n
If the regular expression uses the (?P<name>...) syntax, the groupN\narguments may also be strings identifying groups by their group name. If a\nstring argument is not used as a group name in the pattern, an IndexError\nexception is raised.
\nA moderately complicated example:
\n>>> m = re.match(r"(?P<first_name>\\w+) (?P<last_name>\\w+)", "Malcolm Reynolds")\n>>> m.group('first_name')\n'Malcolm'\n>>> m.group('last_name')\n'Reynolds'\n
Named groups can also be referred to by their index:
\n>>> m.group(1)\n'Malcolm'\n>>> m.group(2)\n'Reynolds'\n
If a group matches multiple times, only the last match is accessible:
\n>>> m = re.match(r"(..)+", "a1b2c3") # Matches 3 times.\n>>> m.group(1) # Returns only the last match.\n'c3'\n
Return a tuple containing all the subgroups of the match, from 1 up to however\nmany groups are in the pattern. The default argument is used for groups that\ndid not participate in the match; it defaults to None. (Incompatibility\nnote: in the original Python 1.5 release, if the tuple was one element long, a\nstring would be returned instead. In later versions (from 1.5.1 on), a\nsingleton tuple is returned in such cases.)
\nFor example:
\n>>> m = re.match(r"(\\d+)\\.(\\d+)", "24.1632")\n>>> m.groups()\n('24', '1632')\n
If we make the decimal place and everything after it optional, not all groups\nmight participate in the match. These groups will default to None unless\nthe default argument is given:
\n>>> m = re.match(r"(\\d+)\\.?(\\d+)?", "24")\n>>> m.groups() # Second group defaults to None.\n('24', None)\n>>> m.groups('0') # Now, the second group defaults to '0'.\n('24', '0')\n
Return a dictionary containing all the named subgroups of the match, keyed by\nthe subgroup name. The default argument is used for groups that did not\nparticipate in the match; it defaults to None. For example:
\n>>> m = re.match(r"(?P<first_name>\\w+) (?P<last_name>\\w+)", "Malcolm Reynolds")\n>>> m.groupdict()\n{'first_name': 'Malcolm', 'last_name': 'Reynolds'}\n
Return the indices of the start and end of the substring matched by group;\ngroup defaults to zero (meaning the whole matched substring). Return -1 if\ngroup exists but did not contribute to the match. For a match object m, and\na group g that did contribute to the match, the substring matched by group g\n(equivalent to m.group(g)) is
\nm.string[m.start(g):m.end(g)]\n
Note that m.start(group) will equal m.end(group) if group matched a\nnull string. For example, after m = re.search('b(c?)', 'cba'),\nm.start(0) is 1, m.end(0) is 2, m.start(1) and m.end(1) are both\n2, and m.start(2) raises an IndexError exception.
\nAn example that will remove remove_this from email addresses:
\n>>> email = "tony@tiremove_thisger.net"\n>>> m = re.search("remove_this", email)\n>>> email[:m.start()] + email[m.end():]\n'tony@tiger.net'\n
In this example, we’ll use the following helper function to display match\nobjects a little more gracefully:
\ndef displaymatch(match):\n if match is None:\n return None\n return '<Match: %r, groups=%r>' (match.group(), match.groups())\n
Suppose you are writing a poker program where a player’s hand is represented as\na 5-character string with each character representing a card, “a” for ace, “k”\nfor king, “q” for queen, “j” for jack, “t” for 10, and “2” through “9”\nrepresenting the card with that value.
\nTo see if a given string is a valid hand, one could do the following:
\n>>> valid = re.compile(r"^[a2-9tjqk]{5}$")\n>>> displaymatch(valid.match("akt5q")) # Valid.\n"<Match: 'akt5q', groups=()>"\n>>> displaymatch(valid.match("akt5e")) # Invalid.\n>>> displaymatch(valid.match("akt")) # Invalid.\n>>> displaymatch(valid.match("727ak")) # Valid.\n"<Match: '727ak', groups=()>"\n
That last hand, "727ak", contained a pair, or two of the same valued cards.\nTo match this with a regular expression, one could use backreferences as such:
\n>>> pair = re.compile(r".*(.).*\\1")\n>>> displaymatch(pair.match("717ak")) # Pair of 7s.\n"<Match: '717', groups=('7',)>"\n>>> displaymatch(pair.match("718ak")) # No pairs.\n>>> displaymatch(pair.match("354aa")) # Pair of aces.\n"<Match: '354aa', groups=('a',)>"\n
To find out what card the pair consists of, one could use the\ngroup() method of MatchObject in the following\nmanner:
\n>>> pair.match("717ak").group(1)\n'7'\n\n# Error because re.match() returns None, which doesn't have a group() method:\n>>> pair.match("718ak").group(1)\nTraceback (most recent call last):\n File "<pyshell#23>", line 1, in <module>\n re.match(r".*(.).*\\1", "718ak").group(1)\nAttributeError: 'NoneType' object has no attribute 'group'\n\n>>> pair.match("354aa").group(1)\n'a'\n
Python does not currently have an equivalent to scanf(). Regular\nexpressions are generally more powerful, though also more verbose, than\nscanf() format strings. The table below offers some more-or-less\nequivalent mappings between scanf() format tokens and regular\nexpressions.
\nscanf() Token | \nRegular Expression | \n
---|---|
%c | \n. | \n
%5c | \n.{5} | \n
%d | \n[-+]?\\d+ | \n
%e, %E, %f, %g | \n[-+]?(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)? | \n
%i | \n[-+]?(0[xX][\\dA-Fa-f]+|0[0-7]*|\\d+) | \n
%o | \n0[0-7]* | \n
%s | \n\\S+ | \n
%u | \n\\d+ | \n
%x, %X | \n0[xX][\\dA-Fa-f]+ | \n
To extract the filename and numbers from a string like
\n/usr/sbin/sendmail - 0 errors, 4 warnings
\nyou would use a scanf() format like
\n%s - %d errors, %d warnings
\nThe equivalent regular expression would be
\n(\\S+) - (\\d+) errors, (\\d+) warnings
\nIf you create regular expressions that require the engine to perform a lot of\nrecursion, you may encounter a RuntimeError exception with the message\nmaximum recursion limit exceeded. For example,
\n>>> s = 'Begin ' + 1000*'a very long string ' + 'end'\n>>> re.match('Begin (\\w| )*? end', s).end()\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\n File "/usr/local/lib/python2.5/re.py", line 132, in match\n return _compile(pattern, flags).match(string)\nRuntimeError: maximum recursion limit exceeded\n
You can often restructure your regular expression to avoid recursion.
\nStarting with Python 2.3, simple uses of the *? pattern are special-cased to\navoid recursion. Thus, the above regular expression can avoid recursion by\nbeing recast as Begin [a-zA-Z0-9_ ]*?end. As a further benefit, such\nregular expressions will run faster than their recursive equivalents.
\nIn a nutshell, match() only attempts to match a pattern at the beginning\nof a string where search() will match a pattern anywhere in a string.\nFor example:
\n>>> re.match("o", "dog") # No match as "o" is not the first letter of "dog".\n>>> re.search("o", "dog") # Match as search() looks everywhere in the string.\n<_sre.SRE_Match object at ...>\n
Note
\nThe following applies only to regular expression objects like those created\nwith re.compile("pattern"), not the primitives re.match(pattern,\nstring) or re.search(pattern, string).
\nmatch() has an optional second parameter that gives an index in the string\nwhere the search is to start:
\n>>> pattern = re.compile("o")\n>>> pattern.match("dog") # No match as "o" is not at the start of "dog."\n\n# Equivalent to the above expression as 0 is the default starting index:\n>>> pattern.match("dog", 0)\n\n# Match as "o" is the 2nd character of "dog" (index 0 is the first):\n>>> pattern.match("dog", 1)\n<_sre.SRE_Match object at ...>\n>>> pattern.match("dog", 2) # No match as "o" is not the 3rd character of "dog."\n
split() splits a string into a list delimited by the passed pattern. The\nmethod is invaluable for converting textual data into data structures that can be\neasily read and modified by Python as demonstrated in the following example that\ncreates a phonebook.
\nFirst, here is the input. Normally it may come from a file, here we are using\ntriple-quoted string syntax:
\n>>> input = """Ross McFluff: 834.345.1254 155 Elm Street\n...\n... Ronald Heathmore: 892.345.3428 436 Finley Avenue\n... Frank Burger: 925.541.7625 662 South Dogwood Way\n...\n...\n... Heather Albrecht: 548.326.4584 919 Park Place"""\n
The entries are separated by one or more newlines. Now we convert the string\ninto a list with each nonempty line having its own entry:
\n>>> entries = re.split("\\n+", input)\n>>> entries\n['Ross McFluff: 834.345.1254 155 Elm Street',\n'Ronald Heathmore: 892.345.3428 436 Finley Avenue',\n'Frank Burger: 925.541.7625 662 South Dogwood Way',\n'Heather Albrecht: 548.326.4584 919 Park Place']\n
Finally, split each entry into a list with first name, last name, telephone\nnumber, and address. We use the maxsplit parameter of split()\nbecause the address has spaces, our splitting pattern, in it:
\n>>> [re.split(":? ", entry, 3) for entry in entries]\n[['Ross', 'McFluff', '834.345.1254', '155 Elm Street'],\n['Ronald', 'Heathmore', '892.345.3428', '436 Finley Avenue'],\n['Frank', 'Burger', '925.541.7625', '662 South Dogwood Way'],\n['Heather', 'Albrecht', '548.326.4584', '919 Park Place']]\n
The :? pattern matches the colon after the last name, so that it does not\noccur in the result list. With a maxsplit of 4, we could separate the\nhouse number from the street name:
\n>>> [re.split(":? ", entry, 4) for entry in entries]\n[['Ross', 'McFluff', '834.345.1254', '155', 'Elm Street'],\n['Ronald', 'Heathmore', '892.345.3428', '436', 'Finley Avenue'],\n['Frank', 'Burger', '925.541.7625', '662', 'South Dogwood Way'],\n['Heather', 'Albrecht', '548.326.4584', '919', 'Park Place']]\n
sub() replaces every occurrence of a pattern with a string or the\nresult of a function. This example demonstrates using sub() with\na function to “munge” text, or randomize the order of all the characters\nin each word of a sentence except for the first and last characters:
\n>>> def repl(m):\n... inner_word = list(m.group(2))\n... random.shuffle(inner_word)\n... return m.group(1) + "".join(inner_word) + m.group(3)\n>>> text = "Professor Abdolmalek, please report your absences promptly."\n>>> re.sub(r"(\\w)(\\w+)(\\w)", repl, text)\n'Poefsrosr Aealmlobdk, pslaee reorpt your abnseces plmrptoy.'\n>>> re.sub(r"(\\w)(\\w+)(\\w)", repl, text)\n'Pofsroser Aodlambelk, plasee reoprt yuor asnebces potlmrpy.'\n
findall() matches all occurrences of a pattern, not just the first\none as search() does. For example, if one was a writer and wanted to\nfind all of the adverbs in some text, he or she might use findall() in\nthe following manner:
\n>>> text = "He was carefully disguised but captured quickly by police."\n>>> re.findall(r"\\w+ly", text)\n['carefully', 'quickly']\n
If one wants more information about all matches of a pattern than the matched\ntext, finditer() is useful as it provides instances of\nMatchObject instead of strings. Continuing with the previous example,\nif one was a writer who wanted to find all of the adverbs and their positions\nin some text, he or she would use finditer() in the following manner:
\n>>> text = "He was carefully disguised but captured quickly by police."\n>>> for m in re.finditer(r"\\w+ly", text):\n... print '%02d-%02d: %s' (m.start(), m.end(), m.group(0))\n07-16: carefully\n40-47: quickly\n
Raw string notation (r"text") keeps regular expressions sane. Without it,\nevery backslash ('\\') in a regular expression would have to be prefixed with\nanother one to escape it. For example, the two following lines of code are\nfunctionally identical:
\n>>> re.match(r"\\W(.)\\1\\W", " ff ")\n<_sre.SRE_Match object at ...>\n>>> re.match("\\\\W(.)\\\\1\\\\W", " ff ")\n<_sre.SRE_Match object at ...>\n
When one wants to match a literal backslash, it must be escaped in the regular\nexpression. With raw string notation, this means r"\\\\". Without raw string\nnotation, one must use "\\\\\\\\", making the following lines of code\nfunctionally identical:
\n>>> re.match(r"\\\\", r"\\\\")\n<_sre.SRE_Match object at ...>\n>>> re.match("\\\\\\\\", r"\\\\")\n<_sre.SRE_Match object at ...>\n
This module implements a file-like class, StringIO, that reads and\nwrites a string buffer (also known as memory files). See the description of\nfile objects for operations (section File Objects). (For\nstandard strings, see str and unicode.)
\nWhen a StringIO object is created, it can be initialized to an existing\nstring by passing the string to the constructor. If no string is given, the\nStringIO will start empty. In both cases, the initial file position\nstarts at zero.
\nThe StringIO object can accept either Unicode or 8-bit strings, but\nmixing the two may take some care. If both are used, 8-bit strings that cannot\nbe interpreted as 7-bit ASCII (that use the 8th bit) will cause a\nUnicodeError to be raised when getvalue() is called.
\nThe following methods of StringIO objects require special mention:
\nExample usage:
\nimport StringIO\n\noutput = StringIO.StringIO()\noutput.write('First line.\\n')\nprint >>output, 'Second line.'\n\n# Retrieve file contents -- this will be\n# 'First line.\\nSecond line.\\n'\ncontents = output.getvalue()\n\n# Close object and discard memory buffer --\n# .getvalue() will now raise an exception.\noutput.close()\n
The module cStringIO provides an interface similar to that of the\nStringIO module. Heavy use of StringIO.StringIO objects can be\nmade more efficient by using the function StringIO() from this module\ninstead.
\nReturn a StringIO-like stream for reading or writing.
\nSince this is a factory function which returns objects of built-in types,\nthere’s no way to build your own version using subclassing. It’s not\npossible to set attributes on it. Use the original StringIO module in\nthose cases.
\nUnlike the StringIO module, this module is not able to accept Unicode\nstrings that cannot be encoded as plain ASCII strings.
\nAnother difference from the StringIO module is that calling\nStringIO() with a string parameter creates a read-only object. Unlike an\nobject created without a string parameter, it does not have write methods.\nThese objects are not generally visible. They turn up in tracebacks as\nStringI and StringO.
\nThe following data objects are provided as well:
\nThere is a C API to the module as well; refer to the module source for more\ninformation.
\nExample usage:
\nimport cStringIO\n\noutput = cStringIO.StringIO()\noutput.write('First line.\\n')\nprint >>output, 'Second line.'\n\n# Retrieve file contents -- this will be\n# 'First line.\\nSecond line.\\n'\ncontents = output.getvalue()\n\n# Close object and discard memory buffer --\n# .getvalue() will now raise an exception.\noutput.close()\n
The following sections describe the standard types that are built into the\ninterpreter.
\nNote
\nHistorically (until release 2.2), Python’s built-in types have differed from\nuser-defined types because it was not possible to use the built-in types as the\nbasis for object-oriented inheritance. This limitation no longer\nexists.
\nThe principal built-in types are numerics, sequences, mappings, files, classes,\ninstances and exceptions.
\nSome operations are supported by several object types; in particular,\npractically all objects can be compared, tested for truth value, and converted\nto a string (with the repr() function or the slightly different\nstr() function). The latter function is implicitly used when an object is\nwritten by the print() function.
\nAny object can be tested for truth value, for use in an if or\nwhile condition or as operand of the Boolean operations below. The\nfollowing values are considered false:
\n\n\n
None
\nFalse
\nzero of any numeric type, for example, 0, 0L, 0.0, 0j.
\nany empty sequence, for example, '', (), [].
\nany empty mapping, for example, {}.
\ninstances of user-defined classes, if the class defines a __nonzero__()\nor __len__() method, when that method returns the integer zero or\nbool value False. [1]
\nAll other values are considered true — so objects of many types are always\ntrue.
\nOperations and built-in functions that have a Boolean result always return 0\nor False for false and 1 or True for true, unless otherwise stated.\n(Important exception: the Boolean operations or and and always return\none of their operands.)
\nThese are the Boolean operations, ordered by ascending priority:
\nOperation | \nResult | \nNotes | \n
---|---|---|
x or y | \nif x is false, then y, else\nx | \n(1) | \n
x and y | \nif x is false, then x, else\ny | \n(2) | \n
not x | \nif x is false, then True,\nelse False | \n(3) | \n
Notes:
\nComparison operations are supported by all objects. They all have the same\npriority (which is higher than that of the Boolean operations). Comparisons can\nbe chained arbitrarily; for example, x < y <= z is equivalent to x < y and\ny <= z, except that y is evaluated only once (but in both cases z is not\nevaluated at all when x < y is found to be false).
\nThis table summarizes the comparison operations:
\nOperation | \nMeaning | \nNotes | \n
---|---|---|
< | \nstrictly less than | \n\n |
<= | \nless than or equal | \n\n |
> | \nstrictly greater than | \n\n |
>= | \ngreater than or equal | \n\n |
== | \nequal | \n\n |
!= | \nnot equal | \n(1) | \n
is | \nobject identity | \n\n |
is not | \nnegated object identity | \n\n |
Notes:
\nObjects of different types, except different numeric types and different string\ntypes, never compare equal; such objects are ordered consistently but\narbitrarily (so that sorting a heterogeneous array yields a consistent result).\nFurthermore, some types (for example, file objects) support only a degenerate\nnotion of comparison where any two objects of that type are unequal. Again,\nsuch objects are ordered arbitrarily but consistently. The <, <=, >\nand >= operators will raise a TypeError exception when any operand is\na complex number.
\nInstances of a class normally compare as non-equal unless the class defines the\n__cmp__() method. Refer to Basic customization) for information on the\nuse of this method to effect object comparisons.
\nCPython implementation detail: Objects of different types except numbers are ordered by their type names;\nobjects of the same types that don’t support proper comparison are ordered by\ntheir address.
\nTwo more operations with the same syntactic priority, in and not in, are\nsupported only by sequence types (below).
\nThere are four distinct numeric types: plain integers, long\nintegers, floating point numbers, and complex numbers. In\naddition, Booleans are a subtype of plain integers. Plain integers (also just\ncalled integers) are implemented using long in C, which gives\nthem at least 32 bits of precision (sys.maxint is always set to the maximum\nplain integer value for the current platform, the minimum value is\n-sys.maxint - 1). Long integers have unlimited precision. Floating point\nnumbers are usually implemented using double in C; information about\nthe precision and internal representation of floating point numbers for the\nmachine on which your program is running is available in\nsys.float_info. Complex numbers have a real and imaginary part, which\nare each a floating point number. To extract these parts from a complex number\nz, use z.real and z.imag. (The standard library includes additional\nnumeric types, fractions that hold rationals, and decimal that\nhold floating-point numbers with user-definable precision.)
\nNumbers are created by numeric literals or as the result of built-in functions\nand operators. Unadorned integer literals (including binary, hex, and octal\nnumbers) yield plain integers unless the value they denote is too large to be\nrepresented as a plain integer, in which case they yield a long integer.\nInteger literals with an 'L' or 'l' suffix yield long integers ('L'\nis preferred because 1l looks too much like eleven!). Numeric literals\ncontaining a decimal point or an exponent sign yield floating point numbers.\nAppending 'j' or 'J' to a numeric literal yields a complex number with a\nzero real part. A complex numeric literal is the sum of a real and an imaginary\npart.
\nPython fully supports mixed arithmetic: when a binary arithmetic operator has\noperands of different numeric types, the operand with the “narrower” type is\nwidened to that of the other, where plain integer is narrower than long integer\nis narrower than floating point is narrower than complex. Comparisons between\nnumbers of mixed type use the same rule. [2] The constructors int(),\nlong(), float(), and complex() can be used to produce numbers\nof a specific type.
\nAll built-in numeric types support the following operations. See\nThe power operator and later sections for the operators’ priorities.
\nOperation | \nResult | \nNotes | \n
---|---|---|
x + y | \nsum of x and y | \n\n |
x - y | \ndifference of x and y | \n\n |
x * y | \nproduct of x and y | \n\n |
x / y | \nquotient of x and y | \n(1) | \n
x // y | \n(floored) quotient of x and\ny | \n(4)(5) | \n
x | \nremainder of x / y | \n(4) | \n
-x | \nx negated | \n\n |
+x | \nx unchanged | \n\n |
abs(x) | \nabsolute value or magnitude of\nx | \n(3) | \n
int(x) | \nx converted to integer | \n(2) | \n
long(x) | \nx converted to long integer | \n(2) | \n
float(x) | \nx converted to floating point | \n(6) | \n
complex(re,im) | \na complex number with real part\nre, imaginary part im.\nim defaults to zero. | \n\n |
c.conjugate() | \nconjugate of the complex number\nc. (Identity on real numbers) | \n\n |
divmod(x, y) | \nthe pair (x // y, x | \n(3)(4) | \n
pow(x, y) | \nx to the power y | \n(3)(7) | \n
x ** y | \nx to the power y | \n(7) | \n
Notes:
\nFor (plain or long) integer division, the result is an integer. The result is\nalways rounded towards minus infinity: 1/2 is 0, (-1)/2 is -1, 1/(-2) is -1, and\n(-1)/(-2) is 0. Note that the result is a long integer if either operand is a\nlong integer, regardless of the numeric value.
\nConversion from floats using int() or long() truncates toward\nzero like the related function, math.trunc(). Use the function\nmath.floor() to round downward and math.ceil() to round\nupward.
\nSee Built-in Functions for a full description.
\n\nDeprecated since version 2.3: The floor division operator, the modulo operator, and the divmod()\nfunction are no longer defined for complex numbers. Instead, convert to\na floating point number using the abs() function if appropriate.
\nAlso referred to as integer division. The resultant value is a whole integer,\nthough the result’s type is not necessarily int.
\nfloat also accepts the strings “nan” and “inf” with an optional prefix “+”\nor “-” for Not a Number (NaN) and positive or negative infinity.
\n\nNew in version 2.6.
\nPython defines pow(0, 0) and 0 ** 0 to be 1, as is common for\nprogramming languages.
\nAll numbers.Real types (int, long, and\nfloat) also include the following operations:
\nOperation | \nResult | \nNotes | \n
---|---|---|
math.trunc(x) | \nx truncated to Integral | \n\n |
round(x[, n]) | \nx rounded to n digits,\nrounding half to even. If n is\nomitted, it defaults to 0. | \n\n |
math.floor(x) | \nthe greatest integral float <= x | \n\n |
math.ceil(x) | \nthe least integral float >= x | \n\n |
Plain and long integer types support additional operations that make sense only\nfor bit-strings. Negative numbers are treated as their 2’s complement value\n(for long integers, this assumes a sufficiently large number of bits that no\noverflow occurs during the operation).
\nThe priorities of the binary bitwise operations are all lower than the numeric\noperations and higher than the comparisons; the unary operation ~ has the\nsame priority as the other unary numeric operations (+ and -).
\nThis table lists the bit-string operations sorted in ascending priority:
\nOperation | \nResult | \nNotes | \n
---|---|---|
x | y | \nbitwise or of x and\ny | \n\n |
x ^ y | \nbitwise exclusive or of\nx and y | \n\n |
x & y | \nbitwise and of x and\ny | \n\n |
x << n | \nx shifted left by n bits | \n(1)(2) | \n
x >> n | \nx shifted right by n bits | \n(1)(3) | \n
~x | \nthe bits of x inverted | \n\n |
Notes:
\nThe integer types implement the numbers.Integral abstract base\nclass. In addition, they provide one more method:
\nReturn the number of bits necessary to represent an integer in binary,\nexcluding the sign and leading zeros:
\n>>> n = -37\n>>> bin(n)\n'-0b100101'\n>>> n.bit_length()\n6\n
More precisely, if x is nonzero, then x.bit_length() is the\nunique positive integer k such that 2**(k-1) <= abs(x) < 2**k.\nEquivalently, when abs(x) is small enough to have a correctly\nrounded logarithm, then k = 1 + int(log(abs(x), 2)).\nIf x is zero, then x.bit_length() returns 0.
\nEquivalent to:
\ndef bit_length(self):\n s = bin(self) # binary representation: bin(-37) --> '-0b100101'\n s = s.lstrip('-0b') # remove leading zeros and minus sign\n return len(s) # len('100101') --> 6\n
\nNew in version 2.7.
\nThe float type implements the numbers.Real abstract base\nclass. float also has the following additional methods.
\nReturn a pair of integers whose ratio is exactly equal to the\noriginal float and with a positive denominator. Raises\nOverflowError on infinities and a ValueError on\nNaNs.
\n\nNew in version 2.6.
\nReturn True if the float instance is finite with integral\nvalue, and False otherwise:
\n>>> (-2.0).is_integer()\nTrue\n>>> (3.2).is_integer()\nFalse\n
\nNew in version 2.6.
\nTwo methods support conversion to\nand from hexadecimal strings. Since Python’s floats are stored\ninternally as binary numbers, converting a float to or from a\ndecimal string usually involves a small rounding error. In\ncontrast, hexadecimal strings allow exact representation and\nspecification of floating-point numbers. This can be useful when\ndebugging, and in numerical work.
\nReturn a representation of a floating-point number as a hexadecimal\nstring. For finite floating-point numbers, this representation\nwill always include a leading 0x and a trailing p and\nexponent.
\n\nNew in version 2.6.
\nClass method to return the float represented by a hexadecimal\nstring s. The string s may have leading and trailing\nwhitespace.
\n\nNew in version 2.6.
\nNote that float.hex() is an instance method, while\nfloat.fromhex() is a class method.
\nA hexadecimal string takes the form:
\n[sign] ['0x'] integer ['.' fraction] ['p' exponent]
\nwhere the optional sign may by either + or -, integer\nand fraction are strings of hexadecimal digits, and exponent\nis a decimal integer with an optional leading sign. Case is not\nsignificant, and there must be at least one hexadecimal digit in\neither the integer or the fraction. This syntax is similar to the\nsyntax specified in section 6.4.4.2 of the C99 standard, and also to\nthe syntax used in Java 1.5 onwards. In particular, the output of\nfloat.hex() is usable as a hexadecimal floating-point literal in\nC or Java code, and hexadecimal strings produced by C’s %a format\ncharacter or Java’s Double.toHexString are accepted by\nfloat.fromhex().
\nNote that the exponent is written in decimal rather than hexadecimal,\nand that it gives the power of 2 by which to multiply the coefficient.\nFor example, the hexadecimal string 0x3.a7p10 represents the\nfloating-point number (3 + 10./16 + 7./16**2) * 2.0**10, or\n3740.0:
\n>>> float.fromhex('0x3.a7p10')\n3740.0\n
Applying the reverse conversion to 3740.0 gives a different\nhexadecimal string representing the same number:
\n>>> float.hex(3740.0)\n'0x1.d380000000000p+11'\n
\nNew in version 2.2.
\nPython supports a concept of iteration over containers. This is implemented\nusing two distinct methods; these are used to allow user-defined classes to\nsupport iteration. Sequences, described below in more detail, always support\nthe iteration methods.
\nOne method needs to be defined for container objects to provide iteration\nsupport:
\nThe iterator objects themselves are required to support the following two\nmethods, which together form the iterator protocol:
\nPython defines several iterator objects to support iteration over general and\nspecific sequence types, dictionaries, and other more specialized forms. The\nspecific types are not important beyond their implementation of the iterator\nprotocol.
\nThe intention of the protocol is that once an iterator’s next() method\nraises StopIteration, it will continue to do so on subsequent calls.\nImplementations that do not obey this property are deemed broken. (This\nconstraint was added in Python 2.3; in Python 2.2, various iterators are broken\naccording to this rule.)
\nPython’s generators provide a convenient way to implement the iterator\nprotocol. If a container object’s __iter__() method is implemented as a\ngenerator, it will automatically return an iterator object (technically, a\ngenerator object) supplying the __iter__() and next() methods. More\ninformation about generators can be found in the documentation for the\nyield expression.
\nThere are seven sequence types: strings, Unicode strings, lists, tuples,\nbytearrays, buffers, and xrange objects.
\nFor other containers see the built in dict and set classes,\nand the collections module.
\nString literals are written in single or double quotes: 'xyzzy',\n"frobozz". See String literals for more about string literals.\nUnicode strings are much like strings, but are specified in the syntax\nusing a preceding 'u' character: u'abc', u"def". In addition\nto the functionality described here, there are also string-specific\nmethods described in the String Methods section. Lists are\nconstructed with square brackets, separating items with commas: [a, b, c].\nTuples are constructed by the comma operator (not within square\nbrackets), with or without enclosing parentheses, but an empty tuple\nmust have the enclosing parentheses, such as a, b, c or (). A\nsingle item tuple must have a trailing comma, such as (d,).
\nBytearray objects are created with the built-in function bytearray().
\nBuffer objects are not directly supported by Python syntax, but can be created\nby calling the built-in function buffer(). They don’t support\nconcatenation or repetition.
\nObjects of type xrange are similar to buffers in that there is no specific syntax to\ncreate them, but they are created using the xrange() function. They don’t\nsupport slicing, concatenation or repetition, and using in, not in,\nmin() or max() on them is inefficient.
\nMost sequence types support the following operations. The in and not in\noperations have the same priorities as the comparison operations. The + and\n* operations have the same priority as the corresponding numeric operations.\n[3] Additional methods are provided for Mutable Sequence Types.
\nThis table lists the sequence operations sorted in ascending priority\n(operations in the same box have the same priority). In the table, s and t\nare sequences of the same type; n, i and j are integers:
\nOperation | \nResult | \nNotes | \n
---|---|---|
x in s | \nTrue if an item of s is\nequal to x, else False | \n(1) | \n
x not in s | \nFalse if an item of s is\nequal to x, else True | \n(1) | \n
s + t | \nthe concatenation of s and\nt | \n(6) | \n
s * n, n * s | \nn shallow copies of s\nconcatenated | \n(2) | \n
s[i] | \ni‘th item of s, origin 0 | \n(3) | \n
s[i:j] | \nslice of s from i to j | \n(3)(4) | \n
s[i:j:k] | \nslice of s from i to j\nwith step k | \n(3)(5) | \n
len(s) | \nlength of s | \n\n |
min(s) | \nsmallest item of s | \n\n |
max(s) | \nlargest item of s | \n\n |
s.index(i) | \nindex of the first occurence\nof i in s | \n\n |
s.count(i) | \ntotal number of occurences of\ni in s | \n\n |
Sequence types also support comparisons. In particular, tuples and lists\nare compared lexicographically by comparing corresponding\nelements. This means that to compare equal, every element must compare\nequal and the two sequences must be of the same type and have the same\nlength. (For full details see Comparisons in the language\nreference.)
\nNotes:
\nWhen s is a string or Unicode string object the in and not in\noperations act like a substring test. In Python versions before 2.3, x had to\nbe a string of length 1. In Python 2.3 and beyond, x may be a string of any\nlength.
\nValues of n less than 0 are treated as 0 (which yields an empty\nsequence of the same type as s). Note also that the copies are shallow;\nnested structures are not copied. This often haunts new Python programmers;\nconsider:
\n>>> lists = [[]] * 3\n>>> lists\n[[], [], []]\n>>> lists[0].append(3)\n>>> lists\n[[3], [3], [3]]\n
What has happened is that [[]] is a one-element list containing an empty\nlist, so all three elements of [[]] * 3 are (pointers to) this single empty\nlist. Modifying any of the elements of lists modifies this single list.\nYou can create a list of different lists this way:
\n>>> lists = [[] for i in range(3)]\n>>> lists[0].append(3)\n>>> lists[1].append(5)\n>>> lists[2].append(7)\n>>> lists\n[[3], [5], [7]]\n
If i or j is negative, the index is relative to the end of the string:\nlen(s) + i or len(s) + j is substituted. But note that -0 is still\n0.
\nThe slice of s from i to j is defined as the sequence of items with index\nk such that i <= k < j. If i or j is greater than len(s), use\nlen(s). If i is omitted or None, use 0. If j is omitted or\nNone, use len(s). If i is greater than or equal to j, the slice is\nempty.
\nThe slice of s from i to j with step k is defined as the sequence of\nitems with index x = i + n*k such that 0 <= n < (j-i)/k. In other words,\nthe indices are i, i+k, i+2*k, i+3*k and so on, stopping when\nj is reached (but never including j). If i or j is greater than\nlen(s), use len(s). If i or j are omitted or None, they become\n“end” values (which end depends on the sign of k). Note, k cannot be zero.\nIf k is None, it is treated like 1.
\nCPython implementation detail: If s and t are both strings, some Python implementations such as\nCPython can usually perform an in-place optimization for assignments of\nthe form s = s + t or s += t. When applicable, this optimization\nmakes quadratic run-time much less likely. This optimization is both\nversion and implementation dependent. For performance sensitive code, it\nis preferable to use the str.join() method which assures consistent\nlinear concatenation performance across versions and implementations.
\n\nChanged in version 2.4: Formerly, string concatenation never occurred in-place.
\nBelow are listed the string methods which both 8-bit strings and\nUnicode objects support. Some of them are also available on bytearray\nobjects.
\nIn addition, Python’s strings support the sequence type methods\ndescribed in the Sequence Types — str, unicode, list, tuple, bytearray, buffer, xrange section. To output formatted strings\nuse template strings or the
Return a copy of the string with its first character capitalized and the\nrest lowercased.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn centered in a string of length width. Padding is done using the\nspecified fillchar (default is a space).
\n\nChanged in version 2.4: Support for the fillchar argument.
\nDecodes the string using the codec registered for encoding. encoding\ndefaults to the default string encoding. errors may be given to set a\ndifferent error handling scheme. The default is 'strict', meaning that\nencoding errors raise UnicodeError. Other possible values are\n'ignore', 'replace' and any other name registered via\ncodecs.register_error(), see section Codec Base Classes.
\n\nNew in version 2.2.
\n\nChanged in version 2.3: Support for other error handling schemes added.
\n\nChanged in version 2.7: Support for keyword arguments added.
\nReturn an encoded version of the string. Default encoding is the current\ndefault string encoding. errors may be given to set a different error\nhandling scheme. The default for errors is 'strict', meaning that\nencoding errors raise a UnicodeError. Other possible values are\n'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and\nany other name registered via codecs.register_error(), see section\nCodec Base Classes. For a list of possible encodings, see section\nStandard Encodings.
\n\nNew in version 2.0.
\n\nChanged in version 2.3: Support for 'xmlcharrefreplace' and 'backslashreplace' and other error\nhandling schemes added.
\n\nChanged in version 2.7: Support for keyword arguments added.
\nReturn True if the string ends with the specified suffix, otherwise return\nFalse. suffix can also be a tuple of suffixes to look for. With optional\nstart, test beginning at that position. With optional end, stop comparing\nat that position.
\n\nChanged in version 2.5: Accept tuples as suffix.
\nReturn the lowest index in the string where substring sub is found, such\nthat sub is contained in the slice s[start:end]. Optional arguments\nstart and end are interpreted as in slice notation. Return -1 if\nsub is not found.
\n\nPerform a string formatting operation. The string on which this method is\ncalled can contain literal text or replacement fields delimited by braces\n{}. Each replacement field contains either the numeric index of a\npositional argument, or the name of a keyword argument. Returns a copy of\nthe string where each replacement field is replaced with the string value of\nthe corresponding argument.
\n>>> "The sum of 1 + 2 is {0}".format(1+2)\n'The sum of 1 + 2 is 3'\n
See Format String Syntax for a description of the various formatting options\nthat can be specified in format strings.
\nThis method of string formatting is the new standard in Python 3.0, and\nshould be preferred to the
\nNew in version 2.6.
\nReturn true if all characters in the string are alphanumeric and there is at\nleast one character, false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn true if all characters in the string are alphabetic and there is at least\none character, false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn true if all characters in the string are digits and there is at least one\ncharacter, false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn true if all cased characters [4] in the string are lowercase and there is at\nleast one cased character, false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn true if there are only whitespace characters in the string and there is\nat least one character, false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn true if the string is a titlecased string and there is at least one\ncharacter, for example uppercase characters may only follow uncased characters\nand lowercase characters only cased ones. Return false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn true if all cased characters [4] in the string are uppercase and there is at\nleast one cased character, false otherwise.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn the string left justified in a string of length width. Padding is done\nusing the specified fillchar (default is a space). The original string is\nreturned if width is less than or equal to len(s).
\n\nChanged in version 2.4: Support for the fillchar argument.
\nReturn a copy of the string with all the cased characters [4] converted to\nlowercase.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn a copy of the string with leading characters removed. The chars\nargument is a string specifying the set of characters to be removed. If omitted\nor None, the chars argument defaults to removing whitespace. The chars\nargument is not a prefix; rather, all combinations of its values are stripped:
\n>>> ' spacious '.lstrip()\n'spacious '\n>>> 'www.example.com'.lstrip('cmowz.')\n'example.com'\n
\nChanged in version 2.2.2: Support for the chars argument.
\nSplit the string at the first occurrence of sep, and return a 3-tuple\ncontaining the part before the separator, the separator itself, and the part\nafter the separator. If the separator is not found, return a 3-tuple containing\nthe string itself, followed by two empty strings.
\n\nNew in version 2.5.
\nReturn the string right justified in a string of length width. Padding is done\nusing the specified fillchar (default is a space). The original string is\nreturned if width is less than or equal to len(s).
\n\nChanged in version 2.4: Support for the fillchar argument.
\nSplit the string at the last occurrence of sep, and return a 3-tuple\ncontaining the part before the separator, the separator itself, and the part\nafter the separator. If the separator is not found, return a 3-tuple containing\ntwo empty strings, followed by the string itself.
\n\nNew in version 2.5.
\nReturn a list of the words in the string, using sep as the delimiter string.\nIf maxsplit is given, at most maxsplit splits are done, the rightmost\nones. If sep is not specified or None, any whitespace string is a\nseparator. Except for splitting from the right, rsplit() behaves like\nsplit() which is described in detail below.
\n\nNew in version 2.4.
\nReturn a copy of the string with trailing characters removed. The chars\nargument is a string specifying the set of characters to be removed. If omitted\nor None, the chars argument defaults to removing whitespace. The chars\nargument is not a suffix; rather, all combinations of its values are stripped:
\n>>> ' spacious '.rstrip()\n' spacious'\n>>> 'mississippi'.rstrip('ipz')\n'mississ'\n
\nChanged in version 2.2.2: Support for the chars argument.
\nReturn a list of the words in the string, using sep as the delimiter\nstring. If maxsplit is given, at most maxsplit splits are done (thus,\nthe list will have at most maxsplit+1 elements). If maxsplit is not\nspecified, then there is no limit on the number of splits (all possible\nsplits are made).
\nIf sep is given, consecutive delimiters are not grouped together and are\ndeemed to delimit empty strings (for example, '1,,2'.split(',') returns\n['1', '', '2']). The sep argument may consist of multiple characters\n(for example, '1<>2<>3'.split('<>') returns ['1', '2', '3']).\nSplitting an empty string with a specified separator returns [''].
\nIf sep is not specified or is None, a different splitting algorithm is\napplied: runs of consecutive whitespace are regarded as a single separator,\nand the result will contain no empty strings at the start or end if the\nstring has leading or trailing whitespace. Consequently, splitting an empty\nstring or a string consisting of just whitespace with a None separator\nreturns [].
\nFor example, ' 1 2 3 '.split() returns ['1', '2', '3'], and\n' 1 2 3 '.split(None, 1) returns ['1', '2 3 '].
\nReturn True if string starts with the prefix, otherwise return False.\nprefix can also be a tuple of prefixes to look for. With optional start,\ntest string beginning at that position. With optional end, stop comparing\nstring at that position.
\n\nChanged in version 2.5: Accept tuples as prefix.
\nReturn a copy of the string with the leading and trailing characters removed.\nThe chars argument is a string specifying the set of characters to be removed.\nIf omitted or None, the chars argument defaults to removing whitespace.\nThe chars argument is not a prefix or suffix; rather, all combinations of its\nvalues are stripped:
\n>>> ' spacious '.strip()\n'spacious'\n>>> 'www.example.com'.strip('cmowz.')\n'example'\n
\nChanged in version 2.2.2: Support for the chars argument.
\nReturn a copy of the string with uppercase characters converted to lowercase and\nvice versa.
\nFor 8-bit strings, this method is locale-dependent.
\nReturn a titlecased version of the string where words start with an uppercase\ncharacter and the remaining characters are lowercase.
\nThe algorithm uses a simple language-independent definition of a word as\ngroups of consecutive letters. The definition works in many contexts but\nit means that apostrophes in contractions and possessives form word\nboundaries, which may not be the desired result:
\n>>> "they're bill's friends from the UK".title()\n"They'Re Bill'S Friends From The Uk"\n
A workaround for apostrophes can be constructed using regular expressions:
\n>>> import re\n>>> def titlecase(s):\n return re.sub(r"[A-Za-z]+('[A-Za-z]+)?",\n lambda mo: mo.group(0)[0].upper() +\n mo.group(0)[1:].lower(),\n s)\n\n>>> titlecase("they're bill's friends.")\n"They're Bill's Friends."\n
For 8-bit strings, this method is locale-dependent.
\nReturn a copy of the string where all characters occurring in the optional\nargument deletechars are removed, and the remaining characters have been\nmapped through the given translation table, which must be a string of length\n256.
\nYou can use the maketrans() helper function in the string\nmodule to create a translation table. For string objects, set the table\nargument to None for translations that only delete characters:
\n>>> 'read this short text'.translate(None, 'aeiou')\n'rd ths shrt txt'\n
\nNew in version 2.6: Support for a None table argument.
\nFor Unicode objects, the translate() method does not accept the optional\ndeletechars argument. Instead, it returns a copy of the s where all\ncharacters have been mapped through the given translation table which must be a\nmapping of Unicode ordinals to Unicode ordinals, Unicode strings or None.\nUnmapped characters are left untouched. Characters mapped to None are\ndeleted. Note, a more flexible approach is to create a custom character mapping\ncodec using the codecs module (see encodings.cp1251 for an\nexample).
\nReturn a copy of the string with all the cased characters [4] converted to\nuppercase. Note that str.upper().isupper() might be False if s\ncontains uncased characters or if the Unicode category of the resulting\ncharacter(s) is not “Lu” (Letter, uppercase), but e.g. “Lt” (Letter, titlecase).
\nFor 8-bit strings, this method is locale-dependent.
\nReturn the numeric string left filled with zeros in a string of length\nwidth. A sign prefix is handled correctly. The original string is\nreturned if width is less than or equal to len(s).
\n\nNew in version 2.2.2.
\nThe following methods are present only on unicode objects:
\nString and Unicode objects have one unique built-in operation: the
If format requires a single argument, values may be a single non-tuple\nobject. [5] Otherwise, values must be a tuple with exactly the number of\nitems specified by the format string, or a single mapping object (for example, a\ndictionary).
\nA conversion specifier contains two or more characters and has the following\ncomponents, which must occur in this order:
\nWhen the right argument is a dictionary (or other mapping type), then the\nformats in the string must include a parenthesised mapping key into that\ndictionary inserted immediately after the '%' character. The mapping key\nselects the value to be formatted from the mapping. For example:
\n>>> print '%(language)s has %(number)03d quote types.' \\\n... {"language": "Python", "number": 2}\nPython has 002 quote types.\n
In this case no * specifiers may occur in a format (since they require a\nsequential parameter list).
\nThe conversion flag characters are:
\nFlag | \nMeaning | \n
---|---|
'#' | \nThe value conversion will use the “alternate form” (where defined\nbelow). | \n
'0' | \nThe conversion will be zero padded for numeric values. | \n
'-' | \nThe converted value is left adjusted (overrides the '0'\nconversion if both are given). | \n
' ' | \n(a space) A blank should be left before a positive number (or empty\nstring) produced by a signed conversion. | \n
'+' | \nA sign character ('+' or '-') will precede the conversion\n(overrides a “space” flag). | \n
A length modifier (h, l, or L) may be present, but is ignored as it\nis not necessary for Python – so e.g. %ld is identical to %d.
\nThe conversion types are:
\nConversion | \nMeaning | \nNotes | \n
---|---|---|
'd' | \nSigned integer decimal. | \n\n |
'i' | \nSigned integer decimal. | \n\n |
'o' | \nSigned octal value. | \n(1) | \n
'u' | \nObsolete type – it is identical to 'd'. | \n(7) | \n
'x' | \nSigned hexadecimal (lowercase). | \n(2) | \n
'X' | \nSigned hexadecimal (uppercase). | \n(2) | \n
'e' | \nFloating point exponential format (lowercase). | \n(3) | \n
'E' | \nFloating point exponential format (uppercase). | \n(3) | \n
'f' | \nFloating point decimal format. | \n(3) | \n
'F' | \nFloating point decimal format. | \n(3) | \n
'g' | \nFloating point format. Uses lowercase exponential\nformat if exponent is less than -4 or not less than\nprecision, decimal format otherwise. | \n(4) | \n
'G' | \nFloating point format. Uses uppercase exponential\nformat if exponent is less than -4 or not less than\nprecision, decimal format otherwise. | \n(4) | \n
'c' | \nSingle character (accepts integer or single\ncharacter string). | \n\n |
'r' | \nString (converts any Python object using\nrepr()). | \n(5) | \n
's' | \nString (converts any Python object using\nstr()). | \n(6) | \n
'%' | \nNo argument is converted, results in a '%'\ncharacter in the result. | \n\n |
Notes:
\nThe alternate form causes a leading zero ('0') to be inserted between\nleft-hand padding and the formatting of the number if the leading character\nof the result is not already a zero.
\nThe alternate form causes a leading '0x' or '0X' (depending on whether\nthe 'x' or 'X' format was used) to be inserted between left-hand padding\nand the formatting of the number if the leading character of the result is not\nalready a zero.
\nThe alternate form causes the result to always contain a decimal point, even if\nno digits follow it.
\nThe precision determines the number of digits after the decimal point and\ndefaults to 6.
\nThe alternate form causes the result to always contain a decimal point, and\ntrailing zeroes are not removed as they would otherwise be.
\nThe precision determines the number of significant digits before and after the\ndecimal point and defaults to 6.
\nThe %r conversion was added in Python 2.0.
\nThe precision determines the maximal number of characters used.
\nIf the object or format provided is a unicode string, the resulting\nstring will also be unicode.
\nThe precision determines the maximal number of characters used.
\nSee PEP 237.
\nSince Python strings have an explicit length, %s conversions do not assume\nthat '\\0' is the end of the string.
\n\nChanged in version 2.7: %f conversions for numbers whose absolute value is over 1e50 are no\nlonger replaced by %g conversions.
\nAdditional string operations are defined in standard modules string and\nre.
\nThe xrange type is an immutable sequence which is commonly used for\nlooping. The advantage of the xrange type is that an xrange\nobject will always take the same amount of memory, no matter the size of the\nrange it represents. There are no consistent performance advantages.
\nXRange objects have very little behavior: they only support indexing, iteration,\nand the len() function.
\nList and bytearray objects support additional operations that allow\nin-place modification of the object. Other mutable sequence types (when added\nto the language) should also support these operations. Strings and tuples\nare immutable sequence types: such objects cannot be modified once created.\nThe following operations are defined on mutable sequence types (where x is\nan arbitrary object):
\nOperation | \nResult | \nNotes | \n
---|---|---|
s[i] = x | \nitem i of s is replaced by\nx | \n\n |
s[i:j] = t | \nslice of s from i to j\nis replaced by the contents of\nthe iterable t | \n\n |
del s[i:j] | \nsame as s[i:j] = [] | \n\n |
s[i:j:k] = t | \nthe elements of s[i:j:k]\nare replaced by those of t | \n(1) | \n
del s[i:j:k] | \nremoves the elements of\ns[i:j:k] from the list | \n\n |
s.append(x) | \nsame as s[len(s):len(s)] =\n[x] | \n(2) | \n
s.extend(x) | \nsame as s[len(s):len(s)] =\nx | \n(3) | \n
s.count(x) | \nreturn number of i‘s for\nwhich s[i] == x | \n\n |
s.index(x[, i[, j]]) | \nreturn smallest k such that\ns[k] == x and i <= k <\nj | \n(4) | \n
s.insert(i, x) | \nsame as s[i:i] = [x] | \n(5) | \n
s.pop([i]) | \nsame as x = s[i]; del s[i];\nreturn x | \n(6) | \n
s.remove(x) | \nsame as del s[s.index(x)] | \n(4) | \n
s.reverse() | \nreverses the items of s in\nplace | \n(7) | \n
s.sort([cmp[, key[,\nreverse]]]) | \nsort the items of s in place | \n(7)(8)(9)(10) | \n
Notes:
\nt must have the same length as the slice it is replacing.
\nThe C implementation of Python has historically accepted multiple parameters and\nimplicitly joined them into a tuple; this no longer works in Python 2.0. Use of\nthis misfeature has been deprecated since Python 1.4.
\nx can be any iterable object.
\nRaises ValueError when x is not found in s. When a negative index is\npassed as the second or third parameter to the index() method, the list\nlength is added, as for slice indices. If it is still negative, it is truncated\nto zero, as for slice indices.
\n\nChanged in version 2.3: Previously, index() didn’t have arguments for specifying start and stop\npositions.
\nWhen a negative index is passed as the first parameter to the insert()\nmethod, the list length is added, as for slice indices. If it is still\nnegative, it is truncated to zero, as for slice indices.
\n\nChanged in version 2.3: Previously, all negative indices were truncated to zero.
\nThe pop() method is only supported by the list and array types. The\noptional argument i defaults to -1, so that by default the last item is\nremoved and returned.
\nThe sort() and reverse() methods modify the list in place for\neconomy of space when sorting or reversing a large list. To remind you that\nthey operate by side effect, they don’t return the sorted or reversed list.
\nThe sort() method takes optional arguments for controlling the\ncomparisons.
\ncmp specifies a custom comparison function of two arguments (list items) which\nshould return a negative, zero or positive number depending on whether the first\nargument is considered smaller than, equal to, or larger than the second\nargument: cmp=lambda x,y: cmp(x.lower(), y.lower()). The default value\nis None.
\nkey specifies a function of one argument that is used to extract a comparison\nkey from each list element: key=str.lower. The default value is None.
\nreverse is a boolean value. If set to True, then the list elements are\nsorted as if each comparison were reversed.
\nIn general, the key and reverse conversion processes are much faster than\nspecifying an equivalent cmp function. This is because cmp is called\nmultiple times for each list element while key and reverse touch each\nelement only once. Use functools.cmp_to_key() to convert an\nold-style cmp function to a key function.
\n\nChanged in version 2.3: Support for None as an equivalent to omitting cmp was added.
\n\nChanged in version 2.4: Support for key and reverse was added.
\nStarting with Python 2.3, the sort() method is guaranteed to be stable. A\nsort is stable if it guarantees not to change the relative order of elements\nthat compare equal — this is helpful for sorting in multiple passes (for\nexample, sort by department, then by salary grade).
\nCPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even\ninspect, the list is undefined. The C implementation of Python 2.3 and\nnewer makes the list appear empty for the duration, and raises\nValueError if it can detect that the list has been mutated during a\nsort.
\nA set object is an unordered collection of distinct hashable objects.\nCommon uses include membership testing, removing duplicates from a sequence, and\ncomputing mathematical operations such as intersection, union, difference, and\nsymmetric difference.\n(For other containers see the built in dict, list,\nand tuple classes, and the collections module.)
\n\nNew in version 2.4.
\nLike other collections, sets support x in set, len(set), and for x in\nset. Being an unordered collection, sets do not record element position or\norder of insertion. Accordingly, sets do not support indexing, slicing, or\nother sequence-like behavior.
\nThere are currently two built-in set types, set and frozenset.\nThe set type is mutable — the contents can be changed using methods\nlike add() and remove(). Since it is mutable, it has no hash value\nand cannot be used as either a dictionary key or as an element of another set.\nThe frozenset type is immutable and hashable — its contents\ncannot be altered after it is created; it can therefore be used as a dictionary\nkey or as an element of another set.
\nAs of Python 2.7, non-empty sets (not frozensets) can be created by placing a\ncomma-separated list of elements within braces, for example: {'jack',\n'sjoerd'}, in addition to the set constructor.
\nThe constructors for both classes work the same:
\nReturn a new set or frozenset object whose elements are taken from\niterable. The elements of a set must be hashable. To represent sets of\nsets, the inner sets must be frozenset objects. If iterable is\nnot specified, a new empty set is returned.
\nInstances of set and frozenset provide the following\noperations:
\nReturn True if the set has no elements in common with other. Sets are\ndisjoint if and only if their intersection is the empty set.
\n\nNew in version 2.6.
\nReturn a new set with elements from the set and all others.
\n\nChanged in version 2.6: Accepts multiple input iterables.
\nReturn a new set with elements common to the set and all others.
\n\nChanged in version 2.6: Accepts multiple input iterables.
\nReturn a new set with elements in the set that are not in the others.
\n\nChanged in version 2.6: Accepts multiple input iterables.
\nNote, the non-operator versions of union(), intersection(),\ndifference(), and symmetric_difference(), issubset(), and\nissuperset() methods will accept any iterable as an argument. In\ncontrast, their operator based counterparts require their arguments to be\nsets. This precludes error-prone constructions like set('abc') & 'cbs'\nin favor of the more readable set('abc').intersection('cbs').
\nBoth set and frozenset support set to set comparisons. Two\nsets are equal if and only if every element of each set is contained in the\nother (each is a subset of the other). A set is less than another set if and\nonly if the first set is a proper subset of the second set (is a subset, but\nis not equal). A set is greater than another set if and only if the first set\nis a proper superset of the second set (is a superset, but is not equal).
\nInstances of set are compared to instances of frozenset\nbased on their members. For example, set('abc') == frozenset('abc')\nreturns True and so does set('abc') in set([frozenset('abc')]).
\nThe subset and equality comparisons do not generalize to a complete ordering\nfunction. For example, any two disjoint sets are not equal and are not\nsubsets of each other, so all of the following return False: a<b,\na==b, or a>b. Accordingly, sets do not implement the __cmp__()\nmethod.
\nSince sets only define partial ordering (subset relationships), the output of\nthe list.sort() method is undefined for lists of sets.
\nSet elements, like dictionary keys, must be hashable.
\nBinary operations that mix set instances with frozenset\nreturn the type of the first operand. For example: frozenset('ab') |\nset('bc') returns an instance of frozenset.
\nThe following table lists operations available for set that do not\napply to immutable instances of frozenset:
\nUpdate the set, adding elements from all others.
\n\nChanged in version 2.6: Accepts multiple input iterables.
\nUpdate the set, keeping only elements found in it and all others.
\n\nChanged in version 2.6: Accepts multiple input iterables.
\nUpdate the set, removing elements found in others.
\n\nChanged in version 2.6: Accepts multiple input iterables.
\nNote, the non-operator versions of the update(),\nintersection_update(), difference_update(), and\nsymmetric_difference_update() methods will accept any iterable as an\nargument.
\nNote, the elem argument to the __contains__(), remove(), and\ndiscard() methods may be a set. To support searching for an equivalent\nfrozenset, the elem set is temporarily mutated during the search and then\nrestored. During the search, the elem set should not be read or mutated\nsince it does not have a meaningful value.
\nSee also
\nA mapping object maps hashable values to arbitrary objects.\nMappings are mutable objects. There is currently only one standard mapping\ntype, the dictionary. (For other containers see the built in\nlist, set, and tuple classes, and the\ncollections module.)
\nA dictionary’s keys are almost arbitrary values. Values that are not\nhashable, that is, values containing lists, dictionaries or other\nmutable types (that are compared by value rather than by object identity) may\nnot be used as keys. Numeric types used for keys obey the normal rules for\nnumeric comparison: if two numbers compare equal (such as 1 and 1.0)\nthen they can be used interchangeably to index the same dictionary entry. (Note\nhowever, that since computers store floating-point numbers as approximations it\nis usually unwise to use them as dictionary keys.)
\nDictionaries can be created by placing a comma-separated list of key: value\npairs within braces, for example: {'jack': 4098, 'sjoerd': 4127} or {4098:\n'jack', 4127: 'sjoerd'}, or by the dict constructor.
\nReturn a new dictionary initialized from an optional positional argument or from\na set of keyword arguments. If no arguments are given, return a new empty\ndictionary. If the positional argument arg is a mapping object, return a\ndictionary mapping the same keys to the same values as does the mapping object.\nOtherwise the positional argument must be a sequence, a container that supports\niteration, or an iterator object. The elements of the argument must each also\nbe of one of those kinds, and each must in turn contain exactly two objects.\nThe first is used as a key in the new dictionary, and the second as the key’s\nvalue. If a given key is seen more than once, the last value associated with it\nis retained in the new dictionary.
\nIf keyword arguments are given, the keywords themselves with their associated\nvalues are added as items to the dictionary. If a key is specified both in the\npositional argument and as a keyword argument, the value associated with the\nkeyword is retained in the dictionary. For example, these all return a\ndictionary equal to {"one": 1, "two": 2}:
\nThe first example only works for keys that are valid Python\nidentifiers; the others work with any valid keys.
\n\nNew in version 2.2.
\n\nChanged in version 2.3: Support for building a dictionary from keyword arguments added.
\nThese are the operations that dictionaries support (and therefore, custom\nmapping types should support too):
\nReturn the item of d with key key. Raises a KeyError if key\nis not in the map.
\n\nNew in version 2.5: If a subclass of dict defines a method __missing__(), if the key\nkey is not present, the d[key] operation calls that method with\nthe key key as argument. The d[key] operation then returns or\nraises whatever is returned or raised by the __missing__(key) call\nif the key is not present. No other operations or methods invoke\n__missing__(). If __missing__() is not defined,\nKeyError is raised. __missing__() must be a method; it\ncannot be an instance variable. For an example, see\ncollections.defaultdict.
\nReturn True if d has a key key, else False.
\n\nNew in version 2.2.
\nEquivalent to not key in d.
\n\nNew in version 2.2.
\nCreate a new dictionary with keys from seq and values set to value.
\nfromkeys() is a class method that returns a new dictionary. value\ndefaults to None.
\n\nNew in version 2.3.
\nReturn a copy of the dictionary’s list of (key, value) pairs.
\nCPython implementation detail: Keys and values are listed in an arbitrary order which is non-random,\nvaries across Python implementations, and depends on the dictionary’s\nhistory of insertions and deletions.
\nIf items(), keys(), values(), iteritems(),\niterkeys(), and itervalues() are called with no intervening\nmodifications to the dictionary, the lists will directly correspond. This\nallows the creation of (value, key) pairs using zip(): pairs =\nzip(d.values(), d.keys()). The same relationship holds for the\niterkeys() and itervalues() methods: pairs =\nzip(d.itervalues(), d.iterkeys()) provides the same value for\npairs. Another way to create the same list is pairs = [(v, k) for\n(k, v) in d.iteritems()].
\nReturn an iterator over the dictionary’s (key, value) pairs. See the\nnote for dict.items().
\nUsing iteritems() while adding or deleting entries in the dictionary\nmay raise a RuntimeError or fail to iterate over all entries.
\n\nNew in version 2.2.
\nReturn an iterator over the dictionary’s keys. See the note for\ndict.items().
\nUsing iterkeys() while adding or deleting entries in the dictionary\nmay raise a RuntimeError or fail to iterate over all entries.
\n\nNew in version 2.2.
\nReturn an iterator over the dictionary’s values. See the note for\ndict.items().
\nUsing itervalues() while adding or deleting entries in the\ndictionary may raise a RuntimeError or fail to iterate over all\nentries.
\n\nNew in version 2.2.
\nIf key is in the dictionary, remove it and return its value, else return\ndefault. If default is not given and key is not in the dictionary,\na KeyError is raised.
\n\nNew in version 2.3.
\nRemove and return an arbitrary (key, value) pair from the dictionary.
\npopitem() is useful to destructively iterate over a dictionary, as\noften used in set algorithms. If the dictionary is empty, calling\npopitem() raises a KeyError.
\nUpdate the dictionary with the key/value pairs from other, overwriting\nexisting keys. Return None.
\nupdate() accepts either another dictionary object or an iterable of\nkey/value pairs (as tuples or other iterables of length two). If keyword\narguments are specified, the dictionary is then updated with those\nkey/value pairs: d.update(red=1, blue=2).
\n\nChanged in version 2.4: Allowed the argument to be an iterable of key/value pairs and allowed\nkeyword arguments.
\nReturn a new view of the dictionary’s items ((key, value) pairs). See\nbelow for documentation of view objects.
\n\nNew in version 2.7.
\nReturn a new view of the dictionary’s keys. See below for documentation of\nview objects.
\n\nNew in version 2.7.
\nReturn a new view of the dictionary’s values. See below for documentation of\nview objects.
\n\nNew in version 2.7.
\nThe objects returned by dict.viewkeys(), dict.viewvalues() and\ndict.viewitems() are view objects. They provide a dynamic view on the\ndictionary’s entries, which means that when the dictionary changes, the view\nreflects these changes.
\nDictionary views can be iterated over to yield their respective data, and\nsupport membership tests:
\nReturn an iterator over the keys, values or items (represented as tuples of\n(key, value)) in the dictionary.
\nKeys and values are iterated over in an arbitrary order which is non-random,\nvaries across Python implementations, and depends on the dictionary’s history\nof insertions and deletions. If keys, values and items views are iterated\nover with no intervening modifications to the dictionary, the order of items\nwill directly correspond. This allows the creation of (value, key) pairs\nusing zip(): pairs = zip(d.values(), d.keys()). Another way to\ncreate the same list is pairs = [(v, k) for (k, v) in d.items()].
\nIterating views while adding or deleting entries in the dictionary may raise\na RuntimeError or fail to iterate over all entries.
\nKeys views are set-like since their entries are unique and hashable. If all\nvalues are hashable, so that (key, value) pairs are unique and hashable, then\nthe items view is also set-like. (Values views are not treated as set-like\nsince the entries are generally not unique.) Then these set operations are\navailable (“other” refers either to another view or a set):
\nAn example of dictionary view usage:
\n>>> dishes = {'eggs': 2, 'sausage': 1, 'bacon': 1, 'spam': 500}\n>>> keys = dishes.viewkeys()\n>>> values = dishes.viewvalues()\n\n>>> # iteration\n>>> n = 0\n>>> for val in values:\n... n += val\n>>> print(n)\n504\n\n>>> # keys and values are iterated over in the same order\n>>> list(keys)\n['eggs', 'bacon', 'sausage', 'spam']\n>>> list(values)\n[2, 1, 1, 500]\n\n>>> # view objects are dynamic and reflect dict changes\n>>> del dishes['eggs']\n>>> del dishes['sausage']\n>>> list(keys)\n['spam', 'bacon']\n\n>>> # set operations\n>>> keys & {'eggs', 'bacon', 'salad'}\n{'bacon'}\n
File objects are implemented using C’s stdio package and can be\ncreated with the built-in open() function. File\nobjects are also returned by some other built-in functions and methods,\nsuch as os.popen() and os.fdopen() and the makefile()\nmethod of socket objects. Temporary files can be created using the\ntempfile module, and high-level file operations such as copying,\nmoving, and deleting files and directories can be achieved with the\nshutil module.
\nWhen a file operation fails for an I/O-related reason, the exception\nIOError is raised. This includes situations where the operation is not\ndefined for some reason, like seek() on a tty device or writing a file\nopened for reading.
\nFiles have the following methods:
\nClose the file. A closed file cannot be read or written any more. Any operation\nwhich requires that the file be open will raise a ValueError after the\nfile has been closed. Calling close() more than once is allowed.
\nAs of Python 2.5, you can avoid having to call this method explicitly if you use\nthe with statement. For example, the following code will\nautomatically close f when the with block is exited:
\nfrom __future__ import with_statement # This isn't required in Python 2.6\n\nwith open("hello.txt") as f:\n for line in f:\n print line\n
In older versions of Python, you would have needed to do this to get the same\neffect:
\nf = open("hello.txt")\ntry:\n for line in f:\n print line\nfinally:\n f.close()\n
Note
\nNot all “file-like” types in Python support use as a context manager for the\nwith statement. If your code is intended to work with any file-like\nobject, you can use the function contextlib.closing() instead of using\nthe object directly.
\nFlush the internal buffer, like stdio‘s fflush(). This may be a\nno-op on some file-like objects.
\nNote
\nflush() does not necessarily write the file’s data to disk. Use\nflush() followed by os.fsync() to ensure this behavior.
\nReturn the integer “file descriptor” that is used by the underlying\nimplementation to request I/O operations from the operating system. This can be\nuseful for other, lower level interfaces that use file descriptors, such as the\nfcntl module or os.read() and friends.
\nNote
\nFile-like objects which do not have a real file descriptor should not provide\nthis method!
\nReturn True if the file is connected to a tty(-like) device, else False.
\nNote
\nIf a file-like object is not associated with a real file, this method should\nnot be implemented.
\nA file object is its own iterator, for example iter(f) returns f (unless\nf is closed). When a file is used as an iterator, typically in a\nfor loop (for example, for line in f: print line), the\nnext() method is called repeatedly. This method returns the next input\nline, or raises StopIteration when EOF is hit when the file is open for\nreading (behavior is undefined when the file is open for writing). In order to\nmake a for loop the most efficient way of looping over the lines of a\nfile (a very common operation), the next() method uses a hidden read-ahead\nbuffer. As a consequence of using a read-ahead buffer, combining next()\nwith other file methods (like readline()) does not work right. However,\nusing seek() to reposition the file to an absolute position will flush the\nread-ahead buffer.
\n\nNew in version 2.3.
\nRead at most size bytes from the file (less if the read hits EOF before\nobtaining size bytes). If the size argument is negative or omitted, read\nall data until EOF is reached. The bytes are returned as a string object. An\nempty string is returned when EOF is encountered immediately. (For certain\nfiles, like ttys, it makes sense to continue reading after an EOF is hit.) Note\nthat this method may call the underlying C function fread() more than\nonce in an effort to acquire as close to size bytes as possible. Also note\nthat when in non-blocking mode, less data than was requested may be\nreturned, even if no size parameter was given.
\nNote
\nThis function is simply a wrapper for the underlying\nfread() C function, and will behave the same in corner cases,\nsuch as whether the EOF value is cached.
\nRead one entire line from the file. A trailing newline character is kept in\nthe string (but may be absent when a file ends with an incomplete line). [6]\nIf the size argument is present and non-negative, it is a maximum byte\ncount (including the trailing newline) and an incomplete line may be\nreturned. When size is not 0, an empty string is returned only when EOF\nis encountered immediately.
\nNote
\nUnlike stdio‘s fgets(), the returned string contains null characters\n('\\0') if they occurred in the input.
\nThis method returns the same thing as iter(f).
\n\nNew in version 2.1.
\n\nDeprecated since version 2.3: Use for line in file instead.
\nSet the file’s current position, like stdio‘s fseek(). The whence\nargument is optional and defaults to os.SEEK_SET or 0 (absolute file\npositioning); other values are os.SEEK_CUR or 1 (seek relative to the\ncurrent position) and os.SEEK_END or 2 (seek relative to the file’s\nend). There is no return value.
\nFor example, f.seek(2, os.SEEK_CUR) advances the position by two and\nf.seek(-3, os.SEEK_END) sets the position to the third to last.
\nNote that if the file is opened for appending\n(mode 'a' or 'a+'), any seek() operations will be undone at the\nnext write. If the file is only opened for writing in append mode (mode\n'a'), this method is essentially a no-op, but it remains useful for files\nopened in append mode with reading enabled (mode 'a+'). If the file is\nopened in text mode (without 'b'), only offsets returned by tell() are\nlegal. Use of other offsets causes undefined behavior.
\nNote that not all file objects are seekable.
\n\nChanged in version 2.6: Passing float values as offset has been deprecated.
\nReturn the file’s current position, like stdio‘s ftell().
\nNote
\nOn Windows, tell() can return illegal values (after an fgets())\nwhen reading files with Unix-style line-endings. Use binary mode ('rb') to\ncircumvent this problem.
\nFiles support the iterator protocol. Each iteration returns the same result as\nfile.readline(), and iteration ends when the readline() method returns\nan empty string.
\nFile objects also offer a number of other interesting attributes. These are not\nrequired for file-like objects, but should be implemented if they make sense for\nthe particular object.
\nThe encoding that this file uses. When Unicode strings are written to a file,\nthey will be converted to byte strings using this encoding. In addition, when\nthe file is connected to a terminal, the attribute gives the encoding that the\nterminal is likely to use (that information might be incorrect if the user has\nmisconfigured the terminal). The attribute is read-only and may not be present\non all file-like objects. It may also be None, in which case the file uses\nthe system default encoding for converting Unicode strings.
\n\nNew in version 2.3.
\nThe Unicode error handler used along with the encoding.
\n\nNew in version 2.6.
\nBoolean that indicates whether a space character needs to be printed before\nanother value when using the print statement. Classes that are trying\nto simulate a file object should also have a writable softspace\nattribute, which should be initialized to zero. This will be automatic for most\nclasses implemented in Python (care may be needed for objects that override\nattribute access); types implemented in C will have to provide a writable\nsoftspace attribute.
\n\n\nNew in version 2.7.
\nmemoryview objects allow Python code to access the internal data\nof an object that supports the buffer protocol without copying. Memory\nis generally interpreted as simple bytes.
\nCreate a memoryview that references obj. obj must support the\nbuffer protocol. Built-in objects that support the buffer protocol include\nstr and bytearray (but not unicode).
\nA memoryview has the notion of an element, which is the\natomic memory unit handled by the originating object obj. For many\nsimple types such as str and bytearray, an element\nis a single byte, but other third-party types may expose larger elements.
\nlen(view) returns the total number of elements in the memoryview,\nview. The itemsize attribute will give you the\nnumber of bytes in a single element.
\nA memoryview supports slicing to expose its data. Taking a single\nindex will return a single element as a str object. Full\nslicing will result in a subview:
\n>>> v = memoryview('abcefg')\n>>> v[1]\n'b'\n>>> v[-1]\n'g'\n>>> v[1:4]\n<memory at 0x77ab28>\n>>> v[1:4].tobytes()\n'bce'\n
If the object the memoryview is over supports changing its data, the\nmemoryview supports slice assignment:
\n>>> data = bytearray('abcefg')\n>>> v = memoryview(data)\n>>> v.readonly\nFalse\n>>> v[0] = 'z'\n>>> data\nbytearray(b'zbcefg')\n>>> v[1:4] = '123'\n>>> data\nbytearray(b'z123fg')\n>>> v[2] = 'spam'\nTraceback (most recent call last):\n File "<stdin>", line 1, in <module>\nValueError: cannot modify size of memoryview object\n
Notice how the size of the memoryview object cannot be changed.
\nmemoryview has two methods:
\nReturn the data in the buffer as a bytestring (an object of class\nstr).
\n>>> m = memoryview("abc")\n>>> m.tobytes()\n'abc'\n
Return the data in the buffer as a list of integers.
\n>>> memoryview("abc").tolist()\n[97, 98, 99]\n
There are also several readonly attributes available:
\n\nNew in version 2.5.
\nPython’s with statement supports the concept of a runtime context\ndefined by a context manager. This is implemented using two separate methods\nthat allow user-defined classes to define a runtime context that is entered\nbefore the statement body is executed and exited when the statement ends.
\nThe context management protocol consists of a pair of methods that need\nto be provided for a context manager object to define a runtime context:
\nEnter the runtime context and return either this object or another object\nrelated to the runtime context. The value returned by this method is bound to\nthe identifier in the as clause of with statements using\nthis context manager.
\nAn example of a context manager that returns itself is a file object. File\nobjects return themselves from __enter__() to allow open() to be used as\nthe context expression in a with statement.
\nAn example of a context manager that returns a related object is the one\nreturned by decimal.localcontext(). These managers set the active\ndecimal context to a copy of the original decimal context and then return the\ncopy. This allows changes to be made to the current decimal context in the body\nof the with statement without affecting code outside the\nwith statement.
\nExit the runtime context and return a Boolean flag indicating if any exception\nthat occurred should be suppressed. If an exception occurred while executing the\nbody of the with statement, the arguments contain the exception type,\nvalue and traceback information. Otherwise, all three arguments are None.
\nReturning a true value from this method will cause the with statement\nto suppress the exception and continue execution with the statement immediately\nfollowing the with statement. Otherwise the exception continues\npropagating after this method has finished executing. Exceptions that occur\nduring execution of this method will replace any exception that occurred in the\nbody of the with statement.
\nThe exception passed in should never be reraised explicitly - instead, this\nmethod should return a false value to indicate that the method completed\nsuccessfully and does not want to suppress the raised exception. This allows\ncontext management code (such as contextlib.nested) to easily detect whether\nor not an __exit__() method has actually failed.
\nPython defines several context managers to support easy thread synchronisation,\nprompt closure of files or other objects, and simpler manipulation of the active\ndecimal arithmetic context. The specific types are not treated specially beyond\ntheir implementation of the context management protocol. See the\ncontextlib module for some examples.
\nPython’s generators and the contextlib.contextmanager decorator\nprovide a convenient way to implement these protocols. If a generator function is\ndecorated with the contextlib.contextmanager decorator, it will return a\ncontext manager implementing the necessary __enter__() and\n__exit__() methods, rather than the iterator produced by an undecorated\ngenerator function.
\nNote that there is no specific slot for any of these methods in the type\nstructure for Python objects in the Python/C API. Extension types wanting to\ndefine these methods must provide them as a normal Python accessible method.\nCompared to the overhead of setting up the runtime context, the overhead of a\nsingle class dictionary lookup is negligible.
\nThe interpreter supports several other kinds of objects. Most of these support\nonly one or two operations.
\nThe only special operation on a module is attribute access: m.name, where\nm is a module and name accesses a name defined in m‘s symbol table.\nModule attributes can be assigned to. (Note that the import\nstatement is not, strictly speaking, an operation on a module object; import\nfoo does not require a module object named foo to exist, rather it requires\nan (external) definition for a module named foo somewhere.)
\nA special attribute of every module is __dict__. This is the dictionary\ncontaining the module’s symbol table. Modifying this dictionary will actually\nchange the module’s symbol table, but direct assignment to the __dict__\nattribute is not possible (you can write m.__dict__['a'] = 1, which defines\nm.a to be 1, but you can’t write m.__dict__ = {}). Modifying\n__dict__ directly is not recommended.
\nModules built into the interpreter are written like this: <module 'sys'\n(built-in)>. If loaded from a file, they are written as <module 'os' from\n'/usr/local/lib/pythonX.Y/os.pyc'>.
\nSee Objects, values and types and Class definitions for these.
\nFunction objects are created by function definitions. The only operation on a\nfunction object is to call it: func(argument-list).
\nThere are really two flavors of function objects: built-in functions and\nuser-defined functions. Both support the same operation (to call the function),\nbut the implementation is different, hence the different object types.
\nSee Function definitions for more information.
\nMethods are functions that are called using the attribute notation. There are\ntwo flavors: built-in methods (such as append() on lists) and class\ninstance methods. Built-in methods are described with the types that support\nthem.
\nThe implementation adds two special read-only attributes to class instance\nmethods: m.im_self is the object on which the method operates, and\nm.im_func is the function implementing the method. Calling m(arg-1,\narg-2, ..., arg-n) is completely equivalent to calling m.im_func(m.im_self,\narg-1, arg-2, ..., arg-n).
\nClass instance methods are either bound or unbound, referring to whether the\nmethod was accessed through an instance or a class, respectively. When a method\nis unbound, its im_self attribute will be None and if called, an\nexplicit self object must be passed as the first argument. In this case,\nself must be an instance of the unbound method’s class (or a subclass of\nthat class), otherwise a TypeError is raised.
\nLike function objects, methods objects support getting arbitrary attributes.\nHowever, since method attributes are actually stored on the underlying function\nobject (meth.im_func), setting method attributes on either bound or unbound\nmethods is disallowed. Attempting to set a method attribute results in a\nTypeError being raised. In order to set a method attribute, you need to\nexplicitly set it on the underlying function object:
\nclass C:\n def method(self):\n pass\n\nc = C()\nc.method.im_func.whoami = 'my name is c'\n
See The standard type hierarchy for more information.
\nCode objects are used by the implementation to represent “pseudo-compiled”\nexecutable Python code such as a function body. They differ from function\nobjects because they don’t contain a reference to their global execution\nenvironment. Code objects are returned by the built-in compile() function\nand can be extracted from function objects through their func_code\nattribute. See also the code module.
\nA code object can be executed or evaluated by passing it (instead of a source\nstring) to the exec statement or the built-in eval() function.
\nSee The standard type hierarchy for more information.
\nType objects represent the various object types. An object’s type is accessed\nby the built-in function type(). There are no special operations on\ntypes. The standard module types defines names for all standard built-in\ntypes.
\nTypes are written like this: <type 'int'>.
\nThis object is returned by functions that don’t explicitly return a value. It\nsupports no special operations. There is exactly one null object, named\nNone (a built-in name).
\nIt is written as None.
\nThis object is used by extended slice notation (see Slicings). It\nsupports no special operations. There is exactly one ellipsis object, named\nEllipsis (a built-in name).
\nIt is written as Ellipsis. When in a subscript, it can also be written as\n..., for example seq[...].
\nThis object is returned from comparisons and binary operations when they are\nasked to operate on types they don’t support. See Comparisons for more\ninformation.
\nIt is written as NotImplemented.
\nBoolean values are the two constant objects False and True. They are\nused to represent truth values (although other values can also be considered\nfalse or true). In numeric contexts (for example when used as the argument to\nan arithmetic operator), they behave like the integers 0 and 1, respectively.\nThe built-in function bool() can be used to convert any value to a\nBoolean, if the value can be interpreted as a truth value (see section\nTruth Value Testing above).
\nThey are written as False and True, respectively.
\nSee The standard type hierarchy for this information. It describes stack frame objects,\ntraceback objects, and slice objects.
\nThe implementation adds a few special read-only attributes to several object\ntypes, where they are relevant. Some of these are not reported by the\ndir() built-in function.
\n\nDeprecated since version 2.2: Use the built-in function dir() to get a list of an object’s attributes.\nThis attribute is no longer available.
\n\nDeprecated since version 2.2: Use the built-in function dir() to get a list of an object’s attributes.\nThis attribute is no longer available.
\nThe following attributes are only supported by new-style classes.
\nEach new-style class keeps a list of weak references to its immediate\nsubclasses. This method returns a list of all those references still alive.\nExample:
\n>>> int.__subclasses__()\n[<type 'bool'>]\n
Footnotes
\n[1] | Additional information on these special methods may be found in the Python\nReference Manual (Basic customization). |
[2] | As a consequence, the list [1, 2] is considered equal to [1.0, 2.0], and\nsimilarly for tuples. |
[3] | They must have since the parser can’t tell the type of the operands. |
[4] | (1, 2, 3, 4) Cased characters are those with general category property being one of\n“Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase). |
[5] | To format only a tuple you should therefore provide a singleton tuple whose only\nelement is the tuple to be formatted. |
[6] | The advantage of leaving the newline on is that returning an empty string is\nthen an unambiguous EOF indication. It is also possible (in cases where it\nmight matter, for example, if you want to make an exact copy of a file while\nscanning its lines) to tell whether the last line of a file ended in a newline\nor not (yes this happens!). |
\nNew in version 2.1.
\nThis module provides classes and functions for comparing sequences. It\ncan be used for example, for comparing files, and can produce difference\ninformation in various formats, including HTML and context and unified\ndiffs. For comparing directories and files, see also, the filecmp module.
\nThis is a flexible class for comparing pairs of sequences of any type, so long\nas the sequence elements are hashable. The basic algorithm predates, and is a\nlittle fancier than, an algorithm published in the late 1980’s by Ratcliff and\nObershelp under the hyperbolic name “gestalt pattern matching.” The idea is to\nfind the longest contiguous matching subsequence that contains no “junk”\nelements (the Ratcliff and Obershelp algorithm doesn’t address junk). The same\nidea is then applied recursively to the pieces of the sequences to the left and\nto the right of the matching subsequence. This does not yield minimal edit\nsequences, but does tend to yield matches that “look right” to people.
\nTiming: The basic Ratcliff-Obershelp algorithm is cubic time in the worst\ncase and quadratic time in the expected case. SequenceMatcher is\nquadratic time for the worst case and has expected-case behavior dependent in a\ncomplicated way on how many elements the sequences have in common; best case\ntime is linear.
\nAutomatic junk heuristic: SequenceMatcher supports a heuristic that\nautomatically treats certain sequence items as junk. The heuristic counts how many\ntimes each individual item appears in the sequence. If an item’s duplicates (after\nthe first one) account for more than 1% of the sequence and the sequence is at least\n200 items long, this item is marked as “popular” and is treated as junk for\nthe purpose of sequence matching. This heuristic can be turned off by setting\nthe autojunk argument to False when creating the SequenceMatcher.
\n\nNew in version 2.7.1: The autojunk parameter.
\nThis is a class for comparing sequences of lines of text, and producing\nhuman-readable differences or deltas. Differ uses SequenceMatcher\nboth to compare sequences of lines, and to compare sequences of characters\nwithin similar (near-matching) lines.
\nEach line of a Differ delta begins with a two-letter code:
\nCode | \nMeaning | \n
---|---|
'- ' | \nline unique to sequence 1 | \n
'+ ' | \nline unique to sequence 2 | \n
' ' | \nline common to both sequences | \n
'? ' | \nline not present in either input sequence | \n
Lines beginning with ‘?‘ attempt to guide the eye to intraline differences,\nand were not present in either input sequence. These lines can be confusing if\nthe sequences contain tab characters.
\nThis class can be used to create an HTML table (or a complete HTML file\ncontaining the table) showing a side by side, line by line comparison of text\nwith inter-line and intra-line change highlights. The table can be generated in\neither full or contextual difference mode.
\nThe constructor for this class is:
\nInitializes instance of HtmlDiff.
\ntabsize is an optional keyword argument to specify tab stop spacing and\ndefaults to 8.
\nwrapcolumn is an optional keyword to specify column number where lines are\nbroken and wrapped, defaults to None where lines are not wrapped.
\nlinejunk and charjunk are optional keyword arguments passed into ndiff()\n(used by HtmlDiff to generate the side by side HTML differences). See\nndiff() documentation for argument default values and descriptions.
\nThe following methods are public:
\nCompares fromlines and tolines (lists of strings) and returns a string which\nis a complete HTML file containing a table showing line by line differences with\ninter-line and intra-line changes highlighted.
\nfromdesc and todesc are optional keyword arguments to specify from/to file\ncolumn header strings (both default to an empty string).
\ncontext and numlines are both optional keyword arguments. Set context to\nTrue when contextual differences are to be shown, else the default is\nFalse to show the full files. numlines defaults to 5. When context\nis True numlines controls the number of context lines which surround the\ndifference highlights. When context is False numlines controls the\nnumber of lines which are shown before a difference highlight when using the\n“next” hyperlinks (setting to zero would cause the “next” hyperlinks to place\nthe next difference highlight at the top of the browser without any leading\ncontext).
\nCompares fromlines and tolines (lists of strings) and returns a string which\nis a complete HTML table showing line by line differences with inter-line and\nintra-line changes highlighted.
\nThe arguments for this method are the same as those for the make_file()\nmethod.
\nTools/scripts/diff.py is a command-line front-end to this class and\ncontains a good example of its use.
\n\nNew in version 2.4.
\nCompare a and b (lists of strings); return a delta (a generator\ngenerating the delta lines) in context diff format.
\nContext diffs are a compact way of showing just the lines that have changed plus\na few lines of context. The changes are shown in a before/after style. The\nnumber of context lines is set by n which defaults to three.
\nBy default, the diff control lines (those with *** or ---) are created\nwith a trailing newline. This is helpful so that inputs created from\nfile.readlines() result in diffs that are suitable for use with\nfile.writelines() since both the inputs and outputs have trailing\nnewlines.
\nFor inputs that do not have trailing newlines, set the lineterm argument to\n"" so that the output will be uniformly newline free.
\nThe context diff format normally has a header for filenames and modification\ntimes. Any or all of these may be specified using strings for fromfile,\ntofile, fromfiledate, and tofiledate. The modification times are normally\nexpressed in the ISO 8601 format. If not specified, the\nstrings default to blanks.
\n>>> s1 = ['bacon\\n', 'eggs\\n', 'ham\\n', 'guido\\n']\n>>> s2 = ['python\\n', 'eggy\\n', 'hamster\\n', 'guido\\n']\n>>> for line in context_diff(s1, s2, fromfile='before.py', tofile='after.py'):\n... sys.stdout.write(line) # doctest: +NORMALIZE_WHITESPACE\n*** before.py\n--- after.py\n***************\n*** 1,4 ****\n! bacon\n! eggs\n! ham\n guido\n--- 1,4 ----\n! python\n! eggy\n! hamster\n guido\n
See A command-line interface to difflib for a more detailed example.
\n\nNew in version 2.3.
\nReturn a list of the best “good enough” matches. word is a sequence for which\nclose matches are desired (typically a string), and possibilities is a list of\nsequences against which to match word (typically a list of strings).
\nOptional argument n (default 3) is the maximum number of close matches to\nreturn; n must be greater than 0.
\nOptional argument cutoff (default 0.6) is a float in the range [0, 1].\nPossibilities that don’t score at least that similar to word are ignored.
\nThe best (no more than n) matches among the possibilities are returned in a\nlist, sorted by similarity score, most similar first.
\n>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])\n['apple', 'ape']\n>>> import keyword\n>>> get_close_matches('wheel', keyword.kwlist)\n['while']\n>>> get_close_matches('apple', keyword.kwlist)\n[]\n>>> get_close_matches('accept', keyword.kwlist)\n['except']\n
Compare a and b (lists of strings); return a Differ-style\ndelta (a generator generating the delta lines).
\nOptional keyword parameters linejunk and charjunk are for filter functions\n(or None):
\nlinejunk: A function that accepts a single string argument, and returns true\nif the string is junk, or false if not. The default is (None), starting with\nPython 2.3. Before then, the default was the module-level function\nIS_LINE_JUNK(), which filters out lines without visible characters, except\nfor at most one pound character ('#'). As of Python 2.3, the underlying\nSequenceMatcher class does a dynamic analysis of which lines are so\nfrequent as to constitute noise, and this usually works better than the pre-2.3\ndefault.
\ncharjunk: A function that accepts a character (a string of length 1), and\nreturns if the character is junk, or false if not. The default is module-level\nfunction IS_CHARACTER_JUNK(), which filters out whitespace characters (a\nblank or tab; note: bad idea to include newline in this!).
\nTools/scripts/ndiff.py is a command-line front-end to this function.
\n>>> diff = ndiff('one\\ntwo\\nthree\\n'.splitlines(1),\n... 'ore\\ntree\\nemu\\n'.splitlines(1))\n>>> print ''.join(diff),\n- one\n? ^\n+ ore\n? ^\n- two\n- three\n? -\n+ tree\n+ emu\n
Return one of the two sequences that generated a delta.
\nGiven a sequence produced by Differ.compare() or ndiff(), extract\nlines originating from file 1 or 2 (parameter which), stripping off line\nprefixes.
\nExample:
\n>>> diff = ndiff('one\\ntwo\\nthree\\n'.splitlines(1),\n... 'ore\\ntree\\nemu\\n'.splitlines(1))\n>>> diff = list(diff) # materialize the generated delta into a list\n>>> print ''.join(restore(diff, 1)),\none\ntwo\nthree\n>>> print ''.join(restore(diff, 2)),\nore\ntree\nemu\n
Compare a and b (lists of strings); return a delta (a generator\ngenerating the delta lines) in unified diff format.
\nUnified diffs are a compact way of showing just the lines that have changed plus\na few lines of context. The changes are shown in a inline style (instead of\nseparate before/after blocks). The number of context lines is set by n which\ndefaults to three.
\nBy default, the diff control lines (those with ---, +++, or @@) are\ncreated with a trailing newline. This is helpful so that inputs created from\nfile.readlines() result in diffs that are suitable for use with\nfile.writelines() since both the inputs and outputs have trailing\nnewlines.
\nFor inputs that do not have trailing newlines, set the lineterm argument to\n"" so that the output will be uniformly newline free.
\nThe context diff format normally has a header for filenames and modification\ntimes. Any or all of these may be specified using strings for fromfile,\ntofile, fromfiledate, and tofiledate. The modification times are normally\nexpressed in the ISO 8601 format. If not specified, the\nstrings default to blanks.
\n>>> s1 = ['bacon\\n', 'eggs\\n', 'ham\\n', 'guido\\n']\n>>> s2 = ['python\\n', 'eggy\\n', 'hamster\\n', 'guido\\n']\n>>> for line in unified_diff(s1, s2, fromfile='before.py', tofile='after.py'):\n... sys.stdout.write(line) # doctest: +NORMALIZE_WHITESPACE\n--- before.py\n+++ after.py\n@@ -1,4 +1,4 @@\n-bacon\n-eggs\n-ham\n+python\n+eggy\n+hamster\n guido\n
See A command-line interface to difflib for a more detailed example.
\n\nNew in version 2.3.
\nSee also
\nThe SequenceMatcher class has this constructor:
\nOptional argument isjunk must be None (the default) or a one-argument\nfunction that takes a sequence element and returns true if and only if the\nelement is “junk” and should be ignored. Passing None for isjunk is\nequivalent to passing lambda x: 0; in other words, no elements are ignored.\nFor example, pass:
\nlambda x: x in " \\t"\n
if you’re comparing lines as sequences of characters, and don’t want to synch up\non blanks or hard tabs.
\nThe optional arguments a and b are sequences to be compared; both default to\nempty strings. The elements of both sequences must be hashable.
\nThe optional argument autojunk can be used to disable the automatic junk\nheuristic.
\n\nNew in version 2.7.1: The autojunk parameter.
\nSequenceMatcher objects have the following methods:
\nSequenceMatcher computes and caches detailed information about the\nsecond sequence, so if you want to compare one sequence against many\nsequences, use set_seq2() to set the commonly used sequence once and\ncall set_seq1() repeatedly, once for each of the other sequences.
\nFind longest matching block in a[alo:ahi] and b[blo:bhi].
\nIf isjunk was omitted or None, find_longest_match() returns\n(i, j, k) such that a[i:i+k] is equal to b[j:j+k], where alo\n<= i <= i+k <= ahi and blo <= j <= j+k <= bhi. For all (i', j',\nk') meeting those conditions, the additional conditions k >= k', i\n<= i', and if i == i', j <= j' are also met. In other words, of\nall maximal matching blocks, return one that starts earliest in a, and\nof all those maximal matching blocks that start earliest in a, return\nthe one that starts earliest in b.
\n>>> s = SequenceMatcher(None, " abcd", "abcd abcd")\n>>> s.find_longest_match(0, 5, 0, 9)\nMatch(a=0, b=4, size=5)\n
If isjunk was provided, first the longest matching block is determined\nas above, but with the additional restriction that no junk element appears\nin the block. Then that block is extended as far as possible by matching\n(only) junk elements on both sides. So the resulting block never matches\non junk except as identical junk happens to be adjacent to an interesting\nmatch.
\nHere’s the same example as before, but considering blanks to be junk. That\nprevents ' abcd' from matching the ' abcd' at the tail end of the\nsecond sequence directly. Instead only the 'abcd' can match, and\nmatches the leftmost 'abcd' in the second sequence:
\n>>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")\n>>> s.find_longest_match(0, 5, 0, 9)\nMatch(a=1, b=0, size=4)\n
If no blocks match, this returns (alo, blo, 0).
\n\nChanged in version 2.6: This method returns a named tuple Match(a, b, size).
\nReturn list of triples describing matching subsequences. Each triple is of\nthe form (i, j, n), and means that a[i:i+n] == b[j:j+n]. The\ntriples are monotonically increasing in i and j.
\nThe last triple is a dummy, and has the value (len(a), len(b), 0). It\nis the only triple with n == 0. If (i, j, n) and (i', j', n')\nare adjacent triples in the list, and the second is not the last triple in\nthe list, then i+n != i' or j+n != j'; in other words, adjacent\ntriples always describe non-adjacent equal blocks.
\n\nChanged in version 2.5: The guarantee that adjacent triples always describe non-adjacent blocks\nwas implemented.
\n>>> s = SequenceMatcher(None, "abxcd", "abcd")\n>>> s.get_matching_blocks()\n[Match(a=0, b=0, size=2), Match(a=3, b=2, size=2), Match(a=5, b=4, size=0)]\n
Return list of 5-tuples describing how to turn a into b. Each tuple is\nof the form (tag, i1, i2, j1, j2). The first tuple has i1 == j1 ==\n0, and remaining tuples have i1 equal to the i2 from the preceding\ntuple, and, likewise, j1 equal to the previous j2.
\nThe tag values are strings, with these meanings:
\nValue | \nMeaning | \n
---|---|
'replace' | \na[i1:i2] should be replaced by\nb[j1:j2]. | \n
'delete' | \na[i1:i2] should be deleted. Note that\nj1 == j2 in this case. | \n
'insert' | \nb[j1:j2] should be inserted at\na[i1:i1]. Note that i1 == i2 in\nthis case. | \n
'equal' | \na[i1:i2] == b[j1:j2] (the sub-sequences\nare equal). | \n
For example:
\n>>> a = "qabxcd"\n>>> b = "abycdf"\n>>> s = SequenceMatcher(None, a, b)\n>>> for tag, i1, i2, j1, j2 in s.get_opcodes():\n... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" \n... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))\n delete a[0:1] (q) b[0:0] ()\n equal a[1:3] (ab) b[0:2] (ab)\nreplace a[3:4] (x) b[2:3] (y)\n equal a[4:6] (cd) b[3:5] (cd)\n insert a[6:6] () b[5:6] (f)\n
Return a generator of groups with up to n lines of context.
\nStarting with the groups returned by get_opcodes(), this method\nsplits out smaller change clusters and eliminates intervening ranges which\nhave no changes.
\nThe groups are returned in the same format as get_opcodes().
\n\nNew in version 2.3.
\nReturn a measure of the sequences’ similarity as a float in the range [0,\n1].
\nWhere T is the total number of elements in both sequences, and M is the\nnumber of matches, this is 2.0*M / T. Note that this is 1.0 if the\nsequences are identical, and 0.0 if they have nothing in common.
\nThis is expensive to compute if get_matching_blocks() or\nget_opcodes() hasn’t already been called, in which case you may want\nto try quick_ratio() or real_quick_ratio() first to get an\nupper bound.
\nThe three methods that return the ratio of matching to total characters can give\ndifferent results due to differing levels of approximation, although\nquick_ratio() and real_quick_ratio() are always at least as large as\nratio():
\n>>> s = SequenceMatcher(None, "abcd", "bcde")\n>>> s.ratio()\n0.75\n>>> s.quick_ratio()\n0.75\n>>> s.real_quick_ratio()\n1.0\n
This example compares two strings, considering blanks to be “junk:”
\n>>> s = SequenceMatcher(lambda x: x == " ",\n... "private Thread currentThread;",\n... "private volatile Thread currentThread;")\n
ratio() returns a float in [0, 1], measuring the similarity of the\nsequences. As a rule of thumb, a ratio() value over 0.6 means the\nsequences are close matches:
\n>>> print round(s.ratio(), 3)\n0.866\n
If you’re only interested in where the sequences match,\nget_matching_blocks() is handy:
\n>>> for block in s.get_matching_blocks():\n... print "a[%d] and b[%d] match for %d elements" block\na[0] and b[0] match for 8 elements\na[8] and b[17] match for 21 elements\na[29] and b[38] match for 0 elements\n
Note that the last tuple returned by get_matching_blocks() is always a\ndummy, (len(a), len(b), 0), and this is the only case in which the last\ntuple element (number of elements matched) is 0.
\nIf you want to know how to change the first sequence into the second, use\nget_opcodes():
\n>>> for opcode in s.get_opcodes():\n... print "%6s a[%d:%d] b[%d:%d]" opcode\n equal a[0:8] b[0:8]\ninsert a[8:8] b[8:17]\n equal a[8:29] b[17:38]\n
See also
\nNote that Differ-generated deltas make no claim to be minimal\ndiffs. To the contrary, minimal diffs are often counter-intuitive, because they\nsynch up anywhere possible, sometimes accidental matches 100 pages apart.\nRestricting synch points to contiguous matches preserves some notion of\nlocality, at the occasional cost of producing a longer diff.
\nThe Differ class has this constructor:
\nOptional keyword parameters linejunk and charjunk are for filter functions\n(or None):
\nlinejunk: A function that accepts a single string argument, and returns true\nif the string is junk. The default is None, meaning that no line is\nconsidered junk.
\ncharjunk: A function that accepts a single character argument (a string of\nlength 1), and returns true if the character is junk. The default is None,\nmeaning that no character is considered junk.
\nDiffer objects are used (deltas generated) via a single method:
\nCompare two sequences of lines, and generate the delta (a sequence of lines).
\nEach sequence must contain individual single-line strings ending with newlines.\nSuch sequences can be obtained from the readlines() method of file-like\nobjects. The delta generated also consists of newline-terminated strings, ready\nto be printed as-is via the writelines() method of a file-like object.
\nThis example compares two texts. First we set up the texts, sequences of\nindividual single-line strings ending with newlines (such sequences can also be\nobtained from the readlines() method of file-like objects):
\n>>> text1 = ''' 1. Beautiful is better than ugly.\n... 2. Explicit is better than implicit.\n... 3. Simple is better than complex.\n... 4. Complex is better than complicated.\n... '''.splitlines(1)\n>>> len(text1)\n4\n>>> text1[0][-1]\n'\\n'\n>>> text2 = ''' 1. Beautiful is better than ugly.\n... 3. Simple is better than complex.\n... 4. Complicated is better than complex.\n... 5. Flat is better than nested.\n... '''.splitlines(1)\n
Next we instantiate a Differ object:
\n>>> d = Differ()\n
Note that when instantiating a Differ object we may pass functions to\nfilter out line and character “junk.” See the Differ() constructor for\ndetails.
\nFinally, we compare the two:
\n>>> result = list(d.compare(text1, text2))\n
result is a list of strings, so let’s pretty-print it:
\n>>> from pprint import pprint\n>>> pprint(result)\n[' 1. Beautiful is better than ugly.\\n',\n '- 2. Explicit is better than implicit.\\n',\n '- 3. Simple is better than complex.\\n',\n '+ 3. Simple is better than complex.\\n',\n '? ++\\n',\n '- 4. Complex is better than complicated.\\n',\n '? ^ ---- ^\\n',\n '+ 4. Complicated is better than complex.\\n',\n '? ++++ ^ ^\\n',\n '+ 5. Flat is better than nested.\\n']\n
As a single multi-line string it looks like this:
\n>>> import sys\n>>> sys.stdout.writelines(result)\n 1. Beautiful is better than ugly.\n- 2. Explicit is better than implicit.\n- 3. Simple is better than complex.\n+ 3. Simple is better than complex.\n? ++\n- 4. Complex is better than complicated.\n? ^ ---- ^\n+ 4. Complicated is better than complex.\n? ++++ ^ ^\n+ 5. Flat is better than nested.\n
This example shows how to use difflib to create a diff-like utility.\nIt is also contained in the Python source distribution, as\nTools/scripts/diff.py.
\n""" Command line interface to difflib.py providing diffs in four formats:\n\n* ndiff: lists every line and highlights interline changes.\n* context: highlights clusters of changes in a before/after format.\n* unified: highlights clusters of changes in an inline format.\n* html: generates side by side comparison with change highlights.\n\n"""\n\nimport sys, os, time, difflib, optparse\n\ndef main():\n # Configure the option parser\n usage = "usage: %prog [options] fromfile tofile"\n parser = optparse.OptionParser(usage)\n parser.add_option("-c", action="store_true", default=False,\n help='Produce a context format diff (default)')\n parser.add_option("-u", action="store_true", default=False,\n help='Produce a unified format diff')\n hlp = 'Produce HTML side by side diff (can use -c and -l in conjunction)'\n parser.add_option("-m", action="store_true", default=False, help=hlp)\n parser.add_option("-n", action="store_true", default=False,\n help='Produce a ndiff format diff')\n parser.add_option("-l", "--lines", type="int", default=3,\n help='Set number of context lines (default 3)')\n (options, args) = parser.parse_args()\n\n if len(args) == 0:\n parser.print_help()\n sys.exit(1)\n if len(args) != 2:\n parser.error("need to specify both a fromfile and tofile")\n\n n = options.lines\n fromfile, tofile = args # as specified in the usage string\n\n # we're passing these as arguments to the diff function\n fromdate = time.ctime(os.stat(fromfile).st_mtime)\n todate = time.ctime(os.stat(tofile).st_mtime)\n fromlines = open(fromfile, 'U').readlines()\n tolines = open(tofile, 'U').readlines()\n\n if options.u:\n diff = difflib.unified_diff(fromlines, tolines, fromfile, tofile,\n fromdate, todate, n=n)\n elif options.n:\n diff = difflib.ndiff(fromlines, tolines)\n elif options.m:\n diff = difflib.HtmlDiff().make_file(fromlines, tolines, fromfile,\n tofile, context=options.c,\n numlines=n)\n else:\n diff = difflib.context_diff(fromlines, tolines, fromfile, tofile,\n fromdate, todate, n=n)\n\n # we're using writelines because diff is a generator\n sys.stdout.writelines(diff)\n\nif __name__ == '__main__':\n main()\n
\nNew in version 2.3.
\nSource code: Lib/textwrap.py
\nThe textwrap module provides two convenience functions, wrap() and\nfill(), as well as TextWrapper, the class that does all the work,\nand a utility function dedent(). If you’re just wrapping or filling one\nor two text strings, the convenience functions should be good enough;\notherwise, you should use an instance of TextWrapper for efficiency.
\nWraps the single paragraph in text (a string) so every line is at most width\ncharacters long. Returns a list of output lines, without final newlines.
\nOptional keyword arguments correspond to the instance attributes of\nTextWrapper, documented below. width defaults to 70.
\nWraps the single paragraph in text, and returns a single string containing the\nwrapped paragraph. fill() is shorthand for
\n"\\n".join(wrap(text, ...))\n
In particular, fill() accepts exactly the same keyword arguments as\nwrap().
\nBoth wrap() and fill() work by creating a TextWrapper\ninstance and calling a single method on it. That instance is not reused, so for\napplications that wrap/fill many text strings, it will be more efficient for you\nto create your own TextWrapper object.
\nText is preferably wrapped on whitespaces and right after the hyphens in\nhyphenated words; only then will long words be broken if necessary, unless\nTextWrapper.break_long_words is set to false.
\nAn additional utility function, dedent(), is provided to remove\nindentation from strings that have unwanted whitespace to the left of the text.
\nRemove any common leading whitespace from every line in text.
\nThis can be used to make triple-quoted strings line up with the left edge of the\ndisplay, while still presenting them in the source code in indented form.
\nNote that tabs and spaces are both treated as whitespace, but they are not\nequal: the lines " hello" and "\\thello" are considered to have no\ncommon leading whitespace. (This behaviour is new in Python 2.5; older versions\nof this module incorrectly expanded tabs before searching for common leading\nwhitespace.)
\nFor example:
\ndef test():\n # end first line with \\ to avoid the empty line!\n s = '''\\\n hello\n world\n '''\n print repr(s) # prints ' hello\\n world\\n '\n print repr(dedent(s)) # prints 'hello\\n world\\n'\n
The TextWrapper constructor accepts a number of optional keyword\narguments. Each argument corresponds to one instance attribute, so for example
\nwrapper = TextWrapper(initial_indent="* ")\n
is the same as
\nwrapper = TextWrapper()\nwrapper.initial_indent = "* "\n
You can re-use the same TextWrapper object many times, and you can\nchange any of its options through direct assignment to instance attributes\nbetween uses.
\nThe TextWrapper instance attributes (and keyword arguments to the\nconstructor) are as follows:
\n(default: True) If true, each whitespace character (as defined by\nstring.whitespace) remaining after tab expansion will be replaced by a\nsingle space.
\nNote
\nIf expand_tabs is false and replace_whitespace is true,\neach tab character will be replaced by a single space, which is not\nthe same as tab expansion.
\nNote
\nIf replace_whitespace is false, newlines may appear in the\nmiddle of a line and cause strange output. For this reason, text should\nbe split into paragraphs (using str.splitlines() or similar)\nwhich are wrapped separately.
\n(default: True) If true, whitespace that, after wrapping, happens to\nend up at the beginning or end of a line is dropped (leading whitespace in\nthe first line is always preserved, though).
\n\nNew in version 2.6: Whitespace was always dropped in earlier versions.
\n(default: False) If true, TextWrapper attempts to detect\nsentence endings and ensure that sentences are always separated by exactly\ntwo spaces. This is generally desired for text in a monospaced font.\nHowever, the sentence detection algorithm is imperfect: it assumes that a\nsentence ending consists of a lowercase letter followed by one of '.',\n'!', or '?', possibly followed by one of '"' or "'",\nfollowed by a space. One problem with this is algorithm is that it is\nunable to detect the difference between “Dr.” in
\n[...] Dr. Frankenstein's monster [...]
\nand “Spot.” in
\n[...] See Spot. See Spot run [...]
\nfix_sentence_endings is false by default.
\nSince the sentence detection algorithm relies on string.lowercase for\nthe definition of “lowercase letter,” and a convention of using two spaces\nafter a period to separate sentences on the same line, it is specific to\nEnglish-language texts.
\n(default: True) If true, wrapping will occur preferably on whitespaces\nand right after hyphens in compound words, as it is customary in English.\nIf false, only whitespaces will be considered as potentially good places\nfor line breaks, but you need to set break_long_words to false if\nyou want truly insecable words. Default behaviour in previous versions\nwas to always allow breaking hyphenated words.
\n\nNew in version 2.6.
\nTextWrapper also provides two public methods, analogous to the\nmodule-level convenience functions:
\nThis module provides access to the Unicode Character Database which defines\ncharacter properties for all Unicode characters. The data in this database is\nbased on the UnicodeData.txt file version 5.2.0 which is publicly\navailable from ftp://ftp.unicode.org/.
\nThe module uses the same names and symbols as defined by the UnicodeData File\nFormat 5.2.0 (see http://www.unicode.org/reports/tr44/tr44-4.html).\nIt defines the following functions:
\nReturns the east asian width assigned to the Unicode character unichr as\nstring.
\n\nNew in version 2.4.
\nReturn the normal form form for the Unicode string unistr. Valid values for\nform are ‘NFC’, ‘NFKC’, ‘NFD’, and ‘NFKD’.
\nThe Unicode standard defines various normalization forms of a Unicode string,\nbased on the definition of canonical equivalence and compatibility equivalence.\nIn Unicode, several characters can be expressed in various way. For example, the\ncharacter U+00C7 (LATIN CAPITAL LETTER C WITH CEDILLA) can also be expressed as\nthe sequence U+0327 (COMBINING CEDILLA) U+0043 (LATIN CAPITAL LETTER C).
\nFor each character, there are two normal forms: normal form C and normal form D.\nNormal form D (NFD) is also known as canonical decomposition, and translates\neach character into its decomposed form. Normal form C (NFC) first applies a\ncanonical decomposition, then composes pre-combined characters again.
\nIn addition to these two forms, there are two additional normal forms based on\ncompatibility equivalence. In Unicode, certain characters are supported which\nnormally would be unified with other characters. For example, U+2160 (ROMAN\nNUMERAL ONE) is really the same thing as U+0049 (LATIN CAPITAL LETTER I).\nHowever, it is supported in Unicode for compatibility with existing character\nsets (e.g. gb2312).
\nThe normal form KD (NFKD) will apply the compatibility decomposition, i.e.\nreplace all compatibility characters with their equivalents. The normal form KC\n(NFKC) first applies the compatibility decomposition, followed by the canonical\ncomposition.
\nEven if two unicode strings are normalized and look the same to\na human reader, if one has combining characters and the other\ndoesn’t, they may not compare equal.
\n\nNew in version 2.3.
\nIn addition, the module exposes the following constant:
\nThe version of the Unicode database used in this module.
\n\nNew in version 2.3.
\nThis is an object that has the same methods as the entire module, but uses the\nUnicode database version 3.2 instead, for applications that require this\nspecific version of the Unicode database (such as IDNA).
\n\nNew in version 2.5.
\nExamples:
\n>>> import unicodedata\n>>> unicodedata.lookup('LEFT CURLY BRACKET')\nu'{'\n>>> unicodedata.name(u'/')\n'SOLIDUS'\n>>> unicodedata.decimal(u'9')\n9\n>>> unicodedata.decimal(u'a')\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: not a decimal\n>>> unicodedata.category(u'A') # 'L'etter, 'u'ppercase\n'Lu'\n>>> unicodedata.bidirectional(u'\\u0660') # 'A'rabic, 'N'umber\n'AN'\n
\nNew in version 2.3.
\nWhen identifying things (such as host names) in the internet, it is often\nnecessary to compare such identifications for “equality”. Exactly how this\ncomparison is executed may depend on the application domain, e.g. whether it\nshould be case-insensitive or not. It may be also necessary to restrict the\npossible identifications, to allow only identifications consisting of\n“printable” characters.
\nRFC 3454 defines a procedure for “preparing” Unicode strings in internet\nprotocols. Before passing strings onto the wire, they are processed with the\npreparation procedure, after which they have a certain normalized form. The RFC\ndefines a set of tables, which can be combined into profiles. Each profile must\ndefine which tables it uses, and what other optional parts of the stringprep\nprocedure are part of the profile. One example of a stringprep profile is\nnameprep, which is used for internationalized domain names.
\nThe module stringprep only exposes the tables from RFC 3454. As these\ntables would be very large to represent them as dictionaries or lists, the\nmodule uses the Unicode character database internally. The module source code\nitself was generated using the mkstringprep.py utility.
\nAs a result, these tables are exposed as functions, not as data structures.\nThere are two kinds of tables in the RFC: sets and mappings. For a set,\nstringprep provides the “characteristic function”, i.e. a function that\nreturns true if the parameter is part of the set. For mappings, it provides the\nmapping function: given the key, it returns the associated value. Below is a\nlist of all functions available in the module.
\nThis module defines base classes for standard Python codecs (encoders and\ndecoders) and provides access to the internal Python codec registry which\nmanages the codec and error handling lookup process.
\nIt defines the following functions:
\nRegister a codec search function. Search functions are expected to take one\nargument, the encoding name in all lower case letters, and return a\nCodecInfo object having the following attributes:
\nThe various functions or classes take the following arguments:
\nencode and decode: These must be functions or methods which have the same\ninterface as the encode()/decode() methods of Codec instances (see\nCodec Interface). The functions/methods are expected to work in a stateless\nmode.
\nincrementalencoder and incrementaldecoder: These have to be factory\nfunctions providing the following interface:
\n\nfactory(errors='strict')\n
The factory functions must return objects providing the interfaces defined by\nthe base classes IncrementalEncoder and IncrementalDecoder,\nrespectively. Incremental codecs can maintain state.
\nstreamreader and streamwriter: These have to be factory functions providing\nthe following interface:
\n\nfactory(stream, errors='strict')\n
The factory functions must return objects providing the interfaces defined by\nthe base classes StreamWriter and StreamReader, respectively.\nStream codecs can maintain state.
\nPossible values for errors are
\nas well as any other error handling name defined via register_error().
\nIn case a search function cannot find a given encoding, it should return\nNone.
\nLooks up the codec info in the Python codec registry and returns a\nCodecInfo object as defined above.
\nEncodings are first looked up in the registry’s cache. If not found, the list of\nregistered search functions is scanned. If no CodecInfo object is\nfound, a LookupError is raised. Otherwise, the CodecInfo object\nis stored in the cache and returned to the caller.
\nTo simplify access to the various codecs, the module provides these additional\nfunctions which use lookup() for the codec lookup:
\nLook up the codec for the given encoding and return its encoder function.
\nRaises a LookupError in case the encoding cannot be found.
\nLook up the codec for the given encoding and return its decoder function.
\nRaises a LookupError in case the encoding cannot be found.
\nLook up the codec for the given encoding and return its incremental encoder\nclass or factory function.
\nRaises a LookupError in case the encoding cannot be found or the codec\ndoesn’t support an incremental encoder.
\n\nNew in version 2.5.
\nLook up the codec for the given encoding and return its incremental decoder\nclass or factory function.
\nRaises a LookupError in case the encoding cannot be found or the codec\ndoesn’t support an incremental decoder.
\n\nNew in version 2.5.
\nLook up the codec for the given encoding and return its StreamReader class or\nfactory function.
\nRaises a LookupError in case the encoding cannot be found.
\nLook up the codec for the given encoding and return its StreamWriter class or\nfactory function.
\nRaises a LookupError in case the encoding cannot be found.
\nRegister the error handling function error_handler under the name name.\nerror_handler will be called during encoding and decoding in case of an error,\nwhen name is specified as the errors parameter.
\nFor encoding error_handler will be called with a UnicodeEncodeError\ninstance, which contains information about the location of the error. The error\nhandler must either raise this or a different exception or return a tuple with a\nreplacement for the unencodable part of the input and a position where encoding\nshould continue. The encoder will encode the replacement and continue encoding\nthe original input at the specified position. Negative position values will be\ntreated as being relative to the end of the input string. If the resulting\nposition is out of bound an IndexError will be raised.
\nDecoding and translating works similar, except UnicodeDecodeError or\nUnicodeTranslateError will be passed to the handler and that the\nreplacement from the error handler will be put into the output directly.
\nReturn the error handler previously registered under the name name.
\nRaises a LookupError in case the handler cannot be found.
\nTo simplify working with encoded files or stream, the module also defines these\nutility functions:
\nOpen an encoded file using the given mode and return a wrapped version\nproviding transparent encoding/decoding. The default file mode is 'r'\nmeaning to open the file in read mode.
\nNote
\nThe wrapped version will only accept the object format defined by the codecs,\ni.e. Unicode objects for most built-in codecs. Output is also codec-dependent\nand will usually be Unicode as well.
\nNote
\nFiles are always opened in binary mode, even if no binary mode was\nspecified. This is done to avoid data loss due to encodings using 8-bit\nvalues. This means that no automatic conversion of '\\n' is done\non reading and writing.
\nencoding specifies the encoding which is to be used for the file.
\nerrors may be given to define the error handling. It defaults to 'strict'\nwhich causes a ValueError to be raised in case an encoding error occurs.
\nbuffering has the same meaning as for the built-in open() function. It\ndefaults to line buffered.
\nReturn a wrapped version of file which provides transparent encoding\ntranslation.
\nStrings written to the wrapped file are interpreted according to the given\ninput encoding and then written to the original file as strings using the\noutput encoding. The intermediate encoding will usually be Unicode but depends\non the specified codecs.
\nIf output is not given, it defaults to input.
\nerrors may be given to define the error handling. It defaults to 'strict',\nwhich causes ValueError to be raised in case an encoding error occurs.
\nUses an incremental encoder to iteratively encode the input provided by\niterable. This function is a generator. errors (as well as any\nother keyword argument) is passed through to the incremental encoder.
\n\nNew in version 2.5.
\nUses an incremental decoder to iteratively decode the input provided by\niterable. This function is a generator. errors (as well as any\nother keyword argument) is passed through to the incremental decoder.
\n\nNew in version 2.5.
\nThe module also provides the following constants which are useful for reading\nand writing to platform dependent files:
\nThe codecs module defines a set of base classes which define the\ninterface and can also be used to easily write your own codecs for use in\nPython.
\nEach codec has to define four interfaces to make it usable as codec in Python:\nstateless encoder, stateless decoder, stream reader and stream writer. The\nstream reader and writers typically reuse the stateless encoder/decoder to\nimplement the file protocols.
\nThe Codec class defines the interface for stateless encoders/decoders.
\nTo simplify and standardize error handling, the encode() and\ndecode() methods may implement different error handling schemes by\nproviding the errors string argument. The following string values are defined\nand implemented by all standard Python codecs:
\nValue | \nMeaning | \n
---|---|
'strict' | \nRaise UnicodeError (or a subclass);\nthis is the default. | \n
'ignore' | \nIgnore the character and continue with the\nnext. | \n
'replace' | \nReplace with a suitable replacement\ncharacter; Python will use the official\nU+FFFD REPLACEMENT CHARACTER for the built-in\nUnicode codecs on decoding and ‘?’ on\nencoding. | \n
'xmlcharrefreplace' | \nReplace with the appropriate XML character\nreference (only for encoding). | \n
'backslashreplace' | \nReplace with backslashed escape sequences\n(only for encoding). | \n
The set of allowed values can be extended via register_error().
\nThe Codec class defines these methods which also define the function\ninterfaces of the stateless encoder and decoder:
\nEncodes the object input and returns a tuple (output object, length consumed).\nWhile codecs are not restricted to use with Unicode, in a Unicode context,\nencoding converts a Unicode object to a plain string using a particular\ncharacter set encoding (e.g., cp1252 or iso-8859-1).
\nerrors defines the error handling to apply. It defaults to 'strict'\nhandling.
\nThe method may not store state in the Codec instance. Use\nStreamCodec for codecs which have to keep state in order to make\nencoding/decoding efficient.
\nThe encoder must be able to handle zero length input and return an empty object\nof the output object type in this situation.
\nDecodes the object input and returns a tuple (output object, length consumed).\nIn a Unicode context, decoding converts a plain string encoded using a\nparticular character set encoding to a Unicode object.
\ninput must be an object which provides the bf_getreadbuf buffer slot.\nPython strings, buffer objects and memory mapped files are examples of objects\nproviding this slot.
\nerrors defines the error handling to apply. It defaults to 'strict'\nhandling.
\nThe method may not store state in the Codec instance. Use\nStreamCodec for codecs which have to keep state in order to make\nencoding/decoding efficient.
\nThe decoder must be able to handle zero length input and return an empty object\nof the output object type in this situation.
\nThe IncrementalEncoder and IncrementalDecoder classes provide\nthe basic interface for incremental encoding and decoding. Encoding/decoding the\ninput isn’t done with one call to the stateless encoder/decoder function, but\nwith multiple calls to the encode()/decode() method of the\nincremental encoder/decoder. The incremental encoder/decoder keeps track of the\nencoding/decoding process during method calls.
\nThe joined output of calls to the encode()/decode() method is the\nsame as if all the single inputs were joined into one, and this input was\nencoded/decoded with the stateless encoder/decoder.
\n\nNew in version 2.5.
\nThe IncrementalEncoder class is used for encoding an input in multiple\nsteps. It defines the following methods which every incremental encoder must\ndefine in order to be compatible with the Python codec registry.
\nConstructor for an IncrementalEncoder instance.
\nAll incremental encoders must provide this constructor interface. They are free\nto add additional keyword arguments, but only the ones defined here are used by\nthe Python codec registry.
\nThe IncrementalEncoder may implement different error handling schemes\nby providing the errors keyword argument. These parameters are predefined:
\nThe errors argument will be assigned to an attribute of the same name.\nAssigning to this attribute makes it possible to switch between different error\nhandling strategies during the lifetime of the IncrementalEncoder\nobject.
\nThe set of allowed values for the errors argument can be extended with\nregister_error().
\nThe IncrementalDecoder class is used for decoding an input in multiple\nsteps. It defines the following methods which every incremental decoder must\ndefine in order to be compatible with the Python codec registry.
\nConstructor for an IncrementalDecoder instance.
\nAll incremental decoders must provide this constructor interface. They are free\nto add additional keyword arguments, but only the ones defined here are used by\nthe Python codec registry.
\nThe IncrementalDecoder may implement different error handling schemes\nby providing the errors keyword argument. These parameters are predefined:
\nThe errors argument will be assigned to an attribute of the same name.\nAssigning to this attribute makes it possible to switch between different error\nhandling strategies during the lifetime of the IncrementalDecoder\nobject.
\nThe set of allowed values for the errors argument can be extended with\nregister_error().
\nThe StreamWriter and StreamReader classes provide generic\nworking interfaces which can be used to implement new encoding submodules very\neasily. See encodings.utf_8 for an example of how this is done.
\nThe StreamWriter class is a subclass of Codec and defines the\nfollowing methods which every stream writer must define in order to be\ncompatible with the Python codec registry.
\nConstructor for a StreamWriter instance.
\nAll stream writers must provide this constructor interface. They are free to add\nadditional keyword arguments, but only the ones defined here are used by the\nPython codec registry.
\nstream must be a file-like object open for writing binary data.
\nThe StreamWriter may implement different error handling schemes by\nproviding the errors keyword argument. These parameters are predefined:
\nThe errors argument will be assigned to an attribute of the same name.\nAssigning to this attribute makes it possible to switch between different error\nhandling strategies during the lifetime of the StreamWriter object.
\nThe set of allowed values for the errors argument can be extended with\nregister_error().
\nFlushes and resets the codec buffers used for keeping state.
\nCalling this method should ensure that the data on the output is put into\na clean state that allows appending of new fresh data without having to\nrescan the whole stream to recover state.
\nIn addition to the above methods, the StreamWriter must also inherit\nall other methods and attributes from the underlying stream.
\nThe StreamReader class is a subclass of Codec and defines the\nfollowing methods which every stream reader must define in order to be\ncompatible with the Python codec registry.
\nConstructor for a StreamReader instance.
\nAll stream readers must provide this constructor interface. They are free to add\nadditional keyword arguments, but only the ones defined here are used by the\nPython codec registry.
\nstream must be a file-like object open for reading (binary) data.
\nThe StreamReader may implement different error handling schemes by\nproviding the errors keyword argument. These parameters are defined:
\nThe errors argument will be assigned to an attribute of the same name.\nAssigning to this attribute makes it possible to switch between different error\nhandling strategies during the lifetime of the StreamReader object.
\nThe set of allowed values for the errors argument can be extended with\nregister_error().
\nDecodes data from the stream and returns the resulting object.
\nchars indicates the number of characters to read from the\nstream. read() will never return more than chars characters, but\nit might return less, if there are not enough characters available.
\nsize indicates the approximate maximum number of bytes to read from the\nstream for decoding purposes. The decoder can modify this setting as\nappropriate. The default value -1 indicates to read and decode as much as\npossible. size is intended to prevent having to decode huge files in\none step.
\nfirstline indicates that it would be sufficient to only return the first\nline, if there are decoding errors on later lines.
\nThe method should use a greedy read strategy meaning that it should read\nas much data as is allowed within the definition of the encoding and the\ngiven size, e.g. if optional encoding endings or state markers are\navailable on the stream, these should be read too.
\n\nChanged in version 2.4: chars argument added.
\n\nChanged in version 2.4.2: firstline argument added.
\nRead one line from the input stream and return the decoded data.
\nsize, if given, is passed as size argument to the stream’s\nreadline() method.
\nIf keepends is false line-endings will be stripped from the lines\nreturned.
\n\nChanged in version 2.4: keepends argument added.
\nRead all lines available on the input stream and return them as a list of\nlines.
\nLine-endings are implemented using the codec’s decoder method and are\nincluded in the list entries if keepends is true.
\nsizehint, if given, is passed as the size argument to the stream’s\nread() method.
\nResets the codec buffers used for keeping state.
\nNote that no stream repositioning should take place. This method is\nprimarily intended to be able to recover from decoding errors.
\nIn addition to the above methods, the StreamReader must also inherit\nall other methods and attributes from the underlying stream.
\nThe next two base classes are included for convenience. They are not needed by\nthe codec registry, but may provide useful in practice.
\nThe StreamReaderWriter allows wrapping streams which work in both read\nand write modes.
\nThe design is such that one can use the factory functions returned by the\nlookup() function to construct the instance.
\nStreamReaderWriter instances define the combined interfaces of\nStreamReader and StreamWriter classes. They inherit all other\nmethods and attributes from the underlying stream.
\nThe StreamRecoder provide a frontend - backend view of encoding data\nwhich is sometimes useful when dealing with different encoding environments.
\nThe design is such that one can use the factory functions returned by the\nlookup() function to construct the instance.
\nCreates a StreamRecoder instance which implements a two-way conversion:\nencode and decode work on the frontend (the input to read() and output\nof write()) while Reader and Writer work on the backend (reading and\nwriting to the stream).
\nYou can use these objects to do transparent direct recodings from e.g. Latin-1\nto UTF-8 and back.
\nstream must be a file-like object.
\nencode, decode must adhere to the Codec interface. Reader,\nWriter must be factory functions or classes providing objects of the\nStreamReader and StreamWriter interface respectively.
\nencode and decode are needed for the frontend translation, Reader and\nWriter for the backend translation. The intermediate format used is\ndetermined by the two sets of codecs, e.g. the Unicode codecs will use Unicode\nas the intermediate encoding.
\nError handling is done in the same way as defined for the stream readers and\nwriters.
\nStreamRecoder instances define the combined interfaces of\nStreamReader and StreamWriter classes. They inherit all other\nmethods and attributes from the underlying stream.
\nUnicode strings are stored internally as sequences of codepoints (to be precise\nas Py_UNICODE arrays). Depending on the way Python is compiled (either\nvia --enable-unicode=ucs2 or --enable-unicode=ucs4, with the\nformer being the default) Py_UNICODE is either a 16-bit or 32-bit data\ntype. Once a Unicode object is used outside of CPU and memory, CPU endianness\nand how these arrays are stored as bytes become an issue. Transforming a\nunicode object into a sequence of bytes is called encoding and recreating the\nunicode object from the sequence of bytes is known as decoding. There are many\ndifferent methods for how this transformation can be done (these methods are\nalso called encodings). The simplest method is to map the codepoints 0-255 to\nthe bytes 0x0-0xff. This means that a unicode object that contains\ncodepoints above U+00FF can’t be encoded with this method (which is called\n'latin-1' or 'iso-8859-1'). unicode.encode() will raise a\nUnicodeEncodeError that looks like this: UnicodeEncodeError: 'latin-1'\ncodec can't encode character u'\\u1234' in position 3: ordinal not in\nrange(256).
\nThere’s another group of encodings (the so called charmap encodings) that choose\na different subset of all unicode code points and how these codepoints are\nmapped to the bytes 0x0-0xff. To see how this is done simply open\ne.g. encodings/cp1252.py (which is an encoding that is used primarily on\nWindows). There’s a string constant with 256 characters that shows you which\ncharacter is mapped to which byte value.
\nAll of these encodings can only encode 256 of the 1114112 codepoints\ndefined in unicode. A simple and straightforward way that can store each Unicode\ncode point, is to store each codepoint as four consecutive bytes. There are two\npossibilities: store the bytes in big endian or in little endian order. These\ntwo encodings are called UTF-32-BE and UTF-32-LE respectively. Their\ndisadvantage is that if e.g. you use UTF-32-BE on a little endian machine you\nwill always have to swap bytes on encoding and decoding. UTF-32 avoids this\nproblem: bytes will always be in natural endianness. When these bytes are read\nby a CPU with a different endianness, then bytes have to be swapped though. To\nbe able to detect the endianness of a UTF-16 or UTF-32 byte sequence,\nthere’s the so called BOM (“Byte Order Mark”). This is the Unicode character\nU+FEFF. This character can be prepended to every UTF-16 or UTF-32\nbyte sequence. The byte swapped version of this character (0xFFFE) is an\nillegal character that may not appear in a Unicode text. So when the\nfirst character in an UTF-16 or UTF-32 byte sequence\nappears to be a U+FFFE the bytes have to be swapped on decoding.\nUnfortunately the character U+FEFF had a second purpose as\na ZERO WIDTH NO-BREAK SPACE: a character that has no width and doesn’t allow\na word to be split. It can e.g. be used to give hints to a ligature algorithm.\nWith Unicode 4.0 using U+FEFF as a ZERO WIDTH NO-BREAK SPACE has been\ndeprecated (with U+2060 (WORD JOINER) assuming this role). Nevertheless\nUnicode software still must be able to handle U+FEFF in both roles: as a BOM\nit’s a device to determine the storage layout of the encoded bytes, and vanishes\nonce the byte sequence has been decoded into a Unicode string; as a ZERO WIDTH\nNO-BREAK SPACE it’s a normal character that will be decoded like any other.
\nThere’s another encoding that is able to encoding the full range of Unicode\ncharacters: UTF-8. UTF-8 is an 8-bit encoding, which means there are no issues\nwith byte order in UTF-8. Each byte in a UTF-8 byte sequence consists of two\nparts: marker bits (the most significant bits) and payload bits. The marker bits\nare a sequence of zero to four 1 bits followed by a 0 bit. Unicode characters are\nencoded like this (with x being payload bits, which when concatenated give the\nUnicode character):
\nRange | \nEncoding | \n
---|---|
U-00000000 ... U-0000007F | \n0xxxxxxx | \n
U-00000080 ... U-000007FF | \n110xxxxx 10xxxxxx | \n
U-00000800 ... U-0000FFFF | \n1110xxxx 10xxxxxx 10xxxxxx | \n
U-00010000 ... U-0010FFFF | \n11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | \n
The least significant bit of the Unicode character is the rightmost x bit.
\nAs UTF-8 is an 8-bit encoding no BOM is required and any U+FEFF character in\nthe decoded Unicode string (even if it’s the first character) is treated as a\nZERO WIDTH NO-BREAK SPACE.
\nWithout external information it’s impossible to reliably determine which\nencoding was used for encoding a Unicode string. Each charmap encoding can\ndecode any random byte sequence. However that’s not possible with UTF-8, as\nUTF-8 byte sequences have a structure that doesn’t allow arbitrary byte\nsequences. To increase the reliability with which a UTF-8 encoding can be\ndetected, Microsoft invented a variant of UTF-8 (that Python 2.5 calls\n"utf-8-sig") for its Notepad program: Before any of the Unicode characters\nis written to the file, a UTF-8 encoded BOM (which looks like this as a byte\nsequence: 0xef, 0xbb, 0xbf) is written. As it’s rather improbable\nthat any charmap encoded file starts with these byte values (which would e.g.\nmap to
\n\n\n\n\nLATIN SMALL LETTER I WITH DIAERESIS\nRIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\nINVERTED QUESTION MARK\n
in iso-8859-1), this increases the probability that a utf-8-sig encoding can be\ncorrectly guessed from the byte sequence. So here the BOM is not used to be able\nto determine the byte order used for generating the byte sequence, but as a\nsignature that helps in guessing the encoding. On encoding the utf-8-sig codec\nwill write 0xef, 0xbb, 0xbf as the first three bytes to the file. On\ndecoding utf-8-sig will skip those three bytes if they appear as the first\nthree bytes in the file. In UTF-8, the use of the BOM is discouraged and\nshould generally be avoided.
\nPython comes with a number of codecs built-in, either implemented as C functions\nor with dictionaries as mapping tables. The following table lists the codecs by\nname, together with a few common aliases, and the languages for which the\nencoding is likely used. Neither the list of aliases nor the list of languages\nis meant to be exhaustive. Notice that spelling alternatives that only differ in\ncase or use a hyphen instead of an underscore are also valid aliases; therefore,\ne.g. 'utf-8' is a valid alias for the 'utf_8' codec.
\nMany of the character sets support the same languages. They vary in individual\ncharacters (e.g. whether the EURO SIGN is supported or not), and in the\nassignment of characters to code positions. For the European languages in\nparticular, the following variants typically exist:
\nCodec | \nAliases | \nLanguages | \n
---|---|---|
ascii | \n646, us-ascii | \nEnglish | \n
big5 | \nbig5-tw, csbig5 | \nTraditional Chinese | \n
big5hkscs | \nbig5-hkscs, hkscs | \nTraditional Chinese | \n
cp037 | \nIBM037, IBM039 | \nEnglish | \n
cp424 | \nEBCDIC-CP-HE, IBM424 | \nHebrew | \n
cp437 | \n437, IBM437 | \nEnglish | \n
cp500 | \nEBCDIC-CP-BE, EBCDIC-CP-CH,\nIBM500 | \nWestern Europe | \n
cp720 | \n\n | Arabic | \n
cp737 | \n\n | Greek | \n
cp775 | \nIBM775 | \nBaltic languages | \n
cp850 | \n850, IBM850 | \nWestern Europe | \n
cp852 | \n852, IBM852 | \nCentral and Eastern Europe | \n
cp855 | \n855, IBM855 | \nBulgarian, Byelorussian,\nMacedonian, Russian, Serbian | \n
cp856 | \n\n | Hebrew | \n
cp857 | \n857, IBM857 | \nTurkish | \n
cp858 | \n858, IBM858 | \nWestern Europe | \n
cp860 | \n860, IBM860 | \nPortuguese | \n
cp861 | \n861, CP-IS, IBM861 | \nIcelandic | \n
cp862 | \n862, IBM862 | \nHebrew | \n
cp863 | \n863, IBM863 | \nCanadian | \n
cp864 | \nIBM864 | \nArabic | \n
cp865 | \n865, IBM865 | \nDanish, Norwegian | \n
cp866 | \n866, IBM866 | \nRussian | \n
cp869 | \n869, CP-GR, IBM869 | \nGreek | \n
cp874 | \n\n | Thai | \n
cp875 | \n\n | Greek | \n
cp932 | \n932, ms932, mskanji, ms-kanji | \nJapanese | \n
cp949 | \n949, ms949, uhc | \nKorean | \n
cp950 | \n950, ms950 | \nTraditional Chinese | \n
cp1006 | \n\n | Urdu | \n
cp1026 | \nibm1026 | \nTurkish | \n
cp1140 | \nibm1140 | \nWestern Europe | \n
cp1250 | \nwindows-1250 | \nCentral and Eastern Europe | \n
cp1251 | \nwindows-1251 | \nBulgarian, Byelorussian,\nMacedonian, Russian, Serbian | \n
cp1252 | \nwindows-1252 | \nWestern Europe | \n
cp1253 | \nwindows-1253 | \nGreek | \n
cp1254 | \nwindows-1254 | \nTurkish | \n
cp1255 | \nwindows-1255 | \nHebrew | \n
cp1256 | \nwindows-1256 | \nArabic | \n
cp1257 | \nwindows-1257 | \nBaltic languages | \n
cp1258 | \nwindows-1258 | \nVietnamese | \n
euc_jp | \neucjp, ujis, u-jis | \nJapanese | \n
euc_jis_2004 | \njisx0213, eucjis2004 | \nJapanese | \n
euc_jisx0213 | \neucjisx0213 | \nJapanese | \n
euc_kr | \neuckr, korean, ksc5601,\nks_c-5601, ks_c-5601-1987,\nksx1001, ks_x-1001 | \nKorean | \n
gb2312 | \nchinese, csiso58gb231280, euc-\ncn, euccn, eucgb2312-cn,\ngb2312-1980, gb2312-80, iso-\nir-58 | \nSimplified Chinese | \n
gbk | \n936, cp936, ms936 | \nUnified Chinese | \n
gb18030 | \ngb18030-2000 | \nUnified Chinese | \n
hz | \nhzgb, hz-gb, hz-gb-2312 | \nSimplified Chinese | \n
iso2022_jp | \ncsiso2022jp, iso2022jp,\niso-2022-jp | \nJapanese | \n
iso2022_jp_1 | \niso2022jp-1, iso-2022-jp-1 | \nJapanese | \n
iso2022_jp_2 | \niso2022jp-2, iso-2022-jp-2 | \nJapanese, Korean, Simplified\nChinese, Western Europe, Greek | \n
iso2022_jp_2004 | \niso2022jp-2004,\niso-2022-jp-2004 | \nJapanese | \n
iso2022_jp_3 | \niso2022jp-3, iso-2022-jp-3 | \nJapanese | \n
iso2022_jp_ext | \niso2022jp-ext, iso-2022-jp-ext | \nJapanese | \n
iso2022_kr | \ncsiso2022kr, iso2022kr,\niso-2022-kr | \nKorean | \n
latin_1 | \niso-8859-1, iso8859-1, 8859,\ncp819, latin, latin1, L1 | \nWest Europe | \n
iso8859_2 | \niso-8859-2, latin2, L2 | \nCentral and Eastern Europe | \n
iso8859_3 | \niso-8859-3, latin3, L3 | \nEsperanto, Maltese | \n
iso8859_4 | \niso-8859-4, latin4, L4 | \nBaltic languages | \n
iso8859_5 | \niso-8859-5, cyrillic | \nBulgarian, Byelorussian,\nMacedonian, Russian, Serbian | \n
iso8859_6 | \niso-8859-6, arabic | \nArabic | \n
iso8859_7 | \niso-8859-7, greek, greek8 | \nGreek | \n
iso8859_8 | \niso-8859-8, hebrew | \nHebrew | \n
iso8859_9 | \niso-8859-9, latin5, L5 | \nTurkish | \n
iso8859_10 | \niso-8859-10, latin6, L6 | \nNordic languages | \n
iso8859_13 | \niso-8859-13, latin7, L7 | \nBaltic languages | \n
iso8859_14 | \niso-8859-14, latin8, L8 | \nCeltic languages | \n
iso8859_15 | \niso-8859-15, latin9, L9 | \nWestern Europe | \n
iso8859_16 | \niso-8859-16, latin10, L10 | \nSouth-Eastern Europe | \n
johab | \ncp1361, ms1361 | \nKorean | \n
koi8_r | \n\n | Russian | \n
koi8_u | \n\n | Ukrainian | \n
mac_cyrillic | \nmaccyrillic | \nBulgarian, Byelorussian,\nMacedonian, Russian, Serbian | \n
mac_greek | \nmacgreek | \nGreek | \n
mac_iceland | \nmaciceland | \nIcelandic | \n
mac_latin2 | \nmaclatin2, maccentraleurope | \nCentral and Eastern Europe | \n
mac_roman | \nmacroman | \nWestern Europe | \n
mac_turkish | \nmacturkish | \nTurkish | \n
ptcp154 | \ncsptcp154, pt154, cp154,\ncyrillic-asian | \nKazakh | \n
shift_jis | \ncsshiftjis, shiftjis, sjis,\ns_jis | \nJapanese | \n
shift_jis_2004 | \nshiftjis2004, sjis_2004,\nsjis2004 | \nJapanese | \n
shift_jisx0213 | \nshiftjisx0213, sjisx0213,\ns_jisx0213 | \nJapanese | \n
utf_32 | \nU32, utf32 | \nall languages | \n
utf_32_be | \nUTF-32BE | \nall languages | \n
utf_32_le | \nUTF-32LE | \nall languages | \n
utf_16 | \nU16, utf16 | \nall languages | \n
utf_16_be | \nUTF-16BE | \nall languages (BMP only) | \n
utf_16_le | \nUTF-16LE | \nall languages (BMP only) | \n
utf_7 | \nU7, unicode-1-1-utf-7 | \nall languages | \n
utf_8 | \nU8, UTF, utf8 | \nall languages | \n
utf_8_sig | \n\n | all languages | \n
A number of codecs are specific to Python, so their codec names have no meaning\noutside Python. Some of them don’t convert from Unicode strings to byte strings,\nbut instead use the property of the Python codecs machinery that any bijective\nfunction with one argument can be considered as an encoding.
\nFor the codecs listed below, the result in the “encoding” direction is always a\nbyte string. The result of the “decoding” direction is listed as operand type in\nthe table.
\nCodec | \nAliases | \nOperand type | \nPurpose | \n
---|---|---|---|
base64_codec | \nbase64, base-64 | \nbyte string | \nConvert operand to MIME\nbase64 | \n
bz2_codec | \nbz2 | \nbyte string | \nCompress the operand\nusing bz2 | \n
hex_codec | \nhex | \nbyte string | \nConvert operand to\nhexadecimal\nrepresentation, with two\ndigits per byte | \n
idna | \n\n | Unicode string | \nImplements RFC 3490,\nsee also\nencodings.idna | \n
mbcs | \ndbcs | \nUnicode string | \nWindows only: Encode\noperand according to the\nANSI codepage (CP_ACP) | \n
palmos | \n\n | Unicode string | \nEncoding of PalmOS 3.5 | \n
punycode | \n\n | Unicode string | \nImplements RFC 3492 | \n
quopri_codec | \nquopri, quoted-printable,\nquotedprintable | \nbyte string | \nConvert operand to MIME\nquoted printable | \n
raw_unicode_escape | \n\n | Unicode string | \nProduce a string that is\nsuitable as raw Unicode\nliteral in Python source\ncode | \n
rot_13 | \nrot13 | \nUnicode string | \nReturns the Caesar-cypher\nencryption of the operand | \n
string_escape | \n\n | byte string | \nProduce a string that is\nsuitable as string\nliteral in Python source\ncode | \n
undefined | \n\n | any | \nRaise an exception for\nall conversions. Can be\nused as the system\nencoding if no automatic\ncoercion between\nbyte and Unicode strings\nis desired. | \n
unicode_escape | \n\n | Unicode string | \nProduce a string that is\nsuitable as Unicode\nliteral in Python source\ncode | \n
unicode_internal | \n\n | Unicode string | \nReturn the internal\nrepresentation of the\noperand | \n
uu_codec | \nuu | \nbyte string | \nConvert the operand using\nuuencode | \n
zlib_codec | \nzip, zlib | \nbyte string | \nCompress the operand\nusing gzip | \n
\nNew in version 2.3: The idna and punycode encodings.
\n\nNew in version 2.3.
\nThis module implements RFC 3490 (Internationalized Domain Names in\nApplications) and RFC 3492 (Nameprep: A Stringprep Profile for\nInternationalized Domain Names (IDN)). It builds upon the punycode encoding\nand stringprep.
\nThese RFCs together define a protocol to support non-ASCII characters in domain\nnames. A domain name containing non-ASCII characters (such as\nwww.Alliancefrançaise.nu) is converted into an ASCII-compatible encoding\n(ACE, such as www.xn--alliancefranaise-npb.nu). The ACE form of the domain\nname is then used in all places where arbitrary characters are not allowed by\nthe protocol, such as DNS queries, HTTP Host fields, and so\non. This conversion is carried out in the application; if possible invisible to\nthe user: The application should transparently convert Unicode domain labels to\nIDNA on the wire, and convert back ACE labels to Unicode before presenting them\nto the user.
\nPython supports this conversion in several ways: the idna codec performs\nconversion between Unicode and ACE, separating an input string into labels\nbased on the separator characters defined in section 3.1 (1) of RFC 3490\nand converting each label to ACE as required, and conversely separating an input\nbyte string into labels based on the . separator and converting any ACE\nlabels found into unicode. Furthermore, the socket module\ntransparently converts Unicode host names to ACE, so that applications need not\nbe concerned about converting host names themselves when they pass them to the\nsocket module. On top of that, modules that have host names as function\nparameters, such as httplib and ftplib, accept Unicode host names\n(httplib then also transparently sends an IDNA hostname in the\nHost field if it sends that field at all).
\nWhen receiving host names from the wire (such as in reverse name lookup), no\nautomatic conversion to Unicode is performed: Applications wishing to present\nsuch host names to the user should decode them to Unicode.
\nThe module encodings.idna also implements the nameprep procedure, which\nperforms certain normalizations on host names, to achieve case-insensitivity of\ninternational domain names, and to unify similar characters. The nameprep\nfunctions can be used directly if desired.
\n\nNew in version 2.5.
\nThis module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded\nBOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this\nis only done once (on the first write to the byte stream). For decoding an\noptional UTF-8 encoded BOM at the start of the data will be skipped.
\n\nDeprecated since version 2.6: The fpformat module has been removed in Python 3.0.
\nThe fpformat module defines functions for dealing with floating point\nnumbers representations in 100% pure Python.
\nNote
\nThis module is unnecessary: everything here can be done using the
The fpformat module defines the following functions and an exception:
\nFormat x as [-]ddd.ddd with digs digits after the point and at least one\ndigit before. If digs <= 0, the decimal point is suppressed.
\nx can be either a number or a string that looks like one. digs is an\ninteger.
\nReturn value is a string.
\nFormat x as [-]d.dddE[+-]ddd with digs digits after the point and\nexactly one digit before. If digs <= 0, one digit is kept and the point is\nsuppressed.
\nx can be either a real number, or a string that looks like one. digs is an\ninteger.
\nReturn value is a string.
\nExample:
\n>>> import fpformat\n>>> fpformat.fix(1.23, 1)\n'1.2'\n
Source code: Lib/calendar.py
\nThis module allows you to output calendars like the Unix cal program,\nand provides additional useful functions related to the calendar. By default,\nthese calendars have Monday as the first day of the week, and Sunday as the last\n(the European convention). Use setfirstweekday() to set the first day of\nthe week to Sunday (6) or to any other weekday. Parameters that specify dates\nare given as integers. For related\nfunctionality, see also the datetime and time modules.
\nMost of these functions and classes rely on the datetime module which\nuses an idealized calendar, the current Gregorian calendar indefinitely extended\nin both directions. This matches the definition of the “proleptic Gregorian”\ncalendar in Dershowitz and Reingold’s book “Calendrical Calculations”, where\nit’s the base calendar for all computations.
\nCreates a Calendar object. firstweekday is an integer specifying the\nfirst day of the week. 0 is Monday (the default), 6 is Sunday.
\nA Calendar object provides several methods that can be used for\npreparing the calendar data for formatting. This class doesn’t do any formatting\nitself. This is the job of subclasses.
\n\nNew in version 2.5.
\nCalendar instances have the following methods:
\nThis class can be used to generate plain text calendars.
\n\nNew in version 2.5.
\nTextCalendar instances have the following methods:
\nThis class can be used to generate HTML calendars.
\n\nNew in version 2.5.
\nHTMLCalendar instances have the following methods:
\nThis subclass of TextCalendar can be passed a locale name in the\nconstructor and will return month and weekday names in the specified locale.\nIf this locale includes an encoding all strings containing month and weekday\nnames will be returned as unicode.
\n\nNew in version 2.5.
\nThis subclass of HTMLCalendar can be passed a locale name in the\nconstructor and will return month and weekday names in the specified\nlocale. If this locale includes an encoding all strings containing month and\nweekday names will be returned as unicode.
\n\nNew in version 2.5.
\nNote
\nThe formatweekday() and formatmonthname() methods of these two\nclasses temporarily change the current locale to the given locale. Because\nthe current locale is a process-wide setting, they are not thread-safe.
\nFor simple text calendars this module provides the following functions.
\nSets the weekday (0 is Monday, 6 is Sunday) to start each week. The\nvalues MONDAY, TUESDAY, WEDNESDAY, THURSDAY,\nFRIDAY, SATURDAY, and SUNDAY are provided for\nconvenience. For example, to set the first weekday to Sunday:
\nimport calendar\ncalendar.setfirstweekday(calendar.SUNDAY)\n
\nNew in version 2.0.
\nReturns the current setting for the weekday to start each week.
\n\nNew in version 2.0.
\nReturns the number of leap years in the range from y1 to y2 (exclusive),\nwhere y1 and y2 are years.
\n\nChanged in version 2.0: This function didn’t work for ranges spanning a century change in Python\n1.5.2.
\nReturns a month’s calendar in a multi-line string using the formatmonth()\nof the TextCalendar class.
\n\nNew in version 2.0.
\nReturns a 3-column calendar for an entire year as a multi-line string using the\nformatyear() of the TextCalendar class.
\n\nNew in version 2.0.
\nAn unrelated but handy function that takes a time tuple such as returned by the\ngmtime() function in the time module, and returns the corresponding\nUnix timestamp value, assuming an epoch of 1970, and the POSIX encoding. In\nfact, time.gmtime() and timegm() are each others’ inverse.
\n\nNew in version 2.0.
\nThe calendar module exports the following data attributes:
\n\nNew in version 2.3.
\nSource code: Lib/heapq.py
\nThis module provides an implementation of the heap queue algorithm, also known\nas the priority queue algorithm.
\nHeaps are binary trees for which every parent node has a value less than or\nequal to any of its children. This implementation uses arrays for which\nheap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2] for all k, counting\nelements from zero. For the sake of comparison, non-existing elements are\nconsidered to be infinite. The interesting property of a heap is that its\nsmallest element is always the root, heap[0].
\nThe API below differs from textbook heap algorithms in two aspects: (a) We use\nzero-based indexing. This makes the relationship between the index for a node\nand the indexes for its children slightly less obvious, but is more suitable\nsince Python uses zero-based indexing. (b) Our pop method returns the smallest\nitem, not the largest (called a “min heap” in textbooks; a “max heap” is more\ncommon in texts because of its suitability for in-place sorting).
\nThese two make it possible to view the heap as a regular Python list without\nsurprises: heap[0] is the smallest item, and heap.sort() maintains the\nheap invariant!
\nTo create a heap, use a list initialized to [], or you can transform a\npopulated list into a heap via function heapify().
\nThe following functions are provided:
\nPush item on the heap, then pop and return the smallest item from the\nheap. The combined action runs more efficiently than heappush()\nfollowed by a separate call to heappop().
\n\nNew in version 2.6.
\nPop and return the smallest item from the heap, and also push the new item.\nThe heap size doesn’t change. If the heap is empty, IndexError is raised.
\nThis one step operation is more efficient than a heappop() followed by\nheappush() and can be more appropriate when using a fixed-size heap.\nThe pop/push combination always returns an element from the heap and replaces\nit with item.
\nThe value returned may be larger than the item added. If that isn’t\ndesired, consider using heappushpop() instead. Its push/pop\ncombination returns the smaller of the two values, leaving the larger value\non the heap.
\nThe module also offers three general purpose functions based on heaps.
\nMerge multiple sorted inputs into a single sorted output (for example, merge\ntimestamped entries from multiple log files). Returns an iterator\nover the sorted values.
\nSimilar to sorted(itertools.chain(*iterables)) but returns an iterable, does\nnot pull the data into memory all at once, and assumes that each of the input\nstreams is already sorted (smallest to largest).
\n\nNew in version 2.6.
\nReturn a list with the n largest elements from the dataset defined by\niterable. key, if provided, specifies a function of one argument that is\nused to extract a comparison key from each element in the iterable:\nkey=str.lower Equivalent to: sorted(iterable, key=key,\nreverse=True)[:n]
\n\nNew in version 2.4.
\n\nChanged in version 2.5: Added the optional key argument.
\nReturn a list with the n smallest elements from the dataset defined by\niterable. key, if provided, specifies a function of one argument that is\nused to extract a comparison key from each element in the iterable:\nkey=str.lower Equivalent to: sorted(iterable, key=key)[:n]
\n\nNew in version 2.4.
\n\nChanged in version 2.5: Added the optional key argument.
\nThe latter two functions perform best for smaller values of n. For larger\nvalues, it is more efficient to use the sorted() function. Also, when\nn==1, it is more efficient to use the built-in min() and max()\nfunctions.
\nA heapsort can be implemented by\npushing all values onto a heap and then popping off the smallest values one at a\ntime:
\n>>> def heapsort(iterable):\n... 'Equivalent to sorted(iterable)'\n... h = []\n... for value in iterable:\n... heappush(h, value)\n... return [heappop(h) for i in range(len(h))]\n...\n>>> heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])\n[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n
Heap elements can be tuples. This is useful for assigning comparison values\n(such as task priorities) alongside the main record being tracked:
\n>>> h = []\n>>> heappush(h, (5, 'write code'))\n>>> heappush(h, (7, 'release product'))\n>>> heappush(h, (1, 'write spec'))\n>>> heappush(h, (3, 'create tests'))\n>>> heappop(h)\n(1, 'write spec')\n
A priority queue is common use\nfor a heap, and it presents several implementation challenges:
\nA solution to the first two challenges is to store entries as 3-element list\nincluding the priority, an entry count, and the task. The entry count serves as\na tie-breaker so that two tasks with the same priority are returned in the order\nthey were added. And since no two entry counts are the same, the tuple\ncomparison will never attempt to directly compare two tasks.
\nThe remaining challenges revolve around finding a pending task and making\nchanges to its priority or removing it entirely. Finding a task can be done\nwith a dictionary pointing to an entry in the queue.
\nRemoving the entry or changing its priority is more difficult because it would\nbreak the heap structure invariants. So, a possible solution is to mark the\nexisting entry as removed and add a new entry with the revised priority:
\npq = [] # list of entries arranged in a heap\nentry_finder = {} # mapping of tasks to entries\nREMOVED = '<removed-task>' # placeholder for a removed task\ncounter = itertools.count() # unique sequence count\n\ndef add_task(task, priority=0):\n 'Add a new task or update the priority of an existing task'\n if task in entry_finder:\n remove_task(task)\n count = next(counter)\n entry = [priority, count, task]\n entry_finder[task] = entry\n heappush(pq, entry)\n\ndef remove_task(task):\n 'Mark an existing task as REMOVED. Raise KeyError if not found.'\n entry = entry_finder.pop(task)\n entry[-1] = REMOVED\n\ndef pop_task():\n 'Remove and return the lowest priority task. Raise KeyError if empty.'\n while pq:\n priority, count, task = heappop(pq)\n if task is not REMOVED:\n del entry_finder[task]\n return task\n raise KeyError('pop from an empty priority queue')\n
Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for all\nk, counting elements from 0. For the sake of comparison, non-existing\nelements are considered to be infinite. The interesting property of a heap is\nthat a[0] is always its smallest element.
\nThe strange invariant above is meant to be an efficient memory representation\nfor a tournament. The numbers below are k, not a[k]:
\n 0\n\n 1 2\n\n 3 4 5 6\n\n 7 8 9 10 11 12 13 14\n\n15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
\nIn the tree above, each cell k is topping 2*k+1 and 2*k+2. In an usual\nbinary tournament we see in sports, each cell is the winner over the two cells\nit tops, and we can trace the winner down the tree to see all opponents s/he\nhad. However, in many computer applications of such tournaments, we do not need\nto trace the history of a winner. To be more memory efficient, when a winner is\npromoted, we try to replace it by something else at a lower level, and the rule\nbecomes that a cell and the two cells it tops contain three different items, but\nthe top cell “wins” over the two topped cells.
\nIf this heap invariant is protected at all time, index 0 is clearly the overall\nwinner. The simplest algorithmic way to remove it and find the “next” winner is\nto move some loser (let’s say cell 30 in the diagram above) into the 0 position,\nand then percolate this new 0 down the tree, exchanging values, until the\ninvariant is re-established. This is clearly logarithmic on the total number of\nitems in the tree. By iterating over all items, you get an O(n log n) sort.
\nA nice feature of this sort is that you can efficiently insert new items while\nthe sort is going on, provided that the inserted items are not “better” than the\nlast 0’th element you extracted. This is especially useful in simulation\ncontexts, where the tree holds all incoming events, and the “win” condition\nmeans the smallest scheduled time. When an event schedule other events for\nexecution, they are scheduled into the future, so they can easily go into the\nheap. So, a heap is a good structure for implementing schedulers (this is what\nI used for my MIDI sequencer :-).
\nVarious structures for implementing schedulers have been extensively studied,\nand heaps are good for this, as they are reasonably speedy, the speed is almost\nconstant, and the worst case is not much different than the average case.\nHowever, there are other representations which are more efficient overall, yet\nthe worst cases might be terrible.
\nHeaps are also very useful in big disk sorts. You most probably all know that a\nbig sort implies producing “runs” (which are pre-sorted sequences, which size is\nusually related to the amount of CPU memory), followed by a merging passes for\nthese runs, which merging is often very cleverly organised [1]. It is very\nimportant that the initial sort produces the longest runs possible. Tournaments\nare a good way to that. If, using all the memory available to hold a\ntournament, you replace and percolate items that happen to fit the current run,\nyou’ll produce runs which are twice the size of the memory for random input, and\nmuch better for input fuzzily ordered.
\nMoreover, if you output the 0’th item on disk and get an input which may not fit\nin the current tournament (because the value “wins” over the last output value),\nit cannot fit in the heap, so the size of the heap decreases. The freed memory\ncould be cleverly reused immediately for progressively building a second heap,\nwhich grows at exactly the same rate the first heap is melting. When the first\nheap completely vanishes, you switch heaps and start a new run. Clever and\nquite effective!
\nIn a word, heaps are useful memory structures to know. I use them in a few\napplications, and I think it is good to keep a ‘heap’ module around. :-)
\nFootnotes
\n[1] | The disk balancing algorithms which are current, nowadays, are more annoying\nthan clever, and this is a consequence of the seeking capabilities of the disks.\nOn devices which cannot seek, like big tape drives, the story was quite\ndifferent, and one had to be very clever to ensure (far in advance) that each\ntape movement will be the most effective possible (that is, will best\nparticipate at “progressing” the merge). Some tapes were even able to read\nbackwards, and this was also used to avoid the rewinding time. Believe me, real\ngood tape sorts were quite spectacular to watch! From all times, sorting has\nalways been a Great Art! :-) |
\nNew in version 2.1.
\nSource code: Lib/bisect.py
\nThis module provides support for maintaining a list in sorted order without\nhaving to sort the list after each insertion. For long lists of items with\nexpensive comparison operations, this can be an improvement over the more common\napproach. The module is called bisect because it uses a basic bisection\nalgorithm to do its work. The source code may be most useful as a working\nexample of the algorithm (the boundary conditions are already right!).
\nThe following functions are provided:
\nLocate the insertion point for x in a to maintain sorted order.\nThe parameters lo and hi may be used to specify a subset of the list\nwhich should be considered; by default the entire list is used. If x is\nalready present in a, the insertion point will be before (to the left of)\nany existing entries. The return value is suitable for use as the first\nparameter to list.insert() assuming that a is already sorted.
\nThe returned insertion point i partitions the array a into two halves so\nthat all(val < x for val in a[lo:i]) for the left side and\nall(val >= x for val in a[i:hi]) for the right side.
\nSimilar to bisect_left(), but returns an insertion point which comes\nafter (to the right of) any existing entries of x in a.
\nThe returned insertion point i partitions the array a into two halves so\nthat all(val <= x for val in a[lo:i]) for the left side and\nall(val > x for val in a[i:hi]) for the right side.
\nSee also
\nSortedCollection recipe that uses\nbisect to build a full-featured collection class with straight-forward search\nmethods and support for a key-function. The keys are precomputed to save\nunnecessary calls to the key function during searches.
\nThe above bisect() functions are useful for finding insertion points but\ncan be tricky or awkward to use for common searching tasks. The following five\nfunctions show how to transform them into the standard lookups for sorted\nlists:
\ndef index(a, x):\n 'Locate the leftmost value exactly equal to x'\n i = bisect_left(a, x)\n if i != len(a) and a[i] == x:\n return i\n raise ValueError\n\ndef find_lt(a, x):\n 'Find rightmost value less than x'\n i = bisect_left(a, x)\n if i:\n return a[i-1]\n raise ValueError\n\ndef find_le(a, x):\n 'Find rightmost value less than or equal to x'\n i = bisect_right(a, x)\n if i:\n return a[i-1]\n raise ValueError\n\ndef find_gt(a, x):\n 'Find leftmost value greater than x'\n i = bisect_right(a, x)\n if i != len(a):\n return a[i]\n raise ValueError\n\ndef find_ge(a, x):\n 'Find leftmost item greater than or equal to x'\n i = bisect_left(a, x)\n if i != len(a):\n return a[i]\n raise ValueError\n
The bisect() function can be useful for numeric table lookups. This\nexample uses bisect() to look up a letter grade for an exam score (say)\nbased on a set of ordered numeric breakpoints: 90 and up is an ‘A’, 80 to 89 is\na ‘B’, and so on:
\n>>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):\n... i = bisect(breakpoints, score)\n... return grades[i]\n...\n>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]\n['F', 'A', 'C', 'C', 'B', 'A', 'A']\n
Unlike the sorted() function, it does not make sense for the bisect()\nfunctions to have key or reversed arguments because that would lead to an\ninefficient design (successive calls to bisect functions would not “remember”\nall of the previous key lookups).
\nInstead, it is better to search a list of precomputed keys to find the index\nof the record in question:
\n>>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]\n>>> data.sort(key=lambda r: r[1])\n>>> keys = [r[1] for r in data] # precomputed list of keys\n>>> data[bisect_left(keys, 0)]\n('black', 0)\n>>> data[bisect_left(keys, 1)]\n('blue', 1)\n>>> data[bisect_left(keys, 5)]\n('red', 5)\n>>> data[bisect_left(keys, 8)]\n('yellow', 8)\n
This module defines an object type which can compactly represent an array of\nbasic values: characters, integers, floating point numbers. Arrays are sequence\ntypes and behave very much like lists, except that the type of objects stored in\nthem is constrained. The type is specified at object creation time by using a\ntype code, which is a single character. The following type codes are\ndefined:
\nType code | \nC Type | \nPython Type | \nMinimum size in bytes | \n
---|---|---|---|
'c' | \nchar | \ncharacter | \n1 | \n
'b' | \nsigned char | \nint | \n1 | \n
'B' | \nunsigned char | \nint | \n1 | \n
'u' | \nPy_UNICODE | \nUnicode character | \n2 (see note) | \n
'h' | \nsigned short | \nint | \n2 | \n
'H' | \nunsigned short | \nint | \n2 | \n
'i' | \nsigned int | \nint | \n2 | \n
'I' | \nunsigned int | \nlong | \n2 | \n
'l' | \nsigned long | \nint | \n4 | \n
'L' | \nunsigned long | \nlong | \n4 | \n
'f' | \nfloat | \nfloat | \n4 | \n
'd' | \ndouble | \nfloat | \n8 | \n
Note
\nThe 'u' typecode corresponds to Python’s unicode character. On narrow\nUnicode builds this is 2-bytes, on wide builds this is 4-bytes.
\nThe actual representation of values is determined by the machine architecture\n(strictly speaking, by the C implementation). The actual size can be accessed\nthrough the itemsize attribute. The values stored for 'L' and\n'I' items will be represented as Python long integers when retrieved,\nbecause Python’s plain integer type cannot represent the full range of C’s\nunsigned (long) integers.
\nThe module defines the following type:
\nA new array whose items are restricted by typecode, and initialized\nfrom the optional initializer value, which must be a list, string, or iterable\nover elements of the appropriate type.
\n\nChanged in version 2.4: Formerly, only lists or strings were accepted.
\nIf given a list or string, the initializer is passed to the new array’s\nfromlist(), fromstring(), or fromunicode() method (see below)\nto add initial items to the array. Otherwise, the iterable initializer is\npassed to the extend() method.
\nArray objects support the ordinary sequence operations of indexing, slicing,\nconcatenation, and multiplication. When using slice assignment, the assigned\nvalue must be an array object with the same type code; in all other cases,\nTypeError is raised. Array objects also implement the buffer interface,\nand may be used wherever buffer objects are supported.
\nThe following data items and methods are also supported:
\nReturn a tuple (address, length) giving the current memory address and the\nlength in elements of the buffer used to hold array’s contents. The size of the\nmemory buffer in bytes can be computed as array.buffer_info()[1] *\narray.itemsize. This is occasionally useful when working with low-level (and\ninherently unsafe) I/O interfaces that require memory addresses, such as certain\nioctl() operations. The returned numbers are valid as long as the array\nexists and no length-changing operations are applied to it.
\nNote
\nWhen using array objects from code written in C or C++ (the only way to\neffectively make use of this information), it makes more sense to use the buffer\ninterface supported by array objects. This method is maintained for backward\ncompatibility and should be avoided in new code. The buffer interface is\ndocumented in Buffers and Memoryview Objects.
\nAppend items from iterable to the end of the array. If iterable is another\narray, it must have exactly the same type code; if not, TypeError will\nbe raised. If iterable is not an array, it must be iterable and its elements\nmust be the right type to be appended to the array.
\n\nChanged in version 2.4: Formerly, the argument could only be another array.
\n\nDeprecated since version 1.5.1: Use the fromfile() method.
\nRead n items (as machine values) from the file object f and append them to\nthe end of the array. If less than n items are available, EOFError is\nraised, but the items that were available are still inserted into the array.\nf must be a real built-in file object; something else with a read()\nmethod won’t do.
\n\nDeprecated since version 1.5.1: Use the tofile() method.
\nWrite all items (as machine values) to the file object f.
\nWhen an array object is printed or converted to a string, it is represented as\narray(typecode, initializer). The initializer is omitted if the array is\nempty, otherwise it is a string if the typecode is 'c', otherwise it is a\nlist of numbers. The string is guaranteed to be able to be converted back to an\narray with the same type and value using eval(), so long as the\narray() function has been imported using from array import array.\nExamples:
\narray('l')\narray('c', 'hello world')\narray('u', u'hello \\u2641')\narray('l', [1, 2, 3, 4, 5])\narray('d', [1.0, 2.0, 3.14])\n
See also
\n\nNew in version 2.3.
\nThe datetime module supplies classes for manipulating dates and times in\nboth simple and complex ways. While date and time arithmetic is supported, the\nfocus of the implementation is on efficient attribute extraction for output\nformatting and manipulation. For related\nfunctionality, see also the time and calendar modules.
\nThere are two kinds of date and time objects: “naive” and “aware”. This\ndistinction refers to whether the object has any notion of time zone, daylight\nsaving time, or other kind of algorithmic or political time adjustment. Whether\na naive datetime object represents Coordinated Universal Time (UTC),\nlocal time, or time in some other timezone is purely up to the program, just\nlike it’s up to the program whether a particular number represents metres,\nmiles, or mass. Naive datetime objects are easy to understand and to\nwork with, at the cost of ignoring some aspects of reality.
\nFor applications requiring more, datetime and time objects\nhave an optional time zone information attribute, tzinfo, that can be\nset to an instance of a subclass of the abstract tzinfo class. These\ntzinfo objects capture information about the offset from UTC time, the\ntime zone name, and whether Daylight Saving Time is in effect. Note that no\nconcrete tzinfo classes are supplied by the datetime module.\nSupporting timezones at whatever level of detail is required is up to the\napplication. The rules for time adjustment across the world are more political\nthan rational, and there is no standard suitable for every application.
\nThe datetime module exports the following constants:
\nSee also
\n\nObjects of these types are immutable.
\nObjects of the date type are always naive.
\nAn object d of type time or datetime may be naive or aware.\nd is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d) does\nnot return None. If d.tzinfo is None, or if d.tzinfo is not\nNone but d.tzinfo.utcoffset(d) returns None, d is naive.
\nThe distinction between naive and aware doesn’t apply to timedelta\nobjects.
\nSubclass relationships:
\nobject\n timedelta\n tzinfo\n time\n date\n datetime
\nA timedelta object represents a duration, the difference between two\ndates or times.
\nAll arguments are optional and default to 0. Arguments may be ints, longs,\nor floats, and may be positive or negative.
\nOnly days, seconds and microseconds are stored internally. Arguments are\nconverted to those units:
\nand days, seconds and microseconds are then normalized so that the\nrepresentation is unique, with
\nIf any argument is a float and there are fractional microseconds, the fractional\nmicroseconds left over from all arguments are combined and their sum is rounded\nto the nearest microsecond. If no argument is a float, the conversion and\nnormalization processes are exact (no information is lost).
\nIf the normalized value of days lies outside the indicated range,\nOverflowError is raised.
\nNote that normalization of negative values may be surprising at first. For\nexample,
\n>>> from datetime import timedelta\n>>> d = timedelta(microseconds=-1)\n>>> (d.days, d.seconds, d.microseconds)\n(-1, 86399, 999999)\n
Class attributes are:
\n\n\nNote that, because of normalization, timedelta.max > -timedelta.min.\n-timedelta.max is not representable as a timedelta object.
\nInstance attributes (read-only):
\nAttribute | \nValue | \n
---|---|
days | \nBetween -999999999 and 999999999 inclusive | \n
seconds | \nBetween 0 and 86399 inclusive | \n
microseconds | \nBetween 0 and 999999 inclusive | \n
Supported operations:
\nOperation | \nResult | \n
---|---|
t1 = t2 + t3 | \nSum of t2 and t3. Afterwards t1-t2 ==\nt3 and t1-t3 == t2 are true. (1) | \n
t1 = t2 - t3 | \nDifference of t2 and t3. Afterwards t1\n== t2 - t3 and t2 == t1 + t3 are\ntrue. (1) | \n
t1 = t2 * i or t1 = i * t2 | \nDelta multiplied by an integer or long.\nAfterwards t1 // i == t2 is true,\nprovided i != 0. | \n
\n | In general, t1 * i == t1 * (i-1) + t1\nis true. (1) | \n
t1 = t2 // i | \nThe floor is computed and the remainder (if\nany) is thrown away. (3) | \n
+t1 | \nReturns a timedelta object with the\nsame value. (2) | \n
-t1 | \nequivalent to timedelta(-t1.days, -t1.seconds,\n-t1.microseconds), and to t1* -1. (1)(4) | \n
abs(t) | \nequivalent to +t when t.days >= 0, and\nto -t when t.days < 0. (2) | \n
str(t) | \nReturns a string in the form\n[D day[s], ][H]H:MM:SS[.UUUUUU], where D\nis negative for negative t. (5) | \n
repr(t) | \nReturns a string in the form\ndatetime.timedelta(D[, S[, U]]), where D\nis negative for negative t. (5) | \n
Notes:
\nThis is exact, but may overflow.
\nThis is exact, and cannot overflow.
\nDivision by 0 raises ZeroDivisionError.
\n-timedelta.max is not representable as a timedelta object.
\nString representations of timedelta objects are normalized\nsimilarly to their internal representation. This leads to somewhat\nunusual results for negative timedeltas. For example:
\n>>> timedelta(hours=-5)\ndatetime.timedelta(-1, 68400)\n>>> print(_)\n-1 day, 19:00:00\n
In addition to the operations listed above timedelta objects support\ncertain additions and subtractions with date and datetime\nobjects (see below).
\nComparisons of timedelta objects are supported with the\ntimedelta object representing the smaller duration considered to be the\nsmaller timedelta. In order to stop mixed-type comparisons from falling back to\nthe default comparison by object address, when a timedelta object is\ncompared to an object of a different type, TypeError is raised unless the\ncomparison is == or !=. The latter cases return False or\nTrue, respectively.
\ntimedelta objects are hashable (usable as dictionary keys), support\nefficient pickling, and in Boolean contexts, a timedelta object is\nconsidered to be true if and only if it isn’t equal to timedelta(0).
\nInstance methods:
\nReturn the total number of seconds contained in the duration.\nEquivalent to (td.microseconds + (td.seconds + td.days * 24 *\n3600) * 10**6) / 10**6 computed with true division enabled.
\nNote that for very large time intervals (greater than 270 years on\nmost platforms) this method will lose microsecond accuracy.
\n\nNew in version 2.7.
\nExample usage:
\n>>> from datetime import timedelta\n>>> year = timedelta(days=365)\n>>> another_year = timedelta(weeks=40, days=84, hours=23,\n... minutes=50, seconds=600) # adds up to 365 days\n>>> year.total_seconds()\n31536000.0\n>>> year == another_year\nTrue\n>>> ten_years = 10 * year\n>>> ten_years, ten_years.days // 365\n(datetime.timedelta(3650), 10)\n>>> nine_years = ten_years - year\n>>> nine_years, nine_years.days // 365\n(datetime.timedelta(3285), 9)\n>>> three_years = nine_years // 3;\n>>> three_years, three_years.days // 365\n(datetime.timedelta(1095), 3)\n>>> abs(three_years - ten_years) == 2 * three_years + year\nTrue\n
A date object represents a date (year, month and day) in an idealized\ncalendar, the current Gregorian calendar indefinitely extended in both\ndirections. January 1 of year 1 is called day number 1, January 2 of year 1 is\ncalled day number 2, and so on. This matches the definition of the “proleptic\nGregorian” calendar in Dershowitz and Reingold’s book Calendrical Calculations,\nwhere it’s the base calendar for all computations. See the book for algorithms\nfor converting between proleptic Gregorian ordinals and many other calendar\nsystems.
\nAll arguments are required. Arguments may be ints or longs, in the following\nranges:
\nIf an argument outside those ranges is given, ValueError is raised.
\nOther constructors, all class methods:
\nClass attributes:
\nInstance attributes (read-only):
\n\n\nSupported operations:
\nOperation | \nResult | \n
---|---|
date2 = date1 + timedelta | \ndate2 is timedelta.days days removed\nfrom date1. (1) | \n
date2 = date1 - timedelta | \nComputes date2 such that date2 +\ntimedelta == date1. (2) | \n
timedelta = date1 - date2 | \n(3) | \n
date1 < date2 | \ndate1 is considered less than date2 when\ndate1 precedes date2 in time. (4) | \n
Notes:
\nDates can be used as dictionary keys. In Boolean contexts, all date\nobjects are considered to be true.
\nInstance methods:
\nReturn a 3-tuple, (ISO year, ISO week number, ISO weekday).
\nThe ISO calendar is a widely used variant of the Gregorian calendar. See\nhttp://www.phys.uu.nl/~vgent/calendar/isocalendar.htm for a good\nexplanation.
\nThe ISO year consists of 52 or 53 full weeks, and where a week starts on a\nMonday and ends on a Sunday. The first week of an ISO year is the first\n(Gregorian) calendar week of a year containing a Thursday. This is called week\nnumber 1, and the ISO year of that Thursday is the same as its Gregorian year.
\nFor example, 2004 begins on a Thursday, so the first week of ISO year 2004\nbegins on Monday, 29 Dec 2003 and ends on Sunday, 4 Jan 2004, so that\ndate(2003, 12, 29).isocalendar() == (2004, 1, 1) and date(2004, 1,\n4).isocalendar() == (2004, 1, 7).
\nExample of counting days to an event:
\n>>> import time\n>>> from datetime import date\n>>> today = date.today()\n>>> today\ndatetime.date(2007, 12, 5)\n>>> today == date.fromtimestamp(time.time())\nTrue\n>>> my_birthday = date(today.year, 6, 24)\n>>> if my_birthday < today:\n... my_birthday = my_birthday.replace(year=today.year + 1)\n>>> my_birthday\ndatetime.date(2008, 6, 24)\n>>> time_to_birthday = abs(my_birthday - today)\n>>> time_to_birthday.days\n202\n
Example of working with date:
\n>>> from datetime import date\n>>> d = date.fromordinal(730920) # 730920th day after 1. 1. 0001\n>>> d\ndatetime.date(2002, 3, 11)\n>>> t = d.timetuple()\n>>> for i in t: \n... print i\n2002 # year\n3 # month\n11 # day\n0\n0\n0\n0 # weekday (0 = Monday)\n70 # 70th day in the year\n-1\n>>> ic = d.isocalendar()\n>>> for i in ic: \n... print i\n2002 # ISO year\n11 # ISO week number\n1 # ISO day number ( 1 = Monday )\n>>> d.isoformat()\n'2002-03-11'\n>>> d.strftime("%d/%m/%y")\n'11/03/02'\n>>> d.strftime("%A %d. %B %Y")\n'Monday 11. March 2002'\n
A datetime object is a single object containing all the information\nfrom a date object and a time object. Like a date\nobject, datetime assumes the current Gregorian calendar extended in\nboth directions; like a time object, datetime assumes there are exactly\n3600*24 seconds in every day.
\nConstructor:
\nThe year, month and day arguments are required. tzinfo may be None, or an\ninstance of a tzinfo subclass. The remaining arguments may be ints or\nlongs, in the following ranges:
\nIf an argument outside those ranges is given, ValueError is raised.
\nOther constructors, all class methods:
\nReturn the current local date and time. If optional argument tz is None\nor not specified, this is like today(), but, if possible, supplies more\nprecision than can be gotten from going through a time.time() timestamp\n(for example, this may be possible on platforms supplying the C\ngettimeofday() function).
\nElse tz must be an instance of a class tzinfo subclass, and the\ncurrent date and time are converted to tz‘s time zone. In this case the\nresult is equivalent to tz.fromutc(datetime.utcnow().replace(tzinfo=tz)).\nSee also today(), utcnow().
\nReturn the local date and time corresponding to the POSIX timestamp, such as is\nreturned by time.time(). If optional argument tz is None or not\nspecified, the timestamp is converted to the platform’s local date and time, and\nthe returned datetime object is naive.
\nElse tz must be an instance of a class tzinfo subclass, and the\ntimestamp is converted to tz‘s time zone. In this case the result is\nequivalent to\ntz.fromutc(datetime.utcfromtimestamp(timestamp).replace(tzinfo=tz)).
\nfromtimestamp() may raise ValueError, if the timestamp is out of\nthe range of values supported by the platform C localtime() or\ngmtime() functions. It’s common for this to be restricted to years in\n1970 through 2038. Note that on non-POSIX systems that include leap seconds in\ntheir notion of a timestamp, leap seconds are ignored by fromtimestamp(),\nand then it’s possible to have two timestamps differing by a second that yield\nidentical datetime objects. See also utcfromtimestamp().
\nReturn a datetime corresponding to date_string, parsed according to\nformat. This is equivalent to datetime(*(time.strptime(date_string,\nformat)[0:6])). ValueError is raised if the date_string and format\ncan’t be parsed by time.strptime() or if it returns a value which isn’t a\ntime tuple. See section strftime() and strptime() Behavior.
\n\nNew in version 2.5.
\nClass attributes:
\n\n\nInstance attributes (read-only):
\n\n\nSupported operations:
\nOperation | \nResult | \n
---|---|
datetime2 = datetime1 + timedelta | \n(1) | \n
datetime2 = datetime1 - timedelta | \n(2) | \n
timedelta = datetime1 - datetime2 | \n(3) | \n
datetime1 < datetime2 | \nCompares datetime to\ndatetime. (4) | \n
datetime2 is a duration of timedelta removed from datetime1, moving forward in\ntime if timedelta.days > 0, or backward if timedelta.days < 0. The\nresult has the same tzinfo attribute as the input datetime, and\ndatetime2 - datetime1 == timedelta after. OverflowError is raised if\ndatetime2.year would be smaller than MINYEAR or larger than\nMAXYEAR. Note that no time zone adjustments are done even if the\ninput is an aware object.
\nComputes the datetime2 such that datetime2 + timedelta == datetime1. As for\naddition, the result has the same tzinfo attribute as the input\ndatetime, and no time zone adjustments are done even if the input is aware.\nThis isn’t quite equivalent to datetime1 + (-timedelta), because -timedelta\nin isolation can overflow in cases where datetime1 - timedelta does not.
\nSubtraction of a datetime from a datetime is defined only if\nboth operands are naive, or if both are aware. If one is aware and the other is\nnaive, TypeError is raised.
\nIf both are naive, or both are aware and have the same tzinfo attribute,\nthe tzinfo attributes are ignored, and the result is a timedelta\nobject t such that datetime2 + t == datetime1. No time zone adjustments\nare done in this case.
\nIf both are aware and have different tzinfo attributes, a-b acts\nas if a and b were first converted to naive UTC datetimes first. The\nresult is (a.replace(tzinfo=None) - a.utcoffset()) - (b.replace(tzinfo=None)\n- b.utcoffset()) except that the implementation never overflows.
\ndatetime1 is considered less than datetime2 when datetime1 precedes\ndatetime2 in time.
\nIf one comparand is naive and the other is aware, TypeError is raised.\nIf both comparands are aware, and have the same tzinfo attribute, the\ncommon tzinfo attribute is ignored and the base datetimes are\ncompared. If both comparands are aware and have different tzinfo\nattributes, the comparands are first adjusted by subtracting their UTC\noffsets (obtained from self.utcoffset()).
\nNote
\nIn order to stop comparison from falling back to the default scheme of comparing\nobject addresses, datetime comparison normally raises TypeError if the\nother comparand isn’t also a datetime object. However,\nNotImplemented is returned instead if the other comparand has a\ntimetuple() attribute. This hook gives other kinds of date objects a\nchance at implementing mixed-type comparison. If not, when a datetime\nobject is compared to an object of a different type, TypeError is raised\nunless the comparison is == or !=. The latter cases return\nFalse or True, respectively.
\ndatetime objects can be used as dictionary keys. In Boolean contexts,\nall datetime objects are considered to be true.
\nInstance methods:
\n\n\nReturn a datetime object with new tzinfo attribute tz,\nadjusting the date and time data so the result is the same UTC time as\nself, but in tz‘s local time.
\ntz must be an instance of a tzinfo subclass, and its\nutcoffset() and dst() methods must not return None. self must\nbe aware (self.tzinfo must not be None, and self.utcoffset() must\nnot return None).
\nIf self.tzinfo is tz, self.astimezone(tz) is equal to self: no\nadjustment of date or time data is performed. Else the result is local\ntime in time zone tz, representing the same UTC time as self: after\nastz = dt.astimezone(tz), astz - astz.utcoffset() will usually have\nthe same date and time data as dt - dt.utcoffset(). The discussion\nof class tzinfo explains the cases at Daylight Saving Time transition\nboundaries where this cannot be achieved (an issue only if tz models both\nstandard and daylight time).
\nIf you merely want to attach a time zone object tz to a datetime dt without\nadjustment of date and time data, use dt.replace(tzinfo=tz). If you\nmerely want to remove the time zone object from an aware datetime dt without\nconversion of date and time data, use dt.replace(tzinfo=None).
\nNote that the default tzinfo.fromutc() method can be overridden in a\ntzinfo subclass to affect the result returned by astimezone().\nIgnoring error cases, astimezone() acts like:
\ndef astimezone(self, tz):\n if self.tzinfo is tz:\n return self\n # Convert self to UTC, and attach the new time zone object.\n utc = (self - self.utcoffset()).replace(tzinfo=tz)\n # Convert from UTC to tz's local time.\n return tz.fromutc(utc)\n
If datetime instance d is naive, this is the same as\nd.timetuple() except that tm_isdst is forced to 0 regardless of what\nd.dst() returns. DST is never in effect for a UTC time.
\nIf d is aware, d is normalized to UTC time, by subtracting\nd.utcoffset(), and a time.struct_time for the normalized time is\nreturned. tm_isdst is forced to 0. Note that the result’s\ntm_year member may be MINYEAR-1 or MAXYEAR+1, if\nd.year was MINYEAR or MAXYEAR and UTC adjustment spills over a year\nboundary.
\nReturn a string representing the date and time in ISO 8601 format,\nYYYY-MM-DDTHH:MM:SS.mmmmmm or, if microsecond is 0,\nYYYY-MM-DDTHH:MM:SS
\nIf utcoffset() does not return None, a 6-character string is\nappended, giving the UTC offset in (signed) hours and minutes:\nYYYY-MM-DDTHH:MM:SS.mmmmmm+HH:MM or, if microsecond is 0\nYYYY-MM-DDTHH:MM:SS+HH:MM
\nThe optional argument sep (default 'T') is a one-character separator,\nplaced between the date and time portions of the result. For example,
\n>>> from datetime import tzinfo, timedelta, datetime\n>>> class TZ(tzinfo):\n... def utcoffset(self, dt): return timedelta(minutes=-399)\n...\n>>> datetime(2002, 12, 25, tzinfo=TZ()).isoformat(' ')\n'2002-12-25 00:00:00-06:39'\n
Examples of working with datetime objects:
\n>>> from datetime import datetime, date, time\n>>> # Using datetime.combine()\n>>> d = date(2005, 7, 14)\n>>> t = time(12, 30)\n>>> datetime.combine(d, t)\ndatetime.datetime(2005, 7, 14, 12, 30)\n>>> # Using datetime.now() or datetime.utcnow()\n>>> datetime.now() \ndatetime.datetime(2007, 12, 6, 16, 29, 43, 79043) # GMT +1\n>>> datetime.utcnow() \ndatetime.datetime(2007, 12, 6, 15, 29, 43, 79060)\n>>> # Using datetime.strptime()\n>>> dt = datetime.strptime("21/11/06 16:30", "%d/%m/%y %H:%M")\n>>> dt\ndatetime.datetime(2006, 11, 21, 16, 30)\n>>> # Using datetime.timetuple() to get tuple of all attributes\n>>> tt = dt.timetuple()\n>>> for it in tt: \n... print it\n...\n2006 # year\n11 # month\n21 # day\n16 # hour\n30 # minute\n0 # second\n1 # weekday (0 = Monday)\n325 # number of days since 1st January\n-1 # dst - method tzinfo.dst() returned None\n>>> # Date in ISO format\n>>> ic = dt.isocalendar()\n>>> for it in ic: \n... print it\n...\n2006 # ISO year\n47 # ISO week\n2 # ISO weekday\n>>> # Formatting datetime\n>>> dt.strftime("%A, %d. %B %Y %I:%M%p")\n'Tuesday, 21. November 2006 04:30PM'\n
Using datetime with tzinfo:
\n>>> from datetime import timedelta, datetime, tzinfo\n>>> class GMT1(tzinfo):\n... def __init__(self): # DST starts last Sunday in March\n... d = datetime(dt.year, 4, 1) # ends last Sunday in October\n... self.dston = d - timedelta(days=d.weekday() + 1)\n... d = datetime(dt.year, 11, 1)\n... self.dstoff = d - timedelta(days=d.weekday() + 1)\n... def utcoffset(self, dt):\n... return timedelta(hours=1) + self.dst(dt)\n... def dst(self, dt):\n... if self.dston <= dt.replace(tzinfo=None) < self.dstoff:\n... return timedelta(hours=1)\n... else:\n... return timedelta(0)\n... def tzname(self,dt):\n... return "GMT +1"\n...\n>>> class GMT2(tzinfo):\n... def __init__(self):\n... d = datetime(dt.year, 4, 1)\n... self.dston = d - timedelta(days=d.weekday() + 1)\n... d = datetime(dt.year, 11, 1)\n... self.dstoff = d - timedelta(days=d.weekday() + 1)\n... def utcoffset(self, dt):\n... return timedelta(hours=1) + self.dst(dt)\n... def dst(self, dt):\n... if self.dston <= dt.replace(tzinfo=None) < self.dstoff:\n... return timedelta(hours=2)\n... else:\n... return timedelta(0)\n... def tzname(self,dt):\n... return "GMT +2"\n...\n>>> gmt1 = GMT1()\n>>> # Daylight Saving Time\n>>> dt1 = datetime(2006, 11, 21, 16, 30, tzinfo=gmt1)\n>>> dt1.dst()\ndatetime.timedelta(0)\n>>> dt1.utcoffset()\ndatetime.timedelta(0, 3600)\n>>> dt2 = datetime(2006, 6, 14, 13, 0, tzinfo=gmt1)\n>>> dt2.dst()\ndatetime.timedelta(0, 3600)\n>>> dt2.utcoffset()\ndatetime.timedelta(0, 7200)\n>>> # Convert datetime to another time zone\n>>> dt3 = dt2.astimezone(GMT2())\n>>> dt3 # doctest: +ELLIPSIS\ndatetime.datetime(2006, 6, 14, 14, 0, tzinfo=<GMT2 object at 0x...>)\n>>> dt2 # doctest: +ELLIPSIS\ndatetime.datetime(2006, 6, 14, 13, 0, tzinfo=<GMT1 object at 0x...>)\n>>> dt2.utctimetuple() == dt3.utctimetuple()\nTrue\n
A time object represents a (local) time of day, independent of any particular\nday, and subject to adjustment via a tzinfo object.
\nAll arguments are optional. tzinfo may be None, or an instance of a\ntzinfo subclass. The remaining arguments may be ints or longs, in the\nfollowing ranges:
\nIf an argument outside those ranges is given, ValueError is raised. All\ndefault to 0 except tzinfo, which defaults to None.
\nClass attributes:
\n\n\n\n\nInstance attributes (read-only):
\nSupported operations:
\nInstance methods:
\nExample:
\n>>> from datetime import time, tzinfo\n>>> class GMT1(tzinfo):\n... def utcoffset(self, dt):\n... return timedelta(hours=1)\n... def dst(self, dt):\n... return timedelta(0)\n... def tzname(self,dt):\n... return "Europe/Prague"\n...\n>>> t = time(12, 10, 30, tzinfo=GMT1())\n>>> t # doctest: +ELLIPSIS\ndatetime.time(12, 10, 30, tzinfo=<GMT1 object at 0x...>)\n>>> gmt = GMT1()\n>>> t.isoformat()\n'12:10:30+01:00'\n>>> t.dst()\ndatetime.timedelta(0)\n>>> t.tzname()\n'Europe/Prague'\n>>> t.strftime("%H:%M:%S %Z")\n'12:10:30 Europe/Prague'\n
tzinfo is an abstract base class, meaning that this class should not be\ninstantiated directly. You need to derive a concrete subclass, and (at least)\nsupply implementations of the standard tzinfo methods needed by the\ndatetime methods you use. The datetime module does not supply\nany concrete subclasses of tzinfo.
\nAn instance of (a concrete subclass of) tzinfo can be passed to the\nconstructors for datetime and time objects. The latter objects\nview their attributes as being in local time, and the tzinfo object\nsupports methods revealing offset of local time from UTC, the name of the time\nzone, and DST offset, all relative to a date or time object passed to them.
\nSpecial requirement for pickling: A tzinfo subclass must have an\n__init__() method that can be called with no arguments, else it can be\npickled but possibly not unpickled again. This is a technical requirement that\nmay be relaxed in the future.
\nA concrete subclass of tzinfo may need to implement the following\nmethods. Exactly which methods are needed depends on the uses made of aware\ndatetime objects. If in doubt, simply implement all of them.
\nReturn offset of local time from UTC, in minutes east of UTC. If local time is\nwest of UTC, this should be negative. Note that this is intended to be the\ntotal offset from UTC; for example, if a tzinfo object represents both\ntime zone and DST adjustments, utcoffset() should return their sum. If\nthe UTC offset isn’t known, return None. Else the value returned must be a\ntimedelta object specifying a whole number of minutes in the range\n-1439 to 1439 inclusive (1440 = 24*60; the magnitude of the offset must be less\nthan one day). Most implementations of utcoffset() will probably look\nlike one of these two:
\nreturn CONSTANT # fixed-offset class\nreturn CONSTANT + self.dst(dt) # daylight-aware class\n
If utcoffset() does not return None, dst() should not return\nNone either.
\nThe default implementation of utcoffset() raises\nNotImplementedError.
\nReturn the daylight saving time (DST) adjustment, in minutes east of UTC, or\nNone if DST information isn’t known. Return timedelta(0) if DST is not\nin effect. If DST is in effect, return the offset as a timedelta object\n(see utcoffset() for details). Note that DST offset, if applicable, has\nalready been added to the UTC offset returned by utcoffset(), so there’s\nno need to consult dst() unless you’re interested in obtaining DST info\nseparately. For example, datetime.timetuple() calls its tzinfo\nattribute’s dst() method to determine how the tm_isdst flag\nshould be set, and tzinfo.fromutc() calls dst() to account for\nDST changes when crossing time zones.
\nAn instance tz of a tzinfo subclass that models both standard and\ndaylight times must be consistent in this sense:
\ntz.utcoffset(dt) - tz.dst(dt)
\nmust return the same result for every datetime dt with dt.tzinfo ==\ntz For sane tzinfo subclasses, this expression yields the time\nzone’s “standard offset”, which should not depend on the date or the time, but\nonly on geographic location. The implementation of datetime.astimezone()\nrelies on this, but cannot detect violations; it’s the programmer’s\nresponsibility to ensure it. If a tzinfo subclass cannot guarantee\nthis, it may be able to override the default implementation of\ntzinfo.fromutc() to work correctly with astimezone() regardless.
\nMost implementations of dst() will probably look like one of these two:
\ndef dst(self, dt):\n # a fixed-offset class: doesn't account for DST\n return timedelta(0)\n
or
\ndef dst(self, dt):\n # Code to set dston and dstoff to the time zone's DST\n # transition times based on the input dt.year, and expressed\n # in standard local time. Then\n\n if dston <= dt.replace(tzinfo=None) < dstoff:\n return timedelta(hours=1)\n else:\n return timedelta(0)\n
The default implementation of dst() raises NotImplementedError.
\nReturn the time zone name corresponding to the datetime object dt, as\na string. Nothing about string names is defined by the datetime module,\nand there’s no requirement that it mean anything in particular. For example,\n“GMT”, “UTC”, “-500”, “-5:00”, “EDT”, “US/Eastern”, “America/New York” are all\nvalid replies. Return None if a string name isn’t known. Note that this is\na method rather than a fixed string primarily because some tzinfo\nsubclasses will wish to return different names depending on the specific value\nof dt passed, especially if the tzinfo class is accounting for\ndaylight time.
\nThe default implementation of tzname() raises NotImplementedError.
\nThese methods are called by a datetime or time object, in\nresponse to their methods of the same names. A datetime object passes\nitself as the argument, and a time object passes None as the\nargument. A tzinfo subclass’s methods should therefore be prepared to\naccept a dt argument of None, or of class datetime.
\nWhen None is passed, it’s up to the class designer to decide the best\nresponse. For example, returning None is appropriate if the class wishes to\nsay that time objects don’t participate in the tzinfo protocols. It\nmay be more useful for utcoffset(None) to return the standard UTC offset, as\nthere is no other convention for discovering the standard offset.
\nWhen a datetime object is passed in response to a datetime\nmethod, dt.tzinfo is the same object as self. tzinfo methods can\nrely on this, unless user code calls tzinfo methods directly. The\nintent is that the tzinfo methods interpret dt as being in local\ntime, and not need worry about objects in other timezones.
\nThere is one more tzinfo method that a subclass may wish to override:
\nThis is called from the default datetime.astimezone()\nimplementation. When called from that, dt.tzinfo is self, and dt‘s\ndate and time data are to be viewed as expressing a UTC time. The purpose\nof fromutc() is to adjust the date and time data, returning an\nequivalent datetime in self‘s local time.
\nMost tzinfo subclasses should be able to inherit the default\nfromutc() implementation without problems. It’s strong enough to handle\nfixed-offset time zones, and time zones accounting for both standard and\ndaylight time, and the latter even if the DST transition times differ in\ndifferent years. An example of a time zone the default fromutc()\nimplementation may not handle correctly in all cases is one where the standard\noffset (from UTC) depends on the specific date and time passed, which can happen\nfor political reasons. The default implementations of astimezone() and\nfromutc() may not produce the result you want if the result is one of the\nhours straddling the moment the standard offset changes.
\nSkipping code for error cases, the default fromutc() implementation acts\nlike:
\ndef fromutc(self, dt):\n # raise ValueError error if dt.tzinfo is not self\n dtoff = dt.utcoffset()\n dtdst = dt.dst()\n # raise ValueError if dtoff is None or dtdst is None\n delta = dtoff - dtdst # this is self's standard offset\n if delta:\n dt += delta # convert to standard local time\n dtdst = dt.dst()\n # raise ValueError if dtdst is None\n if dtdst:\n return dt + dtdst\n else:\n return dt\n
Example tzinfo classes:
\nfrom datetime import tzinfo, timedelta, datetime\n\nZERO = timedelta(0)\nHOUR = timedelta(hours=1)\n\n# A UTC class.\n\nclass UTC(tzinfo):\n """UTC"""\n\n def utcoffset(self, dt):\n return ZERO\n\n def tzname(self, dt):\n return "UTC"\n\n def dst(self, dt):\n return ZERO\n\nutc = UTC()\n\n# A class building tzinfo objects for fixed-offset time zones.\n# Note that FixedOffset(0, "UTC") is a different way to build a\n# UTC tzinfo object.\n\nclass FixedOffset(tzinfo):\n """Fixed offset in minutes east from UTC."""\n\n def __init__(self, offset, name):\n self.__offset = timedelta(minutes = offset)\n self.__name = name\n\n def utcoffset(self, dt):\n return self.__offset\n\n def tzname(self, dt):\n return self.__name\n\n def dst(self, dt):\n return ZERO\n\n# A class capturing the platform's idea of local time.\n\nimport time as _time\n\nSTDOFFSET = timedelta(seconds = -_time.timezone)\nif _time.daylight:\n DSTOFFSET = timedelta(seconds = -_time.altzone)\nelse:\n DSTOFFSET = STDOFFSET\n\nDSTDIFF = DSTOFFSET - STDOFFSET\n\nclass LocalTimezone(tzinfo):\n\n def utcoffset(self, dt):\n if self._isdst(dt):\n return DSTOFFSET\n else:\n return STDOFFSET\n\n def dst(self, dt):\n if self._isdst(dt):\n return DSTDIFF\n else:\n return ZERO\n\n def tzname(self, dt):\n return _time.tzname[self._isdst(dt)]\n\n def _isdst(self, dt):\n tt = (dt.year, dt.month, dt.day,\n dt.hour, dt.minute, dt.second,\n dt.weekday(), 0, 0)\n stamp = _time.mktime(tt)\n tt = _time.localtime(stamp)\n return tt.tm_isdst > 0\n\nLocal = LocalTimezone()\n\n\n# A complete implementation of current DST rules for major US time zones.\n\ndef first_sunday_on_or_after(dt):\n days_to_go = 6 - dt.weekday()\n if days_to_go:\n dt += timedelta(days_to_go)\n return dt\n\n\n# US DST Rules\n#\n# This is a simplified (i.e., wrong for a few cases) set of rules for US\n# DST start and end times. For a complete and up-to-date set of DST rules\n# and timezone definitions, visit the Olson Database (or try pytz):\n# http://www.twinsun.com/tz/tz-link.htm\n# http://sourceforge.net/projects/pytz/ (might not be up-to-date)\n#\n# In the US, since 2007, DST starts at 2am (standard time) on the second\n# Sunday in March, which is the first Sunday on or after Mar 8.\nDSTSTART_2007 = datetime(1, 3, 8, 2)\n# and ends at 2am (DST time; 1am standard time) on the first Sunday of Nov.\nDSTEND_2007 = datetime(1, 11, 1, 1)\n# From 1987 to 2006, DST used to start at 2am (standard time) on the first\n# Sunday in April and to end at 2am (DST time; 1am standard time) on the last\n# Sunday of October, which is the first Sunday on or after Oct 25.\nDSTSTART_1987_2006 = datetime(1, 4, 1, 2)\nDSTEND_1987_2006 = datetime(1, 10, 25, 1)\n# From 1967 to 1986, DST used to start at 2am (standard time) on the last\n# Sunday in April (the one on or after April 24) and to end at 2am (DST time;\n# 1am standard time) on the last Sunday of October, which is the first Sunday\n# on or after Oct 25.\nDSTSTART_1967_1986 = datetime(1, 4, 24, 2)\nDSTEND_1967_1986 = DSTEND_1987_2006\n\nclass USTimeZone(tzinfo):\n\n def __init__(self, hours, reprname, stdname, dstname):\n self.stdoffset = timedelta(hours=hours)\n self.reprname = reprname\n self.stdname = stdname\n self.dstname = dstname\n\n def __repr__(self):\n return self.reprname\n\n def tzname(self, dt):\n if self.dst(dt):\n return self.dstname\n else:\n return self.stdname\n\n def utcoffset(self, dt):\n return self.stdoffset + self.dst(dt)\n\n def dst(self, dt):\n if dt is None or dt.tzinfo is None:\n # An exception may be sensible here, in one or both cases.\n # It depends on how you want to treat them. The default\n # fromutc() implementation (called by the default astimezone()\n # implementation) passes a datetime with dt.tzinfo is self.\n return ZERO\n assert dt.tzinfo is self\n\n # Find start and end times for US DST. For years before 1967, return\n # ZERO for no DST.\n if 2006 < dt.year:\n dststart, dstend = DSTSTART_2007, DSTEND_2007\n elif 1986 < dt.year < 2007:\n dststart, dstend = DSTSTART_1987_2006, DSTEND_1987_2006\n elif 1966 < dt.year < 1987:\n dststart, dstend = DSTSTART_1967_1986, DSTEND_1967_1986\n else:\n return ZERO\n\n start = first_sunday_on_or_after(dststart.replace(year=dt.year))\n end = first_sunday_on_or_after(dstend.replace(year=dt.year))\n\n # Can't compare naive to aware objects, so strip the timezone from\n # dt first.\n if start <= dt.replace(tzinfo=None) < end:\n return HOUR\n else:\n return ZERO\n\nEastern = USTimeZone(-5, "Eastern", "EST", "EDT")\nCentral = USTimeZone(-6, "Central", "CST", "CDT")\nMountain = USTimeZone(-7, "Mountain", "MST", "MDT")\nPacific = USTimeZone(-8, "Pacific", "PST", "PDT")\n
Note that there are unavoidable subtleties twice per year in a tzinfo\nsubclass accounting for both standard and daylight time, at the DST transition\npoints. For concreteness, consider US Eastern (UTC -0500), where EDT begins the\nminute after 1:59 (EST) on the second Sunday in March, and ends the minute after\n1:59 (EDT) on the first Sunday in November:
\n UTC 3:MM 4:MM 5:MM 6:MM 7:MM 8:MM\n EST 22:MM 23:MM 0:MM 1:MM 2:MM 3:MM\n EDT 23:MM 0:MM 1:MM 2:MM 3:MM 4:MM\n\nstart 22:MM 23:MM 0:MM 1:MM 3:MM 4:MM\n\n end 23:MM 0:MM 1:MM 1:MM 2:MM 3:MM
\nWhen DST starts (the “start” line), the local wall clock leaps from 1:59 to\n3:00. A wall time of the form 2:MM doesn’t really make sense on that day, so\nastimezone(Eastern) won’t deliver a result with hour == 2 on the day DST\nbegins. In order for astimezone() to make this guarantee, the\nrzinfo.dst() method must consider times in the “missing hour” (2:MM for\nEastern) to be in daylight time.
\nWhen DST ends (the “end” line), there’s a potentially worse problem: there’s an\nhour that can’t be spelled unambiguously in local wall time: the last hour of\ndaylight time. In Eastern, that’s times of the form 5:MM UTC on the day\ndaylight time ends. The local wall clock leaps from 1:59 (daylight time) back\nto 1:00 (standard time) again. Local times of the form 1:MM are ambiguous.\nastimezone() mimics the local clock’s behavior by mapping two adjacent UTC\nhours into the same local hour then. In the Eastern example, UTC times of the\nform 5:MM and 6:MM both map to 1:MM when converted to Eastern. In order for\nastimezone() to make this guarantee, the tzinfo.dst() method must\nconsider times in the “repeated hour” to be in standard time. This is easily\narranged, as in the example, by expressing DST switch times in the time zone’s\nstandard local time.
\nApplications that can’t bear such ambiguities should avoid using hybrid\ntzinfo subclasses; there are no ambiguities when using UTC, or any\nother fixed-offset tzinfo subclass (such as a class representing only\nEST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)).
\ndate, datetime, and time objects all support a\nstrftime(format) method, to create a string representing the time under the\ncontrol of an explicit format string. Broadly speaking, d.strftime(fmt)\nacts like the time module’s time.strftime(fmt, d.timetuple())\nalthough not all objects support a timetuple() method.
\nConversely, the datetime.strptime() class method creates a\ndatetime object from a string representing a date and time and a\ncorresponding format string. datetime.strptime(date_string, format) is\nequivalent to datetime(*(time.strptime(date_string, format)[0:6])).
\nFor time objects, the format codes for year, month, and day should not\nbe used, as time objects have no such values. If they’re used anyway, 1900\nis substituted for the year, and 1 for the month and day.
\nFor date objects, the format codes for hours, minutes, seconds, and\nmicroseconds should not be used, as date objects have no such\nvalues. If they’re used anyway, 0 is substituted for them.
\n\nNew in version 2.6: time and datetime objects support a %f format code\nwhich expands to the number of microseconds in the object, zero-padded on\nthe left to six places.
\nFor a naive object, the %z and %Z format codes are replaced by empty\nstrings.
\nFor an aware object:
\nThe full set of format codes supported varies across platforms, because Python\ncalls the platform C library’s strftime() function, and platform\nvariations are common.
\nThe following is a list of all the format codes that the C standard (1989\nversion) requires, and these work on all platforms with a standard C\nimplementation. Note that the 1999 version of the C standard added additional\nformat codes.
\nThe exact range of years for which strftime() works also varies across\nplatforms. Regardless of platform, years before 1900 cannot be used.
\nDirective | \nMeaning | \nNotes | \n
---|---|---|
%a | \nLocale’s abbreviated weekday\nname. | \n\n |
%A | \nLocale’s full weekday name. | \n\n |
%b | \nLocale’s abbreviated month\nname. | \n\n |
%B | \nLocale’s full month name. | \n\n |
%c | \nLocale’s appropriate date and\ntime representation. | \n\n |
%d | \nDay of the month as a decimal\nnumber [01,31]. | \n\n |
%f | \nMicrosecond as a decimal\nnumber [0,999999], zero-padded\non the left | \n(1) | \n
%H | \nHour (24-hour clock) as a\ndecimal number [00,23]. | \n\n |
%I | \nHour (12-hour clock) as a\ndecimal number [01,12]. | \n\n |
%j | \nDay of the year as a decimal\nnumber [001,366]. | \n\n |
%m | \nMonth as a decimal number\n[01,12]. | \n\n |
%M | \nMinute as a decimal number\n[00,59]. | \n\n |
%p | \nLocale’s equivalent of either\nAM or PM. | \n(2) | \n
%S | \nSecond as a decimal number\n[00,61]. | \n(3) | \n
%U | \nWeek number of the year\n(Sunday as the first day of\nthe week) as a decimal number\n[00,53]. All days in a new\nyear preceding the first\nSunday are considered to be in\nweek 0. | \n(4) | \n
%w | \nWeekday as a decimal number\n[0(Sunday),6]. | \n\n |
%W | \nWeek number of the year\n(Monday as the first day of\nthe week) as a decimal number\n[00,53]. All days in a new\nyear preceding the first\nMonday are considered to be in\nweek 0. | \n(4) | \n
%x | \nLocale’s appropriate date\nrepresentation. | \n\n |
%X | \nLocale’s appropriate time\nrepresentation. | \n\n |
%y | \nYear without century as a\ndecimal number [00,99]. | \n\n |
%Y | \nYear with century as a decimal\nnumber. | \n\n |
%z | \nUTC offset in the form +HHMM\nor -HHMM (empty string if the\nthe object is naive). | \n(5) | \n
%Z | \nTime zone name (empty string\nif the object is naive). | \n\n |
A literal '%' character. | \n\n |
Notes:
\n\nNew in version 2.4.
\nSource code: Lib/collections.py and Lib/_abcoll.py
\nThis module implements specialized container datatypes providing alternatives to\nPython’s general purpose built-in containers, dict, list,\nset, and tuple.
\nnamedtuple() | \nfactory function for creating tuple subclasses with named fields | \n\nNew in version 2.6. \n | \n
deque | \nlist-like container with fast appends and pops on either end | \n\nNew in version 2.4. \n | \n
Counter | \ndict subclass for counting hashable objects | \n\nNew in version 2.7. \n | \n
OrderedDict | \ndict subclass that remembers the order entries were added | \n\nNew in version 2.7. \n | \n
defaultdict | \ndict subclass that calls a factory function to supply missing values | \n\nNew in version 2.5. \n | \n
In addition to the concrete container classes, the collections module provides\nabstract base classes that can be\nused to test whether a class provides a particular interface, for example,\nwhether it is hashable or a mapping.
\nA counter tool is provided to support convenient and rapid tallies.\nFor example:
\n>>> # Tally occurrences of words in a list\n>>> cnt = Counter()\n>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:\n... cnt[word] += 1\n>>> cnt\nCounter({'blue': 3, 'red': 2, 'green': 1})\n\n>>> # Find the ten most common words in Hamlet\n>>> import re\n>>> words = re.findall('\\w+', open('hamlet.txt').read().lower())\n>>> Counter(words).most_common(10)\n[('the', 1143), ('and', 966), ('to', 762), ('of', 669), ('i', 631),\n ('you', 554), ('a', 546), ('my', 514), ('hamlet', 471), ('in', 451)]\n
A Counter is a dict subclass for counting hashable objects.\nIt is an unordered collection where elements are stored as dictionary keys\nand their counts are stored as dictionary values. Counts are allowed to be\nany integer value including zero or negative counts. The Counter\nclass is similar to bags or multisets in other languages.
\nElements are counted from an iterable or initialized from another\nmapping (or counter):
\n>>> c = Counter() # a new, empty counter\n>>> c = Counter('gallahad') # a new counter from an iterable\n>>> c = Counter({'red': 4, 'blue': 2}) # a new counter from a mapping\n>>> c = Counter(cats=4, dogs=8) # a new counter from keyword args\n
Counter objects have a dictionary interface except that they return a zero\ncount for missing items instead of raising a KeyError:
\n>>> c = Counter(['eggs', 'ham'])\n>>> c['bacon'] # count of a missing element is zero\n0\n
Setting a count to zero does not remove an element from a counter.\nUse del to remove it entirely:
\n>>> c['sausage'] = 0 # counter entry with a zero count\n>>> del c['sausage'] # del actually removes the entry\n
\nNew in version 2.7.
\nCounter objects support three methods beyond those available for all\ndictionaries:
\nReturn an iterator over elements repeating each as many times as its\ncount. Elements are returned in arbitrary order. If an element’s count\nis less than one, elements() will ignore it.
\n>>> c = Counter(a=4, b=2, c=0, d=-2)\n>>> list(c.elements())\n['a', 'a', 'a', 'a', 'b', 'b']\n
Return a list of the n most common elements and their counts from the\nmost common to the least. If n is not specified, most_common()\nreturns all elements in the counter. Elements with equal counts are\nordered arbitrarily:
\n>>> Counter('abracadabra').most_common(3)\n[('a', 5), ('r', 2), ('b', 2)]\n
Elements are subtracted from an iterable or from another mapping\n(or counter). Like dict.update() but subtracts counts instead\nof replacing them. Both inputs and outputs may be zero or negative.
\n>>> c = Counter(a=4, b=2, c=0, d=-2)\n>>> d = Counter(a=1, b=2, c=3, d=4)\n>>> c.subtract(d)\nCounter({'a': 3, 'b': 0, 'c': -3, 'd': -6})\n
The usual dictionary methods are available for Counter objects\nexcept for two which work differently for counters.
\n\n\nCommon patterns for working with Counter objects:
\nsum(c.values()) # total of all counts\nc.clear() # reset all counts\nlist(c) # list unique elements\nset(c) # convert to a set\ndict(c) # convert to a regular dictionary\nc.items() # convert to a list of (elem, cnt) pairs\nCounter(dict(list_of_pairs)) # convert from a list of (elem, cnt) pairs\nc.most_common()[:-n:-1] # n least common elements\nc += Counter() # remove zero and negative counts\n
Several mathematical operations are provided for combining Counter\nobjects to produce multisets (counters that have counts greater than zero).\nAddition and subtraction combine counters by adding or subtracting the counts\nof corresponding elements. Intersection and union return the minimum and\nmaximum of corresponding counts. Each operation can accept inputs with signed\ncounts, but the output will exclude results with counts of zero or less.
\n>>> c = Counter(a=3, b=1)\n>>> d = Counter(a=1, b=2)\n>>> c + d # add two counters together: c[x] + d[x]\nCounter({'a': 4, 'b': 3})\n>>> c - d # subtract (keeping only positive counts)\nCounter({'a': 2})\n>>> c & d # intersection: min(c[x], d[x])\nCounter({'a': 1, 'b': 1})\n>>> c | d # union: max(c[x], d[x])\nCounter({'a': 3, 'b': 2})\n
Note
\nCounters were primarily designed to work with positive integers to represent\nrunning counts; however, care was taken to not unnecessarily preclude use\ncases needing other types or negative values. To help with those use cases,\nthis section documents the minimum range and type restrictions.
\nSee also
\nCounter class\nadapted for Python 2.5 and an early Bag recipe for Python 2.4.
\nBag class\nin Smalltalk.
\nWikipedia entry for Multisets.
\nC++ multisets\ntutorial with examples.
\nFor mathematical operations on multisets and their use cases, see\nKnuth, Donald. The Art of Computer Programming Volume II,\nSection 4.6.3, Exercise 19.
\nTo enumerate all distinct multisets of a given size over a given set of\nelements, see itertools.combinations_with_replacement().
\n\n\nmap(Counter, combinations_with_replacement(‘ABC’, 2)) –> AA AB AC BB BC CC
\n
Returns a new deque object initialized left-to-right (using append()) with\ndata from iterable. If iterable is not specified, the new deque is empty.
\nDeques are a generalization of stacks and queues (the name is pronounced “deck”\nand is short for “double-ended queue”). Deques support thread-safe, memory\nefficient appends and pops from either side of the deque with approximately the\nsame O(1) performance in either direction.
\nThough list objects support similar operations, they are optimized for\nfast fixed-length operations and incur O(n) memory movement costs for\npop(0) and insert(0, v) operations which change both the size and\nposition of the underlying data representation.
\n\nNew in version 2.4.
\nIf maxlen is not specified or is None, deques may grow to an\narbitrary length. Otherwise, the deque is bounded to the specified maximum\nlength. Once a bounded length deque is full, when new items are added, a\ncorresponding number of items are discarded from the opposite end. Bounded\nlength deques provide functionality similar to the tail filter in\nUnix. They are also useful for tracking transactions and other pools of data\nwhere only the most recent activity is of interest.
\n\nChanged in version 2.6: Added maxlen parameter.
\nDeque objects support the following methods:
\nCount the number of deque elements equal to x.
\n\nNew in version 2.7.
\nRemoved the first occurrence of value. If not found, raises a\nValueError.
\n\nNew in version 2.5.
\nReverse the elements of the deque in-place and then return None.
\n\nNew in version 2.7.
\nDeque objects also provide one read-only attribute:
\nMaximum size of a deque or None if unbounded.
\n\nNew in version 2.7.
\nIn addition to the above, deques support iteration, pickling, len(d),\nreversed(d), copy.copy(d), copy.deepcopy(d), membership testing with\nthe in operator, and subscript references such as d[-1]. Indexed\naccess is O(1) at both ends but slows to O(n) in the middle. For fast random\naccess, use lists instead.
\nExample:
\n>>> from collections import deque\n>>> d = deque('ghi') # make a new deque with three items\n>>> for elem in d: # iterate over the deque's elements\n... print elem.upper()\nG\nH\nI\n\n>>> d.append('j') # add a new entry to the right side\n>>> d.appendleft('f') # add a new entry to the left side\n>>> d # show the representation of the deque\ndeque(['f', 'g', 'h', 'i', 'j'])\n\n>>> d.pop() # return and remove the rightmost item\n'j'\n>>> d.popleft() # return and remove the leftmost item\n'f'\n>>> list(d) # list the contents of the deque\n['g', 'h', 'i']\n>>> d[0] # peek at leftmost item\n'g'\n>>> d[-1] # peek at rightmost item\n'i'\n\n>>> list(reversed(d)) # list the contents of a deque in reverse\n['i', 'h', 'g']\n>>> 'h' in d # search the deque\nTrue\n>>> d.extend('jkl') # add multiple elements at once\n>>> d\ndeque(['g', 'h', 'i', 'j', 'k', 'l'])\n>>> d.rotate(1) # right rotation\n>>> d\ndeque(['l', 'g', 'h', 'i', 'j', 'k'])\n>>> d.rotate(-1) # left rotation\n>>> d\ndeque(['g', 'h', 'i', 'j', 'k', 'l'])\n\n>>> deque(reversed(d)) # make a new deque in reverse order\ndeque(['l', 'k', 'j', 'i', 'h', 'g'])\n>>> d.clear() # empty the deque\n>>> d.pop() # cannot pop from an empty deque\nTraceback (most recent call last):\n File "<pyshell#6>", line 1, in -toplevel-\n d.pop()\nIndexError: pop from an empty deque\n\n>>> d.extendleft('abc') # extendleft() reverses the input order\n>>> d\ndeque(['c', 'b', 'a'])\n
This section shows various approaches to working with deques.
\nBounded length deques provide functionality similar to the tail filter\nin Unix:
\ndef tail(filename, n=10):\n 'Return the last n lines of a file'\n return deque(open(filename), n)\n
Another approach to using deques is to maintain a sequence of recently\nadded elements by appending to the right and popping to the left:
\ndef moving_average(iterable, n=3):\n # moving_average([40, 30, 50, 46, 39, 44]) --> 40.0 42.0 45.0 43.0\n # http://en.wikipedia.org/wiki/Moving_average\n it = iter(iterable)\n d = deque(itertools.islice(it, n-1))\n d.appendleft(0)\n s = sum(d)\n for elem in it:\n s += elem - d.popleft()\n d.append(elem)\n yield s / float(n)\n
The rotate() method provides a way to implement deque slicing and\ndeletion. For example, a pure Python implementation of del d[n] relies on\nthe rotate() method to position elements to be popped:
\ndef delete_nth(d, n):\n d.rotate(-n)\n d.popleft()\n d.rotate(n)\n
To implement deque slicing, use a similar approach applying\nrotate() to bring a target element to the left side of the deque. Remove\nold entries with popleft(), add new entries with extend(), and then\nreverse the rotation.\nWith minor variations on that approach, it is easy to implement Forth style\nstack manipulations such as dup, drop, swap, over, pick,\nrot, and roll.
\nReturns a new dictionary-like object. defaultdict is a subclass of the\nbuilt-in dict class. It overrides one method and adds one writable\ninstance variable. The remaining functionality is the same as for the\ndict class and is not documented here.
\nThe first argument provides the initial value for the default_factory\nattribute; it defaults to None. All remaining arguments are treated the same\nas if they were passed to the dict constructor, including keyword\narguments.
\n\nNew in version 2.5.
\ndefaultdict objects support the following method in addition to the\nstandard dict operations:
\nIf the default_factory attribute is None, this raises a\nKeyError exception with the key as argument.
\nIf default_factory is not None, it is called without arguments\nto provide a default value for the given key, this value is inserted in\nthe dictionary for the key, and returned.
\nIf calling default_factory raises an exception this exception is\npropagated unchanged.
\nThis method is called by the __getitem__() method of the\ndict class when the requested key is not found; whatever it\nreturns or raises is then returned or raised by __getitem__().
\ndefaultdict objects support the following instance variable:
\nUsing list as the default_factory, it is easy to group a\nsequence of key-value pairs into a dictionary of lists:
\n>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]\n>>> d = defaultdict(list)\n>>> for k, v in s:\n... d[k].append(v)\n...\n>>> d.items()\n[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]\n
When each key is encountered for the first time, it is not already in the\nmapping; so an entry is automatically created using the default_factory\nfunction which returns an empty list. The list.append()\noperation then attaches the value to the new list. When keys are encountered\nagain, the look-up proceeds normally (returning the list for that key) and the\nlist.append() operation adds another value to the list. This technique is\nsimpler and faster than an equivalent technique using dict.setdefault():
\n>>> d = {}\n>>> for k, v in s:\n... d.setdefault(k, []).append(v)\n...\n>>> d.items()\n[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]\n
Setting the default_factory to int makes the\ndefaultdict useful for counting (like a bag or multiset in other\nlanguages):
\n>>> s = 'mississippi'\n>>> d = defaultdict(int)\n>>> for k in s:\n... d[k] += 1\n...\n>>> d.items()\n[('i', 4), ('p', 2), ('s', 4), ('m', 1)]\n
When a letter is first encountered, it is missing from the mapping, so the\ndefault_factory function calls int() to supply a default count of\nzero. The increment operation then builds up the count for each letter.
\nThe function int() which always returns zero is just a special case of\nconstant functions. A faster and more flexible way to create constant functions\nis to use itertools.repeat() which can supply any constant value (not just\nzero):
\n>>> def constant_factory(value):\n... return itertools.repeat(value).next\n>>> d = defaultdict(constant_factory('<missing>'))\n>>> d.update(name='John', action='ran')\n>>> '%(name)s %(action)s to %(object)s' d\n'John ran to <missing>'\n
Setting the default_factory to set makes the\ndefaultdict useful for building a dictionary of sets:
\n>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]\n>>> d = defaultdict(set)\n>>> for k, v in s:\n... d[k].add(v)\n...\n>>> d.items()\n[('blue', set([2, 4])), ('red', set([1, 3]))]\n
Named tuples assign meaning to each position in a tuple and allow for more readable,\nself-documenting code. They can be used wherever regular tuples are used, and\nthey add the ability to access fields by name instead of position index.
\nReturns a new tuple subclass named typename. The new subclass is used to\ncreate tuple-like objects that have fields accessible by attribute lookup as\nwell as being indexable and iterable. Instances of the subclass also have a\nhelpful docstring (with typename and field_names) and a helpful __repr__()\nmethod which lists the tuple contents in a name=value format.
\nThe field_names are a sequence of strings such as ['x', 'y'].\nAlternatively, field_names can be a single string with each fieldname\nseparated by whitespace and/or commas, for example 'x y' or 'x, y'.
\nAny valid Python identifier may be used for a fieldname except for names\nstarting with an underscore. Valid identifiers consist of letters, digits,\nand underscores but do not start with a digit or underscore and cannot be\na keyword such as class, for, return, global, pass, print,\nor raise.
\nIf rename is true, invalid fieldnames are automatically replaced\nwith positional names. For example, ['abc', 'def', 'ghi', 'abc'] is\nconverted to ['abc', '_1', 'ghi', '_3'], eliminating the keyword\ndef and the duplicate fieldname abc.
\nIf verbose is true, the class definition is printed just before being built.
\nNamed tuple instances do not have per-instance dictionaries, so they are\nlightweight and require no more memory than regular tuples.
\n\nNew in version 2.6.
\n\nChanged in version 2.7: added support for rename.
\nExample:
\n>>> Point = namedtuple('Point', ['x', 'y'], verbose=True)\nclass Point(tuple):\n 'Point(x, y)'\n\n __slots__ = ()\n\n _fields = ('x', 'y')\n\n def __new__(_cls, x, y):\n 'Create a new instance of Point(x, y)'\n return _tuple.__new__(_cls, (x, y))\n\n @classmethod\n def _make(cls, iterable, new=tuple.__new__, len=len):\n 'Make a new Point object from a sequence or iterable'\n result = new(cls, iterable)\n if len(result) != 2:\n raise TypeError('Expected 2 arguments, got %d' % len(result))\n return result\n\n def __repr__(self):\n 'Return a nicely formatted representation string'\n return 'Point(x=%r, y=%r)' % self\n\n def _asdict(self):\n 'Return a new OrderedDict which maps field names to their values'\n return OrderedDict(zip(self._fields, self))\n\n __dict__ = property(_asdict)\n\n def _replace(_self, **kwds):\n 'Return a new Point object replacing specified fields with new values'\n result = _self._make(map(kwds.pop, ('x', 'y'), _self))\n if kwds:\n raise ValueError('Got unexpected field names: %r' % kwds.keys())\n return result\n\n def __getnewargs__(self):\n 'Return self as a plain tuple. Used by copy and pickle.'\n return tuple(self)\n\n x = _property(_itemgetter(0), doc='Alias for field number 0')\n y = _property(_itemgetter(1), doc='Alias for field number 1')\n\n>>> p = Point(11, y=22) # instantiate with positional or keyword arguments\n>>> p[0] + p[1] # indexable like the plain tuple (11, 22)\n33\n>>> x, y = p # unpack like a regular tuple\n>>> x, y\n(11, 22)\n>>> p.x + p.y # fields also accessible by name\n33\n>>> p # readable __repr__ with a name=value style\nPoint(x=11, y=22)\n
Named tuples are especially useful for assigning field names to result tuples returned\nby the csv or sqlite3 modules:
\nEmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')\n\nimport csv\nfor emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):\n print emp.name, emp.title\n\nimport sqlite3\nconn = sqlite3.connect('/companydata')\ncursor = conn.cursor()\ncursor.execute('SELECT name, age, title, department, paygrade FROM employees')\nfor emp in map(EmployeeRecord._make, cursor.fetchall()):\n print emp.name, emp.title\n
In addition to the methods inherited from tuples, named tuples support\nthree additional methods and one attribute. To prevent conflicts with\nfield names, the method and attribute names start with an underscore.
\nClass method that makes a new instance from an existing sequence or iterable.
\n>>> t = [11, 22]\n>>> Point._make(t)\nPoint(x=11, y=22)\n
Return a new OrderedDict which maps field names to their corresponding\nvalues:
\n>>> p._asdict()\nOrderedDict([('x', 11), ('y', 22)])\n
\nChanged in version 2.7: Returns an OrderedDict instead of a regular dict.
\nReturn a new instance of the named tuple replacing specified fields with new\nvalues:
\n>>> p = Point(x=11, y=22)\n>>> p._replace(x=33)\nPoint(x=33, y=22)\n\n>>> for partnum, record in inventory.items():\n inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now())\n
Tuple of strings listing the field names. Useful for introspection\nand for creating new named tuple types from existing named tuples.
\n>>> p._fields # view the field names\n('x', 'y')\n\n>>> Color = namedtuple('Color', 'red green blue')\n>>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)\n>>> Pixel(11, 22, 128, 255, 0)\nPixel(x=11, y=22, red=128, green=255, blue=0)\n
To retrieve a field whose name is stored in a string, use the getattr()\nfunction:
\n>>> getattr(p, 'x')\n11\n
To convert a dictionary to a named tuple, use the double-star-operator\n(as described in Unpacking Argument Lists):
\n>>> d = {'x': 11, 'y': 22}\n>>> Point(**d)\nPoint(x=11, y=22)\n
Since a named tuple is a regular Python class, it is easy to add or change\nfunctionality with a subclass. Here is how to add a calculated field and\na fixed-width print format:
\n\n\n\n\n>>> class Point(namedtuple('Point', 'x y')):\n __slots__ = ()\n @property\n def hypot(self):\n return (self.x ** 2 + self.y ** 2) ** 0.5\n def __str__(self):\n return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot)\n\n\n>>> for p in Point(3, 4), Point(14, 5/7.):\n print p\nPoint: x= 3.000 y= 4.000 hypot= 5.000\nPoint: x=14.000 y= 0.714 hypot=14.018\n
The subclass shown above sets __slots__ to an empty tuple. This helps\nkeep memory requirements low by preventing the creation of instance dictionaries.
\nSubclassing is not useful for adding new, stored fields. Instead, simply\ncreate a new named tuple type from the _fields attribute:
\n>>> Point3D = namedtuple('Point3D', Point._fields + ('z',))\n
Default values can be implemented by using _replace() to\ncustomize a prototype instance:
\n>>> Account = namedtuple('Account', 'owner balance transaction_count')\n>>> default_account = Account('<owner name>', 0.0, 0)\n>>> johns_account = default_account._replace(owner='John')\n
Enumerated constants can be implemented with named tuples, but it is simpler\nand more efficient to use a simple class declaration:
\n>>> Status = namedtuple('Status', 'open pending closed')._make(range(3))\n>>> Status.open, Status.pending, Status.closed\n(0, 1, 2)\n>>> class Status:\n open, pending, closed = range(3)\n
See also
\nNamed tuple recipe\nadapted for Python 2.4.
\nOrdered dictionaries are just like regular dictionaries but they remember the\norder that items were inserted. When iterating over an ordered dictionary,\nthe items are returned in the order their keys were first added.
\nReturn an instance of a dict subclass, supporting the usual dict\nmethods. An OrderedDict is a dict that remembers the order that keys\nwere first inserted. If a new entry overwrites an existing entry, the\noriginal insertion position is left unchanged. Deleting an entry and\nreinserting it will move it to the end.
\n\nNew in version 2.7.
\nIn addition to the usual mapping methods, ordered dictionaries also support\nreverse iteration using reversed().
\nEquality tests between OrderedDict objects are order-sensitive\nand are implemented as list(od1.items())==list(od2.items()).\nEquality tests between OrderedDict objects and other\nMapping objects are order-insensitive like regular dictionaries.\nThis allows OrderedDict objects to be substituted anywhere a\nregular dictionary is used.
\nThe OrderedDict constructor and update() method both accept\nkeyword arguments, but their order is lost because Python’s function call\nsemantics pass-in keyword arguments using a regular unordered dictionary.
\nSee also
\nEquivalent OrderedDict recipe\nthat runs on Python 2.4 or later.
\nSince an ordered dictionary remembers its insertion order, it can be used\nin conjuction with sorting to make a sorted dictionary:
\n>>> # regular unsorted dictionary\n>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}\n\n>>> # dictionary sorted by key\n>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))\nOrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])\n\n>>> # dictionary sorted by value\n>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))\nOrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])\n\n>>> # dictionary sorted by length of the key string\n>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))\nOrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])\n
The new sorted dictionaries maintain their sort order when entries\nare deleted. But when new keys are added, the keys are appended\nto the end and the sort is not maintained.
\nIt is also straight-forward to create an ordered dictionary variant\nthat the remembers the order the keys were last inserted.\nIf a new entry overwrites an existing entry, the\noriginal insertion position is changed and moved to the end:
\nclass LastUpdatedOrderedDict(OrderedDict):\n 'Store items in the order the keys were last added'\n\n def __setitem__(self, key, value):\n if key in self:\n del self[key]\n OrderedDict.__setitem__(self, key, value)\n
An ordered dictionary can be combined with the Counter class\nso that the counter remembers the order elements are first encountered:
\nclass OrderedCounter(Counter, OrderedDict):\n 'Counter that remembers the order elements are first encountered'\n\n def __repr__(self):\n return '%s(%r)' (self.__class__.__name__, OrderedDict(self))\n\n def __reduce__(self):\n return self.__class__, (OrderedDict(self),)\n
The collections module offers the following ABCs:
\nABC | \nInherits from | \nAbstract Methods | \nMixin Methods | \n
---|---|---|---|
Container | \n\n | __contains__ | \n\n |
Hashable | \n\n | __hash__ | \n\n |
Iterable | \n\n | __iter__ | \n\n |
Iterator | \nIterable | \nnext | \n__iter__ | \n
Sized | \n\n | __len__ | \n\n |
Callable | \n\n | __call__ | \n\n |
Sequence | \nSized,\nIterable,\nContainer | \n__getitem__ | \n__contains__, __iter__, __reversed__,\nindex, and count | \n
MutableSequence | \nSequence | \n__setitem__,\n__delitem__,\ninsert | \nInherited Sequence methods and\nappend, reverse, extend, pop,\nremove, and __iadd__ | \n
Set | \nSized,\nIterable,\nContainer | \n\n | __le__, __lt__, __eq__, __ne__,\n__gt__, __ge__, __and__, __or__,\n__sub__, __xor__, and isdisjoint | \n
MutableSet | \nSet | \nadd,\ndiscard | \nInherited Set methods and\nclear, pop, remove, __ior__,\n__iand__, __ixor__, and __isub__ | \n
Mapping | \nSized,\nIterable,\nContainer | \n__getitem__ | \n__contains__, keys, items, values,\nget, __eq__, and __ne__ | \n
MutableMapping | \nMapping | \n__setitem__,\n__delitem__ | \nInherited Mapping methods and\npop, popitem, clear, update,\nand setdefault | \n
MappingView | \nSized | \n\n | __len__ | \n
ItemsView | \nMappingView,\nSet | \n\n | __contains__,\n__iter__ | \n
KeysView | \nMappingView,\nSet | \n\n | __contains__,\n__iter__ | \n
ValuesView | \nMappingView | \n\n | __contains__, __iter__ | \n
These ABCs allow us to ask classes or instances if they provide\nparticular functionality, for example:
\nsize = None\nif isinstance(myvar, collections.Sized):\n size = len(myvar)\n
Several of the ABCs are also useful as mixins that make it easier to develop\nclasses supporting container APIs. For example, to write a class supporting\nthe full Set API, it only necessary to supply the three underlying\nabstract methods: __contains__(), __iter__(), and __len__().\nThe ABC supplies the remaining methods such as __and__() and\nisdisjoint()
\nclass ListBasedSet(collections.Set):\n ''' Alternate set implementation favoring space over speed\n and not requiring the set elements to be hashable. '''\n def __init__(self, iterable):\n self.elements = lst = []\n for value in iterable:\n if value not in lst:\n lst.append(value)\n def __iter__(self):\n return iter(self.elements)\n def __contains__(self, value):\n return value in self.elements\n def __len__(self):\n return len(self.elements)\n\ns1 = ListBasedSet('abcdef')\ns2 = ListBasedSet('defghi')\noverlap = s1 & s2 # The __and__() method is supported automatically\n
Notes on using Set and MutableSet as a mixin:
\nSee also
\nSource code: Lib/sched.py
\nThe sched module defines a class which implements a general purpose event\nscheduler:
\nExample:
\n>>> import sched, time\n>>> s = sched.scheduler(time.time, time.sleep)\n>>> def print_time(): print "From print_time", time.time()\n...\n>>> def print_some_times():\n... print time.time()\n... s.enter(5, 1, print_time, ())\n... s.enter(10, 1, print_time, ())\n... s.run()\n... print time.time()\n...\n>>> print_some_times()\n930343690.257\nFrom print_time 930343695.274\nFrom print_time 930343700.273\n930343700.276\n
In multi-threaded environments, the scheduler class has limitations\nwith respect to thread-safety, inability to insert a new task before\nthe one currently pending in a running scheduler, and holding up the main\nthread until the event queue is empty. Instead, the preferred approach\nis to use the threading.Timer class instead.
\nExample:
\n>>> import time\n>>> from threading import Timer\n>>> def print_time():\n... print "From print_time", time.time()\n...\n>>> def print_some_times():\n... print time.time()\n... Timer(5, print_time, ()).start()\n... Timer(10, print_time, ()).start()\n... time.sleep(11) # sleep while time-delay events execute\n... print time.time()\n...\n>>> print_some_times()\n930343690.257\nFrom print_time 930343695.274\nFrom print_time 930343700.273\n930343701.301\n
scheduler instances have the following methods and attributes:
\nSchedule a new event. The time argument should be a numeric type compatible\nwith the return value of the timefunc function passed to the constructor.\nEvents scheduled for the same time will be executed in the order of their\npriority.
\nExecuting the event means executing action(*argument). argument must be a\nsequence holding the parameters for action.
\nReturn value is an event which may be used for later cancellation of the event\n(see cancel()).
\nRun all scheduled events. This function will wait (using the delayfunc()\nfunction passed to the constructor) for the next event, then execute it and so\non until there are no more scheduled events.
\nEither action or delayfunc can raise an exception. In either case, the\nscheduler will maintain a consistent state and propagate the exception. If an\nexception is raised by action, the event will not be attempted in future calls\nto run().
\nIf a sequence of events takes longer to run than the time available before the\nnext event, the scheduler will simply fall behind. No events will be dropped;\nthe calling code is responsible for canceling events which are no longer\npertinent.
\nRead-only attribute returning a list of upcoming events in the order they\nwill be run. Each event is shown as a named tuple with the\nfollowing fields: time, priority, action, argument.
\n\nNew in version 2.6.
\n\nDeprecated since version 2.6: The mutex module has been removed in Python 3.0.
\nThe mutex module defines a class that allows mutual-exclusion via\nacquiring and releasing locks. It does not require (or imply)\nthreading or multi-tasking, though it could be useful for those\npurposes.
\nThe mutex module defines the following class:
\nCreate a new (unlocked) mutex.
\nA mutex has two pieces of state — a “locked” bit and a queue. When the mutex\nis not locked, the queue is empty. Otherwise, the queue contains zero or more\n(function, argument) pairs representing functions (or methods) waiting to\nacquire the lock. When the mutex is unlocked while the queue is not empty, the\nfirst queue entry is removed and its function(argument) pair called,\nimplying it now has the lock.
\nOf course, no multi-threading is implied – hence the funny interface for\nlock(), where a function is called once the lock is acquired.
\nmutex objects have following methods:
\n\nNew in version 2.3.
\n\nDeprecated since version 2.6: The built-in set/frozenset types replace this module.
\nThe sets module provides classes for constructing and manipulating\nunordered collections of unique elements. Common uses include membership\ntesting, removing duplicates from a sequence, and computing standard math\noperations on sets such as intersection, union, difference, and symmetric\ndifference.
\nLike other collections, sets support x in set, len(set), and for x in\nset. Being an unordered collection, sets do not record element position or\norder of insertion. Accordingly, sets do not support indexing, slicing, or\nother sequence-like behavior.
\nMost set applications use the Set class which provides every set method\nexcept for __hash__(). For advanced applications requiring a hash method,\nthe ImmutableSet class adds a __hash__() method but omits methods\nwhich alter the contents of the set. Both Set and ImmutableSet\nderive from BaseSet, an abstract class useful for determining whether\nsomething is a set: isinstance(obj, BaseSet).
\nThe set classes are implemented using dictionaries. Accordingly, the\nrequirements for set elements are the same as those for dictionary keys; namely,\nthat the element defines both __eq__() and __hash__(). As a result,\nsets cannot contain mutable elements such as lists or dictionaries. However,\nthey can contain immutable collections such as tuples or instances of\nImmutableSet. For convenience in implementing sets of sets, inner sets\nare automatically converted to immutable form, for example,\nSet([Set(['dog'])]) is transformed to Set([ImmutableSet(['dog'])]).
\nConstructs a new empty ImmutableSet object. If the optional iterable\nparameter is supplied, updates the set with elements obtained from iteration.\nAll of the elements in iterable should be immutable or be transformable to an\nimmutable using the protocol described in section Protocol for automatic conversion to immutable.
\nBecause ImmutableSet objects provide a __hash__() method, they\ncan be used as set elements or as dictionary keys. ImmutableSet\nobjects do not have methods for adding or removing elements, so all of the\nelements must be known when the constructor is called.
\nInstances of Set and ImmutableSet both provide the following\noperations:
\nOperation | \nEquivalent | \nResult | \n
---|---|---|
len(s) | \n\n | cardinality of set s | \n
x in s | \n\n | test x for membership in s | \n
x not in s | \n\n | test x for non-membership in\ns | \n
s.issubset(t) | \ns <= t | \ntest whether every element in\ns is in t | \n
s.issuperset(t) | \ns >= t | \ntest whether every element in\nt is in s | \n
s.union(t) | \ns | t | \nnew set with elements from both\ns and t | \n
s.intersection(t) | \ns & t | \nnew set with elements common to\ns and t | \n
s.difference(t) | \ns - t | \nnew set with elements in s\nbut not in t | \n
s.symmetric_difference(t) | \ns ^ t | \nnew set with elements in either\ns or t but not both | \n
s.copy() | \n\n | new set with a shallow copy of\ns | \n
Note, the non-operator versions of union(), intersection(),\ndifference(), and symmetric_difference() will accept any iterable as\nan argument. In contrast, their operator based counterparts require their\narguments to be sets. This precludes error-prone constructions like\nSet('abc') & 'cbs' in favor of the more readable\nSet('abc').intersection('cbs').
\n\nChanged in version 2.3.1: Formerly all arguments were required to be sets.
\nIn addition, both Set and ImmutableSet support set to set\ncomparisons. Two sets are equal if and only if every element of each set is\ncontained in the other (each is a subset of the other). A set is less than\nanother set if and only if the first set is a proper subset of the second set\n(is a subset, but is not equal). A set is greater than another set if and only\nif the first set is a proper superset of the second set (is a superset, but is\nnot equal).
\nThe subset and equality comparisons do not generalize to a complete ordering\nfunction. For example, any two disjoint sets are not equal and are not subsets\nof each other, so all of the following return False: a<b, a==b,\nor a>b. Accordingly, sets do not implement the __cmp__() method.
\nSince sets only define partial ordering (subset relationships), the output of\nthe list.sort() method is undefined for lists of sets.
\nThe following table lists operations available in ImmutableSet but not\nfound in Set:
\nOperation | \nResult | \n
---|---|
hash(s) | \nreturns a hash value for s | \n
The following table lists operations available in Set but not found in\nImmutableSet:
\nOperation | \nEquivalent | \nResult | \n
---|---|---|
s.update(t) | \ns |= t | \nreturn set s with elements\nadded from t | \n
s.intersection_update(t) | \ns &= t | \nreturn set s keeping only\nelements also found in t | \n
s.difference_update(t) | \ns -= t | \nreturn set s after removing\nelements found in t | \n
s.symmetric_difference_update(t) | \ns ^= t | \nreturn set s with elements\nfrom s or t but not both | \n
s.add(x) | \n\n | add element x to set s | \n
s.remove(x) | \n\n | remove x from set s; raises\nKeyError if not present | \n
s.discard(x) | \n\n | removes x from set s if\npresent | \n
s.pop() | \n\n | remove and return an arbitrary\nelement from s; raises\nKeyError if empty | \n
s.clear() | \n\n | remove all elements from set\ns | \n
Note, the non-operator versions of update(), intersection_update(),\ndifference_update(), and symmetric_difference_update() will accept\nany iterable as an argument.
\n\nChanged in version 2.3.1: Formerly all arguments were required to be sets.
\nAlso note, the module also includes a union_update() method which is an\nalias for update(). The method is included for backwards compatibility.\nProgrammers should prefer the update() method because it is supported by\nthe built-in set() and frozenset() types.
\n>>> from sets import Set\n>>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])\n>>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])\n>>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack'])\n>>> employees = engineers | programmers | managers # union\n>>> engineering_management = engineers & managers # intersection\n>>> fulltime_management = managers - engineers - programmers # difference\n>>> engineers.add('Marvin') # add element\n>>> print engineers # doctest: +SKIP\nSet(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])\n>>> employees.issuperset(engineers) # superset test\nFalse\n>>> employees.update(engineers) # update from another set\n>>> employees.issuperset(engineers)\nTrue\n>>> for group in [engineers, programmers, managers, employees]: # doctest: +SKIP\n... group.discard('Susan') # unconditionally remove element\n... print group\n...\nSet(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])\nSet(['Janice', 'Jack', 'Sam'])\nSet(['Jane', 'Zack', 'Jack'])\nSet(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])\n
Sets can only contain immutable elements. For convenience, mutable Set\nobjects are automatically copied to an ImmutableSet before being added\nas a set element.
\nThe mechanism is to always add a hashable element, or if it is not\nhashable, the element is checked to see if it has an __as_immutable__()\nmethod which returns an immutable equivalent.
\nSince Set objects have a __as_immutable__() method returning an\ninstance of ImmutableSet, it is possible to construct sets of sets.
\nA similar mechanism is needed by the __contains__() and remove()\nmethods which need to hash an element to check for membership in a set. Those\nmethods check an element for hashability and, if not, check for a\n__as_temporarily_immutable__() method which returns the element wrapped by\na class that provides temporary methods for __hash__(), __eq__(),\nand __ne__().
\nThe alternate mechanism spares the need to build a separate copy of the original\nmutable object.
\nSet objects implement the __as_temporarily_immutable__() method\nwhich returns the Set object wrapped by a new class\n_TemporarilyImmutableSet.
\nThe two mechanisms for adding hashability are normally invisible to the user;\nhowever, a conflict can arise in a multi-threaded environment where one thread\nis updating a set while another has temporarily wrapped it in\n_TemporarilyImmutableSet. In other words, sets of mutable sets are not\nthread-safe.
\nThe built-in set and frozenset types were designed based on\nlessons learned from the sets module. The key differences are:
\nNote
\nThe Queue module has been renamed to queue in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nSource code: Lib/Queue.py
\nThe Queue module implements multi-producer, multi-consumer queues.\nIt is especially useful in threaded programming when information must be\nexchanged safely between multiple threads. The Queue class in this\nmodule implements all the required locking semantics. It depends on the\navailability of thread support in Python; see the threading\nmodule.
\nImplements three types of queue whose only difference is the order that\nthe entries are retrieved. In a FIFO queue, the first tasks added are\nthe first retrieved. In a LIFO queue, the most recently added entry is\nthe first retrieved (operating like a stack). With a priority queue,\nthe entries are kept sorted (using the heapq module) and the\nlowest valued entry is retrieved first.
\nThe Queue module defines the following classes and exceptions:
\nConstructor for a LIFO queue. maxsize is an integer that sets the upperbound\nlimit on the number of items that can be placed in the queue. Insertion will\nblock once this size has been reached, until queue items are consumed. If\nmaxsize is less than or equal to zero, the queue size is infinite.
\n\nNew in version 2.6.
\nConstructor for a priority queue. maxsize is an integer that sets the upperbound\nlimit on the number of items that can be placed in the queue. Insertion will\nblock once this size has been reached, until queue items are consumed. If\nmaxsize is less than or equal to zero, the queue size is infinite.
\nThe lowest valued entries are retrieved first (the lowest valued entry is the\none returned by sorted(list(entries))[0]). A typical pattern for entries\nis a tuple in the form: (priority_number, data).
\n\nNew in version 2.6.
\nSee also
\ncollections.deque is an alternative implementation of unbounded\nqueues with fast atomic append() and popleft() operations that\ndo not require locking.
\nQueue objects (Queue, LifoQueue, or PriorityQueue)\nprovide the public methods described below.
\nPut item into the queue. If optional args block is true and timeout is\nNone (the default), block if necessary until a free slot is available. If\ntimeout is a positive number, it blocks at most timeout seconds and raises\nthe Full exception if no free slot was available within that time.\nOtherwise (block is false), put an item on the queue if a free slot is\nimmediately available, else raise the Full exception (timeout is\nignored in that case).
\n\nNew in version 2.3: The timeout parameter.
\nRemove and return an item from the queue. If optional args block is true and\ntimeout is None (the default), block if necessary until an item is available.\nIf timeout is a positive number, it blocks at most timeout seconds and\nraises the Empty exception if no item was available within that time.\nOtherwise (block is false), return an item if one is immediately available,\nelse raise the Empty exception (timeout is ignored in that case).
\n\nNew in version 2.3: The timeout parameter.
\nTwo methods are offered to support tracking whether enqueued tasks have been\nfully processed by daemon consumer threads.
\nIndicate that a formerly enqueued task is complete. Used by queue consumer\nthreads. For each get() used to fetch a task, a subsequent call to\ntask_done() tells the queue that the processing on the task is complete.
\nIf a join() is currently blocking, it will resume when all items have been\nprocessed (meaning that a task_done() call was received for every item\nthat had been put() into the queue).
\nRaises a ValueError if called more times than there were items placed in\nthe queue.
\n\nNew in version 2.5.
\nBlocks until all items in the queue have been gotten and processed.
\nThe count of unfinished tasks goes up whenever an item is added to the queue.\nThe count goes down whenever a consumer thread calls task_done() to\nindicate that the item was retrieved and all work on it is complete. When the\ncount of unfinished tasks drops to zero, join() unblocks.
\n\nNew in version 2.5.
\nExample of how to wait for enqueued tasks to be completed:
\ndef worker():\n while True:\n item = q.get()\n do_work(item)\n q.task_done()\n\nq = Queue()\nfor i in range(num_worker_threads):\n t = Thread(target=worker)\n t.daemon = True\n t.start()\n\nfor item in source():\n q.put(item)\n\nq.join() # block until all tasks are done\n
\nDeprecated since version 2.6: The new module has been removed in Python 3.0. Use the types\nmodule’s classes instead.
\nThe new module allows an interface to the interpreter object creation\nfunctions. This is for use primarily in marshal-type functions, when a new\nobject needs to be created “magically” and not by using the regular creation\nfunctions. This module provides a low-level interface to the interpreter, so\ncare must be exercised when using this module. It is possible to supply\nnon-sensical arguments which crash the interpreter when the object is used.
\nThe new module defines the following functions:
\nSource code: Lib/types.py
\nThis module defines names for some object types that are used by the standard\nPython interpreter, but not for the types defined by various extension modules.\nAlso, it does not include some of the types that arise during processing such as\nthe listiterator type. It is safe to use from types import * — the\nmodule does not export any names besides the ones listed here. New names\nexported by future versions of this module will all end in Type.
\nTypical use is for functions that do different things depending on their\nargument types, like the following:
\nfrom types import *\ndef delete(mylist, item):\n if type(item) is IntType:\n del mylist[item]\n else:\n mylist.remove(item)\n
Starting in Python 2.2, built-in factory functions such as int() and\nstr() are also names for the corresponding types. This is now the\npreferred way to access the type instead of using the types module.\nAccordingly, the example above should be written as follows:
\ndef delete(mylist, item):\n if isinstance(item, int):\n del mylist[item]\n else:\n mylist.remove(item)\n
The module defines the following names:
\nThe type of type objects (such as returned by type()); alias of the\nbuilt-in type.
\nThe type of the bool values True and False; alias of the\nbuilt-in bool.
\n\nNew in version 2.3.
\nThe type of generator-iterator objects, produced by calling a\ngenerator function.
\n\nNew in version 2.2.
\nThe type of range objects returned by xrange(); alias of the built-in\nxrange.
\nThe type of objects defined in extension modules with PyGetSetDef, such\nas FrameType.f_locals or array.array.typecode. This type is used as\ndescriptor for object attributes; it has the same purpose as the\nproperty type, but for classes defined in extension modules.
\n\nNew in version 2.5.
\nThe type of objects defined in extension modules with PyMemberDef, such\nas datetime.timedelta.days. This type is used as descriptor for simple C\ndata members which use standard conversion functions; it has the same purpose\nas the property type, but for classes defined in extension modules.
\nCPython implementation detail: In other implementations of Python, this type may be identical to\nGetSetDescriptorType.
\n\nNew in version 2.5.
\nA sequence containing StringType and UnicodeType used to facilitate\neasier checking for any string object. Using this is more portable than using a\nsequence of the two string types constructed elsewhere since it only contains\nUnicodeType if it has been built in the running version of Python. For\nexample: isinstance(s, types.StringTypes).
\n\nNew in version 2.2.
\nSource code: Lib/UserDict.py
\nThe module defines a mixin, DictMixin, defining all dictionary methods\nfor classes that already have a minimum mapping interface. This greatly\nsimplifies writing classes that need to be substitutable for dictionaries (such\nas the shelve module).
\nThis module also defines a class, UserDict, that acts as a wrapper\naround dictionary objects. The need for this class has been largely supplanted\nby the ability to subclass directly from dict (a feature that became\navailable starting with Python version 2.2). Prior to the introduction of\ndict, the UserDict class was used to create dictionary-like\nsub-classes that obtained new behaviors by overriding existing methods or adding\nnew ones.
\nThe UserDict module defines the UserDict class and\nDictMixin:
\nClass that simulates a dictionary. The instance’s contents are kept in a\nregular dictionary, which is accessible via the data attribute of\nUserDict instances. If initialdata is provided, data is\ninitialized with its contents; note that a reference to initialdata will not\nbe kept, allowing it be used for other purposes.
\nNote
\nFor backward compatibility, instances of UserDict are not iterable.
\nIn addition to supporting the methods and operations of mappings (see section\nMapping Types — dict), UserDict and IterableUserDict instances\nprovide the following attribute:
\n\n\nMixin defining all dictionary methods for classes that already have a minimum\ndictionary interface including __getitem__(), __setitem__(),\n__delitem__(), and keys().
\nThis mixin should be used as a superclass. Adding each of the above methods\nadds progressively more functionality. For instance, defining all but\n__delitem__() will preclude only pop() and popitem() from the\nfull interface.
\nIn addition to the four base methods, progressively more efficiency comes with\ndefining __contains__(), __iter__(), and iteritems().
\nSince the mixin has no knowledge of the subclass constructor, it does not define\n__init__() or copy().
\nStarting with Python version 2.6, it is recommended to use\ncollections.MutableMapping instead of DictMixin.
\nNote
\nThis module is available for backward compatibility only. If you are writing\ncode that does not need to work with versions of Python earlier than Python 2.2,\nplease consider subclassing directly from the built-in list type.
\nThis module defines a class that acts as a wrapper around list objects. It is a\nuseful base class for your own list-like classes, which can inherit from them\nand override existing methods or add new ones. In this way one can add new\nbehaviors to lists.
\nThe UserList module defines the UserList class:
\nClass that simulates a list. The instance’s contents are kept in a regular\nlist, which is accessible via the data attribute of UserList\ninstances. The instance’s contents are initially set to a copy of list,\ndefaulting to the empty list []. list can be any iterable, e.g. a\nreal Python list or a UserList object.
\nNote
\nThe UserList class has been moved to the collections\nmodule in Python 3.0. The 2to3 tool will automatically adapt\nimports when converting your sources to 3.0.
\nIn addition to supporting the methods and operations of mutable sequences (see\nsection Sequence Types — str, unicode, list, tuple, bytearray, buffer, xrange), UserList instances provide the following\nattribute:
\n\n\nSubclassing requirements: Subclasses of UserList are expect to\noffer a constructor which can be called with either no arguments or one\nargument. List operations which return a new sequence attempt to create an\ninstance of the actual implementation class. To do so, it assumes that the\nconstructor can be called with a single parameter, which is a sequence object\nused as a data source.
\nIf a derived class does not wish to comply with this requirement, all of the\nspecial methods supported by this class will need to be overridden; please\nconsult the sources for information about the methods which need to be provided\nin that case.
\n\nChanged in version 2.0: Python versions 1.5.2 and 1.6 also required that the constructor be callable\nwith no parameters, and offer a mutable data attribute. Earlier\nversions of Python did not attempt to create instances of the derived class.
\nNote
\nThis UserString class from this module is available for backward\ncompatibility only. If you are writing code that does not need to work with\nversions of Python earlier than Python 2.2, please consider subclassing directly\nfrom the built-in str type instead of using UserString (there\nis no built-in equivalent to MutableString).
\nThis module defines a class that acts as a wrapper around string objects. It is\na useful base class for your own string-like classes, which can inherit from\nthem and override existing methods or add new ones. In this way one can add new\nbehaviors to strings.
\nIt should be noted that these classes are highly inefficient compared to real\nstring or Unicode objects; this is especially the case for\nMutableString.
\nThe UserString module defines the following classes:
\nClass that simulates a string or a Unicode string object. The instance’s\ncontent is kept in a regular string or Unicode string object, which is\naccessible via the data attribute of UserString instances. The\ninstance’s contents are initially set to a copy of sequence. sequence can\nbe either a regular Python string or Unicode string, an instance of\nUserString (or a subclass) or an arbitrary sequence which can be\nconverted into a string using the built-in str() function.
\nNote
\nThe UserString class has been moved to the collections\nmodule in Python 3.0. The 2to3 tool will automatically adapt\nimports when converting your sources to 3.0.
\nThis class is derived from the UserString above and redefines strings\nto be mutable. Mutable strings can’t be used as dictionary keys, because\ndictionaries require immutable objects as keys. The main intention of this\nclass is to serve as an educational example for inheritance and necessity to\nremove (override) the __hash__() method in order to trap attempts to use a\nmutable object as dictionary key, which would be otherwise very error prone and\nhard to track down.
\n\nDeprecated since version 2.6: The MutableString class has been removed in Python 3.0.
\nIn addition to supporting the methods and operations of string and Unicode\nobjects (see section String Methods), UserString instances\nprovide the following attribute:
\n\nNew in version 2.1.
\nSource code: Lib/weakref.py
\nThe weakref module allows the Python programmer to create weak\nreferences to objects.
\nIn the following, the term referent means the object which is referred to\nby a weak reference.
\nA weak reference to an object is not enough to keep the object alive: when the\nonly remaining references to a referent are weak references,\ngarbage collection is free to destroy the referent and reuse its memory\nfor something else. A primary use for weak references is to implement caches or\nmappings holding large objects, where it’s desired that a large object not be\nkept alive solely because it appears in a cache or mapping.
\nFor example, if you have a number of large binary image objects, you may wish to\nassociate a name with each. If you used a Python dictionary to map names to\nimages, or images to names, the image objects would remain alive just because\nthey appeared as values or keys in the dictionaries. The\nWeakKeyDictionary and WeakValueDictionary classes supplied by\nthe weakref module are an alternative, using weak references to construct\nmappings that don’t keep objects alive solely because they appear in the mapping\nobjects. If, for example, an image object is a value in a\nWeakValueDictionary, then when the last remaining references to that\nimage object are the weak references held by weak mappings, garbage collection\ncan reclaim the object, and its corresponding entries in weak mappings are\nsimply deleted.
\nWeakKeyDictionary and WeakValueDictionary use weak references\nin their implementation, setting up callback functions on the weak references\nthat notify the weak dictionaries when a key or value has been reclaimed by\ngarbage collection. Most programs should find that using one of these weak\ndictionary types is all they need – it’s not usually necessary to create your\nown weak references directly. The low-level machinery used by the weak\ndictionary implementations is exposed by the weakref module for the\nbenefit of advanced uses.
\nNote
\nWeak references to an object are cleared before the object’s __del__()\nis called, to ensure that the weak reference callback (if any) finds the\nobject still alive.
\nNot all objects can be weakly referenced; those objects which can include class\ninstances, functions written in Python (but not in C), methods (both bound and\nunbound), sets, frozensets, file objects, generators, type objects,\nDBcursor objects from the bsddb module, sockets, arrays, deques,\nregular expression pattern objects, and code objects.
\n\nChanged in version 2.4: Added support for files, sockets, arrays, and patterns.
\n\nChanged in version 2.7: Added support for thread.lock, threading.Lock, and code objects.
\nSeveral built-in types such as list and dict do not directly\nsupport weak references but can add support through subclassing:
\nclass Dict(dict):\n pass\n\nobj = Dict(red=1, green=2, blue=3) # this object is weak referenceable\n
CPython implementation detail: Other built-in types such as tuple and long do not support\nweak references even when subclassed.
\nExtension types can easily be made to support weak references; see\nWeak Reference Support.
\nReturn a weak reference to object. The original object can be retrieved by\ncalling the reference object if the referent is still alive; if the referent is\nno longer alive, calling the reference object will cause None to be\nreturned. If callback is provided and not None, and the returned\nweakref object is still alive, the callback will be called when the object is\nabout to be finalized; the weak reference object will be passed as the only\nparameter to the callback; the referent will no longer be available.
\nIt is allowable for many weak references to be constructed for the same object.\nCallbacks registered for each weak reference will be called from the most\nrecently registered callback to the oldest registered callback.
\nExceptions raised by the callback will be noted on the standard error output,\nbut cannot be propagated; they are handled in exactly the same way as exceptions\nraised from an object’s __del__() method.
\nWeak references are hashable if the object is hashable. They will maintain\ntheir hash value even after the object was deleted. If hash() is called\nthe first time only after the object was deleted, the call will raise\nTypeError.
\nWeak references support tests for equality, but not ordering. If the referents\nare still alive, two references have the same equality relationship as their\nreferents (regardless of the callback). If either referent has been deleted,\nthe references are equal only if the reference objects are the same object.
\n\nChanged in version 2.4: This is now a subclassable type rather than a factory function; it derives from\nobject.
\nMapping class that references keys weakly. Entries in the dictionary will be\ndiscarded when there is no longer a strong reference to the key. This can be\nused to associate additional data with an object owned by other parts of an\napplication without adding attributes to those objects. This can be especially\nuseful with objects that override attribute accesses.
\nNote
\nCaution: Because a WeakKeyDictionary is built on top of a Python\ndictionary, it must not change size when iterating over it. This can be\ndifficult to ensure for a WeakKeyDictionary because actions\nperformed by the program during iteration may cause items in the\ndictionary to vanish “by magic” (as a side effect of garbage collection).
\nWeakKeyDictionary objects have the following additional methods. These\nexpose the internal references directly. The references are not guaranteed to\nbe “live” at the time they are used, so the result of calling the references\nneeds to be checked before being used. This can be used to avoid creating\nreferences that will cause the garbage collector to keep the keys around longer\nthan needed.
\nReturn an iterator that yields the weak references to the keys.
\n\nNew in version 2.5.
\nReturn a list of weak references to the keys.
\n\nNew in version 2.5.
\nMapping class that references values weakly. Entries in the dictionary will be\ndiscarded when no strong reference to the value exists any more.
\nNote
\nCaution: Because a WeakValueDictionary is built on top of a Python\ndictionary, it must not change size when iterating over it. This can be\ndifficult to ensure for a WeakValueDictionary because actions performed\nby the program during iteration may cause items in the dictionary to vanish “by\nmagic” (as a side effect of garbage collection).
\nWeakValueDictionary objects have the following additional methods.\nThese method have the same issues as the iterkeyrefs() and keyrefs()\nmethods of WeakKeyDictionary objects.
\nReturn an iterator that yields the weak references to the values.
\n\nNew in version 2.5.
\nReturn a list of weak references to the values.
\n\nNew in version 2.5.
\nSet class that keeps weak references to its elements. An element will be\ndiscarded when no strong reference to it exists any more.
\n\nNew in version 2.7.
\nSee also
\nWeak reference objects have no attributes or methods, but do allow the referent\nto be obtained, if it still exists, by calling it:
\n>>> import weakref\n>>> class Object:\n... pass\n...\n>>> o = Object()\n>>> r = weakref.ref(o)\n>>> o2 = r()\n>>> o is o2\nTrue\n
If the referent no longer exists, calling the reference object returns\nNone:
\n>>> del o, o2\n>>> print r()\nNone\n
Testing that a weak reference object is still live should be done using the\nexpression ref() is not None. Normally, application code that needs to use\na reference object should follow this pattern:
\n# r is a weak reference object\no = r()\nif o is None:\n # referent has been garbage collected\n print "Object has been deallocated; can't frobnicate."\nelse:\n print "Object is still live!"\n o.do_something_useful()\n
Using a separate test for “liveness” creates race conditions in threaded\napplications; another thread can cause a weak reference to become invalidated\nbefore the weak reference is called; the idiom shown above is safe in threaded\napplications as well as single-threaded applications.
\nSpecialized versions of ref objects can be created through subclassing.\nThis is used in the implementation of the WeakValueDictionary to reduce\nthe memory overhead for each entry in the mapping. This may be most useful to\nassociate additional information with a reference, but could also be used to\ninsert additional processing on calls to retrieve the referent.
\nThis example shows how a subclass of ref can be used to store\nadditional information about an object and affect the value that’s returned when\nthe referent is accessed:
\nimport weakref\n\nclass ExtendedRef(weakref.ref):\n def __init__(self, ob, callback=None, **annotations):\n super(ExtendedRef, self).__init__(ob, callback)\n self.__counter = 0\n for k, v in annotations.iteritems():\n setattr(self, k, v)\n\n def __call__(self):\n """Return a pair containing the referent and the number of\n times the reference has been called.\n """\n ob = super(ExtendedRef, self).__call__()\n if ob is not None:\n self.__counter += 1\n ob = (ob, self.__counter)\n return ob\n
This simple example shows how an application can use objects IDs to retrieve\nobjects that it has seen before. The IDs of the objects can then be used in\nother data structures without forcing the objects to remain alive, but the\nobjects can still be retrieved by ID if they do.
\nimport weakref\n\n_id2obj_dict = weakref.WeakValueDictionary()\n\ndef remember(obj):\n oid = id(obj)\n _id2obj_dict[oid] = obj\n return oid\n\ndef id2obj(oid):\n return _id2obj_dict[oid]\n
This module provides generic (shallow and deep) copying operations.
\nInterface summary:
\nThe difference between shallow and deep copying is only relevant for compound\nobjects (objects that contain other objects, like lists or class instances):
\nTwo problems often exist with deep copy operations that don’t exist with shallow\ncopy operations:
\nThe deepcopy() function avoids these problems by:
\nThis module does not copy types like module, method, stack trace, stack frame,\nfile, socket, window, array, or any similar types. It does “copy” functions and\nclasses (shallow and deeply), by returning the original object unchanged; this\nis compatible with the way these are treated by the pickle module.
\nShallow copies of dictionaries can be made using dict.copy(), and\nof lists by assigning a slice of the entire list, for example,\ncopied_list = original_list[:].
\n\nChanged in version 2.5: Added copying functions.
\nClasses can use the same interfaces to control copying that they use to control\npickling. See the description of module pickle for information on these\nmethods. The copy module does not use the copy_reg registration\nmodule.
\nIn order for a class to define its own copy implementation, it can define\nspecial methods __copy__() and __deepcopy__(). The former is called\nto implement the shallow copy operation; no additional arguments are passed.\nThe latter is called to implement the deep copy operation; it is passed one\nargument, the memo dictionary. If the __deepcopy__() implementation needs\nto make a deep copy of a component, it should call the deepcopy() function\nwith the component as first argument and the memo dictionary as second argument.
\nSee also
\nNote
\nThe repr module has been renamed to reprlib in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nSource code: Lib/repr.py
\nThe repr module provides a means for producing object representations\nwith limits on the size of the resulting strings. This is used in the Python\ndebugger and may be useful in other contexts as well.
\nThis module provides a class, an instance, and a function:
\nRepr instances provide several attributes which can be used to provide\nsize limits for the representations of different object types, and methods\nwhich format specific object types.
\nLimits on the number of entries represented for the named object type. The\ndefault is 4 for maxdict, 5 for maxarray, and 6 for\nthe others.
\n\nNew in version 2.4: maxset, maxfrozenset, and set.
\nThe use of dynamic dispatching by Repr.repr1() allows subclasses of\nRepr to add support for additional built-in object types or to modify\nthe handling of types already supported. This example shows how special support\nfor file objects could be added:
\nimport repr as reprlib\nimport sys\n\nclass MyRepr(reprlib.Repr):\n def repr_file(self, obj, level):\n if obj.name in ['<stdin>', '<stdout>', '<stderr>']:\n return obj.name\n else:\n return repr(obj)\n\naRepr = MyRepr()\nprint aRepr.repr(sys.stdin) # prints '<stdin>'\n
Source code: Lib/pprint.py
\nThe pprint module provides a capability to “pretty-print” arbitrary\nPython data structures in a form which can be used as input to the interpreter.\nIf the formatted structures include objects which are not fundamental Python\ntypes, the representation may not be loadable. This may be the case if objects\nsuch as files, sockets, classes, or instances are included, as well as many\nother built-in objects which are not representable as Python constants.
\nThe formatted representation keeps objects on a single line if it can, and\nbreaks them onto multiple lines if they don’t fit within the allowed width.\nConstruct PrettyPrinter objects explicitly if you need to adjust the\nwidth constraint.
\n\nChanged in version 2.5: Dictionaries are sorted by key before the display is computed; before 2.5, a\ndictionary was sorted only if its display required more than one line, although\nthat wasn’t documented.
\n\nChanged in version 2.6: Added support for set and frozenset.
\nThe pprint module defines one class:
\nConstruct a PrettyPrinter instance. This constructor understands\nseveral keyword parameters. An output stream may be set using the stream\nkeyword; the only method used on the stream object is the file protocol’s\nwrite() method. If not specified, the PrettyPrinter adopts\nsys.stdout. Three additional parameters may be used to control the\nformatted representation. The keywords are indent, depth, and width. The\namount of indentation added for each recursive level is specified by indent;\nthe default is one. Other values can cause output to look a little odd, but can\nmake nesting easier to spot. The number of levels which may be printed is\ncontrolled by depth; if the data structure being printed is too deep, the next\ncontained level is replaced by .... By default, there is no constraint on\nthe depth of the objects being formatted. The desired output width is\nconstrained using the width parameter; the default is 80 characters. If a\nstructure cannot be formatted within the constrained width, a best effort will\nbe made.
\n>>> import pprint\n>>> stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']\n>>> stuff.insert(0, stuff[:])\n>>> pp = pprint.PrettyPrinter(indent=4)\n>>> pp.pprint(stuff)\n[ ['spam', 'eggs', 'lumberjack', 'knights', 'ni'],\n 'spam',\n 'eggs',\n 'lumberjack',\n 'knights',\n 'ni']\n>>> tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',\n... ('parrot', ('fresh fruit',))))))))\n>>> pp = pprint.PrettyPrinter(depth=6)\n>>> pp.pprint(tup)\n('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead', (...)))))))\n
The PrettyPrinter class supports several derivative functions:
\nReturn the formatted representation of object as a string. indent, width\nand depth will be passed to the PrettyPrinter constructor as\nformatting parameters.
\n\nChanged in version 2.4: The parameters indent, width and depth were added.
\nPrints the formatted representation of object on stream, followed by a\nnewline. If stream is omitted, sys.stdout is used. This may be used in\nthe interactive interpreter instead of a print statement for\ninspecting values. indent, width and depth will be passed to the\nPrettyPrinter constructor as formatting parameters.
\n>>> import pprint\n>>> stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']\n>>> stuff.insert(0, stuff)\n>>> pprint.pprint(stuff)\n[<Recursion on list with id=...>,\n 'spam',\n 'eggs',\n 'lumberjack',\n 'knights',\n 'ni']\n
\nChanged in version 2.4: The parameters indent, width and depth were added.
\nDetermine if the formatted representation of object is “readable,” or can be\nused to reconstruct the value using eval(). This always returns False\nfor recursive objects.
\n>>> pprint.isreadable(stuff)\nFalse\n
One more support function is also defined:
\nReturn a string representation of object, protected against recursive data\nstructures. If the representation of object exposes a recursive entry, the\nrecursive reference will be represented as <Recursion on typename with\nid=number>. The representation is not otherwise formatted.
\n>>> pprint.saferepr(stuff)\n"[<Recursion on list with id=...>, 'spam', 'eggs', 'lumberjack', 'knights', 'ni']"\n
PrettyPrinter instances have the following methods:
\nThe following methods provide the implementations for the corresponding\nfunctions of the same names. Using these methods on an instance is slightly\nmore efficient since new PrettyPrinter objects don’t need to be\ncreated.
\nDetermine if the formatted representation of the object is “readable,” or can be\nused to reconstruct the value using eval(). Note that this returns\nFalse for recursive objects. If the depth parameter of the\nPrettyPrinter is set and the object is deeper than allowed, this\nreturns False.
\nThis method is provided as a hook to allow subclasses to modify the way objects\nare converted to strings. The default implementation uses the internals of the\nsaferepr() implementation.
\nReturns three values: the formatted version of object as a string, a flag\nindicating whether the result is readable, and a flag indicating whether\nrecursion was detected. The first argument is the object to be presented. The\nsecond is a dictionary which contains the id() of objects that are part of\nthe current presentation context (direct and indirect containers for object\nthat are affecting the presentation) as the keys; if an object needs to be\npresented which is already represented in context, the third return value\nshould be True. Recursive calls to the format() method should add\nadditional entries for containers to this dictionary. The third argument,\nmaxlevels, gives the requested limit to recursion; this will be 0 if there\nis no requested limit. This argument should be passed unmodified to recursive\ncalls. The fourth argument, level, gives the current level; recursive calls\nshould be passed a value less than that of the current call.
\n\nNew in version 2.3.
\nThis example demonstrates several uses of the pprint() function and its parameters.
\n>>> import pprint\n>>> tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',\n... ('parrot', ('fresh fruit',))))))))\n>>> stuff = ['a' * 10, tup, ['a' * 30, 'b' * 30], ['c' * 20, 'd' * 20]]\n>>> pprint.pprint(stuff)\n['aaaaaaaaaa',\n ('spam',\n ('eggs',\n ('lumberjack',\n ('knights', ('ni', ('dead', ('parrot', ('fresh fruit',)))))))),\n ['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'],\n ['cccccccccccccccccccc', 'dddddddddddddddddddd']]\n>>> pprint.pprint(stuff, depth=3)\n['aaaaaaaaaa',\n ('spam', ('eggs', (...))),\n ['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'],\n ['cccccccccccccccccccc', 'dddddddddddddddddddd']]\n>>> pprint.pprint(stuff, width=60)\n['aaaaaaaaaa',\n ('spam',\n ('eggs',\n ('lumberjack',\n ('knights',\n ('ni', ('dead', ('parrot', ('fresh fruit',)))))))),\n ['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa',\n 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'],\n ['cccccccccccccccccccc', 'dddddddddddddddddddd']]\n
\nNew in version 2.6.
\nThe numbers module (PEP 3141) defines a hierarchy of numeric\nabstract base classes which progressively define\nmore operations. None of the types defined in this module can be instantiated.
\nSubclasses of this type describe complex numbers and include the operations\nthat work on the built-in complex type. These are: conversions to\ncomplex and bool, real, imag, +,\n-, *, /, abs(), conjugate(), ==, and !=. All\nexcept - and != are abstract.
\nTo Complex, Real adds the operations that work on real\nnumbers.
\nIn short, those are: a conversion to float, math.trunc(),\nround(), math.floor(), math.ceil(), divmod(), //,\n
Real also provides defaults for complex(), real,\nimag, and conjugate().
\nImplementors should be careful to make equal numbers equal and hash\nthem to the same values. This may be subtle if there are two different\nextensions of the real numbers. For example, fractions.Fraction\nimplements hash() as follows:
\ndef __hash__(self):\n if self.denominator == 1:\n # Get integers right.\n return hash(self.numerator)\n # Expensive check, but definitely correct.\n if self == float(self):\n return hash(float(self))\n else:\n # Use tuple's hash to avoid a high collision rate on\n # simple fractions.\n return hash((self.numerator, self.denominator))\n
There are, of course, more possible ABCs for numbers, and this would\nbe a poor hierarchy if it precluded the possibility of adding\nthose. You can add MyFoo between Complex and\nReal with:
\nclass MyFoo(Complex): ...\nMyFoo.register(Real)\n
We want to implement the arithmetic operations so that mixed-mode\noperations either call an implementation whose author knew about the\ntypes of both arguments, or convert both to the nearest built in type\nand do the operation there. For subtypes of Integral, this\nmeans that __add__() and __radd__() should be defined as:
\nclass MyIntegral(Integral):\n\n def __add__(self, other):\n if isinstance(other, MyIntegral):\n return do_my_adding_stuff(self, other)\n elif isinstance(other, OtherTypeIKnowAbout):\n return do_my_other_adding_stuff(self, other)\n else:\n return NotImplemented\n\n def __radd__(self, other):\n if isinstance(other, MyIntegral):\n return do_my_adding_stuff(other, self)\n elif isinstance(other, OtherTypeIKnowAbout):\n return do_my_other_adding_stuff(other, self)\n elif isinstance(other, Integral):\n return int(other) + int(self)\n elif isinstance(other, Real):\n return float(other) + float(self)\n elif isinstance(other, Complex):\n return complex(other) + complex(self)\n else:\n return NotImplemented\n
There are 5 different cases for a mixed-type operation on subclasses\nof Complex. I’ll refer to all of the above code that doesn’t\nrefer to MyIntegral and OtherTypeIKnowAbout as\n“boilerplate”. a will be an instance of A, which is a subtype\nof Complex (a : A <: Complex), and b : B <:\nComplex. I’ll consider a + b:
\n\n\n\n
\n- If A defines an __add__() which accepts b, all is\nwell.
\n- If A falls back to the boilerplate code, and it were to\nreturn a value from __add__(), we’d miss the possibility\nthat B defines a more intelligent __radd__(), so the\nboilerplate should return NotImplemented from\n__add__(). (Or A may not implement __add__() at\nall.)
\n- Then B‘s __radd__() gets a chance. If it accepts\na, all is well.
\n- If it falls back to the boilerplate, there are no more possible\nmethods to try, so this is where the default implementation\nshould live.
\n- If B <: A, Python tries B.__radd__ before\nA.__add__. This is ok, because it was implemented with\nknowledge of A, so it can handle those instances before\ndelegating to Complex.
\n
If A <: Complex and B <: Real without sharing any other knowledge,\nthen the appropriate shared operation is the one involving the built\nin complex, and both __radd__() s land there, so a+b\n== b+a.
\nBecause most of the operations on any given type will be very similar,\nit can be useful to define a helper function which generates the\nforward and reverse instances of any given operator. For example,\nfractions.Fraction uses:
\ndef _operator_fallbacks(monomorphic_operator, fallback_operator):\n def forward(a, b):\n if isinstance(b, (int, long, Fraction)):\n return monomorphic_operator(a, b)\n elif isinstance(b, float):\n return fallback_operator(float(a), b)\n elif isinstance(b, complex):\n return fallback_operator(complex(a), b)\n else:\n return NotImplemented\n forward.__name__ = '__' + fallback_operator.__name__ + '__'\n forward.__doc__ = monomorphic_operator.__doc__\n\n def reverse(b, a):\n if isinstance(a, Rational):\n # Includes ints.\n return monomorphic_operator(a, b)\n elif isinstance(a, numbers.Real):\n return fallback_operator(float(a), float(b))\n elif isinstance(a, numbers.Complex):\n return fallback_operator(complex(a), complex(b))\n else:\n return NotImplemented\n reverse.__name__ = '__r' + fallback_operator.__name__ + '__'\n reverse.__doc__ = monomorphic_operator.__doc__\n\n return forward, reverse\n\ndef _add(a, b):\n """a + b"""\n return Fraction(a.numerator * b.denominator +\n b.numerator * a.denominator,\n a.denominator * b.denominator)\n\n__add__, __radd__ = _operator_fallbacks(_add, operator.add)\n\n# ...\n
This module is always available. It provides access to the mathematical\nfunctions defined by the C standard.
\nThese functions cannot be used with complex numbers; use the functions of the\nsame name from the cmath module if you require support for complex\nnumbers. The distinction between functions which support complex numbers and\nthose which don’t is made since most users do not want to learn quite as much\nmathematics as required to understand complex numbers. Receiving an exception\ninstead of a complex result allows earlier detection of the unexpected complex\nnumber used as a parameter, so that the programmer can determine how and why it\nwas generated in the first place.
\nThe following functions are provided by this module. Except when explicitly\nnoted otherwise, all return values are floats.
\nReturn x with the sign of y. On a platform that supports\nsigned zeros, copysign(1.0, -0.0) returns -1.0.
\n\nNew in version 2.6.
\nReturn x factorial. Raises ValueError if x is not integral or\nis negative.
\n\nNew in version 2.6.
\nReturn an accurate floating point sum of values in the iterable. Avoids\nloss of precision by tracking multiple intermediate partial sums:
\n>>> sum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])\n0.9999999999999999\n>>> fsum([.1, .1, .1, .1, .1, .1, .1, .1, .1, .1])\n1.0\n
The algorithm’s accuracy depends on IEEE-754 arithmetic guarantees and the\ntypical case where the rounding mode is half-even. On some non-Windows\nbuilds, the underlying C library uses extended precision addition and may\noccasionally double-round an intermediate sum causing it to be off in its\nleast significant bit.
\nFor further discussion and two alternative approaches, see the ASPN cookbook\nrecipes for accurate floating point summation.
\n\nNew in version 2.6.
\nCheck if the float x is positive or negative infinity.
\n\nNew in version 2.6.
\nCheck if the float x is a NaN (not a number). For more information\non NaNs, see the IEEE 754 standards.
\n\nNew in version 2.6.
\nReturn the Real value x truncated to an Integral (usually\na long integer). Uses the __trunc__ method.
\n\nNew in version 2.6.
\nNote that frexp() and modf() have a different call/return pattern\nthan their C equivalents: they take a single argument and return a pair of\nvalues, rather than returning their second return value through an ‘output\nparameter’ (there is no such thing in Python).
\nFor the ceil(), floor(), and modf() functions, note that all\nfloating-point numbers of sufficiently large magnitude are exact integers.\nPython floats typically carry no more than 53 bits of precision (the same as the\nplatform C double type), in which case any float x with abs(x) >= 2**52\nnecessarily has no fractional bits.
\nReturn e**x - 1. For small floats x, the subtraction in\nexp(x) - 1 can result in a significant loss of precision; the\nexpm1() function provides a way to compute this quantity to\nfull precision:
\n>>> from math import exp, expm1\n>>> exp(1e-5) - 1 # gives result accurate to 11 places\n1.0000050000069649e-05\n>>> expm1(1e-5) # result accurate to full precision\n1.0000050000166668e-05\n
\nNew in version 2.7.
\nWith one argument, return the natural logarithm of x (to base e).
\nWith two arguments, return the logarithm of x to the given base,\ncalculated as log(x)/log(base).
\n\nChanged in version 2.3: base argument added.
\nReturn the natural logarithm of 1+x (base e). The\nresult is calculated in a way which is accurate for x near zero.
\n\nNew in version 2.6.
\nReturn x raised to the power y. Exceptional cases follow\nAnnex ‘F’ of the C99 standard as far as possible. In particular,\npow(1.0, x) and pow(x, 0.0) always return 1.0, even\nwhen x is a zero or a NaN. If both x and y are finite,\nx is negative, and y is not an integer then pow(x, y)\nis undefined, and raises ValueError.
\n\nChanged in version 2.6: The outcome of 1**nan and nan**0 was undefined.
\nReturn the inverse hyperbolic cosine of x.
\n\nNew in version 2.6.
\nReturn the inverse hyperbolic sine of x.
\n\nNew in version 2.6.
\nReturn the inverse hyperbolic tangent of x.
\n\nNew in version 2.6.
\nReturn the error function at x.
\n\nNew in version 2.7.
\nReturn the complementary error function at x.
\n\nNew in version 2.7.
\nReturn the Gamma function at x.
\n\nNew in version 2.7.
\nReturn the natural logarithm of the absolute value of the Gamma\nfunction at x.
\n\nNew in version 2.7.
\nCPython implementation detail: The math module consists mostly of thin wrappers around the platform C\nmath library functions. Behavior in exceptional cases follows Annex F of\nthe C99 standard where appropriate. The current implementation will raise\nValueError for invalid operations like sqrt(-1.0) or log(0.0)\n(where C99 Annex F recommends signaling invalid operation or divide-by-zero),\nand OverflowError for results that overflow (for example,\nexp(1000.0)). A NaN will not be returned from any of the functions\nabove unless one or more of the input arguments was a NaN; in that case,\nmost functions will return a NaN, but (again following C99 Annex F) there\nare some exceptions to this rule, for example pow(float('nan'), 0.0) or\nhypot(float('nan'), float('inf')).
\nNote that Python makes no effort to distinguish signaling NaNs from\nquiet NaNs, and behavior for signaling NaNs remains unspecified.\nTypical behavior is to treat all NaNs as though they were quiet.
\n\nChanged in version 2.6: Behavior in special cases now aims to follow C99 Annex F. In earlier\nversions of Python the behavior in special cases was loosely specified.
\nSee also
\n\nNew in version 2.6.
\nSource code: Lib/fractions.py
\nThe fractions module provides support for rational number arithmetic.
\nA Fraction instance can be constructed from a pair of integers, from\nanother rational number, or from a string.
\nThe first version requires that numerator and denominator are instances\nof numbers.Rational and returns a new Fraction instance\nwith value numerator/denominator. If denominator is 0, it\nraises a ZeroDivisionError. The second version requires that\nother_fraction is an instance of numbers.Rational and returns a\nFraction instance with the same value. The next two versions accept\neither a float or a decimal.Decimal instance, and return a\nFraction instance with exactly the same value. Note that due to the\nusual issues with binary floating-point (see Floating Point Arithmetic: Issues and Limitations), the\nargument to Fraction(1.1) is not exactly equal to 11/10, and so\nFraction(1.1) does not return Fraction(11, 10) as one might expect.\n(But see the documentation for the limit_denominator() method below.)\nThe last version of the constructor expects a string or unicode instance.\nThe usual form for this instance is:
\n[sign] numerator ['/' denominator]
\nwhere the optional sign may be either ‘+’ or ‘-‘ and\nnumerator and denominator (if present) are strings of\ndecimal digits. In addition, any string that represents a finite\nvalue and is accepted by the float constructor is also\naccepted by the Fraction constructor. In either form the\ninput string may also have leading and/or trailing whitespace.\nHere are some examples:
\n>>> from fractions import Fraction\n>>> Fraction(16, -10)\nFraction(-8, 5)\n>>> Fraction(123)\nFraction(123, 1)\n>>> Fraction()\nFraction(0, 1)\n>>> Fraction('3/7')\nFraction(3, 7)\n[40794 refs]\n>>> Fraction(' -3/7 ')\nFraction(-3, 7)\n>>> Fraction('1.414213 \\t\\n')\nFraction(1414213, 1000000)\n>>> Fraction('-.125')\nFraction(-1, 8)\n>>> Fraction('7e-6')\nFraction(7, 1000000)\n>>> Fraction(2.25)\nFraction(9, 4)\n>>> Fraction(1.1)\nFraction(2476979795053773, 2251799813685248)\n>>> from decimal import Decimal\n>>> Fraction(Decimal('1.1'))\nFraction(11, 10)\n
The Fraction class inherits from the abstract base class\nnumbers.Rational, and implements all of the methods and\noperations from that class. Fraction instances are hashable,\nand should be treated as immutable. In addition,\nFraction has the following methods:
\n\nChanged in version 2.7: The Fraction constructor now accepts float and\ndecimal.Decimal instances.
\nThis class method constructs a Fraction representing the exact\nvalue of flt, which must be a float. Beware that\nFraction.from_float(0.3) is not the same value as Fraction(3, 10)
\n\nThis class method constructs a Fraction representing the exact\nvalue of dec, which must be a decimal.Decimal.
\nNote
\nFrom Python 2.7 onwards, you can also construct a\nFraction instance directly from a decimal.Decimal\ninstance.
\nFinds and returns the closest Fraction to self that has\ndenominator at most max_denominator. This method is useful for finding\nrational approximations to a given floating-point number:
\n>>> from fractions import Fraction\n>>> Fraction('3.1415926535897932').limit_denominator(1000)\nFraction(355, 113)\n
or for recovering a rational number that’s represented as a float:
\n>>> from math import pi, cos\n>>> Fraction(cos(pi/3))\nFraction(4503599627370497, 9007199254740992)\n>>> Fraction(cos(pi/3)).limit_denominator()\nFraction(1, 2)\n>>> Fraction(1.1).limit_denominator()\nFraction(11, 10)\n
See also
\nThis module is always available. It provides access to mathematical functions\nfor complex numbers. The functions in this module accept integers,\nfloating-point numbers or complex numbers as arguments. They will also accept\nany Python object that has either a __complex__() or a __float__()\nmethod: these methods are used to convert the object to a complex or\nfloating-point number, respectively, and the function is then applied to the\nresult of the conversion.
\nNote
\nOn platforms with hardware and system-level support for signed\nzeros, functions involving branch cuts are continuous on both\nsides of the branch cut: the sign of the zero distinguishes one\nside of the branch cut from the other. On platforms that do not\nsupport signed zeros the continuity is as specified below.
\nA Python complex number z is stored internally using rectangular\nor Cartesian coordinates. It is completely determined by its real\npart z.real and its imaginary part z.imag. In other\nwords:
\nz == z.real + z.imag*1j\n
Polar coordinates give an alternative way to represent a complex\nnumber. In polar coordinates, a complex number z is defined by the\nmodulus r and the phase angle phi. The modulus r is the distance\nfrom z to the origin, while the phase phi is the counterclockwise\nangle, measured in radians, from the positive x-axis to the line\nsegment that joins the origin to z.
\nThe following functions can be used to convert from the native\nrectangular coordinates to polar coordinates and back.
\nReturn the phase of x (also known as the argument of x), as a\nfloat. phase(x) is equivalent to math.atan2(x.imag,\nx.real). The result lies in the range [-π, π], and the branch\ncut for this operation lies along the negative real axis,\ncontinuous from above. On systems with support for signed zeros\n(which includes most systems in current use), this means that the\nsign of the result is the same as the sign of x.imag, even when\nx.imag is zero:
\n>>> phase(complex(-1.0, 0.0))\n3.1415926535897931\n>>> phase(complex(-1.0, -0.0))\n-3.1415926535897931\n
\nNew in version 2.6.
\nNote
\nThe modulus (absolute value) of a complex number x can be\ncomputed using the built-in abs() function. There is no\nseparate cmath module function for this operation.
\nReturn the representation of x in polar coordinates. Returns a\npair (r, phi) where r is the modulus of x and phi is the\nphase of x. polar(x) is equivalent to (abs(x),\nphase(x)).
\n\nNew in version 2.6.
\nReturn the complex number x with polar coordinates r and phi.\nEquivalent to r * (math.cos(phi) + math.sin(phi)*1j).
\n\nNew in version 2.6.
\nReturns the logarithm of x to the given base. If the base is not\nspecified, returns the natural logarithm of x. There is one branch cut, from 0\nalong the negative real axis to -∞, continuous from above.
\n\nChanged in version 2.4: base argument added.
\nReturn the arc tangent of x. There are two branch cuts: One extends from\n1j along the imaginary axis to ∞j, continuous from the right. The\nother extends from -1j along the imaginary axis to -∞j, continuous\nfrom the left.
\n\nChanged in version 2.6: direction of continuity of upper cut reversed
\nReturn the hyperbolic arc sine of x. There are two branch cuts:\nOne extends from 1j along the imaginary axis to ∞j,\ncontinuous from the right. The other extends from -1j along\nthe imaginary axis to -∞j, continuous from the left.
\n\nChanged in version 2.6: branch cuts moved to match those recommended by the C99 standard
\nReturn the hyperbolic arc tangent of x. There are two branch cuts: One\nextends from 1 along the real axis to ∞, continuous from below. The\nother extends from -1 along the real axis to -∞, continuous from\nabove.
\n\nChanged in version 2.6: direction of continuity of right cut reversed
\nReturn True if the real or the imaginary part of x is positive\nor negative infinity.
\n\nNew in version 2.6.
\nReturn True if the real or imaginary part of x is not a number (NaN).
\n\nNew in version 2.6.
\nNote that the selection of functions is similar, but not identical, to that in\nmodule math. The reason for having two modules is that some users aren’t\ninterested in complex numbers, and perhaps don’t even know what they are. They\nwould rather have math.sqrt(-1) raise an exception than return a complex\nnumber. Also note that the functions defined in cmath always return a\ncomplex number, even if the answer can be expressed as a real number (in which\ncase the complex number has an imaginary part of zero).
\nA note on branch cuts: They are curves along which the given function fails to\nbe continuous. They are a necessary feature of many complex functions. It is\nassumed that if you need to compute with complex functions, you will understand\nabout branch cuts. Consult almost any (not too elementary) book on complex\nvariables for enlightenment. For information of the proper choice of branch\ncuts for numerical purposes, a good reference should be the following:
\nSee also
\nKahan, W: Branch cuts for complex elementary functions; or, Much ado about\nnothing’s sign bit. In Iserles, A., and Powell, M. (eds.), The state of the art\nin numerical analysis. Clarendon Press (1987) pp165-211.
\n\nNew in version 2.5.
\nSource code: Lib/functools.py
\nThe functools module is for higher-order functions: functions that act on\nor return other functions. In general, any callable object can be treated as a\nfunction for the purposes of this module.
\nThe functools module defines the following functions:
\nTransform an old-style comparison function to a key-function. Used with\ntools that accept key functions (such as sorted(), min(),\nmax(), heapq.nlargest(), heapq.nsmallest(),\nitertools.groupby()). This function is primarily used as a transition\ntool for programs being converted to Py3.x where comparison functions are no\nlonger supported.
\nA compare function is any callable that accept two arguments, compares them,\nand returns a negative number for less-than, zero for equality, or a positive\nnumber for greater-than. A key function is a callable that accepts one\nargument and returns another value that indicates the position in the desired\ncollation sequence.
\nExample:
\nsorted(iterable, key=cmp_to_key(locale.strcoll)) # locale-aware sort order\n
\nNew in version 2.7.
\nGiven a class defining one or more rich comparison ordering methods, this\nclass decorator supplies the rest. This simplifies the effort involved\nin specifying all of the possible rich comparison operations:
\nThe class must define one of __lt__(), __le__(),\n__gt__(), or __ge__().\nIn addition, the class should supply an __eq__() method.
\nFor example:
\n@total_ordering\nclass Student:\n def __eq__(self, other):\n return ((self.lastname.lower(), self.firstname.lower()) ==\n (other.lastname.lower(), other.firstname.lower()))\n def __lt__(self, other):\n return ((self.lastname.lower(), self.firstname.lower()) <\n (other.lastname.lower(), other.firstname.lower()))\n
\nNew in version 2.7.
\nThis is the same function as reduce(). It is made available in this module\nto allow writing code more forward-compatible with Python 3.
\n\nNew in version 2.6.
\nReturn a new partial object which when called will behave like func\ncalled with the positional arguments args and keyword arguments keywords. If\nmore arguments are supplied to the call, they are appended to args. If\nadditional keyword arguments are supplied, they extend and override keywords.\nRoughly equivalent to:
\ndef partial(func, *args, **keywords):\n def newfunc(*fargs, **fkeywords):\n newkeywords = keywords.copy()\n newkeywords.update(fkeywords)\n return func(*(args + fargs), **newkeywords)\n newfunc.func = func\n newfunc.args = args\n newfunc.keywords = keywords\n return newfunc\n
The partial() is used for partial function application which “freezes”\nsome portion of a function’s arguments and/or keywords resulting in a new object\nwith a simplified signature. For example, partial() can be used to create\na callable that behaves like the int() function where the base argument\ndefaults to two:
\n>>> from functools import partial\n>>> basetwo = partial(int, base=2)\n>>> basetwo.__doc__ = 'Convert base 2 string to an int.'\n>>> basetwo('10010')\n18\n
Update a wrapper function to look like the wrapped function. The optional\narguments are tuples to specify which attributes of the original function are\nassigned directly to the matching attributes on the wrapper function and which\nattributes of the wrapper function are updated with the corresponding attributes\nfrom the original function. The default values for these arguments are the\nmodule level constants WRAPPER_ASSIGNMENTS (which assigns to the wrapper\nfunction’s __name__, __module__ and __doc__, the documentation string) and\nWRAPPER_UPDATES (which updates the wrapper function’s __dict__, i.e. the\ninstance dictionary).
\nThe main intended use for this function is in decorator functions which\nwrap the decorated function and return the wrapper. If the wrapper function is\nnot updated, the metadata of the returned function will reflect the wrapper\ndefinition rather than the original function definition, which is typically less\nthan helpful.
\nThis is a convenience function for invoking partial(update_wrapper,\nwrapped=wrapped, assigned=assigned, updated=updated) as a function decorator\nwhen defining a wrapper function. For example:
\n>>> from functools import wraps\n>>> def my_decorator(f):\n... @wraps(f)\n... def wrapper(*args, **kwds):\n... print 'Calling decorated function'\n... return f(*args, **kwds)\n... return wrapper\n...\n>>> @my_decorator\n... def example():\n... """Docstring"""\n... print 'Called example function'\n...\n>>> example()\nCalling decorated function\nCalled example function\n>>> example.__name__\n'example'\n>>> example.__doc__\n'Docstring'\n
Without the use of this decorator factory, the name of the example function\nwould have been 'wrapper', and the docstring of the original example()\nwould have been lost.
\npartial objects are callable objects created by partial(). They\nhave three read-only attributes:
\npartial objects are like function objects in that they are\ncallable, weak referencable, and can have attributes. There are some important\ndifferences. For instance, the __name__ and __doc__ attributes\nare not created automatically. Also, partial objects defined in\nclasses behave like static methods and do not transform into bound methods\nduring instance attribute look-up.
\n\nNew in version 2.3.
\nThis module implements a number of iterator building blocks inspired\nby constructs from APL, Haskell, and SML. Each has been recast in a form\nsuitable for Python.
\nThe module standardizes a core set of fast, memory efficient tools that are\nuseful by themselves or in combination. Together, they form an “iterator\nalgebra” making it possible to construct specialized tools succinctly and\nefficiently in pure Python.
\nFor instance, SML provides a tabulation tool: tabulate(f) which produces a\nsequence f(0), f(1), .... The same effect can be achieved in Python\nby combining imap() and count() to form imap(f, count()).
\nThese tools and their built-in counterparts also work well with the high-speed\nfunctions in the operator module. For example, the multiplication\noperator can be mapped across two vectors to form an efficient dot-product:\nsum(imap(operator.mul, vector1, vector2)).
\nInfinite Iterators:
\nIterator | \nArguments | \nResults | \nExample | \n
---|---|---|---|
count() | \nstart, [step] | \nstart, start+step, start+2*step, ... | \ncount(10) --> 10 11 12 13 14 ... | \n
cycle() | \np | \np0, p1, ... plast, p0, p1, ... | \ncycle('ABCD') --> A B C D A B C D ... | \n
repeat() | \nelem [,n] | \nelem, elem, elem, ... endlessly or up to n times | \nrepeat(10, 3) --> 10 10 10 | \n
Iterators terminating on the shortest input sequence:
\nIterator | \nArguments | \nResults | \nExample | \n
---|---|---|---|
chain() | \np, q, ... | \np0, p1, ... plast, q0, q1, ... | \nchain('ABC', 'DEF') --> A B C D E F | \n
compress() | \ndata, selectors | \n(d[0] if s[0]), (d[1] if s[1]), ... | \ncompress('ABCDEF', [1,0,1,0,1,1]) --> A C E F | \n
dropwhile() | \npred, seq | \nseq[n], seq[n+1], starting when pred fails | \ndropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1 | \n
groupby() | \niterable[, keyfunc] | \nsub-iterators grouped by value of keyfunc(v) | \n\n |
ifilter() | \npred, seq | \nelements of seq where pred(elem) is True | \nifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9 | \n
ifilterfalse() | \npred, seq | \nelements of seq where pred(elem) is False | \nifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8 | \n
islice() | \nseq, [start,] stop [, step] | \nelements from seq[start:stop:step] | \nislice('ABCDEFG', 2, None) --> C D E F G | \n
imap() | \nfunc, p, q, ... | \nfunc(p0, q0), func(p1, q1), ... | \nimap(pow, (2,3,10), (5,2,3)) --> 32 9 1000 | \n
starmap() | \nfunc, seq | \nfunc(*seq[0]), func(*seq[1]), ... | \nstarmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000 | \n
tee() | \nit, n | \nit1, it2 , ... itn splits one iterator into n | \n\n |
takewhile() | \npred, seq | \nseq[0], seq[1], until pred fails | \ntakewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4 | \n
izip() | \np, q, ... | \n(p[0], q[0]), (p[1], q[1]), ... | \nizip('ABCD', 'xy') --> Ax By | \n
izip_longest() | \np, q, ... | \n(p[0], q[0]), (p[1], q[1]), ... | \nizip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D- | \n
Combinatoric generators:
\nIterator | \nArguments | \nResults | \n
---|---|---|
product() | \np, q, ... [repeat=1] | \ncartesian product, equivalent to a nested for-loop | \n
permutations() | \np[, r] | \nr-length tuples, all possible orderings, no repeated elements | \n
combinations() | \np, r | \nr-length tuples, in sorted order, no repeated elements | \n
combinations_with_replacement() | \np, r | \nr-length tuples, in sorted order, with repeated elements | \n
product('ABCD', repeat=2) | \n\n | AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD | \n
permutations('ABCD', 2) | \n\n | AB AC AD BA BC BD CA CB CD DA DB DC | \n
combinations('ABCD', 2) | \n\n | AB AC AD BC BD CD | \n
combinations_with_replacement('ABCD', 2) | \n\n | AA AB AC AD BB BC BD CC CD DD | \n
The following module functions all construct and return iterators. Some provide\nstreams of infinite length, so they should only be accessed by functions or\nloops that truncate the stream.
\nMake an iterator that returns elements from the first iterable until it is\nexhausted, then proceeds to the next iterable, until all of the iterables are\nexhausted. Used for treating consecutive sequences as a single sequence.\nEquivalent to:
\ndef chain(*iterables):\n # chain('ABC', 'DEF') --> A B C D E F\n for it in iterables:\n for element in it:\n yield element\n
Alternate constructor for chain(). Gets chained inputs from a\nsingle iterable argument that is evaluated lazily. Equivalent to:
\n@classmethod\ndef from_iterable(iterables):\n # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F\n for it in iterables:\n for element in it:\n yield element\n
\nNew in version 2.6.
\nReturn r length subsequences of elements from the input iterable.
\nCombinations are emitted in lexicographic sort order. So, if the\ninput iterable is sorted, the combination tuples will be produced\nin sorted order.
\nElements are treated as unique based on their position, not on their\nvalue. So if the input elements are unique, there will be no repeat\nvalues in each combination.
\nEquivalent to:
\ndef combinations(iterable, r):\n # combinations('ABCD', 2) --> AB AC AD BC BD CD\n # combinations(range(4), 3) --> 012 013 023 123\n pool = tuple(iterable)\n n = len(pool)\n if r > n:\n return\n indices = range(r)\n yield tuple(pool[i] for i in indices)\n while True:\n for i in reversed(range(r)):\n if indices[i] != i + n - r:\n break\n else:\n return\n indices[i] += 1\n for j in range(i+1, r):\n indices[j] = indices[j-1] + 1\n yield tuple(pool[i] for i in indices)\n
The code for combinations() can be also expressed as a subsequence\nof permutations() after filtering entries where the elements are not\nin sorted order (according to their position in the input pool):
\ndef combinations(iterable, r):\n pool = tuple(iterable)\n n = len(pool)\n for indices in permutations(range(n), r):\n if sorted(indices) == list(indices):\n yield tuple(pool[i] for i in indices)\n
The number of items returned is n! / r! / (n-r)! when 0 <= r <= n\nor zero when r > n.
\n\nNew in version 2.6.
\nReturn r length subsequences of elements from the input iterable\nallowing individual elements to be repeated more than once.
\nCombinations are emitted in lexicographic sort order. So, if the\ninput iterable is sorted, the combination tuples will be produced\nin sorted order.
\nElements are treated as unique based on their position, not on their\nvalue. So if the input elements are unique, the generated combinations\nwill also be unique.
\nEquivalent to:
\ndef combinations_with_replacement(iterable, r):\n # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC\n pool = tuple(iterable)\n n = len(pool)\n if not n and r:\n return\n indices = [0] * r\n yield tuple(pool[i] for i in indices)\n while True:\n for i in reversed(range(r)):\n if indices[i] != n - 1:\n break\n else:\n return\n indices[i:] = [indices[i] + 1] * (r - i)\n yield tuple(pool[i] for i in indices)\n
The code for combinations_with_replacement() can be also expressed as\na subsequence of product() after filtering entries where the elements\nare not in sorted order (according to their position in the input pool):
\ndef combinations_with_replacement(iterable, r):\n pool = tuple(iterable)\n n = len(pool)\n for indices in product(range(n), repeat=r):\n if sorted(indices) == list(indices):\n yield tuple(pool[i] for i in indices)\n
The number of items returned is (n+r-1)! / r! / (n-1)! when n > 0.
\n\nNew in version 2.7.
\nMake an iterator that filters elements from data returning only those that\nhave a corresponding element in selectors that evaluates to True.\nStops when either the data or selectors iterables has been exhausted.\nEquivalent to:
\ndef compress(data, selectors):\n # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F\n return (d for d, s in izip(data, selectors) if s)\n
\nNew in version 2.7.
\nMake an iterator that returns evenly spaced values starting with n. Often\nused as an argument to imap() to generate consecutive data points.\nAlso, used with izip() to add sequence numbers. Equivalent to:
\ndef count(start=0, step=1):\n # count(10) --> 10 11 12 13 14 ...\n # count(2.5, 0.5) -> 2.5 3.0 3.5 ...\n n = start\n while True:\n yield n\n n += step\n
When counting with floating point numbers, better accuracy can sometimes be\nachieved by substituting multiplicative code such as: (start + step * i\nfor i in count()).
\n\nChanged in version 2.7: added step argument and allowed non-integer arguments.
\nMake an iterator returning elements from the iterable and saving a copy of each.\nWhen the iterable is exhausted, return elements from the saved copy. Repeats\nindefinitely. Equivalent to:
\ndef cycle(iterable):\n # cycle('ABCD') --> A B C D A B C D A B C D ...\n saved = []\n for element in iterable:\n yield element\n saved.append(element)\n while saved:\n for element in saved:\n yield element\n
Note, this member of the toolkit may require significant auxiliary storage\n(depending on the length of the iterable).
\nMake an iterator that drops elements from the iterable as long as the predicate\nis true; afterwards, returns every element. Note, the iterator does not produce\nany output until the predicate first becomes false, so it may have a lengthy\nstart-up time. Equivalent to:
\ndef dropwhile(predicate, iterable):\n # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1\n iterable = iter(iterable)\n for x in iterable:\n if not predicate(x):\n yield x\n break\n for x in iterable:\n yield x\n
Make an iterator that returns consecutive keys and groups from the iterable.\nThe key is a function computing a key value for each element. If not\nspecified or is None, key defaults to an identity function and returns\nthe element unchanged. Generally, the iterable needs to already be sorted on\nthe same key function.
\nThe operation of groupby() is similar to the uniq filter in Unix. It\ngenerates a break or new group every time the value of the key function changes\n(which is why it is usually necessary to have sorted the data using the same key\nfunction). That behavior differs from SQL’s GROUP BY which aggregates common\nelements regardless of their input order.
\nThe returned group is itself an iterator that shares the underlying iterable\nwith groupby(). Because the source is shared, when the groupby()\nobject is advanced, the previous group is no longer visible. So, if that data\nis needed later, it should be stored as a list:
\ngroups = []\nuniquekeys = []\ndata = sorted(data, key=keyfunc)\nfor k, g in groupby(data, keyfunc):\n groups.append(list(g)) # Store group iterator as a list\n uniquekeys.append(k)\n
groupby() is equivalent to:
\nclass groupby(object):\n # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B\n # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D\n def __init__(self, iterable, key=None):\n if key is None:\n key = lambda x: x\n self.keyfunc = key\n self.it = iter(iterable)\n self.tgtkey = self.currkey = self.currvalue = object()\n def __iter__(self):\n return self\n def next(self):\n while self.currkey == self.tgtkey:\n self.currvalue = next(self.it) # Exit on StopIteration\n self.currkey = self.keyfunc(self.currvalue)\n self.tgtkey = self.currkey\n return (self.currkey, self._grouper(self.tgtkey))\n def _grouper(self, tgtkey):\n while self.currkey == tgtkey:\n yield self.currvalue\n self.currvalue = next(self.it) # Exit on StopIteration\n self.currkey = self.keyfunc(self.currvalue)\n
\nNew in version 2.4.
\nMake an iterator that filters elements from iterable returning only those for\nwhich the predicate is True. If predicate is None, return the items\nthat are true. Equivalent to:
\ndef ifilter(predicate, iterable):\n # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9\n if predicate is None:\n predicate = bool\n for x in iterable:\n if predicate(x):\n yield x\n
Make an iterator that filters elements from iterable returning only those for\nwhich the predicate is False. If predicate is None, return the items\nthat are false. Equivalent to:
\ndef ifilterfalse(predicate, iterable):\n # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8\n if predicate is None:\n predicate = bool\n for x in iterable:\n if not predicate(x):\n yield x\n
Make an iterator that computes the function using arguments from each of the\niterables. If function is set to None, then imap() returns the\narguments as a tuple. Like map() but stops when the shortest iterable is\nexhausted instead of filling in None for shorter iterables. The reason for\nthe difference is that infinite iterator arguments are typically an error for\nmap() (because the output is fully evaluated) but represent a common and\nuseful way of supplying arguments to imap(). Equivalent to:
\ndef imap(function, *iterables):\n # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000\n iterables = map(iter, iterables)\n while True:\n args = [next(it) for it in iterables]\n if function is None:\n yield tuple(args)\n else:\n yield function(*args)\n
Make an iterator that returns selected elements from the iterable. If start is\nnon-zero, then elements from the iterable are skipped until start is reached.\nAfterward, elements are returned consecutively unless step is set higher than\none which results in items being skipped. If stop is None, then iteration\ncontinues until the iterator is exhausted, if at all; otherwise, it stops at the\nspecified position. Unlike regular slicing, islice() does not support\nnegative values for start, stop, or step. Can be used to extract related\nfields from data where the internal structure has been flattened (for example, a\nmulti-line report may list a name field on every third line). Equivalent to:
\ndef islice(iterable, *args):\n # islice('ABCDEFG', 2) --> A B\n # islice('ABCDEFG', 2, 4) --> C D\n # islice('ABCDEFG', 2, None) --> C D E F G\n # islice('ABCDEFG', 0, None, 2) --> A C E G\n s = slice(*args)\n it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))\n nexti = next(it)\n for i, element in enumerate(iterable):\n if i == nexti:\n yield element\n nexti = next(it)\n
If start is None, then iteration starts at zero. If step is None,\nthen the step defaults to one.
\n\nChanged in version 2.5: accept None values for default start and step.
\nMake an iterator that aggregates elements from each of the iterables. Like\nzip() except that it returns an iterator instead of a list. Used for\nlock-step iteration over several iterables at a time. Equivalent to:
\ndef izip(*iterables):\n # izip('ABCD', 'xy') --> Ax By\n iterators = map(iter, iterables)\n while iterators:\n yield tuple(map(next, iterators))\n
\nChanged in version 2.4: When no iterables are specified, returns a zero length iterator instead of\nraising a TypeError exception.
\nThe left-to-right evaluation order of the iterables is guaranteed. This\nmakes possible an idiom for clustering a data series into n-length groups\nusing izip(*[iter(s)]*n).
\nizip() should only be used with unequal length inputs when you don’t\ncare about trailing, unmatched values from the longer iterables. If those\nvalues are important, use izip_longest() instead.
\nMake an iterator that aggregates elements from each of the iterables. If the\niterables are of uneven length, missing values are filled-in with fillvalue.\nIteration continues until the longest iterable is exhausted. Equivalent to:
\nclass ZipExhausted(Exception):\n pass\n\ndef izip_longest(*args, **kwds):\n # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-\n fillvalue = kwds.get('fillvalue')\n counter = [len(args) - 1]\n def sentinel():\n if not counter[0]:\n raise ZipExhausted\n counter[0] -= 1\n yield fillvalue\n fillers = repeat(fillvalue)\n iterators = [chain(it, sentinel(), fillers) for it in args]\n try:\n while iterators:\n yield tuple(map(next, iterators))\n except ZipExhausted:\n pass\n
If one of the iterables is potentially infinite, then the\nizip_longest() function should be wrapped with something that limits\nthe number of calls (for example islice() or takewhile()). If\nnot specified, fillvalue defaults to None.
\n\nNew in version 2.6.
\nReturn successive r length permutations of elements in the iterable.
\nIf r is not specified or is None, then r defaults to the length\nof the iterable and all possible full-length permutations\nare generated.
\nPermutations are emitted in lexicographic sort order. So, if the\ninput iterable is sorted, the permutation tuples will be produced\nin sorted order.
\nElements are treated as unique based on their position, not on their\nvalue. So if the input elements are unique, there will be no repeat\nvalues in each permutation.
\nEquivalent to:
\ndef permutations(iterable, r=None):\n # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC\n # permutations(range(3)) --> 012 021 102 120 201 210\n pool = tuple(iterable)\n n = len(pool)\n r = n if r is None else r\n if r > n:\n return\n indices = range(n)\n cycles = range(n, n-r, -1)\n yield tuple(pool[i] for i in indices[:r])\n while n:\n for i in reversed(range(r)):\n cycles[i] -= 1\n if cycles[i] == 0:\n indices[i:] = indices[i+1:] + indices[i:i+1]\n cycles[i] = n - i\n else:\n j = cycles[i]\n indices[i], indices[-j] = indices[-j], indices[i]\n yield tuple(pool[i] for i in indices[:r])\n break\n else:\n return\n
The code for permutations() can be also expressed as a subsequence of\nproduct(), filtered to exclude entries with repeated elements (those\nfrom the same position in the input pool):
\ndef permutations(iterable, r=None):\n pool = tuple(iterable)\n n = len(pool)\n r = n if r is None else r\n for indices in product(range(n), repeat=r):\n if len(set(indices)) == r:\n yield tuple(pool[i] for i in indices)\n
The number of items returned is n! / (n-r)! when 0 <= r <= n\nor zero when r > n.
\n\nNew in version 2.6.
\nCartesian product of input iterables.
\nEquivalent to nested for-loops in a generator expression. For example,\nproduct(A, B) returns the same as ((x,y) for x in A for y in B).
\nThe nested loops cycle like an odometer with the rightmost element advancing\non every iteration. This pattern creates a lexicographic ordering so that if\nthe input’s iterables are sorted, the product tuples are emitted in sorted\norder.
\nTo compute the product of an iterable with itself, specify the number of\nrepetitions with the optional repeat keyword argument. For example,\nproduct(A, repeat=4) means the same as product(A, A, A, A).
\nThis function is equivalent to the following code, except that the\nactual implementation does not build up intermediate results in memory:
\ndef product(*args, **kwds):\n # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy\n # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111\n pools = map(tuple, args) * kwds.get('repeat', 1)\n result = [[]]\n for pool in pools:\n result = [x+[y] for x in result for y in pool]\n for prod in result:\n yield tuple(prod)\n
\nNew in version 2.6.
\nMake an iterator that returns object over and over again. Runs indefinitely\nunless the times argument is specified. Used as argument to imap() for\ninvariant function parameters. Also used with izip() to create constant\nfields in a tuple record. Equivalent to:
\ndef repeat(object, times=None):\n # repeat(10, 3) --> 10 10 10\n if times is None:\n while True:\n yield object\n else:\n for i in xrange(times):\n yield object\n
Make an iterator that computes the function using arguments obtained from\nthe iterable. Used instead of imap() when argument parameters are already\ngrouped in tuples from a single iterable (the data has been “pre-zipped”). The\ndifference between imap() and starmap() parallels the distinction\nbetween function(a,b) and function(*c). Equivalent to:
\ndef starmap(function, iterable):\n # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000\n for args in iterable:\n yield function(*args)\n
\nChanged in version 2.6: Previously, starmap() required the function arguments to be tuples.\nNow, any iterable is allowed.
\nMake an iterator that returns elements from the iterable as long as the\npredicate is true. Equivalent to:
\ndef takewhile(predicate, iterable):\n # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4\n for x in iterable:\n if predicate(x):\n yield x\n else:\n break\n
Return n independent iterators from a single iterable. Equivalent to:
\ndef tee(iterable, n=2):\n it = iter(iterable)\n deques = [collections.deque() for i in range(n)]\n def gen(mydeque):\n while True:\n if not mydeque: # when the local deque is empty\n newval = next(it) # fetch a new value and\n for d in deques: # load it to all the deques\n d.append(newval)\n yield mydeque.popleft()\n return tuple(gen(d) for d in deques)\n
Once tee() has made a split, the original iterable should not be\nused anywhere else; otherwise, the iterable could get advanced without\nthe tee objects being informed.
\nThis itertool may require significant auxiliary storage (depending on how\nmuch temporary data needs to be stored). In general, if one iterator uses\nmost or all of the data before another iterator starts, it is faster to use\nlist() instead of tee().
\n\nNew in version 2.4.
\nThis section shows recipes for creating an extended toolset using the existing\nitertools as building blocks.
\nThe extended tools offer the same high performance as the underlying toolset.\nThe superior memory performance is kept by processing elements one at a time\nrather than bringing the whole iterable into memory all at once. Code volume is\nkept small by linking the tools together in a functional style which helps\neliminate temporary variables. High speed is retained by preferring\n“vectorized” building blocks over the use of for-loops and generators\nwhich incur interpreter overhead.
\ndef take(n, iterable):\n "Return first n items of the iterable as a list"\n return list(islice(iterable, n))\n\ndef tabulate(function, start=0):\n "Return function(0), function(1), ..."\n return imap(function, count(start))\n\ndef consume(iterator, n):\n "Advance the iterator n-steps ahead. If n is none, consume entirely."\n # Use functions that consume iterators at C speed.\n if n is None:\n # feed the entire iterator into a zero-length deque\n collections.deque(iterator, maxlen=0)\n else:\n # advance to the empty slice starting at position n\n next(islice(iterator, n, n), None)\n\ndef nth(iterable, n, default=None):\n "Returns the nth item or a default value"\n return next(islice(iterable, n, None), default)\n\ndef quantify(iterable, pred=bool):\n "Count how many times the predicate is true"\n return sum(imap(pred, iterable))\n\ndef padnone(iterable):\n """Returns the sequence elements and then returns None indefinitely.\n\n Useful for emulating the behavior of the built-in map() function.\n """\n return chain(iterable, repeat(None))\n\ndef ncycles(iterable, n):\n "Returns the sequence elements n times"\n return chain.from_iterable(repeat(tuple(iterable), n))\n\ndef dotproduct(vec1, vec2):\n return sum(imap(operator.mul, vec1, vec2))\n\ndef flatten(listOfLists):\n "Flatten one level of nesting"\n return chain.from_iterable(listOfLists)\n\ndef repeatfunc(func, times=None, *args):\n """Repeat calls to func with specified arguments.\n\n Example: repeatfunc(random.random)\n """\n if times is None:\n return starmap(func, repeat(args))\n return starmap(func, repeat(args, times))\n\ndef pairwise(iterable):\n "s -> (s0,s1), (s1,s2), (s2, s3), ..."\n a, b = tee(iterable)\n next(b, None)\n return izip(a, b)\n\ndef grouper(n, iterable, fillvalue=None):\n "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"\n args = [iter(iterable)] * n\n return izip_longest(fillvalue=fillvalue, *args)\n\ndef roundrobin(*iterables):\n "roundrobin('ABC', 'D', 'EF') --> A D E B F C"\n # Recipe credited to George Sakkis\n pending = len(iterables)\n nexts = cycle(iter(it).next for it in iterables)\n while pending:\n try:\n for next in nexts:\n yield next()\n except StopIteration:\n pending -= 1\n nexts = cycle(islice(nexts, pending))\n\ndef powerset(iterable):\n "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"\n s = list(iterable)\n return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))\n\ndef unique_everseen(iterable, key=None):\n "List unique elements, preserving order. Remember all elements ever seen."\n # unique_everseen('AAAABBBCCDAABBB') --> A B C D\n # unique_everseen('ABBCcAD', str.lower) --> A B C D\n seen = set()\n seen_add = seen.add\n if key is None:\n for element in ifilterfalse(seen.__contains__, iterable):\n seen_add(element)\n yield element\n else:\n for element in iterable:\n k = key(element)\n if k not in seen:\n seen_add(k)\n yield element\n\ndef unique_justseen(iterable, key=None):\n "List unique elements, preserving order. Remember only the element just seen."\n # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B\n # unique_justseen('ABBCcAD', str.lower) --> A B C A D\n return imap(next, imap(itemgetter(1), groupby(iterable, key)))\n\ndef iter_except(func, exception, first=None):\n """ Call a function repeatedly until an exception is raised.\n\n Converts a call-until-exception interface to an iterator interface.\n Like __builtin__.iter(func, sentinel) but uses an exception instead\n of a sentinel to end the loop.\n\n Examples:\n bsddbiter = iter_except(db.next, bsddb.error, db.first)\n heapiter = iter_except(functools.partial(heappop, h), IndexError)\n dictiter = iter_except(d.popitem, KeyError)\n dequeiter = iter_except(d.popleft, IndexError)\n queueiter = iter_except(q.get_nowait, Queue.Empty)\n setiter = iter_except(s.pop, KeyError)\n\n """\n try:\n if first is not None:\n yield first()\n while 1:\n yield func()\n except exception:\n pass\n\ndef random_product(*args, **kwds):\n "Random selection from itertools.product(*args, **kwds)"\n pools = map(tuple, args) * kwds.get('repeat', 1)\n return tuple(random.choice(pool) for pool in pools)\n\ndef random_permutation(iterable, r=None):\n "Random selection from itertools.permutations(iterable, r)"\n pool = tuple(iterable)\n r = len(pool) if r is None else r\n return tuple(random.sample(pool, r))\n\ndef random_combination(iterable, r):\n "Random selection from itertools.combinations(iterable, r)"\n pool = tuple(iterable)\n n = len(pool)\n indices = sorted(random.sample(xrange(n), r))\n return tuple(pool[i] for i in indices)\n\ndef random_combination_with_replacement(iterable, r):\n "Random selection from itertools.combinations_with_replacement(iterable, r)"\n pool = tuple(iterable)\n n = len(pool)\n indices = sorted(random.randrange(n) for i in xrange(r))\n return tuple(pool[i] for i in indices)\n
Note, many of the above recipes can be optimized by replacing global lookups\nwith local variables defined as default values. For example, the\ndotproduct recipe can be written as:
\ndef dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):\n return sum(imap(mul, vec1, vec2))\n
\nNew in version 2.4.
\nThe decimal module provides support for decimal floating point\narithmetic. It offers several advantages over the float datatype:
\nDecimal “is based on a floating-point model which was designed with people\nin mind, and necessarily has a paramount guiding principle – computers must\nprovide an arithmetic that works in the same way as the arithmetic that\npeople learn at school.” – excerpt from the decimal arithmetic specification.
\nDecimal numbers can be represented exactly. In contrast, numbers like\n1.1 and 2.2 do not have an exact representations in binary\nfloating point. End users typically would not expect 1.1 + 2.2 to display\nas 3.3000000000000003 as it does with binary floating point.
\nThe exactness carries over into arithmetic. In decimal floating point, 0.1\n+ 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result\nis 5.5511151231257827e-017. While near to zero, the differences\nprevent reliable equality testing and differences can accumulate. For this\nreason, decimal is preferred in accounting applications which have strict\nequality invariants.
\nThe decimal module incorporates a notion of significant places so that 1.30\n+ 1.20 is 2.50. The trailing zero is kept to indicate significance.\nThis is the customary presentation for monetary applications. For\nmultiplication, the “schoolbook” approach uses all the figures in the\nmultiplicands. For instance, 1.3 * 1.2 gives 1.56 while 1.30 *\n1.20 gives 1.5600.
\nUnlike hardware based binary floating point, the decimal module has a user\nalterable precision (defaulting to 28 places) which can be as large as needed for\na given problem:
\n>>> from decimal import *\n>>> getcontext().prec = 6\n>>> Decimal(1) / Decimal(7)\nDecimal('0.142857')\n>>> getcontext().prec = 28\n>>> Decimal(1) / Decimal(7)\nDecimal('0.1428571428571428571428571429')\n
Both binary and decimal floating point are implemented in terms of published\nstandards. While the built-in float type exposes only a modest portion of its\ncapabilities, the decimal module exposes all required parts of the standard.\nWhen needed, the programmer has full control over rounding and signal handling.\nThis includes an option to enforce exact arithmetic by using exceptions\nto block any inexact operations.
\nThe decimal module was designed to support “without prejudice, both exact\nunrounded decimal arithmetic (sometimes called fixed-point arithmetic)\nand rounded floating-point arithmetic.” – excerpt from the decimal\narithmetic specification.
\nThe module design is centered around three concepts: the decimal number, the\ncontext for arithmetic, and signals.
\nA decimal number is immutable. It has a sign, coefficient digits, and an\nexponent. To preserve significance, the coefficient digits do not truncate\ntrailing zeros. Decimals also include special values such as\nInfinity, -Infinity, and NaN. The standard also\ndifferentiates -0 from +0.
\nThe context for arithmetic is an environment specifying precision, rounding\nrules, limits on exponents, flags indicating the results of operations, and trap\nenablers which determine whether signals are treated as exceptions. Rounding\noptions include ROUND_CEILING, ROUND_DOWN,\nROUND_FLOOR, ROUND_HALF_DOWN, ROUND_HALF_EVEN,\nROUND_HALF_UP, ROUND_UP, and ROUND_05UP.
\nSignals are groups of exceptional conditions arising during the course of\ncomputation. Depending on the needs of the application, signals may be ignored,\nconsidered as informational, or treated as exceptions. The signals in the\ndecimal module are: Clamped, InvalidOperation,\nDivisionByZero, Inexact, Rounded, Subnormal,\nOverflow, and Underflow.
\nFor each signal there is a flag and a trap enabler. When a signal is\nencountered, its flag is set to one, then, if the trap enabler is\nset to one, an exception is raised. Flags are sticky, so the user needs to\nreset them before monitoring a calculation.
\nSee also
\nThe usual start to using decimals is importing the module, viewing the current\ncontext with getcontext() and, if necessary, setting new values for\nprecision, rounding, or enabled traps:
\n>>> from decimal import *\n>>> getcontext()\nContext(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,\n capitals=1, flags=[], traps=[Overflow, DivisionByZero,\n InvalidOperation])\n\n>>> getcontext().prec = 7 # Set a new precision\n
Decimal instances can be constructed from integers, strings, floats, or tuples.\nConstruction from an integer or a float performs an exact conversion of the\nvalue of that integer or float. Decimal numbers include special values such as\nNaN which stands for “Not a number”, positive and negative\nInfinity, and -0.
\n>>> getcontext().prec = 28\n>>> Decimal(10)\nDecimal('10')\n>>> Decimal('3.14')\nDecimal('3.14')\n>>> Decimal(3.14)\nDecimal('3.140000000000000124344978758017532527446746826171875')\n>>> Decimal((0, (3, 1, 4), -2))\nDecimal('3.14')\n>>> Decimal(str(2.0 ** 0.5))\nDecimal('1.41421356237')\n>>> Decimal(2) ** Decimal('0.5')\nDecimal('1.414213562373095048801688724')\n>>> Decimal('NaN')\nDecimal('NaN')\n>>> Decimal('-Infinity')\nDecimal('-Infinity')\n
The significance of a new Decimal is determined solely by the number of digits\ninput. Context precision and rounding only come into play during arithmetic\noperations.
\n>>> getcontext().prec = 6\n>>> Decimal('3.0')\nDecimal('3.0')\n>>> Decimal('3.1415926535')\nDecimal('3.1415926535')\n>>> Decimal('3.1415926535') + Decimal('2.7182818285')\nDecimal('5.85987')\n>>> getcontext().rounding = ROUND_UP\n>>> Decimal('3.1415926535') + Decimal('2.7182818285')\nDecimal('5.85988')\n
Decimals interact well with much of the rest of Python. Here is a small decimal\nfloating point flying circus:
\n>>> data = map(Decimal, '1.34 1.87 3.45 2.35 1.00 0.03 9.25'.split())\n>>> max(data)\nDecimal('9.25')\n>>> min(data)\nDecimal('0.03')\n>>> sorted(data)\n[Decimal('0.03'), Decimal('1.00'), Decimal('1.34'), Decimal('1.87'),\n Decimal('2.35'), Decimal('3.45'), Decimal('9.25')]\n>>> sum(data)\nDecimal('19.29')\n>>> a,b,c = data[:3]\n>>> str(a)\n'1.34'\n>>> float(a)\n1.34\n>>> round(a, 1) # round() first converts to binary floating point\n1.3\n>>> int(a)\n1\n>>> a * 5\nDecimal('6.70')\n>>> a * b\nDecimal('2.5058')\n>>> c a\nDecimal('0.77')\n
And some mathematical functions are also available to Decimal:
\n>>> getcontext().prec = 28\n>>> Decimal(2).sqrt()\nDecimal('1.414213562373095048801688724')\n>>> Decimal(1).exp()\nDecimal('2.718281828459045235360287471')\n>>> Decimal('10').ln()\nDecimal('2.302585092994045684017991455')\n>>> Decimal('10').log10()\nDecimal('1')\n
The quantize() method rounds a number to a fixed exponent. This method is\nuseful for monetary applications that often round results to a fixed number of\nplaces:
\n>>> Decimal('7.325').quantize(Decimal('.01'), rounding=ROUND_DOWN)\nDecimal('7.32')\n>>> Decimal('7.325').quantize(Decimal('1.'), rounding=ROUND_UP)\nDecimal('8')\n
As shown above, the getcontext() function accesses the current context and\nallows the settings to be changed. This approach meets the needs of most\napplications.
\nFor more advanced work, it may be useful to create alternate contexts using the\nContext() constructor. To make an alternate active, use the setcontext()\nfunction.
\nIn accordance with the standard, the Decimal module provides two ready to\nuse standard contexts, BasicContext and ExtendedContext. The\nformer is especially useful for debugging because many of the traps are\nenabled:
\n>>> myothercontext = Context(prec=60, rounding=ROUND_HALF_DOWN)\n>>> setcontext(myothercontext)\n>>> Decimal(1) / Decimal(7)\nDecimal('0.142857142857142857142857142857142857142857142857142857142857')\n\n>>> ExtendedContext\nContext(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,\n capitals=1, flags=[], traps=[])\n>>> setcontext(ExtendedContext)\n>>> Decimal(1) / Decimal(7)\nDecimal('0.142857143')\n>>> Decimal(42) / Decimal(0)\nDecimal('Infinity')\n\n>>> setcontext(BasicContext)\n>>> Decimal(42) / Decimal(0)\nTraceback (most recent call last):\n File "<pyshell#143>", line 1, in -toplevel-\n Decimal(42) / Decimal(0)\nDivisionByZero: x / 0\n
Contexts also have signal flags for monitoring exceptional conditions\nencountered during computations. The flags remain set until explicitly cleared,\nso it is best to clear the flags before each set of monitored computations by\nusing the clear_flags() method.
\n>>> setcontext(ExtendedContext)\n>>> getcontext().clear_flags()\n>>> Decimal(355) / Decimal(113)\nDecimal('3.14159292')\n>>> getcontext()\nContext(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,\n capitals=1, flags=[Rounded, Inexact], traps=[])\n
The flags entry shows that the rational approximation to Pi was\nrounded (digits beyond the context precision were thrown away) and that the\nresult is inexact (some of the discarded digits were non-zero).
\nIndividual traps are set using the dictionary in the traps field of a\ncontext:
\n>>> setcontext(ExtendedContext)\n>>> Decimal(1) / Decimal(0)\nDecimal('Infinity')\n>>> getcontext().traps[DivisionByZero] = 1\n>>> Decimal(1) / Decimal(0)\nTraceback (most recent call last):\n File "<pyshell#112>", line 1, in -toplevel-\n Decimal(1) / Decimal(0)\nDivisionByZero: x / 0\n
Most programs adjust the current context only once, at the beginning of the\nprogram. And, in many applications, data is converted to Decimal with\na single cast inside a loop. With context set and decimals created, the bulk of\nthe program manipulates the data no differently than with other Python numeric\ntypes.
\nConstruct a new Decimal object based from value.
\nvalue can be an integer, string, tuple, float, or another Decimal\nobject. If no value is given, returns Decimal('0'). If value is a\nstring, it should conform to the decimal numeric string syntax after leading\nand trailing whitespace characters are removed:
\nsign ::= '+' | '-'\ndigit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'\nindicator ::= 'e' | 'E'\ndigits ::= digit [digit]...\ndecimal-part ::= digits '.' [digits] | ['.'] digits\nexponent-part ::= indicator [sign] digits\ninfinity ::= 'Infinity' | 'Inf'\nnan ::= 'NaN' [digits] | 'sNaN' [digits]\nnumeric-value ::= decimal-part [exponent-part] | infinity\nnumeric-string ::= [sign] numeric-value | [sign] nan
\nIf value is a unicode string then other Unicode decimal digits\nare also permitted where digit appears above. These include\ndecimal digits from various other alphabets (for example,\nArabic-Indic and Devanāgarī digits) along with the fullwidth digits\nu'\\uff10' through u'\\uff19'.
\nIf value is a tuple, it should have three components, a sign\n(0 for positive or 1 for negative), a tuple of\ndigits, and an integer exponent. For example, Decimal((0, (1, 4, 1, 4), -3))\nreturns Decimal('1.414').
\nIf value is a float, the binary floating point value is losslessly\nconverted to its exact decimal equivalent. This conversion can often require\n53 or more digits of precision. For example, Decimal(float('1.1'))\nconverts to\nDecimal('1.100000000000000088817841970012523233890533447265625').
\nThe context precision does not affect how many digits are stored. That is\ndetermined exclusively by the number of digits in value. For example,\nDecimal('3.00000') records all five zeros even if the context precision is\nonly three.
\nThe purpose of the context argument is determining what to do if value is a\nmalformed string. If the context traps InvalidOperation, an exception\nis raised; otherwise, the constructor returns a new Decimal with the value of\nNaN.
\nOnce constructed, Decimal objects are immutable.
\n\nChanged in version 2.6: leading and trailing whitespace characters are permitted when\ncreating a Decimal instance from a string.
\n\nChanged in version 2.7: The argument to the constructor is now permitted to be a float instance.
\nDecimal floating point objects share many properties with the other built-in\nnumeric types such as float and int. All of the usual math\noperations and special methods apply. Likewise, decimal objects can be\ncopied, pickled, printed, used as dictionary keys, used as set elements,\ncompared, sorted, and coerced to another type (such as float or\nlong).
\nDecimal objects cannot generally be combined with floats in\narithmetic operations: an attempt to add a Decimal to a\nfloat, for example, will raise a TypeError.\nThere’s one exception to this rule: it’s possible to use Python’s\ncomparison operators to compare a float instance x\nwith a Decimal instance y. Without this exception,\ncomparisons between Decimal and float instances\nwould follow the general rules for comparing objects of different\ntypes described in the Expressions section of the reference\nmanual, leading to confusing results.
\n\nChanged in version 2.7: A comparison between a float instance x and a\nDecimal instance y now returns a result based on\nthe values of x and y. In earlier versions x < y\nreturned the same (arbitrary) result for any Decimal\ninstance x and any float instance y.
\nIn addition to the standard numeric properties, decimal floating point\nobjects also have a number of specialized methods:
\nReturn a named tuple representation of the number:\nDecimalTuple(sign, digits, exponent).
\n\nChanged in version 2.6: Use a named tuple.
\nReturn the canonical encoding of the argument. Currently, the encoding of\na Decimal instance is always canonical, so this operation returns\nits argument unchanged.
\n\nNew in version 2.6.
\nCompare the values of two Decimal instances. This operation behaves in\nthe same way as the usual comparison method __cmp__(), except that\ncompare() returns a Decimal instance rather than an integer, and if\neither operand is a NaN then the result is a NaN:
\na or b is a NaN ==> Decimal('NaN')\na < b ==> Decimal('-1')\na == b ==> Decimal('0')\na > b ==> Decimal('1')
\nThis operation is identical to the compare() method, except that all\nNaNs signal. That is, if neither operand is a signaling NaN then any\nquiet NaN operand is treated as though it were a signaling NaN.
\n\nNew in version 2.6.
\nCompare two operands using their abstract representation rather than their\nnumerical value. Similar to the compare() method, but the result\ngives a total ordering on Decimal instances. Two\nDecimal instances with the same numeric value but different\nrepresentations compare unequal in this ordering:
\n>>> Decimal('12.0').compare_total(Decimal('12'))\nDecimal('-1')\n
Quiet and signaling NaNs are also included in the total ordering. The\nresult of this function is Decimal('0') if both operands have the same\nrepresentation, Decimal('-1') if the first operand is lower in the\ntotal order than the second, and Decimal('1') if the first operand is\nhigher in the total order than the second operand. See the specification\nfor details of the total order.
\n\nNew in version 2.6.
\nCompare two operands using their abstract representation rather than their\nvalue as in compare_total(), but ignoring the sign of each operand.\nx.compare_total_mag(y) is equivalent to\nx.copy_abs().compare_total(y.copy_abs()).
\n\nNew in version 2.6.
\nJust returns self, this method is only to comply with the Decimal\nSpecification.
\n\nNew in version 2.6.
\nReturn the absolute value of the argument. This operation is unaffected\nby the context and is quiet: no flags are changed and no rounding is\nperformed.
\n\nNew in version 2.6.
\nReturn the negation of the argument. This operation is unaffected by the\ncontext and is quiet: no flags are changed and no rounding is performed.
\n\nNew in version 2.6.
\nReturn a copy of the first operand with the sign set to be the same as the\nsign of the second operand. For example:
\n>>> Decimal('2.3').copy_sign(Decimal('-1.5'))\nDecimal('-2.3')\n
This operation is unaffected by the context and is quiet: no flags are\nchanged and no rounding is performed.
\n\nNew in version 2.6.
\nReturn the value of the (natural) exponential function e**x at the\ngiven number. The result is correctly rounded using the\nROUND_HALF_EVEN rounding mode.
\n>>> Decimal(1).exp()\nDecimal('2.718281828459045235360287471')\n>>> Decimal(321).exp()\nDecimal('2.561702493119680037517373933E+139')\n
\nNew in version 2.6.
\nClassmethod that converts a float to a decimal number, exactly.
\nNote Decimal.from_float(0.1) is not the same as Decimal(‘0.1’).\nSince 0.1 is not exactly representable in binary floating point, the\nvalue is stored as the nearest representable value which is\n0x1.999999999999ap-4. That equivalent value in decimal is\n0.1000000000000000055511151231257827021181583404541015625.
\nNote
\nFrom Python 2.7 onwards, a Decimal instance\ncan also be constructed directly from a float.
\n>>> Decimal.from_float(0.1)\nDecimal('0.1000000000000000055511151231257827021181583404541015625')\n>>> Decimal.from_float(float('nan'))\nDecimal('NaN')\n>>> Decimal.from_float(float('inf'))\nDecimal('Infinity')\n>>> Decimal.from_float(float('-inf'))\nDecimal('-Infinity')\n
\nNew in version 2.7.
\nFused multiply-add. Return self*other+third with no rounding of the\nintermediate product self*other.
\n>>> Decimal(2).fma(3, 5)\nDecimal('11')\n
\nNew in version 2.6.
\nReturn True if the argument is canonical and False\notherwise. Currently, a Decimal instance is always canonical, so\nthis operation always returns True.
\n\nNew in version 2.6.
\nReturn True if the argument is a finite number, and\nFalse if the argument is an infinity or a NaN.
\n\nNew in version 2.6.
\nReturn True if the argument is either positive or negative\ninfinity and False otherwise.
\n\nNew in version 2.6.
\nReturn True if the argument is a (quiet or signaling) NaN and\nFalse otherwise.
\n\nNew in version 2.6.
\nReturn True if the argument is a normal finite non-zero\nnumber with an adjusted exponent greater than or equal to Emin.\nReturn False if the argument is zero, subnormal, infinite or a\nNaN. Note, the term normal is used here in a different sense with\nthe normalize() method which is used to create canonical values.
\n\nNew in version 2.6.
\nReturn True if the argument is a quiet NaN, and\nFalse otherwise.
\n\nNew in version 2.6.
\nReturn True if the argument has a negative sign and\nFalse otherwise. Note that zeros and NaNs can both carry signs.
\n\nNew in version 2.6.
\nReturn True if the argument is a signaling NaN and False\notherwise.
\n\nNew in version 2.6.
\nReturn True if the argument is subnormal, and False\notherwise. A number is subnormal is if it is nonzero, finite, and has an\nadjusted exponent less than Emin.
\n\nNew in version 2.6.
\nReturn True if the argument is a (positive or negative) zero and\nFalse otherwise.
\n\nNew in version 2.6.
\nReturn the natural (base e) logarithm of the operand. The result is\ncorrectly rounded using the ROUND_HALF_EVEN rounding mode.
\n\nNew in version 2.6.
\nReturn the base ten logarithm of the operand. The result is correctly\nrounded using the ROUND_HALF_EVEN rounding mode.
\n\nNew in version 2.6.
\nFor a nonzero number, return the adjusted exponent of its operand as a\nDecimal instance. If the operand is a zero then\nDecimal('-Infinity') is returned and the DivisionByZero flag\nis raised. If the operand is an infinity then Decimal('Infinity') is\nreturned.
\n\nNew in version 2.6.
\nlogical_and() is a logical operation which takes two logical\noperands (see Logical operands). The result is the\ndigit-wise and of the two operands.
\n\nNew in version 2.6.
\nlogical_invert() is a logical operation. The\nresult is the digit-wise inversion of the operand.
\n\nNew in version 2.6.
\nlogical_or() is a logical operation which takes two logical\noperands (see Logical operands). The result is the\ndigit-wise or of the two operands.
\n\nNew in version 2.6.
\nlogical_xor() is a logical operation which takes two logical\noperands (see Logical operands). The result is the\ndigit-wise exclusive or of the two operands.
\n\nNew in version 2.6.
\nSimilar to the max() method, but the comparison is done using the\nabsolute values of the operands.
\n\nNew in version 2.6.
\nSimilar to the min() method, but the comparison is done using the\nabsolute values of the operands.
\n\nNew in version 2.6.
\nReturn the largest number representable in the given context (or in the\ncurrent thread’s context if no context is given) that is smaller than the\ngiven operand.
\n\nNew in version 2.6.
\nReturn the smallest number representable in the given context (or in the\ncurrent thread’s context if no context is given) that is larger than the\ngiven operand.
\n\nNew in version 2.6.
\nIf the two operands are unequal, return the number closest to the first\noperand in the direction of the second operand. If both operands are\nnumerically equal, return a copy of the first operand with the sign set to\nbe the same as the sign of the second operand.
\n\nNew in version 2.6.
\nReturn a string describing the class of the operand. The returned value\nis one of the following ten strings.
\n\nNew in version 2.6.
\nReturn a value equal to the first operand after rounding and having the\nexponent of the second operand.
\n>>> Decimal('1.41421356').quantize(Decimal('1.000'))\nDecimal('1.414')\n
Unlike other operations, if the length of the coefficient after the\nquantize operation would be greater than precision, then an\nInvalidOperation is signaled. This guarantees that, unless there\nis an error condition, the quantized exponent is always equal to that of\nthe right-hand operand.
\nAlso unlike other operations, quantize never signals Underflow, even if\nthe result is subnormal and inexact.
\nIf the exponent of the second operand is larger than that of the first\nthen rounding may be necessary. In this case, the rounding mode is\ndetermined by the rounding argument if given, else by the given\ncontext argument; if neither argument is given the rounding mode of\nthe current thread’s context is used.
\nIf watchexp is set (default), then an error is returned whenever the\nresulting exponent is greater than Emax or less than\nEtiny.
\nReturn Decimal(10), the radix (base) in which the Decimal\nclass does all its arithmetic. Included for compatibility with the\nspecification.
\n\nNew in version 2.6.
\nCompute the modulo as either a positive or negative value depending on\nwhich is closest to zero. For instance, Decimal(10).remainder_near(6)\nreturns Decimal('-2') which is closer to zero than Decimal('4').
\nIf both are equally close, the one chosen will have the same sign as\nself.
\nReturn the result of rotating the digits of the first operand by an amount\nspecified by the second operand. The second operand must be an integer in\nthe range -precision through precision. The absolute value of the second\noperand gives the number of places to rotate. If the second operand is\npositive then rotation is to the left; otherwise rotation is to the right.\nThe coefficient of the first operand is padded on the left with zeros to\nlength precision if necessary. The sign and exponent of the first operand\nare unchanged.
\n\nNew in version 2.6.
\nReturn the first operand with exponent adjusted by the second.\nEquivalently, return the first operand multiplied by 10**other. The\nsecond operand must be an integer.
\n\nNew in version 2.6.
\nReturn the result of shifting the digits of the first operand by an amount\nspecified by the second operand. The second operand must be an integer in\nthe range -precision through precision. The absolute value of the second\noperand gives the number of places to shift. If the second operand is\npositive then the shift is to the left; otherwise the shift is to the\nright. Digits shifted into the coefficient are zeros. The sign and\nexponent of the first operand are unchanged.
\n\nNew in version 2.6.
\nConvert to an engineering-type string.
\nEngineering notation has an exponent which is a multiple of 3, so there\nare up to 3 digits left of the decimal place. For example, converts\nDecimal('123E+1') to Decimal('1.23E+3')
\nRound to the nearest integer, signaling Inexact or\nRounded as appropriate if rounding occurs. The rounding mode is\ndetermined by the rounding parameter if given, else by the given\ncontext. If neither parameter is given then the rounding mode of the\ncurrent context is used.
\n\nNew in version 2.6.
\nRound to the nearest integer without signaling Inexact or\nRounded. If given, applies rounding; otherwise, uses the\nrounding method in either the supplied context or the current context.
\n\nChanged in version 2.6: renamed from to_integral to to_integral_value. The old name\nremains valid for compatibility.
\nContexts are environments for arithmetic operations. They govern precision, set\nrules for rounding, determine which signals are treated as exceptions, and limit\nthe range for exponents.
\nEach thread has its own current context which is accessed or changed using the\ngetcontext() and setcontext() functions:
\nBeginning with Python 2.5, you can also use the with statement and\nthe localcontext() function to temporarily change the active context.
\nReturn a context manager that will set the current context for the active thread\nto a copy of c on entry to the with-statement and restore the previous context\nwhen exiting the with-statement. If no context is specified, a copy of the\ncurrent context is used.
\n\nNew in version 2.5.
\nFor example, the following code sets the current decimal precision to 42 places,\nperforms a calculation, and then automatically restores the previous context:
\nfrom decimal import localcontext\n\nwith localcontext() as ctx:\n ctx.prec = 42 # Perform a high precision calculation\n s = calculate_something()\ns = +s # Round the final result back to the default precision\n
New contexts can also be created using the Context constructor\ndescribed below. In addition, the module provides three pre-made contexts:
\nThis is a standard context defined by the General Decimal Arithmetic\nSpecification. Precision is set to nine. Rounding is set to\nROUND_HALF_UP. All flags are cleared. All traps are enabled (treated\nas exceptions) except Inexact, Rounded, and\nSubnormal.
\nBecause many of the traps are enabled, this context is useful for debugging.
\nThis is a standard context defined by the General Decimal Arithmetic\nSpecification. Precision is set to nine. Rounding is set to\nROUND_HALF_EVEN. All flags are cleared. No traps are enabled (so that\nexceptions are not raised during computations).
\nBecause the traps are disabled, this context is useful for applications that\nprefer to have result value of NaN or Infinity instead of\nraising exceptions. This allows an application to complete a run in the\npresence of conditions that would otherwise halt the program.
\nThis context is used by the Context constructor as a prototype for new\ncontexts. Changing a field (such a precision) has the effect of changing the\ndefault for new contexts created by the Context constructor.
\nThis context is most useful in multi-threaded environments. Changing one of the\nfields before threads are started has the effect of setting system-wide\ndefaults. Changing the fields after threads have started is not recommended as\nit would require thread synchronization to prevent race conditions.
\nIn single threaded environments, it is preferable to not use this context at\nall. Instead, simply create contexts explicitly as described below.
\nThe default values are precision=28, rounding=ROUND_HALF_EVEN, and enabled traps\nfor Overflow, InvalidOperation, and DivisionByZero.
\nIn addition to the three supplied contexts, new contexts can be created with the\nContext constructor.
\nCreates a new context. If a field is not specified or is None, the\ndefault values are copied from the DefaultContext. If the flags\nfield is not specified or is None, all flags are cleared.
\nThe prec field is a positive integer that sets the precision for arithmetic\noperations in the context.
\nThe rounding option is one of:
\nThe traps and flags fields list any signals to be set. Generally, new\ncontexts should only set traps and leave the flags clear.
\nThe Emin and Emax fields are integers specifying the outer limits allowable\nfor exponents.
\nThe capitals field is either 0 or 1 (the default). If set to\n1, exponents are printed with a capital E; otherwise, a\nlowercase e is used: Decimal('6.02e+23').
\n\nChanged in version 2.6: The ROUND_05UP rounding mode was added.
\nThe Context class defines several general purpose methods as well as\na large number of methods for doing arithmetic directly in a given context.\nIn addition, for each of the Decimal methods described above (with\nthe exception of the adjusted() and as_tuple() methods) there is\na corresponding Context method. For example, for a Context\ninstance C and Decimal instance x, C.exp(x) is\nequivalent to x.exp(context=C). Each Context method accepts a\nPython integer (an instance of int or long) anywhere that a\nDecimal instance is accepted.
\nCreates a new Decimal instance from num but using self as\ncontext. Unlike the Decimal constructor, the context precision,\nrounding method, flags, and traps are applied to the conversion.
\nThis is useful because constants are often given to a greater precision\nthan is needed by the application. Another benefit is that rounding\nimmediately eliminates unintended effects from digits beyond the current\nprecision. In the following example, using unrounded inputs means that\nadding zero to a sum can change the result:
\n>>> getcontext().prec = 3\n>>> Decimal('3.4445') + Decimal('1.0023')\nDecimal('4.45')\n>>> Decimal('3.4445') + Decimal(0) + Decimal('1.0023')\nDecimal('4.44')\n
This method implements the to-number operation of the IBM specification.\nIf the argument is a string, no leading or trailing whitespace is\npermitted.
\nCreates a new Decimal instance from a float f but rounding using self\nas the context. Unlike the Decimal.from_float() class method,\nthe context precision, rounding method, flags, and traps are applied to\nthe conversion.
\n>>> context = Context(prec=5, rounding=ROUND_DOWN)\n>>> context.create_decimal_from_float(math.pi)\nDecimal('3.1415')\n>>> context = Context(prec=5, traps=[Inexact])\n>>> context.create_decimal_from_float(math.pi)\nTraceback (most recent call last):\n ...\nInexact: None\n
\nNew in version 2.7.
\nThe usual approach to working with decimals is to create Decimal\ninstances and then apply arithmetic operations which take place within the\ncurrent context for the active thread. An alternative approach is to use\ncontext methods for calculating within a specific context. The methods are\nsimilar to those for the Decimal class and are only briefly\nrecounted here.
\nReturn x to the power of y, reduced modulo modulo if given.
\nWith two arguments, compute x**y. If x is negative then y\nmust be integral. The result will be inexact unless y is integral and\nthe result is finite and can be expressed exactly in ‘precision’ digits.\nThe result should always be correctly rounded, using the rounding mode of\nthe current thread’s context.
\nWith three arguments, compute (x**y)
\n\n\n
\n- all three arguments must be integral
\n- y must be nonnegative
\n- at least one of x or y must be nonzero
\n- modulo must be nonzero and have at most ‘precision’ digits
\n
The value resulting from Context.power(x, y, modulo) is\nequal to the value that would be obtained by computing (x**y)\n
\nChanged in version 2.6: y may now be nonintegral in x**y.\nStricter requirements for the three-argument version.
\nReturns the remainder from integer division.
\nThe sign of the result, if non-zero, is the same as that of the original\ndividend.
\nSignals represent conditions that arise during computation. Each corresponds to\none context flag and one context trap enabler.
\nThe context flag is set whenever the condition is encountered. After the\ncomputation, flags may be checked for informational purposes (for instance, to\ndetermine whether a computation was exact). After checking the flags, be sure to\nclear all flags before starting the next computation.
\nIf the context’s trap enabler is set for the signal, then the condition causes a\nPython exception to be raised. For example, if the DivisionByZero trap\nis set, then a DivisionByZero exception is raised upon encountering the\ncondition.
\nAltered an exponent to fit representation constraints.
\nTypically, clamping occurs when an exponent falls outside the context’s\nEmin and Emax limits. If possible, the exponent is reduced to\nfit by adding zeros to the coefficient.
\nSignals the division of a non-infinite number by zero.
\nCan occur with division, modulo division, or when raising a number to a negative\npower. If this signal is not trapped, returns Infinity or\n-Infinity with the sign determined by the inputs to the calculation.
\nIndicates that rounding occurred and the result is not exact.
\nSignals when non-zero digits were discarded during rounding. The rounded result\nis returned. The signal flag or trap is used to detect when results are\ninexact.
\nAn invalid operation was performed.
\nIndicates that an operation was requested that does not make sense. If not\ntrapped, returns NaN. Possible causes include:
\nInfinity - Infinity\n0 * Infinity\nInfinity / Infinity\nx 0\nInfinity x\nx._rescale( non-integer )\nsqrt(-x) and x > 0\n0 ** 0\nx ** (non-integer)\nx ** Infinity\n
Numerical overflow.
\nIndicates the exponent is larger than Emax after rounding has\noccurred. If not trapped, the result depends on the rounding mode, either\npulling inward to the largest representable finite number or rounding outward\nto Infinity. In either case, Inexact and Rounded\nare also signaled.
\nRounding occurred though possibly no information was lost.
\nSignaled whenever rounding discards digits; even if those digits are zero\n(such as rounding 5.00 to 5.0). If not trapped, returns\nthe result unchanged. This signal is used to detect loss of significant\ndigits.
\nExponent was lower than Emin prior to rounding.
\nOccurs when an operation result is subnormal (the exponent is too small). If\nnot trapped, returns the result unchanged.
\nNumerical underflow with result rounded to zero.
\nOccurs when a subnormal result is pushed to zero by rounding. Inexact\nand Subnormal are also signaled.
\nThe following table summarizes the hierarchy of signals:
\nexceptions.ArithmeticError(exceptions.StandardError)\n DecimalException\n Clamped\n DivisionByZero(DecimalException, exceptions.ZeroDivisionError)\n Inexact\n Overflow(Inexact, Rounded)\n Underflow(Inexact, Rounded, Subnormal)\n InvalidOperation\n Rounded\n Subnormal
\nThe use of decimal floating point eliminates decimal representation error\n(making it possible to represent 0.1 exactly); however, some operations\ncan still incur round-off error when non-zero digits exceed the fixed precision.
\nThe effects of round-off error can be amplified by the addition or subtraction\nof nearly offsetting quantities resulting in loss of significance. Knuth\nprovides two instructive examples where rounded floating point arithmetic with\ninsufficient precision causes the breakdown of the associative and distributive\nproperties of addition:
\n# Examples from Seminumerical Algorithms, Section 4.2.2.\n>>> from decimal import Decimal, getcontext\n>>> getcontext().prec = 8\n\n>>> u, v, w = Decimal(11111113), Decimal(-11111111), Decimal('7.51111111')\n>>> (u + v) + w\nDecimal('9.5111111')\n>>> u + (v + w)\nDecimal('10')\n\n>>> u, v, w = Decimal(20000), Decimal(-6), Decimal('6.0000003')\n>>> (u*v) + (u*w)\nDecimal('0.01')\n>>> u * (v+w)\nDecimal('0.0060000')
\nThe decimal module makes it possible to restore the identities by\nexpanding the precision sufficiently to avoid loss of significance:
\n>>> getcontext().prec = 20\n>>> u, v, w = Decimal(11111113), Decimal(-11111111), Decimal('7.51111111')\n>>> (u + v) + w\nDecimal('9.51111111')\n>>> u + (v + w)\nDecimal('9.51111111')\n>>>\n>>> u, v, w = Decimal(20000), Decimal(-6), Decimal('6.0000003')\n>>> (u*v) + (u*w)\nDecimal('0.0060000')\n>>> u * (v+w)\nDecimal('0.0060000')\n
The number system for the decimal module provides special values\nincluding NaN, sNaN, -Infinity, Infinity,\nand two zeros, +0 and -0.
\nInfinities can be constructed directly with: Decimal('Infinity'). Also,\nthey can arise from dividing by zero when the DivisionByZero signal is\nnot trapped. Likewise, when the Overflow signal is not trapped, infinity\ncan result from rounding beyond the limits of the largest representable number.
\nThe infinities are signed (affine) and can be used in arithmetic operations\nwhere they get treated as very large, indeterminate numbers. For instance,\nadding a constant to infinity gives another infinite result.
\nSome operations are indeterminate and return NaN, or if the\nInvalidOperation signal is trapped, raise an exception. For example,\n0/0 returns NaN which means “not a number”. This variety of\nNaN is quiet and, once created, will flow through other computations\nalways resulting in another NaN. This behavior can be useful for a\nseries of computations that occasionally have missing inputs — it allows the\ncalculation to proceed while flagging specific results as invalid.
\nA variant is sNaN which signals rather than remaining quiet after every\noperation. This is a useful return value when an invalid result needs to\ninterrupt a calculation for special handling.
\nThe behavior of Python’s comparison operators can be a little surprising where a\nNaN is involved. A test for equality where one of the operands is a\nquiet or signaling NaN always returns False (even when doing\nDecimal('NaN')==Decimal('NaN')), while a test for inequality always returns\nTrue. An attempt to compare two Decimals using any of the <,\n<=, > or >= operators will raise the InvalidOperation signal\nif either operand is a NaN, and return False if this signal is\nnot trapped. Note that the General Decimal Arithmetic specification does not\nspecify the behavior of direct comparisons; these rules for comparisons\ninvolving a NaN were taken from the IEEE 854 standard (see Table 3 in\nsection 5.7). To ensure strict standards-compliance, use the compare()\nand compare-signal() methods instead.
\nThe signed zeros can result from calculations that underflow. They keep the sign\nthat would have resulted if the calculation had been carried out to greater\nprecision. Since their magnitude is zero, both positive and negative zeros are\ntreated as equal and their sign is informational.
\nIn addition to the two signed zeros which are distinct yet equal, there are\nvarious representations of zero with differing precisions yet equivalent in\nvalue. This takes a bit of getting used to. For an eye accustomed to\nnormalized floating point representations, it is not immediately obvious that\nthe following calculation returns a value equal to zero:
\n>>> 1 / Decimal('Infinity')\nDecimal('0E-1000000026')\n
The getcontext() function accesses a different Context object for\neach thread. Having separate thread contexts means that threads may make\nchanges (such as getcontext.prec=10) without interfering with other threads.
\nLikewise, the setcontext() function automatically assigns its target to\nthe current thread.
\nIf setcontext() has not been called before getcontext(), then\ngetcontext() will automatically create a new context for use in the\ncurrent thread.
\nThe new context is copied from a prototype context called DefaultContext. To\ncontrol the defaults so that each thread will use the same values throughout the\napplication, directly modify the DefaultContext object. This should be done\nbefore any threads are started so that there won’t be a race condition between\nthreads calling getcontext(). For example:
\n# Set applicationwide defaults for all threads about to be launched\nDefaultContext.prec = 12\nDefaultContext.rounding = ROUND_DOWN\nDefaultContext.traps = ExtendedContext.traps.copy()\nDefaultContext.traps[InvalidOperation] = 1\nsetcontext(DefaultContext)\n\n# Afterwards, the threads can be started\nt1.start()\nt2.start()\nt3.start()\n . . .
\nHere are a few recipes that serve as utility functions and that demonstrate ways\nto work with the Decimal class:
\ndef moneyfmt(value, places=2, curr='', sep=',', dp='.',\n pos='', neg='-', trailneg=''):\n """Convert Decimal to a money formatted string.\n\n places: required number of places after the decimal point\n curr: optional currency symbol before the sign (may be blank)\n sep: optional grouping separator (comma, period, space, or blank)\n dp: decimal point indicator (comma or period)\n only specify as blank when places is zero\n pos: optional sign for positive numbers: '+', space or blank\n neg: optional sign for negative numbers: '-', '(', space or blank\n trailneg:optional trailing minus indicator: '-', ')', space or blank\n\n >>> d = Decimal('-1234567.8901')\n >>> moneyfmt(d, curr='$')\n '-$1,234,567.89'\n >>> moneyfmt(d, places=0, sep='.', dp='', neg='', trailneg='-')\n '1.234.568-'\n >>> moneyfmt(d, curr='$', neg='(', trailneg=')')\n '($1,234,567.89)'\n >>> moneyfmt(Decimal(123456789), sep=' ')\n '123 456 789.00'\n >>> moneyfmt(Decimal('-0.02'), neg='<', trailneg='>')\n '<0.02>'\n\n """\n q = Decimal(10) ** -places # 2 places --> '0.01'\n sign, digits, exp = value.quantize(q).as_tuple()\n result = []\n digits = map(str, digits)\n build, next = result.append, digits.pop\n if sign:\n build(trailneg)\n for i in range(places):\n build(next() if digits else '0')\n build(dp)\n if not digits:\n build('0')\n i = 0\n while digits:\n build(next())\n i += 1\n if i == 3 and digits:\n i = 0\n build(sep)\n build(curr)\n build(neg if sign else pos)\n return ''.join(reversed(result))\n\ndef pi():\n """Compute Pi to the current precision.\n\n >>> print pi()\n 3.141592653589793238462643383\n\n """\n getcontext().prec += 2 # extra digits for intermediate steps\n three = Decimal(3) # substitute "three=3.0" for regular floats\n lasts, t, s, n, na, d, da = 0, three, 3, 1, 0, 0, 24\n while s != lasts:\n lasts = s\n n, na = n+na, na+8\n d, da = d+da, da+32\n t = (t * n) / d\n s += t\n getcontext().prec -= 2\n return +s # unary plus applies the new precision\n\ndef exp(x):\n """Return e raised to the power of x. Result type matches input type.\n\n >>> print exp(Decimal(1))\n 2.718281828459045235360287471\n >>> print exp(Decimal(2))\n 7.389056098930650227230427461\n >>> print exp(2.0)\n 7.38905609893\n >>> print exp(2+0j)\n (7.38905609893+0j)\n\n """\n getcontext().prec += 2\n i, lasts, s, fact, num = 0, 0, 1, 1, 1\n while s != lasts:\n lasts = s\n i += 1\n fact *= i\n num *= x\n s += num / fact\n getcontext().prec -= 2\n return +s\n\ndef cos(x):\n """Return the cosine of x as measured in radians.\n\n >>> print cos(Decimal('0.5'))\n 0.8775825618903727161162815826\n >>> print cos(0.5)\n 0.87758256189\n >>> print cos(0.5+0j)\n (0.87758256189+0j)\n\n """\n getcontext().prec += 2\n i, lasts, s, fact, num, sign = 0, 0, 1, 1, 1, 1\n while s != lasts:\n lasts = s\n i += 2\n fact *= i * (i-1)\n num *= x * x\n sign *= -1\n s += num / fact * sign\n getcontext().prec -= 2\n return +s\n\ndef sin(x):\n """Return the sine of x as measured in radians.\n\n >>> print sin(Decimal('0.5'))\n 0.4794255386042030002732879352\n >>> print sin(0.5)\n 0.479425538604\n >>> print sin(0.5+0j)\n (0.479425538604+0j)\n\n """\n getcontext().prec += 2\n i, lasts, s, fact, num, sign = 1, 0, x, 1, x, 1\n while s != lasts:\n lasts = s\n i += 2\n fact *= i * (i-1)\n num *= x * x\n sign *= -1\n s += num / fact * sign\n getcontext().prec -= 2\n return +s\n
Q. It is cumbersome to type decimal.Decimal('1234.5'). Is there a way to\nminimize typing when using the interactive interpreter?
\nA. Some users abbreviate the constructor to just a single letter:
\n>>> D = decimal.Decimal\n>>> D('1.23') + D('3.45')\nDecimal('4.68')\n
Q. In a fixed-point application with two decimal places, some inputs have many\nplaces and need to be rounded. Others are not supposed to have excess digits\nand need to be validated. What methods should be used?
\nA. The quantize() method rounds to a fixed number of decimal places. If\nthe Inexact trap is set, it is also useful for validation:
\n\n\n\n\n>>> TWOPLACES = Decimal(10) ** -2 # same as Decimal('0.01')\n\n\n>>> # Round to two places\n>>> Decimal('3.214').quantize(TWOPLACES)\nDecimal('3.21')\n\n\n>>> # Validate that a number does not exceed two places\n>>> Decimal('3.21').quantize(TWOPLACES, context=Context(traps=[Inexact]))\nDecimal('3.21')\n\n\n>>> Decimal('3.214').quantize(TWOPLACES, context=Context(traps=[Inexact]))\nTraceback (most recent call last):\n ...\nInexact: None\n
Q. Once I have valid two place inputs, how do I maintain that invariant\nthroughout an application?
\nA. Some operations like addition, subtraction, and multiplication by an integer\nwill automatically preserve fixed point. Others operations, like division and\nnon-integer multiplication, will change the number of decimal places and need to\nbe followed-up with a quantize() step:
\n>>> a = Decimal('102.72') # Initial fixed-point values\n>>> b = Decimal('3.17')\n>>> a + b # Addition preserves fixed-point\nDecimal('105.89')\n>>> a - b\nDecimal('99.55')\n>>> a * 42 # So does integer multiplication\nDecimal('4314.24')\n>>> (a * b).quantize(TWOPLACES) # Must quantize non-integer multiplication\nDecimal('325.62')\n>>> (b / a).quantize(TWOPLACES) # And quantize division\nDecimal('0.03')\n
In developing fixed-point applications, it is convenient to define functions\nto handle the quantize() step:
\n\n\n\n\n>>> def mul(x, y, fp=TWOPLACES):\n... return (x * y).quantize(fp)\n>>> def div(x, y, fp=TWOPLACES):\n... return (x / y).quantize(fp)\n\n\n>>> mul(a, b) # Automatically preserve fixed-point\nDecimal('325.62')\n>>> div(b, a)\nDecimal('0.03')\n
Q. There are many ways to express the same value. The numbers 200,\n200.000, 2E2, and 02E+4 all have the same value at\nvarious precisions. Is there a way to transform them to a single recognizable\ncanonical value?
\nA. The normalize() method maps all equivalent values to a single\nrepresentative:
\n>>> values = map(Decimal, '200 200.000 2E2 .02E+4'.split())\n>>> [v.normalize() for v in values]\n[Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2')]\n
Q. Some decimal values always print with exponential notation. Is there a way\nto get a non-exponential representation?
\nA. For some values, exponential notation is the only way to express the number\nof significant places in the coefficient. For example, expressing\n5.0E+3 as 5000 keeps the value constant but cannot show the\noriginal’s two-place significance.
\nIf an application does not care about tracking significance, it is easy to\nremove the exponent and trailing zeros, losing significance, but keeping the\nvalue unchanged:
\ndef remove_exponent(d):\n '''Remove exponent and trailing zeros.\n\n >>> remove_exponent(Decimal('5E+3'))\n Decimal('5000')\n\n '''\n return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()\n
Q. Is there a way to convert a regular float to a Decimal?
\nA. Yes, any binary floating point number can be exactly expressed as a\nDecimal though an exact conversion may take more precision than intuition would\nsuggest:
\n>>> Decimal(math.pi)\nDecimal('3.141592653589793115997963468544185161590576171875')\n
Q. Within a complex calculation, how can I make sure that I haven’t gotten a\nspurious result because of insufficient precision or rounding anomalies.
\nA. The decimal module makes it easy to test results. A best practice is to\nre-run calculations using greater precision and with various rounding modes.\nWidely differing results indicate insufficient precision, rounding mode issues,\nill-conditioned inputs, or a numerically unstable algorithm.
\nQ. I noticed that context precision is applied to the results of operations but\nnot to the inputs. Is there anything to watch out for when mixing values of\ndifferent precisions?
\nA. Yes. The principle is that all values are considered to be exact and so is\nthe arithmetic on those values. Only the results are rounded. The advantage\nfor inputs is that “what you type is what you get”. A disadvantage is that the\nresults can look odd if you forget that the inputs haven’t been rounded:
\n>>> getcontext().prec = 3\n>>> Decimal('3.104') + Decimal('2.104')\nDecimal('5.21')\n>>> Decimal('3.104') + Decimal('0.000') + Decimal('2.104')\nDecimal('5.20')\n
The solution is either to increase precision or to force rounding of inputs\nusing the unary plus operation:
\n>>> getcontext().prec = 3\n>>> +Decimal('1.23456789') # unary plus triggers rounding\nDecimal('1.23')\n
Alternatively, inputs can be rounded upon creation using the\nContext.create_decimal() method:
\n>>> Context(prec=5, rounding=ROUND_DOWN).create_decimal('1.2345678')\nDecimal('1.2345')\n
Source code: Lib/fileinput.py
\nThis module implements a helper class and functions to quickly write a\nloop over standard input or a list of files. If you just want to read or\nwrite one file see open().
\nThe typical use is:
\nimport fileinput\nfor line in fileinput.input():\n process(line)\n
This iterates over the lines of all files listed in sys.argv[1:], defaulting\nto sys.stdin if the list is empty. If a filename is '-', it is also\nreplaced by sys.stdin. To specify an alternative list of filenames, pass it\nas the first argument to input(). A single file name is also allowed.
\nAll files are opened in text mode by default, but you can override this by\nspecifying the mode parameter in the call to input() or\nFileInput(). If an I/O error occurs during opening or reading a file,\nIOError is raised.
\nIf sys.stdin is used more than once, the second and further use will return\nno lines, except perhaps for interactive use, or if it has been explicitly reset\n(e.g. using sys.stdin.seek(0)).
\nEmpty files are opened and immediately closed; the only time their presence in\nthe list of filenames is noticeable at all is when the last file opened is\nempty.
\nLines are returned with any newlines intact, which means that the last line in\na file may not have one.
\nYou can control how files are opened by providing an opening hook via the\nopenhook parameter to fileinput.input() or FileInput(). The\nhook must be a function that takes two arguments, filename and mode, and\nreturns an accordingly opened file-like object. Two useful hooks are already\nprovided by this module.
\nThe following function is the primary interface of this module:
\nCreate an instance of the FileInput class. The instance will be used\nas global state for the functions of this module, and is also returned to use\nduring iteration. The parameters to this function will be passed along to the\nconstructor of the FileInput class.
\n\nChanged in version 2.5: Added the mode and openhook parameters.
\nThe following functions use the global state created by fileinput.input();\nif there is no active state, RuntimeError is raised.
\nReturn the integer “file descriptor” for the current file. When no file is\nopened (before the first line and between files), returns -1.
\n\nNew in version 2.5.
\nThe class which implements the sequence behavior provided by the module is\navailable for subclassing as well:
\nClass FileInput is the implementation; its methods filename(),\nfileno(), lineno(), filelineno(), isfirstline(),\nisstdin(), nextfile() and close() correspond to the functions\nof the same name in the module. In addition it has a readline() method\nwhich returns the next input line, and a __getitem__() method which\nimplements the sequence behavior. The sequence must be accessed in strictly\nsequential order; random access and readline() cannot be mixed.
\nWith mode you can specify which file mode will be passed to open(). It\nmust be one of 'r', 'rU', 'U' and 'rb'.
\nThe openhook, when given, must be a function that takes two arguments,\nfilename and mode, and returns an accordingly opened file-like object. You\ncannot use inplace and openhook together.
\n\nChanged in version 2.5: Added the mode and openhook parameters.
\nOptional in-place filtering: if the keyword argument inplace=1 is passed\nto fileinput.input() or to the FileInput constructor, the file is\nmoved to a backup file and standard output is directed to the input file (if a\nfile of the same name as the backup file already exists, it will be replaced\nsilently). This makes it possible to write a filter that rewrites its input\nfile in place. If the backup parameter is given (typically as\nbackup='.<some extension>'), it specifies the extension for the backup file,\nand the backup file remains around; by default, the extension is '.bak' and\nit is deleted when the output file is closed. In-place filtering is disabled\nwhen standard input is read.
\nNote
\nThe current implementation does not work for MS-DOS 8+3 filesystems.
\nThe two following opening hooks are provided by this module:
\nTransparently opens files compressed with gzip and bzip2 (recognized by the\nextensions '.gz' and '.bz2') using the gzip and bz2\nmodules. If the filename extension is not '.gz' or '.bz2', the file is\nopened normally (ie, using open() without any decompression).
\nUsage example: fi = fileinput.FileInput(openhook=fileinput.hook_compressed)
\n\nNew in version 2.5.
\nReturns a hook which opens each file with codecs.open(), using the given\nencoding to read the file.
\nUsage example: fi =\nfileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))
\nNote
\nWith this hook, FileInput might return Unicode strings depending on the\nspecified encoding.
\n\nNew in version 2.5.
\nSource code: Lib/random.py
\nThis module implements pseudo-random number generators for various\ndistributions.
\nFor integers, uniform selection from a range. For sequences, uniform selection\nof a random element, a function to generate a random permutation of a list\nin-place, and a function for random sampling without replacement.
\nOn the real line, there are functions to compute uniform, normal (Gaussian),\nlognormal, negative exponential, gamma, and beta distributions. For generating\ndistributions of angles, the von Mises distribution is available.
\nAlmost all module functions depend on the basic function random(), which\ngenerates a random float uniformly in the semi-open range [0.0, 1.0). Python\nuses the Mersenne Twister as the core generator. It produces 53-bit precision\nfloats and has a period of 2**19937-1. The underlying implementation in C is\nboth fast and threadsafe. The Mersenne Twister is one of the most extensively\ntested random number generators in existence. However, being completely\ndeterministic, it is not suitable for all purposes, and is completely unsuitable\nfor cryptographic purposes.
\nThe functions supplied by this module are actually bound methods of a hidden\ninstance of the random.Random class. You can instantiate your own\ninstances of Random to get generators that don’t share state. This is\nespecially useful for multi-threaded programs, creating a different instance of\nRandom for each thread, and using the jumpahead() method to make\nit likely that the generated sequences seen by each thread don’t overlap.
\nClass Random can also be subclassed if you want to use a different\nbasic generator of your own devising: in that case, override the random(),\nseed(), getstate(), setstate() and jumpahead() methods.\nOptionally, a new generator can supply a getrandbits() method — this\nallows randrange() to produce selections over an arbitrarily large range.
\n\nNew in version 2.4: the getrandbits() method.
\nAs an example of subclassing, the random module provides the\nWichmannHill class that implements an alternative generator in pure\nPython. The class provides a backward compatible way to reproduce results from\nearlier versions of Python, which used the Wichmann-Hill algorithm as the core\ngenerator. Note that this Wichmann-Hill generator can no longer be recommended:\nits period is too short by contemporary standards, and the sequence generated is\nknown to fail some stringent randomness tests. See the references below for a\nrecent variant that repairs these flaws.
\n\nChanged in version 2.3: MersenneTwister replaced Wichmann-Hill as the default generator.
\nThe random module also provides the SystemRandom class which\nuses the system function os.urandom() to generate random numbers\nfrom sources provided by the operating system.
\nBookkeeping functions:
\nInitialize the basic random number generator. Optional argument x can be any\nhashable object. If x is omitted or None, current system time is used;\ncurrent system time is also used to initialize the generator when the module is\nfirst imported. If randomness sources are provided by the operating system,\nthey are used instead of the system time (see the os.urandom() function\nfor details on availability).
\n\nChanged in version 2.4: formerly, operating system resources were not used.
\nReturn an object capturing the current internal state of the generator. This\nobject can be passed to setstate() to restore the state.
\n\nNew in version 2.1.
\n\nChanged in version 2.6: State values produced in Python 2.6 cannot be loaded into earlier versions.
\nstate should have been obtained from a previous call to getstate(), and\nsetstate() restores the internal state of the generator to what it was at\nthe time setstate() was called.
\n\nNew in version 2.1.
\nChange the internal state to one different from and likely far away from the\ncurrent state. n is a non-negative integer which is used to scramble the\ncurrent state vector. This is most useful in multi-threaded programs, in\nconjunction with multiple instances of the Random class:\nsetstate() or seed() can be used to force all instances into the\nsame internal state, and then jumpahead() can be used to force the\ninstances’ states far apart.
\n\nNew in version 2.1.
\n\nChanged in version 2.3: Instead of jumping to a specific state, n steps ahead, jumpahead(n)\njumps to another state likely to be separated by many steps.
\nReturns a python long int with k random bits. This method is supplied\nwith the MersenneTwister generator and some other generators may also provide it\nas an optional part of the API. When available, getrandbits() enables\nrandrange() to handle arbitrarily large ranges.
\n\nNew in version 2.4.
\nFunctions for integers:
\nReturn a randomly selected element from range(start, stop, step). This is\nequivalent to choice(range(start, stop, step)), but doesn’t actually build a\nrange object.
\n\nNew in version 1.5.2.
\nFunctions for sequences:
\nShuffle the sequence x in place. The optional argument random is a\n0-argument function returning a random float in [0.0, 1.0); by default, this is\nthe function random().
\nNote that for even rather small len(x), the total number of permutations of\nx is larger than the period of most random number generators; this implies\nthat most permutations of a long sequence can never be generated.
\nReturn a k length list of unique elements chosen from the population sequence.\nUsed for random sampling without replacement.
\n\nNew in version 2.3.
\nReturns a new list containing elements from the population while leaving the\noriginal population unchanged. The resulting list is in selection order so that\nall sub-slices will also be valid random samples. This allows raffle winners\n(the sample) to be partitioned into grand prize and second place winners (the\nsubslices).
\nMembers of the population need not be hashable or unique. If the population\ncontains repeats, then each occurrence is a possible selection in the sample.
\nTo choose a sample from a range of integers, use an xrange() object as an\nargument. This is especially fast and space efficient for sampling from a large\npopulation: sample(xrange(10000000), 60).
\nThe following functions generate specific real-valued distributions. Function\nparameters are named after the corresponding variables in the distribution’s\nequation, as used in common mathematical practice; most of these equations can\nbe found in any statistics text.
\nReturn a random floating point number N such that a <= N <= b for\na <= b and b <= N <= a for b < a.
\nThe end-point value b may or may not be included in the range\ndepending on floating-point rounding in the equation a + (b-a) * random().
\nReturn a random floating point number N such that low <= N <= high and\nwith the specified mode between those bounds. The low and high bounds\ndefault to zero and one. The mode argument defaults to the midpoint\nbetween the bounds, giving a symmetric distribution.
\n\nNew in version 2.6.
\nGamma distribution. (Not the gamma function!) Conditions on the\nparameters are alpha > 0 and beta > 0.
\nThe probability distribution function is:
\n x ** (alpha - 1) * math.exp(-x / beta)\npdf(x) = --------------------------------------\n math.gamma(alpha) * beta ** alpha
\nAlternative Generators:
\nClass that uses the os.urandom() function for generating random numbers\nfrom sources provided by the operating system. Not available on all systems.\nDoes not rely on software state and sequences are not reproducible. Accordingly,\nthe seed() and jumpahead() methods have no effect and are ignored.\nThe getstate() and setstate() methods raise\nNotImplementedError if called.
\n\nNew in version 2.4.
\nExamples of basic usage:
\n>>> random.random() # Random float x, 0.0 <= x < 1.0\n0.37444887175646646\n>>> random.uniform(1, 10) # Random float x, 1.0 <= x < 10.0\n1.1800146073117523\n>>> random.randint(1, 10) # Integer from 1 to 10, endpoints included\n7\n>>> random.randrange(0, 101, 2) # Even integer from 0 to 100\n26\n>>> random.choice('abcdefghij') # Choose a random element\n'c'\n\n>>> items = [1, 2, 3, 4, 5, 6, 7]\n>>> random.shuffle(items)\n>>> items\n[7, 3, 2, 5, 6, 4, 1]\n\n>>> random.sample([1, 2, 3, 4, 5], 3) # Choose 3 elements\n[4, 1, 5]\n
See also
\nM. Matsumoto and T. Nishimura, “Mersenne Twister: A 623-dimensionally\nequidistributed uniform pseudorandom number generator”, ACM Transactions on\nModeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998.
\nWichmann, B. A. & Hill, I. D., “Algorithm AS 183: An efficient and portable\npseudo-random number generator”, Applied Statistics 31 (1982) 188-190.
\nComplementary-Multiply-with-Carry recipe for a compatible alternative\nrandom number generator with a long period and comparatively simple update\noperations.
\n\nDeprecated since version 2.6: The statvfs module has been deprecated for removal in Python 3.0.
\nThe statvfs module defines constants so interpreting the result if\nos.statvfs(), which returns a tuple, can be made without remembering\n“magic numbers.” Each of the constants defined in this module is the index of\nthe entry in the tuple returned by os.statvfs() that contains the\nspecified information.
\nThe operator module exports a set of efficient functions corresponding to\nthe intrinsic operators of Python. For example, operator.add(x, y) is\nequivalent to the expression x+y. The function names are those used for\nspecial class methods; variants without leading and trailing __ are also\nprovided for convenience.
\nThe functions fall into categories that perform object comparisons, logical\noperations, mathematical operations, sequence operations, and abstract type\ntests.
\nThe object comparison functions are useful for all objects, and are named after\nthe rich comparison operators they support:
\nPerform “rich comparisons” between a and b. Specifically, lt(a, b) is\nequivalent to a < b, le(a, b) is equivalent to a <= b, eq(a,\nb) is equivalent to a == b, ne(a, b) is equivalent to a != b,\ngt(a, b) is equivalent to a > b and ge(a, b) is equivalent to a\n>= b. Note that unlike the built-in cmp(), these functions can\nreturn any value, which may or may not be interpretable as a Boolean value.\nSee Comparisons for more information about rich comparisons.
\n\nNew in version 2.2.
\nThe logical operations are also generally applicable to all objects, and support\ntruth tests, identity tests, and boolean operations:
\nReturn a is b. Tests object identity.
\n\nNew in version 2.3.
\nReturn a is not b. Tests object identity.
\n\nNew in version 2.3.
\nThe mathematical and bitwise operations are the most numerous:
\n\n\n\n\n\n\nReturn a // b.
\n\nNew in version 2.2.
\nReturn a converted to an integer. Equivalent to a.__index__().
\n\nNew in version 2.5.
\nReturn the bitwise inverse of the number obj. This is equivalent to ~obj.
\n\nNew in version 2.0: The names invert() and __invert__().
\nReturn a ** b, for a and b numbers.
\n\nNew in version 2.3.
\nReturn a / b when __future__.division is in effect. This is also\nknown as “true” division.
\n\nNew in version 2.2.
\nOperations which work with sequences (some of them with mappings too) include:
\n\n\nReturn the outcome of the test b in a. Note the reversed operands.
\n\nNew in version 2.0: The name __contains__().
\nDelete the slice of a from index b to index c-1.
\n\nDeprecated since version 2.6: This function is removed in Python 3.x. Use delitem() with a slice\nindex.
\nReturn the slice of a from index b to index c-1.
\n\nDeprecated since version 2.6: This function is removed in Python 3.x. Use getitem() with a slice\nindex.
\n\nDeprecated since version 2.7: Use __mul__() instead.
\nReturn a * b where a is a sequence and b is an integer.
\n\nDeprecated since version 2.0: Use contains() instead.
\nAlias for contains().
\nSet the slice of a from index b to index c-1 to the sequence v.
\n\nDeprecated since version 2.6: This function is removed in Python 3.x. Use setitem() with a slice\nindex.
\nExample use of operator functions:
\n>>> # Elementwise multiplication\n>>> map(mul, [0, 1, 2, 3], [10, 20, 30, 40])\n[0, 20, 60, 120]\n\n>>> # Dot product\n>>> sum(map(mul, [0, 1, 2, 3], [10, 20, 30, 40]))\n200\n
Many operations have an “in-place” version. The following functions provide a\nmore primitive access to in-place operators than the usual syntax does; for\nexample, the statement x += y is equivalent to\nx = operator.iadd(x, y). Another way to put it is to say that\nz = operator.iadd(x, y) is equivalent to the compound statement\nz = x; z += y.
\na = iadd(a, b) is equivalent to a += b.
\n\nNew in version 2.5.
\na = iand(a, b) is equivalent to a &= b.
\n\nNew in version 2.5.
\na = iconcat(a, b) is equivalent to a += b for a and b sequences.
\n\nNew in version 2.5.
\na = idiv(a, b) is equivalent to a /= b when __future__.division is\nnot in effect.
\n\nNew in version 2.5.
\na = ifloordiv(a, b) is equivalent to a //= b.
\n\nNew in version 2.5.
\na = ilshift(a, b) is equivalent to a <<= b.
\n\nNew in version 2.5.
\na = imod(a, b) is equivalent to a %= b.
\n\nNew in version 2.5.
\na = imul(a, b) is equivalent to a *= b.
\n\nNew in version 2.5.
\na = ior(a, b) is equivalent to a |= b.
\n\nNew in version 2.5.
\na = ipow(a, b) is equivalent to a **= b.
\n\nNew in version 2.5.
\n\nDeprecated since version 2.7: Use __imul__() instead.
\na = irepeat(a, b) is equivalent to a *= b where a is a sequence and\nb is an integer.
\n\nNew in version 2.5.
\na = irshift(a, b) is equivalent to a >>= b.
\n\nNew in version 2.5.
\na = isub(a, b) is equivalent to a -= b.
\n\nNew in version 2.5.
\na = itruediv(a, b) is equivalent to a /= b when __future__.division\nis in effect.
\n\nNew in version 2.5.
\na = ixor(a, b) is equivalent to a ^= b.
\n\nNew in version 2.5.
\nThe operator module also defines a few predicates to test the type of\nobjects; however, these are not all reliable. It is preferable to test\nabstract base classes instead (see collections and\nnumbers for details).
\n\nDeprecated since version 2.0: Use isinstance(x, collections.Callable) instead.
\nReturns true if the object obj can be called like a function, otherwise it\nreturns false. True is returned for functions, bound and unbound methods, class\nobjects, and instance objects which support the __call__() method.
\n\nDeprecated since version 2.7: Use isinstance(x, collections.Mapping) instead.
\nReturns true if the object obj supports the mapping interface. This is true for\ndictionaries and all instance objects defining __getitem__().
\n\nDeprecated since version 2.7: Use isinstance(x, numbers.Number) instead.
\nReturns true if the object obj represents a number. This is true for all\nnumeric types implemented in C.
\n\nDeprecated since version 2.7: Use isinstance(x, collections.Sequence) instead.
\nReturns true if the object obj supports the sequence protocol. This returns true\nfor all objects which define sequence methods in C, and for all instance objects\ndefining __getitem__().
\nThe operator module also defines tools for generalized attribute and item\nlookups. These are useful for making fast field extractors as arguments for\nmap(), sorted(), itertools.groupby(), or other functions that\nexpect a function argument.
\nReturn a callable object that fetches attr from its operand. If more than one\nattribute is requested, returns a tuple of attributes. After,\nf = attrgetter('name'), the call f(b) returns b.name. After,\nf = attrgetter('name', 'date'), the call f(b) returns (b.name,\nb.date). Equivalent to:
\ndef attrgetter(*items):\n if len(items) == 1:\n attr = items[0]\n def g(obj):\n return resolve_attr(obj, attr)\n else:\n def g(obj):\n return tuple(resolve_att(obj, attr) for attr in items)\n return g\n\ndef resolve_attr(obj, attr):\n for name in attr.split("."):\n obj = getattr(obj, name)\n return obj\n
The attribute names can also contain dots; after f = attrgetter('date.month'),\nthe call f(b) returns b.date.month.
\n\nNew in version 2.4.
\n\nChanged in version 2.5: Added support for multiple attributes.
\n\nChanged in version 2.6: Added support for dotted attributes.
\nReturn a callable object that fetches item from its operand using the\noperand’s __getitem__() method. If multiple items are specified,\nreturns a tuple of lookup values. Equivalent to:
\ndef itemgetter(*items):\n if len(items) == 1:\n item = items[0]\n def g(obj):\n return obj[item]\n else:\n def g(obj):\n return tuple(obj[item] for item in items)\n return g\n
The items can be any type accepted by the operand’s __getitem__()\nmethod. Dictionaries accept any hashable value. Lists, tuples, and\nstrings accept an index or a slice:
\n>>> itemgetter(1)('ABCDEFG')\n'B'\n>>> itemgetter(1,3,5)('ABCDEFG')\n('B', 'D', 'F')\n>>> itemgetter(slice(2,None))('ABCDEFG')\n'CDEFG'\n
\nNew in version 2.4.
\n\nChanged in version 2.5: Added support for multiple item extraction.
\nExample of using itemgetter() to retrieve specific fields from a\ntuple record:
\n>>> inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]\n>>> getcount = itemgetter(1)\n>>> map(getcount, inventory)\n[3, 2, 5, 1]\n>>> sorted(inventory, key=getcount)\n[('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)]\n
Return a callable object that calls the method name on its operand. If\nadditional arguments and/or keyword arguments are given, they will be given\nto the method as well. After f = methodcaller('name'), the call f(b)\nreturns b.name(). After f = methodcaller('name', 'foo', bar=1), the\ncall f(b) returns b.name('foo', bar=1). Equivalent to:
\ndef methodcaller(name, *args, **kwargs):\n def caller(obj):\n return getattr(obj, name)(*args, **kwargs)\n return caller\n
\nNew in version 2.6.
\nThis table shows how abstract operations correspond to operator symbols in the\nPython syntax and the functions in the operator module.
\nOperation | \nSyntax | \nFunction | \n
---|---|---|
Addition | \na + b | \nadd(a, b) | \n
Concatenation | \nseq1 + seq2 | \nconcat(seq1, seq2) | \n
Containment Test | \nobj in seq | \ncontains(seq, obj) | \n
Division | \na / b | \ndiv(a, b) (without\n__future__.division) | \n
Division | \na / b | \ntruediv(a, b) (with\n__future__.division) | \n
Division | \na // b | \nfloordiv(a, b) | \n
Bitwise And | \na & b | \nand_(a, b) | \n
Bitwise Exclusive Or | \na ^ b | \nxor(a, b) | \n
Bitwise Inversion | \n~ a | \ninvert(a) | \n
Bitwise Or | \na | b | \nor_(a, b) | \n
Exponentiation | \na ** b | \npow(a, b) | \n
Identity | \na is b | \nis_(a, b) | \n
Identity | \na is not b | \nis_not(a, b) | \n
Indexed Assignment | \nobj[k] = v | \nsetitem(obj, k, v) | \n
Indexed Deletion | \ndel obj[k] | \ndelitem(obj, k) | \n
Indexing | \nobj[k] | \ngetitem(obj, k) | \n
Left Shift | \na << b | \nlshift(a, b) | \n
Modulo | \na | \nmod(a, b) | \n
Multiplication | \na * b | \nmul(a, b) | \n
Negation (Arithmetic) | \n- a | \nneg(a) | \n
Negation (Logical) | \nnot a | \nnot_(a) | \n
Positive | \n+ a | \npos(a) | \n
Right Shift | \na >> b | \nrshift(a, b) | \n
Sequence Repetition | \nseq * i | \nrepeat(seq, i) | \n
Slice Assignment | \nseq[i:j] = values | \nsetitem(seq, slice(i, j), values) | \n
Slice Deletion | \ndel seq[i:j] | \ndelitem(seq, slice(i, j)) | \n
Slicing | \nseq[i:j] | \ngetitem(seq, slice(i, j)) | \n
String Formatting | \ns | \nmod(s, obj) | \n
Subtraction | \na - b | \nsub(a, b) | \n
Truth Test | \nobj | \ntruth(obj) | \n
Ordering | \na < b | \nlt(a, b) | \n
Ordering | \na <= b | \nle(a, b) | \n
Equality | \na == b | \neq(a, b) | \n
Difference | \na != b | \nne(a, b) | \n
Ordering | \na >= b | \nge(a, b) | \n
Ordering | \na > b | \ngt(a, b) | \n
This module implements some useful functions on pathnames. To read or\nwrite files see open(), and for accessing the filesystem see the\nos module.
\nNote
\nOn Windows, many of these functions do not properly support UNC pathnames.\nsplitunc() and ismount() do handle them correctly.
\nNote
\nSince different operating systems have different path name conventions, there\nare several versions of this module in the standard library. The\nos.path module is always the path module suitable for the operating\nsystem Python is running on, and therefore usable for local paths. However,\nyou can also import and use the individual modules if you want to manipulate\na path that is always in one of the different formats. They all have the\nsame interface:
\nReturn a normalized absolutized version of the pathname path. On most\nplatforms, this is equivalent to normpath(join(os.getcwd(), path)).
\n\nNew in version 1.5.2.
\nReturn True if path refers to an existing path. Returns True for\nbroken symbolic links. Equivalent to exists() on platforms lacking\nos.lstat().
\n\nNew in version 2.4.
\nOn Unix and Windows, return the argument with an initial component of ~ or\n~user replaced by that user‘s home directory.
\nOn Unix, an initial ~ is replaced by the environment variable HOME\nif it is set; otherwise the current user’s home directory is looked up in the\npassword directory through the built-in module pwd. An initial ~user\nis looked up directly in the password directory.
\nOn Windows, HOME and USERPROFILE will be used if set,\notherwise a combination of HOMEPATH and HOMEDRIVE will be\nused. An initial ~user is handled by stripping the last directory component\nfrom the created user path derived above.
\nIf the expansion fails or if the path does not begin with a tilde, the path is\nreturned unchanged.
\nReturn the argument with environment variables expanded. Substrings of the form\n$name or ${name} are replaced by the value of environment variable\nname. Malformed variable names and references to non-existing variables are\nleft unchanged.
\nOn Windows,
Return the time of last access of path. The return value is a number giving\nthe number of seconds since the epoch (see the time module). Raise\nos.error if the file does not exist or is inaccessible.
\n\nNew in version 1.5.2.
\n\nChanged in version 2.3: If os.stat_float_times() returns True, the result is a floating point\nnumber.
\nReturn the time of last modification of path. The return value is a number\ngiving the number of seconds since the epoch (see the time module).\nRaise os.error if the file does not exist or is inaccessible.
\n\nNew in version 1.5.2.
\n\nChanged in version 2.3: If os.stat_float_times() returns True, the result is a floating point\nnumber.
\nReturn the system’s ctime which, on some systems (like Unix) is the time of the\nlast change, and, on others (like Windows), is the creation time for path.\nThe return value is a number giving the number of seconds since the epoch (see\nthe time module). Raise os.error if the file does not exist or\nis inaccessible.
\n\nNew in version 2.3.
\nReturn the size, in bytes, of path. Raise os.error if the file does\nnot exist or is inaccessible.
\n\nNew in version 1.5.2.
\nNormalize a pathname. This collapses redundant separators and up-level\nreferences so that A//B, A/B/, A/./B and A/foo/../B all become\nA/B.
\nIt does not normalize the case (use normcase() for that). On Windows, it\nconverts forward slashes to backward slashes. It should be understood that this\nmay change the meaning of the path if it contains symbolic links!
\nReturn the canonical path of the specified filename, eliminating any symbolic\nlinks encountered in the path (if they are supported by the operating system).
\n\nNew in version 2.2.
\nReturn a relative filepath to path either from the current directory or from\nan optional start point.
\nstart defaults to os.curdir.
\nAvailability: Windows, Unix.
\n\nNew in version 2.6.
\nReturn True if both pathname arguments refer to the same file or directory\n(as indicated by device number and i-node number). Raise an exception if a\nos.stat() call on either pathname fails.
\nAvailability: Unix.
\nReturn True if the file descriptors fp1 and fp2 refer to the same file.
\nAvailability: Unix.
\nReturn True if the stat tuples stat1 and stat2 refer to the same file.\nThese structures may have been returned by fstat(), lstat(), or\nstat(). This function implements the underlying comparison used by\nsamefile() and sameopenfile().
\nAvailability: Unix.
\nSplit the pathname path into a pair (drive, tail) where drive is either\na drive specification or the empty string. On systems which do not use drive\nspecifications, drive will always be the empty string. In all cases, drive\n+ tail will be the same as path.
\n\nNew in version 1.3.
\nSplit the pathname path into a pair (root, ext) such that root + ext ==\npath, and ext is empty or begins with a period and contains at most one\nperiod. Leading periods on the basename are ignored; splitext('.cshrc')\nreturns ('.cshrc', '').
\n\nChanged in version 2.6: Earlier versions could produce an empty root when the only period was the\nfirst character.
\nSplit the pathname path into a pair (unc, rest) so that unc is the UNC\nmount point (such as r'\\\\host\\mount'), if present, and rest the rest of\nthe path (such as r'\\path\\file.ext'). For paths containing drive letters,\nunc will always be the empty string.
\nAvailability: Windows.
\nCalls the function visit with arguments (arg, dirname, names) for each\ndirectory in the directory tree rooted at path (including path itself, if it\nis a directory). The argument dirname specifies the visited directory, the\nargument names lists the files in the directory (gotten from\nos.listdir(dirname)). The visit function may modify names to influence\nthe set of directories visited below dirname, e.g. to avoid visiting certain\nparts of the tree. (The object referred to by names must be modified in\nplace, using del or slice assignment.)
\nNote
\nSymbolic links to directories are not treated as subdirectories, and that\nwalk() therefore will not visit them. To visit linked directories you must\nidentify them with os.path.islink(file) and os.path.isdir(file), and\ninvoke walk() as necessary.
\nNote
\nThis function is deprecated and has been removed in 3.0 in favor of\nos.walk().
\nTrue if arbitrary Unicode strings can be used as file names (within limitations\nimposed by the file system).
\n\nNew in version 2.3.
\nSource code: Lib/glob.py
\nThe glob module finds all the pathnames matching a specified pattern\naccording to the rules used by the Unix shell. No tilde expansion is done, but\n*, ?, and character ranges expressed with [] will be correctly\nmatched. This is done by using the os.listdir() and\nfnmatch.fnmatch() functions in concert, and not by actually invoking a\nsubshell. (For tilde and shell variable expansion, use\nos.path.expanduser() and os.path.expandvars().)
\nReturn an iterator which yields the same values as glob()\nwithout actually storing them all simultaneously.
\n\nNew in version 2.5.
\nFor example, consider a directory containing only the following files:\n1.gif, 2.txt, and card.gif. glob() will produce\nthe following results. Notice how any leading components of the path are\npreserved.
\n>>> import glob\n>>> glob.glob('./[0-9].*')\n['./1.gif', './2.txt']\n>>> glob.glob('*.gif')\n['1.gif', 'card.gif']\n>>> glob.glob('?.gif')\n['1.gif']\n
See also
\nSource code: Lib/stat.py
\nThe stat module defines constants and functions for interpreting the\nresults of os.stat(), os.fstat() and os.lstat() (if they\nexist). For complete details about the stat(), fstat() and\nlstat() calls, consult the documentation for your system.
\nThe stat module defines the following functions to test for specific file\ntypes:
\nTwo additional functions are defined for more general manipulation of the file’s\nmode:
\nNormally, you would use the os.path.is*() functions for testing the type\nof a file; the functions here are useful when you are doing multiple tests of\nthe same file and wish to avoid the overhead of the stat() system call\nfor each test. These are also useful when checking for information about a file\nthat isn’t handled by os.path, like the tests for block and character\ndevices.
\nExample:
\nimport os, sys\nfrom stat import *\n\ndef walktree(top, callback):\n '''recursively descend the directory tree rooted at top,\n calling the callback function for each regular file'''\n\n for f in os.listdir(top):\n pathname = os.path.join(top, f)\n mode = os.stat(pathname).st_mode\n if S_ISDIR(mode):\n # It's a directory, recurse into it\n walktree(pathname, callback)\n elif S_ISREG(mode):\n # It's a file, call the callback function\n callback(pathname)\n else:\n # Unknown file type, print a message\n print 'Skipping %s' pathname\n\ndef visitfile(file):\n print 'visiting', file\n\nif __name__ == '__main__':\n walktree(sys.argv[1], visitfile)\n
All the variables below are simply symbolic indexes into the 10-tuple returned\nby os.stat(), os.fstat() or os.lstat().
\nThe interpretation of “file size” changes according to the file type. For plain\nfiles this is the size of the file in bytes. For FIFOs and sockets under most\nflavors of Unix (including Linux in particular), the “size” is the number of\nbytes waiting to be read at the time of the call to os.stat(),\nos.fstat(), or os.lstat(); this can sometimes be useful, especially\nfor polling one of these special files after a non-blocking open. The meaning\nof the size field for other character and block devices varies more, depending\non the implementation of the underlying system call.
\nThe variables below define the flags used in the ST_MODE field.
\nUse of the functions above is more portable than use of the first set of flags:
\nThe following flags can also be used in the mode argument of os.chmod():
\nThe following flags can be used in the flags argument of os.chflags():
\nSee the *BSD or Mac OS systems man page chflags(2) for more information.
\nSource code: Lib/tempfile.py
\nThis module generates temporary files and directories. It works on all\nsupported platforms.
\nIn version 2.3 of Python, this module was overhauled for enhanced security. It\nnow provides three new functions, NamedTemporaryFile(), mkstemp(),\nand mkdtemp(), which should eliminate all remaining need to use the\ninsecure mktemp() function. Temporary file names created by this module\nno longer contain the process ID; instead a string of six random characters is\nused.
\nAlso, all the user-callable functions now take additional arguments which\nallow direct control over the location and name of temporary files. It is\nno longer necessary to use the global tempdir and template variables.\nTo maintain backward compatibility, the argument order is somewhat odd; it\nis recommended to use keyword arguments for clarity.
\nThe module defines the following user-callable functions:
\nReturn a file-like object that can be used as a temporary storage area.\nThe file is created using mkstemp(). It will be destroyed as soon\nas it is closed (including an implicit close when the object is garbage\ncollected). Under Unix, the directory entry for the file is removed\nimmediately after the file is created. Other platforms do not support\nthis; your code should not rely on a temporary file created using this\nfunction having or not having a visible name in the file system.
\nThe mode parameter defaults to 'w+b' so that the file created can\nbe read and written without being closed. Binary mode is used so that it\nbehaves consistently on all platforms without regard for the data that is\nstored. bufsize defaults to -1, meaning that the operating system\ndefault is used.
\nThe dir, prefix and suffix parameters are passed to mkstemp().
\nThe returned object is a true file object on POSIX platforms. On other\nplatforms, it is a file-like object whose file attribute is the\nunderlying true file object. This file-like object can be used in a\nwith statement, just like a normal file.
\nThis function operates exactly as TemporaryFile() does, except that\nthe file is guaranteed to have a visible name in the file system (on\nUnix, the directory entry is not unlinked). That name can be retrieved\nfrom the name attribute of the file object. Whether the name can be\nused to open the file a second time, while the named temporary file is\nstill open, varies across platforms (it can be so used on Unix; it cannot\non Windows NT or later). If delete is true (the default), the file is\ndeleted as soon as it is closed.
\nThe returned object is always a file-like object whose file\nattribute is the underlying true file object. This file-like object can\nbe used in a with statement, just like a normal file.
\n\nNew in version 2.3.
\n\nNew in version 2.6: The delete parameter.
\nThis function operates exactly as TemporaryFile() does, except that\ndata is spooled in memory until the file size exceeds max_size, or\nuntil the file’s fileno() method is called, at which point the\ncontents are written to disk and operation proceeds as with\nTemporaryFile().
\nThe resulting file has one additional method, rollover(), which\ncauses the file to roll over to an on-disk file regardless of its size.
\nThe returned object is a file-like object whose _file attribute\nis either a StringIO object or a true file object, depending on\nwhether rollover() has been called. This file-like object can be\nused in a with statement, just like a normal file.
\n\nNew in version 2.6.
\nCreates a temporary file in the most secure manner possible. There are\nno race conditions in the file’s creation, assuming that the platform\nproperly implements the os.O_EXCL flag for os.open(). The\nfile is readable and writable only by the creating user ID. If the\nplatform uses permission bits to indicate whether a file is executable,\nthe file is executable by no one. The file descriptor is not inherited\nby child processes.
\nUnlike TemporaryFile(), the user of mkstemp() is responsible\nfor deleting the temporary file when done with it.
\nIf suffix is specified, the file name will end with that suffix,\notherwise there will be no suffix. mkstemp() does not put a dot\nbetween the file name and the suffix; if you need one, put it at the\nbeginning of suffix.
\nIf prefix is specified, the file name will begin with that prefix;\notherwise, a default prefix is used.
\nIf dir is specified, the file will be created in that directory;\notherwise, a default directory is used. The default directory is chosen\nfrom a platform-dependent list, but the user of the application can\ncontrol the directory location by setting the TMPDIR, TEMP or TMP\nenvironment variables. There is thus no guarantee that the generated\nfilename will have any nice properties, such as not requiring quoting\nwhen passed to external commands via os.popen().
\nIf text is specified, it indicates whether to open the file in binary\nmode (the default) or text mode. On some platforms, this makes no\ndifference.
\nmkstemp() returns a tuple containing an OS-level handle to an open\nfile (as would be returned by os.open()) and the absolute pathname\nof that file, in that order.
\n\nNew in version 2.3.
\nCreates a temporary directory in the most secure manner possible. There\nare no race conditions in the directory’s creation. The directory is\nreadable, writable, and searchable only by the creating user ID.
\nThe user of mkdtemp() is responsible for deleting the temporary\ndirectory and its contents when done with it.
\nThe prefix, suffix, and dir arguments are the same as for\nmkstemp().
\nmkdtemp() returns the absolute pathname of the new directory.
\n\nNew in version 2.3.
\n\nDeprecated since version 2.3: Use mkstemp() instead.
\nReturn an absolute pathname of a file that did not exist at the time the\ncall is made. The prefix, suffix, and dir arguments are the same\nas for mkstemp().
\nWarning
\nUse of this function may introduce a security hole in your program. By\nthe time you get around to doing anything with the file name it returns,\nsomeone else may have beaten you to the punch. mktemp() usage can\nbe replaced easily with NamedTemporaryFile(), passing it the\ndelete=False parameter:
\n>>> f = NamedTemporaryFile(delete=False)\n>>> f\n<open file '<fdopen>', mode 'w+b' at 0x384698>\n>>> f.name\n'/var/folders/5q/5qTPn6xq2RaWqk+1Ytw3-U+++TI/-Tmp-/tmpG7V1Y0'\n>>> f.write("Hello World!\\n")\n>>> f.close()\n>>> os.unlink(f.name)\n>>> os.path.exists(f.name)\nFalse\n
The module uses two global variables that tell it how to construct a\ntemporary name. They are initialized at the first call to any of the\nfunctions above. The caller may change them, but this is discouraged; use\nthe appropriate function arguments, instead.
\nWhen set to a value other than None, this variable defines the\ndefault value for the dir argument to all the functions defined in this\nmodule.
\nIf tempdir is unset or None at any call to any of the above\nfunctions, Python searches a standard list of directories and sets\ntempdir to the first one which the calling user can create files in.\nThe list is:
\nReturn the directory currently selected to create temporary files in. If\ntempdir is not None, this simply returns its contents; otherwise,\nthe search described above is performed, and the result returned.
\n\nNew in version 2.3.
\n\nDeprecated since version 2.0: Use gettempprefix() instead.
\nWhen set to a value other than None, this variable defines the prefix of the\nfinal component of the filenames returned by mktemp(). A string of six\nrandom letters and digits is appended to the prefix to make the filename unique.\nThe default prefix is tmp.
\nOlder versions of this module used to require that template be set to\nNone after a call to os.fork(); this has not been necessary since\nversion 1.5.2.
\nReturn the filename prefix used to create temporary files. This does not\ncontain the directory component. Using this function is preferred over reading\nthe template variable directly.
\n\nNew in version 1.5.2.
\nSource code: Lib/fnmatch.py
\nThis module provides support for Unix shell-style wildcards, which are not the\nsame as regular expressions (which are documented in the re module). The\nspecial characters used in shell-style wildcards are:
\nPattern | \nMeaning | \n
---|---|
* | \nmatches everything | \n
? | \nmatches any single character | \n
[seq] | \nmatches any character in seq | \n
[!seq] | \nmatches any character not in seq | \n
Note that the filename separator ('/' on Unix) is not special to this\nmodule. See module glob for pathname expansion (glob uses\nfnmatch() to match pathname segments). Similarly, filenames starting with\na period are not special for this module, and are matched by the * and ?\npatterns.
\nTest whether the filename string matches the pattern string, returning\nTrue or False. If the operating system is case-insensitive,\nthen both parameters will be normalized to all lower- or upper-case before\nthe comparison is performed. fnmatchcase() can be used to perform a\ncase-sensitive comparison, regardless of whether that’s standard for the\noperating system.
\nThis example will print all file names in the current directory with the\nextension .txt:
\nimport fnmatch\nimport os\n\nfor file in os.listdir('.'):\n if fnmatch.fnmatch(file, '*.txt'):\n print file\n
Return the subset of the list of names that match pattern. It is the same as\n[n for n in names if fnmatch(n, pattern)], but implemented more efficiently.
\n\nNew in version 2.2.
\nReturn the shell-style pattern converted to a regular expression.
\nBe aware there is no way to quote meta-characters.
\nExample:
\n>>> import fnmatch, re\n>>>\n>>> regex = fnmatch.translate('*.txt')\n>>> regex\n'.*\\\\.txt$'\n>>> reobj = re.compile(regex)\n>>> reobj.match('foobar.txt')\n<_sre.SRE_Match object at 0x...>\n
See also
\nSource code: Lib/linecache.py
\nThe linecache module allows one to get any line from any file, while\nattempting to optimize internally, using a cache, the common case where many\nlines are read from a single file. This is used by the traceback module\nto retrieve source lines for inclusion in the formatted traceback.
\nThe linecache module defines the following functions:
\nGet line lineno from file named filename. This function will never raise an\nexception — it will return '' on errors (the terminating newline character\nwill be included for lines that are found).
\nIf a file named filename is not found, the function will look for it in the\nmodule search path, sys.path, after first checking for a PEP 302\n__loader__ in module_globals, in case the module was imported from a\nzipfile or other non-filesystem import source.
\n\nNew in version 2.5: The module_globals parameter was added.
\nExample:
\n>>> import linecache\n>>> linecache.getline('/etc/passwd', 4)\n'sys:x:3:3:sys:/dev:/bin/sh\\n'\n
Source code: Lib/filecmp.py
\nThe filecmp module defines functions to compare files and directories,\nwith various optional time/correctness trade-offs. For comparing files,\nsee also the difflib module.
\nThe filecmp module defines the following functions:
\nCompare the files named f1 and f2, returning True if they seem equal,\nFalse otherwise.
\nUnless shallow is given and is false, files with identical os.stat()\nsignatures are taken to be equal.
\nFiles that were compared using this function will not be compared again unless\ntheir os.stat() signature changes.
\nNote that no external programs are called from this function, giving it\nportability and efficiency.
\nCompare the files in the two directories dir1 and dir2 whose names are\ngiven by common.
\nReturns three lists of file names: match, mismatch,\nerrors. match contains the list of files that match, mismatch contains\nthe names of those that don’t, and errors lists the names of files which\ncould not be compared. Files are listed in errors if they don’t exist in\none of the directories, the user lacks permission to read them or if the\ncomparison could not be done for some other reason.
\nThe shallow parameter has the same meaning and default value as for\nfilecmp.cmp().
\nFor example, cmpfiles('a', 'b', ['c', 'd/e']) will compare a/c with\nb/c and a/d/e with b/d/e. 'c' and 'd/e' will each be in\none of the three returned lists.
\nExample:
\n>>> import filecmp\n>>> filecmp.cmp('undoc.rst', 'undoc.rst')\nTrue\n>>> filecmp.cmp('undoc.rst', 'index.rst')\nFalse\n
dircmp instances are built using this constructor:
\nConstruct a new directory comparison object, to compare the directories a and\nb. ignore is a list of names to ignore, and defaults to ['RCS', 'CVS',\n'tags']. hide is a list of names to hide, and defaults to [os.curdir,\nos.pardir].
\nThe dircmp class provides the following methods:
\nThe dircmp offers a number of interesting attributes that may be\nused to get various bits of information about the directory trees being\ncompared.
\nNote that via __getattr__() hooks, all attributes are computed lazily,\nso there is no speed penalty if only those attributes which are lightweight\nto compute are used.
\n\nDeprecated since version 2.6: The dircache module has been removed in Python 3.0.
\nThe dircache module defines a function for reading directory listing\nusing a cache, and cache invalidation using the mtime of the directory.\nAdditionally, it defines a function to annotate directories by appending a\nslash.
\nThe dircache module defines the following functions:
\nReturn a directory listing of path, as gotten from os.listdir(). Note\nthat unless path changes, further call to listdir() will not re-read the\ndirectory structure.
\nNote that the list returned should be regarded as read-only. (Perhaps a future\nversion should change it to return a tuple?)
\n>>> import dircache\n>>> a = dircache.listdir('/')\n>>> a = a[:] # Copy the return value so we can change 'a'\n>>> a\n['bin', 'boot', 'cdrom', 'dev', 'etc', 'floppy', 'home', 'initrd', 'lib', 'lost+\nfound', 'mnt', 'proc', 'root', 'sbin', 'tmp', 'usr', 'var', 'vmlinuz']\n>>> dircache.annotate('/', a)\n>>> a\n['bin/', 'boot/', 'cdrom/', 'dev/', 'etc/', 'floppy/', 'home/', 'initrd/', 'lib/\n', 'lost+found/', 'mnt/', 'proc/', 'root/', 'sbin/', 'tmp/', 'usr/', 'var/', 'vm\nlinuz']\n
Note
\nThe copy_reg module has been renamed to copyreg in Python 3.0.\nThe 2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nThe copy_reg module provides support for the pickle and\ncPickle modules. The copy module is likely to use this in the\nfuture as well. It provides configuration information about object constructors\nwhich are not classes. Such constructors may be factory functions or class\ninstances.
\nDeclares that function should be used as a “reduction” function for objects of\ntype type; type must not be a “classic” class object. (Classic classes are\nhandled differently; see the documentation for the pickle module for\ndetails.) function should return either a string or a tuple containing two or\nthree elements.
\nThe optional constructor parameter, if provided, is a callable object which\ncan be used to reconstruct the object when called with the tuple of arguments\nreturned by function at pickling time. TypeError will be raised if\nobject is a class or constructor is not callable.
\nSee the pickle module for more details on the interface expected of\nfunction and constructor.
\nSource code: Lib/shutil.py
\nThe shutil module offers a number of high-level operations on files and\ncollections of files. In particular, functions are provided which support file\ncopying and removal. For operations on individual files, see also the\nos module.
\nWarning
\nEven the higher-level file copying functions (copy(), copy2())\ncan’t copy all file metadata.
\nOn POSIX platforms, this means that file owner and group are lost as well\nas ACLs. On Mac OS, the resource fork and other metadata are not used.\nThis means that resources will be lost and file type and creator codes will\nnot be correct. On Windows, file owners, ACLs and alternate data streams\nare not copied.
\nThis factory function creates a function that can be used as a callable for\ncopytree()‘s ignore argument, ignoring files and directories that\nmatch one of the glob-style patterns provided. See the example below.
\n\nNew in version 2.6.
\nRecursively copy an entire directory tree rooted at src. The destination\ndirectory, named by dst, must not already exist; it will be created as well\nas missing parent directories. Permissions and times of directories are\ncopied with copystat(), individual files are copied using\ncopy2().
\nIf symlinks is true, symbolic links in the source tree are represented as\nsymbolic links in the new tree, but the metadata of the original links is NOT\ncopied; if false or omitted, the contents and metadata of the linked files\nare copied to the new tree.
\nIf ignore is given, it must be a callable that will receive as its\narguments the directory being visited by copytree(), and a list of its\ncontents, as returned by os.listdir(). Since copytree() is\ncalled recursively, the ignore callable will be called once for each\ndirectory that is copied. The callable must return a sequence of directory\nand file names relative to the current directory (i.e. a subset of the items\nin its second argument); these names will then be ignored in the copy\nprocess. ignore_patterns() can be used to create such a callable that\nignores names based on glob-style patterns.
\nIf exception(s) occur, an Error is raised with a list of reasons.
\nThe source code for this should be considered an example rather than the\nultimate tool.
\n\nChanged in version 2.3: Error is raised if any exceptions occur during copying, rather than\nprinting a message.
\n\nChanged in version 2.5: Create intermediate directories needed to create dst, rather than raising an\nerror. Copy permissions and times of directories using copystat().
\n\nChanged in version 2.6: Added the ignore argument to be able to influence what is being copied.
\nDelete an entire directory tree; path must point to a directory (but not a\nsymbolic link to a directory). If ignore_errors is true, errors resulting\nfrom failed removals will be ignored; if false or omitted, such errors are\nhandled by calling a handler specified by onerror or, if that is omitted,\nthey raise an exception.
\nIf onerror is provided, it must be a callable that accepts three\nparameters: function, path, and excinfo. The first parameter,\nfunction, is the function which raised the exception; it will be\nos.path.islink(), os.listdir(), os.remove() or\nos.rmdir(). The second parameter, path, will be the path name passed\nto function. The third parameter, excinfo, will be the exception\ninformation return by sys.exc_info(). Exceptions raised by onerror\nwill not be caught.
\n\nChanged in version 2.6: Explicitly check for path being a symbolic link and raise OSError\nin that case.
\nRecursively move a file or directory (src) to another location (dst).
\nIf the destination is a directory or a symlink to a directory, then src is\nmoved inside that directory.
\nThe destination directory must not already exist. If the destination already\nexists but is not a directory, it may be overwritten depending on\nos.rename() semantics.
\nIf the destination is on the current filesystem, then os.rename() is\nused. Otherwise, src is copied (using copy2()) to dst and then\nremoved.
\n\nNew in version 2.3.
\nThis exception collects exceptions that are raised during a multi-file\noperation. For copytree(), the exception argument is a list of 3-tuples\n(srcname, dstname, exception).
\n\nNew in version 2.3.
\nThis example is the implementation of the copytree() function, described\nabove, with the docstring omitted. It demonstrates many of the other functions\nprovided by this module.
\ndef copytree(src, dst, symlinks=False, ignore=None):\n names = os.listdir(src)\n if ignore is not None:\n ignored_names = ignore(src, names)\n else:\n ignored_names = set()\n\n os.makedirs(dst)\n errors = []\n for name in names:\n if name in ignored_names:\n continue\n srcname = os.path.join(src, name)\n dstname = os.path.join(dst, name)\n try:\n if symlinks and os.path.islink(srcname):\n linkto = os.readlink(srcname)\n os.symlink(linkto, dstname)\n elif os.path.isdir(srcname):\n copytree(srcname, dstname, symlinks, ignore)\n else:\n copy2(srcname, dstname)\n # XXX What about devices, sockets etc.?\n except (IOError, os.error), why:\n errors.append((srcname, dstname, str(why)))\n # catch the Error from the recursive copytree so that we can\n # continue with other files\n except Error, err:\n errors.extend(err.args[0])\n try:\n copystat(src, dst)\n except WindowsError:\n # can't copy file access times on Windows\n pass\n except OSError, why:\n errors.extend((src, dst, str(why)))\n if errors:\n raise Error(errors)\n
Another example that uses the ignore_patterns() helper:
\nfrom shutil import copytree, ignore_patterns\n\ncopytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))\n
This will copy everything except .pyc files and files or directories whose\nname starts with tmp.
\nAnother example that uses the ignore argument to add a logging call:
\nfrom shutil import copytree\nimport logging\n\ndef _logpath(path, names):\n logging.info('Working in %s' path)\n return [] # nothing will be ignored\n\ncopytree(source, destination, ignore=_logpath)\n
Create an archive file (eg. zip or tar) and returns its name.
\nbase_name is the name of the file to create, including the path, minus\nany format-specific extension. format is the archive format: one of\n“zip”, “tar”, “bztar” or “gztar”.
\nroot_dir is a directory that will be the root directory of the\narchive; ie. we typically chdir into root_dir before creating the\narchive.
\nbase_dir is the directory where we start archiving from;\nie. base_dir will be the common prefix of all files and\ndirectories in the archive.
\nroot_dir and base_dir both default to the current directory.
\nowner and group are used when creating a tar archive. By default,\nuses the current owner and group.
\nlogger is an instance of logging.Logger.
\n\nNew in version 2.7.
\nReturn a list of supported formats for archiving.\nEach element of the returned sequence is a tuple (name, description)
\nBy default shutil provides these formats:
\nYou can register new formats or provide your own archiver for any existing\nformats, by using register_archive_format().
\n\nNew in version 2.7.
\nRegister an archiver for the format name. function is a callable that\nwill be used to invoke the archiver.
\nIf given, extra_args is a sequence of (name, value) that will be\nused as extra keywords arguments when the archiver callable is used.
\ndescription is used by get_archive_formats() which returns the\nlist of archivers. Defaults to an empty list.
\n\nNew in version 2.7.
\nRemove the archive format name from the list of supported formats.
\n\nNew in version 2.7.
\nIn this example, we create a gzip’ed tar-file archive containing all files\nfound in the .ssh directory of the user:
\n>>> from shutil import make_archive\n>>> import os\n>>> archive_name = os.path.expanduser(os.path.join('~', 'myarchive'))\n>>> root_dir = os.path.expanduser(os.path.join('~', '.ssh'))\n>>> make_archive(archive_name, 'gztar', root_dir)\n'/Users/tarek/myarchive.tar.gz'\n
The resulting archive contains:
\n$ tar -tzvf /Users/tarek/myarchive.tar.gz\ndrwx------ tarek/staff 0 2010-02-01 16:23:40 ./\n-rw-r--r-- tarek/staff 609 2008-06-09 13:26:54 ./authorized_keys\n-rwxr-xr-x tarek/staff 65 2008-06-09 13:26:54 ./config\n-rwx------ tarek/staff 668 2008-06-09 13:26:54 ./id_dsa\n-rwxr-xr-x tarek/staff 609 2008-06-09 13:26:54 ./id_dsa.pub\n-rw------- tarek/staff 1675 2008-06-09 13:26:54 ./id_rsa\n-rw-r--r-- tarek/staff 397 2008-06-09 13:26:54 ./id_rsa.pub\n-rw-r--r-- tarek/staff 37192 2010-02-06 18:23:10 ./known_hosts
\nThis module is the Mac OS 9 (and earlier) implementation of the os.path\nmodule. It can be used to manipulate old-style Macintosh pathnames on Mac OS X\n(or any other platform).
\nThe following functions are available in this module: normcase(),\nnormpath(), isabs(), join(), split(), isdir(),\nisfile(), walk(), exists(). For other functions available in\nos.path dummy counterparts are available.
\nThis module contains functions that can read and write Python values in a binary\nformat. The format is specific to Python, but independent of machine\narchitecture issues (e.g., you can write a Python value to a file on a PC,\ntransport the file to a Sun, and read it back there). Details of the format are\nundocumented on purpose; it may change between Python versions (although it\nrarely does). [1]
\nThis is not a general “persistence” module. For general persistence and\ntransfer of Python objects through RPC calls, see the modules pickle and\nshelve. The marshal module exists mainly to support reading and\nwriting the “pseudo-compiled” code for Python modules of .pyc files.\nTherefore, the Python maintainers reserve the right to modify the marshal format\nin backward incompatible ways should the need arise. If you’re serializing and\nde-serializing Python objects, use the pickle module instead – the\nperformance is comparable, version independence is guaranteed, and pickle\nsupports a substantially wider range of objects than marshal.
\nWarning
\nThe marshal module is not intended to be secure against erroneous or\nmaliciously constructed data. Never unmarshal data received from an\nuntrusted or unauthenticated source.
\nNot all Python object types are supported; in general, only objects whose value\nis independent from a particular invocation of Python can be written and read by\nthis module. The following types are supported: booleans, integers, long\nintegers, floating point numbers, complex numbers, strings, Unicode objects,\ntuples, lists, sets, frozensets, dictionaries, and code objects, where it should\nbe understood that tuples, lists, sets, frozensets and dictionaries are only\nsupported as long as the values contained therein are themselves supported; and\nrecursive lists, sets and dictionaries should not be written (they will cause\ninfinite loops). The singletons None, Ellipsis and\nStopIteration can also be marshalled and unmarshalled.
\nWarning
\nOn machines where C’s long int type has more than 32 bits (such as the\nDEC Alpha), it is possible to create plain Python integers that are longer\nthan 32 bits. If such an integer is marshaled and read back in on a machine\nwhere C’s long int type has only 32 bits, a Python long integer object\nis returned instead. While of a different type, the numeric value is the\nsame. (This behavior is new in Python 2.2. In earlier versions, all but the\nleast-significant 32 bits of the value were lost, and a warning message was\nprinted.)
\nThere are functions that read/write files as well as functions operating on\nstrings.
\nThe module defines these functions:
\nWrite the value on the open file. The value must be a supported type. The\nfile must be an open file object such as sys.stdout or returned by\nopen() or os.popen(). It must be opened in binary mode ('wb'\nor 'w+b').
\nIf the value has (or contains an object that has) an unsupported type, a\nValueError exception is raised — but garbage data will also be written\nto the file. The object will not be properly read back by load().
\n\nNew in version 2.4: The version argument indicates the data format that dump should use\n(see below).
\nRead one value from the open file and return it. If no valid value is read\n(e.g. because the data has a different Python version’s incompatible marshal\nformat), raise EOFError, ValueError or TypeError. The\nfile must be an open file object opened in binary mode ('rb' or\n'r+b').
\n\nReturn the string that would be written to a file by dump(value, file). The\nvalue must be a supported type. Raise a ValueError exception if value\nhas (or contains an object that has) an unsupported type.
\n\nNew in version 2.4: The version argument indicates the data format that dumps should use\n(see below).
\nIn addition, the following constants are defined:
\nIndicates the format that the module uses. Version 0 is the historical format,\nversion 1 (added in Python 2.4) shares interned strings and version 2 (added in\nPython 2.5) uses a binary format for floating point numbers. The current version\nis 2.
\n\nNew in version 2.4.
\nFootnotes
\n[1] | The name of this module stems from a bit of terminology used by the designers of\nModula-3 (amongst others), who use the term “marshalling” for shipping of data\naround in a self-contained form. Strictly speaking, “to marshal” means to\nconvert some data from internal to external form (in an RPC buffer for instance)\nand “unmarshalling” for the reverse process. |
Note
\nThe anydbm module has been renamed to dbm in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nanydbm is a generic interface to variants of the DBM database —\ndbhash (requires bsddb), gdbm, or dbm. If none of\nthese modules is installed, the slow-but-simple implementation in module\ndumbdbm will be used.
\nOpen the database file filename and return a corresponding object.
\nIf the database file already exists, the whichdb module is used to\ndetermine its type and the appropriate module is used; if it does not exist,\nthe first module listed above that can be imported is used.
\nThe optional flag argument must be one of these values:
\nValue | \nMeaning | \n
---|---|
'r' | \nOpen existing database for reading only\n(default) | \n
'w' | \nOpen existing database for reading and\nwriting | \n
'c' | \nOpen database for reading and writing,\ncreating it if it doesn’t exist | \n
'n' | \nAlways create a new, empty database, open\nfor reading and writing | \n
If not specified, the default value is 'r'.
\nThe optional mode argument is the Unix mode of the file, used only when the\ndatabase has to be created. It defaults to octal 0666 (and will be\nmodified by the prevailing umask).
\nThe object returned by open() supports most of the same functionality as\ndictionaries; keys and their corresponding values can be stored, retrieved, and\ndeleted, and the has_key() and keys() methods are available. Keys\nand values must always be strings.
\nThe following example records some hostnames and a corresponding title, and\nthen prints out the contents of the database:
\nimport anydbm\n\n# Open database, creating it if necessary.\ndb = anydbm.open('cache', 'c')\n\n# Record some values\ndb['www.python.org'] = 'Python Website'\ndb['www.cnn.com'] = 'Cable News Network'\n\n# Loop through contents. Other dictionary methods\n# such as .keys(), .values() also work.\nfor k, v in db.iteritems():\n print k, '\\t', v\n\n# Storing a non-string key or value will raise an exception (most\n# likely a TypeError).\ndb['www.yahoo.com'] = 4\n\n# Close when done.\ndb.close()\n
See also
\nSource code: Lib/shelve.py
\nA “shelf” is a persistent, dictionary-like object. The difference with “dbm”\ndatabases is that the values (not the keys!) in a shelf can be essentially\narbitrary Python objects — anything that the pickle module can handle.\nThis includes most class instances, recursive data types, and objects containing\nlots of shared sub-objects. The keys are ordinary strings.
\nOpen a persistent dictionary. The filename specified is the base filename for\nthe underlying database. As a side-effect, an extension may be added to the\nfilename and more than one file may be created. By default, the underlying\ndatabase file is opened for reading and writing. The optional flag parameter\nhas the same interpretation as the flag parameter of anydbm.open().
\nBy default, version 0 pickles are used to serialize values. The version of the\npickle protocol can be specified with the protocol parameter.
\n\nChanged in version 2.3: The protocol parameter was added.
\nBecause of Python semantics, a shelf cannot know when a mutable\npersistent-dictionary entry is modified. By default modified objects are\nwritten only when assigned to the shelf (see Example). If the\noptional writeback parameter is set to True, all entries accessed are also\ncached in memory, and written back on sync() and\nclose(); this can make it handier to mutate mutable entries in\nthe persistent dictionary, but, if many entries are accessed, it can consume\nvast amounts of memory for the cache, and it can make the close operation\nvery slow since all accessed entries are written back (there is no way to\ndetermine which accessed entries are mutable, nor which ones were actually\nmutated).
\nLike file objects, shelve objects should be closed explicitly to ensure\nthat the persistent data is flushed to disk.
\nSince the shelve module stores objects using pickle, the same\nsecurity precautions apply. Accordingly, you should avoid loading a shelf\nfrom an untrusted source.
\nShelf objects support all methods supported by dictionaries. This eases the\ntransition from dictionary based scripts to those requiring persistent storage.
\nTwo additional methods are supported:
\nSee also
\nPersistent dictionary recipe\nwith widely supported storage formats and having the speed of native\ndictionaries.
\n\n\n
A subclass of UserDict.DictMixin which stores pickled values in the\ndict object.
\nBy default, version 0 pickles are used to serialize values. The version of the\npickle protocol can be specified with the protocol parameter. See the\npickle documentation for a discussion of the pickle protocols.
\n\nChanged in version 2.3: The protocol parameter was added.
\nIf the writeback parameter is True, the object will hold a cache of all\nentries accessed and write them back to the dict at sync and close times.\nThis allows natural operations on mutable entries, but can consume much more\nmemory and make sync and close take a long time.
\nTo summarize the interface (key is a string, data is an arbitrary\nobject):
\nimport shelve\n\nd = shelve.open(filename) # open -- file may get suffix added by low-level\n # library\n\nd[key] = data # store data at key (overwrites old data if\n # using an existing key)\ndata = d[key] # retrieve a COPY of data at key (raise KeyError if no\n # such key)\ndel d[key] # delete data stored at key (raises KeyError\n # if no such key)\nflag = d.has_key(key) # true if the key exists\nklist = d.keys() # a list of all existing keys (slow!)\n\n# as d was opened WITHOUT writeback=True, beware:\nd['xx'] = range(4) # this works as expected, but...\nd['xx'].append(5) # *this doesn't!* -- d['xx'] is STILL range(4)!\n\n# having opened d without writeback=True, you need to code carefully:\ntemp = d['xx'] # extracts the copy\ntemp.append(5) # mutates the copy\nd['xx'] = temp # stores the copy right back, to persist it\n\n# or, d=shelve.open(filename,writeback=True) would let you just code\n# d['xx'].append(5) and have it work as expected, BUT it would also\n# consume more memory and make the d.close() operation slower.\n\nd.close() # close it\n
See also
\nThe pickle module implements a fundamental, but powerful algorithm for\nserializing and de-serializing a Python object structure. “Pickling” is the\nprocess whereby a Python object hierarchy is converted into a byte stream, and\n“unpickling” is the inverse operation, whereby a byte stream is converted back\ninto an object hierarchy. Pickling (and unpickling) is alternatively known as\n“serialization”, “marshalling,” [1] or “flattening”, however, to avoid\nconfusion, the terms used here are “pickling” and “unpickling”.
\nThis documentation describes both the pickle module and the\ncPickle module.
\nWarning
\nThe pickle module is not intended to be secure against erroneous or\nmaliciously constructed data. Never unpickle data received from an untrusted\nor unauthenticated source.
\nThe pickle module has an optimized cousin called the cPickle\nmodule. As its name implies, cPickle is written in C, so it can be up to\n1000 times faster than pickle. However it does not support subclassing\nof the Pickler() and Unpickler() classes, because in cPickle\nthese are functions, not classes. Most applications have no need for this\nfunctionality, and can benefit from the improved performance of cPickle.\nOther than that, the interfaces of the two modules are nearly identical; the\ncommon interface is described in this manual and differences are pointed out\nwhere necessary. In the following discussions, we use the term “pickle” to\ncollectively describe the pickle and cPickle modules.
\nThe data streams the two modules produce are guaranteed to be interchangeable.
\nPython has a more primitive serialization module called marshal, but in\ngeneral pickle should always be the preferred way to serialize Python\nobjects. marshal exists primarily to support Python’s .pyc\nfiles.
\nThe pickle module differs from marshal in several significant ways:
\nThe pickle module keeps track of the objects it has already serialized,\nso that later references to the same object won’t be serialized again.\nmarshal doesn’t do this.
\nThis has implications both for recursive objects and object sharing. Recursive\nobjects are objects that contain references to themselves. These are not\nhandled by marshal, and in fact, attempting to marshal recursive objects will\ncrash your Python interpreter. Object sharing happens when there are multiple\nreferences to the same object in different places in the object hierarchy being\nserialized. pickle stores such objects only once, and ensures that all\nother references point to the master copy. Shared objects remain shared, which\ncan be very important for mutable objects.
\nmarshal cannot be used to serialize user-defined classes and their\ninstances. pickle can save and restore class instances transparently,\nhowever the class definition must be importable and live in the same module as\nwhen the object was stored.
\nThe marshal serialization format is not guaranteed to be portable\nacross Python versions. Because its primary job in life is to support\n.pyc files, the Python implementers reserve the right to change the\nserialization format in non-backwards compatible ways should the need arise.\nThe pickle serialization format is guaranteed to be backwards compatible\nacross Python releases.
\nNote that serialization is a more primitive notion than persistence; although\npickle reads and writes file objects, it does not handle the issue of\nnaming persistent objects, nor the (even more complicated) issue of concurrent\naccess to persistent objects. The pickle module can transform a complex\nobject into a byte stream and it can transform the byte stream into an object\nwith the same internal structure. Perhaps the most obvious thing to do with\nthese byte streams is to write them onto a file, but it is also conceivable to\nsend them across a network or store them in a database. The module\nshelve provides a simple interface to pickle and unpickle objects on\nDBM-style database files.
\nThe data format used by pickle is Python-specific. This has the\nadvantage that there are no restrictions imposed by external standards such as\nXDR (which can’t represent pointer sharing); however it means that non-Python\nprograms may not be able to reconstruct pickled Python objects.
\nBy default, the pickle data format uses a printable ASCII representation.\nThis is slightly more voluminous than a binary representation. The big\nadvantage of using printable ASCII (and of some other characteristics of\npickle‘s representation) is that for debugging or recovery purposes it is\npossible for a human to read the pickled file with a standard text editor.
\nThere are currently 3 different protocols which can be used for pickling.
\nRefer to PEP 307 for more information.
\nIf a protocol is not specified, protocol 0 is used. If protocol is specified\nas a negative value or HIGHEST_PROTOCOL, the highest protocol version\navailable will be used.
\n\nChanged in version 2.3: Introduced the protocol parameter.
\nA binary format, which is slightly more efficient, can be chosen by specifying a\nprotocol version >= 1.
\nTo serialize an object hierarchy, you first create a pickler, then you call the\npickler’s dump() method. To de-serialize a data stream, you first create\nan unpickler, then you call the unpickler’s load() method. The\npickle module provides the following constant:
\nThe highest protocol version available. This value can be passed as a\nprotocol value.
\n\nNew in version 2.3.
\nNote
\nBe sure to always open pickle files created with protocols >= 1 in binary mode.\nFor the old ASCII-based pickle protocol 0 you can use either text mode or binary\nmode as long as you stay consistent.
\nA pickle file written with protocol 0 in binary mode will contain lone linefeeds\nas line terminators and therefore will look “funny” when viewed in Notepad or\nother editors which do not support this format.
\nThe pickle module provides the following functions to make the pickling\nprocess more convenient:
\nWrite a pickled representation of obj to the open file object file. This is\nequivalent to Pickler(file, protocol).dump(obj).
\nIf the protocol parameter is omitted, protocol 0 is used. If protocol is\nspecified as a negative value or HIGHEST_PROTOCOL, the highest protocol\nversion will be used.
\n\nChanged in version 2.3: Introduced the protocol parameter.
\nfile must have a write() method that accepts a single string argument.\nIt can thus be a file object opened for writing, a StringIO object, or\nany other custom object that meets this interface.
\nRead a string from the open file object file and interpret it as a pickle data\nstream, reconstructing and returning the original object hierarchy. This is\nequivalent to Unpickler(file).load().
\nfile must have two methods, a read() method that takes an integer\nargument, and a readline() method that requires no arguments. Both\nmethods should return a string. Thus file can be a file object opened for\nreading, a StringIO object, or any other custom object that meets this\ninterface.
\nThis function automatically determines whether the data stream was written in\nbinary mode or not.
\nReturn the pickled representation of the object as a string, instead of writing\nit to a file.
\nIf the protocol parameter is omitted, protocol 0 is used. If protocol is\nspecified as a negative value or HIGHEST_PROTOCOL, the highest protocol\nversion will be used.
\n\nChanged in version 2.3: The protocol parameter was added.
\nThe pickle module also defines three exceptions:
\nThe pickle module also exports two callables [2], Pickler and\nUnpickler:
\nThis takes a file-like object to which it will write a pickle data stream.
\nIf the protocol parameter is omitted, protocol 0 is used. If protocol is\nspecified as a negative value or HIGHEST_PROTOCOL, the highest\nprotocol version will be used.
\n\nChanged in version 2.3: Introduced the protocol parameter.
\nfile must have a write() method that accepts a single string argument.\nIt can thus be an open file object, a StringIO object, or any other\ncustom object that meets this interface.
\nPickler objects define one (or two) public methods:
\nClears the pickler’s “memo”. The memo is the data structure that remembers\nwhich objects the pickler has already seen, so that shared or recursive objects\npickled by reference and not by value. This method is useful when re-using\npicklers.
\nNote
\nPrior to Python 2.3, clear_memo() was only available on the picklers\ncreated by cPickle. In the pickle module, picklers have an\ninstance variable called memo which is a Python dictionary. So to clear\nthe memo for a pickle module pickler, you could do the following:
\nmypickler.memo.clear()\n
Code that does not need to support older versions of Python should simply use\nclear_memo().
\nIt is possible to make multiple calls to the dump() method of the same\nPickler instance. These must then be matched to the same number of\ncalls to the load() method of the corresponding Unpickler\ninstance. If the same object is pickled by multiple dump() calls, the\nload() will all yield references to the same object. [3]
\nUnpickler objects are defined as:
\nThis takes a file-like object from which it will read a pickle data stream.\nThis class automatically determines whether the data stream was written in\nbinary mode or not, so it does not need a flag as in the Pickler\nfactory.
\nfile must have two methods, a read() method that takes an integer\nargument, and a readline() method that requires no arguments. Both\nmethods should return a string. Thus file can be a file object opened for\nreading, a StringIO object, or any other custom object that meets this\ninterface.
\nUnpickler objects have one (or two) public methods:
\nRead a pickled object representation from the open file object given in\nthe constructor, and return the reconstituted object hierarchy specified\ntherein.
\nThis method automatically determines whether the data stream was written\nin binary mode or not.
\nThis is just like load() except that it doesn’t actually create any\nobjects. This is useful primarily for finding what’s called “persistent\nids” that may be referenced in a pickle data stream. See section\nThe pickle protocol below for more details.
\nNote: the noload() method is currently only available on\nUnpickler objects created with the cPickle module.\npickle module Unpicklers do not have the noload()\nmethod.
\nThe following types can be pickled:
\nAttempts to pickle unpicklable objects will raise the PicklingError\nexception; when this happens, an unspecified number of bytes may have already\nbeen written to the underlying file. Trying to pickle a highly recursive data\nstructure may exceed the maximum recursion depth, a RuntimeError will be\nraised in this case. You can carefully raise this limit with\nsys.setrecursionlimit().
\nNote that functions (built-in and user-defined) are pickled by “fully qualified”\nname reference, not by value. This means that only the function name is\npickled, along with the name of the module the function is defined in. Neither the\nfunction’s code, nor any of its function attributes are pickled. Thus the\ndefining module must be importable in the unpickling environment, and the module\nmust contain the named object, otherwise an exception will be raised. [4]
\nSimilarly, classes are pickled by named reference, so the same restrictions in\nthe unpickling environment apply. Note that none of the class’s code or data is\npickled, so in the following example the class attribute attr is not\nrestored in the unpickling environment:
\nclass Foo:\n attr = 'a class attr'\n\npicklestring = pickle.dumps(Foo)\n
These restrictions are why picklable functions and classes must be defined in\nthe top level of a module.
\nSimilarly, when class instances are pickled, their class’s code and data are not\npickled along with them. Only the instance data are pickled. This is done on\npurpose, so you can fix bugs in a class or add methods to the class and still\nload objects that were created with an earlier version of the class. If you\nplan to have long-lived objects that will see many versions of a class, it may\nbe worthwhile to put a version number in the objects so that suitable\nconversions can be made by the class’s __setstate__() method.
\nThis section describes the “pickling protocol” that defines the interface\nbetween the pickler/unpickler and the objects that are being serialized. This\nprotocol provides a standard way for you to define, customize, and control how\nyour objects are serialized and de-serialized. The description in this section\ndoesn’t cover specific customizations that you can employ to make the unpickling\nenvironment slightly safer from untrusted pickle data streams; see section\nSubclassing Unpicklers for more details.
\nNew-style types can provide a __getnewargs__() method that is used for\nprotocol 2. Implementing this method is needed if the type establishes some\ninternal invariants when the instance is created, or if the memory allocation\nis affected by the values passed to the __new__() method for the type\n(as it is for tuples and strings). Instances of a new-style class\nC are created using
\nobj = C.__new__(C, *args)\n
where args is the result of calling __getnewargs__() on the original\nobject; if there is no __getnewargs__(), an empty tuple is assumed.
\nUpon unpickling, if the class also defines the method __setstate__(),\nit is called with the unpickled state. [5] If there is no\n__setstate__() method, the pickled state must be a dictionary and its\nitems are assigned to the new instance’s dictionary. If a class defines both\n__getstate__() and __setstate__(), the state object needn’t be a\ndictionary and these methods can do what they want. [6]
\nNote
\nFor new-style classes, if __getstate__() returns a false\nvalue, the __setstate__() method will not be called.
\nNote
\nAt unpickling time, some methods like __getattr__(),\n__getattribute__(), or __setattr__() may be called upon the\ninstance. In case those methods rely on some internal invariant being\ntrue, the type should implement either __getinitargs__() or\n__getnewargs__() to establish such an invariant; otherwise, neither\n__new__() nor __init__() will be called.
\nWhen the Pickler encounters an object of a type it knows nothing\nabout — such as an extension type — it looks in two places for a hint of\nhow to pickle it. One alternative is for the object to implement a\n__reduce__() method. If provided, at pickling time __reduce__()\nwill be called with no arguments, and it must return either a string or a\ntuple.
\nIf a string is returned, it names a global variable whose contents are\npickled as normal. The string returned by __reduce__() should be the\nobject’s local name relative to its module; the pickle module searches the\nmodule namespace to determine the object’s module.
\nWhen a tuple is returned, it must be between two and five elements long.\nOptional elements can either be omitted, or None can be provided as their\nvalue. The contents of this tuple are pickled as normal and used to\nreconstruct the object at unpickling time. The semantics of each element\nare:
\nA callable object that will be called to create the initial version of the\nobject. The next element of the tuple will provide arguments for this\ncallable, and later elements provide additional state information that will\nsubsequently be used to fully reconstruct the pickled data.
\nIn the unpickling environment this object must be either a class, a\ncallable registered as a “safe constructor” (see below), or it must have an\nattribute __safe_for_unpickling__ with a true value. Otherwise, an\nUnpicklingError will be raised in the unpickling environment. Note\nthat as usual, the callable itself is pickled by name.
\nA tuple of arguments for the callable object.
\n\nChanged in version 2.5: Formerly, this argument could also be None.
\nOptionally, the object’s state, which will be passed to the object’s\n__setstate__() method as described in section Pickling and unpickling normal class instances. If\nthe object has no __setstate__() method, then, as above, the value\nmust be a dictionary and it will be added to the object’s __dict__.
\nOptionally, an iterator (and not a sequence) yielding successive list\nitems. These list items will be pickled, and appended to the object using\neither obj.append(item) or obj.extend(list_of_items). This is\nprimarily used for list subclasses, but may be used by other classes as\nlong as they have append() and extend() methods with the\nappropriate signature. (Whether append() or extend() is used\ndepends on which pickle protocol version is used as well as the number of\nitems to append, so both must be supported.)
\nOptionally, an iterator (not a sequence) yielding successive dictionary\nitems, which should be tuples of the form (key, value). These items\nwill be pickled and stored to the object using obj[key] = value. This\nis primarily used for dictionary subclasses, but may be used by other\nclasses as long as they implement __setitem__().
\nIt is sometimes useful to know the protocol version when implementing\n__reduce__(). This can be done by implementing a method named\n__reduce_ex__() instead of __reduce__(). __reduce_ex__(),\nwhen it exists, is called in preference over __reduce__() (you may\nstill provide __reduce__() for backwards compatibility). The\n__reduce_ex__() method will be called with a single integer argument,\nthe protocol version.
\nThe object class implements both __reduce__() and\n__reduce_ex__(); however, if a subclass overrides __reduce__()\nbut not __reduce_ex__(), the __reduce_ex__() implementation\ndetects this and calls __reduce__().
\nAn alternative to implementing a __reduce__() method on the object to be\npickled, is to register the callable with the copy_reg module. This\nmodule provides a way for programs to register “reduction functions” and\nconstructors for user-defined types. Reduction functions have the same\nsemantics and interface as the __reduce__() method described above, except\nthat they are called with a single argument, the object to be pickled.
\nThe registered constructor is deemed a “safe constructor” for purposes of\nunpickling as described above.
\nFor the benefit of object persistence, the pickle module supports the\nnotion of a reference to an object outside the pickled data stream. Such\nobjects are referenced by a “persistent id”, which is just an arbitrary string\nof printable ASCII characters. The resolution of such names is not defined by\nthe pickle module; it will delegate this resolution to user defined\nfunctions on the pickler and unpickler. [7]
\nTo define external persistent id resolution, you need to set the\npersistent_id attribute of the pickler object and the\npersistent_load attribute of the unpickler object.
\nTo pickle objects that have an external persistent id, the pickler must have a\ncustom persistent_id() method that takes an object as an argument and\nreturns either None or the persistent id for that object. When None is\nreturned, the pickler simply pickles the object as normal. When a persistent id\nstring is returned, the pickler will pickle that string, along with a marker so\nthat the unpickler will recognize the string as a persistent id.
\nTo unpickle external objects, the unpickler must have a custom\npersistent_load() function that takes a persistent id string and returns\nthe referenced object.
\nHere’s a silly example that might shed more light:
\nimport pickle\nfrom cStringIO import StringIO\n\nsrc = StringIO()\np = pickle.Pickler(src)\n\ndef persistent_id(obj):\n if hasattr(obj, 'x'):\n return 'the value %d' obj.x\n else:\n return None\n\np.persistent_id = persistent_id\n\nclass Integer:\n def __init__(self, x):\n self.x = x\n def __str__(self):\n return 'My name is integer %d' self.x\n\ni = Integer(7)\nprint i\np.dump(i)\n\ndatastream = src.getvalue()\nprint repr(datastream)\ndst = StringIO(datastream)\n\nup = pickle.Unpickler(dst)\n\nclass FancyInteger(Integer):\n def __str__(self):\n return 'I am the integer %d' self.x\n\ndef persistent_load(persid):\n if persid.startswith('the value '):\n value = int(persid.split()[2])\n return FancyInteger(value)\n else:\n raise pickle.UnpicklingError, 'Invalid persistent id'\n\nup.persistent_load = persistent_load\n\nj = up.load()\nprint j\n
In the cPickle module, the unpickler’s persistent_load attribute\ncan also be set to a Python list, in which case, when the unpickler reaches a\npersistent id, the persistent id string will simply be appended to this list.\nThis functionality exists so that a pickle data stream can be “sniffed” for\nobject references without actually instantiating all the objects in a pickle.\n[8] Setting persistent_load to a list is usually used in conjunction\nwith the noload() method on the Unpickler.
\nBy default, unpickling will import any class that it finds in the pickle data.\nYou can control exactly what gets unpickled and what gets called by customizing\nyour unpickler. Unfortunately, exactly how you do this is different depending\non whether you’re using pickle or cPickle. [9]
\nIn the pickle module, you need to derive a subclass from\nUnpickler, overriding the load_global() method.\nload_global() should read two lines from the pickle data stream where the\nfirst line will the name of the module containing the class and the second line\nwill be the name of the instance’s class. It then looks up the class, possibly\nimporting the module and digging out the attribute, then it appends what it\nfinds to the unpickler’s stack. Later on, this class will be assigned to the\n__class__ attribute of an empty class, as a way of magically creating an\ninstance without calling its class’s __init__(). Your job (should you\nchoose to accept it), would be to have load_global() push onto the\nunpickler’s stack, a known safe version of any class you deem safe to unpickle.\nIt is up to you to produce such a class. Or you could raise an error if you\nwant to disallow all unpickling of instances. If this sounds like a hack,\nyou’re right. Refer to the source code to make this work.
\nThings are a little cleaner with cPickle, but not by much. To control\nwhat gets unpickled, you can set the unpickler’s find_global attribute\nto a function or None. If it is None then any attempts to unpickle\ninstances will raise an UnpicklingError. If it is a function, then it\nshould accept a module name and a class name, and return the corresponding class\nobject. It is responsible for looking up the class and performing any necessary\nimports, and it may raise an error to prevent instances of the class from being\nunpickled.
\nThe moral of the story is that you should be really careful about the source of\nthe strings your application unpickles.
\nFor the simplest code, use the dump() and load() functions. Note\nthat a self-referencing list is pickled and restored correctly.
\nimport pickle\n\ndata1 = {'a': [1, 2.0, 3, 4+6j],\n 'b': ('string', u'Unicode string'),\n 'c': None}\n\nselfref_list = [1, 2, 3]\nselfref_list.append(selfref_list)\n\noutput = open('data.pkl', 'wb')\n\n# Pickle dictionary using protocol 0.\npickle.dump(data1, output)\n\n# Pickle the list using the highest protocol available.\npickle.dump(selfref_list, output, -1)\n\noutput.close()\n
The following example reads the resulting pickled data. When reading a\npickle-containing file, you should open the file in binary mode because you\ncan’t be sure if the ASCII or binary format was used.
\nimport pprint, pickle\n\npkl_file = open('data.pkl', 'rb')\n\ndata1 = pickle.load(pkl_file)\npprint.pprint(data1)\n\ndata2 = pickle.load(pkl_file)\npprint.pprint(data2)\n\npkl_file.close()\n
Here’s a larger example that shows how to modify pickling behavior for a class.\nThe TextReader class opens a text file, and returns the line number and\nline contents each time its readline() method is called. If a\nTextReader instance is pickled, all attributes except the file object\nmember are saved. When the instance is unpickled, the file is reopened, and\nreading resumes from the last location. The __setstate__() and\n__getstate__() methods are used to implement this behavior.
\n#!/usr/local/bin/python\n\nclass TextReader:\n """Print and number lines in a text file."""\n def __init__(self, file):\n self.file = file\n self.fh = open(file)\n self.lineno = 0\n\n def readline(self):\n self.lineno = self.lineno + 1\n line = self.fh.readline()\n if not line:\n return None\n if line.endswith("\\n"):\n line = line[:-1]\n return "%d: %s" (self.lineno, line)\n\n def __getstate__(self):\n odict = self.__dict__.copy() # copy the dict since we change it\n del odict['fh'] # remove filehandle entry\n return odict\n\n def __setstate__(self, dict):\n fh = open(dict['file']) # reopen file\n count = dict['lineno'] # read from file...\n while count: # until line count is restored\n fh.readline()\n count = count - 1\n self.__dict__.update(dict) # update attributes\n self.fh = fh # save the file object\n
A sample usage might be something like this:
\n>>> import TextReader\n>>> obj = TextReader.TextReader("TextReader.py")\n>>> obj.readline()\n'1: #!/usr/local/bin/python'\n>>> obj.readline()\n'2: '\n>>> obj.readline()\n'3: class TextReader:'\n>>> import pickle\n>>> pickle.dump(obj, open('save.p', 'wb'))\n
If you want to see that pickle works across Python processes, start\nanother Python session, before continuing. What follows can happen from either\nthe same process or a new process.
\n>>> import pickle\n>>> reader = pickle.load(open('save.p', 'rb'))\n>>> reader.readline()\n'4: """Print and number lines in a text file."""'\n
The cPickle module supports serialization and de-serialization of Python\nobjects, providing an interface and functionality nearly identical to the\npickle module. There are several differences, the most important being\nperformance and subclassability.
\nFirst, cPickle can be up to 1000 times faster than pickle because\nthe former is implemented in C. Second, in the cPickle module the\ncallables Pickler() and Unpickler() are functions, not classes.\nThis means that you cannot use them to derive custom pickling and unpickling\nsubclasses. Most applications have no need for this functionality and should\nbenefit from the greatly improved performance of the cPickle module.
\nThe pickle data stream produced by pickle and cPickle are\nidentical, so it is possible to use pickle and cPickle\ninterchangeably with existing pickles. [10]
\nThere are additional minor differences in API between cPickle and\npickle, however for most applications, they are interchangeable. More\ndocumentation is provided in the pickle module documentation, which\nincludes a list of the documented differences.
\nFootnotes
\n[1] | Don’t confuse this with the marshal module |
[2] | In the pickle module these callables are classes, which you could\nsubclass to customize the behavior. However, in the cPickle module these\ncallables are factory functions and so cannot be subclassed. One common reason\nto subclass is to control what objects can actually be unpickled. See section\nSubclassing Unpicklers for more details. |
[3] | Warning: this is intended for pickling multiple objects without intervening\nmodifications to the objects or their parts. If you modify an object and then\npickle it again using the same Pickler instance, the object is not\npickled again — a reference to it is pickled and the Unpickler will\nreturn the old value, not the modified one. There are two problems here: (1)\ndetecting changes, and (2) marshalling a minimal set of changes. Garbage\nCollection may also become a problem here. |
[4] | The exception raised will likely be an ImportError or an\nAttributeError but it could be something else. |
[5] | These methods can also be used to implement copying class instances. |
[6] | This protocol is also used by the shallow and deep copying operations defined in\nthe copy module. |
[7] | The actual mechanism for associating these user defined functions is slightly\ndifferent for pickle and cPickle. The description given here\nworks the same for both implementations. Users of the pickle module\ncould also use subclassing to effect the same results, overriding the\npersistent_id() and persistent_load() methods in the derived\nclasses. |
[8] | We’ll leave you with the image of Guido and Jim sitting around sniffing pickles\nin their living rooms. |
[9] | A word of caution: the mechanisms described here use internal attributes and\nmethods, which are subject to change in future versions of Python. We intend to\nsomeday provide a common interface for controlling this behavior, which will\nwork in either pickle or cPickle. |
[10] | Since the pickle data format is actually a tiny stack-oriented programming\nlanguage, and some freedom is taken in the encodings of certain objects, it is\npossible that the two modules produce different data streams for the same input\nobjects. However it is guaranteed that they will always be able to read each\nother’s data streams. |
Note
\nThe whichdb module’s only function has been put into the dbm\nmodule in Python 3.0. The 2to3 tool will automatically adapt imports\nwhen converting your sources to 3.0.
\nThe single function in this module attempts to guess which of the several simple\ndatabase modules available–dbm, gdbm, or dbhash–should be used to open a given file.
\nPlatforms: Unix
\nNote
\nThe dbm module has been renamed to dbm.ndbm in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nThe dbm module provides an interface to the Unix “(n)dbm” library. Dbm\nobjects behave like mappings (dictionaries), except that keys and values are\nalways strings. Printing a dbm object doesn’t print the keys and values, and the\nitems() and values() methods are not supported.
\nThis module can be used with the “classic” ndbm interface, the BSD DB\ncompatibility interface, or the GNU GDBM compatibility interface. On Unix, the\nconfigure script will attempt to locate the appropriate header file\nto simplify building this module.
\nThe module defines the following:
\nOpen a dbm database and return a dbm object. The filename argument is the\nname of the database file (without the .dir or .pag extensions;\nnote that the BSD DB implementation of the interface will append the extension\n.db and only create one file).
\nThe optional flag argument must be one of these values:
\nValue | \nMeaning | \n
---|---|
'r' | \nOpen existing database for reading only\n(default) | \n
'w' | \nOpen existing database for reading and\nwriting | \n
'c' | \nOpen database for reading and writing,\ncreating it if it doesn’t exist | \n
'n' | \nAlways create a new, empty database, open\nfor reading and writing | \n
The optional mode argument is the Unix mode of the file, used only when the\ndatabase has to be created. It defaults to octal 0666 (and will be\nmodified by the prevailing umask).
\nPlatforms: Unix
\nNote
\nThe gdbm module has been renamed to dbm.gnu in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nThis module is quite similar to the dbm module, but uses gdbm instead\nto provide some additional functionality. Please note that the file formats\ncreated by gdbm and dbm are incompatible.
\nThe gdbm module provides an interface to the GNU DBM library. gdbm\nobjects behave like mappings (dictionaries), except that keys and values are\nalways strings. Printing a gdbm object doesn’t print the keys and values,\nand the items() and values() methods are not supported.
\nThe module defines the following constant and functions:
\nOpen a gdbm database and return a gdbm object. The filename argument\nis the name of the database file.
\nThe optional flag argument can be:
\nValue | \nMeaning | \n
---|---|
'r' | \nOpen existing database for reading only\n(default) | \n
'w' | \nOpen existing database for reading and\nwriting | \n
'c' | \nOpen database for reading and writing,\ncreating it if it doesn’t exist | \n
'n' | \nAlways create a new, empty database, open\nfor reading and writing | \n
The following additional characters may be appended to the flag to control\nhow the database is opened:
\nValue | \nMeaning | \n
---|---|
'f' | \nOpen the database in fast mode. Writes\nto the database will not be synchronized. | \n
's' | \nSynchronized mode. This will cause changes\nto the database to be immediately written\nto the file. | \n
'u' | \nDo not lock database. | \n
Not all flags are valid for all versions of gdbm. The module constant\nopen_flags is a string of supported flag characters. The exception\nerror is raised if an invalid flag is specified.
\nThe optional mode argument is the Unix mode of the file, used only when the\ndatabase has to be created. It defaults to octal 0666.
\nIn addition to the dictionary-like methods, gdbm objects have the following\nmethods:
\nReturns the key that follows key in the traversal. The following code prints\nevery key in the database db, without having to create a list in memory that\ncontains them all:
\nk = db.firstkey()\nwhile k != None:\n print k\n k = db.nextkey(k)\n
Note
\nThe dumbdbm module has been renamed to dbm.dumb in Python 3.0.\nThe 2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nNote
\nThe dumbdbm module is intended as a last resort fallback for the\nanydbm module when no more robust module is available. The dumbdbm\nmodule is not written for speed and is not nearly as heavily used as the other\ndatabase modules.
\nThe dumbdbm module provides a persistent dictionary-like interface which\nis written entirely in Python. Unlike other modules such as gdbm and\nbsddb, no external library is required. As with other persistent\nmappings, the keys and values must always be strings.
\nThe module defines the following:
\nOpen a dumbdbm database and return a dumbdbm object. The filename argument is\nthe basename of the database file (without any specific extensions). When a\ndumbdbm database is created, files with .dat and .dir extensions\nare created.
\nThe optional flag argument is currently ignored; the database is always opened\nfor update, and will be created if it does not exist.
\nThe optional mode argument is the Unix mode of the file, used only when the\ndatabase has to be created. It defaults to octal 0666 (and will be modified\nby the prevailing umask).
\n\nChanged in version 2.2: The mode argument was ignored in earlier versions.
\nSee also
\nIn addition to the methods provided by the UserDict.DictMixin class,\ndumbdbm objects provide the following methods.
\n\n\n\nDeprecated since version 2.6: The dbhash module has been deprecated for removal in Python 3.0.
\nThe dbhash module provides a function to open databases using the BSD\ndb library. This module mirrors the interface of the other Python database\nmodules that provide access to DBM-style databases. The bsddb module is\nrequired to use dbhash.
\nThis module provides an exception and a function:
\nOpen a db database and return the database object. The path argument is\nthe name of the database file.
\nThe flag argument can be:
\nValue | \nMeaning | \n
---|---|
'r' | \nOpen existing database for reading only\n(default) | \n
'w' | \nOpen existing database for reading and\nwriting | \n
'c' | \nOpen database for reading and writing,\ncreating it if it doesn’t exist | \n
'n' | \nAlways create a new, empty database, open\nfor reading and writing | \n
For platforms on which the BSD db library supports locking, an 'l'\ncan be appended to indicate that locking should be used.
\nThe optional mode parameter is used to indicate the Unix permission bits that\nshould be set if a new database must be created; this will be masked by the\ncurrent umask value for the process.
\nSee also
\n\nThe database objects returned by open() provide the methods common to all\nthe DBM-style databases and mapping objects. The following methods are\navailable in addition to the standard methods.
\nReturns the key next key/value pair in a database traversal. The following code\nprints every key in the database db, without having to create a list in\nmemory that contains them all:
\nprint db.first()\nfor i in xrange(1, len(db)):\n print db.next()\n
Source code: Lib/gzip.py
\nThis module provides a simple interface to compress and decompress files just\nlike the GNU programs gzip and gunzip would.
\nThe data compression is provided by the zlib module.
\nThe gzip module provides the GzipFile class which is modeled\nafter Python’s File Object. The GzipFile class reads and writes\ngzip-format files, automatically compressing or decompressing the\ndata so that it looks like an ordinary file object.
\nNote that additional file formats which can be decompressed by the\ngzip and gunzip programs, such as those produced by\ncompress and pack, are not supported by this module.
\nFor other archive formats, see the bz2, zipfile, and\ntarfile modules.
\nThe module defines the following items:
\nConstructor for the GzipFile class, which simulates most of the methods\nof a file object, with the exception of the readinto() and\ntruncate() methods. At least one of fileobj and filename must be\ngiven a non-trivial value.
\nThe new class instance is based on fileobj, which can be a regular file, a\nStringIO object, or any other object which simulates a file. It\ndefaults to None, in which case filename is opened to provide a file\nobject.
\nWhen fileobj is not None, the filename argument is only used to be\nincluded in the gzip file header, which may includes the original\nfilename of the uncompressed file. It defaults to the filename of fileobj, if\ndiscernible; otherwise, it defaults to the empty string, and in this case the\noriginal filename is not included in the header.
\nThe mode argument can be any of 'r', 'rb', 'a', 'ab', 'w',\nor 'wb', depending on whether the file will be read or written. The default\nis the mode of fileobj if discernible; otherwise, the default is 'rb'. If\nnot given, the ‘b’ flag will be added to the mode to ensure the file is opened\nin binary mode for cross-platform portability.
\nThe compresslevel argument is an integer from 1 to 9 controlling the\nlevel of compression; 1 is fastest and produces the least compression, and\n9 is slowest and produces the most compression. The default is 9.
\nThe mtime argument is an optional numeric timestamp to be written to\nthe stream when compressing. All gzip compressed streams are\nrequired to contain a timestamp. If omitted or None, the current\ntime is used. This module ignores the timestamp when decompressing;\nhowever, some programs, such as gunzip, make use of it.\nThe format of the timestamp is the same as that of the return value of\ntime.time() and of the st_mtime attribute of the object returned\nby os.stat().
\nCalling a GzipFile object’s close() method does not close\nfileobj, since you might wish to append more material after the compressed\ndata. This also allows you to pass a StringIO object opened for\nwriting as fileobj, and retrieve the resulting memory buffer using the\nStringIO object’s getvalue() method.
\nGzipFile supports iteration and the with statement.
\n\nChanged in version 2.7: Support for the with statement was added.
\n\nChanged in version 2.7: Support for zero-padded files was added.
\nExample of how to read a compressed file:
\nimport gzip\nf = gzip.open('/home/joe/file.txt.gz', 'rb')\nfile_content = f.read()\nf.close()\n
Example of how to create a compressed GZIP file:
\nimport gzip\ncontent = "Lots of content here"\nf = gzip.open('/home/joe/file.txt.gz', 'wb')\nf.write(content)\nf.close()\n
Example of how to GZIP compress an existing file:
\nimport gzip\nf_in = open('/home/joe/file.txt', 'rb')\nf_out = gzip.open('/home/joe/file.txt.gz', 'wb')\nf_out.writelines(f_in)\nf_out.close()\nf_in.close()\n
See also
\nFor applications that require data compression, the functions in this module\nallow compression and decompression, using the zlib library. The zlib library\nhas its own home page at http://www.zlib.net. There are known\nincompatibilities between the Python module and versions of the zlib library\nearlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using\n1.1.4 or later.
\nzlib’s functions have many options and often need to be used in a particular\norder. This documentation doesn’t attempt to cover all of the permutations;\nconsult the zlib manual at http://www.zlib.net/manual.html for authoritative\ninformation.
\nFor reading and writing .gz files see the gzip module. For\nother archive formats, see the bz2, zipfile, and\ntarfile modules.
\nThe available exception and functions in this module are:
\nComputes a Adler-32 checksum of data. (An Adler-32 checksum is almost as\nreliable as a CRC32 but can be computed much more quickly.) If value is\npresent, it is used as the starting value of the checksum; otherwise, a fixed\ndefault value is used. This allows computing a running checksum over the\nconcatenation of several inputs. The algorithm is not cryptographically\nstrong, and should not be used for authentication or digital signatures. Since\nthe algorithm is designed for use as a checksum algorithm, it is not suitable\nfor use as a general hash algorithm.
\nThis function always returns an integer object.
\nNote
\nTo generate the same numeric value across all Python versions and\nplatforms use adler32(data) & 0xffffffff. If you are only using\nthe checksum in packed binary format this is not necessary as the\nreturn value is the correct 32bit binary representation\nregardless of sign.
\n\nChanged in version 2.6: The return value is in the range [-2**31, 2**31-1]\nregardless of platform. In older versions the value is\nsigned on some platforms and unsigned on others.
\n\nChanged in version 3.0: The return value is unsigned and in the range [0, 2**32-1]\nregardless of platform.
\nComputes a CRC (Cyclic Redundancy Check) checksum of data. If value is\npresent, it is used as the starting value of the checksum; otherwise, a fixed\ndefault value is used. This allows computing a running checksum over the\nconcatenation of several inputs. The algorithm is not cryptographically\nstrong, and should not be used for authentication or digital signatures. Since\nthe algorithm is designed for use as a checksum algorithm, it is not suitable\nfor use as a general hash algorithm.
\nThis function always returns an integer object.
\nNote
\nTo generate the same numeric value across all Python versions and\nplatforms use crc32(data) & 0xffffffff. If you are only using\nthe checksum in packed binary format this is not necessary as the\nreturn value is the correct 32bit binary representation\nregardless of sign.
\n\nChanged in version 2.6: The return value is in the range [-2**31, 2**31-1]\nregardless of platform. In older versions the value would be\nsigned on some platforms and unsigned on others.
\n\nChanged in version 3.0: The return value is unsigned and in the range [0, 2**32-1]\nregardless of platform.
\nDecompresses the data in string, returning a string containing the\nuncompressed data. The wbits parameter controls the size of the window\nbuffer, and is discussed further below.\nIf bufsize is given, it is used as the initial size of the output\nbuffer. Raises the error exception if any error occurs.
\nThe absolute value of wbits is the base two logarithm of the size of the\nhistory buffer (the “window size”) used when compressing data. Its absolute\nvalue should be between 8 and 15 for the most recent versions of the zlib\nlibrary, larger values resulting in better compression at the expense of greater\nmemory usage. When decompressing a stream, wbits must not be smaller\nthan the size originally used to compress the stream; using a too-small\nvalue will result in an exception. The default value is therefore the\nhighest value, 15. When wbits is negative, the standard\ngzip header is suppressed.
\nbufsize is the initial size of the buffer used to hold decompressed data. If\nmore space is required, the buffer size will be increased as needed, so you\ndon’t have to get this value exactly right; tuning it will only save a few calls\nto malloc(). The default size is 16384.
\nCompression objects support the following methods:
\nReturns a copy of the compression object. This can be used to efficiently\ncompress a set of data that share a common initial prefix.
\n\nNew in version 2.5.
\nDecompression objects support the following methods, and two attributes:
\nA string which contains any bytes past the end of the compressed data. That is,\nthis remains "" until the last byte that contains compression data is\navailable. If the whole string turned out to contain compressed data, this is\n"", the empty string.
\nThe only way to determine where a string of compressed data ends is by actually\ndecompressing it. This means that when compressed data is contained part of a\nlarger file, you can only find the end of it by reading data and feeding it\nfollowed by some non-empty string into a decompression object’s\ndecompress() method until the unused_data attribute is no longer\nthe empty string.
\nDecompress string, returning a string containing the uncompressed data\ncorresponding to at least part of the data in string. This data should be\nconcatenated to the output produced by any preceding calls to the\ndecompress() method. Some of the input data may be preserved in internal\nbuffers for later processing.
\nIf the optional parameter max_length is supplied then the return value will be\nno longer than max_length. This may mean that not all of the compressed input\ncan be processed; and unconsumed data will be stored in the attribute\nunconsumed_tail. This string must be passed to a subsequent call to\ndecompress() if decompression is to continue. If max_length is not\nsupplied then the whole input is decompressed, and unconsumed_tail is an\nempty string.
\nAll pending input is processed, and a string containing the remaining\nuncompressed output is returned. After calling flush(), the\ndecompress() method cannot be called again; the only realistic action is\nto delete the object.
\nThe optional parameter length sets the initial size of the output buffer.
\nReturns a copy of the decompression object. This can be used to save the state\nof the decompressor midway through the data stream in order to speed up random\nseeks into the stream at a future point.
\n\nNew in version 2.5.
\nSee also
\n\nDeprecated since version 2.6: The bsddb module has been deprecated for removal in Python 3.0.
\nThe bsddb module provides an interface to the Berkeley DB library. Users\ncan create hash, btree or record based library files using the appropriate open\ncall. Bsddb objects behave generally like dictionaries. Keys and values must be\nstrings, however, so to use other objects as keys or to store other kinds of\nobjects the user must serialize them somehow, typically using\nmarshal.dumps() or pickle.dumps().
\nThe bsddb module requires a Berkeley DB library version from 4.0 thru\n4.7.
\nSee also
\nA more modern DB, DBEnv and DBSequence object interface is available in the\nbsddb.db module which closely matches the Berkeley DB C API documented at\nthe above URLs. Additional features provided by the bsddb.db API include\nfine tuning, transactions, logging, and multiprocess concurrent database access.
\nThe following is a description of the legacy bsddb interface compatible\nwith the old Python bsddb module. Starting in Python 2.5 this interface should\nbe safe for multithreaded access. The bsddb.db API is recommended for\nthreading users as it provides better control.
\nThe bsddb module defines the following functions that create objects that\naccess the appropriate type of Berkeley DB file. The first two arguments of\neach function are the same. For ease of portability, only the first two\narguments should be used in most instances.
\nNote
\nBeginning in 2.3 some Unix versions of Python may have a bsddb185 module.\nThis is present only to allow backwards compatibility with systems which ship\nwith the old Berkeley DB 1.85 database library. The bsddb185 module\nshould never be used directly in new code. The module has been removed in\nPython 3.0. If you find you still need it look in PyPI.
\nSee also
\nOnce instantiated, hash, btree and record objects support the same methods as\ndictionaries. In addition, they support the methods listed below.
\n\nChanged in version 2.3.1: Added dictionary methods.
\nExample:
\n>>> import bsddb\n>>> db = bsddb.btopen('/tmp/spam.db', 'c')\n>>> for i in range(10): db['%d' i] = '%d' (i*i)\n...\n>>> db['3']\n'9'\n>>> db.keys()\n['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']\n>>> db.first()\n('0', '0')\n>>> db.next()\n('1', '1')\n>>> db.last()\n('9', '81')\n>>> db.set_location('2')\n('2', '4')\n>>> db.previous()\n('1', '1')\n>>> for k, v in db.iteritems():\n... print k, v\n0 0\n1 1\n2 4\n3 9\n4 16\n5 25\n6 36\n7 49\n8 64\n9 81\n>>> '8' in db\nTrue\n>>> db.sync()\n0\n
\nNew in version 2.3.
\nThis module provides a comprehensive interface for the bz2 compression library.\nIt implements a complete file interface, one-shot (de)compression functions, and\ntypes for sequential (de)compression.
\nFor other archive formats, see the gzip, zipfile, and\ntarfile modules.
\nHere is a summary of the features offered by the bz2 module:
\nHandling of compressed files is offered by the BZ2File class.
\nOpen a bz2 file. Mode can be either 'r' or 'w', for reading (default)\nor writing. When opened for writing, the file will be created if it doesn’t\nexist, and truncated otherwise. If buffering is given, 0 means\nunbuffered, and larger numbers specify the buffer size; the default is\n0. If compresslevel is given, it must be a number between 1 and\n9; the default is 9. Add a 'U' to mode to open the file for input\nwith universal newline support. Any line ending in the input file will be\nseen as a '\\n' in Python. Also, a file so opened gains the attribute\nnewlines; the value for this attribute is one of None (no newline\nread yet), '\\r', '\\n', '\\r\\n' or a tuple containing all the\nnewline types seen. Universal newlines are available only when\nreading. Instances support iteration in the same way as normal file\ninstances.
\nBZ2File supports the with statement.
\n\nChanged in version 2.7: Support for the with statement was added.
\nFor backward compatibility. BZ2File objects now include the\nperformance optimizations previously implemented in the xreadlines\nmodule.
\n\nDeprecated since version 2.3: This exists only for compatibility with the method by this name on\nfile objects, which is deprecated. Use for line in file\ninstead.
\nMove to new file position. Argument offset is a byte count. Optional\nargument whence defaults to os.SEEK_SET or 0 (offset from start\nof file; offset should be >= 0); other values are os.SEEK_CUR or\n1 (move relative to current position; offset can be positive or\nnegative), and os.SEEK_END or 2 (move relative to end of file;\noffset is usually negative, although many platforms allow seeking beyond\nthe end of a file).
\nNote that seeking of bz2 files is emulated, and depending on the\nparameters the operation may be extremely slow.
\nSequential compression and decompression is done using the classes\nBZ2Compressor and BZ2Decompressor.
\nCreate a new compressor object. This object may be used to compress data\nsequentially. If you want to compress data in one shot, use the\ncompress() function instead. The compresslevel parameter, if given,\nmust be a number between 1 and 9; the default is 9.
\nCreate a new decompressor object. This object may be used to decompress data\nsequentially. If you want to decompress data in one shot, use the\ndecompress() function instead.
\nOne-shot compression and decompression is provided through the compress()\nand decompress() functions.
\n\nNew in version 1.6.
\nSource code: Lib/zipfile.py
\nThe ZIP file format is a common archive and compression standard. This module\nprovides tools to create, read, write, append, and list a ZIP file. Any\nadvanced use of this module will require an understanding of the format, as\ndefined in PKZIP Application Note.
\nThis module does not currently handle multi-disk ZIP files.\nIt can handle ZIP files that use the ZIP64 extensions\n(that is ZIP files that are more than 4 GByte in size). It supports\ndecryption of encrypted files in ZIP archives, but it currently cannot\ncreate an encrypted file. Decryption is extremely slow as it is\nimplemented in native Python rather than C.
\nFor other archive formats, see the bz2, gzip, and\ntarfile modules.
\nThe module defines the following items:
\nReturns True if filename is a valid ZIP file based on its magic number,\notherwise returns False. filename may be a file or file-like object too.
\n\nChanged in version 2.7: Support for file and file-like objects.
\nSee also
\nOpen a ZIP file, where file can be either a path to a file (a string) or a\nfile-like object. The mode parameter should be 'r' to read an existing\nfile, 'w' to truncate and write a new file, or 'a' to append to an\nexisting file. If mode is 'a' and file refers to an existing ZIP\nfile, then additional files are added to it. If file does not refer to a\nZIP file, then a new ZIP archive is appended to the file. This is meant for\nadding a ZIP archive to another file (such as python.exe).
\n\nChanged in version 2.6: If mode is a and the file does not exist at all, it is created.
\ncompression is the ZIP compression method to use when writing the archive,\nand should be ZIP_STORED or ZIP_DEFLATED; unrecognized\nvalues will cause RuntimeError to be raised. If ZIP_DEFLATED\nis specified but the zlib module is not available, RuntimeError\nis also raised. The default is ZIP_STORED. If allowZip64 is\nTrue zipfile will create ZIP files that use the ZIP64 extensions when\nthe zipfile is larger than 2 GB. If it is false (the default) zipfile\nwill raise an exception when the ZIP file would require ZIP64 extensions.\nZIP64 extensions are disabled by default because the default zip\nand unzip commands on Unix (the InfoZIP utilities) don’t support\nthese extensions.
\n\nChanged in version 2.7.1: If the file is created with mode 'a' or 'w' and then\nclose()d without adding any files to the archive, the appropriate\nZIP structures for an empty archive will be written to the file.
\nZipFile is also a context manager and therefore supports the\nwith statement. In the example, myzip is closed after the\nwith statement’s suite is finished—even if an exception occurs:
\nwith ZipFile('spam.zip', 'w') as myzip:\n myzip.write('eggs.txt')\n
\nNew in version 2.7: Added the ability to use ZipFile as a context manager.
\nExtract a member from the archive as a file-like object (ZipExtFile). name is\nthe name of the file in the archive, or a ZipInfo object. The mode\nparameter, if included, must be one of the following: 'r' (the default),\n'U', or 'rU'. Choosing 'U' or 'rU' will enable universal newline\nsupport in the read-only object. pwd is the password used for encrypted files.\nCalling open() on a closed ZipFile will raise a RuntimeError.
\nNote
\nThe file-like object is read-only and provides the following methods:\nread(), readline(), readlines(), __iter__(),\nnext().
\nNote
\nIf the ZipFile was created by passing in a file-like object as the first\nargument to the constructor, then the object returned by open() shares the\nZipFile’s file pointer. Under these circumstances, the object returned by\nopen() should not be used after any additional operations are performed\non the ZipFile object. If the ZipFile was created by passing in a string (the\nfilename) as the first argument to the constructor, then open() will\ncreate a new file object that will be held by the ZipExtFile, allowing it to\noperate independently of the ZipFile.
\nNote
\nThe open(), read() and extract() methods can take a filename\nor a ZipInfo object. You will appreciate this when trying to read a\nZIP file that contains members with duplicate names.
\n\nNew in version 2.6.
\nExtract a member from the archive to the current working directory; member\nmust be its full name or a ZipInfo object). Its file information is\nextracted as accurately as possible. path specifies a different directory\nto extract to. member can be a filename or a ZipInfo object.\npwd is the password used for encrypted files.
\n\nNew in version 2.6.
\nExtract all members from the archive to the current working directory. path\nspecifies a different directory to extract to. members is optional and must\nbe a subset of the list returned by namelist(). pwd is the password\nused for encrypted files.
\nWarning
\nNever extract archives from untrusted sources without prior inspection.\nIt is possible that files are created outside of path, e.g. members\nthat have absolute filenames starting with "/" or filenames with two\ndots "..".
\n\nNew in version 2.6.
\nSet pwd as default password to extract encrypted files.
\n\nNew in version 2.6.
\nReturn the bytes of the file name in the archive. name is the name of the\nfile in the archive, or a ZipInfo object. The archive must be open for\nread or append. pwd is the password used for encrypted files and, if specified,\nit will override the default password set with setpassword(). Calling\nread() on a closed ZipFile will raise a RuntimeError.
\n\nChanged in version 2.6: pwd was added, and name can now be a ZipInfo object.
\nWrite the file named filename to the archive, giving it the archive name\narcname (by default, this will be the same as filename, but without a drive\nletter and with leading path separators removed). If given, compress_type\noverrides the value given for the compression parameter to the constructor for\nthe new entry. The archive must be open with mode 'w' or 'a' – calling\nwrite() on a ZipFile created with mode 'r' will raise a\nRuntimeError. Calling write() on a closed ZipFile will raise a\nRuntimeError.
\nNote
\nThere is no official file name encoding for ZIP files. If you have unicode file\nnames, you must convert them to byte strings in your desired encoding before\npassing them to write(). WinZip interprets all file names as encoded in\nCP437, also known as DOS Latin.
\nNote
\nArchive names should be relative to the archive root, that is, they should not\nstart with a path separator.
\nNote
\nIf arcname (or filename, if arcname is not given) contains a null\nbyte, the name of the file in the archive will be truncated at the null byte.
\nWrite the string bytes to the archive; zinfo_or_arcname is either the file\nname it will be given in the archive, or a ZipInfo instance. If it’s\nan instance, at least the filename, date, and time must be given. If it’s a\nname, the date and time is set to the current date and time. The archive must be\nopened with mode 'w' or 'a' – calling writestr() on a ZipFile\ncreated with mode 'r' will raise a RuntimeError. Calling\nwritestr() on a closed ZipFile will raise a RuntimeError.
\nIf given, compress_type overrides the value given for the compression\nparameter to the constructor for the new entry, or in the zinfo_or_arcname\n(if that is a ZipInfo instance).
\nNote
\nWhen passing a ZipInfo instance as the zinfo_or_arcname parameter,\nthe compression method used will be that specified in the compress_type\nmember of the given ZipInfo instance. By default, the\nZipInfo constructor sets this member to ZIP_STORED.
\n\nChanged in version 2.7: The compression_type argument.
\nThe following data attributes are also available:
\nThe PyZipFile constructor takes the same parameters as the\nZipFile constructor. Instances have one method in addition to those of\nZipFile objects.
\nSearch for files *.py and add the corresponding file to the archive.\nThe corresponding file is a *.pyo file if available, else a\n*.pyc file, compiling if necessary. If the pathname is a file, the\nfilename must end with .py, and just the (corresponding\n*.py[co]) file is added at the top level (no path information). If the\npathname is a file that does not end with .py, a RuntimeError\nwill be raised. If it is a directory, and the directory is not a package\ndirectory, then all the files *.py[co] are added at the top level. If\nthe directory is a package directory, then all *.py[co] are added under\nthe package name as a file path, and if any subdirectories are package\ndirectories, all of these are added recursively. basename is intended for\ninternal use only. The writepy() method makes archives with file names\nlike this:
\nstring.pyc # Top level name\ntest/__init__.pyc # Package directory\ntest/test_support.pyc # Module test.test_support\ntest/bogus/__init__.pyc # Subpackage directory\ntest/bogus/myfile.pyc # Submodule test.bogus.myfile\n
Instances of the ZipInfo class are returned by the getinfo() and\ninfolist() methods of ZipFile objects. Each object stores\ninformation about a single member of the ZIP archive.
\nInstances have the following attributes:
\nThe time and date of the last modification to the archive member. This is a\ntuple of six values:
\nIndex | \nValue | \n
---|---|
0 | \nYear (>= 1980) | \n
1 | \nMonth (one-based) | \n
2 | \nDay of month (one-based) | \n
3 | \nHours (zero-based) | \n
4 | \nMinutes (zero-based) | \n
5 | \nSeconds (zero-based) | \n
Note
\nThe ZIP file format does not support timestamps before 1980.
\nNote
\nThe robotparser module has been renamed urllib.robotparser in\nPython 3.0.\nThe 2to3 tool will automatically adapt imports when converting\nyour sources to 3.0.
\nThis module provides a single class, RobotFileParser, which answers\nquestions about whether or not a particular user agent can fetch a URL on the\nWeb site that published the robots.txt file. For more details on the\nstructure of robots.txt files, see http://www.robotstxt.org/orig.html.
\nThis class provides a set of methods to read, parse and answer questions\nabout a single robots.txt file.
\nThe following example demonstrates basic use of the RobotFileParser class.
\n>>> import robotparser\n>>> rp = robotparser.RobotFileParser()\n>>> rp.set_url("http://www.musi-cal.com/robots.txt")\n>>> rp.read()\n>>> rp.can_fetch("*", "http://www.musi-cal.com/cgi-bin/search?city=San+Francisco")\nFalse\n>>> rp.can_fetch("*", "http://www.musi-cal.com/")\nTrue\n
\nNew in version 2.3.
\nSource code: Lib/tarfile.py
\nThe tarfile module makes it possible to read and write tar\narchives, including those using gzip or bz2 compression.\n(.zip files can be read and written using the zipfile module.)
\nSome facts and figures:
\nread/write support for the POSIX.1-1988 (ustar) format.
\nread/write support for the GNU tar format including longname and longlink\nextensions, read-only support for the sparse extension.
\nread/write support for the POSIX.1-2001 (pax) format.
\n\nNew in version 2.6.
\nhandles directories, regular files, hardlinks, symbolic links, fifos,\ncharacter devices and block devices and is able to acquire and restore file\ninformation like timestamp, access permissions and owner.
\nReturn a TarFile object for the pathname name. For detailed\ninformation on TarFile objects and the keyword arguments that are\nallowed, see TarFile Objects.
\nmode has to be a string of the form 'filemode[:compression]', it defaults\nto 'r'. Here is a full list of mode combinations:
\nmode | \naction | \n
---|---|
'r' or 'r:*' | \nOpen for reading with transparent\ncompression (recommended). | \n
'r:' | \nOpen for reading exclusively without\ncompression. | \n
'r:gz' | \nOpen for reading with gzip compression. | \n
'r:bz2' | \nOpen for reading with bzip2 compression. | \n
'a' or 'a:' | \nOpen for appending with no compression. The\nfile is created if it does not exist. | \n
'w' or 'w:' | \nOpen for uncompressed writing. | \n
'w:gz' | \nOpen for gzip compressed writing. | \n
'w:bz2' | \nOpen for bzip2 compressed writing. | \n
Note that 'a:gz' or 'a:bz2' is not possible. If mode is not suitable\nto open a certain (compressed) file for reading, ReadError is raised. Use\nmode 'r' to avoid this. If a compression method is not supported,\nCompressionError is raised.
\nIf fileobj is specified, it is used as an alternative to a file object opened\nfor name. It is supposed to be at position 0.
\nFor special purposes, there is a second format for mode:\n'filemode|[compression]'. tarfile.open() will return a TarFile\nobject that processes its data as a stream of blocks. No random seeking will\nbe done on the file. If given, fileobj may be any object that has a\nread() or write() method (depending on the mode). bufsize\nspecifies the blocksize and defaults to 20 * 512 bytes. Use this variant\nin combination with e.g. sys.stdin, a socket file object or a tape\ndevice. However, such a TarFile object is limited in that it does\nnot allow to be accessed randomly, see Examples. The currently\npossible modes:
\nMode | \nAction | \n
---|---|
'r|*' | \nOpen a stream of tar blocks for reading\nwith transparent compression. | \n
'r|' | \nOpen a stream of uncompressed tar blocks\nfor reading. | \n
'r|gz' | \nOpen a gzip compressed stream for\nreading. | \n
'r|bz2' | \nOpen a bzip2 compressed stream for\nreading. | \n
'w|' | \nOpen an uncompressed stream for writing. | \n
'w|gz' | \nOpen an gzip compressed stream for\nwriting. | \n
'w|bz2' | \nOpen an bzip2 compressed stream for\nwriting. | \n
Class for limited access to tar archives with a zipfile-like interface.\nPlease consult the documentation of the zipfile module for more details.\ncompression must be one of the following constants:
\n\nDeprecated since version 2.6: The TarFileCompat class has been deprecated for removal in Python 3.0.
\nIs raised by TarInfo.frombuf() if the buffer it gets is invalid.
\n\nNew in version 2.6.
\nEach of the following constants defines a tar archive format that the\ntarfile module is able to create. See section Supported tar formats for\ndetails.
\nThe following variables are available on module level:
\nSee also
\nThe TarFile object provides an interface to a tar archive. A tar\narchive is a sequence of blocks. An archive member (a stored file) is made up of\na header block followed by data blocks. It is possible to store a file in a tar\narchive several times. Each archive member is represented by a TarInfo\nobject, see TarInfo Objects for details.
\nA TarFile object can be used as a context manager in a with\nstatement. It will automatically be closed when the block is completed. Please\nnote that in the event of an exception an archive opened for writing will not\nbe finalized; only the internally used file object will be closed. See the\nExamples section for a use case.
\n\nNew in version 2.7: Added support for the context manager protocol.
\nAll following arguments are optional and can be accessed as instance attributes\nas well.
\nname is the pathname of the archive. It can be omitted if fileobj is given.\nIn this case, the file object’s name attribute is used if it exists.
\nmode is either 'r' to read from an existing archive, 'a' to append\ndata to an existing file or 'w' to create a new file overwriting an existing\none.
\nIf fileobj is given, it is used for reading or writing data. If it can be\ndetermined, mode is overridden by fileobj‘s mode. fileobj will be used\nfrom position 0.
\nNote
\nfileobj is not closed, when TarFile is closed.
\nformat controls the archive format. It must be one of the constants\nUSTAR_FORMAT, GNU_FORMAT or PAX_FORMAT that are\ndefined at module level.
\n\nNew in version 2.6.
\nThe tarinfo argument can be used to replace the default TarInfo class\nwith a different one.
\n\nNew in version 2.6.
\nIf dereference is False, add symbolic and hard links to the archive. If it\nis True, add the content of the target files to the archive. This has no\neffect on systems that do not support symbolic links.
\nIf ignore_zeros is False, treat an empty block as the end of the archive.\nIf it is True, skip empty (and invalid) blocks and try to get as many members\nas possible. This is only useful for reading concatenated or damaged archives.
\ndebug can be set from 0 (no debug messages) up to 3 (all debug\nmessages). The messages are written to sys.stderr.
\nIf errorlevel is 0, all errors are ignored when using TarFile.extract().\nNevertheless, they appear as error messages in the debug output, when debugging\nis enabled. If 1, all fatal errors are raised as OSError or\nIOError exceptions. If 2, all non-fatal errors are raised as\nTarError exceptions as well.
\nThe encoding and errors arguments control the way strings are converted to\nunicode objects and vice versa. The default settings will work for most users.\nSee section Unicode issues for in-depth information.
\n\nNew in version 2.6.
\nThe pax_headers argument is an optional dictionary of unicode strings which\nwill be added as a pax global header if format is PAX_FORMAT.
\n\nNew in version 2.6.
\nReturn a TarInfo object for member name. If name can not be found\nin the archive, KeyError is raised.
\nNote
\nIf a member occurs more than once in the archive, its last occurrence is assumed\nto be the most up-to-date version.
\nExtract all members from the archive to the current working directory or\ndirectory path. If optional members is given, it must be a subset of the\nlist returned by getmembers(). Directory information like owner,\nmodification time and permissions are set after all members have been extracted.\nThis is done to work around two problems: A directory’s modification time is\nreset each time a file is created in it. And, if a directory’s permissions do\nnot allow writing, extracting files to it will fail.
\nWarning
\nNever extract archives from untrusted sources without prior inspection.\nIt is possible that files are created outside of path, e.g. members\nthat have absolute filenames starting with "/" or filenames with two\ndots "..".
\n\nNew in version 2.5.
\nExtract a member from the archive to the current working directory, using its\nfull name. Its file information is extracted as accurately as possible. member\nmay be a filename or a TarInfo object. You can specify a different\ndirectory using path.
\nNote
\nThe extract() method does not take care of several extraction issues.\nIn most cases you should consider using the extractall() method.
\nWarning
\nSee the warning for extractall().
\nExtract a member from the archive as a file object. member may be a filename\nor a TarInfo object. If member is a regular file, a file-like object\nis returned. If member is a link, a file-like object is constructed from the\nlink’s target. If member is none of the above, None is returned.
\nNote
\nThe file-like object is read-only. It provides the methods\nread(), readline(), readlines(), seek(), tell(),\nand close(), and also supports iteration over its lines.
\nAdd the file name to the archive. name may be any type of file (directory,\nfifo, symbolic link, etc.). If given, arcname specifies an alternative name\nfor the file in the archive. Directories are added recursively by default. This\ncan be avoided by setting recursive to False. If exclude is given\nit must be a function that takes one filename argument and returns a boolean\nvalue. Depending on this value the respective file is either excluded\n(True) or added (False). If filter is specified it must\nbe a function that takes a TarInfo object argument and returns the\nchanged TarInfo object. If it instead returns None the TarInfo\nobject will be excluded from the archive. See Examples for an\nexample.
\n\nChanged in version 2.6: Added the exclude parameter.
\n\nChanged in version 2.7: Added the filter parameter.
\n\nDeprecated since version 2.7: The exclude parameter is deprecated, please use the filter parameter\ninstead. For maximum portability, filter should be used as a keyword\nargument rather than as a positional argument so that code won’t be\naffected when exclude is ultimately removed.
\nAdd the TarInfo object tarinfo to the archive. If fileobj is given,\ntarinfo.size bytes are read from it and added to the archive. You can\ncreate TarInfo objects using gettarinfo().
\nNote
\nOn Windows platforms, fileobj should always be opened with mode 'rb' to\navoid irritation about the file size.
\nSetting this to True is equivalent to setting the format\nattribute to USTAR_FORMAT, False is equivalent to\nGNU_FORMAT.
\n\nChanged in version 2.4: posix defaults to False.
\n\nDeprecated since version 2.6: Use the format attribute instead.
\nA dictionary containing key-value pairs of pax global headers.
\n\nNew in version 2.6.
\nA TarInfo object represents one member in a TarFile. Aside\nfrom storing all required attributes of a file (like file type, size, time,\npermissions, owner etc.), it provides some useful methods to determine its type.\nIt does not contain the file’s data itself.
\nTarInfo objects are returned by TarFile‘s methods\ngetmember(), getmembers() and gettarinfo().
\n\n\nCreate and return a TarInfo object from string buffer buf.
\n\nNew in version 2.6: Raises HeaderError if the buffer is invalid..
\nRead the next member from the TarFile object tarfile and return it as\na TarInfo object.
\n\nNew in version 2.6.
\nCreate a string buffer from a TarInfo object. For information on the\narguments see the constructor of the TarFile class.
\n\nChanged in version 2.6: The arguments were added.
\nA TarInfo object has the following public data attributes:
\nA dictionary containing key-value pairs of an associated pax extended header.
\n\nNew in version 2.6.
\nA TarInfo object also provides some convenient query methods:
\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nHow to extract an entire tar archive to the current working directory:
\nimport tarfile\ntar = tarfile.open("sample.tar.gz")\ntar.extractall()\ntar.close()\n
How to extract a subset of a tar archive with TarFile.extractall() using\na generator function instead of a list:
\nimport os\nimport tarfile\n\ndef py_files(members):\n for tarinfo in members:\n if os.path.splitext(tarinfo.name)[1] == ".py":\n yield tarinfo\n\ntar = tarfile.open("sample.tar.gz")\ntar.extractall(members=py_files(tar))\ntar.close()\n
How to create an uncompressed tar archive from a list of filenames:
\nimport tarfile\ntar = tarfile.open("sample.tar", "w")\nfor name in ["foo", "bar", "quux"]:\n tar.add(name)\ntar.close()\n
The same example using the with statement:
\nimport tarfile\nwith tarfile.open("sample.tar", "w") as tar:\n for name in ["foo", "bar", "quux"]:\n tar.add(name)\n
How to read a gzip compressed tar archive and display some member information:
\nimport tarfile\ntar = tarfile.open("sample.tar.gz", "r:gz")\nfor tarinfo in tar:\n print tarinfo.name, "is", tarinfo.size, "bytes in size and is",\n if tarinfo.isreg():\n print "a regular file."\n elif tarinfo.isdir():\n print "a directory."\n else:\n print "something else."\ntar.close()\n
How to create an archive and reset the user information using the filter\nparameter in TarFile.add():
\nimport tarfile\ndef reset(tarinfo):\n tarinfo.uid = tarinfo.gid = 0\n tarinfo.uname = tarinfo.gname = "root"\n return tarinfo\ntar = tarfile.open("sample.tar.gz", "w:gz")\ntar.add("foo", filter=reset)\ntar.close()\n
There are three tar formats that can be created with the tarfile module:
\nThe POSIX.1-1988 ustar format (USTAR_FORMAT). It supports filenames\nup to a length of at best 256 characters and linknames up to 100 characters. The\nmaximum file size is 8 gigabytes. This is an old and limited but widely\nsupported format.
\nThe GNU tar format (GNU_FORMAT). It supports long filenames and\nlinknames, files bigger than 8 gigabytes and sparse files. It is the de facto\nstandard on GNU/Linux systems. tarfile fully supports the GNU tar\nextensions for long names, sparse file support is read-only.
\nThe POSIX.1-2001 pax format (PAX_FORMAT). It is the most flexible\nformat with virtually no limits. It supports long filenames and linknames, large\nfiles and stores pathnames in a portable way. However, not all tar\nimplementations today are able to handle pax archives properly.
\nThe pax format is an extension to the existing ustar format. It uses extra\nheaders for information that cannot be stored otherwise. There are two flavours\nof pax headers: Extended headers only affect the subsequent file header, global\nheaders are valid for the complete archive and affect all following files. All\nthe data in a pax header is encoded in UTF-8 for portability reasons.
\nThere are some more variants of the tar format which can be read, but not\ncreated:
\nThe tar format was originally conceived to make backups on tape drives with the\nmain focus on preserving file system information. Nowadays tar archives are\ncommonly used for file distribution and exchanging archives over networks. One\nproblem of the original format (that all other formats are merely variants of)\nis that there is no concept of supporting different character encodings. For\nexample, an ordinary tar archive created on a UTF-8 system cannot be read\ncorrectly on a Latin-1 system if it contains non-ASCII characters. Names (i.e.\nfilenames, linknames, user/group names) containing these characters will appear\ndamaged. Unfortunately, there is no way to autodetect the encoding of an\narchive.
\nThe pax format was designed to solve this problem. It stores non-ASCII names\nusing the universal character encoding UTF-8. When a pax archive is read,\nthese UTF-8 names are converted to the encoding of the local file system.
\nThe details of unicode conversion are controlled by the encoding and errors\nkeyword arguments of the TarFile class.
\nThe default value for encoding is the local character encoding. It is deduced\nfrom sys.getfilesystemencoding() and sys.getdefaultencoding(). In\nread mode, encoding is used exclusively to convert unicode names from a pax\narchive to strings in the local character encoding. In write mode, the use of\nencoding depends on the chosen archive format. In case of PAX_FORMAT,\ninput names that contain non-ASCII characters need to be decoded before being\nstored as UTF-8 strings. The other formats do not make use of encoding\nunless unicode objects are used as input names. These are converted to 8-bit\ncharacter strings before they are added to the archive.
\nThe errors argument defines how characters are treated that cannot be\nconverted to or from encoding. Possible values are listed in section\nCodec Base Classes. In read mode, there is an additional scheme\n'utf-8' which means that bad characters are replaced by their UTF-8\nrepresentation. This is the default scheme. In write mode the default value for\nerrors is 'strict' to ensure that name information is not altered\nunnoticed.
\nNote
\nThe ConfigParser module has been renamed to configparser in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\nThis module defines the class ConfigParser. The ConfigParser\nclass implements a basic configuration file parser language which provides a\nstructure similar to what you would find on Microsoft Windows INI files. You\ncan use this to write Python programs which can be customized by end users\neasily.
\nNote
\nThis library does not interpret or write the value-type prefixes used in\nthe Windows Registry extended version of INI syntax.
\nSee also
\n\nThe configuration file consists of sections, led by a [section] header and\nfollowed by name: value entries, with continuations in the style of\nRFC 822 (see section 3.1.1, “LONG HEADER FIELDS”); name=value is also\naccepted. Note that leading whitespace is removed from values. The optional\nvalues can contain format strings which refer to other values in the same\nsection, or values in a special DEFAULT section. Additional defaults can be\nprovided on initialization and retrieval. Lines beginning with '#' or\n';' are ignored and may be used to provide comments.
\nConfiguration files may include comments, prefixed by specific characters (#\nand ;). Comments may appear on their own in an otherwise empty line, or may\nbe entered in lines holding values or section names. In the latter case, they\nneed to be preceded by a whitespace character to be recognized as a comment.\n(For backwards compatibility, only ; starts an inline comment, while #\ndoes not.)
\nOn top of the core functionality, SafeConfigParser supports\ninterpolation. This means values can contain format strings which refer to\nother values in the same section, or values in a special DEFAULT section.\nAdditional defaults can be provided on initialization.
\nFor example:
\n[My Section]\nfoodir: %(dir)s/whatever\ndir=frob\nlong: this value continues\n in the next line
\nwould resolve the %(dir)s to the value of dir (frob in this case).\nAll reference expansions are done on demand.
\nDefault values can be specified by passing them into the ConfigParser\nconstructor as a dictionary. Additional defaults may be passed into the\nget() method which will override all others.
\nSections are normally stored in a built-in dictionary. An alternative dictionary\ntype can be passed to the ConfigParser constructor. For example, if a\ndictionary type is passed that sorts its keys, the sections will be sorted on\nwrite-back, as will be the keys within each section.
\nThe basic configuration object. When defaults is given, it is initialized\ninto the dictionary of intrinsic defaults. When dict_type is given, it will\nbe used to create the dictionary objects for the list of sections, for the\noptions within a section, and for the default values. When allow_no_value\nis true (default: False), options without values are accepted; the value\npresented for these is None.
\nThis class does not\nsupport the magical interpolation behavior.
\nAll option names are passed through the optionxform() method. Its\ndefault implementation converts option names to lower case.
\n\nNew in version 2.3.
\n\nChanged in version 2.6: dict_type was added.
\n\nChanged in version 2.7: The default dict_type is collections.OrderedDict.\nallow_no_value was added.
\nDerived class of RawConfigParser that implements the magical\ninterpolation feature and adds optional arguments to the get() and\nitems() methods. The values in defaults must be appropriate for the\n%()s string interpolation. Note that __name__ is an intrinsic default;\nits value is the section name, and will override any value provided in\ndefaults.
\nAll option names used in interpolation will be passed through the\noptionxform() method just like any other option name reference. Using\nthe default implementation of optionxform(), the values foo %(bar)s\nand foo %(BAR)s are equivalent.
\n\nNew in version 2.3.
\n\nChanged in version 2.6: dict_type was added.
\n\nChanged in version 2.7: The default dict_type is collections.OrderedDict.\nallow_no_value was added.
\nDerived class of ConfigParser that implements a more-sane variant of\nthe magical interpolation feature. This implementation is more predictable as\nwell. New applications should prefer this version if they don’t need to be\ncompatible with older versions of Python.
\n\nNew in version 2.3.
\n\nChanged in version 2.6: dict_type was added.
\n\nChanged in version 2.7: The default dict_type is collections.OrderedDict.\nallow_no_value was added.
\nException raised when an option referenced from a value does not exist. Subclass\nof InterpolationError.
\n\nNew in version 2.3.
\nException raised when the source text into which substitutions are made does not\nconform to the required syntax. Subclass of InterpolationError.
\n\nNew in version 2.3.
\nSee also
\nRawConfigParser instances have the following methods:
\nIf the given section exists, and contains the given option, return\nTrue; otherwise return False.
\n\nNew in version 1.6.
\nAttempt to read and parse a list of filenames, returning a list of filenames\nwhich were successfully parsed. If filenames is a string or Unicode string,\nit is treated as a single filename. If a file named in filenames cannot be\nopened, that file will be ignored. This is designed so that you can specify a\nlist of potential configuration file locations (for example, the current\ndirectory, the user’s home directory, and some system-wide directory), and all\nexisting configuration files in the list will be read. If none of the named\nfiles exist, the ConfigParser instance will contain an empty dataset.\nAn application which requires initial values to be loaded from a file should\nload the required file or files using readfp() before calling read()\nfor any optional files:
\nimport ConfigParser, os\n\nconfig = ConfigParser.ConfigParser()\nconfig.readfp(open('defaults.cfg'))\nconfig.read(['site.cfg', os.path.expanduser('~/.myapp.cfg')])\n
\nChanged in version 2.4: Returns list of successfully parsed filenames.
\nIf the given section exists, set the given option to the specified value;\notherwise raise NoSectionError. While it is possible to use\nRawConfigParser (or ConfigParser with raw parameters set to\ntrue) for internal storage of non-string values, full functionality (including\ninterpolation and output to files) can only be achieved using string values.
\n\nNew in version 1.6.
\nWrite a representation of the configuration to the specified file object. This\nrepresentation can be parsed by a future read() call.
\n\nNew in version 1.6.
\nRemove the specified option from the specified section. If the section does\nnot exist, raise NoSectionError. If the option existed to be removed,\nreturn True; otherwise return False.
\n\nNew in version 1.6.
\nTransforms the option name option as found in an input file or as passed in\nby client code to the form that should be used in the internal structures.\nThe default implementation returns a lower-case version of option;\nsubclasses may override this or client code can set an attribute of this name\non instances to affect this behavior.
\nYou don’t necessarily need to subclass a ConfigParser to use this method, you\ncan also re-set it on an instance, to a function that takes a string\nargument. Setting it to str, for example, would make option names case\nsensitive:
\ncfgparser = ConfigParser()\n...\ncfgparser.optionxform = str\n
Note that when reading configuration files, whitespace around the\noption names are stripped before optionxform() is called.
\nThe ConfigParser class extends some methods of the\nRawConfigParser interface, adding some optional arguments.
\nGet an option value for the named section. If vars is provided, it\nmust be a dictionary. The option is looked up in vars (if provided),\nsection, and in defaults in that order.
\nAll the '%' interpolations are expanded in the return values, unless the\nraw argument is true. Values for interpolation keys are looked up in the\nsame manner as the option.
\nThe SafeConfigParser class implements the same extended interface as\nConfigParser, with the following addition:
\nIf the given section exists, set the given option to the specified value;\notherwise raise NoSectionError. value must be a string (str\nor unicode); if not, TypeError is raised.
\n\nNew in version 2.4.
\nAn example of writing to a configuration file:
\nimport ConfigParser\n\nconfig = ConfigParser.RawConfigParser()\n\n# When adding sections or items, add them in the reverse order of\n# how you want them to be displayed in the actual file.\n# In addition, please note that using RawConfigParser's and the raw\n# mode of ConfigParser's respective set functions, you can assign\n# non-string values to keys internally, but will receive an error\n# when attempting to write to a file or when you get it in non-raw\n# mode. SafeConfigParser does not allow such assignments to take place.\nconfig.add_section('Section1')\nconfig.set('Section1', 'int', '15')\nconfig.set('Section1', 'bool', 'true')\nconfig.set('Section1', 'float', '3.1415')\nconfig.set('Section1', 'baz', 'fun')\nconfig.set('Section1', 'bar', 'Python')\nconfig.set('Section1', 'foo', '%(bar)s is %(baz)s!')\n\n# Writing our configuration file to 'example.cfg'\nwith open('example.cfg', 'wb') as configfile:\n config.write(configfile)\n
An example of reading the configuration file again:
\nimport ConfigParser\n\nconfig = ConfigParser.RawConfigParser()\nconfig.read('example.cfg')\n\n# getfloat() raises an exception if the value is not a float\n# getint() and getboolean() also do this for their respective types\nfloat = config.getfloat('Section1', 'float')\nint = config.getint('Section1', 'int')\nprint float + int\n\n# Notice that the next output does not interpolate '%(bar)s' or '%(baz)s'.\n# This is because we are using a RawConfigParser().\nif config.getboolean('Section1', 'bool'):\n print config.get('Section1', 'foo')\n
To get interpolation, you will need to use a ConfigParser or\nSafeConfigParser:
\nimport ConfigParser\n\nconfig = ConfigParser.ConfigParser()\nconfig.read('example.cfg')\n\n# Set the third, optional argument of get to 1 if you wish to use raw mode.\nprint config.get('Section1', 'foo', 0) # -> "Python is fun!"\nprint config.get('Section1', 'foo', 1) # -> "%(bar)s is %(baz)s!"\n\n# The optional fourth argument is a dict with members that will take\n# precedence in interpolation.\nprint config.get('Section1', 'foo', 0, {'bar': 'Documentation',\n 'baz': 'evil'})\n
Defaults are available in all three types of ConfigParsers. They are used in\ninterpolation if an option used is not defined elsewhere.
\nimport ConfigParser\n\n# New instance with 'bar' and 'baz' defaulting to 'Life' and 'hard' each\nconfig = ConfigParser.SafeConfigParser({'bar': 'Life', 'baz': 'hard'})\nconfig.read('example.cfg')\n\nprint config.get('Section1', 'foo') # -> "Python is fun!"\nconfig.remove_option('Section1', 'bar')\nconfig.remove_option('Section1', 'baz')\nprint config.get('Section1', 'foo') # -> "Life is hard!"\n
The function opt_move below can be used to move options between sections:
\ndef opt_move(config, section1, section2, option):\n try:\n config.set(section2, option, config.get(section1, option, 1))\n except ConfigParser.NoSectionError:\n # Create non-existent section\n config.add_section(section2)\n opt_move(config, section1, section2, option)\n else:\n config.remove_option(section1, option)\n
Some configuration files are known to include settings without values, but which\notherwise conform to the syntax supported by ConfigParser. The\nallow_no_value parameter to the constructor can be used to indicate that such\nvalues should be accepted:
\n>>> import ConfigParser\n>>> import io\n\n>>> sample_config = """\n... [mysqld]\n... user = mysql\n... pid-file = /var/run/mysqld/mysqld.pid\n... skip-external-locking\n... old_passwords = 1\n... skip-bdb\n... skip-innodb\n... """\n>>> config = ConfigParser.RawConfigParser(allow_no_value=True)\n>>> config.readfp(io.BytesIO(sample_config))\n\n>>> # Settings with values are treated as before:\n>>> config.get("mysqld", "user")\n'mysql'\n\n>>> # Settings without values provide None:\n>>> config.get("mysqld", "skip-bdb")\n\n>>> # Settings which aren't specified still raise an error:\n>>> config.get("mysqld", "does-not-exist")\nTraceback (most recent call last):\n ...\nConfigParser.NoOptionError: No option 'does-not-exist' in section: 'mysqld'\n
\nNew in version 2.3.
\nThe so-called CSV (Comma Separated Values) format is the most common import and\nexport format for spreadsheets and databases. There is no “CSV standard”, so\nthe format is operationally defined by the many applications which read and\nwrite it. The lack of a standard means that subtle differences often exist in\nthe data produced and consumed by different applications. These differences can\nmake it annoying to process CSV files from multiple sources. Still, while the\ndelimiters and quoting characters vary, the overall format is similar enough\nthat it is possible to write a single module which can efficiently manipulate\nsuch data, hiding the details of reading and writing the data from the\nprogrammer.
\nThe csv module implements classes to read and write tabular data in CSV\nformat. It allows programmers to say, “write this data in the format preferred\nby Excel,” or “read data from this file which was generated by Excel,” without\nknowing the precise details of the CSV format used by Excel. Programmers can\nalso describe the CSV formats understood by other applications or define their\nown special-purpose CSV formats.
\nThe csv module’s reader and writer objects read and\nwrite sequences. Programmers can also read and write data in dictionary form\nusing the DictReader and DictWriter classes.
\nNote
\nThis version of the csv module doesn’t support Unicode input. Also,\nthere are currently some issues regarding ASCII NUL characters. Accordingly,\nall input should be UTF-8 or printable ASCII to be safe; see the examples in\nsection Examples. These restrictions will be removed in the future.
\nSee also
\nThe csv module defines the following functions:
\nReturn a reader object which will iterate over lines in the given csvfile.\ncsvfile can be any object which supports the iterator protocol and returns a\nstring each time its next() method is called — file objects and list\nobjects are both suitable. If csvfile is a file object, it must be opened\nwith the ‘b’ flag on platforms where that makes a difference. An optional\ndialect parameter can be given which is used to define a set of parameters\nspecific to a particular CSV dialect. It may be an instance of a subclass of\nthe Dialect class or one of the strings returned by the\nlist_dialects() function. The other optional fmtparam keyword arguments\ncan be given to override individual formatting parameters in the current\ndialect. For full details about the dialect and formatting parameters, see\nsection Dialects and Formatting Parameters.
\nEach row read from the csv file is returned as a list of strings. No\nautomatic data type conversion is performed.
\nA short usage example:
\n>>> import csv\n>>> spamReader = csv.reader(open('eggs.csv', 'rb'), delimiter=' ', quotechar='|')\n>>> for row in spamReader:\n... print ', '.join(row)\nSpam, Spam, Spam, Spam, Spam, Baked Beans\nSpam, Lovely Spam, Wonderful Spam\n
\nChanged in version 2.5: The parser is now stricter with respect to multi-line quoted fields. Previously,\nif a line ended within a quoted field without a terminating newline character, a\nnewline would be inserted into the returned field. This behavior caused problems\nwhen reading files which contained carriage return characters within fields.\nThe behavior was changed to return the field without inserting newlines. As a\nconsequence, if newlines embedded within fields are important, the input should\nbe split into lines in a manner which preserves the newline characters.
\nReturn a writer object responsible for converting the user’s data into delimited\nstrings on the given file-like object. csvfile can be any object with a\nwrite() method. If csvfile is a file object, it must be opened with the\n‘b’ flag on platforms where that makes a difference. An optional dialect\nparameter can be given which is used to define a set of parameters specific to a\nparticular CSV dialect. It may be an instance of a subclass of the\nDialect class or one of the strings returned by the\nlist_dialects() function. The other optional fmtparam keyword arguments\ncan be given to override individual formatting parameters in the current\ndialect. For full details about the dialect and formatting parameters, see\nsection Dialects and Formatting Parameters. To make it\nas easy as possible to interface with modules which implement the DB API, the\nvalue None is written as the empty string. While this isn’t a\nreversible transformation, it makes it easier to dump SQL NULL data values to\nCSV files without preprocessing the data returned from a cursor.fetch* call.\nAll other non-string data are stringified with str() before being written.
\nA short usage example:
\n>>> import csv\n>>> spamWriter = csv.writer(open('eggs.csv', 'wb'), delimiter=' ',\n... quotechar='|', quoting=csv.QUOTE_MINIMAL)\n>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])\n>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])\n
Return the dialect associated with name. An Error is raised if name\nis not a registered dialect name.
\n\nChanged in version 2.5: This function now returns an immutable Dialect. Previously an\ninstance of the requested dialect was returned. Users could modify the\nunderlying class, changing the behavior of active readers and writers.
\nReturns the current maximum field size allowed by the parser. If new_limit is\ngiven, this becomes the new limit.
\n\nNew in version 2.5.
\nThe csv module defines the following classes:
\nCreate an object which operates like a regular writer but maps dictionaries onto\noutput rows. The fieldnames parameter identifies the order in which values in\nthe dictionary passed to the writerow() method are written to the\ncsvfile. The optional restval parameter specifies the value to be written\nif the dictionary is missing a key in fieldnames. If the dictionary passed to\nthe writerow() method contains a key not found in fieldnames, the\noptional extrasaction parameter indicates what action to take. If it is set\nto 'raise' a ValueError is raised. If it is set to 'ignore',\nextra values in the dictionary are ignored. Any other optional or keyword\narguments are passed to the underlying writer instance.
\nNote that unlike the DictReader class, the fieldnames parameter of\nthe DictWriter is not optional. Since Python’s dict objects\nare not ordered, there is not enough information available to deduce the order\nin which the row should be written to the csvfile.
\nThe Sniffer class is used to deduce the format of a CSV file.
\nThe Sniffer class provides two methods:
\n\n\n\n\nAn example for Sniffer use:
\ncsvfile = open("example.csv", "rb")\ndialect = csv.Sniffer().sniff(csvfile.read(1024))\ncsvfile.seek(0)\nreader = csv.reader(csvfile, dialect)\n# ... process CSV file contents here ...\n
The csv module defines the following constants:
\n\n\nInstructs writer objects to quote all non-numeric fields.
\nInstructs the reader to convert all non-quoted fields to type float.
\nInstructs writer objects to never quote fields. When the current\ndelimiter occurs in output data it is preceded by the current escapechar\ncharacter. If escapechar is not set, the writer will raise Error if\nany characters that require escaping are encountered.
\nInstructs reader to perform no special processing of quote characters.
\nThe csv module defines the following exception:
\nTo make it easier to specify the format of input and output records, specific\nformatting parameters are grouped together into dialects. A dialect is a\nsubclass of the Dialect class having a set of specific methods and a\nsingle validate() method. When creating reader or\nwriter objects, the programmer can specify a string or a subclass of\nthe Dialect class as the dialect parameter. In addition to, or instead\nof, the dialect parameter, the programmer can also specify individual\nformatting parameters, which have the same names as the attributes defined below\nfor the Dialect class.
\nDialects support the following attributes:
\nControls how instances of quotechar appearing inside a field should be\nthemselves be quoted. When True, the character is doubled. When\nFalse, the escapechar is used as a prefix to the quotechar. It\ndefaults to True.
\nOn output, if doublequote is False and no escapechar is set,\nError is raised if a quotechar is found in a field.
\nThe string used to terminate lines produced by the writer. It defaults\nto '\\r\\n'.
\nNote
\nThe reader is hard-coded to recognise either '\\r' or '\\n' as\nend-of-line, and ignores lineterminator. This behavior may change in the\nfuture.
\nReader objects (DictReader instances and objects returned by the\nreader() function) have the following public methods:
\nReader objects have the following public attributes:
\nThe number of lines read from the source iterator. This is not the same as the\nnumber of records returned, as records can span multiple lines.
\n\nNew in version 2.5.
\nDictReader objects have the following public attribute:
\nIf not passed as a parameter when creating the object, this attribute is\ninitialized upon first access or when the first record is read from the\nfile.
\n\nChanged in version 2.6.
\nWriter objects (DictWriter instances and objects returned by\nthe writer() function) have the following public methods. A row must be\na sequence of strings or numbers for Writer objects and a dictionary\nmapping fieldnames to strings or numbers (by passing them through str()\nfirst) for DictWriter objects. Note that complex numbers are written\nout surrounded by parens. This may cause some problems for other programs which\nread CSV files (assuming they support complex numbers at all).
\nWriter objects have the following public attribute:
\nDictWriter objects have the following public method:
\nWrite a row with the field names (as specified in the constructor).
\n\nNew in version 2.7.
\nThe simplest example of reading a CSV file:
\nimport csv\nwith open('some.csv', 'rb') as f:\n reader = csv.reader(f)\n for row in reader:\n print row\n
Reading a file with an alternate format:
\nimport csv\nwith open('passwd', 'rb') as f:\n reader = csv.reader(f, delimiter=':', quoting=csv.QUOTE_NONE)\n for row in reader:\n print row\n
The corresponding simplest possible writing example is:
\nimport csv\nwith open('some.csv', 'wb') as f:\n writer = csv.writer(f)\n writer.writerows(someiterable)\n
Registering a new dialect:
\nimport csv\ncsv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE)\nwith open('passwd', 'rb') as f:\n reader = csv.reader(f, 'unixpwd')\n
A slightly more advanced use of the reader — catching and reporting errors:
\nimport csv, sys\nfilename = 'some.csv'\nwith open(filename, 'rb') as f:\n reader = csv.reader(f)\n try:\n for row in reader:\n print row\n except csv.Error, e:\n sys.exit('file %s, line %d: %s' (filename, reader.line_num, e))\n
And while the module doesn’t directly support parsing strings, it can easily be\ndone:
\nimport csv\nfor row in csv.reader(['one,two,three']):\n print row\n
The csv module doesn’t directly support reading and writing Unicode, but\nit is 8-bit-clean save for some problems with ASCII NUL characters. So you can\nwrite functions or classes that handle the encoding and decoding for you as long\nas you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.
\nunicode_csv_reader() below is a generator that wraps csv.reader\nto handle Unicode CSV data (a list of Unicode strings). utf_8_encoder()\nis a generator that encodes the Unicode strings as UTF-8, one string (or row) at\na time. The encoded strings are parsed by the CSV reader, and\nunicode_csv_reader() decodes the UTF-8-encoded cells back into Unicode:
\nimport csv\n\ndef unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):\n # csv.py doesn't do Unicode; encode temporarily as UTF-8:\n csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),\n dialect=dialect, **kwargs)\n for row in csv_reader:\n # decode UTF-8 back to Unicode, cell by cell:\n yield [unicode(cell, 'utf-8') for cell in row]\n\ndef utf_8_encoder(unicode_csv_data):\n for line in unicode_csv_data:\n yield line.encode('utf-8')\n
For all other encodings the following UnicodeReader and\nUnicodeWriter classes can be used. They take an additional encoding\nparameter in their constructor and make sure that the data passes the real\nreader or writer encoded as UTF-8:
\nimport csv, codecs, cStringIO\n\nclass UTF8Recoder:\n """\n Iterator that reads an encoded stream and reencodes the input to UTF-8\n """\n def __init__(self, f, encoding):\n self.reader = codecs.getreader(encoding)(f)\n\n def __iter__(self):\n return self\n\n def next(self):\n return self.reader.next().encode("utf-8")\n\nclass UnicodeReader:\n """\n A CSV reader which will iterate over lines in the CSV file "f",\n which is encoded in the given encoding.\n """\n\n def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):\n f = UTF8Recoder(f, encoding)\n self.reader = csv.reader(f, dialect=dialect, **kwds)\n\n def next(self):\n row = self.reader.next()\n return [unicode(s, "utf-8") for s in row]\n\n def __iter__(self):\n return self\n\nclass UnicodeWriter:\n """\n A CSV writer which will write rows to CSV file "f",\n which is encoded in the given encoding.\n """\n\n def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):\n # Redirect output to a queue\n self.queue = cStringIO.StringIO()\n self.writer = csv.writer(self.queue, dialect=dialect, **kwds)\n self.stream = f\n self.encoder = codecs.getincrementalencoder(encoding)()\n\n def writerow(self, row):\n self.writer.writerow([s.encode("utf-8") for s in row])\n # Fetch UTF-8 output from the queue ...\n data = self.queue.getvalue()\n data = data.decode("utf-8")\n # ... and reencode it into the target encoding\n data = self.encoder.encode(data)\n # write to the target stream\n self.stream.write(data)\n # empty queue\n self.queue.truncate(0)\n\n def writerows(self, rows):\n for row in rows:\n self.writerow(row)\n
\nNew in version 2.5.
\nSQLite is a C library that provides a lightweight disk-based database that\ndoesn’t require a separate server process and allows accessing the database\nusing a nonstandard variant of the SQL query language. Some applications can use\nSQLite for internal data storage. It’s also possible to prototype an\napplication using SQLite and then port the code to a larger database such as\nPostgreSQL or Oracle.
\nsqlite3 was written by Gerhard Häring and provides a SQL interface compliant\nwith the DB-API 2.0 specification described by PEP 249.
\nTo use the module, you must first create a Connection object that\nrepresents the database. Here the data will be stored in the\n/tmp/example file:
\nconn = sqlite3.connect('/tmp/example')\n
You can also supply the special name :memory: to create a database in RAM.
\nOnce you have a Connection, you can create a Cursor object\nand call its execute() method to perform SQL commands:
\nc = conn.cursor()\n\n# Create table\nc.execute('''create table stocks\n(date text, trans text, symbol text,\n qty real, price real)''')\n\n# Insert a row of data\nc.execute("""insert into stocks\n values ('2006-01-05','BUY','RHAT',100,35.14)""")\n\n# Save (commit) the changes\nconn.commit()\n\n# We can also close the cursor if we are done with it\nc.close()\n
Usually your SQL operations will need to use values from Python variables. You\nshouldn’t assemble your query using Python’s string operations because doing so\nis insecure; it makes your program vulnerable to an SQL injection attack.
\nInstead, use the DB-API’s parameter substitution. Put ? as a placeholder\nwherever you want to use a value, and then provide a tuple of values as the\nsecond argument to the cursor’s execute() method. (Other database\nmodules may use a different placeholder, such as %s or :1.) For\nexample:
\n# Never do this -- insecure!\nsymbol = 'IBM'\nc.execute("... where symbol = '%s'" symbol)\n\n# Do this instead\nt = (symbol,)\nc.execute('select * from stocks where symbol=?', t)\n\n# Larger example\nfor t in [('2006-03-28', 'BUY', 'IBM', 1000, 45.00),\n ('2006-04-05', 'BUY', 'MSFT', 1000, 72.00),\n ('2006-04-06', 'SELL', 'IBM', 500, 53.00),\n ]:\n c.execute('insert into stocks values (?,?,?,?,?)', t)\n
To retrieve data after executing a SELECT statement, you can either treat the\ncursor as an iterator, call the cursor’s fetchone() method to\nretrieve a single matching row, or call fetchall() to get a list of the\nmatching rows.
\nThis example uses the iterator form:
\n>>> c = conn.cursor()\n>>> c.execute('select * from stocks order by price')\n>>> for row in c:\n... print row\n...\n(u'2006-01-05', u'BUY', u'RHAT', 100, 35.14)\n(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)\n(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)\n(u'2006-04-05', u'BUY', u'MSFT', 1000, 72.0)\n>>>\n
See also
\nThis constant is meant to be used with the detect_types parameter of the\nconnect() function.
\nSetting it makes the sqlite3 module parse the declared type for each\ncolumn it returns. It will parse out the first word of the declared type,\ni. e. for “integer primary key”, it will parse out “integer”, or for\n“number(10)” it will parse out “number”. Then for that column, it will look\ninto the converters dictionary and use the converter function registered for\nthat type there.
\nThis constant is meant to be used with the detect_types parameter of the\nconnect() function.
\nSetting this makes the SQLite interface parse the column name for each column it\nreturns. It will look for a string formed [mytype] in there, and then decide\nthat ‘mytype’ is the type of the column. It will try to find an entry of\n‘mytype’ in the converters dictionary and then use the converter function found\nthere to return the value. The column name found in Cursor.description\nis only the first word of the column name, i. e. if you use something like\n'as "x [datetime]"' in your SQL, then we will parse out everything until the\nfirst blank for the column name: the column name would simply be “x”.
\nOpens a connection to the SQLite database file database. You can use\n":memory:" to open a database connection to a database that resides in RAM\ninstead of on disk.
\nWhen a database is accessed by multiple connections, and one of the processes\nmodifies the database, the SQLite database is locked until that transaction is\ncommitted. The timeout parameter specifies how long the connection should wait\nfor the lock to go away until raising an exception. The default for the timeout\nparameter is 5.0 (five seconds).
\nFor the isolation_level parameter, please see the\nConnection.isolation_level property of Connection objects.
\nSQLite natively supports only the types TEXT, INTEGER, FLOAT, BLOB and NULL. If\nyou want to use other types you must add support for them yourself. The\ndetect_types parameter and the using custom converters registered with the\nmodule-level register_converter() function allow you to easily do that.
\ndetect_types defaults to 0 (i. e. off, no type detection), you can set it to\nany combination of PARSE_DECLTYPES and PARSE_COLNAMES to turn\ntype detection on.
\nBy default, the sqlite3 module uses its Connection class for the\nconnect call. You can, however, subclass the Connection class and make\nconnect() use your class instead by providing your class for the factory\nparameter.
\nConsult the section SQLite and Python types of this manual for details.
\nThe sqlite3 module internally uses a statement cache to avoid SQL parsing\noverhead. If you want to explicitly set the number of statements that are cached\nfor the connection, you can set the cached_statements parameter. The currently\nimplemented default is to cache 100 statements.
\nReturns True if the string sql contains one or more complete SQL\nstatements terminated by semicolons. It does not verify that the SQL is\nsyntactically correct, only that there are no unclosed string literals and the\nstatement is terminated by a semicolon.
\nThis can be used to build a shell for SQLite, as in the following example:
\n# A minimal SQLite shell for experiments\n\nimport sqlite3\n\ncon = sqlite3.connect(":memory:")\ncon.isolation_level = None\ncur = con.cursor()\n\nbuffer = ""\n\nprint "Enter your SQL commands to execute in sqlite3."\nprint "Enter a blank line to exit."\n\nwhile True:\n line = raw_input()\n if line == "":\n break\n buffer += line\n if sqlite3.complete_statement(buffer):\n try:\n buffer = buffer.strip()\n cur.execute(buffer)\n\n if buffer.lstrip().upper().startswith("SELECT"):\n print cur.fetchall()\n except sqlite3.Error, e:\n print "An error occurred:", e.args[0]\n buffer = ""\n\ncon.close()\n
Creates a user-defined function that you can later use from within SQL\nstatements under the function name name. num_params is the number of\nparameters the function accepts, and func is a Python callable that is called\nas the SQL function.
\nThe function can return any of the types supported by SQLite: unicode, str, int,\nlong, float, buffer and None.
\nExample:
\nimport sqlite3\nimport md5\n\ndef md5sum(t):\n return md5.md5(t).hexdigest()\n\ncon = sqlite3.connect(":memory:")\ncon.create_function("md5", 1, md5sum)\ncur = con.cursor()\ncur.execute("select md5(?)", ("foo",))\nprint cur.fetchone()[0]\n
Creates a user-defined aggregate function.
\nThe aggregate class must implement a step method, which accepts the number\nof parameters num_params, and a finalize method which will return the\nfinal result of the aggregate.
\nThe finalize method can return any of the types supported by SQLite:\nunicode, str, int, long, float, buffer and None.
\nExample:
\nimport sqlite3\n\nclass MySum:\n def __init__(self):\n self.count = 0\n\n def step(self, value):\n self.count += value\n\n def finalize(self):\n return self.count\n\ncon = sqlite3.connect(":memory:")\ncon.create_aggregate("mysum", 1, MySum)\ncur = con.cursor()\ncur.execute("create table test(i)")\ncur.execute("insert into test(i) values (1)")\ncur.execute("insert into test(i) values (2)")\ncur.execute("select mysum(i) from test")\nprint cur.fetchone()[0]\n
Creates a collation with the specified name and callable. The callable will\nbe passed two string arguments. It should return -1 if the first is ordered\nlower than the second, 0 if they are ordered equal and 1 if the first is ordered\nhigher than the second. Note that this controls sorting (ORDER BY in SQL) so\nyour comparisons don’t affect other SQL operations.
\nNote that the callable will get its parameters as Python bytestrings, which will\nnormally be encoded in UTF-8.
\nThe following example shows a custom collation that sorts “the wrong way”:
\nimport sqlite3\n\ndef collate_reverse(string1, string2):\n return -cmp(string1, string2)\n\ncon = sqlite3.connect(":memory:")\ncon.create_collation("reverse", collate_reverse)\n\ncur = con.cursor()\ncur.execute("create table test(x)")\ncur.executemany("insert into test(x) values (?)", [("a",), ("b",)])\ncur.execute("select x from test order by x collate reverse")\nfor row in cur:\n print row\ncon.close()\n
To remove a collation, call create_collation with None as callable:
\ncon.create_collation("reverse", None)\n
This routine registers a callback. The callback is invoked for each attempt to\naccess a column of a table in the database. The callback should return\nSQLITE_OK if access is allowed, SQLITE_DENY if the entire SQL\nstatement should be aborted with an error and SQLITE_IGNORE if the\ncolumn should be treated as a NULL value. These constants are available in the\nsqlite3 module.
\nThe first argument to the callback signifies what kind of operation is to be\nauthorized. The second and third argument will be arguments or None\ndepending on the first argument. The 4th argument is the name of the database\n(“main”, “temp”, etc.) if applicable. The 5th argument is the name of the\ninner-most trigger or view that is responsible for the access attempt or\nNone if this access attempt is directly from input SQL code.
\nPlease consult the SQLite documentation about the possible values for the first\nargument and the meaning of the second and third argument depending on the first\none. All necessary constants are available in the sqlite3 module.
\n\nNew in version 2.6.
\nThis routine registers a callback. The callback is invoked for every n\ninstructions of the SQLite virtual machine. This is useful if you want to\nget called from SQLite during long-running operations, for example to update\na GUI.
\nIf you want to clear any previously installed progress handler, call the\nmethod with None for handler.
\n\nNew in version 2.7.
\nThis routine allows/disallows the SQLite engine to load SQLite extensions\nfrom shared libraries. SQLite extensions can define new functions,\naggregates or whole new virtual table implementations. One well-known\nextension is the fulltext-search extension distributed with SQLite.
\nimport sqlite3\n\ncon = sqlite3.connect(":memory:")\n\n# enable extension loading\ncon.enable_load_extension(True)\n\n# Load the fulltext search extension\ncon.execute("select load_extension('./fts3.so')")\n\n# alternatively you can load the extension using an API call:\n# con.load_extension("./fts3.so")\n\n# disable extension laoding again\ncon.enable_load_extension(False)\n\n# example from SQLite wiki\ncon.execute("create virtual table recipe using fts3(name, ingredients)")\ncon.executescript("""\n insert into recipe (name, ingredients) values ('broccoli stew', 'broccoli peppers cheese tomatoes');\n insert into recipe (name, ingredients) values ('pumpkin stew', 'pumpkin onions garlic celery');\n insert into recipe (name, ingredients) values ('broccoli pie', 'broccoli cheese onions flour');\n insert into recipe (name, ingredients) values ('pumpkin pie', 'pumpkin sugar flour butter');\n """)\nfor row in con.execute("select rowid, name, ingredients from recipe where name match 'pie'"):\n print row\n
Loadable extensions are disabled by default. See [1]
\n\nNew in version 2.7.
\nThis routine loads a SQLite extension from a shared library. You have to\nenable extension loading with enable_load_extension() before you can\nuse this routine.
\nLoadable extensions are disabled by default. See [1]
\nYou can change this attribute to a callable that accepts the cursor and the\noriginal row as a tuple and will return the real result row. This way, you can\nimplement more advanced ways of returning results, such as returning an object\nthat can also access columns by name.
\nExample:
\nimport sqlite3\n\ndef dict_factory(cursor, row):\n d = {}\n for idx, col in enumerate(cursor.description):\n d[col[0]] = row[idx]\n return d\n\ncon = sqlite3.connect(":memory:")\ncon.row_factory = dict_factory\ncur = con.cursor()\ncur.execute("select 1 as a")\nprint cur.fetchone()["a"]\n
If returning a tuple doesn’t suffice and you want name-based access to\ncolumns, you should consider setting row_factory to the\nhighly-optimized sqlite3.Row type. Row provides both\nindex-based and case-insensitive name-based access to columns with almost no\nmemory overhead. It will probably be better than your own custom\ndictionary-based approach or even a db_row based solution.
\nUsing this attribute you can control what objects are returned for the TEXT\ndata type. By default, this attribute is set to unicode and the\nsqlite3 module will return Unicode objects for TEXT. If you want to\nreturn bytestrings instead, you can set it to str.
\nFor efficiency reasons, there’s also a way to return Unicode objects only for\nnon-ASCII data, and bytestrings otherwise. To activate it, set this attribute to\nsqlite3.OptimizedUnicode.
\nYou can also set it to any other callable that accepts a single bytestring\nparameter and returns the resulting object.
\nSee the following example code for illustration:
\nimport sqlite3\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\n\n# Create the table\ncon.execute("create table person(lastname, firstname)")\n\nAUSTRIA = u"\\xd6sterreich"\n\n# by default, rows are returned as Unicode\ncur.execute("select ?", (AUSTRIA,))\nrow = cur.fetchone()\nassert row[0] == AUSTRIA\n\n# but we can make sqlite3 always return bytestrings ...\ncon.text_factory = str\ncur.execute("select ?", (AUSTRIA,))\nrow = cur.fetchone()\nassert type(row[0]) == str\n# the bytestrings will be encoded in UTF-8, unless you stored garbage in the\n# database ...\nassert row[0] == AUSTRIA.encode("utf-8")\n\n# we can also implement a custom text_factory ...\n# here we implement one that will ignore Unicode characters that cannot be\n# decoded from UTF-8\ncon.text_factory = lambda x: unicode(x, "utf-8", "ignore")\ncur.execute("select ?", ("this is latin1 and would normally create errors" +\n u"\\xe4\\xf6\\xfc".encode("latin1"),))\nrow = cur.fetchone()\nassert type(row[0]) == unicode\n\n# sqlite3 offers a built-in optimized text_factory that will return bytestring\n# objects, if the data is in ASCII only, and otherwise return unicode objects\ncon.text_factory = sqlite3.OptimizedUnicode\ncur.execute("select ?", (AUSTRIA,))\nrow = cur.fetchone()\nassert type(row[0]) == unicode\n\ncur.execute("select ?", ("Germany",))\nrow = cur.fetchone()\nassert type(row[0]) == str\n
Returns an iterator to dump the database in an SQL text format. Useful when\nsaving an in-memory database for later restoration. This function provides\nthe same capabilities as the .dump command in the sqlite3\nshell.
\n\nNew in version 2.6.
\nExample:
\n# Convert file existing_db.db to SQL dump file dump.sql\nimport sqlite3, os\n\ncon = sqlite3.connect('existing_db.db')\nwith open('dump.sql', 'w') as f:\n for line in con.iterdump():\n f.write('%s\\n' line)\n
Executes an SQL statement. The SQL statement may be parametrized (i. e.\nplaceholders instead of SQL literals). The sqlite3 module supports two\nkinds of placeholders: question marks (qmark style) and named placeholders\n(named style).
\nThis example shows how to use parameters with qmark style:
\nimport sqlite3\n\ncon = sqlite3.connect("mydb")\n\ncur = con.cursor()\n\nwho = "Yeltsin"\nage = 72\n\ncur.execute("select name_last, age from people where name_last=? and age=?", (who, age))\nprint cur.fetchone()\n
This example shows how to use the named style:
\nimport sqlite3\n\ncon = sqlite3.connect("mydb")\n\ncur = con.cursor()\n\nwho = "Yeltsin"\nage = 72\n\ncur.execute("select name_last, age from people where name_last=:who and age=:age",\n {"who": who, "age": age})\nprint cur.fetchone()\n
execute() will only execute a single SQL statement. If you try to execute\nmore than one statement with it, it will raise a Warning. Use\nexecutescript() if you want to execute multiple SQL statements with one\ncall.
\nExecutes an SQL command against all parameter sequences or mappings found in\nthe sequence sql. The sqlite3 module also allows using an\niterator yielding parameters instead of a sequence.
\nimport sqlite3\n\nclass IterChars:\n def __init__(self):\n self.count = ord('a')\n\n def __iter__(self):\n return self\n\n def next(self):\n if self.count > ord('z'):\n raise StopIteration\n self.count += 1\n return (chr(self.count - 1),) # this is a 1-tuple\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\ncur.execute("create table characters(c)")\n\ntheIter = IterChars()\ncur.executemany("insert into characters(c) values (?)", theIter)\n\ncur.execute("select c from characters")\nprint cur.fetchall()\n
Here’s a shorter example using a generator:
\nimport sqlite3\n\ndef char_generator():\n import string\n for c in string.letters[:26]:\n yield (c,)\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\ncur.execute("create table characters(c)")\n\ncur.executemany("insert into characters(c) values (?)", char_generator())\n\ncur.execute("select c from characters")\nprint cur.fetchall()\n
This is a nonstandard convenience method for executing multiple SQL statements\nat once. It issues a COMMIT statement first, then executes the SQL script it\ngets as a parameter.
\nsql_script can be a bytestring or a Unicode string.
\nExample:
\nimport sqlite3\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\ncur.executescript("""\n create table person(\n firstname,\n lastname,\n age\n );\n\n create table book(\n title,\n author,\n published\n );\n\n insert into book(title, author, published)\n values (\n 'Dirk Gently''s Holistic Detective Agency',\n 'Douglas Adams',\n 1987\n );\n """)\n
Fetches the next set of rows of a query result, returning a list. An empty\nlist is returned when no more rows are available.
\nThe number of rows to fetch per call is specified by the size parameter.\nIf it is not given, the cursor’s arraysize determines the number of rows\nto be fetched. The method should try to fetch as many rows as indicated by\nthe size parameter. If this is not possible due to the specified number of\nrows not being available, fewer rows may be returned.
\nNote there are performance considerations involved with the size parameter.\nFor optimal performance, it is usually best to use the arraysize attribute.\nIf the size parameter is used, then it is best for it to retain the same\nvalue from one fetchmany() call to the next.
\nAlthough the Cursor class of the sqlite3 module implements this\nattribute, the database engine’s own support for the determination of “rows\naffected”/”rows selected” is quirky.
\nFor DELETE statements, SQLite reports rowcount as 0 if you make a\nDELETE FROM table without any condition.
\nFor executemany() statements, the number of modifications are summed up\ninto rowcount.
\nAs required by the Python DB API Spec, the rowcount attribute “is -1 in\ncase no executeXX() has been performed on the cursor or the rowcount of the\nlast operation is not determinable by the interface”.
\nThis includes SELECT statements because we cannot determine the number of\nrows a query produced until all rows were fetched.
\nA Row instance serves as a highly optimized\nrow_factory for Connection objects.\nIt tries to mimic a tuple in most of its features.
\nIt supports mapping access by column name and index, iteration,\nrepresentation, equality testing and len().
\nIf two Row objects have exactly the same columns and their\nmembers are equal, they compare equal.
\n\nChanged in version 2.6: Added iteration and equality (hashability).
\nThis method returns a tuple of column names. Immediately after a query,\nit is the first member of each tuple in Cursor.description.
\n\nNew in version 2.6.
\nLet’s assume we initialize a table as in the example given above:
\nconn = sqlite3.connect(":memory:")\nc = conn.cursor()\nc.execute('''create table stocks\n(date text, trans text, symbol text,\n qty real, price real)''')\nc.execute("""insert into stocks\n values ('2006-01-05','BUY','RHAT',100,35.14)""")\nconn.commit()\nc.close()\n
Now we plug Row in:
\n>>> conn.row_factory = sqlite3.Row\n>>> c = conn.cursor()\n>>> c.execute('select * from stocks')\n<sqlite3.Cursor object at 0x7f4e7dd8fa80>\n>>> r = c.fetchone()\n>>> type(r)\n<type 'sqlite3.Row'>\n>>> r\n(u'2006-01-05', u'BUY', u'RHAT', 100.0, 35.14)\n>>> len(r)\n5\n>>> r[2]\nu'RHAT'\n>>> r.keys()\n['date', 'trans', 'symbol', 'qty', 'price']\n>>> r['qty']\n100.0\n>>> for member in r: print member\n...\n2006-01-05\nBUY\nRHAT\n100.0\n35.14\n
SQLite natively supports the following types: NULL, INTEGER,\nREAL, TEXT, BLOB.
\nThe following Python types can thus be sent to SQLite without any problem:
\nPython type | \nSQLite type | \n
---|---|
None | \nNULL | \n
int | \nINTEGER | \n
long | \nINTEGER | \n
float | \nREAL | \n
str (UTF8-encoded) | \nTEXT | \n
unicode | \nTEXT | \n
buffer | \nBLOB | \n
This is how SQLite types are converted to Python types by default:
\nSQLite type | \nPython type | \n
---|---|
NULL | \nNone | \n
INTEGER | \nint or long,\ndepending on size | \n
REAL | \nfloat | \n
TEXT | \ndepends on text_factory,\nunicode by default | \n
BLOB | \nbuffer | \n
The type system of the sqlite3 module is extensible in two ways: you can\nstore additional Python types in a SQLite database via object adaptation, and\nyou can let the sqlite3 module convert SQLite types to different Python\ntypes via converters.
\nAs described before, SQLite supports only a limited set of types natively. To\nuse other Python types with SQLite, you must adapt them to one of the\nsqlite3 module’s supported types for SQLite: one of NoneType, int, long, float,\nstr, unicode, buffer.
\nThe sqlite3 module uses Python object adaptation, as described in\nPEP 246 for this. The protocol to use is PrepareProtocol.
\nThere are two ways to enable the sqlite3 module to adapt a custom Python\ntype to one of the supported ones.
\nThis is a good approach if you write the class yourself. Let’s suppose you have\na class like this:
\nclass Point(object):\n def __init__(self, x, y):\n self.x, self.y = x, y\n
Now you want to store the point in a single SQLite column. First you’ll have to\nchoose one of the supported types first to be used for representing the point.\nLet’s just use str and separate the coordinates using a semicolon. Then you need\nto give your class a method __conform__(self, protocol) which must return\nthe converted value. The parameter protocol will be PrepareProtocol.
\nimport sqlite3\n\nclass Point(object):\n def __init__(self, x, y):\n self.x, self.y = x, y\n\n def __conform__(self, protocol):\n if protocol is sqlite3.PrepareProtocol:\n return "%f;%f" (self.x, self.y)\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\n\np = Point(4.0, -3.2)\ncur.execute("select ?", (p,))\nprint cur.fetchone()[0]\n
The other possibility is to create a function that converts the type to the\nstring representation and register the function with register_adapter().
\nNote
\nThe type/class to adapt must be a new-style class, i. e. it must have\nobject as one of its bases.
\nimport sqlite3\n\nclass Point(object):\n def __init__(self, x, y):\n self.x, self.y = x, y\n\ndef adapt_point(point):\n return "%f;%f" (point.x, point.y)\n\nsqlite3.register_adapter(Point, adapt_point)\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\n\np = Point(4.0, -3.2)\ncur.execute("select ?", (p,))\nprint cur.fetchone()[0]\n
The sqlite3 module has two default adapters for Python’s built-in\ndatetime.date and datetime.datetime types. Now let’s suppose\nwe want to store datetime.datetime objects not in ISO representation,\nbut as a Unix timestamp.
\nimport sqlite3\nimport datetime, time\n\ndef adapt_datetime(ts):\n return time.mktime(ts.timetuple())\n\nsqlite3.register_adapter(datetime.datetime, adapt_datetime)\n\ncon = sqlite3.connect(":memory:")\ncur = con.cursor()\n\nnow = datetime.datetime.now()\ncur.execute("select ?", (now,))\nprint cur.fetchone()[0]\n
Writing an adapter lets you send custom Python types to SQLite. But to make it\nreally useful we need to make the Python to SQLite to Python roundtrip work.
\nEnter converters.
\nLet’s go back to the Point class. We stored the x and y coordinates\nseparated via semicolons as strings in SQLite.
\nFirst, we’ll define a converter function that accepts the string as a parameter\nand constructs a Point object from it.
\nNote
\nConverter functions always get called with a string, no matter under which\ndata type you sent the value to SQLite.
\ndef convert_point(s):\n x, y = map(float, s.split(";"))\n return Point(x, y)\n
Now you need to make the sqlite3 module know that what you select from\nthe database is actually a point. There are two ways of doing this:
\nBoth ways are described in section Module functions and constants, in the entries\nfor the constants PARSE_DECLTYPES and PARSE_COLNAMES.
\nThe following example illustrates both approaches.
\nimport sqlite3\n\nclass Point(object):\n def __init__(self, x, y):\n self.x, self.y = x, y\n\n def __repr__(self):\n return "(%f;%f)" (self.x, self.y)\n\ndef adapt_point(point):\n return "%f;%f" (point.x, point.y)\n\ndef convert_point(s):\n x, y = map(float, s.split(";"))\n return Point(x, y)\n\n# Register the adapter\nsqlite3.register_adapter(Point, adapt_point)\n\n# Register the converter\nsqlite3.register_converter("point", convert_point)\n\np = Point(4.0, -3.2)\n\n#########################\n# 1) Using declared types\ncon = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES)\ncur = con.cursor()\ncur.execute("create table test(p point)")\n\ncur.execute("insert into test(p) values (?)", (p,))\ncur.execute("select p from test")\nprint "with declared types:", cur.fetchone()[0]\ncur.close()\ncon.close()\n\n#######################\n# 1) Using column names\ncon = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_COLNAMES)\ncur = con.cursor()\ncur.execute("create table test(p)")\n\ncur.execute("insert into test(p) values (?)", (p,))\ncur.execute('select p as "p [point]" from test')\nprint "with column names:", cur.fetchone()[0]\ncur.close()\ncon.close()\n
There are default adapters for the date and datetime types in the datetime\nmodule. They will be sent as ISO dates/ISO timestamps to SQLite.
\nThe default converters are registered under the name “date” for\ndatetime.date and under the name “timestamp” for\ndatetime.datetime.
\nThis way, you can use date/timestamps from Python without any additional\nfiddling in most cases. The format of the adapters is also compatible with the\nexperimental SQLite date/time functions.
\nThe following example demonstrates this.
\nimport sqlite3\nimport datetime\n\ncon = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES)\ncur = con.cursor()\ncur.execute("create table test(d date, ts timestamp)")\n\ntoday = datetime.date.today()\nnow = datetime.datetime.now()\n\ncur.execute("insert into test(d, ts) values (?, ?)", (today, now))\ncur.execute("select d, ts from test")\nrow = cur.fetchone()\nprint today, "=>", row[0], type(row[0])\nprint now, "=>", row[1], type(row[1])\n\ncur.execute('select current_date as "d [date]", current_timestamp as "ts [timestamp]"')\nrow = cur.fetchone()\nprint "current_date", row[0], type(row[0])\nprint "current_timestamp", row[1], type(row[1])\n
By default, the sqlite3 module opens transactions implicitly before a\nData Modification Language (DML) statement (i.e.\nINSERT/UPDATE/DELETE/REPLACE), and commits transactions\nimplicitly before a non-DML, non-query statement (i. e.\nanything other than SELECT or the aforementioned).
\nSo if you are within a transaction and issue a command like CREATE TABLE\n..., VACUUM, PRAGMA, the sqlite3 module will commit implicitly\nbefore executing that command. There are two reasons for doing that. The first\nis that some of these commands don’t work within transactions. The other reason\nis that sqlite3 needs to keep track of the transaction state (if a transaction\nis active or not).
\nYou can control which kind of BEGIN statements sqlite3 implicitly executes\n(or none at all) via the isolation_level parameter to the connect()\ncall, or via the isolation_level property of connections.
\nIf you want autocommit mode, then set isolation_level to None.
\nOtherwise leave it at its default, which will result in a plain “BEGIN”\nstatement, or set it to one of SQLite’s supported isolation levels: “DEFERRED”,\n“IMMEDIATE” or “EXCLUSIVE”.
\nUsing the nonstandard execute(), executemany() and\nexecutescript() methods of the Connection object, your code can\nbe written more concisely because you don’t have to create the (often\nsuperfluous) Cursor objects explicitly. Instead, the Cursor\nobjects are created implicitly and these shortcut methods return the cursor\nobjects. This way, you can execute a SELECT statement and iterate over it\ndirectly using only a single call on the Connection object.
\nimport sqlite3\n\npersons = [\n ("Hugo", "Boss"),\n ("Calvin", "Klein")\n ]\n\ncon = sqlite3.connect(":memory:")\n\n# Create the table\ncon.execute("create table person(firstname, lastname)")\n\n# Fill the table\ncon.executemany("insert into person(firstname, lastname) values (?, ?)", persons)\n\n# Print the table contents\nfor row in con.execute("select firstname, lastname from person"):\n print row\n\n# Using a dummy WHERE clause to not let SQLite take the shortcut table deletes.\nprint "I just deleted", con.execute("delete from person where 1=1").rowcount, "rows"\n
One useful feature of the sqlite3 module is the built-in\nsqlite3.Row class designed to be used as a row factory.
\nRows wrapped with this class can be accessed both by index (like tuples) and\ncase-insensitively by name:
\nimport sqlite3\n\ncon = sqlite3.connect("mydb")\ncon.row_factory = sqlite3.Row\n\ncur = con.cursor()\ncur.execute("select name_last, age from people")\nfor row in cur:\n assert row[0] == row["name_last"]\n assert row["name_last"] == row["nAmE_lAsT"]\n assert row[1] == row["age"]\n assert row[1] == row["AgE"]\n
\nNew in version 2.6.
\nConnection objects can be used as context managers\nthat automatically commit or rollback transactions. In the event of an\nexception, the transaction is rolled back; otherwise, the transaction is\ncommitted:
\nimport sqlite3\n\ncon = sqlite3.connect(":memory:")\ncon.execute("create table person (id integer primary key, firstname varchar unique)")\n\n# Successful, con.commit() is called automatically afterwards\nwith con:\n con.execute("insert into person(firstname) values (?)", ("Joe",))\n\n# con.rollback() is called after the with block finishes with an exception, the\n# exception is still raised and must be caught\ntry:\n with con:\n con.execute("insert into person(firstname) values (?)", ("Joe",))\nexcept sqlite3.IntegrityError:\n print "couldn't add Joe twice"\n
Older SQLite versions had issues with sharing connections between threads.\nThat’s why the Python module disallows sharing connections and cursors between\nthreads. If you still try to do so, you will get an exception at runtime.
\nThe only exception is calling the interrupt() method, which\nonly makes sense to call from a different thread.
\nFootnotes
\n[1] | (1, 2) The sqlite3 module is not built with loadable extension support by\ndefault, because some platforms (notably Mac OS X) have SQLite libraries\nwhich are compiled without this feature. To get loadable extension support,\nyou must modify setup.py and remove the line that sets\nSQLITE_OMIT_LOAD_EXTENSION. |
\nNew in version 1.5.2.
\nSource code: Lib/netrc.py
\nThe netrc class parses and encapsulates the netrc file format used by\nthe Unix ftp program and other FTP clients.
\nA netrc instance has the following methods:
\nInstances of netrc have public instance variables:
\nNote
\nPasswords are limited to a subset of the ASCII character set. Versions of\nthis module prior to 2.3 were extremely limited. Starting with 2.3, all\nASCII punctuation is allowed in passwords. However, note that whitespace and\nnon-printable characters are not allowed in passwords. This is a limitation\nof the way the .netrc file is parsed and may be removed in the future.
\n\nNew in version 2.5.
\nSource code: Lib/hashlib.py
\nThis module implements a common interface to many different secure hash and\nmessage digest algorithms. Included are the FIPS secure hash algorithms SHA1,\nSHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA’s MD5\nalgorithm (defined in Internet RFC 1321). The terms secure hash and message\ndigest are interchangeable. Older algorithms were called message digests. The\nmodern term is secure hash.
\nNote
\nIf you want the adler32 or crc32 hash functions they are available in\nthe zlib module.
\nWarning
\nSome algorithms have known hash collision weaknesses, see the FAQ at the end.
\nThere is one constructor method named for each type of hash. All return\na hash object with the same simple interface. For example: use sha1() to\ncreate a SHA1 hash object. You can now feed this object with arbitrary strings\nusing the update() method. At any point you can ask it for the\ndigest of the concatenation of the strings fed to it so far using the\ndigest() or hexdigest() methods.
\nConstructors for hash algorithms that are always present in this module are\nmd5(), sha1(), sha224(), sha256(), sha384(), and\nsha512(). Additional algorithms may also be available depending upon the\nOpenSSL library that Python uses on your platform.
\nFor example, to obtain the digest of the string 'Nobody inspects the spammish\nrepetition':
\n>>> import hashlib\n>>> m = hashlib.md5()\n>>> m.update("Nobody inspects")\n>>> m.update(" the spammish repetition")\n>>> m.digest()\n'\\xbbd\\x9c\\x83\\xdd\\x1e\\xa5\\xc9\\xd9\\xde\\xc9\\xa1\\x8d\\xf0\\xff\\xe9'\n>>> m.digest_size\n16\n>>> m.block_size\n64\n
More condensed:
\n>>> hashlib.sha224("Nobody inspects the spammish repetition").hexdigest()\n'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'\n
A generic new() constructor that takes the string name of the desired\nalgorithm as its first parameter also exists to allow access to the above listed\nhashes as well as any other algorithms that your OpenSSL library may offer. The\nnamed constructors are much faster than new() and should be preferred.
\nUsing new() with an algorithm provided by OpenSSL:
\n>>> h = hashlib.new('ripemd160')\n>>> h.update("Nobody inspects the spammish repetition")\n>>> h.hexdigest()\n'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc'\n
This module provides the following constant attribute:
\nA tuple providing the names of the hash algorithms guaranteed to be\nsupported by this module.
\n\nNew in version 2.7.
\nThe following values are provided as constant attributes of the hash objects\nreturned by the constructors:
\nA hash object has the following methods:
\nUpdate the hash object with the string arg. Repeated calls are equivalent to\na single call with the concatenation of all the arguments: m.update(a);\nm.update(b) is equivalent to m.update(a+b).
\n\nChanged in version 2.7.
\nSee also
\n\nNew in version 2.2.
\nSource code: Lib/hmac.py
\nThis module implements the HMAC algorithm as described by RFC 2104.
\nAn HMAC object has the following methods:
\nSee also
\n\nChanged in version 2.6: This module was previously only available in the Mac-specific library, it is\nnow available for all platforms.
\nSource code: Lib/plistlib.py
\nThis module provides an interface for reading and writing the “property list”\nXML files used mainly by Mac OS X.
\nThe property list (.plist) file format is a simple XML pickle supporting\nbasic object types, like dictionaries, lists, numbers and strings. Usually the\ntop level object is a dictionary.
\nValues can be strings, integers, floats, booleans, tuples, lists, dictionaries\n(but only with string keys), Data or datetime.datetime\nobjects. String values (including dictionary keys) may be unicode strings –\nthey will be written out as UTF-8.
\nThe <data> plist type is supported through the Data class. This is\na thin wrapper around a Python string. Use Data if your strings\ncontain control characters.
\nSee also
\nThis module defines the following functions:
\nRead a plist file. pathOrFile may either be a file name or a (readable)\nfile object. Return the unpacked root object (which usually is a\ndictionary).
\nThe XML data is parsed using the Expat parser from xml.parsers.expat\n– see its documentation for possible exceptions on ill-formed XML.\nUnknown elements will simply be ignored by the plist parser.
\nWrite rootObject to a plist file. pathOrFile may either be a file name\nor a (writable) file object.
\nA TypeError will be raised if the object is of an unsupported type or\na container that contains objects of unsupported types.
\nRead a plist from the resource with type restype from the resource fork of\npath. Availability: Mac OS X.
\nNote
\nIn Python 3.x, this function has been removed.
\nWrite rootObject as a resource with type restype to the resource fork of\npath. Availability: Mac OS X.
\nNote
\nIn Python 3.x, this function has been removed.
\nThe following class is available:
\nReturn a “data” wrapper object around the string data. This is used in\nfunctions converting from/to plists to represent the <data> type\navailable in plists.
\nIt has one attribute, data, that can be used to retrieve the Python\nstring stored in it.
\nGenerating a plist:
\npl = dict(\n aString="Doodah",\n aList=["A", "B", 12, 32.1, [1, 2, 3]],\n aFloat = 0.1,\n anInt = 728,\n aDict=dict(\n anotherString="<hello & hi there!>",\n aUnicodeValue=u'M\\xe4ssig, Ma\\xdf',\n aTrueValue=True,\n aFalseValue=False,\n ),\n someData = Data("<binary gunk>"),\n someMoreData = Data("<lots of binary gunk>" * 10),\n aDate = datetime.datetime.fromtimestamp(time.mktime(time.gmtime())),\n)\n# unicode keys are possible, but a little awkward to use:\npl[u'\\xc5benraa'] = "That was a unicode key."\nwritePlist(pl, fileName)\n
Parsing a plist:
\npl = readPlist(pathOrFile)\nprint pl["aKey"]\n
Source code: Lib/xdrlib.py
\nThe xdrlib module supports the External Data Representation Standard as\ndescribed in RFC 1014, written by Sun Microsystems, Inc. June 1987. It\nsupports most of the data types described in the RFC.
\nThe xdrlib module defines two classes, one for packing variables into XDR\nrepresentation, and another for unpacking from XDR representation. There are\nalso two exception classes.
\nSee also
\nPacker instances have the following methods:
\nIn general, you can pack any of the most common XDR data types by calling the\nappropriate pack_type() method. Each method takes a single argument, the\nvalue to pack. The following simple data type packing methods are supported:\npack_uint(), pack_int(), pack_enum(), pack_bool(),\npack_uhyper(), and pack_hyper().
\nThe following methods support packing strings, bytes, and opaque data:
\nThe following methods support packing arrays and lists:
\nPacks a list of homogeneous items. This method is useful for lists with an\nindeterminate size; i.e. the size is not available until the entire list has\nbeen walked. For each item in the list, an unsigned integer 1 is packed\nfirst, followed by the data value from the list. pack_item is the function\nthat is called to pack the individual item. At the end of the list, an unsigned\ninteger 0 is packed.
\nFor example, to pack a list of integers, the code might appear like this:
\nimport xdrlib\np = xdrlib.Packer()\np.pack_list([1, 2, 3], p.pack_int)\n
The Unpacker class offers the following methods:
\nIn addition, every data type that can be packed with a Packer, can be\nunpacked with an Unpacker. Unpacking methods are of the form\nunpack_type(), and take no arguments. They return the unpacked object.
\nIn addition, the following methods unpack strings, bytes, and opaque data:
\nThe following methods support unpacking arrays and lists:
\nExceptions in this module are coded as class instances:
\nHere is an example of how you would catch one of these exceptions:
\nimport xdrlib\np = xdrlib.Packer()\ntry:\n p.pack_double(8.01)\nexcept xdrlib.ConversionError, instance:\n print 'packing the double failed:', instance.msg\n
\nDeprecated since version 2.5: Use the hashlib module instead.
\nThis module implements the interface to RSA’s MD5 message digest algorithm (see\nalso Internet RFC 1321). Its use is quite straightforward: use new()\nto create an md5 object. You can now feed this object with arbitrary strings\nusing the update() method, and at any point you can ask it for the\ndigest (a strong kind of 128-bit checksum, a.k.a. “fingerprint”) of the\nconcatenation of the strings fed to it so far using the digest() method.
\nFor example, to obtain the digest of the string 'Nobody inspects the spammish\nrepetition':
\n>>> import md5\n>>> m = md5.new()\n>>> m.update("Nobody inspects")\n>>> m.update(" the spammish repetition")\n>>> m.digest()\n'\\xbbd\\x9c\\x83\\xdd\\x1e\\xa5\\xc9\\xd9\\xde\\xc9\\xa1\\x8d\\xf0\\xff\\xe9'\n
More condensed:
\n>>> md5.new("Nobody inspects the spammish repetition").digest()\n'\\xbbd\\x9c\\x83\\xdd\\x1e\\xa5\\xc9\\xd9\\xde\\xc9\\xa1\\x8d\\xf0\\xff\\xe9'\n
The following values are provided as constants in the module and as attributes\nof the md5 objects returned by new():
\nThe md5 module provides the following functions:
\nAn md5 object has the following methods:
\nSee also
\n\nDeprecated since version 2.5: Use the hashlib module instead.
\nThis module implements the interface to NIST’s secure hash algorithm, known as\nSHA-1. SHA-1 is an improved version of the original SHA hash algorithm. It is\nused in the same way as the md5 module: use new() to create an sha\nobject, then feed this object with arbitrary strings using the update()\nmethod, and at any point you can ask it for the digest of the\nconcatenation of the strings fed to it so far. SHA-1 digests are 160 bits\ninstead of MD5’s 128 bits.
\nThe following values are provided as constants in the module and as attributes\nof the sha objects returned by new():
\nAn sha object has the same methods as md5 objects:
\nSee also
\nThis module provides various time-related functions. For related\nfunctionality, see also the datetime and calendar modules.
\nAlthough this module is always available,\nnot all functions are available on all platforms. Most of the functions\ndefined in this module call platform C library functions with the same name. It\nmay sometimes be helpful to consult the platform documentation, because the\nsemantics of these functions varies among platforms.
\nAn explanation of some terminology and conventions is in order.
\nDST is Daylight Saving Time, an adjustment of the timezone by (usually) one\nhour during part of the year. DST rules are magic (determined by local law) and\ncan change from year to year. The C library has a table containing the local\nrules (often it is read from a system file for flexibility) and is the only\nsource of True Wisdom in this respect.
\nThe precision of the various real-time functions may be less than suggested by\nthe units in which their value or argument is expressed. E.g. on most Unix\nsystems, the clock “ticks” only 50 or 100 times a second.
\nOn the other hand, the precision of time() and sleep() is better\nthan their Unix equivalents: times are expressed as floating point numbers,\ntime() returns the most accurate time available (using Unix\ngettimeofday() where available), and sleep() will accept a time\nwith a nonzero fraction (Unix select() is used to implement this, where\navailable).
\nThe time value as returned by gmtime(), localtime(), and\nstrptime(), and accepted by asctime(), mktime() and\nstrftime(), may be considered as a sequence of 9 integers. The return\nvalues of gmtime(), localtime(), and strptime() also offer\nattribute names for individual fields.
\nSee struct_time for a description of these objects.
\n\nChanged in version 2.2: The time value sequence was changed from a tuple to a struct_time, with\nthe addition of attribute names for the fields.
\nUse the following functions to convert between time representations:
\nFrom \n | \nTo \n | \nUse \n | \n
---|---|---|
seconds since the epoch \n | \nstruct_time in\nUTC \n | \n\n | \n
seconds since the epoch \n | \nstruct_time in\nlocal time \n | \n\n | \n
struct_time in\nUTC \n | \nseconds since the epoch \n | \n\n | \n
struct_time in\nlocal time \n | \nseconds since the epoch \n | \n\n | \n
The module defines the following functions and data items:
\nConvert a tuple or struct_time representing a time as returned by\ngmtime() or localtime() to a 24-character string of the following\nform: 'Sun Jun 20 23:21:05 1993'. If t is not provided, the current time\nas returned by localtime() is used. Locale information is not used by\nasctime().
\nNote
\nUnlike the C function of the same name, there is no trailing newline.
\n\nChanged in version 2.1: Allowed t to be omitted.
\nOn Unix, return the current processor time as a floating point number expressed\nin seconds. The precision, and in fact the very definition of the meaning of\n“processor time”, depends on that of the C function of the same name, but in any\ncase, this is the function to use for benchmarking Python or timing algorithms.
\nOn Windows, this function returns wall-clock seconds elapsed since the first\ncall to this function, as a floating point number, based on the Win32 function\nQueryPerformanceCounter(). The resolution is typically better than one\nmicrosecond.
\nConvert a time expressed in seconds since the epoch to a string representing\nlocal time. If secs is not provided or None, the current time as\nreturned by time() is used. ctime(secs) is equivalent to\nasctime(localtime(secs)). Locale information is not used by ctime().
\n\nChanged in version 2.1: Allowed secs to be omitted.
\n\nChanged in version 2.4: If secs is None, the current time is used.
\nConvert a time expressed in seconds since the epoch to a struct_time in\nUTC in which the dst flag is always zero. If secs is not provided or\nNone, the current time as returned by time() is used. Fractions\nof a second are ignored. See above for a description of the\nstruct_time object. See calendar.timegm() for the inverse of this\nfunction.
\n\nChanged in version 2.1: Allowed secs to be omitted.
\n\nChanged in version 2.4: If secs is None, the current time is used.
\nLike gmtime() but converts to local time. If secs is not provided or\nNone, the current time as returned by time() is used. The dst\nflag is set to 1 when DST applies to the given time.
\n\nChanged in version 2.1: Allowed secs to be omitted.
\n\nChanged in version 2.4: If secs is None, the current time is used.
\nConvert a tuple or struct_time representing a time as returned by\ngmtime() or localtime() to a string as specified by the format\nargument. If t is not provided, the current time as returned by\nlocaltime() is used. format must be a string. ValueError is\nraised if any field in t is outside of the allowed range.
\n\nChanged in version 2.1: Allowed t to be omitted.
\n\nChanged in version 2.4: ValueError raised if a field in t is out of range.
\n\nChanged in version 2.5: 0 is now a legal argument for any position in the time tuple; if it is normally\nillegal the value is forced to a correct one..
\nThe following directives can be embedded in the format string. They are shown\nwithout the optional field width and precision specification, and are replaced\nby the indicated characters in the strftime() result:
\nDirective | \nMeaning | \nNotes | \n
---|---|---|
%a | \nLocale’s abbreviated weekday\nname. | \n\n |
%A | \nLocale’s full weekday name. | \n\n |
%b | \nLocale’s abbreviated month\nname. | \n\n |
%B | \nLocale’s full month name. | \n\n |
%c | \nLocale’s appropriate date and\ntime representation. | \n\n |
%d | \nDay of the month as a decimal\nnumber [01,31]. | \n\n |
%H | \nHour (24-hour clock) as a\ndecimal number [00,23]. | \n\n |
%I | \nHour (12-hour clock) as a\ndecimal number [01,12]. | \n\n |
%j | \nDay of the year as a decimal\nnumber [001,366]. | \n\n |
%m | \nMonth as a decimal number\n[01,12]. | \n\n |
%M | \nMinute as a decimal number\n[00,59]. | \n\n |
%p | \nLocale’s equivalent of either\nAM or PM. | \n(1) | \n
%S | \nSecond as a decimal number\n[00,61]. | \n(2) | \n
%U | \nWeek number of the year\n(Sunday as the first day of\nthe week) as a decimal number\n[00,53]. All days in a new\nyear preceding the first\nSunday are considered to be in\nweek 0. | \n(3) | \n
%w | \nWeekday as a decimal number\n[0(Sunday),6]. | \n\n |
%W | \nWeek number of the year\n(Monday as the first day of\nthe week) as a decimal number\n[00,53]. All days in a new\nyear preceding the first\nMonday are considered to be in\nweek 0. | \n(3) | \n
%x | \nLocale’s appropriate date\nrepresentation. | \n\n |
%X | \nLocale’s appropriate time\nrepresentation. | \n\n |
%y | \nYear without century as a\ndecimal number [00,99]. | \n\n |
%Y | \nYear with century as a decimal\nnumber. | \n\n |
%Z | \nTime zone name (no characters\nif no time zone exists). | \n\n |
A literal '%' character. | \n\n |
Notes:
\nHere is an example, a format for dates compatible with that specified in the\nRFC 2822 Internet email standard. [1]
\n>>> from time import gmtime, strftime\n>>> strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())\n'Thu, 28 Jun 2001 14:17:15 +0000'\n
Additional directives may be supported on certain platforms, but only the ones\nlisted here have a meaning standardized by ANSI C.
\nOn some platforms, an optional field width and precision specification can\nimmediately follow the initial '%' of a directive in the following order;\nthis is also not portable. The field width is normally 2 except for %j where\nit is 3.
\nParse a string representing a time according to a format. The return value is\na struct_time as returned by gmtime() or localtime().
\nThe format parameter uses the same directives as those used by\nstrftime(); it defaults to "%a %b %d %H:%M:%S %Y" which matches the\nformatting returned by ctime(). If string cannot be parsed according to\nformat, or if it has excess data after parsing, ValueError is raised.\nThe default values used to fill in any missing data when more accurate values\ncannot be inferred are (1900, 1, 1, 0, 0, 0, 0, 1, -1).
\nFor example:
\n>>> import time\n>>> time.strptime("30 Nov 00", "%d %b %y") # doctest: +NORMALIZE_WHITESPACE\ntime.struct_time(tm_year=2000, tm_mon=11, tm_mday=30, tm_hour=0, tm_min=0,\n tm_sec=0, tm_wday=3, tm_yday=335, tm_isdst=-1)\n
Support for the %Z directive is based on the values contained in tzname\nand whether daylight is true. Because of this, it is platform-specific\nexcept for recognizing UTC and GMT which are always known (and are considered to\nbe non-daylight savings timezones).
\nOnly the directives specified in the documentation are supported. Because\nstrftime() is implemented per platform it can sometimes offer more\ndirectives than those listed. But strptime() is independent of any platform\nand thus does not necessarily support all directives available that are not\ndocumented as supported.
\nThe type of the time value sequence returned by gmtime(),\nlocaltime(), and strptime(). It is an object with a named\ntuple interface: values can be accessed by index and by attribute name. The\nfollowing values are present:
\nIndex | \nAttribute | \nValues | \n
---|---|---|
0 | \ntm_year | \n(for example, 1993) | \n
1 | \ntm_mon | \nrange [1, 12] | \n
2 | \ntm_mday | \nrange [1, 31] | \n
3 | \ntm_hour | \nrange [0, 23] | \n
4 | \ntm_min | \nrange [0, 59] | \n
5 | \ntm_sec | \nrange [0, 61]; see (1) in\nstrftime() description | \n
6 | \ntm_wday | \nrange [0, 6], Monday is 0 | \n
7 | \ntm_yday | \nrange [1, 366] | \n
8 | \ntm_isdst | \n0, 1 or -1; see below | \n
\nNew in version 2.2.
\nNote that unlike the C structure, the month value is a range of [1, 12], not\n[0, 11]. A year value will be handled as described under Year 2000\n(Y2K) issues above. A -1 argument as the daylight\nsavings flag, passed to mktime() will usually result in the correct\ndaylight savings state to be filled in.
\nWhen a tuple with an incorrect length is passed to a function expecting a\nstruct_time, or having elements of the wrong type, a\nTypeError is raised.
\nResets the time conversion rules used by the library routines. The environment\nvariable TZ specifies how this is done.
\n\nNew in version 2.3.
\nAvailability: Unix.
\nNote
\nAlthough in many cases, changing the TZ environment variable may\naffect the output of functions like localtime() without calling\ntzset(), this behavior should not be relied on.
\nThe TZ environment variable should contain no whitespace.
\nThe standard format of the TZ environment variable is (whitespace\nadded for clarity):
\nstd offset [dst [offset [,start[/time], end[/time]]]]
\nWhere the components are:
\nIndicates when to change to and back from DST. The format of the\nstart and end dates are one of the following:
\ntime has the same format as offset except that no leading sign\n(‘-‘ or ‘+’) is allowed. The default, if time is not given, is 02:00:00.
\n>>> os.environ['TZ'] = 'EST+05EDT,M4.1.0,M10.5.0'\n>>> time.tzset()\n>>> time.strftime('%X %x %Z')\n'02:07:36 05/08/03 EDT'\n>>> os.environ['TZ'] = 'AEST-10AEDT-11,M10.5.0,M3.5.0'\n>>> time.tzset()\n>>> time.strftime('%X %x %Z')\n'16:08:12 05/08/03 AEST'\n
On many Unix systems (including *BSD, Linux, Solaris, and Darwin), it is more\nconvenient to use the system’s zoneinfo (tzfile(5)) database to\nspecify the timezone rules. To do this, set the TZ environment\nvariable to the path of the required timezone datafile, relative to the root of\nthe systems ‘zoneinfo’ timezone database, usually located at\n/usr/share/zoneinfo. For example, 'US/Eastern',\n'Australia/Melbourne', 'Egypt' or 'Europe/Amsterdam'.
\n>>> os.environ['TZ'] = 'US/Eastern'\n>>> time.tzset()\n>>> time.tzname\n('EST', 'EDT')\n>>> os.environ['TZ'] = 'Egypt'\n>>> time.tzset()\n>>> time.tzname\n('EET', 'EEST')\n
See also
\nFootnotes
\n[1] | The use of %Z is now deprecated, but the %z escape that expands to the\npreferred hour/minute offset is not supported by all ANSI C libraries. Also, a\nstrict reading of the original 1982 RFC 822 standard calls for a two-digit\nyear (%y rather than %Y), but practice moved to 4-digit years long before the\nyear 2000. After that, RFC 822 became obsolete and the 4-digit year has\nbeen first recommended by RFC 1123 and then mandated by RFC 2822. |
\nNew in version 2.6.
\nThe io module provides the Python interfaces to stream handling.\nUnder Python 2.x, this is proposed as an alternative to the built-in\nfile object, but in Python 3.x it is the default interface to\naccess files and streams.
\nNote
\nSince this module has been designed primarily for Python 3.x, you have to\nbe aware that all uses of “bytes” in this document refer to the\nstr type (of which bytes is an alias), and all uses\nof “text” refer to the unicode type. Furthermore, those two\ntypes are not interchangeable in the io APIs.
\nAt the top of the I/O hierarchy is the abstract base class IOBase. It\ndefines the basic interface to a stream. Note, however, that there is no\nseparation between reading and writing to streams; implementations are allowed\nto raise an IOError if they do not support a given operation.
\nExtending IOBase is RawIOBase which deals simply with the\nreading and writing of raw bytes to a stream. FileIO subclasses\nRawIOBase to provide an interface to files in the machine’s\nfile system.
\nBufferedIOBase deals with buffering on a raw byte stream\n(RawIOBase). Its subclasses, BufferedWriter,\nBufferedReader, and BufferedRWPair buffer streams that are\nreadable, writable, and both readable and writable.\nBufferedRandom provides a buffered interface to random access\nstreams. BytesIO is a simple stream of in-memory bytes.
\nAnother IOBase subclass, TextIOBase, deals with\nstreams whose bytes represent text, and handles encoding and decoding\nfrom and to unicode strings. TextIOWrapper, which extends\nit, is a buffered text interface to a buffered raw stream\n(BufferedIOBase). Finally, StringIO is an in-memory\nstream for unicode text.
\nArgument names are not part of the specification, and only the arguments of\nopen() are intended to be used as keyword arguments.
\nOpen file and return a corresponding stream. If the file cannot be opened,\nan IOError is raised.
\nfile is either a string giving the pathname (absolute or\nrelative to the current working directory) of the file to be opened or\nan integer file descriptor of the file to be wrapped. (If a file descriptor\nis given, it is closed when the returned I/O object is closed, unless\nclosefd is set to False.)
\nmode is an optional string that specifies the mode in which the file is\nopened. It defaults to 'r' which means open for reading in text mode.\nOther common values are 'w' for writing (truncating the file if it\nalready exists), and 'a' for appending (which on some Unix systems,\nmeans that all writes append to the end of the file regardless of the\ncurrent seek position). In text mode, if encoding is not specified the\nencoding used is platform dependent. (For reading and writing raw bytes use\nbinary mode and leave encoding unspecified.) The available modes are:
\nCharacter | \nMeaning | \n
'r' | \nopen for reading (default) | \n
'w' | \nopen for writing, truncating the file first | \n
'a' | \nopen for writing, appending to the end of the file if it exists | \n
'b' | \nbinary mode | \n
't' | \ntext mode (default) | \n
'+' | \nopen a disk file for updating (reading and writing) | \n
'U' | \nuniversal newline mode (for backwards compatibility; should\nnot be used in new code) | \n
The default mode is 'rt' (open for reading text). For binary random\naccess, the mode 'w+b' opens and truncates the file to 0 bytes, while\n'r+b' opens the file without truncation.
\nPython distinguishes between files opened in binary and text modes, even when\nthe underlying operating system doesn’t. Files opened in binary mode\n(including 'b' in the mode argument) return contents as bytes\nobjects without any decoding. In text mode (the default, or when 't' is\nincluded in the mode argument), the contents of the file are returned as\nunicode strings, the bytes having been first decoded using a\nplatform-dependent encoding or using the specified encoding if given.
\nbuffering is an optional integer used to set the buffering policy.\nPass 0 to switch buffering off (only allowed in binary mode), 1 to select\nline buffering (only usable in text mode), and an integer > 1 to indicate\nthe size of a fixed-size chunk buffer. When no buffering argument is\ngiven, the default buffering policy works as follows:
\nencoding is the name of the encoding used to decode or encode the file.\nThis should only be used in text mode. The default encoding is platform\ndependent (whatever locale.getpreferredencoding() returns), but any\nencoding supported by Python can be used. See the codecs module for\nthe list of supported encodings.
\nerrors is an optional string that specifies how encoding and decoding\nerrors are to be handled–this cannot be used in binary mode. Pass\n'strict' to raise a ValueError exception if there is an encoding\nerror (the default of None has the same effect), or pass 'ignore' to\nignore errors. (Note that ignoring encoding errors can lead to data loss.)\n'replace' causes a replacement marker (such as '?') to be inserted\nwhere there is malformed data. When writing, 'xmlcharrefreplace'\n(replace with the appropriate XML character reference) or\n'backslashreplace' (replace with backslashed escape sequences) can be\nused. Any other error handling name that has been registered with\ncodecs.register_error() is also valid.
\nnewline controls how universal newlines works (it only applies to text\nmode). It can be None, '', '\\n', '\\r', and '\\r\\n'. It\nworks as follows:
\nIf closefd is False and a file descriptor rather than a filename was\ngiven, the underlying file descriptor will be kept open when the file is\nclosed. If a filename is given closefd has no effect and must be True\n(the default).
\nThe type of file object returned by the open() function depends on the\nmode. When open() is used to open a file in a text mode ('w',\n'r', 'wt', 'rt', etc.), it returns a subclass of\nTextIOBase (specifically TextIOWrapper). When used to open\na file in a binary mode with buffering, the returned class is a subclass of\nBufferedIOBase. The exact class varies: in read binary mode, it\nreturns a BufferedReader; in write binary and append binary modes,\nit returns a BufferedWriter, and in read/write mode, it returns a\nBufferedRandom. When buffering is disabled, the raw stream, a\nsubclass of RawIOBase, FileIO, is returned.
\nIt is also possible to use an unicode or bytes string\nas a file for both reading and writing. For unicode strings\nStringIO can be used like a file opened in text mode,\nand for bytes a BytesIO can be used like a\nfile opened in a binary mode.
\nError raised when blocking would occur on a non-blocking stream. It inherits\nIOError.
\nIn addition to those of IOError, BlockingIOError has one\nattribute:
\nThe abstract base class for all I/O classes, acting on streams of bytes.\nThere is no public constructor.
\nThis class provides empty abstract implementations for many methods\nthat derived classes can override selectively; the default\nimplementations represent a file that cannot be read, written or\nseeked.
\nEven though IOBase does not declare read(), readinto(),\nor write() because their signatures will vary, implementations and\nclients should consider those methods part of the interface. Also,\nimplementations may raise a IOError when operations they do not\nsupport are called.
\nThe basic type used for binary data read from or written to a file is\nbytes (also known as str). bytearrays are\naccepted too, and in some cases (such as readinto) required.\nText I/O classes work with unicode data.
\nNote that calling any method (even inquiries) on a closed stream is\nundefined. Implementations may raise IOError in this case.
\nIOBase (and its subclasses) support the iterator protocol, meaning that an\nIOBase object can be iterated over yielding the lines in a stream.\nLines are defined slightly differently depending on whether the stream is\na binary stream (yielding bytes), or a text stream (yielding\nunicode strings). See readline() below.
\nIOBase is also a context manager and therefore supports the\nwith statement. In this example, file is closed after the\nwith statement’s suite is finished—even if an exception occurs:
\nwith io.open('spam.txt', 'w') as file:\n file.write(u'Spam and eggs!')\n
IOBase provides these data attributes and methods:
\nFlush and close this stream. This method has no effect if the file is\nalready closed. Once the file is closed, any operation on the file\n(e.g. reading or writing) will raise a ValueError.
\nAs a convenience, it is allowed to call this method more than once;\nonly the first call, however, will have an effect.
\nRead and return one line from the stream. If limit is specified, at\nmost limit bytes will be read.
\nThe line terminator is always b'\\n' for binary files; for text files,\nthe newlines argument to open() can be used to select the line\nterminator(s) recognized.
\nChange the stream position to the given byte offset. offset is\ninterpreted relative to the position indicated by whence. Values for\nwhence are:
\nReturn the new absolute position.
\n\nNew in version 2.7: The SEEK_* constants
\nBase class for raw binary I/O. It inherits IOBase. There is no\npublic constructor.
\nRaw binary I/O typically provides low-level access to an underlying OS\ndevice or API, and does not try to encapsulate it in high-level primitives\n(this is left to Buffered I/O and Text I/O, described later in this page).
\nIn addition to the attributes and methods from IOBase,\nRawIOBase provides the following methods:
\nRead up to n bytes from the object and return them. As a convenience,\nif n is unspecified or -1, readall() is called. Otherwise,\nonly one system call is ever made. Fewer than n bytes may be\nreturned if the operating system call returns fewer than n bytes.
\nIf 0 bytes are returned, and n was not 0, this indicates end of file.\nIf the object is in non-blocking mode and no bytes are available,\nNone is returned.
\nBase class for binary streams that support some kind of buffering.\nIt inherits IOBase. There is no public constructor.
\nThe main difference with RawIOBase is that methods read(),\nreadinto() and write() will try (respectively) to read as much\ninput as requested or to consume all given output, at the expense of\nmaking perhaps more than one system call.
\nIn addition, those methods can raise BlockingIOError if the\nunderlying raw stream is in non-blocking mode and cannot take or give\nenough data; unlike their RawIOBase counterparts, they will\nnever return None.
\nBesides, the read() method does not have a default\nimplementation that defers to readinto().
\nA typical BufferedIOBase implementation should not inherit from a\nRawIOBase implementation, but wrap one, like\nBufferedWriter and BufferedReader do.
\nBufferedIOBase provides or overrides these methods and attribute in\naddition to those from IOBase:
\nSeparate the underlying raw stream from the buffer and return it.
\nAfter the raw stream has been detached, the buffer is in an unusable\nstate.
\nSome buffers, like BytesIO, do not have the concept of a single\nraw stream to return from this method. They raise\nUnsupportedOperation.
\n\nNew in version 2.7.
\nRead and return up to n bytes. If the argument is omitted, None, or\nnegative, data is read and returned until EOF is reached. An empty bytes\nobject is returned if the stream is already at EOF.
\nIf the argument is positive, and the underlying raw stream is not\ninteractive, multiple raw reads may be issued to satisfy the byte count\n(unless EOF is reached first). But for interactive raw streams, at most\none raw read will be issued, and a short result does not imply that EOF is\nimminent.
\nA BlockingIOError is raised if the underlying raw stream is in\nnon blocking-mode, and has no data available at the moment.
\nRead up to len(b) bytes into bytearray b and return the number of bytes\nread.
\nLike read(), multiple reads may be issued to the underlying raw\nstream, unless the latter is ‘interactive’.
\nA BlockingIOError is raised if the underlying raw stream is in\nnon blocking-mode, and has no data available at the moment.
\nWrite the given bytes or bytearray object, b and return the number\nof bytes written (never less than len(b), since if the write fails\nan IOError will be raised). Depending on the actual\nimplementation, these bytes may be readily written to the underlying\nstream, or held in a buffer for performance and latency reasons.
\nWhen in non-blocking mode, a BlockingIOError is raised if the\ndata needed to be written to the raw stream but it couldn’t accept\nall the data without blocking.
\nFileIO represents an OS-level file containing bytes data.\nIt implements the RawIOBase interface (and therefore the\nIOBase interface, too).
\nThe name can be one of two things:
\nThe mode can be 'r', 'w' or 'a' for reading (default), writing,\nor appending. The file will be created if it doesn’t exist when opened for\nwriting or appending; it will be truncated when opened for writing. Add a\n'+' to the mode to allow simultaneous reading and writing.
\nThe read() (when called with a positive argument), readinto()\nand write() methods on this class will only make one system call.
\nIn addition to the attributes and methods from IOBase and\nRawIOBase, FileIO provides the following data\nattributes and methods:
\nBuffered I/O streams provide a higher-level interface to an I/O device\nthan raw I/O does.
\nA stream implementation using an in-memory bytes buffer. It inherits\nBufferedIOBase.
\nThe argument initial_bytes is an optional initial bytes.
\nBytesIO provides or overrides these methods in addition to those\nfrom BufferedIOBase and IOBase:
\nA buffer providing higher-level access to a readable, sequential\nRawIOBase object. It inherits BufferedIOBase.\nWhen reading data from this object, a larger amount of data may be\nrequested from the underlying raw stream, and kept in an internal buffer.\nThe buffered data can then be returned directly on subsequent reads.
\nThe constructor creates a BufferedReader for the given readable\nraw stream and buffer_size. If buffer_size is omitted,\nDEFAULT_BUFFER_SIZE is used.
\nBufferedReader provides or overrides these methods in addition to\nthose from BufferedIOBase and IOBase:
\nA buffer providing higher-level access to a writeable, sequential\nRawIOBase object. It inherits BufferedIOBase.\nWhen writing to this object, data is normally held into an internal\nbuffer. The buffer will be written out to the underlying RawIOBase\nobject under various conditions, including:
\nThe constructor creates a BufferedWriter for the given writeable\nraw stream. If the buffer_size is not given, it defaults to\nDEFAULT_BUFFER_SIZE.
\nA third argument, max_buffer_size, is supported, but unused and deprecated.
\nBufferedWriter provides or overrides these methods in addition to\nthose from BufferedIOBase and IOBase:
\nA buffered interface to random access streams. It inherits\nBufferedReader and BufferedWriter, and further supports\nseek() and tell() functionality.
\nThe constructor creates a reader and writer for a seekable raw stream, given\nin the first argument. If the buffer_size is omitted it defaults to\nDEFAULT_BUFFER_SIZE.
\nA third argument, max_buffer_size, is supported, but unused and deprecated.
\nBufferedRandom is capable of anything BufferedReader or\nBufferedWriter can do.
\nA buffered I/O object combining two unidirectional RawIOBase\nobjects – one readable, the other writeable – into a single bidirectional\nendpoint. It inherits BufferedIOBase.
\nreader and writer are RawIOBase objects that are readable and\nwriteable respectively. If the buffer_size is omitted it defaults to\nDEFAULT_BUFFER_SIZE.
\nA fourth argument, max_buffer_size, is supported, but unused and\ndeprecated.
\nBufferedRWPair implements all of BufferedIOBase‘s methods\nexcept for detach(), which raises\nUnsupportedOperation.
\nWarning
\nBufferedRWPair does not attempt to synchronize accesses to\nits underlying raw streams. You should not pass it the same object\nas reader and writer; use BufferedRandom instead.
\nBase class for text streams. This class provides an unicode character\nand line based interface to stream I/O. There is no readinto()\nmethod because Python’s unicode strings are immutable.\nIt inherits IOBase. There is no public constructor.
\nTextIOBase provides or overrides these data attributes and\nmethods in addition to those from IOBase:
\nSeparate the underlying binary buffer from the TextIOBase and\nreturn it.
\nAfter the underlying buffer has been detached, the TextIOBase is\nin an unusable state.
\nSome TextIOBase implementations, like StringIO, may not\nhave the concept of an underlying buffer and calling this method will\nraise UnsupportedOperation.
\n\nNew in version 2.7.
\nA buffered text stream over a BufferedIOBase binary stream.\nIt inherits TextIOBase.
\nencoding gives the name of the encoding that the stream will be decoded or\nencoded with. It defaults to locale.getpreferredencoding().
\nerrors is an optional string that specifies how encoding and decoding\nerrors are to be handled. Pass 'strict' to raise a ValueError\nexception if there is an encoding error (the default of None has the same\neffect), or pass 'ignore' to ignore errors. (Note that ignoring encoding\nerrors can lead to data loss.) 'replace' causes a replacement marker\n(such as '?') to be inserted where there is malformed data. When\nwriting, 'xmlcharrefreplace' (replace with the appropriate XML character\nreference) or 'backslashreplace' (replace with backslashed escape\nsequences) can be used. Any other error handling name that has been\nregistered with codecs.register_error() is also valid.
\nnewline can be None, '', '\\n', '\\r', or '\\r\\n'. It\ncontrols the handling of line endings. If it is None, universal newlines\nis enabled. With this enabled, on input, the lines endings '\\n',\n'\\r', or '\\r\\n' are translated to '\\n' before being returned to\nthe caller. Conversely, on output, '\\n' is translated to the system\ndefault line separator, os.linesep. If newline is any other of its\nlegal values, that newline becomes the newline when the file is read and it\nis returned untranslated. On output, '\\n' is converted to the newline.
\nIf line_buffering is True, flush() is implied when a call to\nwrite contains a newline character.
\nTextIOWrapper provides one attribute in addition to those of\nTextIOBase and its parents:
\nAn in-memory stream for unicode text. It inherits TextIOWrapper.
\nThe initial value of the buffer (an empty unicode string by default) can\nbe set by providing initial_value. The newline argument works like\nthat of TextIOWrapper. The default is to do no newline\ntranslation.
\nStringIO provides this method in addition to those from\nTextIOWrapper and its parents:
\nExample usage:
\nimport io\n\noutput = io.StringIO()\noutput.write(u'First line.\\n')\noutput.write(u'Second line.\\n')\n\n# Retrieve file contents -- this will be\n# u'First line.\\nSecond line.\\n'\ncontents = output.getvalue()\n\n# Close object and discard memory buffer --\n# .getvalue() will now raise an exception.\noutput.close()\n
Here we will discuss several advanced topics pertaining to the concrete\nI/O implementations described above.
\nBy reading and writing only large chunks of data even when the user asks\nfor a single byte, buffered I/O is designed to hide any inefficiency in\ncalling and executing the operating system’s unbuffered I/O routines. The\ngain will vary very much depending on the OS and the kind of I/O which is\nperformed (for example, on some contemporary OSes such as Linux, unbuffered\ndisk I/O can be as fast as buffered I/O). The bottom line, however, is\nthat buffered I/O will offer you predictable performance regardless of the\nplatform and the backing device. Therefore, it is most always preferable to\nuse buffered I/O rather than unbuffered I/O.
\nText I/O over a binary storage (such as a file) is significantly slower than\nbinary I/O over the same storage, because it implies conversions from\nunicode to binary data using a character codec. This can become noticeable\nif you handle huge amounts of text data (for example very large log files).\nAlso, TextIOWrapper.tell() and TextIOWrapper.seek() are both\nquite slow due to the reconstruction algorithm used.
\nStringIO, however, is a native in-memory unicode container and will\nexhibit similar speed to BytesIO.
\nFileIO objects are thread-safe to the extent that the operating\nsystem calls (such as read(2) under Unix) they are wrapping are thread-safe\ntoo.
\nBinary buffered objects (instances of BufferedReader,\nBufferedWriter, BufferedRandom and BufferedRWPair)\nprotect their internal structures using a lock; it is therefore safe to call\nthem from multiple threads at once.
\nTextIOWrapper objects are not thread-safe.
\nBinary buffered objects (instances of BufferedReader,\nBufferedWriter, BufferedRandom and BufferedRWPair)\nare not reentrant. While reentrant calls will not happen in normal situations,\nthey can arise if you are doing I/O in a signal handler. If it is\nattempted to enter a buffered object again while already being accessed\nfrom the same thread, then a RuntimeError is raised.
\nThe above implicitly extends to text files, since the open()\nfunction will wrap a buffered object inside a TextIOWrapper. This\nincludes standard streams and therefore affects the built-in function\nprint() as well.
\nThis module provides a portable way of using operating system dependent\nfunctionality. If you just want to read or write a file see open(), if\nyou want to manipulate paths, see the os.path module, and if you want to\nread all the lines in all the files on the command line see the fileinput\nmodule. For creating temporary files and directories see the tempfile\nmodule, and for high-level file and directory handling see the shutil\nmodule.
\nNotes on the availability of these functions:
\nNote
\nAll functions in this module raise OSError in the case of invalid or\ninaccessible file names and paths, or other arguments that have the correct\ntype, but are not accepted by the operating system.
\nThe name of the operating system dependent module imported. The following\nnames have currently been registered: 'posix', 'nt',\n'os2', 'ce', 'java', 'riscos'.
\nSee also
\nsys.platform has a finer granularity. os.uname() gives\nsystem-dependent version information.
\nThe platform module provides detailed checks for the\nsystem’s identity.
\nThese functions and data items provide information and operate on the current\nprocess and user.
\nA mapping object representing the string environment. For example,\nenviron['HOME'] is the pathname of your home directory (on some platforms),\nand is equivalent to getenv("HOME") in C.
\nThis mapping is captured the first time the os module is imported,\ntypically during Python startup as part of processing site.py. Changes\nto the environment made after this time are not reflected in os.environ,\nexcept for changes made by modifying os.environ directly.
\nIf the platform supports the putenv() function, this mapping may be used\nto modify the environment as well as query the environment. putenv() will\nbe called automatically when the mapping is modified.
\nNote
\nCalling putenv() directly does not change os.environ, so it’s better\nto modify os.environ.
\nNote
\nOn some platforms, including FreeBSD and Mac OS X, setting environ may\ncause memory leaks. Refer to the system documentation for\nputenv().
\nIf putenv() is not provided, a modified copy of this mapping may be\npassed to the appropriate process-creation functions to cause child processes\nto use a modified environment.
\nIf the platform supports the unsetenv() function, you can delete items in\nthis mapping to unset environment variables. unsetenv() will be called\nautomatically when an item is deleted from os.environ, and when\none of the pop() or clear() methods is called.
\n\nChanged in version 2.6: Also unset environment variables when calling os.environ.clear()\nand os.environ.pop().
\nReturn the filename corresponding to the controlling terminal of the process.
\nAvailability: Unix.
\nReturn the effective group id of the current process. This corresponds to the\n“set id” bit on the file being executed in the current process.
\nAvailability: Unix.
\nReturn the current process’s effective user id.
\nAvailability: Unix.
\nReturn the real group id of the current process.
\nAvailability: Unix.
\nReturn list of supplemental group ids associated with the current process.
\nAvailability: Unix.
\nCall the system initgroups() to initialize the group access list with all of\nthe groups of which the specified username is a member, plus the specified\ngroup id.
\nAvailability: Unix.
\n\nNew in version 2.7.
\nReturn the name of the user logged in on the controlling terminal of the\nprocess. For most purposes, it is more useful to use the environment variable\nLOGNAME to find out who the user is, or\npwd.getpwuid(os.getuid())[0] to get the login name of the currently\neffective user id.
\nAvailability: Unix.
\nReturn the process group id of the process with process id pid. If pid is 0,\nthe process group id of the current process is returned.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nReturn the id of the current process group.
\nAvailability: Unix.
\nReturn the current process id.
\nAvailability: Unix, Windows.
\nReturn the parent’s process id.
\nAvailability: Unix.
\nReturn a tuple (ruid, euid, suid) denoting the current process’s\nreal, effective, and saved user ids.
\nAvailability: Unix.
\n\nNew in version 2.7.
\nReturn a tuple (rgid, egid, sgid) denoting the current process’s\nreal, effective, and saved group ids.
\nAvailability: Unix.
\n\nNew in version 2.7.
\nReturn the current process’s user id.
\nAvailability: Unix.
\nReturn the value of the environment variable varname if it exists, or value\nif it doesn’t. value defaults to None.
\nAvailability: most flavors of Unix, Windows.
\nSet the environment variable named varname to the string value. Such\nchanges to the environment affect subprocesses started with os.system(),\npopen() or fork() and execv().
\nAvailability: most flavors of Unix, Windows.
\nNote
\nOn some platforms, including FreeBSD and Mac OS X, setting environ may\ncause memory leaks. Refer to the system documentation for putenv.
\nWhen putenv() is supported, assignments to items in os.environ are\nautomatically translated into corresponding calls to putenv(); however,\ncalls to putenv() don’t update os.environ, so it is actually\npreferable to assign to items of os.environ.
\nSet the current process’s effective group id.
\nAvailability: Unix.
\nSet the current process’s effective user id.
\nAvailability: Unix.
\nSet the current process’ group id.
\nAvailability: Unix.
\nSet the list of supplemental group ids associated with the current process to\ngroups. groups must be a sequence, and each element must be an integer\nidentifying a group. This operation is typically available only to the superuser.
\nAvailability: Unix.
\n\nNew in version 2.2.
\nCall the system call setpgrp() or setpgrp(0, 0)() depending on\nwhich version is implemented (if any). See the Unix manual for the semantics.
\nAvailability: Unix.
\nCall the system call setpgid() to set the process group id of the\nprocess with id pid to the process group with id pgrp. See the Unix manual\nfor the semantics.
\nAvailability: Unix.
\nSet the current process’s real and effective group ids.
\nAvailability: Unix.
\nSet the current process’s real, effective, and saved group ids.
\nAvailability: Unix.
\n\nNew in version 2.7.
\nSet the current process’s real, effective, and saved user ids.
\nAvailability: Unix.
\n\nNew in version 2.7.
\nSet the current process’s real and effective user ids.
\nAvailability: Unix.
\nCall the system call getsid(). See the Unix manual for the semantics.
\nAvailability: Unix.
\n\nNew in version 2.4.
\nCall the system call setsid(). See the Unix manual for the semantics.
\nAvailability: Unix.
\nSet the current process’s user id.
\nAvailability: Unix.
\nReturn the error message corresponding to the error code in code.\nOn platforms where strerror() returns NULL when given an unknown\nerror number, ValueError is raised.
\nAvailability: Unix, Windows.
\nSet the current numeric umask and return the previous umask.
\nAvailability: Unix, Windows.
\nReturn a 5-tuple containing information identifying the current operating\nsystem. The tuple contains 5 strings: (sysname, nodename, release, version,\nmachine). Some systems truncate the nodename to 8 characters or to the\nleading component; a better way to get the hostname is\nsocket.gethostname() or even\nsocket.gethostbyaddr(socket.gethostname()).
\nAvailability: recent flavors of Unix.
\nUnset (delete) the environment variable named varname. Such changes to the\nenvironment affect subprocesses started with os.system(), popen() or\nfork() and execv().
\nWhen unsetenv() is supported, deletion of items in os.environ is\nautomatically translated into a corresponding call to unsetenv(); however,\ncalls to unsetenv() don’t update os.environ, so it is actually\npreferable to delete items of os.environ.
\nAvailability: most flavors of Unix, Windows.
\nThese functions create new file objects. (See also open().)
\nReturn an open file object connected to the file descriptor fd. The mode\nand bufsize arguments have the same meaning as the corresponding arguments to\nthe built-in open() function.
\nAvailability: Unix, Windows.
\n\nChanged in version 2.3: When specified, the mode argument must now start with one of the letters\n'r', 'w', or 'a', otherwise a ValueError is raised.
\n\nChanged in version 2.5: On Unix, when the mode argument starts with 'a', the O_APPEND flag is\nset on the file descriptor (which the fdopen() implementation already\ndoes on most platforms).
\nOpen a pipe to or from command. The return value is an open file object\nconnected to the pipe, which can be read or written depending on whether mode\nis 'r' (default) or 'w'. The bufsize argument has the same meaning as\nthe corresponding argument to the built-in open() function. The exit\nstatus of the command (encoded in the format specified for wait()) is\navailable as the return value of the close() method of the file object,\nexcept that when the exit status is zero (termination without errors), None\nis returned.
\nAvailability: Unix, Windows.
\n\nDeprecated since version 2.6: This function is obsolete. Use the subprocess module. Check\nespecially the Replacing Older Functions with the subprocess Module section.
\n\nChanged in version 2.0: This function worked unreliably under Windows in earlier versions of Python.\nThis was due to the use of the _popen() function from the libraries\nprovided with Windows. Newer versions of Python do not use the broken\nimplementation from the Windows libraries.
\nReturn a new file object opened in update mode (w+b). The file has no\ndirectory entries associated with it and will be automatically deleted once\nthere are no file descriptors for the file.
\nAvailability: Unix, Windows.
\nThere are a number of different popen*() functions that provide slightly\ndifferent ways to create subprocesses.
\n\nDeprecated since version 2.6: All of the popen*() functions are obsolete. Use the subprocess\nmodule.
\nFor each of the popen*() variants, if bufsize is specified, it\nspecifies the buffer size for the I/O pipes. mode, if provided, should be the\nstring 'b' or 't'; on Windows this is needed to determine whether the\nfile objects should be opened in binary or text mode. The default value for\nmode is 't'.
\nAlso, for each of these variants, on Unix, cmd may be a sequence, in which\ncase arguments will be passed directly to the program without shell intervention\n(as with os.spawnv()). If cmd is a string it will be passed to the shell\n(as with os.system()).
\nThese methods do not make it possible to retrieve the exit status from the child\nprocesses. The only way to control the input and output streams and also\nretrieve the return codes is to use the subprocess module; these are only\navailable on Unix.
\nFor a discussion of possible deadlock conditions related to the use of these\nfunctions, see Flow Control Issues.
\nExecute cmd as a sub-process and return the file objects (child_stdin,\nchild_stdout).
\n\nDeprecated since version 2.6: This function is obsolete. Use the subprocess module. Check\nespecially the Replacing Older Functions with the subprocess Module section.
\nAvailability: Unix, Windows.
\n\nNew in version 2.0.
\nExecute cmd as a sub-process and return the file objects (child_stdin,\nchild_stdout, child_stderr).
\n\nDeprecated since version 2.6: This function is obsolete. Use the subprocess module. Check\nespecially the Replacing Older Functions with the subprocess Module section.
\nAvailability: Unix, Windows.
\n\nNew in version 2.0.
\nExecute cmd as a sub-process and return the file objects (child_stdin,\nchild_stdout_and_stderr).
\n\nDeprecated since version 2.6: This function is obsolete. Use the subprocess module. Check\nespecially the Replacing Older Functions with the subprocess Module section.
\nAvailability: Unix, Windows.
\n\nNew in version 2.0.
\n(Note that child_stdin, child_stdout, and child_stderr are named from the\npoint of view of the child process, so child_stdin is the child’s standard\ninput.)
\nThis functionality is also available in the popen2 module using functions\nof the same names, but the return values of those functions have a different\norder.
\nThese functions operate on I/O streams referenced using file descriptors.
\nFile descriptors are small integers corresponding to a file that has been opened\nby the current process. For example, standard input is usually file descriptor\n0, standard output is 1, and standard error is 2. Further files opened by a\nprocess will then be assigned 3, 4, 5, and so forth. The name “file descriptor”\nis slightly deceptive; on Unix platforms, sockets and pipes are also referenced\nby file descriptors.
\nThe fileno() method can be used to obtain the file descriptor\nassociated with a file object when required. Note that using the file\ndescriptor directly will bypass the file object methods, ignoring aspects such\nas internal buffering of data.
\nClose file descriptor fd.
\nAvailability: Unix, Windows.
\n\nClose all file descriptors from fd_low (inclusive) to fd_high (exclusive),\nignoring errors. Equivalent to:
\nfor fd in xrange(fd_low, fd_high):\n try:\n os.close(fd)\n except OSError:\n pass\n
Availability: Unix, Windows.
\n\nNew in version 2.6.
\nReturn a duplicate of file descriptor fd.
\nAvailability: Unix, Windows.
\nDuplicate file descriptor fd to fd2, closing the latter first if necessary.
\nAvailability: Unix, Windows.
\nChange the mode of the file given by fd to the numeric mode. See the docs\nfor chmod() for possible values of mode.
\nAvailability: Unix.
\n\nNew in version 2.6.
\nChange the owner and group id of the file given by fd to the numeric uid\nand gid. To leave one of the ids unchanged, set it to -1.
\nAvailability: Unix.
\n\nNew in version 2.6.
\nForce write of file with filedescriptor fd to disk. Does not force update of\nmetadata.
\nAvailability: Unix.
\nNote
\nThis function is not available on MacOS.
\nReturn system configuration information relevant to an open file. name\nspecifies the configuration value to retrieve; it may be a string which is the\nname of a defined system value; these names are specified in a number of\nstandards (POSIX.1, Unix 95, Unix 98, and others). Some platforms define\nadditional names as well. The names known to the host operating system are\ngiven in the pathconf_names dictionary. For configuration variables not\nincluded in that mapping, passing an integer for name is also accepted.
\nIf name is a string and is not known, ValueError is raised. If a\nspecific value for name is not supported by the host system, even if it is\nincluded in pathconf_names, an OSError is raised with\nerrno.EINVAL for the error number.
\nAvailability: Unix.
\nReturn status for file descriptor fd, like stat().
\nAvailability: Unix, Windows.
\nReturn information about the filesystem containing the file associated with file\ndescriptor fd, like statvfs().
\nAvailability: Unix.
\nForce write of file with filedescriptor fd to disk. On Unix, this calls the\nnative fsync() function; on Windows, the MS _commit() function.
\nIf you’re starting with a Python file object f, first do f.flush(), and\nthen do os.fsync(f.fileno()), to ensure that all internal buffers associated\nwith f are written to disk.
\nAvailability: Unix, and Windows starting in 2.2.3.
\nTruncate the file corresponding to file descriptor fd, so that it is at most\nlength bytes in size.
\nAvailability: Unix.
\nReturn True if the file descriptor fd is open and connected to a\ntty(-like) device, else False.
\nAvailability: Unix.
\nSet the current position of file descriptor fd to position pos, modified\nby how: SEEK_SET or 0 to set the position relative to the\nbeginning of the file; SEEK_CUR or 1 to set it relative to the\ncurrent position; os.SEEK_END or 2 to set it relative to the end of\nthe file.
\nAvailability: Unix, Windows.
\nParameters to the lseek() function. Their values are 0, 1, and 2,\nrespectively.
\nAvailability: Windows, Unix.
\n\nNew in version 2.5.
\nOpen the file file and set various flags according to flags and possibly its\nmode according to mode. The default mode is 0777 (octal), and the\ncurrent umask value is first masked out. Return the file descriptor for the\nnewly opened file.
\nFor a description of the flag and mode values, see the C run-time documentation;\nflag constants (like O_RDONLY and O_WRONLY) are defined in\nthis module too (see open() flag constants). In particular, on Windows adding\nO_BINARY is needed to open files in binary mode.
\nAvailability: Unix, Windows.
\n\nOpen a new pseudo-terminal pair. Return a pair of file descriptors (master,\nslave) for the pty and the tty, respectively. For a (slightly) more portable\napproach, use the pty module.
\nAvailability: some flavors of Unix.
\nCreate a pipe. Return a pair of file descriptors (r, w) usable for reading\nand writing, respectively.
\nAvailability: Unix, Windows.
\nRead at most n bytes from file descriptor fd. Return a string containing the\nbytes read. If the end of the file referred to by fd has been reached, an\nempty string is returned.
\nAvailability: Unix, Windows.
\n\nReturn the process group associated with the terminal given by fd (an open\nfile descriptor as returned by os.open()).
\nAvailability: Unix.
\nSet the process group associated with the terminal given by fd (an open file\ndescriptor as returned by os.open()) to pg.
\nAvailability: Unix.
\nReturn a string which specifies the terminal device associated with\nfile descriptor fd. If fd is not associated with a terminal device, an\nexception is raised.
\nAvailability: Unix.
\nWrite the string str to file descriptor fd. Return the number of bytes\nactually written.
\nAvailability: Unix, Windows.
\nNote
\nThis function is intended for low-level I/O and must be applied to a file\ndescriptor as returned by os.open() or pipe(). To write a “file\nobject” returned by the built-in function open() or by popen() or\nfdopen(), or sys.stdout or sys.stderr, use its\nwrite() method.
\nThe following constants are options for the flags parameter to the\nopen() function. They can be combined using the bitwise OR operator\n|. Some of them are not available on all platforms. For descriptions of\ntheir availability and use, consult the open(2) manual page on Unix\nor the MSDN on Windows.
\nUse the real uid/gid to test for access to path. Note that most operations\nwill use the effective uid/gid, therefore this routine can be used in a\nsuid/sgid environment to test if the invoking user has the specified access to\npath. mode should be F_OK to test the existence of path, or it\ncan be the inclusive OR of one or more of R_OK, W_OK, and\nX_OK to test permissions. Return True if access is allowed,\nFalse if not. See the Unix man page access(2) for more\ninformation.
\nAvailability: Unix, Windows.
\nNote
\nUsing access() to check if a user is authorized to e.g. open a file\nbefore actually doing so using open() creates a security hole,\nbecause the user might exploit the short time interval between checking\nand opening the file to manipulate it. It’s preferable to use EAFP\ntechniques. For example:
\nif os.access("myfile", os.R_OK):\n with open("myfile") as fp:\n return fp.read()\nreturn "some default data"\n
is better written as:
\ntry:\n fp = open("myfile")\nexcept IOError as e:\n if e.errno == errno.EACCES:\n return "some default data"\n # Not a permission error.\n raise\nelse:\n with fp:\n return fp.read()\n
Note
\nI/O operations may fail even when access() indicates that they would\nsucceed, particularly for operations on network filesystems which may have\npermissions semantics beyond the usual POSIX permission-bit model.
\nChange the current working directory to path.
\nAvailability: Unix, Windows.
\nChange the current working directory to the directory represented by the file\ndescriptor fd. The descriptor must refer to an opened directory, not an open\nfile.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nReturn a string representing the current working directory.
\nAvailability: Unix, Windows.
\nReturn a Unicode object representing the current working directory.
\nAvailability: Unix, Windows.
\n\nNew in version 2.3.
\nSet the flags of path to the numeric flags. flags may take a combination\n(bitwise OR) of the following values (as defined in the stat module):
\nAvailability: Unix.
\n\nNew in version 2.6.
\nChange the root directory of the current process to path. Availability:\nUnix.
\n\nNew in version 2.2.
\nChange the mode of path to the numeric mode. mode may take one of the\nfollowing values (as defined in the stat module) or bitwise ORed\ncombinations of them:
\nAvailability: Unix, Windows.
\nNote
\nAlthough Windows supports chmod(), you can only set the file’s read-only\nflag with it (via the stat.S_IWRITE and stat.S_IREAD\nconstants or a corresponding integer value). All other bits are\nignored.
\nChange the owner and group id of path to the numeric uid and gid. To leave\none of the ids unchanged, set it to -1.
\nAvailability: Unix.
\nSet the flags of path to the numeric flags, like chflags(), but do not\nfollow symbolic links.
\nAvailability: Unix.
\n\nNew in version 2.6.
\nChange the mode of path to the numeric mode. If path is a symlink, this\naffects the symlink rather than the target. See the docs for chmod()\nfor possible values of mode.
\nAvailability: Unix.
\n\nNew in version 2.6.
\nChange the owner and group id of path to the numeric uid and gid. This\nfunction will not follow symbolic links.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nCreate a hard link pointing to source named link_name.
\nAvailability: Unix.
\nReturn a list containing the names of the entries in the directory given by\npath. The list is in arbitrary order. It does not include the special\nentries '.' and '..' even if they are present in the\ndirectory.
\nAvailability: Unix, Windows.
\n\nChanged in version 2.3: On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be\na list of Unicode objects. Undecodable filenames will still be returned as\nstring objects.
\nCreate a FIFO (a named pipe) named path with numeric mode mode. The default\nmode is 0666 (octal). The current umask value is first masked out from\nthe mode.
\nAvailability: Unix.
\nFIFOs are pipes that can be accessed like regular files. FIFOs exist until they\nare deleted (for example with os.unlink()). Generally, FIFOs are used as\nrendezvous between “client” and “server” type processes: the server opens the\nFIFO for reading, and the client opens it for writing. Note that mkfifo()\ndoesn’t open the FIFO — it just creates the rendezvous point.
\nCreate a filesystem node (file, device special file or named pipe) named\nfilename. mode specifies both the permissions to use and the type of node to\nbe created, being combined (bitwise OR) with one of stat.S_IFREG,\nstat.S_IFCHR, stat.S_IFBLK,\nand stat.S_IFIFO (those constants are available in stat).\nFor stat.S_IFCHR and\nstat.S_IFBLK, device defines the newly created device special file (probably using\nos.makedev()), otherwise it is ignored.
\n\nNew in version 2.3.
\nExtract the device major number from a raw device number (usually the\nst_dev or st_rdev field from stat).
\n\nNew in version 2.3.
\nExtract the device minor number from a raw device number (usually the\nst_dev or st_rdev field from stat).
\n\nNew in version 2.3.
\nCompose a raw device number from the major and minor device numbers.
\n\nNew in version 2.3.
\nCreate a directory named path with numeric mode mode. The default mode is\n0777 (octal). On some systems, mode is ignored. Where it is used, the\ncurrent umask value is first masked out. If the directory already exists,\nOSError is raised.
\nIt is also possible to create temporary directories; see the\ntempfile module’s tempfile.mkdtemp() function.
\nAvailability: Unix, Windows.
\nRecursive directory creation function. Like mkdir(), but makes all\nintermediate-level directories needed to contain the leaf directory. Raises an\nerror exception if the leaf directory already exists or cannot be\ncreated. The default mode is 0777 (octal). On some systems, mode is\nignored. Where it is used, the current umask value is first masked out.
\nNote
\nmakedirs() will become confused if the path elements to create include\nos.pardir.
\n\nNew in version 1.5.2.
\n\nChanged in version 2.3: This function now handles UNC paths correctly.
\nReturn system configuration information relevant to a named file. name\nspecifies the configuration value to retrieve; it may be a string which is the\nname of a defined system value; these names are specified in a number of\nstandards (POSIX.1, Unix 95, Unix 98, and others). Some platforms define\nadditional names as well. The names known to the host operating system are\ngiven in the pathconf_names dictionary. For configuration variables not\nincluded in that mapping, passing an integer for name is also accepted.
\nIf name is a string and is not known, ValueError is raised. If a\nspecific value for name is not supported by the host system, even if it is\nincluded in pathconf_names, an OSError is raised with\nerrno.EINVAL for the error number.
\nAvailability: Unix.
\nReturn a string representing the path to which the symbolic link points. The\nresult may be either an absolute or relative pathname; if it is relative, it may\nbe converted to an absolute pathname using os.path.join(os.path.dirname(path),\nresult).
\n\nChanged in version 2.6: If the path is a Unicode object the result will also be a Unicode object.
\nAvailability: Unix.
\nRemove (delete) the file path. If path is a directory, OSError is\nraised; see rmdir() below to remove a directory. This is identical to\nthe unlink() function documented below. On Windows, attempting to\nremove a file that is in use causes an exception to be raised; on Unix, the\ndirectory entry is removed but the storage allocated to the file is not made\navailable until the original file is no longer in use.
\nAvailability: Unix, Windows.
\nRemove directories recursively. Works like rmdir() except that, if the\nleaf directory is successfully removed, removedirs() tries to\nsuccessively remove every parent directory mentioned in path until an error\nis raised (which is ignored, because it generally means that a parent directory\nis not empty). For example, os.removedirs('foo/bar/baz') will first remove\nthe directory 'foo/bar/baz', and then remove 'foo/bar' and 'foo' if\nthey are empty. Raises OSError if the leaf directory could not be\nsuccessfully removed.
\n\nNew in version 1.5.2.
\nRename the file or directory src to dst. If dst is a directory,\nOSError will be raised. On Unix, if dst exists and is a file, it will\nbe replaced silently if the user has permission. The operation may fail on some\nUnix flavors if src and dst are on different filesystems. If successful,\nthe renaming will be an atomic operation (this is a POSIX requirement). On\nWindows, if dst already exists, OSError will be raised even if it is a\nfile; there may be no way to implement an atomic rename when dst names an\nexisting file.
\nAvailability: Unix, Windows.
\nRecursive directory or file renaming function. Works like rename(), except\ncreation of any intermediate directories needed to make the new pathname good is\nattempted first. After the rename, directories corresponding to rightmost path\nsegments of the old name will be pruned away using removedirs().
\n\nNew in version 1.5.2.
\nNote
\nThis function can fail with the new directory structure made if you lack\npermissions needed to remove the leaf directory or file.
\nRemove (delete) the directory path. Only works when the directory is\nempty, otherwise, OSError is raised. In order to remove whole\ndirectory trees, shutil.rmtree() can be used.
\nAvailability: Unix, Windows.
\nPerform the equivalent of a stat() system call on the given path.\n(This function follows symlinks; to stat a symlink use lstat().)
\nThe return value is an object whose attributes correspond to the members\nof the stat structure, namely:
\n\nChanged in version 2.3: If stat_float_times() returns True, the time values are floats, measuring\nseconds. Fractions of a second may be reported if the system supports that. On\nMac OS, the times are always floats. See stat_float_times() for further\ndiscussion.
\nOn some Unix systems (such as Linux), the following attributes may also be\navailable:
\nOn other Unix systems (such as FreeBSD), the following attributes may be\navailable (but may be only filled out if root tries to use them):
\nOn Mac OS systems, the following attributes may also be available:
\nOn RISCOS systems, the following attributes are also available:
\nNote
\nThe exact meaning and resolution of the st_atime,\nst_mtime, and st_ctime attributes depend on the operating\nsystem and the file system. For example, on Windows systems using the FAT\nor FAT32 file systems, st_mtime has 2-second resolution, and\nst_atime has only 1-day resolution. See your operating system\ndocumentation for details.
\nFor backward compatibility, the return value of stat() is also accessible\nas a tuple of at least 10 integers giving the most important (and portable)\nmembers of the stat structure, in the order st_mode,\nst_ino, st_dev, st_nlink, st_uid,\nst_gid, st_size, st_atime, st_mtime,\nst_ctime. More items may be added at the end by some implementations.
\nThe standard module stat defines functions and constants that are useful\nfor extracting information from a stat structure. (On Windows, some\nitems are filled with dummy values.)
\nExample:
\n>>> import os\n>>> statinfo = os.stat('somefile.txt')\n>>> statinfo\n(33188, 422511, 769, 1, 1032, 100, 926, 1105022698,1105022732, 1105022732)\n>>> statinfo.st_size\n926\n
Availability: Unix, Windows.
\n\nChanged in version 2.2: Added access to values as attributes of the returned object.
\n\nChanged in version 2.5: Added st_gen and st_birthtime.
\nDetermine whether stat_result represents time stamps as float objects.\nIf newvalue is True, future calls to stat() return floats, if it is\nFalse, future calls return ints. If newvalue is omitted, return the\ncurrent setting.
\nFor compatibility with older Python versions, accessing stat_result as\na tuple always returns integers.
\n\nChanged in version 2.5: Python now returns float values by default. Applications which do not work\ncorrectly with floating point time stamps can use this function to restore the\nold behaviour.
\nThe resolution of the timestamps (that is the smallest possible fraction)\ndepends on the system. Some systems only support second resolution; on these\nsystems, the fraction will always be zero.
\nIt is recommended that this setting is only changed at program startup time in\nthe __main__ module; libraries should never change this setting. If an\napplication uses a library that works incorrectly if floating point time stamps\nare processed, this application should turn the feature off until the library\nhas been corrected.
\nPerform a statvfs() system call on the given path. The return value is\nan object whose attributes describe the filesystem on the given path, and\ncorrespond to the members of the statvfs structure, namely:\nf_bsize, f_frsize, f_blocks, f_bfree,\nf_bavail, f_files, f_ffree, f_favail,\nf_flag, f_namemax.
\nFor backward compatibility, the return value is also accessible as a tuple whose\nvalues correspond to the attributes, in the order given above. The standard\nmodule statvfs defines constants that are useful for extracting\ninformation from a statvfs structure when accessing it as a sequence;\nthis remains useful when writing code that needs to work with versions of Python\nthat don’t support accessing the fields as attributes.
\nAvailability: Unix.
\n\nChanged in version 2.2: Added access to values as attributes of the returned object.
\nCreate a symbolic link pointing to source named link_name.
\nAvailability: Unix.
\nReturn a unique path name that is reasonable for creating a temporary file.\nThis will be an absolute path that names a potential directory entry in the\ndirectory dir or a common location for temporary files if dir is omitted or\nNone. If given and not None, prefix is used to provide a short prefix\nto the filename. Applications are responsible for properly creating and\nmanaging files created using paths returned by tempnam(); no automatic\ncleanup is provided. On Unix, the environment variable TMPDIR\noverrides dir, while on Windows TMP is used. The specific\nbehavior of this function depends on the C library implementation; some aspects\nare underspecified in system documentation.
\nWarning
\nUse of tempnam() is vulnerable to symlink attacks; consider using\ntmpfile() (section File Object Creation) instead.
\nAvailability: Unix, Windows.
\nReturn a unique path name that is reasonable for creating a temporary file.\nThis will be an absolute path that names a potential directory entry in a common\nlocation for temporary files. Applications are responsible for properly\ncreating and managing files created using paths returned by tmpnam(); no\nautomatic cleanup is provided.
\nWarning
\nUse of tmpnam() is vulnerable to symlink attacks; consider using\ntmpfile() (section File Object Creation) instead.
\nAvailability: Unix, Windows. This function probably shouldn’t be used on\nWindows, though: Microsoft’s implementation of tmpnam() always creates a\nname in the root directory of the current drive, and that’s generally a poor\nlocation for a temp file (depending on privileges, you may not even be able to\nopen a file using this name).
\nRemove (delete) the file path. This is the same function as\nremove(); the unlink() name is its traditional Unix\nname.
\nAvailability: Unix, Windows.
\nSet the access and modified times of the file specified by path. If times\nis None, then the file’s access and modified times are set to the current\ntime. (The effect is similar to running the Unix program touch on\nthe path.) Otherwise, times must be a 2-tuple of numbers, of the form\n(atime, mtime) which is used to set the access and modified times,\nrespectively. Whether a directory can be given for path depends on whether\nthe operating system implements directories as files (for example, Windows\ndoes not). Note that the exact times you set here may not be returned by a\nsubsequent stat() call, depending on the resolution with which your\noperating system records access and modification times; see stat().
\n\nChanged in version 2.0: Added support for None for times.
\nAvailability: Unix, Windows.
\nGenerate the file names in a directory tree by walking the tree\neither top-down or bottom-up. For each directory in the tree rooted at directory\ntop (including top itself), it yields a 3-tuple (dirpath, dirnames,\nfilenames).
\ndirpath is a string, the path to the directory. dirnames is a list of the\nnames of the subdirectories in dirpath (excluding '.' and '..').\nfilenames is a list of the names of the non-directory files in dirpath.\nNote that the names in the lists contain no path components. To get a full path\n(which begins with top) to a file or directory in dirpath, do\nos.path.join(dirpath, name).
\nIf optional argument topdown is True or not specified, the triple for a\ndirectory is generated before the triples for any of its subdirectories\n(directories are generated top-down). If topdown is False, the triple for a\ndirectory is generated after the triples for all of its subdirectories\n(directories are generated bottom-up).
\nWhen topdown is True, the caller can modify the dirnames list in-place\n(perhaps using del or slice assignment), and walk() will only\nrecurse into the subdirectories whose names remain in dirnames; this can be\nused to prune the search, impose a specific order of visiting, or even to inform\nwalk() about directories the caller creates or renames before it resumes\nwalk() again. Modifying dirnames when topdown is False is\nineffective, because in bottom-up mode the directories in dirnames are\ngenerated before dirpath itself is generated.
\nBy default, errors from the listdir() call are ignored. If optional\nargument onerror is specified, it should be a function; it will be called with\none argument, an OSError instance. It can report the error to continue\nwith the walk, or raise the exception to abort the walk. Note that the filename\nis available as the filename attribute of the exception object.
\nBy default, walk() will not walk down into symbolic links that resolve to\ndirectories. Set followlinks to True to visit directories pointed to by\nsymlinks, on systems that support them.
\n\nNew in version 2.6: The followlinks parameter.
\nNote
\nBe aware that setting followlinks to True can lead to infinite recursion if a\nlink points to a parent directory of itself. walk() does not keep track of\nthe directories it visited already.
\nNote
\nIf you pass a relative pathname, don’t change the current working directory\nbetween resumptions of walk(). walk() never changes the current\ndirectory, and assumes that its caller doesn’t either.
\nThis example displays the number of bytes taken by non-directory files in each\ndirectory under the starting directory, except that it doesn’t look under any\nCVS subdirectory:
\nimport os\nfrom os.path import join, getsize\nfor root, dirs, files in os.walk('python/Lib/email'):\n print root, "consumes",\n print sum(getsize(join(root, name)) for name in files),\n print "bytes in", len(files), "non-directory files"\n if 'CVS' in dirs:\n dirs.remove('CVS') # don't visit CVS directories\n
In the next example, walking the tree bottom-up is essential: rmdir()\ndoesn’t allow deleting a directory before the directory is empty:
\n# Delete everything reachable from the directory named in "top",\n# assuming there are no symbolic links.\n# CAUTION: This is dangerous! For example, if top == '/', it\n# could delete all your disk files.\nimport os\nfor root, dirs, files in os.walk(top, topdown=False):\n for name in files:\n os.remove(os.path.join(root, name))\n for name in dirs:\n os.rmdir(os.path.join(root, name))\n
\nNew in version 2.3.
\nThese functions may be used to create and manage processes.
\nThe various exec*() functions take a list of arguments for the new\nprogram loaded into the process. In each case, the first of these arguments is\npassed to the new program as its own name rather than as an argument a user may\nhave typed on a command line. For the C programmer, this is the argv[0]\npassed to a program’s main(). For example, os.execv('/bin/echo',\n['foo', 'bar']) will only print bar on standard output; foo will seem\nto be ignored.
\nGenerate a SIGABRT signal to the current process. On Unix, the default\nbehavior is to produce a core dump; on Windows, the process immediately returns\nan exit code of 3. Be aware that calling this function will not call the\nPython signal handler registered for SIGABRT with\nsignal.signal().
\nAvailability: Unix, Windows.
\nThese functions all execute a new program, replacing the current process; they\ndo not return. On Unix, the new executable is loaded into the current process,\nand will have the same process id as the caller. Errors will be reported as\nOSError exceptions.
\nThe current process is replaced immediately. Open file objects and\ndescriptors are not flushed, so if there may be data buffered\non these open files, you should flush them using\nsys.stdout.flush() or os.fsync() before calling an\nexec*() function.
\nThe “l” and “v” variants of the exec*() functions differ in how\ncommand-line arguments are passed. The “l” variants are perhaps the easiest\nto work with if the number of parameters is fixed when the code is written; the\nindividual parameters simply become additional parameters to the execl*()\nfunctions. The “v” variants are good when the number of parameters is\nvariable, with the arguments being passed in a list or tuple as the args\nparameter. In either case, the arguments to the child process should start with\nthe name of the command being run, but this is not enforced.
\nThe variants which include a “p” near the end (execlp(),\nexeclpe(), execvp(), and execvpe()) will use the\nPATH environment variable to locate the program file. When the\nenvironment is being replaced (using one of the exec*e() variants,\ndiscussed in the next paragraph), the new environment is used as the source of\nthe PATH variable. The other variants, execl(), execle(),\nexecv(), and execve(), will not use the PATH variable to\nlocate the executable; path must contain an appropriate absolute or relative\npath.
\nFor execle(), execlpe(), execve(), and execvpe() (note\nthat these all end in “e”), the env parameter must be a mapping which is\nused to define the environment variables for the new process (these are used\ninstead of the current process’ environment); the functions execl(),\nexeclp(), execv(), and execvp() all cause the new process to\ninherit the environment of the current process.
\nAvailability: Unix, Windows.
\nExit the process with status n, without calling cleanup handlers, flushing\nstdio buffers, etc.
\nAvailability: Unix, Windows.
\n\nThe following exit codes are defined and can be used with _exit(),\nalthough they are not required. These are typically used for system programs\nwritten in Python, such as a mail server’s external command delivery program.
\nNote
\nSome of these may not be available on all Unix platforms, since there is some\nvariation. These constants are defined where they are defined by the underlying\nplatform.
\nExit code that means no error occurred.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means the command was used incorrectly, such as when the wrong\nnumber of arguments are given.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means the input data was incorrect.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means an input file did not exist or was not readable.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means a specified user did not exist.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means a specified host did not exist.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means that a required service is unavailable.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means an internal software error was detected.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means an operating system error was detected, such as the\ninability to fork or create a pipe.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means some system file did not exist, could not be opened, or had\nsome other kind of error.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means a user specified output file could not be created.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means that an error occurred while doing I/O on some file.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means a temporary failure occurred. This indicates something\nthat may not really be an error, such as a network connection that couldn’t be\nmade during a retryable operation.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means that a protocol exchange was illegal, invalid, or not\nunderstood.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means that there were insufficient permissions to perform the\noperation (but not intended for file system problems).
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means that some kind of configuration error occurred.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nExit code that means something like “an entry was not found”.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nFork a child process. Return 0 in the child and the child’s process id in the\nparent. If an error occurs OSError is raised.
\nNote that some platforms including FreeBSD <= 6.3, Cygwin and OS/2 EMX have\nknown issues when using fork() from a thread.
\nAvailability: Unix.
\nFork a child process, using a new pseudo-terminal as the child’s controlling\nterminal. Return a pair of (pid, fd), where pid is 0 in the child, the\nnew child’s process id in the parent, and fd is the file descriptor of the\nmaster end of the pseudo-terminal. For a more portable approach, use the\npty module. If an error occurs OSError is raised.
\nAvailability: some flavors of Unix.
\nSend signal sig to the process pid. Constants for the specific signals\navailable on the host platform are defined in the signal module.
\nWindows: The signal.CTRL_C_EVENT and\nsignal.CTRL_BREAK_EVENT signals are special signals which can\nonly be sent to console processes which share a common console window,\ne.g., some subprocesses. Any other value for sig will cause the process\nto be unconditionally killed by the TerminateProcess API, and the exit code\nwill be set to sig. The Windows version of kill() additionally takes\nprocess handles to be killed.
\n\nNew in version 2.7: Windows support
\nSend the signal sig to the process group pgid.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nAdd increment to the process’s “niceness”. Return the new niceness.
\nAvailability: Unix.
\nLock program segments into memory. The value of op (defined in\n<sys/lock.h>) determines which segments are locked.
\nAvailability: Unix.
\nExecute the program path in a new process.
\n(Note that the subprocess module provides more powerful facilities for\nspawning new processes and retrieving their results; using that module is\npreferable to using these functions. Check especially the\nReplacing Older Functions with the subprocess Module section.)
\nIf mode is P_NOWAIT, this function returns the process id of the new\nprocess; if mode is P_WAIT, returns the process’s exit code if it\nexits normally, or -signal, where signal is the signal that killed the\nprocess. On Windows, the process id will actually be the process handle, so can\nbe used with the waitpid() function.
\nThe “l” and “v” variants of the spawn*() functions differ in how\ncommand-line arguments are passed. The “l” variants are perhaps the easiest\nto work with if the number of parameters is fixed when the code is written; the\nindividual parameters simply become additional parameters to the\nspawnl*() functions. The “v” variants are good when the number of\nparameters is variable, with the arguments being passed in a list or tuple as\nthe args parameter. In either case, the arguments to the child process must\nstart with the name of the command being run.
\nThe variants which include a second “p” near the end (spawnlp(),\nspawnlpe(), spawnvp(), and spawnvpe()) will use the\nPATH environment variable to locate the program file. When the\nenvironment is being replaced (using one of the spawn*e() variants,\ndiscussed in the next paragraph), the new environment is used as the source of\nthe PATH variable. The other variants, spawnl(),\nspawnle(), spawnv(), and spawnve(), will not use the\nPATH variable to locate the executable; path must contain an\nappropriate absolute or relative path.
\nFor spawnle(), spawnlpe(), spawnve(), and spawnvpe()\n(note that these all end in “e”), the env parameter must be a mapping\nwhich is used to define the environment variables for the new process (they are\nused instead of the current process’ environment); the functions\nspawnl(), spawnlp(), spawnv(), and spawnvp() all cause\nthe new process to inherit the environment of the current process. Note that\nkeys and values in the env dictionary must be strings; invalid keys or\nvalues will cause the function to fail, with a return value of 127.
\nAs an example, the following calls to spawnlp() and spawnvpe() are\nequivalent:
\nimport os\nos.spawnlp(os.P_WAIT, 'cp', 'cp', 'index.html', '/dev/null')\n\nL = ['cp', 'index.html', '/dev/null']\nos.spawnvpe(os.P_WAIT, 'cp', L, os.environ)\n
Availability: Unix, Windows. spawnlp(), spawnlpe(), spawnvp()\nand spawnvpe() are not available on Windows. spawnle() and\nspawnve() are not thread-safe on Windows; we advise you to use the\nsubprocess module instead.
\n\nNew in version 1.6.
\nPossible values for the mode parameter to the spawn*() family of\nfunctions. If either of these values is given, the spawn*() functions\nwill return as soon as the new process has been created, with the process id as\nthe return value.
\nAvailability: Unix, Windows.
\n\nNew in version 1.6.
\nPossible value for the mode parameter to the spawn*() family of\nfunctions. If this is given as mode, the spawn*() functions will not\nreturn until the new process has run to completion and will return the exit code\nof the process the run is successful, or -signal if a signal kills the\nprocess.
\nAvailability: Unix, Windows.
\n\nNew in version 1.6.
\nPossible values for the mode parameter to the spawn*() family of\nfunctions. These are less portable than those listed above. P_DETACH\nis similar to P_NOWAIT, but the new process is detached from the\nconsole of the calling process. If P_OVERLAY is used, the current\nprocess will be replaced; the spawn*() function will not return.
\nAvailability: Windows.
\n\nNew in version 1.6.
\nStart a file with its associated application.
\nWhen operation is not specified or 'open', this acts like double-clicking\nthe file in Windows Explorer, or giving the file name as an argument to the\nstart command from the interactive command shell: the file is opened\nwith whatever application (if any) its extension is associated.
\nWhen another operation is given, it must be a “command verb” that specifies\nwhat should be done with the file. Common verbs documented by Microsoft are\n'print' and 'edit' (to be used on files) as well as 'explore' and\n'find' (to be used on directories).
\nstartfile() returns as soon as the associated application is launched.\nThere is no option to wait for the application to close, and no way to retrieve\nthe application’s exit status. The path parameter is relative to the current\ndirectory. If you want to use an absolute path, make sure the first character\nis not a slash ('/'); the underlying Win32 ShellExecute() function\ndoesn’t work if it is. Use the os.path.normpath() function to ensure that\nthe path is properly encoded for Win32.
\nAvailability: Windows.
\n\nNew in version 2.0.
\n\nNew in version 2.5: The operation parameter.
\nExecute the command (a string) in a subshell. This is implemented by calling\nthe Standard C function system(), and has the same limitations.\nChanges to sys.stdin, etc. are not reflected in the environment of the\nexecuted command.
\nOn Unix, the return value is the exit status of the process encoded in the\nformat specified for wait(). Note that POSIX does not specify the meaning\nof the return value of the C system() function, so the return value of\nthe Python function is system-dependent.
\nOn Windows, the return value is that returned by the system shell after running\ncommand, given by the Windows environment variable COMSPEC: on\ncommand.com systems (Windows 95, 98 and ME) this is always 0; on\ncmd.exe systems (Windows NT, 2000 and XP) this is the exit status of\nthe command run; on systems using a non-native shell, consult your shell\ndocumentation.
\nThe subprocess module provides more powerful facilities for spawning new\nprocesses and retrieving their results; using that module is preferable to using\nthis function. See the\nReplacing Older Functions with the subprocess Module section in the subprocess documentation\nfor some helpful recipes.
\nAvailability: Unix, Windows.
\nReturn a 5-tuple of floating point numbers indicating accumulated (processor\nor other) times, in seconds. The items are: user time, system time,\nchildren’s user time, children’s system time, and elapsed real time since a\nfixed point in the past, in that order. See the Unix manual page\ntimes(2) or the corresponding Windows Platform API documentation.\nOn Windows, only the first two items are filled, the others are zero.
\nAvailability: Unix, Windows
\nWait for completion of a child process, and return a tuple containing its pid\nand exit status indication: a 16-bit number, whose low byte is the signal number\nthat killed the process, and whose high byte is the exit status (if the signal\nnumber is zero); the high bit of the low byte is set if a core file was\nproduced.
\nAvailability: Unix.
\nThe details of this function differ on Unix and Windows.
\nOn Unix: Wait for completion of a child process given by process id pid, and\nreturn a tuple containing its process id and exit status indication (encoded as\nfor wait()). The semantics of the call are affected by the value of the\ninteger options, which should be 0 for normal operation.
\nIf pid is greater than 0, waitpid() requests status information for\nthat specific process. If pid is 0, the request is for the status of any\nchild in the process group of the current process. If pid is -1, the\nrequest pertains to any child of the current process. If pid is less than\n-1, status is requested for any process in the process group -pid (the\nabsolute value of pid).
\nAn OSError is raised with the value of errno when the syscall\nreturns -1.
\nOn Windows: Wait for completion of a process given by process handle pid, and\nreturn a tuple containing pid, and its exit status shifted left by 8 bits\n(shifting makes cross-platform use of the function easier). A pid less than or\nequal to 0 has no special meaning on Windows, and raises an exception. The\nvalue of integer options has no effect. pid can refer to any process whose\nid is known, not necessarily a child process. The spawn() functions called\nwith P_NOWAIT return suitable process handles.
\nSimilar to waitpid(), except no process id argument is given and a\n3-element tuple containing the child’s process id, exit status indication, and\nresource usage information is returned. Refer to resource.getrusage() for details on resource usage information. The option\nargument is the same as that provided to waitpid() and wait4().
\nAvailability: Unix.
\n\nNew in version 2.5.
\nSimilar to waitpid(), except a 3-element tuple, containing the child’s\nprocess id, exit status indication, and resource usage information is returned.\nRefer to resource.getrusage() for details on resource usage\ninformation. The arguments to wait4() are the same as those provided to\nwaitpid().
\nAvailability: Unix.
\n\nNew in version 2.5.
\nThe option for waitpid() to return immediately if no child process status\nis available immediately. The function returns (0, 0) in this case.
\nAvailability: Unix.
\nThis option causes child processes to be reported if they have been continued\nfrom a job control stop since their status was last reported.
\nAvailability: Some Unix systems.
\n\nNew in version 2.3.
\nThis option causes child processes to be reported if they have been stopped but\ntheir current state has not been reported since they were stopped.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nThe following functions take a process status code as returned by\nsystem(), wait(), or waitpid() as a parameter. They may be\nused to determine the disposition of a process.
\nReturn True if a core dump was generated for the process, otherwise\nreturn False.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nReturn True if the process has been continued from a job control stop,\notherwise return False.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nReturn True if the process has been stopped, otherwise return\nFalse.
\nAvailability: Unix.
\nReturn True if the process exited due to a signal, otherwise return\nFalse.
\nAvailability: Unix.
\nReturn True if the process exited using the exit(2) system call,\notherwise return False.
\nAvailability: Unix.
\nIf WIFEXITED(status) is true, return the integer parameter to the\nexit(2) system call. Otherwise, the return value is meaningless.
\nAvailability: Unix.
\nReturn the signal which caused the process to stop.
\nAvailability: Unix.
\nReturn the signal which caused the process to exit.
\nAvailability: Unix.
\nReturn string-valued system configuration values. name specifies the\nconfiguration value to retrieve; it may be a string which is the name of a\ndefined system value; these names are specified in a number of standards (POSIX,\nUnix 95, Unix 98, and others). Some platforms define additional names as well.\nThe names known to the host operating system are given as the keys of the\nconfstr_names dictionary. For configuration variables not included in that\nmapping, passing an integer for name is also accepted.
\nIf the configuration value specified by name isn’t defined, None is\nreturned.
\nIf name is a string and is not known, ValueError is raised. If a\nspecific value for name is not supported by the host system, even if it is\nincluded in confstr_names, an OSError is raised with\nerrno.EINVAL for the error number.
\nAvailability: Unix
\nDictionary mapping names accepted by confstr() to the integer values\ndefined for those names by the host operating system. This can be used to\ndetermine the set of names known to the system.
\nAvailability: Unix.
\nReturn the number of processes in the system run queue averaged over the last\n1, 5, and 15 minutes or raises OSError if the load average was\nunobtainable.
\nAvailability: Unix.
\n\nNew in version 2.3.
\nReturn integer-valued system configuration values. If the configuration value\nspecified by name isn’t defined, -1 is returned. The comments regarding\nthe name parameter for confstr() apply here as well; the dictionary that\nprovides information on the known names is given by sysconf_names.
\nAvailability: Unix.
\nDictionary mapping names accepted by sysconf() to the integer values\ndefined for those names by the host operating system. This can be used to\ndetermine the set of names known to the system.
\nAvailability: Unix.
\nThe following data values are used to support path manipulation operations. These\nare defined for all platforms.
\nHigher-level operations on pathnames are defined in the os.path module.
\nThe character which separates the base filename from the extension; for example,\nthe '.' in os.py. Also available via os.path.
\n\nNew in version 2.2.
\nReturn a string of n random bytes suitable for cryptographic use.
\nThis function returns random bytes from an OS-specific randomness source. The\nreturned data should be unpredictable enough for cryptographic applications,\nthough its exact quality depends on the OS implementation. On a UNIX-like\nsystem this will query /dev/urandom, and on Windows it will use CryptGenRandom.\nIf a randomness source is not found, NotImplementedError will be raised.
\n\nNew in version 2.4.
\nSource code: Lib/getopt.py
\nNote
\nThe getopt module is a parser for command line options whose API is\ndesigned to be familiar to users of the C getopt() function. Users who\nare unfamiliar with the C getopt() function or who would like to write\nless code and get better help and error messages should consider using the\nargparse module instead.
\nThis module helps scripts to parse the command line arguments in sys.argv.\nIt supports the same conventions as the Unix getopt() function (including\nthe special meanings of arguments of the form ‘-‘ and ‘--‘). Long\noptions similar to those supported by GNU software may be used as well via an\noptional third argument.
\nThis module provides two functions and an\nexception:
\nParses command line options and parameter list. args is the argument list to\nbe parsed, without the leading reference to the running program. Typically, this\nmeans sys.argv[1:]. options is the string of option letters that the\nscript wants to recognize, with options that require an argument followed by a\ncolon (':'; i.e., the same format that Unix getopt() uses).
\nNote
\nUnlike GNU getopt(), after a non-option argument, all further\narguments are considered also non-options. This is similar to the way\nnon-GNU Unix systems work.
\nlong_options, if specified, must be a list of strings with the names of the\nlong options which should be supported. The leading '--'\ncharacters should not be included in the option name. Long options which\nrequire an argument should be followed by an equal sign ('='). Optional\narguments are not supported. To accept only long options, options should\nbe an empty string. Long options on the command line can be recognized so\nlong as they provide a prefix of the option name that matches exactly one of\nthe accepted options. For example, if long_options is ['foo', 'frob'],\nthe option --fo will match as --foo, but --f\nwill not match uniquely, so GetoptError will be raised.
\nThe return value consists of two elements: the first is a list of (option,\nvalue) pairs; the second is the list of program arguments left after the\noption list was stripped (this is a trailing slice of args). Each\noption-and-value pair returned has the option as its first element, prefixed\nwith a hyphen for short options (e.g., '-x') or two hyphens for long\noptions (e.g., '--long-option'), and the option argument as its\nsecond element, or an empty string if the option has no argument. The\noptions occur in the list in the same order in which they were found, thus\nallowing multiple occurrences. Long and short options may be mixed.
\nThis function works like getopt(), except that GNU style scanning mode is\nused by default. This means that option and non-option arguments may be\nintermixed. The getopt() function stops processing options as soon as a\nnon-option argument is encountered.
\nIf the first character of the option string is '+', or if the environment\nvariable POSIXLY_CORRECT is set, then option processing stops as\nsoon as a non-option argument is encountered.
\n\nNew in version 2.3.
\nThis is raised when an unrecognized option is found in the argument list or when\nan option requiring an argument is given none. The argument to the exception is\na string indicating the cause of the error. For long options, an argument given\nto an option which does not require one will also cause this exception to be\nraised. The attributes msg and opt give the error message and\nrelated option; if there is no specific option to which the exception relates,\nopt is an empty string.
\n\nChanged in version 1.6: Introduced GetoptError as a synonym for error.
\nAn example using only Unix style options:
\n>>> import getopt\n>>> args = '-a -b -cfoo -d bar a1 a2'.split()\n>>> args\n['-a', '-b', '-cfoo', '-d', 'bar', 'a1', 'a2']\n>>> optlist, args = getopt.getopt(args, 'abc:d:')\n>>> optlist\n[('-a', ''), ('-b', ''), ('-c', 'foo'), ('-d', 'bar')]\n>>> args\n['a1', 'a2']\n
Using long option names is equally easy:
\n>>> s = '--condition=foo --testing --output-file abc.def -x a1 a2'\n>>> args = s.split()\n>>> args\n['--condition=foo', '--testing', '--output-file', 'abc.def', '-x', 'a1', 'a2']\n>>> optlist, args = getopt.getopt(args, 'x', [\n... 'condition=', 'output-file=', 'testing'])\n>>> optlist\n[('--condition', 'foo'), ('--testing', ''), ('--output-file', 'abc.def'), ('-x', '')]\n>>> args\n['a1', 'a2']\n
In a script, typical usage is something like this:
\nimport getopt, sys\n\ndef main():\n try:\n opts, args = getopt.getopt(sys.argv[1:], "ho:v", ["help", "output="])\n except getopt.GetoptError, err:\n # print help information and exit:\n print str(err) # will print something like "option -a not recognized"\n usage()\n sys.exit(2)\n output = None\n verbose = False\n for o, a in opts:\n if o == "-v":\n verbose = True\n elif o in ("-h", "--help"):\n usage()\n sys.exit()\n elif o in ("-o", "--output"):\n output = a\n else:\n assert False, "unhandled option"\n # ...\n\nif __name__ == "__main__":\n main()\n
Note that an equivalent command line interface could be produced with less code\nand more informative help and error messages by using the argparse module:
\nimport argparse\n\nif __name__ == '__main__':\n parser = argparse.ArgumentParser()\n parser.add_argument('-o', '--output')\n parser.add_argument('-v', dest='verbose', action='store_true')\n args = parser.parse_args()\n # ... do something with args.output ...\n # ... do something with args.verbose ..\n
See also
\n\nNew in version 2.3.
\n\nDeprecated since version 2.7: The optparse module is deprecated and will not be developed further;\ndevelopment will continue with the argparse module.
\nSource code: Lib/optparse.py
\noptparse is a more convenient, flexible, and powerful library for parsing\ncommand-line options than the old getopt module. optparse uses a\nmore declarative style of command-line parsing: you create an instance of\nOptionParser, populate it with options, and parse the command\nline. optparse allows users to specify options in the conventional\nGNU/POSIX syntax, and additionally generates usage and help messages for you.
\nHere’s an example of using optparse in a simple script:
\nfrom optparse import OptionParser\n[...]\nparser = OptionParser()\nparser.add_option("-f", "--file", dest="filename",\n help="write report to FILE", metavar="FILE")\nparser.add_option("-q", "--quiet",\n action="store_false", dest="verbose", default=True,\n help="don't print status messages to stdout")\n\n(options, args) = parser.parse_args()\n
With these few lines of code, users of your script can now do the “usual thing”\non the command-line, for example:
\n<yourscript> --file=outfile -q
\nAs it parses the command line, optparse sets attributes of the\noptions object returned by parse_args() based on user-supplied\ncommand-line values. When parse_args() returns from parsing this command\nline, options.filename will be "outfile" and options.verbose will be\nFalse. optparse supports both long and short options, allows short\noptions to be merged together, and allows options to be associated with their\narguments in a variety of ways. Thus, the following command lines are all\nequivalent to the above example:
\n<yourscript> -f outfile --quiet\n<yourscript> --quiet --file outfile\n<yourscript> -q -foutfile\n<yourscript> -qfoutfile
\nAdditionally, users can run one of
\n<yourscript> -h\n<yourscript> --help
\nand optparse will print out a brief summary of your script’s options:
\nUsage: <yourscript> [options]\n\nOptions:\n -h, --help show this help message and exit\n -f FILE, --file=FILE write report to FILE\n -q, --quiet don't print status messages to stdout\n
where the value of yourscript is determined at runtime (normally from\nsys.argv[0]).
\noptparse was explicitly designed to encourage the creation of programs\nwith straightforward, conventional command-line interfaces. To that end, it\nsupports only the most common command-line syntax and semantics conventionally\nused under Unix. If you are unfamiliar with these conventions, read this\nsection to acquaint yourself with them.
\na string entered on the command-line, and passed by the shell to execl()\nor execv(). In Python, arguments are elements of sys.argv[1:]\n(sys.argv[0] is the name of the program being executed). Unix shells\nalso use the term “word”.
\nIt is occasionally desirable to substitute an argument list other than\nsys.argv[1:], so you should read “argument” as “an element of\nsys.argv[1:], or of some other list provided as a substitute for\nsys.argv[1:]“.
\nan argument used to supply extra information to guide or customize the\nexecution of a program. There are many different syntaxes for options; the\ntraditional Unix syntax is a hyphen (“-“) followed by a single letter,\ne.g. -x or -F. Also, traditional Unix syntax allows multiple\noptions to be merged into a single argument, e.g. -x -F is equivalent\nto -xF. The GNU project introduced -- followed by a series of\nhyphen-separated words, e.g. --file or --dry-run. These are the\nonly two option syntaxes provided by optparse.
\nSome other option syntaxes that the world has seen include:
\nThese option syntaxes are not supported by optparse, and they never\nwill be. This is deliberate: the first three are non-standard on any\nenvironment, and the last only makes sense if you’re exclusively targeting\nVMS, MS-DOS, and/or Windows.
\nan argument that follows an option, is closely associated with that option,\nand is consumed from the argument list when that option is. With\noptparse, option arguments may either be in a separate argument from\ntheir option:
\n-f foo\n--file foo\n
or included in the same argument:
\n-ffoo\n--file=foo\n
Typically, a given option either takes an argument or it doesn’t. Lots of\npeople want an “optional option arguments” feature, meaning that some options\nwill take an argument if they see it, and won’t if they don’t. This is\nsomewhat controversial, because it makes parsing ambiguous: if -a takes\nan optional argument and -b is another option entirely, how do we\ninterpret -ab? Because of this ambiguity, optparse does not\nsupport this feature.
\nFor example, consider this hypothetical command-line:
\nprog -v --report /tmp/report.txt foo bar
\n-v and --report are both options. Assuming that --report\ntakes one argument, /tmp/report.txt is an option argument. foo and\nbar are positional arguments.
\nOptions are used to provide extra information to tune or customize the execution\nof a program. In case it wasn’t clear, options are usually optional. A\nprogram should be able to run just fine with no options whatsoever. (Pick a\nrandom program from the Unix or GNU toolsets. Can it run without any options at\nall and still make sense? The main exceptions are find, tar, and\ndd—all of which are mutant oddballs that have been rightly criticized\nfor their non-standard syntax and confusing interfaces.)
\nLots of people want their programs to have “required options”. Think about it.\nIf it’s required, then it’s not optional! If there is a piece of information\nthat your program absolutely requires in order to run successfully, that’s what\npositional arguments are for.
\nAs an example of good command-line interface design, consider the humble cp\nutility, for copying files. It doesn’t make much sense to try to copy files\nwithout supplying a destination and at least one source. Hence, cp fails if\nyou run it with no arguments. However, it has a flexible, useful syntax that\ndoes not require any options at all:
\ncp SOURCE DEST\ncp SOURCE ... DEST-DIR
\nYou can get pretty far with just that. Most cp implementations provide a\nbunch of options to tweak exactly how the files are copied: you can preserve\nmode and modification time, avoid following symlinks, ask before clobbering\nexisting files, etc. But none of this distracts from the core mission of\ncp, which is to copy either one file to another, or several files to another\ndirectory.
\nPositional arguments are for those pieces of information that your program\nabsolutely, positively requires to run.
\nA good user interface should have as few absolute requirements as possible. If\nyour program requires 17 distinct pieces of information in order to run\nsuccessfully, it doesn’t much matter how you get that information from the\nuser—most people will give up and walk away before they successfully run the\nprogram. This applies whether the user interface is a command-line, a\nconfiguration file, or a GUI: if you make that many demands on your users, most\nof them will simply give up.
\nIn short, try to minimize the amount of information that users are absolutely\nrequired to supply—use sensible defaults whenever possible. Of course, you\nalso want to make your programs reasonably flexible. That’s what options are\nfor. Again, it doesn’t matter if they are entries in a config file, widgets in\nthe “Preferences” dialog of a GUI, or command-line options—the more options\nyou implement, the more flexible your program is, and the more complicated its\nimplementation becomes. Too much flexibility has drawbacks as well, of course;\ntoo many options can overwhelm users and make your code much harder to maintain.
\nWhile optparse is quite flexible and powerful, it’s also straightforward\nto use in most cases. This section covers the code patterns that are common to\nany optparse-based program.
\nFirst, you need to import the OptionParser class; then, early in the main\nprogram, create an OptionParser instance:
\nfrom optparse import OptionParser\n[...]\nparser = OptionParser()\n
Then you can start defining options. The basic syntax is:
\nparser.add_option(opt_str, ...,\n attr=value, ...)\n
Each option has one or more option strings, such as -f or --file,\nand several option attributes that tell optparse what to expect and what\nto do when it encounters that option on the command line.
\nTypically, each option will have one short option string and one long option\nstring, e.g.:
\nparser.add_option("-f", "--file", ...)\n
You’re free to define as many short option strings and as many long option\nstrings as you like (including zero), as long as there is at least one option\nstring overall.
\nThe option strings passed to add_option() are effectively labels for the\noption defined by that call. For brevity, we will frequently refer to\nencountering an option on the command line; in reality, optparse\nencounters option strings and looks up options from them.
\nOnce all of your options are defined, instruct optparse to parse your\nprogram’s command line:
\n(options, args) = parser.parse_args()\n
(If you like, you can pass a custom argument list to parse_args(), but\nthat’s rarely necessary: by default it uses sys.argv[1:].)
\nparse_args() returns two values:
\nThis tutorial section only covers the four most important option attributes:\naction, type, dest\n(destination), and help. Of these, action is the\nmost fundamental.
\nActions tell optparse what to do when it encounters an option on the\ncommand line. There is a fixed set of actions hard-coded into optparse;\nadding new actions is an advanced topic covered in section\nExtending optparse. Most actions tell optparse to store\na value in some variable—for example, take a string from the command line and\nstore it in an attribute of options.
\nIf you don’t specify an option action, optparse defaults to store.
\nThe most common option action is store, which tells optparse to take\nthe next argument (or the remainder of the current argument), ensure that it is\nof the correct type, and store it to your chosen destination.
\nFor example:
\nparser.add_option("-f", "--file",\n action="store", type="string", dest="filename")\n
Now let’s make up a fake command line and ask optparse to parse it:
\nargs = ["-f", "foo.txt"]\n(options, args) = parser.parse_args(args)\n
When optparse sees the option string -f, it consumes the next\nargument, foo.txt, and stores it in options.filename. So, after this\ncall to parse_args(), options.filename is "foo.txt".
\nSome other option types supported by optparse are int and float.\nHere’s an option that expects an integer argument:
\nparser.add_option("-n", type="int", dest="num")\n
Note that this option has no long option string, which is perfectly acceptable.\nAlso, there’s no explicit action, since the default is store.
\nLet’s parse another fake command-line. This time, we’ll jam the option argument\nright up against the option: since -n42 (one argument) is equivalent to\n-n 42 (two arguments), the code
\n(options, args) = parser.parse_args(["-n42"])\nprint options.num\n
will print 42.
\nIf you don’t specify a type, optparse assumes string. Combined with\nthe fact that the default action is store, that means our first example can\nbe a lot shorter:
\nparser.add_option("-f", "--file", dest="filename")\n
If you don’t supply a destination, optparse figures out a sensible\ndefault from the option strings: if the first long option string is\n--foo-bar, then the default destination is foo_bar. If there are no\nlong option strings, optparse looks at the first short option string: the\ndefault destination for -f is f.
\noptparse also includes built-in long and complex types. Adding\ntypes is covered in section Extending optparse.
\nFlag options—set a variable to true or false when a particular option is seen\n—are quite common. optparse supports them with two separate actions,\nstore_true and store_false. For example, you might have a verbose\nflag that is turned on with -v and off with -q:
\nparser.add_option("-v", action="store_true", dest="verbose")\nparser.add_option("-q", action="store_false", dest="verbose")\n
Here we have two different options with the same destination, which is perfectly\nOK. (It just means you have to be a bit careful when setting default values—\nsee below.)
\nWhen optparse encounters -v on the command line, it sets\noptions.verbose to True; when it encounters -q,\noptions.verbose is set to False.
\nSome other actions supported by optparse are:
\nThese are covered in section Reference Guide, Reference Guide\nand section Option Callbacks.
\nAll of the above examples involve setting some variable (the “destination”) when\ncertain command-line options are seen. What happens if those options are never\nseen? Since we didn’t supply any defaults, they are all set to None. This\nis usually fine, but sometimes you want more control. optparse lets you\nsupply a default value for each destination, which is assigned before the\ncommand line is parsed.
\nFirst, consider the verbose/quiet example. If we want optparse to set\nverbose to True unless -q is seen, then we can do this:
\nparser.add_option("-v", action="store_true", dest="verbose", default=True)\nparser.add_option("-q", action="store_false", dest="verbose")\n
Since default values apply to the destination rather than to any particular\noption, and these two options happen to have the same destination, this is\nexactly equivalent:
\nparser.add_option("-v", action="store_true", dest="verbose")\nparser.add_option("-q", action="store_false", dest="verbose", default=True)\n
Consider this:
\nparser.add_option("-v", action="store_true", dest="verbose", default=False)\nparser.add_option("-q", action="store_false", dest="verbose", default=True)\n
Again, the default value for verbose will be True: the last default\nvalue supplied for any particular destination is the one that counts.
\nA clearer way to specify default values is the set_defaults() method of\nOptionParser, which you can call at any time before calling parse_args():
\nparser.set_defaults(verbose=True)\nparser.add_option(...)\n(options, args) = parser.parse_args()\n
As before, the last value specified for a given option destination is the one\nthat counts. For clarity, try to use one method or the other of setting default\nvalues, not both.
\noptparse‘s ability to generate help and usage text automatically is\nuseful for creating user-friendly command-line interfaces. All you have to do\nis supply a help value for each option, and optionally a short\nusage message for your whole program. Here’s an OptionParser populated with\nuser-friendly (documented) options:
\nusage = "usage: %prog [options] arg1 arg2"\nparser = OptionParser(usage=usage)\nparser.add_option("-v", "--verbose",\n action="store_true", dest="verbose", default=True,\n help="make lots of noise [default]")\nparser.add_option("-q", "--quiet",\n action="store_false", dest="verbose",\n help="be vewwy quiet (I'm hunting wabbits)")\nparser.add_option("-f", "--filename",\n metavar="FILE", help="write output to FILE")\nparser.add_option("-m", "--mode",\n default="intermediate",\n help="interaction mode: novice, intermediate, "\n "or expert [default: %default]")\n
If optparse encounters either -h or --help on the\ncommand-line, or if you just call parser.print_help(), it prints the\nfollowing to standard output:
\nUsage: <yourscript> [options] arg1 arg2\n\nOptions:\n -h, --help show this help message and exit\n -v, --verbose make lots of noise [default]\n -q, --quiet be vewwy quiet (I'm hunting wabbits)\n -f FILE, --filename=FILE\n write output to FILE\n -m MODE, --mode=MODE interaction mode: novice, intermediate, or\n expert [default: intermediate]\n
(If the help output is triggered by a help option, optparse exits after\nprinting the help text.)
\nThere’s a lot going on here to help optparse generate the best possible\nhelp message:
\nthe script defines its own usage message:
\nusage = "usage: %prog [options] arg1 arg2"\n
optparse expands %prog in the usage string to the name of the\ncurrent program, i.e. os.path.basename(sys.argv[0]). The expanded string\nis then printed before the detailed option help.
\nIf you don’t supply a usage string, optparse uses a bland but sensible\ndefault: "Usage: %prog [options]", which is fine if your script doesn’t\ntake any positional arguments.
\nevery option defines a help string, and doesn’t worry about line-wrapping—\noptparse takes care of wrapping lines and making the help output look\ngood.
\noptions that take a value indicate this fact in their automatically-generated\nhelp message, e.g. for the “mode” option:
\n-m MODE, --mode=MODE
\nHere, “MODE” is called the meta-variable: it stands for the argument that the\nuser is expected to supply to -m/--mode. By default,\noptparse converts the destination variable name to uppercase and uses\nthat for the meta-variable. Sometimes, that’s not what you want—for\nexample, the --filename option explicitly sets metavar="FILE",\nresulting in this automatically-generated option description:
\n-f FILE, --filename=FILE
\nThis is important for more than just saving space, though: the manually\nwritten help text uses the meta-variable FILE to clue the user in that\nthere’s a connection between the semi-formal syntax -f FILE and the informal\nsemantic description “write output to FILE”. This is a simple but effective\nway to make your help text a lot clearer and more useful for end users.
\n\nNew in version 2.4: Options that have a default value can include %default in the help\nstring—optparse will replace it with str() of the option’s\ndefault value. If an option has no default value (or the default value is\nNone), %default expands to none.
\nWhen dealing with many options, it is convenient to group these options for\nbetter help output. An OptionParser can contain several option groups,\neach of which can contain several options.
\nAn option group is obtained using the class OptionGroup:
\nwhere
\nOptionGroup inherits from OptionContainer (like\nOptionParser) and so the add_option() method can be used to add\nan option to the group.
\nOnce all the options are declared, using the OptionParser method\nadd_option_group() the group is added to the previously defined parser.
\nContinuing with the parser defined in the previous section, adding an\nOptionGroup to a parser is easy:
\ngroup = OptionGroup(parser, "Dangerous Options",\n "Caution: use these options at your own risk. "\n "It is believed that some of them bite.")\ngroup.add_option("-g", action="store_true", help="Group option.")\nparser.add_option_group(group)\n
This would result in the following help output:
\nUsage: <yourscript> [options] arg1 arg2\n\nOptions:\n -h, --help show this help message and exit\n -v, --verbose make lots of noise [default]\n -q, --quiet be vewwy quiet (I'm hunting wabbits)\n -f FILE, --filename=FILE\n write output to FILE\n -m MODE, --mode=MODE interaction mode: novice, intermediate, or\n expert [default: intermediate]\n\n Dangerous Options:\n Caution: use these options at your own risk. It is believed that some\n of them bite.\n\n -g Group option.\n
A bit more complete example might involve using more than one group: still\nextending the previous example:
\ngroup = OptionGroup(parser, "Dangerous Options",\n "Caution: use these options at your own risk. "\n "It is believed that some of them bite.")\ngroup.add_option("-g", action="store_true", help="Group option.")\nparser.add_option_group(group)\n\ngroup = OptionGroup(parser, "Debug Options")\ngroup.add_option("-d", "--debug", action="store_true",\n help="Print debug information")\ngroup.add_option("-s", "--sql", action="store_true",\n help="Print all SQL statements executed")\ngroup.add_option("-e", action="store_true", help="Print every action done")\nparser.add_option_group(group)\n
that results in the following output:
\nUsage: <yourscript> [options] arg1 arg2\n\nOptions:\n -h, --help show this help message and exit\n -v, --verbose make lots of noise [default]\n -q, --quiet be vewwy quiet (I'm hunting wabbits)\n -f FILE, --filename=FILE\n write output to FILE\n -m MODE, --mode=MODE interaction mode: novice, intermediate, or expert\n [default: intermediate]\n\n Dangerous Options:\n Caution: use these options at your own risk. It is believed that some\n of them bite.\n\n -g Group option.\n\n Debug Options:\n -d, --debug Print debug information\n -s, --sql Print all SQL statements executed\n -e Print every action done\n
Another interesting method, in particular when working programmatically with\noption groups is:
\nSimilar to the brief usage string, optparse can also print a version\nstring for your program. You have to supply the string as the version\nargument to OptionParser:
\nparser = OptionParser(usage="%prog [-f] [-q]", version="%prog 1.0")\n
%prog is expanded just like it is in usage. Apart from that,\nversion can contain anything you like. When you supply it, optparse\nautomatically adds a --version option to your parser. If it encounters\nthis option on the command line, it expands your version string (by\nreplacing %prog), prints it to stdout, and exits.
\nFor example, if your script is called /usr/bin/foo:
\n$ /usr/bin/foo --version\nfoo 1.0
\nThe following two methods can be used to print and get the version string:
\nThere are two broad classes of errors that optparse has to worry about:\nprogrammer errors and user errors. Programmer errors are usually erroneous\ncalls to OptionParser.add_option(), e.g. invalid option strings, unknown\noption attributes, missing option attributes, etc. These are dealt with in the\nusual way: raise an exception (either optparse.OptionError or\nTypeError) and let the program crash.
\nHandling user errors is much more important, since they are guaranteed to happen\nno matter how stable your code is. optparse can automatically detect\nsome user errors, such as bad option arguments (passing -n 4x where\n-n takes an integer argument), missing arguments (-n at the end\nof the command line, where -n takes an argument of any type). Also,\nyou can call OptionParser.error() to signal an application-defined error\ncondition:
\n(options, args) = parser.parse_args()\n[...]\nif options.a and options.b:\n parser.error("options -a and -b are mutually exclusive")\n
In either case, optparse handles the error the same way: it prints the\nprogram’s usage message and an error message to standard error and exits with\nerror status 2.
\nConsider the first example above, where the user passes 4x to an option\nthat takes an integer:
\n$ /usr/bin/foo -n 4x\nUsage: foo [options]\n\nfoo: error: option -n: invalid integer value: '4x'
\nOr, where the user fails to pass a value at all:
\n$ /usr/bin/foo -n\nUsage: foo [options]\n\nfoo: error: -n option requires an argument
\noptparse-generated error messages take care always to mention the\noption involved in the error; be sure to do the same when calling\nOptionParser.error() from your application code.
\nIf optparse‘s default error-handling behaviour does not suit your needs,\nyou’ll need to subclass OptionParser and override its exit()\nand/or error() methods.
\nHere’s what optparse-based scripts usually look like:
\nfrom optparse import OptionParser\n[...]\ndef main():\n usage = "usage: %prog [options] arg"\n parser = OptionParser(usage)\n parser.add_option("-f", "--file", dest="filename",\n help="read data from FILENAME")\n parser.add_option("-v", "--verbose",\n action="store_true", dest="verbose")\n parser.add_option("-q", "--quiet",\n action="store_false", dest="verbose")\n [...]\n (options, args) = parser.parse_args()\n if len(args) != 1:\n parser.error("incorrect number of arguments")\n if options.verbose:\n print "reading %s..." options.filename\n [...]\n\nif __name__ == "__main__":\n main()\n
The first step in using optparse is to create an OptionParser instance.
\nThe OptionParser constructor has no required arguments, but a number of\noptional keyword arguments. You should always pass them as keyword\narguments, i.e. do not rely on the order in which the arguments are declared.
\nThere are several ways to populate the parser with options. The preferred way\nis by using OptionParser.add_option(), as shown in section\nTutorial. add_option() can be called in one of two ways:
\nThe other alternative is to pass a list of pre-constructed Option instances to\nthe OptionParser constructor, as in:
\noption_list = [\n make_option("-f", "--filename",\n action="store", type="string", dest="filename"),\n make_option("-q", "--quiet",\n action="store_false", dest="verbose"),\n ]\nparser = OptionParser(option_list=option_list)\n
(make_option() is a factory function for creating Option instances;\ncurrently it is an alias for the Option constructor. A future version of\noptparse may split Option into several classes, and make_option()\nwill pick the right class to instantiate. Do not instantiate Option directly.)
\nEach Option instance represents a set of synonymous command-line option strings,\ne.g. -f and --file. You can specify any number of short or\nlong option strings, but you must specify at least one overall option string.
\nThe canonical way to create an Option instance is with the\nadd_option() method of OptionParser.
\nTo define an option with only a short option string:
\nparser.add_option("-f", attr=value, ...)\n
And to define an option with only a long option string:
\nparser.add_option("--foo", attr=value, ...)\n
The keyword arguments define attributes of the new Option object. The most\nimportant option attribute is action, and it largely\ndetermines which other attributes are relevant or required. If you pass\nirrelevant option attributes, or fail to pass required ones, optparse\nraises an OptionError exception explaining your mistake.
\nAn option’s action determines what optparse does when it encounters\nthis option on the command-line. The standard option actions hard-coded into\noptparse are:
\n(If you don’t supply an action, the default is "store". For this action,\nyou may also supply type and dest option\nattributes; see Standard option actions.)
\nAs you can see, most actions involve storing or updating a value somewhere.\noptparse always creates a special object for this, conventionally called\noptions (it happens to be an instance of optparse.Values). Option\narguments (and various other values) are stored as attributes of this object,\naccording to the dest (destination) option attribute.
\nFor example, when you call
\nparser.parse_args()\n
one of the first things optparse does is create the options object:
\noptions = Values()\n
If one of the options in this parser is defined with
\nparser.add_option("-f", "--file", action="store", type="string", dest="filename")\n
and the command-line being parsed includes any of the following:
\n-ffoo\n-f foo\n--file=foo\n--file foo
\nthen optparse, on seeing this option, will do the equivalent of
\noptions.filename = "foo"\n
The type and dest option attributes are almost\nas important as action, but action is the only\none that makes sense for all options.
\nThe following option attributes may be passed as keyword arguments to\nOptionParser.add_option(). If you pass an option attribute that is not\nrelevant to a particular option, or fail to pass a required option attribute,\noptparse raises OptionError.
\n(default: "store")
\nDetermines optparse‘s behaviour when this option is seen on the\ncommand line; the available options are documented here.
\n(default: "string")
\nThe argument type expected by this option (e.g., "string" or "int");\nthe available option types are documented here.
\n(default: derived from option strings)
\nIf the option’s action implies writing or modifying a value somewhere, this\ntells optparse where to write it: dest names an\nattribute of the options object that optparse builds as it parses\nthe command line.
\n(default: 1)
\nHow many arguments of type type should be consumed when this\noption is seen. If > 1, optparse will store a tuple of values to\ndest.
\nThe various option actions all have slightly different requirements and effects.\nMost actions have several relevant option attributes which you may specify to\nguide optparse‘s behaviour; a few have required attributes, which you\nmust specify for any option using that action.
\n"store" [relevant: type, dest,\nnargs, choices]
\nThe option must be followed by an argument, which is converted to a value\naccording to type and stored in dest. If\nnargs > 1, multiple arguments will be consumed from the\ncommand line; all will be converted according to type and\nstored to dest as a tuple. See the\nStandard option types section.
\nIf choices is supplied (a list or tuple of strings), the type\ndefaults to "choice".
\nIf type is not supplied, it defaults to "string".
\nIf dest is not supplied, optparse derives a destination\nfrom the first long option string (e.g., --foo-bar implies\nfoo_bar). If there are no long option strings, optparse derives a\ndestination from the first short option string (e.g., -f implies f).
\nExample:
\nparser.add_option("-f")\nparser.add_option("-p", type="float", nargs=3, dest="point")\n
As it parses the command line
\n-f foo.txt -p 1 -3.5 4 -fbar.txt
\noptparse will set
\noptions.f = "foo.txt"\noptions.point = (1.0, -3.5, 4.0)\noptions.f = "bar.txt"\n
"store_const" [required: const; relevant:\ndest]
\nThe value const is stored in dest.
\nExample:
\nparser.add_option("-q", "--quiet",\n action="store_const", const=0, dest="verbose")\nparser.add_option("-v", "--verbose",\n action="store_const", const=1, dest="verbose")\nparser.add_option("--noisy",\n action="store_const", const=2, dest="verbose")\n
If --noisy is seen, optparse will set
\noptions.verbose = 2\n
"store_true" [relevant: dest]
\nA special case of "store_const" that stores a true value to\ndest.
\n"store_false" [relevant: dest]
\nLike "store_true", but stores a false value.
\nExample:
\nparser.add_option("--clobber", action="store_true", dest="clobber")\nparser.add_option("--no-clobber", action="store_false", dest="clobber")\n
"append" [relevant: type, dest,\nnargs, choices]
\nThe option must be followed by an argument, which is appended to the list in\ndest. If no default value for dest is\nsupplied, an empty list is automatically created when optparse first\nencounters this option on the command-line. If nargs > 1,\nmultiple arguments are consumed, and a tuple of length nargs\nis appended to dest.
\nThe defaults for type and dest are the same as\nfor the "store" action.
\nExample:
\nparser.add_option("-t", "--tracks", action="append", type="int")\n
If -t3 is seen on the command-line, optparse does the equivalent\nof:
\noptions.tracks = []\noptions.tracks.append(int("3"))\n
If, a little later on, --tracks=4 is seen, it does:
\noptions.tracks.append(int("4"))\n
"append_const" [required: const; relevant:\ndest]
\nLike "store_const", but the value const is appended to\ndest; as with "append", dest defaults to\nNone, and an empty list is automatically created the first time the option\nis encountered.
\n"count" [relevant: dest]
\nIncrement the integer stored at dest. If no default value is\nsupplied, dest is set to zero before being incremented the\nfirst time.
\nExample:
\nparser.add_option("-v", action="count", dest="verbosity")\n
The first time -v is seen on the command line, optparse does the\nequivalent of:
\noptions.verbosity = 0\noptions.verbosity += 1\n
Every subsequent occurrence of -v results in
\noptions.verbosity += 1\n
"callback" [required: callback; relevant:\ntype, nargs, callback_args,\ncallback_kwargs]
\nCall the function specified by callback, which is called as
\nfunc(option, opt_str, value, parser, *args, **kwargs)\n
See section Option Callbacks for more detail.
\n"help"
\nPrints a complete help message for all the options in the current option\nparser. The help message is constructed from the usage string passed to\nOptionParser’s constructor and the help string passed to every\noption.
\nIf no help string is supplied for an option, it will still be\nlisted in the help message. To omit an option entirely, use the special value\noptparse.SUPPRESS_HELP.
\noptparse automatically adds a help option to all\nOptionParsers, so you do not normally need to create one.
\nExample:
\nfrom optparse import OptionParser, SUPPRESS_HELP\n\n# usually, a help option is added automatically, but that can\n# be suppressed using the add_help_option argument\nparser = OptionParser(add_help_option=False)\n\nparser.add_option("-h", "--help", action="help")\nparser.add_option("-v", action="store_true", dest="verbose",\n help="Be moderately verbose")\nparser.add_option("--file", dest="filename",\n help="Input file to read data from")\nparser.add_option("--secret", help=SUPPRESS_HELP)\n
If optparse sees either -h or --help on the command line,\nit will print something like the following help message to stdout (assuming\nsys.argv[0] is "foo.py"):
\nUsage: foo.py [options]\n\nOptions:\n -h, --help Show this help message and exit\n -v Be moderately verbose\n --file=FILENAME Input file to read data from\n
After printing the help message, optparse terminates your process with\nsys.exit(0).
\n"version"
\nPrints the version number supplied to the OptionParser to stdout and exits.\nThe version number is actually formatted and printed by the\nprint_version() method of OptionParser. Generally only relevant if the\nversion argument is supplied to the OptionParser constructor. As with\nhelp options, you will rarely create version options,\nsince optparse automatically adds them when needed.
\noptparse has six built-in option types: "string", "int",\n"long", "choice", "float" and "complex". If you need to add new\noption types, see section Extending optparse.
\nArguments to string options are not checked or converted in any way: the text on\nthe command line is stored in the destination (or passed to the callback) as-is.
\nInteger arguments (type "int" or "long") are parsed as follows:
\nThe conversion is done by calling either int() or long() with the\nappropriate base (2, 8, 10, or 16). If this fails, so will optparse,\nalthough with a more useful error message.
\n"float" and "complex" option arguments are converted directly with\nfloat() and complex(), with similar error-handling.
\n"choice" options are a subtype of "string" options. The\nchoices option attribute (a sequence of strings) defines the\nset of allowed option arguments. optparse.check_choice() compares\nuser-supplied option arguments against this master list and raises\nOptionValueError if an invalid string is given.
\nThe whole point of creating and populating an OptionParser is to call its\nparse_args() method:
\n(options, args) = parser.parse_args(args=None, values=None)\n
where the input parameters are
\nand the return values are
\nThe most common usage is to supply neither keyword argument. If you supply\nvalues, it will be modified with repeated setattr() calls (roughly one\nfor every option argument stored to an option destination) and returned by\nparse_args().
\nIf parse_args() encounters any errors in the argument list, it calls the\nOptionParser’s error() method with an appropriate end-user error message.\nThis ultimately terminates your process with an exit status of 2 (the\ntraditional Unix exit status for command-line errors).
\nThe default behavior of the option parser can be customized slightly, and you\ncan also poke around your option parser and see what’s there. OptionParser\nprovides several methods to help you out:
\nSet parsing to stop on the first non-option. For example, if -a and\n-b are both simple options that take no arguments, optparse\nnormally accepts this syntax:
\nprog -a arg1 -b arg2
\nand treats it as equivalent to
\nprog -a -b arg1 arg2
\nTo disable this feature, call disable_interspersed_args(). This\nrestores traditional Unix syntax, where option parsing stops with the first\nnon-option argument.
\nUse this if you have a command processor which runs another command which has\noptions of its own and you want to make sure these options don’t get\nconfused. For example, each command might have a different set of options.
\nIf you’re not careful, it’s easy to define options with conflicting option\nstrings:
\nparser.add_option("-n", "--dry-run", ...)\n[...]\nparser.add_option("-n", "--noisy", ...)\n
(This is particularly true if you’ve defined your own OptionParser subclass with\nsome standard options.)
\nEvery time you add an option, optparse checks for conflicts with existing\noptions. If it finds any, it invokes the current conflict-handling mechanism.\nYou can set the conflict-handling mechanism either in the constructor:
\nparser = OptionParser(..., conflict_handler=handler)\n
or with a separate call:
\nparser.set_conflict_handler(handler)\n
The available conflict handlers are:
\n\n\n\n
\n- "error" (default)
\n- assume option conflicts are a programming error and raise\nOptionConflictError
\n- "resolve"
\n- resolve option conflicts intelligently (see below)
\n
As an example, let’s define an OptionParser that resolves conflicts\nintelligently and add conflicting options to it:
\nparser = OptionParser(conflict_handler="resolve")\nparser.add_option("-n", "--dry-run", ..., help="do no harm")\nparser.add_option("-n", "--noisy", ..., help="be noisy")\n
At this point, optparse detects that a previously-added option is already\nusing the -n option string. Since conflict_handler is "resolve",\nit resolves the situation by removing -n from the earlier option’s list of\noption strings. Now --dry-run is the only way for the user to activate\nthat option. If the user asks for help, the help message will reflect that:
\nOptions:\n --dry-run do no harm\n [...]\n -n, --noisy be noisy
\nIt’s possible to whittle away the option strings for a previously-added option\nuntil there are none left, and the user has no way of invoking that option from\nthe command-line. In that case, optparse removes that option completely,\nso it doesn’t show up in help text or anywhere else. Carrying on with our\nexisting OptionParser:
\nparser.add_option("--dry-run", ..., help="new dry-run option")\n
At this point, the original -n/--dry-run option is no longer\naccessible, so optparse removes it, leaving this help text:
\nOptions:\n [...]\n -n, --noisy be noisy\n --dry-run new dry-run option
\nOptionParser instances have several cyclic references. This should not be a\nproblem for Python’s garbage collector, but you may wish to break the cyclic\nreferences explicitly by calling destroy() on your\nOptionParser once you are done with it. This is particularly useful in\nlong-running applications where large object graphs are reachable from your\nOptionParser.
\nOptionParser supports several other public methods:
\nSet default values for several option destinations at once. Using\nset_defaults() is the preferred way to set default values for options,\nsince multiple options can share the same destination. For example, if\nseveral “mode” options all set the same destination, any one of them can set\nthe default, and the last one wins:
\nparser.add_option("--advanced", action="store_const",\n dest="mode", const="advanced",\n default="novice") # overridden below\nparser.add_option("--novice", action="store_const",\n dest="mode", const="novice",\n default="advanced") # overrides above setting\n
To avoid this confusion, use set_defaults():
\nparser.set_defaults(mode="advanced")\nparser.add_option("--advanced", action="store_const",\n dest="mode", const="advanced")\nparser.add_option("--novice", action="store_const",\n dest="mode", const="novice")\n
When optparse‘s built-in actions and types aren’t quite enough for your\nneeds, you have two choices: extend optparse or define a callback option.\nExtending optparse is more general, but overkill for a lot of simple\ncases. Quite often a simple callback is all you need.
\nThere are two steps to defining a callback option:
\nAs always, the easiest way to define a callback option is by using the\nOptionParser.add_option() method. Apart from action, the\nonly option attribute you must specify is callback, the function to call:
\nparser.add_option("-c", action="callback", callback=my_callback)\n
callback is a function (or other callable object), so you must have already\ndefined my_callback() when you create this callback option. In this simple\ncase, optparse doesn’t even know if -c takes any arguments,\nwhich usually means that the option takes no arguments—the mere presence of\n-c on the command-line is all it needs to know. In some\ncircumstances, though, you might want your callback to consume an arbitrary\nnumber of command-line arguments. This is where writing callbacks gets tricky;\nit’s covered later in this section.
\noptparse always passes four particular arguments to your callback, and it\nwill only pass additional arguments if you specify them via\ncallback_args and callback_kwargs. Thus, the\nminimal callback function signature is:
\ndef my_callback(option, opt, value, parser):
\nThe four arguments to a callback are described below.
\nThere are several other option attributes that you can supply when you define a\ncallback option:
\nAll callbacks are called as follows:
\nfunc(option, opt_str, value, parser, *args, **kwargs)\n
where
\nis the OptionParser instance driving the whole thing, mainly useful because\nyou can access some other interesting data through its instance attributes:
\nThe callback function should raise OptionValueError if there are any\nproblems with the option or its argument(s). optparse catches this and\nterminates the program, printing the error message you supply to stderr. Your\nmessage should be clear, concise, accurate, and mention the option at fault.\nOtherwise, the user will have a hard time figuring out what he did wrong.
\nHere’s an example of a callback option that takes no arguments, and simply\nrecords that the option was seen:
\ndef record_foo_seen(option, opt_str, value, parser):\n parser.values.saw_foo = True\n\nparser.add_option("--foo", action="callback", callback=record_foo_seen)\n
Of course, you could do that with the "store_true" action.
\nHere’s a slightly more interesting example: record the fact that -a is\nseen, but blow up if it comes after -b in the command-line.
\ndef check_order(option, opt_str, value, parser):\n if parser.values.b:\n raise OptionValueError("can't use -a after -b")\n parser.values.a = 1\n[...]\nparser.add_option("-a", action="callback", callback=check_order)\nparser.add_option("-b", action="store_true", dest="b")\n
If you want to re-use this callback for several similar options (set a flag, but\nblow up if -b has already been seen), it needs a bit of work: the error\nmessage and the flag that it sets must be generalized.
\ndef check_order(option, opt_str, value, parser):\n if parser.values.b:\n raise OptionValueError("can't use %s after -b" opt_str)\n setattr(parser.values, option.dest, 1)\n[...]\nparser.add_option("-a", action="callback", callback=check_order, dest='a')\nparser.add_option("-b", action="store_true", dest="b")\nparser.add_option("-c", action="callback", callback=check_order, dest='c')\n
Of course, you could put any condition in there—you’re not limited to checking\nthe values of already-defined options. For example, if you have options that\nshould not be called when the moon is full, all you have to do is this:
\ndef check_moon(option, opt_str, value, parser):\n if is_moon_full():\n raise OptionValueError("%s option invalid when moon is full"\n opt_str)\n setattr(parser.values, option.dest, 1)\n[...]\nparser.add_option("--foo",\n action="callback", callback=check_moon, dest="foo")\n
(The definition of is_moon_full() is left as an exercise for the reader.)
\nThings get slightly more interesting when you define callback options that take\na fixed number of arguments. Specifying that a callback option takes arguments\nis similar to defining a "store" or "append" option: if you define\ntype, then the option takes one argument that must be\nconvertible to that type; if you further define nargs, then the\noption takes nargs arguments.
\nHere’s an example that just emulates the standard "store" action:
\ndef store_value(option, opt_str, value, parser):\n setattr(parser.values, option.dest, value)\n[...]\nparser.add_option("--foo",\n action="callback", callback=store_value,\n type="int", nargs=3, dest="foo")\n
Note that optparse takes care of consuming 3 arguments and converting\nthem to integers for you; all you have to do is store them. (Or whatever;\nobviously you don’t need a callback for this example.)
\nThings get hairy when you want an option to take a variable number of arguments.\nFor this case, you must write a callback, as optparse doesn’t provide any\nbuilt-in capabilities for it. And you have to deal with certain intricacies of\nconventional Unix command-line parsing that optparse normally handles for\nyou. In particular, callbacks should implement the conventional rules for bare\n-- and - arguments:
\nIf you want an option that takes a variable number of arguments, there are\nseveral subtle, tricky issues to worry about. The exact implementation you\nchoose will be based on which trade-offs you’re willing to make for your\napplication (which is why optparse doesn’t support this sort of thing\ndirectly).
\nNevertheless, here’s a stab at a callback for an option with variable\narguments:
\n def vararg_callback(option, opt_str, value, parser):\n assert value is None\n value = []\n\n def floatable(str):\n try:\n float(str)\n return True\n except ValueError:\n return False\n\n for arg in parser.rargs:\n # stop on --foo like options\n if arg[:2] == \"--\" and len(arg) > 2:\n break\n # stop on -a, but not on -3 or -3.0\n if arg[:1] == \"-\" and len(arg) > 1 and not floatable(arg):\n break\n value.append(arg)\n\n del parser.rargs[:len(value)]\n setattr(parser.values, option.dest, value)\n\n[...]\nparser.add_option(\"-c\", \"--callback\", dest=\"vararg_attr\",\n action=\"callback\", callback=vararg_callback)
\nSince the two major controlling factors in how optparse interprets\ncommand-line options are the action and type of each option, the most likely\ndirection of extension is to add new actions and new types.
\nTo add new types, you need to define your own subclass of optparse‘s\nOption class. This class has a couple of attributes that define\noptparse‘s types: TYPES and TYPE_CHECKER.
\nA dictionary mapping type names to type-checking functions. A type-checking\nfunction has the following signature:
\ndef check_mytype(option, opt, value)
\nwhere option is an Option instance, opt is an option string\n(e.g., -f), and value is the string from the command line that must\nbe checked and converted to your desired type. check_mytype() should\nreturn an object of the hypothetical type mytype. The value returned by\na type-checking function will wind up in the OptionValues instance returned\nby OptionParser.parse_args(), or be passed to a callback as the\nvalue parameter.
\nYour type-checking function should raise OptionValueError if it\nencounters any problems. OptionValueError takes a single string\nargument, which is passed as-is to OptionParser‘s error()\nmethod, which in turn prepends the program name and the string "error:"\nand prints everything to stderr before terminating the process.
\nHere’s a silly example that demonstrates adding a "complex" option type to\nparse Python-style complex numbers on the command line. (This is even sillier\nthan it used to be, because optparse 1.3 added built-in support for\ncomplex numbers, but never mind.)
\nFirst, the necessary imports:
\nfrom copy import copy\nfrom optparse import Option, OptionValueError\n
You need to define your type-checker first, since it’s referred to later (in the\nTYPE_CHECKER class attribute of your Option subclass):
\ndef check_complex(option, opt, value):\n try:\n return complex(value)\n except ValueError:\n raise OptionValueError(\n "option %s: invalid complex value: %r" (opt, value))\n
Finally, the Option subclass:
\nclass MyOption (Option):\n TYPES = Option.TYPES + ("complex",)\n TYPE_CHECKER = copy(Option.TYPE_CHECKER)\n TYPE_CHECKER["complex"] = check_complex\n
(If we didn’t make a copy() of Option.TYPE_CHECKER, we would end\nup modifying the TYPE_CHECKER attribute of optparse‘s\nOption class. This being Python, nothing stops you from doing that except good\nmanners and common sense.)
\nThat’s it! Now you can write a script that uses the new option type just like\nany other optparse-based script, except you have to instruct your\nOptionParser to use MyOption instead of Option:
\nparser = OptionParser(option_class=MyOption)\nparser.add_option("-c", type="complex")\n
Alternately, you can build your own option list and pass it to OptionParser; if\nyou don’t use add_option() in the above way, you don’t need to tell\nOptionParser which option class to use:
\noption_list = [MyOption("-c", action="store", type="complex", dest="c")]\nparser = OptionParser(option_list=option_list)\n
Adding new actions is a bit trickier, because you have to understand that\noptparse has a couple of classifications for actions:
\nThese are overlapping sets: some default “store” actions are "store",\n"store_const", "append", and "count", while the default “typed”\nactions are "store", "append", and "callback".
\nWhen you add an action, you need to categorize it by listing it in at least one\nof the following class attributes of Option (all are lists of strings):
\nIn order to actually implement your new action, you must override Option’s\ntake_action() method and add a case that recognizes your action.
\nFor example, let’s add an "extend" action. This is similar to the standard\n"append" action, but instead of taking a single value from the command-line\nand appending it to an existing list, "extend" will take multiple values in\na single comma-delimited string, and extend an existing list with them. That\nis, if --names is an "extend" option of type "string", the command\nline
\n--names=foo,bar --names blah --names ding,dong
\nwould result in a list
\n["foo", "bar", "blah", "ding", "dong"]\n
Again we define a subclass of Option:
\nclass MyOption(Option):\n\n ACTIONS = Option.ACTIONS + ("extend",)\n STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",)\n TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",)\n ALWAYS_TYPED_ACTIONS = Option.ALWAYS_TYPED_ACTIONS + ("extend",)\n\n def take_action(self, action, dest, opt, value, values, parser):\n if action == "extend":\n lvalue = value.split(",")\n values.ensure_value(dest, []).extend(lvalue)\n else:\n Option.take_action(\n self, action, dest, opt, value, values, parser)\n
Features of note:
\n"extend" both expects a value on the command-line and stores that value\nsomewhere, so it goes in both STORE_ACTIONS and\nTYPED_ACTIONS.
\nto ensure that optparse assigns the default type of "string" to\n"extend" actions, we put the "extend" action in\nALWAYS_TYPED_ACTIONS as well.
\nMyOption.take_action() implements just this one new action, and passes\ncontrol back to Option.take_action() for the standard optparse\nactions.
\nvalues is an instance of the optparse_parser.Values class, which provides\nthe very useful ensure_value() method. ensure_value() is\nessentially getattr() with a safety valve; it is called as
\nvalues.ensure_value(attr, value)\n
If the attr attribute of values doesn’t exist or is None, then\nensure_value() first sets it to value, and then returns ‘value. This is\nvery handy for actions like "extend", "append", and "count", all\nof which accumulate data in a variable and expect that variable to be of a\ncertain type (a list for the first two, an integer for the latter). Using\nensure_value() means that scripts using your action don’t have to worry\nabout setting a default value for the option destinations in question; they\ncan just leave the default as None and ensure_value() will take care of\ngetting it right when it’s needed.
\nThe getpass module provides two functions:
\nPrompt the user for a password without echoing. The user is prompted using the\nstring prompt, which defaults to 'Password: '. On Unix, the prompt is\nwritten to the file-like object stream. stream defaults to the\ncontrolling terminal (/dev/tty) or if that is unavailable to sys.stderr\n(this argument is ignored on Windows).
\nIf echo free input is unavailable getpass() falls back to printing\na warning message to stream and reading from sys.stdin and\nissuing a GetPassWarning.
\nAvailability: Macintosh, Unix, Windows.
\n\nChanged in version 2.5: The stream parameter was added.
\n\nChanged in version 2.6: On Unix it defaults to using /dev/tty before falling back\nto sys.stdin and sys.stderr.
\nNote
\nIf you call getpass from within IDLE, the input may be done in the\nterminal you launched IDLE from rather than the idle window itself.
\nReturn the “login name” of the user. Availability: Unix, Windows.
\nThis function checks the environment variables LOGNAME,\nUSER, LNAME and USERNAME, in order, and returns\nthe value of the first one which is set to a non-empty string. If none are set,\nthe login name from the password database is returned on systems which support\nthe pwd module, otherwise, an exception is raised.
\nThe following useful handlers are provided in the package. Note that three of\nthe handlers (StreamHandler, FileHandler and\nNullHandler) are actually defined in the logging module itself,\nbut have been documented here along with the other handlers.
\nThe StreamHandler class, located in the core logging package,\nsends logging output to streams such as sys.stdout, sys.stderr or any\nfile-like object (or, more precisely, any object which supports write()\nand flush() methods).
\nReturns a new instance of the StreamHandler class. If stream is\nspecified, the instance will use it for logging output; otherwise, sys.stderr\nwill be used.
\nThe FileHandler class, located in the core logging package,\nsends logging output to a disk file. It inherits the output functionality from\nStreamHandler.
\nReturns a new instance of the FileHandler class. The specified file is\nopened and used as the stream for logging. If mode is not specified,\n'a' is used. If encoding is not None, it is used to open the file\nwith that encoding. If delay is true, then file opening is deferred until the\nfirst call to emit(). By default, the file grows indefinitely.
\n\nChanged in version 2.6: delay was added.
\n\nNew in version 2.7.
\nThe NullHandler class, located in the core logging package,\ndoes not do any formatting or output. It is essentially a ‘no-op’ handler\nfor use by library developers.
\nReturns a new instance of the NullHandler class.
\nSee Configuring Logging for a Library for more information on how to use\nNullHandler.
\n\nNew in version 2.6.
\nThe WatchedFileHandler class, located in the logging.handlers\nmodule, is a FileHandler which watches the file it is logging to. If\nthe file changes, it is closed and reopened using the file name.
\nA file change can happen because of usage of programs such as newsyslog and\nlogrotate which perform log file rotation. This handler, intended for use\nunder Unix/Linux, watches the file to see if it has changed since the last emit.\n(A file is deemed to have changed if its device or inode have changed.) If the\nfile has changed, the old file stream is closed, and the file opened to get a\nnew stream.
\nThis handler is not appropriate for use under Windows, because under Windows\nopen log files cannot be moved or renamed - logging opens the files with\nexclusive locks - and so there is no need for such a handler. Furthermore,\nST_INO is not supported under Windows; stat() always returns zero for\nthis value.
\nReturns a new instance of the WatchedFileHandler class. The specified\nfile is opened and used as the stream for logging. If mode is not specified,\n'a' is used. If encoding is not None, it is used to open the file\nwith that encoding. If delay is true, then file opening is deferred until the\nfirst call to emit(). By default, the file grows indefinitely.
\nThe RotatingFileHandler class, located in the logging.handlers\nmodule, supports rotation of disk log files.
\nReturns a new instance of the RotatingFileHandler class. The specified\nfile is opened and used as the stream for logging. If mode is not specified,\n'a' is used. If encoding is not None, it is used to open the file\nwith that encoding. If delay is true, then file opening is deferred until the\nfirst call to emit(). By default, the file grows indefinitely.
\nYou can use the maxBytes and backupCount values to allow the file to\nrollover at a predetermined size. When the size is about to be exceeded,\nthe file is closed and a new file is silently opened for output. Rollover occurs\nwhenever the current log file is nearly maxBytes in length; if maxBytes is\nzero, rollover never occurs. If backupCount is non-zero, the system will save\nold log files by appending the extensions ‘.1’, ‘.2’ etc., to the filename. For\nexample, with a backupCount of 5 and a base file name of app.log, you\nwould get app.log, app.log.1, app.log.2, up to\napp.log.5. The file being written to is always app.log. When\nthis file is filled, it is closed and renamed to app.log.1, and if files\napp.log.1, app.log.2, etc. exist, then they are renamed to\napp.log.2, app.log.3 etc. respectively.
\n\nChanged in version 2.6: delay was added.
\nThe TimedRotatingFileHandler class, located in the\nlogging.handlers module, supports rotation of disk log files at certain\ntimed intervals.
\nReturns a new instance of the TimedRotatingFileHandler class. The\nspecified file is opened and used as the stream for logging. On rotating it also\nsets the filename suffix. Rotating happens based on the product of when and\ninterval.
\nYou can use the when to specify the type of interval. The list of possible\nvalues is below. Note that they are not case sensitive.
\nValue | \nType of interval | \n
---|---|
'S' | \nSeconds | \n
'M' | \nMinutes | \n
'H' | \nHours | \n
'D' | \nDays | \n
'W' | \nWeek day (0=Monday) | \n
'midnight' | \nRoll over at midnight | \n
The system will save old log files by appending extensions to the filename.\nThe extensions are date-and-time based, using the strftime format\n%Y-%m-%d_%H-%M-%S or a leading portion thereof, depending on the\nrollover interval.
\nWhen computing the next rollover time for the first time (when the handler\nis created), the last modification time of an existing log file, or else\nthe current time, is used to compute when the next rotation will occur.
\nIf the utc argument is true, times in UTC will be used; otherwise\nlocal time is used.
\nIf backupCount is nonzero, at most backupCount files\nwill be kept, and if more would be created when rollover occurs, the oldest\none is deleted. The deletion logic uses the interval to determine which\nfiles to delete, so changing the interval may leave old files lying around.
\nIf delay is true, then file opening is deferred until the first call to\nemit().
\n\nChanged in version 2.6: delay and utc were added.
\nThe SocketHandler class, located in the logging.handlers module,\nsends logging output to a network socket. The base class uses a TCP socket.
\nReturns a new instance of the SocketHandler class intended to\ncommunicate with a remote machine whose address is given by host and port.
\nPickles the record’s attribute dictionary in binary format with a length\nprefix, and returns it ready for transmission across the socket.
\nNote that pickles aren’t completely secure. If you are concerned about\nsecurity, you may want to override this method to implement a more secure\nmechanism. For example, you can sign pickles using HMAC and then verify\nthem on the receiving end, or alternatively you can disable unpickling of\nglobal objects on the receiving end.
\nTries to create a socket; on failure, uses an exponential back-off\nalgorithm. On intial failure, the handler will drop the message it was\ntrying to send. When subsequent messages are handled by the same\ninstance, it will not try connecting until some time has passed. The\ndefault parameters are such that the initial delay is one second, and if\nafter that delay the connection still can’t be made, the handler will\ndouble the delay each time up to a maximum of 30 seconds.
\nThis behaviour is controlled by the following handler attributes:
\nThis means that if the remote listener starts up after the handler has\nbeen used, you could lose messages (since the handler won’t even attempt\na connection until the delay has elapsed, but just silently drop messages\nduring the delay period).
\nThe DatagramHandler class, located in the logging.handlers\nmodule, inherits from SocketHandler to support sending logging messages\nover UDP sockets.
\nReturns a new instance of the DatagramHandler class intended to\ncommunicate with a remote machine whose address is given by host and port.
\nThe SysLogHandler class, located in the logging.handlers module,\nsupports sending logging messages to a remote or local Unix syslog.
\nReturns a new instance of the SysLogHandler class intended to\ncommunicate with a remote Unix machine whose address is given by address in\nthe form of a (host, port) tuple. If address is not specified,\n('localhost', 514) is used. The address is used to open a socket. An\nalternative to providing a (host, port) tuple is providing an address as a\nstring, for example ‘/dev/log’. In this case, a Unix domain socket is used to\nsend the message to the syslog. If facility is not specified,\nLOG_USER is used. The type of socket opened depends on the\nsocktype argument, which defaults to socket.SOCK_DGRAM and thus\nopens a UDP socket. To open a TCP socket (for use with the newer syslog\ndaemons such as rsyslog), specify a value of socket.SOCK_STREAM.
\nNote that if your server is not listening on UDP port 514,\nSysLogHandler may appear not to work. In that case, check what\naddress you should be using for a domain socket - it’s system dependent.\nFor example, on Linux it’s usually ‘/dev/log’ but on OS/X it’s\n‘/var/run/syslog’. You’ll need to check your platform and use the\nappropriate address (you may need to do this check at runtime if your\napplication needs to run on several platforms). On Windows, you pretty\nmuch have to use the UDP option.
\n\nChanged in version 2.7: socktype was added.
\nEncodes the facility and priority into an integer. You can pass in strings\nor integers - if strings are passed, internal mapping dictionaries are\nused to convert them to integers.
\nThe symbolic LOG_ values are defined in SysLogHandler and\nmirror the values defined in the sys/syslog.h header file.
\nPriorities
\nName (string) | \nSymbolic value | \n
---|---|
alert | \nLOG_ALERT | \n
crit or critical | \nLOG_CRIT | \n
debug | \nLOG_DEBUG | \n
emerg or panic | \nLOG_EMERG | \n
err or error | \nLOG_ERR | \n
info | \nLOG_INFO | \n
notice | \nLOG_NOTICE | \n
warn or warning | \nLOG_WARNING | \n
Facilities
\nName (string) | \nSymbolic value | \n
---|---|
auth | \nLOG_AUTH | \n
authpriv | \nLOG_AUTHPRIV | \n
cron | \nLOG_CRON | \n
daemon | \nLOG_DAEMON | \n
ftp | \nLOG_FTP | \n
kern | \nLOG_KERN | \n
lpr | \nLOG_LPR | \n
LOG_MAIL | \n|
news | \nLOG_NEWS | \n
syslog | \nLOG_SYSLOG | \n
user | \nLOG_USER | \n
uucp | \nLOG_UUCP | \n
local0 | \nLOG_LOCAL0 | \n
local1 | \nLOG_LOCAL1 | \n
local2 | \nLOG_LOCAL2 | \n
local3 | \nLOG_LOCAL3 | \n
local4 | \nLOG_LOCAL4 | \n
local5 | \nLOG_LOCAL5 | \n
local6 | \nLOG_LOCAL6 | \n
local7 | \nLOG_LOCAL7 | \n
The NTEventLogHandler class, located in the logging.handlers\nmodule, supports sending logging messages to a local Windows NT, Windows 2000 or\nWindows XP event log. Before you can use it, you need Mark Hammond’s Win32\nextensions for Python installed.
\nReturns a new instance of the NTEventLogHandler class. The appname is\nused to define the application name as it appears in the event log. An\nappropriate registry entry is created using this name. The dllname should give\nthe fully qualified pathname of a .dll or .exe which contains message\ndefinitions to hold in the log (if not specified, 'win32service.pyd' is used\n- this is installed with the Win32 extensions and contains some basic\nplaceholder message definitions. Note that use of these placeholders will make\nyour event logs big, as the entire message source is held in the log. If you\nwant slimmer logs, you have to pass in the name of your own .dll or .exe which\ncontains the message definitions you want to use in the event log). The\nlogtype is one of 'Application', 'System' or 'Security', and\ndefaults to 'Application'.
\nThe SMTPHandler class, located in the logging.handlers module,\nsupports sending logging messages to an email address via SMTP.
\nReturns a new instance of the SMTPHandler class. The instance is\ninitialized with the from and to addresses and subject line of the email.\nThe toaddrs should be a list of strings. To specify a non-standard SMTP\nport, use the (host, port) tuple format for the mailhost argument. If you\nuse a string, the standard SMTP port is used. If your SMTP server requires\nauthentication, you can specify a (username, password) tuple for the\ncredentials argument.
\nTo specify the use of a secure protocol (TLS), pass in a tuple to the\nsecure argument. This will only be used when authentication credentials are\nsupplied. The tuple should be either an empty tuple, or a single-value tuple\nwith the name of a keyfile, or a 2-value tuple with the names of the keyfile\nand certificate file. (This tuple is passed to the\nsmtplib.SMTP.starttls() method.)
\n\nChanged in version 2.6: credentials was added.
\n\nChanged in version 2.7: secure was added.
\nThe MemoryHandler class, located in the logging.handlers module,\nsupports buffering of logging records in memory, periodically flushing them to a\ntarget handler. Flushing occurs whenever the buffer is full, or when an\nevent of a certain severity or greater is seen.
\nMemoryHandler is a subclass of the more general\nBufferingHandler, which is an abstract class. This buffers logging\nrecords in memory. Whenever each record is added to the buffer, a check is made\nby calling shouldFlush() to see if the buffer should be flushed. If it\nshould, then flush() is expected to do the needful.
\nInitializes the handler with a buffer of the specified capacity.
\nReturns a new instance of the MemoryHandler class. The instance is\ninitialized with a buffer size of capacity. If flushLevel is not specified,\nERROR is used. If no target is specified, the target will need to be\nset using setTarget() before this handler does anything useful.
\n\n\n\nChanged in version 2.6: credentials was added.
Sets the target handler for this handler.
\n\nThe HTTPHandler class, located in the logging.handlers module,\nsupports sending logging messages to a Web server, using either GET or\nPOST semantics.
\nReturns a new instance of the HTTPHandler class. The host can be\nof the form host:port, should you need to use a specific port number.\nIf no method is specified, GET is used.
\nSee also
\n\nNew in version 2.7.
\nSource code: Lib/argparse.py
\nThe argparse module makes it easy to write user-friendly command-line\ninterfaces. The program defines what arguments it requires, and argparse\nwill figure out how to parse those out of sys.argv. The argparse\nmodule also automatically generates help and usage messages and issues errors\nwhen users give the program invalid arguments.
\nThe following code is a Python program that takes a list of integers and\nproduces either the sum or the max:
\nimport argparse\n\nparser = argparse.ArgumentParser(description='Process some integers.')\nparser.add_argument('integers', metavar='N', type=int, nargs='+',\n help='an integer for the accumulator')\nparser.add_argument('--sum', dest='accumulate', action='store_const',\n const=sum, default=max,\n help='sum the integers (default: find the max)')\n\nargs = parser.parse_args()\nprint args.accumulate(args.integers)\n
Assuming the Python code above is saved into a file called prog.py, it can\nbe run at the command line and provides useful help messages:
\n$ prog.py -h\nusage: prog.py [-h] [--sum] N [N ...]\n\nProcess some integers.\n\npositional arguments:\n N an integer for the accumulator\n\noptional arguments:\n -h, --help show this help message and exit\n --sum sum the integers (default: find the max)
\nWhen run with the appropriate arguments, it prints either the sum or the max of\nthe command-line integers:
\n$ prog.py 1 2 3 4\n4\n\n$ prog.py 1 2 3 4 --sum\n10
\nIf invalid arguments are passed in, it will issue an error:
\n$ prog.py a b c\nusage: prog.py [-h] [--sum] N [N ...]\nprog.py: error: argument N: invalid int value: 'a'
\nThe following sections walk you through this example.
\nThe first step in using the argparse is creating an\nArgumentParser object:
\n>>> parser = argparse.ArgumentParser(description='Process some integers.')\n
The ArgumentParser object will hold all the information necessary to\nparse the command line into Python data types.
\nFilling an ArgumentParser with information about program arguments is\ndone by making calls to the add_argument() method.\nGenerally, these calls tell the ArgumentParser how to take the strings\non the command line and turn them into objects. This information is stored and\nused when parse_args() is called. For example:
\n>>> parser.add_argument('integers', metavar='N', type=int, nargs='+',\n... help='an integer for the accumulator')\n>>> parser.add_argument('--sum', dest='accumulate', action='store_const',\n... const=sum, default=max,\n... help='sum the integers (default: find the max)')\n
Later, calling parse_args() will return an object with\ntwo attributes, integers and accumulate. The integers attribute\nwill be a list of one or more ints, and the accumulate attribute will be\neither the sum() function, if --sum was specified at the command line,\nor the max() function if it was not.
\nArgumentParser parses arguments through the\nparse_args() method. This will inspect the command line,\nconvert each argument to the appropriate type and then invoke the appropriate action.\nIn most cases, this means a simple Namespace object will be built up from\nattributes parsed out of the command line:
\n>>> parser.parse_args(['--sum', '7', '-1', '42'])\nNamespace(accumulate=<built-in function sum>, integers=[7, -1, 42])\n
In a script, parse_args() will typically be called with no\narguments, and the ArgumentParser will automatically determine the\ncommand-line arguments from sys.argv.
\nCreate a new ArgumentParser object. Each parameter has its own more\ndetailed description below, but in short they are:
\nThe following sections describe how each of these are used.
\nMost calls to the ArgumentParser constructor will use the\ndescription= keyword argument. This argument gives a brief description of\nwhat the program does and how it works. In help messages, the description is\ndisplayed between the command-line usage string and the help messages for the\nvarious arguments:
\n>>> parser = argparse.ArgumentParser(description='A foo that bars')\n>>> parser.print_help()\nusage: argparse.py [-h]\n\nA foo that bars\n\noptional arguments:\n -h, --help show this help message and exit\n
By default, the description will be line-wrapped so that it fits within the\ngiven space. To change this behavior, see the formatter_class argument.
\nSome programs like to display additional description of the program after the\ndescription of the arguments. Such text can be specified using the epilog=\nargument to ArgumentParser:
\n>>> parser = argparse.ArgumentParser(\n... description='A foo that bars',\n... epilog="And that's how you'd foo a bar")\n>>> parser.print_help()\nusage: argparse.py [-h]\n\nA foo that bars\n\noptional arguments:\n -h, --help show this help message and exit\n\nAnd that's how you'd foo a bar\n
As with the description argument, the epilog= text is by default\nline-wrapped, but this behavior can be adjusted with the formatter_class\nargument to ArgumentParser.
\nBy default, ArgumentParser objects add an option which simply displays\nthe parser’s help message. For example, consider a file named\nmyprogram.py containing the following code:
\nimport argparse\nparser = argparse.ArgumentParser()\nparser.add_argument('--foo', help='foo help')\nargs = parser.parse_args()\n
If -h or --help is supplied at the command line, the ArgumentParser\nhelp will be printed:
\n$ python myprogram.py --help\nusage: myprogram.py [-h] [--foo FOO]\n\noptional arguments:\n -h, --help show this help message and exit\n --foo FOO foo help
\nOccasionally, it may be useful to disable the addition of this help option.\nThis can be achieved by passing False as the add_help= argument to\nArgumentParser:
\n>>> parser = argparse.ArgumentParser(prog='PROG', add_help=False)\n>>> parser.add_argument('--foo', help='foo help')\n>>> parser.print_help()\nusage: PROG [--foo FOO]\n\noptional arguments:\n --foo FOO foo help\n
The help option is typically -h/--help. The exception to this is\nif the prefix_chars= is specified and does not include -, in\nwhich case -h and --help are not valid options. In\nthis case, the first character in prefix_chars is used to prefix\nthe help options:
\n>>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='+/')\n>>> parser.print_help()\nusage: PROG [+h]\n\noptional arguments:\n +h, ++help show this help message and exit\n
Most command-line options will use - as the prefix, e.g. -f/--foo.\nParsers that need to support different or additional prefix\ncharacters, e.g. for options\nlike +f or /foo, may specify them using the prefix_chars= argument\nto the ArgumentParser constructor:
\n>>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='-+')\n>>> parser.add_argument('+f')\n>>> parser.add_argument('++bar')\n>>> parser.parse_args('+f X ++bar Y'.split())\nNamespace(bar='Y', f='X')\n
The prefix_chars= argument defaults to '-'. Supplying a set of\ncharacters that does not include - will cause -f/--foo options to be\ndisallowed.
\nSometimes, for example when dealing with a particularly long argument lists, it\nmay make sense to keep the list of arguments in a file rather than typing it out\nat the command line. If the fromfile_prefix_chars= argument is given to the\nArgumentParser constructor, then arguments that start with any of the\nspecified characters will be treated as files, and will be replaced by the\narguments they contain. For example:
\n>>> with open('args.txt', 'w') as fp:\n... fp.write('-f\\nbar')\n>>> parser = argparse.ArgumentParser(fromfile_prefix_chars='@')\n>>> parser.add_argument('-f')\n>>> parser.parse_args(['-f', 'foo', '@args.txt'])\nNamespace(f='bar')\n
Arguments read from a file must by default be one per line (but see also\nconvert_arg_line_to_args()) and are treated as if they\nwere in the same place as the original file referencing argument on the command\nline. So in the example above, the expression ['-f', 'foo', '@args.txt']\nis considered equivalent to the expression ['-f', 'foo', '-f', 'bar'].
\nThe fromfile_prefix_chars= argument defaults to None, meaning that\narguments will never be treated as file references.
\nGenerally, argument defaults are specified either by passing a default to\nadd_argument() or by calling the\nset_defaults() methods with a specific set of name-value\npairs. Sometimes however, it may be useful to specify a single parser-wide\ndefault for arguments. This can be accomplished by passing the\nargument_default= keyword argument to ArgumentParser. For example,\nto globally suppress attribute creation on parse_args()\ncalls, we supply argument_default=SUPPRESS:
\n>>> parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)\n>>> parser.add_argument('--foo')\n>>> parser.add_argument('bar', nargs='?')\n>>> parser.parse_args(['--foo', '1', 'BAR'])\nNamespace(bar='BAR', foo='1')\n>>> parser.parse_args([])\nNamespace()\n
Sometimes, several parsers share a common set of arguments. Rather than\nrepeating the definitions of these arguments, a single parser with all the\nshared arguments and passed to parents= argument to ArgumentParser\ncan be used. The parents= argument takes a list of ArgumentParser\nobjects, collects all the positional and optional actions from them, and adds\nthese actions to the ArgumentParser object being constructed:
\n>>> parent_parser = argparse.ArgumentParser(add_help=False)\n>>> parent_parser.add_argument('--parent', type=int)\n\n>>> foo_parser = argparse.ArgumentParser(parents=[parent_parser])\n>>> foo_parser.add_argument('foo')\n>>> foo_parser.parse_args(['--parent', '2', 'XXX'])\nNamespace(foo='XXX', parent=2)\n\n>>> bar_parser = argparse.ArgumentParser(parents=[parent_parser])\n>>> bar_parser.add_argument('--bar')\n>>> bar_parser.parse_args(['--bar', 'YYY'])\nNamespace(bar='YYY', parent=None)\n
Note that most parent parsers will specify add_help=False. Otherwise, the\nArgumentParser will see two -h/--help options (one in the parent\nand one in the child) and raise an error.
\nNote
\nYou must fully initialize the parsers before passing them via parents=.\nIf you change the parent parsers after the child parser, those changes will\nnot be reflected in the child.
\nArgumentParser objects allow the help formatting to be customized by\nspecifying an alternate formatting class. Currently, there are three such\nclasses:
\nThe first two allow more control over how textual descriptions are displayed,\nwhile the last automatically adds information about argument default values.
\nBy default, ArgumentParser objects line-wrap the description and\nepilog texts in command-line help messages:
\n>>> parser = argparse.ArgumentParser(\n... prog='PROG',\n... description='''this description\n... was indented weird\n... but that is okay''',\n... epilog='''\n... likewise for this epilog whose whitespace will\n... be cleaned up and whose words will be wrapped\n... across a couple lines''')\n>>> parser.print_help()\nusage: PROG [-h]\n\nthis description was indented weird but that is okay\n\noptional arguments:\n -h, --help show this help message and exit\n\nlikewise for this epilog whose whitespace will be cleaned up and whose words\nwill be wrapped across a couple lines\n
Passing RawDescriptionHelpFormatter as formatter_class=\nindicates that description and epilog are already correctly formatted and\nshould not be line-wrapped:
\n>>> parser = argparse.ArgumentParser(\n... prog='PROG',\n... formatter_class=argparse.RawDescriptionHelpFormatter,\n... description=textwrap.dedent('''\\\n... Please do not mess up this text!\n... --------------------------------\n... I have indented it\n... exactly the way\n... I want it\n... '''))\n>>> parser.print_help()\nusage: PROG [-h]\n\nPlease do not mess up this text!\n--------------------------------\n I have indented it\n exactly the way\n I want it\n\noptional arguments:\n -h, --help show this help message and exit\n
RawTextHelpFormatter maintains whitespace for all sorts of help text,\nincluding argument descriptions.
\nThe other formatter class available, ArgumentDefaultsHelpFormatter,\nwill add information about the default value of each of the arguments:
\n>>> parser = argparse.ArgumentParser(\n... prog='PROG',\n... formatter_class=argparse.ArgumentDefaultsHelpFormatter)\n>>> parser.add_argument('--foo', type=int, default=42, help='FOO!')\n>>> parser.add_argument('bar', nargs='*', default=[1, 2, 3], help='BAR!')\n>>> parser.print_help()\nusage: PROG [-h] [--foo FOO] [bar [bar ...]]\n\npositional arguments:\n bar BAR! (default: [1, 2, 3])\n\noptional arguments:\n -h, --help show this help message and exit\n --foo FOO FOO! (default: 42)\n
ArgumentParser objects do not allow two actions with the same option\nstring. By default, ArgumentParser objects raises an exception if an\nattempt is made to create an argument with an option string that is already in\nuse:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-f', '--foo', help='old foo help')\n>>> parser.add_argument('--foo', help='new foo help')\nTraceback (most recent call last):\n ..\nArgumentError: argument --foo: conflicting option string(s): --foo\n
Sometimes (e.g. when using parents) it may be useful to simply override any\nolder arguments with the same option string. To get this behavior, the value\n'resolve' can be supplied to the conflict_handler= argument of\nArgumentParser:
\n>>> parser = argparse.ArgumentParser(prog='PROG', conflict_handler='resolve')\n>>> parser.add_argument('-f', '--foo', help='old foo help')\n>>> parser.add_argument('--foo', help='new foo help')\n>>> parser.print_help()\nusage: PROG [-h] [-f FOO] [--foo FOO]\n\noptional arguments:\n -h, --help show this help message and exit\n -f FOO old foo help\n --foo FOO new foo help\n
Note that ArgumentParser objects only remove an action if all of its\noption strings are overridden. So, in the example above, the old -f/--foo\naction is retained as the -f action, because only the --foo option\nstring was overridden.
\nBy default, ArgumentParser objects uses sys.argv[0] to determine\nhow to display the name of the program in help messages. This default is almost\nalways desirable because it will make the help messages match how the program was\ninvoked on the command line. For example, consider a file named\nmyprogram.py with the following code:
\nimport argparse\nparser = argparse.ArgumentParser()\nparser.add_argument('--foo', help='foo help')\nargs = parser.parse_args()\n
The help for this program will display myprogram.py as the program name\n(regardless of where the program was invoked from):
\n$ python myprogram.py --help\nusage: myprogram.py [-h] [--foo FOO]\n\noptional arguments:\n -h, --help show this help message and exit\n --foo FOO foo help\n$ cd ..\n$ python subdir\\myprogram.py --help\nusage: myprogram.py [-h] [--foo FOO]\n\noptional arguments:\n -h, --help show this help message and exit\n --foo FOO foo help
\nTo change this default behavior, another value can be supplied using the\nprog= argument to ArgumentParser:
\n>>> parser = argparse.ArgumentParser(prog='myprogram')\n>>> parser.print_help()\nusage: myprogram [-h]\n\noptional arguments:\n -h, --help show this help message and exit\n
Note that the program name, whether determined from sys.argv[0] or from the\nprog= argument, is available to help messages using the %(prog)s format\nspecifier.
\n>>> parser = argparse.ArgumentParser(prog='myprogram')\n>>> parser.add_argument('--foo', help='foo of the %(prog)s program')\n>>> parser.print_help()\nusage: myprogram [-h] [--foo FOO]\n\noptional arguments:\n -h, --help show this help message and exit\n --foo FOO foo of the myprogram program\n
By default, ArgumentParser calculates the usage message from the\narguments it contains:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('--foo', nargs='?', help='foo help')\n>>> parser.add_argument('bar', nargs='+', help='bar help')\n>>> parser.print_help()\nusage: PROG [-h] [--foo [FOO]] bar [bar ...]\n\npositional arguments:\n bar bar help\n\noptional arguments:\n -h, --help show this help message and exit\n --foo [FOO] foo help\n
The default message can be overridden with the usage= keyword argument:
\n>>> parser = argparse.ArgumentParser(prog='PROG', usage='%(prog)s [options]')\n>>> parser.add_argument('--foo', nargs='?', help='foo help')\n>>> parser.add_argument('bar', nargs='+', help='bar help')\n>>> parser.print_help()\nusage: PROG [options]\n\npositional arguments:\n bar bar help\n\noptional arguments:\n -h, --help show this help message and exit\n --foo [FOO] foo help\n
The %(prog)s format specifier is available to fill in the program name in\nyour usage messages.
\nDefine how a single command-line argument should be parsed. Each parameter\nhas its own more detailed description below, but in short they are:
\nThe following sections describe how each of these are used.
\nThe add_argument() method must know whether an optional\nargument, like -f or --foo, or a positional argument, like a list of\nfilenames, is expected. The first arguments passed to\nadd_argument() must therefore be either a series of\nflags, or a simple argument name. For example, an optional argument could\nbe created like:
\n>>> parser.add_argument('-f', '--foo')\n
while a positional argument could be created like:
\n>>> parser.add_argument('bar')\n
When parse_args() is called, optional arguments will be\nidentified by the - prefix, and the remaining arguments will be assumed to\nbe positional:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-f', '--foo')\n>>> parser.add_argument('bar')\n>>> parser.parse_args(['BAR'])\nNamespace(bar='BAR', foo=None)\n>>> parser.parse_args(['BAR', '--foo', 'FOO'])\nNamespace(bar='BAR', foo='FOO')\n>>> parser.parse_args(['--foo', 'FOO'])\nusage: PROG [-h] [-f FOO] bar\nPROG: error: too few arguments\n
ArgumentParser objects associate command-line arguments with actions. These\nactions can do just about anything with the command-line arguments associated with\nthem, though most actions simply add an attribute to the object returned by\nparse_args(). The action keyword argument specifies\nhow the command-line arguments should be handled. The supported actions are:
\n'store' - This just stores the argument’s value. This is the default\naction. For example:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo')\n>>> parser.parse_args('--foo 1'.split())\nNamespace(foo='1')\n
'store_const' - This stores the value specified by the const keyword\nargument. (Note that the const keyword argument defaults to the rather\nunhelpful None.) The 'store_const' action is most commonly used with\noptional arguments that specify some sort of flag. For example:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', action='store_const', const=42)\n>>> parser.parse_args('--foo'.split())\nNamespace(foo=42)\n
'store_true' and 'store_false' - These are special cases of\n'store_const' using for storing the values True and False\nrespectively. In addition, they create default values of False and True\nrespectively. For example:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', action='store_true')\n>>> parser.add_argument('--bar', action='store_false')\n>>> parser.add_argument('--baz', action='store_false')\n>>> parser.parse_args('--foo --bar'.split())\nNamespace(bar=False, baz=True, foo=True)\n
'append' - This stores a list, and appends each argument value to the\nlist. This is useful to allow an option to be specified multiple times.\nExample usage:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', action='append')\n>>> parser.parse_args('--foo 1 --foo 2'.split())\nNamespace(foo=['1', '2'])\n
'append_const' - This stores a list, and appends the value specified by\nthe const keyword argument to the list. (Note that the const keyword\nargument defaults to None.) The 'append_const' action is typically\nuseful when multiple arguments need to store constants to the same list. For\nexample:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--str', dest='types', action='append_const', const=str)\n>>> parser.add_argument('--int', dest='types', action='append_const', const=int)\n>>> parser.parse_args('--str --int'.split())\nNamespace(types=[<type 'str'>, <type 'int'>])\n
'count' - This counts the number of times a keyword argument occurs. For\nexample, this is useful for increasing verbosity levels:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--verbose', '-v', action='count')\n>>> parser.parse_args('-vvv'.split())\nNamespace(verbose=3)\n
'help' - This prints a complete help message for all the options in the\ncurrent parser and then exits. By default a help action is automatically\nadded to the parser. See ArgumentParser for details of how the\noutput is created.
\n'version' - This expects a version= keyword argument in the\nadd_argument() call, and prints version information\nand exits when invoked.
\n>>> import argparse\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('--version', action='version', version='%(prog)s 2.0')\n>>> parser.parse_args(['--version'])\nPROG 2.0\n
You can also specify an arbitrary action by passing an object that implements\nthe Action API. The easiest way to do this is to extend\nargparse.Action, supplying an appropriate __call__ method. The\n__call__ method should accept four parameters:
\nAn example of a custom action:
\n>>> class FooAction(argparse.Action):\n... def __call__(self, parser, namespace, values, option_string=None):\n... print '%r %r %r' (namespace, values, option_string)\n... setattr(namespace, self.dest, values)\n...\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', action=FooAction)\n>>> parser.add_argument('bar', action=FooAction)\n>>> args = parser.parse_args('1 --foo 2'.split())\nNamespace(bar=None, foo=None) '1' None\nNamespace(bar='1', foo=None) '2' '--foo'\n>>> args\nNamespace(bar='1', foo='2')\n
ArgumentParser objects usually associate a single command-line argument with a\nsingle action to be taken. The nargs keyword argument associates a\ndifferent number of command-line arguments with a single action. The supported\nvalues are:
\nN (an integer). N arguments from the command line will be gathered together into a\nlist. For example:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', nargs=2)\n>>> parser.add_argument('bar', nargs=1)\n>>> parser.parse_args('c --foo a b'.split())\nNamespace(bar=['c'], foo=['a', 'b'])\n
Note that nargs=1 produces a list of one item. This is different from\nthe default, in which the item is produced by itself.
\n'?'. One argument will be consumed from the command line if possible, and\nproduced as a single item. If no command-line argument is present, the value from\ndefault will be produced. Note that for optional arguments, there is an\nadditional case - the option string is present but not followed by a\ncommand-line argument. In this case the value from const will be produced. Some\nexamples to illustrate this:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', nargs='?', const='c', default='d')\n>>> parser.add_argument('bar', nargs='?', default='d')\n>>> parser.parse_args('XX --foo YY'.split())\nNamespace(bar='XX', foo='YY')\n>>> parser.parse_args('XX --foo'.split())\nNamespace(bar='XX', foo='c')\n>>> parser.parse_args(''.split())\nNamespace(bar='d', foo='d')\n
One of the more common uses of nargs='?' is to allow optional input and\noutput files:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('infile', nargs='?', type=argparse.FileType('r'),\n... default=sys.stdin)\n>>> parser.add_argument('outfile', nargs='?', type=argparse.FileType('w'),\n... default=sys.stdout)\n>>> parser.parse_args(['input.txt', 'output.txt'])\nNamespace(infile=<open file 'input.txt', mode 'r' at 0x...>,\n outfile=<open file 'output.txt', mode 'w' at 0x...>)\n>>> parser.parse_args([])\nNamespace(infile=<open file '<stdin>', mode 'r' at 0x...>,\n outfile=<open file '<stdout>', mode 'w' at 0x...>)\n
'*'. All command-line arguments present are gathered into a list. Note that\nit generally doesn’t make much sense to have more than one positional argument\nwith nargs='*', but multiple optional arguments with nargs='*' is\npossible. For example:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', nargs='*')\n>>> parser.add_argument('--bar', nargs='*')\n>>> parser.add_argument('baz', nargs='*')\n>>> parser.parse_args('a b --foo x y --bar 1 2'.split())\nNamespace(bar=['1', '2'], baz=['a', 'b'], foo=['x', 'y'])\n
'+'. Just like '*', all command-line args present are gathered into a\nlist. Additionally, an error message will be generated if there wasn’t at\nleast one command-line argument present. For example:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('foo', nargs='+')\n>>> parser.parse_args('a b'.split())\nNamespace(foo=['a', 'b'])\n>>> parser.parse_args(''.split())\nusage: PROG [-h] foo [foo ...]\nPROG: error: too few arguments\n
If the nargs keyword argument is not provided, the number of arguments consumed\nis determined by the action. Generally this means a single command-line argument\nwill be consumed and a single item (not a list) will be produced.
\nThe const argument of add_argument() is used to hold\nconstant values that are not read from the command line but are required for\nthe various ArgumentParser actions. The two most common uses of it are:
\nThe const keyword argument defaults to None.
\nAll optional arguments and some positional arguments may be omitted at the\ncommand line. The default keyword argument of\nadd_argument(), whose value defaults to None,\nspecifies what value should be used if the command-line argument is not present.\nFor optional arguments, the default value is used when the option string\nwas not present at the command line:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', default=42)\n>>> parser.parse_args('--foo 2'.split())\nNamespace(foo='2')\n>>> parser.parse_args(''.split())\nNamespace(foo=42)\n
For positional arguments with nargs equal to ? or *, the default value\nis used when no command-line argument was present:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('foo', nargs='?', default=42)\n>>> parser.parse_args('a'.split())\nNamespace(foo='a')\n>>> parser.parse_args(''.split())\nNamespace(foo=42)\n
Providing default=argparse.SUPPRESS causes no attribute to be added if the\ncommand-line argument was not present.:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', default=argparse.SUPPRESS)\n>>> parser.parse_args([])\nNamespace()\n>>> parser.parse_args(['--foo', '1'])\nNamespace(foo='1')\n
By default, ArgumentParser objects read command-line arguments in as simple\nstrings. However, quite often the command-line string should instead be\ninterpreted as another type, like a float or int. The\ntype keyword argument of add_argument() allows any\nnecessary type-checking and type conversions to be performed. Common built-in\ntypes and functions can be used directly as the value of the type argument:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('foo', type=int)\n>>> parser.add_argument('bar', type=file)\n>>> parser.parse_args('2 temp.txt'.split())\nNamespace(bar=<open file 'temp.txt', mode 'r' at 0x...>, foo=2)\n
To ease the use of various types of files, the argparse module provides the\nfactory FileType which takes the mode= and bufsize= arguments of the\nfile object. For example, FileType('w') can be used to create a\nwritable file:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('bar', type=argparse.FileType('w'))\n>>> parser.parse_args(['out.txt'])\nNamespace(bar=<open file 'out.txt', mode 'w' at 0x...>)\n
type= can take any callable that takes a single string argument and returns\nthe converted value:
\n>>> def perfect_square(string):\n... value = int(string)\n... sqrt = math.sqrt(value)\n... if sqrt != int(sqrt):\n... msg = "%r is not a perfect square" string\n... raise argparse.ArgumentTypeError(msg)\n... return value\n...\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('foo', type=perfect_square)\n>>> parser.parse_args('9'.split())\nNamespace(foo=9)\n>>> parser.parse_args('7'.split())\nusage: PROG [-h] foo\nPROG: error: argument foo: '7' is not a perfect square\n
The choices keyword argument may be more convenient for type checkers that\nsimply check against a range of values:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('foo', type=int, choices=xrange(5, 10))\n>>> parser.parse_args('7'.split())\nNamespace(foo=7)\n>>> parser.parse_args('11'.split())\nusage: PROG [-h] {5,6,7,8,9}\nPROG: error: argument foo: invalid choice: 11 (choose from 5, 6, 7, 8, 9)\n
See the choices section for more details.
\nSome command-line arguments should be selected from a restricted set of values.\nThese can be handled by passing a container object as the choices keyword\nargument to add_argument(). When the command line is\nparsed, argument values will be checked, and an error message will be displayed if\nthe argument was not one of the acceptable values:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('foo', choices='abc')\n>>> parser.parse_args('c'.split())\nNamespace(foo='c')\n>>> parser.parse_args('X'.split())\nusage: PROG [-h] {a,b,c}\nPROG: error: argument foo: invalid choice: 'X' (choose from 'a', 'b', 'c')\n
Note that inclusion in the choices container is checked after any type\nconversions have been performed, so the type of the objects in the choices\ncontainer should match the type specified:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('foo', type=complex, choices=[1, 1j])\n>>> parser.parse_args('1j'.split())\nNamespace(foo=1j)\n>>> parser.parse_args('-- -4'.split())\nusage: PROG [-h] {1,1j}\nPROG: error: argument foo: invalid choice: (-4+0j) (choose from 1, 1j)\n
Any object that supports the in operator can be passed as the choices\nvalue, so dict objects, set objects, custom containers,\netc. are all supported.
\nIn general, the argparse module assumes that flags like -f and --bar\nindicate optional arguments, which can always be omitted at the command line.\nTo make an option required, True can be specified for the required=\nkeyword argument to add_argument():
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', required=True)\n>>> parser.parse_args(['--foo', 'BAR'])\nNamespace(foo='BAR')\n>>> parser.parse_args([])\nusage: argparse.py [-h] [--foo FOO]\nargparse.py: error: option --foo is required\n
As the example shows, if an option is marked as required,\nparse_args() will report an error if that option is not\npresent at the command line.
\nNote
\nRequired options are generally considered bad form because users expect\noptions to be optional, and thus they should be avoided when possible.
\nThe help value is a string containing a brief description of the argument.\nWhen a user requests help (usually by using -h or --help at the\ncommand line), these help descriptions will be displayed with each\nargument:
\n>>> parser = argparse.ArgumentParser(prog='frobble')\n>>> parser.add_argument('--foo', action='store_true',\n... help='foo the bars before frobbling')\n>>> parser.add_argument('bar', nargs='+',\n... help='one of the bars to be frobbled')\n>>> parser.parse_args('-h'.split())\nusage: frobble [-h] [--foo] bar [bar ...]\n\npositional arguments:\n bar one of the bars to be frobbled\n\noptional arguments:\n -h, --help show this help message and exit\n --foo foo the bars before frobbling\n
The help strings can include various format specifiers to avoid repetition\nof things like the program name or the argument default. The available\nspecifiers include the program name, %(prog)s and most keyword arguments to\nadd_argument(), e.g. %(default)s, %(type)s, etc.:
\n>>> parser = argparse.ArgumentParser(prog='frobble')\n>>> parser.add_argument('bar', nargs='?', type=int, default=42,\n... help='the bar to %(prog)s (default: %(default)s)')\n>>> parser.print_help()\nusage: frobble [-h] [bar]\n\npositional arguments:\n bar the bar to frobble (default: 42)\n\noptional arguments:\n -h, --help show this help message and exit\n
argparse supports silencing the help entry for certain options, by\nsetting the help value to argparse.SUPPRESS:
\n>>> parser = argparse.ArgumentParser(prog='frobble')\n>>> parser.add_argument('--foo', help=argparse.SUPPRESS)\n>>> parser.print_help()\nusage: frobble [-h]\n\noptional arguments:\n -h, --help show this help message and exit\n
When ArgumentParser generates help messages, it need some way to refer\nto each expected argument. By default, ArgumentParser objects use the dest\nvalue as the “name” of each object. By default, for positional argument\nactions, the dest value is used directly, and for optional argument actions,\nthe dest value is uppercased. So, a single positional argument with\ndest='bar' will be referred to as bar. A single\noptional argument --foo that should be followed by a single command-line argument\nwill be referred to as FOO. An example:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo')\n>>> parser.add_argument('bar')\n>>> parser.parse_args('X --foo Y'.split())\nNamespace(bar='X', foo='Y')\n>>> parser.print_help()\nusage: [-h] [--foo FOO] bar\n\npositional arguments:\n bar\n\noptional arguments:\n -h, --help show this help message and exit\n --foo FOO\n
An alternative name can be specified with metavar:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', metavar='YYY')\n>>> parser.add_argument('bar', metavar='XXX')\n>>> parser.parse_args('X --foo Y'.split())\nNamespace(bar='X', foo='Y')\n>>> parser.print_help()\nusage: [-h] [--foo YYY] XXX\n\npositional arguments:\n XXX\n\noptional arguments:\n -h, --help show this help message and exit\n --foo YYY\n
Note that metavar only changes the displayed name - the name of the\nattribute on the parse_args() object is still determined\nby the dest value.
\nDifferent values of nargs may cause the metavar to be used multiple times.\nProviding a tuple to metavar specifies a different display for each of the\narguments:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-x', nargs=2)\n>>> parser.add_argument('--foo', nargs=2, metavar=('bar', 'baz'))\n>>> parser.print_help()\nusage: PROG [-h] [-x X X] [--foo bar baz]\n\noptional arguments:\n -h, --help show this help message and exit\n -x X X\n --foo bar baz\n
Most ArgumentParser actions add some value as an attribute of the\nobject returned by parse_args(). The name of this\nattribute is determined by the dest keyword argument of\nadd_argument(). For positional argument actions,\ndest is normally supplied as the first argument to\nadd_argument():
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('bar')\n>>> parser.parse_args('XXX'.split())\nNamespace(bar='XXX')\n
For optional argument actions, the value of dest is normally inferred from\nthe option strings. ArgumentParser generates the value of dest by\ntaking the first long option string and stripping away the initial --\nstring. If no long option strings were supplied, dest will be derived from\nthe first short option string by stripping the initial - character. Any\ninternal - characters will be converted to _ characters to make sure\nthe string is a valid attribute name. The examples below illustrate this\nbehavior:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('-f', '--foo-bar', '--foo')\n>>> parser.add_argument('-x', '-y')\n>>> parser.parse_args('-f 1 -x 2'.split())\nNamespace(foo_bar='1', x='2')\n>>> parser.parse_args('--foo 1 -y 2'.split())\nNamespace(foo_bar='1', x='2')\n
dest allows a custom attribute name to be provided:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', dest='bar')\n>>> parser.parse_args('--foo XXX'.split())\nNamespace(bar='XXX')\n
Convert argument strings to objects and assign them as attributes of the\nnamespace. Return the populated namespace.
\nPrevious calls to add_argument() determine exactly what objects are\ncreated and how they are assigned. See the documentation for\nadd_argument() for details.
\nBy default, the argument strings are taken from sys.argv, and a new empty\nNamespace object is created for the attributes.
\nThe parse_args() method supports several ways of\nspecifying the value of an option (if it takes one). In the simplest case, the\noption and its value are passed as two separate arguments:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-x')\n>>> parser.add_argument('--foo')\n>>> parser.parse_args('-x X'.split())\nNamespace(foo=None, x='X')\n>>> parser.parse_args('--foo FOO'.split())\nNamespace(foo='FOO', x=None)\n
For long options (options with names longer than a single character), the option\nand value can also be passed as a single command-line argument, using = to\nseparate them:
\n>>> parser.parse_args('--foo=FOO'.split())\nNamespace(foo='FOO', x=None)\n
For short options (options only one character long), the option and its value\ncan be concatenated:
\n>>> parser.parse_args('-xX'.split())\nNamespace(foo=None, x='X')\n
Several short options can be joined together, using only a single - prefix,\nas long as only the last option (or none of them) requires a value:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-x', action='store_true')\n>>> parser.add_argument('-y', action='store_true')\n>>> parser.add_argument('-z')\n>>> parser.parse_args('-xyzZ'.split())\nNamespace(x=True, y=True, z='Z')\n
While parsing the command line, parse_args() checks for a\nvariety of errors, including ambiguous options, invalid types, invalid options,\nwrong number of positional arguments, etc. When it encounters such an error,\nit exits and prints the error along with a usage message:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('--foo', type=int)\n>>> parser.add_argument('bar', nargs='?')\n\n>>> # invalid type\n>>> parser.parse_args(['--foo', 'spam'])\nusage: PROG [-h] [--foo FOO] [bar]\nPROG: error: argument --foo: invalid int value: 'spam'\n\n>>> # invalid option\n>>> parser.parse_args(['--bar'])\nusage: PROG [-h] [--foo FOO] [bar]\nPROG: error: no such option: --bar\n\n>>> # wrong number of arguments\n>>> parser.parse_args(['spam', 'badger'])\nusage: PROG [-h] [--foo FOO] [bar]\nPROG: error: extra arguments found: badger\n
The parse_args() method attempts to give errors whenever\nthe user has clearly made a mistake, but some situations are inherently\nambiguous. For example, the command-line argument -1 could either be an\nattempt to specify an option or an attempt to provide a positional argument.\nThe parse_args() method is cautious here: positional\narguments may only begin with - if they look like negative numbers and\nthere are no options in the parser that look like negative numbers:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-x')\n>>> parser.add_argument('foo', nargs='?')\n\n>>> # no negative number options, so -1 is a positional argument\n>>> parser.parse_args(['-x', '-1'])\nNamespace(foo=None, x='-1')\n\n>>> # no negative number options, so -1 and -5 are positional arguments\n>>> parser.parse_args(['-x', '-1', '-5'])\nNamespace(foo='-5', x='-1')\n\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-1', dest='one')\n>>> parser.add_argument('foo', nargs='?')\n\n>>> # negative number options present, so -1 is an option\n>>> parser.parse_args(['-1', 'X'])\nNamespace(foo=None, one='X')\n\n>>> # negative number options present, so -2 is an option\n>>> parser.parse_args(['-2'])\nusage: PROG [-h] [-1 ONE] [foo]\nPROG: error: no such option: -2\n\n>>> # negative number options present, so both -1s are options\n>>> parser.parse_args(['-1', '-1'])\nusage: PROG [-h] [-1 ONE] [foo]\nPROG: error: argument -1: expected one argument\n
If you have positional arguments that must begin with - and don’t look\nlike negative numbers, you can insert the pseudo-argument '--' which tells\nparse_args() that everything after that is a positional\nargument:
\n>>> parser.parse_args(['--', '-f'])\nNamespace(foo='-f', one=None)\n
The parse_args() method allows long options to be\nabbreviated if the abbreviation is unambiguous:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('-bacon')\n>>> parser.add_argument('-badger')\n>>> parser.parse_args('-bac MMM'.split())\nNamespace(bacon='MMM', badger=None)\n>>> parser.parse_args('-bad WOOD'.split())\nNamespace(bacon=None, badger='WOOD')\n>>> parser.parse_args('-ba BA'.split())\nusage: PROG [-h] [-bacon BACON] [-badger BADGER]\nPROG: error: ambiguous option: -ba could match -badger, -bacon\n
An error is produced for arguments that could produce more than one options.
\nSometimes it may be useful to have an ArgumentParser parse arguments other than those\nof sys.argv. This can be accomplished by passing a list of strings to\nparse_args(). This is useful for testing at the\ninteractive prompt:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument(\n... 'integers', metavar='int', type=int, choices=xrange(10),\n... nargs='+', help='an integer in the range 0..9')\n>>> parser.add_argument(\n... '--sum', dest='accumulate', action='store_const', const=sum,\n... default=max, help='sum the integers (default: find the max)')\n>>> parser.parse_args(['1', '2', '3', '4'])\nNamespace(accumulate=<built-in function max>, integers=[1, 2, 3, 4])\n>>> parser.parse_args('1 2 3 4 --sum'.split())\nNamespace(accumulate=<built-in function sum>, integers=[1, 2, 3, 4])\n
This class is deliberately simple, just an object subclass with a\nreadable string representation. If you prefer to have dict-like view of the\nattributes, you can use the standard Python idiom, vars():
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo')\n>>> args = parser.parse_args(['--foo', 'BAR'])\n>>> vars(args)\n{'foo': 'BAR'}\n
It may also be useful to have an ArgumentParser assign attributes to an\nalready existing object, rather than a new Namespace object. This can\nbe achieved by specifying the namespace= keyword argument:
\n>>> class C(object):\n... pass\n...\n>>> c = C()\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo')\n>>> parser.parse_args(args=['--foo', 'BAR'], namespace=c)\n>>> c.foo\n'BAR'\n
Many programs split up their functionality into a number of sub-commands,\nfor example, the svn program can invoke sub-commands like svn\ncheckout, svn update, and svn commit. Splitting up functionality\nthis way can be a particularly good idea when a program performs several\ndifferent functions which require different kinds of command-line arguments.\nArgumentParser supports the creation of such sub-commands with the\nadd_subparsers() method. The add_subparsers() method is normally\ncalled with no arguments and returns an special action object. This object\nhas a single method, add_parser(), which takes a\ncommand name and any ArgumentParser constructor arguments, and\nreturns an ArgumentParser object that can be modified as usual.
\nSome example usage:
\n>>> # create the top-level parser\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> parser.add_argument('--foo', action='store_true', help='foo help')\n>>> subparsers = parser.add_subparsers(help='sub-command help')\n>>>\n>>> # create the parser for the "a" command\n>>> parser_a = subparsers.add_parser('a', help='a help')\n>>> parser_a.add_argument('bar', type=int, help='bar help')\n>>>\n>>> # create the parser for the "b" command\n>>> parser_b = subparsers.add_parser('b', help='b help')\n>>> parser_b.add_argument('--baz', choices='XYZ', help='baz help')\n>>>\n>>> # parse some argument lists\n>>> parser.parse_args(['a', '12'])\nNamespace(bar=12, foo=False)\n>>> parser.parse_args(['--foo', 'b', '--baz', 'Z'])\nNamespace(baz='Z', foo=True)\n
Note that the object returned by parse_args() will only contain\nattributes for the main parser and the subparser that was selected by the\ncommand line (and not any other subparsers). So in the example above, when\nthe a command is specified, only the foo and bar attributes are\npresent, and when the b command is specified, only the foo and\nbaz attributes are present.
\nSimilarly, when a help message is requested from a subparser, only the help\nfor that particular parser will be printed. The help message will not\ninclude parent parser or sibling parser messages. (A help message for each\nsubparser command, however, can be given by supplying the help= argument\nto add_parser() as above.)
\n>>> parser.parse_args(['--help'])\nusage: PROG [-h] [--foo] {a,b} ...\n\npositional arguments:\n {a,b} sub-command help\na a help\nb b help\n\noptional arguments:\n -h, --help show this help message and exit\n --foo foo help\n\n>>> parser.parse_args(['a', '--help'])\nusage: PROG a [-h] bar\n\npositional arguments:\n bar bar help\n\noptional arguments:\n -h, --help show this help message and exit\n\n>>> parser.parse_args(['b', '--help'])\nusage: PROG b [-h] [--baz {X,Y,Z}]\n\noptional arguments:\n -h, --help show this help message and exit\n --baz {X,Y,Z} baz help\n
The add_subparsers() method also supports title and description\nkeyword arguments. When either is present, the subparser’s commands will\nappear in their own group in the help output. For example:
\n>>> parser = argparse.ArgumentParser()\n>>> subparsers = parser.add_subparsers(title='subcommands',\n... description='valid subcommands',\n... help='additional help')\n>>> subparsers.add_parser('foo')\n>>> subparsers.add_parser('bar')\n>>> parser.parse_args(['-h'])\nusage: [-h] {foo,bar} ...\n\noptional arguments:\n -h, --help show this help message and exit\n\nsubcommands:\n valid subcommands\n\n {foo,bar} additional help\n
One particularly effective way of handling sub-commands is to combine the use\nof the add_subparsers() method with calls to set_defaults() so\nthat each subparser knows which Python function it should execute. For\nexample:
\n>>> # sub-command functions\n>>> def foo(args):\n... print args.x * args.y\n...\n>>> def bar(args):\n... print '((%s))' args.z\n...\n>>> # create the top-level parser\n>>> parser = argparse.ArgumentParser()\n>>> subparsers = parser.add_subparsers()\n>>>\n>>> # create the parser for the "foo" command\n>>> parser_foo = subparsers.add_parser('foo')\n>>> parser_foo.add_argument('-x', type=int, default=1)\n>>> parser_foo.add_argument('y', type=float)\n>>> parser_foo.set_defaults(func=foo)\n>>>\n>>> # create the parser for the "bar" command\n>>> parser_bar = subparsers.add_parser('bar')\n>>> parser_bar.add_argument('z')\n>>> parser_bar.set_defaults(func=bar)\n>>>\n>>> # parse the args and call whatever function was selected\n>>> args = parser.parse_args('foo 1 -x 2'.split())\n>>> args.func(args)\n2.0\n>>>\n>>> # parse the args and call whatever function was selected\n>>> args = parser.parse_args('bar XYZYX'.split())\n>>> args.func(args)\n((XYZYX))\n
This way, you can let parse_args() does the job of calling the\nappropriate function after argument parsing is complete. Associating\nfunctions with actions like this is typically the easiest way to handle the\ndifferent actions for each of your subparsers. However, if it is necessary\nto check the name of the subparser that was invoked, the dest keyword\nargument to the add_subparsers() call will work:
\n>>> parser = argparse.ArgumentParser()\n>>> subparsers = parser.add_subparsers(dest='subparser_name')\n>>> subparser1 = subparsers.add_parser('1')\n>>> subparser1.add_argument('-x')\n>>> subparser2 = subparsers.add_parser('2')\n>>> subparser2.add_argument('y')\n>>> parser.parse_args(['2', 'frobble'])\nNamespace(subparser_name='2', y='frobble')\n
The FileType factory creates objects that can be passed to the type\nargument of ArgumentParser.add_argument(). Arguments that have\nFileType objects as their type will open command-line arguments as files\nwith the requested modes and buffer sizes:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--output', type=argparse.FileType('wb', 0))\n>>> parser.parse_args(['--output', 'out'])\nNamespace(output=<open file 'out', mode 'wb' at 0x...>)\n
FileType objects understand the pseudo-argument '-' and automatically\nconvert this into sys.stdin for readable FileType objects and\nsys.stdout for writable FileType objects:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('infile', type=argparse.FileType('r'))\n>>> parser.parse_args(['-'])\nNamespace(infile=<open file '<stdin>', mode 'r' at 0x...>)\n
By default, ArgumentParser groups command-line arguments into\n“positional arguments” and “optional arguments” when displaying help\nmessages. When there is a better conceptual grouping of arguments than this\ndefault one, appropriate groups can be created using the\nadd_argument_group() method:
\n>>> parser = argparse.ArgumentParser(prog='PROG', add_help=False)\n>>> group = parser.add_argument_group('group')\n>>> group.add_argument('--foo', help='foo help')\n>>> group.add_argument('bar', help='bar help')\n>>> parser.print_help()\nusage: PROG [--foo FOO] bar\n\ngroup:\n bar bar help\n --foo FOO foo help\n
The add_argument_group() method returns an argument group object which\nhas an add_argument() method just like a regular\nArgumentParser. When an argument is added to the group, the parser\ntreats it just like a normal argument, but displays the argument in a\nseparate group for help messages. The add_argument_group() method\naccepts title and description arguments which can be used to\ncustomize this display:
\n>>> parser = argparse.ArgumentParser(prog='PROG', add_help=False)\n>>> group1 = parser.add_argument_group('group1', 'group1 description')\n>>> group1.add_argument('foo', help='foo help')\n>>> group2 = parser.add_argument_group('group2', 'group2 description')\n>>> group2.add_argument('--bar', help='bar help')\n>>> parser.print_help()\nusage: PROG [--bar BAR] foo\n\ngroup1:\n group1 description\n\n foo foo help\n\ngroup2:\n group2 description\n\n --bar BAR bar help\n
Note that any arguments not your user defined groups will end up back in the\nusual “positional arguments” and “optional arguments” sections.
\nCreate a mutually exclusive group. argparse will make sure that only\none of the arguments in the mutually exclusive group was present on the\ncommand line:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> group = parser.add_mutually_exclusive_group()\n>>> group.add_argument('--foo', action='store_true')\n>>> group.add_argument('--bar', action='store_false')\n>>> parser.parse_args(['--foo'])\nNamespace(bar=True, foo=True)\n>>> parser.parse_args(['--bar'])\nNamespace(bar=False, foo=False)\n>>> parser.parse_args(['--foo', '--bar'])\nusage: PROG [-h] [--foo | --bar]\nPROG: error: argument --bar: not allowed with argument --foo\n
The add_mutually_exclusive_group() method also accepts a required\nargument, to indicate that at least one of the mutually exclusive arguments\nis required:
\n>>> parser = argparse.ArgumentParser(prog='PROG')\n>>> group = parser.add_mutually_exclusive_group(required=True)\n>>> group.add_argument('--foo', action='store_true')\n>>> group.add_argument('--bar', action='store_false')\n>>> parser.parse_args([])\nusage: PROG [-h] (--foo | --bar)\nPROG: error: one of the arguments --foo --bar is required\n
Note that currently mutually exclusive argument groups do not support the\ntitle and description arguments of\nadd_argument_group().
\nMost of the time, the attributes of the object returned by parse_args()\nwill be fully determined by inspecting the command-line arguments and the argument\nactions. set_defaults() allows some additional\nattributes that are determined without any inspection of the command line to\nbe added:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('foo', type=int)\n>>> parser.set_defaults(bar=42, baz='badger')\n>>> parser.parse_args(['736'])\nNamespace(bar=42, baz='badger', foo=736)\n
Note that parser-level defaults always override argument-level defaults:
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', default='bar')\n>>> parser.set_defaults(foo='spam')\n>>> parser.parse_args([])\nNamespace(foo='spam')\n
Parser-level defaults can be particularly useful when working with multiple\nparsers. See the add_subparsers() method for an\nexample of this type.
\nGet the default value for a namespace attribute, as set by either\nadd_argument() or by\nset_defaults():
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', default='badger')\n>>> parser.get_default('foo')\n'badger'\n
In most typical applications, parse_args() will take\ncare of formatting and printing any usage or error messages. However, several\nformatting methods are available:
\nThere are also variants of these methods that simply return a string instead of\nprinting it:
\nSometimes a script may only parse a few of the command-line arguments, passing\nthe remaining arguments on to another script or program. In these cases, the\nparse_known_args() method can be useful. It works much like\nparse_args() except that it does not produce an error when\nextra arguments are present. Instead, it returns a two item tuple containing\nthe populated namespace and the list of remaining argument strings.
\n>>> parser = argparse.ArgumentParser()\n>>> parser.add_argument('--foo', action='store_true')\n>>> parser.add_argument('bar')\n>>> parser.parse_known_args(['--foo', '--badger', 'BAR', 'spam'])\n(Namespace(bar='BAR', foo=True), ['--badger', 'spam'])\n
Arguments that are read from a file (see the fromfile_prefix_chars\nkeyword argument to the ArgumentParser constructor) are read one\nargument per line. convert_arg_line_to_args() can be overriden for\nfancier reading.
\nThis method takes a single argument arg_line which is a string read from\nthe argument file. It returns a list of arguments parsed from this string.\nThe method is called once per line read from the argument file, in order.
\nA useful override of this method is one that treats each space-separated word\nas an argument:
\ndef convert_arg_line_to_args(self, arg_line):\n for arg in arg_line.split():\n if not arg.strip():\n continue\n yield arg\n
Originally, the argparse module had attempted to maintain compatibility\nwith optparse. However, optparse was difficult to extend\ntransparently, particularly with the changes required to support the new\nnargs= specifiers and better usage messages. When most everything in\noptparse had either been copy-pasted over or monkey-patched, it no\nlonger seemed practical to try to maintain the backwards compatibility.
\nA partial upgrade path from optparse to argparse:
\n\nNew in version 1.6.
\nThe curses.ascii module supplies name constants for ASCII characters and\nfunctions to test membership in various ASCII character classes. The constants\nsupplied are names for control characters as follows:
\nName | \nMeaning | \n
---|---|
NUL | \n\n |
SOH | \nStart of heading, console interrupt | \n
STX | \nStart of text | \n
ETX | \nEnd of text | \n
EOT | \nEnd of transmission | \n
ENQ | \nEnquiry, goes with ACK flow control | \n
ACK | \nAcknowledgement | \n
BEL | \nBell | \n
BS | \nBackspace | \n
TAB | \nTab | \n
HT | \nAlias for TAB: “Horizontal tab” | \n
LF | \nLine feed | \n
NL | \nAlias for LF: “New line” | \n
VT | \nVertical tab | \n
FF | \nForm feed | \n
CR | \nCarriage return | \n
SO | \nShift-out, begin alternate character set | \n
SI | \nShift-in, resume default character set | \n
DLE | \nData-link escape | \n
DC1 | \nXON, for flow control | \n
DC2 | \nDevice control 2, block-mode flow control | \n
DC3 | \nXOFF, for flow control | \n
DC4 | \nDevice control 4 | \n
NAK | \nNegative acknowledgement | \n
SYN | \nSynchronous idle | \n
ETB | \nEnd transmission block | \n
CAN | \nCancel | \n
EM | \nEnd of medium | \n
SUB | \nSubstitute | \n
ESC | \nEscape | \n
FS | \nFile separator | \n
GS | \nGroup separator | \n
RS | \nRecord separator, block-mode terminator | \n
US | \nUnit separator | \n
SP | \nSpace | \n
DEL | \nDelete | \n
Note that many of these have little practical significance in modern usage. The\nmnemonics derive from teleprinter conventions that predate digital computers.
\nThe module supplies the following functions, patterned on those in the standard\nC library:
\nThese functions accept either integers or strings; when the argument is a\nstring, it is first converted using the built-in function ord().
\nNote that all these functions check ordinal bit values derived from the first\ncharacter of the string you pass in; they do not actually know anything about\nthe host machine’s character encoding. For functions that know about the\ncharacter encoding (and handle internationalization properly) see the\nstring module.
\nThe following two functions take either a single-character string or integer\nbyte value; they return a value of the same type.
\nThe following function takes either a single-character string or integer value;\nit returns a string.
\nThis section describes the API for configuring the logging module.
\nThe following functions configure the logging module. They are located in the\nlogging.config module. Their use is optional — you can configure the\nlogging module using these functions or by making calls to the main API (defined\nin logging itself) and defining handlers which are declared either in\nlogging or logging.handlers.
\n\n\nTakes the logging configuration from a dictionary. The contents of\nthis dictionary are described in Configuration dictionary schema\nbelow.
\nIf an error is encountered during configuration, this function will\nraise a ValueError, TypeError, AttributeError\nor ImportError with a suitably descriptive message. The\nfollowing is a (possibly incomplete) list of conditions which will\nraise an error:
\n\n
\n- A level which is not a string or which is a string not\ncorresponding to an actual logging level.
\n- A propagate value which is not a boolean.
\n- An id which does not have a corresponding destination.
\n- A non-existent handler id found during an incremental call.
\n- An invalid logger name.
\n- Inability to resolve to an internal or external object.
\nParsing is performed by the DictConfigurator class, whose\nconstructor is passed the dictionary used for configuration, and\nhas a configure() method. The logging.config module\nhas a callable attribute dictConfigClass\nwhich is initially set to DictConfigurator.\nYou can replace the value of dictConfigClass with a\nsuitable implementation of your own.
\ndictConfig() calls dictConfigClass passing\nthe specified dictionary, and then calls the configure() method on\nthe returned object to put the configuration into effect:
\n\n\ndef dictConfig(config):\n dictConfigClass(config).configure()\nFor example, a subclass of DictConfigurator could call\nDictConfigurator.__init__() in its own __init__(), then\nset up custom prefixes which would be usable in the subsequent\nconfigure() call. dictConfigClass would be bound to\nthis new subclass, and then dictConfig() could be called exactly as\nin the default, uncustomized state.
\n
\nNew in version 2.7.
\nReads the logging configuration from a configparser-format file\nnamed fname. This function can be called several times from an\napplication, allowing an end user to select from various pre-canned\nconfigurations (if the developer provides a mechanism to present the choices\nand load the chosen configuration).
\nParameters: |
| \n
---|
\nChanged in version 2.6: The disable_existing_loggers keyword argument was added. Previously,\nexisting loggers were always disabled.
\nStarts up a socket server on the specified port, and listens for new\nconfigurations. If no port is specified, the module’s default\nDEFAULT_LOGGING_CONFIG_PORT is used. Logging configurations will be\nsent as a file suitable for processing by fileConfig(). Returns a\nThread instance on which you can call start() to start the\nserver, and which you can join() when appropriate. To stop the server,\ncall stopListening().
\nTo send a configuration to the socket, read in the configuration file and\nsend it to the socket as a string of bytes preceded by a four-byte length\nstring packed in binary using struct.pack('>L', n).
\nDescribing a logging configuration requires listing the various\nobjects to create and the connections between them; for example, you\nmay create a handler named ‘console’ and then say that the logger\nnamed ‘startup’ will send its messages to the ‘console’ handler.\nThese objects aren’t limited to those provided by the logging\nmodule because you might write your own formatter or handler class.\nThe parameters to these classes may also need to include external\nobjects such as sys.stderr. The syntax for describing these\nobjects and connections is defined in Object connections\nbelow.
\nThe dictionary passed to dictConfig() must contain the following\nkeys:
\nAll other keys are optional, but if present they will be interpreted\nas described below. In all cases below where a ‘configuring dict’ is\nmentioned, it will be checked for the special '()' key to see if a\ncustom instantiation is required. If so, the mechanism described in\nUser-defined objects below is used to create an instance;\notherwise, the context is used to determine what to instantiate.
\nformatters - the corresponding value will be a dict in which each\nkey is a formatter id and each value is a dict describing how to\nconfigure the corresponding Formatter instance.
\nThe configuring dict is searched for keys format and datefmt\n(with defaults of None) and these are used to construct a\nlogging.Formatter instance.
\nfilters - the corresponding value will be a dict in which each key\nis a filter id and each value is a dict describing how to configure\nthe corresponding Filter instance.
\nThe configuring dict is searched for the key name (defaulting to the\nempty string) and this is used to construct a logging.Filter\ninstance.
\nhandlers - the corresponding value will be a dict in which each\nkey is a handler id and each value is a dict describing how to\nconfigure the corresponding Handler instance.
\nThe configuring dict is searched for the following keys:
\nAll other keys are passed through as keyword arguments to the\nhandler’s constructor. For example, given the snippet:
\nhandlers:\n console:\n class : logging.StreamHandler\n formatter: brief\n level : INFO\n filters: [allow_foo]\n stream : ext://sys.stdout\n file:\n class : logging.handlers.RotatingFileHandler\n formatter: precise\n filename: logconfig.log\n maxBytes: 1024\n backupCount: 3
\nthe handler with id console is instantiated as a\nlogging.StreamHandler, using sys.stdout as the underlying\nstream. The handler with id file is instantiated as a\nlogging.handlers.RotatingFileHandler with the keyword arguments\nfilename='logconfig.log', maxBytes=1024, backupCount=3.
\nloggers - the corresponding value will be a dict in which each key\nis a logger name and each value is a dict describing how to\nconfigure the corresponding Logger instance.
\nThe configuring dict is searched for the following keys:
\nThe specified loggers will be configured according to the level,\npropagation, filters and handlers specified.
\nroot - this will be the configuration for the root logger.\nProcessing of the configuration will be as for any logger, except\nthat the propagate setting will not be applicable.
\nincremental - whether the configuration is to be interpreted as\nincremental to the existing configuration. This value defaults to\nFalse, which means that the specified configuration replaces the\nexisting configuration with the same semantics as used by the\nexisting fileConfig() API.
\nIf the specified value is True, the configuration is processed\nas described in the section on Incremental Configuration.
\ndisable_existing_loggers - whether any existing loggers are to be\ndisabled. This setting mirrors the parameter of the same name in\nfileConfig(). If absent, this parameter defaults to True.\nThis value is ignored if incremental is True.
\nIt is difficult to provide complete flexibility for incremental\nconfiguration. For example, because objects such as filters\nand formatters are anonymous, once a configuration is set up, it is\nnot possible to refer to such anonymous objects when augmenting a\nconfiguration.
\nFurthermore, there is not a compelling case for arbitrarily altering\nthe object graph of loggers, handlers, filters, formatters at\nrun-time, once a configuration is set up; the verbosity of loggers and\nhandlers can be controlled just by setting levels (and, in the case of\nloggers, propagation flags). Changing the object graph arbitrarily in\na safe way is problematic in a multi-threaded environment; while not\nimpossible, the benefits are not worth the complexity it adds to the\nimplementation.
\nThus, when the incremental key of a configuration dict is present\nand is True, the system will completely ignore any formatters and\nfilters entries, and process only the level\nsettings in the handlers entries, and the level and\npropagate settings in the loggers and root entries.
\nUsing a value in the configuration dict lets configurations to be sent\nover the wire as pickled dicts to a socket listener. Thus, the logging\nverbosity of a long-running application can be altered over time with\nno need to stop and restart the application.
\nThe schema describes a set of logging objects - loggers,\nhandlers, formatters, filters - which are connected to each other in\nan object graph. Thus, the schema needs to represent connections\nbetween the objects. For example, say that, once configured, a\nparticular logger has attached to it a particular handler. For the\npurposes of this discussion, we can say that the logger represents the\nsource, and the handler the destination, of a connection between the\ntwo. Of course in the configured objects this is represented by the\nlogger holding a reference to the handler. In the configuration dict,\nthis is done by giving each destination object an id which identifies\nit unambiguously, and then using the id in the source object’s\nconfiguration to indicate that a connection exists between the source\nand the destination object with that id.
\nSo, for example, consider the following YAML snippet:
\nformatters:\n brief:\n # configuration for formatter with id 'brief' goes here\n precise:\n # configuration for formatter with id 'precise' goes here\nhandlers:\n h1: #This is an id\n # configuration of handler with id 'h1' goes here\n formatter: brief\n h2: #This is another id\n # configuration of handler with id 'h2' goes here\n formatter: precise\nloggers:\n foo.bar.baz:\n # other configuration for logger 'foo.bar.baz'\n handlers: [h1, h2]
\n(Note: YAML used here because it’s a little more readable than the\nequivalent Python source form for the dictionary.)
\nThe ids for loggers are the logger names which would be used\nprogrammatically to obtain a reference to those loggers, e.g.\nfoo.bar.baz. The ids for Formatters and Filters can be any string\nvalue (such as brief, precise above) and they are transient,\nin that they are only meaningful for processing the configuration\ndictionary and used to determine connections between objects, and are\nnot persisted anywhere when the configuration call is complete.
\nThe above snippet indicates that logger named foo.bar.baz should\nhave two handlers attached to it, which are described by the handler\nids h1 and h2. The formatter for h1 is that described by id\nbrief, and the formatter for h2 is that described by id\nprecise.
\nThe schema supports user-defined objects for handlers, filters and\nformatters. (Loggers do not need to have different types for\ndifferent instances, so there is no support in this configuration\nschema for user-defined logger classes.)
\nObjects to be configured are described by dictionaries\nwhich detail their configuration. In some places, the logging system\nwill be able to infer from the context how an object is to be\ninstantiated, but when a user-defined object is to be instantiated,\nthe system will not know how to do this. In order to provide complete\nflexibility for user-defined object instantiation, the user needs\nto provide a ‘factory’ - a callable which is called with a\nconfiguration dictionary and which returns the instantiated object.\nThis is signalled by an absolute import path to the factory being\nmade available under the special key '()'. Here’s a concrete\nexample:
\nformatters:\n brief:\n format: '%(message)s'\n default:\n format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s'\n datefmt: '%Y-%m-%d %H:%M:%S'\n custom:\n (): my.package.customFormatterFactory\n bar: baz\n spam: 99.9\n answer: 42
\nThe above YAML snippet defines three formatters. The first, with id\nbrief, is a standard logging.Formatter instance with the\nspecified format string. The second, with id default, has a\nlonger format and also defines the time format explicitly, and will\nresult in a logging.Formatter initialized with those two format\nstrings. Shown in Python source form, the brief and default\nformatters have configuration sub-dictionaries:
\n{\n 'format' : '%(message)s'\n}\n
and:
\n{\n 'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s',\n 'datefmt' : '%Y-%m-%d %H:%M:%S'\n}\n
respectively, and as these dictionaries do not contain the special key\n'()', the instantiation is inferred from the context: as a result,\nstandard logging.Formatter instances are created. The\nconfiguration sub-dictionary for the third formatter, with id\ncustom, is:
\n{\n '()' : 'my.package.customFormatterFactory',\n 'bar' : 'baz',\n 'spam' : 99.9,\n 'answer' : 42\n}\n
and this contains the special key '()', which means that\nuser-defined instantiation is wanted. In this case, the specified\nfactory callable will be used. If it is an actual callable it will be\nused directly - otherwise, if you specify a string (as in the example)\nthe actual callable will be located using normal import mechanisms.\nThe callable will be called with the remaining items in the\nconfiguration sub-dictionary as keyword arguments. In the above\nexample, the formatter with id custom will be assumed to be\nreturned by the call:
\nmy.package.customFormatterFactory(bar='baz', spam=99.9, answer=42)\n
The key '()' has been used as the special key because it is not a\nvalid keyword parameter name, and so will not clash with the names of\nthe keyword arguments used in the call. The '()' also serves as a\nmnemonic that the corresponding value is a callable.
\nThere are times where a configuration needs to refer to objects\nexternal to the configuration, for example sys.stderr. If the\nconfiguration dict is constructed using Python code, this is\nstraightforward, but a problem arises when the configuration is\nprovided via a text file (e.g. JSON, YAML). In a text file, there is\nno standard way to distinguish sys.stderr from the literal string\n'sys.stderr'. To facilitate this distinction, the configuration\nsystem looks for certain special prefixes in string values and\ntreat them specially. For example, if the literal string\n'ext://sys.stderr' is provided as a value in the configuration,\nthen the ext:// will be stripped off and the remainder of the\nvalue processed using normal import mechanisms.
\nThe handling of such prefixes is done in a way analogous to protocol\nhandling: there is a generic mechanism to look for prefixes which\nmatch the regular expression ^(?P<prefix>[a-z]+)://(?P<suffix>.*)$\nwhereby, if the prefix is recognised, the suffix is processed\nin a prefix-dependent manner and the result of the processing replaces\nthe string value. If the prefix is not recognised, then the string\nvalue will be left as-is.
\nAs well as external objects, there is sometimes also a need to refer\nto objects in the configuration. This will be done implicitly by the\nconfiguration system for things that it knows about. For example, the\nstring value 'DEBUG' for a level in a logger or handler will\nautomatically be converted to the value logging.DEBUG, and the\nhandlers, filters and formatter entries will take an\nobject id and resolve to the appropriate destination object.
\nHowever, a more generic mechanism is needed for user-defined\nobjects which are not known to the logging module. For\nexample, consider logging.handlers.MemoryHandler, which takes\na target argument which is another handler to delegate to. Since\nthe system already knows about this class, then in the configuration,\nthe given target just needs to be the object id of the relevant\ntarget handler, and the system will resolve to the handler from the\nid. If, however, a user defines a my.package.MyHandler which has\nan alternate handler, the configuration system would not know that\nthe alternate referred to a handler. To cater for this, a generic\nresolution system allows the user to specify:
\nhandlers:\n file:\n # configuration of file handler goes here\n\n custom:\n (): my.package.MyHandler\n alternate: cfg://handlers.file
\nThe literal string 'cfg://handlers.file' will be resolved in an\nanalogous way to strings with the ext:// prefix, but looking\nin the configuration itself rather than the import namespace. The\nmechanism allows access by dot or by index, in a similar way to\nthat provided by str.format. Thus, given the following snippet:
\nhandlers:\n email:\n class: logging.handlers.SMTPHandler\n mailhost: localhost\n fromaddr: my_app@domain.tld\n toaddrs:\n - support_team@domain.tld\n - dev_team@domain.tld\n subject: Houston, we have a problem.
\nin the configuration, the string 'cfg://handlers' would resolve to\nthe dict with key handlers, the string 'cfg://handlers.email\nwould resolve to the dict with key email in the handlers dict,\nand so on. The string 'cfg://handlers.email.toaddrs[1] would\nresolve to 'dev_team.domain.tld' and the string\n'cfg://handlers.email.toaddrs[0]' would resolve to the value\n'support_team@domain.tld'. The subject value could be accessed\nusing either 'cfg://handlers.email.subject' or, equivalently,\n'cfg://handlers.email[subject]'. The latter form only needs to be\nused if the key contains spaces or non-alphanumeric characters. If an\nindex value consists only of decimal digits, access will be attempted\nusing the corresponding integer value, falling back to the string\nvalue if needed.
\nGiven a string cfg://handlers.myhandler.mykey.123, this will\nresolve to config_dict['handlers']['myhandler']['mykey']['123'].\nIf the string is specified as cfg://handlers.myhandler.mykey[123],\nthe system will attempt to retrieve the value from\nconfig_dict['handlers']['myhandler']['mykey'][123], and fall back\nto config_dict['handlers']['myhandler']['mykey']['123'] if that\nfails.
\nImport resolution, by default, uses the builtin __import__() function\nto do its importing. You may want to replace this with your own importing\nmechanism: if so, you can replace the importer attribute of the\nDictConfigurator or its superclass, the\nBaseConfigurator class. However, you need to be\ncareful because of the way functions are accessed from classes via\ndescriptors. If you are using a Python callable to do your imports, and you\nwant to define it at class level rather than instance level, you need to wrap\nit with staticmethod(). For example:
\nfrom importlib import import_module\nfrom logging.config import BaseConfigurator\n\nBaseConfigurator.importer = staticmethod(import_module)\n
You don’t need to wrap with staticmethod() if you’re setting the import\ncallable on a configurator instance.
\nThe configuration file format understood by fileConfig() is based on\nconfigparser functionality. The file must contain sections called\n[loggers], [handlers] and [formatters] which identify by name the\nentities of each type which are defined in the file. For each such entity, there\nis a separate section which identifies how that entity is configured. Thus, for\na logger named log01 in the [loggers] section, the relevant\nconfiguration details are held in a section [logger_log01]. Similarly, a\nhandler called hand01 in the [handlers] section will have its\nconfiguration held in a section called [handler_hand01], while a formatter\ncalled form01 in the [formatters] section will have its configuration\nspecified in a section called [formatter_form01]. The root logger\nconfiguration must be specified in a section called [logger_root].
\nExamples of these sections in the file are given below.
\n[loggers]\nkeys=root,log02,log03,log04,log05,log06,log07\n\n[handlers]\nkeys=hand01,hand02,hand03,hand04,hand05,hand06,hand07,hand08,hand09\n\n[formatters]\nkeys=form01,form02,form03,form04,form05,form06,form07,form08,form09\n
The root logger must specify a level and a list of handlers. An example of a\nroot logger section is given below.
\n[logger_root]\nlevel=NOTSET\nhandlers=hand01\n
The level entry can be one of DEBUG, INFO, WARNING, ERROR, CRITICAL or\nNOTSET. For the root logger only, NOTSET means that all messages will be\nlogged. Level values are eval()uated in the context of the logging\npackage’s namespace.
\nThe handlers entry is a comma-separated list of handler names, which must\nappear in the [handlers] section. These names must appear in the\n[handlers] section and have corresponding sections in the configuration\nfile.
\nFor loggers other than the root logger, some additional information is required.\nThis is illustrated by the following example.
\n[logger_parser]\nlevel=DEBUG\nhandlers=hand01\npropagate=1\nqualname=compiler.parser\n
The level and handlers entries are interpreted as for the root logger,\nexcept that if a non-root logger’s level is specified as NOTSET, the system\nconsults loggers higher up the hierarchy to determine the effective level of the\nlogger. The propagate entry is set to 1 to indicate that messages must\npropagate to handlers higher up the logger hierarchy from this logger, or 0 to\nindicate that messages are not propagated to handlers up the hierarchy. The\nqualname entry is the hierarchical channel name of the logger, that is to\nsay the name used by the application to get the logger.
\nSections which specify handler configuration are exemplified by the following.
\n[handler_hand01]\nclass=StreamHandler\nlevel=NOTSET\nformatter=form01\nargs=(sys.stdout,)
\nThe class entry indicates the handler’s class (as determined by eval()\nin the logging package’s namespace). The level is interpreted as for\nloggers, and NOTSET is taken to mean ‘log everything’.
\n\nChanged in version 2.6: Added support for resolving the handler’s class as a dotted module and\nclass name.
\nThe formatter entry indicates the key name of the formatter for this\nhandler. If blank, a default formatter (logging._defaultFormatter) is used.\nIf a name is specified, it must appear in the [formatters] section and have\na corresponding section in the configuration file.
\nThe args entry, when eval()uated in the context of the logging\npackage’s namespace, is the list of arguments to the constructor for the handler\nclass. Refer to the constructors for the relevant handlers, or to the examples\nbelow, to see how typical entries are constructed.
\n[handler_hand02]\nclass=FileHandler\nlevel=DEBUG\nformatter=form02\nargs=('python.log', 'w')\n\n[handler_hand03]\nclass=handlers.SocketHandler\nlevel=INFO\nformatter=form03\nargs=('localhost', handlers.DEFAULT_TCP_LOGGING_PORT)\n\n[handler_hand04]\nclass=handlers.DatagramHandler\nlevel=WARN\nformatter=form04\nargs=('localhost', handlers.DEFAULT_UDP_LOGGING_PORT)\n\n[handler_hand05]\nclass=handlers.SysLogHandler\nlevel=ERROR\nformatter=form05\nargs=(('localhost', handlers.SYSLOG_UDP_PORT), handlers.SysLogHandler.LOG_USER)\n\n[handler_hand06]\nclass=handlers.NTEventLogHandler\nlevel=CRITICAL\nformatter=form06\nargs=('Python Application', '', 'Application')\n\n[handler_hand07]\nclass=handlers.SMTPHandler\nlevel=WARN\nformatter=form07\nargs=('localhost', 'from@abc', ['user1@abc', 'user2@xyz'], 'Logger Subject')\n\n[handler_hand08]\nclass=handlers.MemoryHandler\nlevel=NOTSET\nformatter=form08\ntarget=\nargs=(10, ERROR)\n\n[handler_hand09]\nclass=handlers.HTTPHandler\nlevel=NOTSET\nformatter=form09\nargs=('localhost:9022', '/log', 'GET')
\nSections which specify formatter configuration are typified by the following.
\n[formatter_form01]\nformat=F1 %(asctime)s %(levelname)s %(message)s\ndatefmt=\nclass=logging.Formatter
\nThe format entry is the overall format string, and the datefmt entry is\nthe strftime()-compatible date/time format string. If empty, the\npackage substitutes ISO8601 format date/times, which is almost equivalent to\nspecifying the date format string '%Y-%m-%d %H:%M:%S'. The ISO8601 format\nalso specifies milliseconds, which are appended to the result of using the above\nformat string, with a comma separator. An example time in ISO8601 format is\n2003-01-23 00:29:50,411.
\nThe class entry is optional. It indicates the name of the formatter’s class\n(as a dotted module and class name.) This option is useful for instantiating a\nFormatter subclass. Subclasses of Formatter can present\nexception tracebacks in an expanded or condensed format.
\nSee also
\nPanels are windows with the added feature of depth, so they can be stacked on\ntop of each other, and only the visible portions of each window will be\ndisplayed. Panels can be added, moved up or down in the stack, and removed.
\nThe module curses.panel defines the following functions:
\nPanel objects, as returned by new_panel() above, are windows with a\nstacking order. There’s always a window associated with a panel which determines\nthe content, while the panel methods are responsible for the window’s depth in\nthe panel stack.
\nPanel objects have the following methods:
\n\nNew in version 2.3.
\nSource code: Lib/platform.py
\nNote
\nSpecific platforms listed alphabetically, with Linux included in the Unix\nsection.
\nQueries the given executable (defaults to the Python interpreter binary) for\nvarious architecture information.
\nReturns a tuple (bits, linkage) which contain information about the bit\narchitecture and the linkage format used for the executable. Both values are\nreturned as strings.
\nValues that cannot be determined are returned as given by the parameter presets.\nIf bits is given as '', the sizeof(pointer)() (or\nsizeof(long)() on Python version < 1.5.2) is used as indicator for the\nsupported pointer size.
\nThe function relies on the system’s file command to do the actual work.\nThis is available on most if not all Unix platforms and some non-Unix platforms\nand then only if the executable points to the Python interpreter. Reasonable\ndefaults are used when the above needs are not met.
\nNote
\nOn Mac OS X (and perhaps other platforms), executable files may be\nuniversal files containing multiple architectures.
\nTo get at the “64-bitness” of the current interpreter, it is more\nreliable to query the sys.maxsize attribute:
\nis_64bits = sys.maxsize > 2**32\n
Returns a single string identifying the underlying platform with as much useful\ninformation as possible.
\nThe output is intended to be human readable rather than machine parseable. It\nmay look different on different platforms and this is intended.
\nIf aliased is true, the function will use aliases for various platforms that\nreport system names which differ from their common names, for example SunOS will\nbe reported as Solaris. The system_alias() function is used to implement\nthis.
\nSetting terse to true causes the function to return only the absolute minimum\ninformation needed to identify the platform.
\nReturns the (real) processor name, e.g. 'amdk6'.
\nAn empty string is returned if the value cannot be determined. Note that many\nplatforms do not provide this information or simply return the same value as for\nmachine(). NetBSD does this.
\nReturns a string identifying the Python implementation SCM branch.
\n\nNew in version 2.6.
\nReturns a string identifying the Python implementation. Possible return values\nare: ‘CPython’, ‘IronPython’, ‘Jython’, ‘PyPy’.
\n\nNew in version 2.6.
\nReturns a string identifying the Python implementation SCM revision.
\n\nNew in version 2.6.
\nReturns the Python version as string 'major.minor.patchlevel'
\nNote that unlike the Python sys.version, the returned value will always\ninclude the patchlevel (it defaults to 0).
\nReturns the Python version as tuple (major, minor, patchlevel) of strings.
\nNote that unlike the Python sys.version, the returned value will always\ninclude the patchlevel (it defaults to '0').
\nFairly portable uname interface. Returns a tuple of strings (system, node,\nrelease, version, machine, processor) identifying the underlying platform.
\nNote that unlike the os.uname() function this also returns possible\nprocessor information as additional tuple entry.
\nEntries which cannot be determined are set to ''.
\nVersion interface for Jython.
\nReturns a tuple (release, vendor, vminfo, osinfo) with vminfo being a\ntuple (vm_name, vm_release, vm_vendor) and osinfo being a tuple\n(os_name, os_version, os_arch). Values which cannot be determined are set to\nthe defaults given as parameters (which all default to '').
\nGet additional version information from the Windows Registry and return a tuple\n(version, csd, ptype) referring to version number, CSD level and OS type\n(multi/single processor).
\nAs a hint: ptype is 'Uniprocessor Free' on single processor NT machines\nand 'Multiprocessor Free' on multi processor machines. The ‘Free’ refers\nto the OS version being free of debugging code. It could also state ‘Checked’\nwhich means the OS version uses debugging code, i.e. code that checks arguments,\nranges, etc.
\nNote
\nThis function works best with Mark Hammond’s\nwin32all package installed, but also on Python 2.3 and\nlater (support for this was added in Python 2.6). It obviously\nonly runs on Win32 compatible platforms.
\nGet Mac OS version information and return it as tuple (release, versioninfo,\nmachine) with versioninfo being a tuple (version, dev_stage,\nnon_release_version).
\nEntries which cannot be determined are set to ''. All tuple entries are\nstrings.
\nThis is an old version of the functionality now provided by\nlinux_distribution(). For new code, please use the\nlinux_distribution().
\nThe only difference between the two is that dist() always\nreturns the short name of the distribution taken from the\nsupported_dists parameter.
\n\nDeprecated since version 2.6.
\nTries to determine the name of the Linux OS distribution name.
\nsupported_dists may be given to define the set of Linux distributions to\nlook for. It defaults to a list of currently supported Linux distributions\nidentified by their release file name.
\nIf full_distribution_name is true (default), the full distribution read\nfrom the OS is returned. Otherwise the short name taken from\nsupported_dists is used.
\nReturns a tuple (distname,version,id) which defaults to the args given as\nparameters. id is the item in parentheses after the version number. It\nis usually the version codename.
\n\nNew in version 2.6.
\nTries to determine the libc version against which the file executable (defaults\nto the Python interpreter) is linked. Returns a tuple of strings (lib,\nversion) which default to the given parameters in case the lookup fails.
\nNote that this function has intimate knowledge of how different libc versions\nadd symbols to the executable is probably only usable for executables compiled\nusing gcc.
\nThe file is read and scanned in chunks of chunksize bytes.
\nPlatforms: Unix
\n\nChanged in version 1.6: Added support for the ncurses library and converted to a package.
\nThe curses module provides an interface to the curses library, the\nde-facto standard for portable advanced terminal handling.
\nWhile curses is most widely used in the Unix environment, versions are available\nfor DOS, OS/2, and possibly other systems as well. This extension module is\ndesigned to match the API of ncurses, an open-source curses library hosted on\nLinux and the BSD variants of Unix.
\nNote
\nSince version 5.4, the ncurses library decides how to interpret non-ASCII data\nusing the nl_langinfo function. That means that you have to call\nlocale.setlocale() in the application and encode Unicode strings\nusing one of the system’s available encodings. This example uses the\nsystem’s default encoding:
\nimport locale\nlocale.setlocale(locale.LC_ALL, '')\ncode = locale.getpreferredencoding()\n
Then use code as the encoding for str.encode() calls.
\nSee also
\nThe Demo/curses/ directory in the Python source distribution contains\nsome example programs using the curses bindings provided by this module.
\nThe module curses defines the following exception:
\nNote
\nWhenever x or y arguments to a function or a method are optional, they\ndefault to the current cursor location. Whenever attr is optional, it defaults\nto A_NORMAL.
\nThe module curses defines the following functions:
\nUpdate the physical screen. The curses library keeps two data structures, one\nrepresenting the current physical screen contents and a virtual screen\nrepresenting the desired next state. The doupdate() ground updates the\nphysical screen to match the virtual screen.
\nThe virtual screen may be updated by a noutrefresh() call after write\noperations such as addstr() have been performed on a window. The normal\nrefresh() call is simply noutrefresh() followed by doupdate();\nif you have to update multiple windows, you can speed performance and perhaps\nreduce screen flicker by issuing noutrefresh() calls on all windows,\nfollowed by a single doupdate().
\nInitialize the library. Return a WindowObject which represents the\nwhole screen.
\nNote
\nIf there is an error opening the terminal, the underlying curses library may\ncause the interpreter to exit.
\nCreate and return a pointer to a new pad data structure with the given number\nof lines and columns. A pad is returned as a window object.
\nA pad is like a window, except that it is not restricted by the screen size, and\nis not necessarily associated with a particular part of the screen. Pads can be\nused when a large window is needed, and only a part of the window will be on the\nscreen at one time. Automatic refreshes of pads (such as from scrolling or\nechoing of input) do not occur. The refresh() and noutrefresh()\nmethods of a pad require 6 arguments to specify the part of the pad to be\ndisplayed and the location on the screen to be used for the display. The\narguments are pminrow, pmincol, sminrow, smincol, smaxrow, smaxcol; the p\narguments refer to the upper left corner of the pad region to be displayed and\nthe s arguments define a clipping box on the screen within which the pad region\nis to be displayed.
\nReturn a new window, whose left-upper corner is at (begin_y, begin_x), and\nwhose height/width is nlines/ncols.
\nBy default, the window will extend from the specified position to the lower\nright corner of the screen.
\nMust be called if the programmer wants to use colors, and before any other color\nmanipulation routine is called. It is good practice to call this routine right\nafter initscr().
\nstart_color() initializes eight basic colors (black, red, green, yellow,\nblue, magenta, cyan, and white), and two global variables in the curses\nmodule, COLORS and COLOR_PAIRS, containing the maximum number\nof colors and color-pairs the terminal can support. It also restores the colors\non the terminal to the values they had when the terminal was just turned on.
\nSpecify that the file descriptor fd be used for typeahead checking. If fd\nis -1, then no typeahead checking is done.
\nThe curses library does “line-breakout optimization” by looking for typeahead\nperiodically while updating the screen. If input is found, and it is coming\nfrom a tty, the current update is postponed until refresh or doupdate is called\nagain, allowing faster response to commands typed in advance. This function\nallows specifying a different file descriptor for typeahead checking.
\nPush ch so the next getch() will return it.
\nNote
\nOnly one ch can be pushed before getch() is called.
\nWindow objects, as returned by initscr() and newwin() above, have\nthe following methods:
\nNote
\nA character means a C character (an ASCII code), rather than a Python\ncharacter (a string of length 1). (This note is true whenever the\ndocumentation mentions a character.) The built-in ord() is handy for\nconveying strings to codes.
\nPaint character ch at (y, x) with attributes attr, overwriting any\ncharacter previously painter at that location. By default, the character\nposition and attributes are the current settings for the window object.
\nSet the background property of the window to the character ch, with\nattributes attr. The change is then applied to every character position in\nthat window:
\nDraw a border around the edges of the window. Each parameter specifies the\ncharacter to use for a specific part of the border; see the table below for more\ndetails. The characters can be specified as integers or as one-character\nstrings.
\nNote
\nA 0 value for any parameter will cause the default character to be used for\nthat parameter. Keyword parameters can not be used. The defaults are listed\nin this table:
\nParameter | \nDescription | \nDefault value | \n
---|---|---|
ls | \nLeft side | \nACS_VLINE | \n
rs | \nRight side | \nACS_VLINE | \n
ts | \nTop | \nACS_HLINE | \n
bs | \nBottom | \nACS_HLINE | \n
tl | \nUpper-left corner | \nACS_ULCORNER | \n
tr | \nUpper-right corner | \nACS_URCORNER | \n
bl | \nBottom-left corner | \nACS_LLCORNER | \n
br | \nBottom-right corner | \nACS_LRCORNER | \n
If yes is 1, cursor is left where it is on update, instead of being at “cursor\nposition.” This reduces cursor movement where possible. If possible the cursor\nwill be made invisible.
\nIf yes is 0, cursor will always be at “cursor position” after an update.
\nIf yes is 1, escape sequences will not be timed out.
\nIf yes is 0, after a few milliseconds, an escape sequence will not be\ninterpreted, and will be left in the input stream as is.
\nOverlay the window on top of destwin. The windows need not be the same size,\nonly the overlapping region is copied. This copy is non-destructive, which means\nthat the current background character does not overwrite the old contents of\ndestwin.
\nTo get fine-grained control over the copied region, the second form of\noverlay() can be used. sminrow and smincol are the upper-left\ncoordinates of the source window, and the other variables mark a rectangle in\nthe destination window.
\nOverwrite the window on top of destwin. The windows need not be the same size,\nin which case only the overlapping region is copied. This copy is destructive,\nwhich means that the current background character overwrites the old contents of\ndestwin.
\nTo get fine-grained control over the copied region, the second form of\noverwrite() can be used. sminrow and smincol are the upper-left\ncoordinates of the source window, the other variables mark a rectangle in the\ndestination window.
\nUpdate the display immediately (sync actual screen with previous\ndrawing/deleting methods).
\nThe 6 optional arguments can only be specified when the window is a pad created\nwith newpad(). The additional parameters are needed to indicate what part\nof the pad and screen are involved. pminrow and pmincol specify the upper\nleft-hand corner of the rectangle to be displayed in the pad. sminrow,\nsmincol, smaxrow, and smaxcol specify the edges of the rectangle to be\ndisplayed on the screen. The lower right-hand corner of the rectangle to be\ndisplayed in the pad is calculated from the screen coordinates, since the\nrectangles must be the same size. Both rectangles must be entirely contained\nwithin their respective structures. Negative values of pminrow, pmincol,\nsminrow, or smincol are treated as if they were zero.
\nReturn a sub-window, whose upper-left corner is at (begin_y, begin_x), and\nwhose width/height is ncols/nlines.
\nBy default, the sub-window will extend from the specified position to the lower\nright corner of the window.
\nThe curses module defines the following data members:
\nSeveral constants are available to specify character cell attributes:
\nAttribute | \nMeaning | \n
---|---|
A_ALTCHARSET | \nAlternate character set mode. | \n
A_BLINK | \nBlink mode. | \n
A_BOLD | \nBold mode. | \n
A_DIM | \nDim mode. | \n
A_NORMAL | \nNormal attribute. | \n
A_REVERSE | \nReverse background and\nforeground colors. | \n
A_STANDOUT | \nStandout mode. | \n
A_UNDERLINE | \nUnderline mode. | \n
Keys are referred to by integer constants with names starting with KEY_.\nThe exact keycaps available are system dependent.
\nKey constant | \nKey | \n
---|---|
KEY_MIN | \nMinimum key value | \n
KEY_BREAK | \nBreak key (unreliable) | \n
KEY_DOWN | \nDown-arrow | \n
KEY_UP | \nUp-arrow | \n
KEY_LEFT | \nLeft-arrow | \n
KEY_RIGHT | \nRight-arrow | \n
KEY_HOME | \nHome key (upward+left arrow) | \n
KEY_BACKSPACE | \nBackspace (unreliable) | \n
KEY_F0 | \nFunction keys. Up to 64 function keys are\nsupported. | \n
KEY_Fn | \nValue of function key n | \n
KEY_DL | \nDelete line | \n
KEY_IL | \nInsert line | \n
KEY_DC | \nDelete character | \n
KEY_IC | \nInsert char or enter insert mode | \n
KEY_EIC | \nExit insert char mode | \n
KEY_CLEAR | \nClear screen | \n
KEY_EOS | \nClear to end of screen | \n
KEY_EOL | \nClear to end of line | \n
KEY_SF | \nScroll 1 line forward | \n
KEY_SR | \nScroll 1 line backward (reverse) | \n
KEY_NPAGE | \nNext page | \n
KEY_PPAGE | \nPrevious page | \n
KEY_STAB | \nSet tab | \n
KEY_CTAB | \nClear tab | \n
KEY_CATAB | \nClear all tabs | \n
KEY_ENTER | \nEnter or send (unreliable) | \n
KEY_SRESET | \nSoft (partial) reset (unreliable) | \n
KEY_RESET | \nReset or hard reset (unreliable) | \n
KEY_PRINT | \n|
KEY_LL | \nHome down or bottom (lower left) | \n
KEY_A1 | \nUpper left of keypad | \n
KEY_A3 | \nUpper right of keypad | \n
KEY_B2 | \nCenter of keypad | \n
KEY_C1 | \nLower left of keypad | \n
KEY_C3 | \nLower right of keypad | \n
KEY_BTAB | \nBack tab | \n
KEY_BEG | \nBeg (beginning) | \n
KEY_CANCEL | \nCancel | \n
KEY_CLOSE | \nClose | \n
KEY_COMMAND | \nCmd (command) | \n
KEY_COPY | \nCopy | \n
KEY_CREATE | \nCreate | \n
KEY_END | \nEnd | \n
KEY_EXIT | \nExit | \n
KEY_FIND | \nFind | \n
KEY_HELP | \nHelp | \n
KEY_MARK | \nMark | \n
KEY_MESSAGE | \nMessage | \n
KEY_MOVE | \nMove | \n
KEY_NEXT | \nNext | \n
KEY_OPEN | \nOpen | \n
KEY_OPTIONS | \nOptions | \n
KEY_PREVIOUS | \nPrev (previous) | \n
KEY_REDO | \nRedo | \n
KEY_REFERENCE | \nRef (reference) | \n
KEY_REFRESH | \nRefresh | \n
KEY_REPLACE | \nReplace | \n
KEY_RESTART | \nRestart | \n
KEY_RESUME | \nResume | \n
KEY_SAVE | \nSave | \n
KEY_SBEG | \nShifted Beg (beginning) | \n
KEY_SCANCEL | \nShifted Cancel | \n
KEY_SCOMMAND | \nShifted Command | \n
KEY_SCOPY | \nShifted Copy | \n
KEY_SCREATE | \nShifted Create | \n
KEY_SDC | \nShifted Delete char | \n
KEY_SDL | \nShifted Delete line | \n
KEY_SELECT | \nSelect | \n
KEY_SEND | \nShifted End | \n
KEY_SEOL | \nShifted Clear line | \n
KEY_SEXIT | \nShifted Dxit | \n
KEY_SFIND | \nShifted Find | \n
KEY_SHELP | \nShifted Help | \n
KEY_SHOME | \nShifted Home | \n
KEY_SIC | \nShifted Input | \n
KEY_SLEFT | \nShifted Left arrow | \n
KEY_SMESSAGE | \nShifted Message | \n
KEY_SMOVE | \nShifted Move | \n
KEY_SNEXT | \nShifted Next | \n
KEY_SOPTIONS | \nShifted Options | \n
KEY_SPREVIOUS | \nShifted Prev | \n
KEY_SPRINT | \nShifted Print | \n
KEY_SREDO | \nShifted Redo | \n
KEY_SREPLACE | \nShifted Replace | \n
KEY_SRIGHT | \nShifted Right arrow | \n
KEY_SRSUME | \nShifted Resume | \n
KEY_SSAVE | \nShifted Save | \n
KEY_SSUSPEND | \nShifted Suspend | \n
KEY_SUNDO | \nShifted Undo | \n
KEY_SUSPEND | \nSuspend | \n
KEY_UNDO | \nUndo | \n
KEY_MOUSE | \nMouse event has occurred | \n
KEY_RESIZE | \nTerminal resize event | \n
KEY_MAX | \nMaximum key value | \n
On VT100s and their software emulations, such as X terminal emulators, there are\nnormally at least four function keys (KEY_F1, KEY_F2,\nKEY_F3, KEY_F4) available, and the arrow keys mapped to\nKEY_UP, KEY_DOWN, KEY_LEFT and KEY_RIGHT in\nthe obvious way. If your machine has a PC keyboard, it is safe to expect arrow\nkeys and twelve function keys (older PC keyboards may have only ten function\nkeys); also, the following keypad mappings are standard:
\nKeycap | \nConstant | \n
---|---|
Insert | \nKEY_IC | \n
Delete | \nKEY_DC | \n
Home | \nKEY_HOME | \n
End | \nKEY_END | \n
Page Up | \nKEY_NPAGE | \n
Page Down | \nKEY_PPAGE | \n
The following table lists characters from the alternate character set. These are\ninherited from the VT100 terminal, and will generally be available on software\nemulations such as X terminals. When there is no graphic available, curses\nfalls back on a crude printable ASCII approximation.
\nNote
\nThese are available only after initscr() has been called.
\nACS code | \nMeaning | \n
---|---|
ACS_BBSS | \nalternate name for upper right corner | \n
ACS_BLOCK | \nsolid square block | \n
ACS_BOARD | \nboard of squares | \n
ACS_BSBS | \nalternate name for horizontal line | \n
ACS_BSSB | \nalternate name for upper left corner | \n
ACS_BSSS | \nalternate name for top tee | \n
ACS_BTEE | \nbottom tee | \n
ACS_BULLET | \nbullet | \n
ACS_CKBOARD | \nchecker board (stipple) | \n
ACS_DARROW | \narrow pointing down | \n
ACS_DEGREE | \ndegree symbol | \n
ACS_DIAMOND | \ndiamond | \n
ACS_GEQUAL | \ngreater-than-or-equal-to | \n
ACS_HLINE | \nhorizontal line | \n
ACS_LANTERN | \nlantern symbol | \n
ACS_LARROW | \nleft arrow | \n
ACS_LEQUAL | \nless-than-or-equal-to | \n
ACS_LLCORNER | \nlower left-hand corner | \n
ACS_LRCORNER | \nlower right-hand corner | \n
ACS_LTEE | \nleft tee | \n
ACS_NEQUAL | \nnot-equal sign | \n
ACS_PI | \nletter pi | \n
ACS_PLMINUS | \nplus-or-minus sign | \n
ACS_PLUS | \nbig plus sign | \n
ACS_RARROW | \nright arrow | \n
ACS_RTEE | \nright tee | \n
ACS_S1 | \nscan line 1 | \n
ACS_S3 | \nscan line 3 | \n
ACS_S7 | \nscan line 7 | \n
ACS_S9 | \nscan line 9 | \n
ACS_SBBS | \nalternate name for lower right corner | \n
ACS_SBSB | \nalternate name for vertical line | \n
ACS_SBSS | \nalternate name for right tee | \n
ACS_SSBB | \nalternate name for lower left corner | \n
ACS_SSBS | \nalternate name for bottom tee | \n
ACS_SSSB | \nalternate name for left tee | \n
ACS_SSSS | \nalternate name for crossover or big plus | \n
ACS_STERLING | \npound sterling | \n
ACS_TTEE | \ntop tee | \n
ACS_UARROW | \nup arrow | \n
ACS_ULCORNER | \nupper left corner | \n
ACS_URCORNER | \nupper right corner | \n
ACS_VLINE | \nvertical line | \n
The following table lists the predefined colors:
\nConstant | \nColor | \n
---|---|
COLOR_BLACK | \nBlack | \n
COLOR_BLUE | \nBlue | \n
COLOR_CYAN | \nCyan (light greenish blue) | \n
COLOR_GREEN | \nGreen | \n
COLOR_MAGENTA | \nMagenta (purplish red) | \n
COLOR_RED | \nRed | \n
COLOR_WHITE | \nWhite | \n
COLOR_YELLOW | \nYellow | \n
\nNew in version 1.6.
\nThe curses.textpad module provides a Textbox class that handles\nelementary text editing in a curses window, supporting a set of keybindings\nresembling those of Emacs (thus, also of Netscape Navigator, BBedit 6.x,\nFrameMaker, and many other programs). The module also provides a\nrectangle-drawing function useful for framing text boxes or for other purposes.
\nThe module curses.textpad defines the following function:
\nYou can instantiate a Textbox object as follows:
\nReturn a textbox widget object. The win argument should be a curses\nWindowObject in which the textbox is to be contained. The edit cursor\nof the textbox is initially located at the upper left hand corner of the\ncontaining window, with coordinates (0, 0). The instance’s\nstripspaces flag is initially on.
\nTextbox objects have the following methods:
\nProcess a single command keystroke. Here are the supported special\nkeystrokes:
\nKeystroke | \nAction | \n
---|---|
Control-A | \nGo to left edge of window. | \n
Control-B | \nCursor left, wrapping to previous line if\nappropriate. | \n
Control-D | \nDelete character under cursor. | \n
Control-E | \nGo to right edge (stripspaces off) or end\nof line (stripspaces on). | \n
Control-F | \nCursor right, wrapping to next line when\nappropriate. | \n
Control-G | \nTerminate, returning the window contents. | \n
Control-H | \nDelete character backward. | \n
Control-J | \nTerminate if the window is 1 line,\notherwise insert newline. | \n
Control-K | \nIf line is blank, delete it, otherwise\nclear to end of line. | \n
Control-L | \nRefresh screen. | \n
Control-N | \nCursor down; move down one line. | \n
Control-O | \nInsert a blank line at cursor location. | \n
Control-P | \nCursor up; move up one line. | \n
Move operations do nothing if the cursor is at an edge where the movement\nis not possible. The following synonyms are supported where possible:
\nConstant | \nKeystroke | \n
---|---|
KEY_LEFT | \nControl-B | \n
KEY_RIGHT | \nControl-F | \n
KEY_UP | \nControl-P | \n
KEY_DOWN | \nControl-N | \n
KEY_BACKSPACE | \nControl-h | \n
All other keystrokes are treated as a command to insert the given\ncharacter and move right (with line wrapping).
\nThis module makes available standard errno system symbols. The value of each\nsymbol is the corresponding integer value. The names and descriptions are\nborrowed from linux/include/errno.h, which should be pretty\nall-inclusive.
\nTo translate a numeric error code to an error message, use os.strerror().
\nOf the following list, symbols that are not used on the current platform are not\ndefined by the module. The specific list of defined symbols is available as\nerrno.errorcode.keys(). Symbols available can include:
\nThis module provides access to the select() and poll() functions\navailable in most operating systems, epoll() available on Linux 2.5+ and\nkqueue() available on most BSD.\nNote that on Windows, it only works for sockets; on other operating systems,\nit also works for other file types (in particular, on Unix, it works on pipes).\nIt cannot be used on regular files to determine whether a file has grown since\nit was last read.
\nThe module defines the following:
\n(Only supported on Linux 2.5.44 and newer.) Returns an edge polling object,\nwhich can be used as Edge or Level Triggered interface for I/O events; see\nsection Edge and Level Trigger Polling (epoll) Objects below for the methods supported by epolling\nobjects.
\n\nNew in version 2.6.
\n(Only supported on BSD.) Returns a kernel queue object; see section\nKqueue Objects below for the methods supported by kqueue objects.
\n\nNew in version 2.6.
\n(Only supported on BSD.) Returns a kernel event object; see section\nKevent Objects below for the methods supported by kevent objects.
\n\nNew in version 2.6.
\nThis is a straightforward interface to the Unix select() system call.\nThe first three arguments are sequences of ‘waitable objects’: either\nintegers representing file descriptors or objects with a parameterless method\nnamed fileno() returning such an integer:
\nEmpty sequences are allowed, but acceptance of three empty sequences is\nplatform-dependent. (It is known to work on Unix but not on Windows.) The\noptional timeout argument specifies a time-out as a floating point number\nin seconds. When the timeout argument is omitted the function blocks until\nat least one file descriptor is ready. A time-out value of zero specifies a\npoll and never blocks.
\nThe return value is a triple of lists of objects that are ready: subsets of the\nfirst three arguments. When the time-out is reached without a file descriptor\nbecoming ready, three empty lists are returned.
\nAmong the acceptable object types in the sequences are Python file objects (e.g.\nsys.stdin, or objects returned by open() or os.popen()), socket\nobjects returned by socket.socket(). You may also define a wrapper\nclass yourself, as long as it has an appropriate fileno() method (that\nreally returns a file descriptor, not just a random integer).
\nNote
\nFile objects on Windows are not acceptable, but sockets are. On Windows,\nthe underlying select() function is provided by the WinSock\nlibrary, and does not handle file descriptors that don’t originate from\nWinSock.
\nFiles reported as ready for writing by select(), poll() or\nsimilar interfaces in this module are guaranteed to not block on a write\nof up to PIPE_BUF bytes.\nThis value is guaranteed by POSIX to be at least 512. Availability: Unix.
\n\nNew in version 2.7.
\n\n\nhttp://linux.die.net/man/4/epoll
\neventmask
\n\n
\n\n \n\n\n \n \n\n\n Constant \nMeaning \n\n EPOLLIN \nAvailable for read \n\n EPOLLOUT \nAvailable for write \n\n EPOLLPRI \nUrgent data for read \n\n EPOLLERR \nError condition happened on the assoc. fd \n\n EPOLLHUP \nHang up happened on the assoc. fd \n\n EPOLLET \nSet Edge Trigger behavior, the default is\nLevel Trigger behavior \n\n EPOLLONESHOT \nSet one-shot behavior. After one event is\npulled out, the fd is internally disabled \n\n EPOLLRDNORM \nEquivalent to EPOLLIN \n\n EPOLLRDBAND \nPriority data band can be read. \n\n EPOLLWRNORM \nEquivalent to EPOLLOUT \n\n EPOLLWRBAND \nPriority data may be written. \n\n\n EPOLLMSG \nIgnored. \n
Register a fd descriptor with the epoll object.
\nNote
\nRegistering a file descriptor that’s already registered raises an\nIOError – contrary to Polling Objects‘s register.
\nThe poll() system call, supported on most Unix systems, provides better\nscalability for network servers that service many, many clients at the same\ntime. poll() scales better because the system call only requires listing\nthe file descriptors of interest, while select() builds a bitmap, turns\non bits for the fds of interest, and then afterward the whole bitmap has to be\nlinearly scanned again. select() is O(highest file descriptor), while\npoll() is O(number of file descriptors).
\nRegister a file descriptor with the polling object. Future calls to the\npoll() method will then check whether the file descriptor has any pending\nI/O events. fd can be either an integer, or an object with a fileno()\nmethod that returns an integer. File objects implement fileno(), so they\ncan also be used as the argument.
\neventmask is an optional bitmask describing the type of events you want to\ncheck for, and can be a combination of the constants POLLIN,\nPOLLPRI, and POLLOUT, described in the table below. If not\nspecified, the default value used will check for all 3 types of events.
\nConstant | \nMeaning | \n
---|---|
POLLIN | \nThere is data to read | \n
POLLPRI | \nThere is urgent data to read | \n
POLLOUT | \nReady for output: writing will not block | \n
POLLERR | \nError condition of some sort | \n
POLLHUP | \nHung up | \n
POLLNVAL | \nInvalid request: descriptor not open | \n
Registering a file descriptor that’s already registered is not an error, and has\nthe same effect as registering the descriptor exactly once.
\nModifies an already registered fd. This has the same effect as\nregister(fd, eventmask). Attempting to modify a file descriptor\nthat was never registered causes an IOError exception with errno\nENOENT to be raised.
\n\nNew in version 2.6.
\nRemove a file descriptor being tracked by a polling object. Just like the\nregister() method, fd can be an integer or an object with a\nfileno() method that returns an integer.
\nAttempting to remove a file descriptor that was never registered causes a\nKeyError exception to be raised.
\nLow level interface to kevent
\nhttp://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2
\nName of the kernel filter.
\nConstant | \nMeaning | \n
---|---|
KQ_FILTER_READ | \nTakes a descriptor and returns whenever\nthere is data available to read | \n
KQ_FILTER_WRITE | \nTakes a descriptor and returns whenever\nthere is data available to write | \n
KQ_FILTER_AIO | \nAIO requests | \n
KQ_FILTER_VNODE | \nReturns when one or more of the requested\nevents watched in fflag occurs | \n
KQ_FILTER_PROC | \nWatch for events on a process id | \n
KQ_FILTER_NETDEV | \nWatch for events on a network device\n[not available on Mac OS X] | \n
KQ_FILTER_SIGNAL | \nReturns whenever the watched signal is\ndelivered to the process | \n
KQ_FILTER_TIMER | \nEstablishes an arbitrary timer | \n
Filter action.
\nConstant | \nMeaning | \n
---|---|
KQ_EV_ADD | \nAdds or modifies an event | \n
KQ_EV_DELETE | \nRemoves an event from the queue | \n
KQ_EV_ENABLE | \nPermitscontrol() to returns the event | \n
KQ_EV_DISABLE | \nDisablesevent | \n
KQ_EV_ONESHOT | \nRemoves event after first occurrence | \n
KQ_EV_CLEAR | \nReset the state after an event is retrieved | \n
KQ_EV_SYSFLAGS | \ninternal event | \n
KQ_EV_FLAG1 | \ninternal event | \n
KQ_EV_EOF | \nFilter specific EOF condition | \n
KQ_EV_ERROR | \nSee return values | \n
Filter specific flags.
\nKQ_FILTER_READ and KQ_FILTER_WRITE filter flags:
\nConstant | \nMeaning | \n
---|---|
KQ_NOTE_LOWAT | \nlow water mark of a socket buffer | \n
KQ_FILTER_VNODE filter flags:
\nConstant | \nMeaning | \n
---|---|
KQ_NOTE_DELETE | \nunlink() was called | \n
KQ_NOTE_WRITE | \na write occurred | \n
KQ_NOTE_EXTEND | \nthe file was extended | \n
KQ_NOTE_ATTRIB | \nan attribute was changed | \n
KQ_NOTE_LINK | \nthe link count has changed | \n
KQ_NOTE_RENAME | \nthe file was renamed | \n
KQ_NOTE_REVOKE | \naccess to the file was revoked | \n
KQ_FILTER_PROC filter flags:
\nConstant | \nMeaning | \n
---|---|
KQ_NOTE_EXIT | \nthe process has exited | \n
KQ_NOTE_FORK | \nthe process has called fork() | \n
KQ_NOTE_EXEC | \nthe process has executed a new process | \n
KQ_NOTE_PCTRLMASK | \ninternal filter flag | \n
KQ_NOTE_PDATAMASK | \ninternal filter flag | \n
KQ_NOTE_TRACK | \nfollow a process across fork() | \n
KQ_NOTE_CHILD | \nreturned on the child process for\nNOTE_TRACK | \n
KQ_NOTE_TRACKERR | \nunable to attach to a child | \n
KQ_FILTER_NETDEV filter flags (not available on Mac OS X):
\nConstant | \nMeaning | \n
---|---|
KQ_NOTE_LINKUP | \nlink is up | \n
KQ_NOTE_LINKDOWN | \nlink is down | \n
KQ_NOTE_LINKINV | \nlink state is invalid | \n
Source code: Lib/dummy_threading.py
\nThis module provides a duplicate interface to the threading module. It\nis meant to be imported when the thread module is not provided on a\nplatform.
\nSuggested usage is:
\ntry:\n import threading as _threading\nexcept ImportError:\n import dummy_threading as _threading\n
Be careful to not use this module where deadlock might occur from a thread\nbeing created that blocks waiting for another thread to be created. This often\noccurs with blocking I/O.
\nNote
\nThe thread module has been renamed to _thread in Python 3.0.\nThe 2to3 tool will automatically adapt imports when converting your\nsources to 3.0; however, you should consider using the high-level\nthreading module instead.
\nThis module provides low-level primitives for working with multiple threads\n(also called light-weight processes or tasks) — multiple threads of\ncontrol sharing their global data space. For synchronization, simple locks\n(also called mutexes or binary semaphores) are provided.\nThe threading module provides an easier to use and higher-level\nthreading API built on top of this module.
\nThe module is optional. It is supported on Windows, Linux, SGI IRIX, Solaris\n2.x, as well as on systems that have a POSIX thread (a.k.a. “pthread”)\nimplementation. For systems lacking the thread module, the\ndummy_thread module is available. It duplicates this module’s interface\nand can be used as a drop-in replacement.
\nIt defines the following constant and functions:
\nRaise a KeyboardInterrupt exception in the main thread. A subthread can\nuse this function to interrupt the main thread.
\n\nNew in version 2.3.
\nReturn the thread stack size used when creating new threads. The optional\nsize argument specifies the stack size to be used for subsequently created\nthreads, and must be 0 (use platform or configured default) or a positive\ninteger value of at least 32,768 (32kB). If changing the thread stack size is\nunsupported, the error exception is raised. If the specified stack size is\ninvalid, a ValueError is raised and the stack size is unmodified. 32kB\nis currently the minimum supported stack size value to guarantee sufficient\nstack space for the interpreter itself. Note that some platforms may have\nparticular restrictions on values for the stack size, such as requiring a\nminimum stack size > 32kB or requiring allocation in multiples of the system\nmemory page size - platform documentation should be referred to for more\ninformation (4kB pages are common; using multiples of 4096 for the stack size is\nthe suggested approach in the absence of more specific information).\nAvailability: Windows, systems with POSIX threads.
\n\nNew in version 2.5.
\nLock objects have the following methods:
\nIn addition to these methods, lock objects can also be used via the\nwith statement, e.g.:
\nimport thread\n\na_lock = thread.allocate_lock()\n\nwith a_lock:\n print "a_lock is locked while this executes"\n
Caveats:
\n\n\n
Threads interact strangely with interrupts: the KeyboardInterrupt\nexception will be received by an arbitrary thread. (When the signal\nmodule is available, interrupts always go to the main thread.)
\nCalling sys.exit() or raising the SystemExit exception is\nequivalent to calling thread.exit().
\nNot all built-in functions that may block waiting for I/O allow other threads\nto run. (The most popular ones (time.sleep(), file.read(),\nselect.select()) work as expected.)
\nIt is not possible to interrupt the acquire() method on a lock — the\nKeyboardInterrupt exception will happen after the lock has been acquired.
\nWhen the main thread exits, it is system defined whether the other threads\nsurvive. On SGI IRIX using the native thread implementation, they survive. On\nmost other systems, they are killed without executing try ...\nfinally clauses or executing object destructors.
\nWhen the main thread exits, it does not do any of its usual cleanup (except\nthat try ... finally clauses are honored), and the\nstandard I/O files are not flushed.
\n\nNew in version 2.3.
\nThis module defines functions and classes which implement a flexible event\nlogging system for applications and libraries.
\nThe key benefit of having the logging API provided by a standard library module\nis that all Python modules can participate in logging, so your application log\ncan include your own messages integrated with messages from third-party\nmodules.
\nThe module provides a lot of functionality and flexibility. If you are\nunfamiliar with logging, the best way to get to grips with it is to see the\ntutorials (see the links on the right).
\nThe basic classes defined by the module, together with their functions, are\nlisted below.
\nLoggers have the following attributes and methods. Note that Loggers are never\ninstantiated directly, but always through the module-level function\nlogging.getLogger(name).
\nIf this evaluates to true, logging messages are passed by this logger and by\nits child loggers to the handlers of higher level (ancestor) loggers.\nMessages are passed directly to the ancestor loggers’ handlers - neither the\nlevel nor filters of the ancestor loggers in question are considered.
\nIf this evaluates to false, logging messages are not passed to the handlers\nof ancestor loggers.
\nThe constructor sets this attribute to True.
\nSets the threshold for this logger to lvl. Logging messages which are less\nsevere than lvl will be ignored. When a logger is created, the level is set to\nNOTSET (which causes all messages to be processed when the logger is\nthe root logger, or delegation to the parent when the logger is a non-root\nlogger). Note that the root logger is created with level WARNING.
\nThe term ‘delegation to the parent’ means that if a logger has a level of\nNOTSET, its chain of ancestor loggers is traversed until either an ancestor with\na level other than NOTSET is found, or the root is reached.
\nIf an ancestor is found with a level other than NOTSET, then that ancestor’s\nlevel is treated as the effective level of the logger where the ancestor search\nbegan, and is used to determine how a logging event is handled.
\nIf the root is reached, and it has a level of NOTSET, then all messages will be\nprocessed. Otherwise, the root’s level will be used as the effective level.
\nReturns a logger which is a descendant to this logger, as determined by the suffix.\nThus, logging.getLogger('abc').getChild('def.ghi') would return the same\nlogger as would be returned by logging.getLogger('abc.def.ghi'). This is a\nconvenience method, useful when the parent logger is named using e.g. __name__\nrather than a literal string.
\n\nNew in version 2.7.
\nLogs a message with level DEBUG on this logger. The msg is the\nmessage format string, and the args are the arguments which are merged into\nmsg using the string formatting operator. (Note that this means that you can\nuse keywords in the format string, together with a single dictionary argument.)
\nThere are two keyword arguments in kwargs which are inspected: exc_info\nwhich, if it does not evaluate as false, causes exception information to be\nadded to the logging message. If an exception tuple (in the format returned by\nsys.exc_info()) is provided, it is used; otherwise, sys.exc_info()\nis called to get the exception information.
\nThe second keyword argument is extra which can be used to pass a\ndictionary which is used to populate the __dict__ of the LogRecord created for\nthe logging event with user-defined attributes. These custom attributes can then\nbe used as you like. For example, they could be incorporated into logged\nmessages. For example:
\nFORMAT = '%(asctime)-15s %(clientip)s %(user)-8s %(message)s'\nlogging.basicConfig(format=FORMAT)\nd = { 'clientip' : '192.168.0.1', 'user' : 'fbloggs' }\nlogger = logging.getLogger('tcpserver')\nlogger.warning('Protocol problem: %s', 'connection reset', extra=d)\n
would print something like
\n2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset
\nThe keys in the dictionary passed in extra should not clash with the keys used\nby the logging system. (See the Formatter documentation for more\ninformation on which keys are used by the logging system.)
\nIf you choose to use these attributes in logged messages, you need to exercise\nsome care. In the above example, for instance, the Formatter has been\nset up with a format string which expects ‘clientip’ and ‘user’ in the attribute\ndictionary of the LogRecord. If these are missing, the message will not be\nlogged because a string formatting exception will occur. So in this case, you\nalways need to pass the extra dictionary with these keys.
\nWhile this might be annoying, this feature is intended for use in specialized\ncircumstances, such as multi-threaded servers where the same code executes in\nmany contexts, and interesting conditions which arise are dependent on this\ncontext (such as remote client IP address and authenticated user name, in the\nabove example). In such circumstances, it is likely that specialized\nFormatters would be used with particular Handlers.
\nFinds the caller’s source filename and line number. Returns the filename, line\nnumber and function name as a 3-element tuple.
\n\nChanged in version 2.4: The function name was added. In earlier versions, the filename and line\nnumber were returned as a 2-element tuple.
\nHandlers have the following attributes and methods. Note that Handler\nis never instantiated directly; this class acts as a base for more useful\nsubclasses. However, the __init__() method in subclasses needs to call\nHandler.__init__().
\nFor a list of handlers included as standard, see logging.handlers.
\nFormatter objects have the following attributes and methods. They are\nresponsible for converting a LogRecord to (usually) a string which can\nbe interpreted by either a human or an external system. The base\nFormatter allows a formatting string to be specified. If none is\nsupplied, the default value of '%(message)s' is used.
\nA Formatter can be initialized with a format string which makes use of knowledge\nof the LogRecord attributes - such as the default value mentioned above\nmaking use of the fact that the user’s message and arguments are pre-formatted\ninto a LogRecord‘s message attribute. This format string contains\nstandard Python %-style mapping keys. See section String Formatting Operations\nfor more information on string formatting.
\nThe useful mapping keys in a LogRecord are given in the section on\nLogRecord attributes.
\nReturns a new instance of the Formatter class. The instance is\ninitialized with a format string for the message as a whole, as well as a\nformat string for the date/time portion of a message. If no fmt is\nspecified, '%(message)s' is used. If no datefmt is specified, the\nISO8601 date format is used.
\nThis method should be called from format() by a formatter which\nwants to make use of a formatted time. This method can be overridden in\nformatters to provide for any specific requirement, but the basic behavior\nis as follows: if datefmt (a string) is specified, it is used with\ntime.strftime() to format the creation time of the\nrecord. Otherwise, the ISO8601 format is used. The resulting string is\nreturned.
\nThis function uses a user-configurable function to convert the creation\ntime to a tuple. By default, time.localtime() is used; to change\nthis for a particular formatter instance, set the converter attribute\nto a function with the same signature as time.localtime() or\ntime.gmtime(). To change it for all formatters, for example if you\nwant all logging times to be shown in GMT, set the converter\nattribute in the Formatter class.
\nFilters can be used by Handlers and Loggers for more sophisticated\nfiltering than is provided by levels. The base filter class only allows events\nwhich are below a certain point in the logger hierarchy. For example, a filter\ninitialized with ‘A.B’ will allow events logged by loggers ‘A.B’, ‘A.B.C’,\n‘A.B.C.D’, ‘A.B.D’ etc. but not ‘A.BB’, ‘B.A.B’ etc. If initialized with the\nempty string, all events are passed.
\nReturns an instance of the Filter class. If name is specified, it\nnames a logger which, together with its children, will have its events allowed\nthrough the filter. If name is the empty string, allows every event.
\nNote that filters attached to handlers are consulted whenever an event is\nemitted by the handler, whereas filters attached to loggers are consulted\nwhenever an event is logged to the handler (using debug(), info(),\netc.) This means that events which have been generated by descendant loggers\nwill not be filtered by a logger’s filter setting, unless the filter has also\nbeen applied to those descendant loggers.
\nYou don’t actually need to subclass Filter: you can pass any instance\nwhich has a filter method with the same semantics.
\nAlthough filters are used primarily to filter records based on more\nsophisticated criteria than levels, they get to see every record which is\nprocessed by the handler or logger they’re attached to: this can be useful if\nyou want to do things like counting how many records were processed by a\nparticular logger or handler, or adding, changing or removing attributes in\nthe LogRecord being processed. Obviously changing the LogRecord needs to be\ndone with some care, but it does allow the injection of contextual information\ninto logs (see Using Filters to impart contextual information).
\nLogRecord instances are created automatically by the Logger\nevery time something is logged, and can be created manually via\nmakeLogRecord() (for example, from a pickled event received over the\nwire).
\nContains all the information pertinent to the event being logged.
\nThe primary information is passed in msg and args, which\nare combined using msg
Parameters: |
| \n
---|
\nChanged in version 2.5: func was added.
\nThe LogRecord has a number of attributes, most of which are derived from the\nparameters to the constructor. (Note that the names do not always correspond\nexactly between the LogRecord constructor parameters and the LogRecord\nattributes.) These attributes can be used to merge data from the record into\nthe format string. The following table lists (in alphabetical order) the\nattribute names, their meanings and the corresponding placeholder in a %-style\nformat string.
\nAttribute name | \nFormat | \nDescription | \n
---|---|---|
args | \nYou shouldn’t need to\nformat this yourself. | \nThe tuple of arguments merged into msg to\nproduce message. | \n
asctime | \n%(asctime)s | \nHuman-readable time when the\nLogRecord was created. By default\nthis is of the form ‘2003-07-08 16:49:45,896’\n(the numbers after the comma are millisecond\nportion of the time). | \n
created | \n%(created)f | \nTime when the LogRecord was created\n(as returned by time.time()). | \n
exc_info | \nYou shouldn’t need to\nformat this yourself. | \nException tuple (à la sys.exc_info) or,\nif no exception has occurred, None. | \n
filename | \n%(filename)s | \nFilename portion of pathname. | \n
funcName | \n%(funcName)s | \nName of function containing the logging call. | \n
levelname | \n%(levelname)s | \nText logging level for the message\n('DEBUG', 'INFO', 'WARNING',\n'ERROR', 'CRITICAL'). | \n
levelno | \n%(levelno)s | \nNumeric logging level for the message\n(DEBUG, INFO,\nWARNING, ERROR,\nCRITICAL). | \n
lineno | \n%(lineno)d | \nSource line number where the logging call was\nissued (if available). | \n
module | \n%(module)s | \nModule (name portion of filename). | \n
msecs | \n%(msecs)d | \nMillisecond portion of the time when the\nLogRecord was created. | \n
message | \n%(message)s | \nThe logged message, computed as msg | \n
msg | \nYou shouldn’t need to\nformat this yourself. | \nThe format string passed in the original\nlogging call. Merged with args to\nproduce message, or an arbitrary object\n(see Using arbitrary objects as messages). | \n
name | \n%(name)s | \nName of the logger used to log the call. | \n
pathname | \n%(pathname)s | \nFull pathname of the source file where the\nlogging call was issued (if available). | \n
process | \n%(process)d | \nProcess ID (if available). | \n
processName | \n%(processName)s | \nProcess name (if available). | \n
relativeCreated | \n%(relativeCreated)d | \nTime in milliseconds when the LogRecord was\ncreated, relative to the time the logging\nmodule was loaded. | \n
thread | \n%(thread)d | \nThread ID (if available). | \n
threadName | \n%(threadName)s | \nThread name (if available). | \n
\nChanged in version 2.5: funcName was added.
\nLoggerAdapter instances are used to conveniently pass contextual\ninformation into logging calls. For a usage example , see the section on\nadding contextual information to your logging output.
\n\nNew in version 2.6.
\nReturns an instance of LoggerAdapter initialized with an\nunderlying Logger instance and a dict-like object.
\nIn addition to the above, LoggerAdapter supports the following\nmethods of Logger, i.e. debug(), info(), warning(),\nerror(), exception(), critical(), log(),\nisEnabledFor(), getEffectiveLevel(), setLevel(),\nhasHandlers(). These methods have the same signatures as their\ncounterparts in Logger, so you can use the two types of instances\ninterchangeably.
\n\nChanged in version 2.7: The isEnabledFor() method was added to LoggerAdapter. This\nmethod delegates to the underlying logger.
\nThe logging module is intended to be thread-safe without any special work\nneeding to be done by its clients. It achieves this though using threading\nlocks; there is one lock to serialize access to the module’s shared data, and\neach handler also creates a lock to serialize access to its underlying I/O.
\nIf you are implementing asynchronous signal handlers using the signal\nmodule, you may not be able to use logging from within such handlers. This is\nbecause lock implementations in the threading module are not always\nre-entrant, and so cannot be invoked from such signal handlers.
\nIn addition to the classes described above, there are a number of module- level\nfunctions.
\nReturn a logger with the specified name or, if no name is specified, return a\nlogger which is the root logger of the hierarchy. If specified, the name is\ntypically a dot-separated hierarchical name like “a”, “a.b” or “a.b.c.d”.\nChoice of these names is entirely up to the developer who is using logging.
\nAll calls to this function with a given name return the same logger instance.\nThis means that logger instances never need to be passed between different parts\nof an application.
\nReturn either the standard Logger class, or the last class passed to\nsetLoggerClass(). This function may be called from within a new class\ndefinition, to ensure that installing a customised Logger class will\nnot undo customisations already applied by other code. For example:
\nclass MyLogger(logging.getLoggerClass()):\n # ... override behaviour here
\nLogs a message with level DEBUG on the root logger. The msg is the\nmessage format string, and the args are the arguments which are merged into\nmsg using the string formatting operator. (Note that this means that you can\nuse keywords in the format string, together with a single dictionary argument.)
\nThere are two keyword arguments in kwargs which are inspected: exc_info\nwhich, if it does not evaluate as false, causes exception information to be\nadded to the logging message. If an exception tuple (in the format returned by\nsys.exc_info()) is provided, it is used; otherwise, sys.exc_info()\nis called to get the exception information.
\nThe other optional keyword argument is extra which can be used to pass a\ndictionary which is used to populate the __dict__ of the LogRecord created for\nthe logging event with user-defined attributes. These custom attributes can then\nbe used as you like. For example, they could be incorporated into logged\nmessages. For example:
\nFORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"\nlogging.basicConfig(format=FORMAT)\nd = {'clientip': '192.168.0.1', 'user': 'fbloggs'}\nlogging.warning("Protocol problem: %s", "connection reset", extra=d)\n
would print something like:
\n2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset
\nThe keys in the dictionary passed in extra should not clash with the keys used\nby the logging system. (See the Formatter documentation for more\ninformation on which keys are used by the logging system.)
\nIf you choose to use these attributes in logged messages, you need to exercise\nsome care. In the above example, for instance, the Formatter has been\nset up with a format string which expects ‘clientip’ and ‘user’ in the attribute\ndictionary of the LogRecord. If these are missing, the message will not be\nlogged because a string formatting exception will occur. So in this case, you\nalways need to pass the extra dictionary with these keys.
\nWhile this might be annoying, this feature is intended for use in specialized\ncircumstances, such as multi-threaded servers where the same code executes in\nmany contexts, and interesting conditions which arise are dependent on this\ncontext (such as remote client IP address and authenticated user name, in the\nabove example). In such circumstances, it is likely that specialized\nFormatters would be used with particular Handlers.
\n\nChanged in version 2.5: extra was added.
\nLogs a message with level level on the root logger. The other arguments are\ninterpreted as for debug().
\nPLEASE NOTE: The above module-level functions which delegate to the root\nlogger should not be used in threads, in versions of Python earlier than\n2.7.1 and 3.2, unless at least one handler has been added to the root\nlogger before the threads are started. These convenience functions call\nbasicConfig() to ensure that at least one handler is available; in\nearlier versions of Python, this can (under rare circumstances) lead to\nhandlers being added multiple times to the root logger, which can in turn\nlead to multiple messages for the same event.
\nAssociates level lvl with text levelName in an internal dictionary, which is\nused to map numeric levels to a textual representation, for example when a\nFormatter formats a message. This function can also be used to define\nyour own levels. The only constraints are that all levels used must be\nregistered using this function, levels should be positive integers and they\nshould increase in increasing order of severity.
\nNOTE: If you are thinking of defining your own levels, please see the section\non Custom Levels.
\nDoes basic configuration for the logging system by creating a\nStreamHandler with a default Formatter and adding it to the\nroot logger. The functions debug(), info(), warning(),\nerror() and critical() will call basicConfig() automatically\nif no handlers are defined for the root logger.
\nThis function does nothing if the root logger already has handlers\nconfigured for it.
\n\nChanged in version 2.4: Formerly, basicConfig() did not take any keyword arguments.
\nPLEASE NOTE: This function should be called from the main thread\nbefore other threads are started. In versions of Python prior to\n2.7.1 and 3.2, if this function is called from multiple threads,\nit is possible (in rare circumstances) that a handler will be added\nto the root logger more than once, leading to unexpected results\nsuch as messages being duplicated in the log.
\nThe following keyword arguments are supported.
\nFormat | \nDescription | \n
---|---|
filename | \nSpecifies that a FileHandler be created,\nusing the specified filename, rather than a\nStreamHandler. | \n
filemode | \nSpecifies the mode to open the file, if\nfilename is specified (if filemode is\nunspecified, it defaults to ‘a’). | \n
format | \nUse the specified format string for the\nhandler. | \n
datefmt | \nUse the specified date/time format. | \n
level | \nSet the root logger level to the specified\nlevel. | \n
stream | \nUse the specified stream to initialize the\nStreamHandler. Note that this argument is\nincompatible with ‘filename’ - if both are\npresent, ‘stream’ is ignored. | \n
The captureWarnings() function can be used to integrate logging\nwith the warnings module.
\nThis function is used to turn the capture of warnings by logging on and\noff.
\nIf capture is True, warnings issued by the warnings module will\nbe redirected to the logging system. Specifically, a warning will be\nformatted using warnings.formatwarning() and the resulting string\nlogged to a logger named ‘py.warnings’ with a severity of WARNING.
\nIf capture is False, the redirection of warnings to the logging system\nwill stop, and warnings will be redirected to their original destinations\n(i.e. those in effect before captureWarnings(True) was called).
\nSee also
\nNote
\nThe dummy_thread module has been renamed to _dummy_thread in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0; however, you should consider using the\nhigh-lever dummy_threading module instead.
\nSource code: Lib/dummy_thread.py
\nThis module provides a duplicate interface to the thread module. It is\nmeant to be imported when the thread module is not provided on a\nplatform.
\nSuggested usage is:
\ntry:\n import thread as _thread\nexcept ImportError:\n import dummy_thread as _thread\n
Be careful to not use this module where deadlock might occur from a thread\nbeing created that blocks waiting for another thread to be created. This often\noccurs with blocking I/O.
\n\nNew in version 2.5.
\nctypes is a foreign function library for Python. It provides C compatible\ndata types, and allows calling functions in DLLs or shared libraries. It can be\nused to wrap these libraries in pure Python.
\nNote: The code samples in this tutorial use doctest to make sure that\nthey actually work. Since some code samples behave differently under Linux,\nWindows, or Mac OS X, they contain doctest directives in comments.
\nNote: Some code samples reference the ctypes c_int type. This type is\nan alias for the c_long type on 32-bit systems. So, you should not be\nconfused if c_long is printed if you would expect c_int —\nthey are actually the same type.
\nctypes exports the cdll, and on Windows windll and oledll\nobjects, for loading dynamic link libraries.
\nYou load libraries by accessing them as attributes of these objects. cdll\nloads libraries which export functions using the standard cdecl calling\nconvention, while windll libraries call functions using the stdcall\ncalling convention. oledll also uses the stdcall calling convention, and\nassumes the functions return a Windows HRESULT error code. The error\ncode is used to automatically raise a WindowsError exception when the\nfunction call fails.
\nHere are some examples for Windows. Note that msvcrt is the MS standard C\nlibrary containing most standard C functions, and uses the cdecl calling\nconvention:
\n>>> from ctypes import *\n>>> print windll.kernel32 # doctest: +WINDOWS\n<WinDLL 'kernel32', handle ... at ...>\n>>> print cdll.msvcrt # doctest: +WINDOWS\n<CDLL 'msvcrt', handle ... at ...>\n>>> libc = cdll.msvcrt # doctest: +WINDOWS\n>>>\n
Windows appends the usual .dll file suffix automatically.
\nOn Linux, it is required to specify the filename including the extension to\nload a library, so attribute access can not be used to load libraries. Either the\nLoadLibrary() method of the dll loaders should be used, or you should load\nthe library by creating an instance of CDLL by calling the constructor:
\n>>> cdll.LoadLibrary("libc.so.6") # doctest: +LINUX\n<CDLL 'libc.so.6', handle ... at ...>\n>>> libc = CDLL("libc.so.6") # doctest: +LINUX\n>>> libc # doctest: +LINUX\n<CDLL 'libc.so.6', handle ... at ...>\n>>>\n
Functions are accessed as attributes of dll objects:
\n>>> from ctypes import *\n>>> libc.printf\n<_FuncPtr object at 0x...>\n>>> print windll.kernel32.GetModuleHandleA # doctest: +WINDOWS\n<_FuncPtr object at 0x...>\n>>> print windll.kernel32.MyOwnFunction # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\n File "ctypes.py", line 239, in __getattr__\n func = _StdcallFuncPtr(name, self)\nAttributeError: function 'MyOwnFunction' not found\n>>>\n
Note that win32 system dlls like kernel32 and user32 often export ANSI\nas well as UNICODE versions of a function. The UNICODE version is exported with\nan W appended to the name, while the ANSI version is exported with an A\nappended to the name. The win32 GetModuleHandle function, which returns a\nmodule handle for a given module name, has the following C prototype, and a\nmacro is used to expose one of them as GetModuleHandle depending on whether\nUNICODE is defined or not:
\n/* ANSI version */\nHMODULE GetModuleHandleA(LPCSTR lpModuleName);\n/* UNICODE version */\nHMODULE GetModuleHandleW(LPCWSTR lpModuleName);
\nwindll does not try to select one of them by magic, you must access the\nversion you need by specifying GetModuleHandleA or GetModuleHandleW\nexplicitly, and then call it with strings or unicode strings\nrespectively.
\nSometimes, dlls export functions with names which aren’t valid Python\nidentifiers, like "??2@YAPAXI@Z". In this case you have to use\ngetattr() to retrieve the function:
\n>>> getattr(cdll.msvcrt, "??2@YAPAXI@Z") # doctest: +WINDOWS\n<_FuncPtr object at 0x...>\n>>>\n
On Windows, some dlls export functions not by name but by ordinal. These\nfunctions can be accessed by indexing the dll object with the ordinal number:
\n>>> cdll.kernel32[1] # doctest: +WINDOWS\n<_FuncPtr object at 0x...>\n>>> cdll.kernel32[0] # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\n File "ctypes.py", line 310, in __getitem__\n func = _StdcallFuncPtr(name, self)\nAttributeError: function ordinal 0 not found\n>>>\n
You can call these functions like any other Python callable. This example uses\nthe time() function, which returns system time in seconds since the Unix\nepoch, and the GetModuleHandleA() function, which returns a win32 module\nhandle.
\nThis example calls both functions with a NULL pointer (None should be used\nas the NULL pointer):
\n>>> print libc.time(None) # doctest: +SKIP\n1150640792\n>>> print hex(windll.kernel32.GetModuleHandleA(None)) # doctest: +WINDOWS\n0x1d000000\n>>>\n
ctypes tries to protect you from calling functions with the wrong number\nof arguments or the wrong calling convention. Unfortunately this only works on\nWindows. It does this by examining the stack after the function returns, so\nalthough an error is raised the function has been called:
\n>>> windll.kernel32.GetModuleHandleA() # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: Procedure probably called with not enough arguments (4 bytes missing)\n>>> windll.kernel32.GetModuleHandleA(0, 0) # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: Procedure probably called with too many arguments (4 bytes in excess)\n>>>\n
The same exception is raised when you call an stdcall function with the\ncdecl calling convention, or vice versa:
\n>>> cdll.kernel32.GetModuleHandleA(None) # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: Procedure probably called with not enough arguments (4 bytes missing)\n>>>\n\n>>> windll.msvcrt.printf("spam") # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: Procedure probably called with too many arguments (4 bytes in excess)\n>>>\n
To find out the correct calling convention you have to look into the C header\nfile or the documentation for the function you want to call.
\nOn Windows, ctypes uses win32 structured exception handling to prevent\ncrashes from general protection faults when functions are called with invalid\nargument values:
\n>>> windll.kernel32.GetModuleHandleA(32) # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nWindowsError: exception: access violation reading 0x00000020\n>>>\n
There are, however, enough ways to crash Python with ctypes, so you\nshould be careful anyway.
\nNone, integers, longs, byte strings and unicode strings are the only native\nPython objects that can directly be used as parameters in these function calls.\nNone is passed as a C NULL pointer, byte strings and unicode strings are\npassed as pointer to the memory block that contains their data (char *\nor wchar_t *). Python integers and Python longs are passed as the\nplatforms default C int type, their value is masked to fit into the C\ntype.
\nBefore we move on calling functions with other parameter types, we have to learn\nmore about ctypes data types.
\nctypes defines a number of primitive C compatible data types :
\nctypes type | \nC type | \nPython type | \n
---|---|---|
c_bool | \n_Bool | \nbool (1) | \n
c_char | \nchar | \n1-character string | \n
c_wchar | \nwchar_t | \n1-character unicode string | \n
c_byte | \nchar | \nint/long | \n
c_ubyte | \nunsigned char | \nint/long | \n
c_short | \nshort | \nint/long | \n
c_ushort | \nunsigned short | \nint/long | \n
c_int | \nint | \nint/long | \n
c_uint | \nunsigned int | \nint/long | \n
c_long | \nlong | \nint/long | \n
c_ulong | \nunsigned long | \nint/long | \n
c_longlong | \n__int64 or long long | \nint/long | \n
c_ulonglong | \nunsigned __int64 or\nunsigned long long | \nint/long | \n
c_float | \nfloat | \nfloat | \n
c_double | \ndouble | \nfloat | \n
c_longdouble | \nlong double | \nfloat | \n
c_char_p | \nchar * (NUL terminated) | \nstring or None | \n
c_wchar_p | \nwchar_t * (NUL terminated) | \nunicode or None | \n
c_void_p | \nvoid * | \nint/long or None | \n
All these types can be created by calling them with an optional initializer of\nthe correct type and value:
\n>>> c_int()\nc_long(0)\n>>> c_char_p("Hello, World")\nc_char_p('Hello, World')\n>>> c_ushort(-3)\nc_ushort(65533)\n>>>\n
Since these types are mutable, their value can also be changed afterwards:
\n>>> i = c_int(42)\n>>> print i\nc_long(42)\n>>> print i.value\n42\n>>> i.value = -99\n>>> print i.value\n-99\n>>>\n
Assigning a new value to instances of the pointer types c_char_p,\nc_wchar_p, and c_void_p changes the memory location they\npoint to, not the contents of the memory block (of course not, because Python\nstrings are immutable):
\n>>> s = "Hello, World"\n>>> c_s = c_char_p(s)\n>>> print c_s\nc_char_p('Hello, World')\n>>> c_s.value = "Hi, there"\n>>> print c_s\nc_char_p('Hi, there')\n>>> print s # first string is unchanged\nHello, World\n>>>\n
You should be careful, however, not to pass them to functions expecting pointers\nto mutable memory. If you need mutable memory blocks, ctypes has a\ncreate_string_buffer() function which creates these in various ways. The\ncurrent memory block contents can be accessed (or changed) with the raw\nproperty; if you want to access it as NUL terminated string, use the value\nproperty:
\n>>> from ctypes import *\n>>> p = create_string_buffer(3) # create a 3 byte buffer, initialized to NUL bytes\n>>> print sizeof(p), repr(p.raw)\n3 '\\x00\\x00\\x00'\n>>> p = create_string_buffer("Hello") # create a buffer containing a NUL terminated string\n>>> print sizeof(p), repr(p.raw)\n6 'Hello\\x00'\n>>> print repr(p.value)\n'Hello'\n>>> p = create_string_buffer("Hello", 10) # create a 10 byte buffer\n>>> print sizeof(p), repr(p.raw)\n10 'Hello\\x00\\x00\\x00\\x00\\x00'\n>>> p.value = "Hi"\n>>> print sizeof(p), repr(p.raw)\n10 'Hi\\x00lo\\x00\\x00\\x00\\x00\\x00'\n>>>\n
The create_string_buffer() function replaces the c_buffer() function\n(which is still available as an alias), as well as the c_string() function\nfrom earlier ctypes releases. To create a mutable memory block containing\nunicode characters of the C type wchar_t use the\ncreate_unicode_buffer() function.
\nNote that printf prints to the real standard output channel, not to\nsys.stdout, so these examples will only work at the console prompt, not\nfrom within IDLE or PythonWin:
\n>>> printf = libc.printf\n>>> printf("Hello, %s\\n", "World!")\nHello, World!\n14\n>>> printf("Hello, %S\\n", u"World!")\nHello, World!\n14\n>>> printf("%d bottles of beer\\n", 42)\n42 bottles of beer\n19\n>>> printf("%f bottles of beer\\n", 42.5)\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nArgumentError: argument 2: exceptions.TypeError: Don't know how to convert parameter 2\n>>>\n
As has been mentioned before, all Python types except integers, strings, and\nunicode strings have to be wrapped in their corresponding ctypes type, so\nthat they can be converted to the required C data type:
\n>>> printf("An int %d, a double %f\\n", 1234, c_double(3.14))\nAn int 1234, a double 3.140000\n31\n>>>\n
You can also customize ctypes argument conversion to allow instances of\nyour own classes be used as function arguments. ctypes looks for an\n_as_parameter_ attribute and uses this as the function argument. Of\ncourse, it must be one of integer, string, or unicode:
\n>>> class Bottles(object):\n... def __init__(self, number):\n... self._as_parameter_ = number\n...\n>>> bottles = Bottles(42)\n>>> printf("%d bottles of beer\\n", bottles)\n42 bottles of beer\n19\n>>>\n
If you don’t want to store the instance’s data in the _as_parameter_\ninstance variable, you could define a property() which makes the data\navailable.
\nIt is possible to specify the required argument types of functions exported from\nDLLs by setting the argtypes attribute.
\nargtypes must be a sequence of C data types (the printf function is\nprobably not a good example here, because it takes a variable number and\ndifferent types of parameters depending on the format string, on the other hand\nthis is quite handy to experiment with this feature):
\n>>> printf.argtypes = [c_char_p, c_char_p, c_int, c_double]\n>>> printf("String '%s', Int %d, Double %f\\n", "Hi", 10, 2.2)\nString 'Hi', Int 10, Double 2.200000\n37\n>>>\n
Specifying a format protects against incompatible argument types (just as a\nprototype for a C function), and tries to convert the arguments to valid types:
\n>>> printf("%d %d %d", 1, 2, 3)\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nArgumentError: argument 2: exceptions.TypeError: wrong type\n>>> printf("%s %d %f\\n", "X", 2, 3)\nX 2 3.000000\n13\n>>>\n
If you have defined your own classes which you pass to function calls, you have\nto implement a from_param() class method for them to be able to use them\nin the argtypes sequence. The from_param() class method receives\nthe Python object passed to the function call, it should do a typecheck or\nwhatever is needed to make sure this object is acceptable, and then return the\nobject itself, its _as_parameter_ attribute, or whatever you want to\npass as the C function argument in this case. Again, the result should be an\ninteger, string, unicode, a ctypes instance, or an object with an\n_as_parameter_ attribute.
\nBy default functions are assumed to return the C int type. Other\nreturn types can be specified by setting the restype attribute of the\nfunction object.
\nHere is a more advanced example, it uses the strchr function, which expects\na string pointer and a char, and returns a pointer to a string:
\n>>> strchr = libc.strchr\n>>> strchr("abcdef", ord("d")) # doctest: +SKIP\n8059983\n>>> strchr.restype = c_char_p # c_char_p is a pointer to a string\n>>> strchr("abcdef", ord("d"))\n'def'\n>>> print strchr("abcdef", ord("x"))\nNone\n>>>\n
If you want to avoid the ord("x") calls above, you can set the\nargtypes attribute, and the second argument will be converted from a\nsingle character Python string into a C char:
\n>>> strchr.restype = c_char_p\n>>> strchr.argtypes = [c_char_p, c_char]\n>>> strchr("abcdef", "d")\n'def'\n>>> strchr("abcdef", "def")\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nArgumentError: argument 2: exceptions.TypeError: one character string expected\n>>> print strchr("abcdef", "x")\nNone\n>>> strchr("abcdef", "d")\n'def'\n>>>\n
You can also use a callable Python object (a function or a class for example) as\nthe restype attribute, if the foreign function returns an integer. The\ncallable will be called with the integer the C function returns, and the\nresult of this call will be used as the result of your function call. This is\nuseful to check for error return values and automatically raise an exception:
\n>>> GetModuleHandle = windll.kernel32.GetModuleHandleA # doctest: +WINDOWS\n>>> def ValidHandle(value):\n... if value == 0:\n... raise WinError()\n... return value\n...\n>>>\n>>> GetModuleHandle.restype = ValidHandle # doctest: +WINDOWS\n>>> GetModuleHandle(None) # doctest: +WINDOWS\n486539264\n>>> GetModuleHandle("something silly") # doctest: +WINDOWS\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\n File "<stdin>", line 3, in ValidHandle\nWindowsError: [Errno 126] The specified module could not be found.\n>>>\n
WinError is a function which will call Windows FormatMessage() api to\nget the string representation of an error code, and returns an exception.\nWinError takes an optional error code parameter, if no one is used, it calls\nGetLastError() to retrieve it.
\nPlease note that a much more powerful error checking mechanism is available\nthrough the errcheck attribute; see the reference manual for details.
\nSometimes a C api function expects a pointer to a data type as parameter,\nprobably to write into the corresponding location, or if the data is too large\nto be passed by value. This is also known as passing parameters by reference.
\nctypes exports the byref() function which is used to pass\nparameters by reference. The same effect can be achieved with the\npointer() function, although pointer() does a lot more work since it\nconstructs a real pointer object, so it is faster to use byref() if you\ndon’t need the pointer object in Python itself:
\n>>> i = c_int()\n>>> f = c_float()\n>>> s = create_string_buffer('\\000' * 32)\n>>> print i.value, f.value, repr(s.value)\n0 0.0 ''\n>>> libc.sscanf("1 3.14 Hello", "%d %f %s",\n... byref(i), byref(f), s)\n3\n>>> print i.value, f.value, repr(s.value)\n1 3.1400001049 'Hello'\n>>>\n
Structures and unions must derive from the Structure and Union\nbase classes which are defined in the ctypes module. Each subclass must\ndefine a _fields_ attribute. _fields_ must be a list of\n2-tuples, containing a field name and a field type.
\nThe field type must be a ctypes type like c_int, or any other\nderived ctypes type: structure, union, array, pointer.
\nHere is a simple example of a POINT structure, which contains two integers named\nx and y, and also shows how to initialize a structure in the constructor:
\n>>> from ctypes import *\n>>> class POINT(Structure):\n... _fields_ = [("x", c_int),\n... ("y", c_int)]\n...\n>>> point = POINT(10, 20)\n>>> print point.x, point.y\n10 20\n>>> point = POINT(y=5)\n>>> print point.x, point.y\n0 5\n>>> POINT(1, 2, 3)\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: too many initializers\n>>>\n
You can, however, build much more complicated structures. Structures can itself\ncontain other structures by using a structure as a field type.
\nHere is a RECT structure which contains two POINTs named upperleft and\nlowerright:
\n>>> class RECT(Structure):\n... _fields_ = [("upperleft", POINT),\n... ("lowerright", POINT)]\n...\n>>> rc = RECT(point)\n>>> print rc.upperleft.x, rc.upperleft.y\n0 5\n>>> print rc.lowerright.x, rc.lowerright.y\n0 0\n>>>\n
Nested structures can also be initialized in the constructor in several ways:
\n>>> r = RECT(POINT(1, 2), POINT(3, 4))\n>>> r = RECT((1, 2), (3, 4))\n
Field descriptors can be retrieved from the class, they are useful\nfor debugging because they can provide useful information:
\n>>> print POINT.x\n<Field type=c_long, ofs=0, size=4>\n>>> print POINT.y\n<Field type=c_long, ofs=4, size=4>\n>>>\n
By default, Structure and Union fields are aligned in the same way the C\ncompiler does it. It is possible to override this behavior be specifying a\n_pack_ class attribute in the subclass definition. This must be set to a\npositive integer and specifies the maximum alignment for the fields. This is\nwhat #pragma pack(n) also does in MSVC.
\nctypes uses the native byte order for Structures and Unions. To build\nstructures with non-native byte order, you can use one of the\nBigEndianStructure, LittleEndianStructure,\nBigEndianUnion, and LittleEndianUnion base classes. These\nclasses cannot contain pointer fields.
\nIt is possible to create structures and unions containing bit fields. Bit fields\nare only possible for integer fields, the bit width is specified as the third\nitem in the _fields_ tuples:
\n>>> class Int(Structure):\n... _fields_ = [("first_16", c_int, 16),\n... ("second_16", c_int, 16)]\n...\n>>> print Int.first_16\n<Field type=c_long, ofs=0:0, bits=16>\n>>> print Int.second_16\n<Field type=c_long, ofs=0:16, bits=16>\n>>>\n
Arrays are sequences, containing a fixed number of instances of the same type.
\nThe recommended way to create array types is by multiplying a data type with a\npositive integer:
\nTenPointsArrayType = POINT * 10\n
Here is an example of an somewhat artificial data type, a structure containing 4\nPOINTs among other stuff:
\n>>> from ctypes import *\n>>> class POINT(Structure):\n... _fields_ = ("x", c_int), ("y", c_int)\n...\n>>> class MyStruct(Structure):\n... _fields_ = [("a", c_int),\n... ("b", c_float),\n... ("point_array", POINT * 4)]\n>>>\n>>> print len(MyStruct().point_array)\n4\n>>>\n
Instances are created in the usual way, by calling the class:
\narr = TenPointsArrayType()\nfor pt in arr:\n print pt.x, pt.y\n
The above code print a series of 0 0 lines, because the array contents is\ninitialized to zeros.
\nInitializers of the correct type can also be specified:
\n>>> from ctypes import *\n>>> TenIntegers = c_int * 10\n>>> ii = TenIntegers(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)\n>>> print ii\n<c_long_Array_10 object at 0x...>\n>>> for i in ii: print i,\n...\n1 2 3 4 5 6 7 8 9 10\n>>>\n
Pointer instances are created by calling the pointer() function on a\nctypes type:
\n>>> from ctypes import *\n>>> i = c_int(42)\n>>> pi = pointer(i)\n>>>\n
Pointer instances have a contents attribute which returns the object to\nwhich the pointer points, the i object above:
\n>>> pi.contents\nc_long(42)\n>>>\n
Note that ctypes does not have OOR (original object return), it constructs a\nnew, equivalent object each time you retrieve an attribute:
\n>>> pi.contents is i\nFalse\n>>> pi.contents is pi.contents\nFalse\n>>>\n
Assigning another c_int instance to the pointer’s contents attribute\nwould cause the pointer to point to the memory location where this is stored:
\n>>> i = c_int(99)\n>>> pi.contents = i\n>>> pi.contents\nc_long(99)\n>>>\n
Pointer instances can also be indexed with integers:
\n>>> pi[0]\n99\n>>>\n
Assigning to an integer index changes the pointed to value:
\n>>> print i\nc_long(99)\n>>> pi[0] = 22\n>>> print i\nc_long(22)\n>>>\n
It is also possible to use indexes different from 0, but you must know what\nyou’re doing, just as in C: You can access or change arbitrary memory locations.\nGenerally you only use this feature if you receive a pointer from a C function,\nand you know that the pointer actually points to an array instead of a single\nitem.
\nBehind the scenes, the pointer() function does more than simply create\npointer instances, it has to create pointer types first. This is done with\nthe POINTER() function, which accepts any ctypes type, and returns\na new type:
\n>>> PI = POINTER(c_int)\n>>> PI\n<class 'ctypes.LP_c_long'>\n>>> PI(42)\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nTypeError: expected c_long instead of int\n>>> PI(c_int(42))\n<ctypes.LP_c_long object at 0x...>\n>>>\n
Calling the pointer type without an argument creates a NULL pointer.\nNULL pointers have a False boolean value:
\n>>> null_ptr = POINTER(c_int)()\n>>> print bool(null_ptr)\nFalse\n>>>\n
ctypes checks for NULL when dereferencing pointers (but dereferencing\ninvalid non-NULL pointers would crash Python):
\n>>> null_ptr[0]\nTraceback (most recent call last):\n ....\nValueError: NULL pointer access\n>>>\n\n>>> null_ptr[0] = 1234\nTraceback (most recent call last):\n ....\nValueError: NULL pointer access\n>>>\n
Usually, ctypes does strict type checking. This means, if you have\nPOINTER(c_int) in the argtypes list of a function or as the type of\na member field in a structure definition, only instances of exactly the same\ntype are accepted. There are some exceptions to this rule, where ctypes accepts\nother objects. For example, you can pass compatible array instances instead of\npointer types. So, for POINTER(c_int), ctypes accepts an array of c_int:
\n>>> class Bar(Structure):\n... _fields_ = [("count", c_int), ("values", POINTER(c_int))]\n...\n>>> bar = Bar()\n>>> bar.values = (c_int * 3)(1, 2, 3)\n>>> bar.count = 3\n>>> for i in range(bar.count):\n... print bar.values[i]\n...\n1\n2\n3\n>>>\n
To set a POINTER type field to NULL, you can assign None:
\n>>> bar.values = None\n>>>\n
Sometimes you have instances of incompatible types. In C, you can cast one type\ninto another type. ctypes provides a cast() function which can be\nused in the same way. The Bar structure defined above accepts\nPOINTER(c_int) pointers or c_int arrays for its values field,\nbut not instances of other types:
\n>>> bar.values = (c_byte * 4)()\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nTypeError: incompatible types, c_byte_Array_4 instance instead of LP_c_long instance\n>>>\n
For these cases, the cast() function is handy.
\nThe cast() function can be used to cast a ctypes instance into a pointer\nto a different ctypes data type. cast() takes two parameters, a ctypes\nobject that is or can be converted to a pointer of some kind, and a ctypes\npointer type. It returns an instance of the second argument, which references\nthe same memory block as the first argument:
\n>>> a = (c_byte * 4)()\n>>> cast(a, POINTER(c_int))\n<ctypes.LP_c_long object at ...>\n>>>\n
So, cast() can be used to assign to the values field of Bar the\nstructure:
\n>>> bar = Bar()\n>>> bar.values = cast((c_byte * 4)(), POINTER(c_int))\n>>> print bar.values[0]\n0\n>>>\n
Incomplete Types are structures, unions or arrays whose members are not yet\nspecified. In C, they are specified by forward declarations, which are defined\nlater:
\nstruct cell; /* forward declaration */\n\nstruct cell {\n char *name;\n struct cell *next;\n};
\nThe straightforward translation into ctypes code would be this, but it does not\nwork:
\n>>> class cell(Structure):\n... _fields_ = [("name", c_char_p),\n... ("next", POINTER(cell))]\n...\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\n File "<stdin>", line 2, in cell\nNameError: name 'cell' is not defined\n>>>\n
because the new class cell is not available in the class statement itself.\nIn ctypes, we can define the cell class and set the _fields_\nattribute later, after the class statement:
\n>>> from ctypes import *\n>>> class cell(Structure):\n... pass\n...\n>>> cell._fields_ = [("name", c_char_p),\n... ("next", POINTER(cell))]\n>>>\n
Lets try it. We create two instances of cell, and let them point to each\nother, and finally follow the pointer chain a few times:
\n>>> c1 = cell()\n>>> c1.name = "foo"\n>>> c2 = cell()\n>>> c2.name = "bar"\n>>> c1.next = pointer(c2)\n>>> c2.next = pointer(c1)\n>>> p = c1\n>>> for i in range(8):\n... print p.name,\n... p = p.next[0]\n...\nfoo bar foo bar foo bar foo bar\n>>>\n
ctypes allows to create C callable function pointers from Python callables.\nThese are sometimes called callback functions.
\nFirst, you must create a class for the callback function, the class knows the\ncalling convention, the return type, and the number and types of arguments this\nfunction will receive.
\nThe CFUNCTYPE factory function creates types for callback functions using the\nnormal cdecl calling convention, and, on Windows, the WINFUNCTYPE factory\nfunction creates types for callback functions using the stdcall calling\nconvention.
\nBoth of these factory functions are called with the result type as first\nargument, and the callback functions expected argument types as the remaining\narguments.
\nI will present an example here which uses the standard C library’s qsort()\nfunction, this is used to sort items with the help of a callback function.\nqsort() will be used to sort an array of integers:
\n>>> IntArray5 = c_int * 5\n>>> ia = IntArray5(5, 1, 7, 33, 99)\n>>> qsort = libc.qsort\n>>> qsort.restype = None\n>>>\n
qsort() must be called with a pointer to the data to sort, the number of\nitems in the data array, the size of one item, and a pointer to the comparison\nfunction, the callback. The callback will then be called with two pointers to\nitems, and it must return a negative integer if the first item is smaller than\nthe second, a zero if they are equal, and a positive integer else.
\nSo our callback function receives pointers to integers, and must return an\ninteger. First we create the type for the callback function:
\n>>> CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))\n>>>\n
For the first implementation of the callback function, we simply print the\narguments we get, and return 0 (incremental development ;-):
\n>>> def py_cmp_func(a, b):\n... print "py_cmp_func", a, b\n... return 0\n...\n>>>\n
Create the C callable callback:
\n>>> cmp_func = CMPFUNC(py_cmp_func)\n>>>\n
And we’re ready to go:
\n>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\npy_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>\n>>>\n
We know how to access the contents of a pointer, so lets redefine our callback:
\n>>> def py_cmp_func(a, b):\n... print "py_cmp_func", a[0], b[0]\n... return 0\n...\n>>> cmp_func = CMPFUNC(py_cmp_func)\n>>>\n
Here is what we get on Windows:
\n>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS\npy_cmp_func 7 1\npy_cmp_func 33 1\npy_cmp_func 99 1\npy_cmp_func 5 1\npy_cmp_func 7 5\npy_cmp_func 33 5\npy_cmp_func 99 5\npy_cmp_func 7 99\npy_cmp_func 33 99\npy_cmp_func 7 33\n>>>\n
It is funny to see that on linux the sort function seems to work much more\nefficiently, it is doing less comparisons:
\n>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +LINUX\npy_cmp_func 5 1\npy_cmp_func 33 99\npy_cmp_func 7 33\npy_cmp_func 5 7\npy_cmp_func 1 7\n>>>\n
Ah, we’re nearly done! The last step is to actually compare the two items and\nreturn a useful result:
\n>>> def py_cmp_func(a, b):\n... print "py_cmp_func", a[0], b[0]\n... return a[0] - b[0]\n...\n>>>\n
Final run on Windows:
\n>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +WINDOWS\npy_cmp_func 33 7\npy_cmp_func 99 33\npy_cmp_func 5 99\npy_cmp_func 1 99\npy_cmp_func 33 7\npy_cmp_func 1 33\npy_cmp_func 5 33\npy_cmp_func 5 7\npy_cmp_func 1 7\npy_cmp_func 5 1\n>>>\n
and on Linux:
\n>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +LINUX\npy_cmp_func 5 1\npy_cmp_func 33 99\npy_cmp_func 7 33\npy_cmp_func 1 7\npy_cmp_func 5 7\n>>>\n
It is quite interesting to see that the Windows qsort() function needs\nmore comparisons than the linux version!
\nAs we can easily check, our array is sorted now:
\n>>> for i in ia: print i,\n...\n1 5 7 33 99\n>>>\n
Important note for callback functions:
\nMake sure you keep references to CFUNCTYPE objects as long as they are used from\nC code. ctypes doesn’t, and if you don’t, they may be garbage collected,\ncrashing your program when a callback is made.
\nSome shared libraries not only export functions, they also export variables. An\nexample in the Python library itself is the Py_OptimizeFlag, an integer set\nto 0, 1, or 2, depending on the -O or -OO flag given on\nstartup.
\nctypes can access values like this with the in_dll() class methods of\nthe type. pythonapi is a predefined symbol giving access to the Python C\napi:
\n>>> opt_flag = c_int.in_dll(pythonapi, "Py_OptimizeFlag")\n>>> print opt_flag\nc_long(0)\n>>>\n
If the interpreter would have been started with -O, the sample would\nhave printed c_long(1), or c_long(2) if -OO would have been\nspecified.
\nAn extended example which also demonstrates the use of pointers accesses the\nPyImport_FrozenModules pointer exported by Python.
\nQuoting the Python docs: This pointer is initialized to point to an array of\n“struct _frozen” records, terminated by one whose members are all NULL or zero.\nWhen a frozen module is imported, it is searched in this table. Third-party code\ncould play tricks with this to provide a dynamically created collection of\nfrozen modules.
\nSo manipulating this pointer could even prove useful. To restrict the example\nsize, we show only how this table can be read with ctypes:
\n>>> from ctypes import *\n>>>\n>>> class struct_frozen(Structure):\n... _fields_ = [("name", c_char_p),\n... ("code", POINTER(c_ubyte)),\n... ("size", c_int)]\n...\n>>>\n
We have defined the struct _frozen data type, so we can get the pointer to\nthe table:
\n>>> FrozenTable = POINTER(struct_frozen)\n>>> table = FrozenTable.in_dll(pythonapi, "PyImport_FrozenModules")\n>>>\n
Since table is a pointer to the array of struct_frozen records, we\ncan iterate over it, but we just have to make sure that our loop terminates,\nbecause pointers have no size. Sooner or later it would probably crash with an\naccess violation or whatever, so it’s better to break out of the loop when we\nhit the NULL entry:
\n>>> for item in table:\n... print item.name, item.size\n... if item.name is None:\n... break\n...\n__hello__ 104\n__phello__ -104\n__phello__.spam 104\nNone 0\n>>>\n
The fact that standard Python has a frozen module and a frozen package\n(indicated by the negative size member) is not well known, it is only used for\ntesting. Try it out with import __hello__ for example.
\nThere are some edges in ctypes where you may be expect something else than\nwhat actually happens.
\nConsider the following example:
\n>>> from ctypes import *\n>>> class POINT(Structure):\n... _fields_ = ("x", c_int), ("y", c_int)\n...\n>>> class RECT(Structure):\n... _fields_ = ("a", POINT), ("b", POINT)\n...\n>>> p1 = POINT(1, 2)\n>>> p2 = POINT(3, 4)\n>>> rc = RECT(p1, p2)\n>>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y\n1 2 3 4\n>>> # now swap the two points\n>>> rc.a, rc.b = rc.b, rc.a\n>>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y\n3 4 3 4\n>>>\n
Hm. We certainly expected the last statement to print 3 4 1 2. What\nhappened? Here are the steps of the rc.a, rc.b = rc.b, rc.a line above:
\n>>> temp0, temp1 = rc.b, rc.a\n>>> rc.a = temp0\n>>> rc.b = temp1\n>>>\n
Note that temp0 and temp1 are objects still using the internal buffer of\nthe rc object above. So executing rc.a = temp0 copies the buffer\ncontents of temp0 into rc ‘s buffer. This, in turn, changes the\ncontents of temp1. So, the last assignment rc.b = temp1, doesn’t have\nthe expected effect.
\nKeep in mind that retrieving sub-objects from Structure, Unions, and Arrays\ndoesn’t copy the sub-object, instead it retrieves a wrapper object accessing\nthe root-object’s underlying buffer.
\nAnother example that may behave different from what one would expect is this:
\n>>> s = c_char_p()\n>>> s.value = "abc def ghi"\n>>> s.value\n'abc def ghi'\n>>> s.value is s.value\nFalse\n>>>\n
Why is it printing False? ctypes instances are objects containing a memory\nblock plus some descriptors accessing the contents of the memory.\nStoring a Python object in the memory block does not store the object itself,\ninstead the contents of the object is stored. Accessing the contents again\nconstructs a new Python object each time!
\nctypes provides some support for variable-sized arrays and structures.
\nThe resize() function can be used to resize the memory buffer of an\nexisting ctypes object. The function takes the object as first argument, and\nthe requested size in bytes as the second argument. The memory block cannot be\nmade smaller than the natural memory block specified by the objects type, a\nValueError is raised if this is tried:
\n>>> short_array = (c_short * 4)()\n>>> print sizeof(short_array)\n8\n>>> resize(short_array, 4)\nTraceback (most recent call last):\n ...\nValueError: minimum size is 8\n>>> resize(short_array, 32)\n>>> sizeof(short_array)\n32\n>>> sizeof(type(short_array))\n8\n>>>\n
This is nice and fine, but how would one access the additional elements\ncontained in this array? Since the type still only knows about 4 elements, we\nget errors accessing other elements:
\n>>> short_array[:]\n[0, 0, 0, 0]\n>>> short_array[7]\nTraceback (most recent call last):\n ...\nIndexError: invalid index\n>>>\n
Another way to use variable-sized data types with ctypes is to use the\ndynamic nature of Python, and (re-)define the data type after the required size\nis already known, on a case by case basis.
\nAs explained in the previous section, foreign functions can be accessed as\nattributes of loaded shared libraries. The function objects created in this way\nby default accept any number of arguments, accept any ctypes data instances as\narguments, and return the default result type specified by the library loader.\nThey are instances of a private class:
\nBase class for C callable foreign functions.
\nInstances of foreign functions are also C compatible data types; they\nrepresent C function pointers.
\nThis behavior can be customized by assigning to special attributes of the\nforeign function object.
\nAssign a ctypes type to specify the result type of the foreign function.\nUse None for void, a function not returning anything.
\nIt is possible to assign a callable Python object that is not a ctypes\ntype, in this case the function is assumed to return a C int, and\nthe callable will be called with this integer, allowing to do further\nprocessing or error checking. Using this is deprecated, for more flexible\npost processing or error checking use a ctypes data type as\nrestype and assign a callable to the errcheck attribute.
\nAssign a tuple of ctypes types to specify the argument types that the\nfunction accepts. Functions using the stdcall calling convention can\nonly be called with the same number of arguments as the length of this\ntuple; functions using the C calling convention accept additional,\nunspecified arguments as well.
\nWhen a foreign function is called, each actual argument is passed to the\nfrom_param() class method of the items in the argtypes\ntuple, this method allows to adapt the actual argument to an object that\nthe foreign function accepts. For example, a c_char_p item in\nthe argtypes tuple will convert a unicode string passed as\nargument into an byte string using ctypes conversion rules.
\nNew: It is now possible to put items in argtypes which are not ctypes\ntypes, but each item must have a from_param() method which returns a\nvalue usable as argument (integer, string, ctypes instance). This allows\nto define adapters that can adapt custom objects as function parameters.
\nAssign a Python function or another callable to this attribute. The\ncallable will be called with three or more arguments:
\nresult is what the foreign function returns, as specified by the\nrestype attribute.
\nfunc is the foreign function object itself, this allows to reuse the\nsame callable object to check or post process the results of several\nfunctions.
\narguments is a tuple containing the parameters originally passed to\nthe function call, this allows to specialize the behavior on the\narguments used.
\nThe object that this function returns will be returned from the\nforeign function call, but it can also check the result value\nand raise an exception if the foreign function call failed.
\nForeign functions can also be created by instantiating function prototypes.\nFunction prototypes are similar to function prototypes in C; they describe a\nfunction (return type, argument types, calling convention) without defining an\nimplementation. The factory functions must be called with the desired result\ntype and the argument types of the function.
\nThe returned function prototype creates functions that use the standard C\ncalling convention. The function will release the GIL during the call. If\nuse_errno is set to True, the ctypes private copy of the system\nerrno variable is exchanged with the real errno value before\nand after the call; use_last_error does the same for the Windows error\ncode.
\n\nChanged in version 2.6: The optional use_errno and use_last_error parameters were added.
\nFunction prototypes created by these factory functions can be instantiated in\ndifferent ways, depending on the type and number of the parameters in the call:
\n\n\n\n
\n\n- \nprototype(address)
\n- Returns a foreign function at the specified address which must be an integer.
\n
\n\n- \nprototype(callable)
\n- Create a C callable function (a callback function) from a Python callable.
\n
\n\n- \nprototype(func_spec[, paramflags])
\n- Returns a foreign function exported by a shared library. func_spec must be a\n2-tuple (name_or_ordinal, library). The first item is the name of the\nexported function as string, or the ordinal of the exported function as small\ninteger. The second item is the shared library instance.
\n
\n\n- \nprototype(vtbl_index, name[, paramflags[, iid]])
\nReturns a foreign function that will call a COM method. vtbl_index is the\nindex into the virtual function table, a small non-negative integer. name is\nname of the COM method. iid is an optional pointer to the interface identifier\nwhich is used in extended error reporting.
\nCOM methods use a special calling convention: They require a pointer to the COM\ninterface as first argument, in addition to those parameters that are specified\nin the argtypes tuple.
\nThe optional paramflags parameter creates foreign function wrappers with much\nmore functionality than the features described above.
\nparamflags must be a tuple of the same length as argtypes.
\nEach item in this tuple contains further information about a parameter, it must\nbe a tuple containing one, two, or three items.
\nThe first item is an integer containing a combination of direction\nflags for the parameter:
\n\n\n\n
\n- 1
\n- Specifies an input parameter to the function.
\n- 2
\n- Output parameter. The foreign function fills in a value.
\n- 4
\n- Input parameter which defaults to the integer zero.
\nThe optional second item is the parameter name as string. If this is specified,\nthe foreign function can be called with named parameters.
\nThe optional third item is the default value for this parameter.
\n
This example demonstrates how to wrap the Windows MessageBoxA function so\nthat it supports default parameters and named arguments. The C declaration from\nthe windows header file is this:
\nWINUSERAPI int WINAPI\nMessageBoxA(\n HWND hWnd ,\n LPCSTR lpText,\n LPCSTR lpCaption,\n UINT uType);
\nHere is the wrapping with ctypes:
\n>>> from ctypes import c_int, WINFUNCTYPE, windll\n>>> from ctypes.wintypes import HWND, LPCSTR, UINT\n>>> prototype = WINFUNCTYPE(c_int, HWND, LPCSTR, LPCSTR, UINT)\n>>> paramflags = (1, "hwnd", 0), (1, "text", "Hi"), (1, "caption", None), (1, "flags", 0)\n>>> MessageBox = prototype(("MessageBoxA", windll.user32), paramflags)\n>>>\n
The MessageBox foreign function can now be called in these ways:
\n>>> MessageBox()\n>>> MessageBox(text="Spam, spam, spam")\n>>> MessageBox(flags=2, text="foo bar")\n>>>\n
A second example demonstrates output parameters. The win32 GetWindowRect\nfunction retrieves the dimensions of a specified window by copying them into\nRECT structure that the caller has to supply. Here is the C declaration:
\nWINUSERAPI BOOL WINAPI\nGetWindowRect(\n HWND hWnd,\n LPRECT lpRect);
\nHere is the wrapping with ctypes:
\n>>> from ctypes import POINTER, WINFUNCTYPE, windll, WinError\n>>> from ctypes.wintypes import BOOL, HWND, RECT\n>>> prototype = WINFUNCTYPE(BOOL, HWND, POINTER(RECT))\n>>> paramflags = (1, "hwnd"), (2, "lprect")\n>>> GetWindowRect = prototype(("GetWindowRect", windll.user32), paramflags)\n>>>\n
Functions with output parameters will automatically return the output parameter\nvalue if there is a single one, or a tuple containing the output parameter\nvalues when there are more than one, so the GetWindowRect function now returns a\nRECT instance, when called.
\nOutput parameters can be combined with the errcheck protocol to do\nfurther output processing and error checking. The win32 GetWindowRect api\nfunction returns a BOOL to signal success or failure, so this function could\ndo the error checking, and raises an exception when the api call failed:
\n>>> def errcheck(result, func, args):\n... if not result:\n... raise WinError()\n... return args\n...\n>>> GetWindowRect.errcheck = errcheck\n>>>\n
If the errcheck function returns the argument tuple it receives\nunchanged, ctypes continues the normal processing it does on the output\nparameters. If you want to return a tuple of window coordinates instead of a\nRECT instance, you can retrieve the fields in the function and return them\ninstead, the normal processing will no longer take place:
\n>>> def errcheck(result, func, args):\n... if not result:\n... raise WinError()\n... rc = args[1]\n... return rc.left, rc.top, rc.bottom, rc.right\n...\n>>> GetWindowRect.errcheck = errcheck\n>>>\n
Returns a light-weight pointer to obj, which must be an instance of a\nctypes type. offset defaults to zero, and must be an integer that will be\nadded to the internal pointer value.
\nbyref(obj, offset) corresponds to this C code:
\n(((char *)&obj) + offset)
\nThe returned object can only be used as a foreign function call\nparameter. It behaves similar to pointer(obj), but the\nconstruction is a lot faster.
\n\nNew in version 2.6: The offset optional argument was added.
\nThis function creates a mutable character buffer. The returned object is a\nctypes array of c_char.
\ninit_or_size must be an integer which specifies the size of the array, or a\nstring which will be used to initialize the array items.
\nIf a string is specified as first argument, the buffer is made one item larger\nthan the length of the string so that the last element in the array is a NUL\ntermination character. An integer can be passed as second argument which allows\nto specify the size of the array if the length of the string should not be used.
\nIf the first parameter is a unicode string, it is converted into an 8-bit string\naccording to ctypes conversion rules.
\nThis function creates a mutable unicode character buffer. The returned object is\na ctypes array of c_wchar.
\ninit_or_size must be an integer which specifies the size of the array, or a\nunicode string which will be used to initialize the array items.
\nIf a unicode string is specified as first argument, the buffer is made one item\nlarger than the length of the string so that the last element in the array is a\nNUL termination character. An integer can be passed as second argument which\nallows to specify the size of the array if the length of the string should not\nbe used.
\nIf the first parameter is a 8-bit string, it is converted into an unicode string\naccording to ctypes conversion rules.
\nTry to find a library and return a pathname. name is the library name\nwithout any prefix like lib, suffix like .so, .dylib or version\nnumber (this is the form used for the posix linker option -l). If\nno library can be found, returns None.
\nThe exact functionality is system dependent.
\n\nChanged in version 2.6: Windows only: find_library("m") or find_library("c") return the\nresult of a call to find_msvcrt().
\nWindows only: return the filename of the VC runtype library used by Python,\nand by the extension modules. If the name of the library cannot be\ndetermined, None is returned.
\nIf you need to free memory, for example, allocated by an extension module\nwith a call to the free(void *), it is important that you use the\nfunction in the same library that allocated the memory.
\n\nNew in version 2.6.
\nReturns the current value of the ctypes-private copy of the system\nerrno variable in the calling thread.
\n\nNew in version 2.6.
\nWindows only: returns the current value of the ctypes-private copy of the system\nLastError variable in the calling thread.
\n\nNew in version 2.6.
\nThis function creates a new pointer instance, pointing to obj. The returned\nobject is of the type POINTER(type(obj)).
\nNote: If you just want to pass a pointer to an object to a foreign function\ncall, you should use byref(obj) which is much faster.
\nThis function sets the rules that ctypes objects use when converting between\n8-bit strings and unicode strings. encoding must be a string specifying an\nencoding, like 'utf-8' or 'mbcs', errors must be a string\nspecifying the error handling on encoding/decoding errors. Examples of\npossible values are "strict", "replace", or "ignore".
\nset_conversion_mode() returns a 2-tuple containing the previous\nconversion rules. On windows, the initial conversion rules are ('mbcs',\n'ignore'), on other systems ('ascii', 'strict').
\nSet the current value of the ctypes-private copy of the system errno\nvariable in the calling thread to value and return the previous value.
\n\nNew in version 2.6.
\nWindows only: set the current value of the ctypes-private copy of the system\nLastError variable in the calling thread to value and return the\nprevious value.
\n\nNew in version 2.6.
\nThis non-public class is the common base class of all ctypes data types.\nAmong other things, all ctypes type instances contain a memory block that\nhold C compatible data; the address of the memory block is returned by the\naddressof() helper function. Another instance variable is exposed as\n_objects; this contains other Python objects that need to be kept\nalive in case the memory block contains pointers.
\nCommon methods of ctypes data types, these are all class methods (to be\nexact, they are methods of the metaclass):
\nThis method returns a ctypes instance that shares the buffer of the\nsource object. The source object must support the writeable buffer\ninterface. The optional offset parameter specifies an offset into the\nsource buffer in bytes; the default is zero. If the source buffer is not\nlarge enough a ValueError is raised.
\n\nNew in version 2.6.
\nThis method creates a ctypes instance, copying the buffer from the\nsource object buffer which must be readable. The optional offset\nparameter specifies an offset into the source buffer in bytes; the default\nis zero. If the source buffer is not large enough a ValueError is\nraised.
\n\nNew in version 2.6.
\nThis method adapts obj to a ctypes type. It is called with the actual\nobject used in a foreign function call when the type is present in the\nforeign function’s argtypes tuple; it must return an object that\ncan be used as a function call parameter.
\nAll ctypes data types have a default implementation of this classmethod\nthat normally returns obj if that is an instance of the type. Some\ntypes accept other objects as well.
\nCommon instance variables of ctypes data types:
\nThis non-public class is the base class of all fundamental ctypes data\ntypes. It is mentioned here because it contains the common attributes of the\nfundamental ctypes data types. _SimpleCData is a subclass of\n_CData, so it inherits their methods and attributes.
\n\nChanged in version 2.6: ctypes data types that are not and do not contain pointers can now be\npickled.
\nInstances have a single attribute:
\nThis attribute contains the actual value of the instance. For integer and\npointer types, it is an integer, for character types, it is a single\ncharacter string, for character pointer types it is a Python string or\nunicode string.
\nWhen the value attribute is retrieved from a ctypes instance, usually\na new object is returned each time. ctypes does not implement\noriginal object return, always a new object is constructed. The same is\ntrue for all other ctypes object instances.
\nFundamental data types, when returned as foreign function call results, or, for\nexample, by retrieving structure field members or array items, are transparently\nconverted to native Python types. In other words, if a foreign function has a\nrestype of c_char_p, you will always receive a Python string,\nnot a c_char_p instance.
\nSubclasses of fundamental data types do not inherit this behavior. So, if a\nforeign functions restype is a subclass of c_void_p, you will\nreceive an instance of this subclass from the function call. Of course, you can\nget the value of the pointer by accessing the value attribute.
\nThese are the fundamental ctypes data types:
\nRepresents the C long double datatype. The constructor accepts an\noptional float initializer. On platforms where sizeof(long double) ==\nsizeof(double) it is an alias to c_double.
\n\nNew in version 2.6.
\nRepresents the C ssize_t datatype.
\n\nNew in version 2.7.
\nRepresent the C bool datatype (more accurately, _Bool from\nC99). Its value can be True or False, and the constructor accepts any object\nthat has a truth value.
\n\nNew in version 2.6.
\nThe ctypes.wintypes module provides quite some other Windows specific\ndata types, for example HWND, WPARAM, or DWORD. Some\nuseful structures like MSG or RECT are also defined.
\nStructures with non-native byte order cannot contain pointer type fields, or any\nother data types containing pointer type fields.
\nAbstract base class for structures in native byte order.
\nConcrete structure and union types must be created by subclassing one of these\ntypes, and at least define a _fields_ class variable. ctypes will\ncreate descriptors which allow reading and writing the fields by direct\nattribute accesses. These are the
\nA sequence defining the structure fields. The items must be 2-tuples or\n3-tuples. The first item is the name of the field, the second item\nspecifies the type of the field; it can be any ctypes data type.
\nFor integer type fields like c_int, a third optional item can be\ngiven. It must be a small positive integer defining the bit width of the\nfield.
\nField names must be unique within one structure or union. This is not\nchecked, only one field can be accessed when names are repeated.
\nIt is possible to define the _fields_ class variable after the\nclass statement that defines the Structure subclass, this allows to create\ndata types that directly or indirectly reference themselves:
\nclass List(Structure):\n pass\nList._fields_ = [("pnext", POINTER(List)),\n ...\n ]\n
The _fields_ class variable must, however, be defined before the\ntype is first used (an instance is created, sizeof() is called on it,\nand so on). Later assignments to the _fields_ class variable will\nraise an AttributeError.
\nStructure and union subclass constructors accept both positional and named\narguments. Positional arguments are used to initialize the fields in the\nsame order as they appear in the _fields_ definition, named\narguments are used to initialize the fields with the corresponding name.
\nIt is possible to defined sub-subclasses of structure types, they inherit\nthe fields of the base class plus the _fields_ defined in the\nsub-subclass, if any.
\nAn optional sequence that lists the names of unnamed (anonymous) fields.\n_anonymous_ must be already defined when _fields_ is\nassigned, otherwise it will have no effect.
\nThe fields listed in this variable must be structure or union type fields.\nctypes will create descriptors in the structure type that allows to\naccess the nested fields directly, without the need to create the\nstructure or union field.
\nHere is an example type (Windows):
\nclass _U(Union):\n _fields_ = [("lptdesc", POINTER(TYPEDESC)),\n ("lpadesc", POINTER(ARRAYDESC)),\n ("hreftype", HREFTYPE)]\n\nclass TYPEDESC(Structure):\n _anonymous_ = ("u",)\n _fields_ = [("u", _U),\n ("vt", VARTYPE)]\n
The TYPEDESC structure describes a COM data type, the vt field\nspecifies which one of the union fields is valid. Since the u field\nis defined as anonymous field, it is now possible to access the members\ndirectly off the TYPEDESC instance. td.lptdesc and td.u.lptdesc\nare equivalent, but the former is faster since it does not need to create\na temporary union instance:
\ntd = TYPEDESC()\ntd.vt = VT_PTR\ntd.lptdesc = POINTER(some_type)\ntd.u.lptdesc = POINTER(some_type)\n
It is possible to defined sub-subclasses of structures, they inherit the\nfields of the base class. If the subclass definition has a separate\n_fields_ variable, the fields specified in this are appended to the\nfields of the base class.
\nStructure and union constructors accept both positional and keyword\narguments. Positional arguments are used to initialize member fields in the\nsame order as they are appear in _fields_. Keyword arguments in the\nconstructor are interpreted as attribute assignments, so they will initialize\n_fields_ with the same name, or create new attributes for names not\npresent in _fields_.
\nPlatforms: Unix
\nThe readline module defines a number of functions to facilitate\ncompletion and reading/writing of history files from the Python interpreter.\nThis module can be used directly or via the rlcompleter module. Settings\nmade using this module affect the behaviour of both the interpreter’s\ninteractive prompt and the prompts offered by the raw_input() and\ninput() built-in functions.
\nNote
\nOn MacOS X the readline module can be implemented using\nthe libedit library instead of GNU readline.
\nThe configuration file for libedit is different from that\nof GNU readline. If you programmatically load configuration strings\nyou can check for the text “libedit” in readline.__doc__\nto differentiate between GNU readline and libedit.
\nThe readline module defines the following functions:
\nClear the current history. (Note: this function is not available if the\ninstalled version of GNU readline doesn’t support it.)
\n\nNew in version 2.4.
\nReturn the number of lines currently in the history. (This is different from\nget_history_length(), which returns the maximum number of lines that will\nbe written to a history file.)
\n\nNew in version 2.3.
\nReturn the current contents of history item at index.
\n\nNew in version 2.3.
\nRemove history item specified by its position from the history.
\n\nNew in version 2.4.
\nReplace history item specified by its position with the given line.
\n\nNew in version 2.4.
\nChange what’s displayed on the screen to reflect the current contents of the\nline buffer.
\n\nNew in version 2.3.
\nGet the completer function, or None if no completer function has been set.
\n\nNew in version 2.3.
\nGet the type of completion being attempted.
\n\nNew in version 2.6.
\nSet or remove the completion display function. If function is\nspecified, it will be used as the new completion display function;\nif omitted or None, any completion display function already\ninstalled is removed. The completion display function is called as\nfunction(substitution, [matches], longest_match_length) once\neach time matches need to be displayed.
\n\nNew in version 2.6.
\nSee also
\nThe following example demonstrates how to use the readline module’s\nhistory reading and writing functions to automatically load and save a history\nfile named .pyhist from the user’s home directory. The code below would\nnormally be executed automatically during interactive sessions from the user’s\nPYTHONSTARTUP file.
\nimport os\nimport readline\nhistfile = os.path.join(os.path.expanduser("~"), ".pyhist")\ntry:\n readline.read_history_file(histfile)\nexcept IOError:\n pass\nimport atexit\natexit.register(readline.write_history_file, histfile)\ndel os, histfile\n
The following example extends the code.InteractiveConsole class to\nsupport history save/restore.
\nimport code\nimport readline\nimport atexit\nimport os\n\nclass HistoryConsole(code.InteractiveConsole):\n def __init__(self, locals=None, filename="<console>",\n histfile=os.path.expanduser("~/.console-history")):\n code.InteractiveConsole.__init__(self, locals, filename)\n self.init_history(histfile)\n\n def init_history(self, histfile):\n readline.parse_and_bind("tab: complete")\n if hasattr(readline, "read_history_file"):\n try:\n readline.read_history_file(histfile)\n except IOError:\n pass\n atexit.register(self.save_history, histfile)\n\n def save_history(self, histfile):\n readline.write_history_file(histfile)\n
Memory-mapped file objects behave like both strings and like file objects.\nUnlike normal string objects, however, these are mutable. You can use mmap\nobjects in most places where strings are expected; for example, you can use\nthe re module to search through a memory-mapped file. Since they’re\nmutable, you can change a single character by doing obj[index] = 'a', or\nchange a substring by assigning to a slice: obj[i1:i2] = '...'. You can\nalso read and write data starting at the current file position, and\nseek() through the file to different positions.
\nA memory-mapped file is created by the mmap constructor, which is\ndifferent on Unix and on Windows. In either case you must provide a file\ndescriptor for a file opened for update. If you wish to map an existing Python\nfile object, use its fileno() method to obtain the correct value for the\nfileno parameter. Otherwise, you can open the file using the\nos.open() function, which returns a file descriptor directly (the file\nstill needs to be closed when done).
\nNote
\nIf you want to create a memory-mapping for a writable, buffered file, you\nshould flush() the file first. This is necessary to ensure\nthat local modifications to the buffers are actually available to the\nmapping.
\nFor both the Unix and Windows versions of the constructor, access may be\nspecified as an optional keyword parameter. access accepts one of three\nvalues: ACCESS_READ, ACCESS_WRITE, or ACCESS_COPY\nto specify read-only, write-through or copy-on-write memory respectively.\naccess can be used on both Unix and Windows. If access is not specified,\nWindows mmap returns a write-through mapping. The initial memory values for\nall three access types are taken from the specified file. Assignment to an\nACCESS_READ memory map raises a TypeError exception.\nAssignment to an ACCESS_WRITE memory map affects both memory and the\nunderlying file. Assignment to an ACCESS_COPY memory map affects\nmemory but does not update the underlying file.
\n\nChanged in version 2.5: To map anonymous memory, -1 should be passed as the fileno along with the\nlength.
\n\nChanged in version 2.6: mmap.mmap has formerly been a factory function creating mmap objects. Now\nmmap.mmap is the class itself.
\n(Windows version) Maps length bytes from the file specified by the\nfile handle fileno, and creates a mmap object. If length is larger\nthan the current size of the file, the file is extended to contain length\nbytes. If length is 0, the maximum length of the map is the current\nsize of the file, except that if the file is empty Windows raises an\nexception (you cannot create an empty mapping on Windows).
\ntagname, if specified and not None, is a string giving a tag name for\nthe mapping. Windows allows you to have many different mappings against\nthe same file. If you specify the name of an existing tag, that tag is\nopened, otherwise a new tag of this name is created. If this parameter is\nomitted or None, the mapping is created without a name. Avoiding the\nuse of the tag parameter will assist in keeping your code portable between\nUnix and Windows.
\noffset may be specified as a non-negative integer offset. mmap references\nwill be relative to the offset from the beginning of the file. offset\ndefaults to 0. offset must be a multiple of the ALLOCATIONGRANULARITY.
\n(Unix version) Maps length bytes from the file specified by the file\ndescriptor fileno, and returns a mmap object. If length is 0, the\nmaximum length of the map will be the current size of the file when\nmmap is called.
\nflags specifies the nature of the mapping. MAP_PRIVATE creates a\nprivate copy-on-write mapping, so changes to the contents of the mmap\nobject will be private to this process, and MAP_SHARED creates a\nmapping that’s shared with all other processes mapping the same areas of\nthe file. The default value is MAP_SHARED.
\nprot, if specified, gives the desired memory protection; the two most\nuseful values are PROT_READ and PROT_WRITE, to specify\nthat the pages may be read or written. prot defaults to\nPROT_READ | PROT_WRITE.
\naccess may be specified in lieu of flags and prot as an optional\nkeyword parameter. It is an error to specify both flags, prot and\naccess. See the description of access above for information on how to\nuse this parameter.
\noffset may be specified as a non-negative integer offset. mmap references\nwill be relative to the offset from the beginning of the file. offset\ndefaults to 0. offset must be a multiple of the PAGESIZE or\nALLOCATIONGRANULARITY.
\nTo ensure validity of the created memory mapping the file specified\nby the descriptor fileno is internally automatically synchronized\nwith physical backing store on Mac OS X and OpenVMS.
\nThis example shows a simple way of using mmap:
\nimport mmap\n\n# write a simple example file\nwith open("hello.txt", "wb") as f:\n f.write("Hello Python!\\n")\n\nwith open("hello.txt", "r+b") as f:\n # memory-map the file, size 0 means whole file\n map = mmap.mmap(f.fileno(), 0)\n # read content via standard file methods\n print map.readline() # prints "Hello Python!"\n # read content via slice notation\n print map[:5] # prints "Hello"\n # update content using slice notation;\n # note that new content must have same size\n map[6:] = " world!\\n"\n # ... and read again using standard file methods\n map.seek(0)\n print map.readline() # prints "Hello world!"\n # close the map\n map.close()\n
The next example demonstrates how to create an anonymous map and exchange\ndata between the parent and child processes:
\nimport mmap\nimport os\n\nmap = mmap.mmap(-1, 13)\nmap.write("Hello world!")\n\npid = os.fork()\n\nif pid == 0: # In a child process\n map.seek(0)\n print map.readline()\n\n map.close()\n
Memory-mapped file objects support the following methods:
\nFlushes changes made to the in-memory copy of a file back to disk. Without\nuse of this call there is no guarantee that changes are written back before\nthe object is destroyed. If offset and size are specified, only\nchanges to the given range of bytes will be flushed to disk; otherwise, the\nwhole extent of the mapping is flushed.
\n(Windows version) A nonzero value returned indicates success; zero\nindicates failure.
\n(Unix version) A zero value is returned to indicate success. An\nexception is raised when the call failed.
\nSource code: Lib/threading.py
\nThis module constructs higher-level threading interfaces on top of the lower\nlevel thread module.\nSee also the mutex and Queue modules.
\nThe dummy_threading module is provided for situations where\nthreading cannot be used because thread is missing.
\nNote
\nStarting with Python 2.6, this module provides PEP 8 compliant aliases and\nproperties to replace the camelCase names that were inspired by Java’s\nthreading API. This updated API is compatible with that of the\nmultiprocessing module. However, no schedule has been set for the\ndeprecation of the camelCase names and they remain fully supported in\nboth Python 2.x and 3.x.
\nNote
\nStarting with Python 2.5, several Thread methods raise RuntimeError\ninstead of AssertionError if called erroneously.
\nCPython implementation detail: Due to the Global Interpreter Lock, in CPython only one thread\ncan execute Python code at once (even though certain performance-oriented\nlibraries might overcome this limitation).\nIf you want your application to make better of use of the computational\nresources of multi-core machines, you are advised to use\nmultiprocessing. However, threading is still an appropriate model\nif you want to run multiple I/O-bound tasks simultaneously.
\nThis module defines the following functions and objects:
\nA factory function that returns a new condition variable object. A condition\nvariable allows one or more threads to wait until they are notified by another\nthread.
\nSee Condition Objects.
\nA factory function that returns a new event object. An event manages a flag\nthat can be set to true with the set() method and reset to false\nwith the clear() method. The wait() method blocks until the flag\nis true.
\nSee Event Objects.
\nA class that represents thread-local data. Thread-local data are data whose\nvalues are thread specific. To manage thread-local data, just create an\ninstance of local (or a subclass) and store attributes on it:
\nmydata = threading.local()\nmydata.x = 1\n
The instance’s values will be different for separate threads.
\nFor more details and extensive examples, see the documentation string of the\n_threading_local module.
\n\nNew in version 2.4.
\nA factory function that returns a new primitive lock object. Once a thread has\nacquired it, subsequent attempts to acquire it block, until it is released; any\nthread may release it.
\nSee Lock Objects.
\nA factory function that returns a new reentrant lock object. A reentrant lock\nmust be released by the thread that acquired it. Once a thread has acquired a\nreentrant lock, the same thread may acquire it again without blocking; the\nthread must release it once for each time it has acquired it.
\nSee RLock Objects.
\nA factory function that returns a new semaphore object. A semaphore manages a\ncounter representing the number of release() calls minus the number of\nacquire() calls, plus an initial value. The acquire() method blocks\nif necessary until it can return without making the counter negative. If not\ngiven, value defaults to 1.
\nSee Semaphore Objects.
\nA class that represents a thread of control. This class can be safely\nsubclassed in a limited fashion.
\nSee Thread Objects.
\nA thread that executes a function after a specified interval has passed.
\nSee Timer Objects.
\nSet a trace function for all threads started from the threading module.\nThe func will be passed to sys.settrace() for each thread, before its\nrun() method is called.
\n\nNew in version 2.3.
\nSet a profile function for all threads started from the threading module.\nThe func will be passed to sys.setprofile() for each thread, before its\nrun() method is called.
\n\nNew in version 2.3.
\nReturn the thread stack size used when creating new threads. The optional\nsize argument specifies the stack size to be used for subsequently created\nthreads, and must be 0 (use platform or configured default) or a positive\ninteger value of at least 32,768 (32kB). If changing the thread stack size is\nunsupported, a ThreadError is raised. If the specified stack size is\ninvalid, a ValueError is raised and the stack size is unmodified. 32kB\nis currently the minimum supported stack size value to guarantee sufficient\nstack space for the interpreter itself. Note that some platforms may have\nparticular restrictions on values for the stack size, such as requiring a\nminimum stack size > 32kB or requiring allocation in multiples of the system\nmemory page size - platform documentation should be referred to for more\ninformation (4kB pages are common; using multiples of 4096 for the stack size is\nthe suggested approach in the absence of more specific information).\nAvailability: Windows, systems with POSIX threads.
\n\nNew in version 2.5.
\nDetailed interfaces for the objects are documented below.
\nThe design of this module is loosely based on Java’s threading model. However,\nwhere Java makes locks and condition variables basic behavior of every object,\nthey are separate objects in Python. Python’s Thread class supports a\nsubset of the behavior of Java’s Thread class; currently, there are no\npriorities, no thread groups, and threads cannot be destroyed, stopped,\nsuspended, resumed, or interrupted. The static methods of Java’s Thread class,\nwhen implemented, are mapped to module-level functions.
\nAll of the methods described below are executed atomically.
\nThis class represents an activity that is run in a separate thread of control.\nThere are two ways to specify the activity: by passing a callable object to the\nconstructor, or by overriding the run() method in a subclass. No other\nmethods (except for the constructor) should be overridden in a subclass. In\nother words, only override the __init__() and run() methods of\nthis class.
\nOnce a thread object is created, its activity must be started by calling the\nthread’s start() method. This invokes the run() method in a\nseparate thread of control.
\nOnce the thread’s activity is started, the thread is considered ‘alive’. It\nstops being alive when its run() method terminates – either normally, or\nby raising an unhandled exception. The is_alive() method tests whether the\nthread is alive.
\nOther threads can call a thread’s join() method. This blocks the calling\nthread until the thread whose join() method is called is terminated.
\nA thread has a name. The name can be passed to the constructor, and read or\nchanged through the name attribute.
\nA thread can be flagged as a “daemon thread”. The significance of this flag is\nthat the entire Python program exits when only daemon threads are left. The\ninitial value is inherited from the creating thread. The flag can be set\nthrough the daemon property.
\nThere is a “main thread” object; this corresponds to the initial thread of\ncontrol in the Python program. It is not a daemon thread.
\nThere is the possibility that “dummy thread objects” are created. These are\nthread objects corresponding to “alien threads”, which are threads of control\nstarted outside the threading module, such as directly from C code. Dummy\nthread objects have limited functionality; they are always considered alive and\ndaemonic, and cannot be join()ed. They are never deleted, since it is\nimpossible to detect the termination of alien threads.
\nThis constructor should always be called with keyword arguments. Arguments\nare:
\ngroup should be None; reserved for future extension when a\nThreadGroup class is implemented.
\ntarget is the callable object to be invoked by the run() method.\nDefaults to None, meaning nothing is called.
\nname is the thread name. By default, a unique name is constructed of the\nform “Thread-N” where N is a small decimal number.
\nargs is the argument tuple for the target invocation. Defaults to ().
\nkwargs is a dictionary of keyword arguments for the target invocation.\nDefaults to {}.
\nIf the subclass overrides the constructor, it must make sure to invoke the\nbase class constructor (Thread.__init__()) before doing anything else to\nthe thread.
\nStart the thread’s activity.
\nIt must be called at most once per thread object. It arranges for the\nobject’s run() method to be invoked in a separate thread of control.
\nThis method will raise a RuntimeError if called more than once\non the same thread object.
\nMethod representing the thread’s activity.
\nYou may override this method in a subclass. The standard run()\nmethod invokes the callable object passed to the object’s constructor as\nthe target argument, if any, with sequential and keyword arguments taken\nfrom the args and kwargs arguments, respectively.
\nWait until the thread terminates. This blocks the calling thread until the\nthread whose join() method is called terminates – either normally\nor through an unhandled exception – or until the optional timeout occurs.
\nWhen the timeout argument is present and not None, it should be a\nfloating point number specifying a timeout for the operation in seconds\n(or fractions thereof). As join() always returns None, you must\ncall isAlive() after join() to decide whether a timeout\nhappened – if the thread is still alive, the join() call timed out.
\nWhen the timeout argument is not present or None, the operation will\nblock until the thread terminates.
\nA thread can be join()ed many times.
\njoin() raises a RuntimeError if an attempt is made to join\nthe current thread as that would cause a deadlock. It is also an error to\njoin() a thread before it has been started and attempts to do so\nraises the same exception.
\nThe ‘thread identifier’ of this thread or None if the thread has not\nbeen started. This is a nonzero integer. See the\nthread.get_ident() function. Thread identifiers may be recycled\nwhen a thread exits and another thread is created. The identifier is\navailable even after the thread has exited.
\n\nNew in version 2.6.
\nReturn whether the thread is alive.
\nThis method returns True just before the run() method starts\nuntil just after the run() method terminates. The module function\nenumerate() returns a list of all alive threads.
\nA boolean value indicating whether this thread is a daemon thread (True)\nor not (False). This must be set before start() is called,\notherwise RuntimeError is raised. Its initial value is inherited\nfrom the creating thread; the main thread is not a daemon thread and\ntherefore all threads created in the main thread default to daemon\n= False.
\nThe entire Python program exits when no alive non-daemon threads are left.
\nA primitive lock is a synchronization primitive that is not owned by a\nparticular thread when locked. In Python, it is currently the lowest level\nsynchronization primitive available, implemented directly by the thread\nextension module.
\nA primitive lock is in one of two states, “locked” or “unlocked”. It is created\nin the unlocked state. It has two basic methods, acquire() and\nrelease(). When the state is unlocked, acquire() changes the state\nto locked and returns immediately. When the state is locked, acquire()\nblocks until a call to release() in another thread changes it to unlocked,\nthen the acquire() call resets it to locked and returns. The\nrelease() method should only be called in the locked state; it changes the\nstate to unlocked and returns immediately. If an attempt is made to release an\nunlocked lock, a RuntimeError will be raised.
\nWhen more than one thread is blocked in acquire() waiting for the state to\nturn to unlocked, only one thread proceeds when a release() call resets\nthe state to unlocked; which one of the waiting threads proceeds is not defined,\nand may vary across implementations.
\nAll methods are executed atomically.
\nAcquire a lock, blocking or non-blocking.
\nWhen invoked without arguments, block until the lock is unlocked, then set it to\nlocked, and return true.
\nWhen invoked with the blocking argument set to true, do the same thing as when\ncalled without arguments, and return true.
\nWhen invoked with the blocking argument set to false, do not block. If a call\nwithout an argument would block, return false immediately; otherwise, do the\nsame thing as when called without arguments, and return true.
\nRelease a lock.
\nWhen the lock is locked, reset it to unlocked, and return. If any other threads\nare blocked waiting for the lock to become unlocked, allow exactly one of them\nto proceed.
\nDo not call this method when the lock is unlocked.
\nThere is no return value.
\nA reentrant lock is a synchronization primitive that may be acquired multiple\ntimes by the same thread. Internally, it uses the concepts of “owning thread”\nand “recursion level” in addition to the locked/unlocked state used by primitive\nlocks. In the locked state, some thread owns the lock; in the unlocked state,\nno thread owns it.
\nTo lock the lock, a thread calls its acquire() method; this returns once\nthe thread owns the lock. To unlock the lock, a thread calls its\nrelease() method. acquire()/release() call pairs may be\nnested; only the final release() (the release() of the outermost\npair) resets the lock to unlocked and allows another thread blocked in\nacquire() to proceed.
\nAcquire a lock, blocking or non-blocking.
\nWhen invoked without arguments: if this thread already owns the lock, increment\nthe recursion level by one, and return immediately. Otherwise, if another\nthread owns the lock, block until the lock is unlocked. Once the lock is\nunlocked (not owned by any thread), then grab ownership, set the recursion level\nto one, and return. If more than one thread is blocked waiting until the lock\nis unlocked, only one at a time will be able to grab ownership of the lock.\nThere is no return value in this case.
\nWhen invoked with the blocking argument set to true, do the same thing as when\ncalled without arguments, and return true.
\nWhen invoked with the blocking argument set to false, do not block. If a call\nwithout an argument would block, return false immediately; otherwise, do the\nsame thing as when called without arguments, and return true.
\nRelease a lock, decrementing the recursion level. If after the decrement it is\nzero, reset the lock to unlocked (not owned by any thread), and if any other\nthreads are blocked waiting for the lock to become unlocked, allow exactly one\nof them to proceed. If after the decrement the recursion level is still\nnonzero, the lock remains locked and owned by the calling thread.
\nOnly call this method when the calling thread owns the lock. A\nRuntimeError is raised if this method is called when the lock is\nunlocked.
\nThere is no return value.
\nA condition variable is always associated with some kind of lock; this can be\npassed in or one will be created by default. (Passing one in is useful when\nseveral condition variables must share the same lock.)
\nA condition variable has acquire() and release() methods that call\nthe corresponding methods of the associated lock. It also has a wait()\nmethod, and notify() and notifyAll() methods. These three must only\nbe called when the calling thread has acquired the lock, otherwise a\nRuntimeError is raised.
\nThe wait() method releases the lock, and then blocks until it is awakened\nby a notify() or notifyAll() call for the same condition variable in\nanother thread. Once awakened, it re-acquires the lock and returns. It is also\npossible to specify a timeout.
\nThe notify() method wakes up one of the threads waiting for the condition\nvariable, if any are waiting. The notifyAll() method wakes up all threads\nwaiting for the condition variable.
\nNote: the notify() and notifyAll() methods don’t release the lock;\nthis means that the thread or threads awakened will not return from their\nwait() call immediately, but only when the thread that called\nnotify() or notifyAll() finally relinquishes ownership of the lock.
\nTip: the typical programming style using condition variables uses the lock to\nsynchronize access to some shared state; threads that are interested in a\nparticular change of state call wait() repeatedly until they see the\ndesired state, while threads that modify the state call notify() or\nnotifyAll() when they change the state in such a way that it could\npossibly be a desired state for one of the waiters. For example, the following\ncode is a generic producer-consumer situation with unlimited buffer capacity:
\n# Consume one item\ncv.acquire()\nwhile not an_item_is_available():\n cv.wait()\nget_an_available_item()\ncv.release()\n\n# Produce one item\ncv.acquire()\nmake_an_item_available()\ncv.notify()\ncv.release()\n
To choose between notify() and notifyAll(), consider whether one\nstate change can be interesting for only one or several waiting threads. E.g.\nin a typical producer-consumer situation, adding one item to the buffer only\nneeds to wake up one consumer thread.
\nIf the lock argument is given and not None, it must be a Lock\nor RLock object, and it is used as the underlying lock. Otherwise,\na new RLock object is created and used as the underlying lock.
\nWait until notified or until a timeout occurs. If the calling thread has not\nacquired the lock when this method is called, a RuntimeError is raised.
\nThis method releases the underlying lock, and then blocks until it is\nawakened by a notify() or notifyAll() call for the same\ncondition variable in another thread, or until the optional timeout\noccurs. Once awakened or timed out, it re-acquires the lock and returns.
\nWhen the timeout argument is present and not None, it should be a\nfloating point number specifying a timeout for the operation in seconds\n(or fractions thereof).
\nWhen the underlying lock is an RLock, it is not released using\nits release() method, since this may not actually unlock the lock\nwhen it was acquired multiple times recursively. Instead, an internal\ninterface of the RLock class is used, which really unlocks it\neven when it has been recursively acquired several times. Another internal\ninterface is then used to restore the recursion level when the lock is\nreacquired.
\nBy default, wake up one thread waiting on this condition, if any. If the\ncalling thread has not acquired the lock when this method is called, a\nRuntimeError is raised.
\nThis method wakes up at most n of the threads waiting for the condition\nvariable; it is a no-op if no threads are waiting.
\nThe current implementation wakes up exactly n threads, if at least n\nthreads are waiting. However, it’s not safe to rely on this behavior.\nA future, optimized implementation may occasionally wake up more than\nn threads.
\nNote: an awakened thread does not actually return from its wait()\ncall until it can reacquire the lock. Since notify() does not\nrelease the lock, its caller should.
\nThis is one of the oldest synchronization primitives in the history of computer\nscience, invented by the early Dutch computer scientist Edsger W. Dijkstra (he\nused P() and V() instead of acquire() and release()).
\nA semaphore manages an internal counter which is decremented by each\nacquire() call and incremented by each release() call. The counter\ncan never go below zero; when acquire() finds that it is zero, it blocks,\nwaiting until some other thread calls release().
\nThe optional argument gives the initial value for the internal counter; it\ndefaults to 1. If the value given is less than 0, ValueError is\nraised.
\nAcquire a semaphore.
\nWhen invoked without arguments: if the internal counter is larger than\nzero on entry, decrement it by one and return immediately. If it is zero\non entry, block, waiting until some other thread has called\nrelease() to make it larger than zero. This is done with proper\ninterlocking so that if multiple acquire() calls are blocked,\nrelease() will wake exactly one of them up. The implementation may\npick one at random, so the order in which blocked threads are awakened\nshould not be relied on. There is no return value in this case.
\nWhen invoked with blocking set to true, do the same thing as when called\nwithout arguments, and return true.
\nWhen invoked with blocking set to false, do not block. If a call\nwithout an argument would block, return false immediately; otherwise, do\nthe same thing as when called without arguments, and return true.
\nSemaphores are often used to guard resources with limited capacity, for example,\na database server. In any situation where the size of the resource is fixed,\nyou should use a bounded semaphore. Before spawning any worker threads, your\nmain thread would initialize the semaphore:
\nmaxconnections = 5\n...\npool_sema = BoundedSemaphore(value=maxconnections)\n
Once spawned, worker threads call the semaphore’s acquire and release methods\nwhen they need to connect to the server:
\npool_sema.acquire()\nconn = connectdb()\n... use connection ...\nconn.close()\npool_sema.release()\n
The use of a bounded semaphore reduces the chance that a programming error which\ncauses the semaphore to be released more than it’s acquired will go undetected.
\nThis is one of the simplest mechanisms for communication between threads: one\nthread signals an event and other threads wait for it.
\nAn event object manages an internal flag that can be set to true with the\nset() method and reset to false with the clear() method. The\nwait() method blocks until the flag is true.
\nThe internal flag is initially false.
\nReturn true if and only if the internal flag is true.
\n\nChanged in version 2.6: The is_set() syntax is new.
\nBlock until the internal flag is true. If the internal flag is true on\nentry, return immediately. Otherwise, block until another thread calls\nset() to set the flag to true, or until the optional timeout\noccurs.
\nWhen the timeout argument is present and not None, it should be a\nfloating point number specifying a timeout for the operation in seconds\n(or fractions thereof).
\nThis method returns the internal flag on exit, so it will always return\nTrue except if a timeout is given and the operation times out.
\n\nChanged in version 2.7: Previously, the method always returned None.
\nThis class represents an action that should be run only after a certain amount\nof time has passed — a timer. Timer is a subclass of Thread\nand as such also functions as an example of creating custom threads.
\nTimers are started, as with threads, by calling their start() method. The\ntimer can be stopped (before its action has begun) by calling the cancel()\nmethod. The interval the timer will wait before executing its action may not be\nexactly the same as the interval specified by the user.
\nFor example:
\ndef hello():\n print "hello, world"\n\nt = Timer(30.0, hello)\nt.start() # after 30 seconds, "hello, world" will be printed\n
Create a timer that will run function with arguments args and keyword\narguments kwargs, after interval seconds have passed.
\nAll of the objects provided by this module that have acquire() and\nrelease() methods can be used as context managers for a with\nstatement. The acquire() method will be called when the block is entered,\nand release() will be called when the block is exited.
\nCurrently, Lock, RLock, Condition,\nSemaphore, and BoundedSemaphore objects may be used as\nwith statement context managers. For example:
\nimport threading\n\nsome_rlock = threading.RLock()\n\nwith some_rlock:\n print "some_rlock is locked while this executes"\n
While the import machinery is thread-safe, there are two key restrictions on\nthreaded imports due to inherent limitations in the way that thread-safety is\nprovided:
\nSource code: Lib/rlcompleter.py
\nThe rlcompleter module defines a completion function suitable for the\nreadline module by completing valid Python identifiers and keywords.
\nWhen this module is imported on a Unix platform with the readline module\navailable, an instance of the Completer class is automatically created\nand its complete() method is set as the readline completer.
\nExample:
\n>>> import rlcompleter\n>>> import readline\n>>> readline.parse_and_bind("tab: complete")\n>>> readline. <TAB PRESSED>\nreadline.__doc__ readline.get_line_buffer( readline.read_init_file(\nreadline.__file__ readline.insert_text( readline.set_completer(\nreadline.__name__ readline.parse_and_bind(\n>>> readline.\n
The rlcompleter module is designed for use with Python’s interactive\nmode. A user can add the following lines to his or her initialization file\n(identified by the PYTHONSTARTUP environment variable) to get\nautomatic Tab completion:
\ntry:\n import readline\nexcept ImportError:\n print "Module readline not available."\nelse:\n import rlcompleter\n readline.parse_and_bind("tab: complete")\n
On platforms without readline, the Completer class defined by\nthis module can still be used for custom purposes.
\nCompleter objects have the following method:
\nReturn the stateth completion for text.
\nIf called for text that doesn’t include a period character ('.'), it will\ncomplete from names currently defined in __main__, __builtin__ and\nkeywords (as defined by the keyword module).
\nIf called for a dotted name, it will try to evaluate anything without obvious\nside-effects (functions will not be evaluated, but it can generate calls to\n__getattr__()) up to the last part, and find matches for the rest via the\ndir() function. Any exception raised during the evaluation of the\nexpression is caught, silenced and None is returned.
\nThis module provides mechanisms to use signal handlers in Python. Some general\nrules for working with signals and their handlers:
\nThe variables defined in the signal module are:
\nThe signal corresponding to the CTRL+C keystroke event. This signal can\nonly be used with os.kill().
\nAvailability: Windows.
\n\nNew in version 2.7.
\nThe signal corresponding to the CTRL+BREAK keystroke event. This signal can\nonly be used with os.kill().
\nAvailability: Windows.
\n\nNew in version 2.7.
\nThe signal module defines one exception:
\nThe signal module defines the following functions:
\nSets given interval timer (one of signal.ITIMER_REAL,\nsignal.ITIMER_VIRTUAL or signal.ITIMER_PROF) specified\nby which to fire after seconds (float is accepted, different from\nalarm()) and after that every interval seconds. The interval\ntimer specified by which can be cleared by setting seconds to zero.
\nWhen an interval timer fires, a signal is sent to the process.\nThe signal sent is dependent on the timer being used;\nsignal.ITIMER_REAL will deliver SIGALRM,\nsignal.ITIMER_VIRTUAL sends SIGVTALRM,\nand signal.ITIMER_PROF will deliver SIGPROF.
\nThe old values are returned as a tuple: (delay, interval).
\nAttempting to pass an invalid interval timer will cause an\nItimerError. Availability: Unix.
\n\nNew in version 2.6.
\nReturns current value of a given interval timer specified by which.\nAvailability: Unix.
\n\nNew in version 2.6.
\nSet the wakeup fd to fd. When a signal is received, a '\\0' byte is\nwritten to the fd. This can be used by a library to wakeup a poll or select\ncall, allowing the signal to be fully processed.
\nThe old wakeup fd is returned. fd must be non-blocking. It is up to the\nlibrary to remove any bytes before calling poll or select again.
\nWhen threads are enabled, this function can only be called from the main thread;\nattempting to call it from other threads will cause a ValueError\nexception to be raised.
\n\nNew in version 2.6.
\nChange system call restart behaviour: if flag is False, system\ncalls will be restarted when interrupted by signal signalnum, otherwise\nsystem calls will be interrupted. Returns nothing. Availability: Unix (see\nthe man page siginterrupt(3) for further information).
\nNote that installing a signal handler with signal() will reset the\nrestart behaviour to interruptible by implicitly calling\nsiginterrupt() with a true flag value for the given signal.
\n\nNew in version 2.6.
\nSet the handler for signal signalnum to the function handler. handler can\nbe a callable Python object taking two arguments (see below), or one of the\nspecial values signal.SIG_IGN or signal.SIG_DFL. The previous\nsignal handler will be returned (see the description of getsignal()\nabove). (See the Unix man page signal(2).)
\nWhen threads are enabled, this function can only be called from the main thread;\nattempting to call it from other threads will cause a ValueError\nexception to be raised.
\nThe handler is called with two arguments: the signal number and the current\nstack frame (None or a frame object; for a description of frame objects,\nsee the description in the type hierarchy or see the\nattribute descriptions in the inspect module).
\nOn Windows, signal() can only be called with SIGABRT,\nSIGFPE, SIGILL, SIGINT, SIGSEGV, or\nSIGTERM. A ValueError will be raised in any other case.
\nHere is a minimal example program. It uses the alarm() function to limit\nthe time spent waiting to open a file; this is useful if the file is for a\nserial device that may not be turned on, which would normally cause the\nos.open() to hang indefinitely. The solution is to set a 5-second alarm\nbefore opening the file; if the operation takes too long, the alarm signal will\nbe sent, and the handler raises an exception.
\nimport signal, os\n\ndef handler(signum, frame):\n print 'Signal handler called with signal', signum\n raise IOError("Couldn't open device!")\n\n# Set the signal handler and a 5-second alarm\nsignal.signal(signal.SIGALRM, handler)\nsignal.alarm(5)\n\n# This open() may hang indefinitely\nfd = os.open('/dev/ttyS0', os.O_RDWR)\n\nsignal.alarm(0) # Disable the alarm\n
\nDeprecated since version 2.6: This module is obsolete. Use the subprocess module. Check\nespecially the Replacing Older Functions with the subprocess Module section.
\nThis module allows you to spawn processes and connect to their\ninput/output/error pipes and obtain their return codes under Unix and Windows.
\nThe subprocess module provides more powerful facilities for spawning new\nprocesses and retrieving their results. Using the subprocess module is\npreferable to using the popen2 module.
\nThe primary interface offered by this module is a trio of factory functions.\nFor each of these, if bufsize is specified, it specifies the buffer size for\nthe I/O pipes. mode, if provided, should be the string 'b' or 't'; on\nWindows this is needed to determine whether the file objects should be opened in\nbinary or text mode. The default value for mode is 't'.
\nOn Unix, cmd may be a sequence, in which case arguments will be passed\ndirectly to the program without shell intervention (as with os.spawnv()).\nIf cmd is a string it will be passed to the shell (as with os.system()).
\nThe only way to retrieve the return codes for the child processes is by using\nthe poll() or wait() methods on the Popen3 and\nPopen4 classes; these are only available on Unix. This information is\nnot available when using the popen2(), popen3(), and popen4()\nfunctions, or the equivalent functions in the os module. (Note that the\ntuples returned by the os module’s functions are in a different order\nfrom the ones returned by the popen2 module.)
\nExecutes cmd as a sub-process. Returns the file objects\n(child_stdout_and_stderr, child_stdin).
\n\nNew in version 2.0.
\nOn Unix, a class defining the objects returned by the factory functions is also\navailable. These are not used for the Windows implementation, and are not\navailable on that platform.
\nThis class represents a child process. Normally, Popen3 instances are\ncreated using the popen2() and popen3() factory functions described\nabove.
\nIf not using one of the helper functions to create Popen3 objects, the\nparameter cmd is the shell command to execute in a sub-process. The\ncapturestderr flag, if true, specifies that the object should capture standard\nerror output of the child process. The default is false. If the bufsize\nparameter is specified, it specifies the size of the I/O buffers to/from the\nchild process.
\nSimilar to Popen3, but always captures standard error into the same\nfile object as standard output. These are typically created using\npopen4().
\n\nNew in version 2.0.
\nInstances of the Popen3 and Popen4 classes have the following\nmethods:
\nThe following attributes are also available:
\nAny time you are working with any form of inter-process communication, control\nflow needs to be carefully thought out. This remains the case with the file\nobjects provided by this module (or the os module equivalents).
\nWhen reading output from a child process that writes a lot of data to standard\nerror while the parent is reading from the child’s standard output, a deadlock\ncan occur. A similar situation can occur with other combinations of reads and\nwrites. The essential factors are that more than _PC_PIPE_BUF bytes\nare being written by one process in a blocking fashion, while the other process\nis reading from the first process, also in a blocking fashion.
\nThere are several ways to deal with this situation.
\nThe simplest application change, in many cases, will be to follow this model in\nthe parent process:
\nimport popen2\n\nr, w, e = popen2.popen3('python slave.py')\ne.readlines()\nr.readlines()\nr.close()\ne.close()\nw.close()\n
with code like this in the child:
\nimport os\nimport sys\n\n# note that each of these print statements\n# writes a single long string\n\nprint >>sys.stderr, 400 * 'this is a test\\n'\nos.close(sys.stderr.fileno())\nprint >>sys.stdout, 400 * 'this is another test\\n'\n
In particular, note that sys.stderr must be closed after writing all data,\nor readlines() won’t return. Also note that os.close() must be\nused, as sys.stderr.close() won’t close stderr (otherwise assigning to\nsys.stderr will silently close it, so no further errors can be printed).
\nApplications which need to support a more general approach should integrate I/O\nover pipes with their select() loops, or use separate threads to read each\nof the individual files provided by whichever popen*() function or\nPopen* class was used.
\nSee also
\n\nNew in version 2.4.
\nThe subprocess module allows you to spawn new processes, connect to their\ninput/output/error pipes, and obtain their return codes. This module intends to\nreplace several other, older modules and functions, such as:
\nos.system\nos.spawn*\nos.popen*\npopen2.*\ncommands.*
\nInformation about how the subprocess module can be used to replace these\nmodules and functions can be found in the following sections.
\nSee also
\nPEP 324 – PEP proposing the subprocess module
\nThe recommended approach to invoking subprocesses is to use the following\nconvenience functions for all use cases they can handle. For more advanced\nuse cases, the underlying Popen interface can be used directly.
\nRun the command described by args. Wait for command to complete, then\nreturn the returncode attribute.
\nThe arguments shown above are merely the most common ones, described below\nin Frequently Used Arguments (hence the slightly odd notation in\nthe abbreviated signature). The full function signature is the same as\nthat of the Popen constructor - this functions passes all\nsupplied arguments directly through to that interface.
\nExamples:
\n>>> subprocess.call(["ls", "-l"])\n0\n\n>>> subprocess.call("exit 1", shell=True)\n1\n
Warning
\nInvoking the system shell with shell=True can be a security hazard\nif combined with untrusted input. See the warning under\nFrequently Used Arguments for details.
\nNote
\nDo not use stdout=PIPE or stderr=PIPE with this function. As\nthe pipes are not being read in the current process, the child\nprocess may block if it generates enough output to a pipe to fill up\nthe OS pipe buffer.
\nRun command with arguments. Wait for command to complete. If the return\ncode was zero then return, otherwise raise CalledProcessError. The\nCalledProcessError object will have the return code in the\nreturncode attribute.
\nThe arguments shown above are merely the most common ones, described below\nin Frequently Used Arguments (hence the slightly odd notation in\nthe abbreviated signature). The full function signature is the same as\nthat of the Popen constructor - this functions passes all\nsupplied arguments directly through to that interface.
\nExamples:
\n>>> subprocess.check_call(["ls", "-l"])\n0\n\n>>> subprocess.check_call("exit 1", shell=True)\nTraceback (most recent call last):\n ...\nsubprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1\n
\nNew in version 2.5.
\nWarning
\nInvoking the system shell with shell=True can be a security hazard\nif combined with untrusted input. See the warning under\nFrequently Used Arguments for details.
\nNote
\nDo not use stdout=PIPE or stderr=PIPE with this function. As\nthe pipes are not being read in the current process, the child\nprocess may block if it generates enough output to a pipe to fill up\nthe OS pipe buffer.
\nRun command with arguments and return its output as a byte string.
\nIf the return code was non-zero it raises a CalledProcessError. The\nCalledProcessError object will have the return code in the\nreturncode attribute and any output in the output\nattribute.
\nThe arguments shown above are merely the most common ones, described below\nin Frequently Used Arguments (hence the slightly odd notation in\nthe abbreviated signature). The full function signature is largely the\nsame as that of the Popen constructor, except that stdout is\nnot permitted as it is used internally. All other supplied arguments are\npassed directly through to the Popen constructor.
\nExamples:
\n>>> subprocess.check_output(["echo", "Hello World!"])\n'Hello World!\\n'\n\n>>> subprocess.check_output("exit 1", shell=True)\nTraceback (most recent call last):\n ...\nsubprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1\n
To also capture standard error in the result, use\nstderr=subprocess.STDOUT:
\n>>> subprocess.check_output(\n... "ls non_existent_file; exit 0",\n... stderr=subprocess.STDOUT,\n... shell=True)\n'ls: non_existent_file: No such file or directory\\n'\n
\nNew in version 2.7.
\nWarning
\nInvoking the system shell with shell=True can be a security hazard\nif combined with untrusted input. See the warning under\nFrequently Used Arguments for details.
\nNote
\nDo not use stderr=PIPE with this function. As the pipe is not being\nread in the current process, the child process may block if it\ngenerates enough output to the pipe to fill up the OS pipe buffer.
\nTo support a wide variety of use cases, the Popen constructor (and\nthe convenience functions) accept a large number of optional arguments. For\nmost typical use cases, many of these arguments can be safely left at their\ndefault values. The arguments that are most commonly needed are:
\n\n\nargs is required for all calls and should be a string, or a sequence of\nprogram arguments. Providing a sequence of arguments is generally\npreferred, as it allows the module to take care of any required escaping\nand quoting of arguments (e.g. to permit spaces in file names). If passing\na single string, either shell must be True (see below) or else\nthe string must simply name the program to be executed without specifying\nany arguments.
\nstdin, stdout and stderr specify the executed program’s standard input,\nstandard output and standard error file handles, respectively. Valid values\nare PIPE, an existing file descriptor (a positive integer), an\nexisting file object, and None. PIPE indicates that a new pipe\nto the child should be created. With the default settings of None, no\nredirection will occur; the child’s file handles will be inherited from the\nparent. Additionally, stderr can be STDOUT, which indicates that\nthe stderr data from the child process should be captured into the same file\nhandle as for stdout.
\nWhen stdout or stderr are pipes and universal_newlines is\nTrue then all line endings will be converted to '\\n' as\ndescribed for the universal newlines ‘U’` mode argument to open().
\nIf shell is True, the specified command will be executed through\nthe shell. This can be useful if you are using Python primarily for the\nenhanced control flow it offers over most system shells and still want\naccess to other shell features such as filename wildcards, shell pipes and\nenvironment variable expansion.
\n\n\nWarning
\nExecuting shell commands that incorporate unsanitized input from an\nuntrusted source makes a program vulnerable to shell injection,\na serious security flaw which can result in arbitrary command execution.\nFor this reason, the use of shell=True is strongly discouraged in cases\nwhere the command string is constructed from external input:
\n\n\n>>> from subprocess import call\n>>> filename = input("What file would you like to display?\\n")\nWhat file would you like to display?\nnon_existent; rm -rf / #\n>>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...\nshell=False disables all shell based features, but does not suffer\nfrom this vulnerability; see the Note in the Popen constructor\ndocumentation for helpful hints in getting shell=False to work.
\n
These options, along with all of the other options, are described in more\ndetail in the Popen constructor documentation.
\nThe underlying process creation and management in this module is handled by\nthe Popen class. It offers a lot of flexibility so that developers\nare able to handle the less common cases not covered by the convenience\nfunctions.
\nArguments are:
\nargs should be a string, or a sequence of program arguments. The program\nto execute is normally the first item in the args sequence or the string if\na string is given, but can be explicitly set by using the executable\nargument. When executable is given, the first item in the args sequence\nis still treated by most programs as the command name, which can then be\ndifferent from the actual executable name. On Unix, it becomes the display\nname for the executing program in utilities such as ps.
\nOn Unix, with shell=False (default): In this case, the Popen class uses\nos.execvp() to execute the child program. args should normally be a\nsequence. If a string is specified for args, it will be used as the name\nor path of the program to execute; this will only work if the program is\nbeing given no arguments.
\nNote
\nshlex.split() can be useful when determining the correct\ntokenization for args, especially in complex cases:
\n>>> import shlex, subprocess\n>>> command_line = raw_input()\n/bin/vikings -input eggs.txt -output "spam spam.txt" -cmd "echo '$MONEY'"\n>>> args = shlex.split(command_line)\n>>> print args\n['/bin/vikings', '-input', 'eggs.txt', '-output', 'spam spam.txt', '-cmd', "echo '$MONEY'"]\n>>> p = subprocess.Popen(args) # Success!\n
Note in particular that options (such as -input) and arguments (such\nas eggs.txt) that are separated by whitespace in the shell go in separate\nlist elements, while arguments that need quoting or backslash escaping when\nused in the shell (such as filenames containing spaces or the echo command\nshown above) are single list elements.
\nOn Unix, with shell=True: If args is a string, it specifies the command\nstring to execute through the shell. This means that the string must be\nformatted exactly as it would be when typed at the shell prompt. This\nincludes, for example, quoting or backslash escaping filenames with spaces in\nthem. If args is a sequence, the first item specifies the command string, and\nany additional items will be treated as additional arguments to the shell\nitself. That is to say, Popen does the equivalent of:
\nPopen(['/bin/sh', '-c', args[0], args[1], ...])\n
On Windows: the Popen class uses CreateProcess() to execute the child\nchild program, which operates on strings. If args is a sequence, it will\nbe converted to a string in a manner described in\nConverting an argument sequence to a string on Windows.
\nbufsize, if given, has the same meaning as the corresponding argument to the\nbuilt-in open() function: 0 means unbuffered, 1 means line\nbuffered, any other positive value means use a buffer of (approximately) that\nsize. A negative bufsize means to use the system default, which usually means\nfully buffered. The default value for bufsize is 0 (unbuffered).
\nNote
\nIf you experience performance issues, it is recommended that you try to\nenable buffering by setting bufsize to either -1 or a large enough\npositive value (such as 4096).
\nThe executable argument specifies the program to execute. It is very seldom\nneeded: Usually, the program to execute is defined by the args argument. If\nshell=True, the executable argument specifies which shell to use. On Unix,\nthe default shell is /bin/sh. On Windows, the default shell is\nspecified by the COMSPEC environment variable. The only reason you\nwould need to specify shell=True on Windows is where the command you\nwish to execute is actually built in to the shell, eg dir, copy.\nYou don’t need shell=True to run a batch file, nor to run a console-based\nexecutable.
\nstdin, stdout and stderr specify the executed program’s standard input,\nstandard output and standard error file handles, respectively. Valid values\nare PIPE, an existing file descriptor (a positive integer), an\nexisting file object, and None. PIPE indicates that a new pipe\nto the child should be created. With the default settings of None, no\nredirection will occur; the child’s file handles will be inherited from the\nparent. Additionally, stderr can be STDOUT, which indicates that\nthe stderr data from the child process should be captured into the same file\nhandle as for stdout.
\nIf preexec_fn is set to a callable object, this object will be called in the\nchild process just before the child is executed. (Unix only)
\nIf close_fds is true, all file descriptors except 0, 1 and\n2 will be closed before the child process is executed. (Unix only).\nOr, on Windows, if close_fds is true then no handles will be inherited by the\nchild process. Note that on Windows, you cannot set close_fds to true and\nalso redirect the standard handles by setting stdin, stdout or stderr.
\nIf shell is True, the specified command will be executed through the\nshell.
\nWarning
\nEnabling this option can be a security hazard if combined with untrusted\ninput. See the warning under Frequently Used Arguments\nfor details.
\nIf cwd is not None, the child’s current directory will be changed to cwd\nbefore it is executed. Note that this directory is not considered when\nsearching the executable, so you can’t specify the program’s path relative to\ncwd.
\nIf env is not None, it must be a mapping that defines the environment\nvariables for the new process; these are used instead of inheriting the current\nprocess’ environment, which is the default behavior.
\nNote
\nIf specified, env must provide any variables required\nfor the program to execute. On Windows, in order to run a\nside-by-side assembly the specified env must include a valid\nSystemRoot.
\nIf universal_newlines is True, the file objects stdout and stderr are\nopened as text files, but lines may be terminated by any of '\\n', the Unix\nend-of-line convention, '\\r', the old Macintosh convention or '\\r\\n', the\nWindows convention. All of these external representations are seen as '\\n'\nby the Python program.
\nNote
\nThis feature is only available if Python is built with universal newline\nsupport (the default). Also, the newlines attribute of the file objects\nstdout, stdin and stderr are not updated by the\ncommunicate() method.
\nIf given, startupinfo will be a STARTUPINFO object, which is\npassed to the underlying CreateProcess function.\ncreationflags, if given, can be CREATE_NEW_CONSOLE or\nCREATE_NEW_PROCESS_GROUP. (Windows only)
\nExceptions raised in the child process, before the new program has started to\nexecute, will be re-raised in the parent. Additionally, the exception object\nwill have one extra attribute called child_traceback, which is a string\ncontaining traceback information from the child’s point of view.
\nThe most common exception raised is OSError. This occurs, for example,\nwhen trying to execute a non-existent file. Applications should prepare for\nOSError exceptions.
\nA ValueError will be raised if Popen is called with invalid\narguments.
\ncheck_call() and check_output() will raise\nCalledProcessError if the called process returns a non-zero return\ncode.
\nUnlike some other popen functions, this implementation will never call a\nsystem shell implicitly. This means that all characters, including shell\nmetacharacters, can safely be passed to child processes. Obviously, if the\nshell is invoked explicitly, then it is the application’s responsibility to\nensure that all whitespace and metacharacters are quoted appropriately.
\nInstances of the Popen class have the following methods:
\nWait for child process to terminate. Set and return returncode\nattribute.
\nWarning
\nThis will deadlock when using stdout=PIPE and/or\nstderr=PIPE and the child process generates enough output to\na pipe such that it blocks waiting for the OS pipe buffer to\naccept more data. Use communicate() to avoid that.
\nInteract with process: Send data to stdin. Read data from stdout and stderr,\nuntil end-of-file is reached. Wait for process to terminate. The optional\ninput argument should be a string to be sent to the child process, or\nNone, if no data should be sent to the child.
\ncommunicate() returns a tuple (stdoutdata, stderrdata).
\nNote that if you want to send data to the process’s stdin, you need to create\nthe Popen object with stdin=PIPE. Similarly, to get anything other than\nNone in the result tuple, you need to give stdout=PIPE and/or\nstderr=PIPE too.
\nNote
\nThe data read is buffered in memory, so do not use this method if the data\nsize is large or unlimited.
\nSends the signal signal to the child.
\nNote
\nOn Windows, SIGTERM is an alias for terminate(). CTRL_C_EVENT and\nCTRL_BREAK_EVENT can be sent to processes started with a creationflags\nparameter which includes CREATE_NEW_PROCESS_GROUP.
\n\nNew in version 2.6.
\nStop the child. On Posix OSs the method sends SIGTERM to the\nchild. On Windows the Win32 API function TerminateProcess() is called\nto stop the child.
\n\nNew in version 2.6.
\nKills the child. On Posix OSs the function sends SIGKILL to the child.\nOn Windows kill() is an alias for terminate().
\n\nNew in version 2.6.
\nThe following attributes are also available:
\nWarning
\nUse communicate() rather than .stdin.write,\n.stdout.read or .stderr.read to avoid\ndeadlocks due to any of the other OS pipe buffers filling up and blocking the\nchild process.
\nThe process ID of the child process.
\nNote that if you set the shell argument to True, this is the process ID\nof the spawned shell.
\nThe child return code, set by poll() and wait() (and indirectly\nby communicate()). A None value indicates that the process\nhasn’t terminated yet.
\nA negative value -N indicates that the child was terminated by signal\nN (Unix only).
\nThe STARTUPINFO class and following constants are only available\non Windows.
\nPartial support of the Windows\nSTARTUPINFO\nstructure is used for Popen creation.
\nA bit field that determines whether certain STARTUPINFO\nattributes are used when the process creates a window.
\nsi = subprocess.STARTUPINFO()\nsi.dwFlags = subprocess.STARTF_USESTDHANDLES | subprocess.STARTF_USESHOWWINDOW\n
If dwFlags specifies STARTF_USESHOWWINDOW, this attribute\ncan be any of the values that can be specified in the nCmdShow\nparameter for the\nShowWindow\nfunction, except for SW_SHOWDEFAULT. Otherwise, this attribute is\nignored.
\nSW_HIDE is provided for this attribute. It is used when\nPopen is called with shell=True.
\nThe subprocess module exposes the following constants.
\nThe new process has a new console, instead of inheriting its parent’s\nconsole (the default).
\nThis flag is always set when Popen is created with shell=True.
\nA Popen creationflags parameter to specify that a new process\ngroup will be created. This flag is necessary for using os.kill()\non the subprocess.
\nThis flag is ignored if CREATE_NEW_CONSOLE is specified.
\nIn this section, “a becomes b” means that b can be used as a replacement for a.
\nNote
\nAll “a” functions in this section fail (more or less) silently if the\nexecuted program cannot be found; the “b” replacements raise OSError\ninstead.
\nIn addition, the replacements using check_output() will fail with a\nCalledProcessError if the requested operation produces a non-zero\nreturn code. The output is still available as the output attribute of\nthe raised exception.
\nIn the following examples, we assume that the relevant functions have already\nbeen imported from the subprocess module.
\noutput=`mycmd myarg`\n# becomes\noutput = check_output([\"mycmd\", \"myarg\"])
\noutput=`dmesg | grep hda`\n# becomes\np1 = Popen([\"dmesg\"], stdout=PIPE)\np2 = Popen([\"grep\", \"hda\"], stdin=p1.stdout, stdout=PIPE)\np1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.\noutput = p2.communicate()[0]
\nThe p1.stdout.close() call after starting the p2 is important in order for p1\nto receive a SIGPIPE if p2 exits before p1.
\nAlternatively, for trusted input, the shell’s own pipeline support may still\nbe used directly:
\n\noutput=`dmesg | grep hda`\n# becomes\noutput=check_output(“dmesg | grep hda”, shell=True)\n
sts = os.system("mycmd" + " myarg")\n# becomes\nsts = call("mycmd" + " myarg", shell=True)\n
Notes:
\nA more realistic example would look like this:
\ntry:\n retcode = call("mycmd" + " myarg", shell=True)\n if retcode < 0:\n print >>sys.stderr, "Child was terminated by signal", -retcode\n else:\n print >>sys.stderr, "Child returned", retcode\nexcept OSError, e:\n print >>sys.stderr, "Execution failed:", e\n
P_NOWAIT example:
\npid = os.spawnlp(os.P_NOWAIT, \"/bin/mycmd\", \"mycmd\", \"myarg\")\n==>\npid = Popen([\"/bin/mycmd\", \"myarg\"]).pid
\nP_WAIT example:
\nretcode = os.spawnlp(os.P_WAIT, \"/bin/mycmd\", \"mycmd\", \"myarg\")\n==>\nretcode = call([\"/bin/mycmd\", \"myarg\"])
\nVector example:
\nos.spawnvp(os.P_NOWAIT, path, args)\n==>\nPopen([path] + args[1:])
\nEnvironment example:
\nos.spawnlpe(os.P_NOWAIT, \"/bin/mycmd\", \"mycmd\", \"myarg\", env)\n==>\nPopen([\"/bin/mycmd\", \"myarg\"], env={\"PATH\": \"/usr/bin\"})
\npipe = os.popen(\"cmd\", 'r', bufsize)\n==>\npipe = Popen(\"cmd\", shell=True, bufsize=bufsize, stdout=PIPE).stdout
\npipe = os.popen(\"cmd\", 'w', bufsize)\n==>\npipe = Popen(\"cmd\", shell=True, bufsize=bufsize, stdin=PIPE).stdin
\n(child_stdin, child_stdout) = os.popen2(\"cmd\", mode, bufsize)\n==>\np = Popen(\"cmd\", shell=True, bufsize=bufsize,\n stdin=PIPE, stdout=PIPE, close_fds=True)\n(child_stdin, child_stdout) = (p.stdin, p.stdout)
\n(child_stdin,\n child_stdout,\n child_stderr) = os.popen3(\"cmd\", mode, bufsize)\n==>\np = Popen(\"cmd\", shell=True, bufsize=bufsize,\n stdin=PIPE, stdout=PIPE, stderr=PIPE, close_fds=True)\n(child_stdin,\n child_stdout,\n child_stderr) = (p.stdin, p.stdout, p.stderr)
\n(child_stdin, child_stdout_and_stderr) = os.popen4(\"cmd\", mode,\n bufsize)\n==>\np = Popen(\"cmd\", shell=True, bufsize=bufsize,\n stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)\n(child_stdin, child_stdout_and_stderr) = (p.stdin, p.stdout)
\nOn Unix, os.popen2, os.popen3 and os.popen4 also accept a sequence as\nthe command to execute, in which case arguments will be passed\ndirectly to the program without shell intervention. This usage can be\nreplaced as follows:
\n(child_stdin, child_stdout) = os.popen2([\"/bin/ls\", \"-l\"], mode,\n bufsize)\n==>\np = Popen([\"/bin/ls\", \"-l\"], bufsize=bufsize, stdin=PIPE, stdout=PIPE)\n(child_stdin, child_stdout) = (p.stdin, p.stdout)
\nReturn code handling translates as follows:
\npipe = os.popen(\"cmd\", 'w')\n...\nrc = pipe.close()\nif rc is not None and rc >> 8:\n print \"There were some errors\"\n==>\nprocess = Popen(\"cmd\", 'w', shell=True, stdin=PIPE)\n...\nprocess.stdin.close()\nif process.wait() != 0:\n print \"There were some errors\"
\n(child_stdout, child_stdin) = popen2.popen2(\"somestring\", bufsize, mode)\n==>\np = Popen([\"somestring\"], shell=True, bufsize=bufsize,\n stdin=PIPE, stdout=PIPE, close_fds=True)\n(child_stdout, child_stdin) = (p.stdout, p.stdin)
\nOn Unix, popen2 also accepts a sequence as the command to execute, in\nwhich case arguments will be passed directly to the program without\nshell intervention. This usage can be replaced as follows:
\n(child_stdout, child_stdin) = popen2.popen2([\"mycmd\", \"myarg\"], bufsize,\n mode)\n==>\np = Popen([\"mycmd\", \"myarg\"], bufsize=bufsize,\n stdin=PIPE, stdout=PIPE, close_fds=True)\n(child_stdout, child_stdin) = (p.stdout, p.stdin)
\npopen2.Popen3 and popen2.Popen4 basically work as\nsubprocess.Popen, except that:
\n\nOn Windows, an args sequence is converted to a string that can be parsed\nusing the following rules (which correspond to the rules used by the MS C\nruntime):
\n\nNew in version 2.6.
\nmultiprocessing is a package that supports spawning processes using an\nAPI similar to the threading module. The multiprocessing package\noffers both local and remote concurrency, effectively side-stepping the\nGlobal Interpreter Lock by using subprocesses instead of threads. Due\nto this, the multiprocessing module allows the programmer to fully\nleverage multiple processors on a given machine. It runs on both Unix and\nWindows.
\nWarning
\nSome of this package’s functionality requires a functioning shared semaphore\nimplementation on the host operating system. Without one, the\nmultiprocessing.synchronize module will be disabled, and attempts to\nimport it will result in an ImportError. See\nissue 3770 for additional information.
\nNote
\nFunctionality within this package requires that the __main__ module be\nimportable by the children. This is covered in Programming guidelines\nhowever it is worth pointing out here. This means that some examples, such\nas the multiprocessing.Pool examples will not work in the\ninteractive interpreter. For example:
\n>>> from multiprocessing import Pool\n>>> p = Pool(5)\n>>> def f(x):\n... return x*x\n...\n>>> p.map(f, [1,2,3])\nProcess PoolWorker-1:\nProcess PoolWorker-2:\nProcess PoolWorker-3:\nTraceback (most recent call last):\nAttributeError: 'module' object has no attribute 'f'\nAttributeError: 'module' object has no attribute 'f'\nAttributeError: 'module' object has no attribute 'f'\n
(If you try this it will actually output three full tracebacks\ninterleaved in a semi-random fashion, and then you may have to\nstop the master process somehow.)
\nIn multiprocessing, processes are spawned by creating a Process\nobject and then calling its start() method. Process\nfollows the API of threading.Thread. A trivial example of a\nmultiprocess program is
\nfrom multiprocessing import Process\n\ndef f(name):\n print 'hello', name\n\nif __name__ == '__main__':\n p = Process(target=f, args=('bob',))\n p.start()\n p.join()\n
To show the individual process IDs involved, here is an expanded example:
\nfrom multiprocessing import Process\nimport os\n\ndef info(title):\n print title\n print 'module name:', __name__\n print 'parent process:', os.getppid()\n print 'process id:', os.getpid()\n\ndef f(name):\n info('function f')\n print 'hello', name\n\nif __name__ == '__main__':\n info('main line')\n p = Process(target=f, args=('bob',))\n p.start()\n p.join()\n
For an explanation of why (on Windows) the if __name__ == '__main__' part is\nnecessary, see Programming guidelines.
\nmultiprocessing supports two types of communication channel between\nprocesses:
\nQueues
\n\n\nThe Queue class is a near clone of Queue.Queue. For\nexample:
\n\n\nfrom multiprocessing import Process, Queue\n\ndef f(q):\n q.put([42, None, 'hello'])\n\nif __name__ == '__main__':\n q = Queue()\n p = Process(target=f, args=(q,))\n p.start()\n print q.get() # prints "[42, None, 'hello']"\n p.join()\nQueues are thread and process safe.
\n
Pipes
\n\n\nThe Pipe() function returns a pair of connection objects connected by a\npipe which by default is duplex (two-way). For example:
\n\n\nfrom multiprocessing import Process, Pipe\n\ndef f(conn):\n conn.send([42, None, 'hello'])\n conn.close()\n\nif __name__ == '__main__':\n parent_conn, child_conn = Pipe()\n p = Process(target=f, args=(child_conn,))\n p.start()\n print parent_conn.recv() # prints "[42, None, 'hello']"\n p.join()\nThe two connection objects returned by Pipe() represent the two ends of\nthe pipe. Each connection object has send() and\nrecv() methods (among others). Note that data in a pipe\nmay become corrupted if two processes (or threads) try to read from or write\nto the same end of the pipe at the same time. Of course there is no risk\nof corruption from processes using different ends of the pipe at the same\ntime.
\n
multiprocessing contains equivalents of all the synchronization\nprimitives from threading. For instance one can use a lock to ensure\nthat only one process prints to standard output at a time:
\nfrom multiprocessing import Process, Lock\n\ndef f(l, i):\n l.acquire()\n print 'hello world', i\n l.release()\n\nif __name__ == '__main__':\n lock = Lock()\n\n for num in range(10):\n Process(target=f, args=(lock, num)).start()\n
Without using the lock output from the different processes is liable to get all\nmixed up.
\nAs mentioned above, when doing concurrent programming it is usually best to\navoid using shared state as far as possible. This is particularly true when\nusing multiple processes.
\nHowever, if you really do need to use some shared data then\nmultiprocessing provides a couple of ways of doing so.
\nShared memory
\n\n\nData can be stored in a shared memory map using Value or\nArray. For example, the following code
\n\n\nfrom multiprocessing import Process, Value, Array\n\ndef f(n, a):\n n.value = 3.1415927\n for i in range(len(a)):\n a[i] = -a[i]\n\nif __name__ == '__main__':\n num = Value('d', 0.0)\n arr = Array('i', range(10))\n\n p = Process(target=f, args=(num, arr))\n p.start()\n p.join()\n\n print num.value\n print arr[:]\nwill print
\n\n\n3.1415927\n[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]\nThe 'd' and 'i' arguments used when creating num and arr are\ntypecodes of the kind used by the array module: 'd' indicates a\ndouble precision float and 'i' indicates a signed integer. These shared\nobjects will be process and thread-safe.
\nFor more flexibility in using shared memory one can use the\nmultiprocessing.sharedctypes module which supports the creation of\narbitrary ctypes objects allocated from shared memory.
\n
Server process
\n\n\nA manager object returned by Manager() controls a server process which\nholds Python objects and allows other processes to manipulate them using\nproxies.
\nA manager returned by Manager() will support types list,\ndict, Namespace, Lock, RLock,\nSemaphore, BoundedSemaphore, Condition,\nEvent, Queue, Value and Array. For\nexample,
\n\n\nfrom multiprocessing import Process, Manager\n\ndef f(d, l):\n d[1] = '1'\n d['2'] = 2\n d[0.25] = None\n l.reverse()\n\nif __name__ == '__main__':\n manager = Manager()\n\n d = manager.dict()\n l = manager.list(range(10))\n\n p = Process(target=f, args=(d, l))\n p.start()\n p.join()\n\n print d\n print l\nwill print
\n\n\n{0.25: None, 1: '1', '2': 2}\n[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]\nServer process managers are more flexible than using shared memory objects\nbecause they can be made to support arbitrary object types. Also, a single\nmanager can be shared by processes on different computers over a network.\nThey are, however, slower than using shared memory.
\n
The Pool class represents a pool of worker\nprocesses. It has methods which allows tasks to be offloaded to the worker\nprocesses in a few different ways.
\nFor example:
\nfrom multiprocessing import Pool\n\ndef f(x):\n return x*x\n\nif __name__ == '__main__':\n pool = Pool(processes=4) # start 4 worker processes\n result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously\n print result.get(timeout=1) # prints "100" unless your computer is *very* slow\n print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"\n
The multiprocessing package mostly replicates the API of the\nthreading module.
\nProcess objects represent activity that is run in a separate process. The\nProcess class has equivalents of all the methods of\nthreading.Thread.
\nThe constructor should always be called with keyword arguments. group\nshould always be None; it exists solely for compatibility with\nthreading.Thread. target is the callable object to be invoked by\nthe run() method. It defaults to None, meaning nothing is\ncalled. name is the process name. By default, a unique name is constructed\nof the form ‘Process-N1:N2:...:Nk‘ where N1,N2,...,Nk is a sequence of integers whose length\nis determined by the generation of the process. args is the argument\ntuple for the target invocation. kwargs is a dictionary of keyword\narguments for the target invocation. By default, no arguments are passed to\ntarget.
\nIf a subclass overrides the constructor, it must make sure it invokes the\nbase class constructor (Process.__init__()) before doing anything else\nto the process.
\nMethod representing the process’s activity.
\nYou may override this method in a subclass. The standard run()\nmethod invokes the callable object passed to the object’s constructor as\nthe target argument, if any, with sequential and keyword arguments taken\nfrom the args and kwargs arguments, respectively.
\nStart the process’s activity.
\nThis must be called at most once per process object. It arranges for the\nobject’s run() method to be invoked in a separate process.
\nBlock the calling thread until the process whose join() method is\ncalled terminates or until the optional timeout occurs.
\nIf timeout is None then there is no timeout.
\nA process can be joined many times.
\nA process cannot join itself because this would cause a deadlock. It is\nan error to attempt to join a process before it has been started.
\nThe process’s name.
\nThe name is a string used for identification purposes only. It has no\nsemantics. Multiple processes may be given the same name. The initial\nname is set by the constructor.
\nReturn whether the process is alive.
\nRoughly, a process object is alive from the moment the start()\nmethod returns until the child process terminates.
\nThe process’s daemon flag, a Boolean value. This must be set before\nstart() is called.
\nThe initial value is inherited from the creating process.
\nWhen a process exits, it attempts to terminate all of its daemonic child\nprocesses.
\nNote that a daemonic process is not allowed to create child processes.\nOtherwise a daemonic process would leave its children orphaned if it gets\nterminated when its parent process exits. Additionally, these are not\nUnix daemons or services, they are normal processes that will be\nterminated (and not joined) if non-daemonic processes have exited.
\nIn addition to the Threading.Thread API, Process objects\nalso support the following attributes and methods:
\nThe process’s authentication key (a byte string).
\nWhen multiprocessing is initialized the main process is assigned a\nrandom string using os.random().
\nWhen a Process object is created, it will inherit the\nauthentication key of its parent process, although this may be changed by\nsetting authkey to another byte string.
\nSee Authentication keys.
\nTerminate the process. On Unix this is done using the SIGTERM signal;\non Windows TerminateProcess() is used. Note that exit handlers and\nfinally clauses, etc., will not be executed.
\nNote that descendant processes of the process will not be terminated –\nthey will simply become orphaned.
\nWarning
\nIf this method is used when the associated process is using a pipe or\nqueue then the pipe or queue is liable to become corrupted and may\nbecome unusable by other process. Similarly, if the process has\nacquired a lock or semaphore etc. then terminating it is liable to\ncause other processes to deadlock.
\nNote that the start(), join(), is_alive() and\nexit_code methods should only be called by the process that created\nthe process object.
\nExample usage of some of the methods of Process:
\n>>> import multiprocessing, time, signal\n>>> p = multiprocessing.Process(target=time.sleep, args=(1000,))\n>>> print p, p.is_alive()\n<Process(Process-1, initial)> False\n>>> p.start()\n>>> print p, p.is_alive()\n<Process(Process-1, started)> True\n>>> p.terminate()\n>>> time.sleep(0.1)\n>>> print p, p.is_alive()\n<Process(Process-1, stopped[SIGTERM])> False\n>>> p.exitcode == -signal.SIGTERM\nTrue\n
Exception raised by Connection.recv_bytes_into() when the supplied\nbuffer object is too small for the message read.
\nIf e is an instance of BufferTooShort then e.args[0] will give\nthe message as a byte string.
\nWhen using multiple processes, one generally uses message passing for\ncommunication between processes and avoids having to use any synchronization\nprimitives like locks.
\nFor passing messages one can use Pipe() (for a connection between two\nprocesses) or a queue (which allows multiple producers and consumers).
\nThe Queue and JoinableQueue types are multi-producer,\nmulti-consumer FIFO queues modelled on the Queue.Queue class in the\nstandard library. They differ in that Queue lacks the\ntask_done() and join() methods introduced\ninto Python 2.5’s Queue.Queue class.
\nIf you use JoinableQueue then you must call\nJoinableQueue.task_done() for each task removed from the queue or else the\nsemaphore used to count the number of unfinished tasks may eventually overflow,\nraising an exception.
\nNote that one can also create a shared queue by using a manager object – see\nManagers.
\nNote
\nmultiprocessing uses the usual Queue.Empty and\nQueue.Full exceptions to signal a timeout. They are not available in\nthe multiprocessing namespace so you need to import them from\nQueue.
\nWarning
\nIf a process is killed using Process.terminate() or os.kill()\nwhile it is trying to use a Queue, then the data in the queue is\nlikely to become corrupted. This may cause any other process to get an\nexception when it tries to use the queue later on.
\nWarning
\nAs mentioned above, if a child process has put items on a queue (and it has\nnot used JoinableQueue.cancel_join_thread()), then that process will\nnot terminate until all buffered items have been flushed to the pipe.
\nThis means that if you try joining that process you may get a deadlock unless\nyou are sure that all items which have been put on the queue have been\nconsumed. Similarly, if the child process is non-daemonic then the parent\nprocess may hang on exit when it tries to join all its non-daemonic children.
\nNote that a queue created using a manager does not have this issue. See\nProgramming guidelines.
\nFor an example of the usage of queues for interprocess communication see\nExamples.
\nReturns a pair (conn1, conn2) of Connection objects representing\nthe ends of a pipe.
\nIf duplex is True (the default) then the pipe is bidirectional. If\nduplex is False then the pipe is unidirectional: conn1 can only be\nused for receiving messages and conn2 can only be used for sending\nmessages.
\nReturns a process shared queue implemented using a pipe and a few\nlocks/semaphores. When a process first puts an item on the queue a feeder\nthread is started which transfers objects from a buffer into the pipe.
\nThe usual Queue.Empty and Queue.Full exceptions from the\nstandard library’s Queue module are raised to signal timeouts.
\nQueue implements all the methods of Queue.Queue except for\ntask_done() and join().
\nReturn the approximate size of the queue. Because of\nmultithreading/multiprocessing semantics, this number is not reliable.
\nNote that this may raise NotImplementedError on Unix platforms like\nMac OS X where sem_getvalue() is not implemented.
\nmultiprocessing.Queue has a few additional methods not found in\nQueue.Queue. These methods are usually unnecessary for most\ncode:
\nJoin the background thread. This can only be used after close() has\nbeen called. It blocks until the background thread exits, ensuring that\nall data in the buffer has been flushed to the pipe.
\nBy default if a process is not the creator of the queue then on exit it\nwill attempt to join the queue’s background thread. The process can call\ncancel_join_thread() to make join_thread() do nothing.
\nJoinableQueue, a Queue subclass, is a queue which\nadditionally has task_done() and join() methods.
\nIndicate that a formerly enqueued task is complete. Used by queue consumer\nthreads. For each get() used to fetch a task, a subsequent\ncall to task_done() tells the queue that the processing on the task\nis complete.
\nIf a join() is currently blocking, it will resume when all\nitems have been processed (meaning that a task_done() call was\nreceived for every item that had been put() into the queue).
\nRaises a ValueError if called more times than there were items\nplaced in the queue.
\nBlock until all items in the queue have been gotten and processed.
\nThe count of unfinished tasks goes up whenever an item is added to the\nqueue. The count goes down whenever a consumer thread calls\ntask_done() to indicate that the item was retrieved and all work on\nit is complete. When the count of unfinished tasks drops to zero,\njoin() unblocks.
\nReturn list of all live children of the current process.
\nCalling this has the side affect of “joining” any processes which have\nalready finished.
\nReturn the Process object corresponding to the current process.
\nAn analogue of threading.current_thread().
\nAdd support for when a program which uses multiprocessing has been\nfrozen to produce a Windows executable. (Has been tested with py2exe,\nPyInstaller and cx_Freeze.)
\nOne needs to call this function straight after the if __name__ ==\n'__main__' line of the main module. For example:
\nfrom multiprocessing import Process, freeze_support\n\ndef f():\n print 'hello world!'\n\nif __name__ == '__main__':\n freeze_support()\n Process(target=f).start()\n
If the freeze_support() line is omitted then trying to run the frozen\nexecutable will raise RuntimeError.
\nIf the module is being run normally by the Python interpreter then\nfreeze_support() has no effect.
\nSets the path of the Python interpreter to use when starting a child process.\n(By default sys.executable is used). Embedders will probably need to\ndo some thing like
\nset_executable(os.path.join(sys.exec_prefix, 'pythonw.exe'))\n
before they can create child processes. (Windows only)
\nNote
\nmultiprocessing contains no analogues of\nthreading.active_count(), threading.enumerate(),\nthreading.settrace(), threading.setprofile(),\nthreading.Timer, or threading.local.
\nConnection objects allow the sending and receiving of picklable objects or\nstrings. They can be thought of as message oriented connected sockets.
\nConnection objects are usually created using Pipe() – see also\nListeners and Clients.
\nSend an object to the other end of the connection which should be read\nusing recv().
\nThe object must be picklable. Very large pickles (approximately 32 MB+,\nthough it depends on the OS) may raise a ValueError exception.
\nClose the connection.
\nThis is called automatically when the connection is garbage collected.
\nReturn whether there is any data available to be read.
\nIf timeout is not specified then it will return immediately. If\ntimeout is a number then this specifies the maximum time in seconds to\nblock. If timeout is None then an infinite timeout is used.
\nSend byte data from an object supporting the buffer interface as a\ncomplete message.
\nIf offset is given then data is read from that position in buffer. If\nsize is given then that many bytes will be read from buffer. Very large\nbuffers (approximately 32 MB+, though it depends on the OS) may raise a\nValueError exception
\nReturn a complete message of byte data sent from the other end of the\nconnection as a string. Blocks until there is something to receive.\nRaises EOFError if there is nothing left\nto receive and the other end has closed.
\nIf maxlength is specified and the message is longer than maxlength\nthen IOError is raised and the connection will no longer be\nreadable.
\nRead into buffer a complete message of byte data sent from the other end\nof the connection and return the number of bytes in the message. Blocks\nuntil there is something to receive. Raises\nEOFError if there is nothing left to receive and the other end was\nclosed.
\nbuffer must be an object satisfying the writable buffer interface. If\noffset is given then the message will be written into the buffer from\nthat position. Offset must be a non-negative integer less than the\nlength of buffer (in bytes).
\nIf the buffer is too short then a BufferTooShort exception is\nraised and the complete message is available as e.args[0] where e\nis the exception instance.
\nFor example:
\n>>> from multiprocessing import Pipe\n>>> a, b = Pipe()\n>>> a.send([1, 'hello', None])\n>>> b.recv()\n[1, 'hello', None]\n>>> b.send_bytes('thank you')\n>>> a.recv_bytes()\n'thank you'\n>>> import array\n>>> arr1 = array.array('i', range(5))\n>>> arr2 = array.array('i', [0] * 10)\n>>> a.send_bytes(arr1)\n>>> count = b.recv_bytes_into(arr2)\n>>> assert count == len(arr1) * arr1.itemsize\n>>> arr2\narray('i', [0, 1, 2, 3, 4, 0, 0, 0, 0, 0])\n
Warning
\nThe Connection.recv() method automatically unpickles the data it\nreceives, which can be a security risk unless you can trust the process\nwhich sent the message.
\nTherefore, unless the connection object was produced using Pipe() you\nshould only use the recv() and send()\nmethods after performing some sort of authentication. See\nAuthentication keys.
\nWarning
\nIf a process is killed while it is trying to read or write to a pipe then\nthe data in the pipe is likely to become corrupted, because it may become\nimpossible to be sure where the message boundaries lie.
\nGenerally synchronization primitives are not as necessary in a multiprocess\nprogram as they are in a multithreaded program. See the documentation for\nthreading module.
\nNote that one can also create synchronization primitives by using a manager\nobject – see Managers.
\nA bounded semaphore object: a clone of threading.BoundedSemaphore.
\n(On Mac OS X, this is indistinguishable from Semaphore because\nsem_getvalue() is not implemented on that platform).
\nA condition variable: a clone of threading.Condition.
\nIf lock is specified then it should be a Lock or RLock\nobject from multiprocessing.
\nA clone of threading.Event.\nThis method returns the state of the internal semaphore on exit, so it\nwill always return True except if a timeout is given and the operation\ntimes out.
\n\nChanged in version 2.7: Previously, the method always returned None.
\nNote
\nThe acquire() method of BoundedSemaphore, Lock,\nRLock and Semaphore has a timeout parameter not supported\nby the equivalents in threading. The signature is\nacquire(block=True, timeout=None) with keyword parameters being\nacceptable. If block is True and timeout is not None then it\nspecifies a timeout in seconds. If block is False then timeout is\nignored.
\nOn Mac OS X, sem_timedwait is unsupported, so calling acquire() with\na timeout will emulate that function’s behavior using a sleeping loop.
\nNote
\nIf the SIGINT signal generated by Ctrl-C arrives while the main thread is\nblocked by a call to BoundedSemaphore.acquire(), Lock.acquire(),\nRLock.acquire(), Semaphore.acquire(), Condition.acquire()\nor Condition.wait() then the call will be immediately interrupted and\nKeyboardInterrupt will be raised.
\nThis differs from the behaviour of threading where SIGINT will be\nignored while the equivalent blocking calls are in progress.
\nManagers provide a way to create data which can be shared between different\nprocesses. A manager object controls a server process which manages shared\nobjects. Other processes can access the shared objects by using proxies.
\nManager processes will be shutdown as soon as they are garbage collected or\ntheir parent process exits. The manager classes are defined in the\nmultiprocessing.managers module:
\nCreate a BaseManager object.
\nOnce created one should call start() or get_server().serve_forever() to ensure\nthat the manager object refers to a started manager process.
\naddress is the address on which the manager process listens for new\nconnections. If address is None then an arbitrary one is chosen.
\nauthkey is the authentication key which will be used to check the validity\nof incoming connections to the server process. If authkey is None then\ncurrent_process().authkey. Otherwise authkey is used and it\nmust be a string.
\nReturns a Server object which represents the actual server under\nthe control of the Manager. The Server object supports the\nserve_forever() method:
\n>>> from multiprocessing.managers import BaseManager\n>>> manager = BaseManager(address=('', 50000), authkey='abc')\n>>> server = manager.get_server()\n>>> server.serve_forever()\n
Server additionally has an address attribute.
\nConnect a local manager object to a remote manager process:
\n>>> from multiprocessing.managers import BaseManager\n>>> m = BaseManager(address=('127.0.0.1', 5000), authkey='abc')\n>>> m.connect()\n
Stop the process used by the manager. This is only available if\nstart() has been used to start the server process.
\nThis can be called multiple times.
\nA classmethod which can be used for registering a type or callable with\nthe manager class.
\ntypeid is a “type identifier” which is used to identify a particular\ntype of shared object. This must be a string.
\ncallable is a callable used for creating objects for this type\nidentifier. If a manager instance will be created using the\nfrom_address() classmethod or if the create_method argument is\nFalse then this can be left as None.
\nproxytype is a subclass of BaseProxy which is used to create\nproxies for shared objects with this typeid. If None then a proxy\nclass is created automatically.
\nexposed is used to specify a sequence of method names which proxies for\nthis typeid should be allowed to access using\nBaseProxy._callMethod(). (If exposed is None then\nproxytype._exposed_ is used instead if it exists.) In the case\nwhere no exposed list is specified, all “public methods” of the shared\nobject will be accessible. (Here a “public method” means any attribute\nwhich has a __call__() method and whose name does not begin with\n'_'.)
\nmethod_to_typeid is a mapping used to specify the return type of those\nexposed methods which should return a proxy. It maps method names to\ntypeid strings. (If method_to_typeid is None then\nproxytype._method_to_typeid_ is used instead if it exists.) If a\nmethod’s name is not a key of this mapping or if the mapping is None\nthen the object returned by the method will be copied by value.
\ncreate_method determines whether a method should be created with name\ntypeid which can be used to tell the server process to create a new\nshared object and return a proxy for it. By default it is True.
\nBaseManager instances also have one read-only property:
\nA subclass of BaseManager which can be used for the synchronization\nof processes. Objects of this type are returned by\nmultiprocessing.Manager().
\nIt also supports creation of shared lists and dictionaries.
\nCreate a shared threading.Condition object and return a proxy for\nit.
\nIf lock is supplied then it should be a proxy for a\nthreading.Lock or threading.RLock object.
\nNote
\nModifications to mutable values or items in dict and list proxies will not\nbe propagated through the manager, because the proxy has no way of knowing\nwhen its values or items are modified. To modify such an item, you can\nre-assign the modified object to the container proxy:
\n# create a list proxy and append a mutable object (a dictionary)\nlproxy = manager.list()\nlproxy.append({})\n# now mutate the dictionary\nd = lproxy[0]\nd['a'] = 1\nd['b'] = 2\n# at this point, the changes to d are not yet synced, but by\n# reassigning the dictionary, the proxy is notified of the change\nlproxy[0] = d\n
A namespace object has no public methods, but does have writable attributes.\nIts representation shows the values of its attributes.
\nHowever, when using a proxy for a namespace object, an attribute beginning with\n'_' will be an attribute of the proxy and not an attribute of the referent:
\n>>> manager = multiprocessing.Manager()\n>>> Global = manager.Namespace()\n>>> Global.x = 10\n>>> Global.y = 'hello'\n>>> Global._z = 12.3 # this is an attribute of the proxy\n>>> print Global\nNamespace(x=10, y='hello')\n
To create one’s own manager, one creates a subclass of BaseManager and\nuses the register() classmethod to register new types or\ncallables with the manager class. For example:
\nfrom multiprocessing.managers import BaseManager\n\nclass MathsClass(object):\n def add(self, x, y):\n return x + y\n def mul(self, x, y):\n return x * y\n\nclass MyManager(BaseManager):\n pass\n\nMyManager.register('Maths', MathsClass)\n\nif __name__ == '__main__':\n manager = MyManager()\n manager.start()\n maths = manager.Maths()\n print maths.add(4, 3) # prints 7\n print maths.mul(7, 8) # prints 56\n
It is possible to run a manager server on one machine and have clients use it\nfrom other machines (assuming that the firewalls involved allow it).
\nRunning the following commands creates a server for a single shared queue which\nremote clients can access:
\n>>> from multiprocessing.managers import BaseManager\n>>> import Queue\n>>> queue = Queue.Queue()\n>>> class QueueManager(BaseManager): pass\n>>> QueueManager.register('get_queue', callable=lambda:queue)\n>>> m = QueueManager(address=('', 50000), authkey='abracadabra')\n>>> s = m.get_server()\n>>> s.serve_forever()\n
One client can access the server as follows:
\n>>> from multiprocessing.managers import BaseManager\n>>> class QueueManager(BaseManager): pass\n>>> QueueManager.register('get_queue')\n>>> m = QueueManager(address=('foo.bar.org', 50000), authkey='abracadabra')\n>>> m.connect()\n>>> queue = m.get_queue()\n>>> queue.put('hello')\n
Another client can also use it:
\n>>> from multiprocessing.managers import BaseManager\n>>> class QueueManager(BaseManager): pass\n>>> QueueManager.register('get_queue')\n>>> m = QueueManager(address=('foo.bar.org', 50000), authkey='abracadabra')\n>>> m.connect()\n>>> queue = m.get_queue()\n>>> queue.get()\n'hello'\n
Local processes can also access that queue, using the code from above on the\nclient to access it remotely:
\n>>> from multiprocessing import Process, Queue\n>>> from multiprocessing.managers import BaseManager\n>>> class Worker(Process):\n... def __init__(self, q):\n... self.q = q\n... super(Worker, self).__init__()\n... def run(self):\n... self.q.put('local hello')\n...\n>>> queue = Queue()\n>>> w = Worker(queue)\n>>> w.start()\n>>> class QueueManager(BaseManager): pass\n...\n>>> QueueManager.register('get_queue', callable=lambda: queue)\n>>> m = QueueManager(address=('', 50000), authkey='abracadabra')\n>>> s = m.get_server()\n>>> s.serve_forever()\n
A proxy is an object which refers to a shared object which lives (presumably)\nin a different process. The shared object is said to be the referent of the\nproxy. Multiple proxy objects may have the same referent.
\nA proxy object has methods which invoke corresponding methods of its referent\n(although not every method of the referent will necessarily be available through\nthe proxy). A proxy can usually be used in most of the same ways that its\nreferent can:
\n>>> from multiprocessing import Manager\n>>> manager = Manager()\n>>> l = manager.list([i*i for i in range(10)])\n>>> print l\n[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n>>> print repr(l)\n<ListProxy object, typeid 'list' at 0x...>\n>>> l[4]\n16\n>>> l[2:5]\n[4, 9, 16]\n
Notice that applying str() to a proxy will return the representation of\nthe referent, whereas applying repr() will return the representation of\nthe proxy.
\nAn important feature of proxy objects is that they are picklable so they can be\npassed between processes. Note, however, that if a proxy is sent to the\ncorresponding manager’s process then unpickling it will produce the referent\nitself. This means, for example, that one shared object can contain a second:
\n>>> a = manager.list()\n>>> b = manager.list()\n>>> a.append(b) # referent of a now contains referent of b\n>>> print a, b\n[[]] []\n>>> b.append('hello')\n>>> print a, b\n[['hello']] ['hello']\n
Note
\nThe proxy types in multiprocessing do nothing to support comparisons\nby value. So, for instance, we have:
\n>>> manager.list([1,2,3]) == [1,2,3]\nFalse\n
One should just use a copy of the referent instead when making comparisons.
\nProxy objects are instances of subclasses of BaseProxy.
\nCall and return the result of a method of the proxy’s referent.
\nIf proxy is a proxy whose referent is obj then the expression
\nproxy._callmethod(methodname, args, kwds)\n
will evaluate the expression
\ngetattr(obj, methodname)(*args, **kwds)\n
in the manager’s process.
\nThe returned value will be a copy of the result of the call or a proxy to\na new shared object – see documentation for the method_to_typeid\nargument of BaseManager.register().
\nIf an exception is raised by the call, then is re-raised by\n_callmethod(). If some other exception is raised in the manager’s\nprocess then this is converted into a RemoteError exception and is\nraised by _callmethod().
\nNote in particular that an exception will be raised if methodname has\nnot been exposed
\nAn example of the usage of _callmethod():
\n>>> l = manager.list(range(10))\n>>> l._callmethod('__len__')\n10\n>>> l._callmethod('__getslice__', (2, 7)) # equiv to `l[2:7]`\n[2, 3, 4, 5, 6]\n>>> l._callmethod('__getitem__', (20,)) # equiv to `l[20]`\nTraceback (most recent call last):\n...\nIndexError: list index out of range\n
Return a copy of the referent.
\nIf the referent is unpicklable then this will raise an exception.
\nA proxy object uses a weakref callback so that when it gets garbage collected it\nderegisters itself from the manager which owns its referent.
\nA shared object gets deleted from the manager process when there are no longer\nany proxies referring to it.
\nOne can create a pool of processes which will carry out tasks submitted to it\nwith the Pool class.
\nA process pool object which controls a pool of worker processes to which jobs\ncan be submitted. It supports asynchronous results with timeouts and\ncallbacks and has a parallel map implementation.
\nprocesses is the number of worker processes to use. If processes is\nNone then the number returned by cpu_count() is used. If\ninitializer is not None then each worker process will call\ninitializer(*initargs) when it starts.
\n\nNew in version 2.7: maxtasksperchild is the number of tasks a worker process can complete\nbefore it will exit and be replaced with a fresh worker process, to enable\nunused resources to be freed. The default maxtasksperchild is None, which\nmeans worker processes will live as long as the pool.
\nNote
\nWorker processes within a Pool typically live for the complete\nduration of the Pool’s work queue. A frequent pattern found in other\nsystems (such as Apache, mod_wsgi, etc) to free resources held by\nworkers is to allow a worker within a pool to complete only a set\namount of work before being exiting, being cleaned up and a new\nprocess spawned to replace the old one. The maxtasksperchild\nargument to the Pool exposes this ability to the end user.
\nA variant of the apply() method which returns a result object.
\nIf callback is specified then it should be a callable which accepts a\nsingle argument. When the result becomes ready callback is applied to\nit (unless the call failed). callback should complete immediately since\notherwise the thread which handles the results will get blocked.
\nA parallel equivalent of the map() built-in function (it supports only\none iterable argument though). It blocks until the result is ready.
\nThis method chops the iterable into a number of chunks which it submits to\nthe process pool as separate tasks. The (approximate) size of these\nchunks can be specified by setting chunksize to a positive integer.
\nA variant of the map() method which returns a result object.
\nIf callback is specified then it should be a callable which accepts a\nsingle argument. When the result becomes ready callback is applied to\nit (unless the call failed). callback should complete immediately since\notherwise the thread which handles the results will get blocked.
\nAn equivalent of itertools.imap().
\nThe chunksize argument is the same as the one used by the map()\nmethod. For very long iterables using a large value for chunksize can\nmake the job complete much faster than using the default value of\n1.
\nAlso if chunksize is 1 then the next() method of the iterator\nreturned by the imap() method has an optional timeout parameter:\nnext(timeout) will raise multiprocessing.TimeoutError if the\nresult cannot be returned within timeout seconds.
\nThe class of the result returned by Pool.apply_async() and\nPool.map_async().
\nThe following example demonstrates the use of a pool:
\nfrom multiprocessing import Pool\n\ndef f(x):\n return x*x\n\nif __name__ == '__main__':\n pool = Pool(processes=4) # start 4 worker processes\n\n result = pool.apply_async(f, (10,)) # evaluate "f(10)" asynchronously\n print result.get(timeout=1) # prints "100" unless your computer is *very* slow\n\n print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"\n\n it = pool.imap(f, range(10))\n print it.next() # prints "0"\n print it.next() # prints "1"\n print it.next(timeout=1) # prints "4" unless your computer is *very* slow\n\n import time\n result = pool.apply_async(time.sleep, (10,))\n print result.get(timeout=1) # raises TimeoutError\n
Usually message passing between processes is done using queues or by using\nConnection objects returned by Pipe().
\nHowever, the multiprocessing.connection module allows some extra\nflexibility. It basically gives a high level message oriented API for dealing\nwith sockets or Windows named pipes, and also has support for digest\nauthentication using the hmac module.
\nSend a randomly generated message to the other end of the connection and wait\nfor a reply.
\nIf the reply matches the digest of the message using authkey as the key\nthen a welcome message is sent to the other end of the connection. Otherwise\nAuthenticationError is raised.
\nReceive a message, calculate the digest of the message using authkey as the\nkey, and then send the digest back.
\nIf a welcome message is not received, then AuthenticationError is\nraised.
\nAttempt to set up a connection to the listener which is using address\naddress, returning a Connection.
\nThe type of the connection is determined by family argument, but this can\ngenerally be omitted since it can usually be inferred from the format of\naddress. (See Address Formats)
\nIf authenticate is True or authkey is a string then digest\nauthentication is used. The key used for authentication will be either\nauthkey or current_process().authkey) if authkey is None.\nIf authentication fails then AuthenticationError is raised. See\nAuthentication keys.
\nA wrapper for a bound socket or Windows named pipe which is ‘listening’ for\nconnections.
\naddress is the address to be used by the bound socket or named pipe of the\nlistener object.
\nNote
\nIf an address of ‘0.0.0.0’ is used, the address will not be a connectable\nend point on Windows. If you require a connectable end-point,\nyou should use ‘127.0.0.1’.
\nfamily is the type of socket (or named pipe) to use. This can be one of\nthe strings 'AF_INET' (for a TCP socket), 'AF_UNIX' (for a Unix\ndomain socket) or 'AF_PIPE' (for a Windows named pipe). Of these only\nthe first is guaranteed to be available. If family is None then the\nfamily is inferred from the format of address. If address is also\nNone then a default is chosen. This default is the family which is\nassumed to be the fastest available. See\nAddress Formats. Note that if family is\n'AF_UNIX' and address is None then the socket will be created in a\nprivate temporary directory created using tempfile.mkstemp().
\nIf the listener object uses a socket then backlog (1 by default) is passed\nto the listen() method of the socket once it has been bound.
\nIf authenticate is True (False by default) or authkey is not\nNone then digest authentication is used.
\nIf authkey is a string then it will be used as the authentication key;\notherwise it must be None.
\nIf authkey is None and authenticate is True then\ncurrent_process().authkey is used as the authentication key. If\nauthkey is None and authenticate is False then no\nauthentication is done. If authentication fails then\nAuthenticationError is raised. See Authentication keys.
\nListener objects have the following read-only properties:
\nThe module defines two exceptions:
\nExamples
\nThe following server code creates a listener which uses 'secret password' as\nan authentication key. It then waits for a connection and sends some data to\nthe client:
\nfrom multiprocessing.connection import Listener\nfrom array import array\n\naddress = ('localhost', 6000) # family is deduced to be 'AF_INET'\nlistener = Listener(address, authkey='secret password')\n\nconn = listener.accept()\nprint 'connection accepted from', listener.last_accepted\n\nconn.send([2.25, None, 'junk', float])\n\nconn.send_bytes('hello')\n\nconn.send_bytes(array('i', [42, 1729]))\n\nconn.close()\nlistener.close()\n
The following code connects to the server and receives some data from the\nserver:
\nfrom multiprocessing.connection import Client\nfrom array import array\n\naddress = ('localhost', 6000)\nconn = Client(address, authkey='secret password')\n\nprint conn.recv() # => [2.25, None, 'junk', float]\n\nprint conn.recv_bytes() # => 'hello'\n\narr = array('i', [0, 0, 0, 0, 0])\nprint conn.recv_bytes_into(arr) # => 8\nprint arr # => array('i', [42, 1729, 0, 0, 0])\n\nconn.close()\n
An 'AF_INET' address is a tuple of the form (hostname, port) where\nhostname is a string and port is an integer.
\nAn 'AF_UNIX' address is a string representing a filename on the\nfilesystem.
\nr'\\\\.\\pipe\\PipeName'. To use Client() to connect to a named\npipe on a remote computer called ServerName one should use an address of the\nform r'\\\\ServerName\\pipe\\PipeName' instead.
\nNote that any string beginning with two backslashes is assumed by default to be\nan 'AF_PIPE' address rather than an 'AF_UNIX' address.
\nWhen one uses Connection.recv(), the data received is automatically\nunpickled. Unfortunately unpickling data from an untrusted source is a security\nrisk. Therefore Listener and Client() use the hmac module\nto provide digest authentication.
\nAn authentication key is a string which can be thought of as a password: once a\nconnection is established both ends will demand proof that the other knows the\nauthentication key. (Demonstrating that both ends are using the same key does\nnot involve sending the key over the connection.)
\nIf authentication is requested but do authentication key is specified then the\nreturn value of current_process().authkey is used (see\nProcess). This value will automatically inherited by\nany Process object that the current process creates.\nThis means that (by default) all processes of a multi-process program will share\na single authentication key which can be used when setting up connections\nbetween themselves.
\nSuitable authentication keys can also be generated by using os.urandom().
\nSome support for logging is available. Note, however, that the logging\npackage does not use process shared locks so it is possible (depending on the\nhandler type) for messages from different processes to get mixed up.
\nReturns the logger used by multiprocessing. If necessary, a new one\nwill be created.
\nWhen first created the logger has level logging.NOTSET and no\ndefault handler. Messages sent to this logger will not by default propagate\nto the root logger.
\nNote that on Windows child processes will only inherit the level of the\nparent process’s logger – any other customization of the logger will not be\ninherited.
\nBelow is an example session with logging turned on:
\n>>> import multiprocessing, logging\n>>> logger = multiprocessing.log_to_stderr()\n>>> logger.setLevel(logging.INFO)\n>>> logger.warning('doomed')\n[WARNING/MainProcess] doomed\n>>> m = multiprocessing.Manager()\n[INFO/SyncManager-...] child process calling self.run()\n[INFO/SyncManager-...] created temp directory /.../pymp-...\n[INFO/SyncManager-...] manager serving at '/.../listener-...'\n>>> del m\n[INFO/MainProcess] sending shutdown message to manager\n[INFO/SyncManager-...] manager exiting with exitcode 0\n
In addition to having these two logging functions, the multiprocessing also\nexposes two additional logging level attributes. These are SUBWARNING\nand SUBDEBUG. The table below illustrates where theses fit in the\nnormal level hierarchy.
\nLevel | \nNumeric value | \n
---|---|
SUBWARNING | \n25 | \n
SUBDEBUG | \n5 | \n
For a full table of logging levels, see the logging module.
\nThese additional logging levels are used primarily for certain debug messages\nwithin the multiprocessing module. Below is the same example as above, except\nwith SUBDEBUG enabled:
\n>>> import multiprocessing, logging\n>>> logger = multiprocessing.log_to_stderr()\n>>> logger.setLevel(multiprocessing.SUBDEBUG)\n>>> logger.warning('doomed')\n[WARNING/MainProcess] doomed\n>>> m = multiprocessing.Manager()\n[INFO/SyncManager-...] child process calling self.run()\n[INFO/SyncManager-...] created temp directory /.../pymp-...\n[INFO/SyncManager-...] manager serving at '/.../pymp-djGBXN/listener-...'\n>>> del m\n[SUBDEBUG/MainProcess] finalizer calling ...\n[INFO/MainProcess] sending shutdown message to manager\n[DEBUG/SyncManager-...] manager received shutdown message\n[SUBDEBUG/SyncManager-...] calling <Finalize object, callback=unlink, ...\n[SUBDEBUG/SyncManager-...] finalizer calling <built-in function unlink> ...\n[SUBDEBUG/SyncManager-...] calling <Finalize object, dead>\n[SUBDEBUG/SyncManager-...] finalizer calling <function rmtree at 0x5aa730> ...\n[INFO/SyncManager-...] manager exiting with exitcode 0\n
There are certain guidelines and idioms which should be adhered to when using\nmultiprocessing.
\nAvoid shared state
\n\n\nAs far as possible one should try to avoid shifting large amounts of data\nbetween processes.
\nIt is probably best to stick to using queues or pipes for communication\nbetween processes rather than using the lower level synchronization\nprimitives from the threading module.
\n
Picklability
\n\nEnsure that the arguments to the methods of proxies are picklable.\n
Thread safety of proxies
\n\n\nDo not use a proxy object from more than one thread unless you protect it\nwith a lock.
\n(There is never a problem with different processes using the same proxy.)
\n
Joining zombie processes
\n\nOn Unix when a process finishes but has not been joined it becomes a zombie.\nThere should never be very many because each time a new process starts (or\nactive_children() is called) all completed processes which have not\nyet been joined will be joined. Also calling a finished process’s\nProcess.is_alive() will join the process. Even so it is probably good\npractice to explicitly join all the processes that you start.\n
Better to inherit than pickle/unpickle
\n\nOn Windows many types from multiprocessing need to be picklable so\nthat child processes can use them. However, one should generally avoid\nsending shared objects to other processes using pipes or queues. Instead\nyou should arrange the program so that a process which needs access to a\nshared resource created elsewhere can inherit it from an ancestor process.\n
Avoid terminating processes
\n\n\nUsing the Process.terminate() method to stop a process is liable to\ncause any shared resources (such as locks, semaphores, pipes and queues)\ncurrently being used by the process to become broken or unavailable to other\nprocesses.
\nTherefore it is probably best to only consider using\nProcess.terminate() on processes which never use any shared resources.
\n
Joining processes that use queues
\n\n\nBear in mind that a process that has put items in a queue will wait before\nterminating until all the buffered items are fed by the “feeder” thread to\nthe underlying pipe. (The child process can call the\nQueue.cancel_join_thread() method of the queue to avoid this behaviour.)
\nThis means that whenever you use a queue you need to make sure that all\nitems which have been put on the queue will eventually be removed before the\nprocess is joined. Otherwise you cannot be sure that processes which have\nput items on the queue will terminate. Remember also that non-daemonic\nprocesses will be automatically be joined.
\nAn example which will deadlock is the following:
\n\n\nfrom multiprocessing import Process, Queue\n\ndef f(q):\n q.put('X' * 1000000)\n\nif __name__ == '__main__':\n queue = Queue()\n p = Process(target=f, args=(queue,))\n p.start()\n p.join() # this deadlocks\n obj = queue.get()\nA fix here would be to swap the last two lines round (or simply remove the\np.join() line).
\n
Explicitly pass resources to child processes
\n\n\nOn Unix a child process can make use of a shared resource created in a\nparent process using a global resource. However, it is better to pass the\nobject as an argument to the constructor for the child process.
\nApart from making the code (potentially) compatible with Windows this also\nensures that as long as the child process is still alive the object will not\nbe garbage collected in the parent process. This might be important if some\nresource is freed when the object is garbage collected in the parent\nprocess.
\nSo for instance
\n\n\nfrom multiprocessing import Process, Lock\n\ndef f():\n ... do something using "lock" ...\n\nif __name__ == '__main__':\n lock = Lock()\n for i in range(10):\n Process(target=f).start()\nshould be rewritten as
\n\n\nfrom multiprocessing import Process, Lock\n\ndef f(l):\n ... do something using "l" ...\n\nif __name__ == '__main__':\n lock = Lock()\n for i in range(10):\n Process(target=f, args=(lock,)).start()\n
Beware of replacing sys.stdin with a “file like object”
\n\n\nmultiprocessing originally unconditionally called:
\n\n\nos.close(sys.stdin.fileno())\nin the multiprocessing.Process._bootstrap() method — this resulted\nin issues with processes-in-processes. This has been changed to:
\n\n\nsys.stdin.close()\nsys.stdin = open(os.devnull)\nWhich solves the fundamental issue of processes colliding with each other\nresulting in a bad file descriptor error, but introduces a potential danger\nto applications which replace sys.stdin() with a “file-like object”\nwith output buffering. This danger is that if multiple processes call\nclose() on this file-like object, it could result in the same\ndata being flushed to the object multiple times, resulting in corruption.
\nIf you write a file-like object and implement your own caching, you can\nmake it fork-safe by storing the pid whenever you append to the cache,\nand discarding the cache when the pid changes. For example:
\n\n\n@property\ndef cache(self):\n pid = os.getpid()\n if pid != self._pid:\n self._pid = pid\n self._cache = []\n return self._cache\nFor more information, see issue 5155, issue 5313 and issue 5331
\n
Since Windows lacks os.fork() it has a few extra restrictions:
\nMore picklability
\n\n\nEnsure that all arguments to Process.__init__() are picklable. This\nmeans, in particular, that bound or unbound methods cannot be used directly\nas the target argument on Windows — just define a function and use\nthat instead.
\nAlso, if you subclass Process then make sure that instances will be\npicklable when the Process.start() method is called.
\n
Global variables
\n\n\nBear in mind that if code run in a child process tries to access a global\nvariable, then the value it sees (if any) may not be the same as the value\nin the parent process at the time that Process.start() was called.
\nHowever, global variables which are just module level constants cause no\nproblems.
\n
Safe importing of main module
\n\n\nMake sure that the main module can be safely imported by a new Python\ninterpreter without causing unintended side effects (such a starting a new\nprocess).
\nFor example, under Windows running the following module would fail with a\nRuntimeError:
\n\n\nfrom multiprocessing import Process\n\ndef foo():\n print 'hello'\n\np = Process(target=foo)\np.start()\nInstead one should protect the “entry point” of the program by using if\n__name__ == '__main__': as follows:
\n\n\nfrom multiprocessing import Process, freeze_support\n\ndef foo():\n print 'hello'\n\nif __name__ == '__main__':\n freeze_support()\n p = Process(target=foo)\n p.start()\n(The freeze_support() line can be omitted if the program will be run\nnormally instead of frozen.)
\nThis allows the newly spawned Python interpreter to safely import the module\nand then run the module’s foo() function.
\nSimilar restrictions apply if a pool or manager is created in the main\nmodule.
\n
Demonstration of how to create and use customized managers and proxies:
\n#\n# This module shows how to use arbitrary callables with a subclass of\n# `BaseManager`.\n#\n# Copyright (c) 2006-2008, R Oudkerk\n# All rights reserved.\n#\n\nfrom multiprocessing import freeze_support\nfrom multiprocessing.managers import BaseManager, BaseProxy\nimport operator\n\n##\n\nclass Foo(object):\n def f(self):\n print 'you called Foo.f()'\n def g(self):\n print 'you called Foo.g()'\n def _h(self):\n print 'you called Foo._h()'\n\n# A simple generator function\ndef baz():\n for i in xrange(10):\n yield i*i\n\n# Proxy type for generator objects\nclass GeneratorProxy(BaseProxy):\n _exposed_ = ('next', '__next__')\n def __iter__(self):\n return self\n def next(self):\n return self._callmethod('next')\n def __next__(self):\n return self._callmethod('__next__')\n\n# Function to return the operator module\ndef get_operator_module():\n return operator\n\n##\n\nclass MyManager(BaseManager):\n pass\n\n# register the Foo class; make `f()` and `g()` accessible via proxy\nMyManager.register('Foo1', Foo)\n\n# register the Foo class; make `g()` and `_h()` accessible via proxy\nMyManager.register('Foo2', Foo, exposed=('g', '_h'))\n\n# register the generator function baz; use `GeneratorProxy` to make proxies\nMyManager.register('baz', baz, proxytype=GeneratorProxy)\n\n# register get_operator_module(); make public functions accessible via proxy\nMyManager.register('operator', get_operator_module)\n\n##\n\ndef test():\n manager = MyManager()\n manager.start()\n\n print '-' * 20\n\n f1 = manager.Foo1()\n f1.f()\n f1.g()\n assert not hasattr(f1, '_h')\n assert sorted(f1._exposed_) == sorted(['f', 'g'])\n\n print '-' * 20\n\n f2 = manager.Foo2()\n f2.g()\n f2._h()\n assert not hasattr(f2, 'f')\n assert sorted(f2._exposed_) == sorted(['g', '_h'])\n\n print '-' * 20\n\n it = manager.baz()\n for i in it:\n print '<%d>' i,\n print\n\n print '-' * 20\n\n op = manager.operator()\n print 'op.add(23, 45) =', op.add(23, 45)\n print 'op.pow(2, 94) =', op.pow(2, 94)\n print 'op.getslice(range(10), 2, 6) =', op.getslice(range(10), 2, 6)\n print 'op.repeat(range(5), 3) =', op.repeat(range(5), 3)\n print 'op._exposed_ =', op._exposed_\n\n##\n\nif __name__ == '__main__':\n freeze_support()\n test()\n
Using Pool:
\n#\n# A test of `multiprocessing.Pool` class\n#\n# Copyright (c) 2006-2008, R Oudkerk\n# All rights reserved.\n#\n\nimport multiprocessing\nimport time\nimport random\nimport sys\n\n#\n# Functions used by test code\n#\n\ndef calculate(func, args):\n result = func(*args)\n return '%s says that %s%s = %s' (\n multiprocessing.current_process().name,\n func.__name__, args, result\n )\n\ndef calculatestar(args):\n return calculate(*args)\n\ndef mul(a, b):\n time.sleep(0.5*random.random())\n return a * b\n\ndef plus(a, b):\n time.sleep(0.5*random.random())\n return a + b\n\ndef f(x):\n return 1.0 / (x-5.0)\n\ndef pow3(x):\n return x**3\n\ndef noop(x):\n pass\n\n#\n# Test code\n#\n\ndef test():\n print 'cpu_count() = %d\\n' multiprocessing.cpu_count()\n\n #\n # Create pool\n #\n\n PROCESSES = 4\n print 'Creating pool with %d processes\\n' PROCESSES\n pool = multiprocessing.Pool(PROCESSES)\n print 'pool = %s' pool\n print\n\n #\n # Tests\n #\n\n TASKS = [(mul, (i, 7)) for i in range(10)] + \\\n [(plus, (i, 8)) for i in range(10)]\n\n results = [pool.apply_async(calculate, t) for t in TASKS]\n imap_it = pool.imap(calculatestar, TASKS)\n imap_unordered_it = pool.imap_unordered(calculatestar, TASKS)\n\n print 'Ordered results using pool.apply_async():'\n for r in results:\n print '\\t', r.get()\n print\n\n print 'Ordered results using pool.imap():'\n for x in imap_it:\n print '\\t', x\n print\n\n print 'Unordered results using pool.imap_unordered():'\n for x in imap_unordered_it:\n print '\\t', x\n print\n\n print 'Ordered results using pool.map() --- will block till complete:'\n for x in pool.map(calculatestar, TASKS):\n print '\\t', x\n print\n\n #\n # Simple benchmarks\n #\n\n N = 100000\n print 'def pow3(x): return x**3'\n\n t = time.time()\n A = map(pow3, xrange(N))\n print '\\tmap(pow3, xrange(%d)):\\n\\t\\t%s seconds' \\\n (N, time.time() - t)\n\n t = time.time()\n B = pool.map(pow3, xrange(N))\n print '\\tpool.map(pow3, xrange(%d)):\\n\\t\\t%s seconds' \\\n (N, time.time() - t)\n\n t = time.time()\n C = list(pool.imap(pow3, xrange(N), chunksize=N//8))\n print '\\tlist(pool.imap(pow3, xrange(%d), chunksize=%d)):\\n\\t\\t%s' \\\n ' seconds' (N, N//8, time.time() - t)\n\n assert A == B == C, (len(A), len(B), len(C))\n print\n\n L = [None] * 1000000\n print 'def noop(x): pass'\n print 'L = [None] * 1000000'\n\n t = time.time()\n A = map(noop, L)\n print '\\tmap(noop, L):\\n\\t\\t%s seconds' \\\n (time.time() - t)\n\n t = time.time()\n B = pool.map(noop, L)\n print '\\tpool.map(noop, L):\\n\\t\\t%s seconds' \\\n (time.time() - t)\n\n t = time.time()\n C = list(pool.imap(noop, L, chunksize=len(L)//8))\n print '\\tlist(pool.imap(noop, L, chunksize=%d)):\\n\\t\\t%s seconds' \\\n (len(L)//8, time.time() - t)\n\n assert A == B == C, (len(A), len(B), len(C))\n print\n\n del A, B, C, L\n\n #\n # Test error handling\n #\n\n print 'Testing error handling:'\n\n try:\n print pool.apply(f, (5,))\n except ZeroDivisionError:\n print '\\tGot ZeroDivisionError as expected from pool.apply()'\n else:\n raise AssertionError('expected ZeroDivisionError')\n\n try:\n print pool.map(f, range(10))\n except ZeroDivisionError:\n print '\\tGot ZeroDivisionError as expected from pool.map()'\n else:\n raise AssertionError('expected ZeroDivisionError')\n\n try:\n print list(pool.imap(f, range(10)))\n except ZeroDivisionError:\n print '\\tGot ZeroDivisionError as expected from list(pool.imap())'\n else:\n raise AssertionError('expected ZeroDivisionError')\n\n it = pool.imap(f, range(10))\n for i in range(10):\n try:\n x = it.next()\n except ZeroDivisionError:\n if i == 5:\n pass\n except StopIteration:\n break\n else:\n if i == 5:\n raise AssertionError('expected ZeroDivisionError')\n\n assert i == 9\n print '\\tGot ZeroDivisionError as expected from IMapIterator.next()'\n print\n\n #\n # Testing timeouts\n #\n\n print 'Testing ApplyResult.get() with timeout:',\n res = pool.apply_async(calculate, TASKS[0])\n while 1:\n sys.stdout.flush()\n try:\n sys.stdout.write('\\n\\t%s' res.get(0.02))\n break\n except multiprocessing.TimeoutError:\n sys.stdout.write('.')\n print\n print\n\n print 'Testing IMapIterator.next() with timeout:',\n it = pool.imap(calculatestar, TASKS)\n while 1:\n sys.stdout.flush()\n try:\n sys.stdout.write('\\n\\t%s' it.next(0.02))\n except StopIteration:\n break\n except multiprocessing.TimeoutError:\n sys.stdout.write('.')\n print\n print\n\n #\n # Testing callback\n #\n\n print 'Testing callback:'\n\n A = []\n B = [56, 0, 1, 8, 27, 64, 125, 216, 343, 512, 729]\n\n r = pool.apply_async(mul, (7, 8), callback=A.append)\n r.wait()\n\n r = pool.map_async(pow3, range(10), callback=A.extend)\n r.wait()\n\n if A == B:\n print '\\tcallbacks succeeded\\n'\n else:\n print '\\t*** callbacks failed\\n\\t\\t%s != %s\\n' (A, B)\n\n #\n # Check there are no outstanding tasks\n #\n\n assert not pool._cache, 'cache = %r' pool._cache\n\n #\n # Check close() methods\n #\n\n print 'Testing close():'\n\n for worker in pool._pool:\n assert worker.is_alive()\n\n result = pool.apply_async(time.sleep, [0.5])\n pool.close()\n pool.join()\n\n assert result.get() is None\n\n for worker in pool._pool:\n assert not worker.is_alive()\n\n print '\\tclose() succeeded\\n'\n\n #\n # Check terminate() method\n #\n\n print 'Testing terminate():'\n\n pool = multiprocessing.Pool(2)\n DELTA = 0.1\n ignore = pool.apply(pow3, [2])\n results = [pool.apply_async(time.sleep, [DELTA]) for i in range(100)]\n pool.terminate()\n pool.join()\n\n for worker in pool._pool:\n assert not worker.is_alive()\n\n print '\\tterminate() succeeded\\n'\n\n #\n # Check garbage collection\n #\n\n print 'Testing garbage collection:'\n\n pool = multiprocessing.Pool(2)\n DELTA = 0.1\n processes = pool._pool\n ignore = pool.apply(pow3, [2])\n results = [pool.apply_async(time.sleep, [DELTA]) for i in range(100)]\n\n results = pool = None\n\n time.sleep(DELTA * 2)\n\n for worker in processes:\n assert not worker.is_alive()\n\n print '\\tgarbage collection succeeded\\n'\n\n\nif __name__ == '__main__':\n multiprocessing.freeze_support()\n\n assert len(sys.argv) in (1, 2)\n\n if len(sys.argv) == 1 or sys.argv[1] == 'processes':\n print ' Using processes '.center(79, '-')\n elif sys.argv[1] == 'threads':\n print ' Using threads '.center(79, '-')\n import multiprocessing.dummy as multiprocessing\n else:\n print 'Usage:\\n\\t%s [processes | threads]' sys.argv[0]\n raise SystemExit(2)\n\n test()\n
Synchronization types like locks, conditions and queues:
\n#\n# A test file for the `multiprocessing` package\n#\n# Copyright (c) 2006-2008, R Oudkerk\n# All rights reserved.\n#\n\nimport time, sys, random\nfrom Queue import Empty\n\nimport multiprocessing # may get overwritten\n\n\n#### TEST_VALUE\n\ndef value_func(running, mutex):\n random.seed()\n time.sleep(random.random()*4)\n\n mutex.acquire()\n print '\\n\\t\\t\\t' + str(multiprocessing.current_process()) + ' has finished'\n running.value -= 1\n mutex.release()\n\ndef test_value():\n TASKS = 10\n running = multiprocessing.Value('i', TASKS)\n mutex = multiprocessing.Lock()\n\n for i in range(TASKS):\n p = multiprocessing.Process(target=value_func, args=(running, mutex))\n p.start()\n\n while running.value > 0:\n time.sleep(0.08)\n mutex.acquire()\n print running.value,\n sys.stdout.flush()\n mutex.release()\n\n print\n print 'No more running processes'\n\n\n#### TEST_QUEUE\n\ndef queue_func(queue):\n for i in range(30):\n time.sleep(0.5 * random.random())\n queue.put(i*i)\n queue.put('STOP')\n\ndef test_queue():\n q = multiprocessing.Queue()\n\n p = multiprocessing.Process(target=queue_func, args=(q,))\n p.start()\n\n o = None\n while o != 'STOP':\n try:\n o = q.get(timeout=0.3)\n print o,\n sys.stdout.flush()\n except Empty:\n print 'TIMEOUT'\n\n print\n\n\n#### TEST_CONDITION\n\ndef condition_func(cond):\n cond.acquire()\n print '\\t' + str(cond)\n time.sleep(2)\n print '\\tchild is notifying'\n print '\\t' + str(cond)\n cond.notify()\n cond.release()\n\ndef test_condition():\n cond = multiprocessing.Condition()\n\n p = multiprocessing.Process(target=condition_func, args=(cond,))\n print cond\n\n cond.acquire()\n print cond\n cond.acquire()\n print cond\n\n p.start()\n\n print 'main is waiting'\n cond.wait()\n print 'main has woken up'\n\n print cond\n cond.release()\n print cond\n cond.release()\n\n p.join()\n print cond\n\n\n#### TEST_SEMAPHORE\n\ndef semaphore_func(sema, mutex, running):\n sema.acquire()\n\n mutex.acquire()\n running.value += 1\n print running.value, 'tasks are running'\n mutex.release()\n\n random.seed()\n time.sleep(random.random()*2)\n\n mutex.acquire()\n running.value -= 1\n print '%s has finished' multiprocessing.current_process()\n mutex.release()\n\n sema.release()\n\ndef test_semaphore():\n sema = multiprocessing.Semaphore(3)\n mutex = multiprocessing.RLock()\n running = multiprocessing.Value('i', 0)\n\n processes = [\n multiprocessing.Process(target=semaphore_func,\n args=(sema, mutex, running))\n for i in range(10)\n ]\n\n for p in processes:\n p.start()\n\n for p in processes:\n p.join()\n\n\n#### TEST_JOIN_TIMEOUT\n\ndef join_timeout_func():\n print '\\tchild sleeping'\n time.sleep(5.5)\n print '\\n\\tchild terminating'\n\ndef test_join_timeout():\n p = multiprocessing.Process(target=join_timeout_func)\n p.start()\n\n print 'waiting for process to finish'\n\n while 1:\n p.join(timeout=1)\n if not p.is_alive():\n break\n print '.',\n sys.stdout.flush()\n\n\n#### TEST_EVENT\n\ndef event_func(event):\n print '\\t%r is waiting' multiprocessing.current_process()\n event.wait()\n print '\\t%r has woken up' multiprocessing.current_process()\n\ndef test_event():\n event = multiprocessing.Event()\n\n processes = [multiprocessing.Process(target=event_func, args=(event,))\n for i in range(5)]\n\n for p in processes:\n p.start()\n\n print 'main is sleeping'\n time.sleep(2)\n\n print 'main is setting event'\n event.set()\n\n for p in processes:\n p.join()\n\n\n#### TEST_SHAREDVALUES\n\ndef sharedvalues_func(values, arrays, shared_values, shared_arrays):\n for i in range(len(values)):\n v = values[i][1]\n sv = shared_values[i].value\n assert v == sv\n\n for i in range(len(values)):\n a = arrays[i][1]\n sa = list(shared_arrays[i][:])\n assert a == sa\n\n print 'Tests passed'\n\ndef test_sharedvalues():\n values = [\n ('i', 10),\n ('h', -2),\n ('d', 1.25)\n ]\n arrays = [\n ('i', range(100)),\n ('d', [0.25 * i for i in range(100)]),\n ('H', range(1000))\n ]\n\n shared_values = [multiprocessing.Value(id, v) for id, v in values]\n shared_arrays = [multiprocessing.Array(id, a) for id, a in arrays]\n\n p = multiprocessing.Process(\n target=sharedvalues_func,\n args=(values, arrays, shared_values, shared_arrays)\n )\n p.start()\n p.join()\n\n assert p.exitcode == 0\n\n\n####\n\ndef test(namespace=multiprocessing):\n global multiprocessing\n\n multiprocessing = namespace\n\n for func in [ test_value, test_queue, test_condition,\n test_semaphore, test_join_timeout, test_event,\n test_sharedvalues ]:\n\n print '\\n\\t######## %s\\n' func.__name__\n func()\n\n ignore = multiprocessing.active_children() # cleanup any old processes\n if hasattr(multiprocessing, '_debug_info'):\n info = multiprocessing._debug_info()\n if info:\n print info\n raise ValueError('there should be no positive refcounts left')\n\n\nif __name__ == '__main__':\n multiprocessing.freeze_support()\n\n assert len(sys.argv) in (1, 2)\n\n if len(sys.argv) == 1 or sys.argv[1] == 'processes':\n print ' Using processes '.center(79, '-')\n namespace = multiprocessing\n elif sys.argv[1] == 'manager':\n print ' Using processes and a manager '.center(79, '-')\n namespace = multiprocessing.Manager()\n namespace.Process = multiprocessing.Process\n namespace.current_process = multiprocessing.current_process\n namespace.active_children = multiprocessing.active_children\n elif sys.argv[1] == 'threads':\n print ' Using threads '.center(79, '-')\n import multiprocessing.dummy as namespace\n else:\n print 'Usage:\\n\\t%s [processes | manager | threads]' sys.argv[0]\n raise SystemExit(2)\n\n test(namespace)\n
An example showing how to use queues to feed tasks to a collection of worker\nprocesses and collect the results:
\n#\n# Simple example which uses a pool of workers to carry out some tasks.\n#\n# Notice that the results will probably not come out of the output\n# queue in the same in the same order as the corresponding tasks were\n# put on the input queue. If it is important to get the results back\n# in the original order then consider using `Pool.map()` or\n# `Pool.imap()` (which will save on the amount of code needed anyway).\n#\n# Copyright (c) 2006-2008, R Oudkerk\n# All rights reserved.\n#\n\nimport time\nimport random\n\nfrom multiprocessing import Process, Queue, current_process, freeze_support\n\n#\n# Function run by worker processes\n#\n\ndef worker(input, output):\n for func, args in iter(input.get, 'STOP'):\n result = calculate(func, args)\n output.put(result)\n\n#\n# Function used to calculate result\n#\n\ndef calculate(func, args):\n result = func(*args)\n return '%s says that %s%s = %s' \\\n (current_process().name, func.__name__, args, result)\n\n#\n# Functions referenced by tasks\n#\n\ndef mul(a, b):\n time.sleep(0.5*random.random())\n return a * b\n\ndef plus(a, b):\n time.sleep(0.5*random.random())\n return a + b\n\n#\n#\n#\n\ndef test():\n NUMBER_OF_PROCESSES = 4\n TASKS1 = [(mul, (i, 7)) for i in range(20)]\n TASKS2 = [(plus, (i, 8)) for i in range(10)]\n\n # Create queues\n task_queue = Queue()\n done_queue = Queue()\n\n # Submit tasks\n for task in TASKS1:\n task_queue.put(task)\n\n # Start worker processes\n for i in range(NUMBER_OF_PROCESSES):\n Process(target=worker, args=(task_queue, done_queue)).start()\n\n # Get and print results\n print 'Unordered results:'\n for i in range(len(TASKS1)):\n print '\\t', done_queue.get()\n\n # Add more tasks using `put()`\n for task in TASKS2:\n task_queue.put(task)\n\n # Get and print some more results\n for i in range(len(TASKS2)):\n print '\\t', done_queue.get()\n\n # Tell child processes to stop\n for i in range(NUMBER_OF_PROCESSES):\n task_queue.put('STOP')\n\n\nif __name__ == '__main__':\n freeze_support()\n test()\n
An example of how a pool of worker processes can each run a\nSimpleHTTPServer.HttpServer instance while sharing a single listening\nsocket.
\n#\n# Example where a pool of http servers share a single listening socket\n#\n# On Windows this module depends on the ability to pickle a socket\n# object so that the worker processes can inherit a copy of the server\n# object. (We import `multiprocessing.reduction` to enable this pickling.)\n#\n# Not sure if we should synchronize access to `socket.accept()` method by\n# using a process-shared lock -- does not seem to be necessary.\n#\n# Copyright (c) 2006-2008, R Oudkerk\n# All rights reserved.\n#\n\nimport os\nimport sys\n\nfrom multiprocessing import Process, current_process, freeze_support\nfrom BaseHTTPServer import HTTPServer\nfrom SimpleHTTPServer import SimpleHTTPRequestHandler\n\nif sys.platform == 'win32':\n import multiprocessing.reduction # make sockets pickable/inheritable\n\n\ndef note(format, *args):\n sys.stderr.write('[%s]\\t%s\\n' (current_process().name, format args))\n\n\nclass RequestHandler(SimpleHTTPRequestHandler):\n # we override log_message() to show which process is handling the request\n def log_message(self, format, *args):\n note(format, *args)\n\ndef serve_forever(server):\n note('starting server')\n try:\n server.serve_forever()\n except KeyboardInterrupt:\n pass\n\n\ndef runpool(address, number_of_processes):\n # create a single server object -- children will each inherit a copy\n server = HTTPServer(address, RequestHandler)\n\n # create child processes to act as workers\n for i in range(number_of_processes-1):\n Process(target=serve_forever, args=(server,)).start()\n\n # main process also acts as a worker\n serve_forever(server)\n\n\ndef test():\n DIR = os.path.join(os.path.dirname(__file__), '..')\n ADDRESS = ('localhost', 8000)\n NUMBER_OF_PROCESSES = 4\n\n print 'Serving at http://%s:%d using %d worker processes' \\\n (ADDRESS[0], ADDRESS[1], NUMBER_OF_PROCESSES)\n print 'To exit press Ctrl-' + ['C', 'Break'][sys.platform=='win32']\n\n os.chdir(DIR)\n runpool(ADDRESS, NUMBER_OF_PROCESSES)\n\n\nif __name__ == '__main__':\n freeze_support()\n test()\n
Some simple benchmarks comparing multiprocessing with threading:
\n#\n# Simple benchmarks for the multiprocessing package\n#\n# Copyright (c) 2006-2008, R Oudkerk\n# All rights reserved.\n#\n\nimport time, sys, multiprocessing, threading, Queue, gc\n\nif sys.platform == 'win32':\n _timer = time.clock\nelse:\n _timer = time.time\n\ndelta = 1\n\n\n#### TEST_QUEUESPEED\n\ndef queuespeed_func(q, c, iterations):\n a = '0' * 256\n c.acquire()\n c.notify()\n c.release()\n\n for i in xrange(iterations):\n q.put(a)\n\n q.put('STOP')\n\ndef test_queuespeed(Process, q, c):\n elapsed = 0\n iterations = 1\n\n while elapsed < delta:\n iterations *= 2\n\n p = Process(target=queuespeed_func, args=(q, c, iterations))\n c.acquire()\n p.start()\n c.wait()\n c.release()\n\n result = None\n t = _timer()\n\n while result != 'STOP':\n result = q.get()\n\n elapsed = _timer() - t\n\n p.join()\n\n print iterations, 'objects passed through the queue in', elapsed, 'seconds'\n print 'average number/sec:', iterations/elapsed\n\n\n#### TEST_PIPESPEED\n\ndef pipe_func(c, cond, iterations):\n a = '0' * 256\n cond.acquire()\n cond.notify()\n cond.release()\n\n for i in xrange(iterations):\n c.send(a)\n\n c.send('STOP')\n\ndef test_pipespeed():\n c, d = multiprocessing.Pipe()\n cond = multiprocessing.Condition()\n elapsed = 0\n iterations = 1\n\n while elapsed < delta:\n iterations *= 2\n\n p = multiprocessing.Process(target=pipe_func,\n args=(d, cond, iterations))\n cond.acquire()\n p.start()\n cond.wait()\n cond.release()\n\n result = None\n t = _timer()\n\n while result != 'STOP':\n result = c.recv()\n\n elapsed = _timer() - t\n p.join()\n\n print iterations, 'objects passed through connection in',elapsed,'seconds'\n print 'average number/sec:', iterations/elapsed\n\n\n#### TEST_SEQSPEED\n\ndef test_seqspeed(seq):\n elapsed = 0\n iterations = 1\n\n while elapsed < delta:\n iterations *= 2\n\n t = _timer()\n\n for i in xrange(iterations):\n a = seq[5]\n\n elapsed = _timer()-t\n\n print iterations, 'iterations in', elapsed, 'seconds'\n print 'average number/sec:', iterations/elapsed\n\n\n#### TEST_LOCK\n\ndef test_lockspeed(l):\n elapsed = 0\n iterations = 1\n\n while elapsed < delta:\n iterations *= 2\n\n t = _timer()\n\n for i in xrange(iterations):\n l.acquire()\n l.release()\n\n elapsed = _timer()-t\n\n print iterations, 'iterations in', elapsed, 'seconds'\n print 'average number/sec:', iterations/elapsed\n\n\n#### TEST_CONDITION\n\ndef conditionspeed_func(c, N):\n c.acquire()\n c.notify()\n\n for i in xrange(N):\n c.wait()\n c.notify()\n\n c.release()\n\ndef test_conditionspeed(Process, c):\n elapsed = 0\n iterations = 1\n\n while elapsed < delta:\n iterations *= 2\n\n c.acquire()\n p = Process(target=conditionspeed_func, args=(c, iterations))\n p.start()\n\n c.wait()\n\n t = _timer()\n\n for i in xrange(iterations):\n c.notify()\n c.wait()\n\n elapsed = _timer()-t\n\n c.release()\n p.join()\n\n print iterations * 2, 'waits in', elapsed, 'seconds'\n print 'average number/sec:', iterations * 2 / elapsed\n\n####\n\ndef test():\n manager = multiprocessing.Manager()\n\n gc.disable()\n\n print '\\n\\t######## testing Queue.Queue\\n'\n test_queuespeed(threading.Thread, Queue.Queue(),\n threading.Condition())\n print '\\n\\t######## testing multiprocessing.Queue\\n'\n test_queuespeed(multiprocessing.Process, multiprocessing.Queue(),\n multiprocessing.Condition())\n print '\\n\\t######## testing Queue managed by server process\\n'\n test_queuespeed(multiprocessing.Process, manager.Queue(),\n manager.Condition())\n print '\\n\\t######## testing multiprocessing.Pipe\\n'\n test_pipespeed()\n\n print\n\n print '\\n\\t######## testing list\\n'\n test_seqspeed(range(10))\n print '\\n\\t######## testing list managed by server process\\n'\n test_seqspeed(manager.list(range(10)))\n print '\\n\\t######## testing Array("i", ..., lock=False)\\n'\n test_seqspeed(multiprocessing.Array('i', range(10), lock=False))\n print '\\n\\t######## testing Array("i", ..., lock=True)\\n'\n test_seqspeed(multiprocessing.Array('i', range(10), lock=True))\n\n print\n\n print '\\n\\t######## testing threading.Lock\\n'\n test_lockspeed(threading.Lock())\n print '\\n\\t######## testing threading.RLock\\n'\n test_lockspeed(threading.RLock())\n print '\\n\\t######## testing multiprocessing.Lock\\n'\n test_lockspeed(multiprocessing.Lock())\n print '\\n\\t######## testing multiprocessing.RLock\\n'\n test_lockspeed(multiprocessing.RLock())\n print '\\n\\t######## testing lock managed by server process\\n'\n test_lockspeed(manager.Lock())\n print '\\n\\t######## testing rlock managed by server process\\n'\n test_lockspeed(manager.RLock())\n\n print\n\n print '\\n\\t######## testing threading.Condition\\n'\n test_conditionspeed(threading.Thread, threading.Condition())\n print '\\n\\t######## testing multiprocessing.Condition\\n'\n test_conditionspeed(multiprocessing.Process, multiprocessing.Condition())\n print '\\n\\t######## testing condition managed by a server process\\n'\n test_conditionspeed(multiprocessing.Process, manager.Condition())\n\n gc.enable()\n\nif __name__ == '__main__':\n multiprocessing.freeze_support()\n test()\n
\nNew in version 2.6.
\nSource code: Lib/ssl.py
\nThis module provides access to Transport Layer Security (often known as “Secure\nSockets Layer”) encryption and peer authentication facilities for network\nsockets, both client-side and server-side. This module uses the OpenSSL\nlibrary. It is available on all modern Unix systems, Windows, Mac OS X, and\nprobably additional platforms, as long as OpenSSL is installed on that platform.
\nNote
\nSome behavior may be platform dependent, since calls are made to the\noperating system socket APIs. The installed version of OpenSSL may also\ncause variations in behavior.
\nThis section documents the objects and functions in the ssl module; for more\ngeneral information about TLS, SSL, and certificates, the reader is referred to\nthe documents in the “See Also” section at the bottom.
\nThis module provides a class, ssl.SSLSocket, which is derived from the\nsocket.socket type, and provides a socket-like wrapper that also\nencrypts and decrypts the data going over the socket with SSL. It supports\nadditional read() and write() methods, along with a method,\ngetpeercert(), to retrieve the certificate of the other side of the\nconnection, and a method, cipher(), to retrieve the cipher being used for\nthe secure connection.
\nTakes an instance sock of socket.socket, and returns an instance\nof ssl.SSLSocket, a subtype of socket.socket, which wraps\nthe underlying socket in an SSL context. For client-side sockets, the\ncontext construction is lazy; if the underlying socket isn’t connected yet,\nthe context construction will be performed after connect() is called on\nthe socket. For server-side sockets, if the socket has no remote peer, it is\nassumed to be a listening socket, and the server-side SSL wrapping is\nautomatically performed on client connections accepted via the accept()\nmethod. wrap_socket() may raise SSLError.
\nThe keyfile and certfile parameters specify optional files which\ncontain a certificate to be used to identify the local side of the\nconnection. See the discussion of Certificates for more\ninformation on how the certificate is stored in the certfile.
\nOften the private key is stored in the same file as the certificate; in this\ncase, only the certfile parameter need be passed. If the private key is\nstored in a separate file, both parameters must be used. If the private key\nis stored in the certfile, it should come before the first certificate in\nthe certificate chain:
\n-----BEGIN RSA PRIVATE KEY-----\n... (private key in base64 encoding) ...\n-----END RSA PRIVATE KEY-----\n-----BEGIN CERTIFICATE-----\n... (certificate in base64 PEM encoding) ...\n-----END CERTIFICATE-----
\nThe parameter server_side is a boolean which identifies whether\nserver-side or client-side behavior is desired from this socket.
\nThe parameter cert_reqs specifies whether a certificate is required from\nthe other side of the connection, and whether it will be validated if\nprovided. It must be one of the three values CERT_NONE\n(certificates ignored), CERT_OPTIONAL (not required, but validated\nif provided), or CERT_REQUIRED (required and validated). If the\nvalue of this parameter is not CERT_NONE, then the ca_certs\nparameter must point to a file of CA certificates.
\nThe ca_certs file contains a set of concatenated “certification\nauthority” certificates, which are used to validate certificates passed from\nthe other end of the connection. See the discussion of\nCertificates for more information about how to arrange the\ncertificates in this file.
\nThe parameter ssl_version specifies which version of the SSL protocol to\nuse. Typically, the server chooses a particular protocol version, and the\nclient must adapt to the server’s choice. Most of the versions are not\ninteroperable with the other versions. If not specified, the default is\nPROTOCOL_SSLv23; it provides the most compatibility with other\nversions.
\nHere’s a table showing which versions in a client (down the side) can connect\nto which versions in a server (along the top):
\n\n\n\n
\n\n \n\n\n \n \n \n \n \n client / server \nSSLv2 \nSSLv3 \nSSLv23 \nTLSv1 \n\n SSLv2 \nyes \nno \nyes \nno \n\n SSLv3 \nno \nyes \nyes \nno \n\n SSLv23 \nyes \nno \nyes \nno \n\n\n TLSv1 \nno \nno \nyes \nyes \n
Note
\nWhich connections succeed will vary depending on the version of\nOpenSSL. For instance, in some older versions of OpenSSL (such\nas 0.9.7l on OS X 10.4), an SSLv2 client could not connect to an\nSSLv23 server. Another example: beginning with OpenSSL 1.0.0,\nan SSLv23 client will not actually attempt SSLv2 connections\nunless you explicitly enable SSLv2 ciphers; for example, you\nmight specify "ALL" or "SSLv2" as the ciphers parameter\nto enable them.
\nThe ciphers parameter sets the available ciphers for this SSL object.\nIt should be a string in the OpenSSL cipher list format.
\nThe parameter do_handshake_on_connect specifies whether to do the SSL\nhandshake automatically after doing a socket.connect(), or whether the\napplication program will call it explicitly, by invoking the\nSSLSocket.do_handshake() method. Calling\nSSLSocket.do_handshake() explicitly gives the program control over the\nblocking behavior of the socket I/O involved in the handshake.
\nThe parameter suppress_ragged_eofs specifies how the\nSSLSocket.read() method should signal unexpected EOF from the other end\nof the connection. If specified as True (the default), it returns a\nnormal EOF in response to unexpected EOF errors raised from the underlying\nsocket; if False, it will raise the exceptions back to the caller.
\n\nChanged in version 2.7: New optional argument ciphers.
\nIf you are running an entropy-gathering daemon (EGD) somewhere, and path\nis the pathname of a socket connection open to it, this will read 256 bytes\nof randomness from the socket, and add it to the SSL pseudo-random number\ngenerator to increase the security of generated secret keys. This is\ntypically only necessary on systems without better sources of randomness.
\nSee http://egd.sourceforge.net/ or http://prngd.sourceforge.net/ for sources\nof entropy-gathering daemons.
\nReturns a floating-point value containing a normal seconds-after-the-epoch\ntime value, given the time-string representing the “notBefore” or “notAfter”\ndate from a certificate.
\nHere’s an example:
\n>>> import ssl\n>>> ssl.cert_time_to_seconds("May 9 00:00:00 2007 GMT")\n1178694000.0\n>>> import time\n>>> time.ctime(ssl.cert_time_to_seconds("May 9 00:00:00 2007 GMT"))\n'Wed May 9 00:00:00 2007'\n>>>\n
Selects SSL version 2 as the channel encryption protocol.
\nThis protocol is not available if OpenSSL is compiled with OPENSSL_NO_SSL2\nflag.
\nWarning
\nSSL version 2 is insecure. Its use is highly discouraged.
\nThe version string of the OpenSSL library loaded by the interpreter:
\n>>> ssl.OPENSSL_VERSION\n'OpenSSL 0.9.8k 25 Mar 2009'\n
\nNew in version 2.7.
\nA tuple of five integers representing version information about the\nOpenSSL library:
\n>>> ssl.OPENSSL_VERSION_INFO\n(0, 9, 8, 11, 15)\n
\nNew in version 2.7.
\nThe raw version number of the OpenSSL library, as a single integer:
\n>>> ssl.OPENSSL_VERSION_NUMBER\n9470143L\n>>> hex(ssl.OPENSSL_VERSION_NUMBER)\n'0x9080bfL'\n
\nNew in version 2.7.
\nIf there is no certificate for the peer on the other end of the connection,\nreturns None.
\nIf the parameter binary_form is False, and a certificate was\nreceived from the peer, this method returns a dict instance. If the\ncertificate was not validated, the dict is empty. If the certificate was\nvalidated, it returns a dict with the keys subject (the principal for\nwhich the certificate was issued), and notAfter (the time after which the\ncertificate should not be trusted). The certificate was already validated,\nso the notBefore and issuer fields are not returned. If a\ncertificate contains an instance of the Subject Alternative Name extension\n(see RFC 3280), there will also be a subjectAltName key in the\ndictionary.
\nThe “subject” field is a tuple containing the sequence of relative\ndistinguished names (RDNs) given in the certificate’s data structure for the\nprincipal, and each RDN is a sequence of name-value pairs:
\n{'notAfter': 'Feb 16 16:54:50 2013 GMT',\n 'subject': ((('countryName', u'US'),),\n (('stateOrProvinceName', u'Delaware'),),\n (('localityName', u'Wilmington'),),\n (('organizationName', u'Python Software Foundation'),),\n (('organizationalUnitName', u'SSL'),),\n (('commonName', u'somemachine.python.org'),))}\n
If the binary_form parameter is True, and a certificate was\nprovided, this method returns the DER-encoded form of the entire certificate\nas a sequence of bytes, or None if the peer did not provide a\ncertificate. This return value is independent of validation; if validation\nwas required (CERT_OPTIONAL or CERT_REQUIRED), it will have\nbeen validated, but if CERT_NONE was used to establish the\nconnection, the certificate, if present, will not have been validated.
\nPerform a TLS/SSL handshake. If this is used with a non-blocking socket, it\nmay raise SSLError with an arg[0] of SSL_ERROR_WANT_READ\nor SSL_ERROR_WANT_WRITE, in which case it must be called again until\nit completes successfully. For example, to simulate the behavior of a\nblocking socket, one might write:
\nwhile True:\n try:\n s.do_handshake()\n break\n except ssl.SSLError, err:\n if err.args[0] == ssl.SSL_ERROR_WANT_READ:\n select.select([s], [], [])\n elif err.args[0] == ssl.SSL_ERROR_WANT_WRITE:\n select.select([], [s], [])\n else:\n raise\n
Certificates in general are part of a public-key / private-key system. In this\nsystem, each principal, (which may be a machine, or a person, or an\norganization) is assigned a unique two-part encryption key. One part of the key\nis public, and is called the public key; the other part is kept secret, and is\ncalled the private key. The two parts are related, in that if you encrypt a\nmessage with one of the parts, you can decrypt it with the other part, and\nonly with the other part.
\nA certificate contains information about two principals. It contains the name\nof a subject, and the subject’s public key. It also contains a statement by a\nsecond principal, the issuer, that the subject is who he claims to be, and\nthat this is indeed the subject’s public key. The issuer’s statement is signed\nwith the issuer’s private key, which only the issuer knows. However, anyone can\nverify the issuer’s statement by finding the issuer’s public key, decrypting the\nstatement with it, and comparing it to the other information in the certificate.\nThe certificate also contains information about the time period over which it is\nvalid. This is expressed as two fields, called “notBefore” and “notAfter”.
\nIn the Python use of certificates, a client or server can use a certificate to\nprove who they are. The other side of a network connection can also be required\nto produce a certificate, and that certificate can be validated to the\nsatisfaction of the client or server that requires such validation. The\nconnection attempt can be set to raise an exception if the validation fails.\nValidation is done automatically, by the underlying OpenSSL framework; the\napplication need not concern itself with its mechanics. But the application\ndoes usually need to provide sets of certificates to allow this process to take\nplace.
\nPython uses files to contain certificates. They should be formatted as “PEM”\n(see RFC 1422), which is a base-64 encoded form wrapped with a header line\nand a footer line:
\n-----BEGIN CERTIFICATE-----\n... (certificate in base64 PEM encoding) ...\n-----END CERTIFICATE-----
\nThe Python files which contain certificates can contain a sequence of\ncertificates, sometimes called a certificate chain. This chain should start\nwith the specific certificate for the principal who “is” the client or server,\nand then the certificate for the issuer of that certificate, and then the\ncertificate for the issuer of that certificate, and so on up the chain till\nyou get to a certificate which is self-signed, that is, a certificate which\nhas the same subject and issuer, sometimes called a root certificate. The\ncertificates should just be concatenated together in the certificate file. For\nexample, suppose we had a three certificate chain, from our server certificate\nto the certificate of the certification authority that signed our server\ncertificate, to the root certificate of the agency which issued the\ncertification authority’s certificate:
\n-----BEGIN CERTIFICATE-----\n... (certificate for your server)...\n-----END CERTIFICATE-----\n-----BEGIN CERTIFICATE-----\n... (the certificate for the CA)...\n-----END CERTIFICATE-----\n-----BEGIN CERTIFICATE-----\n... (the root certificate for the CA's issuer)...\n-----END CERTIFICATE-----
\nIf you are going to require validation of the other side of the connection’s\ncertificate, you need to provide a “CA certs” file, filled with the certificate\nchains for each issuer you are willing to trust. Again, this file just contains\nthese chains concatenated together. For validation, Python will use the first\nchain it finds in the file which matches.
\nSome “standard” root certificates are available from various certification\nauthorities: CACert.org, Thawte, Verisign, Positive SSL\n(used by python.org), Equifax and GeoTrust.
\nIn general, if you are using SSL3 or TLS1, you don’t need to put the full chain\nin your “CA certs” file; you only need the root certificates, and the remote\npeer is supposed to furnish the other certificates necessary to chain from its\ncertificate to a root certificate. See RFC 4158 for more discussion of the\nway in which certification chains can be built.
\nIf you are going to create a server that provides SSL-encrypted connection\nservices, you will need to acquire a certificate for that service. There are\nmany ways of acquiring appropriate certificates, such as buying one from a\ncertification authority. Another common practice is to generate a self-signed\ncertificate. The simplest way to do this is with the OpenSSL package, using\nsomething like the following:
\n
\nThe disadvantage of a self-signed certificate is that it is its own root\ncertificate, and no one else will have it in their cache of known (and trusted)\nroot certificates.
\nTo test for the presence of SSL support in a Python installation, user code\nshould use the following idiom:
\ntry:\n import ssl\nexcept ImportError:\n pass\nelse:\n ... # do something that requires SSL support\n
This example connects to an SSL server, prints the server’s address and\ncertificate, sends some bytes, and reads part of the response:
\nimport socket, ssl, pprint\n\ns = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n\n# require a certificate from the server\nssl_sock = ssl.wrap_socket(s,\n ca_certs="/etc/ca_certs_file",\n cert_reqs=ssl.CERT_REQUIRED)\n\nssl_sock.connect(('www.verisign.com', 443))\n\nprint repr(ssl_sock.getpeername())\nprint ssl_sock.cipher()\nprint pprint.pformat(ssl_sock.getpeercert())\n\n# Set a simple HTTP request -- use httplib in actual code.\nssl_sock.write("""GET / HTTP/1.0\\r\nHost: www.verisign.com\\r\\n\\r\\n""")\n\n# Read a chunk of data. Will not necessarily\n# read all the data returned by the server.\ndata = ssl_sock.read()\n\n# note that closing the SSLSocket will also close the underlying socket\nssl_sock.close()\n
As of September 6, 2007, the certificate printed by this program looked like\nthis:
\n{'notAfter': 'May 8 23:59:59 2009 GMT',\n 'subject': ((('serialNumber', u'2497886'),),\n (('1.3.6.1.4.1.311.60.2.1.3', u'US'),),\n (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),),\n (('countryName', u'US'),),\n (('postalCode', u'94043'),),\n (('stateOrProvinceName', u'California'),),\n (('localityName', u'Mountain View'),),\n (('streetAddress', u'487 East Middlefield Road'),),\n (('organizationName', u'VeriSign, Inc.'),),\n (('organizationalUnitName',\n u'Production Security Services'),),\n (('organizationalUnitName',\n u'Terms of use at www.verisign.com/rpa (c)06'),),\n (('commonName', u'www.verisign.com'),))}\n
which is a fairly poorly-formed subject field.
\nFor server operation, typically you’d need to have a server certificate, and\nprivate key, each in a file. You’d open a socket, bind it to a port, call\nlisten() on it, then start waiting for clients to connect:
\nimport socket, ssl\n\nbindsocket = socket.socket()\nbindsocket.bind(('myaddr.mydomain.com', 10023))\nbindsocket.listen(5)\n
When one did, you’d call accept() on the socket to get the new socket from\nthe other end, and use wrap_socket() to create a server-side SSL context\nfor it:
\nwhile True:\n newsocket, fromaddr = bindsocket.accept()\n connstream = ssl.wrap_socket(newsocket,\n server_side=True,\n certfile="mycertfile",\n keyfile="mykeyfile",\n ssl_version=ssl.PROTOCOL_TLSv1)\n try:\n deal_with_client(connstream)\n finally:\n connstream.shutdown(socket.SHUT_RDWR)\n connstream.close()\n
Then you’d read data from the connstream and do something with it till you\nare finished with the client (or the client is finished with you):
\ndef deal_with_client(connstream):\n data = connstream.read()\n # null data means the client is finished with us\n while data:\n if not do_something(connstream, data):\n # we'll assume do_something returns False\n # when we're finished with client\n break\n data = connstream.read()\n # finished with client\n
And go back to listening for new client connections.
\nSee also
\nSource code: Lib/asyncore.py
\nThis module provides the basic infrastructure for writing asynchronous socket\nservice clients and servers.
\nThere are only two ways to have a program on a single processor do “more than\none thing at a time.” Multi-threaded programming is the simplest and most\npopular way to do it, but there is another very different technique, that lets\nyou have nearly all the advantages of multi-threading, without actually using\nmultiple threads. It’s really only practical if your program is largely I/O\nbound. If your program is processor bound, then pre-emptive scheduled threads\nare probably what you really need. Network servers are rarely processor\nbound, however.
\nIf your operating system supports the select() system call in its I/O\nlibrary (and nearly all do), then you can use it to juggle multiple\ncommunication channels at once; doing other work while your I/O is taking\nplace in the “background.” Although this strategy can seem strange and\ncomplex, especially at first, it is in many ways easier to understand and\ncontrol than multi-threaded programming. The asyncore module solves\nmany of the difficult problems for you, making the task of building\nsophisticated high-performance network servers and clients a snap. For\n“conversational” applications and protocols the companion asynchat\nmodule is invaluable.
\nThe basic idea behind both modules is to create one or more network\nchannels, instances of class asyncore.dispatcher and\nasynchat.async_chat. Creating the channels adds them to a global\nmap, used by the loop() function if you do not provide it with your own\nmap.
\nOnce the initial channel(s) is(are) created, calling the loop() function\nactivates channel service, which continues until the last channel (including\nany that have been added to the map during asynchronous service) is closed.
\nEnter a polling loop that terminates after count passes or all open\nchannels have been closed. All arguments are optional. The count\nparameter defaults to None, resulting in the loop terminating only when all\nchannels have been closed. The timeout argument sets the timeout\nparameter for the appropriate select() or poll() call, measured\nin seconds; the default is 30 seconds. The use_poll parameter, if true,\nindicates that poll() should be used in preference to select()\n(the default is False).
\nThe map parameter is a dictionary whose items are the channels to watch.\nAs channels are closed they are deleted from their map. If map is\nomitted, a global map is used. Channels (instances of\nasyncore.dispatcher, asynchat.async_chat and subclasses\nthereof) can freely be mixed in the map.
\nThe dispatcher class is a thin wrapper around a low-level socket\nobject. To make it more useful, it has a few methods for event-handling\nwhich are called from the asynchronous loop. Otherwise, it can be treated\nas a normal non-blocking socket object.
\nThe firing of low-level events at certain times or in certain connection\nstates tells the asynchronous loop that certain higher-level events have\ntaken place. For example, if we have asked for a socket to connect to\nanother host, we know that the connection has been made when the socket\nbecomes writable for the first time (at this point you know that you may\nwrite to it with the expectation of success). The implied higher-level\nevents are:
\nEvent | \nDescription | \n
---|---|
handle_connect() | \nImplied by the first read or write\nevent | \n
handle_close() | \nImplied by a read event with no data\navailable | \n
handle_accept() | \nImplied by a read event on a listening\nsocket | \n
During asynchronous processing, each mapped channel’s readable() and\nwritable() methods are used to determine whether the channel’s socket\nshould be added to the list of channels select()ed or\npoll()ed for read and write events.
\nThus, the set of channel events is larger than the basic socket events. The\nfull set of methods that can be overridden in your subclass follows:
\nCalled when the asynchronous loop detects that a writable socket can be\nwritten. Often this method will implement the necessary buffering for\nperformance. For example:
\ndef handle_write(self):\n sent = self.send(self.buffer)\n self.buffer = self.buffer[sent:]\n
In addition, each channel delegates or extends many of the socket methods.\nMost of these are nearly identical to their socket partners.
\nHere is a very basic HTTP client that uses the dispatcher class to\nimplement its socket handling:
\nimport asyncore, socket\n\nclass HTTPClient(asyncore.dispatcher):\n\n def __init__(self, host, path):\n asyncore.dispatcher.__init__(self)\n self.create_socket(socket.AF_INET, socket.SOCK_STREAM)\n self.connect( (host, 80) )\n self.buffer = 'GET %s HTTP/1.0\\r\\n\\r\\n' path\n\n def handle_connect(self):\n pass\n\n def handle_close(self):\n self.close()\n\n def handle_read(self):\n print self.recv(8192)\n\n def writable(self):\n return (len(self.buffer) > 0)\n\n def handle_write(self):\n sent = self.send(self.buffer)\n self.buffer = self.buffer[sent:]\n\n\nclient = HTTPClient('www.python.org', '/')\nasyncore.loop()\n
Here is a basic echo server that uses the dispatcher class to accept\nconnections and dispatches the incoming connections to a handler:
\nimport asyncore\nimport socket\n\nclass EchoHandler(asyncore.dispatcher_with_send):\n\n def handle_read(self):\n data = self.recv(8192)\n if data:\n self.send(data)\n\nclass EchoServer(asyncore.dispatcher):\n\n def __init__(self, host, port):\n asyncore.dispatcher.__init__(self)\n self.create_socket(socket.AF_INET, socket.SOCK_STREAM)\n self.set_reuse_addr()\n self.bind((host, port))\n self.listen(5)\n\n def handle_accept(self):\n pair = self.accept()\n if pair is None:\n pass\n else:\n sock, addr = pair\n print 'Incoming connection from %s' repr(addr)\n handler = EchoHandler(sock)\n\nserver = EchoServer('localhost', 8080)\nasyncore.loop()\n
This module provides access to the BSD socket interface. It is available on\nall modern Unix systems, Windows, Mac OS X, BeOS, OS/2, and probably additional\nplatforms.
\nNote
\nSome behavior may be platform dependent, since calls are made to the operating\nsystem socket APIs.
\nFor an introduction to socket programming (in C), see the following papers: An\nIntroductory 4.3BSD Interprocess Communication Tutorial, by Stuart Sechrest and\nAn Advanced 4.3BSD Interprocess Communication Tutorial, by Samuel J. Leffler et\nal, both in the UNIX Programmer’s Manual, Supplementary Documents 1 (sections\nPS1:7 and PS1:8). The platform-specific reference material for the various\nsocket-related system calls are also a valuable source of information on the\ndetails of socket semantics. For Unix, refer to the manual pages; for Windows,\nsee the WinSock (or Winsock 2) specification. For IPv6-ready APIs, readers may\nwant to refer to RFC 3493 titled Basic Socket Interface Extensions for IPv6.
\nThe Python interface is a straightforward transliteration of the Unix system\ncall and library interface for sockets to Python’s object-oriented style: the\nsocket() function returns a socket object whose methods implement\nthe various socket system calls. Parameter types are somewhat higher-level than\nin the C interface: as with read() and write() operations on Python\nfiles, buffer allocation on receive operations is automatic, and buffer length\nis implicit on send operations.
\nSocket addresses are represented as follows: A single string is used for the\nAF_UNIX address family. A pair (host, port) is used for the\nAF_INET address family, where host is a string representing either a\nhostname in Internet domain notation like 'daring.cwi.nl' or an IPv4 address\nlike '100.50.200.5', and port is an integral port number. For\nAF_INET6 address family, a four-tuple (host, port, flowinfo,\nscopeid) is used, where flowinfo and scopeid represents sin6_flowinfo\nand sin6_scope_id member in struct sockaddr_in6 in C. For\nsocket module methods, flowinfo and scopeid can be omitted just for\nbackward compatibility. Note, however, omission of scopeid can cause problems\nin manipulating scoped IPv6 addresses. Other address families are currently not\nsupported. The address format required by a particular socket object is\nautomatically selected based on the address family specified when the socket\nobject was created.
\nFor IPv4 addresses, two special forms are accepted instead of a host address:\nthe empty string represents INADDR_ANY, and the string\n'<broadcast>' represents INADDR_BROADCAST. The behavior is not\navailable for IPv6 for backward compatibility, therefore, you may want to avoid\nthese if you intend to support IPv6 with your Python programs.
\nIf you use a hostname in the host portion of IPv4/v6 socket address, the\nprogram may show a nondeterministic behavior, as Python uses the first address\nreturned from the DNS resolution. The socket address will be resolved\ndifferently into an actual IPv4/v6 address, depending on the results from DNS\nresolution and/or the host configuration. For deterministic behavior use a\nnumeric address in host portion.
\n\nNew in version 2.5: AF_NETLINK sockets are represented as pairs pid, groups.
\n\nNew in version 2.6: Linux-only support for TIPC is also available using the AF_TIPC\naddress family. TIPC is an open, non-IP based networked protocol designed\nfor use in clustered computer environments. Addresses are represented by a\ntuple, and the fields depend on the address type. The general tuple form is\n(addr_type, v1, v2, v3 [, scope]), where:
addr_type is one of TIPC_ADDR_NAMESEQ, TIPC_ADDR_NAME, or\nTIPC_ADDR_ID.
\nscope is one of TIPC_ZONE_SCOPE, TIPC_CLUSTER_SCOPE, and\nTIPC_NODE_SCOPE.
\nIf addr_type is TIPC_ADDR_NAME, then v1 is the server type, v2 is\nthe port identifier, and v3 should be 0.
\nIf addr_type is TIPC_ADDR_NAMESEQ, then v1 is the server type, v2\nis the lower port number, and v3 is the upper port number.
\nIf addr_type is TIPC_ADDR_ID, then v1 is the node, v2 is the\nreference, and v3 should be set to 0.
\nAll errors raise exceptions. The normal exceptions for invalid argument types\nand out-of-memory conditions can be raised; errors related to socket or address\nsemantics raise the error socket.error.
\nNon-blocking mode is supported through setblocking(). A\ngeneralization of this based on timeouts is supported through\nsettimeout().
\nThe module socket exports the following constants and functions:
\nThis exception is raised for socket-related errors. The accompanying value is\neither a string telling what went wrong or a pair (errno, string)\nrepresenting an error returned by a system call, similar to the value\naccompanying os.error. See the module errno, which contains names\nfor the error codes defined by the underlying operating system.
\n\nChanged in version 2.6: socket.error is now a child class of IOError.
\nThis exception is raised for address-related errors, i.e. for functions that use\nh_errno in the C API, including gethostbyname_ex() and\ngethostbyaddr().
\nThe accompanying value is a pair (h_errno, string) representing an error\nreturned by a library call. string represents the description of h_errno, as\nreturned by the hstrerror() C function.
\nThis exception is raised when a timeout occurs on a socket which has had\ntimeouts enabled via a prior call to settimeout(). The accompanying value\nis a string whose value is currently always “timed out”.
\n\nNew in version 2.3.
\nConstants for Windows’ WSAIoctl(). The constants are used as arguments to the\nioctl() method of socket objects.
\n\nNew in version 2.6.
\nTIPC related constants, matching the ones exported by the C socket API. See\nthe TIPC documentation for more information.
\n\nNew in version 2.6.
\nThis constant contains a boolean value which indicates if IPv6 is supported on\nthis platform.
\n\nNew in version 2.3.
\nConnect to a TCP service listening on the Internet address (a 2-tuple\n(host, port)), and return the socket object. This is a higher-level\nfunction than socket.connect(): if host is a non-numeric hostname,\nit will try to resolve it for both AF_INET and AF_INET6,\nand then try to connect to all possible addresses in turn until a\nconnection succeeds. This makes it easy to write clients that are\ncompatible to both IPv4 and IPv6.
\nPassing the optional timeout parameter will set the timeout on the\nsocket instance before attempting to connect. If no timeout is\nsupplied, the global default timeout setting returned by\ngetdefaulttimeout() is used.
\nIf supplied, source_address must be a 2-tuple (host, port) for the\nsocket to bind to as its source address before connecting. If host or port\nare ‘’ or 0 respectively the OS default behavior will be used.
\n\nNew in version 2.6.
\n\nChanged in version 2.7: source_address was added.
\nTranslate the host/port argument into a sequence of 5-tuples that contain\nall the necessary arguments for creating a socket connected to that service.\nhost is a domain name, a string representation of an IPv4/v6 address\nor None. port is a string service name such as 'http', a numeric\nport number or None. By passing None as the value of host\nand port, you can pass NULL to the underlying C API.
\nThe family, socktype and proto arguments can be optionally specified\nin order to narrow the list of addresses returned. Passing zero as a\nvalue for each of these arguments selects the full range of results.\nThe flags argument can be one or several of the AI_* constants,\nand will influence how results are computed and returned.\nFor example, AI_NUMERICHOST will disable domain name resolution\nand will raise an error if host is a domain name.
\nThe function returns a list of 5-tuples with the following structure:
\n(family, socktype, proto, canonname, sockaddr)
\nIn these tuples, family, socktype, proto are all integers and are\nmeant to be passed to the socket() function. canonname will be\na string representing the canonical name of the host if\nAI_CANONNAME is part of the flags argument; else canonname\nwill be empty. sockaddr is a tuple describing a socket address, whose\nformat depends on the returned family (a (address, port) 2-tuple for\nAF_INET, a (address, port, flow info, scope id) 4-tuple for\nAF_INET6), and is meant to be passed to the socket.connect()\nmethod.
\nThe following example fetches address information for a hypothetical TCP\nconnection to www.python.org on port 80 (results may differ on your\nsystem if IPv6 isn’t enabled):
\n>>> socket.getaddrinfo("www.python.org", 80, 0, 0, socket.SOL_TCP)\n[(2, 1, 6, '', ('82.94.164.162', 80)),\n (10, 1, 6, '', ('2001:888:2000:d::a2', 80, 0, 0))]\n
\nNew in version 2.2.
\nReturn a fully qualified domain name for name. If name is omitted or empty,\nit is interpreted as the local host. To find the fully qualified name, the\nhostname returned by gethostbyaddr() is checked, followed by aliases for the\nhost, if available. The first name which includes a period is selected. In\ncase no fully qualified domain name is available, the hostname as returned by\ngethostname() is returned.
\n\nNew in version 2.0.
\nReturn a string containing the hostname of the machine where the Python\ninterpreter is currently executing.
\nIf you want to know the current machine’s IP address, you may want to use\ngethostbyname(gethostname()). This operation assumes that there is a\nvalid address-to-host mapping for the host, and the assumption does not\nalways hold.
\nNote: gethostname() doesn’t always return the fully qualified domain\nname; use getfqdn() (see above).
\nTranslate a socket address sockaddr into a 2-tuple (host, port). Depending\non the settings of flags, the result can contain a fully-qualified domain name\nor numeric address representation in host. Similarly, port can contain a\nstring port name or a numeric port number.
\n\nNew in version 2.2.
\nBuild a pair of connected socket objects using the given address family, socket\ntype, and protocol number. Address family, socket type, and protocol number are\nas for the socket() function above. The default family is AF_UNIX\nif defined on the platform; otherwise, the default is AF_INET.\nAvailability: Unix.
\n\nNew in version 2.4.
\nConvert an IPv4 address from dotted-quad string format (for example,\n‘123.45.67.89’) to 32-bit packed binary format, as a string four characters in\nlength. This is useful when conversing with a program that uses the standard C\nlibrary and needs objects of type struct in_addr, which is the C type\nfor the 32-bit packed binary this function returns.
\ninet_aton() also accepts strings with less than three dots; see the\nUnix manual page inet(3) for details.
\nIf the IPv4 address string passed to this function is invalid,\nsocket.error will be raised. Note that exactly what is valid depends on\nthe underlying C implementation of inet_aton().
\ninet_aton() does not support IPv6, and inet_pton() should be used\ninstead for IPv4/v6 dual stack support.
\nConvert a 32-bit packed IPv4 address (a string four characters in length) to its\nstandard dotted-quad string representation (for example, ‘123.45.67.89’). This\nis useful when conversing with a program that uses the standard C library and\nneeds objects of type struct in_addr, which is the C type for the\n32-bit packed binary data this function takes as an argument.
\nIf the string passed to this function is not exactly 4 bytes in length,\nsocket.error will be raised. inet_ntoa() does not support IPv6, and\ninet_ntop() should be used instead for IPv4/v6 dual stack support.
\nConvert an IP address from its family-specific string format to a packed, binary\nformat. inet_pton() is useful when a library or network protocol calls for\nan object of type struct in_addr (similar to inet_aton()) or\nstruct in6_addr.
\nSupported values for address_family are currently AF_INET and\nAF_INET6. If the IP address string ip_string is invalid,\nsocket.error will be raised. Note that exactly what is valid depends on\nboth the value of address_family and the underlying implementation of\ninet_pton().
\nAvailability: Unix (maybe not all platforms).
\n\nNew in version 2.3.
\nConvert a packed IP address (a string of some number of characters) to its\nstandard, family-specific string representation (for example, '7.10.0.5' or\n'5aef:2b::8') inet_ntop() is useful when a library or network protocol\nreturns an object of type struct in_addr (similar to inet_ntoa())\nor struct in6_addr.
\nSupported values for address_family are currently AF_INET and\nAF_INET6. If the string packed_ip is not the correct length for the\nspecified address family, ValueError will be raised. A\nsocket.error is raised for errors from the call to inet_ntop().
\nAvailability: Unix (maybe not all platforms).
\n\nNew in version 2.3.
\nReturn the default timeout in seconds (float) for new socket objects. A value\nof None indicates that new socket objects have no timeout. When the socket\nmodule is first imported, the default is None.
\n\nNew in version 2.3.
\nSet the default timeout in seconds (float) for new socket objects. A value of\nNone indicates that new socket objects have no timeout. When the socket\nmodule is first imported, the default is None.
\n\nNew in version 2.3.
\nSee also
\nSocket objects have the following methods. Except for makefile() these\ncorrespond to Unix system calls applicable to sockets.
\nBind the socket to address. The socket must not already be bound. (The format\nof address depends on the address family — see above.)
\nNote
\nThis method has historically accepted a pair of parameters for AF_INET\naddresses instead of only a tuple. This was never intentional and is no longer\navailable in Python 2.0 and later.
\nClose the socket. All future operations on the socket object will fail. The\nremote end will receive no more data (after queued data is flushed). Sockets are\nautomatically closed when they are garbage-collected.
\nNote
\nclose() releases the resource associated with a connection but\ndoes not necessarily close the connection immediately. If you want\nto close the connection in a timely fashion, call shutdown()\nbefore close().
\nConnect to a remote socket at address. (The format of address depends on the\naddress family — see above.)
\nNote
\nThis method has historically accepted a pair of parameters for AF_INET\naddresses instead of only a tuple. This was never intentional and is no longer\navailable in Python 2.0 and later.
\nLike connect(address), but return an error indicator instead of raising an\nexception for errors returned by the C-level connect() call (other\nproblems, such as “host not found,” can still raise exceptions). The error\nindicator is 0 if the operation succeeded, otherwise the value of the\nerrno variable. This is useful to support, for example, asynchronous\nconnects.
\nNote
\nThis method has historically accepted a pair of parameters for AF_INET\naddresses instead of only a tuple. This was never intentional and is no longer\navailable in Python 2.0 and later.
\nReturn the socket’s file descriptor (a small integer). This is useful with\nselect.select().
\nUnder Windows the small integer returned by this method cannot be used where a\nfile descriptor can be used (such as os.fdopen()). Unix does not have\nthis limitation.
\nPlatform: | Windows | \n
---|
The ioctl() method is a limited interface to the WSAIoctl system\ninterface. Please refer to the Win32 documentation for more\ninformation.
\nOn other platforms, the generic fcntl.fcntl() and fcntl.ioctl()\nfunctions may be used; they accept a socket object as their first argument.
\n\nNew in version 2.6.
\nReturn a file object associated with the socket. (File objects are\ndescribed in File Objects.) The file object\nreferences a dup()ped version of the socket file descriptor, so the\nfile object and socket object may be closed or garbage-collected independently.\nThe socket must be in blocking mode (it can not have a timeout). The optional\nmode and bufsize arguments are interpreted the same way as by the built-in\nfile() function.
\nNote
\nOn Windows, the file-like object created by makefile() cannot be\nused where a file object with a file descriptor is expected, such as the\nstream arguments of subprocess.Popen().
\nReceive data from the socket. The return value is a string representing the\ndata received. The maximum amount of data to be received at once is specified\nby bufsize. See the Unix manual page recv(2) for the meaning of\nthe optional argument flags; it defaults to zero.
\nNote
\nFor best match with hardware and network realities, the value of bufsize\nshould be a relatively small power of 2, for example, 4096.
\nReceive data from the socket, writing it into buffer instead of creating a\nnew string. The return value is a pair (nbytes, address) where nbytes is\nthe number of bytes received and address is the address of the socket sending\nthe data. See the Unix manual page recv(2) for the meaning of the\noptional argument flags; it defaults to zero. (The format of address\ndepends on the address family — see above.)
\n\nNew in version 2.5.
\nReceive up to nbytes bytes from the socket, storing the data into a buffer\nrather than creating a new string. If nbytes is not specified (or 0),\nreceive up to the size available in the given buffer. Returns the number of\nbytes received. See the Unix manual page recv(2) for the meaning\nof the optional argument flags; it defaults to zero.
\n\nNew in version 2.5.
\nSet a timeout on blocking socket operations. The value argument can be a\nnonnegative float expressing seconds, or None. If a float is given,\nsubsequent socket operations will raise a timeout exception if the\ntimeout period value has elapsed before the operation has completed. Setting\na timeout of None disables timeouts on socket operations.\ns.settimeout(0.0) is equivalent to s.setblocking(0);\ns.settimeout(None) is equivalent to s.setblocking(1).
\n\nNew in version 2.3.
\nReturn the timeout in seconds (float) associated with socket operations, or\nNone if no timeout is set. This reflects the last call to\nsetblocking() or settimeout().
\n\nNew in version 2.3.
\nSome notes on socket blocking and timeouts: A socket object can be in one of\nthree modes: blocking, non-blocking, or timeout. Sockets are always created in\nblocking mode. In blocking mode, operations block until complete or\nthe system returns an error (such as connection timed out). In\nnon-blocking mode, operations fail (with an error that is unfortunately\nsystem-dependent) if they cannot be completed immediately. In timeout mode,\noperations fail if they cannot be completed within the timeout specified for the\nsocket or if the system returns an error. The setblocking()\nmethod is simply a shorthand for certain settimeout() calls.
\nTimeout mode internally sets the socket in non-blocking mode. The blocking and\ntimeout modes are shared between file descriptors and socket objects that refer\nto the same network endpoint. A consequence of this is that file objects\nreturned by the makefile() method must only be used when the\nsocket is in blocking mode; in timeout or non-blocking mode file operations\nthat cannot be completed immediately will fail.
\nNote that the connect() operation is subject to the timeout\nsetting, and in general it is recommended to call settimeout()\nbefore calling connect() or pass a timeout parameter to\ncreate_connection(). The system network stack may return a connection\ntimeout error of its own regardless of any Python socket timeout setting.
\nSet the value of the given socket option (see the Unix manual page\nsetsockopt(2)). The needed symbolic constants are defined in the\nsocket module (SO_* etc.). The value can be an integer or a\nstring representing a buffer. In the latter case it is up to the caller to\nensure that the string contains the proper bits (see the optional built-in\nmodule struct for a way to encode C structures as strings).
\nNote that there are no methods read() or write(); use\nrecv() and send() without flags argument instead.
\nSocket objects also have these (read-only) attributes that correspond to the\nvalues given to the socket constructor.
\nThe socket family.
\n\nNew in version 2.5.
\nThe socket type.
\n\nNew in version 2.5.
\nThe socket protocol.
\n\nNew in version 2.5.
\nHere are four minimal example programs using the TCP/IP protocol: a server that\nechoes all data that it receives back (servicing only one client), and a client\nusing it. Note that a server must perform the sequence socket(),\nbind(), listen(), accept() (possibly\nrepeating the accept() to service more than one client), while a\nclient only needs the sequence socket(), connect(). Also\nnote that the server does not send()/recv() on the\nsocket it is listening on but on the new socket returned by\naccept().
\nThe first two examples support IPv4 only.
\n# Echo server program\nimport socket\n\nHOST = '' # Symbolic name meaning all available interfaces\nPORT = 50007 # Arbitrary non-privileged port\ns = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\ns.bind((HOST, PORT))\ns.listen(1)\nconn, addr = s.accept()\nprint 'Connected by', addr\nwhile 1:\n data = conn.recv(1024)\n if not data: break\n conn.send(data)\nconn.close()\n
# Echo client program\nimport socket\n\nHOST = 'daring.cwi.nl' # The remote host\nPORT = 50007 # The same port as used by the server\ns = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\ns.connect((HOST, PORT))\ns.send('Hello, world')\ndata = s.recv(1024)\ns.close()\nprint 'Received', repr(data)\n
The next two examples are identical to the above two, but support both IPv4 and\nIPv6. The server side will listen to the first address family available (it\nshould listen to both instead). On most of IPv6-ready systems, IPv6 will take\nprecedence and the server may not accept IPv4 traffic. The client side will try\nto connect to the all addresses returned as a result of the name resolution, and\nsends traffic to the first one connected successfully.
\n# Echo server program\nimport socket\nimport sys\n\nHOST = None # Symbolic name meaning all available interfaces\nPORT = 50007 # Arbitrary non-privileged port\ns = None\nfor res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC,\n socket.SOCK_STREAM, 0, socket.AI_PASSIVE):\n af, socktype, proto, canonname, sa = res\n try:\n s = socket.socket(af, socktype, proto)\n except socket.error, msg:\n s = None\n continue\n try:\n s.bind(sa)\n s.listen(1)\n except socket.error, msg:\n s.close()\n s = None\n continue\n break\nif s is None:\n print 'could not open socket'\n sys.exit(1)\nconn, addr = s.accept()\nprint 'Connected by', addr\nwhile 1:\n data = conn.recv(1024)\n if not data: break\n conn.send(data)\nconn.close()\n
# Echo client program\nimport socket\nimport sys\n\nHOST = 'daring.cwi.nl' # The remote host\nPORT = 50007 # The same port as used by the server\ns = None\nfor res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC, socket.SOCK_STREAM):\n af, socktype, proto, canonname, sa = res\n try:\n s = socket.socket(af, socktype, proto)\n except socket.error, msg:\n s = None\n continue\n try:\n s.connect(sa)\n except socket.error, msg:\n s.close()\n s = None\n continue\n break\nif s is None:\n print 'could not open socket'\n sys.exit(1)\ns.send('Hello, world')\ndata = s.recv(1024)\ns.close()\nprint 'Received', repr(data)\n
The last example shows how to write a very simple network sniffer with raw\nsockets on Windows. The example requires administrator privileges to modify\nthe interface:
\nimport socket\n\n# the public network interface\nHOST = socket.gethostbyname(socket.gethostname())\n\n# create a raw socket and bind it to the public interface\ns = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_IP)\ns.bind((HOST, 0))\n\n# Include IP headers\ns.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)\n\n# receive all packages\ns.ioctl(socket.SIO_RCVALL, socket.RCVALL_ON)\n\n# receive a package\nprint s.recvfrom(65565)\n\n# disabled promiscuous mode\ns.ioctl(socket.SIO_RCVALL, socket.RCVALL_OFF)\n
Running an example several times with too small delay between executions, could\nlead to this error:
\nsocket.error: [Errno 98] Address already in use
\nThis is because the previous execution has left the socket in a TIME_WAIT\nstate, and can’t be immediately reused.
\nThere is a socket flag to set, in order to prevent this,\nsocket.SO_REUSEADDR:
\ns = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\ns.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)\ns.bind((HOST, PORT))\n
the SO_REUSEADDR flag tells the kernel to reuse a local socket in\nTIME_WAIT state, without waiting for its natural timeout to expire.
\nSource code: Lib/asynchat.py
\nThis module builds on the asyncore infrastructure, simplifying\nasynchronous clients and servers and making it easier to handle protocols\nwhose elements are terminated by arbitrary strings, or are of variable length.\nasynchat defines the abstract class async_chat that you\nsubclass, providing implementations of the collect_incoming_data() and\nfound_terminator() methods. It uses the same asynchronous loop as\nasyncore, and the two types of channel, asyncore.dispatcher\nand asynchat.async_chat, can freely be mixed in the channel map.\nTypically an asyncore.dispatcher server channel generates new\nasynchat.async_chat channel objects as it receives incoming\nconnection requests.
\nThis class is an abstract subclass of asyncore.dispatcher. To make\npractical use of the code you must subclass async_chat, providing\nmeaningful collect_incoming_data() and found_terminator()\nmethods.\nThe asyncore.dispatcher methods can be used, although not all make\nsense in a message/response context.
\nLike asyncore.dispatcher, async_chat defines a set of\nevents that are generated by an analysis of socket conditions after a\nselect() call. Once the polling loop has been started the\nasync_chat object’s methods are called by the event-processing\nframework with no action on the part of the programmer.
\nTwo class attributes can be modified, to improve performance, or possibly\neven to conserve memory.
\nUnlike asyncore.dispatcher, async_chat allows you to\ndefine a first-in-first-out queue (fifo) of producers. A producer need\nhave only one method, more(), which should return data to be\ntransmitted on the channel.\nThe producer indicates exhaustion (i.e. that it contains no more data) by\nhaving its more() method return the empty string. At this point the\nasync_chat object removes the producer from the fifo and starts\nusing the next producer, if any. When the producer fifo is empty the\nhandle_write() method does nothing. You use the channel object’s\nset_terminator() method to describe how to recognize the end of, or\nan important breakpoint in, an incoming transmission from the remote\nendpoint.
\nTo build a functioning async_chat subclass your input methods\ncollect_incoming_data() and found_terminator() must handle the\ndata that the channel receives asynchronously. The methods are described\nbelow.
\nSets the terminating condition to be recognized on the channel. term\nmay be any of three types of value, corresponding to three different ways\nto handle incoming protocol data.
\nterm | \nDescription | \n
---|---|
string | \nWill call found_terminator() when the\nstring is found in the input stream | \n
integer | \nWill call found_terminator() when the\nindicated number of characters have been\nreceived | \n
None | \nThe channel continues to collect data\nforever | \n
Note that any data following the terminator will be available for reading\nby the channel after found_terminator() is called.
\nA fifo holding data which has been pushed by the application but\nnot yet popped for writing to the channel. A fifo is a list used\nto hold data and/or producers until they are required. If the list\nargument is provided then it should contain producers or data items to be\nwritten to the channel.
\nThe following partial example shows how HTTP requests can be read with\nasync_chat. A web server might create an\nhttp_request_handler object for each incoming client connection.\nNotice that initially the channel terminator is set to match the blank line at\nthe end of the HTTP headers, and a flag indicates that the headers are being\nread.
\nOnce the headers have been read, if the request is of type POST (indicating\nthat further data are present in the input stream) then the\nContent-Length: header is used to set a numeric terminator to read the\nright amount of data from the channel.
\nThe handle_request() method is called once all relevant input has been\nmarshalled, after setting the channel terminator to None to ensure that\nany extraneous data sent by the web client are ignored.
\nclass http_request_handler(asynchat.async_chat):\n\n def __init__(self, sock, addr, sessions, log):\n asynchat.async_chat.__init__(self, sock=sock)\n self.addr = addr\n self.sessions = sessions\n self.ibuffer = []\n self.obuffer = ""\n self.set_terminator("\\r\\n\\r\\n")\n self.reading_headers = True\n self.handling = False\n self.cgi_data = None\n self.log = log\n\n def collect_incoming_data(self, data):\n """Buffer the data"""\n self.ibuffer.append(data)\n\n def found_terminator(self):\n if self.reading_headers:\n self.reading_headers = False\n self.parse_headers("".join(self.ibuffer))\n self.ibuffer = []\n if self.op.upper() == "POST":\n clen = self.headers.getheader("content-length")\n self.set_terminator(int(clen))\n else:\n self.handling = True\n self.set_terminator(None)\n self.handle_request()\n elif not self.handling:\n self.set_terminator(None) # browsers sometimes over-send\n self.cgi_data = parse(self.headers, "".join(self.ibuffer))\n self.handling = True\n self.ibuffer = []\n self.handle_request()\n
Source code: Lib/mailcap.py
\nMailcap files are used to configure how MIME-aware applications such as mail\nreaders and Web browsers react to files with different MIME types. (The name\n“mailcap” is derived from the phrase “mail capability”.) For example, a mailcap\nfile might contain a line like video/mpeg; xmpeg %s. Then, if the user\nencounters an email message or Web document with the MIME type\nvideo/mpeg, %s will be replaced by a filename (usually one\nbelonging to a temporary file) and the xmpeg program can be\nautomatically started to view the file.
\nThe mailcap format is documented in RFC 1524, “A User Agent Configuration\nMechanism For Multimedia Mail Format Information,” but is not an Internet\nstandard. However, mailcap files are supported on most Unix systems.
\nReturn a 2-tuple; the first element is a string containing the command line to\nbe executed (which can be passed to os.system()), and the second element\nis the mailcap entry for a given MIME type. If no matching MIME type can be\nfound, (None, None) is returned.
\nkey is the name of the field desired, which represents the type of activity to\nbe performed; the default value is ‘view’, since in the most common case you\nsimply want to view the body of the MIME-typed data. Other possible values\nmight be ‘compose’ and ‘edit’, if you wanted to create a new body of the given\nMIME type or alter the existing body data. See RFC 1524 for a complete list\nof these fields.
\nfilename is the filename to be substituted for %s in the command line; the\ndefault value is '/dev/null' which is almost certainly not what you want, so\nusually you’ll override it by specifying a filename.
\nplist can be a list containing named parameters; the default value is simply\nan empty list. Each entry in the list must be a string containing the parameter\nname, an equals sign ('='), and the parameter’s value. Mailcap entries can\ncontain named parameters like %{foo}, which will be replaced by the value\nof the parameter named ‘foo’. For example, if the command line showpartial\n%{id} %{number} %{total} was in a mailcap file, and plist was set to\n['id=1', 'number=2', 'total=3'], the resulting command line would be\n'showpartial 1 2 3'.
\nIn a mailcap file, the “test” field can optionally be specified to test some\nexternal condition (such as the machine architecture, or the window system in\nuse) to determine whether or not the mailcap line applies. findmatch()\nwill automatically check such conditions and skip the entry if the check fails.
\nReturns a dictionary mapping MIME types to a list of mailcap file entries. This\ndictionary must be passed to the findmatch() function. An entry is stored\nas a list of dictionaries, but it shouldn’t be necessary to know the details of\nthis representation.
\nThe information is derived from all of the mailcap files found on the system.\nSettings in the user’s mailcap file $HOME/.mailcap will override\nsettings in the system mailcap files /etc/mailcap,\n/usr/etc/mailcap, and /usr/local/etc/mailcap.
\nAn example usage:
\n>>> import mailcap\n>>> d=mailcap.getcaps()\n>>> mailcap.findmatch(d, 'video/mpeg', filename='/tmp/tmp1223')\n('xmpeg /tmp/tmp1223', {'view': 'xmpeg %s'})\n
\nNew in version 2.2.
\nThe email package is a library for managing email messages, including\nMIME and other RFC 2822-based message documents. It subsumes most of the\nfunctionality in several older standard modules such as rfc822,\nmimetools, multifile, and other non-standard packages such as\nmimecntl. It is specifically not designed to do any sending of email\nmessages to SMTP (RFC 2821), NNTP, or other servers; those are functions of\nmodules such as smtplib and nntplib. The email package\nattempts to be as RFC-compliant as possible, supporting in addition to\nRFC 2822, such MIME-related RFCs as RFC 2045, RFC 2046, RFC 2047,\nand RFC 2231.
\nThe primary distinguishing feature of the email package is that it splits\nthe parsing and generating of email messages from the internal object model\nrepresentation of email. Applications using the email package deal\nprimarily with objects; you can add sub-objects to messages, remove sub-objects\nfrom messages, completely re-arrange the contents, etc. There is a separate\nparser and a separate generator which handles the transformation from flat text\nto the object model, and then back to flat text again. There are also handy\nsubclasses for some common MIME object types, and a few miscellaneous utilities\nthat help with such common tasks as extracting and parsing message field values,\ncreating RFC-compliant dates, etc.
\nThe following sections describe the functionality of the email package.\nThe ordering follows a progression that should be common in applications: an\nemail message is read as flat text from a file or other source, the text is\nparsed to produce the object structure of the email message, this structure is\nmanipulated, and finally, the object tree is rendered back into flat text.
\nIt is perfectly feasible to create the object structure out of whole cloth —\ni.e. completely from scratch. From there, a similar progression can be taken as\nabove.
\nAlso included are detailed specifications of all the classes and modules that\nthe email package provides, the exception classes you might encounter\nwhile using the email package, some auxiliary utilities, and a few\nexamples. For users of the older mimelib package, or previous versions\nof the email package, a section on differences and porting is provided.
\nContents of the email package documentation:
\nThis table describes the release history of the email package, corresponding to\nthe version of Python that the package was released with. For purposes of this\ndocument, when you see a note about change or added versions, these refer to the\nPython version the change was made in, not the email package version. This\ntable also describes the Python compatibility of each version of the package.
\nemail version | \ndistributed with | \ncompatible with | \n
---|---|---|
1.x | \nPython 2.2.0 to Python 2.2.1 | \nno longer supported | \n
2.5 | \nPython 2.2.2+ and Python 2.3 | \nPython 2.1 to 2.5 | \n
3.0 | \nPython 2.4 | \nPython 2.3 to 2.5 | \n
4.0 | \nPython 2.5 | \nPython 2.3 to 2.5 | \n
Here are the major differences between email version 4 and version 3:
\nAll modules have been renamed according to PEP 8 standards. For example,\nthe version 3 module email.Message was renamed to email.message in\nversion 4.
\nA new subpackage email.mime was added and all the version 3\nemail.MIME* modules were renamed and situated into the email.mime\nsubpackage. For example, the version 3 module email.MIMEText was renamed\nto email.mime.text.
\nNote that the version 3 names will continue to work until Python 2.6.
\nThe email.mime.application module was added, which contains the\nMIMEApplication class.
\nMethods that were deprecated in version 3 have been removed. These include\nGenerator.__call__(), Message.get_type(),\nMessage.get_main_type(), Message.get_subtype().
\nFixes have been added for RFC 2231 support which can change some of the\nreturn types for Message.get_param() and friends. Under some\ncircumstances, values which used to return a 3-tuple now return simple strings\n(specifically, if all extended parameter segments were unencoded, there is no\nlanguage and charset designation expected, so the return type is now a simple\nstring). Also, %-decoding used to be done for both encoded and unencoded\nsegments; this decoding is now done only for encoded segments.
\nHere are the major differences between email version 3 and version 2:
\nHere are the differences between email version 2 and version 1:
\nThe email.Header and email.Charset modules have been added.
\nThe pickle format for Message instances has changed. Since this was\nnever (and still isn’t) formally defined, this isn’t considered a backward\nincompatibility. However if your application pickles and unpickles\nMessage instances, be aware that in email version 2,\nMessage instances now have private variables _charset and\n_default_type.
\nSeveral methods in the Message class have been deprecated, or their\nsignatures changed. Also, many new methods have been added. See the\ndocumentation for the Message class for details. The changes should be\ncompletely backward compatible.
\nThe object structure has changed in the face of message/rfc822\ncontent types. In email version 1, such a type would be represented by a\nscalar payload, i.e. the container message’s is_multipart() returned\nfalse, get_payload() was not a list object, but a single Message\ninstance.
\nThis structure was inconsistent with the rest of the package, so the object\nrepresentation for message/rfc822 content types was changed. In\nemail version 2, the container does return True from\nis_multipart(), and get_payload() returns a list containing a single\nMessage item.
\nNote that this is one place that backward compatibility could not be completely\nmaintained. However, if you’re already testing the return type of\nget_payload(), you should be fine. You just need to make sure your code\ndoesn’t do a set_payload() with a Message instance on a container\nwith a content type of message/rfc822.
\nThe Parser constructor’s strict argument was added, and its\nparse() and parsestr() methods grew a headersonly argument. The\nstrict flag was also added to functions email.message_from_file() and\nemail.message_from_string().
\nGenerator.__call__() is deprecated; use Generator.flatten()\ninstead. The Generator class has also grown the clone() method.
\nThe DecodedGenerator class in the email.Generator module was\nadded.
\nThe intermediate base classes MIMENonMultipart and\nMIMEMultipart have been added, and interposed in the class hierarchy\nfor most of the other MIME-related derived classes.
\nThe _encoder argument to the MIMEText constructor has been\ndeprecated. Encoding now happens implicitly based on the _charset argument.
\nThe following functions in the email.Utils module have been deprecated:\ndump_address_pairs(), decode(), and encode(). The following\nfunctions have been added to the module: make_msgid(),\ndecode_rfc2231(), encode_rfc2231(), and decode_params().
\nThe non-public function email.Iterators._structure() was added.
\nThe email package was originally prototyped as a separate library called\nmimelib. Changes have been made so that method names\nare more consistent, and some methods or modules have either been added or\nremoved. The semantics of some of the methods have also changed. For the most\npart, any functionality available in mimelib is still available in the\nemail package, albeit often in a different way. Backward compatibility\nbetween the mimelib package and the email package was not a\npriority.
\nHere is a brief description of the differences between the mimelib and\nthe email packages, along with hints on how to port your applications.
\nOf course, the most visible difference between the two packages is that the\npackage name has been changed to email. In addition, the top-level\npackage has the following differences:
\nThe Message class has the following differences:
\nThe Parser class has no differences in its public interface. It does\nhave some additional smarts to recognize message/delivery-status\ntype messages, which it represents as a Message instance containing\nseparate Message subparts for each header block in the delivery status\nnotification [1].
\nThe Generator class has no differences in its public interface. There\nis a new class in the email.generator module though, called\nDecodedGenerator which provides most of the functionality previously\navailable in the Message.getpayloadastext() method.
\nThe following modules and classes have been changed:
\nThe MIMEBase class constructor arguments _major and _minor have\nchanged to _maintype and _subtype respectively.
\nThe Image class/module has been renamed to MIMEImage. The _minor\nargument has been renamed to _subtype.
\nThe Text class/module has been renamed to MIMEText. The _minor\nargument has been renamed to _subtype.
\nThe MessageRFC822 class/module has been renamed to MIMEMessage. Note\nthat an earlier version of mimelib called this class/module RFC822,\nbut that clashed with the Python standard library module rfc822 on some\ncase-insensitive file systems.
\nAlso, the MIMEMessage class now represents any kind of MIME message\nwith main type message. It takes an optional argument _subtype\nwhich is used to set the MIME subtype. _subtype defaults to\nrfc822.
\nmimelib provided some utility functions in its address and\ndate modules. All of these functions have been moved to the\nemail.utils module.
\nThe MsgReader class/module has been removed. Its functionality is most\nclosely supported in the body_line_iterator() function in the\nemail.iterators module.
\nFootnotes
\n[1] | Delivery Status Notifications (DSN) are defined in RFC 1894. |
\nNew in version 2.6.
\nJSON (JavaScript Object Notation) is a subset of JavaScript\nsyntax (ECMA-262 3rd edition) used as a lightweight data interchange format.
\njson exposes an API familiar to users of the standard library\nmarshal and pickle modules.
\nEncoding basic Python object hierarchies:
\n>>> import json\n>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])\n'["foo", {"bar": ["baz", null, 1.0, 2]}]'\n>>> print json.dumps("\\"foo\\bar")\n"\\"foo\\bar"\n>>> print json.dumps(u'\\u1234')\n"\\u1234"\n>>> print json.dumps('\\\\')\n"\\\\"\n>>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)\n{"a": 0, "b": 0, "c": 0}\n>>> from StringIO import StringIO\n>>> io = StringIO()\n>>> json.dump(['streaming API'], io)\n>>> io.getvalue()\n'["streaming API"]'\n
Compact encoding:
\n>>> import json\n>>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':'))\n'[1,2,3,{"4":5,"6":7}]'\n
Pretty printing:
\n>>> import json\n>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)\n{\n "4": 5,\n "6": 7\n}\n
Decoding JSON:
\n>>> import json\n>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')\n[u'foo', {u'bar': [u'baz', None, 1.0, 2]}]\n>>> json.loads('"\\\\"foo\\\\bar"')\nu'"foo\\x08ar'\n>>> from StringIO import StringIO\n>>> io = StringIO('["streaming API"]')\n>>> json.load(io)\n[u'streaming API']\n
Specializing JSON object decoding:
\n>>> import json\n>>> def as_complex(dct):\n... if '__complex__' in dct:\n... return complex(dct['real'], dct['imag'])\n... return dct\n...\n>>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',\n... object_hook=as_complex)\n(1+2j)\n>>> import decimal\n>>> json.loads('1.1', parse_float=decimal.Decimal)\nDecimal('1.1')\n
Extending JSONEncoder:
\n>>> import json\n>>> class ComplexEncoder(json.JSONEncoder):\n... def default(self, obj):\n... if isinstance(obj, complex):\n... return [obj.real, obj.imag]\n... return json.JSONEncoder.default(self, obj)\n...\n>>> dumps(2 + 1j, cls=ComplexEncoder)\n'[2.0, 1.0]'\n>>> ComplexEncoder().encode(2 + 1j)\n'[2.0, 1.0]'\n>>> list(ComplexEncoder().iterencode(2 + 1j))\n['[', '2.0', ', ', '1.0', ']']\n
Using json.tool from the shell to validate and pretty-print:
\n$ echo '{"json":"obj"}' | python -mjson.tool\n{\n "json": "obj"\n}\n$ echo '{ 1.2:3.4}' | python -mjson.tool\nExpecting property name: line 1 column 2 (char 2)\n
Note
\nThe JSON produced by this module’s default settings is a subset of\nYAML, so it may be used as a serializer for that as well.
\nSerialize obj as a JSON formatted stream to fp (a .write()-supporting\nfile-like object).
\nIf skipkeys is True (default: False), then dict keys that are not\nof a basic type (str, unicode, int, long,\nfloat, bool, None) will be skipped instead of raising a\nTypeError.
\nIf ensure_ascii is False (default: True), then some chunks written\nto fp may be unicode instances, subject to normal Python\nstr to unicode coercion rules. Unless fp.write()\nexplicitly understands unicode (as in codecs.getwriter()) this\nis likely to cause an error.
\nIf check_circular is False (default: True), then the circular\nreference check for container types will be skipped and a circular reference\nwill result in an OverflowError (or worse).
\nIf allow_nan is False (default: True), then it will be a\nValueError to serialize out of range float values (nan,\ninf, -inf) in strict compliance of the JSON specification, instead of\nusing the JavaScript equivalents (NaN, Infinity, -Infinity).
\nIf indent is a non-negative integer, then JSON array elements and object\nmembers will be pretty-printed with that indent level. An indent level of 0,\nor negative, will only insert newlines. None (the default) selects the\nmost compact representation.
\nIf separators is an (item_separator, dict_separator) tuple, then it\nwill be used instead of the default (', ', ': ') separators. (',',\n':') is the most compact JSON representation.
\nencoding is the character encoding for str instances, default is UTF-8.
\ndefault(obj) is a function that should return a serializable version of\nobj or raise TypeError. The default simply raises TypeError.
\nTo use a custom JSONEncoder subclass (e.g. one that overrides the\ndefault() method to serialize additional types), specify it with the\ncls kwarg; otherwise JSONEncoder is used.
\n\nSerialize obj to a JSON formatted str.
\nIf ensure_ascii is False, then the return value will be a\nunicode instance. The other arguments have the same meaning as in\ndump().
\nDeserialize fp (a .read()-supporting file-like object containing a JSON\ndocument) to a Python object.
\nIf the contents of fp are encoded with an ASCII based encoding other than\nUTF-8 (e.g. latin-1), then an appropriate encoding name must be specified.\nEncodings that are not ASCII based (such as UCS-2) are not allowed, and\nshould be wrapped with codecs.getreader(encoding)(fp), or simply decoded\nto a unicode object and passed to loads().
\nobject_hook is an optional function that will be called with the result of\nany object literal decoded (a dict). The return value of\nobject_hook will be used instead of the dict. This feature can be used\nto implement custom decoders (e.g. JSON-RPC class hinting).
\nobject_pairs_hook is an optional function that will be called with the\nresult of any object literal decoded with an ordered list of pairs. The\nreturn value of object_pairs_hook will be used instead of the\ndict. This feature can be used to implement custom decoders that\nrely on the order that the key and value pairs are decoded (for example,\ncollections.OrderedDict() will remember the order of insertion). If\nobject_hook is also defined, the object_pairs_hook takes priority.
\n\nChanged in version 2.7: Added support for object_pairs_hook.
\nparse_float, if specified, will be called with the string of every JSON\nfloat to be decoded. By default, this is equivalent to float(num_str).\nThis can be used to use another datatype or parser for JSON floats\n(e.g. decimal.Decimal).
\nparse_int, if specified, will be called with the string of every JSON int\nto be decoded. By default, this is equivalent to int(num_str). This can\nbe used to use another datatype or parser for JSON integers\n(e.g. float).
\nparse_constant, if specified, will be called with one of the following\nstrings: '-Infinity', 'Infinity', 'NaN', 'null', 'true',\n'false'. This can be used to raise an exception if invalid JSON numbers\nare encountered.
\nTo use a custom JSONDecoder subclass, specify it with the cls\nkwarg; otherwise JSONDecoder is used. Additional keyword arguments\nwill be passed to the constructor of the class.
\nDeserialize s (a str or unicode instance containing a JSON\ndocument) to a Python object.
\nIf s is a str instance and is encoded with an ASCII based encoding\nother than UTF-8 (e.g. latin-1), then an appropriate encoding name must be\nspecified. Encodings that are not ASCII based (such as UCS-2) are not\nallowed and should be decoded to unicode first.
\nThe other arguments have the same meaning as in load().
\nSimple JSON decoder.
\nPerforms the following translations in decoding by default:
\nJSON | \nPython | \n
---|---|
object | \ndict | \n
array | \nlist | \n
string | \nunicode | \n
number (int) | \nint, long | \n
number (real) | \nfloat | \n
true | \nTrue | \n
false | \nFalse | \n
null | \nNone | \n
It also understands NaN, Infinity, and -Infinity as their\ncorresponding float values, which is outside the JSON spec.
\nencoding determines the encoding used to interpret any str objects\ndecoded by this instance (UTF-8 by default). It has no effect when decoding\nunicode objects.
\nNote that currently only encodings that are a superset of ASCII work, strings\nof other encodings should be passed in as unicode.
\nobject_hook, if specified, will be called with the result of every JSON\nobject decoded and its return value will be used in place of the given\ndict. This can be used to provide custom deserializations (e.g. to\nsupport JSON-RPC class hinting).
\nobject_pairs_hook, if specified will be called with the result of every\nJSON object decoded with an ordered list of pairs. The return value of\nobject_pairs_hook will be used instead of the dict. This\nfeature can be used to implement custom decoders that rely on the order\nthat the key and value pairs are decoded (for example,\ncollections.OrderedDict() will remember the order of insertion). If\nobject_hook is also defined, the object_pairs_hook takes priority.
\n\nChanged in version 2.7: Added support for object_pairs_hook.
\nparse_float, if specified, will be called with the string of every JSON\nfloat to be decoded. By default, this is equivalent to float(num_str).\nThis can be used to use another datatype or parser for JSON floats\n(e.g. decimal.Decimal).
\nparse_int, if specified, will be called with the string of every JSON int\nto be decoded. By default, this is equivalent to int(num_str). This can\nbe used to use another datatype or parser for JSON integers\n(e.g. float).
\nparse_constant, if specified, will be called with one of the following\nstrings: '-Infinity', 'Infinity', 'NaN', 'null', 'true',\n'false'. This can be used to raise an exception if invalid JSON numbers\nare encountered.
\nIf strict is False (True is the default), then control characters\nwill be allowed inside strings. Control characters in this context are\nthose with character codes in the 0-31 range, including '\\t' (tab),\n'\\n', '\\r' and '\\0'.
\n\n\n\n\nExtensible JSON encoder for Python data structures.
\nSupports the following objects and types by default:
\nPython | \nJSON | \n
---|---|
dict | \nobject | \n
list, tuple | \narray | \n
str, unicode | \nstring | \n
int, long, float | \nnumber | \n
True | \ntrue | \n
False | \nfalse | \n
None | \nnull | \n
To extend this to recognize other objects, subclass and implement a\ndefault() method with another method that returns a serializable object\nfor o if possible, otherwise it should call the superclass implementation\n(to raise TypeError).
\nIf skipkeys is False (the default), then it is a TypeError to\nattempt encoding of keys that are not str, int, long, float or None. If\nskipkeys is True, such items are simply skipped.
\nIf ensure_ascii is True (the default), the output is guaranteed to be\nstr objects with all incoming unicode characters escaped. If\nensure_ascii is False, the output will be a unicode object.
\nIf check_circular is True (the default), then lists, dicts, and custom\nencoded objects will be checked for circular references during encoding to\nprevent an infinite recursion (which would cause an OverflowError).\nOtherwise, no such check takes place.
\nIf allow_nan is True (the default), then NaN, Infinity, and\n-Infinity will be encoded as such. This behavior is not JSON\nspecification compliant, but is consistent with most JavaScript based\nencoders and decoders. Otherwise, it will be a ValueError to encode\nsuch floats.
\nIf sort_keys is True (default False), then the output of dictionaries\nwill be sorted by key; this is useful for regression tests to ensure that\nJSON serializations can be compared on a day-to-day basis.
\nIf indent is a non-negative integer (it is None by default), then JSON\narray elements and object members will be pretty-printed with that indent\nlevel. An indent level of 0 will only insert newlines. None is the most\ncompact representation.
\nIf specified, separators should be an (item_separator, key_separator)\ntuple. The default is (', ', ': '). To get the most compact JSON\nrepresentation, you should specify (',', ':') to eliminate whitespace.
\nIf specified, default is a function that gets called for objects that can’t\notherwise be serialized. It should return a JSON encodable version of the\nobject or raise a TypeError.
\nIf encoding is not None, then all input strings will be transformed\ninto unicode using that encoding prior to JSON-encoding. The default is\nUTF-8.
\nImplement this method in a subclass such that it returns a serializable\nobject for o, or calls the base implementation (to raise a\nTypeError).
\nFor example, to support arbitrary iterators, you could implement default\nlike this:
\ndef default(self, o):\n try:\n iterable = iter(o)\n except TypeError:\n pass\n else:\n return list(iterable)\n return JSONEncoder.default(self, o)\n
Return a JSON string representation of a Python data structure, o. For\nexample:
\n>>> JSONEncoder().encode({"foo": ["bar", "baz"]})\n'{"foo": ["bar", "baz"]}'\n
Encode the given object, o, and yield each string representation as\navailable. For example:
\nfor chunk in JSONEncoder().iterencode(bigobject):\n mysocket.write(chunk)\n
\nDeprecated since version 2.3: The email package should be used in preference to the mimetools\nmodule. This module is present only to maintain backward compatibility, and\nit has been removed in 3.x.
\nThis module defines a subclass of the rfc822 module’s Message\nclass and a number of utility functions that are useful for the manipulation for\nMIME multipart or encoded message.
\nIt defines the following items:
\nSee also
\nThe Message class defines the following methods in addition to the\nrfc822.Message methods:
\n\nDeprecated since version 2.6: The mhlib module has been removed in Python 3.0. Use the\nmailbox instead.
\nThe mhlib module provides a Python interface to MH folders and their\ncontents.
\nThe module contains three basic classes, MH, which represents a\nparticular collection of folders, Folder, which represents a single\nfolder, and Message, which represents a single message.
\n\n\n\n\nMH instances have the following methods:
\nFolder instances represent open folders and have the following methods:
\nThe Message class adds one method to those of\nmimetools.Message:
\nSource code: Lib/mimetypes.py
\nThe mimetypes module converts between a filename or URL and the MIME type\nassociated with the filename extension. Conversions are provided from filename\nto MIME type and from MIME type to filename extension; encodings are not\nsupported for the latter conversion.
\nThe module provides one class and a number of convenience functions. The\nfunctions are the normal interface to this module, but some applications may be\ninterested in the class as well.
\nThe functions described below provide the primary interface for this module. If\nthe module has not been initialized, they will call init() if they rely on\nthe information init() sets up.
\nGuess the type of a file based on its filename or URL, given by url. The\nreturn value is a tuple (type, encoding) where type is None if the\ntype can’t be guessed (missing or unknown suffix) or a string of the form\n'type/subtype', usable for a MIME content-type header.
\nencoding is None for no encoding or the name of the program used to encode\n(e.g. compress or gzip). The encoding is suitable for use\nas a Content-Encoding header, not as a\nContent-Transfer-Encoding header. The mappings are table driven.\nEncoding suffixes are case sensitive; type suffixes are first tried case\nsensitively, then case insensitively.
\nThe optional strict argument is a flag specifying whether the list of known MIME types\nis limited to only the official types registered with IANA.\nWhen strict is True (the default), only the IANA types are supported; when\nstrict is False, some additional non-standard but commonly used MIME types\nare also recognized.
\nGuess the extensions for a file based on its MIME type, given by type. The\nreturn value is a list of strings giving all possible filename extensions,\nincluding the leading dot ('.'). The extensions are not guaranteed to have\nbeen associated with any particular data stream, but would be mapped to the MIME\ntype type by guess_type().
\nThe optional strict argument has the same meaning as with the guess_type() function.
\nGuess the extension for a file based on its MIME type, given by type. The\nreturn value is a string giving a filename extension, including the leading dot\n('.'). The extension is not guaranteed to have been associated with any\nparticular data stream, but would be mapped to the MIME type type by\nguess_type(). If no extension can be guessed for type, None is\nreturned.
\nThe optional strict argument has the same meaning as with the guess_type() function.
\nSome additional functions and data items are available for controlling the\nbehavior of the module.
\nInitialize the internal data structures. If given, files must be a sequence\nof file names which should be used to augment the default type map. If omitted,\nthe file names to use are taken from knownfiles; on Windows, the\ncurrent registry settings are loaded. Each file named in files or\nknownfiles takes precedence over those named before it. Calling\ninit() repeatedly is allowed.
\n\nChanged in version 2.7: Previously, Windows registry settings were ignored.
\nAdd a mapping from the MIME type type to the extension ext. When the\nextension is already known, the new type will replace the old one. When the type\nis already known the extension will be added to the list of known extensions.
\nWhen strict is True (the default), the mapping will added to the official MIME\ntypes, otherwise to the non-standard ones.
\nList of type map file names commonly installed. These files are typically named\nmime.types and are installed in different locations by different\npackages.
\nAn example usage of the module:
\n>>> import mimetypes\n>>> mimetypes.init()\n>>> mimetypes.knownfiles\n['/etc/mime.types', '/etc/httpd/mime.types', ... ]\n>>> mimetypes.suffix_map['.tgz']\n'.tar.gz'\n>>> mimetypes.encodings_map['.gz']\n'gzip'\n>>> mimetypes.types_map['.tgz']\n'application/x-tar-gz'\n
The MimeTypes class may be useful for applications which may want more\nthan one MIME-type database; it provides an interface similar to the one of the\nmimetypes module.
\nThis class represents a MIME-types database. By default, it provides access to\nthe same database as the rest of this module. The initial database is a copy of\nthat provided by the module, and may be extended by loading additional\nmime.types-style files into the database using the read() or\nreadfp() methods. The mapping dictionaries may also be cleared before\nloading additional data if the default data is not desired.
\nThe optional filenames parameter can be used to cause additional files to be\nloaded “on top” of the default database.
\nLoad MIME information from a file named filename. This uses readfp() to\nparse the file.
\nIf strict is True, information will be added to list of standard types,\nelse to the list of non-standard types.
\nLoad MIME type information from an open file fp. The file must have the format of\nthe standard mime.types files.
\nIf strict is True, information will be added to the list of standard\ntypes, else to the list of non-standard types.
\nLoad MIME type information from the Windows registry. Availability: Windows.
\nIf strict is True, information will be added to the list of standard\ntypes, else to the list of non-standard types.
\n\nNew in version 2.7.
\n\nDeprecated since version 2.3: The email package should be used in preference to the MimeWriter\nmodule. This module is present only to maintain backward compatibility.
\nThis module defines the class MimeWriter. The MimeWriter\nclass implements a basic formatter for creating MIME multi-part files. It\ndoesn’t seek around the output file nor does it use large amounts of buffer\nspace. You must write the parts out in the order that they should occur in the\nfinal file. MimeWriter does buffer the headers you add, allowing you\nto rearrange their order.
\nMimeWriter instances have the following methods:
\n\nDeprecated since version 2.3: The email package should be used in preference to the mimify\nmodule. This module is present only to maintain backward compatibility.
\nThe mimify module defines two functions to convert mail messages to and\nfrom MIME format. The mail message can be either a simple message or a\nso-called multipart message. Each part is treated separately. Mimifying (a part\nof) a message entails encoding the message as quoted-printable if it contains\nany characters that cannot be represented using 7-bit ASCII. Unmimifying (a\npart of) a message entails undoing the quoted-printable encoding. Mimify and\nunmimify are especially useful when a message has to be edited before being\nsent. Typical use would be:
\nunmimify message\nedit message\nmimify message\nsend message
\nThe modules defines the following user-callable functions and user-settable\nvariables:
\nThis module can also be used from the command line. Usage is as follows:
\nmimify.py -e [-l length] [infile [outfile]]\nmimify.py -d [-b] [infile [outfile]]
\nto encode (mimify) and decode (unmimify) respectively. infile defaults to\nstandard input, outfile defaults to standard output. The same file can be\nspecified for input and output.
\nIf the -l option is given when encoding, if there are any lines longer than\nthe specified length, the containing part will be encoded.
\nIf the -b option is given when decoding, any base64 parts will be decoded as\nwell.
\nSee also
\nThis module defines two classes, Mailbox and Message, for\naccessing and manipulating on-disk mailboxes and the messages they contain.\nMailbox offers a dictionary-like mapping from keys to messages.\nMessage extends the email.Message module’s Message\nclass with format-specific state and behavior. Supported mailbox formats are\nMaildir, mbox, MH, Babyl, and MMDF.
\nSee also
\nA mailbox, which may be inspected and modified.
\nThe Mailbox class defines an interface and is not intended to be\ninstantiated. Instead, format-specific subclasses should inherit from\nMailbox and your code should instantiate a particular subclass.
\nThe Mailbox interface is dictionary-like, with small keys\ncorresponding to messages. Keys are issued by the Mailbox instance\nwith which they will be used and are only meaningful to that Mailbox\ninstance. A key continues to identify a message even if the corresponding\nmessage is modified, such as by replacing it with another message.
\nMessages may be added to a Mailbox instance using the set-like\nmethod add() and removed using a del statement or the set-like\nmethods remove() and discard().
\nMailbox interface semantics differ from dictionary semantics in some\nnoteworthy ways. Each time a message is requested, a new representation\n(typically a Message instance) is generated based upon the current\nstate of the mailbox. Similarly, when a message is added to a\nMailbox instance, the provided message representation’s contents are\ncopied. In neither case is a reference to the message representation kept by\nthe Mailbox instance.
\nThe default Mailbox iterator iterates over message representations,\nnot keys as the default dictionary iterator does. Moreover, modification of a\nmailbox during iteration is safe and well-defined. Messages added to the\nmailbox after an iterator is created will not be seen by the\niterator. Messages removed from the mailbox before the iterator yields them\nwill be silently skipped, though using a key from an iterator may result in a\nKeyError exception if the corresponding message is subsequently\nremoved.
\nWarning
\nBe very cautious when modifying mailboxes that might be simultaneously\nchanged by some other process. The safest mailbox format to use for such\ntasks is Maildir; try to avoid using single-file formats such as mbox for\nconcurrent writing. If you’re modifying a mailbox, you must lock it by\ncalling the lock() and unlock() methods before reading any\nmessages in the file or making any changes by adding or deleting a\nmessage. Failing to lock the mailbox runs the risk of losing messages or\ncorrupting the entire mailbox.
\nMailbox instances have the following methods:
\nAdd message to the mailbox and return the key that has been assigned to\nit.
\nParameter message may be a Message instance, an\nemail.Message.Message instance, a string, or a file-like object\n(which should be open in text mode). If message is an instance of the\nappropriate format-specific Message subclass (e.g., if it’s an\nmboxMessage instance and this is an mbox instance), its\nformat-specific information is used. Otherwise, reasonable defaults for\nformat-specific information are used.
\nDelete the message corresponding to key from the mailbox.
\nIf no such message exists, a KeyError exception is raised if the\nmethod was called as remove() or __delitem__() but no\nexception is raised if the method was called as discard(). The\nbehavior of discard() may be preferred if the underlying mailbox\nformat supports concurrent modification by other processes.
\nReplace the message corresponding to key with message. Raise a\nKeyError exception if no message already corresponds to key.
\nAs with add(), parameter message may be a Message\ninstance, an email.Message.Message instance, a string, or a\nfile-like object (which should be open in text mode). If message is an\ninstance of the appropriate format-specific Message subclass\n(e.g., if it’s an mboxMessage instance and this is an\nmbox instance), its format-specific information is\nused. Otherwise, the format-specific information of the message that\ncurrently corresponds to key is left unchanged.
\nReturn an iterator over representations of all messages if called as\nitervalues() or __iter__() or return a list of such\nrepresentations if called as values(). The messages are represented\nas instances of the appropriate format-specific Message subclass\nunless a custom message factory was specified when the Mailbox\ninstance was initialized.
\nNote
\nThe behavior of __iter__() is unlike that of dictionaries, which\niterate over keys.
\nReturn a file-like representation of the message corresponding to key,\nor raise a KeyError exception if no such message exists. The\nfile-like object behaves as if open in binary mode. This file should be\nclosed once it is no longer needed.
\nNote
\nUnlike other representations of messages, file-like representations are\nnot necessarily independent of the Mailbox instance that\ncreated them or of the underlying mailbox. More specific documentation\nis provided by each subclass.
\nParameter arg should be a key-to-message mapping or an iterable of\n(key, message) pairs. Updates the mailbox so that, for each given\nkey and message, the message corresponding to key is set to\nmessage as if by using __setitem__(). As with __setitem__(),\neach key must already correspond to a message in the mailbox or else a\nKeyError exception will be raised, so in general it is incorrect\nfor arg to be a Mailbox instance.
\nNote
\nUnlike with dictionaries, keyword arguments are not supported.
\nA subclass of Mailbox for mailboxes in Maildir format. Parameter\nfactory is a callable object that accepts a file-like message representation\n(which behaves as if opened in binary mode) and returns a custom representation.\nIf factory is None, MaildirMessage is used as the default message\nrepresentation. If create is True, the mailbox is created if it does not\nexist.
\nIt is for historical reasons that factory defaults to rfc822.Message\nand that dirname is named as such rather than path. For a Maildir\ninstance that behaves like instances of other Mailbox subclasses, set\nfactory to None.
\nMaildir is a directory-based mailbox format invented for the qmail mail\ntransfer agent and now widely supported by other programs. Messages in a\nMaildir mailbox are stored in separate files within a common directory\nstructure. This design allows Maildir mailboxes to be accessed and modified\nby multiple unrelated programs without data corruption, so file locking is\nunnecessary.
\nMaildir mailboxes contain three subdirectories, namely: tmp,\nnew, and cur. Messages are created momentarily in the\ntmp subdirectory and then moved to the new subdirectory to\nfinalize delivery. A mail user agent may subsequently move the message to the\ncur subdirectory and store information about the state of the message\nin a special “info” section appended to its file name.
\nFolders of the style introduced by the Courier mail transfer agent are also\nsupported. Any subdirectory of the main mailbox is considered a folder if\n'.' is the first character in its name. Folder names are represented by\nMaildir without the leading '.'. Each folder is itself a Maildir\nmailbox but should not contain other folders. Instead, a logical nesting is\nindicated using '.' to delimit levels, e.g., “Archived.2005.07”.
\nNote
\nThe Maildir specification requires the use of a colon (':') in certain\nmessage file names. However, some operating systems do not permit this\ncharacter in file names, If you wish to use a Maildir-like format on such\nan operating system, you should specify another character to use\ninstead. The exclamation point ('!') is a popular choice. For\nexample:
\nimport mailbox\nmailbox.Maildir.colon = '!'\n
The colon attribute may also be set on a per-instance basis.
\nMaildir instances have all of the methods of Mailbox in\naddition to the following:
\nSome Mailbox methods implemented by Maildir deserve special\nremarks:
\nWarning
\nThese methods generate unique file names based upon the current process\nID. When using multiple threads, undetected name clashes may occur and\ncause corruption of the mailbox unless threads are coordinated to avoid\nusing these methods to manipulate the same mailbox simultaneously.
\nSee also
\nA subclass of Mailbox for mailboxes in mbox format. Parameter factory\nis a callable object that accepts a file-like message representation (which\nbehaves as if opened in binary mode) and returns a custom representation. If\nfactory is None, mboxMessage is used as the default message\nrepresentation. If create is True, the mailbox is created if it does not\nexist.
\nThe mbox format is the classic format for storing mail on Unix systems. All\nmessages in an mbox mailbox are stored in a single file with the beginning of\neach message indicated by a line whose first five characters are “From “.
\nSeveral variations of the mbox format exist to address perceived shortcomings in\nthe original. In the interest of compatibility, mbox implements the\noriginal format, which is sometimes referred to as mboxo. This means that\nthe Content-Length header, if present, is ignored and that any\noccurrences of “From ” at the beginning of a line in a message body are\ntransformed to “>From ” when storing the message, although occurrences of “>From\n” are not transformed to “From ” when reading the message.
\nSome Mailbox methods implemented by mbox deserve special\nremarks:
\n\n\n\n\nSee also
\nA subclass of Mailbox for mailboxes in MH format. Parameter factory\nis a callable object that accepts a file-like message representation (which\nbehaves as if opened in binary mode) and returns a custom representation. If\nfactory is None, MHMessage is used as the default message\nrepresentation. If create is True, the mailbox is created if it does not\nexist.
\nMH is a directory-based mailbox format invented for the MH Message Handling\nSystem, a mail user agent. Each message in an MH mailbox resides in its own\nfile. An MH mailbox may contain other MH mailboxes (called folders) in\naddition to messages. Folders may be nested indefinitely. MH mailboxes also\nsupport sequences, which are named lists used to logically group\nmessages without moving them to sub-folders. Sequences are defined in a file\ncalled .mh_sequences in each folder.
\nThe MH class manipulates MH mailboxes, but it does not attempt to\nemulate all of mh‘s behaviors. In particular, it does not modify\nand is not affected by the context or .mh_profile files that\nare used by mh to store its state and configuration.
\nMH instances have all of the methods of Mailbox in addition\nto the following:
\nRename messages in the mailbox as necessary to eliminate gaps in\nnumbering. Entries in the sequences list are updated correspondingly.
\nNote
\nAlready-issued keys are invalidated by this operation and should not be\nsubsequently used.
\nSome Mailbox methods implemented by MH deserve special\nremarks:
\nSee also
\nA subclass of Mailbox for mailboxes in Babyl format. Parameter\nfactory is a callable object that accepts a file-like message representation\n(which behaves as if opened in binary mode) and returns a custom representation.\nIf factory is None, BabylMessage is used as the default message\nrepresentation. If create is True, the mailbox is created if it does not\nexist.
\nBabyl is a single-file mailbox format used by the Rmail mail user agent\nincluded with Emacs. The beginning of a message is indicated by a line\ncontaining the two characters Control-Underscore ('\\037') and Control-L\n('\\014'). The end of a message is indicated by the start of the next\nmessage or, in the case of the last message, a line containing a\nControl-Underscore ('\\037') character.
\nMessages in a Babyl mailbox have two sets of headers, original headers and\nso-called visible headers. Visible headers are typically a subset of the\noriginal headers that have been reformatted or abridged to be more\nattractive. Each message in a Babyl mailbox also has an accompanying list of\nlabels, or short strings that record extra information about the\nmessage, and a list of all user-defined labels found in the mailbox is kept\nin the Babyl options section.
\nBabyl instances have all of the methods of Mailbox in\naddition to the following:
\nReturn a list of the names of all user-defined labels used in the mailbox.
\nNote
\nThe actual messages are inspected to determine which labels exist in\nthe mailbox rather than consulting the list of labels in the Babyl\noptions section, but the Babyl section is updated whenever the mailbox\nis modified.
\nSome Mailbox methods implemented by Babyl deserve special\nremarks:
\nSee also
\nA subclass of Mailbox for mailboxes in MMDF format. Parameter factory\nis a callable object that accepts a file-like message representation (which\nbehaves as if opened in binary mode) and returns a custom representation. If\nfactory is None, MMDFMessage is used as the default message\nrepresentation. If create is True, the mailbox is created if it does not\nexist.
\nMMDF is a single-file mailbox format invented for the Multichannel Memorandum\nDistribution Facility, a mail transfer agent. Each message is in the same\nform as an mbox message but is bracketed before and after by lines containing\nfour Control-A ('\\001') characters. As with the mbox format, the\nbeginning of each message is indicated by a line whose first five characters\nare “From “, but additional occurrences of “From ” are not transformed to\n“>From ” when storing messages because the extra message separator lines\nprevent mistaking such occurrences for the starts of subsequent messages.
\nSome Mailbox methods implemented by MMDF deserve special\nremarks:
\n\n\n\n\nSee also
\nA subclass of the email.Message module’s Message. Subclasses of\nmailbox.Message add mailbox-format-specific state and behavior.
\nIf message is omitted, the new instance is created in a default, empty state.\nIf message is an email.Message.Message instance, its contents are\ncopied; furthermore, any format-specific information is converted insofar as\npossible if message is a Message instance. If message is a string\nor a file, it should contain an RFC 2822-compliant message, which is read\nand parsed.
\nThe format-specific state and behaviors offered by subclasses vary, but in\ngeneral it is only the properties that are not specific to a particular\nmailbox that are supported (although presumably the properties are specific\nto a particular mailbox format). For example, file offsets for single-file\nmailbox formats and file names for directory-based mailbox formats are not\nretained, because they are only applicable to the original mailbox. But state\nsuch as whether a message has been read by the user or marked as important is\nretained, because it applies to the message itself.
\nThere is no requirement that Message instances be used to represent\nmessages retrieved using Mailbox instances. In some situations, the\ntime and memory required to generate Message representations might\nnot be acceptable. For such situations, Mailbox instances also\noffer string and file-like representations, and a custom message factory may\nbe specified when a Mailbox instance is initialized.
\nA message with Maildir-specific behaviors. Parameter message has the same\nmeaning as with the Message constructor.
\nTypically, a mail user agent application moves all of the messages in the\nnew subdirectory to the cur subdirectory after the first time\nthe user opens and closes the mailbox, recording that the messages are old\nwhether or not they’ve actually been read. Each message in cur has an\n“info” section added to its file name to store information about its state.\n(Some mail readers may also add an “info” section to messages in\nnew.) The “info” section may take one of two forms: it may contain\n“2,” followed by a list of standardized flags (e.g., “2,FR”) or it may\ncontain “1,” followed by so-called experimental information. Standard flags\nfor Maildir messages are as follows:
\nFlag | \nMeaning | \nExplanation | \n
---|---|---|
D | \nDraft | \nUnder composition | \n
F | \nFlagged | \nMarked as important | \n
P | \nPassed | \nForwarded, resent, or bounced | \n
R | \nReplied | \nReplied to | \n
S | \nSeen | \nRead | \n
T | \nTrashed | \nMarked for subsequent deletion | \n
MaildirMessage instances offer the following methods:
\nReturn either “new” (if the message should be stored in the new\nsubdirectory) or “cur” (if the message should be stored in the cur\nsubdirectory).
\nNote
\nA message is typically moved from new to cur after its\nmailbox has been accessed, whether or not the message is has been\nread. A message msg has been read if "S" in msg.get_flags() is\nTrue.
\nWhen a MaildirMessage instance is created based upon an\nmboxMessage or MMDFMessage instance, the Status\nand X-Status headers are omitted and the following conversions\ntake place:
\nResulting state | \nmboxMessage or MMDFMessage\nstate | \n
---|---|
“cur” subdirectory | \nO flag | \n
F flag | \nF flag | \n
R flag | \nA flag | \n
S flag | \nR flag | \n
T flag | \nD flag | \n
When a MaildirMessage instance is created based upon an\nMHMessage instance, the following conversions take place:
\nResulting state | \nMHMessage state | \n
---|---|
“cur” subdirectory | \n“unseen” sequence | \n
“cur” subdirectory and S flag | \nno “unseen” sequence | \n
F flag | \n“flagged” sequence | \n
R flag | \n“replied” sequence | \n
When a MaildirMessage instance is created based upon a\nBabylMessage instance, the following conversions take place:
\nResulting state | \nBabylMessage state | \n
---|---|
“cur” subdirectory | \n“unseen” label | \n
“cur” subdirectory and S flag | \nno “unseen” label | \n
P flag | \n“forwarded” or “resent” label | \n
R flag | \n“answered” label | \n
T flag | \n“deleted” label | \n
A message with mbox-specific behaviors. Parameter message has the same meaning\nas with the Message constructor.
\nMessages in an mbox mailbox are stored together in a single file. The\nsender’s envelope address and the time of delivery are typically stored in a\nline beginning with “From ” that is used to indicate the start of a message,\nthough there is considerable variation in the exact format of this data among\nmbox implementations. Flags that indicate the state of the message, such as\nwhether it has been read or marked as important, are typically stored in\nStatus and X-Status headers.
\nConventional flags for mbox messages are as follows:
\nFlag | \nMeaning | \nExplanation | \n
---|---|---|
R | \nRead | \nRead | \n
O | \nOld | \nPreviously detected by MUA | \n
D | \nDeleted | \nMarked for subsequent deletion | \n
F | \nFlagged | \nMarked as important | \n
A | \nAnswered | \nReplied to | \n
The “R” and “O” flags are stored in the Status header, and the\n“D”, “F”, and “A” flags are stored in the X-Status header. The\nflags and headers typically appear in the order mentioned.
\nmboxMessage instances offer the following methods:
\nWhen an mboxMessage instance is created based upon a\nMaildirMessage instance, a “From ” line is generated based upon the\nMaildirMessage instance’s delivery date, and the following conversions\ntake place:
\nResulting state | \nMaildirMessage state | \n
---|---|
R flag | \nS flag | \n
O flag | \n“cur” subdirectory | \n
D flag | \nT flag | \n
F flag | \nF flag | \n
A flag | \nR flag | \n
When an mboxMessage instance is created based upon an\nMHMessage instance, the following conversions take place:
\nResulting state | \nMHMessage state | \n
---|---|
R flag and O flag | \nno “unseen” sequence | \n
O flag | \n“unseen” sequence | \n
F flag | \n“flagged” sequence | \n
A flag | \n“replied” sequence | \n
When an mboxMessage instance is created based upon a\nBabylMessage instance, the following conversions take place:
\nResulting state | \nBabylMessage state | \n
---|---|
R flag and O flag | \nno “unseen” label | \n
O flag | \n“unseen” label | \n
D flag | \n“deleted” label | \n
A flag | \n“answered” label | \n
When a Message instance is created based upon an MMDFMessage\ninstance, the “From ” line is copied and all flags directly correspond:
\nResulting state | \nMMDFMessage state | \n
---|---|
R flag | \nR flag | \n
O flag | \nO flag | \n
D flag | \nD flag | \n
F flag | \nF flag | \n
A flag | \nA flag | \n
A message with MH-specific behaviors. Parameter message has the same meaning\nas with the Message constructor.
\nMH messages do not support marks or flags in the traditional sense, but they\ndo support sequences, which are logical groupings of arbitrary messages. Some\nmail reading programs (although not the standard mh and\nnmh) use sequences in much the same way flags are used with other\nformats, as follows:
\nSequence | \nExplanation | \n
---|---|
unseen | \nNot read, but previously detected by MUA | \n
replied | \nReplied to | \n
flagged | \nMarked as important | \n
MHMessage instances offer the following methods:
\nWhen an MHMessage instance is created based upon a\nMaildirMessage instance, the following conversions take place:
\nResulting state | \nMaildirMessage state | \n
---|---|
“unseen” sequence | \nno S flag | \n
“replied” sequence | \nR flag | \n
“flagged” sequence | \nF flag | \n
When an MHMessage instance is created based upon an\nmboxMessage or MMDFMessage instance, the Status\nand X-Status headers are omitted and the following conversions\ntake place:
\nResulting state | \nmboxMessage or MMDFMessage\nstate | \n
---|---|
“unseen” sequence | \nno R flag | \n
“replied” sequence | \nA flag | \n
“flagged” sequence | \nF flag | \n
When an MHMessage instance is created based upon a\nBabylMessage instance, the following conversions take place:
\nResulting state | \nBabylMessage state | \n
---|---|
“unseen” sequence | \n“unseen” label | \n
“replied” sequence | \n“answered” label | \n
A message with Babyl-specific behaviors. Parameter message has the same\nmeaning as with the Message constructor.
\nCertain message labels, called attributes, are defined by convention\nto have special meanings. The attributes are as follows:
\nLabel | \nExplanation | \n
---|---|
unseen | \nNot read, but previously detected by MUA | \n
deleted | \nMarked for subsequent deletion | \n
filed | \nCopied to another file or mailbox | \n
answered | \nReplied to | \n
forwarded | \nForwarded | \n
edited | \nModified by the user | \n
resent | \nResent | \n
By default, Rmail displays only visible headers. The BabylMessage\nclass, though, uses the original headers because they are more\ncomplete. Visible headers may be accessed explicitly if desired.
\nBabylMessage instances offer the following methods:
\nWhen a BabylMessage instance is created based upon a\nMaildirMessage instance, the following conversions take place:
\nResulting state | \nMaildirMessage state | \n
---|---|
“unseen” label | \nno S flag | \n
“deleted” label | \nT flag | \n
“answered” label | \nR flag | \n
“forwarded” label | \nP flag | \n
When a BabylMessage instance is created based upon an\nmboxMessage or MMDFMessage instance, the Status\nand X-Status headers are omitted and the following conversions\ntake place:
\nResulting state | \nmboxMessage or MMDFMessage\nstate | \n
---|---|
“unseen” label | \nno R flag | \n
“deleted” label | \nD flag | \n
“answered” label | \nA flag | \n
When a BabylMessage instance is created based upon an\nMHMessage instance, the following conversions take place:
\nResulting state | \nMHMessage state | \n
---|---|
“unseen” label | \n“unseen” sequence | \n
“answered” label | \n“replied” sequence | \n
A message with MMDF-specific behaviors. Parameter message has the same meaning\nas with the Message constructor.
\nAs with message in an mbox mailbox, MMDF messages are stored with the\nsender’s address and the delivery date in an initial line beginning with\n“From “. Likewise, flags that indicate the state of the message are\ntypically stored in Status and X-Status headers.
\nConventional flags for MMDF messages are identical to those of mbox message\nand are as follows:
\nFlag | \nMeaning | \nExplanation | \n
---|---|---|
R | \nRead | \nRead | \n
O | \nOld | \nPreviously detected by MUA | \n
D | \nDeleted | \nMarked for subsequent deletion | \n
F | \nFlagged | \nMarked as important | \n
A | \nAnswered | \nReplied to | \n
The “R” and “O” flags are stored in the Status header, and the\n“D”, “F”, and “A” flags are stored in the X-Status header. The\nflags and headers typically appear in the order mentioned.
\nMMDFMessage instances offer the following methods, which are\nidentical to those offered by mboxMessage:
\nWhen an MMDFMessage instance is created based upon a\nMaildirMessage instance, a “From ” line is generated based upon the\nMaildirMessage instance’s delivery date, and the following conversions\ntake place:
\nResulting state | \nMaildirMessage state | \n
---|---|
R flag | \nS flag | \n
O flag | \n“cur” subdirectory | \n
D flag | \nT flag | \n
F flag | \nF flag | \n
A flag | \nR flag | \n
When an MMDFMessage instance is created based upon an\nMHMessage instance, the following conversions take place:
\nResulting state | \nMHMessage state | \n
---|---|
R flag and O flag | \nno “unseen” sequence | \n
O flag | \n“unseen” sequence | \n
F flag | \n“flagged” sequence | \n
A flag | \n“replied” sequence | \n
When an MMDFMessage instance is created based upon a\nBabylMessage instance, the following conversions take place:
\nResulting state | \nBabylMessage state | \n
---|---|
R flag and O flag | \nno “unseen” label | \n
O flag | \n“unseen” label | \n
D flag | \n“deleted” label | \n
A flag | \n“answered” label | \n
When an MMDFMessage instance is created based upon an\nmboxMessage instance, the “From ” line is copied and all flags directly\ncorrespond:
\nResulting state | \nmboxMessage state | \n
---|---|
R flag | \nR flag | \n
O flag | \nO flag | \n
D flag | \nD flag | \n
F flag | \nF flag | \n
A flag | \nA flag | \n
The following exception classes are defined in the mailbox module:
\n\nDeprecated since version 2.6.
\nOlder versions of the mailbox module do not support modification of\nmailboxes, such as adding or removing message, and do not provide classes to\nrepresent format-specific message properties. For backward compatibility, the\nolder mailbox classes are still available, but the newer classes should be used\nin preference to them. The old classes will be removed in Python 3.0.
\nOlder mailbox objects support only iteration and provide a single public method:
\nMost of the older mailbox classes have names that differ from the current\nmailbox class names, except for Maildir. For this reason, the new\nMaildir class defines a next() method and its constructor differs\nslightly from those of the other new mailbox classes.
\nThe older mailbox classes whose names are not the same as their newer\ncounterparts are as follows:
\nAccess to a classic Unix-style mailbox, where all messages are contained in a\nsingle file and separated by From (a.k.a. From_) lines. The file object\nfp points to the mailbox file. The optional factory parameter is a callable\nthat should create new message objects. factory is called with one argument,\nfp by the next() method of the mailbox object. The default is the\nrfc822.Message class (see the rfc822 module – and the note\nbelow).
\nNote
\nFor reasons of this module’s internal implementation, you will probably want to\nopen the fp object in binary mode. This is especially important on Windows.
\nFor maximum portability, messages in a Unix-style mailbox are separated by any\nline that begins exactly with the string 'From ' (note the trailing space)\nif preceded by exactly two newlines. Because of the wide-range of variations in\npractice, nothing else on the From_ line should be considered. However, the\ncurrent implementation doesn’t check for the leading two newlines. This is\nusually fine for most applications.
\nThe UnixMailbox class implements a more strict version of From_\nline checking, using a regular expression that usually correctly matched\nFrom_ delimiters. It considers delimiter line to be separated by From\nname time lines. For maximum portability, use the\nPortableUnixMailbox class instead. This class is identical to\nUnixMailbox except that individual messages are separated by only\nFrom lines.
\nIf you wish to use the older mailbox classes with the email module rather\nthan the deprecated rfc822 module, you can do so as follows:
\nimport email\nimport email.Errors\nimport mailbox\n\ndef msgfactory(fp):\n try:\n return email.message_from_file(fp)\n except email.Errors.MessageParseError:\n # Don't return None since that will\n # stop the mailbox iterator\n return ''\n\nmbox = mailbox.UnixMailbox(fp, msgfactory)\n
Alternatively, if you know your mailbox contains only well-formed MIME messages,\nyou can simplify this to:
\nimport email\nimport mailbox\n\nmbox = mailbox.UnixMailbox(fp, email.message_from_file)\n
A simple example of printing the subjects of all messages in a mailbox that seem\ninteresting:
\nimport mailbox\nfor message in mailbox.mbox('~/mbox'):\n subject = message['subject'] # Could possibly be None.\n if subject and 'python' in subject.lower():\n print subject\n
To copy all mail from a Babyl mailbox to an MH mailbox, converting all of the\nformat-specific information that can be converted:
\nimport mailbox\ndestination = mailbox.MH('~/Mail')\ndestination.lock()\nfor message in mailbox.Babyl('~/RMAIL'):\n destination.add(mailbox.MHMessage(message))\ndestination.flush()\ndestination.unlock()\n
This example sorts mail from several mailing lists into different mailboxes,\nbeing careful to avoid mail corruption due to concurrent modification by other\nprograms, mail loss due to interruption of the program, or premature termination\ndue to malformed messages in the mailbox:
\nimport mailbox\nimport email.Errors\n\nlist_names = ('python-list', 'python-dev', 'python-bugs')\n\nboxes = dict((name, mailbox.mbox('~/email/%s' name)) for name in list_names)\ninbox = mailbox.Maildir('~/Maildir', factory=None)\n\nfor key in inbox.iterkeys():\n try:\n message = inbox[key]\n except email.Errors.MessageParseError:\n continue # The message is malformed. Just leave it.\n\n for name in list_names:\n list_id = message['list-id']\n if list_id and name in list_id:\n # Get mailbox to use\n box = boxes[name]\n\n # Write copy to disk before removing original.\n # If there's a crash, you might duplicate a message, but\n # that's better than losing a message completely.\n box.lock()\n box.add(message)\n box.flush()\n box.unlock()\n\n # Remove original message\n inbox.lock()\n inbox.discard(key)\n inbox.flush()\n inbox.unlock()\n break # Found destination, so stop looking.\n\nfor box in boxes.itervalues():\n box.close()\n
This module provides data encoding and decoding as specified in RFC 3548.\nThis standard defines the Base16, Base32, and Base64 algorithms for encoding and\ndecoding arbitrary binary strings into text strings that can be safely sent by\nemail, used as parts of URLs, or included as part of an HTTP POST request. The\nencoding algorithm is not the same as the uuencode program.
\nThere are two interfaces provided by this module. The modern interface supports\nencoding and decoding string objects using all three alphabets. The legacy\ninterface provides for encoding and decoding to and from file-like objects as\nwell as strings, but only using the Base64 standard alphabet.
\nThe modern interface, which was introduced in Python 2.4, provides:
\nEncode a string use Base64.
\ns is the string to encode. Optional altchars must be a string of at least\nlength 2 (additional characters are ignored) which specifies an alternative\nalphabet for the + and / characters. This allows an application to e.g.\ngenerate URL or filesystem safe Base64 strings. The default is None, for\nwhich the standard Base64 alphabet is used.
\nThe encoded string is returned.
\nDecode a Base64 encoded string.
\ns is the string to decode. Optional altchars must be a string of at least\nlength 2 (additional characters are ignored) which specifies the alternative\nalphabet used instead of the + and / characters.
\nThe decoded string is returned. A TypeError is raised if s were\nincorrectly padded or if there are non-alphabet characters present in the\nstring.
\nDecode a Base32 encoded string.
\ns is the string to decode. Optional casefold is a flag specifying whether a\nlowercase alphabet is acceptable as input. For security purposes, the default\nis False.
\nRFC 3548 allows for optional mapping of the digit 0 (zero) to the letter O\n(oh), and for optional mapping of the digit 1 (one) to either the letter I (eye)\nor letter L (el). The optional argument map01 when not None, specifies\nwhich letter the digit 1 should be mapped to (when map01 is not None, the\ndigit 0 is always mapped to the letter O). For security purposes the default is\nNone, so that 0 and 1 are not allowed in the input.
\nThe decoded string is returned. A TypeError is raised if s were\nincorrectly padded or if there are non-alphabet characters present in the\nstring.
\nEncode a string using Base16.
\ns is the string to encode. The encoded string is returned.
\nDecode a Base16 encoded string.
\ns is the string to decode. Optional casefold is a flag specifying whether a\nlowercase alphabet is acceptable as input. For security purposes, the default\nis False.
\nThe decoded string is returned. A TypeError is raised if s were\nincorrectly padded or if there are non-alphabet characters present in the\nstring.
\nThe legacy interface:
\nAn example usage of the module:
\n>>> import base64\n>>> encoded = base64.b64encode('data to be encoded')\n>>> encoded\n'ZGF0YSB0byBiZSBlbmNvZGVk'\n>>> data = base64.b64decode(encoded)\n>>> data\n'data to be encoded'\n
See also
\n\nDeprecated since version 2.3: The email package should be used in preference to the rfc822\nmodule. This module is present only to maintain backward compatibility, and\nhas been removed in 3.0.
\nThis module defines a class, Message, which represents an “email\nmessage” as defined by the Internet standard RFC 2822. [1] Such messages\nconsist of a collection of message headers, and a message body. This module\nalso defines a helper class AddressList for parsing RFC 2822\naddresses. Please refer to the RFC for information on the specific syntax of\nRFC 2822 messages.
\nThe mailbox module provides classes to read mailboxes produced by\nvarious end-user mail programs.
\nA Message instance is instantiated with an input object as parameter.\nMessage relies only on the input object having a readline() method; in\nparticular, ordinary file objects qualify. Instantiation reads headers from the\ninput object up to a delimiter line (normally a blank line) and stores them in\nthe instance. The message body, following the headers, is not consumed.
\nThis class can work with any input object that supports a readline()\nmethod. If the input object has seek and tell capability, the\nrewindbody() method will work; also, illegal lines will be pushed back\nonto the input stream. If the input object lacks seek but has an unread()\nmethod that can push back a line of input, Message will use that to\npush back illegal lines. Thus this class can be used to parse messages coming\nfrom a buffered stream.
\nThe optional seekable argument is provided as a workaround for certain stdio\nlibraries in which tell() discards buffered data before discovering that\nthe lseek() system call doesn’t work. For maximum portability, you\nshould set the seekable argument to zero to prevent that initial tell()\nwhen passing in an unseekable object such as a file object created from a socket\nobject.
\nInput lines as read from the file may either be terminated by CR-LF or by a\nsingle linefeed; a terminating CR-LF is replaced by a single linefeed before the\nline is stored.
\nAll header matching is done independent of upper or lower case; e.g.\nm['From'], m['from'] and m['FROM'] all yield the same result.
\nSee also
\nA Message instance has the following methods:
\nReturn a pair (full name, email address) parsed from the string returned by\ngetheader(name). If no header matching name exists, return (None,\nNone); otherwise both the full name and the address are (possibly empty)\nstrings.
\nExample: If m‘s first From header contains the string\n'jack@cwi.nl (Jack Jansen)', then m.getaddr('From') will yield the pair\n('Jack Jansen', 'jack@cwi.nl'). If the header contained 'Jack Jansen\n<jack@cwi.nl>' instead, it would yield the exact same result.
\nThis is similar to getaddr(list), but parses a header containing a list of\nemail addresses (e.g. a To header) and returns a list of (full\nname, email address) pairs (even if there was only one address in the header).\nIf there is no header matching name, return an empty list.
\nIf multiple headers exist that match the named header (e.g. if there are several\nCc headers), all are parsed for addresses. Any continuation lines\nthe named headers contain are also parsed.
\nRetrieve a header using getheader() and parse it into a 9-tuple compatible\nwith time.mktime(); note that fields 6, 7, and 8 are not usable. If\nthere is no header matching name, or it is unparsable, return None.
\nDate parsing appears to be a black art, and not all mailers adhere to the\nstandard. While it has been tested and found correct on a large collection of\nemail from many sources, it is still possible that this function may\noccasionally yield an incorrect result.
\nMessage instances also support a limited mapping interface. In\nparticular: m[name] is like m.getheader(name) but raises KeyError\nif there is no matching header; and len(m), m.get(name[, default]),\nname in m, m.keys(), m.values() m.items(), and\nm.setdefault(name[, default]) act as expected, with the one difference\nthat setdefault() uses an empty string as the default value.\nMessage instances also support the mapping writable interface m[name]\n= value and del m[name]. Message objects do not support the\nclear(), copy(), popitem(), or update() methods of the\nmapping interface. (Support for get() and setdefault() was only\nadded in Python 2.2.)
\nFinally, Message instances have some public instance variables:
\nAn AddressList instance has the following methods:
\nFinally, AddressList instances have one public instance variable:
\nFootnotes
\n[1] | This module originally conformed to RFC 822, hence the name. Since then,\nRFC 2822 has been released as an update to RFC 822. This module should be\nconsidered RFC 2822-conformant, especially in cases where the syntax or\nsemantics have changed since RFC 822. |
\nDeprecated since version 2.5: The email package should be used in preference to the multifile\nmodule. This module is present only to maintain backward compatibility.
\nThe MultiFile object enables you to treat sections of a text file as\nfile-like input objects, with '' being returned by readline() when a\ngiven delimiter pattern is encountered. The defaults of this class are designed\nto make it useful for parsing MIME multipart messages, but by subclassing it and\noverriding methods it can be easily adapted for more general use.
\nCreate a multi-file. You must instantiate this class with an input object\nargument for the MultiFile instance to get lines from, such as a file\nobject returned by open().
\nMultiFile only ever looks at the input object’s readline(),\nseek() and tell() methods, and the latter two are only needed if you\nwant random access to the individual MIME parts. To use MultiFile on a\nnon-seekable stream object, set the optional seekable argument to false; this\nwill prevent using the input object’s seek() and tell() methods.
\nIt will be useful to know that in MultiFile‘s view of the world, text\nis composed of three kinds of lines: data, section-dividers, and end-markers.\nMultiFile is designed to support parsing of messages that may have multiple\nnested message parts, each with its own pattern for section-divider and\nend-marker lines.
\nSee also
\nA MultiFile instance has the following methods:
\nReturn true if str is data and false if it might be a section boundary. As\nwritten, it tests for a prefix other than '--' at start of line (which\nall MIME boundaries have) but it is declared so it can be overridden in derived\nclasses.
\nNote that this test is used intended as a fast guard for the real boundary\ntests; if it always returns false it will merely slow processing, not cause it\nto fail.
\nPush a boundary string. When a decorated version of this boundary is found as\nan input line, it will be interpreted as a section-divider or end-marker\n(depending on the decoration, see RFC 2045). All subsequent reads will\nreturn the empty string to indicate end-of-file, until a call to pop()\nremoves the boundary a or next() call reenables it.
\nIt is possible to push more than one boundary. Encountering the\nmost-recently-pushed boundary will return EOF; encountering any other\nboundary will raise an error.
\nFinally, MultiFile instances have two public instance variables:
\nimport mimetools\nimport multifile\nimport StringIO\n\ndef extract_mime_part_matching(stream, mimetype):\n """Return the first element in a multipart MIME message on stream\n matching mimetype."""\n\n msg = mimetools.Message(stream)\n msgtype = msg.gettype()\n params = msg.getplist()\n\n data = StringIO.StringIO()\n if msgtype[:10] == "multipart/":\n\n file = multifile.MultiFile(stream)\n file.push(msg.getparam("boundary"))\n while file.next():\n submsg = mimetools.Message(file)\n try:\n data = StringIO.StringIO()\n mimetools.decode(file, data, submsg.getencoding())\n except ValueError:\n continue\n if submsg.gettype() == mimetype:\n break\n file.pop()\n return data.getvalue()\n
Source code: Lib/uu.py
\nThis module encodes and decodes files in uuencode format, allowing arbitrary\nbinary data to be transferred over ASCII-only connections. Wherever a file\nargument is expected, the methods accept a file-like object. For backwards\ncompatibility, a string containing a pathname is also accepted, and the\ncorresponding file will be opened for reading and writing; the pathname '-'\nis understood to mean the standard input or output. However, this interface is\ndeprecated; it’s better for the caller to open the file itself, and be sure\nthat, when required, the mode is 'rb' or 'wb' on Windows.
\nThis code was contributed by Lance Ellinghouse, and modified by Jack Jansen.
\nThe uu module defines the following functions:
\nThis call decodes uuencoded file in_file placing the result on file\nout_file. If out_file is a pathname, mode is used to set the permission\nbits if the file must be created. Defaults for out_file and mode are taken\nfrom the uuencode header. However, if the file specified in the header already\nexists, a uu.Error is raised.
\ndecode() may print a warning to standard error if the input was produced\nby an incorrect uuencoder and Python could recover from that error. Setting\nquiet to a true value silences this warning.
\nSee also
\nThe binascii module contains a number of methods to convert between\nbinary and various ASCII-encoded binary representations. Normally, you will not\nuse these functions directly but use wrapper modules like uu,\nbase64, or binhex instead. The binascii module contains\nlow-level functions written in C for greater speed that are used by the\nhigher-level modules.
\nThe binascii module defines the following functions:
\nCompute CRC-32, the 32-bit checksum of data, starting with an initial crc. This\nis consistent with the ZIP file checksum. Since the algorithm is designed for\nuse as a checksum algorithm, it is not suitable for use as a general hash\nalgorithm. Use as follows:
\nprint binascii.crc32("hello world")\n# Or, in two pieces:\ncrc = binascii.crc32("hello")\ncrc = binascii.crc32(" world", crc) & 0xffffffff\nprint 'crc32 = 0x%08x' crc\n
Note
\nTo generate the same numeric value across all Python versions and\nplatforms use crc32(data) & 0xffffffff. If you are only using\nthe checksum in packed binary format this is not necessary as the\nreturn value is the correct 32bit binary representation\nregardless of sign.
\n\nChanged in version 2.6: The return value is in the range [-2**31, 2**31-1]\nregardless of platform. In the past the value would be signed on\nsome platforms and unsigned on others. Use & 0xffffffff on the\nvalue if you want it to match 3.0 behavior.
\n\nChanged in version 3.0: The return value is unsigned and in the range [0, 2**32-1]\nregardless of platform.
\nNote
\nThe HTMLParser module has been renamed to html.parser in Python\n3. The 2to3 tool will automatically adapt imports when converting\nyour sources to Python 3.
\n\nNew in version 2.2.
\nSource code: Lib/HTMLParser.py
\nThis module defines a class HTMLParser which serves as the basis for\nparsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.\nUnlike the parser in htmllib, this parser is not based on the SGML parser\nin sgmllib.
\nThe HTMLParser class is instantiated without arguments.
\nAn HTMLParser instance is fed HTML data and calls handler functions when tags\nbegin and end. The HTMLParser class is meant to be overridden by the\nuser to provide a desired behavior.
\nUnlike the parser in htmllib, this parser does not check that end tags\nmatch start tags or call the end-tag handler for elements which are closed\nimplicitly by closing an outer element.
\nAn exception is defined as well:
\nHTMLParser instances have the following methods:
\nThis method is called to handle the start of a tag. It is intended to be\noverridden by a derived class; the base class implementation does nothing.
\nThe tag argument is the name of the tag converted to lower case. The attrs\nargument is a list of (name, value) pairs containing the attributes found\ninside the tag’s <> brackets. The name will be translated to lower case,\nand quotes in the value have been removed, and character and entity references\nhave been replaced. For instance, for the tag <A\nHREF="http://www.cwi.nl/">, this method would be called as\nhandle_starttag('a', [('href', 'http://www.cwi.nl/')]).
\n\nChanged in version 2.6: All entity references from htmlentitydefs are now replaced in the attribute\nvalues.
\nMethod called when a processing instruction is encountered. The data\nparameter will contain the entire processing instruction. For example, for the\nprocessing instruction <?proc color='red'>, this method would be called as\nhandle_pi("proc color='red'"). It is intended to be overridden by a derived\nclass; the base class implementation does nothing.
\nNote
\nThe HTMLParser class uses the SGML syntactic rules for processing\ninstructions. An XHTML processing instruction using the trailing '?' will\ncause the '?' to be included in data.
\nAs a basic example, below is a simple HTML parser that uses the\nHTMLParser class to print out start tags, end tags and data\nas they are encountered:
\nfrom HTMLParser import HTMLParser\n\nclass MyHTMLParser(HTMLParser):\n def handle_starttag(self, tag, attrs):\n print "Encountered a start tag:", tag\n def handle_endtag(self, tag):\n print "Encountered an end tag:", tag\n def handle_data(self, data):\n print "Encountered some data:", data\n\n\nparser = MyHTMLParser()\nparser.feed('<html><head><title>Test</title></head>'\n '<body><h1>Parse me!</h1></body></html>')\n
This module encodes and decodes files in binhex4 format, a format allowing\nrepresentation of Macintosh files in ASCII. On the Macintosh, both forks of a\nfile and the finder information are encoded (or decoded), on other platforms\nonly the data fork is handled.
\nNote
\nIn Python 3.x, special Macintosh support has been removed.
\nThe binhex module defines the following functions:
\nThe following exception is also defined:
\nSee also
\nThere is an alternative, more powerful interface to the coder and decoder, see\nthe source for details.
\nIf you code or decode textfiles on non-Macintosh platforms they will still use\nthe old Macintosh newline convention (carriage-return as end of line).
\nAs of this writing, hexbin() appears to not work in all cases.
\nSource code: Lib/quopri.py
\nThis module performs quoted-printable transport encoding and decoding, as\ndefined in RFC 1521: “MIME (Multipurpose Internet Mail Extensions) Part One:\nMechanisms for Specifying and Describing the Format of Internet Message Bodies”.\nThe quoted-printable encoding is designed for data where there are relatively\nfew nonprintable characters; the base64 encoding scheme available via the\nbase64 module is more compact if there are many such characters, as when\nsending a graphics file.
\n\nDeprecated since version 2.6: The sgmllib module has been removed in Python 3.0.
\nThis module defines a class SGMLParser which serves as the basis for\nparsing text files formatted in SGML (Standard Generalized Mark-up Language).\nIn fact, it does not provide a full SGML parser — it only parses SGML insofar\nas it is used by HTML, and the module only exists as a base for the\nhtmllib module. Another HTML parser which supports XHTML and offers a\nsomewhat different interface is available in the HTMLParser module.
\nThe SGMLParser class is instantiated without arguments. The parser is\nhardcoded to recognize the following constructs:
\nA single exception is defined as well:
\nException raised by the SGMLParser class when it encounters an error\nwhile parsing.
\n\nNew in version 2.1.
\nSGMLParser instances have the following methods:
\nThis method is called to handle start tags for which either a start_tag()\nor do_tag() method has been defined. The tag argument is the name of\nthe tag converted to lower case, and the method argument is the bound method\nwhich should be used to support semantic interpretation of the start tag. The\nattributes argument is a list of (name, value) pairs containing the\nattributes found inside the tag’s <> brackets.
\nThe name has been translated to lower case. Double quotes and backslashes in\nthe value have been interpreted, as well as known character references and\nknown entity references terminated by a semicolon (normally, entity references\ncan be terminated by any non-alphanumerical character, but this would break the\nvery common case of <A HREF="url?spam=1&eggs=2"> when eggs is a valid\nentity name).
\nFor instance, for the tag <A HREF="http://www.cwi.nl/">, this method would\nbe called as unknown_starttag('a', [('href', 'http://www.cwi.nl/')]). The\nbase implementation simply calls method with attributes as the only\nargument.
\n\nNew in version 2.5: Handling of entity and character references within attribute values.
\nThis method is called to process a character reference of the form &#ref;.\nThe base implementation uses convert_charref() to convert the reference to\na string. If that method returns a string, it is passed to handle_data(),\notherwise unknown_charref(ref) is called to handle the error.
\n\nChanged in version 2.5: Use convert_charref() instead of hard-coding the conversion.
\nConvert a character reference to a string, or None. ref is the reference\npassed in as a string. In the base implementation, ref must be a decimal\nnumber in the range 0-255. It converts the code point found using the\nconvert_codepoint() method. If ref is invalid or out of range, this\nmethod returns None. This method is called by the default\nhandle_charref() implementation and by the attribute value parser.
\n\nNew in version 2.5.
\nConvert a codepoint to a str value. Encodings can be handled here if\nappropriate, though the rest of sgmllib is oblivious on this matter.
\n\nNew in version 2.5.
\nThis method is called to process a general entity reference of the form\n&ref; where ref is an general entity reference. It converts ref by\npassing it to convert_entityref(). If a translation is returned, it calls\nthe method handle_data() with the translation; otherwise, it calls the\nmethod unknown_entityref(ref). The default entitydefs defines\ntranslations for &, &apos, >, <, and ".
\n\nChanged in version 2.5: Use convert_entityref() instead of hard-coding the conversion.
\nConvert a named entity reference to a str value, or None. The\nresulting value will not be parsed. ref will be only the name of the entity.\nThe default implementation looks for ref in the instance (or class) variable\nentitydefs which should be a mapping from entity names to corresponding\ntranslations. If no translation is available for ref, this method returns\nNone. This method is called by the default handle_entityref()\nimplementation and by the attribute value parser.
\n\nNew in version 2.5.
\nApart from overriding or extending the methods listed above, derived classes may\nalso define methods of the following form to define processing of specific tags.\nTag names in the input stream are case independent; the tag occurring in\nmethod names must be in lower case:
\nNote that the parser maintains a stack of open elements for which no end tag has\nbeen found yet. Only tags processed by start_tag() are pushed on this\nstack. Definition of an end_tag() method is optional for these tags. For\ntags processed by do_tag() or by unknown_tag(), no end_tag()\nmethod must be defined; if defined, it will not be used. If both\nstart_tag() and do_tag() methods exist for a tag, the\nstart_tag() method takes precedence.
\n\nDeprecated since version 2.6: The htmllib module has been removed in Python 3.0.
\nThis module defines a class which can serve as a base for parsing text files\nformatted in the HyperText Mark-up Language (HTML). The class is not directly\nconcerned with I/O — it must be provided with input in string form via a\nmethod, and makes calls to methods of a “formatter” object in order to produce\noutput. The HTMLParser class is designed to be used as a base class\nfor other classes in order to add functionality, and allows most of its methods\nto be extended or overridden. In turn, this class is derived from and extends\nthe SGMLParser class defined in module sgmllib. The\nHTMLParser implementation supports the HTML 2.0 language as described\nin RFC 1866. Two implementations of formatter objects are provided in the\nformatter module; refer to the documentation for that module for\ninformation on the formatter interface.
\nThe following is a summary of the interface defined by\nsgmllib.SGMLParser:
\nThe interface to feed data to an instance is through the feed() method,\nwhich takes a string argument. This can be called with as little or as much\ntext at a time as desired; p.feed(a); p.feed(b) has the same effect as\np.feed(a+b). When the data contains complete HTML markup constructs, these\nare processed immediately; incomplete constructs are saved in a buffer. To\nforce processing of all unprocessed data, call the close() method.
\nFor example, to parse the entire contents of a file, use:
\nparser.feed(open('myfile.html').read())\nparser.close()\n
The interface to define semantics for HTML tags is very simple: derive a class\nand define methods called start_tag(), end_tag(), or do_tag().\nThe parser will call these at appropriate moments: start_tag() or\ndo_tag() is called when an opening tag of the form <tag ...> is\nencountered; end_tag() is called when a closing tag of the form <tag>\nis encountered. If an opening tag requires a corresponding closing tag, like\n<H1> ... </H1>, the class should define the start_tag() method; if\na tag requires no closing tag, like <P>, the class should define the\ndo_tag() method.
\nThe module defines a parser class and an exception:
\nException raised by the HTMLParser class when it encounters an error\nwhile parsing.
\n\nNew in version 2.4.
\nSee also
\nIn addition to tag methods, the HTMLParser class provides some\nadditional methods and instance variables for use within tag methods.
\nNote
\nThe htmlentitydefs module has been renamed to html.entities in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\nSource code: Lib/htmlentitydefs.py
\nThis module defines three dictionaries, name2codepoint, codepoint2name,\nand entitydefs. entitydefs is used by the htmllib module to\nprovide the entitydefs attribute of the HTMLParser class. The\ndefinition provided here contains all the entities defined by XHTML 1.0 that\ncan be handled using simple textual substitution in the Latin-1 character set\n(ISO-8859-1).
\nA dictionary that maps HTML entity names to the Unicode codepoints.
\n\nNew in version 2.3.
\nA dictionary that maps Unicode codepoints to HTML entity names.
\n\nNew in version 2.3.
\n\nNew in version 2.0.
\nSource code: Lib/xml/dom/pulldom.py
\nxml.dom.pulldom allows building only selected portions of a Document\nObject Model representation of a document from SAX events.
\n\nNew in version 2.0.
\nThe Document Object Model, or “DOM,” is a cross-language API from the World Wide\nWeb Consortium (W3C) for accessing and modifying XML documents. A DOM\nimplementation presents an XML document as a tree structure, or allows client\ncode to build such a structure from scratch. It then gives access to the\nstructure through a set of objects which provided well-known interfaces.
\nThe DOM is extremely useful for random-access applications. SAX only allows you\na view of one bit of the document at a time. If you are looking at one SAX\nelement, you have no access to another. If you are looking at a text node, you\nhave no access to a containing element. When you write a SAX application, you\nneed to keep track of your program’s position in the document somewhere in your\nown code. SAX does not do it for you. Also, if you need to look ahead in the\nXML document, you are just out of luck.
\nSome applications are simply impossible in an event driven model with no access\nto a tree. Of course you could build some sort of tree yourself in SAX events,\nbut the DOM allows you to avoid writing that code. The DOM is a standard tree\nrepresentation for XML data.
\nThe Document Object Model is being defined by the W3C in stages, or “levels” in\ntheir terminology. The Python mapping of the API is substantially based on the\nDOM Level 2 recommendation.
\nDOM applications typically start by parsing some XML into a DOM. How this is\naccomplished is not covered at all by DOM Level 1, and Level 2 provides only\nlimited improvements: There is a DOMImplementation object class which\nprovides access to Document creation methods, but no way to access an\nXML reader/parser/Document builder in an implementation-independent way. There\nis also no well-defined way to access these methods without an existing\nDocument object. In Python, each DOM implementation will provide a\nfunction getDOMImplementation(). DOM Level 3 adds a Load/Store\nspecification, which defines an interface to the reader, but this is not yet\navailable in the Python standard library.
\nOnce you have a DOM document object, you can access the parts of your XML\ndocument through its properties and methods. These properties are defined in\nthe DOM specification; this portion of the reference manual describes the\ninterpretation of the specification in Python.
\nThe specification provided by the W3C defines the DOM API for Java, ECMAScript,\nand OMG IDL. The Python mapping defined here is based in large part on the IDL\nversion of the specification, but strict compliance is not required (though\nimplementations are free to support the strict mapping from IDL). See section\nConformance for a detailed discussion of mapping requirements.
\nSee also
\nThe xml.dom contains the following functions:
\nReturn a suitable DOM implementation. The name is either well-known, the\nmodule name of a DOM implementation, or None. If it is not None, imports\nthe corresponding module and returns a DOMImplementation object if the\nimport succeeds. If no name is given, and if the environment variable\nPYTHON_DOM is set, this variable is used to find the implementation.
\nIf name is not given, this examines the available implementations to find one\nwith the required feature set. If no implementation can be found, raise an\nImportError. The features list must be a sequence of (feature,\nversion) pairs which are passed to the hasFeature() method on available\nDOMImplementation objects.
\nSome convenience constants are also provided:
\nThe value used to indicate that no namespace is associated with a node in the\nDOM. This is typically found as the namespaceURI of a node, or used as\nthe namespaceURI parameter to a namespaces-specific method.
\n\nNew in version 2.2.
\nThe namespace URI associated with the reserved prefix xml, as defined by\nNamespaces in XML (section 4).
\n\nNew in version 2.2.
\nThe namespace URI for namespace declarations, as defined by Document Object\nModel (DOM) Level 2 Core Specification (section 1.1.8).
\n\nNew in version 2.2.
\nThe URI of the XHTML namespace as defined by XHTML 1.0: The Extensible\nHyperText Markup Language (section 3.1.1).
\n\nNew in version 2.2.
\nIn addition, xml.dom contains a base Node class and the DOM\nexception classes. The Node class provided by this module does not\nimplement any of the methods or attributes defined by the DOM specification;\nconcrete DOM implementations must provide those. The Node class\nprovided as part of this module does provide the constants used for the\nnodeType attribute on concrete Node objects; they are located\nwithin the class rather than at the module level to conform with the DOM\nspecifications.
\nThe definitive documentation for the DOM is the DOM specification from the W3C.
\nNote that DOM attributes may also be manipulated as nodes instead of as simple\nstrings. It is fairly rare that you must do this, however, so this usage is not\nyet documented.
\nInterface | \nSection | \nPurpose | \n
---|---|---|
DOMImplementation | \nDOMImplementation Objects | \nInterface to the underlying\nimplementation. | \n
Node | \nNode Objects | \nBase interface for most objects\nin a document. | \n
NodeList | \nNodeList Objects | \nInterface for a sequence of\nnodes. | \n
DocumentType | \nDocumentType Objects | \nInformation about the\ndeclarations needed to process\na document. | \n
Document | \nDocument Objects | \nObject which represents an\nentire document. | \n
Element | \nElement Objects | \nElement nodes in the document\nhierarchy. | \n
Attr | \nAttr Objects | \nAttribute value nodes on\nelement nodes. | \n
Comment | \nComment Objects | \nRepresentation of comments in\nthe source document. | \n
Text | \nText and CDATASection Objects | \nNodes containing textual\ncontent from the document. | \n
ProcessingInstruction | \nProcessingInstruction Objects | \nProcessing instruction\nrepresentation. | \n
An additional section describes the exceptions defined for working with the DOM\nin Python.
\nThe DOMImplementation interface provides a way for applications to\ndetermine the availability of particular features in the DOM they are using.\nDOM Level 2 added the ability to create new Document and\nDocumentType objects using the DOMImplementation as well.
\nAll of the components of an XML document are subclasses of Node.
\nReturns true if other refers to the same node as this node. This is especially\nuseful for DOM implementations which use any sort of proxy architecture (because\nmore than one object can refer to the same node).
\nNote
\nThis is based on a proposed DOM Level 3 API which is still in the “working\ndraft” stage, but this particular interface appears uncontroversial. Changes\nfrom the W3C will not necessarily affect this method in the Python DOM interface\n(though any new W3C API for this would also be supported).
\nJoin adjacent text nodes so that all stretches of text are stored as single\nText instances. This simplifies processing text from a DOM tree for\nmany applications.
\n\nNew in version 2.1.
\nA NodeList represents a sequence of nodes. These objects are used in\ntwo ways in the DOM Core recommendation: the Element objects provides\none as its list of child nodes, and the getElementsByTagName() and\ngetElementsByTagNameNS() methods of Node return objects with this\ninterface to represent query results.
\nThe DOM Level 2 recommendation defines one method and one attribute for these\nobjects:
\nIn addition, the Python DOM interface requires that some additional support is\nprovided to allow NodeList objects to be used as Python sequences. All\nNodeList implementations must include support for __len__() and\n__getitem__(); this allows iteration over the NodeList in\nfor statements and proper support for the len() built-in\nfunction.
\nIf a DOM implementation supports modification of the document, the\nNodeList implementation must also support the __setitem__() and\n__delitem__() methods.
\nInformation about the notations and entities declared by a document (including\nthe external subset if the parser uses it and can provide the information) is\navailable from a DocumentType object. The DocumentType for a\ndocument is available from the Document object’s doctype\nattribute; if there is no DOCTYPE declaration for the document, the\ndocument’s doctype attribute will be set to None instead of an\ninstance of this interface.
\nDocumentType is a specialization of Node, and adds the\nfollowing attributes:
\nA Document represents an entire XML document, including its constituent\nelements, attributes, processing instructions, comments etc. Remember that it\ninherits properties from Node.
\nElement is a subclass of Node, so inherits all the attributes\nof that class.
\nAttr inherits from Node, so inherits all its attributes.
\nNamedNodeMap does not inherit from Node.
\nThere are also experimental methods that give this class more mapping behavior.\nYou can use them or you can use the standardized getAttribute*() family\nof methods on the Element objects.
\nComment represents a comment in the XML document. It is a subclass of\nNode, but cannot have child nodes.
\nThe Text interface represents text in the XML document. If the parser\nand DOM implementation support the DOM’s XML extension, portions of the text\nenclosed in CDATA marked sections are stored in CDATASection objects.\nThese two interfaces are identical, but provide different values for the\nnodeType attribute.
\nThese interfaces extend the Node interface. They cannot have child\nnodes.
\nNote
\nThe use of a CDATASection node does not indicate that the node\nrepresents a complete CDATA marked section, only that the content of the node\nwas part of a CDATA section. A single CDATA section may be represented by more\nthan one node in the document tree. There is no way to determine whether two\nadjacent CDATASection nodes represent different CDATA marked sections.
\nRepresents a processing instruction in the XML document; this inherits from the\nNode interface and cannot have child nodes.
\n\nNew in version 2.1.
\nThe DOM Level 2 recommendation defines a single exception, DOMException,\nand a number of constants that allow applications to determine what sort of\nerror occurred. DOMException instances carry a code attribute\nthat provides the appropriate value for the specific exception.
\nThe Python DOM interface provides the constants, but also expands the set of\nexceptions so that a specific exception exists for each of the exception codes\ndefined by the DOM. The implementations must raise the appropriate specific\nexception, each of which carries the appropriate value for the code\nattribute.
\nThe exception codes defined in the DOM recommendation map to the exceptions\ndescribed above according to this table:
\nConstant | \nException | \n
---|---|
DOMSTRING_SIZE_ERR | \nDomstringSizeErr | \n
HIERARCHY_REQUEST_ERR | \nHierarchyRequestErr | \n
INDEX_SIZE_ERR | \nIndexSizeErr | \n
INUSE_ATTRIBUTE_ERR | \nInuseAttributeErr | \n
INVALID_ACCESS_ERR | \nInvalidAccessErr | \n
INVALID_CHARACTER_ERR | \nInvalidCharacterErr | \n
INVALID_MODIFICATION_ERR | \nInvalidModificationErr | \n
INVALID_STATE_ERR | \nInvalidStateErr | \n
NAMESPACE_ERR | \nNamespaceErr | \n
NOT_FOUND_ERR | \nNotFoundErr | \n
NOT_SUPPORTED_ERR | \nNotSupportedErr | \n
NO_DATA_ALLOWED_ERR | \nNoDataAllowedErr | \n
NO_MODIFICATION_ALLOWED_ERR | \nNoModificationAllowedErr | \n
SYNTAX_ERR | \nSyntaxErr | \n
WRONG_DOCUMENT_ERR | \nWrongDocumentErr | \n
This section describes the conformance requirements and relationships between\nthe Python DOM API, the W3C DOM recommendations, and the OMG IDL mapping for\nPython.
\nThe primitive IDL types used in the DOM specification are mapped to Python types\naccording to the following table.
\nIDL Type | \nPython Type | \n
---|---|
boolean | \nIntegerType (with a value of 0 or\n1) | \n
int | \nIntegerType | \n
long int | \nIntegerType | \n
unsigned int | \nIntegerType | \n
Additionally, the DOMString defined in the recommendation is mapped to\na Python string or Unicode string. Applications should be able to handle\nUnicode whenever a string is returned from the DOM.
\nThe IDL null value is mapped to None, which may be accepted or\nprovided by the implementation whenever null is allowed by the API.
\nThe mapping from OMG IDL to Python defines accessor functions for IDL\nattribute declarations in much the way the Java mapping does.\nMapping the IDL declarations
\nreadonly attribute string someValue;\n attribute string anotherValue;
\nyields three accessor functions: a “get” method for someValue\n(_get_someValue()), and “get” and “set” methods for anotherValue\n(_get_anotherValue() and _set_anotherValue()). The mapping, in\nparticular, does not require that the IDL attributes are accessible as normal\nPython attributes: object.someValue is not required to work, and may\nraise an AttributeError.
\nThe Python DOM API, however, does require that normal attribute access work.\nThis means that the typical surrogates generated by Python IDL compilers are not\nlikely to work, and wrapper objects may be needed on the client if the DOM\nobjects are accessed via CORBA. While this does require some additional\nconsideration for CORBA DOM clients, the implementers with experience using DOM\nover CORBA from Python do not consider this a problem. Attributes that are\ndeclared readonly may not restrict write access in all DOM\nimplementations.
\nIn the Python DOM API, accessor functions are not required. If provided, they\nshould take the form defined by the Python IDL mapping, but these methods are\nconsidered unnecessary since the attributes are accessible directly from Python.\n“Set” accessors should never be provided for readonly attributes.
\nThe IDL definitions do not fully embody the requirements of the W3C DOM API,\nsuch as the notion of certain objects, such as the return value of\ngetElementsByTagName(), being “live”. The Python DOM API does not require\nimplementations to enforce such requirements.
\n\nNew in version 2.0.
\nThe xml.sax package provides a number of modules which implement the\nSimple API for XML (SAX) interface for Python. The package itself provides the\nSAX exceptions and the convenience functions which will be most used by users of\nthe SAX API.
\nThe convenience functions are:
\nA typical SAX application uses three kinds of objects: readers, handlers and\ninput sources. “Reader” in this context is another term for parser, i.e. some\npiece of code that reads the bytes or characters from the input source, and\nproduces a sequence of events. The events then get distributed to the handler\nobjects, i.e. the reader invokes a method on the handler. A SAX application\nmust therefore obtain a reader object, create or open the input sources, create\nthe handlers, and connect these objects all together. As the final step of\npreparation, the reader is called to parse the input. During parsing, methods on\nthe handler objects are called based on structural and syntactic events from the\ninput data.
\nFor these objects, only the interfaces are relevant; they are normally not\ninstantiated by the application itself. Since Python does not have an explicit\nnotion of interface, they are formally introduced as classes, but applications\nmay use implementations which do not inherit from the provided classes. The\nInputSource, Locator, Attributes,\nAttributesNS, and XMLReader interfaces are defined in the\nmodule xml.sax.xmlreader. The handler interfaces are defined in\nxml.sax.handler. For convenience, InputSource (which is often\ninstantiated directly) and the handler classes are also available from\nxml.sax. These interfaces are described below.
\nIn addition to these classes, xml.sax provides the following exception\nclasses.
\nEncapsulate an XML error or warning. This class can contain basic error or\nwarning information from either the XML parser or the application: it can be\nsubclassed to provide additional functionality or to add localization. Note\nthat although the handlers defined in the ErrorHandler interface\nreceive instances of this exception, it is not required to actually raise the\nexception — it is also useful as a container for information.
\nWhen instantiated, msg should be a human-readable description of the error.\nThe optional exception parameter, if given, should be None or an exception\nthat was caught by the parsing code and is being passed along as information.
\nThis is the base class for the other SAX exception classes.
\nSee also
\nThe SAXException exception class supports the following methods:
\n\nNew in version 2.0.
\nSource code: Lib/xml/dom/minidom.py
\nxml.dom.minidom is a light-weight implementation of the Document Object\nModel interface. It is intended to be simpler than the full DOM and also\nsignificantly smaller.
\nDOM applications typically start by parsing some XML into a DOM. With\nxml.dom.minidom, this is done through the parse functions:
\nfrom xml.dom.minidom import parse, parseString\n\ndom1 = parse('c:\\\\temp\\\\mydata.xml') # parse an XML file by name\n\ndatasource = open('c:\\\\temp\\\\mydata.xml')\ndom2 = parse(datasource) # parse an open file\n\ndom3 = parseString('<myxml>Some data<empty/> some more data</myxml>')\n
The parse() function can take either a filename or an open file object.
\nIf you have XML in a string, you can use the parseString() function\ninstead:
\nBoth functions return a Document object representing the content of the\ndocument.
\nWhat the parse() and parseString() functions do is connect an XML\nparser with a “DOM builder” that can accept parse events from any SAX parser and\nconvert them into a DOM tree. The name of the functions are perhaps misleading,\nbut are easy to grasp when learning the interfaces. The parsing of the document\nwill be completed before these functions return; it’s simply that these\nfunctions do not provide a parser implementation themselves.
\nYou can also create a Document by calling a method on a “DOM\nImplementation” object. You can get this object either by calling the\ngetDOMImplementation() function in the xml.dom package or the\nxml.dom.minidom module. Using the implementation from the\nxml.dom.minidom module will always return a Document instance\nfrom the minidom implementation, while the version from xml.dom may\nprovide an alternate implementation (this is likely if you have the PyXML\npackage installed). Once you have a\nDocument, you can add child nodes to it to populate the DOM:
\nfrom xml.dom.minidom import getDOMImplementation\n\nimpl = getDOMImplementation()\n\nnewdoc = impl.createDocument(None, "some_tag", None)\ntop_element = newdoc.documentElement\ntext = newdoc.createTextNode('Some textual content.')\ntop_element.appendChild(text)\n
Once you have a DOM document object, you can access the parts of your XML\ndocument through its properties and methods. These properties are defined in\nthe DOM specification. The main property of the document object is the\ndocumentElement property. It gives you the main element in the XML\ndocument: the one that holds all others. Here is an example program:
\ndom3 = parseString("<myxml>Some data</myxml>")\nassert dom3.documentElement.tagName == "myxml"\n
When you are finished with a DOM tree, you may optionally call the\nunlink() method to encourage early cleanup of the now-unneeded\nobjects. unlink() is a xml.dom.minidom-specific\nextension to the DOM API that renders the node and its descendants are\nessentially useless. Otherwise, Python’s garbage collector will\neventually take care of the objects in the tree.
\nSee also
\nThe definition of the DOM API for Python is given as part of the xml.dom\nmodule documentation. This section lists the differences between the API and\nxml.dom.minidom.
\nWrite XML to the writer object. The writer should have a write() method\nwhich matches that of the file object interface. The indent parameter is the\nindentation of the current node. The addindent parameter is the incremental\nindentation to use for subnodes of the current one. The newl parameter\nspecifies the string to use to terminate newlines.
\nFor the Document node, an additional keyword argument encoding can\nbe used to specify the encoding field of the XML header.
\n\nChanged in version 2.1: The optional keyword parameters indent, addindent, and newl were added to\nsupport pretty output.
\n\nChanged in version 2.3: For the Document node, an additional keyword argument\nencoding can be used to specify the encoding field of the XML header.
\nReturn the XML that the DOM represents as a string.
\nWith no argument, the XML header does not specify an encoding, and the result is\nUnicode string if the default encoding cannot represent all characters in the\ndocument. Encoding this string in an encoding other than UTF-8 is likely\nincorrect, since UTF-8 is the default encoding of XML.
\nWith an explicit encoding [1] argument, the result is a byte string in the\nspecified encoding. It is recommended that this argument is always specified. To\navoid UnicodeError exceptions in case of unrepresentable text data, the\nencoding argument should be specified as “utf-8”.
\n\nChanged in version 2.3: the encoding argument was introduced; see writexml().
\nReturn a pretty-printed version of the document. indent specifies the\nindentation string and defaults to a tabulator; newl specifies the string\nemitted at the end of each line and defaults to \\n.
\n\nNew in version 2.1.
\n\nChanged in version 2.3: the encoding argument was introduced; see writexml().
\nThe following standard DOM methods have special considerations with\nxml.dom.minidom:
\nThis example program is a fairly realistic example of a simple program. In this\nparticular case, we do not take much advantage of the flexibility of the DOM.
\nimport xml.dom.minidom\n\ndocument = """\\\n<slideshow>\n<title>Demo slideshow</title>\n<slide><title>Slide title</title>\n<point>This is a demo</point>\n<point>Of a program for processing slides</point>\n</slide>\n\n<slide><title>Another demo slide</title>\n<point>It is important</point>\n<point>To have more than</point>\n<point>one slide</point>\n</slide>\n</slideshow>\n"""\n\ndom = xml.dom.minidom.parseString(document)\n\ndef getText(nodelist):\n rc = []\n for node in nodelist:\n if node.nodeType == node.TEXT_NODE:\n rc.append(node.data)\n return ''.join(rc)\n\ndef handleSlideshow(slideshow):\n print "<html>"\n handleSlideshowTitle(slideshow.getElementsByTagName("title")[0])\n slides = slideshow.getElementsByTagName("slide")\n handleToc(slides)\n handleSlides(slides)\n print "</html>"\n\ndef handleSlides(slides):\n for slide in slides:\n handleSlide(slide)\n\ndef handleSlide(slide):\n handleSlideTitle(slide.getElementsByTagName("title")[0])\n handlePoints(slide.getElementsByTagName("point"))\n\ndef handleSlideshowTitle(title):\n print "<title>%s</title>" getText(title.childNodes)\n\ndef handleSlideTitle(title):\n print "<h2>%s</h2>" getText(title.childNodes)\n\ndef handlePoints(points):\n print "<ul>"\n for point in points:\n handlePoint(point)\n print "</ul>"\n\ndef handlePoint(point):\n print "<li>%s</li>" getText(point.childNodes)\n\ndef handleToc(slides):\n for slide in slides:\n title = slide.getElementsByTagName("title")[0]\n print "<p>%s</p>" getText(title.childNodes)\n\nhandleSlideshow(dom)\n
The xml.dom.minidom module is essentially a DOM 1.0-compatible DOM with\nsome DOM 2 features (primarily namespace features).
\nUsage of the DOM interface in Python is straight-forward. The following mapping\nrules apply:
\nThe following interfaces have no implementation in xml.dom.minidom:
\nMost of these reflect information in the XML document that is not of general\nutility to most DOM users.
\nFootnotes
\n[1] | The encoding string included in XML output should conform to the\nappropriate standards. For example, “UTF-8” is valid, but “UTF8” is\nnot. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl\nand http://www.iana.org/assignments/character-sets . |
\nNew in version 2.0.
\nThe xml.parsers.expat module is a Python interface to the Expat\nnon-validating XML parser. The module provides a single extension type,\nxmlparser, that represents the current state of an XML parser. After\nan xmlparser object has been created, various attributes of the object\ncan be set to handler functions. When an XML document is then fed to the\nparser, the handler functions are called for the character data and markup in\nthe XML document.
\nThis module uses the pyexpat module to provide access to the Expat\nparser. Direct use of the pyexpat module is deprecated.
\nThis module provides one exception and one type object:
\nThe xml.parsers.expat module contains two functions:
\nCreates and returns a new xmlparser object. encoding, if specified,\nmust be a string naming the encoding used by the XML data. Expat doesn’t\nsupport as many encodings as Python does, and its repertoire of encodings can’t\nbe extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If\nencoding [1] is given it will override the implicit or explicit encoding of the\ndocument.
\nExpat can optionally do XML namespace processing for you, enabled by providing a\nvalue for namespace_separator. The value must be a one-character string; a\nValueError will be raised if the string has an illegal length (None\nis considered the same as omission). When namespace processing is enabled,\nelement type names and attribute names that belong to a namespace will be\nexpanded. The element name passed to the element handlers\nStartElementHandler and EndElementHandler will be the\nconcatenation of the namespace URI, the namespace separator character, and the\nlocal part of the name. If the namespace separator is a zero byte (chr(0))\nthen the namespace URI and the local part will be concatenated without any\nseparator.
\nFor example, if namespace_separator is set to a space character (' ') and\nthe following document is parsed:
\n<?xml version=\"1.0\"?>\n<root xmlns = \"http://default-namespace.org/\"\n xmlns:py = \"http://www.python.org/ns/\">\n <py:elem1 />\n <elem2 xmlns=\"\" />\n</root>
\nStartElementHandler will receive the following strings for each\nelement:
\nhttp://default-namespace.org/ root\nhttp://www.python.org/ns/ elem1\nelem2
\nSee also
\nxmlparser objects have the following methods:
\nReturns the input data that generated the current event as a string. The data is\nin the encoding of the entity which contains the text. When called while an\nevent handler is not active, the return value is None.
\n\nNew in version 2.1.
\nCalling this with a true value for flag (the default) will cause Expat to call\nthe ExternalEntityRefHandler with None for all arguments to\nallow an alternate DTD to be loaded. If the document does not contain a\ndocument type declaration, the ExternalEntityRefHandler will still be\ncalled, but the StartDoctypeDeclHandler and\nEndDoctypeDeclHandler will not be called.
\nPassing a false value for flag will cancel a previous call that passed a true\nvalue, but otherwise has no effect.
\nThis method can only be called before the Parse() or ParseFile()\nmethods are called; calling it after either of those have been called causes\nExpatError to be raised with the code attribute set to\nerrors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING.
\n\nNew in version 2.3.
\nxmlparser objects have the following attributes:
\nThe size of the buffer used when buffer_text is true.\nA new buffer size can be set by assigning a new integer value\nto this attribute.\nWhen the size is changed, the buffer will be flushed.
\n\nNew in version 2.3.
\n\nChanged in version 2.6: The buffer size can now be changed.
\nSetting this to true causes the xmlparser object to buffer textual\ncontent returned by Expat to avoid multiple calls to the\nCharacterDataHandler() callback whenever possible. This can improve\nperformance substantially since Expat normally breaks character data into chunks\nat every line ending. This attribute is false by default, and may be changed at\nany time.
\n\nNew in version 2.3.
\nIf buffer_text is enabled, the number of bytes stored in the buffer.\nThese bytes represent UTF-8 encoded text. This attribute has no meaningful\ninterpretation when buffer_text is false.
\n\nNew in version 2.3.
\nSetting this attribute to a non-zero integer causes the attributes to be\nreported as a list rather than a dictionary. The attributes are presented in\nthe order found in the document text. For each attribute, two list entries are\npresented: the attribute name and the attribute value. (Older versions of this\nmodule also used this format.) By default, this attribute is false; it may be\nchanged at any time.
\n\nNew in version 2.1.
\nIf this attribute is set to a non-zero integer, the handler functions will be\npassed Unicode strings. If returns_unicode is False, 8-bit\nstrings containing UTF-8 encoded data will be passed to the handlers. This is\nTrue by default when Python is built with Unicode support.
\n\nChanged in version 1.6: Can be changed at any time to affect the result type.
\nIf set to a non-zero integer, the parser will report only those attributes which\nwere specified in the document instance and not those which were derived from\nattribute declarations. Applications which set this need to be especially\ncareful to use what additional information is available from the declarations as\nneeded to comply with the standards for the behavior of XML processors. By\ndefault, this attribute is false; it may be changed at any time.
\n\nNew in version 2.1.
\nThe following attributes contain values relating to the most recent error\nencountered by an xmlparser object, and will only have correct values\nonce a call to Parse() or ParseFile() has raised a\nxml.parsers.expat.ExpatError exception.
\nThe following attributes contain values relating to the current parse location\nin an xmlparser object. During a callback reporting a parse event they\nindicate the location of the first of the sequence of characters that generated\nthe event. When called outside of a callback, the position indicated will be\njust past the last parse event (regardless of whether there was an associated\ncallback).
\n\nNew in version 2.4.
\nHere is the list of handlers that can be set. To set a handler on an\nxmlparser object o, use o.handlername = func. handlername must\nbe taken from the following list, and func must be a callable object accepting\nthe correct number of arguments. The arguments are all strings, unless\notherwise stated.
\nCalled when the XML declaration is parsed. The XML declaration is the\n(optional) declaration of the applicable version of the XML recommendation, the\nencoding of the document text, and an optional “standalone” declaration.\nversion and encoding will be strings of the type dictated by the\nreturns_unicode attribute, and standalone will be 1 if the\ndocument is declared standalone, 0 if it is declared not to be standalone,\nor -1 if the standalone clause was omitted. This is only available with\nExpat version 1.95.0 or newer.
\n\nNew in version 2.1.
\nCalled for all entity declarations. For parameter and internal entities,\nvalue will be a string giving the declared contents of the entity; this will\nbe None for external entities. The notationName parameter will be\nNone for parsed entities, and the name of the notation for unparsed\nentities. is_parameter_entity will be true if the entity is a parameter entity\nor false for general entities (most applications only need to be concerned with\ngeneral entities). This is only available starting with version 1.95.0 of the\nExpat library.
\n\nNew in version 2.1.
\nCalled for references to external entities. base is the current base, as set\nby a previous call to SetBase(). The public and system identifiers,\nsystemId and publicId, are strings if given; if the public identifier is not\ngiven, publicId will be None. The context value is opaque and should\nonly be used as described below.
\nFor external entities to be parsed, this handler must be implemented. It is\nresponsible for creating the sub-parser using\nExternalEntityParserCreate(context), initializing it with the appropriate\ncallbacks, and parsing the entity. This handler should return an integer; if it\nreturns 0, the parser will raise an\nXML_ERROR_EXTERNAL_ENTITY_HANDLING error, otherwise parsing will\ncontinue.
\nIf this handler is not provided, external entities are reported by the\nDefaultHandler callback, if provided.
\nExpatError exceptions have a number of interesting attributes:
\nExpat’s internal error number for the specific error. This will match one of\nthe constants defined in the errors object from this module.
\n\nNew in version 2.1.
\nLine number on which the error was detected. The first line is numbered 1.
\n\nNew in version 2.1.
\nCharacter offset into the line where the error occurred. The first column is\nnumbered 0.
\n\nNew in version 2.1.
\nThe following program defines three handlers that just print out their\narguments.
\nimport xml.parsers.expat\n\n# 3 handler functions\ndef start_element(name, attrs):\n print 'Start element:', name, attrs\ndef end_element(name):\n print 'End element:', name\ndef char_data(data):\n print 'Character data:', repr(data)\n\np = xml.parsers.expat.ParserCreate()\n\np.StartElementHandler = start_element\np.EndElementHandler = end_element\np.CharacterDataHandler = char_data\n\np.Parse("""<?xml version="1.0"?>\n<parent id="top"><child1 name="paul">Text goes here</child1>\n<child2 name="fred">More text</child2>\n</parent>""", 1)\n
The output from this program is:
\nStart element: parent {'id': 'top'}\nStart element: child1 {'name': 'paul'}\nCharacter data: 'Text goes here'\nEnd element: child1\nCharacter data: '\\n'\nStart element: child2 {'name': 'fred'}\nCharacter data: 'More text'\nEnd element: child2\nCharacter data: '\\n'\nEnd element: parent
\nContent modules are described using nested tuples. Each tuple contains four\nvalues: the type, the quantifier, the name, and a tuple of children. Children\nare simply additional content module descriptions.
\nThe values of the first two fields are constants defined in the model object\nof the xml.parsers.expat module. These constants can be collected in two\ngroups: the model type group and the quantifier group.
\nThe constants in the model type group are:
\nThe constants in the quantifier group are:
\nThe following constants are provided in the errors object of the\nxml.parsers.expat module. These constants are useful in interpreting\nsome of the attributes of the ExpatError exception objects raised when an\nerror has occurred.
\nThe errors object has the following attributes:
\nFootnotes
\n[1] | The encoding string included in XML output should conform to the\nappropriate standards. For example, “UTF-8” is valid, but “UTF8” is\nnot. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl\nand http://www.iana.org/assignments/character-sets . |
\nNew in version 2.0.
\nThe module xml.sax.saxutils contains a number of classes and functions\nthat are commonly useful when creating SAX applications, either in direct use,\nor as base classes.
\nEscape '&', '<', and '>' in a string of data.
\nYou can escape other strings of data by passing a dictionary as the optional\nentities parameter. The keys and values must all be strings; each key will be\nreplaced with its corresponding value. The characters '&', '<' and\n'>' are always escaped, even if entities is provided.
\nUnescape '&', '<', and '>' in a string of data.
\nYou can unescape other strings of data by passing a dictionary as the optional\nentities parameter. The keys and values must all be strings; each key will be\nreplaced with its corresponding value. '&', '<', and '>'\nare always unescaped, even if entities is provided.
\n\nNew in version 2.3.
\nSimilar to escape(), but also prepares data to be used as an\nattribute value. The return value is a quoted version of data with any\nadditional required replacements. quoteattr() will select a quote\ncharacter based on the content of data, attempting to avoid encoding any\nquote characters in the string. If both single- and double-quote characters\nare already in data, the double-quote characters will be encoded and data\nwill be wrapped in double-quotes. The resulting string can be used directly\nas an attribute value:
\n>>> print "<element attr=%s>" quoteattr("ab ' cd \\" ef")\n<element attr="ab ' cd " ef">\n
This function is useful when generating attribute values for HTML or any SGML\nusing the reference concrete syntax.
\n\nNew in version 2.2.
\n\nNew in version 2.0.
\nThe SAX API defines four kinds of handlers: content handlers, DTD handlers,\nerror handlers, and entity resolvers. Applications normally only need to\nimplement those interfaces whose events they are interested in; they can\nimplement the interfaces in a single object or in multiple objects. Handler\nimplementations should inherit from the base classes provided in the module\nxml.sax.handler, so that all methods get default implementations.
\nHandle DTD events.
\nThis interface specifies only those DTD events required for basic parsing\n(unparsed entities and attributes).
\nIn addition to these classes, xml.sax.handler provides symbolic constants\nfor the feature and property names.
\nUsers are expected to subclass ContentHandler to support their\napplication. The following methods are called by the parser on the appropriate\nevents in the input document:
\nCalled by the parser to give the application a locator for locating the origin\nof document events.
\nSAX parsers are strongly encouraged (though not absolutely required) to supply a\nlocator: if it does so, it must supply the locator to the application by\ninvoking this method before invoking any of the other methods in the\nDocumentHandler interface.
\nThe locator allows the application to determine the end position of any\ndocument-related event, even if the parser is not reporting an error. Typically,\nthe application will use this information for reporting its own errors (such as\ncharacter content that does not match an application’s business rules). The\ninformation returned by the locator is probably not sufficient for use with a\nsearch engine.
\nNote that the locator will return correct information only during the invocation\nof the events in this interface. The application should not attempt to use it at\nany other time.
\nReceive notification of the beginning of a document.
\nThe SAX parser will invoke this method only once, before any other methods in\nthis interface or in DTDHandler (except for setDocumentLocator()).
\nReceive notification of the end of a document.
\nThe SAX parser will invoke this method only once, and it will be the last method\ninvoked during the parse. The parser shall not invoke this method until it has\neither abandoned parsing (because of an unrecoverable error) or reached the end\nof input.
\nBegin the scope of a prefix-URI Namespace mapping.
\nThe information from this event is not necessary for normal Namespace\nprocessing: the SAX XML reader will automatically replace prefixes for element\nand attribute names when the feature_namespaces feature is enabled (the\ndefault).
\nThere are cases, however, when applications need to use prefixes in character\ndata or in attribute values, where they cannot safely be expanded automatically;\nthe startPrefixMapping() and endPrefixMapping() events supply the\ninformation to the application to expand prefixes in those contexts itself, if\nnecessary.
\nNote that startPrefixMapping() and endPrefixMapping() events are not\nguaranteed to be properly nested relative to each-other: all\nstartPrefixMapping() events will occur before the corresponding\nstartElement() event, and all endPrefixMapping() events will occur\nafter the corresponding endElement() event, but their order is not\nguaranteed.
\nEnd the scope of a prefix-URI mapping.
\nSee startPrefixMapping() for details. This event will always occur after\nthe corresponding endElement() event, but the order of\nendPrefixMapping() events is not otherwise guaranteed.
\nSignals the start of an element in non-namespace mode.
\nThe name parameter contains the raw XML 1.0 name of the element type as a\nstring and the attrs parameter holds an object of the Attributes\ninterface (see The Attributes Interface) containing the attributes of\nthe element. The object passed as attrs may be re-used by the parser; holding\non to a reference to it is not a reliable way to keep a copy of the attributes.\nTo keep a copy of the attributes, use the copy() method of the attrs\nobject.
\nSignals the end of an element in non-namespace mode.
\nThe name parameter contains the name of the element type, just as with the\nstartElement() event.
\nSignals the start of an element in namespace mode.
\nThe name parameter contains the name of the element type as a (uri,\nlocalname) tuple, the qname parameter contains the raw XML 1.0 name used in\nthe source document, and the attrs parameter holds an instance of the\nAttributesNS interface (see The AttributesNS Interface)\ncontaining the attributes of the element. If no namespace is associated with\nthe element, the uri component of name will be None. The object passed\nas attrs may be re-used by the parser; holding on to a reference to it is not\na reliable way to keep a copy of the attributes. To keep a copy of the\nattributes, use the copy() method of the attrs object.
\nParsers may set the qname parameter to None, unless the\nfeature_namespace_prefixes feature is activated.
\nSignals the end of an element in namespace mode.
\nThe name parameter contains the name of the element type, just as with the\nstartElementNS() method, likewise the qname parameter.
\nReceive notification of character data.
\nThe Parser will call this method to report each chunk of character data. SAX\nparsers may return all contiguous character data in a single chunk, or they may\nsplit it into several chunks; however, all of the characters in any single event\nmust come from the same external entity so that the Locator provides useful\ninformation.
\ncontent may be a Unicode string or a byte string; the expat reader module\nproduces always Unicode strings.
\nNote
\nThe earlier SAX 1 interface provided by the Python XML Special Interest Group\nused a more Java-like interface for this method. Since most parsers used from\nPython did not take advantage of the older interface, the simpler signature was\nchosen to replace it. To convert old code to the new interface, use content\ninstead of slicing content with the old offset and length parameters.
\nReceive notification of ignorable whitespace in element content.
\nValidating Parsers must use this method to report each chunk of ignorable\nwhitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating\nparsers may also use this method if they are capable of parsing and using\ncontent models.
\nSAX parsers may return all contiguous whitespace in a single chunk, or they may\nsplit it into several chunks; however, all of the characters in any single event\nmust come from the same external entity, so that the Locator provides useful\ninformation.
\nReceive notification of a processing instruction.
\nThe Parser will invoke this method once for each processing instruction found:\nnote that processing instructions may occur before or after the main document\nelement.
\nA SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a\ntext declaration (XML 1.0, section 4.3.1) using this method.
\nReceive notification of a skipped entity.
\nThe Parser will invoke this method once for each entity skipped. Non-validating\nprocessors may skip entities if they have not seen the declarations (because,\nfor example, the entity was declared in an external DTD subset). All processors\nmay skip external entities, depending on the values of the\nfeature_external_ges and the feature_external_pes properties.
\nDTDHandler instances provide the following methods:
\nObjects with this interface are used to receive error and warning information\nfrom the XMLReader. If you create an object that implements this\ninterface, then register the object with your XMLReader, the parser\nwill call the methods in your object to report all warnings and errors. There\nare three levels of errors available: warnings, (possibly) recoverable errors,\nand unrecoverable errors. All methods take a SAXParseException as the\nonly parameter. Errors and warnings may be converted to an exception by raising\nthe passed-in exception object.
\nSource code: Lib/webbrowser.py
\nThe webbrowser module provides a high-level interface to allow displaying\nWeb-based documents to users. Under most circumstances, simply calling the\nopen() function from this module will do the right thing.
\nUnder Unix, graphical browsers are preferred under X11, but text-mode browsers\nwill be used if graphical browsers are not available or an X11 display isn’t\navailable. If text-mode browsers are used, the calling process will block until\nthe user exits the browser.
\nIf the environment variable BROWSER exists, it is interpreted to\noverride the platform default list of browsers, as a os.pathsep-separated\nlist of browsers to try in order. When the value of a list part contains the\nstring %s, then it is interpreted as a literal browser command line to be\nused with the argument URL substituted for %s; if the part does not contain\n%s, it is simply interpreted as the name of the browser to launch. [1]
\nFor non-Unix platforms, or when a remote browser is available on Unix, the\ncontrolling process will not wait for the user to finish with the browser, but\nallow the remote browser to maintain its own windows on the display. If remote\nbrowsers are not available on Unix, the controlling process will launch a new\nbrowser and wait.
\nThe script webbrowser can be used as a command-line interface for the\nmodule. It accepts an URL as the argument. It accepts the following optional\nparameters: -n opens the URL in a new browser window, if possible;\n-t opens the URL in a new browser page (“tab”). The options are,\nnaturally, mutually exclusive.
\nThe following exception is defined:
\nThe following functions are defined:
\nDisplay url using the default browser. If new is 0, the url is opened\nin the same browser window if possible. If new is 1, a new browser window\nis opened if possible. If new is 2, a new browser page (“tab”) is opened\nif possible. If autoraise is True, the window is raised if possible\n(note that under many window managers this will occur regardless of the\nsetting of this variable).
\nNote that on some platforms, trying to open a filename using this function,\nmay work and start the operating system’s associated program. However, this\nis neither supported nor portable.
\n\nChanged in version 2.5: new can now be 2.
\nOpen url in a new page (“tab”) of the default browser, if possible, otherwise\nequivalent to open_new().
\n\nNew in version 2.5.
\nRegister the browser type name. Once a browser type is registered, the\nget() function can return a controller for that browser type. If\ninstance is not provided, or is None, constructor will be called without\nparameters to create an instance when needed. If instance is provided,\nconstructor will never be called, and may be None.
\nThis entry point is only useful if you plan to either set the BROWSER\nvariable or call get() with a nonempty argument matching the name of a\nhandler you declare.
\nA number of browser types are predefined. This table gives the type names that\nmay be passed to the get() function and the corresponding instantiations\nfor the controller classes, all defined in this module.
\nType Name | \nClass Name | \nNotes | \n
---|---|---|
'mozilla' | \nMozilla('mozilla') | \n\n |
'firefox' | \nMozilla('mozilla') | \n\n |
'netscape' | \nMozilla('netscape') | \n\n |
'galeon' | \nGaleon('galeon') | \n\n |
'epiphany' | \nGaleon('epiphany') | \n\n |
'skipstone' | \nBackgroundBrowser('skipstone') | \n\n |
'kfmclient' | \nKonqueror() | \n(1) | \n
'konqueror' | \nKonqueror() | \n(1) | \n
'kfm' | \nKonqueror() | \n(1) | \n
'mosaic' | \nBackgroundBrowser('mosaic') | \n\n |
'opera' | \nOpera() | \n\n |
'grail' | \nGrail() | \n\n |
'links' | \nGenericBrowser('links') | \n\n |
'elinks' | \nElinks('elinks') | \n\n |
'lynx' | \nGenericBrowser('lynx') | \n\n |
'w3m' | \nGenericBrowser('w3m') | \n\n |
'windows-default' | \nWindowsDefault | \n(2) | \n
'internet-config' | \nInternetConfig | \n(3) | \n
'macosx' | \nMacOSX('default') | \n(4) | \n
Notes:
\nHere are some simple examples:
\nurl = 'http://www.python.org/'\n\n# Open URL in a new tab, if a browser window is already open.\nwebbrowser.open_new_tab(url + 'doc/')\n\n# Open URL in new window, raising the window if possible.\nwebbrowser.open_new(url)\n
Browser controllers provide these methods which parallel three of the\nmodule-level convenience functions:
\nOpen url in a new page (“tab”) of the browser handled by this controller, if\npossible, otherwise equivalent to open_new().
\n\nNew in version 2.5.
\nFootnotes
\n[1] | Executables named here without a full path will be searched in the\ndirectories given in the PATH environment variable. |
\nNew in version 2.2.
\nThe cgitb module provides a special exception handler for Python scripts.\n(Its name is a bit misleading. It was originally designed to display extensive\ntraceback information in HTML for CGI scripts. It was later generalized to also\ndisplay this information in plain text.) After this module is activated, if an\nuncaught exception occurs, a detailed, formatted report will be displayed. The\nreport includes a traceback showing excerpts of the source code for each level,\nas well as the values of the arguments and local variables to currently running\nfunctions, to help you debug the problem. Optionally, you can save this\ninformation to a file instead of sending it to the browser.
\nTo enable this feature, simply add this to the top of your CGI script:
\nimport cgitb\ncgitb.enable()\n
The options to the enable() function control whether the report is\ndisplayed in the browser and whether the report is logged to a file for later\nanalysis.
\nThis function causes the cgitb module to take over the interpreter’s\ndefault handling for exceptions by setting the value of sys.excepthook.
\nThe optional argument display defaults to 1 and can be set to 0 to\nsuppress sending the traceback to the browser. If the argument logdir is\npresent, the traceback reports are written to files. The value of logdir\nshould be a directory where these files will be placed. The optional argument\ncontext is the number of lines of context to display around the current line\nof source code in the traceback; this defaults to 5. If the optional\nargument format is "html", the output is formatted as HTML. Any other\nvalue forces plain text output. The default value is "html".
\n\nNew in version 2.0.
\nSAX parsers implement the XMLReader interface. They are implemented in\na Python module, which must provide a function create_parser(). This\nfunction is invoked by xml.sax.make_parser() with no arguments to create\na new parser object.
\nIn some cases, it is desirable not to parse an input source at once, but to feed\nchunks of the document as they get available. Note that the reader will normally\nnot read the entire file, but read it in chunks as well; still parse()\nwon’t return until the entire document is processed. So these interfaces should\nbe used if the blocking behaviour of parse() is not desirable.
\nWhen the parser is instantiated it is ready to begin accepting data from the\nfeed method immediately. After parsing has been finished with a call to close\nthe reset method must be called to make the parser ready to accept new data,\neither from feed or using the parse method.
\nNote that these methods must not be called during parsing, that is, after\nparse has been called and before it returns.
\nBy default, the class also implements the parse method of the XMLReader\ninterface using the feed, close and reset methods of the IncrementalParser\ninterface as a convenience to SAX 2.0 driver writers.
\nEncapsulation of the information needed by the XMLReader to read\nentities.
\nThis class may include information about the public identifier, system\nidentifier, byte stream (possibly with character encoding information) and/or\nthe character stream of an entity.
\nApplications will create objects of this class for use in the\nXMLReader.parse() method and for returning from\nEntityResolver.resolveEntity.
\nAn InputSource belongs to the application, the XMLReader is\nnot allowed to modify InputSource objects passed to it from the\napplication, although it may make copies and modify those.
\nThe XMLReader interface supports the following methods:
\nAllow an application to set the locale for errors and warnings.
\nSAX parsers are not required to provide localization for errors and warnings; if\nthey cannot support the requested locale, however, they must raise a SAX\nexception. Applications may request a locale change in the middle of a parse.
\nInstances of IncrementalParser offer the following additional methods:
\nInstances of Locator provide these methods:
\nSets the character encoding of this InputSource.
\nThe encoding must be a string acceptable for an XML encoding declaration (see\nsection 4.3.3 of the XML recommendation).
\nThe encoding attribute of the InputSource is ignored if the\nInputSource also contains a character stream.
\nSet the byte stream (a Python file-like object which does not perform\nbyte-to-character conversion) for this input source.
\nThe SAX parser will ignore this if there is also a character stream specified,\nbut it will use a byte stream in preference to opening a URI connection itself.
\nIf the application knows the character encoding of the byte stream, it should\nset it with the setEncoding method.
\nGet the byte stream for this input source.
\nThe getEncoding method will return the character encoding for this byte stream,\nor None if unknown.
\nSet the character stream for this input source. (The stream must be a Python 1.6\nUnicode-wrapped file-like that performs conversion to Unicode strings.)
\nIf there is a character stream specified, the SAX parser will ignore any byte\nstream and will not attempt to open a URI connection to the system identifier.
\nAttributes objects implement a portion of the mapping protocol,\nincluding the methods copy(), get(), has_key(), items(),\nkeys(), and values(). The following methods are also provided:
\nThis interface is a subtype of the Attributes interface (see section\nThe Attributes Interface). All methods supported by that interface are also\navailable on AttributesNS objects.
\nThe following methods are also available:
\n\nNew in version 2.5.
\nSource code: Lib/xml/etree/ElementTree.py
\nThe Element type is a flexible container object, designed to store\nhierarchical data structures in memory. The type can be described as a cross\nbetween a list and a dictionary.
\nEach element has a number of properties associated with it:
\nTo create an element instance, use the Element constructor or the\nSubElement() factory function.
\nThe ElementTree class can be used to wrap an element structure, and\nconvert it from and to XML.
\nA C implementation of this API is available as xml.etree.cElementTree.
\nSee http://effbot.org/zone/element-index.htm for tutorials and links to other\ndocs. Fredrik Lundh’s page is also the location of the development version of\nthe xml.etree.ElementTree.
\n\nChanged in version 2.7: The ElementTree API is updated to 1.3. For more information, see\nIntroducing ElementTree 1.3.
\nWrites an element tree or element structure to sys.stdout. This function\nshould be used for debugging only.
\nThe exact output format is implementation dependent. In this version, it’s\nwritten as an ordinary XML file.
\nelem is an element tree or an individual element.
\nParses an XML document from a sequence of string fragments. sequence is a\nlist or other sequence containing XML data fragments. parser is an\noptional parser instance. If not given, the standard XMLParser\nparser is used. Returns an Element instance.
\n\nNew in version 2.7.
\nParses an XML section into an element tree incrementally, and reports what’s\ngoing on to the user. source is a filename or file object containing XML\ndata. events is a list of events to report back. If omitted, only “end”\nevents are reported. parser is an optional parser instance. If not\ngiven, the standard XMLParser parser is used. Returns an\niterator providing (event, elem) pairs.
\nNote
\niterparse() only guarantees that it has seen the “>”\ncharacter of a starting tag when it emits a “start” event, so the\nattributes are defined, but the contents of the text and tail attributes\nare undefined at that point. The same applies to the element children;\nthey may or may not be present.
\nIf you need a fully populated element, look for “end” events instead.
\nRegisters a namespace prefix. The registry is global, and any existing\nmapping for either the given prefix or the namespace URI will be removed.\nprefix is a namespace prefix. uri is a namespace uri. Tags and\nattributes in this namespace will be serialized with the given prefix, if at\nall possible.
\n\nNew in version 2.7.
\nSubelement factory. This function creates an element instance, and appends\nit to an existing element.
\nThe element name, attribute names, and attribute values can be either\nbytestrings or Unicode strings. parent is the parent element. tag is\nthe subelement name. attrib is an optional dictionary, containing element\nattributes. extra contains additional attributes, given as keyword\narguments. Returns an element instance.
\nGenerates a string representation of an XML element, including all\nsubelements. element is an Element instance. encoding [1] is\nthe output encoding (default is US-ASCII). method is either "xml",\n"html" or "text" (default is "xml"). Returns a list of encoded\nstrings containing the XML data. It does not guarantee any specific\nsequence, except that "".join(tostringlist(element)) ==\ntostring(element).
\n\nNew in version 2.7.
\nElement class. This class defines the Element interface, and provides a\nreference implementation of this interface.
\nThe element name, attribute names, and attribute values can be either\nbytestrings or Unicode strings. tag is the element name. attrib is\nan optional dictionary, containing element attributes. extra contains\nadditional attributes, given as keyword arguments.
\nThe following dictionary-like methods work on the element attributes.
\nGets the element attribute named key.
\nReturns the attribute value, or default if the attribute was not found.
\nThe following methods work on the element’s children (subelements).
\nAppends subelements from a sequence object with zero or more elements.\nRaises AssertionError if a subelement is not a valid object.
\n\nNew in version 2.7.
\n\nDeprecated since version 2.7: Use list(elem) or iteration.
\n\nDeprecated since version 2.7: Use method Element.iter() instead.
\nCreates a tree iterator with the current element as the root.\nThe iterator iterates over this element and all elements below it, in\ndocument (depth first) order. If tag is not None or '*', only\nelements whose tag equals tag are returned from the iterator. If the\ntree structure is modified during iteration, the result is undefined.
\n\nNew in version 2.7.
\nFinds all matching subelements, by tag name or path. Returns an iterable\nyielding all matching elements in document order.
\n\nNew in version 2.7.
\nCreates a text iterator. The iterator loops over this element and all\nsubelements, in document order, and returns all inner text.
\n\nNew in version 2.7.
\nElement objects also support the following sequence type methods\nfor working with subelements: __delitem__(), __getitem__(),\n__setitem__(), __len__().
\nCaution: Elements with no subelements will test as False. This behavior\nwill change in future versions. Use specific len(elem) or elem is\nNone test instead.
\nelement = root.find('foo')\n\nif not element: # careful!\n print "element not found, or element has no subelements"\n\nif element is None:\n print "element not found"\n
ElementTree wrapper class. This class represents an entire element\nhierarchy, and adds some extra support for serialization to and from\nstandard XML.
\nelement is the root element. The tree is initialized with the contents\nof the XML file if given.
\n\nDeprecated since version 2.7: Use method ElementTree.iter() instead.
\nFinds all matching subelements, by tag name or path. Same as\ngetroot().iterfind(match). Returns an iterable yielding all matching\nelements in document order.
\n\nNew in version 2.7.
\nThis is the XML file that is going to be manipulated:
\n<html>\n <head>\n <title>Example page</title>\n </head>\n <body>\n <p>Moved to <a href=\"http://example.org/\">example.org</a>\n or <a href=\"http://example.com/\">example.com</a>.</p>\n </body>\n</html>
\nExample of changing the attribute “target” of every link in first paragraph:
\n>>> from xml.etree.ElementTree import ElementTree\n>>> tree = ElementTree()\n>>> tree.parse("index.xhtml")\n<Element 'html' at 0xb77e6fac>\n>>> p = tree.find("body/p") # Finds first occurrence of tag p in body\n>>> p\n<Element 'p' at 0xb77ec26c>\n>>> links = list(p.iter("a")) # Returns list of all links\n>>> links\n[<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]\n>>> for i in links: # Iterates through all found links\n... i.attrib["target"] = "blank"\n>>> tree.write("output.xhtml")\n
Generic element structure builder. This builder converts a sequence of\nstart, data, and end method calls to a well-formed element structure. You\ncan use this class to build an element structure using a custom XML parser,\nor a parser for some other XML-like format. The element_factory is called\nto create new Element instances when given.
\nIn addition, a custom TreeBuilder object can provide the\nfollowing method:
\nHandles a doctype declaration. name is the doctype name. pubid is\nthe public identifier. system is the system identifier. This method\ndoes not exist on the default TreeBuilder class.
\n\nNew in version 2.7.
\nElement structure builder for XML source data, based on the expat\nparser. html are predefined HTML entities. This flag is not supported by\nthe current implementation. target is the target object. If omitted, the\nbuilder uses an instance of the standard TreeBuilder class. encoding [1]\nis optional. If given, the value overrides the encoding specified in the\nXML file.
\n\nDeprecated since version 2.7: Define the TreeBuilder.doctype() method on a custom TreeBuilder\ntarget.
\nXMLParser.feed() calls target‘s start() method\nfor each opening tag, its end() method for each closing tag,\nand data is processed by method data(). XMLParser.close()\ncalls target‘s method close().\nXMLParser can be used not only for building a tree structure.\nThis is an example of counting the maximum depth of an XML file:
\n>>> from xml.etree.ElementTree import XMLParser\n>>> class MaxDepth: # The target object of the parser\n... maxDepth = 0\n... depth = 0\n... def start(self, tag, attrib): # Called for each opening tag.\n... self.depth += 1\n... if self.depth > self.maxDepth:\n... self.maxDepth = self.depth\n... def end(self, tag): # Called for each closing tag.\n... self.depth -= 1\n... def data(self, data):\n... pass # We do not need to do anything with data.\n... def close(self): # Called when all data has been parsed.\n... return self.maxDepth\n...\n>>> target = MaxDepth()\n>>> parser = XMLParser(target=target)\n>>> exampleXml = """\n... <a>\n... <b>\n... </b>\n... <b>\n... <c>\n... <d>\n... </d>\n... </c>\n... </b>\n... </a>"""\n>>> parser.feed(exampleXml)\n>>> parser.close()\n4\n
Footnotes
\n[1] | The encoding string included in XML output should conform to the\nappropriate standards. For example, “UTF-8” is valid, but “UTF8” is\nnot. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl\nand http://www.iana.org/assignments/character-sets. |
Source code: Lib/cgi.py
\nSupport module for Common Gateway Interface (CGI) scripts.
\nThis module defines a number of utilities for use by CGI scripts written in\nPython.
\nA CGI script is invoked by an HTTP server, usually to process user input\nsubmitted through an HTML <FORM> or <ISINDEX> element.
\nMost often, CGI scripts live in the server’s special cgi-bin directory.\nThe HTTP server places all sorts of information about the request (such as the\nclient’s hostname, the requested URL, the query string, and lots of other\ngoodies) in the script’s shell environment, executes the script, and sends the\nscript’s output back to the client.
\nThe script’s input is connected to the client too, and sometimes the form data\nis read this way; at other times the form data is passed via the “query string”\npart of the URL. This module is intended to take care of the different cases\nand provide a simpler interface to the Python script. It also provides a number\nof utilities that help in debugging scripts, and the latest addition is support\nfor file uploads from a form (if your browser supports it).
\nThe output of a CGI script should consist of two sections, separated by a blank\nline. The first section contains a number of headers, telling the client what\nkind of data is following. Python code to generate a minimal header section\nlooks like this:
\nprint "Content-Type: text/html" # HTML is following\nprint # blank line, end of headers\n
The second section is usually HTML, which allows the client software to display\nnicely formatted text with header, in-line images, etc. Here’s Python code that\nprints a simple piece of HTML:
\nprint "<TITLE>CGI script output</TITLE>"\nprint "<H1>This is my first CGI script</H1>"\nprint "Hello, world!"\n
Begin by writing import cgi. Do not use from cgi import * — the\nmodule defines all sorts of names for its own use or for backward compatibility\nthat you don’t want in your namespace.
\nWhen you write a new script, consider adding these lines:
\nimport cgitb\ncgitb.enable()\n
This activates a special exception handler that will display detailed reports in\nthe Web browser if any errors occur. If you’d rather not show the guts of your\nprogram to users of your script, you can have the reports saved to files\ninstead, with code like this:
\nimport cgitb\ncgitb.enable(display=0, logdir="/tmp")\n
It’s very helpful to use this feature during script development. The reports\nproduced by cgitb provide information that can save you a lot of time in\ntracking down bugs. You can always remove the cgitb line later when you\nhave tested your script and are confident that it works correctly.
\nTo get at submitted form data, it’s best to use the FieldStorage class.\nThe other classes defined in this module are provided mostly for backward\ncompatibility. Instantiate it exactly once, without arguments. This reads the\nform contents from standard input or the environment (depending on the value of\nvarious environment variables set according to the CGI standard). Since it may\nconsume standard input, it should be instantiated only once.
\nThe FieldStorage instance can be indexed like a Python dictionary.\nIt allows membership testing with the in operator, and also supports\nthe standard dictionary method keys() and the built-in function\nlen(). Form fields containing empty strings are ignored and do not appear\nin the dictionary; to keep such values, provide a true value for the optional\nkeep_blank_values keyword parameter when creating the FieldStorage\ninstance.
\nFor instance, the following code (which assumes that the\nContent-Type header and blank line have already been printed)\nchecks that the fields name and addr are both set to a non-empty\nstring:
\nform = cgi.FieldStorage()\nif "name" not in form or "addr" not in form:\n print "<H1>Error</H1>"\n print "Please fill in the name and addr fields."\n return\nprint "<p>name:", form["name"].value\nprint "<p>addr:", form["addr"].value\n...further form processing here...\n
Here the fields, accessed through form[key], are themselves instances of\nFieldStorage (or MiniFieldStorage, depending on the form\nencoding). The value attribute of the instance yields the string value\nof the field. The getvalue() method returns this string value directly;\nit also accepts an optional second argument as a default to return if the\nrequested key is not present.
\nIf the submitted form data contains more than one field with the same name, the\nobject retrieved by form[key] is not a FieldStorage or\nMiniFieldStorage instance but a list of such instances. Similarly, in\nthis situation, form.getvalue(key) would return a list of strings. If you\nexpect this possibility (when your HTML form contains multiple fields with the\nsame name), use the getlist() function, which always returns a list of\nvalues (so that you do not need to special-case the single item case). For\nexample, this code concatenates any number of username fields, separated by\ncommas:
\nvalue = form.getlist("username")\nusernames = ",".join(value)\n
If a field represents an uploaded file, accessing the value via the\nvalue attribute or the getvalue() method reads the entire file in\nmemory as a string. This may not be what you want. You can test for an uploaded\nfile by testing either the filename attribute or the file\nattribute. You can then read the data at leisure from the file\nattribute:
\nfileitem = form["userfile"]\nif fileitem.file:\n # It's an uploaded file; count lines\n linecount = 0\n while 1:\n line = fileitem.file.readline()\n if not line: break\n linecount = linecount + 1\n
If an error is encountered when obtaining the contents of an uploaded file\n(for example, when the user interrupts the form submission by clicking on\na Back or Cancel button) the done attribute of the object for the\nfield will be set to the value -1.
\nThe file upload draft standard entertains the possibility of uploading multiple\nfiles from one field (using a recursive multipart/* encoding).\nWhen this occurs, the item will be a dictionary-like FieldStorage item.\nThis can be determined by testing its type attribute, which should be\nmultipart/form-data (or perhaps another MIME type matching\nmultipart/*). In this case, it can be iterated over recursively\njust like the top-level form object.
\nWhen a form is submitted in the “old” format (as the query string or as a single\ndata part of type application/x-www-form-urlencoded), the items will\nactually be instances of the class MiniFieldStorage. In this case, the\nlist, file, and filename attributes are always None.
\nA form submitted via POST that also has a query string will contain both\nFieldStorage and MiniFieldStorage items.
\n\nNew in version 2.2.
\nThe previous section explains how to read CGI form data using the\nFieldStorage class. This section describes a higher level interface\nwhich was added to this class to allow one to do it in a more readable and\nintuitive way. The interface doesn’t make the techniques described in previous\nsections obsolete — they are still useful to process file uploads efficiently,\nfor example.
\nThe interface consists of two simple methods. Using the methods you can process\nform data in a generic way, without the need to worry whether only one or more\nvalues were posted under one name.
\nIn the previous section, you learned to write following code anytime you\nexpected a user to post more than one value under one name:
\nitem = form.getvalue(\"item\")\nif isinstance(item, list):\n # The user is requesting more than one item.\nelse:\n # The user is requesting only one item.
\nThis situation is common for example when a form contains a group of multiple\ncheckboxes with the same name:
\n<input type=\"checkbox\" name=\"item\" value=\"1\" />\n<input type=\"checkbox\" name=\"item\" value=\"2\" />
\nIn most situations, however, there’s only one form control with a particular\nname in a form and then you expect and need only one value associated with this\nname. So you write a script containing for example this code:
\nuser = form.getvalue("user").upper()\n
The problem with the code is that you should never expect that a client will\nprovide valid input to your scripts. For example, if a curious user appends\nanother user=foo pair to the query string, then the script would crash,\nbecause in this situation the getvalue("user") method call returns a list\ninstead of a string. Calling the upper() method on a list is not valid\n(since lists do not have a method of this name) and results in an\nAttributeError exception.
\nTherefore, the appropriate way to read form data values was to always use the\ncode which checks whether the obtained value is a single value or a list of\nvalues. That’s annoying and leads to less readable scripts.
\nA more convenient approach is to use the methods getfirst() and\ngetlist() provided by this higher level interface.
\nUsing these methods you can write nice compact code:
\nimport cgi\nform = cgi.FieldStorage()\nuser = form.getfirst("user", "").upper() # This way it's safe.\nfor item in form.getlist("item"):\n do_something(item)\n
\nDeprecated since version 2.6.
\nSvFormContentDict stores single value form content as dictionary; it\nassumes each field name occurs in the form only once.
\nFormContentDict stores multiple value form content as a dictionary (the\nform items are lists of values). Useful if your form contains multiple fields\nwith the same name.
\nOther classes (FormContent, InterpFormContentDict) are present\nfor backwards compatibility with really old applications only.
\nThese are useful if you want more control, or if you want to employ some of the\nalgorithms implemented in this module in other circumstances.
\nParse input of type multipart/form-data (for file uploads).\nArguments are fp for the input file and pdict for a dictionary containing\nother parameters in the Content-Type header.
\nReturns a dictionary just like urlparse.parse_qs() keys are the field names, each\nvalue is a list of values for that field. This is easy to use but not much good\nif you are expecting megabytes to be uploaded — in that case, use the\nFieldStorage class instead which is much more flexible.
\nNote that this does not parse nested multipart parts — use\nFieldStorage for that.
\nConvert the characters '&', '<' and '>' in string s to HTML-safe\nsequences. Use this if you need to display text that might contain such\ncharacters in HTML. If the optional flag quote is true, the quotation mark\ncharacter (") is also translated; this helps for inclusion in an HTML\nattribute value delimited by double quotes, as in <a href="...">. Note\nthat single quotes are never translated.
\nIf the value to be quoted might include single- or double-quote characters,\nor both, consider using the quoteattr() function in the\nxml.sax.saxutils module instead.
\nThere’s one important rule: if you invoke an external program (via the\nos.system() or os.popen() functions. or others with similar\nfunctionality), make very sure you don’t pass arbitrary strings received from\nthe client to the shell. This is a well-known security hole whereby clever\nhackers anywhere on the Web can exploit a gullible CGI script to invoke\narbitrary shell commands. Even parts of the URL or field names cannot be\ntrusted, since the request doesn’t have to come from your form!
\nTo be on the safe side, if you must pass a string gotten from a form to a shell\ncommand, you should make sure the string contains only alphanumeric characters,\ndashes, underscores, and periods.
\nRead the documentation for your HTTP server and check with your local system\nadministrator to find the directory where CGI scripts should be installed;\nusually this is in a directory cgi-bin in the server tree.
\nMake sure that your script is readable and executable by “others”; the Unix file\nmode should be 0755 octal (use chmod 0755 filename). Make sure that the\nfirst line of the script contains #! starting in column 1 followed by the\npathname of the Python interpreter, for instance:
\n#!/usr/local/bin/python\n
Make sure the Python interpreter exists and is executable by “others”.
\nMake sure that any files your script needs to read or write are readable or\nwritable, respectively, by “others” — their mode should be 0644 for\nreadable and 0666 for writable. This is because, for security reasons, the\nHTTP server executes your script as user “nobody”, without any special\nprivileges. It can only read (write, execute) files that everybody can read\n(write, execute). The current directory at execution time is also different (it\nis usually the server’s cgi-bin directory) and the set of environment variables\nis also different from what you get when you log in. In particular, don’t count\non the shell’s search path for executables (PATH) or the Python module\nsearch path (PYTHONPATH) to be set to anything interesting.
\nIf you need to load modules from a directory which is not on Python’s default\nmodule search path, you can change the path in your script, before importing\nother modules. For example:
\nimport sys\nsys.path.insert(0, "/usr/home/joe/lib/python")\nsys.path.insert(0, "/usr/local/lib/python")\n
(This way, the directory inserted last will be searched first!)
\nInstructions for non-Unix systems will vary; check your HTTP server’s\ndocumentation (it will usually have a section on CGI scripts).
\nUnfortunately, a CGI script will generally not run when you try it from the\ncommand line, and a script that works perfectly from the command line may fail\nmysteriously when run from the server. There’s one reason why you should still\ntest your script from the command line: if it contains a syntax error, the\nPython interpreter won’t execute it at all, and the HTTP server will most likely\nsend a cryptic error to the client.
\nAssuming your script has no syntax errors, yet it does not work, you have no\nchoice but to read the next section.
\nFirst of all, check for trivial installation errors — reading the section\nabove on installing your CGI script carefully can save you a lot of time. If\nyou wonder whether you have understood the installation procedure correctly, try\ninstalling a copy of this module file (cgi.py) as a CGI script. When\ninvoked as a script, the file will dump its environment and the contents of the\nform in HTML form. Give it the right mode etc, and send it a request. If it’s\ninstalled in the standard cgi-bin directory, it should be possible to\nsend it a request by entering a URL into your browser of the form:
\nhttp://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
\nIf this gives an error of type 404, the server cannot find the script – perhaps\nyou need to install it in a different directory. If it gives another error,\nthere’s an installation problem that you should fix before trying to go any\nfurther. If you get a nicely formatted listing of the environment and form\ncontent (in this example, the fields should be listed as “addr” with value “At\nHome” and “name” with value “Joe Blow”), the cgi.py script has been\ninstalled correctly. If you follow the same procedure for your own script, you\nshould now be able to debug it.
\nThe next step could be to call the cgi module’s test() function\nfrom your script: replace its main code with the single statement
\ncgi.test()\n
This should produce the same results as those gotten from installing the\ncgi.py file itself.
\nWhen an ordinary Python script raises an unhandled exception (for whatever\nreason: of a typo in a module name, a file that can’t be opened, etc.), the\nPython interpreter prints a nice traceback and exits. While the Python\ninterpreter will still do this when your CGI script raises an exception, most\nlikely the traceback will end up in one of the HTTP server’s log files, or be\ndiscarded altogether.
\nFortunately, once you have managed to get your script to execute some code,\nyou can easily send tracebacks to the Web browser using the cgitb module.\nIf you haven’t done so already, just add the lines:
\nimport cgitb\ncgitb.enable()\n
to the top of your script. Then try running it again; when a problem occurs,\nyou should see a detailed report that will likely make apparent the cause of the\ncrash.
\nIf you suspect that there may be a problem in importing the cgitb module,\nyou can use an even more robust approach (which only uses built-in modules):
\nimport sys\nsys.stderr = sys.stdout\nprint "Content-Type: text/plain"\nprint\n...your code here...\n
This relies on the Python interpreter to print the traceback. The content type\nof the output is set to plain text, which disables all HTML processing. If your\nscript works, the raw HTML will be displayed by your client. If it raises an\nexception, most likely after the first two lines have been printed, a traceback\nwill be displayed. Because no HTML interpretation is going on, the traceback\nwill be readable.
\nFootnotes
\n[1] | Note that some recent versions of the HTML specification do state what order the\nfield values should be supplied in, but knowing whether a request was\nreceived from a conforming browser, or even from a browser at all, is tedious\nand error-prone. |
Note
\nThe urllib module has been split into parts and renamed in\nPython 3.0 to urllib.request, urllib.parse,\nand urllib.error. The 2to3 tool will automatically adapt\nimports when converting your sources to 3.0.\nAlso note that the urllib.urlopen() function has been removed in\nPython 3.0 in favor of urllib2.urlopen().
\nThis module provides a high-level interface for fetching data across the World\nWide Web. In particular, the urlopen() function is similar to the\nbuilt-in function open(), but accepts Universal Resource Locators (URLs)\ninstead of filenames. Some restrictions apply — it can only open URLs for\nreading, and no seek operations are available.
\nWarning
\nWhen opening HTTPS URLs, it does not attempt to validate the\nserver certificate. Use at your own risk!
\nOpen a network object denoted by a URL for reading. If the URL does not have a\nscheme identifier, or if it has file: as its scheme identifier, this\nopens a local file (without universal newlines); otherwise it opens a socket to\na server somewhere on the network. If the connection cannot be made the\nIOError exception is raised. If all went well, a file-like object is\nreturned. This supports the following methods: read(), readline(),\nreadlines(), fileno(), close(), info(), getcode() and\ngeturl(). It also has proper support for the iterator protocol. One\ncaveat: the read() method, if the size argument is omitted or negative,\nmay not read until the end of the data stream; there is no good way to determine\nthat the entire stream from a socket has been read in the general case.
\nExcept for the info(), getcode() and geturl() methods,\nthese methods have the same interface as for file objects — see section\nFile Objects in this manual. (It is not a built-in file object,\nhowever, so it can’t be used at those few places where a true built-in file\nobject is required.)
\nThe info() method returns an instance of the class\nmimetools.Message containing meta-information associated with the\nURL. When the method is HTTP, these headers are those returned by the server\nat the head of the retrieved HTML page (including Content-Length and\nContent-Type). When the method is FTP, a Content-Length header will be\npresent if (as is now usual) the server passed back a file length in response\nto the FTP retrieval request. A Content-Type header will be present if the\nMIME type can be guessed. When the method is local-file, returned headers\nwill include a Date representing the file’s last-modified time, a\nContent-Length giving file size, and a Content-Type containing a guess at the\nfile’s type. See also the description of the mimetools module.
\nThe geturl() method returns the real URL of the page. In some cases, the\nHTTP server redirects a client to another URL. The urlopen() function\nhandles this transparently, but in some cases the caller needs to know which URL\nthe client was redirected to. The geturl() method can be used to get at\nthis redirected URL.
\nThe getcode() method returns the HTTP status code that was sent with the\nresponse, or None if the URL is no HTTP URL.
\nIf the url uses the http: scheme identifier, the optional data\nargument may be given to specify a POST request (normally the request type\nis GET). The data argument must be in standard\napplication/x-www-form-urlencoded format; see the urlencode()\nfunction below.
\nThe urlopen() function works transparently with proxies which do not\nrequire authentication. In a Unix or Windows environment, set the\nhttp_proxy, or ftp_proxy environment variables to a URL that\nidentifies the proxy server before starting the Python interpreter. For example\n(the '%' is the command prompt):
\n% http_proxy=\"http://www.someproxy.com:3128\"\n% export http_proxy\n% python\n...
\nThe no_proxy environment variable can be used to specify hosts which\nshouldn’t be reached via proxy; if set, it should be a comma-separated list\nof hostname suffixes, optionally with :port appended, for example\ncern.ch,ncsa.uiuc.edu,some.host:8080.
\nIn a Windows environment, if no proxy environment variables are set, proxy\nsettings are obtained from the registry’s Internet Settings section.
\nIn a Mac OS X environment, urlopen() will retrieve proxy information\nfrom the OS X System Configuration Framework, which can be managed with\nNetwork System Preferences panel.
\nAlternatively, the optional proxies argument may be used to explicitly specify\nproxies. It must be a dictionary mapping scheme names to proxy URLs, where an\nempty dictionary causes no proxies to be used, and None (the default value)\ncauses environmental proxy settings to be used as discussed above. For\nexample:
\n# Use http://www.someproxy.com:3128 for http proxying\nproxies = {'http': 'http://www.someproxy.com:3128'}\nfilehandle = urllib.urlopen(some_url, proxies=proxies)\n# Don't use any proxies\nfilehandle = urllib.urlopen(some_url, proxies={})\n# Use proxies from environment - both versions are equivalent\nfilehandle = urllib.urlopen(some_url, proxies=None)\nfilehandle = urllib.urlopen(some_url)\n
Proxies which require authentication for use are not currently supported; this\nis considered an implementation limitation.
\n\nChanged in version 2.3: Added the proxies support.
\n\nChanged in version 2.6: Added getcode() to returned object and support for the\nno_proxy environment variable.
\n\nDeprecated since version 2.6: The urlopen() function has been removed in Python 3.0 in favor\nof urllib2.urlopen().
\nCopy a network object denoted by a URL to a local file, if necessary. If the URL\npoints to a local file, or a valid cached copy of the object exists, the object\nis not copied. Return a tuple (filename, headers) where filename is the\nlocal file name under which the object can be found, and headers is whatever\nthe info() method of the object returned by urlopen() returned (for\na remote object, possibly cached). Exceptions are the same as for\nurlopen().
\nThe second argument, if present, specifies the file location to copy to (if\nabsent, the location will be a tempfile with a generated name). The third\nargument, if present, is a hook function that will be called once on\nestablishment of the network connection and once after each block read\nthereafter. The hook will be passed three arguments; a count of blocks\ntransferred so far, a block size in bytes, and the total size of the file. The\nthird argument may be -1 on older FTP servers which do not return a file\nsize in response to a retrieval request.
\nIf the url uses the http: scheme identifier, the optional data\nargument may be given to specify a POST request (normally the request type\nis GET). The data argument must in standard\napplication/x-www-form-urlencoded format; see the urlencode()\nfunction below.
\n\nChanged in version 2.5: urlretrieve() will raise ContentTooShortError when it detects that\nthe amount of data available was less than the expected amount (which is the\nsize reported by a Content-Length header). This can occur, for example, when\nthe download is interrupted.
The Content-Length is treated as a lower bound: if there’s more data to read,\nurlretrieve() reads more data, but if less data is available, it raises\nthe exception.
\nYou can still retrieve the downloaded data in this case, it is stored in the\ncontent attribute of the exception instance.
\nIf no Content-Length header was supplied, urlretrieve() can not check\nthe size of the data it has downloaded, and just returns it. In this case you\njust have to assume that the download was successful.
\n\nThe public functions urlopen() and urlretrieve() create an instance\nof the FancyURLopener class and use it to perform their requested\nactions. To override this functionality, programmers can create a subclass of\nURLopener or FancyURLopener, then assign an instance of that\nclass to the urllib._urlopener variable before calling the desired function.\nFor example, applications may want to specify a different\nUser-Agent header than URLopener defines. This can be\naccomplished with the following code:
\nimport urllib\n\nclass AppURLopener(urllib.FancyURLopener):\n version = "App/1.7"\n\nurllib._urlopener = AppURLopener()\n
Replace special characters in string using the %xx escape. Letters,\ndigits, and the characters '_.-' are never quoted. By default, this\nfunction is intended for quoting the path section of the URL. The optional\nsafe parameter specifies additional characters that should not be quoted\n— its default value is '/'.
\nExample: quote('/~connolly/') yields '/%7econnolly/'.
\nReplace %xx escapes by their single-character equivalent.
\nExample: unquote('/%7Econnolly/') yields '/~connolly/'.
\nBase class for opening and reading URLs. Unless you need to support opening\nobjects using schemes other than http:, ftp:, or file:,\nyou probably want to use FancyURLopener.
\nBy default, the URLopener class sends a User-Agent header\nof urllib/VVV, where VVV is the urllib version number.\nApplications can define their own User-Agent header by subclassing\nURLopener or FancyURLopener and setting the class attribute\nversion to an appropriate string value in the subclass definition.
\nThe optional proxies parameter should be a dictionary mapping scheme names to\nproxy URLs, where an empty dictionary turns proxies off completely. Its default\nvalue is None, in which case environmental proxy settings will be used if\npresent, as discussed in the definition of urlopen(), above.
\nAdditional keyword parameters, collected in x509, may be used for\nauthentication of the client when using the https: scheme. The keywords\nkey_file and cert_file are supported to provide an SSL key and certificate;\nboth are needed to support client authentication.
\nURLopener objects will raise an IOError exception if the server\nreturns an error code.
\n\n\n\n
\n\n- \nopen(fullurl[, data])¶
\n- Open fullurl using the appropriate protocol. This method sets up cache and\nproxy information, then calls the appropriate open method with its input\narguments. If the scheme is not recognized, open_unknown() is called.\nThe data argument has the same meaning as the data argument of\nurlopen().
\n
\n\n- \nopen_unknown(fullurl[, data])¶
\n- Overridable interface to open unknown URL types.
\n
\n\n- \nretrieve(url[, filename[, reporthook[, data]]])¶
\nRetrieves the contents of url and places it in filename. The return value\nis a tuple consisting of a local filename and either a\nmimetools.Message object containing the response headers (for remote\nURLs) or None (for local URLs). The caller must then open and read the\ncontents of filename. If filename is not given and the URL refers to a\nlocal file, the input filename is returned. If the URL is non-local and\nfilename is not given, the filename is the output of tempfile.mktemp()\nwith a suffix that matches the suffix of the last path component of the input\nURL. If reporthook is given, it must be a function accepting three numeric\nparameters. It will be called after each chunk of data is read from the\nnetwork. reporthook is ignored for local URLs.
\nIf the url uses the http: scheme identifier, the optional data\nargument may be given to specify a POST request (normally the request type\nis GET). The data argument must in standard\napplication/x-www-form-urlencoded format; see the urlencode()\nfunction below.
\n\n
\n\n- \nversion¶
\n- Variable that specifies the user agent of the opener object. To get\nurllib to tell servers that it is a particular user agent, set this in a\nsubclass as a class variable or in the constructor before calling the base\nconstructor.
FancyURLopener subclasses URLopener providing default handling\nfor the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x\nresponse codes listed above, the Location header is used to fetch\nthe actual URL. For 401 response codes (authentication required), basic HTTP\nauthentication is performed. For the 30x response codes, recursion is bounded\nby the value of the maxtries attribute, which defaults to 10.
\nFor all other response codes, the method http_error_default() is called\nwhich you can override in subclasses to handle the error appropriately.
\nNote
\nAccording to the letter of RFC 2616, 301 and 302 responses to POST requests\nmust not be automatically redirected without confirmation by the user. In\nreality, browsers do allow automatic redirection of these responses, changing\nthe POST to a GET, and urllib reproduces this behaviour.
\nThe parameters to the constructor are the same as those for URLopener.
\nNote
\n\nWhen performing basic authentication, a FancyURLopener instance calls\nits prompt_user_passwd() method. The default implementation asks the\nusers for the required information on the controlling terminal. A subclass may\noverride this method to support more appropriate behavior if needed.\n
The FancyURLopener class offers one additional method that should be\noverloaded to provide the appropriate behavior:
\nReturn information needed to authenticate the user at the given host in the\nspecified security realm. The return value should be a tuple, (user,\npassword), which can be used for basic authentication.
\nThe implementation prompts for this information on the terminal; an application\nshould override this method to use an appropriate interaction model in the local\nenvironment.
\nThis exception is raised when the urlretrieve() function detects that the\namount of the downloaded data is less than the expected amount (given by the\nContent-Length header). The content attribute stores the downloaded\n(and supposedly truncated) data.
\n\nNew in version 2.5.
\n\n\n
Currently, only the following protocols are supported: HTTP, (versions 0.9 and\n1.0), FTP, and local files.
\nThe caching feature of urlretrieve() has been disabled until I find the\ntime to hack proper processing of Expiration time headers.
\nThere should be a function to query whether a particular URL is in the cache.
\nFor backward compatibility, if a URL appears to point to a local file but the\nfile can’t be opened, the URL is re-interpreted using the FTP protocol. This\ncan sometimes cause confusing error messages.
\nThe urlopen() and urlretrieve() functions can cause arbitrarily\nlong delays while waiting for a network connection to be set up. This means\nthat it is difficult to build an interactive Web client using these functions\nwithout using threads.
\nThe data returned by urlopen() or urlretrieve() is the raw data\nreturned by the server. This may be binary data (such as an image), plain text\nor (for example) HTML. The HTTP protocol provides type information in the reply\nheader, which can be inspected by looking at the Content-Type\nheader. If the returned data is HTML, you can use the module htmllib to\nparse it.
\nThe code handling the FTP protocol cannot differentiate between a file and a\ndirectory. This can lead to unexpected behavior when attempting to read a URL\nthat points to a file that is not accessible. If the URL ends in a /, it is\nassumed to refer to a directory and will be handled accordingly. But if an\nattempt to read a file leads to a 550 error (meaning the URL cannot be found or\nis not accessible, often for permission reasons), then the path is treated as a\ndirectory in order to handle the case when a directory is specified by a URL but\nthe trailing / has been left off. This can cause misleading results when\nyou try to fetch a file whose read permissions make it inaccessible; the FTP\ncode will try to read it, fail with a 550 error, and then perform a directory\nlisting for the unreadable file. If fine-grained control is needed, consider\nusing the ftplib module, subclassing FancyURLopener, or changing\n_urlopener to meet your needs.
\nThis module does not support the use of proxies which require authentication.\nThis may be implemented in the future.
\nAlthough the urllib module contains (undocumented) routines to parse\nand unparse URL strings, the recommended interface for URL manipulation is in\nmodule urlparse.
\nHere is an example session that uses the GET method to retrieve a URL\ncontaining parameters:
\n>>> import urllib\n>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})\n>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" params)\n>>> print f.read()\n
The following example uses the POST method instead:
\n>>> import urllib\n>>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})\n>>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query", params)\n>>> print f.read()\n
The following example uses an explicitly specified HTTP proxy, overriding\nenvironment settings:
\n>>> import urllib\n>>> proxies = {'http': 'http://proxy.example.com:8080/'}\n>>> opener = urllib.FancyURLopener(proxies)\n>>> f = opener.open("http://www.python.org")\n>>> f.read()\n
The following example uses no proxies at all, overriding environment settings:
\n>>> import urllib\n>>> opener = urllib.FancyURLopener({})\n>>> f = opener.open("http://www.python.org/")\n>>> f.read()\n
\nNew in version 2.5.
\nThe Web Server Gateway Interface (WSGI) is a standard interface between web\nserver software and web applications written in Python. Having a standard\ninterface makes it easy to use an application that supports WSGI with a number\nof different web servers.
\nOnly authors of web servers and programming frameworks need to know every detail\nand corner case of the WSGI design. You don’t need to understand every detail\nof WSGI just to install a WSGI application or to write a web application using\nan existing framework.
\nwsgiref is a reference implementation of the WSGI specification that can\nbe used to add WSGI support to a web server or framework. It provides utilities\nfor manipulating WSGI environment variables and response headers, base classes\nfor implementing WSGI servers, a demo HTTP server that serves WSGI applications,\nand a validation tool that checks WSGI servers and applications for conformance\nto the WSGI specification (PEP 333).
\nSee http://www.wsgi.org for more information about WSGI, and links to tutorials\nand other resources.
\nThis module provides a variety of utility functions for working with WSGI\nenvironments. A WSGI environment is a dictionary containing HTTP request\nvariables as described in PEP 333. All of the functions taking an environ\nparameter expect a WSGI-compliant dictionary to be supplied; please see\nPEP 333 for a detailed specification.
\nReturn a guess for whether wsgi.url_scheme should be “http” or “https”, by\nchecking for a HTTPS environment variable in the environ dictionary. The\nreturn value is a string.
\nThis function is useful when creating a gateway that wraps CGI or a CGI-like\nprotocol such as FastCGI. Typically, servers providing such protocols will\ninclude a HTTPS variable with a value of “1” “yes”, or “on” when a request\nis received via SSL. So, this function returns “https” if such a value is\nfound, and “http” otherwise.
\nShift a single name from PATH_INFO to SCRIPT_NAME and return the name.\nThe environ dictionary is modified in-place; use a copy if you need to keep\nthe original PATH_INFO or SCRIPT_NAME intact.
\nIf there are no remaining path segments in PATH_INFO, None is returned.
\nTypically, this routine is used to process each portion of a request URI path,\nfor example to treat the path as a series of dictionary keys. This routine\nmodifies the passed-in environment to make it suitable for invoking another WSGI\napplication that is located at the target URI. For example, if there is a WSGI\napplication at /foo, and the request URI path is /foo/bar/baz, and the\nWSGI application at /foo calls shift_path_info(), it will receive the\nstring “bar”, and the environment will be updated to be suitable for passing to\na WSGI application at /foo/bar. That is, SCRIPT_NAME will change from\n/foo to /foo/bar, and PATH_INFO will change from /bar/baz to\n/baz.
\nWhen PATH_INFO is just a “/”, this routine returns an empty string and\nappends a trailing slash to SCRIPT_NAME, even though empty path segments are\nnormally ignored, and SCRIPT_NAME doesn’t normally end in a slash. This is\nintentional behavior, to ensure that an application can tell the difference\nbetween URIs ending in /x from ones ending in /x/ when using this\nroutine to do object traversal.
\nUpdate environ with trivial defaults for testing purposes.
\nThis routine adds various parameters required for WSGI, including HTTP_HOST,\nSERVER_NAME, SERVER_PORT, REQUEST_METHOD, SCRIPT_NAME,\nPATH_INFO, and all of the PEP 333-defined wsgi.* variables. It\nonly supplies default values, and does not replace any existing settings for\nthese variables.
\nThis routine is intended to make it easier for unit tests of WSGI servers and\napplications to set up dummy environments. It should NOT be used by actual WSGI\nservers or applications, since the data is fake!
\nExample usage:
\nfrom wsgiref.util import setup_testing_defaults\nfrom wsgiref.simple_server import make_server\n\n# A relatively simple WSGI application. It's going to print out the\n# environment dictionary after being updated by setup_testing_defaults\ndef simple_app(environ, start_response):\n setup_testing_defaults(environ)\n\n status = '200 OK'\n headers = [('Content-type', 'text/plain')]\n\n start_response(status, headers)\n\n ret = ["%s: %s\\n" (key, value)\n for key, value in environ.iteritems()]\n return ret\n\nhttpd = make_server('', 8000, simple_app)\nprint "Serving on port 8000..."\nhttpd.serve_forever()\n
In addition to the environment functions above, the wsgiref.util module\nalso provides these miscellaneous utilities:
\nA wrapper to convert a file-like object to an iterator. The resulting objects\nsupport both __getitem__() and __iter__() iteration styles, for\ncompatibility with Python 2.1 and Jython. As the object is iterated over, the\noptional blksize parameter will be repeatedly passed to the filelike\nobject’s read() method to obtain strings to yield. When read()\nreturns an empty string, iteration is ended and is not resumable.
\nIf filelike has a close() method, the returned object will also have a\nclose() method, and it will invoke the filelike object’s close()\nmethod when called.
\nExample usage:
\nfrom StringIO import StringIO\nfrom wsgiref.util import FileWrapper\n\n# We're using a StringIO-buffer for as the file-like object\nfilelike = StringIO("This is an example file-like object"*10)\nwrapper = FileWrapper(filelike, blksize=5)\n\nfor chunk in wrapper:\n print chunk\n
This module provides a single class, Headers, for convenient\nmanipulation of WSGI response headers using a mapping-like interface.
\nCreate a mapping-like object wrapping headers, which must be a list of header\nname/value tuples as described in PEP 333. Any changes made to the new\nHeaders object will directly update the headers list it was created\nwith.
\nHeaders objects support typical mapping operations including\n__getitem__(), get(), __setitem__(), setdefault(),\n__delitem__(), __contains__() and has_key(). For each of\nthese methods, the key is the header name (treated case-insensitively), and the\nvalue is the first value associated with that header name. Setting a header\ndeletes any existing values for that header, then adds a new value at the end of\nthe wrapped header list. Headers’ existing order is generally maintained, with\nnew headers added to the end of the wrapped list.
\nUnlike a dictionary, Headers objects do not raise an error when you try\nto get or delete a key that isn’t in the wrapped header list. Getting a\nnonexistent header just returns None, and deleting a nonexistent header does\nnothing.
\nHeaders objects also support keys(), values(), and\nitems() methods. The lists returned by keys() and items() can\ninclude the same key more than once if there is a multi-valued header. The\nlen() of a Headers object is the same as the length of its\nitems(), which is the same as the length of the wrapped header list. In\nfact, the items() method just returns a copy of the wrapped header list.
\nCalling str() on a Headers object returns a formatted string\nsuitable for transmission as HTTP response headers. Each header is placed on a\nline with its value, separated by a colon and a space. Each line is terminated\nby a carriage return and line feed, and the string is terminated with a blank\nline.
\nIn addition to their mapping interface and formatting features, Headers\nobjects also have the following methods for querying and adding multi-valued\nheaders, and for adding headers with MIME parameters:
\nReturn a list of all the values for the named header.
\nThe returned list will be sorted in the order they appeared in the original\nheader list or were added to this instance, and may contain duplicates. Any\nfields deleted and re-inserted are always appended to the header list. If no\nfields exist with the given name, returns an empty list.
\nAdd a (possibly multi-valued) header, with optional MIME parameters specified\nvia keyword arguments.
\nname is the header field to add. Keyword arguments can be used to set MIME\nparameters for the header field. Each parameter must be a string or None.\nUnderscores in parameter names are converted to dashes, since dashes are illegal\nin Python identifiers, but many MIME parameter names include dashes. If the\nparameter value is a string, it is added to the header value parameters in the\nform name="value". If it is None, only the parameter name is added.\n(This is used for MIME parameters without a value.) Example usage:
\nh.add_header('content-disposition', 'attachment', filename='bud.gif')\n
The above will add a header that looks like this:
\nContent-Disposition: attachment; filename=\"bud.gif\"
\nThis module implements a simple HTTP server (based on BaseHTTPServer)\nthat serves WSGI applications. Each server instance serves a single WSGI\napplication on a given host and port. If you want to serve multiple\napplications on a single host and port, you should create a WSGI application\nthat parses PATH_INFO to select which application to invoke for each\nrequest. (E.g., using the shift_path_info() function from\nwsgiref.util.)
\nCreate a new WSGI server listening on host and port, accepting connections\nfor app. The return value is an instance of the supplied server_class, and\nwill process requests using the specified handler_class. app must be a WSGI\napplication object, as defined by PEP 333.
\nExample usage:
\nfrom wsgiref.simple_server import make_server, demo_app\n\nhttpd = make_server('', 8000, demo_app)\nprint "Serving HTTP on port 8000..."\n\n# Respond to requests until process is killed\nhttpd.serve_forever()\n\n# Alternative: serve one request, then exit\nhttpd.handle_request()\n
Create a WSGIServer instance. server_address should be a\n(host,port) tuple, and RequestHandlerClass should be the subclass of\nBaseHTTPServer.BaseHTTPRequestHandler that will be used to process\nrequests.
\nYou do not normally need to call this constructor, as the make_server()\nfunction can handle all the details for you.
\nWSGIServer is a subclass of BaseHTTPServer.HTTPServer, so all\nof its methods (such as serve_forever() and handle_request()) are\navailable. WSGIServer also provides these WSGI-specific methods:
\nNormally, however, you do not need to use these additional methods, as\nset_app() is normally called by make_server(), and the\nget_app() exists mainly for the benefit of request handler instances.
\nCreate an HTTP handler for the given request (i.e. a socket), client_address\n(a (host,port) tuple), and server (WSGIServer instance).
\nYou do not need to create instances of this class directly; they are\nautomatically created as needed by WSGIServer objects. You can,\nhowever, subclass this class and supply it as a handler_class to the\nmake_server() function. Some possibly relevant methods for overriding in\nsubclasses:
\nWhen creating new WSGI application objects, frameworks, servers, or middleware,\nit can be useful to validate the new code’s conformance using\nwsgiref.validate. This module provides a function that creates WSGI\napplication objects that validate communications between a WSGI server or\ngateway and a WSGI application object, to check both sides for protocol\nconformance.
\nNote that this utility does not guarantee complete PEP 333 compliance; an\nabsence of errors from this module does not necessarily mean that errors do not\nexist. However, if this module does produce an error, then it is virtually\ncertain that either the server or application is not 100% compliant.
\nThis module is based on the paste.lint module from Ian Bicking’s “Python\nPaste” library.
\nWrap application and return a new WSGI application object. The returned\napplication will forward all requests to the original application, and will\ncheck that both the application and the server invoking it are conforming to\nthe WSGI specification and to RFC 2616.
\nAny detected nonconformance results in an AssertionError being raised;\nnote, however, that how these errors are handled is server-dependent. For\nexample, wsgiref.simple_server and other servers based on\nwsgiref.handlers (that don’t override the error handling methods to do\nsomething else) will simply output a message that an error has occurred, and\ndump the traceback to sys.stderr or some other error stream.
\nThis wrapper may also generate output using the warnings module to\nindicate behaviors that are questionable but which may not actually be\nprohibited by PEP 333. Unless they are suppressed using Python command-line\noptions or the warnings API, any such warnings will be written to\nsys.stderr (not wsgi.errors, unless they happen to be the same\nobject).
\nExample usage:
\nfrom wsgiref.validate import validator\nfrom wsgiref.simple_server import make_server\n\n# Our callable object which is intentionally not compliant to the\n# standard, so the validator is going to break\ndef simple_app(environ, start_response):\n status = '200 OK' # HTTP Status\n headers = [('Content-type', 'text/plain')] # HTTP Headers\n start_response(status, headers)\n\n # This is going to break because we need to return a list, and\n # the validator is going to inform us\n return "Hello World"\n\n# This is the application wrapped in a validator\nvalidator_app = validator(simple_app)\n\nhttpd = make_server('', 8000, validator_app)\nprint "Listening on port 8000...."\nhttpd.serve_forever()\n
This module provides base handler classes for implementing WSGI servers and\ngateways. These base classes handle most of the work of communicating with a\nWSGI application, as long as they are given a CGI-like environment, along with\ninput, output, and error streams.
\nCGI-based invocation via sys.stdin, sys.stdout, sys.stderr and\nos.environ. This is useful when you have a WSGI application and want to run\nit as a CGI script. Simply invoke CGIHandler().run(app), where app is\nthe WSGI application object you wish to invoke.
\nThis class is a subclass of BaseCGIHandler that sets wsgi.run_once\nto true, wsgi.multithread to false, and wsgi.multiprocess to true, and\nalways uses sys and os to obtain the necessary CGI streams and\nenvironment.
\nSimilar to CGIHandler, but instead of using the sys and\nos modules, the CGI environment and I/O streams are specified explicitly.\nThe multithread and multiprocess values are used to set the\nwsgi.multithread and wsgi.multiprocess flags for any applications run by\nthe handler instance.
\nThis class is a subclass of SimpleHandler intended for use with\nsoftware other than HTTP “origin servers”. If you are writing a gateway\nprotocol implementation (such as CGI, FastCGI, SCGI, etc.) that uses a\nStatus: header to send an HTTP status, you probably want to subclass this\ninstead of SimpleHandler.
\nSimilar to BaseCGIHandler, but designed for use with HTTP origin\nservers. If you are writing an HTTP server implementation, you will probably\nwant to subclass this instead of BaseCGIHandler
\nThis class is a subclass of BaseHandler. It overrides the\n__init__(), get_stdin(), get_stderr(), add_cgi_vars(),\n_write(), and _flush() methods to support explicitly setting the\nenvironment and streams via the constructor. The supplied environment and\nstreams are stored in the stdin, stdout, stderr, and\nenviron attributes.
\nThis is an abstract base class for running WSGI applications. Each instance\nwill handle a single HTTP request, although in principle you could create a\nsubclass that was reusable for multiple requests.
\nBaseHandler instances have only one method intended for external use:
\nAll of the other BaseHandler methods are invoked by this method in the\nprocess of running the application, and thus exist primarily to allow\ncustomizing the process.
\nThe following methods MUST be overridden in a subclass:
\nHere are some other methods and attributes you may wish to override. This list\nis only a summary, however, and does not include every method that can be\noverridden. You should consult the docstrings and source code for additional\ninformation before attempting to create a customized BaseHandler\nsubclass.
\nAttributes and methods for customizing the WSGI environment:
\nMethods and attributes for customizing exception handling:
\nThis method is a WSGI application to generate an error page for the user. It is\nonly invoked if an error occurs before headers are sent to the client.
\nThis method can access the current error information using sys.exc_info(),\nand should pass that information to start_response when calling it (as\ndescribed in the “Error Handling” section of PEP 333).
\nThe default implementation just uses the error_status,\nerror_headers, and error_body attributes to generate an output\npage. Subclasses can override this to produce more dynamic error output.
\nNote, however, that it’s not recommended from a security perspective to spit out\ndiagnostics to any old user; ideally, you should have to do something special to\nenable diagnostic output, which is why the default implementation doesn’t\ninclude any.
\nMethods and attributes for PEP 333‘s “Optional Platform-Specific File\nHandling” feature:
\nMiscellaneous methods and attributes:
\nThis attribute should be set to a true value if the handler’s _write() and\n_flush() are being used to communicate directly to the client, rather than\nvia a CGI-like gateway protocol that wants the HTTP status in a special\nStatus: header.
\nThis attribute’s default value is true in BaseHandler, but false in\nBaseCGIHandler and CGIHandler.
\nThis is a working “Hello World” WSGI application:
\nfrom wsgiref.simple_server import make_server\n\n# Every WSGI application must have an application object - a callable\n# object that accepts two arguments. For that purpose, we're going to\n# use a function (note that you're not limited to a function, you can\n# use a class for example). The first argument passed to the function\n# is a dictionary containing CGI-style envrironment variables and the\n# second variable is the callable object (see PEP 333).\ndef hello_world_app(environ, start_response):\n status = '200 OK' # HTTP Status\n headers = [('Content-type', 'text/plain')] # HTTP Headers\n start_response(status, headers)\n\n # The returned object is going to be printed\n return ["Hello World"]\n\nhttpd = make_server('', 8000, hello_world_app)\nprint "Serving on port 8000..."\n\n# Serve until process is killed\nhttpd.serve_forever()\n
Source code: Lib/ftplib.py
\nThis module defines the class FTP and a few related items. The\nFTP class implements the client side of the FTP protocol. You can use\nthis to write Python programs that perform a variety of automated FTP jobs, such\nas mirroring other ftp servers. It is also used by the module urllib to\nhandle URLs that use FTP. For more information on FTP (File Transfer Protocol),\nsee Internet RFC 959.
\nHere’s a sample session using the ftplib module:
\n>>> from ftplib import FTP\n>>> ftp = FTP('ftp.cwi.nl') # connect to host, default port\n>>> ftp.login() # user anonymous, passwd anonymous@\n>>> ftp.retrlines('LIST') # list directory contents\ntotal 24418\ndrwxrwsr-x 5 ftp-usr pdmaint 1536 Mar 20 09:48 .\ndr-xr-srwt 105 ftp-usr pdmaint 1536 Mar 21 14:32 ..\n-rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX\n .\n .\n .\n>>> ftp.retrbinary('RETR README', open('README', 'wb').write)\n'226 Transfer complete.'\n>>> ftp.quit()\n
The module defines the following items:
\nReturn a new instance of the FTP class. When host is given, the\nmethod call connect(host) is made. When user is given, additionally\nthe method call login(user, passwd, acct) is made (where passwd and\nacct default to the empty string when not given). The optional timeout\nparameter specifies a timeout in seconds for blocking operations like the\nconnection attempt (if is not specified, the global default timeout setting\nwill be used).
\n\nChanged in version 2.6: timeout was added.
\nA FTP subclass which adds TLS support to FTP as described in\nRFC 4217.\nConnect as usual to port 21 implicitly securing the FTP control connection\nbefore authenticating. Securing the data connection requires the user to\nexplicitly ask for it by calling the prot_p() method.\nkeyfile and certfile are optional – they can contain a PEM formatted\nprivate key and certificate chain file name for the SSL connection.
\n\nNew in version 2.7.
\nHere’s a sample session using the FTP_TLS class:
\n>>> from ftplib import FTP_TLS\n>>> ftps = FTP_TLS('ftp.python.org')\n>>> ftps.login() # login anonymously before securing control channel\n>>> ftps.prot_p() # switch to secure data connection\n>>> ftps.retrlines('LIST') # list directory content securely\ntotal 9\ndrwxr-xr-x 8 root wheel 1024 Jan 3 1994 .\ndrwxr-xr-x 8 root wheel 1024 Jan 3 1994 ..\ndrwxr-xr-x 2 root wheel 1024 Jan 3 1994 bin\ndrwxr-xr-x 2 root wheel 1024 Jan 3 1994 etc\nd-wxrwxr-x 2 ftp wheel 1024 Sep 5 13:43 incoming\ndrwxr-xr-x 2 root wheel 1024 Nov 17 1993 lib\ndrwxr-xr-x 6 1094 wheel 1024 Sep 13 19:07 pub\ndrwxr-xr-x 3 root wheel 1024 Jan 3 1994 usr\n-rw-r--r-- 1 root root 312 Aug 1 1994 welcome.msg\n'226 Transfer complete.'\n>>> ftps.quit()\n>>>\n
See also
\nThe file Tools/scripts/ftpmirror.py in the Python source distribution is\na script that can mirror FTP sites, or portions thereof, using the ftplib\nmodule. It can be used as an extended example that applies this module.
\nSeveral methods are available in two flavors: one for handling text files and\nanother for binary files. These are named for the command which is used\nfollowed by lines for the text version or binary for the binary version.
\nFTP instances have the following methods:
\nConnect to the given host and port. The default port number is 21, as\nspecified by the FTP protocol specification. It is rarely needed to specify a\ndifferent port number. This function should be called only once for each\ninstance; it should not be called at all if a host was given when the instance\nwas created. All other methods can only be used after a connection has been\nmade.
\nThe optional timeout parameter specifies a timeout in seconds for the\nconnection attempt. If no timeout is passed, the global default timeout\nsetting will be used.
\n\nChanged in version 2.6: timeout was added.
\nStore a file in binary transfer mode. command should be an appropriate\nSTOR command: "STOR filename". file is an open file object which is\nread until EOF using its read() method in blocks of size blocksize to\nprovide the data to be stored. The blocksize argument defaults to 8192.\ncallback is an optional single parameter callable that is called\non each block of data after it is sent. rest means the same thing as in\nthe transfercmd() method.
\n\nChanged in version 2.1: default for blocksize added.
\n\nChanged in version 2.6: callback parameter added.
\n\nChanged in version 2.7: rest parameter added.
\nStore a file in ASCII transfer mode. command should be an appropriate\nSTOR command (see storbinary()). Lines are read until EOF from the\nopen file object file using its readline() method to provide the data to\nbe stored. callback is an optional single parameter callable\nthat is called on each line after it is sent.
\n\nChanged in version 2.6: callback parameter added.
\nInitiate a transfer over the data connection. If the transfer is active, send a\nEPRT or PORT command and the transfer command specified by cmd, and\naccept the connection. If the server is passive, send a EPSV or PASV\ncommand, connect to it, and start the transfer command. Either way, return the\nsocket for the connection.
\nIf optional rest is given, a REST command is sent to the server, passing\nrest as an argument. rest is usually a byte offset into the requested file,\ntelling the server to restart sending the file’s bytes at the requested offset,\nskipping over the initial bytes. Note however that RFC 959 requires only that\nrest be a string containing characters in the printable range from ASCII code\n33 to ASCII code 126. The transfercmd() method, therefore, converts\nrest to a string, but no check is performed on the string’s contents. If the\nserver does not recognize the REST command, an error_reply exception\nwill be raised. If this happens, simply call transfercmd() without a\nrest argument.
\nFTP_TLS class inherits from FTP, defining these additional objects:
\nNote
\nThe httplib module has been renamed to http.client in Python\n3.0. The 2to3 tool will automatically adapt imports when converting\nyour sources to 3.0.
\nSource code: Lib/httplib.py
\nThis module defines classes which implement the client side of the HTTP and\nHTTPS protocols. It is normally not used directly — the module urllib\nuses it to handle URLs that use HTTP and HTTPS.
\nNote
\nHTTPS support is only available if the socket module was compiled with\nSSL support.
\nNote
\nThe public interface for this module changed substantially in Python 2.0. The\nHTTP class is retained only for backward compatibility with 1.5.2. It\nshould not be used in new code. Refer to the online docstrings for usage.
\nThe module provides the following classes:
\nAn HTTPConnection instance represents one transaction with an HTTP\nserver. It should be instantiated passing it a host and optional port\nnumber. If no port number is passed, the port is extracted from the host\nstring if it has the form host:port, else the default HTTP port (80) is\nused. When True, the optional parameter strict (which defaults to a false\nvalue) causes BadStatusLine to\nbe raised if the status line can’t be parsed as a valid HTTP/1.0 or 1.1\nstatus line. If the optional timeout parameter is given, blocking\noperations (like connection attempts) will timeout after that many seconds\n(if it is not given, the global default timeout setting is used).\nThe optional source_address parameter may be a tuple of a (host, port)\nto use as the source address the HTTP connection is made from.
\nFor example, the following calls all create instances that connect to the server\nat the same host and port:
\n>>> h1 = httplib.HTTPConnection('www.cwi.nl')\n>>> h2 = httplib.HTTPConnection('www.cwi.nl:80')\n>>> h3 = httplib.HTTPConnection('www.cwi.nl', 80)\n>>> h3 = httplib.HTTPConnection('www.cwi.nl', 80, timeout=10)\n
\nNew in version 2.0.
\n\nChanged in version 2.6: timeout was added.
\n\nChanged in version 2.7: source_address was added.
\nA subclass of HTTPConnection that uses SSL for communication with\nsecure servers. Default port is 443. key_file is the name of a PEM\nformatted file that contains your private key. cert_file is a PEM formatted\ncertificate chain file.
\nWarning
\nThis does not do any verification of the server’s certificate.
\n\nNew in version 2.0.
\n\nChanged in version 2.6: timeout was added.
\n\nChanged in version 2.7: source_address was added.
\nClass whose instances are returned upon successful connection. Not instantiated\ndirectly by user.
\n\nNew in version 2.0.
\nThe following exceptions are raised as appropriate:
\nThe base class of the other exceptions in this module. It is a subclass of\nException.
\n\nNew in version 2.0.
\nA subclass of HTTPException.
\n\nNew in version 2.0.
\nA subclass of HTTPException, raised if a port is given and is either\nnon-numeric or empty.
\n\nNew in version 2.3.
\nA subclass of HTTPException.
\n\nNew in version 2.0.
\nA subclass of HTTPException.
\n\nNew in version 2.0.
\nA subclass of HTTPException.
\n\nNew in version 2.0.
\nA subclass of HTTPException.
\n\nNew in version 2.0.
\nA subclass of HTTPException.
\n\nNew in version 2.0.
\nA subclass of ImproperConnectionState.
\n\nNew in version 2.0.
\nA subclass of ImproperConnectionState.
\n\nNew in version 2.0.
\nA subclass of ImproperConnectionState.
\n\nNew in version 2.0.
\nA subclass of HTTPException. Raised if a server responds with a HTTP\nstatus code that we don’t understand.
\n\nNew in version 2.0.
\nThe constants defined in this module are:
\nand also the following constants for integer status codes:
\nConstant | \nValue | \nDefinition | \n
---|---|---|
CONTINUE | \n100 | \nHTTP/1.1, RFC 2616, Section\n10.1.1 | \n
SWITCHING_PROTOCOLS | \n101 | \nHTTP/1.1, RFC 2616, Section\n10.1.2 | \n
PROCESSING | \n102 | \nWEBDAV, RFC 2518, Section 10.1 | \n
OK | \n200 | \nHTTP/1.1, RFC 2616, Section\n10.2.1 | \n
CREATED | \n201 | \nHTTP/1.1, RFC 2616, Section\n10.2.2 | \n
ACCEPTED | \n202 | \nHTTP/1.1, RFC 2616, Section\n10.2.3 | \n
NON_AUTHORITATIVE_INFORMATION | \n203 | \nHTTP/1.1, RFC 2616, Section\n10.2.4 | \n
NO_CONTENT | \n204 | \nHTTP/1.1, RFC 2616, Section\n10.2.5 | \n
RESET_CONTENT | \n205 | \nHTTP/1.1, RFC 2616, Section\n10.2.6 | \n
PARTIAL_CONTENT | \n206 | \nHTTP/1.1, RFC 2616, Section\n10.2.7 | \n
MULTI_STATUS | \n207 | \nWEBDAV RFC 2518, Section 10.2 | \n
IM_USED | \n226 | \nDelta encoding in HTTP,\nRFC 3229, Section 10.4.1 | \n
MULTIPLE_CHOICES | \n300 | \nHTTP/1.1, RFC 2616, Section\n10.3.1 | \n
MOVED_PERMANENTLY | \n301 | \nHTTP/1.1, RFC 2616, Section\n10.3.2 | \n
FOUND | \n302 | \nHTTP/1.1, RFC 2616, Section\n10.3.3 | \n
SEE_OTHER | \n303 | \nHTTP/1.1, RFC 2616, Section\n10.3.4 | \n
NOT_MODIFIED | \n304 | \nHTTP/1.1, RFC 2616, Section\n10.3.5 | \n
USE_PROXY | \n305 | \nHTTP/1.1, RFC 2616, Section\n10.3.6 | \n
TEMPORARY_REDIRECT | \n307 | \nHTTP/1.1, RFC 2616, Section\n10.3.8 | \n
BAD_REQUEST | \n400 | \nHTTP/1.1, RFC 2616, Section\n10.4.1 | \n
UNAUTHORIZED | \n401 | \nHTTP/1.1, RFC 2616, Section\n10.4.2 | \n
PAYMENT_REQUIRED | \n402 | \nHTTP/1.1, RFC 2616, Section\n10.4.3 | \n
FORBIDDEN | \n403 | \nHTTP/1.1, RFC 2616, Section\n10.4.4 | \n
NOT_FOUND | \n404 | \nHTTP/1.1, RFC 2616, Section\n10.4.5 | \n
METHOD_NOT_ALLOWED | \n405 | \nHTTP/1.1, RFC 2616, Section\n10.4.6 | \n
NOT_ACCEPTABLE | \n406 | \nHTTP/1.1, RFC 2616, Section\n10.4.7 | \n
PROXY_AUTHENTICATION_REQUIRED | \n407 | \nHTTP/1.1, RFC 2616, Section\n10.4.8 | \n
REQUEST_TIMEOUT | \n408 | \nHTTP/1.1, RFC 2616, Section\n10.4.9 | \n
CONFLICT | \n409 | \nHTTP/1.1, RFC 2616, Section\n10.4.10 | \n
GONE | \n410 | \nHTTP/1.1, RFC 2616, Section\n10.4.11 | \n
LENGTH_REQUIRED | \n411 | \nHTTP/1.1, RFC 2616, Section\n10.4.12 | \n
PRECONDITION_FAILED | \n412 | \nHTTP/1.1, RFC 2616, Section\n10.4.13 | \n
REQUEST_ENTITY_TOO_LARGE | \n413 | \nHTTP/1.1, RFC 2616, Section\n10.4.14 | \n
REQUEST_URI_TOO_LONG | \n414 | \nHTTP/1.1, RFC 2616, Section\n10.4.15 | \n
UNSUPPORTED_MEDIA_TYPE | \n415 | \nHTTP/1.1, RFC 2616, Section\n10.4.16 | \n
REQUESTED_RANGE_NOT_SATISFIABLE | \n416 | \nHTTP/1.1, RFC 2616, Section\n10.4.17 | \n
EXPECTATION_FAILED | \n417 | \nHTTP/1.1, RFC 2616, Section\n10.4.18 | \n
UNPROCESSABLE_ENTITY | \n422 | \nWEBDAV, RFC 2518, Section 10.3 | \n
LOCKED | \n423 | \nWEBDAV RFC 2518, Section 10.4 | \n
FAILED_DEPENDENCY | \n424 | \nWEBDAV, RFC 2518, Section 10.5 | \n
UPGRADE_REQUIRED | \n426 | \nHTTP Upgrade to TLS,\nRFC 2817, Section 6 | \n
INTERNAL_SERVER_ERROR | \n500 | \nHTTP/1.1, RFC 2616, Section\n10.5.1 | \n
NOT_IMPLEMENTED | \n501 | \nHTTP/1.1, RFC 2616, Section\n10.5.2 | \n
BAD_GATEWAY | \n502 | \nHTTP/1.1 RFC 2616, Section\n10.5.3 | \n
SERVICE_UNAVAILABLE | \n503 | \nHTTP/1.1, RFC 2616, Section\n10.5.4 | \n
GATEWAY_TIMEOUT | \n504 | \nHTTP/1.1 RFC 2616, Section\n10.5.5 | \n
HTTP_VERSION_NOT_SUPPORTED | \n505 | \nHTTP/1.1, RFC 2616, Section\n10.5.6 | \n
INSUFFICIENT_STORAGE | \n507 | \nWEBDAV, RFC 2518, Section 10.6 | \n
NOT_EXTENDED | \n510 | \nAn HTTP Extension Framework,\nRFC 2774, Section 7 | \n
This dictionary maps the HTTP 1.1 status codes to the W3C names.
\nExample: httplib.responses[httplib.NOT_FOUND] is 'Not Found'.
\n\nNew in version 2.5.
\nHTTPConnection instances have the following methods:
\nThis will send a request to the server using the HTTP request method method\nand the selector url. If the body argument is present, it should be a\nstring of data to send after the headers are finished. Alternatively, it may\nbe an open file object, in which case the contents of the file is sent; this\nfile object should support fileno() and read() methods. The header\nContent-Length is automatically set to the correct value. The headers\nargument should be a mapping of extra HTTP headers to send with the request.
\n\nChanged in version 2.6: body can be a file object.
\nShould be called after a request is sent to get the response from the server.\nReturns an HTTPResponse instance.
\nNote
\nNote that you must have read the whole response before you can send a new\nrequest to the server.
\nSet the host and the port for HTTP Connect Tunnelling. Normally used when\nit is required to do HTTPS Conection through a proxy server.
\nThe headers argument should be a mapping of extra HTTP headers to send\nwith the CONNECT request.
\n\nNew in version 2.7.
\nAs an alternative to using the request() method described above, you can\nalso send your request step by step, by using the four functions below.
\nThis should be the first call after the connection to the server has been made.\nIt sends a line to the server consisting of the request string, the selector\nstring, and the HTTP version (HTTP/1.1). To disable automatic sending of\nHost: or Accept-Encoding: headers (for example to accept additional\ncontent encodings), specify skip_host or skip_accept_encoding with non-False\nvalues.
\n\nChanged in version 2.4: skip_accept_encoding argument added.
\nSend a blank line to the server, signalling the end of the headers. The\noptional message_body argument can be used to pass a message body\nassociated with the request. The message body will be sent in the same\npacket as the message headers if it is string, otherwise it is sent in a\nseparate packet.
\n\nChanged in version 2.7: message_body was added.
\nHTTPResponse instances have the following methods and attributes:
\nReturn a list of (header, value) tuples.
\n\nNew in version 2.4.
\nHere is an example session that uses the GET method:
\n>>> import httplib\n>>> conn = httplib.HTTPConnection("www.python.org")\n>>> conn.request("GET", "/index.html")\n>>> r1 = conn.getresponse()\n>>> print r1.status, r1.reason\n200 OK\n>>> data1 = r1.read()\n>>> conn.request("GET", "/parrot.spam")\n>>> r2 = conn.getresponse()\n>>> print r2.status, r2.reason\n404 Not Found\n>>> data2 = r2.read()\n>>> conn.close()\n
Here is an example session that uses the HEAD method. Note that the\nHEAD method never returns any data.
\n>>> import httplib\n>>> conn = httplib.HTTPConnection("www.python.org")\n>>> conn.request("HEAD","/index.html")\n>>> res = conn.getresponse()\n>>> print res.status, res.reason\n200 OK\n>>> data = res.read()\n>>> print len(data)\n0\n>>> data == ''\nTrue\n
Here is an example session that shows how to POST requests:
\n>>> import httplib, urllib\n>>> params = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 'show'})\n>>> headers = {"Content-type": "application/x-www-form-urlencoded",\n... "Accept": "text/plain"}\n>>> conn = httplib.HTTPConnection("bugs.python.org")\n>>> conn.request("POST", "", params, headers)\n>>> response = conn.getresponse()\n>>> print response.status, response.reason\n302 Found\n>>> data = response.read()\n>>> data\n'Redirecting to <a href="http://bugs.python.org/issue12524">http://bugs.python.org/issue12524</a>'\n>>> conn.close()\n
Note
\nThe urllib2 module has been split across several modules in\nPython 3.0 named urllib.request and urllib.error.\nThe 2to3 tool will automatically adapt imports when converting\nyour sources to 3.0.
\nThe urllib2 module defines functions and classes which help in opening\nURLs (mostly HTTP) in a complex world — basic and digest authentication,\nredirections, cookies and more.
\nThe urllib2 module defines the following functions:
\nOpen the URL url, which can be either a string or a Request object.
\nWarning
\nHTTPS requests do not do any verification of the server’s certificate.
\ndata may be a string specifying additional data to send to the server, or\nNone if no such data is needed. Currently HTTP requests are the only ones\nthat use data; the HTTP request will be a POST instead of a GET when the\ndata parameter is provided. data should be a buffer in the standard\napplication/x-www-form-urlencoded format. The\nurllib.urlencode() function takes a mapping or sequence of 2-tuples and\nreturns a string in this format. urllib2 module sends HTTP/1.1 requests with\nConnection:close header included.
\nThe optional timeout parameter specifies a timeout in seconds for blocking\noperations like the connection attempt (if not specified, the global default\ntimeout setting will be used). This actually only works for HTTP, HTTPS and\nFTP connections.
\nThis function returns a file-like object with two additional methods:
\nRaises URLError on errors.
\nNote that None may be returned if no handler handles the request (though the\ndefault installed global OpenerDirector uses UnknownHandler to\nensure this never happens).
\nIn addition, default installed ProxyHandler makes sure the requests\nare handled through the proxy when they are set.
\n\nChanged in version 2.6: timeout was added.
\nReturn an OpenerDirector instance, which chains the handlers in the\norder given. handlers can be either instances of BaseHandler, or\nsubclasses of BaseHandler (in which case it must be possible to call\nthe constructor without any parameters). Instances of the following classes\nwill be in front of the handlers, unless the handlers contain them,\ninstances of them or subclasses of them: ProxyHandler,\nUnknownHandler, HTTPHandler, HTTPDefaultErrorHandler,\nHTTPRedirectHandler, FTPHandler, FileHandler,\nHTTPErrorProcessor.
\nIf the Python installation has SSL support (i.e., if the ssl module can be imported),\nHTTPSHandler will also be added.
\nBeginning in Python 2.3, a BaseHandler subclass may also change its\nhandler_order attribute to modify its position in the handlers\nlist.
\nThe following exceptions are raised as appropriate:
\nThe handlers raise this exception (or derived exceptions) when they run into a\nproblem. It is a subclass of IOError.
\nThough being an exception (a subclass of URLError), an HTTPError\ncan also function as a non-exceptional file-like return value (the same thing\nthat urlopen() returns). This is useful when handling exotic HTTP\nerrors, such as requests for authentication.
\nThe following classes are provided:
\nThis class is an abstraction of a URL request.
\nurl should be a string containing a valid URL.
\ndata may be a string specifying additional data to send to the server, or\nNone if no such data is needed. Currently HTTP requests are the only ones\nthat use data; the HTTP request will be a POST instead of a GET when the\ndata parameter is provided. data should be a buffer in the standard\napplication/x-www-form-urlencoded format. The\nurllib.urlencode() function takes a mapping or sequence of 2-tuples and\nreturns a string in this format.
\nheaders should be a dictionary, and will be treated as if add_header()\nwas called with each key and value as arguments. This is often used to “spoof”\nthe User-Agent header, which is used by a browser to identify itself –\nsome HTTP servers only allow requests coming from common browsers as opposed\nto scripts. For example, Mozilla Firefox may identify itself as "Mozilla/5.0\n(X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11", while urllib2‘s\ndefault user agent string is "Python-urllib/2.6" (on Python 2.6).
\nThe final two arguments are only of interest for correct handling of third-party\nHTTP cookies:
\norigin_req_host should be the request-host of the origin transaction, as\ndefined by RFC 2965. It defaults to cookielib.request_host(self). This\nis the host name or IP address of the original request that was initiated by the\nuser. For example, if the request is for an image in an HTML document, this\nshould be the request-host of the request for the page containing the image.
\nunverifiable should indicate whether the request is unverifiable, as defined\nby RFC 2965. It defaults to False. An unverifiable request is one whose URL\nthe user did not have the option to approve. For example, if the request is for\nan image in an HTML document, and the user had no option to approve the\nautomatic fetching of the image, this should be true.
\nCause requests to go through a proxy. If proxies is given, it must be a\ndictionary mapping protocol names to URLs of proxies. The default is to read\nthe list of proxies from the environment variables\n. If no proxy environment variables are set, in a\nWindows environment, proxy settings are obtained from the registry’s\nInternet Settings section and in a Mac OS X environment, proxy information\nis retrieved from the OS X System Configuration Framework.
\nTo disable autodetected proxy pass an empty dictionary.
\nThe following methods describe all of Request‘s public interface, and\nso all must be overridden in subclasses.
\nAdd a header that will not be added to a redirected request.
\n\nNew in version 2.4.
\nReturn whether the instance has the named header (checks both regular and\nunredirected).
\n\nNew in version 2.4.
\nOpenerDirector instances have the following methods:
\nhandler should be an instance of BaseHandler. The following\nmethods are searched, and added to the possible chains (note that HTTP errors\nare a special case).
\nOpen the given url (which can be a request object or a string), optionally\npassing the given data. Arguments, return values and exceptions raised are\nthe same as those of urlopen() (which simply calls the open()\nmethod on the currently installed global OpenerDirector). The\noptional timeout parameter specifies a timeout in seconds for blocking\noperations like the connection attempt (if not specified, the global default\ntimeout setting will be used). The timeout feature actually works only for\nHTTP, HTTPS and FTP connections).
\n\nChanged in version 2.6: timeout was added.
\nHandle an error of the given protocol. This will call the registered error\nhandlers for the given protocol with the given arguments (which are protocol\nspecific). The HTTP protocol is a special case which uses the HTTP response\ncode to determine the specific error handler; refer to the http_error_*()\nmethods of the handler classes.
\nReturn values and exceptions raised are the same as those of urlopen().
\nOpenerDirector objects open URLs in three stages:
\nThe order in which these methods are called within each stage is determined by\nsorting the handler instances.
\nEvery handler with a method named like protocol_request has that\nmethod called to pre-process the request.
\nHandlers with a method named like protocol_open are called to handle\nthe request. This stage ends when a handler either returns a non-None\nvalue (ie. a response), or raises an exception (usually URLError).\nExceptions are allowed to propagate.
\nIn fact, the above algorithm is first tried for methods named\ndefault_open(). If all such methods return None, the\nalgorithm is repeated for methods named like protocol_open. If all\nsuch methods return None, the algorithm is repeated for methods\nnamed unknown_open().
\nNote that the implementation of these methods may involve calls of the parent\nOpenerDirector instance’s open() and\nerror() methods.
\nEvery handler with a method named like protocol_response has that\nmethod called to post-process the response.
\nBaseHandler objects provide a couple of methods that are directly\nuseful, and others that are meant to be used by derived classes. These are\nintended for direct use:
\nThe following attributes and methods should only be used by classes derived from\nBaseHandler.
\nNote
\nThe convention has been adopted that subclasses defining\nprotocol_request() or protocol_response() methods are named\n*Processor; all others are named *Handler.
\nThis method is not defined in BaseHandler, but subclasses should\ndefine it if they want to catch all URLs.
\nThis method, if implemented, will be called by the parent\nOpenerDirector. It should return a file-like object as described in\nthe return value of the open() of OpenerDirector, or None.\nIt should raise URLError, unless a truly exceptional thing happens (for\nexample, MemoryError should not be mapped to URLError).
\nThis method will be called before any protocol-specific open method.
\n(“protocol” is to be replaced by the protocol name.)
\nThis method is not defined in BaseHandler, but subclasses should\ndefine it if they want to handle URLs with the given protocol.
\nThis method, if defined, will be called by the parent OpenerDirector.\nReturn values should be the same as for default_open().
\nThis method is not defined in BaseHandler, but subclasses should\ndefine it if they want to catch all URLs with no specific registered handler to\nopen it.
\nThis method, if implemented, will be called by the parent\nOpenerDirector. Return values should be the same as for\ndefault_open().
\nThis method is not defined in BaseHandler, but subclasses should\noverride it if they intend to provide a catch-all for otherwise unhandled HTTP\nerrors. It will be called automatically by the OpenerDirector getting\nthe error, and should not normally be called in other circumstances.
\nreq will be a Request object, fp will be a file-like object with\nthe HTTP error body, code will be the three-digit code of the error, msg\nwill be the user-visible explanation of the code and hdrs will be a mapping\nobject with the headers of the error.
\nReturn values and exceptions raised should be the same as those of\nurlopen().
\nnnn should be a three-digit HTTP error code. This method is also not defined\nin BaseHandler, but will be called, if it exists, on an instance of a\nsubclass, when an HTTP error with code nnn occurs.
\nSubclasses should override this method to handle specific HTTP errors.
\nArguments, return values and exceptions raised should be the same as for\nhttp_error_default().
\n(“protocol” is to be replaced by the protocol name.)
\nThis method is not defined in BaseHandler, but subclasses should\ndefine it if they want to pre-process requests of the given protocol.
\nThis method, if defined, will be called by the parent OpenerDirector.\nreq will be a Request object. The return value should be a\nRequest object.
\n(“protocol” is to be replaced by the protocol name.)
\nThis method is not defined in BaseHandler, but subclasses should\ndefine it if they want to post-process responses of the given protocol.
\nThis method, if defined, will be called by the parent OpenerDirector.\nreq will be a Request object. response will be an object\nimplementing the same interface as the return value of urlopen(). The\nreturn value should implement the same interface as the return value of\nurlopen().
\nNote
\nSome HTTP redirections require action from this module’s client code. If this\nis the case, HTTPError is raised. See RFC 2616 for details of the\nprecise meanings of the various redirection codes.
\nReturn a Request or None in response to a redirect. This is called\nby the default implementations of the http_error_30*() methods when a\nredirection is received from the server. If a redirection should take place,\nreturn a new Request to allow http_error_30*() to perform the\nredirect to newurl. Otherwise, raise HTTPError if no other handler\nshould try to handle this URL, or return None if you can’t but another\nhandler might.
\nNote
\nThe default implementation of this method does not strictly follow RFC 2616,\nwhich says that 301 and 302 responses to POST requests must not be\nautomatically redirected without confirmation by the user. In reality, browsers\ndo allow automatic redirection of these responses, changing the POST to a\nGET, and the default implementation reproduces this behavior.
\n\nNew in version 2.4.
\nHTTPCookieProcessor instances have one attribute:
\n(“protocol” is to be replaced by the protocol name.)
\nThe ProxyHandler will have a method protocol_open for every\nprotocol which has a proxy in the proxies dictionary given in the\nconstructor. The method will modify requests to go through the proxy, by\ncalling request.set_proxy(), and call the next handler in the chain to\nactually execute the protocol.
\nThese methods are available on HTTPPasswordMgr and\nHTTPPasswordMgrWithDefaultRealm objects.
\nGet user/password for given realm and URI, if any. This method will return\n(None, None) if there is no matching user/password.
\nFor HTTPPasswordMgrWithDefaultRealm objects, the realm None will be\nsearched if the given realm has no matching user/password.
\nHandle an authentication request by getting a user/password pair, and re-trying\nthe request. authreq should be the name of the header where the information\nabout the realm is included in the request, host specifies the URL and path to\nauthenticate for, req should be the (failed) Request object, and\nheaders should be the error headers.
\nhost is either an authority (e.g. "python.org") or a URL containing an\nauthority component (e.g. "http://python.org/"). In either case, the\nauthority must not contain a userinfo component (so, "python.org" and\n"python.org:80" are fine, "joe:password@python.org" is not).
\nCacheFTPHandler objects are FTPHandler objects with the\nfollowing additional methods:
\n\nNew in version 2.4.
\nProcess HTTP error responses.
\nFor 200 error codes, the response object is returned immediately.
\nFor non-200 error codes, this simply passes the job on to the\nprotocol_error_code handler methods, via\nOpenerDirector.error(). Eventually,\nurllib2.HTTPDefaultErrorHandler will raise an HTTPError if no\nother handler handles the error.
\nProcess HTTPS error responses.
\nThe behavior is same as http_response().
\nThis example gets the python.org main page and displays the first 100 bytes of\nit:
\n>>> import urllib2\n>>> f = urllib2.urlopen('http://www.python.org/')\n>>> print f.read(100)\n<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">\n<?xml-stylesheet href="./css/ht2html\n
Here we are sending a data-stream to the stdin of a CGI and reading the data it\nreturns to us. Note that this example will only work when the Python\ninstallation supports SSL.
\n>>> import urllib2\n>>> req = urllib2.Request(url='https://localhost/cgi-bin/test.cgi',\n... data='This data is passed to stdin of the CGI')\n>>> f = urllib2.urlopen(req)\n>>> print f.read()\nGot Data: "This data is passed to stdin of the CGI"\n
The code for the sample CGI used in the above example is:
\n#!/usr/bin/env python\nimport sys\ndata = sys.stdin.read()\nprint 'Content-type: text-plain\\n\\nGot Data: "%s"' data\n
Use of Basic HTTP Authentication:
\nimport urllib2\n# Create an OpenerDirector with support for Basic HTTP Authentication...\nauth_handler = urllib2.HTTPBasicAuthHandler()\nauth_handler.add_password(realm='PDQ Application',\n uri='https://mahler:8092/site-updates.py',\n user='klem',\n passwd='kadidd!ehopper')\nopener = urllib2.build_opener(auth_handler)\n# ...and install it globally so it can be used with urlopen.\nurllib2.install_opener(opener)\nurllib2.urlopen('http://www.example.com/login.html')\n
build_opener() provides many handlers by default, including a\nProxyHandler. By default, ProxyHandler uses the environment\nvariables named <scheme>_proxy, where <scheme> is the URL scheme\ninvolved. For example, the http_proxy environment variable is read to\nobtain the HTTP proxy’s URL.
\nThis example replaces the default ProxyHandler with one that uses\nprogrammatically-supplied proxy URLs, and adds proxy authorization support with\nProxyBasicAuthHandler.
\nproxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'})\nproxy_auth_handler = urllib2.ProxyBasicAuthHandler()\nproxy_auth_handler.add_password('realm', 'host', 'username', 'password')\n\nopener = urllib2.build_opener(proxy_handler, proxy_auth_handler)\n# This time, rather than install the OpenerDirector, we use it directly:\nopener.open('http://www.example.com/login.html')\n
Adding HTTP headers:
\nUse the headers argument to the Request constructor, or:
\nimport urllib2\nreq = urllib2.Request('http://www.example.com/')\nreq.add_header('Referer', 'http://www.python.org/')\nr = urllib2.urlopen(req)\n
OpenerDirector automatically adds a User-Agent header to\nevery Request. To change this:
\nimport urllib2\nopener = urllib2.build_opener()\nopener.addheaders = [('User-agent', 'Mozilla/5.0')]\nopener.open('http://www.example.com/')\n
Also, remember that a few standard headers (Content-Length,\nContent-Type and Host) are added when the\nRequest is passed to urlopen() (or OpenerDirector.open()).
\nSource code: Lib/poplib.py
\nThis module defines a class, POP3, which encapsulates a connection to a\nPOP3 server and implements the protocol as defined in RFC 1725. The\nPOP3 class supports both the minimal and optional command sets.\nAdditionally, this module provides a class POP3_SSL, which provides\nsupport for connecting to POP3 servers that use SSL as an underlying protocol\nlayer.
\nNote that POP3, though widely supported, is obsolescent. The implementation\nquality of POP3 servers varies widely, and too many are quite poor. If your\nmailserver supports IMAP, you would be better off using the\nimaplib.IMAP4 class, as IMAP servers tend to be better implemented.
\nA single class is provided by the poplib module:
\nThis class implements the actual POP3 protocol. The connection is created when\nthe instance is initialized. If port is omitted, the standard POP3 port (110)\nis used. The optional timeout parameter specifies a timeout in seconds for the\nconnection attempt (if not specified, the global default timeout setting will\nbe used).
\n\nChanged in version 2.6: timeout was added.
\nThis is a subclass of POP3 that connects to the server over an SSL\nencrypted socket. If port is not specified, 995, the standard POP3-over-SSL\nport is used. keyfile and certfile are also optional - they can contain a\nPEM formatted private key and certificate chain file for the SSL connection.
\n\nNew in version 2.4.
\nOne exception is defined as an attribute of the poplib module:
\nSee also
\nAll POP3 commands are represented by methods of the same name, in lower-case;\nmost return the response text sent by the server.
\nAn POP3 instance has the following methods:
\nRetrieves the message header plus howmuch lines of the message after the\nheader of message number which. Result is in form (response, ['line', ...],\noctets).
\nThe POP3 TOP command this method uses, unlike the RETR command, doesn’t set the\nmessage’s seen flag; unfortunately, TOP is poorly specified in the RFCs and is\nfrequently broken in off-brand servers. Test this method by hand against the\nPOP3 servers you will use before trusting it.
\nInstances of POP3_SSL have no additional methods. The interface of this\nsubclass is identical to its parent.
\nHere is a minimal example (without error checking) that opens a mailbox and\nretrieves and prints all messages:
\nimport getpass, poplib\n\nM = poplib.POP3('localhost')\nM.user(getpass.getuser())\nM.pass_(getpass.getpass())\nnumMessages = len(M.list()[1])\nfor i in range(numMessages):\n for j in M.retr(i+1)[1]:\n print j\n
At the end of the module, there is a test section that contains a more extensive\nexample of usage.
\nSource code: Lib/smtpd.py
\nThis module offers several classes to implement SMTP servers. One is a generic\ndo-nothing implementation, which can be overridden, while the other two offer\nspecific mail-sending strategies.
\nCreate a new SMTPServer object, which binds to local address\nlocaladdr. It will treat remoteaddr as an upstream SMTP relayer. It\ninherits from asyncore.dispatcher, and so will insert itself into\nasyncore‘s event loop on instantiation.
\nSource code: Lib/telnetlib.py
\nThe telnetlib module provides a Telnet class that implements the\nTelnet protocol. See RFC 854 for details about the protocol. In addition, it\nprovides symbolic constants for the protocol characters (see below), and for the\ntelnet options. The symbolic names of the telnet options follow the definitions\nin arpa/telnet.h, with the leading TELOPT_ removed. For symbolic names\nof options which are traditionally not included in arpa/telnet.h, see the\nmodule source itself.
\nThe symbolic constants for the telnet commands are: IAC, DONT, DO, WONT, WILL,\nSE (Subnegotiation End), NOP (No Operation), DM (Data Mark), BRK (Break), IP\n(Interrupt process), AO (Abort output), AYT (Are You There), EC (Erase\nCharacter), EL (Erase Line), GA (Go Ahead), SB (Subnegotiation Begin).
\nTelnet represents a connection to a Telnet server. The instance is\ninitially not connected by default; the open() method must be used to\nestablish a connection. Alternatively, the host name and optional port\nnumber can be passed to the constructor, to, in which case the connection to\nthe server will be established before the constructor returns. The optional\ntimeout parameter specifies a timeout in seconds for blocking operations\nlike the connection attempt (if not specified, the global default timeout\nsetting will be used).
\nDo not reopen an already connected instance.
\nThis class has many read_*() methods. Note that some of them raise\nEOFError when the end of the connection is read, because they can return\nan empty string for other reasons. See the individual descriptions below.
\n\nChanged in version 2.6: timeout was added.
\nSee also
\nTelnet instances have the following methods:
\nRead until a given string, expected, is encountered or until timeout seconds\nhave passed.
\nWhen no match is found, return whatever is available instead, possibly the empty\nstring. Raise EOFError if the connection is closed and no cooked data is\navailable.
\nRead everything that can be without blocking in I/O (eager).
\nRaise EOFError if connection closed and no cooked data available. Return\n'' if no cooked data available otherwise. Do not block unless in the midst\nof an IAC sequence.
\nRead readily available data.
\nRaise EOFError if connection closed and no cooked data available. Return\n'' if no cooked data available otherwise. Do not block unless in the midst\nof an IAC sequence.
\nProcess and return data already in the queues (lazy).
\nRaise EOFError if connection closed and no data available. Return ''\nif no cooked data available otherwise. Do not block unless in the midst of an\nIAC sequence.
\nReturn any data available in the cooked queue (very lazy).
\nRaise EOFError if connection closed and no data available. Return ''\nif no cooked data available otherwise. This method never blocks.
\nReturn the data collected between a SB/SE pair (suboption begin/end). The\ncallback should access these data when it was invoked with a SE command.\nThis method never blocks.
\n\nNew in version 2.3.
\nConnect to a host. The optional second argument is the port number, which\ndefaults to the standard Telnet port (23). The optional timeout parameter\nspecifies a timeout in seconds for blocking operations like the connection\nattempt (if not specified, the global default timeout setting will be used).
\nDo not try to reopen an already connected instance.
\n\nChanged in version 2.6: timeout was added.
\nRead until one from a list of a regular expressions matches.
\nThe first argument is a list of regular expressions, either compiled\n(re.RegexObject instances) or uncompiled (strings). The optional second\nargument is a timeout, in seconds; the default is to block indefinitely.
\nReturn a tuple of three items: the index in the list of the first regular\nexpression that matches; the match object returned; and the text read up till\nand including the match.
\nIf end of file is found and no text was read, raise EOFError. Otherwise,\nwhen nothing matches, return (-1, None, text) where text is the text\nreceived so far (may be the empty string if a timeout happened).
\nIf a regular expression ends with a greedy match (such as .*) or if more\nthan one expression can match the same input, the results are\nnon-deterministic, and may depend on the I/O timing.
\nA simple example illustrating typical use:
\nimport getpass\nimport sys\nimport telnetlib\n\nHOST = "localhost"\nuser = raw_input("Enter your remote account: ")\npassword = getpass.getpass()\n\ntn = telnetlib.Telnet(HOST)\n\ntn.read_until("login: ")\ntn.write(user + "\\n")\nif password:\n tn.read_until("Password: ")\n tn.write(password + "\\n")\n\ntn.write("ls\\n")\ntn.write("exit\\n")\n\nprint tn.read_all()\n
Source code: Lib/imaplib.py
\nThis module defines three classes, IMAP4, IMAP4_SSL and\nIMAP4_stream, which encapsulate a connection to an IMAP4 server and\nimplement a large subset of the IMAP4rev1 client protocol as defined in\nRFC 2060. It is backward compatible with IMAP4 (RFC 1730) servers, but\nnote that the STATUS command is not supported in IMAP4.
\nThree classes are provided by the imaplib module, IMAP4 is the\nbase class:
\nThree exceptions are defined as attributes of the IMAP4 class:
\nThere’s also a subclass for secure connections:
\nThe second subclass allows for connections created by a child process:
\nThis is a subclass derived from IMAP4 that connects to the\nstdin/stdout file descriptors created by passing command to\nos.popen2().
\n\nNew in version 2.3.
\nThe following utility functions are defined:
\nNote that IMAP4 message numbers change as the mailbox changes; in particular,\nafter an EXPUNGE command performs deletions the remaining messages are\nrenumbered. So it is highly advisable to use UIDs instead, with the UID command.
\nAt the end of the module, there is a test section that contains a more extensive\nexample of usage.
\nSee also
\nDocuments describing the protocol, and sources and binaries for servers\nimplementing it, can all be found at the University of Washington’s IMAP\nInformation Center (http://www.washington.edu/imap/).
\nAll IMAP4rev1 commands are represented by methods of the same name, either\nupper-case or lower-case.
\nAll arguments to commands are converted to strings, except for AUTHENTICATE,\nand the last argument to APPEND which is passed as an IMAP4 literal. If\nnecessary (the string contains IMAP4 protocol-sensitive characters and isn’t\nenclosed with either parentheses or double quotes) each string is quoted.\nHowever, the password argument to the LOGIN command is always quoted. If\nyou want to avoid having an argument string quoted (eg: the flags argument to\nSTORE) then enclose the string in parentheses (eg: r'(\\Deleted)').
\nEach command returns a tuple: (type, [data, ...]) where type is usually\n'OK' or 'NO', and data is either the text from the command response,\nor mandated results from the command. Each data is either a string, or a\ntuple. If a tuple, then the first part is the header of the response, and the\nsecond part contains the data (ie: ‘literal’ value).
\nThe message_set options to commands below is a string specifying one or more\nmessages to be acted upon. It may be a simple message number ('1'), a range\nof message numbers ('2:4'), or a group of non-contiguous ranges separated by\ncommas ('1:3,6:9'). A range can contain an asterisk to indicate an infinite\nupper bound ('3:*').
\nAn IMAP4 instance has the following methods:
\nAuthenticate command — requires response processing.
\nmechanism specifies which authentication mechanism is to be used - it should\nappear in the instance variable capabilities in the form AUTH=mechanism.
\nauthobject must be a callable object:
\ndata = authobject(response)\n
It will be called to process server continuation responses. It should return\ndata that will be encoded and sent to server. It should return None if\nthe client abort response * should be sent instead.
\nDelete the ACLs (remove any rights) set for who on mailbox.
\n\nNew in version 2.4.
\nRetrieve the specified ANNOTATIONs for mailbox. The method is\nnon-standard, but is supported by the Cyrus server.
\n\nNew in version 2.5.
\nGet the quota root‘s resource usage and limits. This method is part of the\nIMAP4 QUOTA extension defined in rfc2087.
\n\nNew in version 2.3.
\nGet the list of quota roots for the named mailbox. This method is part\nof the IMAP4 QUOTA extension defined in rfc2087.
\n\nNew in version 2.3.
\nForce use of CRAM-MD5 authentication when identifying the client to protect\nthe password. Will only work if the server CAPABILITY response includes the\nphrase AUTH=CRAM-MD5.
\n\nNew in version 2.3.
\nShow my ACLs for a mailbox (i.e. the rights that I have on mailbox).
\n\nNew in version 2.4.
\nReturns IMAP namespaces as defined in RFC2342.
\n\nNew in version 2.3.
\nAssume authentication as user. Allows an authorised administrator to proxy\ninto any user’s mailbox.
\n\nNew in version 2.3.
\nSearch mailbox for matching messages. charset may be None, in which case\nno CHARSET will be specified in the request to the server. The IMAP\nprotocol requires that at least one criterion be specified; an exception will be\nraised when the server returns an error.
\nExample:
\n# M is a connected IMAP4 instance...\ntyp, msgnums = M.search(None, 'FROM', '"LDJ"')\n\n# or:\ntyp, msgnums = M.search(None, '(FROM "LDJ")')\n
Set ANNOTATIONs for mailbox. The method is non-standard, but is\nsupported by the Cyrus server.
\n\nNew in version 2.5.
\nSet the quota root‘s resource limits. This method is part of the IMAP4\nQUOTA extension defined in rfc2087.
\n\nNew in version 2.3.
\nThe sort command is a variant of search with sorting semantics for the\nresults. Returned data contains a space separated list of matching message\nnumbers.
\nSort has two arguments before the search_criterion argument(s); a\nparenthesized list of sort_criteria, and the searching charset. Note that\nunlike search, the searching charset argument is mandatory. There is also\na uid sort command which corresponds to sort the way that uid search\ncorresponds to search. The sort command first searches the mailbox for\nmessages that match the given searching criteria using the charset argument for\nthe interpretation of strings in the searching criteria. It then returns the\nnumbers of matching messages.
\nThis is an IMAP4rev1 extension command.
\nAlters flag dispositions for messages in mailbox. command is specified by\nsection 6.4.6 of RFC 2060 as being one of “FLAGS”, “+FLAGS”, or “-FLAGS”,\noptionally with a suffix of “.SILENT”.
\nFor example, to set the delete flag on all messages:
\ntyp, data = M.search(None, 'ALL')\nfor num in data[0].split():\n M.store(num, '+FLAGS', '\\\\Deleted')\nM.expunge()\n
The thread command is a variant of search with threading semantics for\nthe results. Returned data contains a space separated list of thread members.
\nThread members consist of zero or more messages numbers, delimited by spaces,\nindicating successive parent and child.
\nThread has two arguments before the search_criterion argument(s); a\nthreading_algorithm, and the searching charset. Note that unlike\nsearch, the searching charset argument is mandatory. There is also a\nuid thread command which corresponds to thread the way that uid\nsearch corresponds to search. The thread command first searches the\nmailbox for messages that match the given searching criteria using the charset\nargument for the interpretation of strings in the searching criteria. It then\nreturns the matching messages threaded according to the specified threading\nalgorithm.
\nThis is an IMAP4rev1 extension command.
\n\nNew in version 2.4.
\nInstances of IMAP4_SSL have just one additional method:
\nThe following attributes are defined on instances of IMAP4:
\nHere is a minimal example (without error checking) that opens a mailbox and\nretrieves and prints all messages:
\nimport getpass, imaplib\n\nM = imaplib.IMAP4()\nM.login(getpass.getuser(), getpass.getpass())\nM.select()\ntyp, data = M.search(None, 'ALL')\nfor num in data[0].split():\n typ, data = M.fetch(num, '(RFC822)')\n print 'Message %s\\n%s\\n' (num, data[0][1])\nM.close()\nM.logout()\n
Source code: Lib/nntplib.py
\nThis module defines the class NNTP which implements the client side of\nthe NNTP protocol. It can be used to implement a news reader or poster, or\nautomated news processors. For more information on NNTP (Network News Transfer\nProtocol), see Internet RFC 977.
\nHere are two small examples of how it can be used. To list some statistics\nabout a newsgroup and print the subjects of the last 10 articles:
\n>>> s = NNTP('news.gmane.org')\n>>> resp, count, first, last, name = s.group('gmane.comp.python.committers')\n>>> print 'Group', name, 'has', count, 'articles, range', first, 'to', last\nGroup gmane.comp.python.committers has 1071 articles, range 1 to 1071\n>>> resp, subs = s.xhdr('subject', first + '-' + last)\n>>> for id, sub in subs[-10:]: print id, sub\n...\n1062 Re: Mercurial Status?\n1063 Re: [python-committers] (Windows) buildbots on 3.x\n1064 Re: Mercurial Status?\n1065 Re: Mercurial Status?\n1066 Python 2.6.6 status\n1067 Commit Privileges for Ask Solem\n1068 Re: Commit Privileges for Ask Solem\n1069 Re: Commit Privileges for Ask Solem\n1070 Re: Commit Privileges for Ask Solem\n1071 2.6.6 rc 2\n>>> s.quit()\n'205 Bye!'\n
To post an article from a file (this assumes that the article has valid\nheaders, and that you have right to post on the particular newsgroup):
\n>>> s = NNTP('news.gmane.org')\n>>> f = open('/tmp/article')\n>>> s.post(f)\n'240 Article posted successfully.'\n>>> s.quit()\n'205 Bye!'\n
The module itself defines the following items:
\nReturn a new instance of the NNTP class, representing a connection\nto the NNTP server running on host host, listening at port port. The\ndefault port is 119. If the optional user and password are provided,\nor if suitable credentials are present in /.netrc and the optional\nflag usenetrc is true (the default), the AUTHINFO USER and AUTHINFO\nPASS commands are used to identify and authenticate the user to the server.\nIf the optional flag readermode is true, then a mode reader command is\nsent before authentication is performed. Reader mode is sometimes necessary\nif you are connecting to an NNTP server on the local machine and intend to\ncall reader-specific commands, such as group. If you get unexpected\nNNTPPermanentErrors, you might need to set readermode.\nreadermode defaults to None. usenetrc defaults to True.
\n\nChanged in version 2.4: usenetrc argument added.
\nNNTP instances have the following methods. The response that is returned as\nthe first item in the return tuple of almost all methods is the server’s\nresponse: a string beginning with a three-digit code. If the server’s response\nindicates an error, the method raises one of the above exceptions.
\nSend a LIST NEWSGROUPS command, where grouppattern is a wildmat string as\nspecified in RFC2980 (it’s essentially the same as DOS or UNIX shell wildcard\nstrings). Return a pair (response, list), where list is a list of tuples\ncontaining (name, title).
\n\nNew in version 2.4.
\nGet a description for a single group group. If more than one group matches\n(if ‘group’ is a real wildmat string), return the first match. If no group\nmatches, return an empty string.
\nThis elides the response code from the server. If the response code is needed,\nuse descriptions().
\n\nNew in version 2.4.
\nProcess an XGTITLE command, returning a pair (response, list), where\nlist is a list of tuples containing (name, title). If the file parameter\nis supplied, then the output of the XGTITLE command is stored in a file.\nIf file is a string, then the method will open a file object with that name,\nwrite to it then close it. If file is a file object, then it will start\ncalling write() on it to store the lines of the command output. If file\nis supplied, then the returned list is an empty list. This is an optional NNTP\nextension, and may not be supported by all servers.
\nRFC2980 says “It is suggested that this extension be deprecated”. Use\ndescriptions() or description() instead.
\n\nNew in version 2.5.
\nThis module provides immutable UUID objects (the UUID class)\nand the functions uuid1(), uuid3(), uuid4(), uuid5() for\ngenerating version 1, 3, 4, and 5 UUIDs as specified in RFC 4122.
\nIf all you want is a unique ID, you should probably call uuid1() or\nuuid4(). Note that uuid1() may compromise privacy since it creates\na UUID containing the computer’s network address. uuid4() creates a\nrandom UUID.
\nCreate a UUID from either a string of 32 hexadecimal digits, a string of 16\nbytes as the bytes argument, a string of 16 bytes in little-endian order as\nthe bytes_le argument, a tuple of six integers (32-bit time_low, 16-bit\ntime_mid, 16-bit time_hi_version, 8-bit clock_seq_hi_variant, 8-bit\nclock_seq_low, 48-bit node) as the fields argument, or a single 128-bit\ninteger as the int argument. When a string of hex digits is given, curly\nbraces, hyphens, and a URN prefix are all optional. For example, these\nexpressions all yield the same UUID:
\nUUID('{12345678-1234-5678-1234-567812345678}')\nUUID('12345678123456781234567812345678')\nUUID('urn:uuid:12345678-1234-5678-1234-567812345678')\nUUID(bytes='\\x12\\x34\\x56\\x78'*4)\nUUID(bytes_le='\\x78\\x56\\x34\\x12\\x34\\x12\\x78\\x56' +\n '\\x12\\x34\\x56\\x78\\x12\\x34\\x56\\x78')\nUUID(fields=(0x12345678, 0x1234, 0x5678, 0x12, 0x34, 0x567812345678))\nUUID(int=0x12345678123456781234567812345678)\n
Exactly one of hex, bytes, bytes_le, fields, or int must be given.\nThe version argument is optional; if given, the resulting UUID will have its\nvariant and version number set according to RFC 4122, overriding bits in the\ngiven hex, bytes, bytes_le, fields, or int.
\nUUID instances have these read-only attributes:
\nA tuple of the six integer fields of the UUID, which are also available as six\nindividual attributes and two derived attributes:
\nField | \nMeaning | \n
---|---|
time_low | \nthe first 32 bits of the UUID | \n
time_mid | \nthe next 16 bits of the UUID | \n
time_hi_version | \nthe next 16 bits of the UUID | \n
clock_seq_hi_variant | \nthe next 8 bits of the UUID | \n
clock_seq_low | \nthe next 8 bits of the UUID | \n
node | \nthe last 48 bits of the UUID | \n
time | \nthe 60-bit timestamp | \n
clock_seq | \nthe 14-bit sequence number | \n
The uuid module defines the following functions:
\nThe uuid module defines the following namespace identifiers for use with\nuuid3() or uuid5().
\nThe uuid module defines the following constants for the possible values\nof the variant attribute:
\nSee also
\nHere are some examples of typical usage of the uuid module:
\n>>> import uuid\n\n>>> # make a UUID based on the host ID and current time\n>>> uuid.uuid1()\nUUID('a8098c1a-f86e-11da-bd1a-00112444be1e')\n\n>>> # make a UUID using an MD5 hash of a namespace UUID and a name\n>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')\nUUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')\n\n>>> # make a random UUID\n>>> uuid.uuid4()\nUUID('16fd2706-8baf-433b-82eb-8c7fada847da')\n\n>>> # make a UUID using a SHA-1 hash of a namespace UUID and a name\n>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')\nUUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')\n\n>>> # make a UUID from a string of hex digits (braces and hyphens ignored)\n>>> x = uuid.UUID('{00010203-0405-0607-0809-0a0b0c0d0e0f}')\n\n>>> # convert a UUID to a string of hex digits in standard form\n>>> str(x)\n'00010203-0405-0607-0809-0a0b0c0d0e0f'\n\n>>> # get the raw 16 bytes of the UUID\n>>> x.bytes\n'\\x00\\x01\\x02\\x03\\x04\\x05\\x06\\x07\\x08\\t\\n\\x0b\\x0c\\r\\x0e\\x0f'\n\n>>> # make a UUID from a 16-byte string\n>>> uuid.UUID(bytes=x.bytes)\nUUID('00010203-0405-0607-0809-0a0b0c0d0e0f')\n
Note
\nThe urlparse module is renamed to urllib.parse in Python 3.0.\nThe 2to3 tool will automatically adapt imports when converting\nyour sources to 3.0.
\nSource code: Lib/urlparse.py
\nThis module defines a standard interface to break Uniform Resource Locator (URL)\nstrings up in components (addressing scheme, network location, path etc.), to\ncombine the components back into a URL string, and to convert a “relative URL”\nto an absolute URL given a “base URL.”
\nThe module has been designed to match the Internet RFC on Relative Uniform\nResource Locators (and discovered a bug in an earlier draft!). It supports the\nfollowing URL schemes: file, ftp, gopher, hdl, http,\nhttps, imap, mailto, mms, news, nntp, prospero,\nrsync, rtsp, rtspu, sftp, shttp, sip, sips,\nsnews, svn, svn+ssh, telnet, wais.
\n\nNew in version 2.5: Support for the sftp and sips schemes.
\nThe urlparse module defines the following functions:
\nParse a URL into six components, returning a 6-tuple. This corresponds to the\ngeneral structure of a URL: scheme://netloc/path;parameters?query#fragment.\nEach tuple item is a string, possibly empty. The components are not broken up in\nsmaller parts (for example, the network location is a single string), and %\nescapes are not expanded. The delimiters as shown above are not part of the\nresult, except for a leading slash in the path component, which is retained if\npresent. For example:
\n>>> from urlparse import urlparse\n>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')\n>>> o # doctest: +NORMALIZE_WHITESPACE\nParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',\n params='', query='', fragment='')\n>>> o.scheme\n'http'\n>>> o.port\n80\n>>> o.geturl()\n'http://www.cwi.nl:80/%7Eguido/Python.html'\n
Following the syntax specifications in RFC 1808, urlparse recognizes\na netloc only if it is properly introduced by ‘//’. Otherwise the\ninput is presumed to be a relative URL and thus to start with\na path component.
\n>>> from urlparse import urlparse\n>>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html')\nParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',\n params='', query='', fragment='')\n>>> urlparse('www.cwi.nl:80/%7Eguido/Python.html')\nParseResult(scheme='', netloc='', path='www.cwi.nl:80/%7Eguido/Python.html',\n params='', query='', fragment='')\n>>> urlparse('help/Python.html')\nParseResult(scheme='', netloc='', path='help/Python.html', params='',\n query='', fragment='')\n
If the scheme argument is specified, it gives the default addressing\nscheme, to be used only if the URL does not specify one. The default value for\nthis argument is the empty string.
\nIf the allow_fragments argument is false, fragment identifiers are not\nallowed, even if the URL’s addressing scheme normally does support them. The\ndefault value for this argument is True.
\nThe return value is actually an instance of a subclass of tuple. This\nclass has the following additional read-only convenience attributes:
\nAttribute | \nIndex | \nValue | \nValue if not present | \n
---|---|---|---|
scheme | \n0 | \nURL scheme specifier | \nempty string | \n
netloc | \n1 | \nNetwork location part | \nempty string | \n
path | \n2 | \nHierarchical path | \nempty string | \n
params | \n3 | \nParameters for last path\nelement | \nempty string | \n
query | \n4 | \nQuery component | \nempty string | \n
fragment | \n5 | \nFragment identifier | \nempty string | \n
username | \n\n | User name | \nNone | \n
password | \n\n | Password | \nNone | \n
hostname | \n\n | Host name (lower case) | \nNone | \n
port | \n\n | Port number as integer,\nif present | \nNone | \n
See section Results of urlparse() and urlsplit() for more information on the result\nobject.
\n\nChanged in version 2.5: Added attributes to return value.
\n\nChanged in version 2.7: Added IPv6 URL parsing capabilities.
\nParse a query string given as a string argument (data of type\napplication/x-www-form-urlencoded). Data are returned as a\ndictionary. The dictionary keys are the unique query variable names and the\nvalues are lists of values for each name.
\nThe optional argument keep_blank_values is a flag indicating whether blank\nvalues in percent-encoded queries should be treated as blank strings. A true value\nindicates that blanks should be retained as blank strings. The default false\nvalue indicates that blank values are to be ignored and treated as if they were\nnot included.
\nThe optional argument strict_parsing is a flag indicating what to do with\nparsing errors. If false (the default), errors are silently ignored. If true,\nerrors raise a ValueError exception.
\nUse the urllib.urlencode() function to convert such dictionaries into\nquery strings.
\n\nNew in version 2.6: Copied from the cgi module.
\nParse a query string given as a string argument (data of type\napplication/x-www-form-urlencoded). Data are returned as a list of\nname, value pairs.
\nThe optional argument keep_blank_values is a flag indicating whether blank\nvalues in percent-encoded queries should be treated as blank strings. A true value\nindicates that blanks should be retained as blank strings. The default false\nvalue indicates that blank values are to be ignored and treated as if they were\nnot included.
\nThe optional argument strict_parsing is a flag indicating what to do with\nparsing errors. If false (the default), errors are silently ignored. If true,\nerrors raise a ValueError exception.
\nUse the urllib.urlencode() function to convert such lists of pairs into\nquery strings.
\n\nNew in version 2.6: Copied from the cgi module.
\nThis is similar to urlparse(), but does not split the params from the URL.\nThis should generally be used instead of urlparse() if the more recent URL\nsyntax allowing parameters to be applied to each segment of the path portion\nof the URL (see RFC 2396) is wanted. A separate function is needed to\nseparate the path segments and parameters. This function returns a 5-tuple:\n(addressing scheme, network location, path, query, fragment identifier).
\nThe return value is actually an instance of a subclass of tuple. This\nclass has the following additional read-only convenience attributes:
\nAttribute | \nIndex | \nValue | \nValue if not present | \n
---|---|---|---|
scheme | \n0 | \nURL scheme specifier | \nempty string | \n
netloc | \n1 | \nNetwork location part | \nempty string | \n
path | \n2 | \nHierarchical path | \nempty string | \n
query | \n3 | \nQuery component | \nempty string | \n
fragment | \n4 | \nFragment identifier | \nempty string | \n
username | \n\n | User name | \nNone | \n
password | \n\n | Password | \nNone | \n
hostname | \n\n | Host name (lower case) | \nNone | \n
port | \n\n | Port number as integer,\nif present | \nNone | \n
See section Results of urlparse() and urlsplit() for more information on the result\nobject.
\n\nNew in version 2.2.
\n\nChanged in version 2.5: Added attributes to return value.
\nCombine the elements of a tuple as returned by urlsplit() into a complete\nURL as a string. The parts argument can be any five-item iterable. This may\nresult in a slightly different, but equivalent URL, if the URL that was parsed\noriginally had unnecessary delimiters (for example, a ? with an empty query; the\nRFC states that these are equivalent).
\n\nNew in version 2.2.
\nConstruct a full (“absolute”) URL by combining a “base URL” (base) with\nanother URL (url). Informally, this uses components of the base URL, in\nparticular the addressing scheme, the network location and (part of) the path,\nto provide missing components in the relative URL. For example:
\n>>> from urlparse import urljoin\n>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html')\n'http://www.cwi.nl/%7Eguido/FAQ.html'\n
The allow_fragments argument has the same meaning and default as for\nurlparse().
\nNote
\nIf url is an absolute URL (that is, starting with // or scheme://),\nthe url‘s host name and/or scheme will be present in the result. For example:
\n>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html',\n... '//www.python.org/%7Eguido')\n'http://www.python.org/%7Eguido'\n
If you do not want that behavior, preprocess the url with urlsplit() and\nurlunsplit(), removing possible scheme and netloc parts.
\nSee also
\nThe result objects from the urlparse() and urlsplit() functions are\nsubclasses of the tuple type. These subclasses add the attributes\ndescribed in those functions, as well as provide an additional method:
\nReturn the re-combined version of the original URL as a string. This may differ\nfrom the original URL in that the scheme will always be normalized to lower case\nand empty components may be dropped. Specifically, empty parameters, queries,\nand fragment identifiers will be removed.
\nThe result of this method is a fixpoint if passed back through the original\nparsing function:
\n\n\n\n\n>>> import urlparse\n>>> url = 'HTTP://www.Python.org/doc/#'\n\n\n>>> r1 = urlparse.urlsplit(url)\n>>> r1.geturl()\n'http://www.Python.org/doc/'\n\n\n>>> r2 = urlparse.urlsplit(r1.geturl())\n>>> r2.geturl()\n'http://www.Python.org/doc/'\n
\nNew in version 2.5.
\nThe following classes provide the implementations of the parse results:
\nNote
\nThe SimpleHTTPServer module has been merged into http.server in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\nThe SimpleHTTPServer module defines a single class,\nSimpleHTTPRequestHandler, which is interface-compatible with\nBaseHTTPServer.BaseHTTPRequestHandler.
\nThe SimpleHTTPServer module defines the following class:
\nThis class serves files from the current directory and below, directly\nmapping the directory structure to HTTP requests.
\nA lot of the work, such as parsing the request, is done by the base class\nBaseHTTPServer.BaseHTTPRequestHandler. This class implements the\ndo_GET() and do_HEAD() functions.
\nThe following are defined as class-level attributes of\nSimpleHTTPRequestHandler:
\nThis will be "SimpleHTTP/" + __version__, where __version__ is\ndefined at the module level.
\nThe SimpleHTTPRequestHandler class defines the following methods:
\nThe request is mapped to a local file by interpreting the request as a\npath relative to the current working directory.
\nIf the request was mapped to a directory, the directory is checked for a\nfile named index.html or index.htm (in that order). If found, the\nfile’s contents are returned; otherwise a directory listing is generated\nby calling the list_directory() method. This method uses\nos.listdir() to scan the directory, and returns a 404 error\nresponse if the listdir() fails.
\nIf the request was mapped to a file, it is opened and the contents are\nreturned. Any IOError exception in opening the requested file is\nmapped to a 404, 'File not found' error. Otherwise, the content\ntype is guessed by calling the guess_type() method, which in turn\nuses the extensions_map variable.
\nA 'Content-type:' header with the guessed content type is output,\nfollowed by a 'Content-Length:' header with the file’s size and a\n'Last-Modified:' header with the file’s modification time.
\nThen follows a blank line signifying the end of the headers, and then the\ncontents of the file are output. If the file’s MIME type starts with\ntext/ the file is opened in text mode; otherwise binary mode is used.
\nThe test() function in the SimpleHTTPServer module is an\nexample which creates a server using the SimpleHTTPRequestHandler\nas the Handler.
\n\nNew in version 2.5: The 'Last-Modified' header.
\nThe SimpleHTTPServer module can be used in the following manner in order\nto set up a very basic web server serving files relative to the current\ndirectory.
\nimport SimpleHTTPServer\nimport SocketServer\n\nPORT = 8000\n\nHandler = SimpleHTTPServer.SimpleHTTPRequestHandler\n\nhttpd = SocketServer.TCPServer(("", PORT), Handler)\n\nprint "serving at port", PORT\nhttpd.serve_forever()\n
The SimpleHTTPServer module can also be invoked directly using the\n-m switch of the interpreter with a port number argument.\nSimilar to the previous example, this serves the files relative to the\ncurrent directory.
\npython -m SimpleHTTPServer 8000
\nSee also
\nSource code: Lib/smtplib.py
\nThe smtplib module defines an SMTP client session object that can be used\nto send mail to any Internet machine with an SMTP or ESMTP listener daemon. For\ndetails of SMTP and ESMTP operation, consult RFC 821 (Simple Mail Transfer\nProtocol) and RFC 1869 (SMTP Service Extensions).
\nA SMTP instance encapsulates an SMTP connection. It has methods\nthat support a full repertoire of SMTP and ESMTP operations. If the optional\nhost and port parameters are given, the SMTP connect() method is called\nwith those parameters during initialization. An SMTPConnectError is\nraised if the specified host doesn’t respond correctly. The optional\ntimeout parameter specifies a timeout in seconds for blocking operations\nlike the connection attempt (if not specified, the global default timeout\nsetting will be used).
\nFor normal use, you should only require the initialization/connect,\nsendmail(), and quit() methods. An example is included below.
\n\nChanged in version 2.6: timeout was added.
\nA SMTP_SSL instance behaves exactly the same as instances of\nSMTP. SMTP_SSL should be used for situations where SSL is\nrequired from the beginning of the connection and using starttls() is\nnot appropriate. If host is not specified, the local host is used. If\nport is omitted, the standard SMTP-over-SSL port (465) is used. keyfile\nand certfile are also optional, and can contain a PEM formatted private key\nand certificate chain file for the SSL connection. The optional timeout\nparameter specifies a timeout in seconds for blocking operations like the\nconnection attempt (if not specified, the global default timeout setting\nwill be used).
\n\nNew in version 2.6.
\nThe LMTP protocol, which is very similar to ESMTP, is heavily based on the\nstandard SMTP client. It’s common to use Unix sockets for LMTP, so our connect()\nmethod must support that as well as a regular host:port server. To specify a\nUnix socket, you must use an absolute path for host, starting with a ‘/’.
\nAuthentication is supported, using the regular SMTP mechanism. When using a Unix\nsocket, LMTP generally don’t support or require any authentication, but your\nmileage might vary.
\n\nNew in version 2.6.
\nA nice selection of exceptions is defined as well:
\nSee also
\nAn SMTP instance has the following methods:
\nSend a command cmd to the server. The optional argument argstring is simply\nconcatenated to the command, separated by a space.
\nThis returns a 2-tuple composed of a numeric response code and the actual\nresponse line (multiline responses are joined into one long line.)
\nIn normal operation it should not be necessary to call this method explicitly.\nIt is used to implement other methods and may be useful for testing private\nextensions.
\nIf the connection to the server is lost while waiting for the reply,\nSMTPServerDisconnected will be raised.
\nIdentify yourself to the SMTP server using HELO. The hostname argument\ndefaults to the fully qualified domain name of the local host.\nThe message returned by the server is stored as the helo_resp attribute\nof the object.
\nIn normal operation it should not be necessary to call this method explicitly.\nIt will be implicitly called by the sendmail() when necessary.
\nIdentify yourself to an ESMTP server using EHLO. The hostname argument\ndefaults to the fully qualified domain name of the local host. Examine the\nresponse for ESMTP option and store them for use by has_extn().\nAlso sets several informational attributes: the message returned by\nthe server is stored as the ehlo_resp attribute, does_esmtp\nis set to true or false depending on whether the server supports ESMTP, and\nesmtp_features will be a dictionary containing the names of the\nSMTP service extensions this server supports, and their\nparameters (if any).
\nUnless you wish to use has_extn() before sending mail, it should not be\nnecessary to call this method explicitly. It will be implicitly called by\nsendmail() when necessary.
\nThis method call ehlo() and or helo() if there has been no\nprevious EHLO or HELO command this session. It tries ESMTP EHLO\nfirst.
\n\nNew in version 2.6.
\nCheck the validity of an address on this server using SMTP VRFY. Returns a\ntuple consisting of code 250 and a full RFC 822 address (including human\nname) if the user address is valid. Otherwise returns an SMTP error code of 400\nor greater and an error string.
\nNote
\nMany sites disable SMTP VRFY in order to foil spammers.
\nLog in on an SMTP server that requires authentication. The arguments are the\nusername and the password to authenticate with. If there has been no previous\nEHLO or HELO command this session, this method tries ESMTP EHLO\nfirst. This method will return normally if the authentication was successful, or\nmay raise the following exceptions:
\nPut the SMTP connection in TLS (Transport Layer Security) mode. All SMTP\ncommands that follow will be encrypted. You should then call ehlo()\nagain.
\nIf keyfile and certfile are provided, these are passed to the socket\nmodule’s ssl() function.
\nIf there has been no previous EHLO or HELO command this session,\nthis method tries ESMTP EHLO first.
\n\nChanged in version 2.6.
\n\nChanged in version 2.6.
\nSend mail. The required arguments are an RFC 822 from-address string, a list\nof RFC 822 to-address strings (a bare string will be treated as a list with 1\naddress), and a message string. The caller may pass a list of ESMTP options\n(such as 8bitmime) to be used in MAIL FROM commands as mail_options.\nESMTP options (such as DSN commands) that should be used with all RCPT\ncommands can be passed as rcpt_options. (If you need to use different ESMTP\noptions to different recipients you have to use the low-level methods such as\nmail(), rcpt() and data() to send the message.)
\nNote
\nThe from_addr and to_addrs parameters are used to construct the message\nenvelope used by the transport agents. The SMTP does not modify the\nmessage headers in any way.
\nIf there has been no previous EHLO or HELO command this session, this\nmethod tries ESMTP EHLO first. If the server does ESMTP, message size and\neach of the specified options will be passed to it (if the option is in the\nfeature set the server advertises). If EHLO fails, HELO will be tried\nand ESMTP options suppressed.
\nThis method will return normally if the mail is accepted for at least one\nrecipient. Otherwise it will raise an exception. That is, if this method does\nnot raise an exception, then someone should get your mail. If this method does\nnot raise an exception, it returns a dictionary, with one entry for each\nrecipient that was refused. Each entry contains a tuple of the SMTP error code\nand the accompanying error message sent by the server.
\nThis method may raise the following exceptions:
\nUnless otherwise noted, the connection will be open even after an exception is\nraised.
\nTerminate the SMTP session and close the connection. Return the result of\nthe SMTP QUIT command.
\n\nChanged in version 2.6: Return a value.
\nLow-level methods corresponding to the standard SMTP/ESMTP commands HELP,\nRSET, NOOP, MAIL, RCPT, and DATA are also supported.\nNormally these do not need to be called directly, so they are not documented\nhere. For details, consult the module code.
\nThis example prompts the user for addresses needed in the message envelope (‘To’\nand ‘From’ addresses), and the message to be delivered. Note that the headers\nto be included with the message must be included in the message as entered; this\nexample doesn’t do any processing of the RFC 822 headers. In particular, the\n‘To’ and ‘From’ addresses must be included in the message headers explicitly.
\nimport smtplib\n\ndef prompt(prompt):\n return raw_input(prompt).strip()\n\nfromaddr = prompt("From: ")\ntoaddrs = prompt("To: ").split()\nprint "Enter message, end with ^D (Unix) or ^Z (Windows):"\n\n# Add the From: and To: headers at the start!\nmsg = ("From: %s\\r\\nTo: %s\\r\\n\\r\\n"\n (fromaddr, ", ".join(toaddrs)))\nwhile 1:\n try:\n line = raw_input()\n except EOFError:\n break\n if not line:\n break\n msg = msg + line\n\nprint "Message length is " + repr(len(msg))\n\nserver = smtplib.SMTP('localhost')\nserver.set_debuglevel(1)\nserver.sendmail(fromaddr, toaddrs, msg)\nserver.quit()\n
Note
\nIn general, you will want to use the email package’s features to\nconstruct an email message, which you can then convert to a string and send\nvia sendmail(); see email: Examples.
\nNote
\nThe BaseHTTPServer module has been merged into http.server in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\nSource code: Lib/BaseHTTPServer.py
\nThis module defines two classes for implementing HTTP servers (Web servers).\nUsually, this module isn’t used directly, but is used as a basis for building\nfunctioning Web servers. See the SimpleHTTPServer and\nCGIHTTPServer modules.
\nThe first class, HTTPServer, is a SocketServer.TCPServer\nsubclass, and therefore implements the SocketServer.BaseServer\ninterface. It creates and listens at the HTTP socket, dispatching the requests\nto a handler. Code to create and run the server looks like this:
\ndef run(server_class=BaseHTTPServer.HTTPServer,\n handler_class=BaseHTTPServer.BaseHTTPRequestHandler):\n server_address = ('', 8000)\n httpd = server_class(server_address, handler_class)\n httpd.serve_forever()\n
This class is used to handle the HTTP requests that arrive at the server. By\nitself, it cannot respond to any actual HTTP requests; it must be subclassed\nto handle each request method (e.g. GET or\nPOST). BaseHTTPRequestHandler provides a number of class and\ninstance variables, and methods for use by subclasses.
\nThe handler will parse the request and the headers, then call a method\nspecific to the request type. The method name is constructed from the\nrequest. For example, for the request method SPAM, the do_SPAM()\nmethod will be called with no arguments. All of the relevant information is\nstored in instance variables of the handler. Subclasses should not need to\noverride or extend the __init__() method.
\nBaseHTTPRequestHandler has the following instance variables:
\nBaseHTTPRequestHandler has the following class variables:
\nSpecifies the Content-Type HTTP header of error responses sent to the\nclient. The default value is 'text/html'.
\n\nNew in version 2.6: Previously, the content type was always 'text/html'.
\nSpecifies a rfc822.Message-like class to parse HTTP headers.\nTypically, this is not overridden, and it defaults to\nmimetools.Message.
\nA BaseHTTPRequestHandler instance has the following methods:
\nReturns the date and time given by timestamp (which must be in the\nformat returned by time.time()), formatted for a message header. If\ntimestamp is omitted, it uses the current date and time.
\nThe result looks like 'Sun, 06 Nov 1994 08:49:37 GMT'.
\n\nNew in version 2.5: The timestamp parameter.
\nTo create a server that doesn’t run forever, but until some condition is\nfulfilled:
\ndef run_while_true(server_class=BaseHTTPServer.HTTPServer,\n handler_class=BaseHTTPServer.BaseHTTPRequestHandler):\n """\n This assumes that keep_running() is a function of no arguments which\n is tested initially and after each request. If its return value\n is true, the server continues.\n """\n server_address = ('', 8000)\n httpd = server_class(server_address, handler_class)\n while keep_running():\n httpd.handle_request()\n
See also
\nNote
\nThe CGIHTTPServer module has been merged into http.server in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\nThe CGIHTTPServer module defines a request-handler class, interface\ncompatible with BaseHTTPServer.BaseHTTPRequestHandler and inherits\nbehavior from SimpleHTTPServer.SimpleHTTPRequestHandler but can also\nrun CGI scripts.
\nNote
\nThis module can run CGI scripts on Unix and Windows systems.
\nNote
\nCGI scripts run by the CGIHTTPRequestHandler class cannot execute\nredirects (HTTP code 302), because code 200 (script output follows) is sent\nprior to execution of the CGI script. This pre-empts the status code.
\nThe CGIHTTPServer module defines the following class:
\nThis class is used to serve either files or output of CGI scripts from the\ncurrent directory and below. Note that mapping HTTP hierarchic structure to\nlocal directory structure is exactly as in\nSimpleHTTPServer.SimpleHTTPRequestHandler.
\nThe class will however, run the CGI script, instead of serving it as a file, if\nit guesses it to be a CGI script. Only directory-based CGI are used — the\nother common server configuration is to treat special extensions as denoting CGI\nscripts.
\nThe do_GET() and do_HEAD() functions are modified to run CGI scripts\nand serve the output, instead of serving files, if the request leads to\nsomewhere below the cgi_directories path.
\nThe CGIHTTPRequestHandler defines the following data member:
\nThe CGIHTTPRequestHandler defines the following methods:
\nNote that CGI scripts will be run with UID of user nobody, for security reasons.\nProblems with the CGI script will be translated to error 403.
\nFor example usage, see the implementation of the test() function.
\nSee also
\nNote
\nThe SocketServer module has been renamed to socketserver in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\nSource code: Lib/SocketServer.py
\nThe SocketServer module simplifies the task of writing network servers.
\nThere are four basic server classes: TCPServer uses the Internet TCP\nprotocol, which provides for continuous streams of data between the client and\nserver. UDPServer uses datagrams, which are discrete packets of\ninformation that may arrive out of order or be lost while in transit. The more\ninfrequently used UnixStreamServer and UnixDatagramServer\nclasses are similar, but use Unix domain sockets; they’re not available on\nnon-Unix platforms. For more details on network programming, consult a book\nsuch as\nW. Richard Steven’s UNIX Network Programming or Ralph Davis’s Win32 Network\nProgramming.
\nThese four classes process requests synchronously; each request must be\ncompleted before the next request can be started. This isn’t suitable if each\nrequest takes a long time to complete, because it requires a lot of computation,\nor because it returns a lot of data which the client is slow to process. The\nsolution is to create a separate process or thread to handle each request; the\nForkingMixIn and ThreadingMixIn mix-in classes can be used to\nsupport asynchronous behaviour.
\nCreating a server requires several steps. First, you must create a request\nhandler class by subclassing the BaseRequestHandler class and\noverriding its handle() method; this method will process incoming\nrequests. Second, you must instantiate one of the server classes, passing it\nthe server’s address and the request handler class. Finally, call the\nhandle_request() or serve_forever() method of the server object to\nprocess one or many requests.
\nWhen inheriting from ThreadingMixIn for threaded connection behavior,\nyou should explicitly declare how you want your threads to behave on an abrupt\nshutdown. The ThreadingMixIn class defines an attribute\ndaemon_threads, which indicates whether or not the server should wait for\nthread termination. You should set the flag explicitly if you would like threads\nto behave autonomously; the default is False, meaning that Python will\nnot exit until all threads created by ThreadingMixIn have exited.
\nServer classes have the same external methods and attributes, no matter what\nnetwork protocol they use.
\nThere are five classes in an inheritance diagram, four of which represent\nsynchronous servers of four types:
\n+------------+\n| BaseServer |\n+------------+\n |\n v\n+-----------+ +------------------+\n| TCPServer |------->| UnixStreamServer |\n+-----------+ +------------------+\n |\n v\n+-----------+ +--------------------+\n| UDPServer |------->| UnixDatagramServer |\n+-----------+ +--------------------+
\nNote that UnixDatagramServer derives from UDPServer, not from\nUnixStreamServer — the only difference between an IP and a Unix\nstream server is the address family, which is simply repeated in both Unix\nserver classes.
\nForking and threading versions of each type of server can be created using the\nForkingMixIn and ThreadingMixIn mix-in classes. For instance,\na threading UDP server class is created as follows:
\nclass ThreadingUDPServer(ThreadingMixIn, UDPServer): pass\n
The mix-in class must come first, since it overrides a method defined in\nUDPServer. Setting the various attributes also change the\nbehavior of the underlying server mechanism.
\nTo implement a service, you must derive a class from BaseRequestHandler\nand redefine its handle() method. You can then run various versions of\nthe service by combining one of the server classes with your request handler\nclass. The request handler class must be different for datagram or stream\nservices. This can be hidden by using the handler subclasses\nStreamRequestHandler or DatagramRequestHandler.
\nOf course, you still have to use your head! For instance, it makes no sense to\nuse a forking server if the service contains state in memory that can be\nmodified by different requests, since the modifications in the child process\nwould never reach the initial state kept in the parent process and passed to\neach child. In this case, you can use a threading server, but you will probably\nhave to use locks to protect the integrity of the shared data.
\nOn the other hand, if you are building an HTTP server where all data is stored\nexternally (for instance, in the file system), a synchronous class will\nessentially render the service “deaf” while one request is being handled –\nwhich may be for a very long time if a client is slow to receive all the data it\nhas requested. Here a threading or forking server is appropriate.
\nIn some cases, it may be appropriate to process part of a request synchronously,\nbut to finish processing in a forked child depending on the request data. This\ncan be implemented by using a synchronous server and doing an explicit fork in\nthe request handler class handle() method.
\nAnother approach to handling multiple simultaneous requests in an environment\nthat supports neither threads nor fork() (or where these are too expensive\nor inappropriate for the service) is to maintain an explicit table of partially\nfinished requests and to use select() to decide which request to work on\nnext (or whether to handle a new incoming request). This is particularly\nimportant for stream services where each client can potentially be connected for\na long time (if threads or subprocesses cannot be used). See asyncore for\nanother way to manage this.
\nTell the serve_forever() loop to stop and wait until it does.
\n\nNew in version 2.6.
\nThe server classes support the following class variables:
\nThere are various server methods that can be overridden by subclasses of base\nserver classes like TCPServer; these methods aren’t useful to external\nusers of the server object.
\nThe request handler class must define a new handle() method, and can\noverride any of the following methods. A new instance is created for each\nrequest.
\nThis function must do all the work required to service a request. The\ndefault implementation does nothing. Several instance attributes are\navailable to it; the request is available as self.request; the client\naddress as self.client_address; and the server instance as\nself.server, in case it needs access to per-server information.
\nThe type of self.request is different for datagram or stream\nservices. For stream services, self.request is a socket object; for\ndatagram services, self.request is a pair of string and socket.\nHowever, this can be hidden by using the request handler subclasses\nStreamRequestHandler or DatagramRequestHandler, which\noverride the setup() and finish() methods, and provide\nself.rfile and self.wfile attributes. self.rfile and\nself.wfile can be read or written, respectively, to get the request\ndata or return data to the client.
\nThis is the server side:
\nimport SocketServer\n\nclass MyTCPHandler(SocketServer.BaseRequestHandler):\n """\n The RequestHandler class for our server.\n\n It is instantiated once per connection to the server, and must\n override the handle() method to implement communication to the\n client.\n """\n\n def handle(self):\n # self.request is the TCP socket connected to the client\n self.data = self.request.recv(1024).strip()\n print "{} wrote:".format(self.client_address[0])\n print self.data\n # just send back the same data, but upper-cased\n self.request.send(self.data.upper())\n\nif __name__ == "__main__":\n HOST, PORT = "localhost", 9999\n\n # Create the server, binding to localhost on port 9999\n server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)\n\n # Activate the server; this will keep running until you\n # interrupt the program with Ctrl-C\n server.serve_forever()\n
An alternative request handler class that makes use of streams (file-like\nobjects that simplify communication by providing the standard file interface):
\nclass MyTCPHandler(SocketServer.StreamRequestHandler):\n\n def handle(self):\n # self.rfile is a file-like object created by the handler;\n # we can now use e.g. readline() instead of raw recv() calls\n self.data = self.rfile.readline().strip()\n print "{} wrote:".format(self.client_address[0])\n print self.data\n # Likewise, self.wfile is a file-like object used to write back\n # to the client\n self.wfile.write(self.data.upper())\n
The difference is that the readline() call in the second handler will call\nrecv() multiple times until it encounters a newline character, while the\nsingle recv() call in the first handler will just return what has been sent\nfrom the client in one send() call.
\nThis is the client side:
\nimport socket\nimport sys\n\nHOST, PORT = "localhost", 9999\ndata = " ".join(sys.argv[1:])\n\n# Create a socket (SOCK_STREAM means a TCP socket)\nsock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n\ntry:\n # Connect to server and send data\n sock.connect((HOST, PORT))\n sock.send(data + "\\n")\n\n # Receive data from the server and shut down\n received = sock.recv(1024)\nfinally:\n sock.close()\n\nprint "Sent: {}".format(data)\nprint "Received: {}".format(received)\n
The output of the example should look something like this:
\nServer:
\n$ python TCPServer.py\n127.0.0.1 wrote:\nhello world with TCP\n127.0.0.1 wrote:\npython is nice
\nClient:
\n$ python TCPClient.py hello world with TCP\nSent: hello world with TCP\nReceived: HELLO WORLD WITH TCP\n$ python TCPClient.py python is nice\nSent: python is nice\nReceived: PYTHON IS NICE
\nThis is the server side:
\nimport SocketServer\n\nclass MyUDPHandler(SocketServer.BaseRequestHandler):\n """\n This class works similar to the TCP handler class, except that\n self.request consists of a pair of data and client socket, and since\n there is no connection the client address must be given explicitly\n when sending data back via sendto().\n """\n\n def handle(self):\n data = self.request[0].strip()\n socket = self.request[1]\n print "{} wrote:".format(self.client_address[0])\n print data\n socket.sendto(data.upper(), self.client_address)\n\nif __name__ == "__main__":\n HOST, PORT = "localhost", 9999\n server = SocketServer.UDPServer((HOST, PORT), MyUDPHandler)\n server.serve_forever()\n
This is the client side:
\nimport socket\nimport sys\n\nHOST, PORT = "localhost", 9999\ndata = " ".join(sys.argv[1:])\n\n# SOCK_DGRAM is the socket type to use for UDP sockets\nsock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)\n\n# As you can see, there is no connect() call; UDP has no connections.\n# Instead, data is directly sent to the recipient via sendto().\nsock.sendto(data + "\\n", (HOST, PORT))\nreceived = sock.recv(1024)\n\nprint "Sent: {}".format(data)\nprint "Received: {}".format(received)\n
The output of the example should look exactly like for the TCP server example.
\nTo build asynchronous handlers, use the ThreadingMixIn and\nForkingMixIn classes.
\nAn example for the ThreadingMixIn class:
\nimport socket\nimport threading\nimport SocketServer\n\nclass ThreadedTCPRequestHandler(SocketServer.BaseRequestHandler):\n\n def handle(self):\n data = self.request.recv(1024)\n cur_thread = threading.current_thread()\n response = "{}: {}".format(cur_thread.name, data)\n self.request.send(response)\n\nclass ThreadedTCPServer(SocketServer.ThreadingMixIn, SocketServer.TCPServer):\n pass\n\ndef client(ip, port, message):\n sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n sock.connect((ip, port))\n try:\n sock.send(message)\n response = sock.recv(1024)\n print "Received: {}".format(response)\n finally:\n sock.close()\n\nif __name__ == "__main__":\n # Port 0 means to select an arbitrary unused port\n HOST, PORT = "localhost", 0\n\n server = ThreadedTCPServer((HOST, PORT), ThreadedTCPRequestHandler)\n ip, port = server.server_address\n\n # Start a thread with the server -- that thread will then start one\n # more thread for each request\n server_thread = threading.Thread(target=server.serve_forever)\n # Exit the server thread when the main thread terminates\n server_thread.daemon = True\n server_thread.start()\n print "Server loop running in thread:", server_thread.name\n\n client(ip, port, "Hello World 1")\n client(ip, port, "Hello World 2")\n client(ip, port, "Hello World 3")\n\n server.shutdown()\n
The output of the example should look something like this:
\n$ python ThreadedTCPServer.py\nServer loop running in thread: Thread-1\nReceived: Thread-2: Hello World 1\nReceived: Thread-3: Hello World 2\nReceived: Thread-4: Hello World 3
\nThe ForkingMixIn class is used in the same way, except that the server\nwill spawn a new process for each request.
\nNote
\nThe DocXMLRPCServer module has been merged into xmlrpc.server\nin Python 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\n\nNew in version 2.3.
\nThe DocXMLRPCServer module extends the classes found in\nSimpleXMLRPCServer to serve HTML documentation in response to HTTP GET\nrequests. Servers can either be free standing, using DocXMLRPCServer,\nor embedded in a CGI environment, using DocCGIXMLRPCRequestHandler.
\nThe DocXMLRPCServer class is derived from\nSimpleXMLRPCServer.SimpleXMLRPCServer and provides a means of creating\nself-documenting, stand alone XML-RPC servers. HTTP POST requests are handled as\nXML-RPC method calls. HTTP GET requests are handled by generating pydoc-style\nHTML documentation. This allows a server to provide its own web-based\ndocumentation.
\nThe DocCGIXMLRPCRequestHandler class is derived from\nSimpleXMLRPCServer.CGIXMLRPCRequestHandler and provides a means of\ncreating self-documenting, XML-RPC CGI scripts. HTTP POST requests are handled\nas XML-RPC method calls. HTTP GET requests are handled by generating pydoc-style\nHTML documentation. This allows a server to provide its own web-based\ndocumentation.
\nNote
\nThe SimpleXMLRPCServer module has been merged into\nxmlrpc.server in Python 3.0. The 2to3 tool will automatically\nadapt imports when converting your sources to 3.0.
\n\nNew in version 2.2.
\nSource code: Lib/SimpleXMLRPCServer.py
\nThe SimpleXMLRPCServer module provides a basic server framework for\nXML-RPC servers written in Python. Servers can either be free standing, using\nSimpleXMLRPCServer, or embedded in a CGI environment, using\nCGIXMLRPCRequestHandler.
\nCreate a new server instance. This class provides methods for registration of\nfunctions that can be called by the XML-RPC protocol. The requestHandler\nparameter should be a factory for request handler instances; it defaults to\nSimpleXMLRPCRequestHandler. The addr and requestHandler parameters\nare passed to the SocketServer.TCPServer constructor. If logRequests\nis true (the default), requests will be logged; setting this parameter to false\nwill turn off logging. The allow_none and encoding parameters are passed\non to xmlrpclib and control the XML-RPC responses that will be returned\nfrom the server. The bind_and_activate parameter controls whether\nserver_bind() and server_activate() are called immediately by the\nconstructor; it defaults to true. Setting it to false allows code to manipulate\nthe allow_reuse_address class variable before the address is bound.
\n\nChanged in version 2.5: The allow_none and encoding parameters were added.
\n\nChanged in version 2.6: The bind_and_activate parameter was added.
\nCreate a new instance to handle XML-RPC requests in a CGI environment. The\nallow_none and encoding parameters are passed on to xmlrpclib and\ncontrol the XML-RPC responses that will be returned from the server.
\n\nNew in version 2.3.
\n\nChanged in version 2.5: The allow_none and encoding parameters were added.
\nThe SimpleXMLRPCServer class is based on\nSocketServer.TCPServer and provides a means of creating simple, stand\nalone XML-RPC servers.
\nRegister an object which is used to expose method names which have not been\nregistered using register_function(). If instance contains a\n_dispatch() method, it is called with the requested method name and the\nparameters from the request. Its API is def _dispatch(self, method, params)\n(note that params does not represent a variable argument list). If it calls\nan underlying function to perform its task, that function is called as\nfunc(*params), expanding the parameter list. The return value from\n_dispatch() is returned to the client as the result. If instance does\nnot have a _dispatch() method, it is searched for an attribute matching\nthe name of the requested method.
\nIf the optional allow_dotted_names argument is true and the instance does not\nhave a _dispatch() method, then if the requested method name contains\nperiods, each component of the method name is searched for individually, with\nthe effect that a simple hierarchical search is performed. The value found from\nthis search is then called with the parameters from the request, and the return\nvalue is passed back to the client.
\nWarning
\nEnabling the allow_dotted_names option allows intruders to access your\nmodule’s global variables and may allow intruders to execute arbitrary code on\nyour machine. Only use this option on a secure, closed network.
\n\nChanged in version 2.3.5,: 2.4.1\nallow_dotted_names was added to plug a security hole; prior versions are\ninsecure.
\nRegisters the XML-RPC introspection functions system.listMethods,\nsystem.methodHelp and system.methodSignature.
\n\nNew in version 2.3.
\nAn attribute value that must be a tuple listing valid path portions of the URL\nfor receiving XML-RPC requests. Requests posted to other paths will result in a\n404 “no such page” HTTP error. If this tuple is empty, all paths will be\nconsidered valid. The default value is ('/', '/RPC2').
\n\nNew in version 2.5.
\nIf this attribute is not None, responses larger than this value\nwill be encoded using the gzip transfer encoding, if permitted by\nthe client. The default is 1400 which corresponds roughly\nto a single TCP packet.
\n\nNew in version 2.7.
\nServer code:
\nfrom SimpleXMLRPCServer import SimpleXMLRPCServer\nfrom SimpleXMLRPCServer import SimpleXMLRPCRequestHandler\n\n# Restrict to a particular path.\nclass RequestHandler(SimpleXMLRPCRequestHandler):\n rpc_paths = ('/RPC2',)\n\n# Create server\nserver = SimpleXMLRPCServer(("localhost", 8000),\n requestHandler=RequestHandler)\nserver.register_introspection_functions()\n\n# Register pow() function; this will use the value of\n# pow.__name__ as the name, which is just 'pow'.\nserver.register_function(pow)\n\n# Register a function under a different name\ndef adder_function(x,y):\n return x + y\nserver.register_function(adder_function, 'add')\n\n# Register an instance; all the methods of the instance are\n# published as XML-RPC methods (in this case, just 'div').\nclass MyFuncs:\n def div(self, x, y):\n return x // y\n\nserver.register_instance(MyFuncs())\n\n# Run the server's main loop\nserver.serve_forever()\n
The following client code will call the methods made available by the preceding\nserver:
\nimport xmlrpclib\n\ns = xmlrpclib.ServerProxy('http://localhost:8000')\nprint s.pow(2,3) # Returns 2**3 = 8\nprint s.add(2,3) # Returns 5\nprint s.div(5,2) # Returns 5//2 = 2\n\n# Print list of available methods\nprint s.system.listMethods()\n
The CGIXMLRPCRequestHandler class can be used to handle XML-RPC\nrequests sent to Python CGI scripts.
\nExample:
\nclass MyFuncs:\n def div(self, x, y) : return x // y\n\n\nhandler = CGIXMLRPCRequestHandler()\nhandler.register_function(pow)\nhandler.register_function(lambda x,y: x+y, 'add')\nhandler.register_introspection_functions()\nhandler.register_instance(MyFuncs())\nhandler.handle_request()\n
Note
\nThe xmlrpclib module has been renamed to xmlrpc.client in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\n\nNew in version 2.2.
\nSource code: Lib/xmlrpclib.py
\nXML-RPC is a Remote Procedure Call method that uses XML passed via HTTP as a\ntransport. With it, a client can call methods with parameters on a remote\nserver (the server is named by a URI) and get back structured data. This module\nsupports writing XML-RPC client code; it handles all the details of translating\nbetween conformable Python objects and XML on the wire.
\nA ServerProxy instance is an object that manages communication with a\nremote XML-RPC server. The required first argument is a URI (Uniform Resource\nIndicator), and will normally be the URL of the server. The optional second\nargument is a transport factory instance; by default it is an internal\nSafeTransport instance for https: URLs and an internal HTTP\nTransport instance otherwise. The optional third argument is an\nencoding, by default UTF-8. The optional fourth argument is a debugging flag.\nIf allow_none is true, the Python constant None will be translated into\nXML; the default behaviour is for None to raise a TypeError. This is\na commonly-used extension to the XML-RPC specification, but isn’t supported by\nall clients and servers; see http://ontosys.com/xml-rpc/extensions.php for a\ndescription. The use_datetime flag can be used to cause date/time values to\nbe presented as datetime.datetime objects; this is false by default.\ndatetime.datetime objects may be passed to calls.
\nBoth the HTTP and HTTPS transports support the URL syntax extension for HTTP\nBasic Authentication: http://user:pass@host:port/path. The user:pass\nportion will be base64-encoded as an HTTP ‘Authorization’ header, and sent to\nthe remote server as part of the connection process when invoking an XML-RPC\nmethod. You only need to use this if the remote server requires a Basic\nAuthentication user and password.
\nThe returned instance is a proxy object with methods that can be used to invoke\ncorresponding RPC calls on the remote server. If the remote server supports the\nintrospection API, the proxy can also be used to query the remote server for the\nmethods it supports (service discovery) and fetch other server-associated\nmetadata.
\nServerProxy instance methods take Python basic types and objects as\narguments and return Python basic types and classes. Types that are conformable\n(e.g. that can be marshalled through XML), include the following (and except\nwhere noted, they are unmarshalled as the same Python type):
\nName | \nMeaning | \n
---|---|
boolean | \nThe True and False\nconstants | \n
integers | \nPass in directly | \n
floating-point numbers | \nPass in directly | \n
strings | \nPass in directly | \n
arrays | \nAny Python sequence type containing\nconformable elements. Arrays are returned\nas lists | \n
structures | \nA Python dictionary. Keys must be strings,\nvalues may be any conformable type. Objects\nof user-defined classes can be passed in;\nonly their __dict__ attribute is\ntransmitted. | \n
dates | \nin seconds since the epoch (pass in an\ninstance of the DateTime class) or\na datetime.datetime instance. | \n
binary data | \npass in an instance of the Binary\nwrapper class | \n
This is the full set of data types supported by XML-RPC. Method calls may also\nraise a special Fault instance, used to signal XML-RPC server errors, or\nProtocolError used to signal an error in the HTTP/HTTPS transport layer.\nBoth Fault and ProtocolError derive from a base class called\nError. Note that even though starting with Python 2.2 you can subclass\nbuilt-in types, the xmlrpclib module currently does not marshal instances of such\nsubclasses.
\nWhen passing strings, characters special to XML such as <, >, and &\nwill be automatically escaped. However, it’s the caller’s responsibility to\nensure that the string is free of characters that aren’t allowed in XML, such as\nthe control characters with ASCII values between 0 and 31 (except, of course,\ntab, newline and carriage return); failing to do this will result in an XML-RPC\nrequest that isn’t well-formed XML. If you have to pass arbitrary strings via\nXML-RPC, use the Binary wrapper class described below.
\nServer is retained as an alias for ServerProxy for backwards\ncompatibility. New code should use ServerProxy.
\n\nChanged in version 2.5: The use_datetime flag was added.
\n\nChanged in version 2.6: Instances of new-style classes can be passed in if they have an\n__dict__ attribute and don’t have a base class that is marshalled in a\nspecial way.
\nSee also
\nA ServerProxy instance has a method corresponding to each remote\nprocedure call accepted by the XML-RPC server. Calling the method performs an\nRPC, dispatched by both name and argument signature (e.g. the same method name\ncan be overloaded with multiple argument signatures). The RPC finishes by\nreturning a value, which may be either returned data in a conformant type or a\nFault or ProtocolError object indicating an error.
\nServers that support the XML introspection API support some common methods\ngrouped under the reserved system attribute:
\nThis method takes one parameter, the name of a method implemented by the XML-RPC\nserver. It returns an array of possible signatures for this method. A signature\nis an array of types. The first of these types is the return type of the method,\nthe rest are parameters.
\nBecause multiple signatures (ie. overloading) is permitted, this method returns\na list of signatures rather than a singleton.
\nSignatures themselves are restricted to the top level parameters expected by a\nmethod. For instance if a method expects one array of structs as a parameter,\nand it returns a string, its signature is simply “string, array”. If it expects\nthree integers and returns a string, its signature is “string, int, int, int”.
\nIf no signature is defined for the method, a non-array value is returned. In\nPython this means that the type of the returned value will be something other\nthan list.
\nThis class may be initialized from any Python value; the instance returned\ndepends only on its truth value. It supports various Python operators through\n__cmp__(), __repr__(), __int__(), and __nonzero__()\nmethods, all implemented in the obvious ways.
\nIt also has the following method, supported mainly for internal use by the\nunmarshalling code:
\nA working example follows. The server code:
\nimport xmlrpclib\nfrom SimpleXMLRPCServer import SimpleXMLRPCServer\n\ndef is_even(n):\n return n 2 == 0\n\nserver = SimpleXMLRPCServer(("localhost", 8000))\nprint "Listening on port 8000..."\nserver.register_function(is_even, "is_even")\nserver.serve_forever()\n
The client code for the preceding server:
\nimport xmlrpclib\n\nproxy = xmlrpclib.ServerProxy("http://localhost:8000/")\nprint "3 is even: %s" str(proxy.is_even(3))\nprint "100 is even: %s" str(proxy.is_even(100))\n
This class may be initialized with seconds since the epoch, a time\ntuple, an ISO 8601 time/date string, or a datetime.datetime\ninstance. It has the following methods, supported mainly for internal\nuse by the marshalling/unmarshalling code:
\nIt also supports certain of Python’s built-in operators through __cmp__()\nand __repr__() methods.
\nA working example follows. The server code:
\nimport datetime\nfrom SimpleXMLRPCServer import SimpleXMLRPCServer\nimport xmlrpclib\n\ndef today():\n today = datetime.datetime.today()\n return xmlrpclib.DateTime(today)\n\nserver = SimpleXMLRPCServer(("localhost", 8000))\nprint "Listening on port 8000..."\nserver.register_function(today, "today")\nserver.serve_forever()\n
The client code for the preceding server:
\nimport xmlrpclib\nimport datetime\n\nproxy = xmlrpclib.ServerProxy("http://localhost:8000/")\n\ntoday = proxy.today()\n# convert the ISO8601 string to a datetime object\nconverted = datetime.datetime.strptime(today.value, "%Y%m%dT%H:%M:%S")\nprint "Today: %s" converted.strftime("%d.%m.%Y, %H:%M")\n
This class may be initialized from string data (which may include NULs). The\nprimary access to the content of a Binary object is provided by an\nattribute:
\nBinary objects have the following methods, supported mainly for\ninternal use by the marshalling/unmarshalling code:
\nWrite the XML-RPC base 64 encoding of this binary item to the out stream object.
\nThe encoded data will have newlines every 76 characters as per\nRFC 2045 section 6.8,\nwhich was the de facto standard base64 specification when the\nXML-RPC spec was written.
\nIt also supports certain of Python’s built-in operators through a\n__cmp__() method.
\nExample usage of the binary objects. We’re going to transfer an image over\nXMLRPC:
\nfrom SimpleXMLRPCServer import SimpleXMLRPCServer\nimport xmlrpclib\n\ndef python_logo():\n with open("python_logo.jpg", "rb") as handle:\n return xmlrpclib.Binary(handle.read())\n\nserver = SimpleXMLRPCServer(("localhost", 8000))\nprint "Listening on port 8000..."\nserver.register_function(python_logo, 'python_logo')\n\nserver.serve_forever()\n
The client gets the image and saves it to a file:
\nimport xmlrpclib\n\nproxy = xmlrpclib.ServerProxy("http://localhost:8000/")\nwith open("fetched_python_logo.jpg", "wb") as handle:\n handle.write(proxy.python_logo().data)\n
A Fault object encapsulates the content of an XML-RPC fault tag. Fault\nobjects have the following attributes:
\nIn the following example we’re going to intentionally cause a Fault by\nreturning a complex type object. The server code:
\nfrom SimpleXMLRPCServer import SimpleXMLRPCServer\n\n# A marshalling error is going to occur because we're returning a\n# complex number\ndef add(x,y):\n return x+y+0j\n\nserver = SimpleXMLRPCServer(("localhost", 8000))\nprint "Listening on port 8000..."\nserver.register_function(add, 'add')\n\nserver.serve_forever()\n
The client code for the preceding server:
\nimport xmlrpclib\n\nproxy = xmlrpclib.ServerProxy("http://localhost:8000/")\ntry:\n proxy.add(2, 5)\nexcept xmlrpclib.Fault, err:\n print "A fault occurred"\n print "Fault code: %d" err.faultCode\n print "Fault string: %s" err.faultString\n
A ProtocolError object describes a protocol error in the underlying\ntransport layer (such as a 404 ‘not found’ error if the server named by the URI\ndoes not exist). It has the following attributes:
\nIn the following example we’re going to intentionally cause a ProtocolError\nby providing an URI that doesn’t point to an XMLRPC server:
\nimport xmlrpclib\n\n# create a ServerProxy with an URI that doesn't respond to XMLRPC requests\nproxy = xmlrpclib.ServerProxy("http://www.google.com/")\n\ntry:\n proxy.some_method()\nexcept xmlrpclib.ProtocolError, err:\n print "A protocol error occurred"\n print "URL: %s" err.url\n print "HTTP/HTTPS headers: %s" err.headers\n print "Error code: %d" err.errcode\n print "Error message: %s" err.errmsg\n
\nNew in version 2.4.
\nThe MultiCall object provides a way to encapsulate multiple calls to a\nremote server into a single request [1].
\nA usage example of this class follows. The server code
\nfrom SimpleXMLRPCServer import SimpleXMLRPCServer\n\ndef add(x,y):\n return x+y\n\ndef subtract(x, y):\n return x-y\n\ndef multiply(x, y):\n return x*y\n\ndef divide(x, y):\n return x/y\n\n# A simple server with simple arithmetic functions\nserver = SimpleXMLRPCServer(("localhost", 8000))\nprint "Listening on port 8000..."\nserver.register_multicall_functions()\nserver.register_function(add, 'add')\nserver.register_function(subtract, 'subtract')\nserver.register_function(multiply, 'multiply')\nserver.register_function(divide, 'divide')\nserver.serve_forever()\n
The client code for the preceding server:
\nimport xmlrpclib\n\nproxy = xmlrpclib.ServerProxy("http://localhost:8000/")\nmulticall = xmlrpclib.MultiCall(proxy)\nmulticall.add(7,3)\nmulticall.subtract(7,3)\nmulticall.multiply(7,3)\nmulticall.divide(7,3)\nresult = multicall()\n\nprint "7+3=%d, 7-3=%d, 7*3=%d, 7/3=%d" tuple(result)\n
Convert an XML-RPC request or response into Python objects, a (params,\nmethodname). params is a tuple of argument; methodname is a string, or\nNone if no method name is present in the packet. If the XML-RPC packet\nrepresents a fault condition, this function will raise a Fault exception.\nThe use_datetime flag can be used to cause date/time values to be presented as\ndatetime.datetime objects; this is false by default.
\n\nChanged in version 2.5: The use_datetime flag was added.
\n# simple test program (from the XML-RPC specification)\nfrom xmlrpclib import ServerProxy, Error\n\n# server = ServerProxy("http://localhost:8000") # local server\nserver = ServerProxy("http://betty.userland.com")\n\nprint server\n\ntry:\n print server.examples.getStateName(41)\nexcept Error, v:\n print "ERROR", v\n
To access an XML-RPC server through a proxy, you need to define a custom\ntransport. The following example shows how:
\nimport xmlrpclib, httplib\n\nclass ProxiedTransport(xmlrpclib.Transport):\n def set_proxy(self, proxy):\n self.proxy = proxy\n def make_connection(self, host):\n self.realhost = host\n h = httplib.HTTP(self.proxy)\n return h\n def send_request(self, connection, handler, request_body):\n connection.putrequest("POST", 'http://%s%s' (self.realhost, handler))\n def send_host(self, connection, host):\n connection.putheader('Host', self.realhost)\n\np = ProxiedTransport()\np.set_proxy('proxy-server:8080')\nserver = xmlrpclib.Server('http://time.xmlrpc.com/RPC2', transport=p)\nprint server.currentTime.getCurrentTime()\n
See SimpleXMLRPCServer Example.
\nFootnotes
\n[1] | This approach has been first presented in a discussion on xmlrpc.com. |
The audioop module contains some useful operations on sound fragments.\nIt operates on sound fragments consisting of signed integer samples 8, 16 or 32\nbits wide, stored in Python strings. This is the same format as used by the\nal and sunaudiodev modules. All scalar items are integers, unless\nspecified otherwise.
\nThis module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings.
\nA few of the more complicated operations only take 16-bit samples, otherwise the\nsample size (in bytes) is always a parameter of the operation.
\nThe module defines the following variables and functions:
\nConvert sound fragments in a-LAW encoding to linearly encoded sound fragments.\na-LAW encoding always uses 8 bits samples, so width refers only to the sample\nwidth of the output fragment here.
\n\nNew in version 2.5.
\nReturn a factor F such that rms(add(fragment, mul(reference, -F))) is\nminimal, i.e., return the factor with which you should multiply reference to\nmake it match as well as possible to fragment. The fragments should both\ncontain 2-byte samples.
\nThe time taken by this routine is proportional to len(fragment).
\nSearch fragment for a slice of length length samples (not bytes!) with\nmaximum energy, i.e., return i for which rms(fragment[i*2:(i+length)*2])\nis maximal. The fragments should both contain 2-byte samples.
\nThe routine takes time proportional to len(fragment).
\nConvert samples to 4 bit Intel/DVI ADPCM encoding. ADPCM coding is an adaptive\ncoding scheme, whereby each 4 bit number is the difference between one sample\nand the next, divided by a (varying) step. The Intel/DVI ADPCM algorithm has\nbeen selected for use by the IMA, so it may well become a standard.
\nstate is a tuple containing the state of the coder. The coder returns a tuple\n(adpcmfrag, newstate), and the newstate should be passed to the next call\nof lin2adpcm(). In the initial call, None can be passed as the state.\nadpcmfrag is the ADPCM coded fragment packed 2 4-bit values per byte.
\nConvert samples in the audio fragment to a-LAW encoding and return this as a\nPython string. a-LAW is an audio encoding format whereby you get a dynamic\nrange of about 13 bits using only 8 bit samples. It is used by the Sun audio\nhardware, among others.
\n\nNew in version 2.5.
\nConvert samples between 1-, 2- and 4-byte formats.
\nNote
\nIn some audio formats, such as .WAV files, 16 and 32 bit samples are\nsigned, but 8 bit samples are unsigned. So when converting to 8 bit wide\nsamples for these formats, you need to also add 128 to the result:
\nnew_frames = audioop.lin2lin(frames, old_width, 1)\nnew_frames = audioop.bias(new_frames, 1, 128)\n
The same, in reverse, has to be applied when converting from 8 to 16 or 32\nbit width samples.
\nConvert the frame rate of the input fragment.
\nstate is a tuple containing the state of the converter. The converter returns\na tuple (newfragment, newstate), and newstate should be passed to the next\ncall of ratecv(). The initial call should pass None as the state.
\nThe weightA and weightB arguments are parameters for a simple digital filter\nand default to 1 and 0 respectively.
\nReturn the root-mean-square of the fragment, i.e. sqrt(sum(S_i^2)/n).
\nThis is a measure of the power in an audio signal.
\nNote that operations such as mul() or max() make no distinction\nbetween mono and stereo fragments, i.e. all samples are treated equal. If this\nis a problem the stereo fragment should be split into two mono fragments first\nand recombined later. Here is an example of how to do that:
\ndef mul_stereo(sample, width, lfactor, rfactor):\n lsample = audioop.tomono(sample, width, 1, 0)\n rsample = audioop.tomono(sample, width, 0, 1)\n lsample = audioop.mul(lsample, width, lfactor)\n rsample = audioop.mul(rsample, width, rfactor)\n lsample = audioop.tostereo(lsample, width, 1, 0)\n rsample = audioop.tostereo(rsample, width, 0, 1)\n return audioop.add(lsample, rsample, width)\n
If you use the ADPCM coder to build network packets and you want your protocol\nto be stateless (i.e. to be able to tolerate packet loss) you should not only\ntransmit the data but also the state. Note that you should send the initial\nstate (the one you passed to lin2adpcm()) along to the decoder, not the\nfinal state (as returned by the coder). If you want to use\nstruct.struct() to store the state in binary you can code the first\nelement (the predicted value) in 16 bits and the second (the delta index) in 8.
\nThe ADPCM coders have never been tried against other ADPCM coders, only against\nthemselves. It could well be that I misinterpreted the standards in which case\nthey will not be interoperable with the respective standards.
\nThe find*() routines might look a bit funny at first sight. They are\nprimarily meant to do echo cancellation. A reasonably fast way to do this is to\npick the most energetic piece of the output sample, locate that in the input\nsample and subtract the whole output sample from the input sample:
\ndef echocancel(outputdata, inputdata):\n pos = audioop.findmax(outputdata, 800) # one tenth second\n out_test = outputdata[pos*2:]\n in_test = inputdata[pos*2:]\n ipos, factor = audioop.findfit(in_test, out_test)\n # Optional (for better cancellation):\n # factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)],\n # out_test)\n prefill = '\\0'*(pos+ipos)*2\n postfill = '\\0'*(len(inputdata)-len(prefill)-len(outputdata))\n outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill\n return audioop.add(inputdata, outputdata, 2)\n
\nDeprecated since version 2.6: The imageop module has been removed in Python 3.0.
\nThe imageop module contains some useful operations on images. It operates\non images consisting of 8 or 32 bit pixels stored in Python strings. This is\nthe same format as used by gl.lrectwrite() and the imgfile module.
\nThe module defines the following variables and functions:
\nNote
\nThe Cookie module has been renamed to http.cookies in Python\n3.0. The 2to3 tool will automatically adapt imports when converting\nyour sources to 3.0.
\nSource code: Lib/Cookie.py
\nThe Cookie module defines classes for abstracting the concept of\ncookies, an HTTP state management mechanism. It supports both simple string-only\ncookies, and provides an abstraction for having any serializable data-type as\ncookie value.
\nThe module formerly strictly applied the parsing rules described in the\nRFC 2109 and RFC 2068 specifications. It has since been discovered that\nMSIE 3.0x doesn’t follow the character rules outlined in those specs. As a\nresult, the parsing rules used are a bit less strict.
\nNote
\nOn encountering an invalid cookie, CookieError is raised, so if your\ncookie data comes from a browser you should always prepare for invalid data\nand catch CookieError on parsing.
\nThis class is a dictionary-like object whose keys are strings and whose values\nare Morsel instances. Note that upon setting a key to a value, the\nvalue is first converted to a Morsel containing the key and the value.
\nIf input is given, it is passed to the load() method.
\nThis class derives from BaseCookie and overrides value_decode()\nand value_encode() to be the pickle.loads() and\npickle.dumps().
\n\nDeprecated since version 2.3: Reading pickled values from untrusted cookie data is a huge security hole, as\npickle strings can be crafted to cause arbitrary code to execute on your server.\nIt is supported for backwards compatibility only, and may eventually go away.
\nThis class derives from BaseCookie. It overrides value_decode()\nto be pickle.loads() if it is a valid pickle, and otherwise the value\nitself. It overrides value_encode() to be pickle.dumps() unless it\nis a string, in which case it returns the value itself.
\n\nDeprecated since version 2.3: The same security warning from SerialCookie applies here.
\nA further security note is warranted. For backwards compatibility, the\nCookie module exports a class named Cookie which is just an\nalias for SmartCookie. This is probably a mistake and will likely be\nremoved in a future version. You should not use the Cookie class in\nyour applications, for the same reason why you should not use the\nSerialCookie class.
\nSee also
\n\nReturn an encoded value. val can be any type, but return value must be a\nstring. This method does nothing in BaseCookie — it exists so it can\nbe overridden
\nIn general, it should be the case that value_encode() and\nvalue_decode() are inverses on the range of value_decode.
\nReturn a string representation suitable to be sent as HTTP headers. attrs and\nheader are sent to each Morsel‘s output() method. sep is used\nto join the headers together, and is by default the combination '\\r\\n'\n(CRLF).
\n\nChanged in version 2.5: The default separator has been changed from '\\n' to match the cookie\nspecification.
\nAbstract a key/value pair, which has some RFC 2109 attributes.
\nMorsels are dictionary-like objects, whose set of keys is constant — the valid\nRFC 2109 attributes, which are
\nThe attribute httponly specifies that the cookie is only transfered\nin HTTP requests, and is not accessible through JavaScript. This is intended\nto mitigate some forms of cross-site scripting.
\nThe keys are case-insensitive.
\n\nNew in version 2.6: The httponly attribute was added.
\nThe following example demonstrates how to use the Cookie module.
\n>>> import Cookie\n>>> C = Cookie.SimpleCookie()\n>>> C["fig"] = "newton"\n>>> C["sugar"] = "wafer"\n>>> print C # generate HTTP headers\nSet-Cookie: fig=newton\nSet-Cookie: sugar=wafer\n>>> print C.output() # same thing\nSet-Cookie: fig=newton\nSet-Cookie: sugar=wafer\n>>> C = Cookie.SimpleCookie()\n>>> C["rocky"] = "road"\n>>> C["rocky"]["path"] = "/cookie"\n>>> print C.output(header="Cookie:")\nCookie: rocky=road; Path=/cookie\n>>> print C.output(attrs=[], header="Cookie:")\nCookie: rocky=road\n>>> C = Cookie.SimpleCookie()\n>>> C.load("chips=ahoy; vienna=finger") # load from a string (HTTP header)\n>>> print C\nSet-Cookie: chips=ahoy\nSet-Cookie: vienna=finger\n>>> C = Cookie.SimpleCookie()\n>>> C.load('keebler="E=everybody; L=\\\\"Loves\\\\"; fudge=\\\\012;";')\n>>> print C\nSet-Cookie: keebler="E=everybody; L=\\"Loves\\"; fudge=\\012;"\n>>> C = Cookie.SimpleCookie()\n>>> C["oreo"] = "doublestuff"\n>>> C["oreo"]["path"] = "/"\n>>> print C\nSet-Cookie: oreo=doublestuff; Path=/\n>>> C["twix"] = "none for you"\n>>> C["twix"].value\n'none for you'\n>>> C = Cookie.SimpleCookie()\n>>> C["number"] = 7 # equivalent to C["number"] = str(7)\n>>> C["string"] = "seven"\n>>> C["number"].value\n'7'\n>>> C["string"].value\n'seven'\n>>> print C\nSet-Cookie: number=7\nSet-Cookie: string=seven\n>>> # SerialCookie and SmartCookie are deprecated\n>>> # using it can cause security loopholes in your code.\n>>> C = Cookie.SerialCookie()\n>>> C["number"] = 7\n>>> C["string"] = "seven"\n>>> C["number"].value\n7\n>>> C["string"].value\n'seven'\n>>> print C\nSet-Cookie: number="I7\\012."\nSet-Cookie: string="S'seven'\\012p1\\012."\n>>> C = Cookie.SmartCookie()\n>>> C["number"] = 7\n>>> C["string"] = "seven"\n>>> C["number"].value\n7\n>>> C["string"].value\n'seven'\n>>> print C\nSet-Cookie: number="I7\\012."\nSet-Cookie: string=seven\n
Source code: Lib/sunau.py
\nThe sunau module provides a convenient interface to the Sun AU sound\nformat. Note that this module is interface-compatible with the modules\naifc and wave.
\nAn audio file consists of a header followed by the data. The fields of the\nheader are:
\nField | \nContents | \n
---|---|
magic word | \nThe four bytes .snd. | \n
header size | \nSize of the header, including info, in bytes. | \n
data size | \nPhysical size of the data, in bytes. | \n
encoding | \nIndicates how the audio samples are encoded. | \n
sample rate | \nThe sampling rate. | \n
# of channels | \nThe number of channels in the samples. | \n
info | \nASCII string giving a description of the\naudio file (padded with null bytes). | \n
Apart from the info field, all header fields are 4 bytes in size. They are all\n32-bit unsigned integers encoded in big-endian byte order.
\nThe sunau module defines the following functions:
\nIf file is a string, open the file by that name, otherwise treat it as a\nseekable file-like object. mode can be any of
\nNote that it does not allow read/write files.
\nA mode of 'r' returns a AU_read object, while a mode of 'w'\nor 'wb' returns a AU_write object.
\nThe sunau module defines the following exception:
\nThe sunau module defines the following data items:
\nAU_read objects, as returned by open() above, have the following methods:
\nThe following two methods define a term “position” which is compatible between\nthem, and is otherwise implementation dependent.
\nThe following two functions are defined for compatibility with the aifc,\nand don’t do anything interesting.
\nAU_write objects, as returned by open() above, have the following methods:
\nMake sure nframes is correct, and close the file.
\nThis method is called upon deletion.
\nNote that it is invalid to set any parameters after calling writeframes()\nor writeframesraw().
\nSource code: Lib/aifc.py
\nThis module provides support for reading and writing AIFF and AIFF-C files.\nAIFF is Audio Interchange File Format, a format for storing digital audio\nsamples in a file. AIFF-C is a newer version of the format that includes the\nability to compress the audio data.
\nNote
\nSome operations may only work under IRIX; these will raise ImportError\nwhen attempting to import the cl module, which is only available on\nIRIX.
\nAudio files have a number of parameters that describe the audio data. The\nsampling rate or frame rate is the number of times per second the sound is\nsampled. The number of channels indicate if the audio is mono, stereo, or\nquadro. Each frame consists of one sample per channel. The sample size is the\nsize in bytes of each sample. Thus a frame consists of\nnchannels**samplesize* bytes, and a second’s worth of audio consists of\nnchannels**samplesize***framerate* bytes.
\nFor example, CD quality audio has a sample size of two bytes (16 bits), uses two\nchannels (stereo) and has a frame rate of 44,100 frames/second. This gives a\nframe size of 4 bytes (2*2), and a second’s worth occupies 2*2*44100 bytes\n(176,400 bytes).
\nModule aifc defines the following function:
\nObjects returned by open() when a file is opened for reading have the\nfollowing methods:
\nObjects returned by open() when a file is opened for writing have all the\nabove methods, except for readframes() and setpos(). In addition\nthe following methods exist. The get*() methods can only be called after\nthe corresponding set*() methods have been called. Before the first\nwriteframes() or writeframesraw(), all parameters except for the\nnumber of frames must be filled in.
\nSpecify the compression type. If not specified, the audio data will not be\ncompressed. In AIFF files, compression is not possible. The name parameter\nshould be a human-readable description of the compression type, the type\nparameter should be a four-character string. Currently the following\ncompression types are supported: NONE, ULAW, ALAW, G722.
\nThis module provides an interface for reading files that use EA IFF 85 chunks.\n[1] This format is used in at least the Audio Interchange File Format\n(AIFF/AIFF-C) and the Real Media File Format (RMFF). The WAVE audio file format\nis closely related and can also be read using this module.
\nA chunk has the following structure:
\nOffset | \nLength | \nContents | \n
---|---|---|
0 | \n4 | \nChunk ID | \n
4 | \n4 | \nSize of chunk in big-endian\nbyte order, not including the\nheader | \n
8 | \nn | \nData bytes, where n is the\nsize given in the preceding\nfield | \n
8 + n | \n0 or 1 | \nPad byte needed if n is odd\nand chunk alignment is used | \n
The ID is a 4-byte string which identifies the type of chunk.
\nThe size field (a 32-bit value, encoded using big-endian byte order) gives the\nsize of the chunk data, not including the 8-byte header.
\nUsually an IFF-type file consists of one or more chunks. The proposed usage of\nthe Chunk class defined here is to instantiate an instance at the start\nof each chunk and read from the instance until it reaches the end, after which a\nnew instance can be instantiated. At the end of the file, creating a new\ninstance will fail with a EOFError exception.
\nClass which represents a chunk. The file argument is expected to be a\nfile-like object. An instance of this class is specifically allowed. The\nonly method that is needed is read(). If the methods seek() and\ntell() are present and don’t raise an exception, they are also used.\nIf these methods are present and raise an exception, they are expected to not\nhave altered the object. If the optional argument align is true, chunks\nare assumed to be aligned on 2-byte boundaries. If align is false, no\nalignment is assumed. The default value is true. If the optional argument\nbigendian is false, the chunk size is assumed to be in little-endian order.\nThis is needed for WAVE audio files. The default value is true. If the\noptional argument inclheader is true, the size given in the chunk header\nincludes the size of the header. The default value is false.
\nA Chunk object supports the following methods:
\nThe remaining methods will raise IOError if called after the\nclose() method has been called.
\nFootnotes
\n[1] | “EA IFF 85” Standard for Interchange Format Files, Jerry Morrison, Electronic\nArts, January 1985. |
Source code: Lib/wave.py
\nThe wave module provides a convenient interface to the WAV sound format.\nIt does not support compression/decompression, but it does support mono/stereo.
\nThe wave module defines the following function and exception:
\nIf file is a string, open the file by that name, otherwise treat it as a\nseekable file-like object. mode can be any of
\nNote that it does not allow read/write WAV files.
\nA mode of 'r' or 'rb' returns a Wave_read object, while a\nmode of 'w' or 'wb' returns a Wave_write object. If\nmode is omitted and a file-like object is passed as file, file.mode\nis used as the default value for mode (the 'b' flag is still added if\nnecessary).
\nIf you pass in a file-like object, the wave object will not close it when its\nclose() method is called; it is the caller’s responsibility to close\nthe file object.
\nWave_read objects, as returned by open(), have the following methods:
\nThe following two methods are defined for compatibility with the aifc\nmodule, and don’t do anything interesting.
\nThe following two methods define a term “position” which is compatible between\nthem, and is otherwise implementation dependent.
\nWave_write objects, as returned by open(), have the following methods:
\nNote that it is invalid to set any parameters after calling writeframes()\nor writeframesraw(), and any attempt to do so will raise\nwave.Error.
\nSource code: Lib/colorsys.py
\nThe colorsys module defines bidirectional conversions of color values\nbetween colors expressed in the RGB (Red Green Blue) color space used in\ncomputer monitors and three other coordinate systems: YIQ, HLS (Hue Lightness\nSaturation) and HSV (Hue Saturation Value). Coordinates in all of these color\nspaces are floating point values. In the YIQ space, the Y coordinate is between\n0 and 1, but the I and Q coordinates can be positive or negative. In all other\nspaces, the coordinates are all between 0 and 1.
\nSee also
\nMore information about color spaces can be found at\nhttp://www.poynton.com/ColorFAQ.html and\nhttp://www.cambridgeincolour.com/tutorials/color-spaces.htm.
\nThe colorsys module defines the following functions:
\nExample:
\n>>> import colorsys\n>>> colorsys.rgb_to_hsv(.3, .4, .2)\n(0.25, 0.5, 0.4)\n>>> colorsys.hsv_to_rgb(0.25, 0.5, 0.4)\n(0.3, 0.4, 0.2)\n
Source code: Lib/imghdr.py
\nThe imghdr module determines the type of image contained in a file or\nbyte stream.
\nThe imghdr module defines the following function:
\nThe following image types are recognized, as listed below with the return value\nfrom what():
\nValue | \nImage format | \n
---|---|
'rgb' | \nSGI ImgLib Files | \n
'gif' | \nGIF 87a and 89a Files | \n
'pbm' | \nPortable Bitmap Files | \n
'pgm' | \nPortable Graymap Files | \n
'ppm' | \nPortable Pixmap Files | \n
'tiff' | \nTIFF Files | \n
'rast' | \nSun Raster Files | \n
'xbm' | \nX Bitmap Files | \n
'jpeg' | \nJPEG data in JFIF or Exif formats | \n
'bmp' | \nBMP files | \n
'png' | \nPortable Network Graphics | \n
\nNew in version 2.5: Exif detection.
\nYou can extend the list of file types imghdr can recognize by appending\nto this variable:
\nA list of functions performing the individual tests. Each function takes two\narguments: the byte-stream and an open file-like object. When what() is\ncalled with a byte-stream, the file-like object will be None.
\nThe test function should return a string describing the image type if the test\nsucceeded, or None if it failed.
\nExample:
\n>>> import imghdr\n>>> imghdr.what('/tmp/bass.gif')\n'gif'\n
Note
\nThe cookielib module has been renamed to http.cookiejar in\nPython 3.0. The 2to3 tool will automatically adapt imports when\nconverting your sources to 3.0.
\n\nNew in version 2.4.
\nSource code: Lib/cookielib.py
\nThe cookielib module defines classes for automatic handling of HTTP\ncookies. It is useful for accessing web sites that require small pieces of data\n– cookies – to be set on the client machine by an HTTP response from a\nweb server, and then returned to the server in later HTTP requests.
\nBoth the regular Netscape cookie protocol and the protocol defined by\nRFC 2965 are handled. RFC 2965 handling is switched off by default.\nRFC 2109 cookies are parsed as Netscape cookies and subsequently treated\neither as Netscape or RFC 2965 cookies according to the ‘policy’ in effect.\nNote that the great majority of cookies on the Internet are Netscape cookies.\ncookielib attempts to follow the de-facto Netscape cookie protocol (which\ndiffers substantially from that set out in the original Netscape specification),\nincluding taking note of the max-age and port cookie-attributes\nintroduced with RFC 2965.
\nNote
\nThe various named parameters found in Set-Cookie and\nSet-Cookie2 headers (eg. domain and expires) are\nconventionally referred to as attributes. To distinguish them from\nPython attributes, the documentation for this module uses the term\ncookie-attribute instead.
\nThe module defines the following exception:
\nInstances of FileCookieJar raise this exception on failure to load\ncookies from a file.
\n\nThe following classes are provided:
\npolicy is an object implementing the CookiePolicy interface.
\nThe CookieJar class stores HTTP cookies. It extracts cookies from HTTP\nrequests, and returns them in HTTP responses. CookieJar instances\nautomatically expire contained cookies when necessary. Subclasses are also\nresponsible for storing and retrieving cookies from a file or database.
\npolicy is an object implementing the CookiePolicy interface. For the\nother arguments, see the documentation for the corresponding attributes.
\nA CookieJar which can load cookies from, and perhaps save cookies to, a\nfile on disk. Cookies are NOT loaded from the named file until either the\nload() or revert() method is called. Subclasses of this class are\ndocumented in section FileCookieJar subclasses and co-operation with web browsers.
\nConstructor arguments should be passed as keyword arguments only.\nblocked_domains is a sequence of domain names that we never accept cookies\nfrom, nor return cookies to. allowed_domains if not None, this is a\nsequence of the only domains for which we accept and return cookies. For all\nother arguments, see the documentation for CookiePolicy and\nDefaultCookiePolicy objects.
\nDefaultCookiePolicy implements the standard accept / reject rules for\nNetscape and RFC 2965 cookies. By default, RFC 2109 cookies (ie. cookies\nreceived in a Set-Cookie header with a version cookie-attribute of\n1) are treated according to the RFC 2965 rules. However, if RFC 2965 handling\nis turned off or rfc2109_as_netscape is True, RFC 2109 cookies are\n‘downgraded’ by the CookieJar instance to Netscape cookies, by\nsetting the version attribute of the Cookie instance to 0.\nDefaultCookiePolicy also provides some parameters to allow some\nfine-tuning of policy.
\nSee also
\nRFC 2964 - Use of HTTP State Management
\nCookieJar objects support the iterator protocol for iterating over\ncontained Cookie objects.
\nCookieJar has the following methods:
\nAdd correct Cookie header to request.
\nIf policy allows (ie. the rfc2965 and hide_cookie2 attributes of\nthe CookieJar‘s CookiePolicy instance are true and false\nrespectively), the Cookie2 header is also added when appropriate.
\nThe request object (usually a urllib2.Request instance) must support\nthe methods get_full_url(), get_host(), get_type(),\nunverifiable(), get_origin_req_host(), has_header(),\nget_header(), header_items(), and add_unredirected_header(),as\ndocumented by urllib2.
\nExtract cookies from HTTP response and store them in the CookieJar,\nwhere allowed by policy.
\nThe CookieJar will look for allowable Set-Cookie and\nSet-Cookie2 headers in the response argument, and store cookies\nas appropriate (subject to the CookiePolicy.set_ok() method’s approval).
\nThe response object (usually the result of a call to urllib2.urlopen(),\nor similar) should support an info() method, which returns an object with\na getallmatchingheaders() method (usually a mimetools.Message\ninstance).
\nThe request object (usually a urllib2.Request instance) must support\nthe methods get_full_url(), get_host(), unverifiable(), and\nget_origin_req_host(), as documented by urllib2. The request is\nused to set default values for cookie-attributes as well as for checking that\nthe cookie is allowed to be set.
\nReturn sequence of Cookie objects extracted from response object.
\nSee the documentation for extract_cookies() for the interfaces required of\nthe response and request arguments.
\nClear some cookies.
\nIf invoked without arguments, clear all cookies. If given a single argument,\nonly cookies belonging to that domain will be removed. If given two arguments,\ncookies belonging to the specified domain and URL path are removed. If\ngiven three arguments, then the cookie with the specified domain, path and\nname is removed.
\nRaises KeyError if no matching cookie exists.
\nDiscard all session cookies.
\nDiscards all contained cookies that have a true discard attribute\n(usually because they had either no max-age or expires cookie-attribute,\nor an explicit discard cookie-attribute). For interactive browsers, the end\nof a session usually corresponds to closing the browser window.
\nNote that the save() method won’t save session cookies anyway, unless you\nask otherwise by passing a true ignore_discard argument.
\nFileCookieJar implements the following additional methods:
\nSave cookies to a file.
\nThis base class raises NotImplementedError. Subclasses may leave this\nmethod unimplemented.
\nfilename is the name of file in which to save cookies. If filename is not\nspecified, self.filename is used (whose default is the value passed to\nthe constructor, if any); if self.filename is None,\nValueError is raised.
\nignore_discard: save even cookies set to be discarded. ignore_expires: save\neven cookies that have expired
\nThe file is overwritten if it already exists, thus wiping all the cookies it\ncontains. Saved cookies can be restored later using the load() or\nrevert() methods.
\nLoad cookies from a file.
\nOld cookies are kept unless overwritten by newly loaded ones.
\nArguments are as for save().
\nThe named file must be in the format understood by the class, or\nLoadError will be raised. Also, IOError may be raised, for\nexample if the file does not exist.
\n\nClear all cookies and reload cookies from a saved file.
\nrevert() can raise the same exceptions as load(). If there is a\nfailure, the object’s state will not be altered.
\nFileCookieJar instances have the following public attributes:
\nThe following CookieJar subclasses are provided for reading and\nwriting .
\nA FileCookieJar that can load from and save cookies to disk in the\nMozilla cookies.txt file format (which is also used by the Lynx and Netscape\nbrowsers).
\nNote
\nVersion 3 of the Firefox web browser no longer writes cookies in the\ncookies.txt file format.
\nNote
\nThis loses information about RFC 2965 cookies, and also about newer or\nnon-standard cookie-attributes such as port.
\nWarning
\nBack up your cookies before saving if you have cookies whose loss / corruption\nwould be inconvenient (there are some subtleties which may lead to slight\nchanges in the file over a load / save round-trip).
\nAlso note that cookies saved while Mozilla is running will get clobbered by\nMozilla.
\nObjects implementing the CookiePolicy interface have the following\nmethods:
\nReturn boolean value indicating whether cookie should be accepted from server.
\ncookie is a cookielib.Cookie instance. request is an object\nimplementing the interface defined by the documentation for\nCookieJar.extract_cookies().
\nReturn boolean value indicating whether cookie should be returned to server.
\ncookie is a cookielib.Cookie instance. request is an object\nimplementing the interface defined by the documentation for\nCookieJar.add_cookie_header().
\nReturn false if cookies should not be returned, given cookie domain.
\nThis method is an optimization. It removes the need for checking every cookie\nwith a particular domain (which might involve reading many files). Returning\ntrue from domain_return_ok() and path_return_ok() leaves all the\nwork to return_ok().
\nIf domain_return_ok() returns true for the cookie domain,\npath_return_ok() is called for the cookie path. Otherwise,\npath_return_ok() and return_ok() are never called for that cookie\ndomain. If path_return_ok() returns true, return_ok() is called\nwith the Cookie object itself for a full check. Otherwise,\nreturn_ok() is never called for that cookie path.
\nNote that domain_return_ok() is called for every cookie domain, not just\nfor the request domain. For example, the function might be called with both\n".example.com" and "www.example.com" if the request domain is\n"www.example.com". The same goes for path_return_ok().
\nThe request argument is as documented for return_ok().
\nReturn false if cookies should not be returned, given cookie path.
\nSee the documentation for domain_return_ok().
\nIn addition to implementing the methods above, implementations of the\nCookiePolicy interface must also supply the following attributes,\nindicating which protocols should be used, and how. All of these attributes may\nbe assigned to.
\nThe most useful way to define a CookiePolicy class is by subclassing\nfrom DefaultCookiePolicy and overriding some or all of the methods\nabove. CookiePolicy itself may be used as a ‘null policy’ to allow\nsetting and receiving any and all cookies (this is unlikely to be useful).
\nImplements the standard rules for accepting and returning cookies.
\nBoth RFC 2965 and Netscape cookies are covered. RFC 2965 handling is switched\noff by default.
\nThe easiest way to provide your own policy is to override this class and call\nits methods in your overridden implementations before adding your own additional\nchecks:
\nimport cookielib\nclass MyCookiePolicy(cookielib.DefaultCookiePolicy):\n def set_ok(self, cookie, request):\n if not cookielib.DefaultCookiePolicy.set_ok(self, cookie, request):\n return False\n if i_dont_want_to_store_this_cookie(cookie):\n return False\n return True\n
In addition to the features required to implement the CookiePolicy\ninterface, this class allows you to block and allow domains from setting and\nreceiving cookies. There are also some strictness switches that allow you to\ntighten up the rather loose Netscape protocol rules a little bit (at the cost of\nblocking some benign cookies).
\nA domain blacklist and whitelist is provided (both off by default). Only domains\nnot in the blacklist and present in the whitelist (if the whitelist is active)\nparticipate in cookie setting and returning. Use the blocked_domains\nconstructor argument, and blocked_domains() and\nset_blocked_domains() methods (and the corresponding argument and methods\nfor allowed_domains). If you set a whitelist, you can turn it off again by\nsetting it to None.
\nDomains in block or allow lists that do not start with a dot must equal the\ncookie domain to be matched. For example, "example.com" matches a blacklist\nentry of "example.com", but "www.example.com" does not. Domains that do\nstart with a dot are matched by more specific domains too. For example, both\n"www.example.com" and "www.coyote.example.com" match ".example.com"\n(but "example.com" itself does not). IP addresses are an exception, and\nmust match exactly. For example, if blocked_domains contains "192.168.1.2"\nand ".168.1.2", 192.168.1.2 is blocked, but 193.168.1.2 is not.
\nDefaultCookiePolicy implements the following additional methods:
\nDefaultCookiePolicy instances have the following attributes, which are\nall initialised from the constructor arguments of the same name, and which may\nall be assigned to.
\nIf true, request that the CookieJar instance downgrade RFC 2109 cookies\n(ie. cookies received in a Set-Cookie header with a version\ncookie-attribute of 1) to Netscape cookies by setting the version attribute of\nthe Cookie instance to 0. The default value is None, in which\ncase RFC 2109 cookies are downgraded if and only if RFC 2965 handling is turned\noff. Therefore, RFC 2109 cookies are downgraded by default.
\n\nNew in version 2.5.
\nGeneral strictness switches:
\nRFC 2965 protocol strictness switches:
\nNetscape protocol strictness switches:
\nstrict_ns_domain is a collection of flags. Its value is constructed by\nor-ing together (for example, DomainStrictNoDots|DomainStrictNonDomain means\nboth flags are set).
\nThe following attributes are provided for convenience, and are the most useful\ncombinations of the above flags:
\nCookie instances have Python attributes roughly corresponding to the\nstandard cookie-attributes specified in the various cookie standards. The\ncorrespondence is not one-to-one, because there are complicated rules for\nassigning default values, because the max-age and expires\ncookie-attributes contain equivalent information, and because RFC 2109 cookies\nmay be ‘downgraded’ by cookielib from version 1 to version 0 (Netscape)\ncookies.
\nAssignment to these attributes should not be necessary other than in rare\ncircumstances in a CookiePolicy method. The class does not enforce\ninternal consistency, so you should know what you’re doing if you do that.
\nTrue if this cookie was received as an RFC 2109 cookie (ie. the cookie\narrived in a Set-Cookie header, and the value of the Version\ncookie-attribute in that header was 1). This attribute is provided because\ncookielib may ‘downgrade’ RFC 2109 cookies to Netscape cookies, in\nwhich case version is 0.
\n\nNew in version 2.5.
\nCookies may have additional non-standard cookie-attributes. These may be\naccessed using the following methods:
\nThe Cookie class also defines the following method:
\nThe first example shows the most common usage of cookielib:
\nimport cookielib, urllib2\ncj = cookielib.CookieJar()\nopener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))\nr = opener.open("http://example.com/")\n
This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx\ncookies (assumes Unix/Netscape convention for location of the cookies file):
\nimport os, cookielib, urllib2\ncj = cookielib.MozillaCookieJar()\ncj.load(os.path.join(os.path.expanduser("~"), ".netscape", "cookies.txt"))\nopener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))\nr = opener.open("http://example.com/")\n
The next example illustrates the use of DefaultCookiePolicy. Turn on\nRFC 2965 cookies, be more strict about domains when setting and returning\nNetscape cookies, and block some domains from setting cookies or having them\nreturned:
\nimport urllib2\nfrom cookielib import CookieJar, DefaultCookiePolicy\npolicy = DefaultCookiePolicy(\n rfc2965=True, strict_ns_domain=DefaultCookiePolicy.DomainStrict,\n blocked_domains=["ads.net", ".ads.net"])\ncj = CookieJar(policy)\nopener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))\nr = opener.open("http://example.com/")\n
Source code: Lib/sndhdr.py
\nThe sndhdr provides utility functions which attempt to determine the type\nof sound data which is in a file. When these functions are able to determine\nwhat type of sound data is stored in a file, they return a tuple (type,\nsampling_rate, channels, frames, bits_per_sample). The value for type\nindicates the data type and will be one of the strings 'aifc', 'aiff',\n'au', 'hcom', 'sndr', 'sndt', 'voc', 'wav', '8svx',\n'sb', 'ub', or 'ul'. The sampling_rate will be either the actual\nvalue or 0 if unknown or difficult to decode. Similarly, channels will be\neither the number of channels or 0 if it cannot be determined or if the\nvalue is difficult to decode. The value for frames will be either the number\nof frames or -1. The last item in the tuple, bits_per_sample, will either\nbe the sample size in bits or 'A' for A-LAW or 'U' for u-LAW.
\nSource code: Lib/gettext.py
\nThe gettext module provides internationalization (I18N) and localization\n(L10N) services for your Python modules and applications. It supports both the\nGNU gettext message catalog API and a higher level, class-based API that may\nbe more appropriate for Python files. The interface described below allows you\nto write your module and application messages in one natural language, and\nprovide a catalog of translated messages for running under different natural\nlanguages.
\nSome hints on localizing your Python modules and applications are also given.
\nThe gettext module defines the following API, which is very similar to\nthe GNU gettext API. If you use this API you will affect the\ntranslation of your entire application globally. Often this is what you want if\nyour application is monolingual, with the choice of language dependent on the\nlocale of your user. If you are localizing a Python module, or if your\napplication needs to switch languages on the fly, you probably want to use the\nclass-based API instead.
\nBind the domain to the locale directory localedir. More concretely,\ngettext will look for binary .mo files for the given domain using\nthe path (on Unix): localedir/language/LC_MESSAGES/domain.mo, where\nlanguages is searched for in the environment variables LANGUAGE,\nLC_ALL, LC_MESSAGES, and LANG respectively.
\nIf localedir is omitted or None, then the current binding for domain is\nreturned. [1]
\nBind the domain to codeset, changing the encoding of strings returned by the\ngettext() family of functions. If codeset is omitted, then the current\nbinding is returned.
\n\nNew in version 2.4.
\nEquivalent to gettext(), but the translation is returned in the preferred\nsystem encoding, if no other encoding was explicitly set with\nbind_textdomain_codeset().
\n\nNew in version 2.4.
\nEquivalent to dgettext(), but the translation is returned in the preferred\nsystem encoding, if no other encoding was explicitly set with\nbind_textdomain_codeset().
\n\nNew in version 2.4.
\nLike gettext(), but consider plural forms. If a translation is found,\napply the plural formula to n, and return the resulting message (some\nlanguages have more than two plural forms). If no translation is found, return\nsingular if n is 1; return plural otherwise.
\nThe Plural formula is taken from the catalog header. It is a C or Python\nexpression that has a free variable n; the expression evaluates to the index\nof the plural in the catalog. See the GNU gettext documentation for the precise\nsyntax to be used in .po files and the formulas for a variety of\nlanguages.
\n\nNew in version 2.3.
\nEquivalent to ngettext(), but the translation is returned in the preferred\nsystem encoding, if no other encoding was explicitly set with\nbind_textdomain_codeset().
\n\nNew in version 2.4.
\nLike ngettext(), but look the message up in the specified domain.
\n\nNew in version 2.3.
\nEquivalent to dngettext(), but the translation is returned in the\npreferred system encoding, if no other encoding was explicitly set with\nbind_textdomain_codeset().
\n\nNew in version 2.4.
\nNote that GNU gettext also defines a dcgettext() method, but\nthis was deemed not useful and so it is currently unimplemented.
\nHere’s an example of typical usage for this API:
\nimport gettext\ngettext.bindtextdomain('myapplication', '/path/to/my/language/directory')\ngettext.textdomain('myapplication')\n_ = gettext.gettext\n# ...\nprint _('This is a translatable string.')\n
The class-based API of the gettext module gives you more flexibility and\ngreater convenience than the GNU gettext API. It is the recommended\nway of localizing your Python applications and modules. gettext defines\na “translations” class which implements the parsing of GNU .mo format\nfiles, and has methods for returning either standard 8-bit strings or Unicode\nstrings. Instances of this “translations” class can also install themselves in\nthe built-in namespace as the function _().
\nThis function implements the standard .mo file search algorithm. It\ntakes a domain, identical to what textdomain() takes. Optional\nlocaledir is as in bindtextdomain() Optional languages is a list of\nstrings, where each string is a language code.
\nIf localedir is not given, then the default system locale directory is used.\n[2] If languages is not given, then the following environment variables are\nsearched: LANGUAGE, LC_ALL, LC_MESSAGES, and\nLANG. The first one returning a non-empty value is used for the\nlanguages variable. The environment variables should contain a colon separated\nlist of languages, which will be split on the colon to produce the expected list\nof language code strings.
\nfind() then expands and normalizes the languages, and then iterates\nthrough them, searching for an existing file built of these components:
\nlocaledir/language/LC_MESSAGES/domain.mo
\nThe first such file name that exists is returned by find(). If no such\nfile is found, then None is returned. If all is given, it returns a list\nof all file names, in the order in which they appear in the languages list or\nthe environment variables.
\nReturn a Translations instance based on the domain, localedir, and\nlanguages, which are first passed to find() to get a list of the\nassociated .mo file paths. Instances with identical .mo file\nnames are cached. The actual class instantiated is either class_ if provided,\notherwise GNUTranslations. The class’s constructor must take a single\nfile object argument. If provided, codeset will change the charset used to\nencode translated strings.
\nIf multiple files are found, later files are used as fallbacks for earlier ones.\nTo allow setting the fallback, copy.copy() is used to clone each\ntranslation object from the cache; the actual instance data is still shared with\nthe cache.
\nIf no .mo file is found, this function raises IOError if\nfallback is false (which is the default), and returns a\nNullTranslations instance if fallback is true.
\n\nChanged in version 2.4: Added the codeset parameter.
\nThis installs the function _() in Python’s builtins namespace, based on\ndomain, localedir, and codeset which are passed to the function\ntranslation(). The unicode flag is passed to the resulting translation\nobject’s install() method.
\nFor the names parameter, please see the description of the translation\nobject’s install() method.
\nAs seen below, you usually mark the strings in your application that are\ncandidates for translation, by wrapping them in a call to the _()\nfunction, like this:
\nprint _('This string will be translated.')\n
For convenience, you want the _() function to be installed in Python’s\nbuiltins namespace, so it is easily accessible in all modules of your\napplication.
\n\nChanged in version 2.4: Added the codeset parameter.
\n\nChanged in version 2.5: Added the names parameter.
\nTranslation classes are what actually implement the translation of original\nsource file message strings to translated message strings. The base class used\nby all translation classes is NullTranslations; this provides the basic\ninterface you can use to write your own specialized translation classes. Here\nare the methods of NullTranslations:
\nTakes an optional file object fp, which is ignored by the base class.\nInitializes “protected” instance variables _info and _charset which are set\nby derived classes, as well as _fallback, which is set through\nadd_fallback(). It then calls self._parse(fp) if fp is not\nNone.
\nIf a fallback has been set, forward lgettext() to the\nfallback. Otherwise, return the translated message. Overridden in derived\nclasses.
\n\nNew in version 2.4.
\nIf a fallback has been set, forward ngettext() to the\nfallback. Otherwise, return the translated message. Overridden in derived\nclasses.
\n\nNew in version 2.3.
\nIf a fallback has been set, forward lngettext() to the\nfallback. Otherwise, return the translated message. Overridden in derived\nclasses.
\n\nNew in version 2.4.
\nIf a fallback has been set, forward ungettext() to the fallback.\nOtherwise, return the translated message as a Unicode string. Overridden\nin derived classes.
\n\nNew in version 2.3.
\nReturn the “protected” _output_charset variable, which defines the\nencoding used to return translated messages.
\n\nNew in version 2.4.
\nChange the “protected” _output_charset variable, which defines the\nencoding used to return translated messages.
\n\nNew in version 2.4.
\nIf the unicode flag is false, this method installs self.gettext()\ninto the built-in namespace, binding it to _. If unicode is true,\nit binds self.ugettext() instead. By default, unicode is false.
\nIf the names parameter is given, it must be a sequence containing the\nnames of functions you want to install in the builtins namespace in\naddition to _(). Supported names are 'gettext' (bound to\nself.gettext() or self.ugettext() according to the unicode\nflag), 'ngettext' (bound to self.ngettext() or\nself.ungettext() according to the unicode flag), 'lgettext'\nand 'lngettext'.
\nNote that this is only one way, albeit the most convenient way, to make\nthe _() function available to your application. Because it affects\nthe entire application globally, and specifically the built-in namespace,\nlocalized modules should never install _(). Instead, they should use\nthis code to make _() available to their module:
\nimport gettext\nt = gettext.translation('mymodule', ...)\n_ = t.gettext\n
This puts _() only in the module’s global namespace and so only\naffects calls within this module.
\n\nChanged in version 2.5: Added the names parameter.
\nThe gettext module provides one additional class derived from\nNullTranslations: GNUTranslations. This class overrides\n_parse() to enable reading GNU gettext format .mo files\nin both big-endian and little-endian format. It also coerces both message ids\nand message strings to Unicode.
\nGNUTranslations parses optional meta-data out of the translation\ncatalog. It is convention with GNU gettext to include meta-data as\nthe translation for the empty string. This meta-data is in RFC 822-style\nkey: value pairs, and should contain the Project-Id-Version key. If the\nkey Content-Type is found, then the charset property is used to\ninitialize the “protected” _charset instance variable, defaulting to\nNone if not found. If the charset encoding is specified, then all message\nids and message strings read from the catalog are converted to Unicode using\nthis encoding. The ugettext() method always returns a Unicode, while the\ngettext() returns an encoded 8-bit string. For the message id arguments\nof both methods, either Unicode strings or 8-bit strings containing only\nUS-ASCII characters are acceptable. Note that the Unicode version of the\nmethods (i.e. ugettext() and ungettext()) are the recommended\ninterface to use for internationalized Python programs.
\nThe entire set of key/value pairs are placed into a dictionary and set as the\n“protected” _info instance variable.
\nIf the .mo file’s magic number is invalid, or if other problems occur\nwhile reading the file, instantiating a GNUTranslations class can raise\nIOError.
\nThe following methods are overridden from the base class implementation:
\nEquivalent to gettext(), but the translation is returned in the preferred\nsystem encoding, if no other encoding was explicitly set with\nset_output_charset().
\n\nNew in version 2.4.
\nDo a plural-forms lookup of a message id. singular is used as the message id\nfor purposes of lookup in the catalog, while n is used to determine which\nplural form to use. The returned message string is an 8-bit string encoded with\nthe catalog’s charset encoding, if known.
\nIf the message id is not found in the catalog, and a fallback is specified, the\nrequest is forwarded to the fallback’s ngettext() method. Otherwise, when\nn is 1 singular is returned, and plural is returned in all other cases.
\n\nNew in version 2.3.
\nEquivalent to gettext(), but the translation is returned in the preferred\nsystem encoding, if no other encoding was explicitly set with\nset_output_charset().
\n\nNew in version 2.4.
\nDo a plural-forms lookup of a message id. singular is used as the message id\nfor purposes of lookup in the catalog, while n is used to determine which\nplural form to use. The returned message string is a Unicode string.
\nIf the message id is not found in the catalog, and a fallback is specified, the\nrequest is forwarded to the fallback’s ungettext() method. Otherwise,\nwhen n is 1 singular is returned, and plural is returned in all other\ncases.
\nHere is an example:
\nn = len(os.listdir('.'))\ncat = GNUTranslations(somefile)\nmessage = cat.ungettext(\n 'There is %(num)d file in this directory',\n 'There are %(num)d files in this directory',\n n) {'num': n}\n
\nNew in version 2.3.
\nThe Solaris operating system defines its own binary .mo file format, but\nsince no documentation can be found on this format, it is not supported at this\ntime.
\nGNOME uses a version of the gettext module by James Henstridge, but this\nversion has a slightly different API. Its documented usage was:
\nimport gettext\ncat = gettext.Catalog(domain, localedir)\n_ = cat.gettext\nprint _('hello world')\n
For compatibility with this older module, the function Catalog() is an\nalias for the translation() function described above.
\nOne difference between this module and Henstridge’s: his catalog objects\nsupported access through a mapping API, but this appears to be unused and so is\nnot currently supported.
\nInternationalization (I18N) refers to the operation by which a program is made\naware of multiple languages. Localization (L10N) refers to the adaptation of\nyour program, once internationalized, to the local language and cultural habits.\nIn order to provide multilingual messages for your Python programs, you need to\ntake the following steps:
\nIn order to prepare your code for I18N, you need to look at all the strings in\nyour files. Any string that needs to be translated should be marked by wrapping\nit in _('...') — that is, a call to the function _(). For example:
\nfilename = 'mylog.txt'\nmessage = _('writing a log message')\nfp = open(filename, 'w')\nfp.write(message)\nfp.close()\n
In this example, the string 'writing a log message' is marked as a candidate\nfor translation, while the strings 'mylog.txt' and 'w' are not.
\nThe Python distribution comes with two tools which help you generate the message\ncatalogs once you’ve prepared your source code. These may or may not be\navailable from a binary distribution, but they can be found in a source\ndistribution, in the Tools/i18n directory.
\nThe pygettext [3] program scans all your Python source code looking\nfor the strings you previously marked as translatable. It is similar to the GNU\ngettext program except that it understands all the intricacies of\nPython source code, but knows nothing about C or C++ source code. You don’t\nneed GNU gettext unless you’re also going to be translating C code (such as\nC extension modules).
\npygettext generates textual Uniforum-style human readable message\ncatalog .pot files, essentially structured human readable files which\ncontain every marked string in the source code, along with a placeholder for the\ntranslation strings. pygettext is a command line script that supports\na similar command line interface as xgettext; for details on its use,\nrun:
\npygettext.py --help\n
Copies of these .pot files are then handed over to the individual human\ntranslators who write language-specific versions for every supported natural\nlanguage. They send you back the filled in language-specific versions as a\n.po file. Using the msgfmt.py [4] program (in the\nTools/i18n directory), you take the .po files from your\ntranslators and generate the machine-readable .mo binary catalog files.\nThe .mo files are what the gettext module uses for the actual\ntranslation processing during run-time.
\nHow you use the gettext module in your code depends on whether you are\ninternationalizing a single module or your entire application. The next two\nsections will discuss each case.
\nIf you are localizing your module, you must take care not to make global\nchanges, e.g. to the built-in namespace. You should not use the GNU gettext\nAPI but instead the class-based API.
\nLet’s say your module is called “spam” and the module’s various natural language\ntranslation .mo files reside in /usr/share/locale in GNU\ngettext format. Here’s what you would put at the top of your\nmodule:
\nimport gettext\nt = gettext.translation('spam', '/usr/share/locale')\n_ = t.lgettext\n
If your translators were providing you with Unicode strings in their .po\nfiles, you’d instead do:
\nimport gettext\nt = gettext.translation('spam', '/usr/share/locale')\n_ = t.ugettext\n
If you are localizing your application, you can install the _() function\nglobally into the built-in namespace, usually in the main driver file of your\napplication. This will let all your application-specific files just use\n_('...') without having to explicitly install it in each file.
\nIn the simple case then, you need only add the following bit of code to the main\ndriver file of your application:
\nimport gettext\ngettext.install('myapplication')\n
If you need to set the locale directory or the unicode flag, you can pass\nthese into the install() function:
\nimport gettext\ngettext.install('myapplication', '/usr/share/locale', unicode=1)\n
If your program needs to support many languages at the same time, you may want\nto create multiple translation instances and then switch between them\nexplicitly, like so:
\nimport gettext\n\nlang1 = gettext.translation('myapplication', languages=['en'])\nlang2 = gettext.translation('myapplication', languages=['fr'])\nlang3 = gettext.translation('myapplication', languages=['de'])\n\n# start by using language1\nlang1.install()\n\n# ... time goes by, user selects language 2\nlang2.install()\n\n# ... more time goes by, user selects language 3\nlang3.install()\n
In most coding situations, strings are translated where they are coded.\nOccasionally however, you need to mark strings for translation, but defer actual\ntranslation until later. A classic example is:
\nanimals = ['mollusk',\n 'albatross',\n 'rat',\n 'penguin',\n 'python', ]\n# ...\nfor a in animals:\n print a\n
Here, you want to mark the strings in the animals list as being\ntranslatable, but you don’t actually want to translate them until they are\nprinted.
\nHere is one way you can handle this situation:
\ndef _(message): return message\n\nanimals = [_('mollusk'),\n _('albatross'),\n _('rat'),\n _('penguin'),\n _('python'), ]\n\ndel _\n\n# ...\nfor a in animals:\n print _(a)\n
This works because the dummy definition of _() simply returns the string\nunchanged. And this dummy definition will temporarily override any definition\nof _() in the built-in namespace (until the del command). Take\ncare, though if you have a previous definition of _() in the local\nnamespace.
\nNote that the second use of _() will not identify “a” as being\ntranslatable to the pygettext program, since it is not a string.
\nAnother way to handle this is with the following example:
\ndef N_(message): return message\n\nanimals = [N_('mollusk'),\n N_('albatross'),\n N_('rat'),\n N_('penguin'),\n N_('python'), ]\n\n# ...\nfor a in animals:\n print _(a)\n
In this case, you are marking translatable strings with the function N_(),\n[5] which won’t conflict with any definition of _(). However, you will\nneed to teach your message extraction program to look for translatable strings\nmarked with N_(). pygettext and xpot both support\nthis through the use of command line switches.
\nIn Python 2.4 the lgettext() family of functions were introduced. The\nintention of these functions is to provide an alternative which is more\ncompliant with the current implementation of GNU gettext. Unlike\ngettext(), which returns strings encoded with the same codeset used in the\ntranslation file, lgettext() will return strings encoded with the\npreferred system encoding, as returned by locale.getpreferredencoding().\nAlso notice that Python 2.4 introduces new functions to explicitly choose the\ncodeset used in translated strings. If a codeset is explicitly set, even\nlgettext() will return translated strings in the requested codeset, as\nwould be expected in the GNU gettext implementation.
\nThe following people contributed code, feedback, design suggestions, previous\nimplementations, and valuable experience to the creation of this module:
\nFootnotes
\n[1] | The default locale directory is system dependent; for example, on RedHat Linux\nit is /usr/share/locale, but on Solaris it is /usr/lib/locale.\nThe gettext module does not try to support these system dependent\ndefaults; instead its default is sys.prefix/share/locale. For this\nreason, it is always best to call bindtextdomain() with an explicit\nabsolute path at the start of your application. |
[2] | See the footnote for bindtextdomain() above. |
[3] | François Pinard has written a program called xpot which does a\nsimilar job. It is available as part of his po-utils package. |
[4] | msgfmt.py is binary compatible with GNU msgfmt except that\nit provides a simpler, all-Python implementation. With this and\npygettext.py, you generally won’t need to install the GNU\ngettext package to internationalize your Python applications. |
[5] | The choice of N_() here is totally arbitrary; it could have just as easily\nbeen MarkThisStringForTranslation(). |
Platforms: Linux, FreeBSD
\n\nNew in version 2.3.
\nThis module allows you to access the OSS (Open Sound System) audio interface.\nOSS is available for a wide range of open-source and commercial Unices, and is\nthe standard audio interface for Linux and recent versions of FreeBSD.
\nSee also
\nThe module defines a large number of constants supplied by the OSS device\ndriver; see <sys/soundcard.h> on either Linux or FreeBSD for a listing .
\nossaudiodev defines the following variables and functions:
\nThis exception is raised on certain errors. The argument is a string describing\nwhat went wrong.
\n(If ossaudiodev receives an error from a system call such as\nopen(), write(), or ioctl(), it raises IOError.\nErrors detected directly by ossaudiodev result in OSSAudioError.)
\n(For backwards compatibility, the exception class is also available as\nossaudiodev.error.)
\nOpen an audio device and return an OSS audio device object. This object\nsupports many file-like methods, such as read(), write(), and\nfileno() (although there are subtle differences between conventional Unix\nread/write semantics and those of OSS audio devices). It also supports a number\nof audio-specific methods; see below for the complete list of methods.
\ndevice is the audio device filename to use. If it is not specified, this\nmodule first looks in the environment variable AUDIODEV for a device\nto use. If not found, it falls back to /dev/dsp.
\nmode is one of 'r' for read-only (record) access, 'w' for\nwrite-only (playback) access and 'rw' for both. Since many sound cards\nonly allow one process to have the recorder or player open at a time, it is a\ngood idea to open the device only for the activity needed. Further, some\nsound cards are half-duplex: they can be opened for reading or writing, but\nnot both at once.
\nNote the unusual calling syntax: the first argument is optional, and the\nsecond is required. This is a historical artifact for compatibility with the\nolder linuxaudiodev module which ossaudiodev supersedes.
\nBefore you can write to or read from an audio device, you must call three\nmethods in the correct order:
\nAlternately, you can use the setparameters() method to set all three audio\nparameters at once. This is more convenient, but may not be as flexible in all\ncases.
\nThe audio device objects returned by open() define the following methods\nand (read-only) attributes:
\nThe following methods each map to exactly one ioctl() system call. The\ncorrespondence is obvious: for example, setfmt() corresponds to the\nSNDCTL_DSP_SETFMT ioctl, and sync() to SNDCTL_DSP_SYNC (this can\nbe useful when consulting the OSS documentation). If the underlying\nioctl() fails, they all raise IOError.
\nReturn a bitmask of the audio output formats supported by the soundcard. Some\nof the formats supported by OSS are:
\nFormat | \nDescription | \n
---|---|
AFMT_MU_LAW | \na logarithmic encoding (used by Sun .au\nfiles and /dev/audio) | \n
AFMT_A_LAW | \na logarithmic encoding | \n
AFMT_IMA_ADPCM | \na 4:1 compressed format defined by the\nInteractive Multimedia Association | \n
AFMT_U8 | \nUnsigned, 8-bit audio | \n
AFMT_S16_LE | \nSigned, 16-bit audio, little-endian byte\norder (as used by Intel processors) | \n
AFMT_S16_BE | \nSigned, 16-bit audio, big-endian byte order\n(as used by 68k, PowerPC, Sparc) | \n
AFMT_S8 | \nSigned, 8 bit audio | \n
AFMT_U16_LE | \nUnsigned, 16-bit little-endian audio | \n
AFMT_U16_BE | \nUnsigned, 16-bit big-endian audio | \n
Consult the OSS documentation for a full list of audio formats, and note that\nmost devices support only a subset of these formats. Some older devices only\nsupport AFMT_U8; the most common format used today is\nAFMT_S16_LE.
\nTry to set the audio sampling rate to samplerate samples per second. Returns\nthe rate actually set. Most sound devices don’t support arbitrary sampling\nrates. Common rates are:
\nRate | \nDescription | \n
---|---|
8000 | \ndefault rate for /dev/audio | \n
11025 | \nspeech recording | \n
22050 | \n\n |
44100 | \nCD quality audio (at 16 bits/sample and 2\nchannels) | \n
96000 | \nDVD quality audio (at 24 bits/sample) | \n
The following convenience methods combine several ioctls, or one ioctl and some\nsimple calculations.
\nSet the key audio sampling parameters—sample format, number of channels, and\nsampling rate—in one method call. format, nchannels, and samplerate\nshould be as specified in the setfmt(), channels(), and\nspeed() methods. If strict is true, setparameters() checks to\nsee if each parameter was actually set to the requested value, and raises\nOSSAudioError if not. Returns a tuple (format, nchannels,\nsamplerate) indicating the parameter values that were actually set by the\ndevice driver (i.e., the same as the return values of setfmt(),\nchannels(), and speed()).
\nFor example,
\n(fmt, channels, rate) = dsp.setparameters(fmt, channels, rate)\n
is equivalent to
\nfmt = dsp.setfmt(fmt)\nchannels = dsp.channels(channels)\nrate = dsp.rate(channels)\n
Audio device objects also support several read-only attributes:
\nThe mixer object provides two file-like methods:
\nThe remaining methods are specific to audio mixing:
\nThis method returns a bitmask specifying the available mixer controls (“Control”\nbeing a specific mixable “channel”, such as SOUND_MIXER_PCM or\nSOUND_MIXER_SYNTH). This bitmask indicates a subset of all available\nmixer controls—the SOUND_MIXER_* constants defined at module level.\nTo determine if, for example, the current mixer object supports a PCM mixer, use\nthe following Python code:
\nmixer=ossaudiodev.openmixer()\nif mixer.controls() & (1 << ossaudiodev.SOUND_MIXER_PCM):\n # PCM is supported\n ... code ...\n
For most purposes, the SOUND_MIXER_VOLUME (master volume) and\nSOUND_MIXER_PCM controls should suffice—but code that uses the mixer\nshould be flexible when it comes to choosing mixer controls. On the Gravis\nUltrasound, for example, SOUND_MIXER_VOLUME does not exist.
\nReturns a bitmask indicating stereo mixer controls. If a bit is set, the\ncorresponding control is stereo; if it is unset, the control is either\nmonophonic or not supported by the mixer (use in combination with\ncontrols() to determine which).
\nSee the code example for the controls() function for an example of getting\ndata from a bitmask.
\nReturns the volume of a given mixer control. The returned volume is a 2-tuple\n(left_volume,right_volume). Volumes are specified as numbers from 0\n(silent) to 100 (full volume). If the control is monophonic, a 2-tuple is still\nreturned, but both volumes are the same.
\nRaises OSSAudioError if an invalid control was is specified, or\nIOError if an unsupported control is specified.
\nSets the volume for a given mixer control to (left,right). left and\nright must be ints and between 0 (silent) and 100 (full volume). On\nsuccess, the new volume is returned as a 2-tuple. Note that this may not be\nexactly the same as the volume specified, because of the limited resolution of\nsome soundcard’s mixers.
\nRaises OSSAudioError if an invalid mixer control was specified, or if the\nspecified volumes were out-of-range.
\nCall this function to specify a recording source. Returns a bitmask indicating\nthe new recording source (or sources) if successful; raises IOError if an\ninvalid source was specified. To set the current recording source to the\nmicrophone input:
\nmixer.setrecsrc (1 << ossaudiodev.SOUND_MIXER_MIC)\n
Source code: Lib/cmd.py
\nThe Cmd class provides a simple framework for writing line-oriented\ncommand interpreters. These are often useful for test harnesses, administrative\ntools, and prototypes that will later be wrapped in a more sophisticated\ninterface.
\nA Cmd instance or subclass instance is a line-oriented interpreter\nframework. There is no good reason to instantiate Cmd itself; rather,\nit’s useful as a superclass of an interpreter class you define yourself in order\nto inherit Cmd‘s methods and encapsulate action methods.
\nThe optional argument completekey is the readline name of a completion\nkey; it defaults to Tab. If completekey is not None and\nreadline is available, command completion is done automatically.
\nThe optional arguments stdin and stdout specify the input and output file\nobjects that the Cmd instance or subclass instance will use for input and\noutput. If not specified, they will default to sys.stdin and\nsys.stdout.
\nIf you want a given stdin to be used, make sure to set the instance’s\nuse_rawinput attribute to False, otherwise stdin will be\nignored.
\n\nChanged in version 2.3: The stdin and stdout parameters were added.
\nA Cmd instance has the following methods:
\nRepeatedly issue a prompt, accept input, parse an initial prefix off the\nreceived input, and dispatch to action methods, passing them the remainder of\nthe line as argument.
\nThe optional argument is a banner or intro string to be issued before the first\nprompt (this overrides the intro class attribute).
\nIf the readline module is loaded, input will automatically inherit\nbash-like history-list editing (e.g. Control-P scrolls back\nto the last command, Control-N forward to the next one, Control-F\nmoves the cursor to the right non-destructively, Control-B moves the\ncursor to the left non-destructively, etc.).
\nAn end-of-file on input is passed back as the string 'EOF'.
\nAn interpreter instance will recognize a command name foo if and only if it\nhas a method do_foo(). As a special case, a line beginning with the\ncharacter '?' is dispatched to the method do_help(). As another\nspecial case, a line beginning with the character '!' is dispatched to the\nmethod do_shell() (if such a method is defined).
\nThis method will return when the postcmd() method returns a true value.\nThe stop argument to postcmd() is the return value from the command’s\ncorresponding do_*() method.
\nIf completion is enabled, completing commands will be done automatically, and\ncompleting of commands args is done by calling complete_foo() with\narguments text, line, begidx, and endidx. text is the string prefix\nwe are attempting to match: all returned matches must begin with it. line is\nthe current input line with leading whitespace removed, begidx and endidx\nare the beginning and ending indexes of the prefix text, which could be used to\nprovide different completion depending upon which position the argument is in.
\nAll subclasses of Cmd inherit a predefined do_help(). This\nmethod, called with an argument 'bar', invokes the corresponding method\nhelp_bar(), and if that is not present, prints the docstring of\ndo_bar(), if available. With no argument, do_help() lists all\navailable help topics (that is, all commands with corresponding\nhelp_*() methods or commands that have docstrings), and also lists any\nundocumented commands.
\nInstances of Cmd subclasses have some public instance variables:
\nThe locale module opens access to the POSIX locale database and\nfunctionality. The POSIX locale mechanism allows programmers to deal with\ncertain cultural issues in an application, without requiring the programmer to\nknow all the specifics of each country where the software is executed.
\nThe locale module is implemented on top of the _locale module,\nwhich in turn uses an ANSI C locale implementation if available.
\nThe locale module defines the following exception and functions:
\nIf locale is given and not None, setlocale() modifies the locale\nsetting for the category. The available categories are listed in the data\ndescription below. locale may be a string, or an iterable of two strings\n(language code and encoding). If it’s an iterable, it’s converted to a locale\nname using the locale aliasing engine. An empty string specifies the user’s\ndefault settings. If the modification of the locale fails, the exception\nError is raised. If successful, the new locale setting is returned.
\nIf locale is omitted or None, the current setting for category is\nreturned.
\nsetlocale() is not thread-safe on most systems. Applications typically\nstart with a call of
\nimport locale\nlocale.setlocale(locale.LC_ALL, '')\n
This sets the locale for all categories to the user’s default setting (typically\nspecified in the LANG environment variable). If the locale is not\nchanged thereafter, using multithreading should not cause problems.
\n\nChanged in version 2.0: Added support for iterable values of the locale parameter.
\nReturns the database of the local conventions as a dictionary. This dictionary\nhas the following strings as keys:
\nCategory | \nKey | \nMeaning | \n
---|---|---|
LC_NUMERIC | \n'decimal_point' | \nDecimal point character. | \n
\n | 'grouping' | \nSequence of numbers specifying\nwhich relative positions the\n'thousands_sep' is\nexpected. If the sequence is\nterminated with\nCHAR_MAX, no further\ngrouping is performed. If the\nsequence terminates with a\n0, the last group size is\nrepeatedly used. | \n
\n | 'thousands_sep' | \nCharacter used between groups. | \n
LC_MONETARY | \n'int_curr_symbol' | \nInternational currency symbol. | \n
\n | 'currency_symbol' | \nLocal currency symbol. | \n
\n | 'p_cs_precedes/n_cs_precedes' | \nWhether the currency symbol\nprecedes the value (for\npositive resp. negative\nvalues). | \n
\n | 'p_sep_by_space/n_sep_by_space' | \nWhether the currency symbol is\nseparated from the value by a\nspace (for positive resp.\nnegative values). | \n
\n | 'mon_decimal_point' | \nDecimal point used for\nmonetary values. | \n
\n | 'frac_digits' | \nNumber of fractional digits\nused in local formatting of\nmonetary values. | \n
\n | 'int_frac_digits' | \nNumber of fractional digits\nused in international\nformatting of monetary values. | \n
\n | 'mon_thousands_sep' | \nGroup separator used for\nmonetary values. | \n
\n | 'mon_grouping' | \nEquivalent to 'grouping',\nused for monetary values. | \n
\n | 'positive_sign' | \nSymbol used to annotate a\npositive monetary value. | \n
\n | 'negative_sign' | \nSymbol used to annotate a\nnegative monetary value. | \n
\n | 'p_sign_posn/n_sign_posn' | \nThe position of the sign (for\npositive resp. negative\nvalues), see below. | \n
All numeric values can be set to CHAR_MAX to indicate that there is no\nvalue specified in this locale.
\nThe possible values for 'p_sign_posn' and 'n_sign_posn' are given below.
\nValue | \nExplanation | \n
---|---|
0 | \nCurrency and value are surrounded by\nparentheses. | \n
1 | \nThe sign should precede the value and\ncurrency symbol. | \n
2 | \nThe sign should follow the value and\ncurrency symbol. | \n
3 | \nThe sign should immediately precede the\nvalue. | \n
4 | \nThe sign should immediately follow the\nvalue. | \n
CHAR_MAX | \nNothing is specified in this locale. | \n
Return some locale-specific information as a string. This function is not\navailable on all systems, and the set of possible options might also vary\nacross platforms. The possible argument values are numbers, for which\nsymbolic constants are available in the locale module.
\nThe nl_langinfo() function accepts one of the following keys. Most\ndescriptions are taken from the corresponding description in the GNU C\nlibrary.
\nGet the name of the n-th day of the week.
\nNote
\nThis follows the US convention of DAY_1 being Sunday, not the\ninternational convention (ISO 8601) that Monday is the first day of the\nweek.
\nGet a regular expression that can be used with the regex function to\nrecognize a positive response to a yes/no question.
\nNote
\nThe expression is in the syntax suitable for the regex() function\nfrom the C library, which might differ from the syntax used in re.
\nGet a string that represents the era used in the current locale.
\nMost locales do not define this value. An example of a locale which does\ndefine this value is the Japanese one. In Japan, the traditional\nrepresentation of dates includes the name of the era corresponding to the\nthen-emperor’s reign.
\nNormally it should not be necessary to use this value directly. Specifying\nthe E modifier in their format strings causes the strftime()\nfunction to use this information. The format of the returned string is not\nspecified, and therefore you should not assume knowledge of it on different\nsystems.
\nTries to determine the default locale settings and returns them as a tuple of\nthe form (language code, encoding).
\nAccording to POSIX, a program which has not called setlocale(LC_ALL, '')\nruns using the portable 'C' locale. Calling setlocale(LC_ALL, '') lets\nit use the default locale as defined by the LANG variable. Since we\ndo not want to interfere with the current locale setting we thus emulate the\nbehavior in the way described above.
\nTo maintain compatibility with other platforms, not only the LANG\nvariable is tested, but a list of variables given as envvars parameter. The\nfirst found to be defined will be used. envvars defaults to the search path\nused in GNU gettext; it must always contain the variable name LANG. The GNU\ngettext search path contains 'LANGUAGE', 'LC_ALL', 'LC_CTYPE', and\n'LANG', in that order.
\nExcept for the code 'C', the language code corresponds to RFC 1766.\nlanguage code and encoding may be None if their values cannot be\ndetermined.
\n\nNew in version 2.0.
\nReturns the current setting for the given locale category as sequence containing\nlanguage code, encoding. category may be one of the LC_* values\nexcept LC_ALL. It defaults to LC_CTYPE.
\nExcept for the code 'C', the language code corresponds to RFC 1766.\nlanguage code and encoding may be None if their values cannot be\ndetermined.
\n\nNew in version 2.0.
\nReturn the encoding used for text data, according to user preferences. User\npreferences are expressed differently on different systems, and might not be\navailable programmatically on some systems, so this function only returns a\nguess.
\nOn some systems, it is necessary to invoke setlocale() to obtain the user\npreferences, so this function is not thread-safe. If invoking setlocale is not\nnecessary or desired, do_setlocale should be set to False.
\n\nNew in version 2.3.
\nReturns a normalized locale code for the given locale name. The returned locale\ncode is formatted for use with setlocale(). If normalization fails, the\noriginal name is returned unchanged.
\nIf the given encoding is not known, the function defaults to the default\nencoding for the locale code just like setlocale().
\n\nNew in version 2.0.
\nSets the locale for category to the default setting.
\nThe default setting is determined by calling getdefaultlocale().\ncategory defaults to LC_ALL.
\n\nNew in version 2.0.
\nTransforms a string to one that can be used for the built-in function\ncmp(), and still returns locale-aware results. This function can be used\nwhen the same string is compared repeatedly, e.g. when collating a sequence of\nstrings.
\nFormats a number val according to the current LC_NUMERIC setting.\nThe format follows the conventions of the
If monetary is true, the conversion uses monetary thousands separator and\ngrouping strings.
\nPlease note that this function will only work for exactly one %char specifier.\nFor whole format strings, use format_string().
\n\nChanged in version 2.5: Added the monetary parameter.
\nProcesses formatting specifiers as in format
\nNew in version 2.5.
\nFormats a number val according to the current LC_MONETARY settings.
\nThe returned string includes the currency symbol if symbol is true, which is\nthe default. If grouping is true (which is not the default), grouping is done\nwith the value. If international is true (which is not the default), the\ninternational currency symbol is used.
\nNote that this function will not work with the ‘C’ locale, so you have to set a\nlocale via setlocale() first.
\n\nNew in version 2.5.
\nLocale category for the character type functions. Depending on the settings of\nthis category, the functions of module string dealing with case change\ntheir behaviour.
\nExample:
\n>>> import locale\n>>> loc = locale.getlocale() # get current locale\n# use German locale; name might vary with platform\n>>> locale.setlocale(locale.LC_ALL, 'de_DE')\n>>> locale.strcoll('f\\xe4n', 'foo') # compare a string containing an umlaut\n>>> locale.setlocale(locale.LC_ALL, '') # use user's preferred locale\n>>> locale.setlocale(locale.LC_ALL, 'C') # use default (C) locale\n>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale\n
The C standard defines the locale as a program-wide property that may be\nrelatively expensive to change. On top of that, some implementation are broken\nin such a way that frequent locale changes may cause core dumps. This makes the\nlocale somewhat painful to use correctly.
\nInitially, when a program is started, the locale is the C locale, no matter\nwhat the user’s preferred locale is. The program must explicitly say that it\nwants the user’s preferred locale settings by calling setlocale(LC_ALL, '').
\nIt is generally a bad idea to call setlocale() in some library routine,\nsince as a side effect it affects the entire program. Saving and restoring it\nis almost as bad: it is expensive and affects other threads that happen to run\nbefore the settings have been restored.
\nIf, when coding a module for general use, you need a locale independent version\nof an operation that is affected by the locale (such as string.lower(), or\ncertain formats used with time.strftime()), you will have to find a way to\ndo it without using the standard library routine. Even better is convincing\nyourself that using locale settings is okay. Only as a last resort should you\ndocument that your module is not compatible with non-C locale settings.
\nThe case conversion functions in the string module are affected by the\nlocale settings. When a call to the setlocale() function changes the\nLC_CTYPE settings, the variables string.lowercase,\nstring.uppercase and string.letters are recalculated. Note that code\nthat uses these variable through ‘from ... import ...’,\ne.g. from string import letters, is not affected by subsequent\nsetlocale() calls.
\nThe only way to perform numeric operations according to the locale is to use the\nspecial functions defined by this module: atof(), atoi(),\nformat(), str().
\nExtension modules should never call setlocale(), except to find out what\nthe current locale is. But since the return value can only be used portably to\nrestore it, that is not very useful (except perhaps to find out whether or not\nthe locale is C).
\nWhen Python code uses the locale module to change the locale, this also\naffects the embedding application. If the embedding application doesn’t want\nthis to happen, it should remove the _locale extension module (which does\nall the work) from the table of built-in modules in the config.c file,\nand make sure that the _locale module is not accessible as a shared\nlibrary.
\nThe locale module exposes the C library’s gettext interface on systems that\nprovide this interface. It consists of the functions gettext(),\ndgettext(), dcgettext(), textdomain(), bindtextdomain(),\nand bind_textdomain_codeset(). These are similar to the same functions in\nthe gettext module, but use the C library’s binary format for message\ncatalogs, and the C library’s search algorithms for locating message catalogs.
\nPython applications should normally find no need to invoke these functions, and\nshould use gettext instead. A known exception to this rule are\napplications that link with additional C libraries which internally invoke\ngettext() or dcgettext(). For these applications, it may be\nnecessary to bind the text domain, so that the libraries can properly locate\ntheir message catalogs.
\n\nNew in version 1.5.2.
\nSource code: Lib/shlex.py
\nThe shlex class makes it easy to write lexical analyzers for simple\nsyntaxes resembling that of the Unix shell. This will often be useful for\nwriting minilanguages, (for example, in run control files for Python\napplications) or for parsing quoted strings.
\nPrior to Python 2.7.3, this module did not support Unicode input.
\nThe shlex module defines the following functions:
\nSplit the string s using shell-like syntax. If comments is False\n(the default), the parsing of comments in the given string will be disabled\n(setting the commenters attribute of the shlex instance to\nthe empty string). This function operates in POSIX mode by default, but uses\nnon-POSIX mode if the posix argument is false.
\n\nNew in version 2.3.
\n\nChanged in version 2.6: Added the posix parameter.
\n\nThe shlex module defines the following class:
\nSee also
\nA shlex instance has the following methods:
\nWhen shlex detects a source request (see source below) this\nmethod is given the following token as argument, and expected to return a tuple\nconsisting of a filename and an open file-like object.
\nNormally, this method first strips any quotes off the argument. If the result\nis an absolute pathname, or there was no previous source request in effect, or\nthe previous source was a stream (such as sys.stdin), the result is left\nalone. Otherwise, if the result is a relative pathname, the directory part of\nthe name of the file immediately before it on the source inclusion stack is\nprepended (this behavior is like the way the C preprocessor handles #include\n"file.h").
\nThe result of the manipulations is treated as a filename, and returned as the\nfirst component of the tuple, with open() called on it to yield the second\ncomponent. (Note: this is the reverse of the order of arguments in instance\ninitialization!)
\nThis hook is exposed so that you can use it to implement directory search paths,\naddition of file extensions, and other namespace hacks. There is no\ncorresponding ‘close’ hook, but a shlex instance will call the close()\nmethod of the sourced input stream when it returns EOF.
\nFor more explicit control of source stacking, use the push_source() and\npop_source() methods.
\nPush an input source stream onto the input stack. If the filename argument is\nspecified it will later be available for use in error messages. This is the\nsame method used internally by the sourcehook() method.
\n\nNew in version 2.1.
\nPop the last-pushed input source from the input stack. This is the same method\nused internally when the lexer reaches EOF on a stacked input stream.
\n\nNew in version 2.1.
\nThis method generates an error message leader in the format of a Unix C compiler\nerror label; the format is '"%s", line %d: ', where the %s is replaced\nwith the name of the current source file and the %d with the current input\nline number (the optional arguments can be used to override these).
\nThis convenience is provided to encourage shlex users to generate error\nmessages in the standard, parseable format understood by Emacs and other Unix\ntools.
\nInstances of shlex subclasses have some public instance variables which\neither control lexical analysis or can be used for debugging:
\nCharacters that will be considered as escape. This will be only used in POSIX\nmode, and includes just '\\' by default.
\n\nNew in version 2.3.
\nCharacters in quotes that will interpret escape characters defined in\nescape. This is only used in POSIX mode, and includes just '"' by\ndefault.
\n\nNew in version 2.3.
\nIf True, tokens will only be split in whitespaces. This is useful, for\nexample, for parsing command lines with shlex, getting tokens in a\nsimilar way to shell arguments.
\n\nNew in version 2.3.
\nToken used to determine end of file. This will be set to the empty string\n(''), in non-POSIX mode, and to None in POSIX mode.
\n\nNew in version 2.3.
\nWhen operating in non-POSIX mode, shlex will try to obey to the\nfollowing rules.
\nWhen operating in POSIX mode, shlex will try to obey to the following\nparsing rules.
\nPlatforms: Tk
\nThe ScrolledText module provides a class of the same name which\nimplements a basic text widget which has a vertical scroll bar configured to do\nthe “right thing.” Using the ScrolledText class is a lot easier than\nsetting up a text widget and scroll bar directly. The constructor is the same\nas that of the Tkinter.Text class.
\nNote
\nScrolledText has been renamed to tkinter.scrolledtext in Python\n3.0. The 2to3 tool will automatically adapt imports when converting\nyour sources to 3.0.
\nThe text widget and scrollbar are packed together in a Frame, and the\nmethods of the Grid and Pack geometry managers are acquired\nfrom the Frame object. This allows the ScrolledText widget to\nbe used directly to achieve most normal geometry management behavior.
\nShould more specific control be necessary, the following attributes are\navailable:
\nThe Tkinter module (“Tk interface”) is the standard Python interface to\nthe Tk GUI toolkit. Both Tk and Tkinter are available on most Unix\nplatforms, as well as on Windows systems. (Tk itself is not part of Python; it\nis maintained at ActiveState.)
\nNote
\nTkinter has been renamed to tkinter in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nSee also
\nMost of the time, the Tkinter module is all you really need, but a number\nof additional modules are available as well. The Tk interface is located in a\nbinary module named _tkinter. This module contains the low-level\ninterface to Tk, and should never be used directly by application programmers.\nIt is usually a shared library (or DLL), but might in some cases be statically\nlinked with the Python interpreter.
\nIn addition to the Tk interface module, Tkinter includes a number of\nPython modules. The two most important modules are the Tkinter module\nitself, and a module called Tkconstants. The former automatically imports\nthe latter, so to use Tkinter, all you need to do is to import one module:
\nimport Tkinter\n
Or, more often:
\nfrom Tkinter import *\n
The Tk class is instantiated without arguments. This creates a toplevel\nwidget of Tk which usually is the main window of an application. Each instance\nhas its own associated Tcl interpreter.
\n\nChanged in version 2.4: The useTk parameter was added.
\nThe Tcl() function is a factory function which creates an object much like\nthat created by the Tk class, except that it does not initialize the Tk\nsubsystem. This is most often useful when driving the Tcl interpreter in an\nenvironment where one doesn’t want to create extraneous toplevel windows, or\nwhere one cannot (such as Unix/Linux systems without an X server). An object\ncreated by the Tcl() object can have a Toplevel window created (and the Tk\nsubsystem initialized) by calling its loadtk() method.
\n\nNew in version 2.4.
\nOther modules that provide Tk support include:
\nThese have been renamed as well in Python 3.0; they were all made submodules of\nthe new tkinter package.
\nThis section is not designed to be an exhaustive tutorial on either Tk or\nTkinter. Rather, it is intended as a stop gap, providing some introductory\norientation on the system.
\nCredits:
\nThis section is designed in two parts: the first half (roughly) covers\nbackground material, while the second half can be taken to the keyboard as a\nhandy reference.
\nWhen trying to answer questions of the form “how do I do blah”, it is often best\nto find out how to do”blah” in straight Tk, and then convert this back into the\ncorresponding Tkinter call. Python programmers can often guess at the\ncorrect Python command by looking at the Tk documentation. This means that in\norder to use Tkinter, you will have to know a little bit about Tk. This document\ncan’t fulfill that role, so the best we can do is point you to the best\ndocumentation that exists. Here are some hints:
\nSee also
\nfrom Tkinter import *\n\nclass Application(Frame):\n def say_hi(self):\n print "hi there, everyone!"\n\n def createWidgets(self):\n self.QUIT = Button(self)\n self.QUIT["text"] = "QUIT"\n self.QUIT["fg"] = "red"\n self.QUIT["command"] = self.quit\n\n self.QUIT.pack({"side": "left"})\n\n self.hi_there = Button(self)\n self.hi_there["text"] = "Hello",\n self.hi_there["command"] = self.say_hi\n\n self.hi_there.pack({"side": "left"})\n\n def __init__(self, master=None):\n Frame.__init__(self, master)\n self.pack()\n self.createWidgets()\n\nroot = Tk()\napp = Application(master=root)\napp.mainloop()\nroot.destroy()\n
The class hierarchy looks complicated, but in actual practice, application\nprogrammers almost always refer to the classes at the very bottom of the\nhierarchy.
\nNotes:
\nTo make use of this reference material, there will be times when you will need\nto know how to read short passages of Tk and how to identify the various parts\nof a Tk command. (See section Mapping Basic Tk into Tkinter for the\nTkinter equivalents of what’s below.)
\nTk scripts are Tcl programs. Like all Tcl programs, Tk scripts are just lists\nof tokens separated by spaces. A Tk widget is just its class, the options\nthat help configure it, and the actions that make it do useful things.
\nTo make a widget in Tk, the command is always of the form:
\nclassCommand newPathname options
\nFor example:
\nbutton .fred -fg red -text \"hi there\"\n ^ ^ \\_____________________/\n | | |\n class new options\ncommand widget (-opt val -opt val ...)
\nOnce created, the pathname to the widget becomes a new command. This new\nwidget command is the programmer’s handle for getting the new widget to\nperform some action. In C, you’d express this as someAction(fred,\nsomeOptions), in C++, you would express this as fred.someAction(someOptions),\nand in Tk, you say:
\n.fred someAction someOptions
\nNote that the object name, .fred, starts with a dot.
\nAs you’d expect, the legal values for someAction will depend on the widget’s\nclass: .fred disable works if fred is a button (fred gets greyed out), but\ndoes not work if fred is a label (disabling of labels is not supported in Tk).
\nThe legal values of someOptions is action dependent. Some actions, like\ndisable, require no arguments, others, like a text-entry box’s delete\ncommand, would need arguments to specify what range of text to delete.
\nClass commands in Tk correspond to class constructors in Tkinter.
\nbutton .fred =====> fred = Button()
\nThe master of an object is implicit in the new name given to it at creation\ntime. In Tkinter, masters are specified explicitly.
\nbutton .panel.fred =====> fred = Button(panel)
\nThe configuration options in Tk are given in lists of hyphened tags followed by\nvalues. In Tkinter, options are specified as keyword-arguments in the instance\nconstructor, and keyword-args for configure calls or as instance indices, in\ndictionary style, for established instances. See section\nSetting Options on setting options.
\nbutton .fred -fg red =====> fred = Button(panel, fg = \"red\")\n.fred configure -fg red =====> fred[\"fg\"] = red\n OR ==> fred.config(fg = \"red\")
\nIn Tk, to perform an action on a widget, use the widget name as a command, and\nfollow it with an action name, possibly with arguments (options). In Tkinter,\nyou call methods on the class instance to invoke actions on the widget. The\nactions (methods) that a given widget can perform are listed in the Tkinter.py\nmodule.
\n.fred invoke =====> fred.invoke()
\nTo give a widget to the packer (geometry manager), you call pack with optional\narguments. In Tkinter, the Pack class holds all this functionality, and the\nvarious forms of the pack command are implemented as methods. All widgets in\nTkinter are subclassed from the Packer, and so inherit all the packing\nmethods. See the Tix module documentation for additional information on\nthe Form geometry manager.
\npack .fred -side left =====> fred.pack(side = \"left\")
\nOptions control things like the color and border width of a widget. Options can\nbe set in three ways:
\nfred = Button(self, fg = "red", bg = "blue")\n
fred["fg"] = "red"\nfred["bg"] = "blue"\n
fred.config(fg = "red", bg = "blue")\n
For a complete explanation of a given option and its behavior, see the Tk man\npages for the widget in question.
\nNote that the man pages list “STANDARD OPTIONS” and “WIDGET SPECIFIC OPTIONS”\nfor each widget. The former is a list of options that are common to many\nwidgets, the latter are the options that are idiosyncratic to that particular\nwidget. The Standard Options are documented on the options(3) man\npage.
\nNo distinction between standard and widget-specific options is made in this\ndocument. Some options don’t apply to some kinds of widgets. Whether a given\nwidget responds to a particular option depends on the class of the widget;\nbuttons have a command option, labels do not.
\nThe options supported by a given widget are listed in that widget’s man page, or\ncan be queried at runtime by calling the config() method without\narguments, or by calling the keys() method on that widget. The return\nvalue of these calls is a dictionary whose key is the name of the option as a\nstring (for example, 'relief') and whose values are 5-tuples.
\nSome options, like bg are synonyms for common options with long names\n(bg is shorthand for “background”). Passing the config() method the name\nof a shorthand option will return a 2-tuple, not 5-tuple. The 2-tuple passed\nback will contain the name of the synonym and the “real” option (such as\n('bg', 'background')).
\nIndex | \nMeaning | \nExample | \n
---|---|---|
0 | \noption name | \n'relief' | \n
1 | \noption name for database lookup | \n'relief' | \n
2 | \noption class for database\nlookup | \n'Relief' | \n
3 | \ndefault value | \n'raised' | \n
4 | \ncurrent value | \n'groove' | \n
Example:
\n>>> print fred.config()\n{'relief' : ('relief', 'relief', 'Relief', 'raised', 'groove')}\n
Of course, the dictionary printed will include all the options available and\ntheir values. This is meant only as an example.
\nThe packer is one of Tk’s geometry-management mechanisms. Geometry managers\nare used to specify the relative positioning of the positioning of widgets\nwithin their container - their mutual master. In contrast to the more\ncumbersome placer (which is used less commonly, and we do not cover here), the\npacker takes qualitative relationship specification - above, to the left of,\nfilling, etc - and works everything out to determine the exact placement\ncoordinates for you.
\nThe size of any master widget is determined by the size of the “slave widgets”\ninside. The packer is used to control where slave widgets appear inside the\nmaster into which they are packed. You can pack widgets into frames, and frames\ninto other frames, in order to achieve the kind of layout you desire.\nAdditionally, the arrangement is dynamically adjusted to accommodate incremental\nchanges to the configuration, once it is packed.
\nNote that widgets do not appear until they have had their geometry specified\nwith a geometry manager. It’s a common early mistake to leave out the geometry\nspecification, and then be surprised when the widget is created but nothing\nappears. A widget will appear only after it has had, for example, the packer’s\npack() method applied to it.
\nThe pack() method can be called with keyword-option/value pairs that control\nwhere the widget is to appear within its container, and how it is to behave when\nthe main application window is resized. Here are some examples:
\nfred.pack() # defaults to side = "top"\nfred.pack(side = "left")\nfred.pack(expand = 1)\n
For more extensive information on the packer and the options that it can take,\nsee the man pages and page 183 of John Ousterhout’s book.
\nThe current-value setting of some widgets (like text entry widgets) can be\nconnected directly to application variables by using special options. These\noptions are variable, textvariable, onvalue, offvalue, and\nvalue. This connection works both ways: if the variable changes for any\nreason, the widget it’s connected to will be updated to reflect the new value.
\nUnfortunately, in the current implementation of Tkinter it is not\npossible to hand over an arbitrary Python variable to a widget through a\nvariable or textvariable option. The only kinds of variables for which\nthis works are variables that are subclassed from a class called Variable,\ndefined in the Tkinter module.
\nThere are many useful subclasses of Variable already defined:\nStringVar, IntVar, DoubleVar, and\nBooleanVar. To read the current value of such a variable, call the\nget() method on it, and to change its value you call the set()\nmethod. If you follow this protocol, the widget will always track the value of\nthe variable, with no further intervention on your part.
\nFor example:
\nclass App(Frame):\n def __init__(self, master=None):\n Frame.__init__(self, master)\n self.pack()\n\n self.entrythingy = Entry()\n self.entrythingy.pack()\n\n # here is the application variable\n self.contents = StringVar()\n # set it to some value\n self.contents.set("this is a variable")\n # tell the entry widget to watch this variable\n self.entrythingy["textvariable"] = self.contents\n\n # and here we get a callback when the user hits return.\n # we will have the program print out the value of the\n # application variable when the user hits return\n self.entrythingy.bind('<Key-Return>',\n self.print_contents)\n\n def print_contents(self, event):\n print "hi. contents of entry is now ---->", \\\n self.contents.get()\n
In Tk, there is a utility command, wm, for interacting with the window\nmanager. Options to the wm command allow you to control things like titles,\nplacement, icon bitmaps, and the like. In Tkinter, these commands have\nbeen implemented as methods on the Wm class. Toplevel widgets are\nsubclassed from the Wm class, and so can call the Wm methods\ndirectly.
\nTo get at the toplevel window that contains a given widget, you can often just\nrefer to the widget’s master. Of course if the widget has been packed inside of\na frame, the master won’t represent a toplevel window. To get at the toplevel\nwindow that contains an arbitrary widget, you can call the _root() method.\nThis method begins with an underscore to denote the fact that this function is\npart of the implementation, and not an interface to Tk functionality.
\nHere are some examples of typical usage:
\nfrom Tkinter import *\nclass App(Frame):\n def __init__(self, master=None):\n Frame.__init__(self, master)\n self.pack()\n\n\n# create the application\nmyapp = App()\n\n#\n# here are method calls to the window manager class\n#\nmyapp.master.title("My Do-Nothing Application")\nmyapp.master.maxsize(1000, 400)\n\n# start the program\nmyapp.mainloop()\n
This is any Python function that takes no arguments. For example:
\ndef print_it():\n print "hi there"\nfred["command"] = print_it\n
The bind method from the widget command allows you to watch for certain events\nand to have a callback function trigger when that event type occurs. The form\nof the bind method is:
\ndef bind(self, sequence, func, add=''):
\nwhere:
\nFor example:
\ndef turnRed(self, event):\n event.widget["activeforeground"] = "red"\n\nself.button.bind("<Enter>", self.turnRed)\n
Notice how the widget field of the event is being accessed in the\nturnRed() callback. This field contains the widget that caught the X\nevent. The following table lists the other event fields you can access, and how\nthey are denoted in Tk, which can be useful when referring to the Tk man pages.
\nTk Tkinter Event Field Tk Tkinter Event Field\n-- ------------------- -- -------------------\n%f focus %A char\n%h height %E send_event\n%k keycode %K keysym\n%s state %N keysym_num\n%t time %T type\n%w width %W widget\n%x x %X x_root\n%y y %Y y_root
\nA number of widgets require”index” parameters to be passed. These are used to\npoint at a specific place in a Text widget, or to particular characters in an\nEntry widget, or to particular menu items in a Menu widget.
\nEntry widgets have options that refer to character positions in the text being\ndisplayed. You can use these Tkinter functions to access these special\npoints in text widgets:
\nSome options and methods for menus manipulate specific menu entries. Anytime a\nmenu index is needed for an option or a parameter, you may pass in:
\nBitmap/Pixelmap images can be created through the subclasses of\nTkinter.Image:
\nEither type of image is created through either the file or the data\noption (other options are available as well).
\nThe image object can then be used wherever an image option is supported by\nsome widget (e.g. labels, buttons, menus). In these cases, Tk will not keep a\nreference to the image. When the last Python reference to the image object is\ndeleted, the image data is deleted as well, and Tk will display an empty box\nwherever the image was used.
\nThe Tix (Tk Interface Extension) module provides an additional rich set\nof widgets. Although the standard Tk library has many useful widgets, they are\nfar from complete. The Tix library provides most of the commonly needed\nwidgets that are missing from standard Tk: HList, ComboBox,\nControl (a.k.a. SpinBox) and an assortment of scrollable widgets.\nTix also includes many more widgets that are generally useful in a wide\nrange of applications: NoteBook, FileEntry,\nPanedWindow, etc; there are more than 40 of them.
\nWith all these new widgets, you can introduce new interaction techniques into\napplications, creating more useful and more intuitive user interfaces. You can\ndesign your application by choosing the most appropriate widgets to match the\nspecial needs of your application and users.
\nNote
\nTix has been renamed to tkinter.tix in Python 3.0. The\n2to3 tool will automatically adapt imports when converting your\nsources to 3.0.
\nSee also
\nToplevel widget of Tix which represents mostly the main window of an\napplication. It has an associated Tcl interpreter.
\nClasses in the Tix module subclasses the classes in the Tkinter\nmodule. The former imports the latter, so to use Tix with Tkinter, all\nyou need to do is to import one module. In general, you can just import\nTix, and replace the toplevel call to Tkinter.Tk with\nTix.Tk:
\nimport Tix\nfrom Tkconstants import *\nroot = Tix.Tk()\n
To use Tix, you must have the Tix widgets installed, usually\nalongside your installation of the Tk widgets. To test your installation, try\nthe following:
\nimport Tix\nroot = Tix.Tk()\nroot.tk.eval('package require Tix')\n
If this fails, you have a Tk installation problem which must be resolved before\nproceeding. Use the environment variable TIX_LIBRARY to point to the\ninstalled Tix library directory, and make sure you have the dynamic\nobject library (tix8183.dll or libtix8183.so) in the same\ndirectory that contains your Tk dynamic object library (tk8183.dll or\nlibtk8183.so). The directory with the dynamic object library should also\nhave a file called pkgIndex.tcl (case sensitive), which contains the\nline:
\npackage ifneeded Tix 8.1 [list load \"[file join $dir tix8183.dll]\" Tix]
\nTix\nintroduces over 40 widget classes to the Tkinter repertoire. There is a\ndemo of all the Tix widgets in the Demo/tix directory of the\nstandard distribution.
\nThe Tix module adds:
\nThe tix commands provide\naccess to miscellaneous elements of Tix‘s internal state and the\nTix application context. Most of the information manipulated by these\nmethods pertains to the application as a whole, or to a screen or display,\nrather than to a particular window.
\nTo view the current settings, the common usage is:
\nimport Tix\nroot = Tix.Tk()\nprint root.tix_configure()\n
Resets the scheme and fontset of the Tix application to newScheme and\nnewFontSet, respectively. This affects only those widgets created after this\ncall. Therefore, it is best to call the resetoptions method before the creation\nof any widgets in a Tix application.
\nThe optional parameter newScmPrio can be given to reset the priority level of\nthe Tk options set by the Tix schemes.
\nBecause of the way Tk handles the X option database, after Tix has been has\nimported and inited, it is not possible to reset the color schemes and font sets\nusing the tix_config() method. Instead, the tix_resetoptions()\nmethod must be used.
\nIDLE is the Python IDE built with the tkinter GUI toolkit.
\nIDLE has the following features:
\nThe coloring is applied in a background “thread,” so you may occasionally see\nuncolorized text. To change the color scheme, edit the [Colors] section in\nconfig.txt.
\nUpon startup with the -s option, IDLE will execute the file referenced by\nthe environment variables IDLESTARTUP or PYTHONSTARTUP.\nIdle first checks for IDLESTARTUP; if IDLESTARTUP is present the file\nreferenced is run. If IDLESTARTUP is not present, Idle checks for\nPYTHONSTARTUP. Files referenced by these environment variables are\nconvenient places to store functions that are used frequently from the Idle\nshell, or for executing import statements to import common modules.
\nIn addition, Tk also loads a startup file if it is present. Note that the\nTk file is loaded unconditionally. This additional file is .Idle.py and is\nlooked for in the user’s home directory. Statements in this file will be\nexecuted in the Tk namespace, so this file is not useful for importing functions\nto be used from Idle’s Python shell.
\nidle.py [-c command] [-d] [-e] [-s] [-t title] [arg] ...\n\n-c command run this command\n-d enable debugger\n-e edit mode; arguments are files to be edited\n-s run $IDLESTARTUP or $PYTHONSTARTUP first\n-t title set title of shell window
\nIf there are arguments:
\nMajor cross-platform (Windows, Mac OS X, Unix-like) GUI toolkits are\navailable for Python:
\nSee also
\nPyGTK, PyQt, and wxPython, all have a modern look and feel and more\nwidgets than Tkinter. In addition, there are many other GUI toolkits for\nPython, both cross-platform, and platform-specific. See the GUI Programming page in the Python Wiki for a\nmuch more complete list, and also for links to documents where the\ndifferent GUI toolkits are compared.
\nThe ttk module provides access to the Tk themed widget set, which has\nbeen introduced in Tk 8.5. If Python is not compiled against Tk 8.5 code may\nstill use this module as long as Tile is installed. However, some features\nprovided by the new Tk, like anti-aliased font rendering under X11, window\ntransparency (on X11 you will need a composition window manager) will be\nmissing.
\nThe basic idea of ttk is to separate, to the extent possible, the code\nimplementing a widget’s behavior from the code implementing its appearance.
\nSee also
\nTo start using Ttk, import its module:
\nimport ttk\n
But code like this:
\nfrom Tkinter import *\n
may optionally want to use this:
\nfrom Tkinter import *\nfrom ttk import *\n
And then several ttk widgets (Button, Checkbutton,\nEntry, Frame, Label, LabelFrame,\nMenubutton, PanedWindow, Radiobutton, Scale\nand Scrollbar) will automatically substitute for the Tk widgets.
\nThis has the direct benefit of using the new widgets, giving better look & feel\nacross platforms, but be aware that they are not totally compatible. The main\ndifference is that widget options such as “fg”, “bg” and others related to\nwidget styling are no longer present in Ttk widgets. Use ttk.Style to\nachieve the same (or better) styling.
\nSee also
\nTtk comes with 17 widgets, 11 of which already exist in Tkinter:\nButton, Checkbutton, Entry, Frame,\nLabel, LabelFrame, Menubutton,\nPanedWindow, Radiobutton, Scale and\nScrollbar. The 6 new widget classes are: Combobox,\nNotebook, Progressbar, Separator,\nSizegrip and Treeview. All of these classes are\nsubclasses of Widget.
\nAs said previously, you will notice changes in look-and-feel as well in the\nstyling code. To demonstrate the latter, a very simple example is shown below.
\nTk code:
\nl1 = Tkinter.Label(text="Test", fg="black", bg="white")\nl2 = Tkinter.Label(text="Test", fg="black", bg="white")\n
Corresponding Ttk code:
\nstyle = ttk.Style()\nstyle.configure("BW.TLabel", foreground="black", background="white")\n\nl1 = ttk.Label(text="Test", style="BW.TLabel")\nl2 = ttk.Label(text="Test", style="BW.TLabel")\n
For more information about TtkStyling read the Style class\ndocumentation.
\nttk.Widget defines standard options and methods supported by Tk\nthemed widgets and is not supposed to be directly instantiated.
\nAll the ttk widgets accept the following options:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Option \nDescription \n\n class \nSpecifies the window class. The class is used when querying\nthe option database for the window’s other options, to\ndetermine the default bindtags for the window, and to select\nthe widget’s default layout and style. This is a read-only\noption which may only be specified when the window is\ncreated. \n\n cursor \nSpecifies the mouse cursor to be used for the widget. If set\nto the empty string (the default), the cursor is inherited\nfrom the parent widget. \n\n takefocus \nDetermines whether the window accepts the focus during\nkeyboard traversal. 0, 1 or an empty string is returned.\nIf 0, the window should be skipped entirely\nduring keyboard traversal. If 1, the window\nshould receive the input focus as long as it is viewable.\nAn empty string means that the traversal scripts make the\ndecision about whether or not to focus on the window. \n\n\n style \nMay be used to specify a custom widget style. \n
The following options are supported by widgets that are controlled by a\nscrollbar.
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n xscrollcommand \n\n Used to communicate with horizontal scrollbars.
\nWhen the view in the widget’s window changes, the widget\nwill generate a Tcl command based on the scrollcommand.
\nUsually this option consists of the\nScrollbar.set() method of some scrollbar. This\nwill cause\nthe scrollbar to be updated whenever the view in the\nwindow changes.
\n\n\n yscrollcommand \nUsed to communicate with vertical scrollbars.\nFor more information, see above. \n
The following options are supported by labels, buttons and other button-like\nwidgets.
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n text \nSpecifies a text string to be displayed inside the widget. \n\n textvariable \nSpecifies a name whose value will be used in place of the\ntext option resource. \n\n underline \nIf set, specifies the index (0-based) of a character to\nunderline in the text string. The underline character is\nused for mnemonic activation. \n\n image \nSpecifies an image to display. This is a list of 1 or more\nelements. The first element is the default image name. The\nrest of the list is a sequence of statespec/value pairs as\ndefined by Style.map(), specifying different images\nto use when the widget is in a particular state or a\ncombination of states. All images in the list should have\nthe same size. \n\n compound \n\n Specifies how to display the image relative to the text,\nin the case both text and image options are present.\nValid values are:
\n\n
\n- text: display text only
\n- image: display image only
\n- top, bottom, left, right: display image above, below,\nleft of, or right of the text, respectively.
\n- none: the default. display the image if present,\notherwise the text.
\n\n\n width \nIf greater than zero, specifies how much space, in\ncharacter widths, to allocate for the text label; if less\nthan zero, specifies a minimum width. If zero or\nunspecified, the natural width of the text label is used. \n
\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n\n state \nMay be set to “normal” or “disabled” to control the “disabled”\nstate bit. This is a write-only option: setting it changes the\nwidget state, but the Widget.state() method does not\naffect this option. \n
The widget state is a bitmap of independent state flags.
\n\n\n\n
\n\n \n\n\n \n \n\n\n flag \ndescription \n\n active \nThe mouse cursor is over the widget and pressing a mouse\nbutton will cause some action to occur. \n\n disabled \nWidget is disabled under program control. \n\n focus \nWidget has keyboard focus. \n\n pressed \nWidget is being pressed. \n\n selected \n“On”, “true”, or “current” for things like Checkbuttons and\nradiobuttons. \n\n background \nWindows and Mac have a notion of an “active” or foreground\nwindow. The background state is set for widgets in a\nbackground window, and cleared for those in the foreground\nwindow. \n\n readonly \nWidget should not allow user modification. \n\n alternate \nA widget-specific alternate display format. \n\n\n invalid \nThe widget’s value is invalid. \n
A state specification is a sequence of state names, optionally prefixed with\nan exclamation point indicating that the bit is off.
\nBesides the methods described below, the ttk.Widget class supports the\nTkinter.Widget.cget() and Tkinter.Widget.configure() methods.
\nReturns the name of the element at position x y, or the empty string\nif the point does not lie within any element.
\nx and y are pixel coordinates relative to the widget.
\nstatespec will usually be a list or a tuple.
\nThe ttk.Combobox widget combines a text field with a pop-down list of\nvalues. This widget is a subclass of Entry.
\nBesides the methods inherited from Widget (Widget.cget(),\nWidget.configure(), Widget.identify(), Widget.instate()\nand Widget.state()) and those inherited from Entry\n(Entry.bbox(), Entry.delete(), Entry.icursor(),\nEntry.index(), Entry.inset(), Entry.selection(),\nEntry.xview()), this class has some other methods, described at\nttk.Combobox.
\nThis widget accepts the following options:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n exportselection \nBoolean value. If set, the widget selection is linked\nto the Window Manager selection (which can be returned\nby invoking Misc.selection_get(), for example). \n\n justify \nSpecifies how the text is aligned within the widget.\nOne of “left”, “center”, or “right”. \n\n height \nSpecifies the height of the pop-down listbox, in rows. \n\n postcommand \nA script (possibly registered with\nMisc.register()) that\nis called immediately before displaying the values. It\nmay specify which values to display. \n\n state \nOne of “normal”, “readonly”, or “disabled”. In the\n“readonly” state, the value may not be edited directly,\nand the user can only select one of the values from the\ndropdown list. In the “normal” state, the text field is\ndirectly editable. In the “disabled” state, no\ninteraction is possible. \n\n textvariable \nSpecifies a name whose value is linked to the widget\nvalue. Whenever the value associated with that name\nchanges, the widget value is updated, and vice versa.\nSee Tkinter.StringVar. \n\n values \nSpecifies the list of values to display in the\ndrop-down listbox. \n\n\n width \nSpecifies an integer value indicating the desired width\nof the entry window, in average-size characters of the\nwidget’s font. \n
The combobox widget generates a <<ComboboxSelected>> virtual event\nwhen the user selects an element from the list of values.
\nThe Ttk Notebook widget manages a collection of windows and displays a single\none at a time. Each child window is associated with a tab, which the user\nmay select to change the currently-displayed window.
\nThis widget accepts the following specific options:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n height \nIf present and greater than zero, specifies the desired height\nof the pane area (not including internal padding or tabs).\nOtherwise, the maximum height of all panes is used. \n\n padding \nSpecifies the amount of extra space to add around the outside\nof the notebook. The padding is a list of up to four length\nspecifications: left top right bottom. If fewer than four\nelements are specified, bottom defaults to top, right defaults\nto left, and top defaults to left. \n\n\n width \nIf present and greater than zero, specifies the desired width\nof the pane area (not including internal padding). Otherwise,\nthe maximum width of all panes is used. \n
There are also specific options for tabs:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n state \nEither “normal”, “disabled” or “hidden”. If “disabled”, then\nthe tab is not selectable. If “hidden”, then the tab is not\nshown. \n\n sticky \nSpecifies how the child window is positioned within the pane\narea. Value is a string containing zero or more of the\ncharacters “n”, “s”, “e” or “w”. Each letter refers to a\nside (north, south, east or west) that the child window will\nstick to, as per the grid() geometry manager. \n\n padding \nSpecifies the amount of extra space to add between the\nnotebook and this pane. Syntax is the same as for the option\npadding used by this widget. \n\n text \nSpecifies a text to be displayed in the tab. \n\n image \nSpecifies an image to display in the tab. See the option\nimage described in Widget. \n\n compound \nSpecifies how to display the image relative to the text, in\nthe case both text and image options are present. See\nLabel Options for legal values. \n\n\n underline \nSpecifies the index (0-based) of a character to underline in\nthe text string. The underlined character is used for\nmnemonic activation if Notebook.enable_traversal() is\ncalled. \n
The tab_id present in several methods of ttk.Notebook may take any\nof the following forms:
\nThis widget generates a <<NotebookTabChanged>> virtual event after a new\ntab is selected.
\nAdds a new tab to the notebook.
\nIf window is currently managed by the notebook but hidden, it is\nrestored to its previous position.
\nSee Tab Options for the list of available options.
\nHides the tab specified by tab_id.
\nThe tab will not be displayed, but the associated window remains\nmanaged by the notebook and its configuration remembered. Hidden tabs\nmay be restored with the add() command.
\nInserts a pane at the specified position.
\npos is either the string “end”, an integer index, or the name of a\nmanaged child. If child is already managed by the notebook, moves it to\nthe specified position.
\nSee Tab Options for the list of available options.
\nSelects the specified tab_id.
\nThe associated child window will be displayed, and the\npreviously-selected window (if different) is unmapped. If tab_id is\nomitted, returns the widget name of the currently selected pane.
\nQuery or modify the options of the specific tab_id.
\nIf kw is not given, returns a dictionary of the tab option values. If\noption is specified, returns the value of that option. Otherwise,\nsets the options to the corresponding values.
\nEnable keyboard traversal for a toplevel window containing this notebook.
\nThis will extend the bindings for the toplevel window containing the\nnotebook as follows:
\nMultiple notebooks in a single toplevel may be enabled for traversal,\nincluding nested notebooks. However, notebook traversal only works\nproperly if all panes have the notebook they are in as master.
\nThe ttk.Progressbar widget shows the status of a long-running\noperation. It can operate in two modes: determinate mode shows the amount\ncompleted relative to the total amount of work to be done, and indeterminate\nmode provides an animated display to let the user know that something is\nhappening.
\nThis widget accepts the following specific options:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n orient \nOne of “horizontal” or “vertical”. Specifies the orientation\nof the progress bar. \n\n length \nSpecifies the length of the long axis of the progress bar\n(width if horizontal, height if vertical). \n\n mode \nOne of “determinate” or “indeterminate”. \n\n maximum \nA number specifying the maximum value. Defaults to 100. \n\n value \nThe current value of the progress bar. In “determinate” mode,\nthis represents the amount of work completed. In\n“indeterminate” mode, it is interpreted as modulo maximum;\nthat is, the progress bar completes one “cycle” when its value\nincreases by maximum. \n\n variable \nA name which is linked to the option value. If specified, the\nvalue of the progress bar is automatically set to the value of\nthis name whenever the latter is modified. \n\n\n phase \nRead-only option. The widget periodically increments the value\nof this option whenever its value is greater than 0 and, in\ndeterminate mode, less than maximum. This option may be used\nby the current theme to provide additional animation effects. \n
Increments the progress bar’s value by amount.
\namount defaults to 1.0 if omitted.
\nThe ttk.Separator widget displays a horizontal or vertical separator\nbar.
\nIt has no other methods besides the ones inherited from ttk.Widget.
\nThis widget accepts the following specific option:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n\n orient \nOne of “horizontal” or “vertical”. Specifies the orientation of\nthe separator. \n
The ttk.Sizegrip widget (also known as a grow box) allows the user to\nresize the containing toplevel window by pressing and dragging the grip.
\nThis widget has neither specific options nor specific methods, besides the\nones inherited from ttk.Widget.
\nThe ttk.Treeview widget displays a hierarchical collection of items.\nEach item has a textual label, an optional image, and an optional list of data\nvalues. The data values are displayed in successive columns after the tree\nlabel.
\nThe order in which data values are displayed may be controlled by setting\nthe widget option displaycolumns. The tree widget can also display column\nheadings. Columns may be accessed by number or symbolic names listed in the\nwidget option columns. See Column Identifiers.
\nEach item is identified by an unique name. The widget will generate item IDs\nif they are not supplied by the caller. There is a distinguished root item,\nnamed {}. The root item itself is not displayed; its children appear at the\ntop level of the hierarchy.
\nEach item also has a list of tags, which can be used to associate event bindings\nwith individual items and control the appearance of the item.
\nThe Treeview widget supports horizontal and vertical scrolling, according to\nthe options described in Scrollable Widget Options and the methods\nTreeview.xview() and Treeview.yview().
\nThis widget accepts the following specific options:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n columns \nA list of column identifiers, specifying the number of\ncolumns and their names. \n\n displaycolumns \nA list of column identifiers (either symbolic or\ninteger indices) specifying which data columns are\ndisplayed and the order in which they appear, or the\nstring “#all”. \n\n height \nSpecifies the number of rows which should be visible.\nNote: the requested width is determined from the sum\nof the column widths. \n\n padding \nSpecifies the internal padding for the widget. The\npadding is a list of up to four length specifications. \n\n selectmode \n\n Controls how the built-in class bindings manage the\nselection. One of “extended”, “browse” or “none”.\nIf set to “extended” (the default), multiple items may\nbe selected. If “browse”, only a single item will be\nselected at a time. If “none”, the selection will not\nbe changed.
\nNote that the application code and tag bindings can set\nthe selection however they wish, regardless of the\nvalue of this option.
\n\n\n show \n\n A list containing zero or more of the following values,\nspecifying which elements of the tree to display.
\n\n
\n- tree: display tree labels in column #0.
\n- headings: display the heading row.
\nThe default is “tree headings”, i.e., show all\nelements.
\nNote: Column #0 always refers to the tree column,\neven if show=”tree” is not specified.
\n
The following item options may be specified for items in the insert and item\nwidget commands.
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n text \nThe textual label to display for the item. \n\n image \nA Tk Image, displayed to the left of the label. \n\n values \n\n The list of values associated with the item.
\nEach item should have the same number of values as the widget\noption columns. If there are fewer values than columns, the\nremaining values are assumed empty. If there are more values\nthan columns, the extra values are ignored.
\n\n open \nTrue/False value indicating whether the item’s children should\nbe displayed or hidden. \n\n\n tags \nA list of tags associated with this item. \n
The following options may be specified on tags:
\n\n\n\n
\n\n \n\n\n \n \n\n\n option \ndescription \n\n foreground \nSpecifies the text foreground color. \n\n background \nSpecifies the cell or item background color. \n\n font \nSpecifies the font to use when drawing text. \n\n\n image \nSpecifies the item image, in case the item’s image option\nis empty. \n
Column identifiers take any of the following forms:
\nNotes:
\nA data column number is an index into an item’s option values list; a display\ncolumn number is the column number in the tree where the values are displayed.\nTree labels are displayed in column #0. If option displaycolumns is not set,\nthen data column n is displayed in column #n+1. Again, column #0 always\nrefers to the tree column.
\nThe Treeview widget generates the following virtual events.
\n\n\n\n
\n\n \n\n\n \n \n\n\n event \ndescription \n\n <<TreeviewSelect>> \nGenerated whenever the selection changes. \n\n <<TreeviewOpen>> \nGenerated just before settings the focus item to\nopen=True. \n\n\n <<TreeviewClose>> \nGenerated just after setting the focus item to\nopen=False. \n
The Treeview.focus() and Treeview.selection() methods can be used\nto determine the affected item or items.
\nReturns the bounding box (relative to the treeview widget’s window) of\nthe specified item in the form (x, y, width, height).
\nIf column is specified, returns the bounding box of that cell. If the\nitem is not visible (i.e., if it is a descendant of a closed item or is\nscrolled offscreen), returns an empty string.
\nReturns the list of children belonging to item.
\nIf item is not specified, returns root children.
\nReplaces item‘s child with newchildren.
\nChildren present in item that are not present in newchildren are\ndetached from the tree. No items in newchildren may be an ancestor of\nitem. Note that not specifying newchildren results in detaching\nitem‘s children.
\nQuery or modify the options for the specified column.
\nIf kw is not given, returns a dict of the column option values. If\noption is specified then the value for that option is returned.\nOtherwise, sets the options to the corresponding values.
\nThe valid options/values are:
\nReturns the column name. This is a read-only option.
\nSpecifies how the text in this column should be aligned with respect\nto the cell.
\nThe minimum width of the column in pixels. The treeview widget will\nnot make the column any smaller than specified by this option when\nthe widget is resized or the user drags a column.
\nSpecifies whether the column’s width should be adjusted when\nthe widget is resized.
\nThe width of the column in pixels.
\nTo configure the tree column, call this with column = “#0”
\nDelete all specified items and all their descendants.
\nThe root item may not be deleted.
\nUnlinks all of the specified items from the tree.
\nThe items and all of their descendants are still present, and may be\nreinserted at another point in the tree, but will not be displayed.
\nThe root item may not be detached.
\nQuery or modify the heading options for the specified column.
\nIf kw is not given, returns a dict of the heading option values. If\noption is specified then the value for that option is returned.\nOtherwise, sets the options to the corresponding values.
\nThe valid options/values are:
\nThe text to display in the column heading.
\nSpecifies an image to display to the right of the column heading.
\nSpecifies how the heading text should be aligned. One of the standard\nTk anchor values.
\nA callback to be invoked when the heading label is pressed.
\nTo configure the tree column heading, call this with column = “#0”.
\nReturns the data column identifier of the cell at position x.
\nThe tree column has ID #0.
\nReturns one of:
\nregion | \nmeaning | \n
---|---|
heading | \nTree heading area. | \n
separator | \nSpace between two columns headings. | \n
tree | \nThe tree area. | \n
cell | \nA data cell. | \n
Availability: Tk 8.6.
\nReturns the element at position x, y.
\nAvailability: Tk 8.6.
\nCreates a new item and returns the item identifier of the newly created\nitem.
\nparent is the item ID of the parent item, or the empty string to create\na new top-level item. index is an integer, or the value “end”,\nspecifying where in the list of parent’s children to insert the new item.\nIf index is less than or equal to zero, the new node is inserted at\nthe beginning; if index is greater than or equal to the current number\nof children, it is inserted at the end. If iid is specified, it is used\nas the item identifier; iid must not already exist in the tree.\nOtherwise, a new unique identifier is generated.
\nSee Item Options for the list of available points.
\nQuery or modify the options for the specified item.
\nIf no options are given, a dict with options/values for the item is\nreturned.\nIf option is specified then the value for that option is returned.\nOtherwise, sets the options to the corresponding values as given by kw.
\nMoves item to position index in parent‘s list of children.
\nIt is illegal to move an item under one of its descendants. If index is\nless than or equal to zero, item is moved to the beginning; if greater\nthan or equal to the number of children, it is moved to the end. If item\nwas detached it is reattached.
\nEnsure that item is visible.
\nSets all of item‘s ancestors open option to True, and scrolls the\nwidget if necessary so that item is within the visible portion of\nthe tree.
\nQuery or modify the options for the specified tagname.
\nIf kw is not given, returns a dict of the option settings for\ntagname. If option is specified, returns the value for that option\nfor the specified tagname. Otherwise, sets the options to the\ncorresponding values for the given tagname.
\nIf item is specified, returns 1 or 0 depending on whether the specified\nitem has the given tagname. Otherwise, returns a list of all items\nthat have the specified tag.
\nAvailability: Tk 8.6
\nEach widget in ttk is assigned a style, which specifies the set of\nelements making up the widget and how they are arranged, along with dynamic and\ndefault settings for element options. By default the style name is the same as\nthe widget’s class name, but it may be overridden by the widget’s style\noption. If the class name of a widget is unknown, use the method\nMisc.winfo_class() (somewidget.winfo_class()).
\nSee also
\nThis class is used to manipulate the style database.
\nQuery or set the default value of the specified option(s) in style.
\nEach key in kw is an option and each value is a string identifying\nthe value for that option.
\nFor example, to change every default button to be a flat button with some\npadding and a different background color do:
\nimport ttk\nimport Tkinter\n\nroot = Tkinter.Tk()\n\nttk.Style().configure("TButton", padding=6, relief="flat",\n background="#ccc")\n\nbtn = ttk.Button(text="Sample")\nbtn.pack()\n\nroot.mainloop()\n
Query or sets dynamic values of the specified option(s) in style.
\nEach key in kw is an option and each value should be a list or a\ntuple (usually) containing statespecs grouped in tuples, lists, or\nsomething else of your preference. A statespec is a compound of one\nor more states and then a value.
\nAn example:
\nimport Tkinter\nimport ttk\n\nroot = Tkinter.Tk()\n\nstyle = ttk.Style()\nstyle.map("C.TButton",\n foreground=[('pressed', 'red'), ('active', 'blue')],\n background=[('pressed', '!disabled', 'black'), ('active', 'white')]\n )\n\ncolored_btn = ttk.Button(text="Test", style="C.TButton").pack()\n\nroot.mainloop()\n
Note that the order of the (states, value) sequences for an\noption matters. In the previous example, if you change the\norder to [('active', 'blue'), ('pressed', 'red')] in the\nforeground option, for example, you would get a blue foreground\nwhen the widget is in the active or pressed states.
\nReturns the value specified for option in style.
\nIf state is specified, it is expected to be a sequence of one or more\nstates. If the default argument is set, it is used as a fallback value\nin case no specification for option is found.
\nTo check what font a Button uses by default, do:
\nimport ttk\n\nprint ttk.Style().lookup("TButton", "font")\n
Define the widget layout for given style. If layoutspec is omitted,\nreturn the layout specification for given style.
\nlayoutspec, if specified, is expected to be a list or some other\nsequence type (excluding strings), where each item should be a tuple and\nthe first item is the layout name and the second item should have the\nformat described in Layouts.
\nTo understand the format, see the following example (it is not\nintended to do anything useful):
\nimport ttk\nimport Tkinter\n\nroot = Tkinter.Tk()\n\nstyle = ttk.Style()\nstyle.layout("TMenubutton", [\n ("Menubutton.background", None),\n ("Menubutton.button", {"children":\n [("Menubutton.focus", {"children":\n [("Menubutton.padding", {"children":\n [("Menubutton.label", {"side": "left", "expand": 1})]\n })]\n })]\n }),\n])\n\nmbtn = ttk.Menubutton(text='Text')\nmbtn.pack()\nroot.mainloop()\n
Create a new element in the current theme, of the given etype which is\nexpected to be either “image”, “from” or “vsapi”. The latter is only\navailable in Tk 8.6a for Windows XP and Vista and is not described here.
\nIf “image” is used, args should contain the default image name followed\nby statespec/value pairs (this is the imagespec), and kw may have the\nfollowing options:
\n\n\n\n
\n- \n
\n
\n- border=padding
\n- \n
padding is a list of up to four integers, specifying the left, top,\nright, and bottom borders, respectively.
\n- \n
\n
\n- height=height
\n- \n
Specifies a minimum height for the element. If less than zero, the\nbase image’s height is used as a default.
\n- \n
\n
\n- padding=padding
\n- \n
Specifies the element’s interior padding. Defaults to border’s value\nif not specified.
\n- \n
\n
\n- sticky=spec
\n- \n
Specifies how the image is placed within the final parcel. spec\ncontains zero or more characters “n”, “s”, “w”, or “e”.
\n- \n
\n
\n- width=width
\n- \n
Specifies a minimum width for the element. If less than zero, the\nbase image’s width is used as a default.
\n
If “from” is used as the value of etype,\nelement_create() will clone an existing\nelement. args is expected to contain a themename, from which\nthe element will be cloned, and optionally an element to clone from.\nIf this element to clone from is not specified, an empty element will\nbe used. kw is discarded.
\nCreate a new theme.
\nIt is an error if themename already exists. If parent is specified,\nthe new theme will inherit styles, elements and layouts from the parent\ntheme. If settings are present they are expected to have the same\nsyntax used for theme_settings().
\nTemporarily sets the current theme to themename, apply specified\nsettings and then restore the previous theme.
\nEach key in settings is a style and each value may contain the keys\n‘configure’, ‘map’, ‘layout’ and ‘element create’ and they are expected\nto have the same format as specified by the methods\nStyle.configure(), Style.map(), Style.layout() and\nStyle.element_create() respectively.
\nAs an example, let’s change the Combobox for the default theme a bit:
\nimport ttk\nimport Tkinter\n\nroot = Tkinter.Tk()\n\nstyle = ttk.Style()\nstyle.theme_settings("default", {\n "TCombobox": {\n "configure": {"padding": 5},\n "map": {\n "background": [("active", "green2"),\n ("!disabled", "green4")],\n "fieldbackground": [("!disabled", "green3")],\n "foreground": [("focus", "OliveDrab1"),\n ("!disabled", "OliveDrab2")]\n }\n }\n})\n\ncombo = ttk.Combobox().pack()\n\nroot.mainloop()\n
A layout can be just None, if it takes no options, or a dict of\noptions specifying how to arrange the element. The layout mechanism\nuses a simplified version of the pack geometry manager: given an\ninitial cavity, each element is allocated a parcel. Valid\noptions/values are:
\n\n\n\n
\n- \n
\n
\n- side: whichside
\n- \n
Specifies which side of the cavity to place the element; one of\ntop, right, bottom or left. If omitted, the element occupies the\nentire cavity.
\n- \n
\n
\n- sticky: nswe
\n- \n
Specifies where the element is placed inside its allocated parcel.
\n- \n
\n
\n- unit: 0 or 1
\n- \n
If set to 1, causes the element and all of its descendants to be treated as\na single element for the purposes of Widget.identify() et al. It’s\nused for things like scrollbar thumbs with grips.
\n- \n
\n
\n- children: [sublayout... ]
\n- \n
Specifies a list of elements to place inside the element. Each\nelement is a tuple (or other sequence type) where the first item is\nthe layout name, and the other is a Layout.
\n
\nNew in version 2.1.
\nSource code: Lib/pydoc.py
\nThe pydoc module automatically generates documentation from Python\nmodules. The documentation can be presented as pages of text on the console,\nserved to a Web browser, or saved to HTML files.
\nThe built-in function help() invokes the online help system in the\ninteractive interpreter, which uses pydoc to generate its documentation\nas text on the console. The same text documentation can also be viewed from\noutside the Python interpreter by running pydoc as a script at the\noperating system’s command prompt. For example, running
\npydoc sys
\nat a shell prompt will display documentation on the sys module, in a\nstyle similar to the manual pages shown by the Unix man command. The\nargument to pydoc can be the name of a function, module, or package,\nor a dotted reference to a class, method, or function within a module or module\nin a package. If the argument to pydoc looks like a path (that is,\nit contains the path separator for your operating system, such as a slash in\nUnix), and refers to an existing Python source file, then documentation is\nproduced for that file.
\nNote
\nIn order to find objects and their documentation, pydoc imports the\nmodule(s) to be documented. Therefore, any code on module level will be\nexecuted on that occasion. Use an if __name__ == '__main__': guard to\nonly execute code when a file is invoked as a script and not just imported.
\nSpecifying a -w flag before the argument will cause HTML documentation\nto be written out to a file in the current directory, instead of displaying text\non the console.
\nSpecifying a -k flag before the argument will search the synopsis\nlines of all available modules for the keyword given as the argument, again in a\nmanner similar to the Unix man command. The synopsis line of a\nmodule is the first line of its documentation string.
\nYou can also use pydoc to start an HTTP server on the local machine\nthat will serve documentation to visiting Web browsers. pydoc -p 1234\nwill start a HTTP server on port 1234, allowing you to browse\nthe documentation at http://localhost:1234/ in your preferred Web browser.\npydoc -g will start the server and additionally bring up a\nsmall Tkinter-based graphical interface to help you search for\ndocumentation pages.
\nWhen pydoc generates documentation, it uses the current environment\nand path to locate modules. Thus, invoking pydoc spam\ndocuments precisely the version of the module you would get if you started the\nPython interpreter and typed import spam.
\nModule docs for core modules are assumed to reside in\nhttp://docs.python.org/library/. This can be overridden by setting the\nPYTHONDOCS environment variable to a different URL or to a local\ndirectory containing the Library Reference Manual pages.
\nTurtle graphics is a popular way for introducing programming to kids. It was\npart of the original Logo programming language developed by Wally Feurzig and\nSeymour Papert in 1966.
\nImagine a robotic turtle starting at (0, 0) in the x-y plane. After an import turtle, give it the\ncommand turtle.forward(15), and it moves (on-screen!) 15 pixels in the\ndirection it is facing, drawing a line as it moves. Give it the command\nturtle.right(25), and it rotates in-place 25 degrees clockwise.
\nBy combining together these and similar commands, intricate shapes and pictures\ncan easily be drawn.
\nThe turtle module is an extended reimplementation of the same-named\nmodule from the Python standard distribution up to version Python 2.5.
\nIt tries to keep the merits of the old turtle module and to be (nearly) 100%\ncompatible with it. This means in the first place to enable the learning\nprogrammer to use all the commands, classes and methods interactively when using\nthe module from within IDLE run with the -n switch.
\nThe turtle module provides turtle graphics primitives, in both object-oriented\nand procedure-oriented ways. Because it uses Tkinter for the underlying\ngraphics, it needs a version of Python installed with Tk support.
\nThe object-oriented interface uses essentially two+two classes:
\nThe TurtleScreen class defines graphics windows as a playground for\nthe drawing turtles. Its constructor needs a Tkinter.Canvas or a\nScrolledCanvas as argument. It should be used when turtle is\nused as part of some application.
\nThe function Screen() returns a singleton object of a\nTurtleScreen subclass. This function should be used when\nturtle is used as a standalone tool for doing graphics.\nAs a singleton object, inheriting from its class is not possible.
\nAll methods of TurtleScreen/Screen also exist as functions, i.e. as part of\nthe procedure-oriented interface.
\nRawTurtle (alias: RawPen) defines Turtle objects which draw\non a TurtleScreen. Its constructor needs a Canvas, ScrolledCanvas\nor TurtleScreen as argument, so the RawTurtle objects know where to draw.
\nDerived from RawTurtle is the subclass Turtle (alias: Pen),\nwhich draws on “the” Screen - instance which is automatically\ncreated, if not already present.
\nAll methods of RawTurtle/Turtle also exist as functions, i.e. part of the\nprocedure-oriented interface.
\nThe procedural interface provides functions which are derived from the methods\nof the classes Screen and Turtle. They have the same names as\nthe corresponding methods. A screen object is automatically created whenever a\nfunction derived from a Screen method is called. An (unnamed) turtle object is\nautomatically created whenever any of the functions derived from a Turtle method\nis called.
\nTo use multiple turtles an a screen one has to use the object-oriented interface.
\nNote
\nIn the following documentation the argument list for functions is given.\nMethods, of course, have the additional first argument self which is\nomitted here.
\nMost of the examples in this section refer to a Turtle instance called\nturtle.
\nParameter: | distance – a number (integer or float) | \n
---|
Move the turtle forward by the specified distance, in the direction the\nturtle is headed.
\n>>> turtle.position()\n(0.00,0.00)\n>>> turtle.forward(25)\n>>> turtle.position()\n(25.00,0.00)\n>>> turtle.forward(-75)\n>>> turtle.position()\n(-50.00,0.00)\n
Parameter: | distance – a number | \n
---|
Move the turtle backward by distance, opposite to the direction the\nturtle is headed. Do not change the turtle’s heading.
\n>>> turtle.position()\n(0.00,0.00)\n>>> turtle.backward(30)\n>>> turtle.position()\n(-30.00,0.00)\n
Parameter: | angle – a number (integer or float) | \n
---|
Turn turtle right by angle units. (Units are by default degrees, but\ncan be set via the degrees() and radians() functions.) Angle\norientation depends on the turtle mode, see mode().
\n>>> turtle.heading()\n22.0\n>>> turtle.right(45)\n>>> turtle.heading()\n337.0\n
Parameter: | angle – a number (integer or float) | \n
---|
Turn turtle left by angle units. (Units are by default degrees, but\ncan be set via the degrees() and radians() functions.) Angle\norientation depends on the turtle mode, see mode().
\n>>> turtle.heading()\n22.0\n>>> turtle.left(45)\n>>> turtle.heading()\n67.0\n
Parameters: |
| \n
---|
If y is None, x must be a pair of coordinates or a Vec2D\n(e.g. as returned by pos()).
\nMove turtle to an absolute position. If the pen is down, draw line. Do\nnot change the turtle’s orientation.
\n>>> tp = turtle.pos()\n>>> tp\n(0.00,0.00)\n>>> turtle.setpos(60,30)\n>>> turtle.pos()\n(60.00,30.00)\n>>> turtle.setpos((20,80))\n>>> turtle.pos()\n(20.00,80.00)\n>>> turtle.setpos(tp)\n>>> turtle.pos()\n(0.00,0.00)\n
Parameter: | x – a number (integer or float) | \n
---|
Set the turtle’s first coordinate to x, leave second coordinate\nunchanged.
\n>>> turtle.position()\n(0.00,240.00)\n>>> turtle.setx(10)\n>>> turtle.position()\n(10.00,240.00)\n
Parameter: | y – a number (integer or float) | \n
---|
Set the turtle’s second coordinate to y, leave first coordinate unchanged.
\n>>> turtle.position()\n(0.00,40.00)\n>>> turtle.sety(-10)\n>>> turtle.position()\n(0.00,-10.00)\n
Parameter: | to_angle – a number (integer or float) | \n
---|
Set the orientation of the turtle to to_angle. Here are some common\ndirections in degrees:
\nstandard mode | \nlogo mode | \n
---|---|
0 - east | \n0 - north | \n
90 - north | \n90 - east | \n
180 - west | \n180 - south | \n
270 - south | \n270 - west | \n
>>> turtle.setheading(90)\n>>> turtle.heading()\n90.0\n
Move turtle to the origin – coordinates (0,0) – and set its heading to\nits start-orientation (which depends on the mode, see mode()).
\n>>> turtle.heading()\n90.0\n>>> turtle.position()\n(0.00,-10.00)\n>>> turtle.home()\n>>> turtle.position()\n(0.00,0.00)\n>>> turtle.heading()\n0.0\n
Parameters: |
| \n
---|
Draw a circle with given radius. The center is radius units left of\nthe turtle; extent – an angle – determines which part of the circle\nis drawn. If extent is not given, draw the entire circle. If extent\nis not a full circle, one endpoint of the arc is the current pen\nposition. Draw the arc in counterclockwise direction if radius is\npositive, otherwise in clockwise direction. Finally the direction of the\nturtle is changed by the amount of extent.
\nAs the circle is approximated by an inscribed regular polygon, steps\ndetermines the number of steps to use. If not given, it will be\ncalculated automatically. May be used to draw regular polygons.
\n>>> turtle.home()\n>>> turtle.position()\n(0.00,0.00)\n>>> turtle.heading()\n0.0\n>>> turtle.circle(50)\n>>> turtle.position()\n(-0.00,0.00)\n>>> turtle.heading()\n0.0\n>>> turtle.circle(120, 180) # draw a semicircle\n>>> turtle.position()\n(0.00,240.00)\n>>> turtle.heading()\n180.0\n
Parameters: |
| \n
---|
Draw a circular dot with diameter size, using color. If size is\nnot given, the maximum of pensize+4 and 2*pensize is used.
\n>>> turtle.home()\n>>> turtle.dot()\n>>> turtle.fd(50); turtle.dot(20, "blue"); turtle.fd(50)\n>>> turtle.position()\n(100.00,-0.00)\n>>> turtle.heading()\n0.0\n
Stamp a copy of the turtle shape onto the canvas at the current turtle\nposition. Return a stamp_id for that stamp, which can be used to delete\nit by calling clearstamp(stamp_id).
\n>>> turtle.color("blue")\n>>> turtle.stamp()\n11\n>>> turtle.fd(50)\n
Parameter: | stampid – an integer, must be return value of previous\nstamp() call | \n
---|
Delete stamp with given stampid.
\n>>> turtle.position()\n(150.00,-0.00)\n>>> turtle.color("blue")\n>>> astamp = turtle.stamp()\n>>> turtle.fd(50)\n>>> turtle.position()\n(200.00,-0.00)\n>>> turtle.clearstamp(astamp)\n>>> turtle.position()\n(200.00,-0.00)\n
Parameter: | n – an integer (or None) | \n
---|
Delete all or first/last n of turtle’s stamps. If n is None, delete\nall stamps, if n > 0 delete first n stamps, else if n < 0 delete\nlast n stamps.
\n>>> for i in range(8):\n... turtle.stamp(); turtle.fd(30)\n13\n14\n15\n16\n17\n18\n19\n20\n>>> turtle.clearstamps(2)\n>>> turtle.clearstamps(-2)\n>>> turtle.clearstamps()\n
Undo (repeatedly) the last turtle action(s). Number of available\nundo actions is determined by the size of the undobuffer.
\n>>> for i in range(4):\n... turtle.fd(50); turtle.lt(80)\n...\n>>> for i in range(8):\n... turtle.undo()\n
Parameter: | speed – an integer in the range 0..10 or a speedstring (see below) | \n
---|
Set the turtle’s speed to an integer value in the range 0..10. If no\nargument is given, return current speed.
\nIf input is a number greater than 10 or smaller than 0.5, speed is set\nto 0. Speedstrings are mapped to speedvalues as follows:
\nSpeeds from 1 to 10 enforce increasingly faster animation of line drawing\nand turtle turning.
\nAttention: speed = 0 means that no animation takes\nplace. forward/back makes turtle jump and likewise left/right make the\nturtle turn instantly.
\n>>> turtle.speed()\n3\n>>> turtle.speed('normal')\n>>> turtle.speed()\n6\n>>> turtle.speed(9)\n>>> turtle.speed()\n9\n
Return the turtle’s current location (x,y) (as a Vec2D vector).
\n>>> turtle.pos()\n(440.00,-0.00)\n
Parameters: |
| \n
---|
Return the angle between the line from turtle position to position specified\nby (x,y), the vector or the other turtle. This depends on the turtle’s start\norientation which depends on the mode - “standard”/”world” or “logo”).
\n>>> turtle.goto(10, 10)\n>>> turtle.towards(0,0)\n225.0\n
Return the turtle’s x coordinate.
\n>>> turtle.home()\n>>> turtle.left(50)\n>>> turtle.forward(100)\n>>> turtle.pos()\n(64.28,76.60)\n>>> print turtle.xcor()\n64.2787609687\n
Return the turtle’s y coordinate.
\n>>> turtle.home()\n>>> turtle.left(60)\n>>> turtle.forward(100)\n>>> print turtle.pos()\n(50.00,86.60)\n>>> print turtle.ycor()\n86.6025403784\n
Return the turtle’s current heading (value depends on the turtle mode, see\nmode()).
\n>>> turtle.home()\n>>> turtle.left(67)\n>>> turtle.heading()\n67.0\n
Parameters: |
| \n
---|
Return the distance from the turtle to (x,y), the given vector, or the given\nother turtle, in turtle step units.
\n>>> turtle.home()\n>>> turtle.distance(30,40)\n50.0\n>>> turtle.distance((30,40))\n50.0\n>>> joe = Turtle()\n>>> joe.forward(77)\n>>> turtle.distance(joe)\n77.0\n
Parameter: | fullcircle – a number | \n
---|
Set angle measurement units, i.e. set number of “degrees” for a full circle.\nDefault value is 360 degrees.
\n>>> turtle.home()\n>>> turtle.left(90)\n>>> turtle.heading()\n90.0\n\nChange angle measurement unit to grad (also known as gon,\ngrade, or gradian and equals 1/100-th of the right angle.)\n>>> turtle.degrees(400.0)\n>>> turtle.heading()\n100.0\n>>> turtle.degrees(360)\n>>> turtle.heading()\n90.0\n
Set the angle measurement units to radians. Equivalent to\ndegrees(2*math.pi).
\n>>> turtle.home()\n>>> turtle.left(90)\n>>> turtle.heading()\n90.0\n>>> turtle.radians()\n>>> turtle.heading()\n1.5707963267948966\n
Parameter: | width – a positive number | \n
---|
Set the line thickness to width or return it. If resizemode is set to\n“auto” and turtleshape is a polygon, that polygon is drawn with the same line\nthickness. If no argument is given, the current pensize is returned.
\n>>> turtle.pensize()\n1\n>>> turtle.pensize(10) # from here on lines of width 10 are drawn\n
Parameters: |
| \n
---|
Return or set the pen’s attributes in a “pen-dictionary” with the following\nkey/value pairs:
\nThis dictionary can be used as argument for a subsequent call to pen()\nto restore the former pen-state. Moreover one or more of these attributes\ncan be provided as keyword-arguments. This can be used to set several pen\nattributes in one statement.
\n>>> turtle.pen(fillcolor="black", pencolor="red", pensize=10)\n>>> sorted(turtle.pen().items())\n[('fillcolor', 'black'), ('outline', 1), ('pencolor', 'red'),\n ('pendown', True), ('pensize', 10), ('resizemode', 'noresize'),\n ('shown', True), ('speed', 9), ('stretchfactor', (1, 1)), ('tilt', 0)]\n>>> penstate=turtle.pen()\n>>> turtle.color("yellow", "")\n>>> turtle.penup()\n>>> sorted(turtle.pen().items())\n[('fillcolor', ''), ('outline', 1), ('pencolor', 'yellow'),\n ('pendown', False), ('pensize', 10), ('resizemode', 'noresize'),\n ('shown', True), ('speed', 9), ('stretchfactor', (1, 1)), ('tilt', 0)]\n>>> turtle.pen(penstate, fillcolor="green")\n>>> sorted(turtle.pen().items())\n[('fillcolor', 'green'), ('outline', 1), ('pencolor', 'red'),\n ('pendown', True), ('pensize', 10), ('resizemode', 'noresize'),\n ('shown', True), ('speed', 9), ('stretchfactor', (1, 1)), ('tilt', 0)]\n
Return True if pen is down, False if it’s up.
\n>>> turtle.penup()\n>>> turtle.isdown()\nFalse\n>>> turtle.pendown()\n>>> turtle.isdown()\nTrue\n
Return or set the pencolor.
\nFour input formats are allowed:
\n\nSet pencolor to the RGB color represented by r, g, and b. Each of\nr, g, and b must be in the range 0..colormode.\n
If turtleshape is a polygon, the outline of that polygon is drawn with the\nnewly set pencolor.
\n>>> colormode()\n1.0\n>>> turtle.pencolor()\n'red'\n>>> turtle.pencolor("brown")\n>>> turtle.pencolor()\n'brown'\n>>> tup = (0.2, 0.8, 0.55)\n>>> turtle.pencolor(tup)\n>>> turtle.pencolor()\n(0.2, 0.8, 0.5490196078431373)\n>>> colormode(255)\n>>> turtle.pencolor()\n(51, 204, 140)\n>>> turtle.pencolor('#32c18f')\n>>> turtle.pencolor()\n(50, 193, 143)\n
Return or set the fillcolor.
\nFour input formats are allowed:
\n\nSet fillcolor to the RGB color represented by r, g, and b. Each of\nr, g, and b must be in the range 0..colormode.\n
If turtleshape is a polygon, the interior of that polygon is drawn\nwith the newly set fillcolor.
\n>>> turtle.fillcolor("violet")\n>>> turtle.fillcolor()\n'violet'\n>>> col = turtle.pencolor()\n>>> col\n(50, 193, 143)\n>>> turtle.fillcolor(col)\n>>> turtle.fillcolor()\n(50, 193, 143)\n>>> turtle.fillcolor('#ffffff')\n>>> turtle.fillcolor()\n(255, 255, 255)\n
Return or set pencolor and fillcolor.
\nSeveral input formats are allowed. They use 0 to 3 arguments as\nfollows:
\n\nEquivalent to pencolor(colorstring1) and fillcolor(colorstring2)\nand analogously if the other input format is used.\n
If turtleshape is a polygon, outline and interior of that polygon is drawn\nwith the newly set colors.
\n>>> turtle.color("red", "green")\n>>> turtle.color()\n('red', 'green')\n>>> color("#285078", "#a0c8f0")\n>>> color()\n((40, 80, 120), (160, 200, 240))\n
See also: Screen method colormode().
\nParameter: | flag – True/False (or 1/0 respectively) | \n
---|
Call fill(True) before drawing the shape you want to fill, and\nfill(False) when done. When used without argument: return fillstate\n(True if filling, False else).
\n>>> turtle.fill(True)\n>>> for _ in range(3):\n... turtle.forward(100)\n... turtle.left(120)\n...\n>>> turtle.fill(False)\n
Fill the shape drawn after the last call to begin_fill(). Equivalent\nto fill(False).
\n>>> turtle.color("black", "red")\n>>> turtle.begin_fill()\n>>> turtle.circle(80)\n>>> turtle.end_fill()\n
Delete the turtle’s drawings from the screen, re-center the turtle and set\nvariables to the default values.
\n>>> turtle.goto(0,-22)\n>>> turtle.left(100)\n>>> turtle.position()\n(0.00,-22.00)\n>>> turtle.heading()\n100.0\n>>> turtle.reset()\n>>> turtle.position()\n(0.00,0.00)\n>>> turtle.heading()\n0.0\n
Parameters: |
| \n
---|
Write text - the string representation of arg - at the current turtle\nposition according to align (“left”, “center” or right”) and with the given\nfont. If move is True, the pen is moved to the bottom-right corner of the\ntext. By default, move is False.
\n>>> turtle.write("Home = ", True, align="center")\n>>> turtle.write((0,0), True)\n
Make the turtle invisible. It’s a good idea to do this while you’re in the\nmiddle of doing some complex drawing, because hiding the turtle speeds up the\ndrawing observably.
\n>>> turtle.hideturtle()\n
Return True if the Turtle is shown, False if it’s hidden.
\n>>> turtle.hideturtle()\n>>> turtle.isvisible()\nFalse\n>>> turtle.showturtle()\n>>> turtle.isvisible()\nTrue\n
Parameter: | name – a string which is a valid shapename | \n
---|
Set turtle shape to shape with given name or, if name is not given, return\nname of current shape. Shape with name must exist in the TurtleScreen’s\nshape dictionary. Initially there are the following polygon shapes: “arrow”,\n“turtle”, “circle”, “square”, “triangle”, “classic”. To learn about how to\ndeal with shapes see Screen method register_shape().
\n>>> turtle.shape()\n'classic'\n>>> turtle.shape("turtle")\n>>> turtle.shape()\n'turtle'\n
Parameter: | rmode – one of the strings “auto”, “user”, “noresize” | \n
---|
Set resizemode to one of the values: “auto”, “user”, “noresize”. If rmode\nis not given, return current resizemode. Different resizemodes have the\nfollowing effects:
\nresizemode(“user”) is called by shapesize() when used with arguments.
\n>>> turtle.resizemode()\n'noresize'\n>>> turtle.resizemode("auto")\n>>> turtle.resizemode()\n'auto'\n
Parameters: |
| \n
---|
Return or set the pen’s attributes x/y-stretchfactors and/or outline. Set\nresizemode to “user”. If and only if resizemode is set to “user”, the turtle\nwill be displayed stretched according to its stretchfactors: stretch_wid is\nstretchfactor perpendicular to its orientation, stretch_len is\nstretchfactor in direction of its orientation, outline determines the width\nof the shapes’s outline.
\n>>> turtle.shapesize()\n(1, 1, 1)\n>>> turtle.resizemode("user")\n>>> turtle.shapesize(5, 5, 12)\n>>> turtle.shapesize()\n(5, 5, 12)\n>>> turtle.shapesize(outline=8)\n>>> turtle.shapesize()\n(5, 5, 8)\n
Parameter: | angle – a number | \n
---|
Rotate the turtleshape by angle from its current tilt-angle, but do not\nchange the turtle’s heading (direction of movement).
\n>>> turtle.reset()\n>>> turtle.shape("circle")\n>>> turtle.shapesize(5,2)\n>>> turtle.tilt(30)\n>>> turtle.fd(50)\n>>> turtle.tilt(30)\n>>> turtle.fd(50)\n
Parameter: | angle – a number | \n
---|
Rotate the turtleshape to point in the direction specified by angle,\nregardless of its current tilt-angle. Do not change the turtle’s heading\n(direction of movement).
\n>>> turtle.reset()\n>>> turtle.shape("circle")\n>>> turtle.shapesize(5,2)\n>>> turtle.settiltangle(45)\n>>> turtle.fd(50)\n>>> turtle.settiltangle(-45)\n>>> turtle.fd(50)\n
Return the current tilt-angle, i.e. the angle between the orientation of the\nturtleshape and the heading of the turtle (its direction of movement).
\n>>> turtle.reset()\n>>> turtle.shape("circle")\n>>> turtle.shapesize(5,2)\n>>> turtle.tilt(45)\n>>> turtle.tiltangle()\n45.0\n
Parameters: |
| \n
---|
Bind fun to mouse-click events on this turtle. If fun is None,\nexisting bindings are removed. Example for the anonymous turtle, i.e. the\nprocedural way:
\n>>> def turn(x, y):\n... left(180)\n...\n>>> onclick(turn) # Now clicking into the turtle will turn it.\n>>> onclick(None) # event-binding will be removed\n
Parameters: |
| \n
---|
Bind fun to mouse-button-release events on this turtle. If fun is\nNone, existing bindings are removed.
\n>>> class MyTurtle(Turtle):\n... def glow(self,x,y):\n... self.fillcolor("red")\n... def unglow(self,x,y):\n... self.fillcolor("")\n...\n>>> turtle = MyTurtle()\n>>> turtle.onclick(turtle.glow) # clicking on turtle turns fillcolor red,\n>>> turtle.onrelease(turtle.unglow) # releasing turns it to transparent.\n
Parameters: |
| \n
---|
Bind fun to mouse-move events on this turtle. If fun is None,\nexisting bindings are removed.
\nRemark: Every sequence of mouse-move-events on a turtle is preceded by a\nmouse-click event on that turtle.
\n>>> turtle.ondrag(turtle.goto)\n
Subsequently, clicking and dragging the Turtle will move it across\nthe screen thereby producing handdrawings (if pen is down).
\nReturn the last recorded polygon.
\n>>> turtle.home()\n>>> turtle.begin_poly()\n>>> turtle.fd(100)\n>>> turtle.left(20)\n>>> turtle.fd(30)\n>>> turtle.left(60)\n>>> turtle.fd(50)\n>>> turtle.end_poly()\n>>> p = turtle.get_poly()\n>>> register_shape("myFavouriteShape", p)\n
Create and return a clone of the turtle with same position, heading and\nturtle properties.
\n>>> mick = Turtle()\n>>> joe = mick.clone()\n
Return the Turtle object itself. Only reasonable use: as a function to\nreturn the “anonymous turtle”:
\n>>> pet = getturtle()\n>>> pet.fd(50)\n>>> pet\n<turtle.Turtle object at 0x...>\n
Return the TurtleScreen object the turtle is drawing on.\nTurtleScreen methods can then be called for that object.
\n>>> ts = turtle.getscreen()\n>>> ts\n<turtle._Screen object at 0x...>\n>>> ts.bgcolor("pink")\n
Parameter: | size – an integer or None | \n
---|
Set or disable undobuffer. If size is an integer an empty undobuffer of\ngiven size is installed. size gives the maximum number of turtle actions\nthat can be undone by the undo() method/function. If size is\nNone, the undobuffer is disabled.
\n>>> turtle.setundobuffer(42)\n
Return number of entries in the undobuffer.
\n>>> while undobufferentries():\n... undo()\n
A replica of the corresponding TurtleScreen method.
\n\nDeprecated since version 2.6.
\nTo use compound turtle shapes, which consist of several polygons of different\ncolor, you must use the helper class Shape explicitly as described\nbelow:
\nCreate an empty Shape object of type “compound”.
\nAdd as many components to this object as desired, using the\naddcomponent() method.
\nFor example:
\n>>> s = Shape("compound")\n>>> poly1 = ((0,0),(10,-5),(0,10),(-10,-5))\n>>> s.addcomponent(poly1, "red", "blue")\n>>> poly2 = ((0,0),(10,-5),(-10,-5))\n>>> s.addcomponent(poly2, "blue", "red")\n
Now add the Shape to the Screen’s shapelist and use it:
\n>>> register_shape("myshape", s)\n>>> shape("myshape")\n
Note
\nThe Shape class is used internally by the register_shape()\nmethod in different ways. The application programmer has to deal with the\nShape class only when using compound shapes like shown above!
\nMost of the examples in this section refer to a TurtleScreen instance called\nscreen.
\nParameter: | args – a color string or three numbers in the range 0..colormode or a\n3-tuple of such numbers | \n
---|
Set or return background color of the TurtleScreen.
\n>>> screen.bgcolor("orange")\n>>> screen.bgcolor()\n'orange'\n>>> screen.bgcolor("#800080")\n>>> screen.bgcolor()\n(128, 0, 128)\n
Parameter: | picname – a string, name of a gif-file or "nopic", or None | \n
---|
Set background image or return name of current backgroundimage. If picname\nis a filename, set the corresponding image as background. If picname is\n"nopic", delete background image, if present. If picname is None,\nreturn the filename of the current backgroundimage.
\n>>> screen.bgpic()\n'nopic'\n>>> screen.bgpic("landscape.gif")\n>>> screen.bgpic()\n"landscape.gif"\n
Delete all drawings and all turtles from the TurtleScreen. Reset the now\nempty TurtleScreen to its initial state: white background, no background\nimage, no event bindings and tracing on.
\nNote
\nThis TurtleScreen method is available as a global function only under the\nname clearscreen. The global function clear is another one\nderived from the Turtle method clear.
\nReset all Turtles on the Screen to their initial state.
\nNote
\nThis TurtleScreen method is available as a global function only under the\nname resetscreen. The global function reset is another one\nderived from the Turtle method reset.
\nParameters: |
| \n
---|
If no arguments are given, return current (canvaswidth, canvasheight). Else\nresize the canvas the turtles are drawing on. Do not alter the drawing\nwindow. To observe hidden parts of the canvas, use the scrollbars. With this\nmethod, one can make visible those parts of a drawing which were outside the\ncanvas before.
\n>>> screen.screensize()\n(400, 300)\n>>> screen.screensize(2000,1500)\n>>> screen.screensize()\n(2000, 1500)\n
e.g. to search for an erroneously escaped turtle ;-)
\nParameters: |
| \n
---|
Set up user-defined coordinate system and switch to mode “world” if\nnecessary. This performs a screen.reset(). If mode “world” is already\nactive, all drawings are redrawn according to the new coordinates.
\nATTENTION: in user-defined coordinate systems angles may appear\ndistorted.
\n>>> screen.reset()\n>>> screen.setworldcoordinates(-50,-7.5,50,7.5)\n>>> for _ in range(72):\n... left(10)\n...\n>>> for _ in range(8):\n... left(45); fd(2) # a regular octagon\n
Parameter: | delay – positive integer | \n
---|
Set or return the drawing delay in milliseconds. (This is approximately\nthe time interval between two consecutive canvas updates.) The longer the\ndrawing delay, the slower the animation.
\nOptional argument:
\n>>> screen.delay()\n10\n>>> screen.delay(5)\n>>> screen.delay()\n5\n
Parameters: |
| \n
---|
Turn turtle animation on/off and set delay for update drawings. If n is\ngiven, only each n-th regular screen update is really performed. (Can be\nused to accelerate the drawing of complex graphics.) Second argument sets\ndelay value (see delay()).
\n>>> screen.tracer(8, 25)\n>>> dist = 2\n>>> for i in range(200):\n... fd(dist)\n... rt(90)\n... dist += 2\n
See also the RawTurtle/Turtle method speed().
\nParameters: |
| \n
---|
Bind fun to key-release event of key. If fun is None, event bindings\nare removed. Remark: in order to be able to register key-events, TurtleScreen\nmust have the focus. (See method listen().)
\n>>> def f():\n... fd(50)\n... lt(60)\n...\n>>> screen.onkey(f, "Up")\n>>> screen.listen()\n
Parameters: |
| \n
---|
Bind fun to mouse-click events on this screen. If fun is None,\nexisting bindings are removed.
\nExample for a TurtleScreen instance named screen and a Turtle instance\nnamed turtle:
\n>>> screen.onclick(turtle.goto) # Subsequently clicking into the TurtleScreen will\n>>> # make the turtle move to the clicked point.\n>>> screen.onclick(None) # remove event binding again\n
Note
\nThis TurtleScreen method is available as a global function only under the\nname onscreenclick. The global function onclick is another one\nderived from the Turtle method onclick.
\nParameters: |
| \n
---|
Install a timer that calls fun after t milliseconds.
\n>>> running = True\n>>> def f():\n... if running:\n... fd(50)\n... lt(60)\n... screen.ontimer(f, 250)\n>>> f() ### makes the turtle march around\n>>> running = False\n
Parameter: | mode – one of the strings “standard”, “logo” or “world” | \n
---|
Set turtle mode (“standard”, “logo” or “world”) and perform reset. If mode\nis not given, current mode is returned.
\nMode “standard” is compatible with old turtle. Mode “logo” is\ncompatible with most Logo turtle graphics. Mode “world” uses user-defined\n“world coordinates”. Attention: in this mode angles appear distorted if\nx/y unit-ratio doesn’t equal 1.
\nMode | \nInitial turtle heading | \npositive angles | \n
---|---|---|
“standard” | \nto the right (east) | \ncounterclockwise | \n
“logo” | \nupward (north) | \nclockwise | \n
>>> mode("logo") # resets turtle heading to north\n>>> mode()\n'logo'\n
Parameter: | cmode – one of the values 1.0 or 255 | \n
---|
Return the colormode or set it to 1.0 or 255. Subsequently r, g, b\nvalues of color triples have to be in the range 0..cmode.
\n>>> screen.colormode(1)\n>>> turtle.pencolor(240, 160, 80)\nTraceback (most recent call last):\n ...\nTurtleGraphicsError: bad color sequence: (240, 160, 80)\n>>> screen.colormode()\n1.0\n>>> screen.colormode(255)\n>>> screen.colormode()\n255\n>>> turtle.pencolor(240,160,80)\n
Return the Canvas of this TurtleScreen. Useful for insiders who know what to\ndo with a Tkinter Canvas.
\n>>> cv = screen.getcanvas()\n>>> cv\n<turtle.ScrolledCanvas instance at 0x...>\n
Return a list of names of all currently available turtle shapes.
\n>>> screen.getshapes()\n['arrow', 'blank', 'circle', ..., 'turtle']\n
There are three different ways to call this function:
\nname is the name of a gif-file and shape is None: Install the\ncorresponding image shape.
\n>>> screen.register_shape("turtle.gif")\n
Note
\nImage shapes do not rotate when turning the turtle, so they do not\ndisplay the heading of the turtle!
\nname is an arbitrary string and shape is a tuple of pairs of\ncoordinates: Install the corresponding polygon shape.
\n>>> screen.register_shape("triangle", ((5,-3), (0,5), (-5,-3)))\n
name is an arbitrary string and shape is a (compound) Shape\nobject: Install the corresponding compound shape.
\nAdd a turtle shape to TurtleScreen’s shapelist. Only thusly registered\nshapes can be used by issuing the command shape(shapename).
\nReturn the list of turtles on the screen.
\n>>> for turtle in screen.turtles():\n... turtle.color("red")\n
Return the height of the turtle window.
\n>>> screen.window_height()\n480\n
Return the width of the turtle window.
\n>>> screen.window_width()\n640\n
Bind bye() method to mouse clicks on the Screen.
\nIf the value “using_IDLE” in the configuration dictionary is False\n(default value), also enter mainloop. Remark: If IDLE with the -n switch\n(no subprocess) is used, this value should be set to True in\nturtle.cfg. In this case IDLE’s own mainloop is active also for the\nclient script.
\nSet the size and position of the main window. Default values of arguments\nare stored in the configuration dictionary and can be changed via a\nturtle.cfg file.
\nParameters: |
| \n
---|
>>> screen.setup (width=200, height=200, startx=0, starty=0)\n>>> # sets window to 200x200 pixels, in upper left of screen\n>>> screen.setup(width=.75, height=0.5, startx=None, starty=None)\n>>> # sets window to 75% of screen by 50% of screen and centers\n
Parameter: | titlestring – a string that is shown in the titlebar of the turtle\ngraphics window | \n
---|
Set title of turtle window to titlestring.
\n>>> screen.title("Welcome to the turtle zoo!")\n
Parameter: | canvas – a Tkinter.Canvas, a ScrolledCanvas or a\nTurtleScreen | \n
---|
Create a turtle. The turtle has all methods described above as “methods of\nTurtle/RawTurtle”.
\nParameter: | cv – a Tkinter.Canvas | \n
---|
Provides screen oriented methods like setbg() etc. that are described\nabove.
\nParameter: | master – some Tkinter widget to contain the ScrolledCanvas, i.e.\na Tkinter-canvas with scrollbars added | \n
---|
Used by class Screen, which thus automatically provides a ScrolledCanvas as\nplayground for the turtles.
\nParameter: | type_ – one of the strings “polygon”, “image”, “compound” | \n
---|
Data structure modeling shapes. The pair (type_, data) must follow this\nspecification:
\ntype_ | \ndata | \n
---|---|
“polygon” | \na polygon-tuple, i.e. a tuple of pairs of coordinates | \n
“image” | \nan image (in this form only used internally!) | \n
“compound” | \nNone (a compound shape has to be constructed using the\naddcomponent() method) | \n
Parameters: |
| \n
---|
Example:
\n>>> poly = ((0,0),(10,-5),(0,10),(-10,-5))\n>>> s = Shape("compound")\n>>> s.addcomponent(poly, "red", "blue")\n>>> # ... add more components and then use register_shape()\n
A two-dimensional vector class, used as a helper class for implementing\nturtle graphics. May be useful for turtle graphics programs too. Derived\nfrom tuple, so a vector is a tuple!
\nProvides (for a, b vectors, k number):
\nThe public methods of the Screen and Turtle classes are documented extensively\nvia docstrings. So these can be used as online-help via the Python help\nfacilities:
\nWhen using IDLE, tooltips show the signatures and first lines of the\ndocstrings of typed in function-/method calls.
\nCalling help() on methods or functions displays the docstrings:
\n>>> help(Screen.bgcolor)\nHelp on method bgcolor in module turtle:\n\nbgcolor(self, *args) unbound turtle.Screen method\n Set or return backgroundcolor of the TurtleScreen.\n\n Arguments (if given): a color string or three numbers\n in the range 0..colormode or a 3-tuple of such numbers.\n\n\n >>> screen.bgcolor("orange")\n >>> screen.bgcolor()\n "orange"\n >>> screen.bgcolor(0.5,0,0.5)\n >>> screen.bgcolor()\n "#800080"\n\n>>> help(Turtle.penup)\nHelp on method penup in module turtle:\n\npenup(self) unbound turtle.Turtle method\n Pull the pen up -- no drawing when moving.\n\n Aliases: penup | pu | up\n\n No argument\n\n >>> turtle.penup()\n
The docstrings of the functions which are derived from methods have a modified\nform:
\n>>> help(bgcolor)\nHelp on function bgcolor in module turtle:\n\nbgcolor(*args)\n Set or return backgroundcolor of the TurtleScreen.\n\n Arguments (if given): a color string or three numbers\n in the range 0..colormode or a 3-tuple of such numbers.\n\n Example::\n\n >>> bgcolor("orange")\n >>> bgcolor()\n "orange"\n >>> bgcolor(0.5,0,0.5)\n >>> bgcolor()\n "#800080"\n\n>>> help(penup)\nHelp on function penup in module turtle:\n\npenup()\n Pull the pen up -- no drawing when moving.\n\n Aliases: penup | pu | up\n\n No argument\n\n Example:\n >>> penup()\n
These modified docstrings are created automatically together with the function\ndefinitions that are derived from the methods at import time.
\nThere is a utility to create a dictionary the keys of which are the method names\nand the values of which are the docstrings of the public methods of the classes\nScreen and Turtle.
\nParameter: | filename – a string, used as filename | \n
---|
Create and write docstring-dictionary to a Python script with the given\nfilename. This function has to be called explicitly (it is not used by the\nturtle graphics classes). The docstring dictionary will be written to the\nPython script filename.py. It is intended to serve as a template\nfor translation of the docstrings into different languages.
\nIf you (or your students) want to use turtle with online help in your\nnative language, you have to translate the docstrings and save the resulting\nfile as e.g. turtle_docstringdict_german.py.
\nIf you have an appropriate entry in your turtle.cfg file this dictionary\nwill be read in at import time and will replace the original English docstrings.
\nAt the time of this writing there are docstring dictionaries in German and in\nItalian. (Requests please to glingl@aon.at.)
\nThe built-in default configuration mimics the appearance and behaviour of the\nold turtle module in order to retain best possible compatibility with it.
\nIf you want to use a different configuration which better reflects the features\nof this module or which better fits to your needs, e.g. for use in a classroom,\nyou can prepare a configuration file turtle.cfg which will be read at import\ntime and modify the configuration according to its settings.
\nThe built in configuration would correspond to the following turtle.cfg:
\nwidth = 0.5\nheight = 0.75\nleftright = None\ntopbottom = None\ncanvwidth = 400\ncanvheight = 300\nmode = standard\ncolormode = 1.0\ndelay = 10\nundobuffersize = 1000\nshape = classic\npencolor = black\nfillcolor = black\nresizemode = noresize\nvisible = True\nlanguage = english\nexampleturtle = turtle\nexamplescreen = screen\ntitle = Python Turtle Graphics\nusing_IDLE = False
\nShort explanation of selected entries:
\nThere can be a turtle.cfg file in the directory where turtle is\nstored and an additional one in the current working directory. The latter will\noverride the settings of the first one.
\nThe Demo/turtle directory contains a turtle.cfg file. You can\nstudy it as an example and see its effects when running the demos (preferably\nnot from within the demo-viewer).
\nThere is a set of demo scripts in the turtledemo directory located in the\nDemo/turtle directory in the source distribution.
\nIt contains:
\nThe demoscripts are:
\nName | \nDescription | \nFeatures | \n
bytedesign | \ncomplex classical\nturtlegraphics pattern | \ntracer(), delay,\nupdate() | \n
chaos | \ngraphs Verhulst dynamics,\nshows that computer’s\ncomputations can generate\nresults sometimes against the\ncommon sense expectations | \nworld coordinates | \n
clock | \nanalog clock showing time\nof your computer | \nturtles as clock’s\nhands, ontimer | \n
colormixer | \nexperiment with r, g, b | \nondrag() | \n
fractalcurves | \nHilbert & Koch curves | \nrecursion | \n
lindenmayer | \nethnomathematics\n(indian kolams) | \nL-System | \n
minimal_hanoi | \nTowers of Hanoi | \nRectangular Turtles\nas Hanoi discs\n(shape, shapesize) | \n
paint | \nsuper minimalistic\ndrawing program | \nonclick() | \n
peace | \nelementary | \nturtle: appearance\nand animation | \n
penrose | \naperiodic tiling with\nkites and darts | \nstamp() | \n
planet_and_moon | \nsimulation of\ngravitational system | \ncompound shapes,\nVec2D | \n
tree | \na (graphical) breadth\nfirst tree (using generators) | \nclone() | \n
wikipedia | \na pattern from the wikipedia\narticle on turtle graphics | \nclone(),\nundo() | \n
yingyang | \nanother elementary example | \ncircle() | \n
Have fun!
\n2to3 is a Python program that reads Python 2.x source code and applies a series\nof fixers to transform it into valid Python 3.x code. The standard library\ncontains a rich set of fixers that will handle almost all code. 2to3 supporting\nlibrary lib2to3 is, however, a flexible and generic library, so it is\npossible to write your own fixers for 2to3. lib2to3 could also be\nadapted to custom applications in which Python code needs to be edited\nautomatically.
\n2to3 will usually be installed with the Python interpreter as a script. It is\nalso located in the Tools/scripts directory of the Python root.
\n2to3’s basic arguments are a list of files or directories to transform. The\ndirectories are to recursively traversed for Python sources.
\nHere is a sample Python 2.x source file, example.py:
\ndef greet(name):\n print "Hello, {0}!".format(name)\nprint "What's your name?"\nname = raw_input()\ngreet(name)\n
It can be converted to Python 3.x code via 2to3 on the command line:
\n$ 2to3 example.py
\nA diff against the original source file is printed. 2to3 can also write the\nneeded modifications right back to the source file. (A backup of the original\nfile is made unless -n is also given.) Writing the changes back is\nenabled with the -w flag:
\n$ 2to3 -w example.py
\nAfter transformation, example.py looks like this:
\ndef greet(name):\n print("Hello, {0}!".format(name))\nprint("What's your name?")\nname = input()\ngreet(name)\n
Comments and exact indentation are preserved throughout the translation process.
\nBy default, 2to3 runs a set of predefined fixers. The\n-l flag lists all available fixers. An explicit set of fixers to run\ncan be given with -f. Likewise the -x explicitly disables a\nfixer. The following example runs only the imports and has_key fixers:
\n$ 2to3 -f imports -f has_key example.py
\nThis command runs every fixer except the apply fixer:
\n$ 2to3 -x apply example.py
\nSome fixers are explicit, meaning they aren’t run by default and must be\nlisted on the command line to be run. Here, in addition to the default fixers,\nthe idioms fixer is run:
\n$ 2to3 -f all -f idioms example.py
\nNotice how passing all enables all default fixers.
\nSometimes 2to3 will find a place in your source code that needs to be changed,\nbut 2to3 cannot fix automatically. In this case, 2to3 will print a warning\nbeneath the diff for a file. You should address the warning in order to have\ncompliant 3.x code.
\n2to3 can also refactor doctests. To enable this mode, use the -d\nflag. Note that only doctests will be refactored. This also doesn’t require\nthe module to be valid Python. For example, doctest like examples in a reST\ndocument could also be refactored with this option.
\nThe -v option enables output of more information on the translation\nprocess.
\nSince some print statements can be parsed as function calls or statements, 2to3\ncannot always read files containing the print function. When 2to3 detects the\npresence of the from __future__ import print_function compiler directive, it\nmodifies its internal grammar to interpret print() as a function. This\nchange can also be enabled manually with the -p flag. Use\n-p to run fixers on code that already has had its print statements\nconverted.
\nEach step of transforming code is encapsulated in a fixer. The command 2to3\n-l lists them. As documented above, each can be turned on\nand off individually. They are described here in more detail.
\nThis optional fixer performs several transformations that make Python code\nmore idiomatic. Type comparisons like type(x) is SomeClass and\ntype(x) == SomeClass are converted to isinstance(x, SomeClass).\nwhile 1 becomes while True. This fixer also tries to make use of\nsorted() in appropriate places. For example, this block
\nL = list(some_iterable)\nL.sort()\n
is changed to
\nL = sorted(some_iterable)\n
Note
\nThe lib2to3 API should be considered unstable and may change\ndrastically in the future.
\nNote
\nThe test package is meant for internal use by Python only. It is\ndocumented for the benefit of the core developers of Python. Any use of\nthis package outside of Python’s standard library is discouraged as code\nmentioned here can change or be removed without notice between releases of\nPython.
\nThe test package contains all regression tests for Python as well as the\nmodules test.test_support and test.regrtest.\ntest.test_support is used to enhance your tests while\ntest.regrtest drives the testing suite.
\nEach module in the test package whose name starts with test_ is a\ntesting suite for a specific module or feature. All new tests should be written\nusing the unittest or doctest module. Some older tests are\nwritten using a “traditional” testing style that compares output printed to\nsys.stdout; this style of test is considered deprecated.
\nSee also
\n\nIt is preferred that tests that use the unittest module follow a few\nguidelines. One is to name the test module by starting it with test_ and end\nit with the name of the module being tested. The test methods in the test module\nshould start with test_ and end with a description of what the method is\ntesting. This is needed so that the methods are recognized by the test driver as\ntest methods. Also, no documentation string for the method should be included. A\ncomment (such as # Tests function returns only True or False) should be used\nto provide documentation for test methods. This is done because documentation\nstrings get printed out if they exist and thus what test is being run is not\nstated.
\nA basic boilerplate is often used:
\nimport unittest\nfrom test import test_support\n\nclass MyTestCase1(unittest.TestCase):\n\n # Only use setUp() and tearDown() if necessary\n\n def setUp(self):\n ... code to execute in preparation for tests ...\n\n def tearDown(self):\n ... code to execute to clean up after tests ...\n\n def test_feature_one(self):\n # Test feature one.\n ... testing code ...\n\n def test_feature_two(self):\n # Test feature two.\n ... testing code ...\n\n ... more test methods ...\n\nclass MyTestCase2(unittest.TestCase):\n ... same structure as MyTestCase1 ...\n\n... more test classes ...\n\ndef test_main():\n test_support.run_unittest(MyTestCase1,\n MyTestCase2,\n ... list other tests ...\n )\n\nif __name__ == '__main__':\n test_main()\n
This boilerplate code allows the testing suite to be run by test.regrtest\nas well as on its own as a script.
\nThe goal for regression testing is to try to break code. This leads to a few\nguidelines to be followed:
\nThe testing suite should exercise all classes, functions, and constants. This\nincludes not just the external API that is to be presented to the outside\nworld but also “private” code.
\nWhitebox testing (examining the code being tested when the tests are being\nwritten) is preferred. Blackbox testing (testing only the published user\ninterface) is not complete enough to make sure all boundary and edge cases\nare tested.
\nMake sure all possible values are tested including invalid ones. This makes\nsure that not only all valid values are acceptable but also that improper\nvalues are handled correctly.
\nExhaust as many code paths as possible. Test where branching occurs and thus\ntailor input to make sure as many different paths through the code are taken.
\nAdd an explicit test for any bugs discovered for the tested code. This will\nmake sure that the error does not crop up again if the code is changed in the\nfuture.
\nMake sure to clean up after your tests (such as close and remove all temporary\nfiles).
\nIf a test is dependent on a specific condition of the operating system then\nverify the condition already exists before attempting the test.
\nImport as few modules as possible and do it as soon as possible. This\nminimizes external dependencies of tests and also minimizes possible anomalous\nbehavior from side-effects of importing a module.
\nTry to maximize code reuse. On occasion, tests will vary by something as small\nas what type of input is used. Minimize code duplication by subclassing a\nbasic test class with a class that specifies the input:
\nclass TestFuncAcceptsSequences(unittest.TestCase):\n\n func = mySuperWhammyFunction\n\n def test_func(self):\n self.func(self.arg)\n\nclass AcceptLists(TestFuncAcceptsSequences):\n arg = [1, 2, 3]\n\nclass AcceptStrings(TestFuncAcceptsSequences):\n arg = 'abc'\n\nclass AcceptTuples(TestFuncAcceptsSequences):\n arg = (1, 2, 3)\n
See also
\nThe test.regrtest module can be run as a script to drive Python’s regression\ntest suite, thanks to the -m option: python -m test.regrtest.\nRunning the script by itself automatically starts running all regression\ntests in the test package. It does this by finding all modules in the\npackage whose name starts with test_, importing them, and executing the\nfunction test_main() if present. The names of tests to execute may also\nbe passed to the script. Specifying a single regression test (python\n-m test.regrtest test_spam) will minimize output and only print whether\nthe test passed or failed and thus minimize output.
\nRunning test.regrtest directly allows what resources are available for\ntests to use to be set. You do this by using the -u command-line\noption. Run python -m test.regrtest -uall to turn on all\nresources; specifying all as an option for -u enables all\npossible resources. If all but one resource is desired (a more common case), a\ncomma-separated list of resources that are not desired may be listed after\nall. The command python -m test.regrtest -uall,-audio,-largefile\nwill run test.regrtest with all resources except the audio and\nlargefile resources. For a list of all resources and more command-line\noptions, run python -m test.regrtest -h.
\nSome other ways to execute the regression tests depend on what platform the\ntests are being executed on. On Unix, you can run make test at the\ntop-level directory where Python was built. On Windows, executing\nrt.bat from your PCBuild directory will run all regression\ntests.
\nNote
\nThe test.test_support module has been renamed to test.support\nin Python 3.x.
\nThe test.test_support module provides support for Python’s regression\ntests.
\nThis module defines the following exceptions:
\nThe test.test_support module defines the following constants:
\nThe test.test_support module defines the following functions:
\nExecute unittest.TestCase subclasses passed to the function. The\nfunction scans the classes for methods starting with the prefix test_\nand executes the tests individually.
\nIt is also legal to pass strings as parameters; these should be keys in\nsys.modules. Each associated module will be scanned by\nunittest.TestLoader.loadTestsFromModule(). This is usually seen in the\nfollowing test_main() function:
\ndef test_main():\n test_support.run_unittest(__name__)\n
This will run all tests defined in the named module.
\nA convenience wrapper for warnings.catch_warnings() that makes it\neasier to test that a warning was correctly raised. It is approximately\nequivalent to calling warnings.catch_warnings(record=True) with\nwarnings.simplefilter() set to always and with the option to\nautomatically validate the results that are recorded.
\ncheck_warnings accepts 2-tuples of the form ("message regexp",\nWarningCategory) as positional arguments. If one or more filters are\nprovided, or if the optional keyword argument quiet is False,\nit checks to make sure the warnings are as expected: each specified filter\nmust match at least one of the warnings raised by the enclosed code or the\ntest fails, and if any warnings are raised that do not match any of the\nspecified filters the test fails. To disable the first of these checks,\nset quiet to True.
\nIf no arguments are specified, it defaults to:
\ncheck_warnings(("", Warning), quiet=True)\n
In this case all warnings are caught and no errors are raised.
\nOn entry to the context manager, a WarningRecorder instance is\nreturned. The underlying warnings list from\ncatch_warnings() is available via the recorder object’s\nwarnings attribute. As a convenience, the attributes of the object\nrepresenting the most recent warning can also be accessed directly through\nthe recorder object (see example below). If no warning has been raised,\nthen any of the attributes that would otherwise be expected on an object\nrepresenting a warning will return None.
\nThe recorder object also has a reset() method, which clears the\nwarnings list.
\nThe context manager is designed to be used like this:
\nwith check_warnings(("assertion is always true", SyntaxWarning),\n ("", UserWarning)):\n exec('assert(False, "Hey!")')\n warnings.warn(UserWarning("Hide me!"))\n
In this case if either warning was not raised, or some other warning was\nraised, check_warnings() would raise an error.
\nWhen a test needs to look more deeply into the warnings, rather than\njust checking whether or not they occurred, code like this can be used:
\nwith check_warnings(quiet=True) as w:\n warnings.warn("foo")\n assert str(w.args[0]) == "foo"\n warnings.warn("bar")\n assert str(w.args[0]) == "bar"\n assert str(w.warnings[0].args[0]) == "foo"\n assert str(w.warnings[1].args[0]) == "bar"\n w.reset()\n assert len(w.warnings) == 0\n
Here all warnings will be caught, and the test code tests the captured\nwarnings directly.
\n\nNew in version 2.6.
\n\nChanged in version 2.7: New optional arguments filters and quiet.
\nSimilar to check_warnings(), but for Python 3 compatibility warnings.\nIf sys.py3kwarning == 1, it checks if the warning is effectively raised.\nIf sys.py3kwarning == 0, it checks that no warning is raised. It\naccepts 2-tuples of the form ("message regexp", WarningCategory) as\npositional arguments. When the optional keyword argument quiet is\nTrue, it does not fail if a filter catches nothing. Without\narguments, it defaults to:
\ncheck_py3k_warnings(("", DeprecationWarning), quiet=False)\n
\nNew in version 2.7.
\nThis is a context manager that runs the with statement body using\na StringIO.StringIO object as sys.stdout. That object can be\nretrieved using the as clause of the with statement.
\nExample use:
\nwith captured_stdout() as s:\n print "hello"\nassert s.getvalue() == "hello"\n
\nNew in version 2.6.
\nThis function imports and returns the named module. Unlike a normal\nimport, this function raises unittest.SkipTest if the module\ncannot be imported.
\nModule and package deprecation messages are suppressed during this import\nif deprecated is True.
\n\nNew in version 2.7.
\nThis function imports and returns a fresh copy of the named Python module\nby removing the named module from sys.modules before doing the import.\nNote that unlike reload(), the original module is not affected by\nthis operation.
\nfresh is an iterable of additional module names that are also removed\nfrom the sys.modules cache before doing the import.
\nblocked is an iterable of module names that are replaced with 0\nin the module cache during the import to ensure that attempts to import\nthem raise ImportError.
\nThe named module and any modules named in the fresh and blocked\nparameters are saved before starting the import and then reinserted into\nsys.modules when the fresh import is complete.
\nModule and package deprecation messages are suppressed during this import\nif deprecated is True.
\nThis function will raise unittest.SkipTest is the named module\ncannot be imported.
\nExample use:
\n# Get copies of the warnings module for testing without\n# affecting the version being used by the rest of the test suite\n# One copy uses the C implementation, the other is forced to use\n# the pure Python fallback implementation\npy_warnings = import_fresh_module('warnings', blocked=['_warnings'])\nc_warnings = import_fresh_module('warnings', fresh=['_warnings'])\n
\nNew in version 2.7.
\nThe test.test_support module defines the following classes:
\nInstances are a context manager that raises ResourceDenied if the\nspecified exception type is raised. Any keyword arguments are treated as\nattribute/value pairs to be compared against any exception raised within the\nwith statement. Only if all pairs match properly against\nattributes on the exception is ResourceDenied raised.
\n\nNew in version 2.6.
\nClass used to temporarily set or unset environment variables. Instances can\nbe used as a context manager and have a complete dictionary interface for\nquerying/modifying the underlying os.environ. After exit from the\ncontext manager all changes to environment variables done through this\ninstance will be rolled back.
\n\nNew in version 2.6.
\n\nChanged in version 2.7: Added dictionary interface.
\nClass used to record warnings for unit tests. See documentation of\ncheck_warnings() above for more details.
\n\nNew in version 2.6.
\nThe doctest module searches for pieces of text that look like interactive\nPython sessions, and then executes those sessions to verify that they work\nexactly as shown. There are several common ways to use doctest:
\nHere’s a complete but small example module:
\n"""\nThis is the "example" module.\n\nThe example module supplies one function, factorial(). For example,\n\n>>> factorial(5)\n120\n"""\n\ndef factorial(n):\n """Return the factorial of n, an exact integer >= 0.\n\n If the result is small enough to fit in an int, return an int.\n Else return a long.\n\n >>> [factorial(n) for n in range(6)]\n [1, 1, 2, 6, 24, 120]\n >>> [factorial(long(n)) for n in range(6)]\n [1, 1, 2, 6, 24, 120]\n >>> factorial(30)\n 265252859812191058636308480000000L\n >>> factorial(30L)\n 265252859812191058636308480000000L\n >>> factorial(-1)\n Traceback (most recent call last):\n ...\n ValueError: n must be >= 0\n\n Factorials of floats are OK, but the float must be an exact integer:\n >>> factorial(30.1)\n Traceback (most recent call last):\n ...\n ValueError: n must be exact integer\n >>> factorial(30.0)\n 265252859812191058636308480000000L\n\n It must also not be ridiculously large:\n >>> factorial(1e100)\n Traceback (most recent call last):\n ...\n OverflowError: n too large\n """\n\n import math\n if not n >= 0:\n raise ValueError("n must be >= 0")\n if math.floor(n) != n:\n raise ValueError("n must be exact integer")\n if n+1 == n: # catch a value like 1e300\n raise OverflowError("n too large")\n result = 1\n factor = 2\n while factor <= n:\n result *= factor\n factor += 1\n return result\n\n\nif __name__ == "__main__":\n import doctest\n doctest.testmod()\n
If you run example.py directly from the command line, doctest\nworks its magic:
\n$ python example.py\n$
\nThere’s no output! That’s normal, and it means all the examples worked. Pass\n-v to the script, and doctest prints a detailed log of what\nit’s trying, and prints a summary at the end:
\n$ python example.py -v\nTrying:\n factorial(5)\nExpecting:\n 120\nok\nTrying:\n [factorial(n) for n in range(6)]\nExpecting:\n [1, 1, 2, 6, 24, 120]\nok\nTrying:\n [factorial(long(n)) for n in range(6)]\nExpecting:\n [1, 1, 2, 6, 24, 120]\nok
\nAnd so on, eventually ending with:
\nTrying:\n factorial(1e100)\nExpecting:\n Traceback (most recent call last):\n ...\n OverflowError: n too large\nok\n2 items passed all tests:\n 1 tests in __main__\n 8 tests in __main__.factorial\n9 tests in 2 items.\n9 passed and 0 failed.\nTest passed.\n$
\nThat’s all you need to know to start making productive use of doctest!\nJump in. The following sections provide full details. Note that there are many\nexamples of doctests in the standard Python test suite and libraries.\nEspecially useful examples can be found in the standard test file\nLib/test/test_doctest.py.
\nThe simplest way to start using doctest (but not necessarily the way you’ll\ncontinue to do it) is to end each module M with:
\nif __name__ == "__main__":\n import doctest\n doctest.testmod()\n
doctest then examines docstrings in module M.
\nRunning the module as a script causes the examples in the docstrings to get\nexecuted and verified:
\npython M.py
\nThis won’t display anything unless an example fails, in which case the failing\nexample(s) and the cause(s) of the failure(s) are printed to stdout, and the\nfinal line of output is ***Test Failed*** N failures., where N is the\nnumber of examples that failed.
\nRun it with the -v switch instead:
\npython M.py -v
\nand a detailed report of all examples tried is printed to standard output, along\nwith assorted summaries at the end.
\nYou can force verbose mode by passing verbose=True to testmod(), or\nprohibit it by passing verbose=False. In either of those cases,\nsys.argv is not examined by testmod() (so passing -v or not\nhas no effect).
\nSince Python 2.6, there is also a command line shortcut for running\ntestmod(). You can instruct the Python interpreter to run the doctest\nmodule directly from the standard library and pass the module name(s) on the\ncommand line:
\npython -m doctest -v example.py
\nThis will import example.py as a standalone module and run\ntestmod() on it. Note that this may not work correctly if the file is\npart of a package and imports other submodules from that package.
\n\nAnother simple application of doctest is testing interactive examples in a text\nfile. This can be done with the testfile() function:
\nimport doctest\ndoctest.testfile("example.txt")\n
That short script executes and verifies any interactive Python examples\ncontained in the file example.txt. The file content is treated as if it\nwere a single giant docstring; the file doesn’t need to contain a Python\nprogram! For example, perhaps example.txt contains this:
\nThe ``example`` module\n======================\n\nUsing ``factorial``\n-------------------\n\nThis is an example text file in reStructuredText format. First import\n``factorial`` from the ``example`` module:\n\n >>> from example import factorial\n\nNow use it:\n\n >>> factorial(6)\n 120
\nRunning doctest.testfile("example.txt") then finds the error in this\ndocumentation:
\nFile \"./example.txt\", line 14, in example.txt\nFailed example:\n factorial(6)\nExpected:\n 120\nGot:\n 720
\nAs with testmod(), testfile() won’t display anything unless an\nexample fails. If an example does fail, then the failing example(s) and the\ncause(s) of the failure(s) are printed to stdout, using the same format as\ntestmod().
\nBy default, testfile() looks for files in the calling module’s directory.\nSee section Basic API for a description of the optional arguments\nthat can be used to tell it to look for files in other locations.
\nLike testmod(), testfile()‘s verbosity can be set with the\n-v command-line switch or with the optional keyword argument\nverbose.
\nSince Python 2.6, there is also a command line shortcut for running\ntestfile(). You can instruct the Python interpreter to run the doctest\nmodule directly from the standard library and pass the file name(s) on the\ncommand line:
\npython -m doctest -v example.txt
\nBecause the file name does not end with .py, doctest infers that\nit must be run with testfile(), not testmod().
\nFor more information on testfile(), see section Basic API.
\nThis section examines in detail how doctest works: which docstrings it looks at,\nhow it finds interactive examples, what execution context it uses, how it\nhandles exceptions, and how option flags can be used to control its behavior.\nThis is the information that you need to know to write doctest examples; for\ninformation about actually running doctest on these examples, see the following\nsections.
\nThe module docstring, and all function, class and method docstrings are\nsearched. Objects imported into the module are not searched.
\nIn addition, if M.__test__ exists and “is true”, it must be a dict, and each\nentry maps a (string) name to a function object, class object, or string.\nFunction and class object docstrings found from M.__test__ are searched, and\nstrings are treated as if they were docstrings. In output, a key K in\nM.__test__ appears with name
\n<name of M>.__test__.K
\nAny classes found are recursively searched similarly, to test docstrings in\ntheir contained methods and nested classes.
\n\nChanged in version 2.4: A “private name” concept is deprecated and no longer documented.
\nIn most cases a copy-and-paste of an interactive console session works fine,\nbut doctest isn’t trying to do an exact emulation of any specific Python shell.
\n>>> # comments are ignored\n>>> x = 12\n>>> x\n12\n>>> if x == 13:\n... print "yes"\n... else:\n... print "no"\n... print "NO"\n... print "NO!!!"\n...\nno\nNO\nNO!!!\n>>>\n
Any expected output must immediately follow the final '>>> ' or '... '\nline containing the code, and the expected output (if any) extends to the next\n'>>> ' or all-whitespace line.
\nThe fine print:
\nExpected output cannot contain an all-whitespace line, since such a line is\ntaken to signal the end of expected output. If expected output does contain a\nblank line, put <BLANKLINE> in your doctest example each place a blank line\nis expected.
\n\nNew in version 2.4: <BLANKLINE> was added; there was no way to use expected output containing\nempty lines in previous versions.
\nAll hard tab characters are expanded to spaces, using 8-column tab stops.\nTabs in output generated by the tested code are not modified. Because any\nhard tabs in the sample output are expanded, this means that if the code\noutput includes hard tabs, the only way the doctest can pass is if the\nNORMALIZE_WHITESPACE option or directive is in effect.\nAlternatively, the test can be rewritten to capture the output and compare it\nto an expected value as part of the test. This handling of tabs in the\nsource was arrived at through trial and error, and has proven to be the least\nerror prone way of handling them. It is possible to use a different\nalgorithm for handling tabs by writing a custom DocTestParser class.
\n\nChanged in version 2.4: Expanding tabs to spaces is new; previous versions tried to preserve hard tabs,\nwith confusing results.
\nOutput to stdout is captured, but not output to stderr (exception tracebacks\nare captured via a different means).
\nIf you continue a line via backslashing in an interactive session, or for any\nother reason use a backslash, you should use a raw docstring, which will\npreserve your backslashes exactly as you type them:
\n>>> def f(x):\n... r'''Backslashes in a raw docstring: m\\n'''\n>>> print f.__doc__\nBackslashes in a raw docstring: m\\n\n
Otherwise, the backslash will be interpreted as part of the string. For example,\nthe “\\” above would be interpreted as a newline character. Alternatively, you\ncan double each backslash in the doctest version (and not use a raw string):
\n>>> def f(x):\n... '''Backslashes in a raw docstring: m\\\\n'''\n>>> print f.__doc__\nBackslashes in a raw docstring: m\\n\n
The starting column doesn’t matter:
\n>>> assert "Easy!"\n >>> import math\n >>> math.floor(1.9)\n 1.0\n
and as many leading whitespace characters are stripped from the expected output\nas appeared in the initial '>>> ' line that started the example.
\nBy default, each time doctest finds a docstring to test, it uses a\nshallow copy of M‘s globals, so that running tests doesn’t change the\nmodule’s real globals, and so that one test in M can’t leave behind\ncrumbs that accidentally allow another test to work. This means examples can\nfreely use any names defined at top-level in M, and names defined earlier\nin the docstring being run. Examples cannot see names defined in other\ndocstrings.
\nYou can force use of your own dict as the execution context by passing\nglobs=your_dict to testmod() or testfile() instead.
\nNo problem, provided that the traceback is the only output produced by the\nexample: just paste in the traceback. [1] Since tracebacks contain details\nthat are likely to change rapidly (for example, exact file paths and line\nnumbers), this is one case where doctest works hard to be flexible in what it\naccepts.
\nSimple example:
\n>>> [1, 2, 3].remove(42)\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: list.remove(x): x not in list\n
That doctest succeeds if ValueError is raised, with the list.remove(x):\nx not in list detail as shown.
\nThe expected output for an exception must start with a traceback header, which\nmay be either of the following two lines, indented the same as the first line of\nthe example:
\nTraceback (most recent call last):\nTraceback (innermost last):
\nThe traceback header is followed by an optional traceback stack, whose contents\nare ignored by doctest. The traceback stack is typically omitted, or copied\nverbatim from an interactive session.
\nThe traceback stack is followed by the most interesting part: the line(s)\ncontaining the exception type and detail. This is usually the last line of a\ntraceback, but can extend across multiple lines if the exception has a\nmulti-line detail:
\n>>> raise ValueError('multi\\n line\\ndetail')\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nValueError: multi\n line\ndetail\n
The last three lines (starting with ValueError) are compared against the\nexception’s type and detail, and the rest are ignored.
\n\nChanged in version 2.4: Previous versions were unable to handle multi-line exception details.
\nBest practice is to omit the traceback stack, unless it adds significant\ndocumentation value to the example. So the last example is probably better as:
\n>>> raise ValueError('multi\\n line\\ndetail')\nTraceback (most recent call last):\n ...\nValueError: multi\n line\ndetail\n
Note that tracebacks are treated very specially. In particular, in the\nrewritten example, the use of ... is independent of doctest’s\nELLIPSIS option. The ellipsis in that example could be left out, or\ncould just as well be three (or three hundred) commas or digits, or an indented\ntranscript of a Monty Python skit.
\nSome details you should read once, but won’t need to remember:
\nDoctest can’t guess whether your expected output came from an exception\ntraceback or from ordinary printing. So, e.g., an example that expects\nValueError: 42 is prime will pass whether ValueError is actually\nraised or if the example merely prints that traceback text. In practice,\nordinary output rarely begins with a traceback header line, so this doesn’t\ncreate real problems.
\nEach line of the traceback stack (if present) must be indented further than\nthe first line of the example, or start with a non-alphanumeric character.\nThe first line following the traceback header indented the same and starting\nwith an alphanumeric is taken to be the start of the exception detail. Of\ncourse this does the right thing for genuine tracebacks.
\nWhen the IGNORE_EXCEPTION_DETAIL doctest option is specified,\neverything following the leftmost colon and any module information in the\nexception name is ignored.
\nThe interactive shell omits the traceback header line for some\nSyntaxErrors. But doctest uses the traceback header line to\ndistinguish exceptions from non-exceptions. So in the rare case where you need\nto test a SyntaxError that omits the traceback header, you will need to\nmanually add the traceback header line to your test example.
\nFor some SyntaxErrors, Python displays the character position of the\nsyntax error, using a ^ marker:
\n>>> 1 1\n File "<stdin>", line 1\n 1 1\n ^\nSyntaxError: invalid syntax\n
Since the lines showing the position of the error come before the exception type\nand detail, they are not checked by doctest. For example, the following test\nwould pass, even though it puts the ^ marker in the wrong location:
\n>>> 1 1\n File "<stdin>", line 1\n 1 1\n ^\nSyntaxError: invalid syntax\n
A number of option flags control various aspects of doctest’s behavior.\nSymbolic names for the flags are supplied as module constants, which can be\nor’ed together and passed to various functions. The names can also be used in\ndoctest directives (see below).
\nThe first group of options define test semantics, controlling aspects of how\ndoctest decides whether actual output matches an example’s expected output:
\nWhen specified, an example that expects an exception passes if an exception of\nthe expected type is raised, even if the exception detail does not match. For\nexample, an example expecting ValueError: 42 will pass if the actual\nexception raised is ValueError: 3*14, but will fail, e.g., if\nTypeError is raised.
\nIt will also ignore the module name used in Python 3 doctest reports. Hence\nboth these variations will work regardless of whether the test is run under\nPython 2.7 or Python 3.2 (or later versions):
\n\n\n\n\n>>> raise CustomError('message') #doctest: +IGNORE_EXCEPTION_DETAIL\nTraceback (most recent call last):\nCustomError: message\n\n\n>>> raise CustomError('message') #doctest: +IGNORE_EXCEPTION_DETAIL\nTraceback (most recent call last):\nmy_module.CustomError: message\n
Note that ELLIPSIS can also be used to ignore the\ndetails of the exception message, but such a test may still fail based\non whether or not the module details are printed as part of the\nexception name. Using IGNORE_EXCEPTION_DETAIL and the details\nfrom Python 2.3 is also the only clear way to write a doctest that doesn’t\ncare about the exception detail yet continues to pass under Python 2.3 or\nearlier (those releases do not support doctest directives and ignore them\nas irrelevant comments). For example,
\n>>> (1, 2)[3] = 'moo' #doctest: +IGNORE_EXCEPTION_DETAIL\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nTypeError: object doesn't support item assignment\n
passes under Python 2.3 and later Python versions, even though the detail\nchanged in Python 2.4 to say “does not” instead of “doesn’t”.
\n\nChanged in version 2.7: IGNORE_EXCEPTION_DETAIL now also ignores any information\nrelating to the module containing the exception under test
\nWhen specified, do not run the example at all. This can be useful in contexts\nwhere doctest examples serve as both documentation and test cases, and an\nexample should be included for documentation purposes, but should not be\nchecked. E.g., the example’s output might be random; or the example might\ndepend on resources which would be unavailable to the test driver.
\nThe SKIP flag can also be used for temporarily “commenting out” examples.
\n\nNew in version 2.5.
\nThe second group of options controls how test failures are reported:
\n“Doctest directives” may be used to modify the option flags for individual\nexamples. Doctest directives are expressed as a special Python comment\nfollowing an example’s source code:
\n\ndirective ::= "#" "doctest:" directive_options\ndirective_options ::= directive_option ("," directive_option)\\*\ndirective_option ::= on_or_off directive_option_name\non_or_off ::= "+" \\| "-"\ndirective_option_name ::= "DONT_ACCEPT_BLANKLINE" \\| "NORMALIZE_WHITESPACE" \\| ...\n\n
Whitespace is not allowed between the + or - and the directive option\nname. The directive option name can be any of the option flag names explained\nabove.
\nAn example’s doctest directives modify doctest’s behavior for that single\nexample. Use + to enable the named behavior, or - to disable it.
\nFor example, this test passes:
\n>>> print range(20) #doctest: +NORMALIZE_WHITESPACE\n[0, 1, 2, 3, 4, 5, 6, 7, 8, 9,\n10, 11, 12, 13, 14, 15, 16, 17, 18, 19]\n
Without the directive it would fail, both because the actual output doesn’t have\ntwo blanks before the single-digit list elements, and because the actual output\nis on a single line. This test also passes, and also requires a directive to do\nso:
\n>>> print range(20) # doctest:+ELLIPSIS\n[0, 1, ..., 18, 19]\n
Multiple directives can be used on a single physical line, separated by commas:
\n>>> print range(20) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE\n[0, 1, ..., 18, 19]\n
If multiple directive comments are used for a single example, then they are\ncombined:
\n>>> print range(20) # doctest: +ELLIPSIS\n... # doctest: +NORMALIZE_WHITESPACE\n[0, 1, ..., 18, 19]\n
As the previous example shows, you can add ... lines to your example\ncontaining only directives. This can be useful when an example is too long for\na directive to comfortably fit on the same line:
\n>>> print range(5) + range(10,20) + range(30,40) + range(50,60)\n... # doctest: +ELLIPSIS\n[0, ..., 4, 10, ..., 19, 30, ..., 39, 50, ..., 59]\n
Note that since all options are disabled by default, and directives apply only\nto the example they appear in, enabling options (via + in a directive) is\nusually the only meaningful choice. However, option flags can also be passed to\nfunctions that run doctests, establishing different defaults. In such cases,\ndisabling an option via - in a directive can be useful.
\n\nNew in version 2.4: Doctest directives and the associated constants\nDONT_ACCEPT_BLANKLINE, NORMALIZE_WHITESPACE,\nELLIPSIS, IGNORE_EXCEPTION_DETAIL, REPORT_UDIFF,\nREPORT_CDIFF, REPORT_NDIFF,\nREPORT_ONLY_FIRST_FAILURE, COMPARISON_FLAGS and\nREPORTING_FLAGS were added.
\nThere’s also a way to register new option flag names, although this isn’t useful\nunless you intend to extend doctest internals via subclassing:
\nCreate a new option flag with a given name, and return the new flag’s integer\nvalue. register_optionflag() can be used when subclassing\nOutputChecker or DocTestRunner to create new options that are\nsupported by your subclasses. register_optionflag() should always be\ncalled using the following idiom:
\nMY_FLAG = register_optionflag('MY_FLAG')\n
\nNew in version 2.4.
\ndoctest is serious about requiring exact matches in expected output. If\neven a single character doesn’t match, the test fails. This will probably\nsurprise you a few times, as you learn exactly what Python does and doesn’t\nguarantee about output. For example, when printing a dict, Python doesn’t\nguarantee that the key-value pairs will be printed in any particular order, so a\ntest like
\n>>> foo()\n{"Hermione": "hippogryph", "Harry": "broomstick"}\n
is vulnerable! One workaround is to do
\n>>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}\nTrue\n
instead. Another is to do
\n>>> d = foo().items()\n>>> d.sort()\n>>> d\n[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]\n
There are others, but you get the idea.
\nAnother bad idea is to print things that embed an object address, like
\n>>> id(1.0) # certain to fail some of the time\n7948648\n>>> class C: pass\n>>> C() # the default repr() for instances embeds an address\n<__main__.C instance at 0x00AC18F0>\n
The ELLIPSIS directive gives a nice approach for the last example:
\n>>> C() #doctest: +ELLIPSIS\n<__main__.C instance at 0x...>\n
Floating-point numbers are also subject to small output variations across\nplatforms, because Python defers to the platform C library for float formatting,\nand C libraries vary widely in quality here.
\n>>> 1./7 # risky\n0.14285714285714285\n>>> print 1./7 # safer\n0.142857142857\n>>> print round(1./7, 6) # much safer\n0.142857\n
Numbers of the form I/2.**J are safe across all platforms, and I often\ncontrive doctest examples to produce numbers of that form:
\n>>> 3./4 # utterly safe\n0.75\n
Simple fractions are also easier for people to understand, and that makes for\nbetter documentation.
\nThe functions testmod() and testfile() provide a simple interface to\ndoctest that should be sufficient for most basic uses. For a less formal\nintroduction to these two functions, see sections Simple Usage: Checking Examples in Docstrings\nand Simple Usage: Checking Examples in a Text File.
\nAll arguments except filename are optional, and should be specified in keyword\nform.
\nTest examples in the file named filename. Return (failure_count,\ntest_count).
\nOptional argument module_relative specifies how the filename should be\ninterpreted:
\nOptional argument name gives the name of the test; by default, or if None,\nos.path.basename(filename) is used.
\nOptional argument package is a Python package or the name of a Python package\nwhose directory should be used as the base directory for a module-relative\nfilename. If no package is specified, then the calling module’s directory is\nused as the base directory for module-relative filenames. It is an error to\nspecify package if module_relative is False.
\nOptional argument globs gives a dict to be used as the globals when executing\nexamples. A new shallow copy of this dict is created for the doctest, so its\nexamples start with a clean slate. By default, or if None, a new empty dict\nis used.
\nOptional argument extraglobs gives a dict merged into the globals used to\nexecute examples. This works like dict.update(): if globs and\nextraglobs have a common key, the associated value in extraglobs appears in\nthe combined dict. By default, or if None, no extra globals are used. This\nis an advanced feature that allows parameterization of doctests. For example, a\ndoctest can be written for a base class, using a generic name for the class,\nthen reused to test any number of subclasses by passing an extraglobs dict\nmapping the generic name to the subclass to be tested.
\nOptional argument verbose prints lots of stuff if true, and prints only\nfailures if false; by default, or if None, it’s true if and only if '-v'\nis in sys.argv.
\nOptional argument report prints a summary at the end when true, else prints\nnothing at the end. In verbose mode, the summary is detailed, else the summary\nis very brief (in fact, empty if all tests passed).
\nOptional argument optionflags or’s together option flags. See section\nOption Flags and Directives.
\nOptional argument raise_on_error defaults to false. If true, an exception is\nraised upon the first failure or unexpected exception in an example. This\nallows failures to be post-mortem debugged. Default behavior is to continue\nrunning examples.
\nOptional argument parser specifies a DocTestParser (or subclass) that\nshould be used to extract tests from the files. It defaults to a normal parser\n(i.e., DocTestParser()).
\nOptional argument encoding specifies an encoding that should be used to\nconvert the file to unicode.
\n\nNew in version 2.4.
\n\nChanged in version 2.5: The parameter encoding was added.
\nAll arguments are optional, and all except for m should be specified in\nkeyword form.
\nTest examples in docstrings in functions and classes reachable from module m\n(or module __main__ if m is not supplied or is None), starting with\nm.__doc__.
\nAlso test examples reachable from dict m.__test__, if it exists and is not\nNone. m.__test__ maps names (strings) to functions, classes and\nstrings; function and class docstrings are searched for examples; strings are\nsearched directly, as if they were docstrings.
\nOnly docstrings attached to objects belonging to module m are searched.
\nReturn (failure_count, test_count).
\nOptional argument name gives the name of the module; by default, or if\nNone, m.__name__ is used.
\nOptional argument exclude_empty defaults to false. If true, objects for which\nno doctests are found are excluded from consideration. The default is a backward\ncompatibility hack, so that code still using doctest.master.summarize() in\nconjunction with testmod() continues to get output for objects with no\ntests. The exclude_empty argument to the newer DocTestFinder\nconstructor defaults to true.
\nOptional arguments extraglobs, verbose, report, optionflags,\nraise_on_error, and globs are the same as for function testfile()\nabove, except that globs defaults to m.__dict__.
\n\nChanged in version 2.3: The parameter optionflags was added.
\n\nChanged in version 2.4: The parameters extraglobs, raise_on_error and exclude_empty were added.
\n\nChanged in version 2.5: The optional argument isprivate, deprecated in 2.4, was removed.
\nThere’s also a function to run the doctests associated with a single object.\nThis function is provided for backward compatibility. There are no plans to\ndeprecate it, but it’s rarely useful:
\nTest examples associated with object f; for example, f may be a module,\nfunction, or class object.
\nA shallow copy of dictionary argument globs is used for the execution context.
\nOptional argument name is used in failure messages, and defaults to\n"NoName".
\nIf optional argument verbose is true, output is generated even if there are no\nfailures. By default, output is generated only in case of an example failure.
\nOptional argument compileflags gives the set of flags that should be used by\nthe Python compiler when running the examples. By default, or if None,\nflags are deduced corresponding to the set of future features found in globs.
\nOptional argument optionflags works as for function testfile() above.
\nAs your collection of doctest’ed modules grows, you’ll want a way to run all\ntheir doctests systematically. Prior to Python 2.4, doctest had a barely\ndocumented Tester class that supplied a rudimentary way to combine\ndoctests from multiple modules. Tester was feeble, and in practice most\nserious Python testing frameworks build on the unittest module, which\nsupplies many flexible ways to combine tests from multiple sources. So, in\nPython 2.4, doctest‘s Tester class is deprecated, and\ndoctest provides two functions that can be used to create unittest\ntest suites from modules and text files containing doctests. To integrate with\nunittest test discovery, include a load_tests() function in your\ntest module:
\nimport unittest\nimport doctest\nimport my_module_with_doctests\n\ndef load_tests(loader, tests, ignore):\n tests.addTests(doctest.DocTestSuite(my_module_with_doctests))\n return tests\n
There are two main functions for creating unittest.TestSuite instances\nfrom text files and modules with doctests:
\nConvert doctest tests from one or more text files to a\nunittest.TestSuite.
\nThe returned unittest.TestSuite is to be run by the unittest framework\nand runs the interactive examples in each file. If an example in any file\nfails, then the synthesized unit test fails, and a failureException\nexception is raised showing the name of the file containing the test and a\n(sometimes approximate) line number.
\nPass one or more paths (as strings) to text files to be examined.
\nOptions may be provided as keyword arguments:
\nOptional argument module_relative specifies how the filenames in paths\nshould be interpreted:
\nOptional argument package is a Python package or the name of a Python\npackage whose directory should be used as the base directory for\nmodule-relative filenames in paths. If no package is specified, then the\ncalling module’s directory is used as the base directory for module-relative\nfilenames. It is an error to specify package if module_relative is\nFalse.
\nOptional argument setUp specifies a set-up function for the test suite.\nThis is called before running the tests in each file. The setUp function\nwill be passed a DocTest object. The setUp function can access the\ntest globals as the globs attribute of the test passed.
\nOptional argument tearDown specifies a tear-down function for the test\nsuite. This is called after running the tests in each file. The tearDown\nfunction will be passed a DocTest object. The setUp function can\naccess the test globals as the globs attribute of the test passed.
\nOptional argument globs is a dictionary containing the initial global\nvariables for the tests. A new copy of this dictionary is created for each\ntest. By default, globs is a new empty dictionary.
\nOptional argument optionflags specifies the default doctest options for the\ntests, created by or-ing together individual option flags. See section\nOption Flags and Directives. See function set_unittest_reportflags() below\nfor a better way to set reporting options.
\nOptional argument parser specifies a DocTestParser (or subclass)\nthat should be used to extract tests from the files. It defaults to a normal\nparser (i.e., DocTestParser()).
\nOptional argument encoding specifies an encoding that should be used to\nconvert the file to unicode.
\n\nNew in version 2.4.
\n\nChanged in version 2.5: The global __file__ was added to the globals provided to doctests\nloaded from a text file using DocFileSuite().
\n\nChanged in version 2.5: The parameter encoding was added.
\nConvert doctest tests for a module to a unittest.TestSuite.
\nThe returned unittest.TestSuite is to be run by the unittest framework\nand runs each doctest in the module. If any of the doctests fail, then the\nsynthesized unit test fails, and a failureException exception is raised\nshowing the name of the file containing the test and a (sometimes approximate)\nline number.
\nOptional argument module provides the module to be tested. It can be a module\nobject or a (possibly dotted) module name. If not specified, the module calling\nthis function is used.
\nOptional argument globs is a dictionary containing the initial global\nvariables for the tests. A new copy of this dictionary is created for each\ntest. By default, globs is a new empty dictionary.
\nOptional argument extraglobs specifies an extra set of global variables, which\nis merged into globs. By default, no extra globals are used.
\nOptional argument test_finder is the DocTestFinder object (or a\ndrop-in replacement) that is used to extract doctests from the module.
\nOptional arguments setUp, tearDown, and optionflags are the same as for\nfunction DocFileSuite() above.
\n\nNew in version 2.3.
\n\nChanged in version 2.4: The parameters globs, extraglobs, test_finder, setUp, tearDown, and\noptionflags were added; this function now uses the same search technique as\ntestmod().
\nUnder the covers, DocTestSuite() creates a unittest.TestSuite out\nof doctest.DocTestCase instances, and DocTestCase is a\nsubclass of unittest.TestCase. DocTestCase isn’t documented\nhere (it’s an internal detail), but studying its code can answer questions about\nthe exact details of unittest integration.
\nSimilarly, DocFileSuite() creates a unittest.TestSuite out of\ndoctest.DocFileCase instances, and DocFileCase is a subclass\nof DocTestCase.
\nSo both ways of creating a unittest.TestSuite run instances of\nDocTestCase. This is important for a subtle reason: when you run\ndoctest functions yourself, you can control the doctest options in\nuse directly, by passing option flags to doctest functions. However, if\nyou’re writing a unittest framework, unittest ultimately controls\nwhen and how tests get run. The framework author typically wants to control\ndoctest reporting options (perhaps, e.g., specified by command line\noptions), but there’s no way to pass options through unittest to\ndoctest test runners.
\nFor this reason, doctest also supports a notion of doctest\nreporting flags specific to unittest support, via this function:
\nSet the doctest reporting flags to use.
\nArgument flags or’s together option flags. See section\nOption Flags and Directives. Only “reporting flags” can be used.
\nThis is a module-global setting, and affects all future doctests run by module\nunittest: the runTest() method of DocTestCase looks at\nthe option flags specified for the test case when the DocTestCase\ninstance was constructed. If no reporting flags were specified (which is the\ntypical and expected case), doctest‘s unittest reporting flags are\nor’ed into the option flags, and the option flags so augmented are passed to the\nDocTestRunner instance created to run the doctest. If any reporting\nflags were specified when the DocTestCase instance was constructed,\ndoctest‘s unittest reporting flags are ignored.
\nThe value of the unittest reporting flags in effect before the function\nwas called is returned by the function.
\n\nNew in version 2.4.
\nThe basic API is a simple wrapper that’s intended to make doctest easy to use.\nIt is fairly flexible, and should meet most users’ needs; however, if you\nrequire more fine-grained control over testing, or wish to extend doctest’s\ncapabilities, then you should use the advanced API.
\nThe advanced API revolves around two container classes, which are used to store\nthe interactive examples extracted from doctest cases:
\nAdditional processing classes are defined to find, parse, and run, and check\ndoctest examples:
\nThe relationships among these processing classes are summarized in the following\ndiagram:
\n list of:\n+------+ +---------+\n|module| --DocTestFinder-> | DocTest | --DocTestRunner-> results\n+------+ | ^ +---------+ | ^ (printed)\n | | | Example | | |\n v | | ... | v |\n DocTestParser | Example | OutputChecker\n +---------+
\nA collection of doctest examples that should be run in a single namespace. The\nconstructor arguments are used to initialize the attributes of the same names.
\n\nNew in version 2.4.
\nDocTest defines the following attributes. They are initialized by\nthe constructor, and should not be modified directly.
\nA single interactive example, consisting of a Python statement and its expected\noutput. The constructor arguments are used to initialize the attributes of the\nsame names.
\n\nNew in version 2.4.
\nExample defines the following attributes. They are initialized by\nthe constructor, and should not be modified directly.
\nA processing class used to extract the DocTests that are relevant to\na given object, from its docstring and the docstrings of its contained objects.\nDocTests can currently be extracted from the following object types:\nmodules, functions, classes, methods, staticmethods, classmethods, and\nproperties.
\nThe optional argument verbose can be used to display the objects searched by\nthe finder. It defaults to False (no output).
\nThe optional argument parser specifies the DocTestParser object (or a\ndrop-in replacement) that is used to extract doctests from docstrings.
\nIf the optional argument recurse is false, then DocTestFinder.find()\nwill only examine the given object, and not any contained objects.
\nIf the optional argument exclude_empty is false, then\nDocTestFinder.find() will include tests for objects with empty docstrings.
\n\nNew in version 2.4.
\nDocTestFinder defines the following method:
\nReturn a list of the DocTests that are defined by obj‘s\ndocstring, or by any of its contained objects’ docstrings.
\nThe optional argument name specifies the object’s name; this name will be\nused to construct names for the returned DocTests. If name is\nnot specified, then obj.__name__ is used.
\nThe optional parameter module is the module that contains the given object.\nIf the module is not specified or is None, then the test finder will attempt\nto automatically determine the correct module. The object’s module is used:
\nIf module is False, no attempt to find the module will be made. This is\nobscure, of use mostly in testing doctest itself: if module is False, or\nis None but cannot be found automatically, then all objects are considered\nto belong to the (non-existent) module, so all contained objects will\n(recursively) be searched for doctests.
\nThe globals for each DocTest is formed by combining globs and\nextraglobs (bindings in extraglobs override bindings in globs). A new\nshallow copy of the globals dictionary is created for each DocTest.\nIf globs is not specified, then it defaults to the module’s __dict__, if\nspecified, or {} otherwise. If extraglobs is not specified, then it\ndefaults to {}.
\nA processing class used to extract interactive examples from a string, and use\nthem to create a DocTest object.
\n\nNew in version 2.4.
\nDocTestParser defines the following methods:
\nExtract all doctest examples from the given string, and collect them into a\nDocTest object.
\nglobs, name, filename, and lineno are attributes for the new\nDocTest object. See the documentation for DocTest for more\ninformation.
\nA processing class used to execute and verify the interactive examples in a\nDocTest.
\nThe comparison between expected outputs and actual outputs is done by an\nOutputChecker. This comparison may be customized with a number of\noption flags; see section Option Flags and Directives for more information. If the\noption flags are insufficient, then the comparison may also be customized by\npassing a subclass of OutputChecker to the constructor.
\nThe test runner’s display output can be controlled in two ways. First, an output\nfunction can be passed to TestRunner.run(); this function will be called\nwith strings that should be displayed. It defaults to sys.stdout.write. If\ncapturing the output is not sufficient, then the display output can be also\ncustomized by subclassing DocTestRunner, and overriding the methods\nreport_start(), report_success(),\nreport_unexpected_exception(), and report_failure().
\nThe optional keyword argument checker specifies the OutputChecker\nobject (or drop-in replacement) that should be used to compare the expected\noutputs to the actual outputs of doctest examples.
\nThe optional keyword argument verbose controls the DocTestRunner‘s\nverbosity. If verbose is True, then information is printed about each\nexample, as it is run. If verbose is False, then only failures are\nprinted. If verbose is unspecified, or None, then verbose output is used\niff the command-line switch -v is used.
\nThe optional keyword argument optionflags can be used to control how the test\nrunner compares expected output to actual output, and how it displays failures.\nFor more information, see section Option Flags and Directives.
\n\nNew in version 2.4.
\nDocTestParser defines the following methods:
\nReport that the test runner is about to process the given example. This method\nis provided to allow subclasses of DocTestRunner to customize their\noutput; it should not be called directly.
\nexample is the example about to be processed. test is the test\ncontaining example. out is the output function that was passed to\nDocTestRunner.run().
\nReport that the given example ran successfully. This method is provided to\nallow subclasses of DocTestRunner to customize their output; it\nshould not be called directly.
\nexample is the example about to be processed. got is the actual output\nfrom the example. test is the test containing example. out is the\noutput function that was passed to DocTestRunner.run().
\nReport that the given example failed. This method is provided to allow\nsubclasses of DocTestRunner to customize their output; it should not\nbe called directly.
\nexample is the example about to be processed. got is the actual output\nfrom the example. test is the test containing example. out is the\noutput function that was passed to DocTestRunner.run().
\nReport that the given example raised an unexpected exception. This method is\nprovided to allow subclasses of DocTestRunner to customize their\noutput; it should not be called directly.
\nexample is the example about to be processed. exc_info is a tuple\ncontaining information about the unexpected exception (as returned by\nsys.exc_info()). test is the test containing example. out is the\noutput function that was passed to DocTestRunner.run().
\nRun the examples in test (a DocTest object), and display the\nresults using the writer function out.
\nThe examples are run in the namespace test.globs. If clear_globs is\ntrue (the default), then this namespace will be cleared after the test runs,\nto help with garbage collection. If you would like to examine the namespace\nafter the test completes, then use clear_globs=False.
\ncompileflags gives the set of flags that should be used by the Python\ncompiler when running the examples. If not specified, then it will default to\nthe set of future-import flags that apply to globs.
\nThe output of each example is checked using the DocTestRunner‘s\noutput checker, and the results are formatted by the\nDocTestRunner.report_*() methods.
\nPrint a summary of all the test cases that have been run by this DocTestRunner,\nand return a named tuple TestResults(failed, attempted).
\nThe optional verbose argument controls how detailed the summary is. If the\nverbosity is not specified, then the DocTestRunner‘s verbosity is\nused.
\n\nChanged in version 2.6: Use a named tuple.
\nA class used to check the whether the actual output from a doctest example\nmatches the expected output. OutputChecker defines two methods:\ncheck_output(), which compares a given pair of outputs, and returns true\nif they match; and output_difference(), which returns a string describing\nthe differences between two outputs.
\n\nNew in version 2.4.
\nOutputChecker defines the following methods:
\nDoctest provides several mechanisms for debugging doctest examples:
\nSeveral functions convert doctests to executable Python programs, which can be\nrun under the Python debugger, pdb.
\nThe DebugRunner class is a subclass of DocTestRunner that\nraises an exception for the first failing example, containing information about\nthat example. This information can be used to perform post-mortem debugging on\nthe example.
\nThe unittest cases generated by DocTestSuite() support the\ndebug() method defined by unittest.TestCase.
\nYou can add a call to pdb.set_trace() in a doctest example, and you’ll\ndrop into the Python debugger when that line is executed. Then you can inspect\ncurrent values of variables, and so on. For example, suppose a.py\ncontains just this module docstring:
\n"""\n>>> def f(x):\n... g(x*2)\n>>> def g(x):\n... print x+3\n... import pdb; pdb.set_trace()\n>>> f(3)\n9\n"""\n
Then an interactive Python session may look like this:
\n>>> import a, doctest\n>>> doctest.testmod(a)\n--Return--\n> <doctest a[1]>(3)g()->None\n-> import pdb; pdb.set_trace()\n(Pdb) list\n 1 def g(x):\n 2 print x+3\n 3 -> import pdb; pdb.set_trace()\n[EOF]\n(Pdb) print x\n6\n(Pdb) step\n--Return--\n> <doctest a[0]>(2)f()->None\n-> g(x*2)\n(Pdb) list\n 1 def f(x):\n 2 -> g(x*2)\n[EOF]\n(Pdb) print x\n3\n(Pdb) step\n--Return--\n> <doctest a[2]>(1)?()->None\n-> f(3)\n(Pdb) cont\n(0, 3)\n>>>\n
\nChanged in version 2.4: The ability to use pdb.set_trace() usefully inside doctests was added.
\nFunctions that convert doctests to Python code, and possibly run the synthesized\ncode under the debugger:
\nConvert text with examples to a script.
\nArgument s is a string containing doctest examples. The string is converted\nto a Python script, where doctest examples in s are converted to regular code,\nand everything else is converted to Python comments. The generated script is\nreturned as a string. For example,
\nimport doctest\nprint doctest.script_from_examples(r"""\n Set x and y to 1 and 2.\n >>> x, y = 1, 2\n\n Print their sum:\n >>> print x+y\n 3\n""")\n
displays:
\n# Set x and y to 1 and 2.\nx, y = 1, 2\n#\n# Print their sum:\nprint x+y\n# Expected:\n## 3\n
This function is used internally by other functions (see below), but can also be\nuseful when you want to transform an interactive Python session into a Python\nscript.
\n\nNew in version 2.4.
\nConvert the doctest for an object to a script.
\nArgument module is a module object, or dotted name of a module, containing the\nobject whose doctests are of interest. Argument name is the name (within the\nmodule) of the object with the doctests of interest. The result is a string,\ncontaining the object’s docstring converted to a Python script, as described for\nscript_from_examples() above. For example, if module a.py\ncontains a top-level function f(), then
\nimport a, doctest\nprint doctest.testsource(a, "a.f")\n
prints a script version of function f()‘s docstring, with doctests\nconverted to code, and the rest placed in comments.
\n\nNew in version 2.3.
\nDebug the doctests for an object.
\nThe module and name arguments are the same as for function\ntestsource() above. The synthesized Python script for the named object’s\ndocstring is written to a temporary file, and then that file is run under the\ncontrol of the Python debugger, pdb.
\nA shallow copy of module.__dict__ is used for both local and global\nexecution context.
\nOptional argument pm controls whether post-mortem debugging is used. If pm\nhas a true value, the script file is run directly, and the debugger gets\ninvolved only if the script terminates via raising an unhandled exception. If\nit does, then post-mortem debugging is invoked, via pdb.post_mortem(),\npassing the traceback object from the unhandled exception. If pm is not\nspecified, or is false, the script is run under the debugger from the start, via\npassing an appropriate execfile() call to pdb.run().
\n\nNew in version 2.3.
\n\nChanged in version 2.4: The pm argument was added.
\nDebug the doctests in a string.
\nThis is like function debug() above, except that a string containing\ndoctest examples is specified directly, via the src argument.
\nOptional argument pm has the same meaning as in function debug() above.
\nOptional argument globs gives a dictionary to use as both local and global\nexecution context. If not specified, or None, an empty dictionary is used.\nIf specified, a shallow copy of the dictionary is used.
\n\nNew in version 2.4.
\nThe DebugRunner class, and the special exceptions it may raise, are of\nmost interest to testing framework authors, and will only be sketched here. See\nthe source code, and especially DebugRunner‘s docstring (which is a\ndoctest!) for more details:
\nA subclass of DocTestRunner that raises an exception as soon as a\nfailure is encountered. If an unexpected exception occurs, an\nUnexpectedException exception is raised, containing the test, the\nexample, and the original exception. If the output doesn’t match, then a\nDocTestFailure exception is raised, containing the test, the example, and\nthe actual output.
\nFor information about the constructor parameters and methods, see the\ndocumentation for DocTestRunner in section Advanced API.
\nThere are two exceptions that may be raised by DebugRunner instances:
\nDocTestFailure defines the following attributes:
\n\n\n\n\nUnexpectedException defines the following attributes:
\n\n\n\n\nAs mentioned in the introduction, doctest has grown to have three primary\nuses:
\nThese uses have different requirements, and it is important to distinguish them.\nIn particular, filling your docstrings with obscure test cases makes for bad\ndocumentation.
\nWhen writing a docstring, choose docstring examples with care. There’s an art to\nthis that needs to be learned—it may not be natural at first. Examples should\nadd genuine value to the documentation. A good example can often be worth many\nwords. If done with care, the examples will be invaluable for your users, and\nwill pay back the time it takes to collect them many times over as the years go\nby and things change. I’m still amazed at how often one of my doctest\nexamples stops working after a “harmless” change.
\nDoctest also makes an excellent tool for regression testing, especially if you\ndon’t skimp on explanatory text. By interleaving prose and examples, it becomes\nmuch easier to keep track of what’s actually being tested, and why. When a test\nfails, good prose can make it much easier to figure out what the problem is, and\nhow it should be fixed. It’s true that you could write extensive comments in\ncode-based testing, but few programmers do. Many have found that using doctest\napproaches instead leads to much clearer tests. Perhaps this is simply because\ndoctest makes writing prose a little easier than writing code, while writing\ncomments in code is a little harder. I think it goes deeper than just that:\nthe natural attitude when writing a doctest-based test is that you want to\nexplain the fine points of your software, and illustrate them with examples.\nThis in turn naturally leads to test files that start with the simplest\nfeatures, and logically progress to complications and edge cases. A coherent\nnarrative is the result, instead of a collection of isolated functions that test\nisolated bits of functionality seemingly at random. It’s a different attitude,\nand produces different results, blurring the distinction between testing and\nexplaining.
\nRegression testing is best confined to dedicated objects or files. There are\nseveral options for organizing tests:
\nFootnotes
\n[1] | Examples containing both expected output and an exception are not supported.\nTrying to guess where one ends and the other begins is too error-prone, and that\nalso makes for a confusing test. |
\nNew in version 2.1.
\n(If you are already familiar with the basic concepts of testing, you might want\nto skip to the list of assert methods.)
\nThe Python unit testing framework, sometimes referred to as “PyUnit,” is a\nPython language version of JUnit, by Kent Beck and Erich Gamma. JUnit is, in\nturn, a Java version of Kent’s Smalltalk testing framework. Each is the de\nfacto standard unit testing framework for its respective language.
\nunittest supports test automation, sharing of setup and shutdown code for\ntests, aggregation of tests into collections, and independence of the tests from\nthe reporting framework. The unittest module provides classes that make\nit easy to support these qualities for a set of tests.
\nTo achieve this, unittest supports some important concepts:
\nThe test case and test fixture concepts are supported through the\nTestCase and FunctionTestCase classes; the former should be\nused when creating new tests, and the latter can be used when integrating\nexisting test code with a unittest-driven framework. When building test\nfixtures using TestCase, the setUp() and\ntearDown() methods can be overridden to provide initialization\nand cleanup for the fixture. With FunctionTestCase, existing functions\ncan be passed to the constructor for these purposes. When the test is run, the\nfixture initialization is run first; if it succeeds, the cleanup method is run\nafter the test has been executed, regardless of the outcome of the test. Each\ninstance of the TestCase will only be used to run a single test method,\nso a new fixture is created for each test.
\nTest suites are implemented by the TestSuite class. This class allows\nindividual tests and test suites to be aggregated; when the suite is executed,\nall tests added directly to the suite and in “child” test suites are run.
\nA test runner is an object that provides a single method,\nrun(), which accepts a TestCase or TestSuite\nobject as a parameter, and returns a result object. The class\nTestResult is provided for use as the result object. unittest\nprovides the TextTestRunner as an example test runner which reports\ntest results on the standard error stream by default. Alternate runners can be\nimplemented for other environments (such as graphical environments) without any\nneed to derive from a specific class.
\nSee also
\nThe unittest module provides a rich set of tools for constructing and\nrunning tests. This section demonstrates that a small subset of the tools\nsuffice to meet the needs of most users.
\nHere is a short script to test three functions from the random module:
\nimport random\nimport unittest\n\nclass TestSequenceFunctions(unittest.TestCase):\n\n def setUp(self):\n self.seq = range(10)\n\n def test_shuffle(self):\n # make sure the shuffled sequence does not lose any elements\n random.shuffle(self.seq)\n self.seq.sort()\n self.assertEqual(self.seq, range(10))\n\n # should raise an exception for an immutable sequence\n self.assertRaises(TypeError, random.shuffle, (1,2,3))\n\n def test_choice(self):\n element = random.choice(self.seq)\n self.assertTrue(element in self.seq)\n\n def test_sample(self):\n with self.assertRaises(ValueError):\n random.sample(self.seq, 20)\n for element in random.sample(self.seq, 5):\n self.assertTrue(element in self.seq)\n\nif __name__ == '__main__':\n unittest.main()\n
A testcase is created by subclassing unittest.TestCase. The three\nindividual tests are defined with methods whose names start with the letters\ntest. This naming convention informs the test runner about which methods\nrepresent tests.
\nThe crux of each test is a call to assertEqual() to check for an\nexpected result; assertTrue() to verify a condition; or\nassertRaises() to verify that an expected exception gets raised.\nThese methods are used instead of the assert statement so the test\nrunner can accumulate all test results and produce a report.
\nWhen a setUp() method is defined, the test runner will run that\nmethod prior to each test. Likewise, if a tearDown() method is\ndefined, the test runner will invoke that method after each test. In the\nexample, setUp() was used to create a fresh sequence for each\ntest.
\nThe final block shows a simple way to run the tests. unittest.main()\nprovides a command-line interface to the test script. When run from the command\nline, the above script produces an output that looks like this:
\n...\n----------------------------------------------------------------------\nRan 3 tests in 0.000s\n\nOK
\nInstead of unittest.main(), there are other ways to run the tests with a\nfiner level of control, less terse output, and no requirement to be run from the\ncommand line. For example, the last two lines may be replaced with:
\nsuite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions)\nunittest.TextTestRunner(verbosity=2).run(suite)\n
Running the revised script from the interpreter or another script produces the\nfollowing output:
\ntest_choice (__main__.TestSequenceFunctions) ... ok\ntest_sample (__main__.TestSequenceFunctions) ... ok\ntest_shuffle (__main__.TestSequenceFunctions) ... ok\n\n----------------------------------------------------------------------\nRan 3 tests in 0.110s\n\nOK
\nThe above examples show the most commonly used unittest features which\nare sufficient to meet many everyday testing needs. The remainder of the\ndocumentation explores the full feature set from first principles.
\nThe unittest module can be used from the command line to run tests from\nmodules, classes or even individual test methods:
\npython -m unittest test_module1 test_module2\npython -m unittest test_module.TestClass\npython -m unittest test_module.TestClass.test_method
\nYou can pass in a list with any combination of module names, and fully\nqualified class or method names.
\nYou can run tests with more detail (higher verbosity) by passing in the -v flag:
\npython -m unittest -v test_module
\nFor a list of all the command-line options:
\npython -m unittest -h
\n\nChanged in version 2.7: In earlier versions it was only possible to run individual test methods and\nnot modules or classes.
\nunittest supports these command-line options:
\nControl-C during the test run waits for the current test to end and then\nreports all the results so far. A second control-C raises the normal\nKeyboardInterrupt exception.
\nSee Signal Handling for the functions that provide this functionality.
\n\nNew in version 2.7: The command-line options -b, -c and -f were added.
\nThe command line can also be used for test discovery, for running all of the\ntests in a project or just a subset.
\n\nNew in version 2.7.
\nUnittest supports simple test discovery. In order to be compatible with test\ndiscovery, all of the test files must be modules or\npackages importable from the top-level directory of\nthe project (this means that their filenames must be valid\nidentifiers).
\nTest discovery is implemented in TestLoader.discover(), but can also be\nused from the command line. The basic command-line usage is:
\ncd project_directory\npython -m unittest discover
\nThe discover sub-command has the following options:
\nThe -s, -p, and -t options can be passed in\nas positional arguments in that order. The following two command lines\nare equivalent:
\npython -m unittest discover -s project_directory -p '*_test.py'\npython -m unittest discover project_directory '*_test.py'
\nAs well as being a path it is possible to pass a package name, for example\nmyproject.subpackage.test, as the start directory. The package name you\nsupply will then be imported and its location on the filesystem will be used\nas the start directory.
\nCaution
\nTest discovery loads tests by importing them. Once test discovery has\nfound all the test files from the start directory you specify it turns the\npaths into package names to import. For example foo/bar/baz.py will be\nimported as foo.bar.baz.
\nIf you have a package installed globally and attempt test discovery on\na different copy of the package then the import could happen from the\nwrong place. If this happens test discovery will warn you and exit.
\nIf you supply the start directory as a package name rather than a\npath to a directory then discover assumes that whichever location it\nimports from is the location you intended, so you will not get the\nwarning.
\nTest modules and packages can customize test loading and discovery by through\nthe load_tests protocol.
\nThe basic building blocks of unit testing are test cases — single\nscenarios that must be set up and checked for correctness. In unittest,\ntest cases are represented by instances of unittest‘s TestCase\nclass. To make your own test cases you must write subclasses of\nTestCase, or use FunctionTestCase.
\nAn instance of a TestCase-derived class is an object that can\ncompletely run a single test method, together with optional set-up and tidy-up\ncode.
\nThe testing code of a TestCase instance should be entirely self\ncontained, such that it can be run either in isolation or in arbitrary\ncombination with any number of other test cases.
\nThe simplest TestCase subclass will simply override the\nrunTest() method in order to perform specific testing code:
\nimport unittest\n\nclass DefaultWidgetSizeTestCase(unittest.TestCase):\n def runTest(self):\n widget = Widget('The widget')\n self.assertEqual(widget.size(), (50, 50), 'incorrect default size')\n
Note that in order to test something, we use the one of the assert*()\nmethods provided by the TestCase base class. If the test fails, an\nexception will be raised, and unittest will identify the test case as a\nfailure. Any other exceptions will be treated as errors. This\nhelps you identify where the problem is: failures are caused by incorrect\nresults - a 5 where you expected a 6. Errors are caused by incorrect\ncode - e.g., a TypeError caused by an incorrect function call.
\nThe way to run a test case will be described later. For now, note that to\nconstruct an instance of such a test case, we call its constructor without\narguments:
\ntestCase = DefaultWidgetSizeTestCase()\n
Now, such test cases can be numerous, and their set-up can be repetitive. In\nthe above case, constructing a Widget in each of 100 Widget test case\nsubclasses would mean unsightly duplication.
\nLuckily, we can factor out such set-up code by implementing a method called\nsetUp(), which the testing framework will automatically call for\nus when we run the test:
\nimport unittest\n\nclass SimpleWidgetTestCase(unittest.TestCase):\n def setUp(self):\n self.widget = Widget('The widget')\n\nclass DefaultWidgetSizeTestCase(SimpleWidgetTestCase):\n def runTest(self):\n self.assertEqual(self.widget.size(), (50,50),\n 'incorrect default size')\n\nclass WidgetResizeTestCase(SimpleWidgetTestCase):\n def runTest(self):\n self.widget.resize(100,150)\n self.assertEqual(self.widget.size(), (100,150),\n 'wrong size after resize')\n
If the setUp() method raises an exception while the test is\nrunning, the framework will consider the test to have suffered an error, and the\nrunTest() method will not be executed.
\nSimilarly, we can provide a tearDown() method that tidies up\nafter the runTest() method has been run:
\nimport unittest\n\nclass SimpleWidgetTestCase(unittest.TestCase):\n def setUp(self):\n self.widget = Widget('The widget')\n\n def tearDown(self):\n self.widget.dispose()\n self.widget = None\n
If setUp() succeeded, the tearDown() method will\nbe run whether runTest() succeeded or not.
\nSuch a working environment for the testing code is called a fixture.
\nOften, many small test cases will use the same fixture. In this case, we would\nend up subclassing SimpleWidgetTestCase into many small one-method\nclasses such as DefaultWidgetSizeTestCase. This is time-consuming and\ndiscouraging, so in the same vein as JUnit, unittest provides a simpler\nmechanism:
\nimport unittest\n\nclass WidgetTestCase(unittest.TestCase):\n def setUp(self):\n self.widget = Widget('The widget')\n\n def tearDown(self):\n self.widget.dispose()\n self.widget = None\n\n def test_default_size(self):\n self.assertEqual(self.widget.size(), (50,50),\n 'incorrect default size')\n\n def test_resize(self):\n self.widget.resize(100,150)\n self.assertEqual(self.widget.size(), (100,150),\n 'wrong size after resize')\n
Here we have not provided a runTest() method, but have instead\nprovided two different test methods. Class instances will now each run one of\nthe test_*() methods, with self.widget created and destroyed\nseparately for each instance. When creating an instance we must specify the\ntest method it is to run. We do this by passing the method name in the\nconstructor:
\ndefaultSizeTestCase = WidgetTestCase('test_default_size')\nresizeTestCase = WidgetTestCase('test_resize')\n
Test case instances are grouped together according to the features they test.\nunittest provides a mechanism for this: the test suite,\nrepresented by unittest‘s TestSuite class:
\nwidgetTestSuite = unittest.TestSuite()\nwidgetTestSuite.addTest(WidgetTestCase('test_default_size'))\nwidgetTestSuite.addTest(WidgetTestCase('test_resize'))\n
For the ease of running tests, as we will see later, it is a good idea to\nprovide in each test module a callable object that returns a pre-built test\nsuite:
\ndef suite():\n suite = unittest.TestSuite()\n suite.addTest(WidgetTestCase('test_default_size'))\n suite.addTest(WidgetTestCase('test_resize'))\n return suite\n
or even:
\ndef suite():\n tests = ['test_default_size', 'test_resize']\n\n return unittest.TestSuite(map(WidgetTestCase, tests))\n
Since it is a common pattern to create a TestCase subclass with many\nsimilarly named test functions, unittest provides a TestLoader\nclass that can be used to automate the process of creating a test suite and\npopulating it with individual tests. For example,
\nsuite = unittest.TestLoader().loadTestsFromTestCase(WidgetTestCase)\n
will create a test suite that will run WidgetTestCase.test_default_size() and\nWidgetTestCase.test_resize. TestLoader uses the 'test' method\nname prefix to identify test methods automatically.
\nNote that the order in which the various test cases will be run is\ndetermined by sorting the test function names with respect to the\nbuilt-in ordering for strings.
\nOften it is desirable to group suites of test cases together, so as to run tests\nfor the whole system at once. This is easy, since TestSuite instances\ncan be added to a TestSuite just as TestCase instances can be\nadded to a TestSuite:
\nsuite1 = module1.TheTestSuite()\nsuite2 = module2.TheTestSuite()\nalltests = unittest.TestSuite([suite1, suite2])\n
You can place the definitions of test cases and test suites in the same modules\nas the code they are to test (such as widget.py), but there are several\nadvantages to placing the test code in a separate module, such as\ntest_widget.py:
\nSome users will find that they have existing test code that they would like to\nrun from unittest, without converting every old test function to a\nTestCase subclass.
\nFor this reason, unittest provides a FunctionTestCase class.\nThis subclass of TestCase can be used to wrap an existing test\nfunction. Set-up and tear-down functions can also be provided.
\nGiven the following test function:
\ndef testSomething():\n something = makeSomething()\n assert something.name is not None\n # ...\n
one can create an equivalent test case instance as follows:
\ntestcase = unittest.FunctionTestCase(testSomething)\n
If there are additional set-up and tear-down methods that should be called as\npart of the test case’s operation, they can also be provided like so:
\ntestcase = unittest.FunctionTestCase(testSomething,\n setUp=makeSomethingDB,\n tearDown=deleteSomethingDB)\n
To make migrating existing test suites easier, unittest supports tests\nraising AssertionError to indicate test failure. However, it is\nrecommended that you use the explicit TestCase.fail*() and\nTestCase.assert*() methods instead, as future versions of unittest\nmay treat AssertionError differently.
\nNote
\nEven though FunctionTestCase can be used to quickly convert an\nexisting test base over to a unittest-based system, this approach is\nnot recommended. Taking the time to set up proper TestCase\nsubclasses will make future test refactorings infinitely easier.
\nIn some cases, the existing tests may have been written using the doctest\nmodule. If so, doctest provides a DocTestSuite class that can\nautomatically build unittest.TestSuite instances from the existing\ndoctest-based tests.
\n\nNew in version 2.7.
\nUnittest supports skipping individual test methods and even whole classes of\ntests. In addition, it supports marking a test as a “expected failure,” a test\nthat is broken and will fail, but shouldn’t be counted as a failure on a\nTestResult.
\nSkipping a test is simply a matter of using the skip() decorator\nor one of its conditional variants.
\nBasic skipping looks like this:
\nclass MyTestCase(unittest.TestCase):\n\n @unittest.skip("demonstrating skipping")\n def test_nothing(self):\n self.fail("shouldn't happen")\n\n @unittest.skipIf(mylib.__version__ < (1, 3),\n "not supported in this library version")\n def test_format(self):\n # Tests that work for only a certain version of the library.\n pass\n\n @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")\n def test_windows_support(self):\n # windows specific testing code\n pass\n
This is the output of running the example above in verbose mode:
\ntest_format (__main__.MyTestCase) ... skipped 'not supported in this library version'\ntest_nothing (__main__.MyTestCase) ... skipped 'demonstrating skipping'\ntest_windows_support (__main__.MyTestCase) ... skipped 'requires Windows'\n\n----------------------------------------------------------------------\nRan 3 tests in 0.005s\n\nOK (skipped=3)
\nClasses can be skipped just like methods:
\n@skip("showing class skipping")\nclass MySkippedTestCase(unittest.TestCase):\n def test_not_run(self):\n pass\n
TestCase.setUp() can also skip the test. This is useful when a resource\nthat needs to be set up is not available.
\nExpected failures use the expectedFailure() decorator.
\nclass ExpectedFailureTestCase(unittest.TestCase):\n @unittest.expectedFailure\n def test_fail(self):\n self.assertEqual(1, 0, "broken")\n
It’s easy to roll your own skipping decorators by making a decorator that calls\nskip() on the test when it wants it to be skipped. This decorator skips\nthe test unless the passed object has a certain attribute:
\ndef skipUnlessHasattr(obj, attr):\n if hasattr(obj, attr):\n return lambda func: func\n return unittest.skip("{0!r} doesn't have {1!r}".format(obj, attr))\n
The following decorators implement test skipping and expected failures:
\nSkipped tests will not have setUp() or tearDown() run around them.\nSkipped classes will not have setUpClass() or tearDownClass() run.
\nThis section describes in depth the API of unittest.
\nInstances of the TestCase class represent the smallest testable units\nin the unittest universe. This class is intended to be used as a base\nclass, with specific tests being implemented by concrete subclasses. This class\nimplements the interface needed by the test runner to allow it to drive the\ntest, and methods that the test code can use to check for and report various\nkinds of failure.
\nEach instance of TestCase will run a single test method: the method\nnamed methodName. If you remember, we had an earlier example that went\nsomething like this:
\ndef suite():\n suite = unittest.TestSuite()\n suite.addTest(WidgetTestCase('test_default_size'))\n suite.addTest(WidgetTestCase('test_resize'))\n return suite\n
Here, we create two instances of WidgetTestCase, each of which runs a\nsingle test.
\nmethodName defaults to runTest().
\nTestCase instances provide three groups of methods: one group used\nto run the test, another used by the test implementation to check conditions\nand report failures, and some inquiry methods allowing information about the\ntest itself to be gathered.
\nMethods in the first group (running the test) are:
\nA class method called before tests in an individual class run.\nsetUpClass is called with the class as the only argument\nand must be decorated as a classmethod():
\n@classmethod\ndef setUpClass(cls):\n ...\n
See Class and Module Fixtures for more details.
\n\nNew in version 2.7.
\nA class method called after tests in an individual class have run.\ntearDownClass is called with the class as the only argument\nand must be decorated as a classmethod():
\n@classmethod\ndef tearDownClass(cls):\n ...\n
See Class and Module Fixtures for more details.
\n\nNew in version 2.7.
\nRun the test, collecting the result into the test result object passed as\nresult. If result is omitted or None, a temporary result\nobject is created (by calling the defaultTestResult() method) and\nused. The result object is not returned to run()‘s caller.
\nThe same effect may be had by simply calling the TestCase\ninstance.
\nCalling this during a test method or setUp() skips the current\ntest. See Skipping tests and expected failures for more information.
\n\nNew in version 2.7.
\nThe TestCase class provides a number of methods to check for and\nreport failures, such as:
\nMethod | \nChecks that | \nNew in | \n
---|---|---|
assertEqual(a, b) | \na == b | \n\n |
assertNotEqual(a, b) | \na != b | \n\n |
assertTrue(x) | \nbool(x) is True | \n\n |
assertFalse(x) | \nbool(x) is False | \n\n |
assertIs(a, b) | \na is b | \n2.7 | \n
assertIsNot(a, b) | \na is not b | \n2.7 | \n
assertIsNone(x) | \nx is None | \n2.7 | \n
assertIsNotNone(x) | \nx is not None | \n2.7 | \n
assertIn(a, b) | \na in b | \n2.7 | \n
assertNotIn(a, b) | \na not in b | \n2.7 | \n
assertIsInstance(a, b) | \nisinstance(a, b) | \n2.7 | \n
assertNotIsInstance(a, b) | \nnot isinstance(a, b) | \n2.7 | \n
All the assert methods (except assertRaises(),\nassertRaisesRegexp())\naccept a msg argument that, if specified, is used as the error message on\nfailure (see also longMessage).
\nTest that first and second are equal. If the values do not compare\nequal, the test will fail.
\nIn addition, if first and second are the exact same type and one of\nlist, tuple, dict, set, frozenset or unicode or any type that a subclass\nregisters with addTypeEqualityFunc() the type specific equality\nfunction will be called in order to generate a more useful default\nerror message (see also the list of type-specific methods).
\n\nChanged in version 2.7: Added the automatic calling of type specific equality function.
\nTest that expr is true (or false).
\nNote that this is equivalent to bool(expr) is True and not to expr\nis True (use assertIs(expr, True) for the latter). This method\nshould also be avoided when more specific methods are available (e.g.\nassertEqual(a, b) instead of assertTrue(a == b)), because they\nprovide a better error message in case of failure.
\nTest that first and second evaluate (or don’t evaluate) to the same object.
\n\nNew in version 2.7.
\nTest that expr is (or is not) None.
\n\nNew in version 2.7.
\nTest that first is (or is not) in second.
\n\nNew in version 2.7.
\nTest that obj is (or is not) an instance of cls (which can be a\nclass or a tuple of classes, as supported by isinstance()).\nTo check for the exact type, use assertIs(type(obj), cls).
\n\nNew in version 2.7.
\nIt is also possible to check that exceptions and warnings are raised using\nthe following methods:
\nMethod | \nChecks that | \nNew in | \n
---|---|---|
assertRaises(exc, fun, *args, **kwds) | \nfun(*args, **kwds) raises exc | \n\n |
assertRaisesRegexp(exc, re, fun, *args, **kwds) | \nfun(*args, **kwds) raises exc\nand the message matches re | \n2.7 | \n
Test that an exception is raised when callable is called with any\npositional or keyword arguments that are also passed to\nassertRaises(). The test passes if exception is raised, is an\nerror if another exception is raised, or fails if no exception is raised.\nTo catch any of a group of exceptions, a tuple containing the exception\nclasses may be passed as exception.
\nIf only the exception argument is given, returns a context manager so\nthat the code under test can be written inline rather than as a function:
\nwith self.assertRaises(SomeException):\n do_something()\n
The context manager will store the caught exception object in its\nexception attribute. This can be useful if the intention\nis to perform additional checks on the exception raised:
\nwith self.assertRaises(SomeException) as cm:\n do_something()\n\nthe_exception = cm.exception\nself.assertEqual(the_exception.error_code, 3)\n
\nChanged in version 2.7: Added the ability to use assertRaises() as a context manager.
\nLike assertRaises() but also tests that regexp matches\non the string representation of the raised exception. regexp may be\na regular expression object or a string containing a regular expression\nsuitable for use by re.search(). Examples:
\nself.assertRaisesRegexp(ValueError, 'invalid literal for.*XYZ$',\n int, 'XYZ')\n
or:
\nwith self.assertRaisesRegexp(ValueError, 'literal'):\n int('XYZ')\n
\nNew in version 2.7.
\nThere are also other methods used to perform more specific checks, such as:
\nMethod | \nChecks that | \nNew in | \n
---|---|---|
assertAlmostEqual(a, b) | \nround(a-b, 7) == 0 | \n\n |
assertNotAlmostEqual(a, b) | \nround(a-b, 7) != 0 | \n\n |
assertGreater(a, b) | \na > b | \n2.7 | \n
assertGreaterEqual(a, b) | \na >= b | \n2.7 | \n
assertLess(a, b) | \na < b | \n2.7 | \n
assertLessEqual(a, b) | \na <= b | \n2.7 | \n
assertRegexpMatches(s, re) | \nregex.search(s) | \n2.7 | \n
assertNotRegexpMatches(s, re) | \nnot regex.search(s) | \n2.7 | \n
assertItemsEqual(a, b) | \nsorted(a) == sorted(b) and\nworks with unhashable objs | \n2.7 | \n
assertDictContainsSubset(a, b) | \nall the key/value pairs\nin a exist in b | \n2.7 | \n
Test that first and second are approximately (or not approximately)\nequal by computing the difference, rounding to the given number of\ndecimal places (default 7), and comparing to zero. Note that these\nmethods round the values to the given number of decimal places (i.e.\nlike the round() function) and not significant digits.
\nIf delta is supplied instead of places then the difference\nbetween first and second must be less (or more) than delta.
\nSupplying both delta and places raises a TypeError.
\n\nChanged in version 2.7: assertAlmostEqual() automatically considers almost equal objects\nthat compare equal. assertNotAlmostEqual() automatically fails\nif the objects compare equal. Added the delta keyword argument.
\nTest that first is respectively >, >=, < or <= than second depending\non the method name. If not, the test will fail:
\n>>> self.assertGreaterEqual(3, 4)\nAssertionError: "3" unexpectedly not greater than or equal to "4"\n
\nNew in version 2.7.
\nTest that a regexp search matches text. In case\nof failure, the error message will include the pattern and the text (or\nthe pattern and the part of text that unexpectedly matched). regexp\nmay be a regular expression object or a string containing a regular\nexpression suitable for use by re.search().
\n\nNew in version 2.7.
\nVerifies that a regexp search does not match text. Fails with an error\nmessage including the pattern and the part of text that matches. regexp\nmay be a regular expression object or a string containing a regular\nexpression suitable for use by re.search().
\n\nNew in version 2.7.
\nTest that sequence expected contains the same elements as actual,\nregardless of their order. When they don’t, an error message listing the\ndifferences between the sequences will be generated.
\nDuplicate elements are not ignored when comparing actual and\nexpected. It verifies if each element has the same count in both\nsequences. It is the equivalent of assertEqual(sorted(expected),\nsorted(actual)) but it works with sequences of unhashable objects as\nwell.
\n\nNew in version 2.7.
\nTests whether the key/value pairs in dictionary actual are a\nsuperset of those in expected. If not, an error message listing\nthe missing keys and mismatched values is generated.
\n\nNew in version 2.7.
\n\nDeprecated since version 3.2.
\nThe assertEqual() method dispatches the equality check for objects of\nthe same type to different type-specific methods. These methods are already\nimplemented for most of the built-in types, but it’s also possible to\nregister new methods using addTypeEqualityFunc():
\nRegisters a type-specific method called by assertEqual() to check\nif two objects of exactly the same typeobj (not subclasses) compare\nequal. function must take two positional arguments and a third msg=None\nkeyword argument just as assertEqual() does. It must raise\nself.failureException(msg) when inequality\nbetween the first two parameters is detected – possibly providing useful\ninformation and explaining the inequalities in details in the error\nmessage.
\n\nNew in version 2.7.
\nThe list of type-specific methods automatically used by\nassertEqual() are summarized in the following table. Note\nthat it’s usually not necessary to invoke these methods directly.
\nMethod | \nUsed to compare | \nNew in | \n
---|---|---|
assertMultiLineEqual(a, b) | \nstrings | \n2.7 | \n
assertSequenceEqual(a, b) | \nsequences | \n2.7 | \n
assertListEqual(a, b) | \nlists | \n2.7 | \n
assertTupleEqual(a, b) | \ntuples | \n2.7 | \n
assertSetEqual(a, b) | \nsets or frozensets | \n2.7 | \n
assertDictEqual(a, b) | \ndicts | \n2.7 | \n
Test that the multiline string first is equal to the string second.\nWhen not equal a diff of the two strings highlighting the differences\nwill be included in the error message. This method is used by default\nwhen comparing strings with assertEqual().
\n\nNew in version 2.7.
\nTests that two sequences are equal. If a seq_type is supplied, both\nseq1 and seq2 must be instances of seq_type or a failure will\nbe raised. If the sequences are different an error message is\nconstructed that shows the difference between the two.
\nThis method is not called directly by assertEqual(), but\nit’s used to implement assertListEqual() and\nassertTupleEqual().
\n\nNew in version 2.7.
\nTests that two lists or tuples are equal. If not an error message is\nconstructed that shows only the differences between the two. An error\nis also raised if either of the parameters are of the wrong type.\nThese methods are used by default when comparing lists or tuples with\nassertEqual().
\n\nNew in version 2.7.
\nTests that two sets are equal. If not, an error message is constructed\nthat lists the differences between the sets. This method is used by\ndefault when comparing sets or frozensets with assertEqual().
\nFails if either of set1 or set2 does not have a set.difference()\nmethod.
\n\nNew in version 2.7.
\nTest that two dictionaries are equal. If not, an error message is\nconstructed that shows the differences in the dictionaries. This\nmethod will be used by default to compare dictionaries in\ncalls to assertEqual().
\n\nNew in version 2.7.
\nFinally the TestCase provides the following methods and attributes:
\nIf set to True then any explicit failure message you pass in to the\nassert methods will be appended to the end of the\nnormal failure message. The normal messages contain useful information\nabout the objects involved, for example the message from assertEqual\nshows you the repr of the two unequal objects. Setting this attribute\nto True allows you to have a custom error message in addition to the\nnormal one.
\nThis attribute defaults to False, meaning that a custom message passed\nto an assert method will silence the normal message.
\nThe class setting can be overridden in individual tests by assigning an\ninstance attribute to True or False before calling the assert methods.
\n\nNew in version 2.7.
\nThis attribute controls the maximum length of diffs output by assert\nmethods that report diffs on failure. It defaults to 80*8 characters.\nAssert methods affected by this attribute are\nassertSequenceEqual() (including all the sequence comparison\nmethods that delegate to it), assertDictEqual() and\nassertMultiLineEqual().
\nSetting maxDiff to None means that there is no maximum length of\ndiffs.
\n\nNew in version 2.7.
\nTesting frameworks can use the following methods to collect information on\nthe test:
\nReturn an instance of the test result class that should be used for this\ntest case class (if no other result instance is provided to the\nrun() method).
\nFor TestCase instances, this will always be an instance of\nTestResult; subclasses of TestCase should override this\nas necessary.
\nAdd a function to be called after tearDown() to cleanup resources\nused during the test. Functions will be called in reverse order to the\norder they are added (LIFO). They are called with any arguments and\nkeyword arguments passed into addCleanup() when they are\nadded.
\nIf setUp() fails, meaning that tearDown() is not called,\nthen any cleanup functions added will still be called.
\n\nNew in version 2.7.
\nThis method is called unconditionally after tearDown(), or\nafter setUp() if setUp() raises an exception.
\nIt is responsible for calling all the cleanup functions added by\naddCleanup(). If you need cleanup functions to be called\nprior to tearDown() then you can call doCleanups()\nyourself.
\ndoCleanups() pops methods off the stack of cleanup\nfunctions one at a time, so it can be called at any time.
\n\nNew in version 2.7.
\nFor historical reasons, some of the TestCase methods had one or more\naliases that are now deprecated. The following table lists the correct names\nalong with their deprecated aliases:
\n\n\n\n
\n\n \n\n\n \n \n\n\n Method Name \nDeprecated alias(es) \n\n assertEqual() \nfailUnlessEqual, assertEquals \n\n assertNotEqual() \nfailIfEqual \n\n assertTrue() \nfailUnless, assert_ \n\n assertFalse() \nfailIf \n\n assertRaises() \nfailUnlessRaises \n\n assertAlmostEqual() \nfailUnlessAlmostEqual \n\n\n assertNotAlmostEqual() \nfailIfAlmostEqual \n\nDeprecated since version 2.7: the aliases listed in the second column
\n
This class represents an aggregation of individual tests cases and test suites.\nThe class presents the interface needed by the test runner to allow it to be run\nas any other test case. Running a TestSuite instance is the same as\niterating over the suite, running each test individually.
\nIf tests is given, it must be an iterable of individual test cases or other\ntest suites that will be used to build the suite initially. Additional methods\nare provided to add test cases and suites to the collection later on.
\nTestSuite objects behave much like TestCase objects, except\nthey do not actually implement a test. Instead, they are used to aggregate\ntests into groups of tests that should be run together. Some additional\nmethods are available to add tests to TestSuite instances:
\n\n\nAdd all the tests from an iterable of TestCase and TestSuite\ninstances to this test suite.
\nThis is equivalent to iterating over tests, calling addTest() for\neach element.
\nTestSuite shares the following methods with TestCase:
\nTests grouped by a TestSuite are always accessed by iteration.\nSubclasses can lazily provide tests by overriding __iter__(). Note\nthat this method maybe called several times on a single suite\n(for example when counting tests or comparing for equality)\nso the tests returned must be the same for repeated iterations.
\n\nChanged in version 2.7: In earlier versions the TestSuite accessed tests directly rather\nthan through iteration, so overriding __iter__() wasn’t sufficient\nfor providing tests.
\nIn the typical usage of a TestSuite object, the run() method\nis invoked by a TestRunner rather than by the end-user test harness.
\nThe TestLoader class is used to create test suites from classes and\nmodules. Normally, there is no need to create an instance of this class; the\nunittest module provides an instance that can be shared as\nunittest.defaultTestLoader. Using a subclass or instance, however, allows\ncustomization of some configurable properties.
\nTestLoader objects have the following methods:
\nReturn a suite of all tests cases contained in the given module. This\nmethod searches module for classes derived from TestCase and\ncreates an instance of the class for each test method defined for the\nclass.
\nNote
\nWhile using a hierarchy of TestCase-derived classes can be\nconvenient in sharing fixtures and helper functions, defining test\nmethods on base classes that are not intended to be instantiated\ndirectly does not play well with this method. Doing so, however, can\nbe useful when the fixtures are different and defined in subclasses.
\nIf a module provides a load_tests function it will be called to\nload the tests. This allows modules to customize test loading.\nThis is the load_tests protocol.
\n\nChanged in version 2.7: Support for load_tests added.
\nReturn a suite of all tests cases given a string specifier.
\nThe specifier name is a “dotted name” that may resolve either to a\nmodule, a test case class, a test method within a test case class, a\nTestSuite instance, or a callable object which returns a\nTestCase or TestSuite instance. These checks are\napplied in the order listed here; that is, a method on a possible test\ncase class will be picked up as “a test method within a test case class”,\nrather than “a callable object”.
\nFor example, if you have a module SampleTests containing a\nTestCase-derived class SampleTestCase with three test\nmethods (test_one(), test_two(), and test_three()), the\nspecifier 'SampleTests.SampleTestCase' would cause this method to\nreturn a suite which will run all three test methods. Using the specifier\n'SampleTests.SampleTestCase.test_two' would cause it to return a test\nsuite which will run only the test_two() test method. The specifier\ncan refer to modules and packages which have not been imported; they will\nbe imported as a side-effect.
\nThe method optionally resolves name relative to the given module.
\nFind and return all test modules from the specified start directory,\nrecursing into subdirectories to find them. Only test files that match\npattern will be loaded. (Using shell style pattern matching.) Only\nmodule names that are importable (i.e. are valid Python identifiers) will\nbe loaded.
\nAll test modules must be importable from the top level of the project. If\nthe start directory is not the top level directory then the top level\ndirectory must be specified separately.
\nIf importing a module fails, for example due to a syntax error, then this\nwill be recorded as a single error and discovery will continue.
\nIf a test package name (directory with __init__.py) matches the\npattern then the package will be checked for a load_tests\nfunction. If this exists then it will be called with loader, tests,\npattern.
\nIf load_tests exists then discovery does not recurse into the package,\nload_tests is responsible for loading all tests in the package.
\nThe pattern is deliberately not stored as a loader attribute so that\npackages can continue discovery themselves. top_level_dir is stored so\nload_tests does not need to pass this argument in to\nloader.discover().
\nstart_dir can be a dotted module name as well as a directory.
\n\nNew in version 2.7.
\nThe following attributes of a TestLoader can be configured either by\nsubclassing or assignment on an instance:
\nString giving the prefix of method names which will be interpreted as test\nmethods. The default value is 'test'.
\nThis affects getTestCaseNames() and all the loadTestsFrom*()\nmethods.
\nThis class is used to compile information about which tests have succeeded\nand which have failed.
\nA TestResult object stores the results of a set of tests. The\nTestCase and TestSuite classes ensure that results are\nproperly recorded; test authors do not need to worry about recording the\noutcome of tests.
\nTesting frameworks built on top of unittest may want access to the\nTestResult object generated by running a set of tests for reporting\npurposes; a TestResult instance is returned by the\nTestRunner.run() method for this purpose.
\nTestResult instances have the following attributes that will be of\ninterest when inspecting the results of running a set of tests:
\nA list containing 2-tuples of TestCase instances and strings\nholding formatted tracebacks. Each tuple represents a test which raised an\nunexpected exception.
\n\nChanged in version 2.2: Contains formatted tracebacks instead of sys.exc_info() results.
\nA list containing 2-tuples of TestCase instances and strings\nholding formatted tracebacks. Each tuple represents a test where a failure\nwas explicitly signalled using the TestCase.fail*() or\nTestCase.assert*() methods.
\n\nChanged in version 2.2: Contains formatted tracebacks instead of sys.exc_info() results.
\nA list containing 2-tuples of TestCase instances and strings\nholding the reason for skipping the test.
\n\nNew in version 2.7.
\nIf set to true, sys.stdout and sys.stderr will be buffered in between\nstartTest() and stopTest() being called. Collected output will\nonly be echoed onto the real sys.stdout and sys.stderr if the test\nfails or errors. Any output is also attached to the failure / error message.
\n\nNew in version 2.7.
\nIf set to true stop() will be called on the first failure or error,\nhalting the test run.
\n\nNew in version 2.7.
\nThis method can be called to signal that the set of tests being run should\nbe aborted by setting the shouldStop attribute to True.\nTestRunner objects should respect this flag and return without\nrunning any additional tests.
\nFor example, this feature is used by the TextTestRunner class to\nstop the test framework when the user signals an interrupt from the\nkeyboard. Interactive tools which provide TestRunner\nimplementations can use this in a similar manner.
\nThe following methods of the TestResult class are used to maintain\nthe internal data structures, and may be extended in subclasses to support\nadditional reporting requirements. This is particularly useful in building\ntools which support interactive reporting while tests are being run.
\nCalled once before any tests are executed.
\n\nNew in version 2.7.
\nCalled once after all tests are executed.
\n\nNew in version 2.7.
\nCalled when the test case test raises an unexpected exception err is a\ntuple of the form returned by sys.exc_info(): (type, value,\ntraceback).
\nThe default implementation appends a tuple (test, formatted_err) to\nthe instance’s errors attribute, where formatted_err is a\nformatted traceback derived from err.
\nCalled when the test case test signals a failure. err is a tuple of\nthe form returned by sys.exc_info(): (type, value, traceback).
\nThe default implementation appends a tuple (test, formatted_err) to\nthe instance’s failures attribute, where formatted_err is a\nformatted traceback derived from err.
\nCalled when the test case test succeeds.
\nThe default implementation does nothing.
\nCalled when the test case test is skipped. reason is the reason the\ntest gave for skipping.
\nThe default implementation appends a tuple (test, reason) to the\ninstance’s skipped attribute.
\nCalled when the test case test fails, but was marked with the\nexpectedFailure() decorator.
\nThe default implementation appends a tuple (test, formatted_err) to\nthe instance’s expectedFailures attribute, where formatted_err\nis a formatted traceback derived from err.
\nCalled when the test case test was marked with the\nexpectedFailure() decorator, but succeeded.
\nThe default implementation appends the test to the instance’s\nunexpectedSuccesses attribute.
\nA concrete implementation of TestResult used by the\nTextTestRunner.
\n\nNew in version 2.7: This class was previously named _TextTestResult. The old name still\nexists as an alias but is deprecated.
\nA basic test runner implementation which prints results on standard error. It\nhas a few configurable parameters, but is essentially very simple. Graphical\napplications which run test suites should provide alternate implementations.
\nThis method returns the instance of TestResult used by run().\nIt is not intended to be called directly, but can be overridden in\nsubclasses to provide a custom TestResult.
\n_makeResult() instantiates the class or callable passed in the\nTextTestRunner constructor as the resultclass argument. It\ndefaults to TextTestResult if no resultclass is provided.\nThe result class is instantiated with the following arguments:
\nstream, descriptions, verbosity\n
A command-line program that runs a set of tests; this is primarily for making\ntest modules conveniently executable. The simplest use for this function is to\ninclude the following line at the end of a test script:
\nif __name__ == '__main__':\n unittest.main()\n
You can run tests with more detailed information by passing in the verbosity\nargument:
\nif __name__ == '__main__':\n unittest.main(verbosity=2)\n
The testRunner argument can either be a test runner class or an already\ncreated instance of it. By default main calls sys.exit() with\nan exit code indicating success or failure of the tests run.
\nmain supports being used from the interactive interpreter by passing in the\nargument exit=False. This displays the result on standard output without\ncalling sys.exit():
\n>>> from unittest import main\n>>> main(module='test_module', exit=False)\n
The failfast, catchbreak and buffer parameters have the same\neffect as the same-name command-line options.
\nCalling main actually returns an instance of the TestProgram class.\nThis stores the result of the tests run as the result attribute.
\n\nChanged in version 2.7: The exit, verbosity, failfast, catchbreak and buffer\nparameters were added.
\n\nNew in version 2.7.
\nModules or packages can customize how tests are loaded from them during normal\ntest runs or test discovery by implementing a function called load_tests.
\nIf a test module defines load_tests it will be called by\nTestLoader.loadTestsFromModule() with the following arguments:
\nload_tests(loader, standard_tests, None)\n
It should return a TestSuite.
\nloader is the instance of TestLoader doing the loading.\nstandard_tests are the tests that would be loaded by default from the\nmodule. It is common for test modules to only want to add or remove tests\nfrom the standard set of tests.\nThe third argument is used when loading packages as part of test discovery.
\nA typical load_tests function that loads tests from a specific set of\nTestCase classes may look like:
\ntest_cases = (TestCase1, TestCase2, TestCase3)\n\ndef load_tests(loader, tests, pattern):\n suite = TestSuite()\n for test_class in test_cases:\n tests = loader.loadTestsFromTestCase(test_class)\n suite.addTests(tests)\n return suite\n
If discovery is started, either from the command line or by calling\nTestLoader.discover(), with a pattern that matches a package\nname then the package __init__.py will be checked for load_tests.
\nNote
\nThe default pattern is ‘test*.py’. This matches all Python files\nthat start with ‘test’ but won’t match any test directories.
\nA pattern like ‘test*’ will match test packages as well as\nmodules.
\nIf the package __init__.py defines load_tests then it will be\ncalled and discovery not continued into the package. load_tests\nis called with the following arguments:
\nload_tests(loader, standard_tests, pattern)\n
This should return a TestSuite representing all the tests\nfrom the package. (standard_tests will only contain tests\ncollected from __init__.py.)
\nBecause the pattern is passed into load_tests the package is free to\ncontinue (and potentially modify) test discovery. A ‘do nothing’\nload_tests function for a test package would look like:
\ndef load_tests(loader, standard_tests, pattern):\n # top level directory cached on loader instance\n this_dir = os.path.dirname(__file__)\n package_tests = loader.discover(start_dir=this_dir, pattern=pattern)\n standard_tests.addTests(package_tests)\n return standard_tests\n
Class and module level fixtures are implemented in TestSuite. When\nthe test suite encounters a test from a new class then tearDownClass()\nfrom the previous class (if there is one) is called, followed by\nsetUpClass() from the new class.
\nSimilarly if a test is from a different module from the previous test then\ntearDownModule from the previous module is run, followed by\nsetUpModule from the new module.
\nAfter all the tests have run the final tearDownClass and\ntearDownModule are run.
\nNote that shared fixtures do not play well with [potential] features like test\nparallelization and they break test isolation. They should be used with care.
\nThe default ordering of tests created by the unittest test loaders is to group\nall tests from the same modules and classes together. This will lead to\nsetUpClass / setUpModule (etc) being called exactly once per class and\nmodule. If you randomize the order, so that tests from different modules and\nclasses are adjacent to each other, then these shared fixture functions may be\ncalled multiple times in a single test run.
\nShared fixtures are not intended to work with suites with non-standard\nordering. A BaseTestSuite still exists for frameworks that don’t want to\nsupport shared fixtures.
\nIf there are any exceptions raised during one of the shared fixture functions\nthe test is reported as an error. Because there is no corresponding test\ninstance an _ErrorHolder object (that has the same interface as a\nTestCase) is created to represent the error. If you are just using\nthe standard unittest test runner then this detail doesn’t matter, but if you\nare a framework author it may be relevant.
\nThese must be implemented as class methods:
\nimport unittest\n\nclass Test(unittest.TestCase):\n @classmethod\n def setUpClass(cls):\n cls._connection = createExpensiveConnectionObject()\n\n @classmethod\n def tearDownClass(cls):\n cls._connection.destroy()\n
If you want the setUpClass and tearDownClass on base classes called\nthen you must call up to them yourself. The implementations in\nTestCase are empty.
\nIf an exception is raised during a setUpClass then the tests in the class\nare not run and the tearDownClass is not run. Skipped classes will not\nhave setUpClass or tearDownClass run. If the exception is a\nSkipTest exception then the class will be reported as having been skipped\ninstead of as an error.
\nThese should be implemented as functions:
\ndef setUpModule():\n createConnection()\n\ndef tearDownModule():\n closeConnection()\n
If an exception is raised in a setUpModule then none of the tests in the\nmodule will be run and the tearDownModule will not be run. If the exception is a\nSkipTest exception then the module will be reported as having been skipped\ninstead of as an error.
\nThe -c/--catch command-line option to unittest,\nalong with the catchbreak parameter to unittest.main(), provide\nmore friendly handling of control-C during a test run. With catch break\nbehavior enabled control-C will allow the currently running test to complete,\nand the test run will then end and report all the results so far. A second\ncontrol-c will raise a KeyboardInterrupt in the usual way.
\nThe control-c handling signal handler attempts to remain compatible with code or\ntests that install their own signal.SIGINT handler. If the unittest\nhandler is called but isn’t the installed signal.SIGINT handler,\ni.e. it has been replaced by the system under test and delegated to, then it\ncalls the default handler. This will normally be the expected behavior by code\nthat replaces an installed handler and delegates to it. For individual tests\nthat need unittest control-c handling disabled the removeHandler()\ndecorator can be used.
\nThere are a few utility functions for framework authors to enable control-c\nhandling functionality within test frameworks.
\nInstall the control-c handler. When a signal.SIGINT is received\n(usually in response to the user pressing control-c) all registered results\nhave stop() called.
\n\nNew in version 2.7.
\nRegister a TestResult object for control-c handling. Registering a\nresult stores a weak reference to it, so it doesn’t prevent the result from\nbeing garbage collected.
\nRegistering a TestResult object has no side-effects if control-c\nhandling is not enabled, so test frameworks can unconditionally register\nall results they create independently of whether or not handling is enabled.
\n\nNew in version 2.7.
\nRemove a registered result. Once a result has been removed then\nstop() will no longer be called on that result object in\nresponse to a control-c.
\n\nNew in version 2.7.
\nWhen called without arguments this function removes the control-c handler\nif it has been installed. This function can also be used as a test decorator\nto temporarily remove the handler whilst the test is being executed:
\n@unittest.removeHandler\ndef test_signal_handling(self):\n ...\n
\nNew in version 2.7.
\nSource code: Lib/bdb.py
\nThe bdb module handles basic debugger functions, like setting breakpoints\nor managing execution via the debugger.
\nThe following exception is defined:
\n\n\nThe bdb module also defines two classes:
\nThis class implements temporary breakpoints, ignore counts, disabling and\n(re-)enabling, and conditionals.
\nBreakpoints are indexed by number through a list called bpbynumber\nand by (file, line) pairs through bplist. The former points to a\nsingle instance of class Breakpoint. The latter points to a list of\nsuch instances since there may be more than one breakpoint per line.
\nWhen creating a breakpoint, its associated filename should be in canonical\nform. If a funcname is defined, a breakpoint hit will be counted when the\nfirst line of that function is executed. A conditional breakpoint always\ncounts a hit.
\nBreakpoint instances have the following methods:
\nPrint all the information about the breakpoint:
\nThe Bdb class acts as a generic Python debugger base class.
\nThis class takes care of the details of the trace facility; a derived class\nshould implement user interaction. The standard debugger class\n(pdb.Pdb) is an example.
\nThe skip argument, if given, must be an iterable of glob-style\nmodule name patterns. The debugger will not step into frames that\noriginate in a module that matches one of these patterns. Whether a\nframe is considered to originate in a certain module is determined\nby the __name__ in the frame globals.
\n\nNew in version 2.7: The skip argument.
\nThe following methods of Bdb normally don’t need to be overridden.
\nThis function is installed as the trace function of debugged frames. Its\nreturn value is the new trace function (in most cases, that is, itself).
\nThe default implementation decides how to dispatch a frame, depending on\nthe type of event (passed as a string) that is about to be executed.\nevent can be one of the following:
\nFor the Python events, specialized functions (see below) are called. For\nthe C events, no action is taken.
\nThe arg parameter depends on the previous event.
\nSee the documentation for sys.settrace() for more information on the\ntrace function. For more information on code and frame objects, refer to\nThe standard type hierarchy.
\nNormally derived classes don’t override the following methods, but they may\nif they want to redefine the definition of stopping and breakpoints.
\nDerived classes should override these methods to gain control over debugger\noperation.
\nHandle how a breakpoint must be removed when it is a temporary one.
\nThis method must be implemented by derived classes.
\nDerived classes and clients can call the following methods to affect the\nstepping state.
\nDerived classes and clients can call the following methods to manipulate\nbreakpoints. These methods return a string containing an error message if\nsomething went wrong, or None if all is well.
\nDerived classes and clients can call the following methods to get a data\nstructure representing a stack trace.
\nReturn a string with information about a stack entry, identified by a\n(frame, lineno) tuple:
\nThe following two methods can be called by clients to use a debugger to debug\na statement, given as a string.
\nFinally, the module defines the following functions:
\nCheck whether we should break here, depending on the way the breakpoint b\nwas set.
\nIf it was set via line number, it checks if b.line is the same as the one\nin the frame also passed as argument. If the breakpoint was set via function\nname, we have to check we are in the right frame (the right function) and if\nwe are in its first executable line.
\nThe module pdb defines an interactive source code debugger for Python\nprograms. It supports setting (conditional) breakpoints and single stepping at\nthe source line level, inspection of stack frames, source code listing, and\nevaluation of arbitrary Python code in the context of any stack frame. It also\nsupports post-mortem debugging and can be called under program control.
\nThe debugger is extensible — it is actually defined as the class Pdb.\nThis is currently undocumented but easily understood by reading the source. The\nextension interface uses the modules bdb and cmd.
\nThe debugger’s prompt is (Pdb). Typical usage to run a program under control\nof the debugger is:
\n>>> import pdb\n>>> import mymodule\n>>> pdb.run('mymodule.test()')\n> <string>(0)?()\n(Pdb) continue\n> <string>(1)?()\n(Pdb) continue\nNameError: 'spam'\n> <string>(1)?()\n(Pdb)\n
pdb.py can also be invoked as a script to debug other scripts. For\nexample:
\npython -m pdb myscript.py
\nWhen invoked as a script, pdb will automatically enter post-mortem debugging if\nthe program being debugged exits abnormally. After post-mortem debugging (or\nafter normal exit of the program), pdb will restart the program. Automatic\nrestarting preserves pdb’s state (such as breakpoints) and in most cases is more\nuseful than quitting the debugger upon program’s exit.
\n\nNew in version 2.4: Restarting post-mortem behavior added.
\nThe typical usage to break into the debugger from a running program is to\ninsert
\nimport pdb; pdb.set_trace()\n
at the location you want to break into the debugger. You can then step through\nthe code following this statement, and continue running without the debugger using\nthe c command.
\nThe typical usage to inspect a crashed program is:
\n>>> import pdb\n>>> import mymodule\n>>> mymodule.test()\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\n File "./mymodule.py", line 4, in test\n test2()\n File "./mymodule.py", line 3, in test2\n print spam\nNameError: spam\n>>> pdb.pm()\n> ./mymodule.py(3)test2()\n-> print spam\n(Pdb)\n
The module defines the following functions; each enters the debugger in a\nslightly different way:
\nThe run* functions and set_trace() are aliases for instantiating the\nPdb class and calling the method of the same name. If you want to\naccess further features, you have to do this yourself:
\nPdb is the debugger class.
\nThe completekey, stdin and stdout arguments are passed to the\nunderlying cmd.Cmd class; see the description there.
\nThe skip argument, if given, must be an iterable of glob-style module name\npatterns. The debugger will not step into frames that originate in a module\nthat matches one of these patterns. [1]
\nExample call to enable tracing with skip:
\nimport pdb; pdb.Pdb(skip=['django.*']).set_trace()\n
\nNew in version 2.7: The skip argument.
\n\n\nThe debugger recognizes the following commands. Most commands can be\nabbreviated to one or two letters; e.g. h(elp) means that either h or\nhelp can be used to enter the help command (but not he or hel, nor\nH or Help or HELP). Arguments to commands must be separated by\nwhitespace (spaces or tabs). Optional arguments are enclosed in square brackets\n([]) in the command syntax; the square brackets must not be typed.\nAlternatives in the command syntax are separated by a vertical bar (|).
\nEntering a blank line repeats the last command entered. Exception: if the last\ncommand was a list command, the next 11 lines are listed.
\nCommands that the debugger doesn’t recognize are assumed to be Python statements\nand are executed in the context of the program being debugged. Python\nstatements can also be prefixed with an exclamation point (!). This is a\npowerful way to inspect the program being debugged; it is even possible to\nchange a variable or call a function. When an exception occurs in such a\nstatement, the exception name is printed but the debugger’s state is not\nchanged.
\nMultiple commands may be entered on a single line, separated by ;;. (A\nsingle ; is not used as it is the separator for multiple commands in a line\nthat is passed to the Python parser.) No intelligence is applied to separating\nthe commands; the input is split at the first ;; pair, even if it is in the\nmiddle of a quoted string.
\nThe debugger supports aliases. Aliases can have parameters which allows one a\ncertain level of adaptability to the context under examination.
\nIf a file .pdbrc exists in the user’s home directory or in the current\ndirectory, it is read in and executed as if it had been typed at the debugger\nprompt. This is particularly useful for aliases. If both files exist, the one\nin the home directory is read first and aliases defined there can be overridden\nby the local file.
\nWith a lineno argument, set a break there in the current file. With a\nfunction argument, set a break at the first executable statement within that\nfunction. The line number may be prefixed with a filename and a colon, to\nspecify a breakpoint in another file (probably one that hasn’t been loaded yet).\nThe file is searched on sys.path. Note that each breakpoint is assigned a\nnumber to which all the other breakpoint commands refer.
\nIf a second argument is present, it is an expression which must evaluate to true\nbefore the breakpoint is honored.
\nWithout argument, list all breaks, including for each breakpoint, the number of\ntimes that breakpoint has been hit, the current ignore count, and the associated\ncondition if any.
\nSpecify a list of commands for breakpoint number bpnumber. The commands\nthemselves appear on the following lines. Type a line containing just ‘end’ to\nterminate the commands. An example:
\n(Pdb) commands 1\n(com) print some_variable\n(com) end\n(Pdb)
\nTo remove all commands from a breakpoint, type commands and follow it\nimmediately with end; that is, give no commands.
\nWith no bpnumber argument, commands refers to the last breakpoint set.
\nYou can use breakpoint commands to start your program up again. Simply use the\ncontinue command, or step, or any other command that resumes execution.
\nSpecifying any command resuming execution (currently continue, step, next,\nreturn, jump, quit and their abbreviations) terminates the command list (as if\nthat command was immediately followed by end). This is because any time you\nresume execution (even with a simple next or step), you may encounter another\nbreakpoint–which could have its own command list, leading to ambiguities about\nwhich list to execute.
\nIf you use the ‘silent’ command in the command list, the usual message about\nstopping at a breakpoint is not printed. This may be desirable for breakpoints\nthat are to print a specific message and then continue. If none of the other\ncommands print anything, you see no sign that the breakpoint was reached.
\n\nNew in version 2.5.
\nContinue execution until the line with the line number greater than the\ncurrent one is reached or when returning from current frame.
\n\nNew in version 2.6.
\nSet the next line that will be executed. Only available in the bottom-most\nframe. This lets you jump back and execute code again, or jump forward to skip\ncode that you don’t want to run.
\nIt should be noted that not all jumps are allowed — for instance it is not\npossible to jump into the middle of a for loop or out of a\nfinally clause.
\nEvaluate the expression in the current context and print its value.
\nNote
\nprint can also be used, but is not a debugger command — this executes the\nPython print statement.
\nCreates an alias called name that executes command. The command must not\nbe enclosed in quotes. Replaceable parameters can be indicated by %1,\n%2, and so on, while %* is replaced by all the parameters. If no\ncommand is given, the current alias for name is shown. If no arguments are\ngiven, all aliases are listed.
\nAliases may be nested and can contain anything that can be legally typed at the\npdb prompt. Note that internal pdb commands can be overridden by aliases.\nSuch a command is then hidden until the alias is removed. Aliasing is\nrecursively applied to the first word of the command line; all other words in\nthe line are left alone.
\nAs an example, here are two useful aliases (especially when placed in the\n.pdbrc file):
\n#Print instance variables (usage \"pi classInst\")\nalias pi for k in %1.__dict__.keys(): print \"%1.\",k,\"=\",%1.__dict__[k]\n#Print instance variables in self\nalias ps pi self\n
Execute the (one-line) statement in the context of the current stack frame.\nThe exclamation point can be omitted unless the first word of the statement\nresembles a debugger command. To set a global variable, you can prefix the\nassignment command with a global command on the same line, e.g.:
\n(Pdb) global list_options; list_options = ['-l']\n(Pdb)\n
Restart the debugged Python program. If an argument is supplied, it is split\nwith “shlex” and the result is used as the new sys.argv. History, breakpoints,\nactions and debugger options are preserved. “restart” is an alias for “run”.
\n\nNew in version 2.6.
\nFootnotes
\n[1] | Whether a frame is considered to originate in a certain module\nis determined by the __name__ in the frame globals. |
Source code: Lib/profile.py and Lib/pstats.py
\nA profiler is a program that describes the run time performance\nof a program, providing a variety of statistics. This documentation\ndescribes the profiler functionality provided in the modules\ncProfile, profile and pstats. This profiler\nprovides deterministic profiling of Python programs. It also\nprovides a series of report generation tools to allow users to rapidly\nexamine the results of a profile operation.
\nThe Python standard library provides three different profilers:
\ncProfile is recommended for most users; it’s a C extension\nwith reasonable overhead\nthat makes it suitable for profiling long-running programs.\nBased on lsprof,\ncontributed by Brett Rosen and Ted Czotter.
\n\nNew in version 2.5.
\nprofile, a pure Python module whose interface is imitated by\ncProfile. Adds significant overhead to profiled programs.\nIf you’re trying to extend\nthe profiler in some way, the task might be easier with this module.
\n\nChanged in version 2.4: Now also reports the time spent in calls to built-in functions and methods.
\nhotshot was an experimental C module that focused on minimizing\nthe overhead of profiling, at the expense of longer data\npost-processing times. It is no longer maintained and may be\ndropped in a future version of Python.
\n\nChanged in version 2.5: The results should be more meaningful than in the past: the timing core\ncontained a critical bug.
\nThe profile and cProfile modules export the same interface, so\nthey are mostly interchangeable; cProfile has a much lower overhead but\nis newer and might not be available on all systems.\ncProfile is really a compatibility layer on top of the internal\n_lsprof module. The hotshot module is reserved for specialized\nusage.
\nThis section is provided for users that “don’t want to read the manual.” It\nprovides a very brief overview, and allows a user to rapidly perform profiling\non an existing application.
\nTo profile an application with a main entry point of foo(), you would add\nthe following to your module:
\nimport cProfile\ncProfile.run('foo()')\n
(Use profile instead of cProfile if the latter is not available on\nyour system.)
\nThe above action would cause foo() to be run, and a series of informative\nlines (the profile) to be printed. The above approach is most useful when\nworking with the interpreter. If you would like to save the results of a\nprofile into a file for later examination, you can supply a file name as the\nsecond argument to the run() function:
\nimport cProfile\ncProfile.run('foo()', 'fooprof')\n
The file cProfile.py can also be invoked as a script to profile another\nscript. For example:
\npython -m cProfile myscript.py
\ncProfile.py accepts two optional arguments on the command line:
\ncProfile.py [-o output_file] [-s sort_order]
\n-s only applies to standard output (-o is not supplied).\nLook in the Stats documentation for valid sort values.
\nWhen you wish to review the profile, you should use the methods in the\npstats module. Typically you would load the statistics data as follows:
\nimport pstats\np = pstats.Stats('fooprof')\n
The class Stats (the above code just created an instance of this class)\nhas a variety of methods for manipulating and printing the data that was just\nread into p. When you ran cProfile.run() above, what was printed was\nthe result of three method calls:
\np.strip_dirs().sort_stats(-1).print_stats()\n
The first method removed the extraneous path from all the module names. The\nsecond method sorted all the entries according to the standard module/line/name\nstring that is printed. The third method printed out all the statistics. You\nmight try the following sort calls:
\np.sort_stats('name')\np.print_stats()\n
The first call will actually sort the list by function name, and the second call\nwill print out the statistics. The following are some interesting calls to\nexperiment with:
\np.sort_stats('cumulative').print_stats(10)\n
This sorts the profile by cumulative time in a function, and then only prints\nthe ten most significant lines. If you want to understand what algorithms are\ntaking time, the above line is what you would use.
\nIf you were looking to see what functions were looping a lot, and taking a lot\nof time, you would do:
\np.sort_stats('time').print_stats(10)\n
to sort according to time spent within each function, and then print the\nstatistics for the top ten functions.
\nYou might also try:
\np.sort_stats('file').print_stats('__init__')\n
This will sort all the statistics by file name, and then print out statistics\nfor only the class init methods (since they are spelled with __init__ in\nthem). As one final example, you could try:
\np.sort_stats('time', 'cum').print_stats(.5, 'init')\n
This line sorts statistics with a primary key of time, and a secondary key of\ncumulative time, and then prints out some of the statistics. To be specific, the\nlist is first culled down to 50% (re: .5) of its original size, then only\nlines containing init are maintained, and that sub-sub-list is printed.
\nIf you wondered what functions called the above functions, you could now (p\nis still sorted according to the last criteria) do:
\np.print_callers(.5, 'init')\n
and you would get a list of callers for each of the listed functions.
\nIf you want more functionality, you’re going to have to read the manual, or\nguess what the following functions do:
\np.print_callees()\np.add('fooprof')\n
Invoked as a script, the pstats module is a statistics browser for\nreading and examining profile dumps. It has a simple line-oriented interface\n(implemented using cmd) and interactive help.
\nDeterministic profiling is meant to reflect the fact that all function\ncall, function return, and exception events are monitored, and precise\ntimings are made for the intervals between these events (during which time the\nuser’s code is executing). In contrast, statistical profiling (which is\nnot done by this module) randomly samples the effective instruction pointer, and\ndeduces where time is being spent. The latter technique traditionally involves\nless overhead (as the code does not need to be instrumented), but provides only\nrelative indications of where time is being spent.
\nIn Python, since there is an interpreter active during execution, the presence\nof instrumented code is not required to do deterministic profiling. Python\nautomatically provides a hook (optional callback) for each event. In\naddition, the interpreted nature of Python tends to add so much overhead to\nexecution, that deterministic profiling tends to only add small processing\noverhead in typical applications. The result is that deterministic profiling is\nnot that expensive, yet provides extensive run time statistics about the\nexecution of a Python program.
\nCall count statistics can be used to identify bugs in code (surprising counts),\nand to identify possible inline-expansion points (high call counts). Internal\ntime statistics can be used to identify “hot loops” that should be carefully\noptimized. Cumulative time statistics should be used to identify high level\nerrors in the selection of algorithms. Note that the unusual handling of\ncumulative times in this profiler allows statistics for recursive\nimplementations of algorithms to be directly compared to iterative\nimplementations.
\nThe primary entry point for the profiler is the global function\nprofile.run() (resp. cProfile.run()). It is typically used to create\nany profile information. The reports are formatted and printed using methods of\nthe class pstats.Stats. The following is a description of all of these\nstandard entry points and functions. For a more in-depth view of some of the\ncode, consider reading the later section on Profiler Extensions, which includes\ndiscussion of how to derive “better” profilers from the classes presented, or\nreading the source code for these modules.
\nThis function takes a single argument that can be passed to the\nexec statement, and an optional file name. In all cases this\nroutine attempts to exec its first argument, and gather profiling\nstatistics from the execution. If no file name is present, then this function\nautomatically prints a simple profiling report, sorted by the standard name\nstring (file/line/function-name) that is presented in each line. The\nfollowing is a typical output from such a call:
\n 2706 function calls (2004 primitive calls) in 4.504 CPU seconds\n\nOrdered by: standard name\n\nncalls tottime percall cumtime percall filename:lineno(function)\n 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects)\n 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate)\n ...
\nThe first line indicates that 2706 calls were monitored. Of those calls, 2004\nwere primitive. We define primitive to mean that the call was not\ninduced via recursion. The next line: Ordered by: standard name, indicates\nthat the text string in the far right column was used to sort the output. The\ncolumn headings include:
\nWhen there are two numbers in the first column (for example, 43/3), then the\nlatter is the number of primitive calls, and the former is the actual number of\ncalls. Note that when the function does not recurse, these two values are the\nsame, and only the single figure is printed.
\nAnalysis of the profiler data is done using the Stats class.
\nNote
\nThe Stats class is defined in the pstats module.
\nThis class constructor creates an instance of a “statistics object” from a\nfilename (or set of filenames). Stats objects are manipulated by\nmethods, in order to print useful reports. You may specify an alternate output\nstream by giving the keyword argument, stream.
\nThe file selected by the above constructor must have been created by the\ncorresponding version of profile or cProfile. To be specific,\nthere is no file compatibility guaranteed with future versions of this\nprofiler, and there is no compatibility with files produced by other profilers.\nIf several files are provided, all the statistics for identical functions will\nbe coalesced, so that an overall view of several processes can be considered in\na single report. If additional files need to be combined with data in an\nexisting Stats object, the add() method can be used.
\n\nChanged in version 2.5: The stream parameter was added.
\nStats objects have the following methods:
\nSave the data loaded into the Stats object to a file named filename.\nThe file is created if it does not exist, and is overwritten if it already\nexists. This is equivalent to the method of the same name on the\nprofile.Profile and cProfile.Profile classes.
\n\nNew in version 2.3.
\nThis method modifies the Stats object by sorting it according to the\nsupplied criteria. The argument is typically a string identifying the basis of\na sort (example: 'time' or 'name').
\nWhen more than one key is provided, then additional keys are used as secondary\ncriteria when there is equality in all keys selected before them. For example,\nsort_stats('name', 'file') will sort all the entries according to their\nfunction name, and resolve all ties (identical function names) by sorting by\nfile name.
\nAbbreviations can be used for any key names, as long as the abbreviation is\nunambiguous. The following are the keys currently defined:
\nValid Arg | \nMeaning | \n
---|---|
'calls' | \ncall count | \n
'cumulative' | \ncumulative time | \n
'file' | \nfile name | \n
'module' | \nfile name | \n
'pcalls' | \nprimitive call count | \n
'line' | \nline number | \n
'name' | \nfunction name | \n
'nfl' | \nname/file/line | \n
'stdname' | \nstandard name | \n
'time' | \ninternal time | \n
Note that all sorts on statistics are in descending order (placing most time\nconsuming items first), where as name, file, and line number searches are in\nascending order (alphabetical). The subtle distinction between 'nfl' and\n'stdname' is that the standard name is a sort of the name as printed, which\nmeans that the embedded line numbers get compared in an odd way. For example,\nlines 3, 20, and 40 would (if the file names were the same) appear in the string\norder 20, 3 and 40. In contrast, 'nfl' does a numeric compare of the line\nnumbers. In fact, sort_stats('nfl') is the same as sort_stats('name',\n'file', 'line').
\nFor backward-compatibility reasons, the numeric arguments -1, 0, 1,\nand 2 are permitted. They are interpreted as 'stdname', 'calls',\n'time', and 'cumulative' respectively. If this old style format\n(numeric) is used, only one sort key (the numeric key) will be used, and\nadditional arguments will be silently ignored.
\nThis method for the Stats class prints out a report as described in the\nprofile.run() definition.
\nThe order of the printing is based on the last sort_stats() operation done\non the object (subject to caveats in add() and strip_dirs()).
\nThe arguments provided (if any) can be used to limit the list down to the\nsignificant entries. Initially, the list is taken to be the complete set of\nprofiled functions. Each restriction is either an integer (to select a count of\nlines), or a decimal fraction between 0.0 and 1.0 inclusive (to select a\npercentage of lines), or a regular expression (to pattern match the standard\nname that is printed; as of Python 1.5b1, this uses the Perl-style regular\nexpression syntax defined by the re module). If several restrictions are\nprovided, then they are applied sequentially. For example:
\nprint_stats(.1, 'foo:')\n
would first limit the printing to first 10% of list, and then only print\nfunctions that were part of filename .*foo:. In contrast, the\ncommand:
\nprint_stats('foo:', .1)\n
would limit the list to all functions having file names .*foo:, and\nthen proceed to only print the first 10% of them.
\nThis method for the Stats class prints a list of all functions that\ncalled each function in the profiled database. The ordering is identical to\nthat provided by print_stats(), and the definition of the restricting\nargument is also identical. Each caller is reported on its own line. The\nformat differs slightly depending on the profiler that produced the stats:
\nOne limitation has to do with accuracy of timing information. There is a\nfundamental problem with deterministic profilers involving accuracy. The most\nobvious restriction is that the underlying “clock” is only ticking at a rate\n(typically) of about .001 seconds. Hence no measurements will be more accurate\nthan the underlying clock. If enough measurements are taken, then the “error”\nwill tend to average out. Unfortunately, removing this first error induces a\nsecond source of error.
\nThe second problem is that it “takes a while” from when an event is dispatched\nuntil the profiler’s call to get the time actually gets the state of the\nclock. Similarly, there is a certain lag when exiting the profiler event\nhandler from the time that the clock’s value was obtained (and then squirreled\naway), until the user’s code is once again executing. As a result, functions\nthat are called many times, or call many functions, will typically accumulate\nthis error. The error that accumulates in this fashion is typically less than\nthe accuracy of the clock (less than one clock tick), but it can accumulate\nand become very significant.
\nThe problem is more important with profile than with the lower-overhead\ncProfile. For this reason, profile provides a means of\ncalibrating itself for a given platform so that this error can be\nprobabilistically (on the average) removed. After the profiler is calibrated, it\nwill be more accurate (in a least square sense), but it will sometimes produce\nnegative numbers (when call counts are exceptionally low, and the gods of\nprobability work against you :-). ) Do not be alarmed by negative numbers in\nthe profile. They should only appear if you have calibrated your profiler,\nand the results are actually better than without calibration.
\nThe profiler of the profile module subtracts a constant from each event\nhandling time to compensate for the overhead of calling the time function, and\nsocking away the results. By default, the constant is 0. The following\nprocedure can be used to obtain a better constant for a given platform (see\ndiscussion in section Limitations above).
\nimport profile\npr = profile.Profile()\nfor i in range(5):\n print pr.calibrate(10000)\n
The method executes the number of Python calls given by the argument, directly\nand again under the profiler, measuring the time for both. It then computes the\nhidden overhead per profiler event, and returns that as a float. For example,\non an 800 MHz Pentium running Windows 2000, and using Python’s time.clock() as\nthe timer, the magical number is about 12.5e-6.
\nThe object of this exercise is to get a fairly consistent result. If your\ncomputer is very fast, or your timer function has poor resolution, you might\nhave to pass 100000, or even 1000000, to get consistent results.
\nWhen you have a consistent answer, there are three ways you can use it: [1]
\nimport profile\n\n# 1. Apply computed bias to all Profile instances created hereafter.\nprofile.Profile.bias = your_computed_bias\n\n# 2. Apply computed bias to a specific Profile instance.\npr = profile.Profile()\npr.bias = your_computed_bias\n\n# 3. Specify computed bias in instance constructor.\npr = profile.Profile(bias=your_computed_bias)\n
If you have a choice, you are better off choosing a smaller constant, and then\nyour results will “less often” show up as negative in profile statistics.
\nThe Profile class of both modules, profile and cProfile,\nwere written so that derived classes could be developed to extend the profiler.\nThe details are not described here, as doing this successfully requires an\nexpert understanding of how the Profile class works internally. Study\nthe source code of the module carefully if you want to pursue this.
\nIf all you want to do is change how current time is determined (for example, to\nforce use of wall-clock time or elapsed process time), pass the timing function\nyou want to the Profile class constructor:
\npr = profile.Profile(your_time_func)\n
The resulting profiler will then call your_time_func().
\nyour_time_func() should return a single number, or a list of numbers whose\nsum is the current time (like what os.times() returns). If the function\nreturns a single time number, or the list of returned numbers has length 2, then\nyou will get an especially fast version of the dispatch routine.
\nBe warned that you should calibrate the profiler class for the timer function\nthat you choose. For most machines, a timer that returns a lone integer value\nwill provide the best results in terms of low overhead during profiling.\n(os.times() is pretty bad, as it returns a tuple of floating point\nvalues). If you want to substitute a better timer in the cleanest fashion,\nderive a class and hardwire a replacement dispatch method that best handles your\ntimer call, along with the appropriate calibration constant.
\nyour_time_func() should return a single number. If it returns plain\nintegers, you can also invoke the class constructor with a second argument\nspecifying the real duration of one unit of time. For example, if\nyour_integer_time_func() returns times measured in thousands of seconds,\nyou would construct the Profile instance as follows:
\npr = profile.Profile(your_integer_time_func, 0.001)\n
As the cProfile.Profile class cannot be calibrated, custom timer\nfunctions should be used with care and should be as fast as possible. For the\nbest results with a custom timer, it might be necessary to hard-code it in the C\nsource of the internal _lsprof module.
\nFootnotes
\n[1] | Updated and converted to LaTeX by Guido van Rossum. Further updated by Armin\nRigo to integrate the documentation for the new cProfile module of Python\n2.5. |
[2] | Prior to Python 2.2, it was necessary to edit the profiler source code to embed\nthe bias as a literal number. You still can, but that method is no longer\ndescribed, because no longer needed. |
\nNew in version 2.2.
\nThis module provides a nicer interface to the _hotshot C module. Hotshot\nis a replacement for the existing profile module. As it’s written mostly\nin C, it should result in a much smaller performance impact than the existing\nprofile module.
\nNote
\nThe hotshot module focuses on minimizing the overhead while profiling, at\nthe expense of long data post-processing times. For common usage it is\nrecommended to use cProfile instead. hotshot is not maintained and\nmight be removed from the standard library in the future.
\n\nChanged in version 2.5: The results should be more meaningful than in the past: the timing core\ncontained a critical bug.
\nNote
\nThe hotshot profiler does not yet work well with threads. It is useful to\nuse an unthreaded script to run the profiler over the code you’re interested in\nmeasuring if at all possible.
\nProfile objects have the following methods:
\n\nNew in version 2.2.
\nThis module loads hotshot profiling data into the standard pstats Stats\nobjects.
\nNote that this example runs the Python “benchmark” pystones. It can take some\ntime to run, and will produce large output files.
\n>>> import hotshot, hotshot.stats, test.pystone\n>>> prof = hotshot.Profile("stones.prof")\n>>> benchtime, stones = prof.runcall(test.pystone.pystones)\n>>> prof.close()\n>>> stats = hotshot.stats.load("stones.prof")\n>>> stats.strip_dirs()\n>>> stats.sort_stats('time', 'calls')\n>>> stats.print_stats(20)\n 850004 function calls in 10.090 CPU seconds\n\n Ordered by: internal time, call count\n\n ncalls tottime percall cumtime percall filename:lineno(function)\n 1 3.295 3.295 10.090 10.090 pystone.py:79(Proc0)\n 150000 1.315 0.000 1.315 0.000 pystone.py:203(Proc7)\n 50000 1.313 0.000 1.463 0.000 pystone.py:229(Func2)\n .\n .\n .\n
\nNew in version 2.3.
\nSource code: Lib/timeit.py
\nThis module provides a simple way to time small bits of Python code. It has both\ncommand line as well as callable interfaces. It avoids a number of common traps\nfor measuring execution times. See also Tim Peters’ introduction to the\n“Algorithms” chapter in the Python Cookbook, published by O’Reilly.
\nThe module defines the following public class:
\nClass for timing execution speed of small code snippets.
\nThe constructor takes a statement to be timed, an additional statement used for\nsetup, and a timer function. Both statements default to 'pass'; the timer\nfunction is platform-dependent (see the module doc string). stmt and setup\nmay also contain multiple statements separated by ; or newlines, as long as\nthey don’t contain multi-line string literals.
\nTo measure the execution time of the first statement, use the timeit()\nmethod. The repeat() method is a convenience to call timeit()\nmultiple times and return a list of results.
\n\nChanged in version 2.6: The stmt and setup parameters can now also take objects that are callable\nwithout arguments. This will embed calls to them in a timer function that will\nthen be executed by timeit(). Note that the timing overhead is a little\nlarger in this case because of the extra function calls.
\nHelper to print a traceback from the timed code.
\nTypical use:
\nt = Timer(...) # outside the try/except\ntry:\n t.timeit(...) # or t.repeat(...)\nexcept:\n t.print_exc()\n
The advantage over the standard traceback is that source lines in the compiled\ntemplate will be displayed. The optional file argument directs where the\ntraceback is sent; it defaults to sys.stderr.
\nCall timeit() a few times.
\nThis is a convenience function that calls the timeit() repeatedly,\nreturning a list of results. The first argument specifies how many times to\ncall timeit(). The second argument specifies the number argument for\ntimeit().
\nNote
\nIt’s tempting to calculate mean and standard deviation from the result vector\nand report these. However, this is not very useful. In a typical case, the\nlowest value gives a lower bound for how fast your machine can run the given\ncode snippet; higher values in the result vector are typically not caused by\nvariability in Python’s speed, but by other processes interfering with your\ntiming accuracy. So the min() of the result is probably the only number\nyou should be interested in. After that, you should look at the entire vector\nand apply common sense rather than statistics.
\nTime number executions of the main statement. This executes the setup\nstatement once, and then returns the time it takes to execute the main statement\na number of times, measured in seconds as a float. The argument is the number\nof times through the loop, defaulting to one million. The main statement, the\nsetup statement and the timer function to be used are passed to the constructor.
\nNote
\nBy default, timeit() temporarily turns off garbage collection\nduring the timing. The advantage of this approach is that it makes\nindependent timings more comparable. This disadvantage is that GC may be\nan important component of the performance of the function being measured.\nIf so, GC can be re-enabled as the first statement in the setup string.\nFor example:
\ntimeit.Timer('for i in xrange(10): oct(i)', 'gc.enable()').timeit()\n
Starting with version 2.6, the module also defines two convenience functions:
\nCreate a Timer instance with the given statement, setup code and timer\nfunction and run its repeat() method with the given repeat count and\nnumber executions.
\n\nNew in version 2.6.
\nCreate a Timer instance with the given statement, setup code and timer\nfunction and run its timeit() method with number executions.
\n\nNew in version 2.6.
\nWhen called as a program from the command line, the following form is used:
\npython -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...]
\nWhere the following options are understood:
\nA multi-line statement may be given by specifying each line as a separate\nstatement argument; indented lines are possible by enclosing an argument in\nquotes and using leading spaces. Multiple -s options are treated\nsimilarly.
\nIf -n is not given, a suitable number of loops is calculated by trying\nsuccessive powers of 10 until the total time is at least 0.2 seconds.
\nThe default timer function is platform dependent. On Windows,\ntime.clock() has microsecond granularity but time.time()‘s\ngranularity is 1/60th of a second; on Unix, time.clock() has 1/100th of a\nsecond granularity and time.time() is much more precise. On either\nplatform, the default timer functions measure wall clock time, not the CPU time.\nThis means that other processes running on the same computer may interfere with\nthe timing. The best thing to do when accurate timing is necessary is to repeat\nthe timing a few times and use the best time. The -r option is good\nfor this; the default of 3 repetitions is probably enough in most cases. On\nUnix, you can use time.clock() to measure CPU time.
\nNote
\nThere is a certain baseline overhead associated with executing a pass statement.\nThe code here doesn’t try to hide it, but you should be aware of it. The\nbaseline overhead can be measured by invoking the program without arguments.
\nThe baseline overhead differs between Python versions! Also, to fairly compare\nolder Python versions to Python 2.3, you may want to use Python’s -O\noption for the older versions to avoid timing SET_LINENO instructions.
\nHere are two example sessions (one using the command line, one using the module\ninterface) that compare the cost of using hasattr() vs.\ntry/except to test for missing and present object\nattributes.
\n$ python -m timeit 'try:' ' str.__nonzero__' 'except AttributeError:' ' pass'\n100000 loops, best of 3: 15.7 usec per loop\n$ python -m timeit 'if hasattr(str, \"__nonzero__\"): pass'\n100000 loops, best of 3: 4.26 usec per loop\n$ python -m timeit 'try:' ' int.__nonzero__' 'except AttributeError:' ' pass'\n1000000 loops, best of 3: 1.43 usec per loop\n$ python -m timeit 'if hasattr(int, \"__nonzero__\"): pass'\n100000 loops, best of 3: 2.23 usec per loop
\n>>> import timeit\n>>> s = """\\\n... try:\n... str.__nonzero__\n... except AttributeError:\n... pass\n... """\n>>> t = timeit.Timer(stmt=s)\n>>> print "%.2f usec/pass" (1000000 * t.timeit(number=100000)/100000)\n17.09 usec/pass\n>>> s = """\\\n... if hasattr(str, '__nonzero__'): pass\n... """\n>>> t = timeit.Timer(stmt=s)\n>>> print "%.2f usec/pass" (1000000 * t.timeit(number=100000)/100000)\n4.85 usec/pass\n>>> s = """\\\n... try:\n... int.__nonzero__\n... except AttributeError:\n... pass\n... """\n>>> t = timeit.Timer(stmt=s)\n>>> print "%.2f usec/pass" (1000000 * t.timeit(number=100000)/100000)\n1.97 usec/pass\n>>> s = """\\\n... if hasattr(int, '__nonzero__'): pass\n... """\n>>> t = timeit.Timer(stmt=s)\n>>> print "%.2f usec/pass" (1000000 * t.timeit(number=100000)/100000)\n3.15 usec/pass\n
To give the timeit module access to functions you define, you can pass a\nsetup parameter which contains an import statement:
\ndef test():\n """Stupid test function"""\n L = []\n for i in range(100):\n L.append(i)\n\nif __name__ == '__main__':\n from timeit import Timer\n t = Timer("test()", "from __main__ import test")\n print t.timeit()\n
This module provides direct access to all ‘built-in’ identifiers of Python; for\nexample, __builtin__.open is the full name for the built-in function\nopen(). See Built-in Functions and Built-in Constants for\ndocumentation.
\nThis module is not normally accessed explicitly by most applications, but can be\nuseful in modules that provide objects with the same name as a built-in value,\nbut in which the built-in of that name is also needed. For example, in a module\nthat wants to implement an open() function that wraps the built-in\nopen(), this module can be used directly:
\nimport __builtin__\n\ndef open(path):\n f = __builtin__.open(path, 'r')\n return UpperCaser(f)\n\nclass UpperCaser:\n '''Wrapper around a file that converts output to upper-case.'''\n\n def __init__(self, f):\n self._f = f\n\n def read(self, count=-1):\n return self._f.read(count).upper()\n\n # ...\n
CPython implementation detail: Most modules have the name __builtins__ (note the 's') made available\nas part of their globals. The value of __builtins__ is normally either\nthis module or the value of this modules’s __dict__ attribute. Since\nthis is an implementation detail, it may not be used by alternate\nimplementations of Python.
\n\nNew in version 2.6.
\nThis module provides functions that exist in 2.x, but have different behavior in\nPython 3, so they cannot be put into the 2.x builtins namespace.
\nInstead, if you want to write code compatible with Python 3 builtins, import\nthem from this module, like this:
\nfrom future_builtins import map, filter\n\n... code using Python 3-style map and filter ...\n
The 2to3 tool that ports Python 2 code to Python 3 will recognize\nthis usage and leave the new builtins alone.
\nNote
\nThe Python 3 print() function is already in the builtins, but cannot be\naccessed from Python 2 code unless you use the appropriate future statement:
\nfrom __future__ import print_function\n
Available builtins are:
\n\nNew in version 2.7.
\nSource code: Lib/sysconfig.py
\nThe sysconfig module provides access to Python’s configuration\ninformation like the list of installation paths and the configuration variables\nrelevant for the current platform.
\nA Python distribution contains a Makefile and a pyconfig.h\nheader file that are necessary to build both the Python binary itself and\nthird-party C extensions compiled using distutils.
\nsysconfig puts all variables found in these files in a dictionary that\ncan be accessed using get_config_vars() or get_config_var().
\nNotice that on Windows, it’s a much smaller set.
\nWith no arguments, return a dictionary of all configuration variables\nrelevant for the current platform.
\nWith arguments, return a list of values that result from looking up each\nargument in the configuration variable dictionary.
\nFor each argument, if the value is not found, return None.
\nReturn the value of a single variable name. Equivalent to\nget_config_vars().get(name).
\nIf name is not found, return None.
\nExample of usage:
\n>>> import sysconfig\n>>> sysconfig.get_config_var('Py_ENABLE_SHARED')\n0\n>>> sysconfig.get_config_var('LIBDIR')\n'/usr/local/lib'\n>>> sysconfig.get_config_vars('AR', 'CXX')\n['ar', 'g++']\n
Python uses an installation scheme that differs depending on the platform and on\nthe installation options. These schemes are stored in sysconfig under\nunique identifiers based on the value returned by os.name.
\nEvery new component that is installed using distutils or a\nDistutils-based system will follow the same scheme to copy its file in the right\nplaces.
\nPython currently supports seven schemes:
\nEach scheme is itself composed of a series of paths and each path has a unique\nidentifier. Python currently uses eight paths:
\nsysconfig provides some functions to determine these paths.
\nReturn an installation path corresponding to the path name, from the\ninstall scheme named scheme.
\nname has to be a value from the list returned by get_path_names().
\nsysconfig stores installation paths corresponding to each path name,\nfor each platform, with variables to be expanded. For instance the stdlib\npath for the nt scheme is: {base}/Lib.
\nget_path() will use the variables returned by get_config_vars()\nto expand the path. All variables have default values for each platform so\none may call this function and get the default value.
\nIf scheme is provided, it must be a value from the list returned by\nget_path_names(). Otherwise, the default scheme for the current\nplatform is used.
\nIf vars is provided, it must be a dictionary of variables that will update\nthe dictionary return by get_config_vars().
\nIf expand is set to False, the path will not be expanded using the\nvariables.
\nIf name is not found, return None.
\nReturn a dictionary containing all installation paths corresponding to an\ninstallation scheme. See get_path() for more information.
\nIf scheme is not provided, will use the default scheme for the current\nplatform.
\nIf vars is provided, it must be a dictionary of variables that will\nupdate the dictionary used to expand the paths.
\nIf expand is set to False, the paths will not be expanded.
\nIf scheme is not an existing scheme, get_paths() will raise a\nKeyError.
\nReturn a string that identifies the current platform.
\nThis is used mainly to distinguish platform-specific build directories and\nplatform-specific built distributions. Typically includes the OS name and\nversion and the architecture (as supplied by os.uname()), although the\nexact information included depends on the OS; e.g. for IRIX the architecture\nisn’t particularly important (IRIX only runs on SGI hardware), but for Linux\nthe kernel version isn’t particularly important.
\nExamples of returned values:
\nWindows will return one of:
\nMac OS X can return:
\nFor other non-POSIX platforms, currently just returns sys.platform.
\nParse a config.h-style file.
\nfp is a file-like object pointing to the config.h-like file.
\nA dictionary containing name/value pairs is returned. If an optional\ndictionary is passed in as the second argument, it is used instead of a new\ndictionary, and updated with the values read in the file.
\nThis module provides access to some variables used or maintained by the\ninterpreter and to functions that interact strongly with the interpreter. It is\nalways available.
\nThe list of command line arguments passed to a Python script. argv[0] is the\nscript name (it is operating system dependent whether this is a full pathname or\nnot). If the command was executed using the -c command line option to\nthe interpreter, argv[0] is set to the string '-c'. If no script name\nwas passed to the Python interpreter, argv[0] is the empty string.
\nTo loop over the standard input, or the list of files given on the\ncommand line, see the fileinput module.
\nAn indicator of the native byte order. This will have the value 'big' on\nbig-endian (most-significant byte first) platforms, and 'little' on\nlittle-endian (least-significant byte first) platforms.
\n\nNew in version 2.0.
\nClear the internal type cache. The type cache is used to speed up attribute\nand method lookups. Use the function only to drop unnecessary references\nduring reference leak debugging.
\nThis function should be used for internal and specialized purposes only.
\n\nNew in version 2.6.
\nReturn a dictionary mapping each thread’s identifier to the topmost stack frame\ncurrently active in that thread at the time the function is called. Note that\nfunctions in the traceback module can build the call stack given such a\nframe.
\nThis is most useful for debugging deadlock: this function does not require the\ndeadlocked threads’ cooperation, and such threads’ call stacks are frozen for as\nlong as they remain deadlocked. The frame returned for a non-deadlocked thread\nmay bear no relationship to that thread’s current activity by the time calling\ncode examines the frame.
\nThis function should be used for internal and specialized purposes only.
\n\nNew in version 2.5.
\nIf value is not None, this function prints it to sys.stdout, and saves\nit in __builtin__._.
\nsys.displayhook is called on the result of evaluating an expression\nentered in an interactive Python session. The display of these values can be\ncustomized by assigning another one-argument function to sys.displayhook.
\nIf this is true, Python won’t try to write .pyc or .pyo files on the\nimport of source modules. This value is initially set to True or\nFalse depending on the -B command line option and the\nPYTHONDONTWRITEBYTECODE environment variable, but you can set it\nyourself to control bytecode file generation.
\n\nNew in version 2.6.
\nThis function prints out a given traceback and exception to sys.stderr.
\nWhen an exception is raised and uncaught, the interpreter calls\nsys.excepthook with three arguments, the exception class, exception\ninstance, and a traceback object. In an interactive session this happens just\nbefore control is returned to the prompt; in a Python program this happens just\nbefore the program exits. The handling of such top-level exceptions can be\ncustomized by assigning another three-argument function to sys.excepthook.
\nThis function returns a tuple of three values that give information about the\nexception that is currently being handled. The information returned is specific\nboth to the current thread and to the current stack frame. If the current stack\nframe is not handling an exception, the information is taken from the calling\nstack frame, or its caller, and so on until a stack frame is found that is\nhandling an exception. Here, “handling an exception” is defined as “executing\nor having executed an except clause.” For any stack frame, only information\nabout the most recently handled exception is accessible.
\nIf no exception is being handled anywhere on the stack, a tuple containing three\nNone values is returned. Otherwise, the values returned are (type, value,\ntraceback). Their meaning is: type gets the exception type of the exception\nbeing handled (a class object); value gets the exception parameter (its\nassociated value or the second argument to raise, which is\nalways a class instance if the exception type is a class object); traceback\ngets a traceback object (see the Reference Manual) which encapsulates the call\nstack at the point where the exception originally occurred.
\nIf exc_clear() is called, this function will return three None values\nuntil either another exception is raised in the current thread or the execution\nstack returns to a frame where another exception is being handled.
\nWarning
\nAssigning the traceback return value to a local variable in a function that is\nhandling an exception will cause a circular reference. This will prevent\nanything referenced by a local variable in the same function or by the traceback\nfrom being garbage collected. Since most functions don’t need access to the\ntraceback, the best solution is to use something like exctype, value =\nsys.exc_info()[:2] to extract only the exception type and value. If you do\nneed the traceback, make sure to delete it after use (best done with a\ntry ... finally statement) or to call exc_info() in\na function that does not itself handle an exception.
\nNote
\nBeginning with Python 2.2, such cycles are automatically reclaimed when garbage\ncollection is enabled and they become unreachable, but it remains more efficient\nto avoid creating cycles.
\nThis function clears all information relating to the current or last exception\nthat occurred in the current thread. After calling this function,\nexc_info() will return three None values until another exception is\nraised in the current thread or the execution stack returns to a frame where\nanother exception is being handled.
\nThis function is only needed in only a few obscure situations. These include\nlogging and error handling systems that report information on the last or\ncurrent exception. This function can also be used to try to free resources and\ntrigger object finalization, though no guarantee is made as to what objects will\nbe freed, if any.
\n\nNew in version 2.3.
\n\nDeprecated since version 1.5: Use exc_info() instead.
\nSince they are global variables, they are not specific to the current thread, so\ntheir use is not safe in a multi-threaded program. When no exception is being\nhandled, exc_type is set to None and the other two are undefined.
\nExit from Python. This is implemented by raising the SystemExit\nexception, so cleanup actions specified by finally clauses of try\nstatements are honored, and it is possible to intercept the exit attempt at\nan outer level.
\nThe optional argument arg can be an integer giving the exit status\n(defaulting to zero), or another type of object. If it is an integer, zero\nis considered “successful termination” and any nonzero value is considered\n“abnormal termination” by shells and the like. Most systems require it to be\nin the range 0-127, and produce undefined results otherwise. Some systems\nhave a convention for assigning specific meanings to specific exit codes, but\nthese are generally underdeveloped; Unix programs generally use 2 for command\nline syntax errors and 1 for all other kind of errors. If another type of\nobject is passed, None is equivalent to passing zero, and any other\nobject is printed to stderr and results in an exit code of 1. In\nparticular, sys.exit("some error message") is a quick way to exit a\nprogram when an error occurs.
\nSince exit() ultimately “only” raises an exception, it will only exit\nthe process when called from the main thread, and the exception is not\nintercepted.
\nThis value is not actually defined by the module, but can be set by the user (or\nby a program) to specify a clean-up action at program exit. When set, it should\nbe a parameterless function. This function will be called when the interpreter\nexits. Only one function may be installed in this way; to allow multiple\nfunctions which will be called at termination, use the atexit module.
\nNote
\nThe exit function is not called when the program is killed by a signal, when a\nPython fatal internal error is detected, or when os._exit() is called.
\n\nDeprecated since version 2.4: Use atexit instead.
\nThe struct sequence flags exposes the status of command line flags. The\nattributes are read only.
\nattribute | \nflag | \n
---|---|
debug | \n-d | \n
py3k_warning | \n-3 | \n
division_warning | \n-Q | \n
division_new | \n-Qnew | \n
inspect | \n-i | \n
interactive | \n-i | \n
optimize | \n-O or -OO | \n
dont_write_bytecode | \n-B | \n
no_user_site | \n-s | \n
no_site | \n-S | \n
ignore_environment | \n-E | \n
tabcheck | \n-t or -tt | \n
verbose | \n-v | \n
unicode | \n-U | \n
bytes_warning | \n-b | \n
\nNew in version 2.6.
\nA structseq holding information about the float type. It contains low level\ninformation about the precision and internal representation. The values\ncorrespond to the various floating-point constants defined in the standard\nheader file float.h for the ‘C’ programming language; see section\n5.2.4.2.2 of the 1999 ISO/IEC C standard [C99], ‘Characteristics of\nfloating types’, for details.
\nattribute | \nfloat.h macro | \nexplanation | \n
---|---|---|
epsilon | \nDBL_EPSILON | \ndifference between 1 and the least value greater\nthan 1 that is representable as a float | \n
dig | \nDBL_DIG | \nmaximum number of decimal digits that can be\nfaithfully represented in a float; see below | \n
mant_dig | \nDBL_MANT_DIG | \nfloat precision: the number of base-radix\ndigits in the significand of a float | \n
max | \nDBL_MAX | \nmaximum representable finite float | \n
max_exp | \nDBL_MAX_EXP | \nmaximum integer e such that radix**(e-1) is\na representable finite float | \n
max_10_exp | \nDBL_MAX_10_EXP | \nmaximum integer e such that 10**e is in the\nrange of representable finite floats | \n
min | \nDBL_MIN | \nminimum positive normalized float | \n
min_exp | \nDBL_MIN_EXP | \nminimum integer e such that radix**(e-1) is\na normalized float | \n
min_10_exp | \nDBL_MIN_10_EXP | \nminimum integer e such that 10**e is a\nnormalized float | \n
radix | \nFLT_RADIX | \nradix of exponent representation | \n
rounds | \nFLT_ROUNDS | \ninteger constant representing the rounding mode\nused for arithmetic operations. This reflects\nthe value of the system FLT_ROUNDS macro at\ninterpreter startup time. See section 5.2.4.2.2\nof the C99 standard for an explanation of the\npossible values and their meanings. | \n
The attribute sys.float_info.dig needs further explanation. If\ns is any string representing a decimal number with at most\nsys.float_info.dig significant digits, then converting s to a\nfloat and back again will recover a string representing the same decimal\nvalue:
\n>>> import sys\n>>> sys.float_info.dig\n15\n>>> s = '3.14159265358979' # decimal string with 15 significant digits\n>>> format(float(s), '.15g') # convert to float and back -> same value\n'3.14159265358979'\n
But for strings with more than sys.float_info.dig significant digits,\nthis isn’t always true:
\n>>> s = '9876543211234567' # 16 significant digits is too many!\n>>> format(float(s), '.16g') # conversion changes value\n'9876543211234568'\n
\nNew in version 2.6.
\nA string indicating how the repr() function behaves for\nfloats. If the string has value 'short' then for a finite\nfloat x, repr(x) aims to produce a short string with the\nproperty that float(repr(x)) == x. This is the usual behaviour\nin Python 2.7 and later. Otherwise, float_repr_style has value\n'legacy' and repr(x) behaves in the same way as it did in\nversions of Python prior to 2.7.
\n\nNew in version 2.7.
\nReturn the interpreter’s “check interval”; see setcheckinterval().
\n\nNew in version 2.3.
\nReturn the name of the current default string encoding used by the Unicode\nimplementation.
\n\nNew in version 2.0.
\nReturn the current value of the flags that are used for dlopen() calls.\nThe flag constants are defined in the dl and DLFCN modules.\nAvailability: Unix.
\n\nNew in version 2.2.
\nReturn the name of the encoding used to convert Unicode filenames into system\nfile names, or None if the system default encoding is used. The result value\ndepends on the operating system:
\n\nNew in version 2.3.
\nReturn the size of an object in bytes. The object can be any type of\nobject. All built-in objects will return correct results, but this\ndoes not have to hold true for third-party extensions as it is implementation\nspecific.
\nIf given, default will be returned if the object does not provide means to\nretrieve the size. Otherwise a TypeError will be raised.
\ngetsizeof() calls the object’s __sizeof__ method and adds an\nadditional garbage collector overhead if the object is managed by the garbage\ncollector.
\n\nNew in version 2.6.
\nReturn a frame object from the call stack. If optional integer depth is\ngiven, return the frame object that many calls below the top of the stack. If\nthat is deeper than the call stack, ValueError is raised. The default\nfor depth is zero, returning the frame at the top of the call stack.
\nCPython implementation detail: This function should be used for internal and specialized purposes only.\nIt is not guaranteed to exist in all implementations of Python.
\nGet the profiler function as set by setprofile().
\n\nNew in version 2.6.
\nGet the trace function as set by settrace().
\nCPython implementation detail: The gettrace() function is intended only for implementing debuggers,\nprofilers, coverage tools and the like. Its behavior is part of the\nimplementation platform, rather than part of the language definition, and\nthus may not be available in all Python implementations.
\n\nNew in version 2.6.
\nReturn a named tuple describing the Windows version\ncurrently running. The named elements are major, minor,\nbuild, platform, service_pack, service_pack_minor,\nservice_pack_major, suite_mask, and product_type.\nservice_pack contains a string while all other values are\nintegers. The components can also be accessed by name, so\nsys.getwindowsversion()[0] is equivalent to\nsys.getwindowsversion().major. For compatibility with prior\nversions, only the first 5 elements are retrievable by indexing.
\nplatform may be one of the following values:
\nConstant | \nPlatform | \n
---|---|
0 (VER_PLATFORM_WIN32s) | \nWin32s on Windows 3.1 | \n
1 (VER_PLATFORM_WIN32_WINDOWS) | \nWindows 95/98/ME | \n
2 (VER_PLATFORM_WIN32_NT) | \nWindows NT/2000/XP/x64 | \n
3 (VER_PLATFORM_WIN32_CE) | \nWindows CE | \n
product_type may be one of the following values:
\nConstant | \nMeaning | \n
---|---|
1 (VER_NT_WORKSTATION) | \nThe system is a workstation. | \n
2 (VER_NT_DOMAIN_CONTROLLER) | \nThe system is a domain\ncontroller. | \n
3 (VER_NT_SERVER) | \nThe system is a server, but not\na domain controller. | \n
This function wraps the Win32 GetVersionEx() function; see the\nMicrosoft documentation on OSVERSIONINFOEX() for more information\nabout these fields.
\nAvailability: Windows.
\n\nNew in version 2.3.
\n\nChanged in version 2.7: Changed to a named tuple and added service_pack_minor,\nservice_pack_major, suite_mask, and product_type.
\nThe version number encoded as a single integer. This is guaranteed to increase\nwith each version, including proper support for non-production releases. For\nexample, to test that the Python interpreter is at least version 1.5.2, use:
\nif sys.hexversion >= 0x010502F0:\n # use some advanced feature\n ...\nelse:\n # use an alternative implementation or warn the user\n ...\n
This is called hexversion since it only really looks meaningful when viewed\nas the result of passing it to the built-in hex() function. The\nversion_info value may be used for a more human-friendly encoding of the\nsame information.
\nThe hexversion is a 32-bit number with the following layout:
\nBits (big endian order) | \nMeaning | \n
---|---|
1-8 | \nPY_MAJOR_VERSION (the 2 in\n2.1.0a3) | \n
9-16 | \nPY_MINOR_VERSION (the 1 in\n2.1.0a3) | \n
17-24 | \nPY_MICRO_VERSION (the 0 in\n2.1.0a3) | \n
25-28 | \nPY_RELEASE_LEVEL (0xA for alpha,\n0xB for beta, 0xC for release\ncandidate and 0xF for final) | \n
29-32 | \nPY_RELEASE_SERIAL (the 3 in\n2.1.0a3, zero for final releases) | \n
Thus 2.1.0a3 is hexversion 0x020100a3.
\n\nNew in version 1.5.2.
\nA struct sequence that holds information about Python’s\ninternal representation of integers. The attributes are read only.
\nAttribute | \nExplanation | \n
---|---|
bits_per_digit | \nnumber of bits held in each digit. Python\nintegers are stored internally in base\n2**long_info.bits_per_digit | \n
sizeof_digit | \nsize in bytes of the C type used to\nrepresent a digit | \n
\nNew in version 2.7.
\nThese three variables are not always defined; they are set when an exception is\nnot handled and the interpreter prints an error message and a stack traceback.\nTheir intended use is to allow an interactive user to import a debugger module\nand engage in post-mortem debugging without having to re-execute the command\nthat caused the error. (Typical use is import pdb; pdb.pm() to enter the\npost-mortem debugger; see chapter pdb — The Python Debugger for\nmore information.)
\nThe meaning of the variables is the same as that of the return values from\nexc_info() above. (Since there is only one interactive thread,\nthread-safety is not a concern for these variables, unlike for exc_type\netc.)
\nA list of finder objects that have their find_module()\nmethods called to see if one of the objects can find the module to be\nimported. The find_module() method is called at least with the\nabsolute name of the module being imported. If the module to be imported is\ncontained in package then the parent package’s __path__ attribute\nis passed in as a second argument. The method returns None if\nthe module cannot be found, else returns a loader.
\nsys.meta_path is searched before any implicit default finders or\nsys.path.
\nSee PEP 302 for the original specification.
\nThis is a dictionary that maps module names to modules which have already been\nloaded. This can be manipulated to force reloading of modules and other tricks.\nNote that removing a module from this dictionary is not the same as calling\nreload() on the corresponding module object.
\nA list of strings that specifies the search path for modules. Initialized from\nthe environment variable PYTHONPATH, plus an installation-dependent\ndefault.
\nAs initialized upon program startup, the first item of this list, path[0],\nis the directory containing the script that was used to invoke the Python\ninterpreter. If the script directory is not available (e.g. if the interpreter\nis invoked interactively or if the script is read from standard input),\npath[0] is the empty string, which directs Python to search modules in the\ncurrent directory first. Notice that the script directory is inserted before\nthe entries inserted as a result of PYTHONPATH.
\nA program is free to modify this list for its own purposes.
\n\nChanged in version 2.3: Unicode strings are no longer ignored.
\n\nA list of callables that take a path argument to try to create a\nfinder for the path. If a finder can be created, it is to be\nreturned by the callable, else raise ImportError.
\nOriginally specified in PEP 302.
\nA dictionary acting as a cache for finder objects. The keys are\npaths that have been passed to sys.path_hooks and the values are\nthe finders that are found. If a path is a valid file system path but no\nexplicit finder is found on sys.path_hooks then None is\nstored to represent the implicit default finder should be used. If the path\nis not an existing path then imp.NullImporter is set.
\nOriginally specified in PEP 302.
\nThis string contains a platform identifier that can be used to append\nplatform-specific components to sys.path, for instance.
\nFor most Unix systems, this is the lowercased OS name as returned by uname\n-s with the first part of the version as returned by uname -r appended,\ne.g. 'sunos5', at the time when Python was built. Unless you want to\ntest for a specific system version, it is therefore recommended to use the\nfollowing idiom:
\nif sys.platform.startswith('freebsd'):\n # FreeBSD-specific code here...\nelif sys.platform.startswith('linux'):\n # Linux-specific code here...
\n\nChanged in version 2.7.3: Since lots of code check for sys.platform == 'linux2', and there is\nno essential change between Linux 2.x and 3.x, sys.platform is always\nset to 'linux2', even on Linux 3.x. In Python 3.3 and later, the\nvalue will always be set to 'linux', so it is recommended to always\nuse the startswith idiom presented above.
\nFor other systems, the values are:
\nSystem | \nplatform value | \n
---|---|
Linux (2.x and 3.x) | \n'linux2' | \n
Windows | \n'win32' | \n
Windows/Cygwin | \n'cygwin' | \n
Mac OS X | \n'darwin' | \n
OS/2 | \n'os2' | \n
OS/2 EMX | \n'os2emx' | \n
RiscOS | \n'riscos' | \n
AtheOS | \n'atheos' | \n
See also
\nos.name has a coarser granularity. os.uname() gives\nsystem-dependent version information.
\nThe platform module provides detailed checks for the\nsystem’s identity.
\nStrings specifying the primary and secondary prompt of the interpreter. These\nare only defined if the interpreter is in interactive mode. Their initial\nvalues in this case are '>>> ' and '... '. If a non-string object is\nassigned to either variable, its str() is re-evaluated each time the\ninterpreter prepares to read a new interactive command; this can be used to\nimplement a dynamic prompt.
\nBool containing the status of the Python 3.0 warning flag. It’s True\nwhen Python is started with the -3 option. (This should be considered\nread-only; setting it to a different value doesn’t have an effect on\nPython 3.0 warnings.)
\n\nNew in version 2.6.
\nSet the current default string encoding used by the Unicode implementation. If\nname does not match any available encoding, LookupError is raised.\nThis function is only intended to be used by the site module\nimplementation and, where needed, by sitecustomize. Once used by the\nsite module, it is removed from the sys module’s namespace.
\n\nNew in version 2.0.
\nSet the flags used by the interpreter for dlopen() calls, such as when\nthe interpreter loads extension modules. Among other things, this will enable a\nlazy resolving of symbols when importing a module, if called as\nsys.setdlopenflags(0). To share symbols across extension modules, call as\nsys.setdlopenflags(dl.RTLD_NOW | dl.RTLD_GLOBAL). Symbolic names for the\nflag modules can be either found in the dl module, or in the DLFCN\nmodule. If DLFCN is not available, it can be generated from\n/usr/include/dlfcn.h using the h2py script. Availability:\nUnix.
\n\nNew in version 2.2.
\nSet the system’s profile function, which allows you to implement a Python source\ncode profiler in Python. See chapter The Python Profilers for more information on the\nPython profiler. The system’s profile function is called similarly to the\nsystem’s trace function (see settrace()), but it isn’t called for each\nexecuted line of code (only on call and return, but the return event is reported\neven when an exception has been set). The function is thread-specific, but\nthere is no way for the profiler to know about context switches between threads,\nso it does not make sense to use this in the presence of multiple threads. Also,\nits return value is not used, so it can simply return None.
\nSet the maximum depth of the Python interpreter stack to limit. This limit\nprevents infinite recursion from causing an overflow of the C stack and crashing\nPython.
\nThe highest possible limit is platform-dependent. A user may need to set the\nlimit higher when she has a program that requires deep recursion and a platform\nthat supports a higher limit. This should be done with care, because a too-high\nlimit can lead to a crash.
\nSet the system’s trace function, which allows you to implement a Python\nsource code debugger in Python. The function is thread-specific; for a\ndebugger to support multiple threads, it must be registered using\nsettrace() for each thread being debugged.
\nTrace functions should have three arguments: frame, event, and\narg. frame is the current stack frame. event is a string: 'call',\n'line', 'return', 'exception', 'c_call', 'c_return', or\n'c_exception'. arg depends on the event type.
\nThe trace function is invoked (with event set to 'call') whenever a new\nlocal scope is entered; it should return a reference to a local trace\nfunction to be used that scope, or None if the scope shouldn’t be traced.
\nThe local trace function should return a reference to itself (or to another\nfunction for further tracing in that scope), or None to turn off tracing\nin that scope.
\nThe events have the following meaning:
\nNote that as an exception is propagated down the chain of callers, an\n'exception' event is generated at each level.
\nFor more information on code and frame objects, refer to The standard type hierarchy.
\nCPython implementation detail: The settrace() function is intended only for implementing debuggers,\nprofilers, coverage tools and the like. Its behavior is part of the\nimplementation platform, rather than part of the language definition, and\nthus may not be available in all Python implementations.
\nActivate dumping of VM measurements using the Pentium timestamp counter, if\non_flag is true. Deactivate these dumps if on_flag is off. The function is\navailable only if Python was compiled with --with-tsc. To understand\nthe output of this dump, read Python/ceval.c in the Python sources.
\n\nNew in version 2.4.
\nCPython implementation detail: This function is intimately bound to CPython implementation details and\nthus not likely to be implemented elsewhere.
\nFile objects corresponding to the interpreter’s standard input, output and error\nstreams. stdin is used for all interpreter input except for scripts but\nincluding calls to input() and raw_input(). stdout is used for\nthe output of print and expression statements and for the\nprompts of input() and raw_input(). The interpreter’s own prompts\nand (almost all of) its error messages go to stderr. stdout and\nstderr needn’t be built-in file objects: any object is acceptable as long\nas it has a write() method that takes a string argument. (Changing these\nobjects doesn’t affect the standard I/O streams of processes executed by\nos.popen(), os.system() or the exec*() family of functions in\nthe os module.)
\nThese objects contain the original values of stdin, stderr and\nstdout at the start of the program. They are used during finalization,\nand could be useful to print to the actual standard stream no matter if the\nsys.std* object has been redirected.
\nIt can also be used to restore the actual files to known working file objects\nin case they have been overwritten with a broken object. However, the\npreferred way to do this is to explicitly save the previous stream before\nreplacing it, and restore the saved object.
\nA triple (repo, branch, version) representing the Subversion information of the\nPython interpreter. repo is the name of the repository, 'CPython'.\nbranch is a string of one of the forms 'trunk', 'branches/name' or\n'tags/name'. version is the output of svnversion, if the interpreter\nwas built from a Subversion checkout; it contains the revision number (range)\nand possibly a trailing ‘M’ if there were local modifications. If the tree was\nexported (or svnversion was not available), it is the revision of\nInclude/patchlevel.h if the branch is a tag. Otherwise, it is None.
\n\nNew in version 2.5.
\nNote
\nPython is now developed using\nMercurial. In recent Python 2.7 bugfix releases, subversion\ntherefore contains placeholder information. It is removed in Python\n3.3.
\nThe C API version for this interpreter. Programmers may find this useful when\ndebugging version conflicts between Python and extension modules.
\n\nNew in version 2.3.
\nA tuple containing the five components of the version number: major, minor,\nmicro, releaselevel, and serial. All values except releaselevel are\nintegers; the release level is 'alpha', 'beta', 'candidate', or\n'final'. The version_info value corresponding to the Python version 2.0\nis (2, 0, 0, 'final', 0). The components can also be accessed by name,\nso sys.version_info[0] is equivalent to sys.version_info.major\nand so on.
\n\nNew in version 2.0.
\n\nChanged in version 2.7: Added named component attributes
\nCitations
\n[C99] | ISO/IEC 9899:1999. “Programming languages – C.” A public draft of this standard is available at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf . |
This module represents the (otherwise anonymous) scope in which the\ninterpreter’s main program executes — commands read either from standard\ninput, from a script file, or from an interactive prompt. It is this\nenvironment in which the idiomatic “conditional script” stanza causes a script\nto run:
\nif __name__ == "__main__":\n main()\n
Source code: Lib/trace.py
\nThe trace module allows you to trace program execution, generate\nannotated statement coverage listings, print caller/callee relationships and\nlist functions executed during a program run. It can be used in another program\nor from the command line.
\nThe trace module can be invoked from the command line. It can be as\nsimple as
\npython -m trace --count -C . somefile.py ...
\nThe above will execute somefile.py and generate annotated listings of\nall Python modules imported during the execution into the current directory.
\nAt least one of the following options must be specified when invoking\ntrace. The --listfuncs option is mutually exclusive with\nthe --trace and --counts options . When\n--listfuncs is provided, neither --counts nor\n--trace are accepted, and vice versa.
\nThese options may be repeated multiple times.
\nCreate an object to trace execution of a single statement or expression. All\nparameters are optional. count enables counting of line numbers. trace\nenables line execution tracing. countfuncs enables listing of the\nfunctions called during the run. countcallers enables call relationship\ntracking. ignoremods is a list of modules or packages to ignore.\nignoredirs is a list of directories whose modules or packages should be\nignored. infile is the name of the file from which to read stored count\ninformation. outfile is the name of the file in which to write updated\ncount information. timing enables a timestamp relative to when tracing was\nstarted to be displayed.
\n\n\n\n
\n\n- \nrun(cmd)¶
\n- Execute the command and gather statistics from the execution with\nthe current tracing parameters. cmd must be a string or code object,\nsuitable for passing into exec().
\n
\n\n- \nrunctx(cmd[, globals=None[, locals=None]])¶
\n- Execute the command and gather statistics from the execution with the\ncurrent tracing parameters, in the defined global and local\nenvironments. If not defined, globals and locals default to empty\ndictionaries.
\n
\n\n- \nrunfunc(func, *args, **kwds)¶
\n- Call func with the given arguments under control of the Trace\nobject with the current tracing parameters.
\n
\n\n- \nresults()¶
\n- Return a CoverageResults object that contains the cumulative\nresults of all previous calls to run, runctx and runfunc\nfor the given Trace instance. Does not reset the accumulated\ntrace results.
A container for coverage results, created by Trace.results(). Should\nnot be created directly by the user.
\n\n\n\n
\n\n- \nupdate(other)¶
\n- Merge in data from another CoverageResults object.
\n
\n\n- \nwrite_results([show_missing=True[, summary=False[, coverdir=None]]])¶
\n- Write coverage results. Set show_missing to show lines that had no\nhits. Set summary to include in the output the coverage summary per\nmodule. coverdir specifies the directory into which the coverage\nresult files will be output. If None, the results for each source\nfile are placed in its directory.
A simple example demonstrating the use of the programmatic interface:
\nimport sys\nimport trace\n\n# create a Trace object, telling it what to ignore, and whether to\n# do tracing or line-counting or both.\ntracer = trace.Trace(\n ignoredirs=[sys.prefix, sys.exec_prefix],\n trace=0,\n count=1)\n\n# run the new command using the given tracer\ntracer.run('main()')\n\n# make a report, placing output in /tmp\nr = tracer.results()\nr.write_results(show_missing=True, coverdir="/tmp")\n
This module provides a standard interface to extract, format and print stack\ntraces of Python programs. It exactly mimics the behavior of the Python\ninterpreter when it prints a stack trace. This is useful when you want to print\nstack traces under program control, such as in a “wrapper” around the\ninterpreter.
\nThe module uses traceback objects — this is the object type that is stored in\nthe variables sys.exc_traceback (deprecated) and sys.last_traceback and\nreturned as the third item from sys.exc_info().
\nThe module defines the following functions:
\nThis is like print_exc(limit) but returns a string instead of printing to a\nfile.
\n\nNew in version 2.4.
\nThis simple example implements a basic read-eval-print loop, similar to (but\nless useful than) the standard Python interactive interpreter loop. For a more\ncomplete implementation of the interpreter loop, refer to the code\nmodule.
\nimport sys, traceback\n\ndef run_user_code(envdir):\n source = raw_input(">>> ")\n try:\n exec source in envdir\n except:\n print "Exception in user code:"\n print '-'*60\n traceback.print_exc(file=sys.stdout)\n print '-'*60\n\nenvdir = {}\nwhile 1:\n run_user_code(envdir)\n
The following example demonstrates the different ways to print and format the\nexception and traceback:
\nimport sys, traceback\n\ndef lumberjack():\n bright_side_of_death()\n\ndef bright_side_of_death():\n return tuple()[0]\n\ntry:\n lumberjack()\nexcept IndexError:\n exc_type, exc_value, exc_traceback = sys.exc_info()\n print "*** print_tb:"\n traceback.print_tb(exc_traceback, limit=1, file=sys.stdout)\n print "*** print_exception:"\n traceback.print_exception(exc_type, exc_value, exc_traceback,\n limit=2, file=sys.stdout)\n print "*** print_exc:"\n traceback.print_exc()\n print "*** format_exc, first and last line:"\n formatted_lines = traceback.format_exc().splitlines()\n print formatted_lines[0]\n print formatted_lines[-1]\n print "*** format_exception:"\n print repr(traceback.format_exception(exc_type, exc_value,\n exc_traceback))\n print "*** extract_tb:"\n print repr(traceback.extract_tb(exc_traceback))\n print "*** format_tb:"\n print repr(traceback.format_tb(exc_traceback))\n print "*** tb_lineno:", exc_traceback.tb_lineno\n
The output for the example would look similar to this:
\n*** print_tb:\n File \"<doctest...>\", line 10, in <module>\n lumberjack()\n*** print_exception:\nTraceback (most recent call last):\n File \"<doctest...>\", line 10, in <module>\n lumberjack()\n File \"<doctest...>\", line 4, in lumberjack\n bright_side_of_death()\nIndexError: tuple index out of range\n*** print_exc:\nTraceback (most recent call last):\n File \"<doctest...>\", line 10, in <module>\n lumberjack()\n File \"<doctest...>\", line 4, in lumberjack\n bright_side_of_death()\nIndexError: tuple index out of range\n*** format_exc, first and last line:\nTraceback (most recent call last):\nIndexError: tuple index out of range\n*** format_exception:\n['Traceback (most recent call last):\\n',\n ' File \"<doctest...>\", line 10, in <module>\\n lumberjack()\\n',\n ' File \"<doctest...>\", line 4, in lumberjack\\n bright_side_of_death()\\n',\n ' File \"<doctest...>\", line 7, in bright_side_of_death\\n return tuple()[0]\\n',\n 'IndexError: tuple index out of range\\n']\n*** extract_tb:\n[('<doctest...>', 10, '<module>', 'lumberjack()'),\n ('<doctest...>', 4, 'lumberjack', 'bright_side_of_death()'),\n ('<doctest...>', 7, 'bright_side_of_death', 'return tuple()[0]')]\n*** format_tb:\n[' File \"<doctest...>\", line 10, in <module>\\n lumberjack()\\n',\n ' File \"<doctest...>\", line 4, in lumberjack\\n bright_side_of_death()\\n',\n ' File \"<doctest...>\", line 7, in bright_side_of_death\\n return tuple()[0]\\n']\n*** tb_lineno: 10
\nThe following example shows the different ways to print and format the stack:
\n>>> import traceback\n>>> def another_function():\n... lumberstack()\n...\n>>> def lumberstack():\n... traceback.print_stack()\n... print repr(traceback.extract_stack())\n... print repr(traceback.format_stack())\n...\n>>> another_function()\n File "<doctest>", line 10, in <module>\n another_function()\n File "<doctest>", line 3, in another_function\n lumberstack()\n File "<doctest>", line 6, in lumberstack\n traceback.print_stack()\n[('<doctest>', 10, '<module>', 'another_function()'),\n ('<doctest>', 3, 'another_function', 'lumberstack()'),\n ('<doctest>', 7, 'lumberstack', 'print repr(traceback.extract_stack())')]\n[' File "<doctest>", line 10, in <module>\\n another_function()\\n',\n ' File "<doctest>", line 3, in another_function\\n lumberstack()\\n',\n ' File "<doctest>", line 8, in lumberstack\\n print repr(traceback.format_stack())\\n']\n
This last example demonstrates the final few formatting functions:
\n>>> import traceback\n>>> traceback.format_list([('spam.py', 3, '<module>', 'spam.eggs()'),\n... ('eggs.py', 42, 'eggs', 'return "bacon"')])\n[' File "spam.py", line 3, in <module>\\n spam.eggs()\\n',\n ' File "eggs.py", line 42, in eggs\\n return "bacon"\\n']\n>>> an_error = IndexError('tuple index out of range')\n>>> traceback.format_exception_only(type(an_error), an_error)\n['IndexError: tuple index out of range\\n']\n
\nNew in version 2.0.
\nSource code: Lib/atexit.py
\nThe atexit module defines a single function to register cleanup\nfunctions. Functions thus registered are automatically executed upon normal\ninterpreter termination. The order in which the functions are called is not\ndefined; if you have cleanup operations that depend on each other, you should\nwrap them in a function and register that one. This keeps atexit simple.
\nNote: the functions registered via this module are not called when the program\nis killed by a signal not handled by Python, when a Python fatal internal error\nis detected, or when os._exit() is called.
\nThis is an alternate interface to the functionality provided by the\nsys.exitfunc variable.
\nNote: This module is unlikely to work correctly when used with other code that\nsets sys.exitfunc. In particular, other core Python modules are free to use\natexit without the programmer’s knowledge. Authors who use\nsys.exitfunc should convert their code to use atexit instead. The\nsimplest way to convert code that sets sys.exitfunc is to import\natexit and register the function that had been bound to sys.exitfunc.
\nRegister func as a function to be executed at termination. Any optional\narguments that are to be passed to func must be passed as arguments to\nregister().
\nAt normal program termination (for instance, if sys.exit() is called or\nthe main module’s execution completes), all functions registered are called in\nlast in, first out order. The assumption is that lower level modules will\nnormally be imported before higher level modules and thus must be cleaned up\nlater.
\nIf an exception is raised during execution of the exit handlers, a traceback is\nprinted (unless SystemExit is raised) and the exception information is\nsaved. After all exit handlers have had a chance to run the last exception to\nbe raised is re-raised.
\n\nChanged in version 2.6: This function now returns func which makes it possible to use it as a\ndecorator without binding the original name to None.
\nSee also
\n\nThe following simple example demonstrates how a module can initialize a counter\nfrom a file when it is imported and save the counter’s updated value\nautomatically when the program terminates without relying on the application\nmaking an explicit call into this module at termination.
\ntry:\n _count = int(open("/tmp/counter").read())\nexcept IOError:\n _count = 0\n\ndef incrcounter(n):\n global _count\n _count = _count + n\n\ndef savecounter():\n open("/tmp/counter", "w").write("%d" _count)\n\nimport atexit\natexit.register(savecounter)\n
Positional and keyword arguments may also be passed to register() to be\npassed along to the registered function when it is called:
\ndef goodbye(name, adjective):\n print 'Goodbye, %s, it was %s to meet you.' (name, adjective)\n\nimport atexit\natexit.register(goodbye, 'Donny', 'nice')\n\n# or:\natexit.register(goodbye, adjective='nice', name='Donny')\n
Usage as a decorator:
\nimport atexit\n\n@atexit.register\ndef goodbye():\n print "You are now leaving the Python sector."\n
This obviously only works with functions that don’t take arguments.
\n\nNew in version 2.1.
\nSource code: Lib/warnings.py
\nWarning messages are typically issued in situations where it is useful to alert\nthe user of some condition in a program, where that condition (normally) doesn’t\nwarrant raising an exception and terminating the program. For example, one\nmight want to issue a warning when a program uses an obsolete module.
\nPython programmers issue warnings by calling the warn() function defined\nin this module. (C programmers use PyErr_WarnEx(); see\nException Handling for details).
\nWarning messages are normally written to sys.stderr, but their disposition\ncan be changed flexibly, from ignoring all warnings to turning them into\nexceptions. The disposition of warnings can vary based on the warning category\n(see below), the text of the warning message, and the source location where it\nis issued. Repetitions of a particular warning for the same source location are\ntypically suppressed.
\nThere are two stages in warning control: first, each time a warning is issued, a\ndetermination is made whether a message should be issued or not; next, if a\nmessage is to be issued, it is formatted and printed using a user-settable hook.
\nThe determination whether to issue a warning message is controlled by the\nwarning filter, which is a sequence of matching rules and actions. Rules can be\nadded to the filter by calling filterwarnings() and reset to its default\nstate by calling resetwarnings().
\nThe printing of warning messages is done by calling showwarning(), which\nmay be overridden; the default implementation of this function formats the\nmessage by calling formatwarning(), which is also available for use by\ncustom implementations.
\nSee also
\nlogging.captureWarnings() allows you to handle all warnings with\nthe standard logging infrastructure.
\nThere are a number of built-in exceptions that represent warning categories.\nThis categorization is useful to be able to filter out groups of warnings. The\nfollowing warnings category classes are currently defined:
\nClass | \nDescription | \n
---|---|
Warning | \nThis is the base class of all warning\ncategory classes. It is a subclass of\nException. | \n
UserWarning | \nThe default category for warn(). | \n
DeprecationWarning | \nBase category for warnings about deprecated\nfeatures (ignored by default). | \n
SyntaxWarning | \nBase category for warnings about dubious\nsyntactic features. | \n
RuntimeWarning | \nBase category for warnings about dubious\nruntime features. | \n
FutureWarning | \nBase category for warnings about constructs\nthat will change semantically in the future. | \n
PendingDeprecationWarning | \nBase category for warnings about features\nthat will be deprecated in the future\n(ignored by default). | \n
ImportWarning | \nBase category for warnings triggered during\nthe process of importing a module (ignored by\ndefault). | \n
UnicodeWarning | \nBase category for warnings related to\nUnicode. | \n
While these are technically built-in exceptions, they are documented here,\nbecause conceptually they belong to the warnings mechanism.
\nUser code can define additional warning categories by subclassing one of the\nstandard warning categories. A warning category must always be a subclass of\nthe Warning class.
\n\nChanged in version 2.7: DeprecationWarning is ignored by default.
\nThe warnings filter controls whether warnings are ignored, displayed, or turned\ninto errors (raising an exception).
\nConceptually, the warnings filter maintains an ordered list of filter\nspecifications; any specific warning is matched against each filter\nspecification in the list in turn until a match is found; the match determines\nthe disposition of the match. Each entry is a tuple of the form (action,\nmessage, category, module, lineno), where:
\naction is one of the following strings:
\nValue \n | \nDisposition \n | \n
---|---|
"error" \n | \nturn matching warnings into exceptions \n | \n
"ignore" \n | \nnever print matching warnings \n | \n
"always" \n | \nalways print matching warnings \n | \n
"default" \n | \nprint the first occurrence of matching\nwarnings for each location where the warning\nis issued \n | \n
"module" \n | \nprint the first occurrence of matching\nwarnings for each module where the warning\nis issued \n | \n
"once" \n | \nprint only the first occurrence of matching\nwarnings, regardless of location \n | \n
message is a string containing a regular expression that the warning message\nmust match (the match is compiled to always be case-insensitive).
\ncategory is a class (a subclass of Warning) of which the warning\ncategory must be a subclass in order to match.
\nmodule is a string containing a regular expression that the module name must\nmatch (the match is compiled to be case-sensitive).
\nlineno is an integer that the line number where the warning occurred must\nmatch, or 0 to match all line numbers.
\nSince the Warning class is derived from the built-in Exception\nclass, to turn a warning into an error we simply raise category(message).
\nThe warnings filter is initialized by -W options passed to the Python\ninterpreter command line. The interpreter saves the arguments for all\n-W options without interpretation in sys.warnoptions; the\nwarnings module parses these when it is first imported (invalid options\nare ignored, after printing a message to sys.stderr).
\nBy default, Python installs several warning filters, which can be overridden by\nthe command-line options passed to -W and calls to\nfilterwarnings().
\nIf you are using code that you know will raise a warning, such as a deprecated\nfunction, but do not want to see the warning, then it is possible to suppress\nthe warning using the catch_warnings context manager:
\nimport warnings\n\ndef fxn():\n warnings.warn("deprecated", DeprecationWarning)\n\nwith warnings.catch_warnings():\n warnings.simplefilter("ignore")\n fxn()\n
While within the context manager all warnings will simply be ignored. This\nallows you to use known-deprecated code without having to see the warning while\nnot suppressing the warning for other code that might not be aware of its use\nof deprecated code. Note: this can only be guaranteed in a single-threaded\napplication. If two or more threads use the catch_warnings context\nmanager at the same time, the behavior is undefined.
\nTo test warnings raised by code, use the catch_warnings context\nmanager. With it you can temporarily mutate the warnings filter to facilitate\nyour testing. For instance, do the following to capture all raised warnings to\ncheck:
\nimport warnings\n\ndef fxn():\n warnings.warn("deprecated", DeprecationWarning)\n\nwith warnings.catch_warnings(record=True) as w:\n # Cause all warnings to always be triggered.\n warnings.simplefilter("always")\n # Trigger a warning.\n fxn()\n # Verify some things\n assert len(w) == 1\n assert issubclass(w[-1].category, DeprecationWarning)\n assert "deprecated" in str(w[-1].message)\n
One can also cause all warnings to be exceptions by using error instead of\nalways. One thing to be aware of is that if a warning has already been\nraised because of a once/default rule, then no matter what filters are\nset the warning will not be seen again unless the warnings registry related to\nthe warning has been cleared.
\nOnce the context manager exits, the warnings filter is restored to its state\nwhen the context was entered. This prevents tests from changing the warnings\nfilter in unexpected ways between tests and leading to indeterminate test\nresults. The showwarning() function in the module is also restored to\nits original value. Note: this can only be guaranteed in a single-threaded\napplication. If two or more threads use the catch_warnings context\nmanager at the same time, the behavior is undefined.
\nWhen testing multiple operations that raise the same kind of warning, it\nis important to test them in a manner that confirms each operation is raising\na new warning (e.g. set warnings to be raised as exceptions and check the\noperations raise exceptions, check that the length of the warning list\ncontinues to increase after each operation, or else delete the previous\nentries from the warnings list before each new operation).
\nWarnings that are only of interest to the developer are ignored by default. As\nsuch you should make sure to test your code with typically ignored warnings\nmade visible. You can do this from the command-line by passing -Wd\nto the interpreter (this is shorthand for -W default). This enables\ndefault handling for all warnings, including those that are ignored by default.\nTo change what action is taken for encountered warnings you simply change what\nargument is passed to -W, e.g. -W error. See the\n-W flag for more details on what is possible.
\nTo programmatically do the same as -Wd, use:
\nwarnings.simplefilter('default')\n
Make sure to execute this code as soon as possible. This prevents the\nregistering of what warnings have been raised from unexpectedly influencing how\nfuture warnings are treated.
\nHaving certain warnings ignored by default is done to prevent a user from\nseeing warnings that are only of interest to the developer. As you do not\nnecessarily have control over what interpreter a user uses to run their code,\nit is possible that a new version of Python will be released between your\nrelease cycles. The new interpreter release could trigger new warnings in your\ncode that were not there in an older interpreter, e.g.\nDeprecationWarning for a module that you are using. While you as a\ndeveloper want to be notified that your code is using a deprecated module, to a\nuser this information is essentially noise and provides no benefit to them.
\nIssue a warning, or maybe ignore it or raise an exception. The category\nargument, if given, must be a warning category class (see above); it defaults to\nUserWarning. Alternatively message can be a Warning instance,\nin which case category will be ignored and message.__class__ will be used.\nIn this case the message text will be str(message). This function raises an\nexception if the particular warning issued is changed into an error by the\nwarnings filter see above. The stacklevel argument can be used by wrapper\nfunctions written in Python, like this:
\ndef deprecation(message):\n warnings.warn(message, DeprecationWarning, stacklevel=2)\n
This makes the warning refer to deprecation()‘s caller, rather than to the\nsource of deprecation() itself (since the latter would defeat the purpose\nof the warning message).
\nThis is a low-level interface to the functionality of warn(), passing in\nexplicitly the message, category, filename and line number, and optionally the\nmodule name and the registry (which should be the __warningregistry__\ndictionary of the module). The module name defaults to the filename with\n.py stripped; if no registry is passed, the warning is never suppressed.\nmessage must be a string and category a subclass of Warning or\nmessage may be a Warning instance, in which case category will be\nignored.
\nmodule_globals, if supplied, should be the global namespace in use by the code\nfor which the warning is issued. (This argument is used to support displaying\nsource for modules found in zipfiles or other non-filesystem import\nsources).
\n\nChanged in version 2.5: Added the module_globals parameter.
\nIssue a warning related to Python 3.x deprecation. Warnings are only shown\nwhen Python is started with the -3 option. Like warn() message must\nbe a string and category a subclass of Warning. warnpy3k()\nis using DeprecationWarning as default warning class.
\n\nNew in version 2.6.
\nWrite a warning to a file. The default implementation calls\nformatwarning(message, category, filename, lineno, line) and writes the\nresulting string to file, which defaults to sys.stderr. You may replace\nthis function with an alternative implementation by assigning to\nwarnings.showwarning.\nline is a line of source code to be included in the warning\nmessage; if line is not supplied, showwarning() will\ntry to read the line specified by filename and lineno.
\n\nChanged in version 2.7: The line argument is required to be supported.
\nFormat a warning the standard way. This returns a string which may contain\nembedded newlines and ends in a newline. line is a line of source code to\nbe included in the warning message; if line is not supplied,\nformatwarning() will try to read the line specified by filename and\nlineno.
\n\nChanged in version 2.6: Added the line argument.
\nA context manager that copies and, upon exit, restores the warnings filter\nand the showwarning() function.\nIf the record argument is False (the default) the context manager\nreturns None on entry. If record is True, a list is\nreturned that is progressively populated with objects as seen by a custom\nshowwarning() function (which also suppresses output to sys.stdout).\nEach object in the list has attributes with the same names as the arguments to\nshowwarning().
\nThe module argument takes a module that will be used instead of the\nmodule returned when you import warnings whose filter will be\nprotected. This argument exists primarily for testing the warnings\nmodule itself.
\nNote
\nThe catch_warnings manager works by replacing and\nthen later restoring the module’s\nshowwarning() function and internal list of filter\nspecifications. This means the context manager is modifying\nglobal state and therefore is not thread-safe.
\nNote
\nIn Python 3.0, the arguments to the constructor for\ncatch_warnings are keyword-only arguments.
\n\nNew in version 2.6.
\n\nNew in version 2.6.
\nSource code: Lib/abc.py
\nThis module provides the infrastructure for defining abstract base\nclasses (ABCs) in Python, as outlined in PEP 3119; see the PEP for why this\nwas added to Python. (See also PEP 3141 and the numbers module\nregarding a type hierarchy for numbers based on ABCs.)
\nThe collections module has some concrete classes that derive from\nABCs; these can, of course, be further derived. In addition the\ncollections module has some ABCs that can be used to test whether\na class or instance provides a particular interface, for example, is it\nhashable or a mapping.
\nThis module provides the following class:
\nMetaclass for defining Abstract Base Classes (ABCs).
\nUse this metaclass to create an ABC. An ABC can be subclassed directly, and\nthen acts as a mix-in class. You can also register unrelated concrete\nclasses (even built-in classes) and unrelated ABCs as “virtual subclasses” –\nthese and their descendants will be considered subclasses of the registering\nABC by the built-in issubclass() function, but the registering ABC\nwon’t show up in their MRO (Method Resolution Order) nor will method\nimplementations defined by the registering ABC be callable (not even via\nsuper()). [1]
\nClasses created with a metaclass of ABCMeta have the following method:
\nRegister subclass as a “virtual subclass” of this ABC. For\nexample:
\nfrom abc import ABCMeta\n\nclass MyABC:\n __metaclass__ = ABCMeta\n\nMyABC.register(tuple)\n\nassert issubclass(tuple, MyABC)\nassert isinstance((), MyABC)\n
You can also override this method in an abstract base class:
\n(Must be defined as a class method.)
\nCheck whether subclass is considered a subclass of this ABC. This means\nthat you can customize the behavior of issubclass further without the\nneed to call register() on every class you want to consider a\nsubclass of the ABC. (This class method is called from the\n__subclasscheck__() method of the ABC.)
\nThis method should return True, False or NotImplemented. If\nit returns True, the subclass is considered a subclass of this ABC.\nIf it returns False, the subclass is not considered a subclass of\nthis ABC, even if it would normally be one. If it returns\nNotImplemented, the subclass check is continued with the usual\nmechanism.
\nFor a demonstration of these concepts, look at this example ABC definition:
\nclass Foo(object):\n def __getitem__(self, index):\n ...\n def __len__(self):\n ...\n def get_iterator(self):\n return iter(self)\n\nclass MyIterable:\n __metaclass__ = ABCMeta\n\n @abstractmethod\n def __iter__(self):\n while False:\n yield None\n\n def get_iterator(self):\n return self.__iter__()\n\n @classmethod\n def __subclasshook__(cls, C):\n if cls is MyIterable:\n if any("__iter__" in B.__dict__ for B in C.__mro__):\n return True\n return NotImplemented\n\nMyIterable.register(Foo)\n
The ABC MyIterable defines the standard iterable method,\n__iter__(), as an abstract method. The implementation given here can\nstill be called from subclasses. The get_iterator() method is also\npart of the MyIterable abstract base class, but it does not have to be\noverridden in non-abstract derived classes.
\nThe __subclasshook__() class method defined here says that any class\nthat has an __iter__() method in its __dict__ (or in that of\none of its base classes, accessed via the __mro__ list) is\nconsidered a MyIterable too.
\nFinally, the last line makes Foo a virtual subclass of MyIterable,\neven though it does not define an __iter__() method (it uses the\nold-style iterable protocol, defined in terms of __len__() and\n__getitem__()). Note that this will not make get_iterator\navailable as a method of Foo, so it is provided separately.
\nIt also provides the following decorators:
\nA decorator indicating abstract methods.
\nUsing this decorator requires that the class’s metaclass is ABCMeta or\nis derived from it.\nA class that has a metaclass derived from ABCMeta\ncannot be instantiated unless all of its abstract methods and\nproperties are overridden.\nThe abstract methods can be called using any of the normal ‘super’ call\nmechanisms.
\nDynamically adding abstract methods to a class, or attempting to modify the\nabstraction status of a method or class once it is created, are not\nsupported. The abstractmethod() only affects subclasses derived using\nregular inheritance; “virtual subclasses” registered with the ABC’s\nregister() method are not affected.
\nUsage:
\nclass C:\n __metaclass__ = ABCMeta\n @abstractmethod\n def my_abstract_method(self, ...):\n ...\n
Note
\nUnlike Java abstract methods, these abstract\nmethods may have an implementation. This implementation can be\ncalled via the super() mechanism from the class that\noverrides it. This could be useful as an end-point for a\nsuper-call in a framework that uses cooperative\nmultiple-inheritance.
\nA subclass of the built-in property(), indicating an abstract property.
\nUsing this function requires that the class’s metaclass is ABCMeta or\nis derived from it.\nA class that has a metaclass derived from ABCMeta cannot be\ninstantiated unless all of its abstract methods and properties are overridden.\nThe abstract properties can be called using any of the normal\n‘super’ call mechanisms.
\nUsage:
\nclass C:\n __metaclass__ = ABCMeta\n @abstractproperty\n def my_abstract_property(self):\n ...\n
This defines a read-only property; you can also define a read-write abstract\nproperty using the ‘long’ form of property declaration:
\nclass C:\n __metaclass__ = ABCMeta\n def getx(self): ...\n def setx(self, value): ...\n x = abstractproperty(getx, setx)\n
Footnotes
\n[1] | C++ programmers should note that Python’s virtual base class\nconcept is not the same as C++’s. |
Source code: Lib/__future__.py
\n__future__ is a real module, and serves three purposes:
\nEach statement in __future__.py is of the form:
\nFeatureName = _Feature(OptionalRelease, MandatoryRelease,\n CompilerFlag)\n
where, normally, OptionalRelease is less than MandatoryRelease, and both are\n5-tuples of the same form as sys.version_info:
\n(PY_MAJOR_VERSION, # the 2 in 2.1.0a3; an int\n PY_MINOR_VERSION, # the 1; an int\n PY_MICRO_VERSION, # the 0; an int\n PY_RELEASE_LEVEL, # "alpha", "beta", "candidate" or "final"; string\n PY_RELEASE_SERIAL # the 3; an int\n)\n
OptionalRelease records the first release in which the feature was accepted.
\nIn the case of a MandatoryRelease that has not yet occurred,\nMandatoryRelease predicts the release in which the feature will become part of\nthe language.
\nElse MandatoryRelease records when the feature became part of the language; in\nreleases at or after that, modules no longer need a future statement to use the\nfeature in question, but may continue to use such imports.
\nMandatoryRelease may also be None, meaning that a planned feature got\ndropped.
\nInstances of class _Feature have two corresponding methods,\ngetOptionalRelease() and getMandatoryRelease().
\nCompilerFlag is the (bitfield) flag that should be passed in the fourth\nargument to the built-in function compile() to enable the feature in\ndynamically compiled code. This flag is stored in the compiler_flag\nattribute on _Feature instances.
\nNo feature description will ever be deleted from __future__. Since its\nintroduction in Python 2.1 the following features have found their way into the\nlanguage using this mechanism:
\nfeature | \noptional in | \nmandatory in | \neffect | \n
---|---|---|---|
nested_scopes | \n2.1.0b1 | \n2.2 | \nPEP 227:\nStatically Nested Scopes | \n
generators | \n2.2.0a1 | \n2.3 | \nPEP 255:\nSimple Generators | \n
division | \n2.2.0a2 | \n3.0 | \nPEP 238:\nChanging the Division Operator | \n
absolute_import | \n2.5.0a1 | \n2.7 | \nPEP 328:\nImports: Multi-Line and Absolute/Relative | \n
with_statement | \n2.5.0a1 | \n2.6 | \nPEP 343:\nThe “with” Statement | \n
print_function | \n2.6.0a2 | \n3.0 | \nPEP 3105:\nMake print a function | \n
unicode_literals | \n2.6.0a2 | \n3.0 | \nPEP 3112:\nBytes literals in Python 3000 | \n
See also
\n\nNew in version 2.5.
\nSource code: Lib/contextlib.py
\nThis module provides utilities for common tasks involving the with\nstatement. For more information see also Context Manager Types and\nWith Statement Context Managers.
\nFunctions provided:
\nThis function is a decorator that can be used to define a factory\nfunction for with statement context managers, without needing to\ncreate a class or separate __enter__() and __exit__() methods.
\nA simple example (this is not recommended as a real way of generating HTML!):
\nfrom contextlib import contextmanager\n\n@contextmanager\ndef tag(name):\n print \"<%s>\" % name\n yield\n print \"</%s>\" % name\n\n>>> with tag(\"h1\"):\n... print \"foo\"\n...\n<h1>\nfoo\n</h1>
\nThe function being decorated must return a generator-iterator when\ncalled. This iterator must yield exactly one value, which will be bound to\nthe targets in the with statement’s as clause, if any.
\nAt the point where the generator yields, the block nested in the with\nstatement is executed. The generator is then resumed after the block is exited.\nIf an unhandled exception occurs in the block, it is reraised inside the\ngenerator at the point where the yield occurred. Thus, you can use a\ntry...except...finally statement to trap\nthe error (if any), or ensure that some cleanup takes place. If an exception is\ntrapped merely in order to log it or to perform some action (rather than to\nsuppress it entirely), the generator must reraise that exception. Otherwise the\ngenerator context manager will indicate to the with statement that\nthe exception has been handled, and execution will resume with the statement\nimmediately following the with statement.
\nCombine multiple context managers into a single nested context manager.
\nThis function has been deprecated in favour of the multiple manager form\nof the with statement.
\nThe one advantage of this function over the multiple manager form of the\nwith statement is that argument unpacking allows it to be\nused with a variable number of context managers as follows:
\nfrom contextlib import nested\n\nwith nested(*managers):\n do_something()\n
Note that if the __exit__() method of one of the nested context managers\nindicates an exception should be suppressed, no exception information will be\npassed to any remaining outer context managers. Similarly, if the\n__exit__() method of one of the nested managers raises an exception, any\nprevious exception state will be lost; the new exception will be passed to the\n__exit__() methods of any remaining outer context managers. In general,\n__exit__() methods should avoid raising exceptions, and in particular they\nshould not re-raise a passed-in exception.
\nThis function has two major quirks that have led to it being deprecated. Firstly,\nas the context managers are all constructed before the function is invoked, the\n__new__() and __init__() methods of the inner context managers are\nnot actually covered by the scope of the outer context managers. That means, for\nexample, that using nested() to open two files is a programming error as the\nfirst file will not be closed promptly if an exception is thrown when opening\nthe second file.
\nSecondly, if the __enter__() method of one of the inner context managers\nraises an exception that is caught and suppressed by the __exit__() method\nof one of the outer context managers, this construct will raise\nRuntimeError rather than skipping the body of the with\nstatement.
\nDevelopers that need to support nesting of a variable number of context managers\ncan either use the warnings module to suppress the DeprecationWarning\nraised by this function or else use this function as a model for an application\nspecific implementation.
\n\nDeprecated since version 2.7: The with-statement now supports this functionality directly (without the\nconfusing error prone quirks).
\nReturn a context manager that closes thing upon completion of the block. This\nis basically equivalent to:
\nfrom contextlib import contextmanager\n\n@contextmanager\ndef closing(thing):\n try:\n yield thing\n finally:\n thing.close()\n
And lets you write code like this:
\nfrom contextlib import closing\nimport urllib\n\nwith closing(urllib.urlopen('http://www.python.org')) as page:\n for line in page:\n print line\n
without needing to explicitly close page. Even if an error occurs,\npage.close() will be called when the with block is exited.
\nThis module provides an interface to the optional garbage collector. It\nprovides the ability to disable the collector, tune the collection frequency,\nand set debugging options. It also provides access to unreachable objects that\nthe collector found but cannot free. Since the collector supplements the\nreference counting already used in Python, you can disable the collector if you\nare sure your program does not create reference cycles. Automatic collection\ncan be disabled by calling gc.disable(). To debug a leaking program call\ngc.set_debug(gc.DEBUG_LEAK). Notice that this includes\ngc.DEBUG_SAVEALL, causing garbage-collected objects to be saved in\ngc.garbage for inspection.
\nThe gc module provides the following functions:
\nWith no arguments, run a full collection. The optional argument generation\nmay be an integer specifying which generation to collect (from 0 to 2). A\nValueError is raised if the generation number is invalid. The number of\nunreachable objects found is returned.
\n\nChanged in version 2.5: The optional generation argument was added.
\n\nChanged in version 2.6: The free lists maintained for a number of built-in types are cleared\nwhenever a full collection or collection of the highest generation (2)\nis run. Not all items in some free lists may be freed due to the\nparticular implementation, in particular int and float.
\nReturns a list of all objects tracked by the collector, excluding the list\nreturned.
\n\nNew in version 2.2.
\nSet the garbage collection thresholds (the collection frequency). Setting\nthreshold0 to zero disables collection.
\nThe GC classifies objects into three generations depending on how many\ncollection sweeps they have survived. New objects are placed in the youngest\ngeneration (generation 0). If an object survives a collection it is moved\ninto the next older generation. Since generation 2 is the oldest\ngeneration, objects in that generation remain there after a collection. In\norder to decide when to run, the collector keeps track of the number object\nallocations and deallocations since the last collection. When the number of\nallocations minus the number of deallocations exceeds threshold0, collection\nstarts. Initially only generation 0 is examined. If generation 0 has\nbeen examined more than threshold1 times since generation 1 has been\nexamined, then generation 1 is examined as well. Similarly, threshold2\ncontrols the number of collections of generation 1 before collecting\ngeneration 2.
\nReturn the current collection counts as a tuple of (count0, count1,\ncount2).
\n\nNew in version 2.5.
\nReturn the list of objects that directly refer to any of objs. This function\nwill only locate those containers which support garbage collection; extension\ntypes which do refer to other objects but do not support garbage collection will\nnot be found.
\nNote that objects which have already been dereferenced, but which live in cycles\nand have not yet been collected by the garbage collector can be listed among the\nresulting referrers. To get only currently live objects, call collect()\nbefore calling get_referrers().
\nCare must be taken when using objects returned by get_referrers() because\nsome of them could still be under construction and hence in a temporarily\ninvalid state. Avoid using get_referrers() for any purpose other than\ndebugging.
\n\nNew in version 2.2.
\nReturn a list of objects directly referred to by any of the arguments. The\nreferents returned are those objects visited by the arguments’ C-level\ntp_traverse methods (if any), and may not be all objects actually\ndirectly reachable. tp_traverse methods are supported only by objects\nthat support garbage collection, and are only required to visit objects that may\nbe involved in a cycle. So, for example, if an integer is directly reachable\nfrom an argument, that integer object may or may not appear in the result list.
\n\nNew in version 2.3.
\nReturns True if the object is currently tracked by the garbage collector,\nFalse otherwise. As a general rule, instances of atomic types aren’t\ntracked and instances of non-atomic types (containers, user-defined\nobjects...) are. However, some type-specific optimizations can be present\nin order to suppress the garbage collector footprint of simple instances\n(e.g. dicts containing only atomic keys and values):
\n>>> gc.is_tracked(0)\nFalse\n>>> gc.is_tracked("a")\nFalse\n>>> gc.is_tracked([])\nTrue\n>>> gc.is_tracked({})\nFalse\n>>> gc.is_tracked({"a": 1})\nFalse\n>>> gc.is_tracked({"a": []})\nTrue\n
\nNew in version 2.7.
\nThe following variable is provided for read-only access (you can mutate its\nvalue but should not rebind it):
\nA list of objects which the collector found to be unreachable but could not be\nfreed (uncollectable objects). By default, this list contains only objects with\n__del__() methods. [1] Objects that have __del__() methods and are\npart of a reference cycle cause the entire reference cycle to be uncollectable,\nincluding objects not necessarily in the cycle but reachable only from it.\nPython doesn’t collect such cycles automatically because, in general, it isn’t\npossible for Python to guess a safe order in which to run the __del__()\nmethods. If you know a safe order, you can force the issue by examining the\ngarbage list, and explicitly breaking cycles due to your objects within the\nlist. Note that these objects are kept alive even so by virtue of being in the\ngarbage list, so they should be removed from garbage too. For example,\nafter breaking cycles, do del gc.garbage[:] to empty the list. It’s\ngenerally better to avoid the issue by not creating cycles containing objects\nwith __del__() methods, and garbage can be examined in that case to\nverify that no such cycles are being created.
\nIf DEBUG_SAVEALL is set, then all unreachable objects will be added to\nthis list rather than freed.
\nThe following constants are provided for use with set_debug():
\nFootnotes
\n[1] | Prior to Python 2.2, the list contained all instance objects in unreachable\ncycles, not only those with __del__() methods. |
\nDeprecated since version 2.6: The user module has been removed in Python 3.0.
\nAs a policy, Python doesn’t run user-specified code on startup of Python\nprograms. (Only interactive sessions execute the script specified in the\nPYTHONSTARTUP environment variable if it exists).
\nHowever, some programs or sites may find it convenient to allow users to have a\nstandard customization file, which gets run when a program requests it. This\nmodule implements such a mechanism. A program that wishes to use the mechanism\nmust execute the statement
\nimport user\n
The user module looks for a file .pythonrc.py in the user’s home\ndirectory and if it can be opened, executes it (using execfile()) in its\nown (the module user‘s) global namespace. Errors during this phase are\nnot caught; that’s up to the program that imports the user module, if it\nwishes. The home directory is assumed to be named by the HOME\nenvironment variable; if this is not set, the current directory is used.
\nThe user’s .pythonrc.py could conceivably test for sys.version if it\nwishes to do different things depending on the Python version.
\nA warning to users: be very conservative in what you place in your\n.pythonrc.py file. Since you don’t know which programs will use it,\nchanging the behavior of standard modules or functions is generally not a good\nidea.
\nA suggestion for programmers who wish to use this mechanism: a simple way to let\nusers specify options for your package is to have them define variables in their\n.pythonrc.py file that you test in your module. For example, a module\nspam that has a verbosity level can look for a variable\nuser.spam_verbose, as follows:
\nimport user\n\nverbose = bool(getattr(user, "spam_verbose", 0))\n
(The three-argument form of getattr() is used in case the user has not\ndefined spam_verbose in their .pythonrc.py file.)
\nPrograms with extensive customization needs are better off reading a\nprogram-specific customization file.
\nPrograms with security or privacy concerns should not import this module; a\nuser can easily break into a program by placing arbitrary code in the\n.pythonrc.py file.
\nModules for general use should not import this module; it may interfere with\nthe operation of the importing program.
\nSee also
\nSource code: Lib/site.py
\nThis module is automatically imported during initialization. The automatic\nimport can be suppressed using the interpreter’s -S option.
\nImporting this module will append site-specific paths to the module search path\nand add a few builtins.
\nIt starts by constructing up to four directories from a head and a tail part.\nFor the head part, it uses sys.prefix and sys.exec_prefix; empty heads\nare skipped. For the tail part, it uses the empty string and then\nlib/site-packages (on Windows) or\nlib/python|version|/site-packages and then lib/site-python (on\nUnix and Macintosh). For each of the distinct head-tail combinations, it sees\nif it refers to an existing directory, and if so, adds it to sys.path and\nalso inspects the newly added path for configuration files.
\nA path configuration file is a file whose name has the form name.pth\nand exists in one of the four directories mentioned above; its contents are\nadditional items (one per line) to be added to sys.path. Non-existing items\nare never added to sys.path, and no check is made that the item refers to a\ndirectory rather than a file. No item is added to sys.path more than\nonce. Blank lines and lines beginning with # are skipped. Lines starting\nwith import (followed by space or tab) are executed.
\n\nChanged in version 2.6: A space or tab is now required after the import keyword.
\nFor example, suppose sys.prefix and sys.exec_prefix are set to\n/usr/local. The Python X.Y library is then installed in\n/usr/local/lib/pythonX.Y. Suppose this has\na subdirectory /usr/local/lib/pythonX.Y/site-packages with three\nsubsubdirectories, foo, bar and spam, and two path\nconfiguration files, foo.pth and bar.pth. Assume\nfoo.pth contains the following:
\n# foo package configuration\n\nfoo\nbar\nbletch\n
and bar.pth contains:
\n# bar package configuration\n\nbar\n
Then the following version-specific directories are added to\nsys.path, in this order:
\n/usr/local/lib/pythonX.Y/site-packages/bar\n/usr/local/lib/pythonX.Y/site-packages/foo\n
Note that bletch is omitted because it doesn’t exist; the bar\ndirectory precedes the foo directory because bar.pth comes\nalphabetically before foo.pth; and spam is omitted because it is\nnot mentioned in either path configuration file.
\nAfter these path manipulations, an attempt is made to import a module named\nsitecustomize, which can perform arbitrary site-specific customizations.\nIt is typically created by a system administrator in the site-packages\ndirectory. If this import fails with an ImportError exception, it is\nsilently ignored.
\nAfter this, an attempt is made to import a module named usercustomize,\nwhich can perform arbitrary user-specific customizations, if\nENABLE_USER_SITE is true. This file is intended to be created in the\nuser site-packages directory (see below), which is part of sys.path unless\ndisabled by -s. An ImportError will be silently ignored.
\nNote that for some non-Unix systems, sys.prefix and sys.exec_prefix are\nempty, and the path manipulations are skipped; however the import of\nsitecustomize and usercustomize is still attempted.
\nA list of prefixes for site-packages directories.
\n\nNew in version 2.6.
\nFlag showing the status of the user site-packages directory. True means\nthat it is enabled and was added to sys.path. False means that it\nwas disabled by user request (with -s or\nPYTHONNOUSERSITE). None means it was disabled for security\nreasons (mismatch between user or group id and effective id) or by an\nadministrator.
\n\nNew in version 2.6.
\nPath to the user site-packages for the running Python. Can be None if\ngetusersitepackages() hasn’t been called yet. Default value is\n~/.local/lib/pythonX.Y/site-packages for UNIX and non-framework Mac\nOS X builds, ~/Library/Python/X.Y/lib/python/site-packages for Mac\nframework builds, and
\nNew in version 2.6.
\nPath to the base directory for the user site-packages. Can be None if\ngetuserbase() hasn’t been called yet. Default value is\n~/.local for UNIX and Mac OS X non-framework builds,\n~/Library/Python/X.Y for Mac framework builds, and\n
\nNew in version 2.6.
\nReturn a list containing all global site-packages directories (and possibly\nsite-python).
\n\nNew in version 2.7.
\nReturn the path of the user base directory, USER_BASE. If it is not\ninitialized yet, this function will also set it, respecting\nPYTHONUSERBASE.
\n\nNew in version 2.7.
\nReturn the path of the user-specific site-packages directory,\nUSER_SITE. If it is not initialized yet, this function will also set\nit, respecting PYTHONNOUSERSITE and USER_BASE.
\n\nNew in version 2.7.
\nThe site module also provides a way to get the user directories from the\ncommand line:
\n$ python3 -m site --user-site\n/home/user/.local/lib/python3.3/site-packages\n
If it is called without arguments, it will print the contents of\nsys.path on the standard output, followed by the value of\nUSER_BASE and whether the directory exists, then the same thing for\nUSER_SITE, and finally the value of ENABLE_USER_SITE.
\nIf both options are given, user base and user site will be printed (always in\nthis order), separated by os.pathsep.
\nIf any option is given, the script will exit with one of these values: O if\nthe user site-packages directory is enabled, 1 if it was disabled by the\nuser, 2 if it is disabled for security reasons or by an administrator, and a\nvalue greater than 2 if there is an error.
\nSee also
\nPEP 370 – Per user site-packages directory
\nThe distutils package provides support for building and installing\nadditional modules into a Python installation. The new modules may be either\n100%-pure Python, or may be extension modules written in C, or may be\ncollections of Python packages which include modules coded in both Python and C.
\nThis package is discussed in two separate chapters:
\nSee also
\nPlatforms: Unix
\nNote
\nThe fpectl module is not built by default, and its usage is discouraged\nand may be dangerous except in the hands of experts. See also the section\nLimitations and other considerations on limitations for more details.
\nMost computers carry out floating point operations in conformance with the\nso-called IEEE-754 standard. On any real computer, some floating point\noperations produce results that cannot be expressed as a normal floating point\nvalue. For example, try
\n>>> import math\n>>> math.exp(1000)\ninf\n>>> math.exp(1000) / math.exp(1000)\nnan\n
(The example above will work on many platforms. DEC Alpha may be one exception.)\n“Inf” is a special, non-numeric value in IEEE-754 that stands for “infinity”,\nand “nan” means “not a number.” Note that, other than the non-numeric results,\nnothing special happened when you asked Python to carry out those calculations.\nThat is in fact the default behaviour prescribed in the IEEE-754 standard, and\nif it works for you, stop reading now.
\nIn some circumstances, it would be better to raise an exception and stop\nprocessing at the point where the faulty operation was attempted. The\nfpectl module is for use in that situation. It provides control over\nfloating point units from several hardware manufacturers, allowing the user to\nturn on the generation of SIGFPE whenever any of the IEEE-754\nexceptions Division by Zero, Overflow, or Invalid Operation occurs. In tandem\nwith a pair of wrapper macros that are inserted into the C code comprising your\npython system, SIGFPE is trapped and converted into the Python\nFloatingPointError exception.
\nThe fpectl module defines the following functions and may raise the given\nexception:
\nThe following example demonstrates how to start up and test operation of the\nfpectl module.
\n>>> import fpectl\n>>> import fpetest\n>>> fpectl.turnon_sigfpe()\n>>> fpetest.test()\noverflow PASS\nFloatingPointError: Overflow\n\ndiv by 0 PASS\nFloatingPointError: Division by zero\n [ more output from test elided ]\n>>> import math\n>>> math.exp(1000)\nTraceback (most recent call last):\n File "<stdin>", line 1, in ?\nFloatingPointError: in math_1\n
Setting up a given processor to trap IEEE-754 floating point errors currently\nrequires custom code on a per-architecture basis. You may have to modify\nfpectl to control your particular hardware.
\nConversion of an IEEE-754 exception to a Python exception requires that the\nwrapper macros PyFPE_START_PROTECT and PyFPE_END_PROTECT be inserted\ninto your code in an appropriate fashion. Python itself has been modified to\nsupport the fpectl module, but many other codes of interest to numerical\nanalysts have not.
\nThe fpectl module is not thread-safe.
\nSee also
\nSome files in the source distribution may be interesting in learning more about\nhow this module operates. The include file Include/pyfpe.h discusses the\nimplementation of this module at some length. Modules/fpetestmodule.c\ngives several examples of use. Many additional examples can be found in\nObjects/floatobject.c.
\n\nDeprecated since version 2.6: The Bastion module has been removed in Python 3.0.
\n\nChanged in version 2.3: Disabled module.
\nNote
\nThe documentation has been left in place to help in reading old code that uses\nthe module.
\nAccording to the dictionary, a bastion is “a fortified area or position”, or\n“something that is considered a stronghold.” It’s a suitable name for this\nmodule, which provides a way to forbid access to certain attributes of an\nobject. It must always be used with the rexec module, in order to allow\nrestricted-mode programs access to certain safe attributes of an object, while\ndenying access to other, unsafe attributes.
\nProtect the object object, returning a bastion for the object. Any attempt to\naccess one of the object’s attributes will have to be approved by the filter\nfunction; if the access is denied an AttributeError exception will be\nraised.
\nIf present, filter must be a function that accepts a string containing an\nattribute name, and returns true if access to that attribute will be permitted;\nif filter returns false, the access is denied. The default filter denies\naccess to any function beginning with an underscore ('_'). The bastion’s\nstring representation will be <Bastion for name> if a value for name is\nprovided; otherwise, repr(object) will be used.
\nclass, if present, should be a subclass of BastionClass; see the\ncode in bastion.py for the details. Overriding the default\nBastionClass will rarely be required.
\nThe codeop module provides utilities upon which the Python\nread-eval-print loop can be emulated, as is done in the code module. As\na result, you probably don’t want to use the module directly; if you want to\ninclude such a loop in your program you probably want to use the code\nmodule instead.
\nThere are two parts to this job:
\nThe codeop module provides a way of doing each of these things, and a way\nof doing them both.
\nTo do just the former:
\nTries to compile source, which should be a string of Python code and return a\ncode object if source is valid Python code. In that case, the filename\nattribute of the code object will be filename, which defaults to\n'<input>'. Returns None if source is not valid Python code, but is a\nprefix of valid Python code.
\nIf there is a problem with source, an exception will be raised.\nSyntaxError is raised if there is invalid Python syntax, and\nOverflowError or ValueError if there is an invalid literal.
\nThe symbol argument determines whether source is compiled as a statement\n('single', the default) or as an expression ('eval'). Any\nother value will cause ValueError to be raised.
\nNote
\nIt is possible (but not likely) that the parser stops parsing with a\nsuccessful outcome before reaching the end of the source; in this case,\ntrailing symbols may be ignored instead of causing an error. For example,\na backslash followed by two newlines may be followed by arbitrary garbage.\nThis will be fixed once the API for the parser is better.
\nA note on version compatibility: the Compile and\nCommandCompiler are new in Python 2.2. If you want to enable the\nfuture-tracking features of 2.2 but also retain compatibility with 2.1 and\nearlier versions of Python you can either write
\ntry:\n from codeop import CommandCompiler\n compile_command = CommandCompiler()\n del CommandCompiler\nexcept ImportError:\n from codeop import compile_command\n
which is a low-impact change, but introduces possibly unwanted global state into\nyour program, or you can write:
\ntry:\n from codeop import CommandCompiler\nexcept ImportError:\n def CommandCompiler():\n from codeop import compile_command\n return compile_command\n
and then call CommandCompiler every time you need a fresh compiler object.
\nThe code module provides facilities to implement read-eval-print loops in\nPython. Two classes and convenience functions are included which can be used to\nbuild applications which provide an interactive interpreter prompt.
\nThis function is useful for programs that want to emulate Python’s interpreter\nmain loop (a.k.a. the read-eval-print loop). The tricky part is to determine\nwhen the user has entered an incomplete command that can be completed by\nentering more text (as opposed to a complete command or a syntax error). This\nfunction almost always makes the same decision as the real interpreter main\nloop.
\nsource is the source string; filename is the optional filename from which\nsource was read, defaulting to '<input>'; and symbol is the optional\ngrammar start symbol, which should be either 'single' (the default) or\n'eval'.
\nReturns a code object (the same as compile(source, filename, symbol)) if the\ncommand is complete and valid; None if the command is incomplete; raises\nSyntaxError if the command is complete and contains a syntax error, or\nraises OverflowError or ValueError if the command contains an\ninvalid literal.
\nCompile and run some source in the interpreter. Arguments are the same as for\ncompile_command(); the default for filename is '<input>', and for\nsymbol is 'single'. One several things can happen:
\nThe return value can be used to decide whether to use sys.ps1 or sys.ps2\nto prompt the next line.
\nExecute a code object. When an exception occurs, showtraceback() is called\nto display a traceback. All exceptions are caught except SystemExit,\nwhich is allowed to propagate.
\nA note about KeyboardInterrupt: this exception may occur elsewhere in\nthis code, and may not always be caught. The caller should be prepared to deal\nwith it.
\nThe InteractiveConsole class is a subclass of\nInteractiveInterpreter, and so offers all the methods of the\ninterpreter objects as well as the following additions.
\n\nNew in version 2.1.
\nSource code: Lib/inspect.py
\nThe inspect module provides several useful functions to help get\ninformation about live objects such as modules, classes, methods, functions,\ntracebacks, frame objects, and code objects. For example, it can help you\nexamine the contents of a class, retrieve the source code of a method, extract\nand format the argument list for a function, or get all the information you need\nto display a detailed traceback.
\nThere are four main kinds of services provided by this module: type checking,\ngetting source code, inspecting classes and functions, and examining the\ninterpreter stack.
\nThe getmembers() function retrieves the members of an object such as a\nclass or module. The sixteen functions whose names begin with “is” are mainly\nprovided as convenient choices for the second argument to getmembers().\nThey also help you determine when you can expect to find the following special\nattributes:
\nType | \nAttribute | \nDescription | \nNotes | \n
---|---|---|---|
module | \n__doc__ | \ndocumentation string | \n\n |
\n | __file__ | \nfilename (missing for\nbuilt-in modules) | \n\n |
class | \n__doc__ | \ndocumentation string | \n\n |
\n | __module__ | \nname of module in which\nthis class was defined | \n\n |
method | \n__doc__ | \ndocumentation string | \n\n |
\n | __name__ | \nname with which this\nmethod was defined | \n\n |
\n | im_class | \nclass object that asked\nfor this method | \n(1) | \n
\n | im_func or\n__func__ | \nfunction object\ncontaining implementation\nof method | \n\n |
\n | im_self or\n__self__ | \ninstance to which this\nmethod is bound, or\nNone | \n\n |
function | \n__doc__ | \ndocumentation string | \n\n |
\n | __name__ | \nname with which this\nfunction was defined | \n\n |
\n | func_code | \ncode object containing\ncompiled function\nbytecode | \n\n |
\n | func_defaults | \ntuple of any default\nvalues for arguments | \n\n |
\n | func_doc | \n(same as __doc__) | \n\n |
\n | func_globals | \nglobal namespace in which\nthis function was defined | \n\n |
\n | func_name | \n(same as __name__) | \n\n |
generator | \n__iter__ | \ndefined to support\niteration over container | \n\n |
\n | close | \nraises new GeneratorExit\nexception inside the\ngenerator to terminate\nthe iteration | \n\n |
\n | gi_code | \ncode object | \n\n |
\n | gi_frame | \nframe object or possibly\nNone once the generator\nhas been exhausted | \n\n |
\n | gi_running | \nset to 1 when generator\nis executing, 0 otherwise | \n\n |
\n | next | \nreturn the next item from\nthe container | \n\n |
\n | send | \nresumes the generator and\n“sends” a value that\nbecomes the result of the\ncurrent yield-expression | \n\n |
\n | throw | \nused to raise an\nexception inside the\ngenerator | \n\n |
traceback | \ntb_frame | \nframe object at this\nlevel | \n\n |
\n | tb_lasti | \nindex of last attempted\ninstruction in bytecode | \n\n |
\n | tb_lineno | \ncurrent line number in\nPython source code | \n\n |
\n | tb_next | \nnext inner traceback\nobject (called by this\nlevel) | \n\n |
frame | \nf_back | \nnext outer frame object\n(this frame’s caller) | \n\n |
\n | f_builtins | \nbuiltins namespace seen\nby this frame | \n\n |
\n | f_code | \ncode object being\nexecuted in this frame | \n\n |
\n | f_exc_traceback | \ntraceback if raised in\nthis frame, or None | \n\n |
\n | f_exc_type | \nexception type if raised\nin this frame, or\nNone | \n\n |
\n | f_exc_value | \nexception value if raised\nin this frame, or\nNone | \n\n |
\n | f_globals | \nglobal namespace seen by\nthis frame | \n\n |
\n | f_lasti | \nindex of last attempted\ninstruction in bytecode | \n\n |
\n | f_lineno | \ncurrent line number in\nPython source code | \n\n |
\n | f_locals | \nlocal namespace seen by\nthis frame | \n\n |
\n | f_restricted | \n0 or 1 if frame is in\nrestricted execution mode | \n\n |
\n | f_trace | \ntracing function for this\nframe, or None | \n\n |
code | \nco_argcount | \nnumber of arguments (not\nincluding * or **\nargs) | \n\n |
\n | co_code | \nstring of raw compiled\nbytecode | \n\n |
\n | co_consts | \ntuple of constants used\nin the bytecode | \n\n |
\n | co_filename | \nname of file in which\nthis code object was\ncreated | \n\n |
\n | co_firstlineno | \nnumber of first line in\nPython source code | \n\n |
\n | co_flags | \nbitmap: 1=optimized |\n2=newlocals | 4=*arg\n| 8=**arg | \n\n |
\n | co_lnotab | \nencoded mapping of line\nnumbers to bytecode\nindices | \n\n |
\n | co_name | \nname with which this code\nobject was defined | \n\n |
\n | co_names | \ntuple of names of local\nvariables | \n\n |
\n | co_nlocals | \nnumber of local variables | \n\n |
\n | co_stacksize | \nvirtual machine stack\nspace required | \n\n |
\n | co_varnames | \ntuple of names of\narguments and local\nvariables | \n\n |
builtin | \n__doc__ | \ndocumentation string | \n\n |
\n | __name__ | \noriginal name of this\nfunction or method | \n\n |
\n | __self__ | \ninstance to which a\nmethod is bound, or\nNone | \n\n |
Note:
\n\nChanged in version 2.2: im_class used to refer to the class that defined the method.
\nReturn all the members of an object in a list of (name, value) pairs sorted by\nname. If the optional predicate argument is supplied, only members for which\nthe predicate returns a true value are included.
\nNote
\ngetmembers() does not return metaclass attributes when the argument\nis a class (this behavior is inherited from the dir() function).
\nReturn a tuple of values that describe how Python will interpret the file\nidentified by path if it is a module, or None if it would not be\nidentified as a module. The return tuple is (name, suffix, mode,\nmodule_type), where name is the name of the module without the name of\nany enclosing package, suffix is the trailing part of the file name (which\nmay not be a dot-delimited extension), mode is the open() mode that\nwould be used ('r' or 'rb'), and module_type is an integer giving\nthe type of the module. module_type will have a value which can be\ncompared to the constants defined in the imp module; see the\ndocumentation for that module for more information on module types.
\n\nChanged in version 2.6: Returns a named tuple ModuleInfo(name, suffix, mode,\nmodule_type).
\nReturn true if the object is a Python generator function.
\n\nNew in version 2.6.
\nReturn true if the object is a generator.
\n\nNew in version 2.6.
\nReturn true if the object is an abstract base class.
\n\nNew in version 2.6.
\nReturn true if the object is a method descriptor, but not if\nismethod(), isclass(), isfunction() or isbuiltin()\nare true.
\nThis is new as of Python 2.2, and, for example, is true of\nint.__add__. An object passing this test has a __get__ attribute\nbut not a __set__ attribute, but beyond that the set of attributes\nvaries. __name__ is usually sensible, and __doc__ often is.
\nMethods implemented via descriptors that also pass one of the other tests\nreturn false from the ismethoddescriptor() test, simply because the\nother tests promise more – you can, e.g., count on having the\nim_func attribute (etc) when an object passes ismethod().
\nReturn true if the object is a data descriptor.
\nData descriptors have both a __get__ and a __set__ attribute.\nExamples are properties (defined in Python), getsets, and members. The\nlatter two are defined in C and there are more specific tests available for\nthose types, which is robust across Python implementations. Typically, data\ndescriptors will also have __name__ and __doc__ attributes\n(properties, getsets, and members have both of these attributes), but this is\nnot guaranteed.
\n\nNew in version 2.3.
\nReturn true if the object is a getset descriptor.
\nCPython implementation detail: getsets are attributes defined in extension modules via\nPyGetSetDef structures. For Python implementations without such\ntypes, this method will always return False.
\n\nNew in version 2.5.
\nReturn true if the object is a member descriptor.
\nCPython implementation detail: Member descriptors are attributes defined in extension modules via\nPyMemberDef structures. For Python implementations without such\ntypes, this method will always return False.
\n\nNew in version 2.5.
\nClean up indentation from docstrings that are indented to line up with blocks\nof code. Any whitespace that can be uniformly removed from the second line\nonwards is removed. Also, all tabs are expanded to spaces.
\n\nNew in version 2.6.
\nGet the names and default values of a Python function’s arguments. A tuple of\nfour things is returned: (args, varargs, keywords, defaults). args is a\nlist of the argument names (it may contain nested lists). varargs and\nkeywords are the names of the * and ** arguments or\nNone. defaults is a tuple of default argument values or None if there\nare no default arguments; if this tuple has n elements, they correspond to\nthe last n elements listed in args.
\n\nChanged in version 2.6: Returns a named tuple ArgSpec(args, varargs, keywords,\ndefaults).
\nGet information about arguments passed into a particular frame. A tuple of\nfour things is returned: (args, varargs, keywords, locals). args is a\nlist of the argument names (it may contain nested lists). varargs and\nkeywords are the names of the * and ** arguments or None.\nlocals is the locals dictionary of the given frame.
\n\nChanged in version 2.6: Returns a named tuple ArgInfo(args, varargs, keywords,\nlocals).
\nBind the args and kwds to the argument names of the Python function or\nmethod func, as if it was called with them. For bound methods, bind also the\nfirst argument (typically named self) to the associated instance. A dict\nis returned, mapping the argument names (including the names of the * and\n** arguments, if any) to their values from args and kwds. In case of\ninvoking func incorrectly, i.e. whenever func(*args, **kwds) would raise\nan exception because of incompatible signature, an exception of the same type\nand the same or similar message is raised. For example:
\n>>> from inspect import getcallargs\n>>> def f(a, b=1, *pos, **named):\n... pass\n>>> getcallargs(f, 1, 2, 3)\n{'a': 1, 'named': {}, 'b': 2, 'pos': (3,)}\n>>> getcallargs(f, a=2, x=4)\n{'a': 2, 'named': {'x': 4}, 'b': 1, 'pos': ()}\n>>> getcallargs(f)\nTraceback (most recent call last):\n...\nTypeError: f() takes at least 1 argument (0 given)\n
\nNew in version 2.7.
\nWhen the following functions return “frame records,” each record is a tuple of\nsix items: the frame object, the filename, the line number of the current line,\nthe function name, a list of lines of context from the source code, and the\nindex of the current line within that list.
\nNote
\nKeeping references to frame objects, as found in the first element of the frame\nrecords these functions return, can cause your program to create reference\ncycles. Once a reference cycle has been created, the lifespan of all objects\nwhich can be accessed from the objects which form the cycle can become much\nlonger even if Python’s optional cycle detector is enabled. If such cycles must\nbe created, it is important to ensure they are explicitly broken to avoid the\ndelayed destruction of objects and increased memory consumption which occurs.
\nThough the cycle detector will catch these, destruction of the frames (and local\nvariables) can be made deterministic by removing the cycle in a\nfinally clause. This is also important if the cycle detector was\ndisabled when Python was compiled or using gc.disable(). For example:
\ndef handle_stackframe_without_leak():\n frame = inspect.currentframe()\n try:\n # do something with the frame\n finally:\n del frame\n
The optional context argument supported by most of these functions specifies\nthe number of lines of context to return, which are centered around the current\nline.
\nGet information about a frame or traceback object. A 5-tuple is returned, the\nlast five elements of the frame’s frame record.
\n\nChanged in version 2.6: Returns a named tuple Traceback(filename, lineno, function,\ncode_context, index).
\nReturn the frame object for the caller’s stack frame.
\nCPython implementation detail: This function relies on Python stack frame support in the interpreter,\nwhich isn’t guaranteed to exist in all implementations of Python. If\nrunning in an implementation without Python stack frame support this\nfunction returns None.
\n\nNew in version 2.7.
\nThis module is a minor subset of what is available in the more full-featured\npackage of the same name from Python 3.1 that provides a complete\nimplementation of import. What is here has been provided to\nhelp ease in transitioning from 2.7 to 3.1.
\n\nNew in version 2.3.
\nThis module adds the ability to import Python modules (*.py,\n*.py[co]) and packages from ZIP-format archives. It is usually not\nneeded to use the zipimport module explicitly; it is automatically used\nby the built-in import mechanism for sys.path items that are paths\nto ZIP archives.
\nTypically, sys.path is a list of directory names as strings. This module\nalso allows an item of sys.path to be a string naming a ZIP file archive.\nThe ZIP archive can contain a subdirectory structure to support package imports,\nand a path within the archive can be specified to only import from a\nsubdirectory. For example, the path /tmp/example.zip/lib/ would only\nimport from the lib/ subdirectory within the archive.
\nAny files may be present in the ZIP archive, but only files .py and\n.py[co] are available for import. ZIP import of dynamic modules\n(.pyd, .so) is disallowed. Note that if an archive only contains\n.py files, Python will not attempt to modify the archive by adding the\ncorresponding .pyc or .pyo file, meaning that if a ZIP archive\ndoesn’t contain .pyc files, importing may be rather slow.
\nUsing the built-in reload() function will fail if called on a module\nloaded from a ZIP archive; it is unlikely that reload() would be needed,\nsince this would imply that the ZIP has been altered during runtime.
\nZIP archives with an archive comment are currently not supported.
\nSee also
\nThis module defines an exception:
\nzipimporter is the class for importing ZIP files.
\nCreate a new zipimporter instance. archivepath must be a path to a ZIP\nfile, or to a specific path within a ZIP file. For example, an archivepath\nof foo/bar.zip/lib will look for modules in the lib directory\ninside the ZIP file foo/bar.zip (provided that it exists).
\nZipImportError is raised if archivepath doesn’t point to a valid ZIP\narchive.
\n\nNew in version 2.7.
\nThe archive and prefix attributes, when combined with a\nslash, equal the original archivepath argument given to the\nzipimporter constructor.
\nHere is an example that imports a module from a ZIP archive - note that the\nzipimport module is not explicitly used.
\n$ unzip -l /tmp/example.zip\nArchive: /tmp/example.zip\n Length Date Time Name\n -------- ---- ---- ----\n 8467 11-26-02 22:30 jwzthreading.py\n -------- -------\n 8467 1 file\n$ ./python\nPython 2.3 (#1, Aug 1 2003, 19:54:32)\n>>> import sys\n>>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path\n>>> import jwzthreading\n>>> jwzthreading.__file__\n'/tmp/example.zip/jwzthreading.py'
\nThis module provides an interface to the mechanisms used to implement the\nimport statement. It defines the following constants and functions:
\nReturn the magic string value used to recognize byte-compiled code files\n(.pyc files). (This value may be different for each Python version.)
\nTry to find the module name. If path is omitted or None, the list of\ndirectory names given by sys.path is searched, but first a few special\nplaces are searched: the function tries to find a built-in module with the\ngiven name (C_BUILTIN), then a frozen module (PY_FROZEN),\nand on some systems some other places are looked in as well (on Windows, it\nlooks in the registry which may point to a specific file).
\nOtherwise, path must be a list of directory names; each directory is\nsearched for files with any of the suffixes returned by get_suffixes()\nabove. Invalid names in the list are silently ignored (but all list items\nmust be strings).
\nIf search is successful, the return value is a 3-element tuple (file,\npathname, description):
\nfile is an open file object positioned at the beginning, pathname is the\npathname of the file found, and description is a 3-element tuple as\ncontained in the list returned by get_suffixes() describing the kind of\nmodule found.
\nIf the module does not live in a file, the returned file is None,\npathname is the empty string, and the description tuple contains empty\nstrings for its suffix and mode; the module type is indicated as given in\nparentheses above. If the search is unsuccessful, ImportError is\nraised. Other exceptions indicate problems with the arguments or\nenvironment.
\nIf the module is a package, file is None, pathname is the package\npath and the last item in the description tuple is PKG_DIRECTORY.
\nThis function does not handle hierarchical module names (names containing\ndots). In order to find P.*M*, that is, submodule M of package P, use\nfind_module() and load_module() to find and load package P, and\nthen use find_module() with the path argument set to P.__path__.\nWhen P itself has a dotted name, apply this recipe recursively.
\nLoad a module that was previously found by find_module() (or by an\notherwise conducted search yielding compatible results). This function does\nmore than importing the module: if the module was already imported, it is\nequivalent to a reload()! The name argument indicates the full\nmodule name (including the package name, if this is a submodule of a\npackage). The file argument is an open file, and pathname is the\ncorresponding file name; these can be None and '', respectively, when\nthe module is a package or not being loaded from a file. The description\nargument is a tuple, as would be returned by get_suffixes(), describing\nwhat kind of module must be loaded.
\nIf the load is successful, the return value is the module object; otherwise,\nan exception (usually ImportError) is raised.
\nImportant: the caller is responsible for closing the file argument, if\nit was not None, even when an exception is raised. This is best done\nusing a try ... finally statement.
\nReturn True if the import lock is currently held, else False. On\nplatforms without threads, always return False.
\nOn platforms with threads, a thread executing an import holds an internal lock\nuntil the import is complete. This lock blocks other threads from doing an\nimport until the original import completes, which in turn prevents other threads\nfrom seeing incomplete module objects constructed by the original thread while\nin the process of completing its import (and the imports, if any, triggered by\nthat).
\nAcquire the interpreter’s import lock for the current thread. This lock should\nbe used by import hooks to ensure thread-safety when importing modules.
\nOnce a thread has acquired the import lock, the same thread may acquire it\nagain without blocking; the thread must release it once for each time it has\nacquired it.
\nOn platforms without threads, this function does nothing.
\n\nNew in version 2.3.
\nRelease the interpreter’s import lock. On platforms without threads, this\nfunction does nothing.
\n\nNew in version 2.3.
\nThe following constants with integer values, defined in this module, are used to\nindicate the search result of find_module().
\nThe following constant and functions are obsolete; their functionality is\navailable through find_module() or load_module(). They are kept\naround for backward compatibility:
\nLoad and initialize a module implemented as a byte-compiled code file and return\nits module object. If the module was already initialized, it will be\ninitialized again. The name argument is used to create or access a module\nobject. The pathname argument points to the byte-compiled code file. The\nfile argument is the byte-compiled code file, open for reading in binary mode,\nfrom the beginning. It must currently be a real file object, not a user-defined\nclass emulating a file.
\nThe NullImporter type is a PEP 302 import hook that handles\nnon-directory path strings by failing to find any modules. Calling this type\nwith an existing directory or empty string raises ImportError.\nOtherwise, a NullImporter instance is returned.
\nPython adds instances of this type to sys.path_importer_cache for any path\nentries that are not directories and are not handled by any other path hooks on\nsys.path_hooks. Instances have only one method:
\n\nNew in version 2.5.
\nThe following function emulates what was the standard import statement up to\nPython 1.4 (no hierarchical module names). (This implementation wouldn’t work\nin that version, since find_module() has been extended and\nload_module() has been added in 1.4.)
\nimport imp\nimport sys\n\ndef __import__(name, globals=None, locals=None, fromlist=None):\n # Fast path: see if the module has already been imported.\n try:\n return sys.modules[name]\n except KeyError:\n pass\n\n # If any of the following calls raises an exception,\n # there's a problem we can't handle -- let the caller handle it.\n\n fp, pathname, description = imp.find_module(name)\n\n try:\n return imp.load_module(name, fp, pathname, description)\n finally:\n # Since we may exit via an exception, close fp explicitly.\n if fp:\n fp.close()\n
A more complete example that implements hierarchical module names and includes a\nreload() function can be found in the module knee. The knee\nmodule can be found in Demo/imputil/ in the Python source distribution.
\n\nDeprecated since version 2.6: The rexec module has been removed in Python 3.0.
\n\nChanged in version 2.3: Disabled module.
\nWarning
\nThe documentation has been left in place to help in reading old code that uses\nthe module.
\nThis module contains the RExec class, which supports r_eval(),\nr_execfile(), r_exec(), and r_import() methods, which are\nrestricted versions of the standard Python functions eval(),\nexecfile() and the exec and import statements. Code\nexecuted in this restricted environment will only have access to modules and\nfunctions that are deemed safe; you can subclass RExec to add or remove\ncapabilities as desired.
\nWarning
\nWhile the rexec module is designed to perform as described below, it does\nhave a few known vulnerabilities which could be exploited by carefully written\ncode. Thus it should not be relied upon in situations requiring “production\nready” security. In such situations, execution via sub-processes or very\ncareful “cleansing” of both code and data to be processed may be necessary.\nAlternatively, help in patching known rexec vulnerabilities would be\nwelcomed.
\nNote
\nThe RExec class can prevent code from performing unsafe operations like\nreading or writing disk files, or using TCP/IP sockets. However, it does not\nprotect against code using extremely large amounts of memory or processor time.
\nReturns an instance of the RExec class.
\nhooks is an instance of the RHooks class or a subclass of it. If it\nis omitted or None, the default RHooks class is instantiated.\nWhenever the rexec module searches for a module (even a built-in one) or\nreads a module’s code, it doesn’t actually go out to the file system itself.\nRather, it calls methods of an RHooks instance that was passed to or\ncreated by its constructor. (Actually, the RExec object doesn’t make\nthese calls — they are made by a module loader object that’s part of the\nRExec object. This allows another level of flexibility, which can be\nuseful when changing the mechanics of import within the restricted\nenvironment.)
\nBy providing an alternate RHooks object, we can control the file system\naccesses made to import a module, without changing the actual algorithm that\ncontrols the order in which those accesses are made. For instance, we could\nsubstitute an RHooks object that passes all filesystem requests to a\nfile server elsewhere, via some RPC mechanism such as ILU. Grail’s applet\nloader uses this to support importing applets from a URL for a directory.
\nIf verbose is true, additional debugging output may be sent to standard\noutput.
\nIt is important to be aware that code running in a restricted environment can\nstill call the sys.exit() function. To disallow restricted code from\nexiting the interpreter, always protect calls that cause restricted code to run\nwith a try/except statement that catches the\nSystemExit exception. Removing the sys.exit() function from the\nrestricted environment is not sufficient — the restricted code could still use\nraise SystemExit. Removing SystemExit is not a reasonable option;\nsome library code makes use of this and would break were it not available.
\nSee also
\nRExec instances support the following methods:
\nMethods whose names begin with s_ are similar to the functions beginning\nwith r_, but the code will be granted access to restricted versions of the\nstandard I/O streams sys.stdin, sys.stderr, and sys.stdout.
\nRExec objects must also support various methods which will be\nimplicitly called by code executing in the restricted environment. Overriding\nthese methods in a subclass is used to change the policies enforced by a\nrestricted environment.
\nAnd their equivalents with access to restricted standard I/O streams:
\nThe RExec class has the following class attributes, which are used by\nthe __init__() method. Changing them on an existing instance won’t have\nany effect; instead, create a subclass of RExec and assign them new\nvalues in the class definition. Instances of the new class will then use those\nnew values. All these attributes are tuples of strings.
\nLet us say that we want a slightly more relaxed policy than the standard\nRExec class. For example, if we’re willing to allow files in\n/tmp to be written, we can subclass the RExec class:
\nclass TmpWriterRExec(rexec.RExec):\n def r_open(self, file, mode='r', buf=-1):\n if mode in ('r', 'rb'):\n pass\n elif mode in ('w', 'wb', 'a', 'ab'):\n # check filename : must begin with /tmp/\n if file[:5]!='/tmp/':\n raise IOError("can't write outside /tmp")\n elif (string.find(file, '/../') >= 0 or\n file[:3] == '../' or file[-3:] == '/..'):\n raise IOError("'..' in filename forbidden")\n else: raise IOError("Illegal open() mode")\n return open(file, mode, buf)\n
Notice that the above code will occasionally forbid a perfectly valid filename;\nfor example, code in the restricted environment won’t be able to open a file\ncalled /tmp/foo/../bar. To fix this, the r_open() method would\nhave to simplify the filename to /tmp/bar, which would require splitting\napart the filename and performing various operations on it. In cases where\nsecurity is at stake, it may be preferable to write simple code which is\nsometimes overly restrictive, instead of more general code that is also more\ncomplex and may harbor a subtle security hole.
\n\nDeprecated since version 2.6: The imputil module has been removed in Python 3.0.
\nThis module provides a very handy and useful mechanism for custom\nimport hooks. Compared to the older ihooks module,\nimputil takes a dramatically simpler and more straight-forward\napproach to custom import functions.
\nManage the import process.
\nBase class for replacing standard import functions.
\nFind and retrieve the code for the given module.
\nparent specifies a parent module to define a context for importing.\nIt may be None, indicating no particular context for the search.
\nmodname specifies a single module (not dotted) within the parent.
\nfqname specifies the fully-qualified module name. This is a\n(potentially) dotted name from the “root” of the module namespace\ndown to the modname.
\nIf there is no parent, then modname==fqname.
\nThis method should return None, or a 3-tuple.
\n\n\n\n
\n- If the module was not found, then None should be returned.
\n- The first item of the 2- or 3-tuple should be the integer 0 or 1,\nspecifying whether the module that was found is a package or not.
\n- The second item is the code object for the module (it will be\nexecuted within the new module’s namespace). This item can also\nbe a fully-loaded module object (e.g. loaded from a shared lib).
\n- The third item is a dictionary of name/value pairs that will be\ninserted into new module before the code object is executed. This\nis provided in case the module’s code expects certain values (such\nas where the module was found). When the second item is a module\nobject, then these names/values will be inserted after the module\nhas been loaded/initialized.
\n
Emulate the import mechanism for built-in and frozen modules. This is a\nsub-class of the Importer class.
\nUndocumented.
\nThis is a re-implementation of hierarchical module import.
\nThis code is intended to be read, not executed. However, it does work\n– all you need to do to enable it is “import knee”.
\n(The name is a pun on the clunkier predecessor of this module, “ni”.)
\nimport sys, imp, __builtin__\n\n# Replacement for __import__()\ndef import_hook(name, globals=None, locals=None, fromlist=None):\n parent = determine_parent(globals)\n q, tail = find_head_package(parent, name)\n m = load_tail(q, tail)\n if not fromlist:\n return q\n if hasattr(m, "__path__"):\n ensure_fromlist(m, fromlist)\n return m\n\ndef determine_parent(globals):\n if not globals or not globals.has_key("__name__"):\n return None\n pname = globals['__name__']\n if globals.has_key("__path__"):\n parent = sys.modules[pname]\n assert globals is parent.__dict__\n return parent\n if '.' in pname:\n i = pname.rfind('.')\n pname = pname[:i]\n parent = sys.modules[pname]\n assert parent.__name__ == pname\n return parent\n return None\n\ndef find_head_package(parent, name):\n if '.' in name:\n i = name.find('.')\n head = name[:i]\n tail = name[i+1:]\n else:\n head = name\n tail = ""\n if parent:\n qname = "%s.%s" (parent.__name__, head)\n else:\n qname = head\n q = import_module(head, qname, parent)\n if q: return q, tail\n if parent:\n qname = head\n parent = None\n q = import_module(head, qname, parent)\n if q: return q, tail\n raise ImportError("No module named " + qname)\n\ndef load_tail(q, tail):\n m = q\n while tail:\n i = tail.find('.')\n if i < 0: i = len(tail)\n head, tail = tail[:i], tail[i+1:]\n mname = "%s.%s" (m.__name__, head)\n m = import_module(head, mname, m)\n if not m:\n raise ImportError("No module named " + mname)\n return m\n\ndef ensure_fromlist(m, fromlist, recursive=0):\n for sub in fromlist:\n if sub == "*":\n if not recursive:\n try:\n all = m.__all__\n except AttributeError:\n pass\n else:\n ensure_fromlist(m, all, 1)\n continue\n if sub != "*" and not hasattr(m, sub):\n subname = "%s.%s" (m.__name__, sub)\n submod = import_module(sub, subname, m)\n if not submod:\n raise ImportError("No module named " + subname)\n\ndef import_module(partname, fqname, parent):\n try:\n return sys.modules[fqname]\n except KeyError:\n pass\n try:\n fp, pathname, stuff = imp.find_module(partname,\n parent and parent.__path__)\n except ImportError:\n return None\n try:\n m = imp.load_module(fqname, fp, pathname, stuff)\n finally:\n if fp: fp.close()\n if parent:\n setattr(parent, partname, m)\n return m\n\n\n# Replacement for reload()\ndef reload_hook(module):\n name = module.__name__\n if '.' not in name:\n return import_module(name, name, None)\n i = name.rfind('.')\n pname = name[:i]\n parent = sys.modules[pname]\n return import_module(name[i+1:], name, parent)\n\n\n# Save the original hooks\noriginal_import = __builtin__.__import__\noriginal_reload = __builtin__.reload\n\n# Now install our hooks\n__builtin__.__import__ = import_hook\n__builtin__.reload = reload_hook\n
Also see the importers module (which can be found\nin Demo/imputil/ in the Python source distribution) for additional\nexamples.
\n\nNew in version 2.3.
\nSource code: Lib/modulefinder.py
\nThis module provides a ModuleFinder class that can be used to determine\nthe set of modules imported by a script. modulefinder.py can also be run as\na script, giving the filename of a Python script as its argument, after which a\nreport of the imported modules will be printed.
\nThis class provides run_script() and report() methods to determine\nthe set of modules imported by a script. path can be a list of directories to\nsearch for modules; if not specified, sys.path is used. debug sets the\ndebugging level; higher values make the class print debugging messages about\nwhat it’s doing. excludes is a list of module names to exclude from the\nanalysis. replace_paths is a list of (oldpath, newpath) tuples that will\nbe replaced in module paths.
\nThe script that is going to get analyzed later on (bacon.py):
\nimport re, itertools\n\ntry:\n import baconhameggs\nexcept ImportError:\n pass\n\ntry:\n import guido.python.ham\nexcept ImportError:\n pass\n
The script that will output the report of bacon.py:
\nfrom modulefinder import ModuleFinder\n\nfinder = ModuleFinder()\nfinder.run_script('bacon.py')\n\nprint 'Loaded modules:'\nfor name, mod in finder.modules.iteritems():\n print '%s: ' name,\n print ','.join(mod.globalnames.keys()[:3])\n\nprint '-'*50\nprint 'Modules not imported:'\nprint '\\n'.join(finder.badmodules.iterkeys())\n
Sample output (may vary depending on the architecture):
\nLoaded modules:\n_types:\ncopy_reg: _inverted_registry,_slotnames,__all__\nsre_compile: isstring,_sre,_optimize_unicode\n_sre:\nsre_constants: REPEAT_ONE,makedict,AT_END_LINE\nsys:\nre: __module__,finditer,_expand\nitertools:\n__main__: re,itertools,baconhameggs\nsre_parse: __getslice__,_PATTERNENDERS,SRE_FLAG_UNICODE\narray:\ntypes: __module__,IntType,TypeType\n---------------------------------------------------\nModules not imported:\nguido.python.ham\nbaconhameggs
\n\nNew in version 2.3.
\nSource code: Lib/pkgutil.py
\nThis module provides utilities for the import system, in particular package\nsupport.
\nExtend the search path for the modules which comprise a package. Intended\nuse is to place the following code in a package’s __init__.py:
\nfrom pkgutil import extend_path\n__path__ = extend_path(__path__, __name__)\n
This will add to the package’s __path__ all subdirectories of directories\non sys.path named after the package. This is useful if one wants to\ndistribute different parts of a single logical package as multiple\ndirectories.
\nIt also looks for *.pkg files beginning where * matches the\nname argument. This feature is similar to *.pth files (see the\nsite module for more information), except that it doesn’t special-case\nlines starting with import. A *.pkg file is trusted at face\nvalue: apart from checking for duplicates, all entries found in a\n*.pkg file are added to the path, regardless of whether they exist\non the filesystem. (This is a feature.)
\nIf the input path is not a list (as is the case for frozen packages) it is\nreturned unchanged. The input path is not modified; an extended copy is\nreturned. Items are only appended to the copy at the end.
\nIt is assumed that sys.path is a sequence. Items of sys.path\nthat are not (Unicode or 8-bit) strings referring to existing directories are\nignored. Unicode items on sys.path that cause errors when used as\nfilenames may cause this function to raise an exception (in line with\nos.path.isdir() behavior).
\nPEP 302 Importer that wraps Python’s “classic” import algorithm.
\nIf dirname is a string, a PEP 302 importer is created that searches that\ndirectory. If dirname is None, a PEP 302 importer is created that\nsearches the current sys.path, plus any modules that are frozen or\nbuilt-in.
\nNote that ImpImporter does not currently support being used by\nplacement on sys.meta_path.
\nFind a PEP 302 “loader” object for fullname.
\nIf fullname contains dots, path must be the containing package’s\n__path__. Returns None if the module cannot be found or imported.\nThis function uses iter_importers(), and is thus subject to the same\nlimitations regarding platform-specific special import locations such as the\nWindows registry.
\nRetrieve a PEP 302 importer for the given path_item.
\nThe returned importer is cached in sys.path_importer_cache if it was\nnewly created by a path hook.
\nIf there is no importer, a wrapper around the basic import machinery is\nreturned. This wrapper is never inserted into the importer cache (None\nis inserted instead).
\nThe cache (or part of it) can be cleared manually if a rescan of\nsys.path_hooks is necessary.
\nGet a PEP 302 “loader” object for module_or_name.
\nIf the module or package is accessible via the normal import mechanism, a\nwrapper around the relevant part of that machinery is returned. Returns\nNone if the module cannot be found or imported. If the named module is\nnot already imported, its containing package (if any) is imported, in order\nto establish the package __path__.
\nThis function uses iter_importers(), and is thus subject to the same\nlimitations regarding platform-specific special import locations such as the\nWindows registry.
\nYield PEP 302 importers for the given module name.
\nIf fullname contains a ‘.’, the importers will be for the package containing\nfullname, otherwise they will be importers for sys.meta_path,\nsys.path, and Python’s “classic” import machinery, in that order. If\nthe named module is in a package, that package is imported as a side effect\nof invoking this function.
\nNon-PEP 302 mechanisms (e.g. the Windows registry) used by the standard\nimport machinery to find files in alternative locations are partially\nsupported, but are searched after sys.path. Normally, these\nlocations are searched before sys.path, preventing sys.path\nentries from shadowing them.
\nFor this to cause a visible difference in behaviour, there must be a module\nor package name that is accessible via both sys.path and one of the\nnon-PEP 302 file system mechanisms. In this case, the emulation will find\nthe former version, while the builtin import mechanism will find the latter.
\nItems of the following types can be affected by this discrepancy:\nimp.C_EXTENSION, imp.PY_SOURCE, imp.PY_COMPILED,\nimp.PKG_DIRECTORY.
\nYields (module_loader, name, ispkg) for all submodules on path, or, if\npath is None, all top-level modules on sys.path.
\npath should be either None or a list of paths to look for modules in.
\nprefix is a string to output on the front of every module name on output.
\nYields (module_loader, name, ispkg) for all modules recursively on\npath, or, if path is None, all accessible modules.
\npath should be either None or a list of paths to look for modules in.
\nprefix is a string to output on the front of every module name on output.
\nNote that this function must import all packages (not all modules!) on\nthe given path, in order to access the __path__ attribute to find\nsubmodules.
\nonerror is a function which gets called with one argument (the name of the\npackage which was being imported) if any exception occurs while trying to\nimport a package. If no onerror function is supplied, ImportErrors\nare caught and ignored, while all other exceptions are propagated,\nterminating the search.
\nExamples:
\n# list all modules python can access\nwalk_packages()\n\n# list all submodules of ctypes\nwalk_packages(ctypes.__path__, ctypes.__name__ + '.')\n
Get a resource from a package.
\nThis is a wrapper for the PEP 302 loader get_data() API. The\npackage argument should be the name of a package, in standard module format\n(foo.bar). The resource argument should be in the form of a relative\nfilename, using / as the path separator. The parent directory name\n.. is not allowed, and nor is a rooted name (starting with a /).
\nThe function returns a binary string that is the contents of the specified\nresource.
\nFor packages located in the filesystem, which have already been imported,\nthis is the rough equivalent of:
\nd = os.path.dirname(sys.modules[package].__file__)\ndata = open(os.path.join(d, resource), 'rb').read()\n
If the package cannot be located or loaded, or it uses a PEP 302 loader\nwhich does not support get_data(), then None is returned.
\n\nNew in version 2.6.
\n\nNew in version 2.5.
\nSource code: Lib/runpy.py
\nThe runpy module is used to locate and run Python modules without\nimporting them first. Its main use is to implement the -m command\nline switch that allows scripts to be located using the Python module\nnamespace rather than the filesystem.
\nThe runpy module provides two functions:
\nExecute the code of the specified module and return the resulting module\nglobals dictionary. The module’s code is first located using the standard\nimport mechanism (refer to PEP 302 for details) and then executed in a\nfresh module namespace.
\nIf the supplied module name refers to a package rather than a normal\nmodule, then that package is imported and the __main__ submodule within\nthat package is then executed and the resulting module globals dictionary\nreturned.
\nThe optional dictionary argument init_globals may be used to pre-populate\nthe module’s globals dictionary before the code is executed. The supplied\ndictionary will not be modified. If any of the special global variables\nbelow are defined in the supplied dictionary, those definitions are\noverridden by run_module().
\nThe special global variables __name__, __file__, __loader__\nand __package__ are set in the globals dictionary before the module\ncode is executed (Note that this is a minimal set of variables - other\nvariables may be set implicitly as an interpreter implementation detail).
\n__name__ is set to run_name if this optional argument is not\nNone, to mod_name + '.__main__' if the named module is a\npackage and to the mod_name argument otherwise.
\n__file__ is set to the name provided by the module loader. If the\nloader does not make filename information available, this variable is set\nto None.
\n__loader__ is set to the PEP 302 module loader used to retrieve the\ncode for the module (This loader may be a wrapper around the standard\nimport mechanism).
\n__package__ is set to mod_name if the named module is a package and\nto mod_name.rpartition('.')[0] otherwise.
\nIf the argument alter_sys is supplied and evaluates to True,\nthen sys.argv[0] is updated with the value of __file__ and\nsys.modules[__name__] is updated with a temporary module object for the\nmodule being executed. Both sys.argv[0] and sys.modules[__name__]\nare restored to their original values before the function returns.
\nNote that this manipulation of sys is not thread-safe. Other threads\nmay see the partially initialised module, as well as the altered list of\narguments. It is recommended that the sys module be left alone when\ninvoking this function from threaded code.
\n\nChanged in version 2.7: Added ability to execute packages by looking for a __main__\nsubmodule
\nExecute the code at the named filesystem location and return the resulting\nmodule globals dictionary. As with a script name supplied to the CPython\ncommand line, the supplied path may refer to a Python source file, a\ncompiled bytecode file or a valid sys.path entry containing a __main__\nmodule (e.g. a zipfile containing a top-level __main__.py file).
\nFor a simple script, the specified code is simply executed in a fresh\nmodule namespace. For a valid sys.path entry (typically a zipfile or\ndirectory), the entry is first added to the beginning of sys.path. The\nfunction then looks for and executes a __main__ module using the\nupdated path. Note that there is no special protection against invoking\nan existing __main__ entry located elsewhere on sys.path if\nthere is no such module at the specified location.
\nThe optional dictionary argument init_globals may be used to pre-populate\nthe module’s globals dictionary before the code is executed. The supplied\ndictionary will not be modified. If any of the special global variables\nbelow are defined in the supplied dictionary, those definitions are\noverridden by run_path().
\nThe special global variables __name__, __file__, __loader__\nand __package__ are set in the globals dictionary before the module\ncode is executed (Note that this is a minimal set of variables - other\nvariables may be set implicitly as an interpreter implementation detail).
\n__name__ is set to run_name if this optional argument is not\nNone and to '<run_path>' otherwise.
\n__file__ is set to the name provided by the module loader. If the\nloader does not make filename information available, this variable is set\nto None. For a simple script, this will be set to file_path.
\n__loader__ is set to the PEP 302 module loader used to retrieve the\ncode for the module (This loader may be a wrapper around the standard\nimport mechanism). For a simple script, this will be set to None.
\n__package__ is set to __name__.rpartition('.')[0].
\nA number of alterations are also made to the sys module. Firstly,\nsys.path may be altered as described above. sys.argv[0] is updated\nwith the value of file_path and sys.modules[__name__] is updated\nwith a temporary module object for the module being executed. All\nmodifications to items in sys are reverted before the function\nreturns.
\nNote that, unlike run_module(), the alterations made to sys\nare not optional in this function as these adjustments are essential to\nallowing the execution of sys.path entries. As the thread-safety\nlimitations still apply, use of this function in threaded code should be\neither serialised with the import lock or delegated to a separate process.
\n\nNew in version 2.7.
\nSee also
\nCommand line and environment - CPython command line details
\nThe parser module provides an interface to Python’s internal parser and\nbyte-code compiler. The primary purpose for this interface is to allow Python\ncode to edit the parse tree of a Python expression and create executable code\nfrom this. This is better than trying to parse and modify an arbitrary Python\ncode fragment as a string because parsing is performed in a manner identical to\nthe code forming the application. It is also faster.
\nNote
\nFrom Python 2.5 onward, it’s much more convenient to cut in at the Abstract\nSyntax Tree (AST) generation and compilation stage, using the ast\nmodule.
\nThe parser module exports the names documented here also with “st”\nreplaced by “ast”; this is a legacy from the time when there was no other\nAST and has nothing to do with the AST found in Python 2.5. This is also the\nreason for the functions’ keyword arguments being called ast, not st.\nThe “ast” functions will be removed in Python 3.0.
\nThere are a few things to note about this module which are important to making\nuse of the data structures created. This is not a tutorial on editing the parse\ntrees for Python code, but some examples of using the parser module are\npresented.
\nMost importantly, a good understanding of the Python grammar processed by the\ninternal parser is required. For full information on the language syntax, refer\nto The Python Language Reference. The parser\nitself is created from a grammar specification defined in the file\nGrammar/Grammar in the standard Python distribution. The parse trees\nstored in the ST objects created by this module are the actual output from the\ninternal parser when created by the expr() or suite() functions,\ndescribed below. The ST objects created by sequence2st() faithfully\nsimulate those structures. Be aware that the values of the sequences which are\nconsidered “correct” will vary from one version of Python to another as the\nformal grammar for the language is revised. However, transporting code from one\nPython version to another as source text will always allow correct parse trees\nto be created in the target version, with the only restriction being that\nmigrating to an older version of the interpreter will not support more recent\nlanguage constructs. The parse trees are not typically compatible from one\nversion to another, whereas source code has always been forward-compatible.
\nEach element of the sequences returned by st2list() or st2tuple()\nhas a simple form. Sequences representing non-terminal elements in the grammar\nalways have a length greater than one. The first element is an integer which\nidentifies a production in the grammar. These integers are given symbolic names\nin the C header file Include/graminit.h and the Python module\nsymbol. Each additional element of the sequence represents a component\nof the production as recognized in the input string: these are always sequences\nwhich have the same form as the parent. An important aspect of this structure\nwhich should be noted is that keywords used to identify the parent node type,\nsuch as the keyword if in an if_stmt, are included in the\nnode tree without any special treatment. For example, the if keyword\nis represented by the tuple (1, 'if'), where 1 is the numeric value\nassociated with all NAME tokens, including variable and function names\ndefined by the user. In an alternate form returned when line number information\nis requested, the same token might be represented as (1, 'if', 12), where\nthe 12 represents the line number at which the terminal symbol was found.
\nTerminal elements are represented in much the same way, but without any child\nelements and the addition of the source text which was identified. The example\nof the if keyword above is representative. The various types of\nterminal symbols are defined in the C header file Include/token.h and\nthe Python module token.
\nThe ST objects are not required to support the functionality of this module,\nbut are provided for three purposes: to allow an application to amortize the\ncost of processing complex parse trees, to provide a parse tree representation\nwhich conserves memory space when compared to the Python list or tuple\nrepresentation, and to ease the creation of additional modules in C which\nmanipulate parse trees. A simple “wrapper” class may be created in Python to\nhide the use of ST objects.
\nThe parser module defines functions for a few distinct purposes. The\nmost important purposes are to create ST objects and to convert ST objects to\nother representations such as parse trees and compiled code objects, but there\nare also functions which serve to query the type of parse tree represented by an\nST object.
\nSee also
\n\nST objects may be created from source code or from a parse tree. When creating\nan ST object from source, different functions are used to create the 'eval'\nand 'exec' forms.
\nThis function accepts a parse tree represented as a sequence and builds an\ninternal representation if possible. If it can validate that the tree conforms\nto the Python grammar and all nodes are valid node types in the host version of\nPython, an ST object is created from the internal representation and returned\nto the called. If there is a problem creating the internal representation, or\nif the tree cannot be validated, a ParserError exception is raised. An\nST object created this way should not be assumed to compile correctly; normal\nexceptions raised by compilation may still be initiated when the ST object is\npassed to compilest(). This may indicate problems not related to syntax\n(such as a MemoryError exception), but may also be due to constructs such\nas the result of parsing del f(0), which escapes the Python parser but is\nchecked by the bytecode compiler.
\nSequences representing terminal tokens may be represented as either two-element\nlists of the form (1, 'name') or as three-element lists of the form (1,\n'name', 56). If the third element is present, it is assumed to be a valid\nline number. The line number may be specified for any subset of the terminal\nsymbols in the input tree.
\nST objects, regardless of the input used to create them, may be converted to\nparse trees represented as list- or tuple- trees, or may be compiled into\nexecutable code objects. Parse trees may be extracted with or without line\nnumbering information.
\nThis function accepts an ST object from the caller in ast and returns a\nPython list representing the equivalent parse tree. The resulting list\nrepresentation can be used for inspection or the creation of a new parse tree in\nlist form. This function does not fail so long as memory is available to build\nthe list representation. If the parse tree will only be used for inspection,\nst2tuple() should be used instead to reduce memory consumption and\nfragmentation. When the list representation is required, this function is\nsignificantly faster than retrieving a tuple representation and converting that\nto nested lists.
\nIf line_info is true, line number information will be included for all\nterminal tokens as a third element of the list representing the token. Note\nthat the line number provided specifies the line on which the token ends.\nThis information is omitted if the flag is false or omitted.
\nThis function accepts an ST object from the caller in ast and returns a\nPython tuple representing the equivalent parse tree. Other than returning a\ntuple instead of a list, this function is identical to st2list().
\nIf line_info is true, line number information will be included for all\nterminal tokens as a third element of the list representing the token. This\ninformation is omitted if the flag is false or omitted.
\nThe Python byte compiler can be invoked on an ST object to produce code objects\nwhich can be used as part of an exec statement or a call to the\nbuilt-in eval() function. This function provides the interface to the\ncompiler, passing the internal parse tree from ast to the parser, using the\nsource file name specified by the filename parameter. The default value\nsupplied for filename indicates that the source was an ST object.
\nCompiling an ST object may result in exceptions related to compilation; an\nexample would be a SyntaxError caused by the parse tree for del f(0):\nthis statement is considered legal within the formal grammar for Python but is\nnot a legal language construct. The SyntaxError raised for this\ncondition is actually generated by the Python byte-compiler normally, which is\nwhy it can be raised at this point by the parser module. Most causes of\ncompilation failure can be diagnosed programmatically by inspection of the parse\ntree.
\nTwo functions are provided which allow an application to determine if an ST was\ncreated as an expression or a suite. Neither of these functions can be used to\ndetermine if an ST was created from source code via expr() or\nsuite() or from a parse tree via sequence2st().
\nWhen ast represents an 'eval' form, this function returns true, otherwise\nit returns false. This is useful, since code objects normally cannot be queried\nfor this information using existing built-in functions. Note that the code\nobjects created by compilest() cannot be queried like this either, and\nare identical to those created by the built-in compile() function.
\nThe parser module defines a single exception, but may also pass other built-in\nexceptions from other portions of the Python runtime environment. See each\nfunction for information about the exceptions it can raise.
\nNote that the functions compilest(), expr(), and suite() may\nraise exceptions which are normally raised by the parsing and compilation\nprocess. These include the built in exceptions MemoryError,\nOverflowError, SyntaxError, and SystemError. In these\ncases, these exceptions carry all the meaning normally associated with them.\nRefer to the descriptions of each function for detailed information.
\nOrdered and equality comparisons are supported between ST objects. Pickling of\nST objects (using the pickle module) is also supported.
\nST objects have the following methods:
\nWhile many useful operations may take place between parsing and bytecode\ngeneration, the simplest operation is to do nothing. For this purpose, using\nthe parser module to produce an intermediate data structure is equivalent\nto the code
\n>>> code = compile('a + 5', 'file.py', 'eval')\n>>> a = 5\n>>> eval(code)\n10\n
The equivalent operation using the parser module is somewhat longer, and\nallows the intermediate internal parse tree to be retained as an ST object:
\n>>> import parser\n>>> st = parser.expr('a + 5')\n>>> code = st.compile('file.py')\n>>> a = 5\n>>> eval(code)\n10\n
An application which needs both ST and code objects can package this code into\nreadily available functions:
\nimport parser\n\ndef load_suite(source_string):\n st = parser.suite(source_string)\n return st, st.compile()\n\ndef load_expression(source_string):\n st = parser.expr(source_string)\n return st, st.compile()\n
Source code: Lib/symbol.py
\nThis module provides constants which represent the numeric values of internal\nnodes of the parse tree. Unlike most Python constants, these use lower-case\nnames. Refer to the file Grammar/Grammar in the Python distribution for\nthe definitions of the names in the context of the language grammar. The\nspecific numeric values which the names map to may change between Python\nversions.
\nThis module also provides one additional data object:
\n\nNew in version 2.5: The low-level _ast module containing only the node classes.
\n\nNew in version 2.6: The high-level ast module containing all helpers.
\nSource code: Lib/ast.py
\nThe ast module helps Python applications to process trees of the Python\nabstract syntax grammar. The abstract syntax itself might change with each\nPython release; this module helps to find out programmatically what the current\ngrammar looks like.
\nAn abstract syntax tree can be generated by passing ast.PyCF_ONLY_AST as\na flag to the compile() built-in function, or using the parse()\nhelper provided in this module. The result will be a tree of objects whose\nclasses all inherit from ast.AST. An abstract syntax tree can be\ncompiled into a Python code object using the built-in compile() function.
\nThis is the base of all AST node classes. The actual node classes are\nderived from the Parser/Python.asdl file, which is reproduced\nbelow. They are defined in the _ast C\nmodule and re-exported in ast.
\nThere is one class defined for each left-hand side symbol in the abstract\ngrammar (for example, ast.stmt or ast.expr). In addition,\nthere is one class defined for each constructor on the right-hand side; these\nclasses inherit from the classes for the left-hand side trees. For example,\nast.BinOp inherits from ast.expr. For production rules\nwith alternatives (aka “sums”), the left-hand side class is abstract: only\ninstances of specific constructor nodes are ever created.
\nEach concrete class has an attribute _fields which gives the names\nof all child nodes.
\nEach instance of a concrete class has one attribute for each child node,\nof the type as defined in the grammar. For example, ast.BinOp\ninstances have an attribute left of type ast.expr.
\nIf these attributes are marked as optional in the grammar (using a\nquestion mark), the value might be None. If the attributes can have\nzero-or-more values (marked with an asterisk), the values are represented\nas Python lists. All possible attributes must be present and have valid\nvalues when compiling an AST with compile().
\nThe constructor of a class ast.T parses its arguments as follows:
\nFor example, to create and populate an ast.UnaryOp node, you could\nuse
\nnode = ast.UnaryOp()\nnode.op = ast.USub()\nnode.operand = ast.Num()\nnode.operand.n = 5\nnode.operand.lineno = 0\nnode.operand.col_offset = 0\nnode.lineno = 0\nnode.col_offset = 0\n
or the more compact
\nnode = ast.UnaryOp(ast.USub(), ast.Num(5, lineno=0, col_offset=0),\n lineno=0, col_offset=0)\n
\nNew in version 2.6: The constructor as explained above was added. In Python 2.5 nodes had\nto be created by calling the class constructor without arguments and\nsetting the attributes afterwards.
\nThe module defines a string constant __version__ which is the decimal\nSubversion revision number of the file shown below.
\nThe abstract grammar is currently defined as follows:
\n-- ASDL's five builtin types are identifier, int, string, object, bool\n\nmodule Python version \"$Revision$\"\n{\n\tmod = Module(stmt* body)\n\t | Interactive(stmt* body)\n\t | Expression(expr body)\n\n\t -- not really an actual node but useful in Jython's typesystem.\n\t | Suite(stmt* body)\n\n\tstmt = FunctionDef(identifier name, arguments args, \n stmt* body, expr* decorator_list)\n\t | ClassDef(identifier name, expr* bases, stmt* body, expr* decorator_list)\n\t | Return(expr? value)\n\n\t | Delete(expr* targets)\n\t | Assign(expr* targets, expr value)\n\t | AugAssign(expr target, operator op, expr value)\n\n\t -- not sure if bool is allowed, can always use int\n \t | Print(expr? dest, expr* values, bool nl)\n\n\t -- use 'orelse' because else is a keyword in target languages\n\t | For(expr target, expr iter, stmt* body, stmt* orelse)\n\t | While(expr test, stmt* body, stmt* orelse)\n\t | If(expr test, stmt* body, stmt* orelse)\n\t | With(expr context_expr, expr? optional_vars, stmt* body)\n\n\t -- 'type' is a bad name\n\t | Raise(expr? type, expr? inst, expr? tback)\n\t | TryExcept(stmt* body, excepthandler* handlers, stmt* orelse)\n\t | TryFinally(stmt* body, stmt* finalbody)\n\t | Assert(expr test, expr? msg)\n\n\t | Import(alias* names)\n\t | ImportFrom(identifier? module, alias* names, int? level)\n\n\t -- Doesn't capture requirement that locals must be\n\t -- defined if globals is\n\t -- still supports use as a function!\n\t | Exec(expr body, expr? globals, expr? locals)\n\n\t | Global(identifier* names)\n\t | Expr(expr value)\n\t | Pass | Break | Continue\n\n\t -- XXX Jython will be different\n\t -- col_offset is the byte offset in the utf8 string the parser uses\n\t attributes (int lineno, int col_offset)\n\n\t -- BoolOp() can use left & right?\n\texpr = BoolOp(boolop op, expr* values)\n\t | BinOp(expr left, operator op, expr right)\n\t | UnaryOp(unaryop op, expr operand)\n\t | Lambda(arguments args, expr body)\n\t | IfExp(expr test, expr body, expr orelse)\n\t | Dict(expr* keys, expr* values)\n\t | Set(expr* elts)\n\t | ListComp(expr elt, comprehension* generators)\n\t | SetComp(expr elt, comprehension* generators)\n\t | DictComp(expr key, expr value, comprehension* generators)\n\t | GeneratorExp(expr elt, comprehension* generators)\n\t -- the grammar constrains where yield expressions can occur\n\t | Yield(expr? value)\n\t -- need sequences for compare to distinguish between\n\t -- x < 4 < 3 and (x < 4) < 3\n\t | Compare(expr left, cmpop* ops, expr* comparators)\n\t | Call(expr func, expr* args, keyword* keywords,\n\t\t\t expr? starargs, expr? kwargs)\n\t | Repr(expr value)\n\t | Num(object n) -- a number as a PyObject.\n\t | Str(string s) -- need to specify raw, unicode, etc?\n\t -- other literals? bools?\n\n\t -- the following expression can appear in assignment context\n\t | Attribute(expr value, identifier attr, expr_context ctx)\n\t | Subscript(expr value, slice slice, expr_context ctx)\n\t | Name(identifier id, expr_context ctx)\n\t | List(expr* elts, expr_context ctx) \n\t | Tuple(expr* elts, expr_context ctx)\n\n\t -- col_offset is the byte offset in the utf8 string the parser uses\n\t attributes (int lineno, int col_offset)\n\n\texpr_context = Load | Store | Del | AugLoad | AugStore | Param\n\n\tslice = Ellipsis | Slice(expr? lower, expr? upper, expr? step) \n\t | ExtSlice(slice* dims) \n\t | Index(expr value) \n\n\tboolop = And | Or \n\n\toperator = Add | Sub | Mult | Div | Mod | Pow | LShift \n | RShift | BitOr | BitXor | BitAnd | FloorDiv\n\n\tunaryop = Invert | Not | UAdd | USub\n\n\tcmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn\n\n\tcomprehension = (expr target, expr iter, expr* ifs)\n\n\t-- not sure what to call the first argument for raise and except\n\texcepthandler = ExceptHandler(expr? type, expr? name, stmt* body)\n\t attributes (int lineno, int col_offset)\n\n\targuments = (expr* args, identifier? vararg, \n\t\t identifier? kwarg, expr* defaults)\n\n -- keyword arguments supplied to call\n keyword = (identifier arg, expr value)\n\n -- import name with optional 'as' alias.\n alias = (identifier name, identifier? asname)\n}\n
\n\nNew in version 2.6.
\nApart from the node classes, ast module defines these utility functions\nand classes for traversing abstract syntax trees:
\nSafely evaluate an expression node or a string containing a Python\nexpression. The string or node provided may only consist of the following\nPython literal structures: strings, numbers, tuples, lists, dicts, booleans,\nand None.
\nThis can be used for safely evaluating strings containing Python expressions\nfrom untrusted sources without the need to parse the values oneself.
\nA node visitor base class that walks the abstract syntax tree and calls a\nvisitor function for every node found. This function may return a value\nwhich is forwarded by the visit() method.
\nThis class is meant to be subclassed, with the subclass adding visitor\nmethods.
\nThis visitor calls visit() on all children of the node.
\nNote that child nodes of nodes that have a custom visitor method won’t be\nvisited unless the visitor calls generic_visit() or visits them\nitself.
\nDon’t use the NodeVisitor if you want to apply changes to nodes\nduring traversal. For this a special visitor exists\n(NodeTransformer) that allows modifications.
\nA NodeVisitor subclass that walks the abstract syntax tree and\nallows modification of nodes.
\nThe NodeTransformer will walk the AST and use the return value of\nthe visitor methods to replace or remove the old node. If the return value\nof the visitor method is None, the node will be removed from its\nlocation, otherwise it is replaced with the return value. The return value\nmay be the original node in which case no replacement takes place.
\nHere is an example transformer that rewrites all occurrences of name lookups\n(foo) to data['foo']:
\nclass RewriteName(NodeTransformer):\n\n def visit_Name(self, node):\n return copy_location(Subscript(\n value=Name(id='data', ctx=Load()),\n slice=Index(value=Str(s=node.id)),\n ctx=node.ctx\n ), node)\n
Keep in mind that if the node you’re operating on has child nodes you must\neither transform the child nodes yourself or call the generic_visit()\nmethod for the node first.
\nFor nodes that were part of a collection of statements (that applies to all\nstatement nodes), the visitor may also return a list of nodes rather than\njust a single node.
\nUsually you use the transformer like this:
\nnode = YourTransformer().visit(node)\n
Symbol tables are generated by the compiler from AST just before bytecode is\ngenerated. The symbol table is responsible for calculating the scope of every\nidentifier in the code. symtable provides an interface to examine these\ntables.
\nA namespace table for a block. The constructor is not public.
\nA namespace for a function or method. This class inherits\nSymbolTable.
\nA namespace of a class. This class inherits SymbolTable.
\nAn entry in a SymbolTable corresponding to an identifier in the\nsource. The constructor is not public.
\nReturn True if name binding introduces new namespace.
\nIf the name is used as the target of a function or class statement, this\nwill be true.
\nFor example:
\n>>> table = symtable.symtable("def some_func(): pass", "string", "exec")\n>>> table.lookup("some_func").is_namespace()\nTrue\n
Note that a single name can be bound to multiple objects. If the result\nis True, the name may also be bound to other objects, like an int or\nlist, that does not introduce a new namespace.
\nSource code: Lib/keyword.py
\nThis module allows a Python program to determine if a string is a keyword.
\nSource code: Lib/token.py
\nThis module provides constants which represent the numeric values of leaf nodes\nof the parse tree (terminal tokens). Refer to the file Grammar/Grammar\nin the Python distribution for the definitions of the names in the context of\nthe language grammar. The specific numeric values which the names map to may\nchange between Python versions.
\nThe module also provides a mapping from numeric codes to names and some\nfunctions. The functions mirror definitions in the Python C header files.
\nThe token constants are:
\nSource code: Lib/tabnanny.py
\nFor the time being this module is intended to be called as a script. However it\nis possible to import it into an IDE and use the function check()\ndescribed below.
\nNote
\nThe API provided by this module is likely to change in future releases; such\nchanges may not be backward compatible.
\nSee also
\nSource code: Lib/pyclbr.py
\nThe pyclbr module can be used to determine some limited information\nabout the classes, methods and top-level functions defined in a module. The\ninformation provided is sufficient to implement a traditional three-pane\nclass browser. The information is extracted from the source code rather\nthan by importing the module, so this module is safe to use with untrusted\ncode. This restriction makes it impossible to use this module with modules\nnot implemented in Python, including all standard and optional extension\nmodules.
\nThe Class objects used as values in the dictionary returned by\nreadmodule() and readmodule_ex() provide the following data\nattributes:
\nThe Function objects used as values in the dictionary returned by\nreadmodule_ex() provide the following attributes:
\nSource code: Lib/py_compile.py
\nThe py_compile module provides a function to generate a byte-code file\nfrom a source file, and another function used when the module source file is\ninvoked as a script.
\nThough not often needed, this function can be useful when installing modules for\nshared use, especially if some of the users may not have permission to write the\nbyte-code cache files in the directory containing the source code.
\nCompile several source files. The files named in args (or on the command\nline, if args is not specified) are compiled and the resulting bytecode is\ncached in the normal manner. This function does not search a directory\nstructure to locate source files; it only compiles files named explicitly.\nIf '-' is the only parameter in args, the list of files is taken from\nstandard input.
\n\nChanged in version 2.7: Added support for '-'.
\nWhen this module is run as a script, the main() is used to compile all the\nfiles named on the command line. The exit status is nonzero if one of the files\ncould not be compiled.
\n\nChanged in version 2.6: Added the nonzero exit status when module is run as a script.
\nSee also
\nSource code: Lib/tokenize.py
\nThe tokenize module provides a lexical scanner for Python source code,\nimplemented in Python. The scanner in this module returns comments as tokens as\nwell, making it useful for implementing “pretty-printers,” including colorizers\nfor on-screen displays.
\nThe primary entry point is a generator:
\nThe generate_tokens() generator requires one argument, readline,\nwhich must be a callable object which provides the same interface as the\nreadline() method of built-in file objects (see section\nFile Objects). Each call to the function should return one line\nof input as a string.
\nThe generator produces 5-tuples with these members: the token type; the token\nstring; a 2-tuple (srow, scol) of ints specifying the row and column\nwhere the token begins in the source; a 2-tuple (erow, ecol) of ints\nspecifying the row and column where the token ends in the source; and the\nline on which the token was found. The line passed (the last tuple item) is\nthe logical line; continuation lines are included.
\n\nNew in version 2.2.
\nAn older entry point is retained for backward compatibility:
\nThe tokenize() function accepts two parameters: one representing the input\nstream, and one providing an output mechanism for tokenize().
\nThe first parameter, readline, must be a callable object which provides the\nsame interface as the readline() method of built-in file objects (see\nsection File Objects). Each call to the function should return one\nline of input as a string. Alternately, readline may be a callable object that\nsignals completion by raising StopIteration.
\n\nChanged in version 2.5: Added StopIteration support.
\nThe second parameter, tokeneater, must also be a callable object. It is\ncalled once for each token, with five arguments, corresponding to the tuples\ngenerated by generate_tokens().
\nAll constants from the token module are also exported from\ntokenize, as are two additional token type values that might be passed to\nthe tokeneater function by tokenize():
\nAnother function is provided to reverse the tokenization process. This is useful\nfor creating tools that tokenize a script, modify the token stream, and write\nback the modified script.
\nConverts tokens back into Python source code. The iterable must return\nsequences with at least two elements, the token type and the token string. Any\nadditional sequence elements are ignored.
\nThe reconstructed script is returned as a single string. The result is\nguaranteed to tokenize back to match the input so that the conversion is\nlossless and round-trips are assured. The guarantee applies only to the token\ntype and token string as the spacing between tokens (column positions) may\nchange.
\n\nNew in version 2.5.
\nExample of a script re-writer that transforms float literals into Decimal\nobjects:
\ndef decistmt(s):\n """Substitute Decimals for floats in a string of statements.\n\n >>> from decimal import Decimal\n >>> s = 'print +21.3e-5*-.1234/81.7'\n >>> decistmt(s)\n "print +Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7')"\n\n >>> exec(s)\n -3.21716034272e-007\n >>> exec(decistmt(s))\n -3.217160342717258261933904529E-7\n\n """\n result = []\n g = generate_tokens(StringIO(s).readline) # tokenize the string\n for toknum, tokval, _, _, _ in g:\n if toknum == NUMBER and '.' in tokval: # replace NUMBER tokens\n result.extend([\n (NAME, 'Decimal'),\n (OP, '('),\n (STRING, repr(tokval)),\n (OP, ')')\n ])\n else:\n result.append((toknum, tokval))\n return untokenize(result)\n
This module provides some utility functions to support installing Python\nlibraries. These functions compile Python source files in a directory tree.\nThis module can be used to create the cached byte-code files at library\ninstallation time, which makes them available for use even by users who don’t\nhave write permission to the library directories.
\nThis module can work as a script (using python -m compileall) to\ncompile Python sources.
\n\nChanged in version 2.7: Added the -i option.
\nRecursively descend the directory tree named by dir, compiling all .py\nfiles along the way.
\nThe maxlevels parameter is used to limit the depth of the recursion; it\ndefaults to 10.
\nIf ddir is given, it is prepended to the path to each file being compiled\nfor use in compilation time tracebacks, and is also compiled in to the\nbyte-code file, where it will be used in tracebacks and other messages in\ncases where the source file does not exist at the time the byte-code file is\nexecuted.
\nIf force is true, modules are re-compiled even if the timestamps are up to\ndate.
\nIf rx is given, its search method is called on the complete path to each\nfile considered for compilation, and if it returns a true value, the file\nis skipped.
\nIf quiet is true, nothing is printed to the standard output unless errors\noccur.
\nCompile the file with path fullname.
\nIf ddir is given, it is prepended to the path to the file being compiled\nfor use in compilation time tracebacks, and is also compiled in to the\nbyte-code file, where it will be used in tracebacks and other messages in\ncases where the source file does not exist at the time the byte-code file is\nexecuted.
\nIf rx is given, its search method is passed the full path name to the\nfile being compiled, and if it returns a true value, the file is not\ncompiled and True is returned.
\nIf quiet is true, nothing is printed to the standard output unless errors\noccur.
\n\nNew in version 2.7.
\nTo force a recompile of all the .py files in the Lib/\nsubdirectory and all its subdirectories:
\nimport compileall\n\ncompileall.compile_dir('Lib/', force=True)\n\n# Perform same compilation, excluding files in .svn directories.\nimport re\ncompileall.compile_dir('Lib/', rx=re.compile('/[.]svn'), force=True)\n
See also
\n\nNew in version 2.3.
\nSource code: Lib/pickletools.py
\nThis module contains various constants relating to the intimate details of the\npickle module, some lengthy comments about the implementation, and a few\nuseful functions for analyzing pickled data. The contents of this module are\nuseful for Python core developers who are working on the pickle and\ncPickle implementations; ordinary users of the pickle module\nprobably won’t find the pickletools module relevant.
\nReturns a new equivalent pickle string after eliminating unused PUT\nopcodes. The optimized pickle is shorter, takes less transmission time,\nrequires less storage space, and unpickles more efficiently.
\n\nNew in version 2.6.
\nSource code: Lib/dis.py
\nThe dis module supports the analysis of CPython bytecode by\ndisassembling it. The CPython bytecode which this module takes as an\ninput is defined in the file Include/opcode.h and used by the compiler\nand the interpreter.
\nCPython implementation detail: Bytecode is an implementation detail of the CPython interpreter! No\nguarantees are made that bytecode will not be added, removed, or changed\nbetween versions of Python. Use of this module should not be considered to\nwork across Python VMs or Python releases.
\nExample: Given the function myfunc():
\ndef myfunc(alist):\n return len(alist)\n
the following command can be used to get the disassembly of myfunc():
\n>>> dis.dis(myfunc)\n 2 0 LOAD_GLOBAL 0 (len)\n 3 LOAD_FAST 0 (alist)\n 6 CALL_FUNCTION 1\n 9 RETURN_VALUE\n
(The “2” is a line number).
\nThe dis module defines the following functions and constants:
\nDisassembles a code object, indicating the last instruction if lasti was\nprovided. The output is divided in the following columns:
\nThe parameter interpretation recognizes local and global variable names,\nconstant values, branch targets, and compare operators.
\nThe Python compiler currently generates the following bytecode instructions.
\nUnary Operations take the top of the stack, apply the operation, and push the\nresult back on the stack.
\nBinary operations remove the top of the stack (TOS) and the second top-most\nstack item (TOS1) from the stack. They perform the operation, and put the\nresult back on the stack.
\nIn-place operations are like binary operations, in that they remove TOS and\nTOS1, and push the result back on the stack, but the operation is done in-place\nwhen TOS1 supports it, and the resulting TOS may be (but does not have to be)\nthe original TOS1.
\nThe slice opcodes take up to three parameters.
\nSlice assignment needs even an additional parameter. As any statement, they put\nnothing on the stack.
\nMiscellaneous opcodes.
\nCleans up the stack when a with statement block exits. On top of\nthe stack are 1–3 values indicating how/why the finally clause was entered:
\nUnder them is EXIT, the context manager’s __exit__() bound method.
\nIn the last case, EXIT(TOP, SECOND, THIRD) is called, otherwise\nEXIT(None, None, None).
\nEXIT is removed from the stack, leaving the values above it in the same\norder. In addition, if the stack represents an exception, and the function\ncall returns a ‘true’ value, this information is “zapped”, to prevent\nEND_FINALLY from re-raising the exception. (But non-local gotos should\nstill be resumed.)
\nAll of the following opcodes expect arguments. An argument is two bytes, with\nthe more significant byte last.
\nPushes a slice object on the stack. argc must be 2 or 3. If it is 2,\nslice(TOS1, TOS) is pushed; if it is 3, slice(TOS2, TOS1, TOS) is\npushed. See the slice() built-in function for more information.
\n