8.5 Doc Strings#

When writing production code / code that is going to be shared it is a good idea to use docstrings. There is a lot of flexibility in how you format your docstrings, in this section we will draw from the official Python Enhancement Proposals (PEP) and the NumPy style guide. From PEP 257 - Docstring Conventions:

"A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition."

The role of the docstring is to explain what the module / function / class / method does and how to use it. In this section we will focus on docstrings used for functions (and in these notes in general as defining modules and classes is beyond the scope). Docstrings are recognised by the Python compiler and can be accessed by using the __doc__ attribute. For example:

print( print.__doc__ )
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.

To create a one-line docstring for a (simple) function, place a string in the first line of the function block:

def add(a, b):
    """Calculates the sum of two numbers and returns the result."""
    return a + b

Some conventions for these (from PEP 257) are:

  • Use triple quotes (""") - this makes it easy to expand to multi-lined if needed.

  • The closing and opening quotes are in the same line.

  • No blank lines before or after the docstring.

  • The phrase should have the format: `”””Do X and return Y.”””

To create a multi-lined docstring is similar, with the triple quotes allowing for new lines:

def quad_roots(a, b, c, real_only=True):
    """Find the roots of a quadratic equation.

    The quadratic equation can be represented as:
    a x^2 + b x + c = 0

    Roots are calculated using the equation:
    x = ( -b +- sqrt(b^2 - 4ac) ) / 2a

    If read_only is True and the roots are complex,
    returns None. Otherwise returns a tuple containing
    each root.

    Arguments:
    a:  coefficient of the second order term
    b:  coefficient of the first order term
    c: coefficient of the zero order term

    Optional keyword arguments:
    real_only:   set to True if you only want real roots returned (default True)
    """

    det = b * b - 4 * a * c

    if real_only and det < 0:
        return None
    
    return 0.5 * (b + det ** 0.5) / a, 0.5 * (b - det ** 0.5) / a

Note that for the function above a one-line docstring is probably more appropriate.

The PEP 257 conventions can be summarised as:

  • The format:

    """Summary line.
    
    Elaborate description.
    """
    
  • Should document:

    • A summary of the function’s behavior

    • Arguments

    • return value(s)

    • side effects

    • exceptions raised

    • restrictions

    • optional arguments

NumPy Docstring Style Guide#

The PEP guide is a top level overview, for a more detailed guide on formatting your docstrings, I recommend the NumPy Style Guide, which is summarised below. An advantage of NumPy style dosctrings is that they can be converted to web-based documentation directly by using Sphinx.

NumPy styled docstrings can be broken into the sections:

  1. Short summary

  2. Depreciation warning

  3. Extended summary

  4. Parameters

  5. Returns

  6. Yields

  7. Receives

  8. Other Parameters

  9. Raises

  10. Warns

  11. Warning

  12. See Also

  13. Notes

  14. References

  15. Examples

Many of these sections can be considered as beyond the scope of this course. We will only cover the emphasised items below.

Short Summary#

A one-line summary that does not use variable names or function names. This is much like the one-line docstring discussed in PEP 257.

Extended Summary#

A few sentences giving an extended description. This should clarify functionality and not implementation details or background theory (this goes in the Notes section).

Parameters#

Description of function arguments, keywords and their types.

Parameters
----------
x : type
    Description of parameter `x`.
y
    Description of parameter `y` (with type not specified).

Examples of parameter types (be as precise as possible):

Parameters
----------
filename : str
copy : bool
dtype : data-type
iterable : iterable object
shape : int or tuple of int
files : list of str

For optional keyword arguments:

x : int, optional

Note that default values are part of the function signature, but can be detailed in the description (see the style guide for details).

Returns#

This is an explanation of returned values and types. It is similar to the Parameters section, except:

  • Name is optional

  • Type is always required

An example with no name:

Returns
-------
int
    Description of anonymous integer return value.

An example with names (looks like Parameters):

Returns
-------
err_code : int
    Non-zero value indicates error code, or zero on success.
err_msg : str or None
    Human readable error message, or None on success.

Template#

Putting the focused sections together we have the template:

def function(args):
    """
    Short Summary
    
    Extended Summary
    
    Parameters
    ----------
    x : type
        Description of parameter `x`.
    y
        Description of parameter `y` (with type not specified).
    
    Returns
    -------
    z : type
        Description of return value
    """

The first multi-lined docstring example in this format will be:

def quad_roots(a, b, c, real_only=True):
    """Find the roots of a quadratic equation.

    If read_only is True and the roots are complex,
    returns None. Otherwise returns a tuple containing
    each root.

    Parameters
    ----------
    a : float
        coefficient of the second order term
    b : float
        coefficient of the first order term
    c : float
        coefficient of the zero order term
    real_only : bool, optional, default = True
        set to True if you only want real roots returned

    Returns
    -------
    float
        first root of the quadratic equation
    float
        second root of the quadratic equation

    Notes
    -----

    The quadratic equation can be represented as:
    ..math:: a x^2 + b x + c = 0

    Roots are calculated using the equation:
    ..math:: x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
    """

    det = b * b - 4 * a * c

    if real_only and det < 0:
        return None
    
    return 0.5 * (b + det ** 0.5) / a, 0.5 * (b - det ** 0.5) / a

I have included the “Notes” section here along with LaTeX formatted maths. If you want to read more on how to structure the notes you can read more in the NumPy guide.