.. _developers-tips:

===========================
Developers' Tips and Tricks
===========================

Productivity and sanity-preserving tips
=======================================

In this section we gather some useful advice and tools that may increase your
quality-of-life when reviewing pull requests, running unit tests, and so forth.
Some of these tricks consist of userscripts that require a browser extension
such as `TamperMonkey`_ or `GreaseMonkey`_; to set up userscripts you must have
one of these extensions installed, enabled and running.  We provide userscripts
as GitHub gists; to install them, click on the "Raw" button on the gist page.

.. _TamperMonkey: https://tampermonkey.net
.. _GreaseMonkey: http://www.greasespot.net

Viewing the rendered HTML documentation for a pull request
----------------------------------------------------------

We use CircleCI to build the HTML documentation for every pull request. To
access that documentation, we provide a redirect as described in the
:ref:`documentation section of the contributor guide
<contribute_documentation>`. Instead of typing the address by hand, we provide a
`userscript <https://gist.github.com/lesteve/470170f288884ec052bcf4bc4ffe958a>`_
that adds a button to every PR. After installing the userscript, navigate to any
GitHub PR; a new button labeled "See CircleCI doc for this PR" should appear in
the top-right area.

Folding and unfolding outdated diffs on pull requests
-----------------------------------------------------

GitHub hides discussions on PRs when the corresponding lines of code have been
changed in the mean while. This `userscript
<https://gist.github.com/lesteve/b4ef29bccd42b354a834>`_ provides a button to
unfold all such hidden discussions at once, so you can catch up.

Checking out pull requests as remote-tracking branches
------------------------------------------------------

In your local fork, add to your ``.git/config``, under the ``[remote
"upstream"]`` heading, the line::

  fetch = +refs/pull/*/head:refs/remotes/upstream/pr/*

You may then use ``git checkout pr/PR_NUMBER`` to navigate to the code of the
pull-request with the given number. (`Read more in this gist.
<https://gist.github.com/piscisaureus/3342247>`_)

Display code coverage in pull requests
--------------------------------------

To overlay the code coverage reports generated by the CodeCov continuous
integration, consider `this browser extension
<https://github.com/codecov/browser-extension>`_. The coverage of each line
will be displayed as a color background behind the line number.

Useful pytest aliases and flags
-------------------------------

We recommend using pytest to run unit tests. When a unit tests fail, the
following tricks can make debugging easier:

  1. The command line argument ``pytest -l`` instructs pytest to print the local
     variables when a failure occurs.

  2. The argument ``pytest --pdb`` drops into the Python debugger on failure. To
     instead drop into the rich IPython debugger ``ipdb``, you may set up a
     shell alias to::

         pytest --pdbcls=IPython.terminal.debugger:TerminalPdb --capture no

Debugging memory errors in Cython with valgrind
===============================================

While python/numpy's built-in memory management is relatively robust, it can
lead to performance penalties for some routines. For this reason, much of
the high-performance code in scikit-learn in written in cython. This
performance gain comes with a tradeoff, however: it is very easy for memory
bugs to crop up in cython code, especially in situations where that code
relies heavily on pointer arithmetic.

Memory errors can manifest themselves a number of ways. The easiest ones to
debug are often segmentation faults and related glibc errors. Uninitialized
variables can lead to unexpected behavior that is difficult to track down.
A very useful tool when debugging these sorts of errors is
valgrind_.


Valgrind is a command-line tool that can trace memory errors in a variety of
code. Follow these steps:

  1. Install `valgrind`_ on your system.

  2. Download the python valgrind suppression file: `valgrind-python.supp`_.

  3. Follow the directions in the `README.valgrind`_ file to customize your
     python suppressions. If you don't, you will have spurious output coming
     related to the python interpreter instead of your own code.

  4. Run valgrind as follows::

       $> valgrind -v --suppressions=valgrind-python.supp python my_test_script.py

.. _valgrind: http://valgrind.org
.. _`README.valgrind`: http://svn.python.org/projects/python/trunk/Misc/README.valgrind
.. _`valgrind-python.supp`: http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp


The result will be a list of all the memory-related errors, which reference
lines in the C-code generated by cython from your .pyx file. If you examine
the referenced lines in the .c file, you will see comments which indicate the
corresponding location in your .pyx source file. Hopefully the output will
give you clues as to the source of your memory error.

For more information on valgrind and the array of options it has, see the
tutorials and documentation on the `valgrind web site <http://valgrind.org>`_.
