PEP: 474 Title: Creating forge.python.org Version: $Revision$
Last-Modified: $Date$ Author: Alyssa Coghlan <ncoghlan@gmail.com>
Status: Withdrawn Type: Process Content-Type: text/x-rst Created:
19-Jul-2014 Post-History: 19-Jul-2014, 08-Jan-2015, 01-Feb-2015

Abstract

This PEP proposes setting up a new PSF provided resource,
forge.python.org, as a location for maintaining various supporting
repositories (such as the repository for Python Enhancement Proposals)
in a way that is more accessible to new contributors, and easier to
manage for core developers.

This PEP does not propose any changes to the core development workflow
for CPython itself (see PEP 462 in relation to that).

PEP Withdrawal

This PEP has been withdrawn by the author in favour of the GitLab based
proposal in PEP 507.

If anyone else would like to take over championing this PEP, contact the
core-workflow mailing list

Proposal

This PEP proposes that an instance of the self-hosted Kallithea code
repository management system be deployed as "forge.python.org".

Individual repositories (such as the developer guide or the PEPs
repository) may then be migrated from the existing hg.python.org
infrastructure to the new forge.python.org infrastructure on a
case-by-case basis. Each migration will need to decide whether to retain
a read-only mirror on hg.python.org, or whether to just migrate
wholesale to the new location.

In addition to supporting read-only mirrors on hg.python.org,
forge.python.org will also aim to support hosting mirrors on popular
proprietary hosting sites like GitHub and BitBucket. The aim will be to
allow users familiar with these sites to submit and discuss pull
requests using their preferred workflow, with forge.python.org
automatically bringing those contributions over to the master
repository.

Given the availability and popularity of commercially backed "free for
open source projects" repository hosting services, this would not be a
general purpose hosting site for arbitrary Python projects. The initial
focus will be specifically on CPython and other repositories currently
hosted on hg.python.org. In the future, this could potentially be
expanded to consolidating other PSF managed repositories that are
currently externally hosted to gain access to a pull request based
workflow, such as the repository for the python.org Django application.
As with the initial migrations, any such future migrations would be
considered on a case-by-case basis, taking into account the preferences
of the primary users of each repository.

Rationale

Currently, hg.python.org hosts more than just the core CPython
repository, it also hosts other repositories such as those for the
CPython developer guide and for Python Enhancement Proposals, along with
various "sandbox" repositories for core developer experimentation.

While the simple "pull request" style workflow made popular by code
hosting sites like GitHub and BitBucket isn't adequate for the complex
branching model needed for parallel maintenance and development of the
various CPython releases, it's a good fit for several of the ancillary
projects that surround CPython that we don't wish to move to a
proprietary hosting site.

The key requirements proposed for a PSF provided software forge are:

-   MUST support simple "pull request" style workflows
-   MUST support online editing for simple changes
-   MUST be backed by an active development organisation (community or
    commercial)
-   MUST support self-hosting of the master repository on PSF
    infrastructure without ongoing fees

Additional recommended requirements that are satisfied by this proposal,
but may be negotiable if a sufficiently compelling alternative is
presented:

-   SHOULD be a fully open source application written in Python
-   SHOULD support Mercurial (for consistency with existing tooling)
-   SHOULD support Git (to provide that option to users that prefer it)
-   SHOULD allow users of git and Mercurial clients to transparently
    collaborate on the same repository
-   SHOULD allow users of GitHub and BitBucket to submit proposed
    changes using the standard pull request workflows offered by those
    tools
-   SHOULD be open to customisation to meet the needs of CPython core
    development, including providing a potential path forward for the
    proposed migration to a core reviewer model in PEP 462

The preference for self-hosting without ongoing fees rules out the
free-as-in-beer providers like GitHub and BitBucket, in addition to the
various proprietary source code management offerings.

The preference for Mercurial support not only rules out GitHub, but also
other Git-only solutions like GitLab and Gitorious.

The hard requirement for online editing support rules out the Apache
Allura/HgForge combination.

The preference for a fully open source solution rules out RhodeCode.

Of the various options considered by the author of this proposal, that
leaves Kallithea SCM as the proposed foundation for a forge.python.org
service.

Kallithea is a full GPLv3 application (derived from the clearly and
unambiguously GPLv3 licensed components of RhodeCode) that is being
developed under the auspices of the Software Freedom Conservancy. The
Conservancy has affirmed that the Kallithea codebase is completely and
validly licensed under GPLv3. In addition to their role in building the
initial Kallithea community, the Conservancy is also the legal home of
both the Mercurial and Git projects. Other SFC member projects that may
be familiar to Python users include Twisted, Gevent, BuildBot and PyPy.

Intended Benefits

The primary benefit of deploying Kallithea as forge.python.org is that
supporting repositories such as the developer guide and the PEP repo
could potentially be managed using pull requests and online editing.
This would be much simpler than the current workflow which requires PEP
editors and other core developers to act as intermediaries to apply
updates suggested by other users.

The richer administrative functionality would also make it substantially
easier to grant users access to particular repositories for
collaboration purposes, without having to grant them general access to
the entire installation. This helps lower barriers to entry, as trust
can more readily be granted and earned incrementally, rather than being
an all-or-nothing decision around granting core developer access.

Sustaining Engineering Considerations

Even with its current workflow, CPython itself remains one of the
largest open source projects in the world (in the top 2% of projects
tracked on OpenHub). Unfortunately, we have been significantly less
effective at encouraging contributions to the projects that make up
CPython's workflow infrastructure, including ensuring that our
installations track upstream, and that wherever feasible, our own
customisations are contributed back to the original project.

As such, a core component of this proposal is to actively engage with
the upstream Kallithea community to lower the barriers to working with
and on the Kallithea SCM, as well as with the PSF Infrastructure team to
ensure the forge.python.org service integrates cleanly with the PSF's
infrastructure automation.

This approach aims to provide a number of key benefits:

-   allowing those of us contributing to maintenance of this service to
    be as productive as possible in the time we have available
-   offering a compelling professional development opportunity to those
    volunteers that choose to participate in maintenance of this service
-   making the Kallithea project itself more attractive to other
    potential users by making it as easy as possible to adopt, deploy
    and manage
-   as a result of the above benefits, attracting sufficient
    contributors both in the upstream Kallithea community, and within
    the CPython infrastructure community, to allow the forge.python.org
    service to evolve effectively to meet changing developer
    expectations

Some initial steps have already been taken to address these sustaining
engineering concerns:

-   Tymoteusz Jankowski has been working with Donald Stufft to work out
    what would be involved in deploying Kallithea using the PSF's Salt
    based infrastructure automation.
-   Graham Dumpleton and I have been working on making it easy to deploy
    demonstration Kallithea instances to the free tier of Red Hat's open
    source hosting service, OpenShift Online. (See the comments on that
    post, or the quickstart issue tracker for links to Graham's follow
    on work)

The next major step to be undertaken is to come up with a local
development workflow that allows contributors on Windows, Mac OS X and
Linux to run the Kallithea tests locally, without interfering with the
operation of their own system. The currently planned approach for this
is to focus on Vagrant, which is a popular automated virtual machine
management system specifically aimed at developers running local VMs for
testing purposes. The Vagrant based development guidelines for OpenShift
Origin provide an extended example of the kind of workflow this approach
enables. It's also worth noting that Vagrant is one of the options for
working with a local build of the main python.org website.

If these workflow proposals end up working well for Kallithea, they may
also be worth proposing for use by the upstream projects backing other
PSF and CPython infrastructure services, including Roundup, BuildBot,
and the main python.org web site.

Personal Motivation

As of July 2015, I now work for Red Hat as a software development
workflow designer and process architect, focusing on the upstream
developer experience in Fedora. Two of the key pieces of that experience
will be familiar to many web service developers: Docker for local
container management, and Vagrant for cross-platform local development
VM management. Spending time applying these technologies in multiple
upstream contexts helps provide additional insight into what works well
and what still needs further improvement to provide a good software
development experience that is well integrated on Fedora, but also
readily available on other Linux distributions, Windows, Mac OS X.

In relation to code review workflows in particular, the primary code
review workflow management tools I've used in my career are Gerrit (for
multi-step code review with fine-grained access control), GitHub and
BitBucket (for basic pull request based workflows), and Rietveld (for
CPython's optional pre-commit reviews).

Kallithea is interesting as a base project to build, as it's currently a
combined repo hosting and code review management platform, but doesn't
directly integrate the two by offering online merges. This creates the
opportunity to blend the low barrier to entry benefits of the
GitHub/BitBucket pull request model with the mentoring and task hand-off
benefits of Gerrit in defining an online code merging model for
Kallithea in collaboration with the upstream Kallithea developers.

Technical Concerns and Challenges

Introducing a new service into the CPython infrastructure presents a
number of interesting technical concerns and challenges. This section
covers several of the most significant ones.

Service hosting

The default position of this PEP is that the new forge.python.org
service will be integrated into the existing PSF Salt infrastructure and
hosted on the PSF's Rackspace cloud infrastructure.

However, other hosting options will also be considered, in particular,
possible deployment as a Kubernetes hosted web service on either Google
Container Engine or the next generation of Red Hat's OpenShift Online
service, by using either GCEPersistentDisk or the open source GlusterFS
distributed filesystem to hold the source code repositories.

Ongoing infrastructure maintenance

Ongoing infrastructure maintenance is an area of concern within the PSF,
as we currently lack a system administrator mentorship program
equivalent to the Fedora Infrastructure Apprentice or GNOME
Infrastructure Apprentice programs.

Instead, systems tend to be maintained largely by developers as a
part-time activity on top of their development related contributions,
rather than seeking to recruit folks that are more interested in
operations (i.e. keeping existing systems running well) than they are in
development (i.e. making changes to the services to provide new features
or a better user experience, or to address existing issues).

While I'd personally like to see the PSF operating such a program at
some point in the future, I don't consider setting one up to be a
feasible near term goal. However, I do consider it feasible to continue
laying the groundwork for such a program by extending the PSF's existing
usage of modern infrastructure technologies like OpenStack and Salt to
cover more services, as well as starting to explore the potential
benefits of containers and container platforms when it comes to
maintaining and enhancing PSF provided services.

I also plan to look into the question of whether or not an open source
cloud management platform like ManageIQ may help us bring our emerging
"cloud sprawl" problem across Rackspace, Google, Amazon and other
services more under control.

User account management

Ideally we'd like to be able to offer a single account that spans all
python.org services, including Kallithea, Roundup/Rietveld, PyPI and the
back end for the new python.org site, but actually implementing that
would be a distinct infrastructure project, independent of this PEP.
(It's also worth noting that the fine-grained control of ACLs offered by
such a capability is a prerequisite for setting up an effective system
administrator mentorship program)

For the initial rollout of forge.python.org, we will likely create yet
another identity silo within the PSF infrastructure. A potentially
superior alternative would be to add support for python-social-auth to
Kallithea, but actually doing so would not be a requirement for the
initial rollout of the service (the main technical concern there is that
Kallithea is a Pylons application that has not yet been ported to
Pyramid, so integration will require either adding a Pylons backend to
python-social-auth, or else embarking on the Pyramid migration in
Kallithea).

Breaking existing SSH access and links for Mercurial repositories

This PEP proposes leaving the existing hg.python.org installation alone,
and setting up Kallithea on a new host. This approach minimises the risk
of interfering with the development of CPython itself (and any other
projects that don't migrate to the new software forge), but does make
any migrations of existing repos more disruptive (since existing
checkouts will break).

Integration with Roundup

Kallithea provides configurable issue tracker integration. This will
need to be set up appropriately to integrate with the Roundup issue
tracker at bugs.python.org before the initial rollout of the
forge.python.org service.

Accepting pull requests on GitHub and BitBucket

The initial rollout of forge.python.org would support publication of
read-only mirrors, both on hg.python.org and other services, as that is
a relatively straightforward operation that can be implemented in a
commit hook.

While a highly desirable feature, accepting pull requests on external
services, and mirroring them as submissions to the master repositories
on forge.python.org is a more complex problem, and would likely not be
included as part of the initial rollout of the forge.python.org service.

Transparent Git and Mercurial interoperability

Kallithea's native support for both Git and Mercurial offers an
opportunity to make it relatively straightforward for developers to use
the client of their choice to interact with repositories hosted on
forge.python.org.

This transparent interoperability does not exist yet, but running our
own multi-VCS repository hosting service provides the opportunity to
make this capability a reality, rather than passively waiting for a
proprietary provider to deign to provide a feature that likely isn't in
their commercial interest. There's a significant misalignment of
incentives between open source communities and commercial providers in
this particular area, as even though offering VCS client choice can
significantly reduce community friction by eliminating the need for
projects to make autocratic decisions that force particular tooling
choices on potential contributors, top down enforcement of tool
selection (regardless of developer preference) is currently still the
norm in the corporate and other organisational environments that produce
GitHub and Atlassian's paying customers.

Prior to acceptance, in the absence of transparent interoperability,
this PEP should propose specific recommendations for inclusion in the
CPython developer's guide section for git users for creating pull
requests against forge.python.org hosted Mercurial repositories.

Pilot Objectives and Timeline

[TODO: Update this section for Brett's revised timeline, which aims to
have a CPython demo repository online by October 31st, to get a better
indication of future capabilities once CPython itself migrates over to
the new system, rather than just the support repos]

This proposal is part of Brett Cannon's current evaluation of
improvement proposals for various aspects of the CPython development
workflow. Key dates in that timeline are:

-   Feb 1: Draft proposal published (for Kallithea, this PEP)
-   Apr 8: Discussion of final proposals at Python Language Summit
-   May 1: Brett's decision on which proposal to accept
-   Sep 13: Python 3.5 released, adopting new workflows for Python 3.6

If this proposal is selected for further development, it is proposed to
start with the rollout of the following pilot deployment:

-   a reference implementation operational at
    kallithea-pilot.python.org, containing at least the developer guide
    and PEP repositories. This will be a "throwaway" instance, allowing
    core developers and other contributors to experiment freely without
    worrying about the long term consequences for the repository
    history.
-   read-only live mirrors of the Kallithea hosted repositories on
    GitHub and BitBucket. As with the pilot service itself, these would
    be temporary repos, to be discarded after the pilot period ends.
-   clear documentation on using those mirrors to create pull requests
    against Kallithea hosted Mercurial repositories (for the pilot, this
    will likely not include using the native pull request workflows of
    those hosted services)
-   automatic linking of issue references in code review comments and
    commit messages to the corresponding issues on bugs.python.org
-   draft updates to PEP 1 explaining the Kallithea-based PEP editing
    and submission workflow

The following items would be needed for a production migration, but
there doesn't appear to be an obvious way to trial an updated
implementation as part of the pilot:

-   adjusting the PEP publication process and the developer guide
    publication process to be based on the relocated Mercurial repos

The following items would be objectives of the overall workflow
improvement process, but are considered "desirable, but not essential"
for the initial adoption of the new service in September (if this
proposal is the one selected and the proposed pilot deployment is
successful):

-   allowing the use of python-social-auth to authenticate against the
    PSF hosted Kallithea instance
-   allowing the use of the GitHub and BitBucket pull request workflows
    to submit pull requests to the main Kallithea repo
-   allowing easy triggering of forced BuildBot runs based on Kallithea
    hosted repos and pull requests (prior to the implementation of PEP
    462, this would be intended for use with sandbox repos rather than
    the main CPython repo)

Future Implications for CPython Core Development

The workflow requirements for the main CPython development repository
are significantly more complex than those for the repositories being
discussed in this PEP. These concerns are covered in more detail in PEP
462.

Given Guido's recommendation to replace Rietveld with a more actively
maintained code review system, my current plan is to rewrite that PEP to
use Kallithea as the proposed glue layer, with enhanced Kallithea pull
requests eventually replacing the current practice of uploading patche
files directly to the issue tracker.

I've also started working with Pierre Yves-David on a custom Mercurial
extension that automates some aspects of the CPython core development
workflow.

Copyright

This document has been placed in the public domain.