File: migration_github.md

package info (click to toggle)
grass 8.4.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 277,040 kB
  • sloc: ansic: 460,798; python: 227,732; cpp: 42,026; sh: 11,262; makefile: 7,007; xml: 3,637; sql: 968; lex: 520; javascript: 484; yacc: 450; asm: 387; perl: 157; sed: 25; objc: 6; ruby: 4
file content (173 lines) | stat: -rw-r--r-- 7,124 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# RFC 6: Migration from SVN to GitHub

Authors of the first draft: Markus Neteler, Martin Landa

Status: Motion passed Apr 22, 2019

## Introduction

GRASS GIS is an open source geoinformation system which is developed by a globally
distributed team of developers. Besides the source code developers also message
translators, people who write documentation, those who report bugs and wishes and
more are involved.

The centralized source code management system Subversion (SVN) has served the
GRASS GIS project very well since 2007. The project has established routines and
infrastructure (code repository, ticketing system, developer wiki) connected to
SVN. However, with an increasing number of Open Source developers using git (and
here especially the success of GitHub), time has come to migrate the source code
base from SVN kindly hosted by OSGeo to GitHub.com, a widely adopted development
platform.

## Background information of git migration

* For background and aims, see
  * <https://trac.osgeo.org/grass/wiki/GitMigration>
* Results of a survey performed in early 2019:
  * <https://docs.google.com/forms/d/1BoTFyZRNebqVX98A3rh5GpUS2gKFfmuim78gbradDjc/viewanalytics>
* Technical discussions
  * svn -> git test migration ongoing, see [#3722](https://trac.osgeo.org/grass/ticket/3722)
  * trac -> github test issue migration ongoing, see [#3789](https://trac.osgeo.org/grass/ticket/3789)

## New GitHub repositories

Since migration is a huge effort, massive work on converting the existing source
code (organized in branches) and the related trac tickets has been done. The main
scope of weeks of efforts was to preserve as much information as possible by
converting trac/svn references to full URLs pointing to the old system kept
available in read-only mode.

The following new GitHub repositories have been
[created](https://github.com/grass-svn2git). Note that the "cut-off" date of the
main **grass** repository does not correspond to the first upload to CSV which
was then migrated to SVN. The repositories **grass** and **grass-legacy** overlap
in time since they contain different branches:

* repository **grass**
  * Source code from 2008 (as the starting commit r31142 was selected, i.e.
    "Welcome to GRASS 7.0.svn") to present day (SVN-trunk -> git-master)
  * i.e., all 7.x and later release branches + master
* repository **grass-legacy**
  * Source code from 1987 (pre-public internet times; manually reconstructed) -
    2018 (r72361 - last commit to releasebranch_6_4)
  * i.e., a separate repository for older GRASS GIS releases (3.2, 4.x, 5.x, 6.x)
* repository **grass-addons**
  * repository for addons
  * code re-organized from directory-like layout (grass6, grass7) into
    branches-like layout (master == grass7, grass6, ...)
* repository **grass-promo**
  * repository for promotional material

The final destination of these repositories will be under <https://github.com/OSGeo/>

## Authorship and SVN user name mapping to GitHub

Given GRASS GIS’ history of 35+ years we had to invest major effort in identifying
and mapping user names throughout the decades (CVS was used from 1999 to 2007; SVN
has been used since 2007, see [history](https://grasswiki.osgeo.org/wiki/Bug_tracking)).

The following circumstances could be identified:

* user present in CVS but not in SVN
* user present in SVN but not in CVS
* user present in both with identical name
* user present in both with different name as some were changed in the CVS to
  SVN migration in 2007, leading to colliding user names
* some users already having a GitHub account (with mostly different name again)
  * see here for the [author mapping table](https://trac.osgeo.org/grass/browser/grass-addons/tools/svn2git/)

**Important**: nothing is lost as contributions can be
[claimed](https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user)
in GitHub.

## Activating the GitHub issue tracker

As the cut-off date for the trac migration we selected 2007-12-09 (r25479) as it
was the first SVN commit (after the years in CVS).

The tickets migrated from trac to GitHub contain updated links in the ticket texts:

* links to other tickets (closed now pointing to full trac URL, open pointing to
  a new github issues). Note that there were many styles of referring in the
  commit log message which had to be parsed accordingly.
* links to trac wiki (now pointing to full trac URL)
* links source code in SVN (now pointing to full trac URL)
* images and attachments (now pointing to full trac URL)

Labels are preserved by transferring:

* "operating system" trac label into the GitHub issue text itself (following the
  new issue reporting template)
* converting milestones/tickets/comments/labels
* converting trac usernames to known GitHub usernames (those missing at time can
  [claim](https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user)
  commits)
* setting assignees if possible; otherwise set new "grass-svn2git" an assignee

**New labels in the GitHub issue tracker:**

The trac component of the bug reports have been cleaned up following other OSGeo
projects like GDAL and QGIS, leading to the following categories:

* **Issue category**:
  * bug
  * enhancement
* **Priority**:
  * blocker
  * critical
  * feedback needed
* **Issue solution** (other than fixing and closing it normally):
  * duplicate
  * invalid
  * wontfix
  * worksforme
* **Components**:
  * docs
  * GUI
  * libs
  * modules
  * packaging
  * python
  * translations
  * unittests
  * Windows specific

Note that "normal" bugs reported will not carry a label in order to not overload
the visual impact and readability.

## Rules and best practices for using Git

Before the new Git repository is open for writing, we need to have rules and best
practices for dealing with the following:

* Policy for automatic merge commits due to un-synchronous nature of Git. Do we
  want to avoid those by `git pull --rebase`?
* How to do backports?
* A branch for every feature or bug fix in the main repo or is this done in the
  fork?
* _(add more)_

## Turning SVN/trac into readonly mode

As soon as the above listed repositories are stable and functional,
SVN/trac (<https://trac.osgeo.org/grass/>) at OSGeo will be set into readonly
mode. They will serve as a reference for existing links and also for the
aforementioned converted commit messages and issues in the issue tracker.

## Open issues

* Will be also Trac wiki migrated into GitHub?
  * This can be decided at a later stage. Managed in [#3789](https://trac.osgeo.org/grass/ticket/3789)

## Mirror or Exit strategy

GitHub is a closed platform. In case it would be shutdown, closed or GitHub would
start asking unreasonable fees, a backup strategy is needed.

The proposed solution is

* GitLab has an import module from GitHub.
  * implement a continuous mirror on GitLab.com including the issues and wiki.
    * OSGeo-gitea code mirror: <https://git.osgeo.org/gitea/grass_gis/grass>
  * See <https://docs.gitlab.com/ee/workflow/repository_mirroring.html>
    * The mirroring is only for the repository, not the issues or wiki.