Saturday, August 9, 2014

How I got a 17.5 years (!) old github repository

With yet another e-mail by Alain Hauser about tweaks in my R package sfsmisc
I've felt the urge again, to finally move my very oldest R package to a "RCS" a "revision control system" aka "version control" system.

Now the sfsmisc package has a particular history. As its DESCRIPTION file says, also visible in the CRAN web page above,  the package describes itself as

Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, quite a few related to graphics; many ported from S-plus times.

and so is talking about S+ code originating in the 1990s, really.  Now, some may know that R's history also started then, and I got involved with it not very long after its creation by Ross Ihaka and Robert Gentleman, after they had announced it on the S-news mailing list in 1994 (we think.. nobody has found the original e-mail as far as I know).

Anyway, the "SfS goodies" (SfS := Seminar für Statistik) at the time was a directory of S (or "S-plus" or "S+") functions we had found useful at the SfS. Consequently these were among the first things ported to R by me, among them by the way, the str() function which I added to R myself in very early time:
It was part of R 0.61.0, released 1997-12-21, already.
Okay, by now, you must have grasped that I am really nostalgic about this... As a consequence, I did want to keep as much history as possible, and I did succeed in the end, but look at the steps I've used to move things to git and github:
  1. A few of my many files in the package, I put under RCS control earlier, namely 9 files in the ./R/ subdirectory plus the prime-numbers demo, i.e. for these 10 files, there was already a corresponding foo,v file.
  2. For all other source files, notably those inside ./R/, I had kept numbered emacs backup files i.e. for a file foo, files named foo.~1~, foo.~2~ etc. Now these I transformed into a *,v RCS (archive) file with a new version of my shell script G2RCS, I appropriately called G2RCSn (final 'n'). In the end, a total of  146 *,v files.  How did I achieve that efficiently?
  3. In Emacs dired (DIRectory EDitor, the supersmart file browser since the ending 1980s) I've marked all the source files (R/*.R, man/*.Rd, ...) which did not have a corresponding *,v (so from the above, I deduce, there were 146-10=136 of those) and then used "!"  to call G2RCSn on those.
  4. Now, I had downloaded the utility rcs-fast-export.rb, a ruby script, and made it executable in my ~/bin/.
  5. Now, inside my ~/R/Pkgs/sfsmisc/ (and after backing up the whole source tree into a tarball) I did exactly what      rcs-fast-export.rb --help    tells me:   git init && rcs-fast-export.rb . | git fast-import && git reset
  6. Using Emacs 'magit' -- the git interface Doug Bates taught me about -- which I have long assigned to   C-c v   I see that I have ended up with some modified files (marked with an "M "), namely those that had an $Id..$ "tag" inside. Magit also presented me my many untracked files, so I quickly got a .gitignore file so things are easier to browse. Ok, now I do use magit to change/commit afew things, notably getting rid of the $Id$ labels.
  7. After playing around a bit more and convincing myself things work, I remove all the *,v (and *-ss) files, as they are not needed anymore evidently.
  8. Now the next - basically last - step:  Put the already working local git repository on github  (so Alain could make a pull request, next time.. :-):
  •  create a repository on github (by webpage clicking)
  • in the shell, inside ./sfsmisc/ , do what I found before:
        git push --mirror https://github.com/mmaechler/sfsmisc.git where I do need to give my github credentials --> and whoosh! I am done.. whew!
    Well writing the blog really took longer than doing it! I need to get more convenient blogging infrastructure...

Finally, I happily point my browser to github and eventually to
        https://github.com/mmaechler/sfsmisc/graphs/
with the main stats (and a nice graphic!) saying

        Feb 23, 1997 - Aug 9, 2014

 and also
  • Contributions to master, excluding merge commits
  • mmaechler commits  /  16,037 ++ /  6,148 --   
the latter mentioning  16'037 additions and 6148 (git) deletions till today.  Well  it seems I have been productive there.

Martin's Musings about R and often very applied mathematics and statistics

Just need a place where to keep my notes about R and its connections with my interests in applied mathematics and statistics.....
but I'm not yet starting as I want to spend a little time about a possible somewhat systematic way to organize things here as a "tree" (aka "hierarchy")...
...
well it's now August and I have not found time to go further in systematic exploration of "intelligent" and  "useful for R programming" blogging.

So I go ahead and close this intro to do a first "real post".