April 20, 2015

Simon Michael

ssh, Darcs Hub vulnerability

April 20, 2015 11:10 PM UTC

I recently learned of a serious undocumented vulnerability in the ssh package. This is a minimal ssh server implementation used by darcsden to support darcs push/pull. If you use the ssh package, or you have darcsden’s darcsden-ssh server running, you should upgrade to/rebuild with the imminent ssh-0.3 release right away. Or if you know of someone like that, please let them know.

darcsden is of course the basis for Darcs Hub. Here’s the announcement I sent to users there a few days ago, with more details.

Hello darcs hub users,

This is Simon Michael, operator of hub.darcs.net, with the first all-darcs-hub-users announcement. You’re receiving this because you have an email address configured in your darcs hub user settings.

Thank you for using darcs hub, and for any feedback/bug reports/patches you may have sent. Usage is growing steadily, and I plan to blog more about it soon at joyful.com.

This email is to announce a recently patched security vulnerability in darcs hub’s SSH server.

Timeline:

3/21: a software developer reports that the haskell “ssh” library used by darcs hub does not check for a valid signature on the public key during authentication. This means it was possible to authenticate as any other ssh user if you knew their public key.

3/21-: I discuss the issue with a small number of core darcs developers and the ssh author.

3/25: A preliminary fix is deployed. We believe this closed the vulnerability.

4/6: A more comprehensive and tested fix is deployed.

4/15: This announcement is sent to current darcs hub users with valid email addresses (714 of 765 users).

4/20: Public disclosure via blog, haskell mail lists and the issue tracker (darcsden #130).

Impact and current status:

We believe the vulnerability is now fixed. But we are not cryptographers - I’m sure the new ssh maintainer would welcome any help from some of those.

We have no reason to believe anyone discovered or exploited the vulnerability. Also, it seems unlikely there’s anything hosted on darcs hub that would attract this kind of attention. darcs hub logs are not good enough to be certain, however. It’s possible I’ll find a way to be more certain by looking at file timestamps or something.

The weakness was present in darcs hub’s ssh server since it went live (and in darcsden.com before that). As mentioned, it was possible to authenticate via ssh as another user if you provided their public ssh key. With ssh access, it’s possible to create, delete, modify or replace any repository in that darcs hub account (but not possible to change user settings in the web app, or to access the system hosting darcshub).

The worst-case scenario we’ve imagined is that a motivated attacker could have authenticated as you and replaced your repo with one that looks just like it, but with patches altered or added, any time since you created the repo on darcs hub (or on darcsden.com, if you moved it from there).

So if you’re paranoid/careful you may want to check the integrity of your repos, eg by reviewing the repo history (“changes” button on the website, “darcs log [-s] [-v]” at the console). If you have more questions about this, you can contact me (simon@joyful.com) and if necessary Ganesh Sittampalam (ganesh@earth.li) privately.

Future plans:

• Public announcement on 4/20

• I’ll add a security section to the darcs hub FAQ

• Ganesh has stepped up to be maintainer of the ssh package, and will make a new release soon

• I’ll do a darcsden release not too long after that

• We’ll need to figure out Darcs hub’s sustainability plan. As it grows and more of you rely on it, so does the need for a revenue stream to allow decent maintenance and oversight. This could be from funding, donations, charging for private repos or something else.

Also:

Some logistical things to be aware of:

• this announcement has been sent via MailChimp, and as yet there’s no automatic integration between MailChimp and your settings on hub.darcs.net.

• remember that darcs hub’s issue tracker is here, and that it does not yet send email notifications - to see replies to an issue, you must visit the issue page.

• darcs hub’s password recovery emails may not always reach you - if you’re experiencing this, please contribute to #123.

Needless to say, I regret the vulnerability and am pleased to have it closed. Of course we are not alone, eg github had their own incident. Thank you very much to all who have been helping with this, especially the original reporter for letting us all know, and Ganesh for providing swift and high quality fixes.

April 19, 2015

Darcs News

darcs 2.10.0 release

April 19, 2015 10:01 PM UTC

Hi all,

The darcs team is pleased to announce the release of darcs 2.10.0.

The easiest way to install darcs 2.10.0 from source is by first installing the Haskell Platform (http://www.haskell.org/platform). If you have installed the Haskell Platform or cabal-install, you can install this release by doing:

$cabal update$ cabal install darcs-2.10.0

Alternatively, you can download the tarball from http://darcs.net/releases/darcs-2.10.0.tar.gz and build it by hand as explained in the README file.

The 2.10 branch is also available as a darcs repository from http://darcs.net/releases/branch-2.10

Feedback

If you have an issue with darcs 2.10.0, you can report it via the web on http://bugs.darcs.net/ . You can also report bugs by email to bugs at darcs.net, or come to #darcs on irc.freenode.net.

What's new since darcs 2.8.5

New features

• darcs rebase: enable deep amending of history (Ganesh Sittampalam)
• darcs pull --reorder: keep local-only patches on top of mainstream patches (Ale Gadea, Ganesh Sittampalam)
• darcs dist --zip: generate a zip archive from a repository (Guillaume Hoffmann)
• patch bundle contexts are minimized by default. Enables bundles to be applied to more repositories. (Guillaume Hoffmann)
• darcs convert export/import for conversion to/from VCSes supporting the fast-export protocol (Petr Rockai, Owen Stephens, Guillaume Hoffmann, Lele Gaifax, Ben Franksen)
• darcs test --backoff: exponential backoff test strategy, faster than bisect on big repositories (Michael Hendricks)
• work normally on sshfs-mounted repositories (Nathaniel Filardo)
• automatic detection of file/directory moves, and of token replaces (Jose Neder)
• patience diff algorithm by default (Jose Neder)
• interactive mode for whatsnew (Dan Frumin)
• tag --ask-deps to create tags that may not include some patches (Ganesh Sittampalam)

User Interface

• add a last question after all patches have been selected to confirm the whole selection (Florent Becker)
• command names:
• clone is the new name of get and put
• log is the new name of changes
• amend is the new name of amend-record
• show output of log into a pager by default (Guillaume Hoffmann)
• the output of log is more similar to git's:
• show patch hash in UI (hash of the patch's metadata)
• put author and date on separate lines (Guillaume Hoffmann)
• enable to match on patch hash prefix with -h and --hash (Guillaume Hoffmann, Gian Piero Carrubba)
• better messages:
• better error messages for http and ssh errors (Ernesto Rodriguez)
• init, add, remove, move and replace print confirmation messages (Guillaume Hoffmann)
• rollback only happens in the working copy (Florent Becker, Guillaume Hoffmann)
• darcs send no longer tries to send a mail by default (Eric Kow)
• when no patch name given, directly invoke text editor (Jose Neder, Guillaume Hoffmann)
• use nano as default text editor instead of vi (Guillaume Hoffmann)
• keep log files for patch name and mail content in _darcs (Ale Gadea)
• optimize and convert are now supercommands (Guillaume Hoffmann)
• improve darcs help environment and darcs help markdown (Radoslav Dorcik, Guillaume Hoffmann)
• warn about duplicate tags when creating a new one (Ale Gadea)
• allow darcs mv into known, but deleted in working, file (Owen Stephens)
• improve--not-in-remote, allowing multiple repos and use default (Owen Stephens)

Performance

• faster darcs diff (Petr Rockai)
• faster log and annotate thanks to patch index data structure (BSRK Aditya, Benedikt Schmidt, Eric Kow, Guillaume Hoffmann, Ganesh Sittampalam)
• faster push via ssh by using compression (Ben Franksen)
• cloning to an ssh destination (formerly darcs put) is more efficient (Guillaume Hoffmann)
• faster internal representation of patch hashes (Guillaume Hoffmann)
• when cloning from http, use packs in a more predictable way (Guillaume Hoffmann)
• store global cache in bucketed format (Marcio Diaz)
• require and support GHC 7.4 to 7.10 (Ganesh Sittampalam)
• replace type witness CPP macros with plain Haskell (Eric Kow)
• hashed-storage is bundled into darcs (Ganesh Sittampalam)
• replace C SHA256 bindings with external libraries (Ganesh Sittampalam)
• move the bits of the datetime package we need into Darcs.Util.DateTime (Ganesh Sittampalam)
• build Darcs once rather than thrice. (Eric Kow)
• run tests through cabal test (Ryan Desfosses)
• run fewer darcs-1 related tests in testsuite (Ganesh Sittampalam)
• Use custom replHook to fix cabal repl (Owen Stephens)
• darcs.cabal: make Haskell2010 the default-language for all stanzas (Ben Franksen)
• always compile with mmap support (Ganesh Sittampalam)
• new options subsystem (Ben Franksen)
• various cleanups, code restructurations and refactoring, haddocks (Will Langstroth, Owen Stephens, Florent Becker, Guillaume Hoffmann, Michael Hendricks, Eric Kow, Dan Frumin, Ganesh Sittampalam)

Issues resolved in Darcs 2.10

• issue346: implement "patience diff" from bzr (Jose Neder)
• issue642: Automatic detection of file renames (Jose Neder)
• issue822: generalized the IO Type for better error messages and exception handling (Ernesto Rodriguez)
• issue851: interactive mode for whatsnew (Dan Frumin)
• issue904: Fix record on Linux/FUSE/sshfs (fall back to sloppy locks automatically) (Nathaniel Filardo)
• issue1066: clone to ssh URL by locally cloning then copying by scp (Guillaume Hoffmann)
• issue1268: enable to write darcs init x (Radoslav Dorcik)
• issue1416: put log files in tempdir instead of in working dir (Ale Gadea)
• issue1514: send --minimize-context flag for send (Guillaume Hoffmann)
• issue1624: bucketed cache (Marcio Diaz)
• issue1828: file listing and working --dry-run for mark-conflicts (Guillaume Hoffmann)
• issue1987: Garbage collection for inventories and patches (Marcio Diaz)
Incremental fast-export is now provided to ease maintenance of git mirrors: Issues resolved (8) issue2244 Ale Gadea issue2314 Benjamin Franksen issue2361 Ale Gadea issue2364 Sergei Trofimovich issue2364 Sergei Trofimovich issue2388 Owen Stephens issue2394 Guillaume Hoffmann issue2396 Guillaume Hoffmann Patches applied (39) See darcs wiki entry for details. June 12, 2014 Ale Gadea Third Week (02-06 june) June 12, 2014 04:58 PM UTC Well, well... Now with the solution already implemented here are a couple of time tests that show the improvement. For the repository of the issue2361: Before patch1169 "let it run for 2 hours and it did not finish" After patch1169 real 0m5.929s user 0m5.683s sys 0m0.260s For the repository generated by forever.sh, that in summarize has 12600~ patches, a bundle unrevert and doing reorden implies move 1100~ patches forward passing by 11500~ patches. Before patch1169 (Interrupted!) real 73m9.894s user 71m28.256s sys 1m11.439s After patch1169 real 2m23.405s user 2m17.347s sys 0m6.030s The repository generated by bigRepo.sh has 600~ patches, with only one tag and a very small bundle unrevert. Before patch1169 real 0m34.049s user 0m33.386s sys 0m0.665s After patch1169 real 0m1.053s user 0m0.960s sys 0m0.152s One last repository generated by bigUnrevert.sh, has 13 patches and a really big bundle unrevert (~10MB). Before patch1169 real 0m1.304s user 0m0.499s sys 0m0.090s After patch1169 real 0m0.075s user 0m0.016s sys 0m0.011s The repository with more examples is in here: ExamplesRepos. June 05, 2014 Ale Gadea Second Week (26-30 may) June 05, 2014 06:47 PM UTC Luckily, this week with Guillaume we found a "solution" for the issue 2361. But before of entering in details, let's review how the command darcs optimize --reorder does for reorder the patches. So, suppose we have the following repositories than, reading it from left to right we have the first patch till the last patch, besides with$p_{i,j}$we denote the$i$-th patch who belongs to the$j$-th repository, and when we want to specify that a patch$p_{i,j}$is a tag we write$t_{i,j}$.$r_1=p_{1,1}p_{2,1}\ldotsp_{n,1}p_{n+1,1}\ldotsp_{m,1}r_2=p_{1,1}p_{2,1}\ldotsp_{n,1}p_{1,2}\ldotsp_{k,2}t_{1,2}p_{k+1,2}\ldotsp_{l,2}$where the red part represent when$r_2$was cloned from$r_1$, and the rest is how each repository was evolved. Now, suppose we make a merge of$r_1$and$r_2$in$r_1$making a bundle of the patches of$r_2$and appling it in$r_1$. Thus, after the merge we have that$r_1=p_{1,1}p_{2,1}\ldotsp_{n,1}p_{n+1,1}\ldotsp_{m,1}p_{1,2}\ldotsp_{k,2}t_{1,2}p_{k+1,2}\ldotsp_{l,2}$and we found the situation where the tag$t_{1,2}$is dirty because the green part is in the middle. And now we are in conditions of finding out how darcs does the reorder of patches. So, the first task is to select the first tag seeing$r_1$in the reverse way, suppose$t_{1,2}$is the first (ie,$p_{k+1,2}\ldotsp_{l,2}$are not tags), and split the set of patches (the repository) in$ps_{t_{1,2}}=p_{1,1}p_{2,1}\ldotsp_{n,1}p_{1,2}\ldotsp_{k,2}t_{1,2}$and the rest of the patch set,$rest=p_{n+1,1}\ldotsp_{m,1}p_{k+1,2}\ldotsp_{l,2}$this is done by splitOnTag, which I don't totally understand yet, so for the moment... simply do the above :) Then, the part that interest us now is$rest$, we want to delete all the patches of$rest$that exist in$r_1$and then add them again, causing that they show up to the right. This job is done by tentativelyReplacePatches, which first calls tentativelyRemovePatches and then calls tentativelyAddPatches. So, tentativelyRemovePatches of$r_1$and$rest$makes,$r_{1}'=p_{1,1}p_{2,1}\ldotsp_{n,1}p_{1,2}\ldotsp_{k,2}t_{1,2}$and, tentativelyAddPatches of$r_{1}'$and$rest$,$r_{1}''=p_{1,1}p_{2,1}\ldotsp_{n,1}p_{1,2}\ldotsp_{k,2}t_{1,2}p_{n+1,1}\ldotsp_{m,1}p_{k+1,2}\ldotsp_{l,2}$leaving$t_{1,2}$clean. Well, all of this was for understanding the "solution" for the issue, we are almost there but before let's look at the function tentativelyRemovePatches. It attempts to remove patches with one special care: when one does darcs revert, a special file is generated, called unrevert in _darcs/patches, which is used for darcs unrevert in case that one makes a mistake with darcs revert. One important difference with unrevert is that unlike all the other files in _darcs/patches, unrevert in not a patch but a bundle, that contains a patch and a context. This context allows to know if the patch is applicable. So when one removes a patch (running for example oblitarete, unrecord or amend) that patch has to be removed from the bundle-revert (bundle of the file _darcs/patches/unrevert). It's now always possible to adjust the unrevert bundle, in which case, the operation continues only if the user agrees to delete the unrevert bundle. But now a question emerge. Is it necessary to accommodate the bundle-revert in the case of reorder?; the answer is no, and it's because we don't delete any patch of$r_1$so we still can apply the bundle-revert in$r_{1}''$. So, finally! we find out that for reorder we need a special case of removing, which doesn't try to update the unrevert bundle. And this ends up being the "solution" for the issue, since the reorder blocks in that function. But! beyond this solves the issue something weird is happening, that is the reason of the double quotes for solution :) This is more o less the step forward for now. The tasks ahead are, documenting the code in various parts and make the special case for the function tentativelyRemovePatches. On the way I will probably understand more about some of the functions that I mention before so probably I will add more info and rectify whatever is needed. June 03, 2014 Ale Gadea Google Summer of Code 2014 - Darcs June 03, 2014 06:46 PM UTC Hi hi all! I have been accepted in the GSoC 2014 :) , as part of the work I'll be writing about my progress. The original plan is have a summary per week (or at least I hope so jeje). I have already been reading some of the code of darcs and fixing some issues; Issue 2263 ~ Patch 1126 Issue 1416 ~ Patch 1135 - Issue 2244 ~ Patch 1147 (needs-screening) (not any more$\ddot\smile$) The details about the project is in History Reordering Performance and Features. Also some issues about the project are; Issue 2361 Issue 2044 Cheers! First Week (19-23 may) June 03, 2014 06:42 PM UTC Sadly, a first slow week, I lost the monday with problems with my notebook for which I have to reinstall ghc, cabal, all the libraries, etc.. but! in the end this helped :) The list of taks of the week include: 1. Compile and run darcs with profiling flags 2. Write scripts to generate dirty-tagged big repositories 3. Check memory usage with hp2any for the command optimize --reorder for the generated repositories and repo-issue2361 4. Check performance difference with and without patch-index 5. Document reorder implementation on wiki 6. Actually debug/optimize reorder of issue2361 (Stretch goal) 1. Compile and run darcs with prolfiling flags This seems pretty easy at first, but turned somewhat annoying because one have to install all the libraries with the option profiling. So a mini-step-by-step of the my installation of darcs with profiling flags is (i'm using ubuntu 14.04, ghc-7.6.3 and cabal-install-1.20.0.2) : - Install ghc-prof package, in my case with sudo apt-get install ghc-prof - Install depencencies of darcs with enable-library-profiling, doing: -$ cabal install LIB --enable-library-profiling ( for each library :) )
- or setting in ~/.cabal/config, library-profiling: True
- Finaly install darcs with enable-library-profiling and enable-executable-profiling

2. Write scripts to generate dirty-tagged big repositories

About this no much to say, I did some libraries to make the scripts that generates the repositories more straightforward. And I wrote some examples, but still in search of interesting examples. A long the week probably I will add examples, hopefully interesting.

3, 4 and 5 all together and mixed

Now, when finally start to generate the examples repositories and play with hp2ps to check differents things, I started to think about others things and I ended up studing the implementation of the command optimize --reorder, in particular I start to write a version which print some info during the ordering of patches, but for now is very dirty implementation.

April 27, 2014

Marcio Diaz

GSoC Progress Report #1: Complete Repository Garbage Collection

April 27, 2014 05:06 AM UTC

In my first week I worked on completing the garbage collection for repositories.

Darcs stores all the information needed under _darcs directory. In this part of the project we are only interested in the files stored in three directories:

• _darcs/patches/: stores the patches.
• _darcs/pristine.hashed/: stores the last saved state of working copy.
•  _darcs/inventories/: stores the inventories (lists of patches).
While working on a project under version control, these directories grow in size.
Every time we record a new patch:
• A new inventory file is stored in _darcs/inventories/ containing the augmented list of patches. Now, the old inventory file (without the new patch) is no longer needed (this is true in most cases).
• A new patch file is stored in darcs/patches/. If we later unrecord this patch, the patch file is no longer needed.
• The same happens with _darcs/pristine.hashed/.

So, why do we keep these files if we no longer need them? Well, that’s because darcs wants to be fast and does not delete these files over time. Also it’s because if the repository is public and someone is cloning it, you don’t want to have some files disappearing in the process.

Darcs, using "darcs optimize" command, only knows how to clean up the _darcs/pristine.hashed directory. Until now, the only way to clean the other two directories was doing a "darcs get". With the changes introduced, now "darcs optimize" also clean these directories.

Algorithms:

The implemented algorithm was pretty straightforward, in pseudo-code:

- inventory = _darcs/hashed_inventory
- while (inventory)
- useful_inventories += inventory
- inventory = next_inventory(inventory)
- remove files not in useful_inventories.

- inventory = _darcs/hashed_inventory
- while (inventory)
- useful_patches += get_patches(inventory)
- inventory = next_inventory(inventory)
- remove files not in useful_patches.

We can see that we travel the inventory list twice, one for inventories and one for the patches. Although this is not optimal, I think it is more modular, since now we have a function that gets the list of patches.

Commands affected:

- darcs optimize

Use cases:

It is useful when you need to free memory on your hard disk.
For example:
- Record a new patch.
- Unrecord the new patch.
- Run optimize for garbage collecting the unused files corresponding to the unrecorded patch. Details in: http://pastebin.com/vYHiYV0F
You can find more use cases in the regression test script:

Issues solved:

Patches created:

http://bugs.darcs.net/patch1134.

April 26, 2014

Marcio Diaz

GSoC project accepted

April 26, 2014 09:36 PM UTC

I was accepted for the Google Summer of Code 2014. I'll be working for Haskell.org and my project will focus on improvements of Darcs version control system.

The project consists on several parts:

1. Complete garbage collection for repositories.
2. Bucketed global cache.
3. Garbage collection of global cache.
4. Investigate and implement darcs undo command.
5. Investigate and implement darcs undelete command.
Here is a detailed description of my project proposal: http://darcs.net/GSoC/2014-Hashed-Files-And-Cache.

I'll try to give weekly updates of how my work is going, and let you know about the problems and solutions that I find in my way.

Thanks Haskell.org, thanks Darcs and last but not least thanks Google for giveng us this awesome opportunity.

November 03, 2013

Simon Michael

darcsum 1.3

November 03, 2013 07:38 PM UTC

• Fix a hang when reverting, when darcs responds with “Will not ask whether to revert this already decided patch…”.

• Fixed an error in at least my local darcsum, which caused it to break when darcsum-debug was enabled.

• Fixed the four warnings my emacs gave when byte-compiling it. These fixes could use some testing.

• Reviewed the status and backlog. Last release was 2010, the ELPA package dates from 2012, there’s a bunch of unreleased fixes, the site script needs updating for hakyll 4, the project still needs a maintainer.

And since I came this far, I’ll tag and announce darcsum 1.3. Hurrah!

This release includes many fixes from Dave Love and one from Simon Marlow. Here are the release notes.

Site and ELPA package updates will follow asap. All help is welcome.

September 26, 2013

Simon Michael

darcsden/darcs hub GSOC complete

September 26, 2013 11:48 AM UTC

Aditya BSRK’s darcsden-improvement GSOC has concluded, and I’ve recently merged almost all of the pending work and deployed it on darcs hub.

You can always see the recently landed changes here, but let me describe the latest features a little more:

File history - when you browse a file, there’s a new “file changes” button which shows just the changes affecting that file.

File annotate - there’s also a new “annotate” button, providing the standard view showing which commit last touched each line of the file. (also known as the blame/praise feature). It needs some CSS polish but I’m glad that the basic side-by-side layout is there.

More reliable highlighting while editing - the file editor was failing to highlight many common programming languages - this should be working better now. (Note highlighting while viewing and highlighting while editing are independent and probably use different colour schemes, this is a known open wishlist item.)

Repository compare - when viewing a repo’s branches, there’s a new “compare” button which lets you compare (and merge from) any two public repos on darcs hub, showing the unique patches on each side.

Cosmetic fixes - various minor layout and rendering issues were fixed. One point of discussion was whether to use the two-sided layout on the repo branches page as well. Since there wasn’t time to make that really usable I vetoed it in favour of the less confusing one-sided layout. I think showing both sides works well on the compare page though.

Patch bundle support - the last big feature of the GSOC was patch bundles. This is an alternative to the fork repo/request merge workflow, intended to be more lightweight and easy for casual contributors. There are two parts. First, darcs hub issue trackers can now store darcs patch bundle files (one per issue I think). This means patches can be uploaded to an issue, much like the current Darcs issue/patch tracker. But you can also browse and merge patches directly from a bundle, just as you can from another repo.

The second part (not yet deployed) is support for a previously unused feature built in to the darcs send command, which can post patches directly to a url instead of emailing them. The idea (championed by Aditya and Ganesh) is to make it very easy for someone to darcs send patches upstream to the project’s issue tracker, without having to fork a repo, or even create an account on darcs hub. As you can imagine, some safeguards are important to avoid becoming a spam vector or long-term maintenance headache, but the required change(s) are small and I hope we’ll have this piece working soon. It should be interesting to have both workflows available and see which works where.

I won’t recap the older new features, except to say that pack support is in need of more testing. If you ever find darcs get to be slow, perhaps you’d like to help test and troubleshoot packs, since they can potentially make this much faster. Also there are a number of low-hanging UI improvements we can make, and more (relatively easy) bugs keep landing in the darcs hub/darcsden issue tracker. It’s a great time to hack on darcs hub/darcsden and every day make it a little more fun and efficient to work with.

I really appreciate Aditya’s work, and that of his mentor, Ganesh Sittampalam. We did a lot of code review which was not always easy across a large time zone gap, but I think the results were good. Congratulations Aditya on completing the GSOC and delivering many useful features, which we can put to good use immediately. Thanks!

September 20, 2013

Jose Luis Neder

Automatic detection of replaces for Darcs - Part 1

September 20, 2013 03:25 PM UTC

In the last post i show some examples and use cases of the "--look-for-replaces" flag for whatsnew, record, and amend-record commands in Darcs. When used, this flag provides automatic detection of replaces(possible ones), even when the modified files shows more differences than only the replaces, and even shows possible "forced" replaces.
The simplest case is when you made a replace in you editor in of choice and don't do any other change to the file and then, after check all is ok, remember that you could have used a replace patch.

file before:
line1 foo
line2 foo
line3 foo
file after:
line1 bar
line2 bar
line3 bar
> darcs revert -a file
Reverting changes in "file":

Finished reverting.
> darcs replace foo bar file
> darcs record -m "replace foo bar"
replace ./file [A-Za-z_0-9] foo bar
Shall I record this change? (1/1) [ynW...], or ? for more options: y
Do you want to record these changes? [Yglqk...], or ? for more options: y
Finished recording patch 'replace foo bar'
You could do:
> darcs record --look-for-replaces -m "replace foo bar"
replace ./file [A-Za-z_0-9] foo bar
Shall I record this change? (1/1) [ynW...], or ? for more options: y
Do you want to record these changes? [Yglqk...], or ? for more options: y
Finished recording patch 'replace foo bar'
But it doesn't have to be a full replace. For instance, if you don't want to change a pair replaces, when you try to detect the changes instead of:
file before:
line1 foo
line2 foo
line3 foo
line4 foo
file after:
line1 bar
line2 bar
line3 bar
line4 foo
> darcs whatsnew
hunk ./file 1
-line1 foo
-line2 foo
-line3 foo
+line1 bar
+line2 bar
+line3 bar
With the new flag you could record this:
> darcs whatsnew --look-for-replaces
replace ./file [A-Za-z_0-9] foo bar
hunk ./file 4
-line4 bar
+line4 foo
Say you replace a word for another word that was already in the file. Normally this would mean that you should use "darcs replace --force". The look-for-replaces flag always "forces" the replaces, so if you try this, the changes to make the replace reversible will be shown before the replace patch:
file before:
line1 foo
line2 foo
line3 foo
line4 bar
file after:
line1 bar
line2 bar
line3 bar
line4 bar
With the new flag you will see the same patches like if you have made a "darcs replace --force foo bar file":
> darcs whatsnew --look-for-replaces
hunk ./file 4
-line4 bar
+line4 foo
replace ./file [A-Za-z_0-9] foo bar
Given certain limitations you could have any number of replaces detected, like this:
file before:
foo foo2 foo3
fee fee2 fee3
file after:
bar bar2 bar3
bor bor2 bor3
All the replaces are shown below:
> darcs whatsnew --look-for-replaces
replace ./file [A-Za-z_0-9] fee bor
replace ./file [A-Za-z_0-9] fee2 bor2
replace ./file [A-Za-z_0-9] fee3 bor3
replace ./file [A-Za-z_0-9] foo bar
replace ./file [A-Za-z_0-9] foo2 bar2
replace ./file [A-Za-z_0-9] foo3 bar3
If you want to know more about the limitations of this functionality, check Automatic detection of replaces for Darcs - Part 2.

Automatic detection of replaces for Darcs - Part 2

September 20, 2013 09:08 AM UTC

The last weeks i was implementing "--look-for-replaces" flag for whatsnew, record, and amend-record commands in Darcs. When used, this flag provides automatic detection of replaces(possible ones) even when the modified files shows more differences than only the replaces, given they meet the following prerequisites:
1. For a given "word" and a given file, there is not need for all the instances to be replaced, but there must be only one replace suggestion posible. i.e.:

this is ok:
file before:
foo
foo
foo
file after:
foo
bar
bar
this is not detected:
file before:
foo
foo
foo
file after:
foo
bar
bar2
2. The replace must happen in lines that have the same amount of words between the recorded and the working state, otherwise it would not be detected.
this is ok:
file before:
foo
foo
foo
file after:
foo roo
bar fee
bar
this is not detected(i don't know which is to detect anyway):
file before:
figaro foo
figaro foo
figaro foo
file after:
figaro foo
figaro bar bee
figaro foo bar
3. There must be at least one hunk with the same amount of lines in the - and + side that contains the replace.
this is not detected:
file before:
line1 foo
line2 foo
line3 foo
file after:
line1 bar
line2or3 bar
It would not detect this replace, even if it is a "perfect" replace, because it does not have the same number of lines, and is not trivial to tell which line is the one "modified" and which one is the one "deleted".

For more details about the implementation you could look on the look-for-replaces wiki page

Automatic detection of file renames for Darcs - Part 2

September 20, 2013 09:07 AM UTC

In the last few weeks i was refining the automatic detection of file renames implementation adding support for windows, and support for more complicated renames.

Now if you like you can consult the inode information saved in the index at any time with "darcs show index":
⮁ darcs init
⮁ mkdir testdir
⮁ touch testfile
⮁ darcs record -al -m "test files"
Finished recording patch 'test files'
⮁ ls -i1d . testdir testfile
2285722 .
2326707 testdir
2238437 testfile

⮁ darcs show index
07ec6ccf873cf215ac0789a420f154ba9218b7ca5c4fce432584edab49766a7c 2285722 ./
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2326707 testdir/
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2238437 testfile
Now with the new dependency algorithm, you can make more complicated renames, like exchange of filenames, folder moves. The algorithm don't manage exchange of filenames inside of a folder that have been renamed exchanging names, anything else is managed fine.
For example:
⮁ ls -1pC
_darcs/  dir/  dir2/  dir3/  foo  foo2  foo3  foo4  foo5
⮁ mv foo dir3
⮁ mv foo2 dir
⮁ mv foo3 dir2
⮁ mv foo4 foo4.tmp
⮁ mv foo5 foo4
⮁ mv foo4.tmp foo5
⮁ mv dir3 dir
⮁ mv dir dir2/dir2
⮁ mv dir2 dir
⮁ darcs whatsnew --look-for-moves
move ./dir ./dir2/dir2
move ./dir2 ./dir
move ./dir3 ./dir/dir2/dir3
move ./foo ./dir/dir2/dir3/foo3
move ./foo2 ./dir/dir2/foo2
move ./foo3 ./dir/foo3
move ./foo4 ./foo4.tmp~
move ./foo5 ./foo4
move ./foo4.tmp~ ./foo5
The moves shown by "darcs whatsnew --look-for-moves" are not exactly the ones made but yield the same final result.

August 14, 2013

Jose Luis Neder

Automatic detection of file renames for Darcs

August 14, 2013 04:29 AM UTC

In the last few weeks i was implementing automatic detection of file renames adding "look-for-moves" flag to the amend-record, record, and whatsnew commands.

In darcs are 3 states:

• The recorded state is the one is marked by the last record made.
• The working state is the actual state of the files in the repository with all the last changes.
• The pending state is the one that mark changes like file adds, moves, replaces, etc, before they are recorded. Is a temporal state between recorded and working that let darcs know about what filenames to track, and changes that are not common like replaces.

If a file rename is not marked in the pending state, darcs lost track of the file and can't know where it is, and then darcs whatsnew and darcs record will indicate the file as deleted.
To detect this file rename I choose to use the inode info in the filesystem to check for equality between different filenames in the recorded and working state of the repo. for those who don't know, the inode is an index number assigned by the file system to identify a specific file data. The file name is linked to the data by this number, and it's used by directories as well. You can consult this number with "ls -i".
⮁ mkdir testdir
⮁ touch testfile
⮁ ls -i1
10567718 testdir
10485776 testfile
You can see that the hardlink shares the same number with the test file, this is because a file is essentially a hardlink to the file data and when you make a new hardlink you are sharing the same file data, so the same inode number.
To have an old inode to filename mapping, there must be some record of the files inodes in some place, so I added the inode info to the index of hashed-storage in _darcs/index. The index save the last info about the record plus the pending state, sort of, so is a perfect fit to save this info.
Then comparing the RecordedAndPending Tree(from the index) with the Working Tree i get the file changes in a pair list mapping between the two states. With this list I resolve dependencies between the different moves, making temporal names if it's necessary and generating a FL list of move patches to merge with the changes between pending and working patches.
This patches are shown in with whatsnew or are selected with record/amend-record to be recorded in the repo.
There is a little more to make this happen but that's the core idea of the implementation.
The algorithm doesn't care if the file are modified or not, because it doesn't care of the content of the files, so it's very robust in that sense.
With this implementation you could do any move directly with "mv", and is very lightweight and fast in detecting moves so is likely a good decision make "--look-for-moves" a default flag. You could do things like this:
⮁ darcs init
Repository initialized.
touch foo