Pixelastic

You can cut our wings but we will always remember what it was like to fly.

Posts tagged with "mercurial"

Working on Wednesday #9 : Mercurial

I'm feeling like I'm getting more and more behind schedule for what I intended to do at first. I still haven't tried Rails more than that and have gone on different learning side projects.

I've been reading the Mercurial : Definite Guide the past few days to get a correct grasp of the soft.

I've been using Mercurial for the pas two years, but through a GUI and without using any "advanced" features. I never branched a project, and always worked alone.

Commands

Now that I'm working on a Linux machine every day, I can use hg through the command line.

hg commit -Am "commit message" is the same as hg addremove; hg commit -m "commit message".

hg rollback will remove the last local commit. Useful if you forgot files in the commit, or if you inserted a typo in your commit message

hg revert can revert a file or set of files to the state they were at the last commit. This can also cancel a hg add or hg remove

hg backout can "forget" a commit in the history. It will not really forget the commit (ie. will not let you alter the history). Instead, it will create a new commit where the specified changeset is removed (through a merge). It can easily backout the tip, but may involve more merge work if we want to backout an old changeset.

Automation

Also, I've learned about two great tools of Mercurial.

hg bisect let you isolate a specified commit in your history where you introduced a specific bug. You write a piece of code that, given a changeset, returns true or false based on the bug presence, and hg bisect will cleverly scan the history to find the revision that introduced the bug.

hooks where also very interesting. One can script automatic command on specific hg command like commit, pull, push. Or even before those commande to refuse the command if something does not work as expected.

The classical examples where running a build process after a commit, refusing a commit if no bug id where specified, or if the tests didn't pass. Another use case would be to push changes to a remote server on commit.

Git

Why am I learning Mercurial while all the cool guys are using git ?

Well, I've read a lot of papers comparing hg to git. What I've read the more is that git is an awesome toolbox that lets you do whatever you want with your version control, through its 100+ tools.

On the other hand, Mercurial is far easier to learn and has built-in command for the day to day work. As I was already quite familiar with Mercurial, I stick with it, but know that I'll learn git also eventually.

 

Differences between Mercurial and Subversion

People at my current job are using Subversion, and our project will be tracked using it. I never used Subversion before, the only versionning system I ever used was Mercurial. And moving from one to the other meant changing a lot of reflexs I had.
Here is a little list of changes, mostly as a reminder for myself.

Directories

Mercurial uses a single .hg directory at the root of your project to store all your project history while Subversion adds multiple .svn directories in each directory, to track history changes to that directory only.

I prefer the Mercurial approach, it keeps all changes centralized in one place. You can simply remove the .hg directory to transform your versionned version into a stand-alone one.

While the Subversion approach litters your app with countless hidden directories, making copy and pasting a real pain.

Centralised vs Distributed

Subversion is centralised while Mercurial is distributed.
As far as I understand the difference, it means that Subversion uses one main directory to store the versionned version and can deploy (export) a specific revision at anytime.

That revision do not hold any history information, it is only a copy of your project at a given time.

On the other hand, each Mercurial repository holds both the current public version and all the history. You do not have to deploy anything anywhere, just update your current repo with data from one of the revision.

Tortoise

I am using both TortoiseSVN and TortoiseHg. When you commit with TortoiseSVN it displays the list of files that where updated since the last commit. If you added new files, they won't show unless the "Show unversionned files" is checked.

In TortoiseHg, all new files are automatically seen in the commit window, as well as a diff. It allows me the easily see what changes where made, and help me write my commit message.
I really like the TortoiseHg vision better.

Tracking directories

Subversion can track empty directory, just by adding them. Mercurial can't. You have to add an empty file in each to allow tracking.

Also, when doing a commit in a specific directory in Mercurial, it will commit the whole repo, while with Subversion it will only commit the current repo. I can see the benefits of both and I'm not quite sure which is better.

Subversion will allow me to do a commit of one special feature by committing only one directory, but the TortoiseHg integration help me doing commits more easily no matter where I'm browsing.

Ignoring files

I have some files in my project I don't want to track (like auto-generated cache files). In Mercurial, all I need is editing the .hgignore file with regexps. The syntax can be a little strange sometimes, it took me a while to correcly understand it, but it definitely works.

On Subversion, I can add files to the ignore list so they don't show as "unversionned", but I can also add a svn:ignore property to a specific directory to set regexp to files that I don't want to track. The end result is the same, but the way Mercurial handle it with one file is more appealing to me.

Conclusion

Having all history in one dir and all ignore rules in one files seems a better approach to me. I guess on some aspects Mercurial is still more centralised than Subversion.

My new backup strategy for 2011

My computer was starting to get slower and slower for the past days. And I realized my automatic backup wasn't backing anything up for the past month.

And I realized I had different versions of the same files on my 2 laptops...

Well, it seems I have to do some cleaning up.

Synchronizing paperwork

I started by cleaning up my Dropbox folder. I removed shared folders with past clients, and created a "Paperwork" folder where I put all my invoices, contracts and general paperwork.

I also added my private KeePass file as well as other info I may need to access anywhere, anytime.

KeePass allow me to store all my login/password credentials in a secure way (protected by a master password). It is really useful to have this file on all my computers (and mobile phone).

Dropbox is excellent for storing simple files, that you need everywhere. Being able to access invoices and contracts even from my mobile phone proved quite valuable when meeting clients.

Hard backup of personal files

I've also changed my scheduled backups of personal files. I bought an Acronis True Image last year, and reconfigured it today.

I have a hard drive whose sole purpose is to save backups. I scheduled for the first of each month to save : my system state, my applications configuration, and my personal files (photos, saved gamed, writings, etc).

I manually started all this backups to have a clean start. I also forced the backup to restart a whole new file every 6 month (opposed to using the incremental backup).

Backing up my music and movies

I did not spent too much time figuring how to save my hundred of Go of music and movies. I rarely watch the same movies twice, so losing them won't affect me too much.

I occasionally re-watch series, though, but as most of my friends have the same tastes as I, I could very easily get them back from them, or download them (again).

Regarding music, well, I have quite a big collection, but most of it is already "backed up" on my portable mp3 player.

Automatic synchronizing with BitBucket

On my day work, I now always version my files using Mercurial. BitBucket offers unlimited storage, and unlimited public repositories. Private repo are limited to 5 users. As I'm mostly alone on projects that should stay private, this seems the best deal I could found.

Mercurial being a versionning system, I got all the benefits of a backup here, being able to revert to previous versions, update it whenever I want and access it from anywhere.

I wrote a custom Hg hook on commit to automatically push my repos to BitBucket at least once a day (I'll post the code in a future post).

MySQL Backup

I used to backup mysql databases on my work computer using a windows app. This was slowing down my computer on every boot as well and backup was thus only effective when I was working and not when I was on vacations.

Today, I wanted something a lot more flexible, so I set a cronjob on my main host coupled with a slightly edited autoMySQLBackup script.

This will automatically run everyday at midnight and make a local save (with daily, weekly and monthly rotate) of all my clients databases. Logs are saved on disk and gzipped, and will also be sent to a special backup@pixelastic.com mail address (stored on GMail).

This way I am sure to have my mysql backups on two different hosts, with daily and automatic saves, that I can access from anywhere if anything goes wrong.

Conclusion

It took me almost two full days to get the right tools, configure them and write my custom scripts but now, it is seamlessly integrated with my daily workflow. This is a weight off my shoulders, I know I can safely work as usual and my files are saved and easily accessible.

Using nested subrepos with Mercurial and TortoiseHg

Nowadays, when I'm developping a new website, I almost always ended using parts and bits of the previous website I've done. All my websites are based on the same framework (cakePHP) that I have itself updated with its own CMS (Caracole, more on that later).

Caracole is made of several little plugins, each one of them focusing on a simple task (like handling 404 errors, adding a recycle bin, draftable elements, SEO-friendly url, and so on).

I've also updated each one of this plugins to BitBucket, allowing me to easily commit changes and clone new version from one project to another.

But very often, when working on a specific project, using a specific plugin I think that I can update the plugin (be it either by adding a new feature or fixing a bug I've discovered). In that case, I want my changes to be added to both the plugin (on BitBucket) and the project I'm working on at the moment.

To do that, I had to struggle my way with Mercurial because nested repositories (called subrepos) is not a trivial setup.

Setting up subrepo with Mercurial :

Let me show you the classical and easy way to achieve that :

First, let's say you have your main repo. You go in the directory where you want to add your subrepo and you either create it using hg init or hg clone.

You then go back to your main repo root and edit the .hgsub file (if you don't have this file yet, just create it). Add the following line to the .hgsub :

path/to/your/subrepo = path/to/your/subrepo

Now, on every subsequent commit Hg will be aware that your repo is holding a new subrepo. If you omit this line, Hg will not allow you to commit complaining about a repo inside an other repo.

You can now safely commit your main repo, or your subrepo independently.

Now, let's see the edge case.

Changing a classical sub directory into a subrepo

The classical example above is what you can find in the Mercurial help pages. It wasn't that helpful for me because my setup was a little different and it was causing Hg a lot of trouble.

I was not creating a new subrepo, nor cloning a new one. I had sub diretcory of my main app, that I wanted to change into a subrepo. My sub directory was named 'myplugin' and I had a repo of that name hosted on BitBucket.

So I tried to delete my existing 'myplugin' directory, and clone the 'myplugin' from BitBucket, edit the .hgsub and commit but Hg aborted the operation, complaining about the repo in repo file structure.

After a lot of testing, and cry for help, I finally managed to get it to work. The workflow is almost the same, with one little new step.

Deleting the 'myplugin' folder wasn't enough. I had to tell Mercurial to completly remove this files from its index. Using TortoiseHg, I was able to do that by right clicking on the folder, and then choosing 'TortoiseHg > Remove Files'. Then I had to commit those changes, officially telling Mercurial to forget this files, and putting it in a state where those files aren't there at all.

Then only was I able to clone my repo from BitBucket, edit the .hgsub file and commit my main repo.

cakePHP deployment with Mercurial on Dreamhost

I now use Mercurial on my daily work flow and have set up some methods on my dev machine to ease the pain of installing mercurial and make it work on any new webserver.

Here are some snippets that automate all that stuff. You may have to change one thing or two to accomodate your own setup.

First, I create a custom .bashrc file that I will put on the webserver and create into the following method :

hgInstall() {
mkdir -p ~/.packages/src
cd ~/.packages/src
wget http://www.selenic.com/mercurial/release/mercurial-1.2.tar.gz
tar xvzf mercurial-1.2.tar.gz
cd mercurial-1.2
python setup.py install --home=~/.packages/

echo -e "[ui]\nusername = Pixelastic <tim@mailastic.com>" >> ~/.hgrc
echo -e "[extensions]\nhgext/hbisect=!\nhgext.imerge=! >> ~/.hgrc

. ~/.hgrc
cd ~/
hg version
}

Let me explain. I first create a directory to store the packages I will download (in this example I will only download one package, but as I don't like to have files all around my server, I just keep them in this place). I will then download Mercurial 1.2 in this new directory, unzip it and install it.

Next step is configuring the default user and correcting some bugs with Dreamhost trying to load non-existing extensions (hgext/hbisect and hgext.imerge). As I've made a change to .hgrc, I reload it and get back to the default directory while displaying hg version.

That's almost done, I also have to edit the .bash_profile and add the following lines

export PYTHONPATH=~/.packages/lib/python
export PATH=~/.packages/bin:$PATH

Ok, so this method will download, install on configure Hg on the Dreamhost server. That's all very well, but I had to manually setup the .bashrc, let's see if we cannot automate that as well.

Now, I'm editing my .zsh_aliases on my local machine (or your .bash_aliases if you're using bash) to add the following method

dreamhost() {
scp ~/Documents/Config/Dreamhost/.bashrc ~/Documents/Config/Dreamhost/.bash_profile $1:~/
ssh $1 '. ~/.bashrc'
scp ~/.ssh/id_rsa.pub ~/Documents/Config/Dreamhost/.ssh/xpsfixe.pub $1:~/
ssh $1 'addKeys'
scp ~/Documents/Config/Dreamhost/cakeClearCache.sh $1:~/
ssh $1 'chmod +x ~/cakeClearCache.sh'
ssh $1 'hgInstall'
ssh $1
}

Ok, so this one is a little more complex. You have to call this method with one paremeter, being the user@domain credentials to connect to your Dreamhost server. What it will do is upload (using ssh) files from your local machine to the server and then apply some commands on the machine using ssh.

First it will upload both the local version .bashrc and .bash_profile that are sitting on your dev machine and "reload" the .bashrc, allowing you to use the previously defined hgInstall directly in the shell

Then, it will upload your ssh key(s) to the server and add them to the list of allowed keys (more on that later, just skip the addKeys line for now.)

The next step is uploading (and giving the correct chmod) a special script that will clear cakePHP cache (more on that later too)

And the final step is calling the previously explained hgInstall method. So the only thing you have to do is put this method in your .zsh_aliases (and the corresponding keys, .bashrc and scripts in their corresponding places) then run dreamhost() and Hg will be installed on your server.

So now let me get back a little on the two details I skipped. The first is the key stuff. What I'm doing is uploading your ssh key(s) to the server and then calling addKeys. It will authorize those keys to connect using ssh without having to type login/pass on each request. Here is the addKeys code (you have to put it in your .bashrc file and modify the filename to your own)

addKeys() {
mkdir .ssh
cat id_rsa.pub >> .ssh/authorized_keys
cat xpsfixe.pub >> .ssh/authorized_keys
rm id_rsa.pub
rm xpsfixe.pub
chmod go-w ~
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
}

It will basically create the .ssh dir and authorized_keys file with your keys info. It will then delete the files and set the correct chmod.

And the second part was about that cache clearing thing. When you update your app using Hg, you do not want to update the cache files created by Cake as they contain filepath reference and are likely to be different between your test and prod environment and would surely broke your whole app. So, you set an ignore rule in the .hgignore about them like the following :

syntax:glob
app/tmp/cache/cake_*
app/tmp/cache/views/*.php
app/tmp/cache/models/cake_*
app/tmp/cache/persistent/cake_*

It does work fine almost all the time, but it sometimes lead to errors as the cache is not regenerated between each hg update. Sometimes you have to alter a model schema or the way a value is stored in cache and if you don't clear your cache, it can yield to unexpected results as the data will be wrongly parsed and used.

So what i did to avoid that was to create a script that will clear the cache for you. Here is the code (you have to be inside the project dir for this to work)

cd app/tmp/cache
rm -f cake_*
rm -f views/*\.php
rm -f models/cake_*
rm -f persistent/cake_*
cd ../../../

It will remove all the cache files generated by cake that could interfere after an update. You just have to wrap thoses lines in a method in your .bashrc (mine is called cakeClearCache) and execute it after each update or when you have caching issues.

Well, I think you guessed that I did not stop here. Manually applying the method after each update can be a little tedious. So I put the previous code in a file named cakeClearCache.sh (you can spot that I uploaded this file in the dreamhost() method earlie)r. I also added the following line to my /project/.hg/hgrc on my server (if you don't have this file, just create it, it's a project-based hg configuration file)

[hooks]
update = ~/cakeClearCache.sh

It means that everytime an hg update is done, the specified script is fired. That's really fine for us, it means that cache will be cleared on each update. Sounds good.

One last thing to do was creating the hgrc file automatically. That's why I created the following method (add it to the .bashrc file in the server). It is just a wrapper that will create the hgrc file after doing an hg init

hgInitStart() {
hg init
echo -e "[hooks]\nupdate = ~/cakeClearCache.sh" >> ./.hg/hgrc
}

So instead of doing hg init, just do hgInitStart. You can then start cloning your project here.

And one last thing, I also created a method that will set correct chmod to app/tmp and app/webroot/files

cakeCorrectChmod() {
chmod 777 ./app/tmp -R
chmod 777 ./app/webroot/files -R
}

And created a wrapper around it to call just after having cloned the project that will update it and set the correct chmods

hgInitEnd() {
hg update tip
cakeCorrectChmod
}

That's all. I bet anyone slightly more experienced in shell scripting could do better than that, but as I have struggled a little to get this right I thought I could share it.