Software Configuration

for an open source world

GIT- Fast Version Control System

configuration managment system
Posted by Rob Castellow


The productive partnership of Junio Hamano and Linus Torvalds created a distributed version control system that the world knows as GIT. Originally invented by Linus Torvalds, it remains under the continuing development and maintenance of Junio Hamano. The GIT source control software works with many operating systems including such as Linux, Darwin, BSD, Solaris and Windows. It functions without the use of a centralized server.



Software configuration management programs will track the changes in programs and manage the update flow. Several other software configuration management programs are available; however, GIT reportedly runs faster, especially on systems such as Linux - which was also created by Linus Torvalds. Torvalds, when discussing his product, admits that GIT does not do a great variety, rather than handling its specific task both fast and well. GIT is able to handle selection from more than 17000 files in less than a second.



In comparison to alternative source management, GIT does not track individual files, but functions based on the collective. In order to track an individual file, you need to first select the collection in which the file you require is involved, and then pay no attention to the additional files in the group. GIT tracks specific file data, rather than covering every single possibility. Therefore should you require tracking of additional information, metadata, and/or home directories, the use of GIT will be limiting. File names are regarded as byte sequences, and are therefore not converted to code.



The title name GIT may refer to global information tracker; however, creator Torvalds is known to have joked that the word 'git' is also English for a stupid or dull person. He refers this to his product, explaining that the GIT source control software does not cover many avenues.



GIT is source control software that is available for free. Some software projects use GIT in the format of revision control. Examples of these include One Laptop Per Child, Ruby On Rails web framework, the Android mobile platform and Linux kernel.



There have been several versions since the original first appeared on the markets and the features therefore vary dependent upon which version you have. Giving one example, GIT 1.5 and afterwards will offer you CRLF and LF conversion for different platforms. The latest version additionally supplies the ability to specify content filters prior to checking items.



The reason GIT originally worked better with Linux than other operating systems, was the Linux kernel containing such a large amount of files in comparison. This was the main reason that the GIT source control software was originally created. Other operating systems therefore do not have the specific requirements for software configuration management that deal with such an immensity of files. They tend to have a larger core, but GIT does not work so well when used as a centralization.



Originally suggested by previous programs Monotone and BitKeeper, the first reason for Git to surface came from the requirement for a low-level engine with which to write front ends with. Currently, GIT is a complete package for those seeking a revision control system. Designed as a set of programs along with shell scripts to provide wrapping for the programs, the need to incorporate GIT for use with operating systems such as Microsoft Windows led to the rewriting of many of these scripts.



Supportive of merging and rapid branching, GIT is suitable for non linear development projects. Changes are registered in more than one repository, being copied from one to the rest in the form of added development branches. The publication of the repositories can be in formats such as FTP, rsync, HTTP, ssh or in GIT protocol. Those with CVS or IDE plugins can also utilize the GIT repositories, due to the server emulation for CVS.



Tests carried out by Mozzilla have shown that GIT is at least twice as fast as alternate revision control systems. It is therefore highly recommended source control software for the managing of large projects efficiently.



Storage of history is via cryptographic authentication. Once a change in a specific area has been developed, former versions will take note of the changes when they too are about to undergo change. Therefore, once a change has been made, it is always picked up. Whereas reclaiming space via the use of git-gc --prune is regarded as slow, the small amount of space amongst the continued history may be retrieved from aborted operations.



Former source code control systems worked on individual files, working on the assumption that space could be saved from delta encoding versions of similarity. However, GIT does not deal with relationships of file revision below the source code tree level, leading to several consequences. These include the fact that it is easier to note changes to a complete file system than an individual file using GIT. In some cases this can be extremely beneficial, such as when needing to observe a source tree subdirectory in addition to a connected global header file.



Another result of GIT working on complete systems easier than single files is the implicit recognition of renames. Unlike CVS which uses a file name to connect to the history of the file revisions, GIT does not use the file name in direct file identification, but in an identification with snapshot history browsing rather than with snapshot creation.



A third matter that occasionally concerns people is the GIT storage model. Each created item is retained in an individual file. In an attempt to use space efficiently, these individual files are saved in a collective format known as a pack. As history is created, it is stored in single files; therefore packing into collective pack form is required periodically. This is done automatically on occasion by the GIT system. However, using the git-go command will enable you to process this manually whenever you choose. This would prove useful to you if you have recently worked on a large collection and wanted it efficiently stored as soon as possible, prior to the automatic packing.



Merging strategies offered by GIT are obtainable by manual choice, or default. They include resolve, recursive and octopus. Resolve refers to the conventional algorithm of a 3-way merge. Recursive is used as the one branch merging default. It refers to what happens when there is not just a single connection, but several common connectives. It combines the common connections into a reference point, resulting in fewer problems and mis-merging. Recursive also assists with the handling of renamed file sources. Octopus refers to the merging strategy used as default for multiple headed merges.



There are two data structures incorporated in the GIT software. Firstly there is an accumulated index information cache. Secondly GIT has an object database containing four different types of item - namely blob, tree, commit and tag objects. File content without a name, timestamp or metadata is known as a blob. A filename list containing type bits, blob names, symbolic links and directory contents is known as a tree object, which is a snapshot of the source tree. A commit object refers to historically connected tree objects. Along with the title of the tree object, it will include a log message and timestamp, as well as the names of connected objects. A container of a collection of references to other objects in addition to extra data is termed a tag object.



There are many questions that may come to mind when considering the use of GIT. If you wish to access a sub-branch file, rather than the master branch, you need to access the file branch and then create a track to the local branch from the main branch. To access a tree on a specific tag, you may have the ability via 'checkout tagname'. If your GIT version control system was prior to the 1.5 release however, you will need to make an impermanent tag based branch. In order to ensure that you can recheck this at a future date, you will then need to make further alteration to remove the temporary status of the branch.



It is possible to use the GIT software configuration management to ignore specific files. It is also possible to untrack a file using GIT. This is done by keeping the file, but discontinuing it in subsequent file revisions. If you have an accident and remove or damage a file, you can either revert to the index version or the current commit version.



If you encounter a bug during essential work, you may wish to use the option of 'git stash'. This will enable you to save your modifications temporarily, whilst dealing with the bug before continuation with the use of your software configuration management. Alternative options are to save the changes in a patch for later application or use of a temporary branch for later merging.



If you wish to use GIT for the purpose of publishing a repository, it is advised that you use the command known as 'git update-server-info' in said repository prior to mirroring and publication. Using the command 'git show' will enable you to use your source control software to view a file or directory from a former revision. You will need to add the filename following a colon to do this. With the 'git diff' command, you can connect between a pair of files and a commit. If you are attempting to set up a git server, you can use the gitosis tool.



Many other tools and commands make using the GIT fast version control system easier than you might estimate. Nearly every source control software option will have a few commonly reported problems, most of which can be avoided or overcome by simple procedures. Results such as 'cannot merge', 'protocol error', and fatal error prompts come into this category. Simple instructions for overcoming these difficulties can be found in a FAQ manual online.



You may wish to import from a previous revision control system other than GIT, or to exchange data with work colleagues using an alternative system. Most alternate systems will allow this to be achieved. Importing from CVS can be done easily if you have version 2.1 onward via the git output mode, and other processes enable import from earlier versions. Importing from Mercurial or Perforce are likewise possible. Export possibilities for use with svn, arch, baz and tla are also available. Some other software configuration management programs will also permit GIT to import from them.



In most cases where sharing of material is involved, separate repositories for private and personal matters are advisable. If this is a factor you will be using, it is suggested that you omit the continual password requests by using a SSH key and agent mechanism system.



You may wish to have work on an individual repository completed by a team rather than single person. This can be arranged, but their changes may not be automatically applied. You may prefer to exchange patches via a mail system. This can be done in both directions. For outgoing patches, you can use the 'git format-patch' command; whereas for incoming patches, you will use the 'git am'.



Other than the much reported added speed that the GIT source control software will offer you, there are several other advantages to this brand. The fact that it is a free software program entices many to try it, but its usability on projects both large and small leads to it being a popular choice for any assignment. Many other systems focus on small to medium sized ventures, whereas GIT can handle a project of any size. The decentralized distribution of GIT is another factor that many take into preferable account when deciding on which version control system to operate.



In order to start using GIT, or to change from a previous or alternative release management system, you can either obtain the current release from Linux distribution which come in a complete package, or use the most recent stable snapshot and manually compile it. The latter can be done via download of a tarball with the GIT source code snapshot contained within.




Bookmark and Share