Software-Configuration.com
Published by PAC Enterprises LLC
Software Configuration
for an open source world
GIT- Fast Version Control System
configuration
managment system
Posted by Rob Castellow
The
productive partnership of Junio Hamano and Linus Torvalds created a
distributed version control system that the world knows as GIT.
Originally invented by Linus Torvalds, it remains under the continuing
development and maintenance of Junio Hamano. The GIT source control
software works with many operating systems including such as Linux,
Darwin, BSD, Solaris and Windows. It functions without the use of a
centralized server.
Software configuration management programs will track the changes in
programs and manage the update flow. Several other software
configuration management programs are available; however, GIT
reportedly runs faster, especially on systems such as Linux - which was
also created by Linus Torvalds. Torvalds, when discussing his product,
admits that GIT does not do a great variety, rather than handling its
specific task both fast and well. GIT is able to handle selection from
more than 17000 files in less than a second.
In comparison to alternative source management, GIT does not
track
individual files, but functions based on the collective. In order to
track an individual file, you need to first select the collection in
which the file you require is involved, and then pay no attention to
the additional files in the group. GIT tracks specific file data,
rather than covering every single possibility. Therefore should you
require tracking of additional information, metadata, and/or home
directories, the use of GIT will be limiting. File names are regarded
as byte sequences, and are therefore not converted to code.
The title name GIT may refer to global information tracker;
however,
creator Torvalds is known to have joked that the word 'git' is also
English for a stupid or dull person. He refers this to his product,
explaining that the GIT source control software does not cover many
avenues.
GIT is source control software that is available for free.
Some
software projects use GIT in the format of revision control. Examples
of these include One Laptop Per Child, Ruby On Rails web framework, the
Android mobile platform and Linux kernel.
There have been several versions since the original first
appeared on
the markets and the features therefore vary dependent upon which
version you have. Giving one example, GIT 1.5 and afterwards will offer
you CRLF and LF conversion for different platforms. The latest version
additionally supplies the ability to specify content filters prior to
checking items.
The reason GIT originally worked better with Linux than other
operating
systems, was the Linux kernel containing such a large amount of files
in comparison. This was the main reason that the GIT source control
software was originally created. Other operating systems therefore do
not have the specific requirements for software configuration
management that deal with such an immensity of files. They tend to have
a larger core, but GIT does not work so well when used as a
centralization.
Originally suggested by previous programs Monotone and
BitKeeper, the
first reason for Git to surface came from the requirement for a
low-level engine with which to write front ends with. Currently, GIT is
a complete package for those seeking a revision control system.
Designed as a set of programs along with shell scripts to provide
wrapping for the programs, the need to incorporate GIT for use with
operating systems such as Microsoft Windows led to the rewriting of
many of these scripts.
Supportive of merging and rapid branching, GIT is suitable for
non
linear development projects. Changes are registered in more than one
repository, being copied from one to the rest in the form of added
development branches. The publication of the repositories can be in
formats such as FTP, rsync, HTTP, ssh or in GIT protocol. Those with
CVS or IDE plugins can also utilize the GIT repositories, due to the
server emulation for CVS.
Tests carried out by Mozzilla have shown that GIT is at least
twice as
fast as alternate revision control systems. It is therefore highly
recommended source control software for the managing of large projects
efficiently.
Storage of history is via cryptographic authentication. Once a
change
in a specific area has been developed, former versions will take note
of the changes when they too are about to undergo change. Therefore,
once a change has been made, it is always picked up. Whereas reclaiming
space via the use of git-gc --prune is regarded as slow, the small
amount of space amongst the continued history may be retrieved from
aborted operations.
Former source code control systems worked on individual files,
working
on the assumption that space could be saved from delta encoding
versions of similarity. However, GIT does not deal with relationships
of file revision below the source code tree level, leading to several
consequences. These include the fact that it is easier to note changes
to a complete file system than an individual file using GIT. In some
cases this can be extremely beneficial, such as when needing to observe
a source tree subdirectory in addition to a connected global header
file.
Another result of GIT working on complete systems easier than
single
files is the implicit recognition of renames. Unlike CVS which uses a
file name to connect to the history of the file revisions, GIT does not
use the file name in direct file identification, but in an
identification with snapshot history browsing rather than with snapshot
creation.
A third matter that occasionally concerns people is the GIT
storage
model. Each created item is retained in an individual file. In an
attempt to use space efficiently, these individual files are saved in a
collective format known as a pack. As history is created, it is stored
in single files; therefore packing into collective pack form is
required periodically. This is done automatically on occasion by the
GIT system. However, using the git-go command will enable you to
process this manually whenever you choose. This would prove useful to
you if you have recently worked on a large collection and wanted it
efficiently stored as soon as possible, prior to the automatic packing.
Merging strategies offered by GIT are obtainable by manual
choice, or
default. They include resolve, recursive and octopus. Resolve refers to
the conventional algorithm of a 3-way merge. Recursive is used as the
one branch merging default. It refers to what happens when there is not
just a single connection, but several common connectives. It combines
the common connections into a reference point, resulting in fewer
problems and mis-merging. Recursive also assists with the handling of
renamed file sources. Octopus refers to the merging strategy used as
default for multiple headed merges.
There are two data structures incorporated in the GIT
software. Firstly
there is an accumulated index information cache. Secondly GIT has an
object database containing four different types of item - namely blob,
tree, commit and tag objects. File content without a name, timestamp or
metadata is known as a blob. A filename list containing type bits, blob
names, symbolic links and directory contents is known as a tree object,
which is a snapshot of the source tree. A commit object refers to
historically connected tree objects. Along with the title of the tree
object, it will include a log message and timestamp, as well as the
names of connected objects. A container of a collection of references
to other objects in addition to extra data is termed a tag object.
There are many questions that may come to mind when
considering the use
of GIT. If you wish to access a sub-branch file, rather than the master
branch, you need to access the file branch and then create a track to
the local branch from the main branch. To access a tree on a specific
tag, you may have the ability via 'checkout tagname'. If your GIT
version control system was prior to the 1.5 release however, you will
need to make an impermanent tag based branch. In order to ensure that
you can recheck this at a future date, you will then need to make
further alteration to remove the temporary status of the branch.
It is possible to use the GIT software configuration
management to
ignore specific files. It is also possible to untrack a file using GIT.
This is done by keeping the file, but discontinuing it in subsequent
file revisions. If you have an accident and remove or damage a file,
you can either revert to the index version or the current commit
version.
If you encounter a bug during essential work, you may wish to
use the
option of 'git stash'. This will enable you to save your modifications
temporarily, whilst dealing with the bug before continuation with the
use of your software configuration management. Alternative options are
to save the changes in a patch for later application or use of a
temporary branch for later merging.
If you wish to use GIT for the purpose of publishing a
repository, it
is advised that you use the command known as 'git update-server-info'
in said repository prior to mirroring and publication. Using the
command 'git show' will enable you to use your source control software
to view a file or directory from a former revision. You will need to
add the filename following a colon to do this. With the 'git diff'
command, you can connect between a pair of files and a commit. If you
are attempting to set up a git server, you can use the gitosis tool.
Many other tools and commands make using the GIT fast version
control
system easier than you might estimate. Nearly every source control
software option will have a few commonly reported problems, most of
which can be avoided or overcome by simple procedures. Results such as
'cannot merge', 'protocol error', and fatal error prompts come into
this category. Simple instructions for overcoming these difficulties
can be found in a FAQ manual online.
You may wish to import from a previous revision control system
other
than GIT, or to exchange data with work colleagues using an alternative
system. Most alternate systems will allow this to be achieved.
Importing from CVS can be done easily if you have version 2.1 onward
via the git output mode, and other processes enable import from earlier
versions. Importing from Mercurial or Perforce are likewise possible.
Export possibilities for use with svn, arch, baz and tla are also
available. Some other software configuration management programs will
also permit GIT to import from them.
In most cases where sharing of material is involved, separate
repositories for private and personal matters are advisable. If this is
a factor you will be using, it is suggested that you omit the continual
password requests by using a SSH key and agent mechanism system.
You may wish to have work on an individual repository
completed by a
team rather than single person. This can be arranged, but their changes
may not be automatically applied. You may prefer to exchange patches
via a mail system. This can be done in both directions. For outgoing
patches, you can use the 'git format-patch' command; whereas for
incoming patches, you will use the 'git am'.
Other than the much reported added speed that the GIT source
control
software will offer you, there are several other advantages to this
brand. The fact that it is a free software program entices many to try
it, but its usability on projects both large and small leads to it
being a popular choice for any assignment. Many other systems focus on
small to medium sized ventures, whereas GIT can handle a project of any
size. The decentralized distribution of GIT is another factor that many
take into preferable account when deciding on which version control
system to operate.
In order to start using GIT, or to change from a previous or
alternative release management system, you can either obtain the
current release from Linux distribution which come in a complete
package, or use the most recent stable snapshot and manually compile
it. The latter can be done via download of a tarball with the GIT
source code snapshot contained within.