[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Magellan-users] Initial clarifications
From: |
Tryggvi Björgvinsson |
Subject: |
[Magellan-users] Initial clarifications |
Date: |
Tue, 18 Mar 2008 17:50:16 -1000 |
User-agent: |
Thunderbird 2.0.0.12 (X11/20080227) |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi everybody,
(Warning! Long email, take your time)
I just wanted to inform the list members that the first two versions of
the source code have been uploaded to the source code repository. So
magellan has now officially become free and open source software, meant
for global collaboration. Magellan has been released under the GNU
General Public License version 3 or later.
There are of course many modifications needed before it can be
considered to be a stable, working software but it has basic
functionality at the moment. Tasks which must be worked on before it
becomes a working software which can be used without much tweaking have
been identified and are listed in a specific TODO file.
Before continuing some clarifications are perhaps necessary. For those
who are not familiar with common free and open source software
terminology here are some explanations.
Free Software
- -------------
Free software is software which gives its users the four following freedoms:
0. The freedom to use the software for any purpose
1. The freedom to study the software and adapt it to ones needs
2. The freedom to distribute the software, to help ones neighbours
3. The freedom to improve the software and distribute the modifications
so that the whole community benefits
For software to become free software one must release it under a
specific license. The most well known and used by around 70% of free
software projects is the GNU General Public License or simply the GPL.
Open Source Software
- --------------------
Building upon free software is open source software. The difference is
that open source software does not focus on the four user freedoms but
the pragmatic value of the software by focusing on an effective
development method for the software. This method builds upon the above
freedoms but expands on the idea that users should be able to modify the
software and distribute the modifications. It is possible to say that
open source software defines the public as the user and releases the
source code on a public site, accessible to everybody, giving users the
possibility to help with the development.
Since there is a slight difference between the two types of software
(one ideological and the other pragmatic) there can be different
licenses applied to each, however most often a free software license is
also an open source software license, e.g. the GPL is both. Software
released under a free and open source software license is generally just
referred to as free and open source software or abbreviated FOSS.
Source repository and version control
- -------------------------------------
These terms are not only affiliated with free and open source software,
but software development in general. However source repositories and
version control are central in FOSS development. A source repository is
a place where the source code of the software is kept and can be
accessed by developers. The source code of free and open source software
is kept in public repositories accessible by the everybody.
Version control is often correlated with software repositories and
management of the source code. Version control systems keep track of all
updates (often called "patches") to the source code. That way, it is
easy to revert back to a working version of the software when a patch
renders the software (or a part of it) unusable (i.e. when an update to
the software accidentally makes the software unusable). Version control
has a lot of other benefits which are not important to know of at this
moment, but it is perhaps good to know of the two main types of version
control systems: central version control or distributed version control.
Central version control is controls updates to a software repository via
a central location. Each developer must "check out" the code from a
central repository to obtain the most recent version of the software.
All updates are then are then submitted (or committed) to that specific
central repository. Distributed version control is another school of
thought where the developers check out the latest version. But, instead
of only having a copy of the code where changes are committed back to a
central repository, the checked out version is a repository of its own.
So different versions of the software are distributed in many
repositories which can then be merged back into the main repository.
Magellan
- --------
So having some explanations of underlying concepts we can (finally)
start discussing magellan.
Magellan is a free and open source software released, as stated above,
under the GNU General Public License version 3 or above. By this we
intend for magellan to be a collaborative software development project
where users are invited to submit patches to the source code and help
with the development of magellan.
It is important to understand that it takes a lot of work and time
before the project gains a significant core of users which are willing
to help development and for a project like magellan, it might take
years. However, development of magellan will from the beginning put
effort into making it easy for everybody, especially scientists, to
start improving magellan. This means that every decision must be
carefully thought through ranging from technical decisions such as the
programming language to non-technical aspects such as documentation. The
rest of this e-mail will go through the major decisions made for magellan.
Programming language
- --------------------
The programming language chosen for magellan is Python. It can be argued
that this is not the most optimal language to use since it is not
normally taught (especially to scientists) and is not as quick as some
other languages. However, the arguments for Python are better suited for
a project like magellan. There are many reasons why Python was chosen
but the major influences to the decision are:
* It is an easy to learn and use programming language so everybody
should be able to quickly learn how to program Python.
* It forces the developer to program "beautifully", that is structure
the code in a humanly readable way and avoid obfuscations. Readable
syntax and correct indentation are important so users will be quick to
read the code and start working on it.
* It is dynamically typed so one does not have to play around with many
different types of variables (integers, floats, doubles, etc.).
* It is often said that Python comes with batteries included. This means
that the standard library of functions (i.e. functions built in to
Python and ready to be called by programs) is quite large and extensive.
Having such an extensive library makes the coding easier and more
understandable.
* It might not be the fastest programming language around but it is fast
enough. If for some reason one wants to optimize a specific feature for
performance, Python provides a mechanism for extensions written in C or
C++, giving the possibility of rewriting certain parts for performance.
* It is multiplatform, meaning that it can run on almost any operating
system (one must of course take some care in the programming phase).
This means that Python programs like magellan will be able to run on
GNU/Linux, Windows, MacOS X, and many other operating systems.
Project management
- ------------------
The management of the project is currently run through Savannah which is
"a central point for development, distribution and maintenance of Free
Software that runs on free operating systems." The project site on
Savannah is:
http://savannah.nongnu.org/projects/magellan
There you can find a description of magellan, the mailing list
(currently only magellan-users, to whom this mail is posted), the bug
tracker (management of defects in magellan, because every software
contains defects), repositories (for both the website and the source
code), and other things.
The source code repository uses distributed version control called
"git". The reason for choosing distributed version control is to allow
different research institutes to create their own in-house repositories
where they can adapt it to specific in-house research which might not
follow the direction of magellan. Their modifications could then
relatively easily be merged with the official magellan repository if
there is general interest by magellan users.
The website (documentation) repository is centrally controlled using a
version control system called "CVS". There is not as strong a reason for
different versions of documentation to be in a distributed version
control system.
Program structure
- -----------------
Currently the magellan directory which one can check out from the source
code repository is structured in the following way.
The main directory, magellan includes the following
AUTHORS
~ File containing the names and emails of all of magellan's authors
HACKING
~ File describing how to do work on magellan. In FOSS communities the term
~ hacking refers to the joyful act of creating wonderful programs while
~ cracking is used to describe the act of breaking into computers
INSTALL
~ File explaining how to install magellan
LICENSE
~ File containing the GNU General Public License used for magellan
MAINTAINERS
~ File containing the names of the active maintainers of magellan
MANIFEST
~ File used when packaging different versions of magellan
MANIFEST.in
~ File used when packaging different versions of magellan
README
~ File containing basic descriptions
setup.py
~ File used for building, packaging and installing magellan
src
~ Directory containing the source code of magellan
TODO
~ File containing the list of tasks to do
The magellan/src directory contains one file and one folder:
magellan
~ The main program file, the one used to execute magellan
Magellan
~ The directory containing the module files used by magellan.
~ This name will be changed in upcoming versions since this will cause
~ problems with operating systems with a case-insensitive file system,
~ e.g. Microsoft Windows. The name will have to be describing so perhaps
~ python-magellan will do?
The magellan/src/Magellan module directory contains a few files and one
directory. All of the files in the directory ending with .py have a
corresponding file with the .pyc suffix. The .pyc files are just
pre-compiled versions of the .py files so that the running of the
program will take less time. So one shouldn't worry about the .pyc
files. The files and directory of interest in magellan/src/Magellan are:
calc.py
~ The module which performs all core calculations for magellan. This is the
~ heavyweight module where most of the work will take place. Uses numpy for
~ the computations. Numpy is a Python module which is very similar to
MATLAB
~ and can be used for heavy computations
data
~ Directory containing data used by magellan. Currently it only
contains one
~ file called 'candekent.dat' which contains the Cande & Kent (1995)
reversed
~ time-scale.
data.py
~ The module which gathers data from parameter files or configuration
files.
__init__.py
~ A file which Python requires to be in the Magellan directory, so that it
~ will be seen as a module package.
plot.py
~ The module which plots the data graphically and presents it to the user.
~ Uses matplotlib for plotting. Matplotlib is a python module which gives
~ Python developers syntax similar to MATLAB to plot different graphs.
This is all there is to the magellan directory structure.
Flow of magellan
- ----------------
The flow of magellan is very simple. Calling the main program file,
magellan, with specific parameters causes it to read data using the
Magellan/data.py package. The output of the data gathering is sent to
Magellan/calc.py which performs some computations and returns plotable
data structures. These data structures are then sent to Magellan/plot.py
which plots the data structures in a readable form. So basically the
flow is:
magellan ->
~ Magellan/data.py
~ <-
~ ->
~ Magellan/calc.py
~ <-
~ ->
~ Magellan/plot.py
Development of magellan
- -----------------------
Every computation in Magellan/calc.py must be backed up by theoretical
computations, described in documentation which will be available through
the Magellan website. The theoretical computations should be followed by
a computer algorithm (described using an easy to understand
semi-programming language called pseudo-code). In the source code each
computation will point to the corresponding file or chapter in the
documentation which explains the theoretical computations.
Providing users with the theory, algorithm explanation, and a working
implementation should make it easier for users to start contributing.
Scientists who are perhaps not good programmers or do not know Python
can still contribute with theory by submitting the theory which eager
algorithm designers or programmers can then follow and implement.
Therefore, lack of programming skills should not be a valid excuse for
not contributing to magellan.
Furthermore, each and every file in magellan must be commented and easy
to read. Commenting the code means that hard to understand code is
explained so that new users will never have a hard time understanding
what is being done. Variable names must be describing, so variable names
like a or b should be avoided. In the current version of magellan there
are variable names like Jx and P which are used in the corresponding
theoretical computations, but they must be replaced to make it easier to
read for others.
Well this long email should be a good start to describe the intentions
and structure of the magellan project. The above text will of course be
put into the magellan documentation where it will be more accessible
than on a mailing list, but until then this email will have to do.
/Tryggvi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFH4I14TfUwC3N5Fj0RAtWXAJ4qkbKr4SmY9U16nKQK+cbdMt4zGACdFX/b
lzlxvobD0GTrYKrweMjMJpQ=
=eNN0
-----END PGP SIGNATURE-----
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Magellan-users] Initial clarifications,
Tryggvi Björgvinsson <=