- New wrapper module
Haplostats
. This wraps a portion of thehaplo.stats
R packagehaplo-stats
for haplotype estimation. [Implementation in alpha-phase - still working on this].
-
swig
(Simple Wrapper Interface Generator), this is packaged on all platforms, and is either installed during thebuild_wheels.yml
GitHub Action by the platform's package manager,RPM
(for Linux),brew
(MacOS), or is part of the default runner image (Windows). -
gsl
(GNU Scientific Library), this is a library used by some C extensions that needs to be available during compilation, specifically:-
Linux: the
gsl-devel
CentOS package is installed at build time and the dynamic library distributed with the generated wheel. -
MacOS: the
gsl
package is installed via Homebrew, dynamic library is also distributed with the wheel. -
Windows: on X64 (AMD64) the
gsl-msvc14-x64
package is installed via the NuGet package repository, this includes a static version that is compiled into the final extension. Since the NuGet repository doesn't have an ARM64 version ofgsl
, we build a.nupkg
from source using thenuget_gsl_arm64_package.yml
workflow. Triggering this workflow will build the package and also copy the.nupkg
into the repo in thevendor-binaries
folder so it is available at build-time.
-
Numpy
(Numpy)lxml
(Python bindings)pytest
(Python test framework)
- Note that within ".i" wrappers, need to include function prototypes and SWIG wrappers, so functions are duplicated, see this StackOverflow post
-
To install macports via the command-line you can run the following (substituting the current link):
curl -L 'https://github.com/macports/macports-base/releases/download/v2.4.1/MacPorts-2.4.1-10.12-Sierra.pkg' > MacPorts-2.4.1-10.12-Sierra.pkg sudo installer -pkg MacPorts-2.4.1-10.12-Sierra.pkg -target /
- name: Download GSL artifact on Windows
# no pre-compiled Windows ARM64 version of GNU Scientific Library (GSL)
# so we install our own build
# only runs on Windows when cross-compiling for ARM64
if: false
#if: runner.os == 'Windows' && contains(matrix.only, 'win_arm64')
env:
GH_TOKEN: ${{ github.token }}
shell: bash # ensure shell is bash for compatibility
run: |
set -e
echo "Attempting to download artifact for GSL ARM64 package..."
# get the latest workflow run ID for 'Create GSL ARM64 GitHub Package'
RUN_ID=$(gh run list --workflow "Create GSL ARM64 GitHub Package" --json databaseId -q ".[0].databaseId" || echo "")
GSL_PACKAGE_NAME=gsl-msvc14-arm64
GSL_PACKAGE_FILE=${GSL_PACKAGE_NAME}.2.3.0.2779.nupkg
NUGET_PACKAGE_DIR=nuget-packages
# attempt to download artifact if RUN_ID exists
if [ -n "$RUN_ID" ]; then
mkdir ${NUGET_PACKAGE_DIR}
gh run download "$RUN_ID" -n gsl-nuget-package --dir ${NUGET_PACKAGE_DIR} || echo "Artifact download failed."
else
echo "No previous workflow run found for GSL ARM64 package."
fi
# check if artifact exists
if [ ! -f ${NUGET_PACKAGE_DIR}/${GSL_PACKAGE_FILE} ]; then
echo "Artifact not found. Triggering new build..."
gh workflow run nuget_gsl_arm64_package.yml --ref windows_arm64 -f reason="Artifact expired or missing"
echo "New build triggered. Please wait for completion before retrying."
exit 1 # exit with failure to indicate the need to rerun the workflow
else
ls ${NUGET_PACKAGE_DIR}/*.nupkg
echo "Artifact found. Proceeding with installation."
fi
# install the nupkg
nuget install ${GSL_PACKAGE_NAME} -Source $(pwd)/${NUGET_PACKAGE_DIR}
This can occur based on re-tagging after a change (e.g. after a Zenodo
deposition). These has been disabled because they are done internally
by the zenodraft
action.
This first part is needed just after find -empty type d -delete
in
the run
part of the "Push changes back to repo files"
publish_zenodo
job:
git tag -d $GITHUB_REF_NAME
git push --follow-tags origin :$GITHUB_REF
git tag $GITHUB_REF_NAME
git push origin $GITHUB_REF
git ls-remote origin $GITHUB_REF
This second part would be the last step in the same publish_zenodo
job:
#
# FIXME: disabled the rest of these steps (already done by zenodraft)
#
- name: Get existing release
if: false
id: get_existing_release
uses: cardinalby/git-get-release-action@v1
env:
GITHUB_TOKEN: ${{ github.token }}
with:
releaseId: ${{ github.event.release.id }}
- name: Delete old release
if: false
uses: liudonghua123/delete-release-action@v1
env:
GITHUB_TOKEN: ${{ github.token }}
with:
release_id: ${{ github.event.release.id }}
- name: Recreate release with new tag
if: false
id: recreate_release
uses: joutvhu/create-release@v1
with:
tag_name: ${{ github.event.release.tag_name }}
name: ${{ steps.get_existing_release.outputs.name }}
body: ${{ steps.get_existing_release.outputs.body }}
# FIXME: have to set these both to false, because not copied
# from the original release properly
# draft: ${{ steps.get_existing_release.outputs.draft }}
# prerelease: ${{ steps.get_existing_release.outputs.draft }}
draft: false
prerelease: false
target_commitish: ${{ steps.get_existing_release.outputs.target_commitish }}
on_release_exists: update
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# GITHUB_TOKEN: ${{ secrets.PYPOP_RELEASE_TOKEN }}
(These should eventually be migrated back in the source code, so that, if and when we generate API docs, they will appear there. They would need updating as part of that migration to make sure they are still accurate.)
-
Main
is the primary interface to the PyPop modules. Given a ConfigParser instance, which can be (1) created from a filename passed from command-line argument or (2) from values populated by the GUI (currently selected from an .ini file, but can ultimately be set directly from the GUI or values based from a form to a web server or the) it then runs the specified modules (outlined below). -
GUIApp
is the graphical front-end to PyPop which uses the "wxPython":http://www.wxpython.org GUI toolkit. wxPython is a set of Python bindings to "wxWindows":http://www.wxwindows.org, which is an open-source cross-platform GUI widget toolkit which has a native look under GNU/Linux (GTK), Windows (MFC) and MacOS X (Aqua). [as of 2023, this was removed] -
ParseFile
is a base class which has most of the common functionality for reading files. -
ParseGenotypeFile
is a subclass ofParseFile
that deals with files that consist specifically of data with individual genotyped for one or more loci. -
ParseAlleleCount
is another subclass ofParseFile
that deals with files consisting of allele counts across a whole population. -
HardyWeinberg
is a class that calculates Hardy-Weinberg statistics given genotype data for a single locus. -
HardyWeinbergGuoThompson
a subclass ofHardyWeinberg
that uses the Guo & Thompson algorithm for calculating statistics. -
HardyWeinbergGuoThompsonArlequin
a subclass ofHardyWeinberg
that uses the Arlequin implementation of the Guo & Thompson algorithm for calculating statistics. -
Haplo
is an abstract base class for estimating haplotypes given genotype data.
HaploArlequin
is a subclass ofHaplo
that uses Arlequin for estimation of haplotypes (obsolete).
-
Emhaplofreq
is a subclass ofHaplo
that usesemhaplofreq
(Rich Single`s program) for the estimation of haplotypes and linkage disequilibrium values. -
ArlequinWrapper
the underlying class that "wraps" the functionality of the "Arlequin":http://lgb.unige.ch/arlequin/ program (obsolete: this class, in turn, suppliesHaploArlequin
with required information). -
Homozygosity
Calculates homozygosity statistics for a given locus, calculates the observed homozygosity and returns the approximate expected homozygosity statistics taken from previous simulation runs.
Both file formats are assumed to have a population header information with, consisting of a line of column headers (population metadata) followed by a line with the actual data, followed by the column headers for the samples (sample metadata) followed by the sample data itself (either individuals in the genotyped case, or alleles in the allele count case).
These are either obsoleted by new versions of dependencies or platforms, or no longer work, and need to be updated. Keeping around in case of either old platforms or if there is interest in reviving the feature(s) in question.
(obsoleted by newer Ubuntu releases)
There is a bug in versions swig 3.0.6 to 3.0.10 that prevents swig on
xenial
(which is version 3.0.8 of swig) working. You will need
to install the lastest version from source.
-
Get swig dependency:
sudo apt install libpcre3-dev
-
Visit swig.org to get download link
-
Do the installation:
tar zxvf ~/swig-3.0.12.tar.gz cd swig-3.0.12 ./configure make sudo make install
(WARNING: instructions are obsolete with the Python 3 port)
To make pypop more portable (given that some of its dependencies are currently
obsolete), it is possible to build a Singularity container which contains a
minimal Fedora 25 installation (minus the Kernel), pypop, pypop's dependencies,
and some extra tools (yum
, rpm
, less
, and vim
) in case you need to do
work inside the container.
Singularity containers bind-mount many external directories by default (for
example, /home
and /tmp
), with the container image kept read-only. When
run inside the container, pypop will work on your files, even though they live
outside the container.
Singularity 2.3 or later is required in order to bootstrap this container. The container also must be bootstrapped & run on a Linux system, running the x86_64 architecture, because that's the OS & architecture the container uses.
To build pypop as a singularity container, once you have Singularity installed, perform these three steps:
cd path/to/pypop/source
singularity create -s 2048 image.img
sudo singularity bootstrap image.img Singularity
The above commands will give you a 2 GiB executable file named image.img
.
That is the container.
The first command ensures that you are in the pypop source directory. This is required because part of the bootstrap process copies the source into the container.
The second command creates a 2 GiB (a 2048 MiB) container image. This should be large enough, but you can increase or decrease it as you wish. Note that if you make it too small, the bootstrap might not have enough room to complete!
The final command performs the bootstrap. The bootstrap needs to be run as root, so you either need to use sudo
(as shown in the example above) or you need to run the command in a root shell. The bootstrap does a number of things:
- Mount the container image read/write.
- Download and install the Fedora 25 GPG key.
- Create a temporary Yum repo file, pointing to the Fedora 24 package archive.
- Install the
basesystem
package; GCC, SWIG, and GSL; Python (both the - executable and development packages); and the Python modules for Numeric, libxml2, and libxslt.
- Copy the entire pypop source directory into the container.
- Build pypop (again, inside the container).
Once you have the container image, running it is as simple as executing
image.img
. For example:
akkornel@blargh-yakkety-typical:~/pypop$ ./image.img -V
pypop 0.8.0
Copyright (C) 2003-2005 Regents of the University of California
This is free software. There is NO warranty; not even for
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
akkornel@blargh-yakkety-typical:~/pypop$ ./image.img -h
Usage: pypop [OPTION]... [INPUTFILE]...
Process and run population genetics statistics on one or more INPUTFILEs.
Expects to find a configuration file called 'config.ini' in the
current directory or in /usr/share/pypop/config.ini.
-l, --use-libxslt filter XML via XSLT using libxslt (default)
-s, --use-4suite filter XML via XSLT using 4Suite
-x, --xsl=FILE use XSLT translation file FILE
-h, --help show this message
-c, --config=FILE select alternative config file
-d, --debug enable debugging output (overrides config file setting)
-i, --interactive run in interactive mode, prompting user for file names
-g, --gui run GUI (currently disabled)
-o, --outputdir=DIR put output in directory DIR
-f, --filelist=FILE file containing list of files (one per line) to process
(mutually exclusive with supplying INPUTFILEs)
--generate-tsv generate TSV output files (aka run 'popmeta')
-V, --version print version of PyPop
INPUTFILE input text file
Once built, the container image can be transferred to any other system which is running Linux x86_64, and which has the same version of Singularity (or newer).