ADMIXMAPa program to model admixture using marker genotype data |
ADMIXMAP
Change Log
3.6.1 (03/05/07)
p-values for most score tests are not written to the cumulative output file when the observed information is <10%. Previously this was only done for the final tables.
All remaining bugs in the reading of options files have been fixed.
Fixed an error in the writing of positions in the LocusTable.
The counts of haploid and diploid individuals reported on screen and in logfile has been fixed.
Removed some inefficiencies in calculation of energy.
3.6 (20/04/07)
Non-case-sensitive options
Fixed a bug causing a crash when program called with no options
Fixed a bug causing program to crash if no genotypesfile is specified
FIxed the dimensions of the R object output by some of the score tests
Removed trailing commas from ends of R objects
Unit of distance included in header of LocusTable
Data files are checked for the presence of header lines. This means headers must have at least one alphabetic character.
Fixed a bug in checking allelefreqfiles/priorallelefreqfiles/historicallelefreqfiles, where an error was incorrectly reported when the locus was a composite locus.
3.5.3 (02/02/07)
fixed a bug in ancestry score tests.
Format of test output has been changed: only log p-values are output cumulatively and in the final table, p-values are ommitted where information content is less then 10%
3.5.2 (12/01/07)
fixed a bug in the dispersion test, which has been causing false results since v3.3
added two useful command-line switches: -v causes the program to display version number and copyright information then exit; -h prints a help message.
3.5.1 (10/01/07)
added option to control samplers for population admixture and sum-of-intensities.
3.5 (12/06)
Command-line arguments are now deprecated and may be unsupported in future versions. The sample perl script supplied will write program arguments to file and supply the filename to the program. As a side-effect, this eliminates a bug that sometimes caused the program to crash on exit with command-line args.
All options are checked on startup. This means multiple error messages are possible so all invalid options can be fixed by the user in one go.
A check for sensible values of samples, burnin and every has been added.
The annealing schedule has been modified. With thermo=1, the final run at 'coolness' of 1 is twice as long.
Messages generated by the R script now appear in the logfile for convenience.
The Hamiltonian sampler for allele/haplotype frequencies is used when annealing or if there is a noninformative prior on the frequencies. Otherwise, the conjugate sampler is used.
For monitoring, acceptance rates for the Hamiltonian sampler are output to file.
bugs in the sampling of allele frequency prior parameters in the correlated frequencies model have been fixed.
The option haplotypeassociationscorefile is ignored when there are no haplotypes. This preciously could cause the R script to crash.
A bug that caused the program to crash at startup with a fixed frequency model when some of the frequencies were 0 has been fixed.
The dispersion test, stratification test, affecteds-only test and residual allelic association test now work with X-chromosome data.
Posterior mode-finding now takes place at the end of burnin.
A Cox regression model is now supported. See the manual for details.
Haploid autosomal data are now supported. Genotypes for haploid individuals/gametes should be coded as single integers. This allows for analysis of phased genotype data, where available.
The "energy" or -logLikelihood is written to screen along with population-level parameters, if displaylevel=3. For annealed runs, the mean energy over each run is displayed. This is usefule for assessing convergence. Note that the energy is also written to file.
Distances in the locus file may be specified in Morgans, centimorgans or megabases. For centimorgans, the column header should be "distanceincm"; for megabases, the header should be "distanceinmb".
3.4 (11/07/06)
Bugs in the allele frequency sampler have been fixed. The default sampler is now a conjugate sampler again.
Options that take vector values may now contain spaces, e.g. sumintensitiesprior = "5.0, 11.0, 10.0".
Treatment of X-chromosome data has been revised. The admixture proportions are set to the autosomal values and the sum-of-intensities parameter is set to half the autosomal value.
Haploid genotypes should be specified as a single integer (but a pair of integers is still valid for now).
The options xonlyanalysis and truncationpoint are no longer valid.
3.3 (14/06/06)
The allele/haplotype frequencies are sampled using a Hamiltonian sampler instead of direct sampling. This is slower but results in better mixing.
The number of outcome variables is no longer restricted to two and outcome variables are allowed with indadmixhierindicator=0.
The expected value of the outcome variable in a regression model is output instead of the residuals.
The prior precision of the regression parameters can be set with the regressionpriorprecision option.
A testgenotypesfile option has been added to allow offline score tests for genotypes at loci that have not been included in the model.
The population admixture proportions can be specified to be equal with the popadmixproportionsequal option.
A checkdata option has been added to skip checking of data. Useful for large data files.
An error in the header of the misspecallelefreqscoretestfile has been fixed.
Blank lines are allowed in an options file.
The stratification test uses realized haplotype pairs instead of observed genotypes.
3.2.1 (28/03/06)
Sampling bugs in the correlated allele frequencies model have been fixed.
3.2 (20/03/06)
Output bugs in allelic association scoretests have been fixed. Only logs of p-values are output cumulatively, for monitoring.
A new version compiled for X86 64 machines is now available.
3.1 (10/03/06)
Precision of p-values in score tests has been restored.
Some invisible changes to improve efficiency.
3.0 (03/03/06)
The resultsdir directory is now created if it does not exist. If it does, the contents are deleted to avoid mixing of results from different runs.
There are now two separate options, sumintensitiesprior and globalsumintensitiesprior, for specifying a prior on the sum of intensities parameter.
The popadmixpriormean and popadmixpriorvar options have been removed. The prior on the population admixture parameters is fixed as Di(1, ...,1)
The marglikelihood option has been renamed as chib. The method for estimating the posterior modes has been improved and can be invoked even with chib=0 with new option indadmixmodefile.
The anneal option has been replaced with thermo. The number of temperatures at which to sample can be controlled with option numannealedruns. Annealing is used by default during burnin even with thermo=0, in order to improve mixing. To override this, specify numannealedruns=0.
initalpha0 and initalpha1 options have been renamed as admixtureprior and admixtureprior1.
The update of linear regression parameters has been improved.
A score test for residual allelic association between pairs of unlinked loci is available with option residualallelicassocscorefile.
Final score test tables are now output by the main program, rather than the R script.
A bug in the score test for ancestry association has been fixed.
Error reporting has been improved. Most error messages in the event of a crash are written to the logfile.
2.3.1 (14/12/05)
An output error in the allelic association test has been corrected to give output for alleles in a multiallelic locus.
coutindicator option is now deprecated. Instead use displaylevel to control output from silent, quiet, normal to verbose.
Some of the lower-level algorithms are faster at the modest expense of a greater memory requirement.
2.3.0 (11/11/05)
Sampling of individual admixture has been improved by use of a random walk MH sampling.
A correlated allele frequencies model has been implemented (see manual for details).
The adaptive rejection sampler used to sample population admixture and allele frequency parameters has been improved.
Some of the default priors and initial parameter values have changed slightly.
Prior on sumintensities can be specified with the 'sumintensitiesprior' option.
A bug causing the program to crash when run with bad score test options has been removed.
A problem with quotes on population labels in output files has been fixed.
Likelihood ratios for the affectedsonly test now produced.
More diagnostic output written to screen and logfile, including acceptance rates of Metropolis samplers.
Deviance and Deviance Information Criterion (DIC) routinely calculated and loglikelihood written to file.
More than two outcome variables may be placed in the outcomevar file (program still uses only one or two).
analysistypeindicator option is no longer required (still valid for backward compatibility but ignored). Specify a regression model with an outcomevarfile. The program will automatically detect outcome types, affectedsonly analyses and single individual analyses.
marglikelihood now valid with more than one individual, computes the marginal likelihood for the first individual listed in the genotypesfile. This is equivalent to the old analysystypeindicator=-3 option. The logprior, logposterior and log margina likelihood are written to the ergodicaveragefile so that they are plotted by the R script.
With globalrho=0 option, the mean of the sumintensities is written to paramfile and ergodicaveragefile instead of the beta parameter of the prior.
Output in indadmixturefile has been rearranged. IndAdmixPosteriorMeans now shows means for each gamete separately in the case of a random mating model.
The timer now shows the correct run time.
Minor improvements in startup and start messages.
With coutindicator=0 startup information is still displayed and an iteration counter appears on screen. Parameter output is no longer outout to the logfile as it is available in the paramfiles.
A problem with the calling of the R script from the examples perl scripts has been fixed.
2.2.0 (31/08/05)
The samplers for the global sumintensities, population admixture and dispersion parameters have been improved.
The allelic association test is slightly faster.
The stratification test uses only loci with < 10% of missing genotypes and of those, selects those with the greatest expected heterozygosity. More detail has been added to the stratificationtestfile.
2.1.1
Some small bug fixes.
2.1.0 (19/08/05)
A bug in the updating of allelefrequencies has been fixed.
The default priors for the global sumintensities, population admixture and regression parameters have been changed.
Some minor errors in the R script have been fixed.
2.0.0 (28/07/05)
The main sampling algorithms have been improved and made faster.
Better checks of user options.
Better checks of input data.
Startup message and logfile made easier to read.
Priors for all parameters are written to logfile and screen if coutindicator = 1
Population-level parameters are no longer output when they are fixed. In particular none of them are output with no hierarchical model on individual admixture.
Dispersion parameter is now initialise at its prior mean instead of the sum of the initial values of the admixture parameters.
The default value of the sumintensities prior shape parameter changed from 5 to 6.
sumintensities option replaced with two options, sumintensitiesalpha and sumintensitiesbeta for specifying the (gamma) prior on the sumintensities parameter.
New options popadmixpriormean and popadmixpriorvar added for specifying (gamma) prior on population admixture parameters.
Header lines added to strattestfile and dispersiontestfile. Dispersion test results are written only at end intead of 10*every iterations.
args.txt, a list of supplied user options written by the program, is now usable as a valid input options file.
Varous bug fixes including, initialisation when using an allelefreqfile and problems with the ancestry score test and allelic association score test.
1.6.0 (24/03/05)
analysistypeindicator = -3 option removed and replaced by two separate options:
indadmixhiermodel = 0 specifies no hierarchical model for individual admixture. Default is 1.
marglikelihood = 1 specifies that the marginal likelihood of the model is to be calculated. Only valid when analysistypeindicator = -1 or -2 and globalrho = 0. Default is 0.
Dispersion parameters in a dispersion model initialised at their prior means.
1.5.3
An inefficiency in updating of individual-level parameters was removed.
1.5.2
Invisible changes to treatment of allele frequencies and updating of individual-level parameters.
1.5.1
A bug causing the program to crash when ancestryassociationscorefile was specified without affectedsonlyscorefile was removed.
1.5.0
Score test for linkage with locus ancestry improved by means of a Rao-Blackwellized algorithm.