Scope & Usage

Scope

SMAP grm converts a haplotype call table created by SMAP haplotype-sites or SMAP haplotype-window into pairwise genetic similarity/distance and/or locus information matrices.
SMAP grm works on any multi-allelic haplotype call table obtained from GBS, HiPlex or Shotgun sequencing data using the SMAP package.
SMAP grm output genetic relationship matrices (grm) are created in customised, high-quality figures or in standard output file formats for downstream data analyses.

Integration in the SMAP workflow

../_images/SMAP_global_scheme_home_grm.png

Example input files

Reference

Locus

Haplotypes

ind1

ind2

ind3

ind4

ind5

ind6

ind7

ind8

ind9

ind10

ind11

ind12

ind13

ind14

ind15

ind16

ind17

ind18

ind19

chrom1

locus1

1000

2

0

0

1

2

2

2

1

1

2

0

0

0

0

0

0

0

2

chrom1

locus1

1001

0

0

2

0

0

0

0

1

1

0

2

0

0

2

1

0

2

0

chrom1

locus1

1011

0

2

0

0

0

0

0

0

0

0

0

1

0

0

1

2

0

0

chrom1

locus1

1111

0

0

0

1

0

0

0

0

0

0

0

1

2

0

0

0

0

0

chrom1

locus2

1000

2

2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus2

1001

0

0

2

2

1

0

0

0

0

0

0

1

0

2

0

0

2

1

0

chrom1

locus2

1011

0

0

0

0

1

2

2

2

1

0

2

1

2

0

2

2

0

1

2

chrom1

locus2

1111

0

0

0

0

0

0

0

0

1

1

0

0

0

0

0

0

0

0

0

chrom1

locus2

1101

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

chrom1

locus3

1000

2

2

2

2

0

0

1

1

1

2

0

0

2

0

0

0

2

0

2

chrom1

locus3

1001

0

0

0

0

2

2

0

0

0

0

2

2

0

2

2

0

0

2

0

chrom1

locus3

1011

0

0

0

0

0

0

1

1

1

0

0

0

0

0

0

2

0

0

0

chrom1

locus4

1000

0

0

2

2

2

2

2

2

2

2

2

0

2

2

2

2

2

2

1

chrom1

locus4

1001

2

2

0

0

0

0

0

0

0

0

0

2

0

0

0

0

0

0

1

chrom1

locus5

1000

2

0

0

2

2

2

0

1

2

2

0

0

0

1

0

0

0

0

0

chrom1

locus5

1001

0

2

1

0

0

0

0

0

0

0

0

0

0

0

2

0

2

2

2

chrom1

locus5

1011

0

0

1

0

0

0

0

0

0

0

0

0

0

1

0

2

0

0

0

chrom1

locus5

1111

0

0

0

0

0

0

2

0

0

0

0

2

2

0

0

0

0

0

0

chrom1

locus5

1101

0

0

0

0

0

0

0

1

0

0

2

0

0

0

0

0

0

0

0

chrom1

locus6

1000

2

2

2

2

2

0

0

0

2

2

0

0

2

1

2

0

0

0

chrom1

locus6

1001

0

0

0

0

0

1

1

1

0

0

2

0

0

1

0

2

0

0

chrom1

locus6

1011

0

0

0

0

0

1

1

1

0

0

0

2

0

0

0

0

2

2

chrom1

locus7

1000

2

2

0

0

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2

chrom1

locus7

1001

0

0

2

2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus8

1000

0

0

0

0

1

0

0

0

0

0

2

0

0

0

0

0

0

0

2

chrom1

locus8

1001

0

0

0

0

1

2

0

0

0

0

0

2

2

0

1

0

2

0

0

chrom1

locus8

1011

2

2

2

2

0

0

0

0

0

0

0

0

0

2

0

0

0

2

0

chrom1

locus8

1111

0

0

0

0

0

0

2

2

2

0

0

0

0

0

1

0

0

0

0

chrom1

locus8

1101

0

0

0

0

0

0

0

0

0

2

0

0

0

0

0

2

0

0

0

chrom1

locus9

1000

2

0

2

0

0

0

2

2

2

2

2

2

2

2

2

2

2

2

2

chrom1

locus9

1001

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus9

1011

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus9

1111

0

0

0

0

2

2

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus9

1101

0

2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus10

1000

2

2

0

0

1

0

0

1

0

2

0

0

2

0

2

0

2

1

2

chrom1

locus10

1001

0

0

2

2

0

0

0

0

0

0

2

2

0

2

0

0

0

1

0

chrom1

locus10

1011

0

0

0

0

1

2

2

1

2

0

0

0

0

0

0

2

0

0

0

chrom1

locus11

1000

2

2

2

2

2

2

1

2

2

2

0

0

0

0

0

0

0

0

chrom1

locus11

1001

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus11

1011

0

0

0

0

0

0

0

0

0

0

2

2

2

1

2

1

0

2

chrom1

locus11

1111

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

1

2

0

chrom1

locus12

1000

0

0

0

0

0

0

0

0

1

0

0

0

1

0

0

0

0

0

0

chrom1

locus12

1001

0

0

0

0

0

0

0

0

1

0

0

2

0

0

0

1

0

0

0

chrom1

locus12

1011

2

1

0

2

0

0

2

2

0

2

2

0

1

2

2

1

2

2

2

chrom1

locus12

1111

0

0

2

0

2

2

0

0

0

0

0

0

0

0

0

0

0

0

0

chrom1

locus12

1101

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Options may be given in any order.

Commands & options

Mandatory options for SMAP grm

-t, --table ########### (str) ### Name of the haplotype call table obtained with SMAP haplotype-sites or SMAP haplotype-window in the input directory [no default].

Basic command to run SMAP grm with default parameters:

smap grm --table /PATH/TO/Haplotype_table.tsv

Command line options

See tabs below for detailed command line options.

-i, --input_directory #### (str) ### Path to the directory containing the haplotype call table, the --samples text file, and/or the --loci text file [current directory].
-n, --samples ########## (str) ### Name of a tab-delimited text file in the input directory defining the order of the (new) sample IDs in the matrix: first column = old IDs, second column (optional) = new IDs [no list provided, the order of sample IDs in the grm equals their order in the haplotype call table].
-l, --loci ############ (str) ### Name of a tab-delimited text file in the input directory containing a one-column list of locus IDs formatted as in the haplotype call table [no list provided].
-p, --processes ######### (int) ### Number of parallel processes [4].
-t, --plot_type ############### Use this option to choose plot format, choices are png and pdf [png].
-h, --help ################### Show the full list of options. Disregards all other parameters.
-v, --version ################# Show the version. Disregards all other parameters.

Options may be given in any order.

Example commands

Usage:

smap grm -t, --table TABLE [-i INPUT_DIRECTORY] [-n SAMPLES] [-l LOCI] [-lc LOCUS_COMPLETENESS] [-sc SAMPLE_COMPLETENESS] [--include_non_shared_loci] [--similarity_coefficient {Jaccard, Sorensen-Dice, Ochiai}] [--distance] [--distance_method {Inversed, Euclidean}] [-lic LOCUS_INFORMATION_CRITERION {Shared, Unique}] [--partial] [--proportion_informative_loci] [-b BOOTSTRAP] [-p PROCESSES] [-o OUTPUT_DIRECTORY] [-s SUFFIX] [--print_sample_information {Matrix, Plot, All}] [--print_locus_information {None, Matrix, Plot, List, All}] [--matrix_format {Phylip, Nexus}] [--plot_format {pdf, png, svg, jpg, jpeg, tif, tiff}] [--mask {None, Upper, Lower}] [--annotate_matrix_plots] [--no_matrix_plot_labels] [--plot_line_curves] [--list_line_curves LIST_LINE_CURVES] [--locus_interval LOCUS_INTERVAL] [-f {Times New Roman, Arial, Calibri, Consolas, Verdana, Helvetica, Comic Sans MS}] [--title_fontsize TITLE_FONTSIZE] [--label_fontsize LABEL_FONTSIZE] [--tick_fontsize TICK_FONTSIZE] [--legend_fontsize LEGEND_FONTSIZE] [--legend_position X, Y] [-r PLOT_RESOLUTION] [--colour_map {viridis, plasma, inferno, magma, cividis, Greys, Purples, Blues, Greens, Oranges, Reds, YlOrBr, YlOrRd, OrRd, PuRd, RdPu, BuPu, GnBu, PuBu, YlGnBu, PuBuGn, BuGn, YlGn, binary, gist_yarg, gist_gray, gray, bone, pink, spring, summer, autumn, winter, cool, Wistia, hot, afmhot, gist_heat, copper, PiYG, PRGn, BrBG, PuOr, RdGy, RdBu, RdYlBu, RdYlGn, Spectral, coolwarm, bwr, seismic, twilight, twilight_shifted, hsv}]

Output

All output files are saved to the user-defined output directory (default = current directory). The output directory is created by the script if the directory did not exist. The option --print_sample_information creates tab-delimited text files (Matrix, default), or plots heatmaps (Plot), or both (All).

To illustrate the different kinds of output that can be created, a simulated haplotype call matrix was created that includes various scenarios of shared and unique haplotype calls across a small sample set.
In the next tabs, the SMAP grm output is created by comparing 10 individuals at all 12 loci (left hand panel), and using settings for “complete unique loci”: --locus_information_criterion Unique (which creates a matrix showing the number of loci with unique haplotypes in each comparison e.g. locus5 in ind7 uniquely has haplotype d) and only loci with unique haplotypes are counted (complete, default)
The other use case scenario’s (here shaded) are discussed in detail in the How It Works section.
../_images/grm_haplotype_call_matrix_individuals.png