Scope & Usage

Scope

SMAP snp-seq designs HiPlex primers encompassing dedicated polymorphic SNP sites, while taking neighboring SNPs into consideration. It is a simple application to design primer panels for targeted amplicon resequencing taking known polymorphisms into account, and can be directed to pre-selected locations like GBS loci, or candidate genes.

Input

SMAP snp-seq only requires a reference sequence FASTA file and one VCF file with the polymorphisms that need to be screened. Optionally, one may provide a BED file with selected regions, or a VCF file with SNPs that specifically need to be targeted. Last, one may create a customized reference for a particular sample set by providing a VCF file with SNPs that need to be adjusted in the reference sequence prior to primer design.

Output

SMAP snp-seq provides custom filters and a list of primers to order.
SMAP snp-seq creates a BED file with SMAPs to delineate HiPlex loci for downstream analyses (e.g. SMAP haplotype-sites).
SMAP snp-seq creates a GFF file with borders to delineate HiPlex windows for downstream analyses (e.g. SMAP haplotype-window).
SMAP snp-seq plots feature distributions such as length, of amplicons.

Integration in the SMAP workflow

../_images/SMAP_global_scheme_home_snp-seq1.png

SMAP snp-seq is run on a reference sequence FASTA file and one or two VCF files, after variant calling and before SMAP haplotype-sites or SMAP haplotype-window. SMAP snp-seq designs primer panels for HiPlex amplicon sequencing.


Guidelines for variant calling

See Veeckman et al. (2019) for a comparison of different SNP calling methods.


Commands & options

Mandatory options for SMAP snp-seq

SMAP snp-seq only needs a reference sequence and known SNP positions.

--vcf ###### The VCF file with SNPs [no default].
--reference ## The FASTA file with the reference genome sequence or candidate gene sequences [no default].

Command line options

See tabs below for command line options and specific filter options.

Input data options:

-i, --input_directory ## (str) ## Input directory [current directory].
-r, --regions ######## (str) ## Name of the BED file in the input directory containing the genomic coordinates of regions wherein primers must be designed [no BED file provided].
--target_vcf ############### Name of the VCF file in the input directory containing target SNPs [no VCF file with target SNPs provided].
--reference_vcf ############# Name of the VCF file in the input directory containing non-polymorphic differences between the reference genome sequence and the samples for primer design [no VCF file with reference genome differences provided].

Example commands

Basic command to run SMAP snp-seq:

python3 SMAP_snp-seq.py -i /path/to/dir/ --vcf variants.vcf --reference genome.fasta

Output

By default, SMAP snp-seq does not provide graphical output.