How to design primers for Hi-Plex
1. Download the Hiplex primer software.
You can download the software from github.
2. Check your Python version.
Hiplex primer is designed to work with Python 2.6 or 2.7. It will not work with Python 3.
3. Make sure BioPython is installed for your version of Python.
Instructions for installing BioPython.
4. Download a copy of all the chromosome reference files in FASTA format.
Hiplex-primer has been tested with the UCSC hg19 Human Genome Reference. Hiplex-primer requires each chromosome is stored in a separate file (chr1.fa, chr2.fa etcetera). Hiplex-primer should work with other species equally well; you just need to download the appropriate reference sequence. The name of the directory containing your reference files is specified using the --refdir command line parameter for Hiplex-primer.
5. Create a file specifying the target regions.
This input is tab-separated file, similar to bed format, which lists all the target regions. There is one column for the chromosome number, one column for the starting coordinate of the target region, and one column for the end coordinate. The fourth column should list the gene/region name. This is important, as it is used by the software to determine primer names.
The contents of such a file is displayed below:
chr16 68771313 68771371 CDH1,NM_004360
chr16 68772194 68772319 CDH1,NM_004360
chr16 68835567 68835801 CDH1,NM_004360
chr16 68842321 68842475 CDH1,NM_004360
chr16 68842590 68842756 CDH1,NM_004360
chr16 68844094 68844249 CDH1,NM_004360
chr17 41197689 41197824 BRCA1,NM_007300
chr17 41199654 41199725 BRCA1,NM_007300
chr17 41201132 41201216 BRCA1,NM_007300
chr17 41203074 41203139 BRCA1,NM_007300
chr17 41209063 41209157 BRCA1,NM_007300
chr17 41215344 41215395 BRCA1,NM_007300
chr17 41215885 41215973 BRCA1,NM_007300
The name of the file is specified using the --genes command line parameter for Hiplex-primer.
6. You can run Hiplex-primer from the directory where you downloaded it.
Here is an example command line, (the values are indicative only and should be tailored to the user needs):
./Primer_design.py --refdir <dir containing references in fasta format>
--genes <coordinates input file>
--blocksize 100
--maxprimersize 30
--primervar 10
--splicebuffer 8
--melt 64
--log <log_file_name.log>
--idtfile <output_file_name.csv>
--maxhairpinsize 30
--blocksizevar 0
--scale 25nmole
--purification Standard Desalting
7. Hiplex-primer produces a log file.
The name of the log file is specified by --log command line argument. Please note that it records the command line that was used.
Below is the start of such an example log file:
command line: ./Primer_design_custom.py --refdir primer_finder_data --genes exon_list.tsv --blocksize 100 --maxprimersize 30 --primervar 10 --splicebuffer 8 --melt 64 --log genes_coords.log --idtfile gene.idt.csv --maxhairpinsize 30 --blocksizevar 0 --scale 25nmole --purification Standard Desalting
********************************************************************************
chrom:chr16
exon:1
exon start:68771313
exon buff start:68771305
exon end:68771371
exon buff end:68771379
exon size:59
exon buff size:75
block_size:100
num_blocks:1
window_size:100
slack:25
region_start:68771250
region_end:68771434
Reading region on chr16, start: 68771250, end: 68771434 from file primer_finder_data/chr16.fa
region starts with: AGCCC, region ends with: GCGCC
================================================================================
Scoring window starting at 68771280 (1/26)
--------------------------------------------------------------------------------
block:0
block start:68771280
block end:68771379
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Primer dir:forward
Primer start:68771250
Primer end:68771279
Score: 2116, 5'> AGCCCGCTCCAGCCCGGCCCGACCCGACCG <3'
Score: 1936, 5'> GCCCGCTCCAGCCCGGCCCGACCCGACCG <3'
Score: 1600, 5'> CCCGCTCCAGCCCGGCCCGACCCGACCG <3'
Score: 1296, 5'> CCGCTCCAGCCCGGCCCGACCCGACCG <3'
Score: 1024, 5'> CGCTCCAGCCCGGCCCGACCCGACCG <3'
Score: 784, 5'> GCTCCAGCCCGGCCCGACCCGACCG <3'
Score: 576, 5'> CTCCAGCCCGGCCCGACCCGACCG <3'
Score: 400, 5'> TCCAGCCCGGCCCGACCCGACCG <3'
Score: 324, 5'> CCAGCCCGGCCCGACCCGACCG <3'
Score: 196, 5'> CAGCCCGGCCCGACCCGACCG <3'
Best primer is: CAGCCCGGCCCGACCCGACCG
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Primer dir:reverse
Primer start:68771409
Primer end:68771380
Score: 1444, 5'> GCGGCCCGAATGCGTCCCTCGCAAGTCAGG <3'
Score: 1156, 5'> CGGCCCGAATGCGTCCCTCGCAAGTCAGG <3'
Score: 900, 5'> GGCCCGAATGCGTCCCTCGCAAGTCAGG <3'
Score: 676, 5'> GCCCGAATGCGTCCCTCGCAAGTCAGG <3'
Score: 484, 5'> CCCGAATGCGTCCCTCGCAAGTCAGG <3'
Score: 324, 5'> CCGAATGCGTCCCTCGCAAGTCAGG <3'
Score: 196, 5'> CGAATGCGTCCCTCGCAAGTCAGG <3'
8. Here is a complete example command line:
./Primer_design.py
--refdir primer_finder_data
--genes genes_coords.tsv
--blocksize 100
--maxprimersize 30
--primervar 10
--splicebuffer 8
--melt 64
--log genes_coords.log
--idtfile gene.idt.csv
--maxhairpinsize 30
--blocksizevar 0
--scale 25nmole
--purification Standard Desalting
In this example, we have saved the reference fasta files in the directory called primer_finder_data. Our target region list is a file called genes_coords.tsv. We have set the primer-intervening sequence to 100 bases, the maximum primer size is 30 bases, we are allowing primer length to vary by 10 bases, we are allowing for a buffer of 8 additional bases on each side of the target regions listed in genes_coords.tsv in order to account for splice sites, we have set our optimum melting temperature to 64ºC, the log file is called genes_coords.log, we do not allow for primer-intervening sequence to vary in size, we want to order our oligos in the 25 nmole scale, with standard purification.
The output primers will be saved in a file called gene.idt.csv. Here is a small example of what the file will look like (you can open this file in a spreadsheet application, such as Excel):
CDH1,NM_004360_1_F1,AGCCCGGCCCGACCCGACCGC,25nmole,Standard Desalting
CDH1,NM_004360_1_R1,AATGCGTCCCTCGCAAGTCAG,25nmole,Standard Desalting
CDH1,NM_004360_2_F1,GTTTCGGTGAGCAGGAGGGAA,25nmole,Standard Desalting
CDH1,NM_004360_2_R1,GGGCACCGTGAACGTGTAGCT,25nmole,Standard Desalting
CDH1,NM_004360_2_F2,CACCCTGGCTTTGACGCCGAG,25nmole,Standard Desalting
CDH1,NM_004360_2_R2,ATTTCTCGGCCCCTTTCCAAC,25nmole,Standard Desalting
CDH1,NM_004360_3_F1,CGCTCTTTGGAGAAGGAATGC,25nmole,Standard Desalting
CDH1,NM_004360_3_R1,CCACTTTGAATCGGGTGTCGA,25nmole,Standard Desalting
CDH1,NM_004360_3_F2,CAAAGGACAGCCTATTTTTCCC,25nmole,Standard Desalting
CDH1,NM_004360_3_R2,GGAAAACTTTCTGTAGGTGGAG,25nmole,Standard Desalting
CDH1,NM_004360_3_F3,TTTCTTGGTCTACGCCTGGGA,25nmole,Standard Desalting
CDH1,NM_004360_3_R3,AGCGCACTAAAACAACAGCGAA,25nmole,Standard Desalting
BRCA1,NM_007300_1_R2,GATTAGAGCCTAGTCCAGGAG,25nmole,Standard Desalting
BRCA1,NM_007300_2_F1,CCATGCAAAAGGACCCCATATA,25nmole,Standard Desalting
BRCA1,NM_007300_2_R1,TGACACTTTGAATGCTCTTTCCT,25nmole,Standard Desalting
BRCA1,NM_007300_3_F1,TGTGGGCAGAGAAGACTTCTG,25nmole,Standard Desalting
BRCA1,NM_007300_3_R1,TTCATCATTCACCCTTGGCACA,25nmole,Standard Desalting
BRCA1,NM_007300_3_F2,GACAGGGCACCCAATACTTAC,25nmole,Standard Desalting
BRCA1,NM_007300_3_R2,TAAGTATGCAGATTACTGCAGTG,25nmole,Standard Desalting
BRCA1,NM_007300_4_F1,TATGTAAGACAAAGGCTGGTGC,25nmole,Standard Desalting
BRCA1,NM_007300_4_R1,ATTCCCCTGTCCCTCTCTCTT,25nmole,Standard Desalting
BRCA1,NM_007300_5_F1,GTATCTAGCACTGTGTATGTATG,25nmole,Standard Desalting
BRCA1,NM_007300_5_R1,AAGAGAATCCCAGGACAGAAAG,25nmole,Standard Desalting
BRCA1,NM_007300_5_F2,ACTTGAGGGAGGGAGCTTTAC,25nmole,Standard Desalting
BRCA1,NM_007300_5_R2,TTTCTCTTATCCTGATGGGTTGT,25nmole,Standard Desalting
BRCA1,NM_007300_6_F1,AGGAAGCAAATACATTTTTAACTATA,25nmole,Standard Desalting
BRCA1,NM_007300_6_R1,GCTGTATGTAACCTGTCTTTTCT,25nmole,Standard Desalting
BRCA1,NM_007300_7_F1,TTACAATTAAAGACCTTTTGGTAAC,25nmole,Standard Desalting
Before ordering the primers, remember to add the 5’heel clamps to each of the Forward and Reverse primers as described in our publication (Fwd: ctctctatgggcagtcggtgatt and Rev: ctgcgtgtctccgactcag) [A high-plex PCR approach for massively parallel sequencing. BioTechniques, Vol. 55, No. 2, August 2013, pp. 69–74].