diff --git a/README.md b/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5d96d8cca785a0e575e0697d20158207f6149304
--- /dev/null
+++ b/README.md
@@ -0,0 +1,412 @@
+SyntenyLink
+===========
+
+Overview
+========
+
+The SyntenyLink package has six major components: the SyntenyLink
+algorithm allows users to handle reconstruct subgenomes of polyploid
+species more conveniently and to separate the set of genes belong to
+each subgenome in the organism with the aid of reference proteomes of
+polyploid species and related ancestor.
+
+All programs are executed using command line options on Linux systems or
+Mac OS. Usage or help information are well built into the programs. To
+show them on the screen, users just need to run the program without
+giving any options:
+
+    $python3 ./SyntenyLink.py
+
+All code is copiable, distributable, modifiable, and usable without any
+restrictions. Contact: Thulani Hewavithana, <qnm481@usask.ca>; Chu Shin
+Koh, <kevin.koh@gifs.ca>
+
+The following is the list of executable programs
+------------------------------------------------
+
+**Main programs (in the Scripts folder)**
+
+-   SyntenyLink\_bf.pl
+-   SyntenyLink\_st.pl
+-   SyntenyLink\_mbp.py
+-   SyntenyLink\_wg.py
+-   SyntenyLink\_sb.py
+-   SyntenyLink\_mn.py
+-   main\_script.py
+-   SyntenyLink\_acc.py
+-   gap\_threshold\_selection.py
+-   minimum\_block\_length\_selection.py
+
+Main programs
+=============
+
+SyntenyLink\_bf
+===============
+
+This program, detects homologs between two species with blastp and
+involves a filtering step focusing on bit score and e-value to remove
+noise following two criteria: an e-value threshold of less than 1e-20
+and a ratio between the bit score and the highest bit score greater than
+0.6.
+
+\- Usage Reads in a data file: abc.blast. The abc.blast file holds blast
+hits after running blastp between baseline species and polyploid species
+of interest:
+
+    BraA01g000010.3C    AT1G43860.1 74.194  124 14  3   80  186 231 353 9.61e-56    188
+
+Here is a typical parameter setting for generating the abc.blast file:
+
+    $ makeblastdb -in ref_pep.fa -dbtype prot -out ref_pep
+
+    $ blastall -i query_pep.fasta -p blastp -d ref_pep -m 8 -e 1e-5 -F F -v 5 -b 5 -o abc.blast -a 4
+
+Note: for blastp you need to use protein fasta files of baseline model
+and polyploid species of interest. After getting the output blast result
+update gene start loci and end loci adding row start values from the bed
+file of each species before using SyntenyLink\_bf.py.
+
+It is advised that to make SyntenyLink\_bf generate more reasonable
+results, the number of BLASTP hits for a gene should be restricted to
+around top 5. When you have abc.blast ready, put them in the same
+folder. Then you can simply use:
+
+    $ python3 ./SyntenyLink_bf.pl dir/abc.blast
+
+\- Output The execution of SyntenyLink\_bf outputs one blast file
+abc\_blast\_filtered.txt, containing filtered blast hits as follows:
+
+    BraA01g000010.3C    AT1G43860.1 74.194  124 14  3   80  186 231 353 9.61e-56    188
+    BraA01g000010.3C    AT3G04630.1 66.087  115 21  4   194 297 165 272 7.06e-33    125
+    BraA01g000010.3C    AT3G04630.3 66.087  115 21  4   194 297 164 271 7.88e-33    125
+
+SyntenyLink\_st
+---------------
+
+This program uses the output after performing synteny analysis using
+DAGchainer to build a chain of syntenic genes and compute the score of
+each chain. The modified version of filtered results from blastp
+(abc\_blast\_filtered\_modified.txt) and DAGchainer
+(abc\_synteny.aligncoords) are used to generate a syntelog table in
+which the gene chains are incorporated into their corresponding
+chromosomes with redundant chains (likely due to gene duplications or
+tandem or segmental duplications in the polyploid genome) placed in the
+"overlap" column in the syntelog table.
+
+Reads in data file for DAGchainer: abc\_blast\_filtered\_modified.txt.
+The abc\_blast\_filtered\_modified.txt file holds the modified
+abc\_blast\_filtered.txt file matching the input file format of
+DAGchainer:
+
+    A1  BraA01g000010.3C    2944    3050    Chr1    AT1G43860.1 16622247    16622597    9.61E-56    188
+    A1  BraA01g000010.3C    3058    3161    Chr3    AT3G04630.1 1259234 1259503 7.06E-33    125
+
+Note: After getting the abc\_blast\_filtered.txt file update query
+start, query end, subject start and subject end values incoorperating
+gene locus data from bed files of baseline species and polyploid species
+of interest. Then add chromosome which each gene belongs to in the bed
+file of each species before using DAGchainer.
+
+Here is a typical parameter setting for generating the
+abc\_synteny.aligncoords file:
+
+    $ ./run_DAG_chainer.pl -i dir/abc_blast_filtered_modified.txt -s -I
+
+The abc\_synteny.aligncoords file holds pairwise synteny blocks after
+running DAGchainer :
+
+    ## alignment A1 vs. Chr1 Alignment #1  score = 5177.6 (num aligned pairs: 121):
+    A1  BraA01g026830.3C    17161339    17161504    Chr1    AT1G56580.1 21198405    21198568    2.180000e-109   50
+    A1  BraA01g026890.3C    17191267    17191318    Chr1    AT1G57550.1 21312544    21312593    3.270000e-24    40
+    A1  BraA01g026900.3C    17196325    17196618    Chr1    AT1G57610.3 21337612    21337818    8.840000e-106   84
+
+\- Usage Reads in two data files: abc\_blast\_filtered\_modified.txt &
+ref\_genelist.txt. The ref\_genelist.txt file holds gff file like
+details of baseline species :
+
+    AT1G01010   AT1G01010.1 429 Chr1_1  Chr1    1   3631    5899    AT1G01010.1  NAC domain containing protein 1 
+    AT1G01020   AT1G01020.1 245 Chr1_2  Chr1    2   5928    8737    AT1G01020.1  Arv1-like protein 
+    AT1G01030   AT1G01030.1 358 Chr1_4  Chr1    4   11649   13714   AT1G01030.1  AP2/B3-like transcriptional factor family protein
+
+To run SyntenyLink\_st.pl you can simply use:
+
+    $ perl SyntenyLink_st.pl -d abc_synteny.aligncoords -g ref_genelist.txt
+
+\- Output The execution of SyntenyLink\_st outputs four output files:
+abc\_synteny.success.colinear, abc\_synteny.failed.colinear,
+abc\_synteny.chains.passed & abc\_synteny.all.chains.
+
+The abc\_synteny.success.colinear file holds the main output file with
+the generated syntelog table in which the gene chains are incorporated
+into their corresponding chromosomes with redundant chains (likely due
+to gene duplications or tandem or segmental duplications in the
+polyploid genome) placed in the "overlap" column in the syntelog table:
+
+    BraA01g000010.3C    AT1G43860.1 74.194  124 14  3   80  186 231 353 9.61e-56    188
+    BraA01g000010.3C    AT3G04630.1 66.087  115 21  4   194 297 165 272 7.06e-33    125
+    BraA01g000010.3C    AT3G04630.3 66.087  115 21  4   194 297 164 271 7.88e-33    125
+
+The abc\_synteny.failed.colinear file holds the removed chains following
+the condition if the number of remaining colinear pairs are less than 6:
+
+    A6_Chr1_4   Chr1    AT1G14070   BraA06g010220.3C
+    A6_Chr1_4   Chr1    AT1G14080   BraA06g010230.3C
+    A6_Chr1_4   Chr1    AT1G14100   x
+    A6_Chr1_4   Chr1    AT1G14110   x
+    A6_Chr1_4   Chr1    AT1G14120   x
+    A6_Chr1_4   Chr1    AT1G14130   BraA06g010280.3C
+    A6_Chr1_4   Chr1    AT1G14140   x
+    A6_Chr1_4   Chr1    AT1G14150   x
+    A6_Chr1_4   Chr1    AT1G14160   x
+    A6_Chr1_4   Chr1    AT1G14170   x
+    A6_Chr1_4   Chr1    AT1G14180   x
+    A6_Chr1_4   Chr1    AT1G14185   x
+
+The abc\_synteny.chains.passed file holds the ids of set of passed
+chains with collinear pairs greater than 6:
+
+    A7_Chr1_1
+    A8.r_Chr1_1
+    A9.r_Chr1_1
+    A2_Chr1_1
+    A6_Chr1_1
+    A7.r_Chr1_1
+
+The abc\_synteny.all.chains file holds the all chains identified from
+DAGchainer:
+
+    A1_Chr1_1   A1  BraA01g026830.3C    17161339    17161504    Chr1    AT1G56580.1 21198405    21198568    2.180000e-109   50
+    A1_Chr1_1   A1  BraA01g026890.3C    17191267    17191318    Chr1    AT1G57550.1 21312544    21312593    3.270000e-24    40
+    A1_Chr1_1   A1  BraA01g026900.3C    17196325    17196618    Chr1    AT1G57610.3 21337612    21337818    8.840000e-106   84
+
+SyntenyLink\_mbp
+----------------
+
+This program identifies the main breakpoints of translocations in the
+abc\_synteny.success.colinear file. It is is designed to detect
+fractionation gaps larger than a gap threshold. Synteny blocks capped by
+two breakpoints are grouped as a "super-synteny block" where gaps
+smaller than the gap threshold could exist within the super block. The
+input to this algorithm includes two parameters gap threshold, a minimum
+block length, and a data file syntelog table generated from Step 2,
+where collinear blocks are placed into the chromosomes of the organism.
+Next, the gene density of super-synteny blocks in each chromosome is
+calculated. This algorithm disregards densities below threshold density
+value. We selected a threshold density value of 0.1 in here. The
+chromosomes with gene density greater than 0.1 are ranked by the density
+of the blocks within the chromosome and assigned into candidate 𝑚
+subgenomes, where 𝑚 denotes the ploidy level or the number of subgenomes
+to be reconstructed.
+
+\- Usage No input parameters or read in data files.
+
+Note: You need to select the most suitable gap threshold and minimum
+block length for your data by running the gap\_threshold\_selection.py
+and minimum\_block\_length\_selection.py scripts.
+
+Here is a typical parameter setting for obtaining optimal gap threshold
+and block length values for you data:
+
+    $ python3 gap_threshold_selection.py -i abc_synteny.success.colinear
+
+Then get the optimal gap threshold value from the console and run the
+following command:
+
+    $ python3 minimum_block_length_selection.py -i abc_synteny.success.colinear -g <output gap threshold value>
+
+\- Output Optimal gap threshold and block length values are printed to
+the console.
+
+\- Output The execution of SyntenyLink\_mbp outputs two output files:
+Super\_synteny\_block\_output.xlsx and
+Super\_synteny\_bl\_sub\_placement\_density.xlsx.
+
+The Super\_synteny\_block\_output.xlsx file holds the generated
+super-synteny blocks before removing the noise taking density threshold
+into account:
+
+    Block no.   Block_start Block_end   Row start # Row end #   # genes in block    N1  N1.r    N2  N2.r    N3  N3.r    N4  N4.r    N5  N5.r    N6  N6.r    N7  N7.r    N8  N8.r
+    1   AT1G01010   AT1G01560   0   57  58  0.0 0.0 0.0 0.3684210526315789  0.0 0.6666666666666666  0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
+    2   AT1G01570   AT1G02205   58  123 66  0.0 0.0 0.43283582089552236 0.0 0.0 0.4626865671641791  0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
+    3   AT1G02210   AT1G14900   124 1491    1368    0.005113221329437546    0.0 0.3951789627465303  0.005843681519357195    0.3915266617969321  0.0 0.0577063550036523  0.1891891891891892  0.0 0.0 0.07596785975164354 0.26004382761139516 0.0 0.0 0.0 0.0
+
+The Super\_synteny\_bl\_sub\_placement\_density.xlsx file holds the
+subgenome placements of super-synteny blocks based on gene density,
+after removing the noise taking density threshold into account:
+
+    Block no.   Block_start Block_end   Row start # Row end #   # genes in block    N1  N1.r    N2  N2.r    N3  N3.r    N4  N4.r    N5  N5.r    N6  N6.r    N7  N7.r    N8  N8.r    N9  N9.r    Non_zero    subgenome1  subgenome2  subgenome3
+    1   AT1G01010   AT1G11860   0   1161    1162    0   0   0   0   0   0   0   0   0.544358312 0   0   0   0   0   0.354005168 0.308354866 0   0   3   N5  N8  N8.r
+    2   AT1G11870   AT1G13420   1162    1328    167 0   0   0   0   0   0   0   0   0   0.523809524 0   0   0   0   0.386904762 0.43452381  0   0   3   N5.r    N8.r    N8
+    3   AT1G13430   AT1G19470   1329    1957    629 0   0   0   0   0   0   0   0   0.598412698 0   0   0   0   0   0.33968254  0.422222222 0   0   3   N5  N8.r    N8
+
+SyntenyLink\_wg
+---------------
+
+This program uses a weighted direct graph to dynamically link blocks
+produced from Step 3 that are most likely to be in a subgenome using a
+combined information of fractionation and substitution patterns as well
+as continuity of gene chains. This algorithm takes two input parameters
+a1 and a2.
+
+\- Usage Reads in two data files: abc\_synteny.all.chains and
+abc\_blastn.blast. Two input parameters:a1 and a2.
+
+Note: You need to generate abc\_blastn.blast file by running nucleotide
+blast using cds fasta files of baseline species and polyploid species of
+interest.
+
+The abc\_blastn.blast file holds blast hits after running blastn between
+baseline species and polyploid species of interest:
+
+    BraA01g000010.3C    AT1G43860.1 78.706  371 25  6   239 558 692 1059    7.94e-91    335
+
+Here is a typical parameter setting for generating the abc\_blastn.blast
+file:
+
+    $ makeblastdb -in ref_cds.fa -dbtype prot -out ref_cds
+
+    $ blastall -i query_cds.fasta -p blastn -d ref_cds -m 8 -e 1e-5 -F F -v 5 -b 5 -o abc_blastn.blast -a 4
+
+Note: You need to select the most suitable a1 and a2 parameter values
+for your data, which accounts for adjusting weights of the graph.
+
+\- Output The execution of SyntenyLink\_wg outputs number of output
+files matching the number of subgenomes in the species of interest:
+Super\_synteny\_graph\_nodes\_sub{k}.xlsx. There will be m number of
+files. k represents the corresponding subgenome number.
+
+The Super\_synteny\_graph\_nodes\_sub{k}.xlsx file holds the nodes
+belong to each subgenome after traversing the graph:
+
+    Row start # Row end #   subgenome1  subgenome2  subgenome3
+    0   123 N10.r   N9  N8.r
+    124 698 N10 N8.r    N9.r
+    699 1029    N6  N9.r    N8.r
+
+SyntenyLink\_sb
+---------------
+
+This program retrieves genes "hidden" in small blocks that are missed in
+previous steps. Speciï¿¿cally, we consider the chromosome blocks with
+densities lower than d\<8= that are removed in previous steps in the
+generation of small blocks because these blocks may be more likely to
+exhibit ï¿¿ipping of synteny blocks between the forward and reverse
+strands resulting from segmental duplication. Following the generation
+of small blocks, small synteny blocks are incorporated into the
+corresponding super-synteny blocks, taking into account the subgenome
+placement of the super-synteny blocks as a reference.
+
+\- Usage No read in files Three input parameters: ws1, ws2 and ws3.
+
+Note: You need to select the most suitable window size parameters for
+subgenome1, subgenome2 and subgenome3.
+
+\- Output The execution of SyntenyLink\_sb outputs number of output
+files: abc\_synteny\_chromosome\_names.success.colinear.xlsx,
+subgenome\_placement\_blocks.all.xlsx and
+subgenome\_placement\_blocks.all\_sub{k}.xlsx; There will be m number of
+files. k represents the corresponding subgenome number.
+
+The abc\_synteny\_chromosome\_names.success.colinear.xlsx file holds the
+replaced gene id\'s with there corresponding chromosome names which they
+bellong to:
+
+    Row start # Row end #   subgenome1  subgenome2  subgenome3
+    0   123 N10.r   N9  N8.r
+    124 698 N10 N8.r    N9.r
+    699 1029    N6  N9.r    N8.r
+
+The subgenome\_placement\_blocks.all.xlsx file holds the final placement
+of genes in subgenomes in step 5:
+
+    Row start # Row end #   subgenome1  subgenome2  subgenome3
+    0   123 N10.r   N9  N8.r
+    124 698 N10 N8.r    N9.r
+    699 1029    N6  N9.r    N8.r
+
+The subgenome\_placement\_blocks.all\_sub{k}.xlsx file holds the optimal placement of genes in subgenomes for each subgenome. Ex: subgenome\_placement\_blocks.all\_sub1.xlsx::
+
+:   Row start \# Row end \# subgenome1 subgenome2 subgenome3 0 75 N10.r
+    N9 N8.r 76 78 N10.r N9.r N8.r 79 123 N10.r N9 N8.r 124 698 N10 N9.r
+    N8.r
+
+SyntenyLink\_mn
+---------------
+
+Aimed to optimize the subgenome placements for each block (a row in the
+subgenome matrix) based on neighbourhood considerations in a subgenome,
+maximize the number of continuous blocks/rows from the same chromosome.
+The algorithm utilizes two window sizes, up and down, which determine
+the number of blocks above and below a given block, to define a
+neighbourhood.
+
+\- Usage No read in files Two input parameters: wup wdwn.
+
+Note: You need to select the most suitable window up and window down
+parameters.
+
+\- Output The execution of SyntenyLink\_sb outputs number of output
+files: Final\_subgenome\_placement\_result.xlsx, Final\_result.xlsx.
+
+The Final\_subgenome\_placement\_result.xlsx file holds the final
+placement of chromosome blocks in subgenomes:
+
+    Row start # Row end #   subgenome1  subgenome2  subgenome3
+    0   75  N10.r   N9  N8.r
+    76  78  N10.r   N9.r    N8.r
+    79  123 N10.r   N9  N8.r
+    124 698 N10 N9.r    N8.r
+    699 1035    N6  N9.r    N8.r
+    1036    1036    N6  N9  N8.r
+
+The Final\_result.xlsx file holds the final placement of genes inside
+blocks in subgenomes:
+
+    subgenome1  subgenome2  subgenome3
+    BraA10g000880.3C    x   x
+    BraA10g000870.3C    x   x
+    BraA10g000860.3C    x   x
+    BraA10g000850.3C    x   x
+    BraA10g000830.3C    BraA09g065940.3C    x
+    BraA10g000820.3C    x   x
+    x   BraA09g065960.3C    x
+    BraA10g000790.3C    x   x
+    BraA10g000780.3C    BraA09g065970.3C    x
+    BraA10g000770.3C    BraA09g065980.3C    x
+
+main\_script
+------------
+
+This holds the main script that runs all the above scripts in order. It
+takes in the following parameters: -i input\_file -g gap\_threshold -m
+minimum\_block\_length -n number\_of\_subgenomes -gt ground\_truth\_file
+-c chains\_file -bl blastn\_file -a1 adjusment1 -a2 adjusment2 -ws1
+window\_size\_subgenome1 -ws2 window\_size\_subgenome2 -ws3
+window\_size\_subgenome3 -wup window\_up -wdwn window\_down
+
+Note: Before running main\_script.py, you need to run the
+SyntenyLink\_bf.pl and SyntenyLink\_st.pl scripts
+
+\- Output The execution of main\_script outputs all the output files of
+the above scripts in the same directory as the input file.
+
+SyntenyLink\_acc
+----------------
+
+This holds the script that calculates the accuracy of the final
+placement of genes in subgenomes when there is a ground truth file.
+
+\- Usage One read in file: abc\_groundtruth.xlsx
+
+\- Output The execution of SyntenyLink\_sb prints the subgenome
+placemnet accuracy of each subgenome
+
+> Exact match percentage for subgenome1: 84.15% Exact match number for
+> subgenome1: 11490 Missing genes for subgenome1: 2164 Exact match
+> percentage for subgenome2: 61.61% Exact match number for subgenome2:
+> 6062 Missing genes for subgenome2: 3778 Exact match percentage for
+> subgenome3: 65.30% Exact match number for subgenome3: 5635 Missing
+> genes for subgenome3: 2995
+
+:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -
+Example