Changes and Clarifications ========================== 20060618 Belatedly, added in Oxford LD-based genetic map positions. File is now rather large, as it includes the Affy 100K SNPs. End of chromosome 1 fixed (extrapolation for last two markers very poor). 20051010 Added D6S226 D6S224, D6S225, D6S1848, D6S220, at request of Jo Steele. D6S224 seems to be a cryptic duplicate (nested primers) of D6S1717, but given separate entry. 20050607 Blasted GATA156F11 and D19S1179. 20050517 Finished including Build 35.1 physical positions, and reinterpolated linkage map. A few markers moved compared to previous positions: large changes reflected the absence of information about D2S2738, D9S1844, D9S19, UT792 (primers map to 3p, linkage map to 9), AAT251, where the physical position is inferred solely based on Marshfield map position. There were 126 local flips of order, usually between adjacent markers. D17S793 duplication due to existence of 165xd4 and 165xd4A - latter is different marker. D2S1240 moved back to chromosome 2, and D14S780 moved back to chromosome 14, after visits to 21 and 22 respectively, based on Build 34.3 sequence data. 20050421 Started collating Build 35.1 physical positions: no great surprises, save chromosome 17 where positions for D17S126-D17S928 all shifted 2-3 Mbp, X (and Y). Ordering not greatly affected. 20050420 Nievergelt et al placed D5S1454 (ATA4F06) on chromosome 4 based on BLAST search. This marker is not placed on the most recent builds, and our linkage data confirms it at its Marshfield position on chromosome 5 (between D5S433 and D5S2501). So inserted at 107100000 bp on chromosome 5, 116.7 Rutger cM from pter. 20050210 Harry Beeby points out some duplications still left after merging in of Rutgers map: Common Name Alias1 Alias2 ----------- ------ ------ ATA10H11 D3S2409 D3S2384 ATA28C05 ATA28C05 DXS6795 ATA31E12 ATA31E12 DXS7126 D12S1091 D12S1091 D12S1075 D12S1341 D12S1341 D12S1330 D19S178 . . D19S393 D19S543 . D2S1337 D2S406 . D2S2735 . . DXS6792 D5S2796 . (on chr 5 at 65234477 bp) GATA109 D1S1585 D1S1728 GATA112F02 D6S1270 D6S1045 GATA119B03 D7S2200 D7S3047 GATA186D06 DXS9907 . GATA31E08 D3S2390 . (chromos different 3, X; dropped 3) GATA43F06 D2S1370 D3S2442 (SEE BELOW) GATA62F03 D9S2169 D4S2624 GATA66B04 D19S714 D19S588 GATA8B01 D20S201 . GGAT3F08 DXS9900 DXS6814 (pseudoautosomal map) I have fixed these except where there are four _good_ aliases, where I left the duplicates adjacent on the map. 20050204 How many mistakes are left to be made? Added extrapolation of genetic map outside included markers (20050124), which was not good on the acrocentric chromosomes. Reverted to original assumption that marker coverage complete. 20050124 Moved GATA164A09 to correct location at 118 Mbp on X. Replaced some pseudoautosomal markers physical positions with their Build 35 positions (so as to coordinate with recently BLASTed new markers). 20050124 Thomas Wienker has suggested that sex-averaged distances for X are hard to interpret outside the pseudoautosomal regions. The I.cM now are female cM for X. A separate file pseudoautosomal_map.dat contains a map of pseudoautosomal markers. Added a few Xpter markers notably TELA, DXYS201, DXYS218. 20041118 SCA10 is at 44511678 bp on Build 35.1, while D22S532 is at 44443698 bp D22S1160 44749701. Merged in Rutgers map, which includes a number of SNPs (TSC mapping set). This entails addition of 3 fields for the sex-averaged and sex-specific linkage map positions from that pooled data analysis (Kong et al 2004). The interpolated map position is now in Rutger cM. There are a few inconsistent positions (1% discrepancy): name1 name2 name3 chr Our.pos Rutgers.pos D16S3027 AFMa154wc9 D16S2622 16 3864420 4051157 D18S1356 ATA33B11 . 18 42793019 45581487 D9S1844 AFMb321zf1 . 9 42528253 43656049 D3S4557. 3 473473 456628 D6S2441 ATA50H07 . 6 27737689 26007311 And PI microsatellite has been placed on chr 14 with the gene. 20041111 A few duplications persist (Scott Gordon has pointed these out), Some of these may represent known segmental duplications (duplicons): D17S793 is given at 15.379 Mbp and 18.653 Mbp on Ensembl and NCBI, and is not included on the Kong et al map. On 17p12, the Smith-Magenis syndrome/dup(1)(p11.2-p11.2) duplication is 18.524-18.677 Mbp and 20.345-20.492 Mbp D7S804 was resolved using the Build 35.1 positioning, and noting that D7S805 is an alias. There are remaining cases where I have left a duplication via an alias in the "name3" column. In these cases, I believe the correct position is that given where that name appears as "name1": D22S427, D11S1883, GATA43F06 (see below), D4S393, D4S409. 20041101 Compared map to that of Kong et al 2004 (14759 markers), which lists many marker duplications and unmappable markers. Added in additional chromosome 11p markers from Sequana Therapeutics fine map of that region. Notable changes (added aliases were not previously present in master map): D6S1270 duplicates D6S1045 (have same pos in master map anyway!) D9S2152 added as alias for D9S1116 D10S1419 added as alias for D10S2470 D13S784 added as alias for D13S1807 D13S785 alias for D13S1811 D13S789 alias for D13S1812 D10S1218, despite being present on the original Marshfield and Decode maps, is identified as "unlinked to any chromosome". There are 73 markers present on the master map that Kong et al label as "physical position does not match linkage position": D1S1193 D1S396 D1S519 D1S2829 D1S3469 D1S179 D2S2982 D2S323 D2S319 D2S262 D2S1251 D2S2241 D2S1776 D2S1245 D3S1211 D3S3719 D3S4544 D4S1609 D2S2738 D4S1619 D4S1517 D4S2292 D4S2290 D4S1523 D5S593 AC016604-5 D5S2034 D6S1689 D6S941 D6S262 D6S495 D6S1693 D7S628 D7S460 D7S678 D7S1507 D7S2448 D7S1503 D8S1825 D9S280 D9S779 D10S197 D10S589 D10S1141 D11S992 D11S1337 D11S1390 D11S1284 D12S94 VWF D12S58 D12S1074 D12S63 D14S582 D14S543 D17S663 D17S1810 D17S799 D17S968 D18S1140 D18S975 D18S474 D18S68 D18S1374 HRC.A D21S210 D21S1913 D21S1408 D21S1245 DXS6807 DXS1036 DXS997 DXS1049. The mean absolute difference between our interpolated linkage map position and Kong et al is 1.1 cM. Therefore, they have been left in the map file. The Kong et al positions are very close to the published Marshfield positions, rather than DeCODE positions. 20040804 Added 39 markers from Manuel Ferreira. These include DXS6792 on chr 5 (genotype data confirm this to be autosomal). A number turn out to be "cryptic" aliases ie product of one completely contains that of other marker. 20040623 Have compared this map to that of Nievergelt et al 2004. They have BLASTed the markers for which we interpolated physical positions based on the Marshfield map. They could not find 429 of these. They note more markers on chromosomes different to the original assignment eg UT2548 (D11S1916) seems to be on chr 2 at 151549427; UT5086 (D19S724) on chr 1 at 1117405 and so on. However, I am distrustful of some of these, given that both DeCODE and Marshfield, for example, agree that UT924 is D14S539, rather than D10S519 (as the physical maps have it). This gives us physical positions of an additional 1000 markers. name1 name2 chr chr2 build343 D19S724 UT5086 1 19 11174025 PLA2 SGC35515 1 12 184196629 D20S159 UT1307 2 20 136667226 D11S1916 UT2548 2 11 151549427 GC . 2 4 219331392 D2S2394 AFMa101xg5 3 2 485608 D10S1161 UT5819 3 10 153767481 D9S765 UT1531 4 9 56824033 D5S1454 ATA4F06 4 5 60766934 UT2361 . 5 9 26276045 D17S1288 ATA1H07 5 17 80540347 D15S536 UT6886 5 15 110235845 D12S814 UT5140 5 12 149902513 D20S1142 GATA124A11 6 20 6399627 D7S2249 D2S1249 7 2 31648936 D21S2051 GATA116E08 8 21 36216937 D1S460 AFM123yc5 8 1 127193369 D8S1019 UT885 8 3 139929949 D14S539 UT924 10 14 29105585 FB7F11 WI-14125 10 18 67309958 D11S1914 UT1607 11 12 28923393 D18S966 ATA37G10 11 8 43770130 D15S529 UT935 15 12 51424156 D14S779 UT1888 16 14 3403730 UT556 . 16 21 35082119 UT1598 . 16 17 51240316 UT18 . 17 2 120515 PI . 18 14 5534609 D8S1013 UT5182 18 8 59779184 D7S1525 UT7368 19 7 51729360 D14S132 UT563 20 14 25872150 D21S1249 UT1025 21 16 6991420 D2S1240 UT5146 21 2 16907604 D8S2318 GATA115F05 21 8 26770586 D14S780 UT6047 22 14 14519674 D2S1280 UT5116III 22 2 14611866 UT597 . 22 14 14613571 D11S1905 UT832 23 11 10553732 D12S832 UT6574 23 12 45684466 D9S757 UT764 23 9 133552054 D2S1276 UT7691 23 2 138195102 20040618 Thomas Wienker pointed out that there are 100 odd markers from the Marshfield and Decode maps that have not been included (many on chr 20): these have been added. Several aliases were also fixed, notably IFNAR for D21S2039, Mfd92 for F8VWF, 1GF1 for IGF1. A few physical map positions have been updated (D1S443, GATA145F08,D1S2677,GATA153G01,D1S2879,D7S804,D17S793,D21S2039). 20040511 AAC023 is bp 67951716 bp. AGAT128 is at 62464943 bp. GTTT002 is at 155405903 bp. CATA002 is at 23707416 bp. GATA66B04 is an alias for GATA27C12 (D19S588): removed entry and added alias Scott G). 20040429 D6S502 Build 34.3 position corrected and name entered as alias for D6S500/GATA7B06 (Harry). 20040428 GGAAT1B07, Y-27H39, ATA10F11, GATA62F03 given Build 34.3 positions (Scott). SRaP added as alias for TPO. GATA2A12 added at 882323 bp on X. 20040421 TAT024 is on chr 3 at 52269515, TAGA049 on chr 4 at 174665378, and alias of D6S502 as GATA7G07 removed. Build 34.3 positions of CTAT014, TTTTA002 and GATA143C02 added. 20040416 D16S2616 represented twice: Marshfield position at 11.46 cM used. GATA119B03 has two positions in database: Build 34.3 and HSC_TCAG (The Center for Applied Genomics alternate build, usually only used now for chr 7): choose former as Ensemble no longer uses TCAG. GATA129B03 represented twice: Marshfield position at 35.51 cM used. GATA158H04 represented twice, also as g10693: latter removed. Mfd238 is in fact D19S254, as given by the public databases (BLASTing primer gives 62358762 bp as location), and not D19S559 as given in Marshfield spreadsheet (SET13). GATA29B01 is D19S589 as per public database rather than D19S254 (BLAST). GATA23B01 is D19S586 as per public database rather than D19S589 (BLAST). UT7544 is D19S559 as per public database rather than D19S246 (BLAST). Mfd232 is D19S246 as per public database rather than D19S245 (BLAST). Und so weiter on chr 19 SET 11-13. According to Marshfield, GATA7G07 *is* D6S502 on chromosome 6, but an Ensemble search will instead give D8S1179 on chromosome 8. TAT024 is on chr 4 in the Marshfield Set 51 spreadsheet and on chr 3 in the Marshfield Set 52 spreadsheet. TAGA049 is on chr 15 in the Marshfield Set 51 spreadsheet and on chr 4 in the Marshfield Set 52 spreadsheet. 20040408 D3S2395 is localised to chr 12 at 8048872 bp, the same location as D12S397. GATA43F06 is a chromosome 2 marker according to Marshfield, placed at 227.6 Build 32 Mbp and 231.3 Marshfield cM. According to most other sources, it is on chromosome 3 at 161 Mbp. BLASTing Marshfield primers maps to 2. 20040407 Several compound names eg MFD424-TTTA003 renamed to TTTA003 with Mfd424 as alternate. D2S441 is an alias for D2S1779 (added). Several hundred additional aliases added (including those used by Marshfield sets), with reordering of fields in many records to have D name as canonical name. 20040406 Added 11p13 markers from ESE2/3 list. Note that D11S1392 and D11S2008 are identical (nested). Then refitted all I.cM from the current I.bp, using a smaller alpha. This, as Andy B pointed out, is a nicer fit at the telomeres. 20040405 Removed further duplicate records, thanks to Scott. Detected ~10 likely aliases that are not known to databases D1S2132, D14S539 interpolated position changed manually based on estimate using knn algorithm excluding the Marshfield position. Removed D20S159 (UT1307) Build 34.3 position since this placed on chromosome 2 in sequence map: RH and Marshfield linkage information. place on 20q13. 20040401 Folded in newer Marshfield standard mapping set markers