Key facts
- Type of research degree
- PhD
- Application deadline
- Ongoing deadline
- Project start date
- Sunday 1 October 2023
- Country eligibility
- International (open to all nationalities, including the UK)
- Funding
- Non-funded
- Supervisors
- Dr Sofya Titarenko
- Additional supervisors
- Dr Laurent Noe
- Schools
- School of Mathematics
- Research groups/institutes
- Statistics
DNA sequencing and the following analysis of biosequences become a core procedure for many problems in biological research, such as medical diagnosis, development of target medicine, virology, forensic sciences, etc. Examples of biosequence search include the detection of mutations within human DNA in cancer studies and searching for matching patterns in microbial genomes. <br /> With an ever-increasing volume of data to process, the development of efficient search algorithms becomes of paramount importance.<br /> Seeding is one of the techniques which is extensively used in biosequence alignment problems to speed up the searching procedure. A few seeding algorithms have been proposed in the past years. The first seeds which have been suggested represent small contiguous matching patterns and explore the seed-and-extend paradigm. Later the idea has been developed into spaced seeds (binary/ternary), allowing mismatches. <br /> Another technique is k-mers, a set of short matching patterns of length k which is often used in alignment-free methods. <br /> However, finding optimal seeds/k-mers, which maximize efficiency, is still a topic for study. Usually, efficiency can be assessed as a compromise between the algorithm’s time complexity and its sensitivity (ratio of correctly aligned sequences). Different studies suggest different methods for increasing efficiency. Often the selection is based on a certain probabilistic model and pre-defined properties of seeds.<br /> In this project, the student will look into the problem of generating optimal seeds/k-mers to improve the efficiency of biosequence searching algorithms. The focus will be made on the development of a mathematical framework to investigate and demonstrate the efficiency of suggested seeds’ structures.<br />
<p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">It has been shown in [1] that spaced seeds may be more efficient than contiguous seeds. Other designs of seeding strategy have been suggested, such as adaptive seeds and minimisers to speed up the search at the price of decreasing sensitivity. </span></span></span></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">Seeds generators which produce seeds of a certain structure to satisfy given parameters have been proposed in [2] and [3]. The properties of optimal seeds have been investigated in [4] and a list of optimal seeds for predefined structural properties has been suggested.</span></span></span></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">An example of investigating an optimal parameter for k-mers in error correction tools can be found in [5].</span></span></span></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">[1] </span></span></span><span style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif"><span style="color:#333333">Keich Uri, Li Ming, Ma Bin, Tromp John, On spaced seeds for similarity search. <em>Discrete Applied Mathematics</em>. </span></span></span></span><a href="https://doi.org/10.1016/S0166-218X(03)00382-2" style="color:blue; text-decoration:underline"><span style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">https://doi.org/10.1016/S0166-218X(03)00382-2</span></span></span></a></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif"><span style="color:#212121">[2] Brejová, B., Brown, D. G., & Vinar, T. (2004). Optimal spaced seeds for homologous coding regions. <em>Journal of bioinformatics and computational biology</em>, <em>1</em>(4), 595–610. </span></span></span></span><a href="https://doi.org/10.1142/s0219720004000326" style="color:blue; text-decoration:underline"><span style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">https://doi.org/10.1142/s0219720004000326</span></span></span></a></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif"><span style="color:#303030">[3] Kucherov, G., Noé, L., & Roytberg, M. (2006). A unifying framework for seed sensitivity and its application to subset seeds. <em>Journal of bioinformatics and computational biology</em>, <em>4</em>(2), 553–569. </span></span></span></span><a href="https://doi.org/10.1142/s0219720006001977" style="color:blue; text-decoration:underline"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">https://doi.org/10.1142/s0219720006001977</span></span></span></a></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif"><span style="color:#303030">[4] </span></span></span></span><span lang="EN-US" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif"><span style="color:#4a4a4a">Valeriy Titarenko, Sofya Titarenko. PerFSeeB: Designing Long High-weight Single Spaced Seeds for Full Sensitivity Alignment with a Given Number of Mismatches, 15 November 2021, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-1051543/v1]</span></span></span></span></span></span></span></p> <p style="margin-bottom:11px"><span style="font-size:11pt"><span style="line-height:107%"><span style="font-family:Calibri,sans-serif"><span lang="FR" style="font-size:12.0pt"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif">[5] <span style="background:white"><span style="color:#333333">Sharma, A., Jain, P., Mahgoub, A. <em>et al.</em> </span></span></span></span></span><span style="font-size:12.0pt"><span style="background:white"><span style="line-height:107%"><span style="font-family:"Arial",sans-serif"><span style="color:#333333">Lerna: transformer architectures for configuring error correction tools for short- and long-read genome sequencing. <em>BMC Bioinformatics</em> <strong>23, </strong>25 (2022). https://doi.org/10.1186/s12859-021-04547-0</span></span></span></span></span></span></span></span></p> <p style="margin-bottom:11px"> </p>
<p>Formal applications for research degree study should be made online through the <a href="https://www.leeds.ac.uk/research-applying/doc/applying-research-degrees">University's website</a>. Please state clearly in the Planned Course of Study section that you are applying for <em><strong>PHD Statistics FT</strong></em> and in the research information section that the research degree you wish to be considered for is <em><strong>Fast algorithms for efficient biosequence search</strong></em> as well as <strong>Dr Sofya Titarenko</strong> as your proposed supervisor.</p> <p>If English is not your first language, you must provide evidence that you meet the University's minimum English language requirements (below).</p> <p><em>As an international research-intensive university, we welcome students from all walks of life and from across the world. We foster an inclusive environment where all can flourish and prosper, and we are proud of our strong commitment to student education. Across all Faculties we are dedicated to diversifying our community and we welcome the unique contributions that individuals can bring, and particularly encourage applications from, but not limited to Black, Asian, people who belong to a minority ethnic community, people who identify as LGBT+ and people with disabilities. Applicants will always be selected based on merit and ability.</em></p> <p class="MsoNoSpacing">Applications will be considered on an ongoing basis. Potential applicants are strongly encouraged to contact the supervisors for an informal discussion before making a formal application. We also advise that you apply at the earliest opportunity as the application and selection process may close early, should we receive a sufficient number of applications or that a suitable candidate is appointed.</p> <p>Please note that you must provide the following documents at the point you submit your application:</p> <ul> <li>Full Transcripts of all degree study or if in final year of study, full transcripts to date</li> <li>Personal Statement outlining your interest in the project</li> <li>CV</li> <li>Funding information including any alternative sources of funding that you are applying for or if you are able to pay your own fees and maintenance</li> </ul>
Applicants to research degree programmes should normally have at least a first class or an upper second class British Bachelors Honours degree (or equivalent) in an appropriate discipline. The criteria for entry for some research degrees may be higher, for example, several faculties, also require a Masters degree. Applicants are advised to check with the relevant School prior to making an application. Applicants who are uncertain about the requirements for a particular research degree are advised to contact the School or Graduate School prior to making an application.
The minimum English language entry requirement for research postgraduate research study is an IELTS of 6.0 overall with at least 5.5 in each component (reading, writing, listening and speaking) or equivalent. The test must be dated within two years of the start date of the course in order to be valid. Some schools and faculties have a higher requirement.
<p style="margin-bottom:12px"><strong>Self-Funded or externally sponsored students are welcome to apply.</strong></p> <p><strong>UK</strong> – The <a href="https://phd.leeds.ac.uk/funding/209-leeds-doctoral-scholarships-2022">Leeds Doctoral Scholarships</a> and <a href="https://phd.leeds.ac.uk/funding/234-leeds-opportunity-research-scholarship-2022">Leeds Opportunity Research Scholarship</a> are available to UK applicants (open from October 2023). <a href="https://phd.leeds.ac.uk/funding/60-alumni-bursary">Alumni Bursary</a> is available to graduates of the University of Leeds.</p> <p><strong>Non-UK</strong> –The <a href="https://phd.leeds.ac.uk/funding/48-china-scholarship-council-university-of-leeds-scholarships-2021">China Scholarship Council - University of Leeds Scholarship</a> is available to nationals of China (open from October 2023). The <a href="https://phd.leeds.ac.uk/funding/73-leeds-marshall-scholarship">Leeds Marshall Scholarship</a> is available to support US citizens. <a href="https://phd.leeds.ac.uk/funding/60-alumni-bursary">Alumni Bursary</a> is available to graduates of the University of Leeds.</p> <p><strong>Important:</strong> Any costs associated with your arrival at the University of Leeds to start your PhD including flights, immigration health surcharge/medical insurance and Visa costs are <strong>not</strong> covered under these studentships.</p> <p>Please refer to the <a href="https://www.ukcisa.org.uk/">UKCISA</a> website for information regarding Fee Status for Non-UK Nationals.</p>
<p>For general enquiries about applications, contact our admissions team: <a href="mailto:maps.pgr.admissions@leeds.ac.uk">maps.pgr.admissions@leeds.ac.uk</a></p> <p>For questions about the research project, contact Sofya Titarenko: <a href="mailto:S.Titarenko@leeds.ac.uk">S.Titarenko@leeds.ac.uk</a></p>
<h3 class="heading heading--sm">Linked research areas</h3>