Skip to main content

Multivariate Peaks-Over-Threshold (POT) Modelling of Nonstationary Air Pollution Concentration Data

PGR-P-638

Coronavirus information for applicants and offer holders

We hope that by the time you’re ready to start your studies with us the situation with COVID-19 will have eased. However, please be aware, we will continue to review our courses and other elements of the student experience in response to COVID-19 and we may need to adapt our provision to ensure students remain safe. For the most up-to-date information on COVID-19, regularly visit our website, which we will continue to update as the situation changes www.leeds.ac.uk/covid19faqs

Key facts

Type of research degree
PhD
Application deadline
Ongoing deadline
Country eligibility
International (open to all nationalities, including the UK)
Funding
Competition funded
Supervisors
Dr Georgios Aivaliotis and Dr Leonid Bogachev
Additional supervisors
Dr He Wang (School of Computing)
Schools
School of Computing, School of Mathematics
Research groups/institutes
Modern applied statistics, Statistical methodology and probability, Statistics
<h2 class="heading hide-accessible">Summary</h2>

The project will aim to extend the POT method in statistical modelling of extreme values to incorporate multiple observables (e.g. different air pollutants) when the data is nonstationary due to changing environment. Computation will be based on MCMC simulations to obtain posterior estimates of model parameters. Pre-processing of the input data is likely to require a dimension reduction, whereby modern machine learning techniques are expected to be crucial.

<h2 class="heading hide-accessible">Full description</h2>

<p>Peaks-over-threshold (POT) method is the preferred modern approach to analyse extreme values in a time series. This is due to a better usage of information as compared to the classic block-maxima method (which utilises only one maximum value in each block, e.g. year). Moreover, in many applications the impact of extremes is often implemented through a few moderately large values rather than due to a single highest maximum.</p> <p>Threshold exceedances approximately follow a generalised Pareto distribution (GPD) with two parameters (scale, shape), which are constant if the data is stationary (i.e. the observed process is in statistical equilibrium). However, in many practical situations including the air pollution, parameters of the system are likely to significantly change with time. Following Davison &amp; Smith (1990), threshold exceedances in non-stationary data should be modelled by treating the GDP parameters as functions of (time-dependent) covariates (e.g. weather and traffic conditions for air pollutants). However, the Davison-Smith regression model is not threshold stable, which means that the model parameters have to be re-estimated with every new threshold (which may need to vary with time). Recently, Gyarmati-Szabo, Bogachev and Chen (2017) proposed a novel model for non-stationary POT which is threshold stable. This has a strong potential to improve dramatically the computational efficiency of the POT model, making it into a versatile and powerful tool for dynamic estimation and prediction of extremes. In particular, this approach may serve as the basis for a semi- or fully automated computational tool designed for efficient on-line estimation and accurate prediction of future extreme events. Due to the property of threshold stability, such methods will work efficiently with variable threshold selection.</p> <p>The present project will aim to develop a more general methodology of joint modelling of several observables such as different air pollutants, e.g. NO2, NO, O3 etc., which are highly correlated due to complex photochemical reactions in the atmosphere in the presence of sunlight. The principal innovation to be achieved is to design a suitable multivariate POT model for non-stationary data that will preserve the property of threshold stability. Data analysis based on such a model will involve the MCMC (Markov Chain Monte Carlo) simulations to obtain posterior distributions of the model parameters. Due to an increased computational load, pre-processing of the input data may require a dimension reduction, whereby modern machine learning techniques are expected to be crucial.</p> <p><strong>References</strong></p> <ol> <li>Beirlant, J., Goegebeur, Y., Teugels, J. and Segers, J. <em>Statistics of Extremes: Theory and Applications. </em>Wiley, 2004, <a href="https://doi.org/10.1002/0470012382">https://doi.org/10.1002/0470012382</a></li> <li>Davison, A.C. and Smith, R.L. Models for exceedances over high thresholds.<em> Journal of the Royal Statistical Society, Ser. B </em><strong>52</strong> (1990), 393&ndash;442, <a href="http://www.jstor.org/stable/2345667">http://www.jstor.org/stable/2345667</a></li> <li>Gyarmati-Szabo, J., Bogachev, L.V. and Chen, H. Nonstationary POT modelling of air pollution concentrations: Statistical analysis of the traffic and meteorological impact<em>. Environmetrics </em><strong>28</strong> (2017), no. 5, Paper e2449, 15 pp, <a href="https://doi.org/10.1002/env.2449">https://doi.org/10.1002/env.2449</a></li> </ol> <h4>Potential for high impact outcome</h4> <p>Improving air quality is one of the key objectives of the current governmental policies and academic research in environmental science. The project has a strong potential to involve collaboration with external organisations, such as the Leeds City Council, DEFRA, and the Environment Agency. The project is expected to deliver significant results which may be instrumental for dynamic estimation and prediction of future extreme events in air pollution.</p> <h4>Training</h4> <p>This project will be supervised jointly by the Department of Statistics and the School of Computing at Leeds. Also, it has a strong potential to involve collaboration with external organisations such as the Met Office. Supervision will involve weekly meetings between supervisors and the student. Full training in the related disciplines and skills will be provided through taught courses and hands-on tuition. In particular, the student will have access to a broad spectrum of training workshops put on by the Faculty that includes an extensive range of training in theory development, numerical modelling, and data analysis.</p> <h4>Student profile</h4> <p>The successful PhD candidate should have a solid background in mathematics and statistics, with a strong interest in and a flair for statistical modelling of extreme values. Appreciation of the complexity of modelling air pollution concentrations would be an advantage, as well as a sound grounding in multivariate statistical analysis and Bayesian statistics. Key skills required for the project include competent use of R and experience with programming and statistical computing in general, including MCMC simulations.</p>

<h2 class="heading">How to apply</h2>

<p>Formal applications for research degree study should be made online through the&nbsp;<a href="https://www.leeds.ac.uk/research-applying/doc/applying-research-degrees">University&#39;s website</a>. Please state clearly in the Planned Course of Study that you are applying for <em><strong>PHD Statistics FT</strong></em> and in the research information section&nbsp;that the research degree you wish to be considered for is <em><strong>Multivariate Peaks-Over-Threshold (POT) Modelling of Nonstationary Air Pollution Concentration Data</strong></em>&nbsp;as well as <span class="underline_text"><a href="https://eps.leeds.ac.uk/maths/staff/4008/dr-leonid-bogachev">Dr Leonid Bogachev</a></span> as your proposed supervisor.</p> <p>If English is not your first language, you must provide evidence that you meet the University&#39;s minimum English language requirements (below).</p> <p>&nbsp;</p>

<h2 class="heading heading--sm">Entry requirements</h2>

Applicants to research degree programmes should normally have at least a first class or an upper second class British Bachelors Honours degree (or equivalent) in an appropriate discipline.

<h2 class="heading heading--sm">English language requirements</h2>

The minimum English language entry requirement for research postgraduate research study is an IELTS of 6.0 overall with at least 5.5 in each component (reading, writing, listening and speaking) or equivalent. The test must be dated within two years of the start date of the course in order to be valid.

<h2 class="heading">Funding on offer</h2>

<p><strong>Self Funding or externally sponsored students are welcome to apply.</strong></p> <p><strong>UK&nbsp;students</strong>&nbsp;&ndash;&nbsp;The&nbsp;<a href="https://phd.leeds.ac.uk/funding/209-leeds-doctoral-scholarships-2022">Leeds Doctoral Scholarships</a>, <a href="https://phd.leeds.ac.uk/funding/198-akroyd-and-brown-scholarship-2022">Akroyd &amp; Brown</a>, <a href="https://phd.leeds.ac.uk/funding/199-frank-parkinson-scholarship-2022">Frank Parkinson</a> and <a href="https://phd.leeds.ac.uk/funding/204-boothman-reynolds-and-smithells-scholarship-2022">Boothman, Reynolds &amp; Smithells</a> Scholarships are available to UK applicants. &nbsp;<a href="https://phd.leeds.ac.uk/funding/60-alumni-bursary">Alumni Bursary</a> is available to graduates of the University of Leeds.</p> <p><strong>Non-UK students</strong>&nbsp;&ndash; The&nbsp;<a href="https://phd.leeds.ac.uk/funding/48-china-scholarship-council-university-of-leeds-scholarships-2021">China Scholarship Council - University of Leeds Scholarship</a>&nbsp;is available to nationals of China. The&nbsp;<a href="https://phd.leeds.ac.uk/funding/73-leeds-marshall-scholarship">Leeds Marshall Scholarship</a>&nbsp;is available to support US citizens.&nbsp; <a href="https://phd.leeds.ac.uk/funding/60-alumni-bursary">Alumni Bursary</a> is available to graduates of the University of Leeds.</p>

<h2 class="heading">Contact details</h2>

<p>For general enquiries about applications, contact our admissions team:&nbsp;<a href="mailto:maps.pgr.admissions@leeds.ac.uk">maps.pgr.admissions@leeds.ac.uk</a>, +44 (0)113 343 5057.&nbsp;</p> <p>For questions about the research project, please contact Dr Leonid Bogachev: <a href="mailto:EMAIL@leeds.ac.uk">L.V.Bogachev@leeds.ac.uk</a>, +44 (0)113 343 4972.</p>


<h3 class="heading heading--sm">Linked research areas</h3>