# Hirsch's Citation Index and Limit Shape of Random Partitions

PGR-P-243

## Coronavirus information for applicants and offer holders

We hope that by the time youâ€™re ready to start your studies with us the situation with COVID-19 will have eased. However, please be aware, we will continue to review our courses and other elements of the student experience in response to COVID-19 and we may need to adapt our provision to ensure students remain safe. For the most up-to-date information on COVID-19, regularly visit our website, which we will continue to update as the situation changes www.leeds.ac.uk/covid19faqs

## Key facts

Type of research degree
PhD
Country eligibility
International (open to all nationalities, including the UK)
Funding
Competition funded
Supervisors
Dr Leonid Bogachev and Dr Jochen Voss
Schools
School of Mathematics
Research groups/institutes
Statistics

Integer partitions appear in numerous areas of mathematics and its applications. This classic research topic dates back to Euler, Cauchy, Cayley, Lagrange, Hardy and Ramanujan. The modern statistical approach is to treat partitions as a random ensemble endowed with a suitable probability measure.<br /> <br /> The uniform (equiprobable) case is well understood but more interesting models (e.g., with certain weights on the components) are mathematically more challenging. Hirsch introduced his h-index to measure the quality of a researcher&#039;s output, defined as the largest integer n such that the person has h papers with at least h citations each.<br /> <br /> The h-index has become quite popular. Recently, Yong [6] proposed a statistical approach to estimate the h-index using a natural link with the theory of integer partitions [1]. Namely, identifying an integer partition with its Young diagram (with blocks representing parts), it is clear that the h-index is the size of the largest h x h square that fits in. If partitions of a given integer N are treated as random, with uniform distribution (i.e., all such partitions are assumed to be equally likely), then their Young diagrams have limit shape. Yong&#039;s idea is to use the limit shape to deduce certain statistical properties of the h-index. In particular, it follows that the typical value of Hirsch&#039;s index for someone with a large number N of citations should be close to 0.54 N. However, the assumption of uniform distribution on partitions is of course rather arbitrary, and needs to be tested statistically. This issue is important since the limit shape may strongly depend on the distribution of partitions [2], which would also affect the asymptotics of Hirsch&#039;s index.<br /> <br /> Thus, the idea of this project is to explore such an extension of Yong&#039;s approach. To this end, one might try and apply Markov chain Monte Carlo (MCMC) techniques, whereby the uniform distribution may serve as an uninformed prior. These and similar ideas have a potential to be extended beyond the citation topic, and may offer an interesting blend of theoretical and more applied issues, with a possible gateway to further applications of discrete probability and statistics in social sciences. Successful candidates should have a good degree in mathematics and/or statistics. Programming skills to carry out MCMC simulations would be useful but not essential, as the appropriate training will be provided.<br />

<p>Integer partitions appear in numerous areas of mathematics and its applications &mdash; from number theory, algebra and topology to quantum physics, statistics, population genetics, and IT. This classic research topic dates back to Euler, Cauchy, Cayley, Lagrange, Hardy and Ramanujan. The modern statistical approach is to treat partitions as a random ensemble endowed with a suitable probability measure.</p> <p>The uniform (equiprobable) case is well understood but more interesting models (e.g., with certain weights on the components) are mathematically more challenging. Hirsch [3] introduced his <em>h</em>-index to measure the quality of a researcher&#39;s output, defined as the largest integer <em>h</em> such that the person has <em>h</em> papers with at least <em>h</em> citations each.</p> <p>The h-index has become quite popular (see, e.g., &#39;Google Scholar&#39; or &#39;Web of Science&#39;). Recently, Yong [6] proposed a statistical approach to estimate the <em>h</em>-index using a natural link with the theory of integer partitions [1]. Namely, identifying an integer partition with its Young diagram (with blocks representing parts), it is clear that the h-index is the size of the largest <em>h</em> x <em>h</em> square that fits in. If partitions of a given integer <em>N</em> are treated as random, with uniform distribution (i.e., all such partitions are assumed to be equally likely), then their Young diagrams have &quot;limit shape&quot; (under the suitable scaling), first identified by Vershik [5].</p> <p>Yong&#39;s idea is to use the limit shape to deduce certain statistical properties of the <em>h</em>-index. In particular, it follows that the &quot;typical&quot; value of Hirsch&#39;s index for someone with a large number <em>N</em> of citations should be close to 0.54 <em>N</em>. However, the assumption of uniform distribution on partitions is of course rather arbitrary, and needs to be tested statistically. This issue is important since the limit shape may strongly depend on the distribution of partitions [2], which would also affect the asymptotics of Hirsch&#39;s index.</p> <p>Thus, the idea of this project is to explore such an extension of Yong&#39;s approach. To this end, one might try and apply Markov chain Monte Carlo (MCMC) techniques [4], whereby the uniform distribution may serve as an &quot;uninformed prior&quot;. These and similar ideas have a potential to be extended beyond the citation topic, and may offer an interesting blend of theoretical and more applied issues, with a possible gateway to further applications of discrete probability and statistics in social sciences.</p> <p><strong>References</strong></p> <ol> <li>Andrews, G.E. and Eriksson, K<em>. Integer Partitions. </em>Cambridge Univ. Press, Cambridge, 2004.</li> <li>Bogachev, L.V. Unified derivation of the limit shape for multiplicative ensembles of random integer partitions with equiweighted parts<em>. Random Struct. Algorithms, </em><strong>47</strong> (2015), 227&ndash;266. (<a href="https://doi.org/10.1002/rsa.20540">doi:10.1002/rsa.20540</a>)</li> <li>Hirsch, J.E. An index to quantify an individual&#39;s scientific research output.<em> Proc. Natl. Acad. Sci. USA, </em><strong>102</strong> (2005), 16569&ndash;16572. (<a href="https://doi.org/10.1073/pnas.0507655102">doi:10.1073/pnas.0507655102</a>)</li> <li><em>Markov Chain Monte Carlo in Practice </em>(W.R. Gilks, S. Richardson and D.J. Spiegelhalter, eds.).<em> </em>Chapman &amp; Hall/CRC, London, 1996.</li> <li>Vershik, A.M. Asymptotic combinatorics and algebraic analysis. In:<em> Proc. Intern. Congress Math. 1994, vol. 2. </em>Birkh&auml;user, Basel, 1995, pp. 1384&ndash;1394. (<a href="https://doi.org/10.1007/978-3-0348-9078-6_133"><span id="doi-url">doi:10.1007/978-3-0348-9078-6_133</span></a>)</li> <li>Yong, A. Critique of Hirsch&#39;s citation index: a combinatorial Fermi problem.<em> Notices Amer. Math. Soc.,</em> <strong>61</strong> (2014), 1040&ndash;1050. (<a href="https://doi.org//10.1090/noti1164">doi:/10.1090/noti1164</a>)</li> </ol>

<p>Formal applications for research degree study should be made online through the&nbsp;<a href="https://www.leeds.ac.uk/research-applying/doc/applying-research-degrees">University&#39;s website</a>. Please state clearly in the Planned Course of Study that you are applying for <em><strong>PHD Statistics FT</strong></em> and&nbsp;in the research information section&nbsp;that the research degree you wish to be considered for is <em><strong>Hirsch&#39;s Citation Index and Limit Shape of Random Partitions</strong></em>&nbsp;as well as <a href="https://eps.leeds.ac.uk/maths/staff/4008/dr-leonid-bogachev">Dr Leonid&nbsp;Bogachev</a>&nbsp;as your proposed supervisor.</p> <p>Successful candidates should have a good degree in mathematics and/or statistics. Programming skills to carry out MCMC simulations would be useful but not essential, as the appropriate training will be provided.</p> <p>You will be based within a strong research group in&nbsp;<a href="https://eps.leeds.ac.uk/maths-statistics/doc/probability-financial-mathematics">Probability and Financial Mathematics</a>.</p> <p>If English is not your first language, you must provide evidence that you meet the University&#39;s minimum English language requirements (below).</p> <p>&nbsp;</p>

Applicants to research degree programmes should normally have at least a first class or an upper second class British Bachelors Honours degree (or equivalent) in an appropriate discipline. The criteria for entry for some research degrees may be higher, for example, several faculties, also require a Masters degree. Applicants are advised to check with the relevant School prior to making an application. Applicants who are uncertain about the requirements for a particular research degree are advised to contact the School or Graduate School prior to making an application.