Job description
Summer Internship - Dataset Machine Learning for Digitisation of Collections
We propose a project scoped towards creating Kew's first image dataset for developing and benchmarking Artificial Intelligence (AI) applications. Natural history collections, like Kew's herbarium collection, contain a wealth of primary data vital to solving the current biodiversity crisis. AI methods, like deep learning, can help researchers use this data to conduct large-scale studies and generate novel insights, for example, by counting flower buds to model the impact of climate change or mining potentially undescribed species.
Publicly available datasets are the basis of machine learning research. While there are popular repositories for sharing interesting new datasets, benchmarking of new AI methods is concentrated around a few standard datasets. However, good performance against these standard benchmarks does not always indicate good performance for domain-specific applications. Although some large herbarium image collections are available through competitions, the growing interest in applying AI to natural history collections has created a need for more relevant datasets.
The mass digitisation of Kew's plant and fungal specimens provides the perfect opportunity to create expert-verified high-quality benchmarking datasets that relate to real-world problems in plant science.
- Employee Benefits - RBG Kew.pdf
(PDF, 1924.01kb) - Summer intern Job Profile.pdf
(PDF, 137.05kb) - P3 summer project details.pdf
(PDF, 72.32kb)
More details
Our overarching aim is to assemble a dataset of 20,000 images from Kew's digitised collections to benchmark AI species identification. These images will cover the 100 families in the Kew Tropical and soon-to-be-published Temperate Plant Families Identification Handbooks. We propose a study to pilot the collation of this dataset for a single taxonomic order, using the funds to hire and train an intern to label digitised type specimen images with key identification characters. By running this project, we will gain the knowledge and experience required to complete the full benchmark dataset, thereby building the reputation of Kew's data within the machine learning community and laying the foundations for successful applications of AI within Kew.
Please see the Summer internship job profile together with the project specific requirements attached.
The Royal Botanic Gardens, Kew (RBG Kew) is a leading plant science institute, UNESCO World Heritage Site, and major visitor attraction. Our mission is to understand and protect plants and fungi for the well-being of people and the future of all life on Earth.
We are working to end the unprecedented extinction crisis and to help create a world where nature is protected, valued by all and managed sustainably. We will achieve these goals by drawing on our leading scientific research, unrivalled collections of plants and fungi, global network of partners, inspirational gardens at Kew and Wakehurst, and our 260 years of history.
Join us on our journey as protectors of the world’s plants and fungi.
The salary will be £19,582 per annum (pro rata).
Our fantastic benefits package includes opportunities for continuous learning, a generous annual leave entitlement, flexible working to help you maintain a healthy work-life balance, an Employee Assistance Programme and other wellbeing support such as cycle to work scheme and discounted gym membership. We also offer a competitive pension, an employee discount scheme and free entry into a wide range of national museums and galleries, as well as access to our own beautiful gardens at Kew and Wakehurst.
If you are interested in this position, please submit your application through the online portal, by clicking “Apply for this job”.
We are committed to equality of opportunity and welcome applications from all sections of the community. We guarantee to interview all disabled applicants who meet the essential criteria for the post.
No agencies please.