Corpora

The Limerick Corpus of Irish English

The Limerick Corpus of Irish English (LCIE) was developed by the University of Limerick in conjunction with Mary Immaculate College, Limerick. This one-million word spoken corpus of Irish English discourse includes conversations recorded in a wide variety of mostly informal settings throughout Ireland. The design matrix for this corpus centres on a range of speaker relationships (from intimate to professional) across a range of interactional contexts and speech genres. The design matrix for LCIE is based on McCarthy (1998) and builds a sociolinguistic classification scheme around contextual variables such as speaker relationship, context and task at the time of speaking.

More Information:

The Limerick-Belfast Corpus of Academic Spoken English

The Limerick-Belfast Corpus of Academic Spoken English comprises 500,000 words of academic spoken data, collected in different university pedagogic settings in Belfast and Limerick. The copyright for this corpus is now held by Cambridge University Press.

CLAS corpus project

In collaboration with Cambridge University Press and Cambridge ESOL (Cambridge University English examinations syndicate) and Shannon College of Hotel Management, one million words of data were recorded to form the Cambridge Limerick and Shannon corpus. The data feeds into the greater Cambridge University-led English Profile project, which aims to empirically profile competencies in English in line with the Common European Framework of Reference for Languages (CEFR).

NUCASE (Ongoing)

The Newcastle Corpus of Academic Spoken English is a one and a half million word corpus of spoken data recorded in a variety of academic settings at the University of Newcastle in a collaborative project between IVACS partners Newcastle University and Mary Immaculate College. The venture has been co-funded by Cambridge University Press who hold copyright over the data.