CORAAL: Corpus of Regional African American Language

As a part of the CORAAL project, we have generated various derivatives of the official corpus materials. Here you can download the output, phone-level aligned TextGrids and the phonological model, from a forced alignment of the official CORAAL transcripts. The alignment was done using the Montreal Forced Aligner (MFA; McAuliffe et al. 2017).

For alignment, we used MFA Train & Align. In addition to phone-level aligned TextGrids, the MFA Train & Align option also creates a language model that can be used when aligning new AAL data. Initial alignment and the creation of the current CORAAL MFA Language Model was completed in November 2018 using CORAAL version 2018.10.06. In June 2019, we re-aligned CORAAL using the pre-trained (from November 2018) language model. This June 2019 version is what is available below. Note that this version of CORAAL includes all speakers and files currently available from DCA, DCB, and PRV. For ROC, two speakers who were added in version 2020.05 are not included in the alignment.

All CORAAL files are here:

Like the primary CORAAL materials, these files are completely free for research use. CORAAL is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike (4.0) International license. More information is available in the User Guide, and we suggest you read that document for full information about the corpus.

How to cite CORAAL MFA-Aligned

Farrington, Charlie and Tyler Kendall. 2019. The Corpus of Regional African American Language: MFA-Aligned. Version 2019.06. Eugene, OR: The Online Resources for African American Language Project.

