Status: LIVE
PARADISEC - Pacific And Regional Archive for Digital Sources in Endangered Cultures has been operating for 20 years and currently holds material in 1,350 languages across Australia and the Pacific. The archive contains over 210TB of content including more than 16,000 hours of audio recordings, 1,600 hours of video and 8,000 transcriptions. It is a facility that acts as an archive of research recordings as well as forming an integral part of the research workflow in which primary data is made citable, is preserved, and is publicised (with licence agreements) for access.
The Modern PARADISEC demonstrator, developed with previous funding from the ARDC, demonstrates the use of RO-Crate to describe the collections and items and store those items within an OCFL system. The demonstrator includes an elastic search service and a webserver but the key feature is that the it keeps working with only the filesystem and a webserver.
PARADISEC’s access and storage have been developed over 20 years, and some parts are in need of renewal. It is also the model for other archives and we are currently advising a consortium in Japan and working actively with the colleagues at the University of French Polynesia to build an archive in Papeete.
Tools adapted in the PARADISEC system
- Elan - Media transcription - XML output, microservice developed to play media and transcripts. Allows citation of points in media.
- Fieldworks Language Explorer - Text output is structured text with interlinear annotations, dictionaries are structured and can be output as formatted documents or phone apps.
- LaMeta – A tool for creating metadata in a form that can be imported into an archive.
- CSV - metadata entry sheet using a simple row/column layout
- Elan file viewer – provides researchers with a dashboard to see how much of a file has been transcribed Media players - linking transcripts to media (granular citation of media for research purposes)
- OLAC data viewer – presents all aggregated metadata in a map view