Skip to content

Commit c020b06

Browse files
committed
griots_r14_2 fix & Readme update
1 parent 82012c8 commit c020b06

File tree

8 files changed

+289
-15
lines changed

8 files changed

+289
-15
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
*.wav
22
*.WAV
3+
*.py

README.md

Lines changed: 87 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,96 @@
11
# Jeli ASR & Corpus
22

3-
## Overview
3+
## What is Jeli-ASR
4+
Jeli-ASR is a multidimentional package that was developed with the aim to empower the usage of the Bambara Language. Starting in an initiative to the develop the Bambara Language, and its cultural values. The package is consisted of an ASR model under ongoing development, and a mini corpus of griots narration in [audio](https://zenodo.org/record/6997806), its transcription in eaf which is [ELAN format](https://archive.mpi.nl/tla/elan/download), and a package tool that can yield the transcription in raw text format or json.
45

5-
## ASR
6+
## ASR - Model
67
[TODO]
78

8-
## Corpora
9+
## Corpus
10+
The Griots corpus is a speech corpus containing both audio and its accompanying transcribed text. You can find the intent, the approaches, a detailed look, and a thorough explanation of the dataset on the [Data-Card](). Refer to the following list of recordings and the general meta information about the recordings:
911

10-
### EAFs
11-
### [AUDIO CORPUS](https://zenodo.org/record/6997806)
12+
### Griots Narrations
13+
14+
| Recording ID | Theme | Dialect | Utterance Count | Spkr. Gender |
15+
|:------------:|:-----:|:-------:|:---------------:|:------------:|
16+
| griots_r1 | L'histoire d'une fille | Bamako | 980 | M |
17+
| griots_r2 | L'histoire d'un grand marabo | Ségou | 1030 | M |
18+
| griots_r3 | Les forgérons | Bamako | 805 | M |
19+
| griots_r4 | Les Noms Authentiques | Bamako | 764 | M |
20+
| griots_r5 | Les Coulibaly | Bamako | 981 | M |
21+
| griots_r6 | Les Diarra | Ségou | 1122 | M |
22+
| griots_r7 | L'histoire du roi Razaly | Bamako | 1407 | M |
23+
| griots_r8 | L'histoire des fils d'Abraham | Bamako | 1126 | F |
24+
| griots_r9 | Les ''Niamala'' hommes de caste | Bamako | 821 | M |
25+
| griots_r10 | L'éducaion d'hier et d'aujourd'hui | Bamako | 1078 | F |
26+
| griots_r11 | Garba Mama | Bamako | 970 | M |
27+
| griots_r12 | La Bataille de Kaana | Bamako | 997 | M |
28+
| griots_r13 | Diokala | Bamako | 964 | M |
29+
| griots_r14 | Nos ancetres | Malinké Siby | 1136 | M |
30+
| griots_r15 | L'histoire d'El Hadj Oumar Tall | Bamako | 844 | M |
31+
| griots_r16 | Les Massassi du Karta 'Bɔ' | Bamako | 941 | M |
32+
| griots_r17 | Histoire de Samory | Malinké kangaba | 773 | M |
33+
| griots_r18 | Le griot | Malinké de kangaba | 809 | M |
34+
| griots_r19 | La vie d'avant en milieu Bamanan | Bamako | 611 | F |
35+
| griots_r20 | Les Maabo | Ségou | 1102 | M |
36+
| griots_r21 | L'histoire de Djonkoloni | Bamako | 859 | M |
37+
| griots_r22 | Various | Malinké de Siby | 926 | F |
38+
| griots_r23 | L'histoire de Bɔ | Ségou | 1319 | M |
39+
| griots_r24 | L'éducaion d'hier et d'aujourd'hui | Bamako | 942 | F |
40+
| griots_r25 | L'hisoire de la jeune fille Niamakolo | Bamako | 828 | F |
41+
| griots_r26 | Hier et aujourd'hui | Bamako | 1128 | M |
42+
| griots_r27 | Les Mianka | Bamako | 1166 | M |
43+
| griots_r28 | Le mariage d'hier et d'aujourd'hui | Bamako | 810 | F |
44+
| griots_r29 | L' histoire de Dabo | Bamako | 774 | M |
45+
| griots_r30 | Les valeurs du Mali | Bamako | 968 | M |
46+
|**TOTAL**||| ***28971*** ||
47+
||
48+
49+
### Street Interviews
50+
Along side the griots' narrations, a smaller sample of individuals were interviewd about the importance of bambara in the technology.
51+
52+
| Recording ID | Utt. Count | Spkr. Gender | Status |
53+
|:------------:|:-------:|:------------:|:------:|
54+
| intrvw_r1 | 55 | F | V |
55+
| intrvw_r2 | X | X | X |
56+
| intrvw_r3 | 24 | M | V |
57+
| intrvw_r4 | 25 | M | V |
58+
| intrvw_r5 | 31 | M | V |
59+
| intrvw_r6 | 20 | M | V |
60+
| intrvw_r7 | X | X | X |
61+
| intrvw_r8 | X | X | X |
62+
| intrvw_r9 | X | X | X |
63+
| intrvw_r10 | X | X | X |
64+
| intrvw_r11 | X | X | X |
65+
| intrvw_r12 | X | X | X |
66+
| intrvw_r13 | 25 | M | V |
67+
| intrvw_r14 | X | X | X |
68+
| intrvw_r15 | X | X | X |
69+
| intrvw_r16 | X | X | X |
70+
| intrvw_r17 | X | X | X |
71+
| intrvw_r18 | X | X | X |
72+
| intrvw_r19 | X | X | X |
73+
| intrvw_r20 | 17 | M | V |
74+
| intrvw_r21 | 137 | M | V |
75+
| intrvw_r22 | 142 | F | V |
76+
| **TOTAL** | ***476*** | - | - |
77+
||
78+
79+
### jelipkg toolkit
80+
<code>jelipkg</code> is sub-package that serves as an entry point to the corpus. It is a python package that allows you to browse, and download the corpus for your own convenience, you can download the textual data either in raw text format or json format.
81+
82+
#### Installation
83+
#### Quickstart
84+
#### Documentation
85+
86+
**IMPORTANT**: It is recommended to download one recording/interview at a time, if you have an unreliable network due to the size of the dataset.
87+
88+
## Contact & People
89+
**Principal Investigator**: Michael Leventhal, `mleventhal <at> robotsmali.org`
90+
**Manager**: Sebastien Diarra, `sdiarra <at> robotsmali.org`
91+
**inquiries & Collaboration**: `research <at> robotsmali.org`
92+
93+
## Reference
1294

13-
## Contact
1495
## License
1596
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

jeli/cli/jelipkg.py~

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
"""
2+
"""
3+
4+
import os
5+
from jeli.core import jeli
6+
7+
def main():
8+
Al = jeli.JeliASR()
9+
Al.valid_recordings()

jeli/config

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Token=C4LeMu6E9KqVFACLaSKoBYd9RClArDw1ieFOYNGP35L3zWNcVThUXSRI74aP

jeli/core/jeli.py~

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
"""
2+
"""
3+
4+
import os
5+
import glob
6+
import pympi
7+
from jeli.core import config
8+
9+
REC_ROOT = f"{config.ROOT}/transcriptions"
10+
REC_LIST = os.listdir(REC_ROOT)
11+
12+
class JeliASR(object):
13+
""" """
14+
15+
def __init__(self) -> None:
16+
""" """
17+
pass
18+
19+
@staticmethod
20+
def __valid_recordings():
21+
recs = [i for i in REC_LIST if len(glob.glob(f"{REC_ROOT}/{i}/*.eaf")) > 0]
22+
return recs
23+
24+
def eaf_reader(self):
25+
""" """
26+
pass
27+
28+
def read_eaf_files(self, path=REC_ROOT):
29+
"""
30+
"""
31+
32+
pass
33+
34+
def json_export(self):
35+
pass
36+
37+
def export_eaf_file(self, file_id):
38+
pass

meta.json

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
{
2+
"griots_r13": {
3+
"theme": "Diokala",
4+
"dialect": "Bamako",
5+
"gender": "M"
6+
},
7+
"griots_r24": {
8+
"theme": "L'e\u0301ducaion d'hier et d'aujourd'hui",
9+
"dialect": "Bamako",
10+
"gender": "F"
11+
},
12+
"griots_r17": {
13+
"theme": "Histoire de Samory",
14+
"dialect": "Malinke\u0301 kangaba",
15+
"gender": "M"
16+
},
17+
"griots_r9": {
18+
"theme": "Les ''Niamakala'' hommes de caste",
19+
"dialect": "Bamako",
20+
"gender": "M"
21+
},
22+
"griots_r28": {
23+
"theme": "Le mariage d'hier et d'aujourd'hui",
24+
"dialect": "Bamako",
25+
"gender": "F"
26+
},
27+
"griots_r22": {
28+
"theme": "Various",
29+
"dialect": "Malinke\u0301 de Siby",
30+
"gender": "F"
31+
},
32+
"griots_r20": {
33+
"theme": "Les Maabo",
34+
"dialect": "Se\u0301gou",
35+
"gender": "M"
36+
},
37+
"griots_r3": {
38+
"theme": "Les forge\u0301rons",
39+
"dialect": "Bamako",
40+
"gender": "M"
41+
},
42+
"griots_r1": {
43+
"theme": "L'histoire d'une fille",
44+
"dialect": "Bamako",
45+
"gender": "M"
46+
},
47+
"griots_r5": {
48+
"theme": "Les Coulibaly",
49+
"dialect": "Bamako",
50+
"gender": "M"
51+
},
52+
"griots_r12": {
53+
"theme": "La bataille de Kaana",
54+
"dialect": "Bamako",
55+
"gender": "M"
56+
},
57+
"griots_r4": {
58+
"theme": "Les noms authentiques",
59+
"dialect": "Bamako",
60+
"gender": "M"
61+
},
62+
"griots_r25": {
63+
"theme": "L'hisoire de la jeune fille Niamakolon",
64+
"dialect": "Bamako",
65+
"gender": "F"
66+
},
67+
"griots_r8": {
68+
"theme": "L'histoire des fils d'Abraham",
69+
"dialect": "Bamako",
70+
"gender": "F"
71+
},
72+
"griots_r11": {
73+
"theme": "Garba Mama",
74+
"dialect": "Bamako",
75+
"gender": "M"
76+
},
77+
"griots_r18": {
78+
"theme": "Le griot",
79+
"dialect": "Malinke\u0301 de kangaba",
80+
"gender": "M"
81+
},
82+
"griots_r16": {
83+
"theme": "Les Massassi du Karta ''B\u0254''",
84+
"dialect": "Bamako",
85+
"gender": "M"
86+
},
87+
"griots_r10": {
88+
"theme": "L'e\u0301ducaion d'hier et d'aujourd'hui",
89+
"dialect": "Bamako",
90+
"gender": "F"
91+
},
92+
"griots_r6": {
93+
"theme": "Les Diarra",
94+
"dialect": "Se\u0301gou",
95+
"gender": "M"
96+
},
97+
"griots_r21": {
98+
"theme": "L'histoire de Djonkoloni",
99+
"dialect": "Bamako",
100+
"gender": "M"
101+
},
102+
"griots_r23": {
103+
"theme": "L'histoire de B\u0254",
104+
"dialect": "Se\u0301gou",
105+
"gender": "M"
106+
},
107+
"griots_r19": {
108+
"theme": "La vie d'avant en milieu Bamanan ",
109+
"dialect": "Bamako",
110+
"gender": "F"
111+
},
112+
"griots_r14": {
113+
"theme": "Nos ancetres",
114+
"dialect": "Malinke\u0301 Siby",
115+
"gender": "M"
116+
},
117+
"griots_r27": {
118+
"theme": "Les Mianka",
119+
"dialect": "Bamako",
120+
"gender": "M"
121+
},
122+
"griots_r30": {
123+
"theme": "Les valeurs du Mali",
124+
"dialect": "Bamako",
125+
"gender": "M"
126+
},
127+
"griots_r29": {
128+
"theme": "L' histoire de Dabo",
129+
"dialect": "Bamako",
130+
"gender": "M"
131+
},
132+
"griots_r7": {
133+
"theme": "L'histoire du roi Razaly",
134+
"dialect": "Bamako",
135+
"gender": "M"
136+
},
137+
"griots_r26": {
138+
"theme": "Hier et aujourd'hui",
139+
"dialect": "Bamako",
140+
"gender": "M"
141+
},
142+
"griots_r2": {
143+
"theme": "L'histoire d'un grand marabo",
144+
"dialect": "Se\u0301gou",
145+
"gender": "M"
146+
},
147+
"griots_r15": {
148+
"theme": "L'histoire d'El Hadj Oumar Tall",
149+
"dialect": "Bamako",
150+
"gender": "M"
151+
}
152+
}

tools/TODO.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

transcriptions/griots_r14/griots_r14_2.eaf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
<TIME_SLOT TIME_SLOT_ID="ts2" TIME_VALUE="2131"/>
1515
<TIME_SLOT TIME_SLOT_ID="ts3" TIME_VALUE="2131"/>
1616
<TIME_SLOT TIME_SLOT_ID="ts4" TIME_VALUE="3221"/>
17-
<TIME_SLOT TIME_,,SLOT_ID="ts5" TIME_VALUE="3221"/>
17+
<TIME_SLOT TIME_SLOT_ID="ts5" TIME_VALUE="3221"/>
1818
<TIME_SLOT TIME_SLOT_ID="ts6" TIME_VALUE="3221"/>
1919
<TIME_SLOT TIME_SLOT_ID="ts7" TIME_VALUE="3221"/>
2020
<TIME_SLOT TIME_SLOT_ID="ts8" TIME_VALUE="3221"/>

0 commit comments

Comments
 (0)