Datasets:
The dataset viewer is not available for this split.
Error code: StreamingRowsError
Exception: FileNotFoundError
Message: https://datashare.is.ed.ac.uk/bitstream/handle/10283/3443/VCTK-Corpus-0.92.zip
Traceback: Traceback (most recent call last):
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 980, in _wrap_create_connection
return await self._loop.create_connection(*args, **kwargs) # type: ignore[return-value] # noqa
File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1065, in create_connection
raise exceptions[0]
File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1050, in create_connection
sock = await self._connect_sock(
File "/usr/local/lib/python3.9/asyncio/base_events.py", line 961, in _connect_sock
await self.sock_connect(sock, address)
File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 500, in sock_connect
return await fut
File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 535, in _sock_connect_cb
raise OSError(err, f'Connect call failed {address}')
TimeoutError: [Errno 110] Connect call failed ('10.70.21.37', 443)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 417, in _info
await _file_info(
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 833, in _file_info
r = await session.get(url, allow_redirects=ar, **kwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/client.py", line 536, in _request
conn = await self._connector.connect(
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 540, in connect
proto = await self._create_connection(req, traces, timeout)
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 901, in _create_connection
_, proto = await self._create_direct_connection(req, traces, timeout)
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 1209, in _create_direct_connection
raise last_exc
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 1178, in _create_direct_connection
transp, proto = await self._wrap_create_connection(
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 988, in _wrap_create_connection
raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host datashare.ed.ac.uk:443 ssl:default [Connect call failed ('10.70.21.37', 443)]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/src/services/worker/src/worker/utils.py", line 263, in get_rows_or_raise
return get_rows(
File "/src/services/worker/src/worker/utils.py", line 204, in decorator
return func(*args, **kwargs)
File "/src/services/worker/src/worker/utils.py", line 241, in get_rows
rows_plus_one = list(itertools.islice(ds, rows_max_number + 1))
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1353, in __iter__
for key, example in ex_iterable:
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 207, in __iter__
yield from self.generate_examples_fn(**self.kwargs)
File "/tmp/modules-cache/datasets_modules/datasets/vctk/eeb0c5a93221dfd9ef03140e994b3b762d474be37ade9b3cc9e24e07ed227b07/vctk.py", line 92, in _generate_examples
with open(meta_path, encoding="utf-8") as meta_file:
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/streaming.py", line 74, in wrapper
return function(*args, download_config=download_config, **kwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 496, in xopen
file_obj = fsspec.open(file, mode=mode, *args, **kwargs).open()
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 439, in open
return open_files(
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 282, in open_files
fs, fs_token, paths = get_fs_token_paths(
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 606, in get_fs_token_paths
fs = filesystem(protocol, **inkwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/registry.py", line 261, in filesystem
return cls(**storage_options)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 76, in __call__
obj = super().__call__(*args, **kwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/zip.py", line 58, in __init__
self.fo = fo.__enter__() # the whole instance is a context
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 102, in __enter__
f = self.fs.open(self.path, mode=mode)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 1199, in open
f = self._open(
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 356, in _open
size = size or self.info(path, **kwargs)["size"]
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
raise return_result
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
result[0] = await coro
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 430, in _info
raise FileNotFoundError(url) from exc
FileNotFoundError: https://datashare.is.ed.ac.uk/bitstream/handle/10283/3443/VCTK-Corpus-0.92.zipNeed help to make the dataset viewer work? Open a discussion for direct support.
Dataset Card for VCTK
Dataset Summary
This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation paragraph used for the speech accent archive.
Supported Tasks and Leaderboards
[More Information Needed]
Languages
[More Information Needed]
Dataset Structure
Data Instances
A data point comprises the path to the audio file, called file and its transcription, called text.
{
'speaker_id': 'p225',
'text_id': '001',
'text': 'Please call Stella.',
'age': '23',
'gender': 'F',
'accent': 'English',
'region': 'Southern England',
'file': '/datasets/downloads/extracted/8ed7dad05dfffdb552a3699777442af8e8ed11e656feb277f35bf9aea448f49e/wav48_silence_trimmed/p225/p225_001_mic1.flac',
'audio':
{
'path': '/datasets/downloads/extracted/8ed7dad05dfffdb552a3699777442af8e8ed11e656feb277f35bf9aea448f49e/wav48_silence_trimmed/p225/p225_001_mic1.flac',
'array': array([0.00485229, 0.00689697, 0.00619507, ..., 0.00811768, 0.00836182, 0.00854492], dtype=float32),
'sampling_rate': 48000
},
'comment': ''
}
Each audio file is a single-channel FLAC with a sample rate of 48000 Hz.
Data Fields
Each row consists of the following fields:
speaker_id: Speaker IDaudio: Audio recordingfile: Path to audio filetext: Text transcription of corresponding audiotext_id: Text IDage: Speaker's agegender: Speaker's genderaccent: Speaker's accentregion: Speaker's region, if annotation existscomment: Miscellaneous comments, if any
Data Splits
The dataset has no predefined splits.
Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
Initial Data Collection and Normalization
[More Information Needed]
Who are the source language producers?
[More Information Needed]
Annotations
Annotation process
[More Information Needed]
Who are the annotators?
[More Information Needed]
Personal and Sensitive Information
The dataset consists of people who have donated their voice online. You agree to not attempt to determine the identity of speakers in this dataset.
Considerations for Using the Data
Social Impact of Dataset
[More Information Needed]
Discussion of Biases
[More Information Needed]
Other Known Limitations
[More Information Needed]
Additional Information
Dataset Curators
[More Information Needed]
Licensing Information
Public Domain, Creative Commons Attribution 4.0 International Public License (CC-BY-4.0)
Citation Information
@inproceedings{Veaux2017CSTRVC,
title = {CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit},
author = {Christophe Veaux and Junichi Yamagishi and Kirsten MacDonald},
year = 2017
}
Contributions
Thanks to @jaketae for adding this dataset.
- Downloads last month
- 421
Models trained or fine-tuned on vctk