sshleifer/distilbart-cnn-12-6
Summarization
β’
Updated
β’
843k
β’
168
Error code: StreamingRowsError
Exception: FileNotFoundError
Message: https://raw.githubusercontent.com/EdinburghNLP/XSum/master/XSum-Dataset/XSum-TRAINING-DEV-TEST-SPLIT-90-5-5.json
Traceback: Traceback (most recent call last):
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 417, in _info
await _file_info(
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 837, in _file_info
r.raise_for_status()
File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status
raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 503, message='first byte timeout', url=URL('https://raw.githubusercontent.com/EdinburghNLP/XSum/master/XSum-Dataset/XSum-TRAINING-DEV-TEST-SPLIT-90-5-5.json')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/src/services/worker/src/worker/utils.py", line 263, in get_rows_or_raise
return get_rows(
File "/src/services/worker/src/worker/utils.py", line 204, in decorator
return func(*args, **kwargs)
File "/src/services/worker/src/worker/utils.py", line 241, in get_rows
rows_plus_one = list(itertools.islice(ds, rows_max_number + 1))
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1353, in __iter__
for key, example in ex_iterable:
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 207, in __iter__
yield from self.generate_examples_fn(**self.kwargs)
File "/tmp/modules-cache/datasets_modules/datasets/xsum/082863bf4754ee058a5b6f6525d0cb2b18eadb62c7b370b095d1364050a52b71/xsum.py", line 133, in _generate_examples
with open(split_path, "r", encoding="utf-8") as f:
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/streaming.py", line 74, in wrapper
return function(*args, download_config=download_config, **kwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 496, in xopen
file_obj = fsspec.open(file, mode=mode, *args, **kwargs).open()
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 134, in open
return self.__enter__()
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 102, in __enter__
f = self.fs.open(self.path, mode=mode)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 1199, in open
f = self._open(
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 356, in _open
size = size or self.info(path, **kwargs)["size"]
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
raise return_result
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
result[0] = await coro
File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 430, in _info
raise FileNotFoundError(url) from exc
FileNotFoundError: https://raw.githubusercontent.com/EdinburghNLP/XSum/master/XSum-Dataset/XSum-TRAINING-DEV-TEST-SPLIT-90-5-5.jsonNeed help to make the dataset viewer work? Open a discussion for direct support.
Extreme Summarization (XSum) Dataset.
There are three features:
An example of 'validation' looks as follows.
{
"document": "some-body",
"id": "29750031",
"summary": "some-sentence"
}
The data fields are the same among all splits.
document: a string feature.summary: a string feature.id: a string feature.| name | train | validation | test |
|---|---|---|---|
| default | 204045 | 11332 | 11334 |
@article{Narayan2018DontGM,
title={Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization},
author={Shashi Narayan and Shay B. Cohen and Mirella Lapata},
journal={ArXiv},
year={2018},
volume={abs/1808.08745}
}
Thanks to @thomwolf, @lewtun, @mariamabarham, @jbragg, @lhoestq, @patrickvonplaten for adding this dataset.