Alljoined1
Alljoined1 is an EEG dataset comprising neural responses from eight healthy adults viewing 10,000 natural images presented via rapid serial visual presentation (RSVP). The dataset contains approximately 46,080 epochs recorded using a 64-channel BioSemi ActiveTwo system at 512 Hz, designed to enable EEG-to-image decoding research. Raw data are preserved in BioSemi Data Format (24-bit resolution) with accompanying preprocessed and epoched derivatives. Note: approximately 46,080 epochs represents the total after accounting for missing data (sub-03 ses-01 epoched file unavailable).

Alljoined1: EEG Responses to Natural Images
Overview
Alljoined1 is an EEG dataset of neural responses to rapid serial visual presentation (RSVP) of natural images, designed for EEG-to-image decoding research. Eight healthy right-handed adults (6 male, 2 female; mean age 22 +/- 0.64 years, normal or corrected-to-normal vision) each viewed 10,000 natural images across two recording sessions on separate days.
The original data were recorded in BioSemi Data Format (BDF) via a 64-channel BioSemi ActiveTwo system with 24-bit A/D conversion, digitized at 512 Hz. This BIDS-formatted version preserves the BDF format to maintain full 24-bit data fidelity.
Reference: Xu, J., Lee, S. K., & Jiang, W. (2024). Alljoined -- A dataset for EEG-to-Image decoding. <https://doi.org/10.48550/arXiv.2404.05553>
Recording Setup
- Equipment: BioSemi ActiveTwo, 64 Ag/AgCl sintered electrodes
- Montage: International 10-20 system
- Sampling rate: 512 Hz
- Reference: CMS/DRL (BioSemi default); average reference applied in preprocessing
- Electrode offset: kept below 40 mV
- Power line: 60 Hz notch filter applied during preprocessing
Task Paradigm
Participants viewed natural images in a rapid serial visual presentation (RSVP) paradigm with an oddball detection task. Each trial consisted of an image presented for 300 ms, followed by 300 ms of black screen, plus 0-50 ms of random jitter. Participants pressed the space bar when two consecutive trials contained the same image (oddball detection). Oddball trials (24 per block) were excluded from analysis.
Stimulus Set
10,000 natural images per participant drawn from the Natural Scenes Dataset (NSD), which itself is sourced from MS-COCO:
- 1,000 shared images: the first 960 images from the NSD "shared1000" subset, shown to all participants (each image repeated 4 times per participant)
- 9,000 unique images: different for each participant
Each image was shown 4 times per participant across blocks and sessions (presented twice per block, with blocks repeated within sessions).
Subjects and Sessions
8 subjects, 1-2 sessions each (13 sessions total):
| Subject | Sessions | Notes | |---------|----------|-------| | sub-01 | ses-01, ses-02 | | | sub-02 | ses-01 | | | sub-03 | ses-01, ses-02 | Epoched file missing for ses-01 | | sub-04 | ses-01, ses-02 | | | sub-05 | ses-01, ses-02 | | | sub-06 | ses-01, ses-02 | | | sub-07 | ses-01 | | | sub-08 | ses-01 | |
Total: approximately 46,080 epochs across all participants (approximately 3,839 events per session after oddball exclusion).
Data Format
Raw continuous EEG recordings are stored as BDF files (BioSemi Data Format, 24-bit resolution). The original data were distributed as MNE-Python FIF files; conversion to BDF was performed to preserve the native 24-bit precision of the BioSemi ActiveTwo system. Round-trip validation confirmed data integrity to within 1.55e-8 V (sub-nanovolt), and event onsets match exactly (zero timing error).
Per-session files:
| Path | Description | |------|-------------| | sub-XX/ses-YY/eeg/sub-XXses-YYtask-imageseeg.bdf | Raw EEG | | sub-XX/ses-YY/eeg/sub-XXses-YYtask-imagesevents.tsv | Event markers |
Shared sidecar files (root level, BIDS inheritance principle):
| File | Description | |------|-------------| | task-imageseeg.json | Recording parameters | | task-imageschannels.tsv | Channel descriptions (64 EEG channels) | | task-imageselectrodes.tsv | Electrode positions (standard 10-20, CapTrak) | | task-imagescoordsystem.json | Coordinate system specification |
Event values in the events.tsv files represent image indices (1-960+) corresponding to NSD image identifiers. The trial_type column uses the format image/{index}.
Derivatives
The derivatives/epoched/ directory contains preprocessed and epoched data provided by the original authors, stored in MNE-Python FIF format (.fif).
Preprocessing pipeline applied by the original authors:
- Band-pass filter: 0.5-125 Hz
- Notch filter: 60 Hz (power line)
- Independent Component Analysis (ICA): FastICA, retaining 95% of variance
- Epoch extraction: -50 ms to 600 ms relative to stimulus onset
- Artifact rejection: AutoReject algorithm (mean 130.75 epochs dropped per subject, SD 260.44)
- Baseline correction
- Average re-referencing
These epoched files are derivative products, not raw recordings, and are stored separately per BIDS conventions. Note: the epoched file for sub-03 ses-01 was not available in the source distribution.
Code
The code/ directory contains the original Alljoined1 analysis code, cloned from <https://github.com/Alljoined/alljoined-dataset1>.
BIDS Conversion
Converted to BIDS by Yahya Shirazi (Swartz Center for Computational Neuroscience, UC San Diego) using MNE-Python and custom scripts.
- Source data: OSF repository <https://osf.io/kqgs8/>
- Conversion validated with round-trip integrity checks (data, channels, sampling frequency, event count, event values, and event timing)
License and Terms of Use
This dataset is distributed under CC-BY-NC-ND-4.0 (Creative Commons Attribution-NonCommercial-NoDerivatives 4.0). The Alljoined team imposes additional terms on their datasets. By using this dataset you agree to all conditions below.
- Researcher shall use the Dataset only for non-commercial research and educational purposes, in accordance with Alljoined's Terms of Use.
- No Warranties: Alljoined makes no representations or warranties regarding the Dataset, including but not limited to warranties of non-infringement or fitness for a particular purpose.
- Full Responsibility: Researcher accepts full responsibility for his or her use of the Dataset and shall defend and indemnify Alljoined, including their employees, officers and agents, against any and all claims arising from Researcher's use of the Dataset.
- Privacy Compliance: Researcher shall comply with Alljoined's Privacy Policy and ensure that any use of the Dataset respects the privacy rights of individuals whose data may be included.
- Sharing Rights: Researcher may provide research associates and colleagues with access to the Dataset provided that they first agree to be bound by these terms and conditions.
- Termination Rights: Alljoined reserves the right to terminate Researcher's access to the Dataset at any time.
- Commercial Entity Binding: If Researcher is employed by a for-profit, commercial entity, Researcher's employer shall also be bound by these terms and conditions, and Researcher hereby represents that he or she is fully authorized to enter into this agreement on behalf of such employer.
- Governing Law: The law of the State of California shall apply to all disputes under this agreement.
> Note: The original Alljoined1 dataset on OSF (<https://osf.io/kqgs8/>) does not specify an explicit license. The terms above are from the Alljoined-1.6M HuggingFace distribution and the Alljoined website; they are included here as the best available guidance. Contact the Alljoined team (team@alljoined.com) for clarification on redistribution rights.
- Full terms: <https://www.alljoined.com/terms-of-use>
- Privacy policy: <https://www.alljoined.com/privacy-policy>
References
Xu, J., Lee, S. K., & Jiang, W. (2024). Alljoined -- A dataset for EEG-to-Image decoding. https://doi.org/10.48550/arXiv.2404.05553
Files
- .bidsignore 8 B
- .DS_Store 14.0 KB
- dataset_description.json 650 B JSON
- README.md 7.46 KB README
- task-images_channels.tsv 1.38 KB
- task-images_electrodes.tsv 4.08 KB
-
.nemar 0 dirs · 1 files 2.92 KB
- metadata.json 2.92 KB JSON
-
code 3 dirs · 50 files 40.3 MB
-
0_data_collection/ 3 files 286 KB
- convert_224_stimuli_v2.py 2.65 KB
- nsd_expdesign.mat 256 KB
- ORE_EEG_2.m 27.6 KB
-
1_preprocessing/ 4 dirs · 35 files 36.2 MB
-
data/ 5 files 11.5 MB
- captions_and_categories.json 558 KB JSON
- captions.json 129 KB JSON
- coco_indices.json 6.72 KB JSON
- nsd_stim_info_merged.csv 10.8 MB
- test_indices.json 938 B JSON
-
eeg2/ 9 files 23.9 MB
- eeg_plots.ipynb 17.0 MB
- evoked_joint_plot_high_res.png 989 KB
- plot_images_preprocessed.ipynb 876 KB
- plot_preprocessed.ipynb 277 KB
- plot_raw_matplotlib.ipynb 258 KB
- plot_raw.ipynb 4.48 MB
- preprocessing_experiments.ipynb 49.3 KB
- preprocessing_utils.py 8.69 KB
- preprocessing.py 2.41 KB
-
final_dataset/ 12 files 691 KB
- behavioural_dataset.py 3.88 KB
- check_train_test_percentage.py 1.26 KB
- create_captions.ipynb 4.38 KB
- create_huggingface_dataset.py 2.34 KB
- divide_train_test.py 2.00 KB
- download_coco.md 578 B
- get_coco_categories_captions.py 3.63 KB
- main_dataset.py 5.56 KB
- nsd_expdesign.mat 256 KB
- split_test.ipynb 410 KB
- test_indices.json 1.31 KB JSON
- upload_osf.sh 831 B
-
scripts/ 5 files 12.0 KB
- divide_train_test.py 2.00 KB
- get_coco_categories_labels.py 3.83 KB
- get_coco_ids.py 881 B
- get_coco_labels.py 1.67 KB
- plot_top_categories.py 3.67 KB
- fif_preproc.sh 705 B
- fif-eeg-preprocessing.py 2.46 KB
- parse-bdf-event-codes-to-fif.ipynb 166 KB
- README.md 4.56 KB
-
-
2_analysis/ 1 dirs · 11 files 3.77 MB
-
correlations/ 4 files 1.12 MB
- Alljoined1-eeg2-t-test.ipynb 74.5 KB
- Alljoined1-t-test.ipynb 28.5 KB
- correlation_experiment.ipynb 863 KB
- EEG2-t-test.ipynb 181 KB
- average_response.ipynb 90.5 KB
- avg_snr_subj5_ses1.png 65.9 KB
- avg_snr_subj5.png 34.5 KB
- plot_fif.ipynb 1.09 MB
- snr-sme.ipynb 1.02 MB
- subject_ERP.ipynb 153 KB
- takehome.ipynb 211 KB
-
- .DS_Store 6.00 KB
-
-
derivatives 1 dirs · 13 files 3.43 GB
-
epoched/ 8 dirs · 12 files 3.43 GB
-
sub-01/ 2 dirs · 2 files 597 MB
-
ses-01/ 1 dirs · 1 files 313 MB
-
eeg/ 1 files 313 MB
- sub-01_ses-01_task-images_epo.fif 313 MB Vis · soon
-
-
ses-02/ 1 dirs · 1 files 284 MB
-
eeg/ 1 files 284 MB
- sub-01_ses-02_task-images_epo.fif 284 MB Vis · soon
-
-
-
sub-02/ 1 dirs · 1 files 248 MB
-
ses-01/ 1 dirs · 1 files 248 MB
-
eeg/ 1 files 248 MB
- sub-02_ses-01_task-images_epo.fif 248 MB Vis · soon
-
-
-
sub-03/ 1 dirs · 1 files 294 MB
-
ses-02/ 1 dirs · 1 files 294 MB
-
eeg/ 1 files 294 MB
- sub-03_ses-02_task-images_epo.fif 294 MB Vis · soon
-
-
-
sub-04/ 2 dirs · 2 files 599 MB
-
ses-01/ 1 dirs · 1 files 296 MB
-
eeg/ 1 files 296 MB
- sub-04_ses-01_task-images_epo.fif 296 MB Vis · soon
-
-
ses-02/ 1 dirs · 1 files 303 MB
-
eeg/ 1 files 303 MB
- sub-04_ses-02_task-images_epo.fif 303 MB Vis · soon
-
-
-
sub-05/ 2 dirs · 2 files 536 MB
-
ses-01/ 1 dirs · 1 files 299 MB
-
eeg/ 1 files 299 MB
- sub-05_ses-01_task-images_epo.fif 299 MB Vis · soon
-
-
ses-02/ 1 dirs · 1 files 238 MB
-
eeg/ 1 files 238 MB
- sub-05_ses-02_task-images_epo.fif 238 MB Vis · soon
-
-
-
sub-06/ 2 dirs · 2 files 618 MB
-
ses-01/ 1 dirs · 1 files 305 MB
-
eeg/ 1 files 305 MB
- sub-06_ses-01_task-images_epo.fif 305 MB Vis · soon
-
-
ses-02/ 1 dirs · 1 files 313 MB
-
eeg/ 1 files 313 MB
- sub-06_ses-02_task-images_epo.fif 313 MB Vis · soon
-
-
-
sub-07/ 1 dirs · 1 files 313 MB
-
ses-01/ 1 dirs · 1 files 313 MB
-
eeg/ 1 files 313 MB
- sub-07_ses-01_task-images_epo.fif 313 MB Vis · soon
-
-
-
sub-08/ 1 dirs · 1 files 308 MB
-
ses-01/ 1 dirs · 1 files 308 MB
-
eeg/ 1 files 308 MB
- sub-08_ses-01_task-images_epo.fif 308 MB Vis · soon
-
-
-
- .DS_Store 8.00 KB
-
-
sub-01 2 dirs · 6 files 653 MB
-
ses-01/ 1 dirs · 3 files 326 MB
-
eeg/ 2 files 326 MB
- sub-01_ses-01_task-images_eeg.bdf 326 MB Vis · soon
- sub-01_ses-01_task-images_events.tsv 146 KB View · soon
- .DS_Store 8.00 KB
-
-
ses-02/ 1 dirs · 2 files 327 MB
-
eeg/ 2 files 327 MB
- sub-01_ses-02_task-images_eeg.bdf 327 MB Vis · soon
- sub-01_ses-02_task-images_events.tsv 146 KB View · soon
-
- .DS_Store 8.00 KB
-
-
sub-02 1 dirs · 2 files 327 MB
-
ses-01/ 1 dirs · 2 files 327 MB
-
eeg/ 2 files 327 MB
- sub-02_ses-01_task-images_eeg.bdf 327 MB Vis · soon
- sub-02_ses-01_task-images_events.tsv 146 KB View · soon
-
-
-
sub-03 2 dirs · 4 files 632 MB
-
ses-01/ 1 dirs · 2 files 305 MB
-
eeg/ 2 files 305 MB
- sub-03_ses-01_task-images_eeg.bdf 305 MB Vis · soon
- sub-03_ses-01_task-images_events.tsv 135 KB View · soon
-
-
ses-02/ 1 dirs · 2 files 327 MB
-
eeg/ 2 files 327 MB
- sub-03_ses-02_task-images_eeg.bdf 327 MB Vis · soon
- sub-03_ses-02_task-images_events.tsv 146 KB View · soon
-
-
-
sub-04 2 dirs · 4 files 645 MB
-
ses-01/ 1 dirs · 2 files 318 MB
-
eeg/ 2 files 318 MB
- sub-04_ses-01_task-images_eeg.bdf 318 MB Vis · soon
- sub-04_ses-01_task-images_events.tsv 142 KB View · soon
-
-
ses-02/ 1 dirs · 2 files 327 MB
-
eeg/ 2 files 327 MB
- sub-04_ses-02_task-images_eeg.bdf 327 MB Vis · soon
- sub-04_ses-02_task-images_events.tsv 146 KB View · soon
-
-
-
sub-05 2 dirs · 4 files 653 MB
-
ses-01/ 1 dirs · 2 files 326 MB
-
eeg/ 2 files 326 MB
- sub-05_ses-01_task-images_eeg.bdf 326 MB Vis · soon
- sub-05_ses-01_task-images_events.tsv 146 KB View · soon
-
-
ses-02/ 1 dirs · 2 files 327 MB
-
eeg/ 2 files 327 MB
- sub-05_ses-02_task-images_eeg.bdf 327 MB Vis · soon
- sub-05_ses-02_task-images_events.tsv 146 KB View · soon
-
-
-
sub-06 2 dirs · 4 files 653 MB
-
ses-01/ 1 dirs · 2 files 327 MB
-
eeg/ 2 files 327 MB
- sub-06_ses-01_task-images_eeg.bdf 326 MB Vis · soon
- sub-06_ses-01_task-images_events.tsv 145 KB View · soon
-
-
ses-02/ 1 dirs · 2 files 326 MB
-
eeg/ 2 files 326 MB
- sub-06_ses-02_task-images_eeg.bdf 326 MB Vis · soon
- sub-06_ses-02_task-images_events.tsv 146 KB View · soon
-
-
-
sub-07 1 dirs · 4 files 326 MB
-
ses-01/ 1 dirs · 3 files 326 MB
-
eeg/ 2 files 326 MB
- sub-07_ses-01_task-images_eeg.bdf 326 MB Vis · soon
- sub-07_ses-01_task-images_events.tsv 146 KB View · soon
- .DS_Store 8.00 KB
-
- .DS_Store 8.00 KB
-
-
sub-08 1 dirs · 2 files 326 MB
-
ses-01/ 1 dirs · 2 files 326 MB
-
eeg/ 2 files 326 MB
- sub-08_ses-01_task-images_eeg.bdf 326 MB Vis · soon
- sub-08_ses-01_task-images_events.tsv 146 KB View · soon
-
-