DeepCosmoNet

Deep Learning for cosmic web analysis
DeepCosmoNet advances the analysis of the Cosmic Web by combining advanced Machine Learning models with cutting edge HPC technologies to enable automated segmentation of cosmological structures , provide a unified and scalable data processing pipeline, and deliver optimised neural architectures for large-scale cosmological simulations.
DeepCosmoNet has been selected and funded under the ICSC – Centro Nazionale di Ricerca in HPC, Big Data e Quantum Computing, as part of the PNRR, and supported by the European Union – NextGenerationEU.
Access Resources Learn More

Welcome to the DeepCosmoNet Website

This website introduces the DeepCosmoNet project — what it is, what it does, and why it matters. You’ll also find access to the project’s main outputs, from raw n-body data, AI-generated sample of catalogs and AI models to source code and publications. Explore our resources to dive into cutting-edge tools for n-body segmentation.

Everything is open and accessible. See the Outputs section below to learn more, or open the DeepCosmoNet Resource Hub to access the project resources. If you’d like to download datasets or code, we’ll just ask you to fill out a short form with your affiliation and intended use.

Discover the Project

Learn in detail about DeepCosmoNet, its scientific goals, and the motivation behind using Artificial Intelligence to segment n-body simulations. Understand how the project combines cosmological models and algorithms with advanced machine learning techniques to create new opportunities for research and discovery.

Learn More

Access Resources

Browse a growing collection of datasets, source code, and publications as they are released. These resources are designed to support both the astrophysics and AI communities, offering practical tools for analysis, experimentation, and reproducible science.

Explore

Check Usage Terms

Find clear and transparent information about licensing conditions, recommended citation formats, and guidelines for the responsible reuse of materials. These terms help ensure that DeepCosmoNet's outputs are properly credited and can be applied ethically in future research.

View Terms

About the DeepCosmoNet Project

DeepCosmoNet is a research initiative dedicated to the automatic segmentation of the Cosmic Web in large-scale N-body simulations, combining cosmological simulation data with modern deep learning pipelines. Cosmological simulations are fundamental for understanding the formation of large-scale structures like halos,subhalos, and voids; yet, their analysis is challenged by the computational intensity of traditional segmentation algorithms. In this way, DeepCosmoNet contributes to reproducible science and accelerates discovery in computational cosmology and astro-informatics.

To address these challenges, DeepCosmoNet develops a unified segmentation pipeline that integrates specialised AI models for different cosmic structures.

The project delivers openly accessible tools: curated datasets derived from simulations, validated deep learning models, the integrated segmentation pipeline with two branches, and comprehensive catalogues of identified structures.

🧠Advanced 3D Segmentation

We developed specialised deep learning models for the automatic segmentation of 3D point-cloud data from N-body simulations. The pipeline uses distinct approaches for different cosmic structures, employing Graph Neural Networks for dense sub-halos and 3D Convolutional architectures for large-scale cosmic voids..

⚙️ High-Performance Pipeline

The project's core is a unified and scalable segmentation pipeline designed for efficiency. We have implemented significant optimisations, such as using KD-trees to accelerate nearest-neighbor searches , reducing the computational complexity . The entire framework is built for parallelism, allowing to unlock the full potential of HPC systems.

✅ Scientifically Validated Outputs

The pipeline's performance is rigorously benchmarked against existing ground-truth cosmological catalogues to validate its scientific accuracy. The final result is the creation of comprehensive catalogues documenting the identified structures. These validated results are made accessible to the international scientific community.

Project outputs

  • A Pipeline for Open Science: The core output of our project is a robust and scalable segmentation pipeline designed to accelerate cosmological research. By automating the detection of cosmic structures, this tool removes a significant computational bottleneck, allowing scientists to focus on analysis and discovery. In line with FAIR principles (Findable, Accessible, Interoperable, and Reusable), the pipeline is engineered to produce comprehensive and easily shareable catalogues. This commitment to open science ensures that our results are not only reproducible but also provide a valuable, accessible resource for the entire international scientific community.
  • Validated Structure Catalogues: A key output of the project is the generation of comprehensive catalogues of cosmic structures, specifically focusing on sub-halos and voids identified by our pipeline. These catalogues provide detailed information on the properties of each detected object, validated through rigorous comparison with established cosmological datasets. They are designed to be an invaluable resource for the research community, enabling cross-algorithm analysis and facilitating deeper studies into the large-scale structure of the universe.

Validation & benchmarking

The performance of our models is validated through direct comparison with established cosmological catalogues, such as those generated by SUBFIND. Performance reporting focuses on key detection metrics, including precision, recall, and differents clustering metrics such as Completeness, Purity, Adjusted Rand Index, along with error analysis on the properties of the identified structures. This enables a transparent and reproducible evaluation of our segmentation pipeline across different cosmic structure types.

Methodology at a glance

A unified pipeline links (i) preprocessing of raw n-body data, (ii) inference with specialised deep learning models for segmentation, and (iii) generation of validated cosmic structure catalogues. The workflow is designed to process massive 3D point-cloud datasets and scales to large experiments on HPC resources, so models can be trained, stress-tested, and interpreted in a consistent manner.

Ownership, collaboration & funding

DeepCosmoNet is conceived and developed under the leadership of Koexai S.r.l., which coordinates all project activities and ensures their delivery. The project benefits from the valuable scientific input of Prof. Carmelita Carbone, who contribute expertise in cosmological structures and act as scientific referees, nominated by INAF (Spoke 3 leader).

DeepCosmoNet has been selected and funded under the ICSC – Centro Nazionale di Ricerca in HPC, Big Data e Quantum Computing, as part of the PNRR, and supported by the European Union – NextGenerationEU.

  • European Union — NextGenerationEU
  • PNRR — Piano Nazionale di Ripresa e Resilienza
  • ICSC — National Centre for HPC, Big Data and Quantum Computing

Outputs

Explore DeepCosmoNet’s outputs: datasets, source code, publications and diagrams. For downloads and full details, open the Resource Hub.

N-body Simulation

A sample of a dataset of particles derived from large-scale N-body simulations

  • Type: Dataset
  • Version: 1.0.0
  • Released: 31/08/2025
  • Files: 1 (full)
  • Full size: 91 MB (.csv)
  • Licence: Non-commercial research
Last updated 31/08/2025

Sub-halo Catalogue

This catalogue contains sub-halos identified by our custom pipeline for the sample of n-body derived particles.

  • Type: Dataset
  • Version: 1.0.0
  • Released: 31/08/2025
  • Files: 1 (full)
  • Full size: 314 KB (.csv)
  • Licence: Non-commercial research
Last updated: 31/08/2025

Void Catalogue

This catalogue contains voids identified by our custom pipeline for the sample of n-body derived particles.

  • Type: Dataset
  • Version: 1.0.0
  • Released: 31/08/2025
  • Files: 1 (full)
  • Full size: 144 KB (.csv)
  • Licence: Non-commercial research
Sample preview
First 20 rows. Columns and units match the full dataset.
Loading preview…
Download sample CSV Checksum (SHA-256): TBC
Documentation
Last updated: 31/08/2025

Requesting the full catalogues

Large files & licensing: full releases are provided for non-commercial research on a short request.

  1. Tell us your name, affiliation, email, and intended use.
  2. We’ll reply with access instructions and a download link.
info@koexai.com

Contact & Acknowledgements

Project Contact

For resource access requests, collaboration inquiries, or technical questions:

info@koexai.com

Ownership, Scientific Referees & Collaboration

Koexai S.r.l. — Project lead and coordinating body.

Carmelita Carbone — Staff Researcher at INAF-IASF Milan & Scientific Referee

Fondazione Clément Fillietroz — Osservatorio Nazionale Regione autonoma Valle d'Aosta (OAVdA)