More dates

Payment plans

How does it work?

  • Reserve your order today and pay over time in regular, automatic payments.
  • You’ll receive your tickets and items once the final payment is complete.
  • No credit checks or third-party accounts - just simple, secure, automatic payments using your saved card.

Introduction to NCI’s Data Catalogue and Indexing Schemes

Share
Online Event
Add to calendar
 

Event description

We’re hosting a tutorial to introduce the NCI data catalogue and its two indexing schemes: Intake-ESM and Intake-Spark.

A data catalogue helps users discover and access datasets through structured metadata, while indexing improves performance by enabling fast, targeted searches. Built on the Python Intake package, these tools support scalable, memory-efficient access to large datasets. At NCI, Intake-Spark uses Parquet-based indexes for high-performance querying with Spark, while Intake-ESM uses lightweight CSV-based indexes ideal for climate data workflows. 

This session will include hands-on Jupyter Notebook examples showing how to use the catalogue in data analysis and machine learning workflows. You’ll learn how to search, load, and filter datasets efficiently from the /g/data collections. 

The tutorial is ideal for researchers working with large-scale data or looking to streamline their pipelines.

If you have any questions regarding this training, please contact training.nci@anu.edu.au.

Prerequisites

    1. Experience with Python.
    2. Experience with bash or similar Unix shells.
    3. Having a valid NCI account 
    4. Experience using NCI ARE service is recommended. You can find relevant documentations here: ARE User Guide.

Learning Outcomes

After this training session, you will be able to

  • Learn about NCI data services
  • Understand NCI data catalogue and schemes
  • Perform search, load, and filter datasets efficiently from the /g/data collection
  • Can use data catalog in data analysis and machine learning workflows.


Topics Covered

  • Welcome and Introduction to NCI’s Intake-Spark and Intake-ESM Indexing Schemes
  • Overview of NCI’s Data Catalogue Services
  • Working with the Intake-ESM Indexing Scheme
  • Applying the Intake-ESM Scheme in AI/ML Workflows
  • Using the Intake-Spark Indexing Scheme

Powered by

Tickets for good, not greed Humanitix dedicates 100% of profits from booking fees to charity

This event has passed
This event has passed
Online Event