Available courses

The History of Data

FREE Intro | ~2 hours

From punched cards to Lakehouses and AI-ready data platforms: understand the evolution of data and why the future is starting now.

What you'll gain (free entry point):

  • ✅ Key milestones in data technology (1940–2025)
  • ✅ Why relational databases, Hadoop, cloud, and lakehouses emerged
  • ✅ The shift from batch → real-time → AI-driven data
  • ✅ Lessons from the past you can apply today
  • ✅ Context for modern architectures (Lakehouse, Semantic, etc.)

Ideal as a first step for anyone serious about data engineering or architecture.

Outline (~2 hours total):

  • The early days: punched cards & mainframes
  • The relational revolution & SQL
  • Big Data & the Hadoop era
  • Cloud, data lakes & warehouses
  • The rise of Lakehouse, AI & semantic data
  • Summary & what it means for you

Start completely free – no credit card, no catch. Discover the history behind your future work.

Lakehouse Architectures for Data Engineers

Advanced | 8–12 hours

Build modern, scalable data platforms that combine a data lake and data warehouse into one. Hands-on with the best patterns and tools used today by leading organizations.

What you'll learn (hands-on + assessments):

  • ✅ Medallion architecture (bronze/silver/gold) in practice
  • Delta Lake, Apache Iceberg & Hudi – comparison & implementation
  • ✅ Unity Catalog, catalog management and governance
  • ✅ Real-time & batch processing on a single platform
  • ✅ Performance tuning, schema evolution & time travel
  • ✅ Migration strategies from classic warehouses/lakes

Perfect for data engineers and architects who refuse to build outdated platforms.

Outline (8–12 hours total):

  • Intro to Lakehouse concept & benefits
  • Medallion architecture in detail
  • Delta Lake & Iceberg deep-dive
  • Unity Catalog & governance
  • Performance, security & best practices
  • Real-world assessments & case studies
  • Final review & next steps

Ready to future-proof your data platform? Start now with the Assessment Edition.

Semantic Ontologies & Metadata at Scale

Advanced | ~16–20 hours

Build AI-ready knowledge graphs and semantic layers that make your metadata scalable, validatable, and future-proof.

What you'll learn (hands-on):

  • ✅ RDF, OWL & SPARQL – from basics to advanced queries
  • ✅ Protégé + owlready2 for ontology design & Python integration
  • ✅ SHACL for data validation & DCAT/OpenLineage metadata standards
  • ✅ Semantic Governance: centralized vs federated models
  • ✅ HybridRAG, Knowledge Mesh & semantic lineage in AI pipelines
  • ✅ Enterprise-grade data contracts & ontologies

Perfect for data engineers, architects & AI specialists who want to tame complex metadata and make platforms AI-proof.

Outline (~16–20 hours total):

  • Intro & RDF/OWL/SPARQL basics
  • Metadata standards (OpenLineage, DCAT, SHACL)
  • Ontology building & querying (Protégé, owlready2)
  • Advanced governance & HybridRAG
  • Semantic layers for AI pipelines & knowledge mesh
  • Data contracts + SHACL in practice
  • Final review & next steps

Ready to make your data semantic? Start now and unlock real-world applications.