Sparkitecture
  • Welcome to Sparkitecture!
  • Cloud Service Integration
    • Azure Storage
    • Azure SQL Data Warehouse / Synapse
    • Azure Data Factory
  • Data Preparation
    • Reading and Writing Data
    • Shaping Data with Pipelines
    • Other Common Tasks
  • Machine Learning
    • About Spark MLlib
    • Classification
      • Logistic Regression
      • Naïve Bayes
      • Decision Tree
      • Random Forest
      • Gradient-Boosted Trees
    • Regression
      • Linear Regression
      • Decision Tree
      • Random Forest
      • Gradient-Boosted Trees
    • MLflow
    • Feature Importance
    • Model Saving and Loading
    • Model Evaluation
  • Streaming Data
    • Structured Streaming
  • Operationalization
    • API Serving
    • Batch Scoring
  • Natural Language Processing
    • Text Data Preparation
    • Model Evaluation
  • Bioinformatics and Genomics
    • Glow
Powered by GitBook
On this page
  • About
  • How to Cite

Was this helpful?

Export as PDF

Welcome to Sparkitecture!

NextAzure Storage

Last updated 5 years ago

Was this helpful?

Created by: Colby T. Ford, Ph.D.

PySpark Edition | A work in progress... | Created using

About

Sparkitecture is a collection of “cookbook-style” scripts for simplifying data engineering and machine learning in Apache Spark.

How to Cite

BibTex

Text Citation

@misc{sparkitecture,

author = {Colby T. Ford},

title = {Sparkitecture - {A} collection of "cookbook-style" scripts for simplifying data engineering and machine learning in {Apache Spark}.},

month = oct,

year = 2019,

doi = {10.5281/zenodo.3468502},

url = {https://doi.org/10.5281/zenodo.3468502}

}

This is an open source project (GPL v3.0) for the Spark community. If you have ideas or contributions you'd like to add, submit a or a write your code/tutorial/page and create a in the GitHub repo.

Colby T. Ford. (2019, October) Sparkitecture - A collection of "cookbook-style" scripts for simplifying data engineering and machine learning in Apache Spark., (Version v1.0.0). Zenodo.

Feature Request
Pull Request
http://doi.org/10.5281/zenodo.3468502
GitBook.com
DOI