aWorkspaces 1. Workspaces• After getting the data, we need an environment where we could explore our data and ML models.• For instance, we'd like to explore:– Data Completeness → stability (data today is similar to data from, say, a year ago), availability, free of bias (positive feedback loop).– Pretrained models → Transfer learning (embedding layers)– Explainability → Shapley Values, Lime, DeepLift– Model types → Layered, ensemble, AutoML– Feature importance • In addition to exploration, we'd also like to:– Leverage our team resources:* Team packages* Direct collaboration– Manage environment:* Individualized exploration (i.e. individualized environment)* Production-ready for serving predictions– Have HDFS and Spark access– Have asynchronous support:* Training, hyperparameter tuning, evaluation– Have data access governance:* Protected data • The tools out there (at the time of writing this, late 2022) are mostly:– Jupyter Hub:* Amazon SageMaker Studio* Google Colab* Azure ML Workspace