Optimus – profile, clean, process and perform machine learning

Optimus is the missing framework to profile, clean, process and do ML in a distributed fashion using Apache Spark (PySpark).

Prepare, explore, visualize and create Machine Learning models for Big Data with this library.

Optimus is free and open source software.

Features include:

  • Simple and robust – prepare, explore, visualize your data in few lines of code.
  • Easy, fast, parallelized and scalable data cleansing, exploration and Machine Learning Models creation.
  • Local or in the cloud.
  • Easy to use API. Optimus expands the Spark DataFrame functionality adding .rows and .cols attributes.
  • Connect to external API to enrich your data.
  • String clustering – cluster similar strings and change it for a single value.

Website: hi-optimus.com
Support: GitHub Code Repository
Developer: Argenis Leon, Favio Vazquez and contributors
License: Apache License 2.0

Optimus is written in Python. Learn Python with our recommended free books and free tutorials.

Return to Python Data Analysis Home Page

Read our complete collection of recommended free and open source software. The collection covers all categories of software.
Share this article