Dask is a flexible parallel computing library for analytic computing. It takes a Python job and shares it across multiple systems.
Read more 
			
		
	The Linux Portal Site
 
			
		
	Dask is a flexible parallel computing library for analytic computing. It takes a Python job and shares it across multiple systems.
Read more 
			
		
	Optimus is the missing framework to profile, clean, process and do ML in a distributed fashion using Apache Spark (PySpark).
Read more 
			
		
	yt is an open-source Python package for analyzing and visualizing volumetric data. yt focuses on driving physically-meaningful inquiry.
Read more 
			
		
	AWS Data Wrangler extends the power of Pandas library to AWS connecting DataFrames and AWS data related services.
Read more 
			
		
	Orange is a component-based framework for machine learning and data mining. It includes a range of data visualization, and exploration.
Read more 
			
		
	The Pentaho BI Project is application software for enterprise reporting, analysis, dashboard, data mining, workflow and ETL capabilities.
Read more 
			
		
	Data science is an emerging, multidisciplinary field of scientific methods, processes, algorithm development and technology to extract knowledge or insights in ingenious ways from structured or unstructured data.
Read more