scikit-learn is an open source Python module for machine learning built on top of SciPy. It offers efficient versions of a large number of common algorithms. The software displays a clean, uniform, and streamlined API, with good online documentation.
The software provides various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
scikit-learn is largely written in Python, with some core algorithms written in Cython to optimise performance.
scikit-learn is one of the most useful modules for machine learning in Python.
The software has the following dependencies:
- Python (>= 2.7 or >= 3.4)
- NumPy (>= 1.8.2)
- SciPy (>= 0.13.3)
scikit-learn also uses CBLAS, the C interface to the Basic Linear Algebra Subprograms library, and matplotlib to generate the supplied examples.
- Simple and efficient tools for data mining and data analysis.
- Supervised learning algorithms – great coverage for this type of algorithm. There’s generalised Linear Modules, Linear and Quadratic Discriminant Analysis, Support Vector Machines, Decision Trees, Bayesian methods, Gaussian Processes, Neural network models, and more.
- Cross-validation – various methods to check the accuracy of supervised models on unseen data.
- Unsupervised learning algorithms – again, a wide range of algorithms are available, including Gaussian mixture modules, manifold learning, clustering, factor analysis, covariance estimation, density estimation, and more.
- Dataset transformations – provides a library of transformers, which clean, reduce, expand, or generate feature representations.
- Feature extraction – useful for extracting features from images and text.
- Various sample datasets – the datasets are useful in learning how to use scikit-learn. They include: Boston house prices dataset, iris dataset, diabetes dataset, digits dataset, linnerud dataset, wine dataset, and a breast cancer dataset.
- Built on NumPy, SciPy, and matplotlib.
|New to Linux? Read our Linux for Starters series. We start right at the basics and teach you everything you need to know to get started with Linux.|
|The largest compilation of the best free and open source software in the universe. Each article is supplied with a legendary ratings chart helping you to make informed decisions.|
|Hundreds of in-depth reviews offering our unbiased and expert opinion on software. We offer helpful and impartial information.|
|Replace proprietary software with open source alternatives: Google, Microsoft, Apple, Adobe, IBM, Autodesk, Oracle, Atlassian, Corel, Cisco, Intuit, and SAS.|
|Linux Around The World showcases events and usergroups that are Linux-related. This is a new series.|
|Getting Started with Docker helps you master Docker, a set of platform as a service products that delivers software in packages called containers.|
|Essential Linux system tools focuses on small, indispensable utilities, useful for system administrators as well as regular users.|
|Linux utilities to maximise your productivity. Small, indispensable tools, useful for anyone running a Linux machine.|
|Home computers became commonplace in the 1980s. Emulate home computers including the Commodore 64, Amiga, Atari ST, ZX81, Amstrad CPC, and ZX Spectrum.|
|Now and Then examines how promising open source software fared over the years. It can be a bumpy ride.|
|Linux at Home looks at a range of home activities where Linux can play its part, making the most of our time at home, keeping active and engaged.|
|Linux Candy reveals the lighter side of Linux. Have some fun and escape from the daily drudgery.|
|Best Free Android Apps. We showcase free Android apps that are definitely worth downloading. There's a strict eligibility criteria for inclusion in this series.|
|These best free books accelerate your learning of every programming language. Learn a new language today!|
|These free tutorials offer the perfect tonic to our free programming books series.|
|Stars and Stripes is an occasional series looking at the impact of Linux in the USA.|