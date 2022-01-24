International Business Machines Corporation (IBM) is an American multinational technology corporation headquartered in Armonk, New York. They sell computer hardware, middleware and software employing over 370,000 people.
IBM acquired Red Hat in 2019. But you can trace IBM’s history of open source far further back. They were one of the earliest champions of open source, backing influential communities like Linux, Apache, and Eclipse, advocating open licenses, open governance, and open standards.
IBM also collaborates with Linux organisations. For example, IBM works with Ubuntu in areas like containers, virtualization, Infrastructure-as-a-Service, big data analytics and DevOps to provide reference architectures, support solutions and cloud offerings, both for enterprise data centres and cloud service providers.
The company is involved in many open source projects. For example, they helped to create the Apache Software Foundation, and were also a founder member of the OpenJS Foundation, responsible for the development of the Node.js platform, Appium, Dojo, jQuery and many other products.
There are also many IBM software products published under a proprietary license. This series looks at free and open source alternatives to IBM’s products.
IBM SPSS Modeler is a data mining and text analytics software application. The program is used to build predictive models and conduct other analytic tasks.
While SPSS Modeler is available for Linux, it’s proprietary software. What are the best free and open source alternatives?
1. RapidMiner
RapidMiner (formerly known as YALE) is a flexible Java environment for knowledge discovery in databases, machine learning, and data mining. It allows experiments to be made up of a large number of arbitrarily nestable operators, described in XML files which are created with RapidMiner’s graphical user interface.
It features an easy-to-use visual environment, a plugin mechanism, and high-dimensional plotting, as well as an extension mechanism that makes it possible to integrate new operators and adapt the system to your personal requirements. The modular operator concept of RapidMiner allows the design of complex nested operator chains for a huge number of learning problems in a very fast and efficient way (rapid prototyping).
There are three editions of RapidMiner available. Only the free edition is licensed under an open source license. And it’s limited to 10,000 rows and 1 logical processor.
2. Orange
Orange is a component-based framework for machine learning and data mining. It includes a range of data visualization, exploration, preprocessing and modeling techniques. It can be used through an attractive graphical user interface or alternatively as a module for Python programming language.
For explorative data analysis, it provides a visual programming framework with emphasis on interactions and creative combinations of visual components.
3. KNIME
KNIME is a coherent and comprehensive open source visual platform for data integration, processing, analysis, reporting and exploration.
It enables users to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.
