.. image:: images/change-logo.png
:width: 120pt
:target: http://www.change.org
.. raw:: html
At change.org we automate the use of scikit-learn's RandomForestClassifier
in our production systems to drive email targeting that reaches millions
of users across the world each week. In the lab, scikit-learn's ease-of-use,
performance, and overall variety of algorithms implemented has proved invaluable
in giving us a single reliable source to turn to for our machine-learning needs.
.. raw:: html
`_
----------------------------------------------------------
.. raw:: html
.. image:: images/phimeca.png
:width: 120pt
:target: http://www.phimeca.com/?lang=en
.. raw:: html
At PHIMECA Engineering, we use scikit-learn estimators as surrogates for
expensive-to-evaluate numerical models (mostly but not exclusively
finite-element mechanical models) for speeding up the intensive post-processing
operations involved in our simulation-based decision making framework.
Scikit-learn's fit/predict API together with its efficient cross-validation
tools considerably eases the task of selecting the best-fit estimator. We are
also using scikit-learn for illustrating concepts in our training sessions.
Trainees are always impressed by the ease-of-use of scikit-learn despite the
apparent theoretical complexity of machine learning.
.. raw:: html
Vincent Dubourg, PHIMECA Engineering, PhD Engineer
.. raw:: html
`HowAboutWe `_
----------------------------------------------------------
.. raw:: html
.. image:: images/howaboutwe.png
:width: 120pt
:target: http://www.howaboutwe.com/
.. raw:: html
At HowAboutWe, scikit-learn lets us implement a wide array of machine learning
techniques in analysis and in production, despite having a small team. We use
scikit-learn’s classification algorithms to predict user behavior, enabling us
to (for example) estimate the value of leads from a given traffic source early
in the lead’s tenure on our site. Also, our users' profiles consist of
primarily unstructured data (answers to open-ended questions), so we use
scikit-learn’s feature extraction and dimensionality reduction tools to
translate these unstructured data into inputs for our matchmaking system.
.. raw:: html
Daniel Weitzenfeld, Senior Data Scientist at HowAboutWe
.. raw:: html
`PeerIndex `_
----------------------------------------
.. raw:: html
.. image:: images/peerindex.png
:width: 120pt
:target: http://www.peerindex.com/
.. raw:: html
At PeerIndex we use scientific methodology to build the Influence Graph - a
unique dataset that allows us to identify who’s really influential and in which
context. To do this, we have to tackle a range of machine learning and
predictive modeling problems. Scikit-learn has emerged as our primary tool for
developing prototypes and making quick progress. From predicting missing data
and classifying tweets to clustering communities of social media users, scikit-
learn proved useful in a variety of applications. Its very intuitive interface
and excellent compatibility with other python tools makes it and indispensable
tool in our daily research efforts.
.. raw:: html
Ferenc Huszar - Senior Data Scientist at Peerindex
.. raw:: html
`DataRobot `_
----------------------------------------
.. raw:: html
.. image:: images/datarobot.png
:width: 120pt
:target: http://www.datarobot.com
.. raw:: html
DataRobot is building next generation predictive analytics software to make data scientists more productive, and scikit-learn is an integral part of our system. The variety of machine learning techniques in combination with the solid implementations that scikit-learn offers makes it a one-stop-shopping library for machine learning in Python. Moreover, its consistent API, well-tested code and permissive licensing allow us to use it in a production environment. Scikit-learn has literally saved us years of work we would have had to do ourselves to bring our product to market.
.. raw:: html
Jeremy Achin, CEO & Co-founder DataRobot Inc.
.. raw:: html
`OkCupid `_
--------------------------------------
.. raw:: html
.. image:: images/okcupid.png
:width: 120pt
:target: https://www.okcupid.com
.. raw:: html
We're using scikit-learn at OkCupid to evaluate and improve our matchmaking
system. The range of features it has, especially preprocessing utilities, means
we can use it for a wide variety of projects, and it's performant enough to
handle the volume of data that we need to sort through. The documentation is
really thorough, as well, which makes the library quite easy to use.
.. raw:: html
David Koh - Senior Data Scientist at OkCupid
.. raw:: html
`Lovely `_
-----------------------------------------
.. raw:: html
.. image:: images/lovely.png
:width: 120pt
:target: https://www.livelovely.com
.. raw:: html
At Lovely, we strive to deliver the best apartment marketplace, with respect to
our users and our listings. From understanding user behavior, improving data
quality, and detecting fraud, scikit-learn is a regular tool for gathering
insights, predictive modeling and improving our product. The easy-to-read
documentation and intuitive architecture of the API makes machine learning both
explorable and accessible to a wide range of python developers. I'm constantly
recommending that more developers and scientists try scikit-learn.
.. raw:: html
Simon Frid - Data Scientist, Lead at Lovely
.. raw:: html
`Data Publica `_
----------------------------------------------
.. raw:: html
.. image:: images/datapublica.png
:width: 120pt
:target: http://www.data-publica.com/
.. raw:: html
Data Publica builds a new predictive sales tool for commercial and marketing teams called C-Radar.
We extensively use scikit-learn to build segmentations of customers through clustering, and to predict future customers based on past partnerships success or failure.
We also categorize companies using their website communication thanks to scikit-learn and its machine learning algorithm implementations.
Eventually, machine learning makes it possible to detect weak signals that traditional tools cannot see.
All these complex tasks are performed in an easy and straightforward way thanks to the great quality of the scikit-learn framework.
.. raw:: html
Guillaume Lebourgeois & Samuel Charron - Data Scientists at Data Publica
.. raw:: html
`Machinalis `_
-----------------------------------------
.. raw:: html
.. image:: images/machinalis.png
:width: 120pt
:target: http://www.machinalis.com
.. raw:: html
Scikit-learn is the cornerstone of all the machine learning projects carried at
Machinalis. It has a consistent API, a wide selection of algorithms and lots
of auxiliary tools to deal with the boilerplate.
We have used it in production environments on a variety of projects
including click-through rate prediction, `information extraction `_,
and even counting sheep!
In fact, we use it so much that we've started to freeze our common use cases
into Python packages, some of them open-sourced, like
`FeatureForge `_ .
Scikit-learn in one word: Awesome.
.. raw:: html
Rafael Carrascosa, Lead developer
`solido `_
-----------------------------------------
.. raw:: html
.. image:: images/solido_logo.png
:width: 120pt
:target: http://www.solidodesign.com
.. raw:: html
Scikit-learn is helping to drive Moore’s Law, via Solido. Solido creates
computer-aided design tools used by the majority of top-20 semiconductor
companies and fabs, to design the bleeding-edge chips inside smartphones,
automobiles, and more. Scikit-learn helps to power Solido’s algorithms for
rare-event estimation, worst-case verification, optimization, and more. At
Solido, we are particularly fond of scikit-learn’s libraries for Gaussian
Process models, large-scale regularized linear regression, and classification.
Scikit-learn has increased our productivity, because for many ML problems we no
longer need to “roll our own” code. `This PyData 2014 talk `_ has details.
.. raw:: html
Trent McConaghy, founder, Solido Design Automation Inc.
.. raw:: html
`INFONEA `_
-----------------------------------------
.. raw:: html
.. image:: images/infonea.jpg
:width: 120pt
:target: http://www.infonea.com/en
.. raw:: html
We employ scikit-learn for rapid prototyping and custom-made Data Science
solutions within our in-memory based Business Intelligence Software
INFONEA®. As a well-documented and comprehensive collection of
state-of-the-art algorithms and pipelining methods, scikit-learn enables
us to provide flexible and scalable scientific analysis solutions. Thus,
scikit-learn is immensely valuable in realizing a powerful integration of
Data Science technology within self-service business analytics.
.. raw:: html
Thorsten Kranz, Data Scientist, Coma Soft AG.
.. raw:: html
`Dataiku `_
-----------------------------------------
.. raw:: html
.. image:: images/dataiku_logo.png
:width: 120pt
:target: http://www.dataiku.com
.. raw:: html
Our software, Data Science Studio (DSS), enables users to create data services
that combine `ETL `_ with
Machine Learning. Our Machine Learning module integrates
many scikit-learn algorithms. The scikit-learn library is a perfect integration
with DSS because it offers algorithms for virtually all business cases. Our goal
is to offer a transparent and flexible tool that makes it easier to optimize
time consuming aspects of building a data service, preparing data, and training
machine learning algorithms on all types of data.
.. raw:: html
Florian Douetteau, CEO, Dataiku
.. raw:: html