Open Source Automated Data-Wrangling

Leonardo - glider.jpg

Automunge is automating the practice of data-wrangling to prepare structured data sets for the direct application of machine learning algorithms in the framework of a user’s choice. Taking as input an arbitrary “tidy” structured set (one feature per column and one observation per row), the tool fully automates the numerical encoding and normalization of dataframe feature sets such as to allow for training of predictive algorithms as well as subsequent consistent processing to generate predictions from a trained model. Automunge is not yet a replacement for feature engineering, however is is suitable as a replacement of the final steps of data preparation prior to the application of machine learning.


Our Blog

We have been documenting the development of this tool via regular essays published on Medium.


Automunge solves the problem of data set encoding which in mainstream practice often requires a manual address. Further, the tool automates the prediction of infill values for missing points in a set using machine learning models trained on the rest of a set in a fully generalized and automated fashion.

— Nicholas Teague

Black Mirror 8-05-18 slide 6.001.jpeg

Full documentation and download instructions available on GitHub at GitHub.com/AutoMunge

#to install to python:
pip install Automunge

Follow us on twitter at @automunge !