The curious thing about machine learning is how much human effort goes into practicing it. Before we apply sophisticated statistical mathematics and powerful computers to our data, people spend days, weeks, or more transforming and cleansing the data. These data preparation steps are almost always performed "by hand" and usually with little governance.
A similar trend applies to the output. Automated tracking, deployment, or even versioning of the statistical models and their various predictions is unusual.
The good news is that all of these are problems that "ordinary" software development encountered decades ago. We, as an industry, know how and why to improve these details. Automation of the transformation, of testing, and of deployment result in repeatable and reliable process, reduced turn-around-time, and better outcomes.
In hindsight all of this will seem obvious. My posts on this blog will describe how to automate the error-prone drudgery out of business intelligence, so that you have time to think.
I’m one person who is passionate about computers and the way that we people interact with them. I’m here to break down the ivory tower, to show you the insides of the sausage factory. Because I’m only one person. I don’t scale. I can only improve one place at a time. I’ll share my thoughts and techniques, and together we can reduce the waste and pain in business intelligence and improve the outcomes everywhere. Because the errors in our models increasingly impact the lives of our family, friends, and communities. And those errors are largely human errors, stemming from the sloppy processes of our immature industry.