Documents

Data Science Process

Here is our process explained with rough estimates how long each phase will last. Please note that estimates are here to get a sense how long can each phase last, when we get more details we can plan better (after business workshop).

Business insight as a service is an optional phase, only for clients that have the data, they know they can gain more insights but do not have clear vision how to leverage the data.

  1. Business insight as a service (5 - 10 days)
    1. Why: Purpose of this phase is to take the data, derive meaning from it and create the report with graphs and possible ideas for implementation
    2. How: Advisory (Data Scientists, Business Analysts, Data Wrangler)
    3. Input: business case explanation, data specification (some document explaining meaning of data, and what data is stored) and data itself, type of problem (is it what can be done with our data in terms of this and that? or I have this business problem, can you solve it with help of data we have or enriching it with other data?)
    4. Output: report with some insights, business impact of proposed solutions and implementational ideas based on business case and data, examples of inputs and outputs of each idea
  2. Phase 1 - business domain workshop (1 - 3 days)
    1. Why: we need to understand better what we are building and all business domain specifics. Also we want to build detailed plan how we will implement whole solution. This would be workshop on insights coming from former phase.
    2. How: Advisory On-Site (Data Scientists, Business Analysts, Data Wrangler)
    3. Input: key insights from business insight as a service or insights client had before contacting us (this phase can also be 1st phase in some cases, but then client knows exactly what they want)
    4. Output: an implementation plan with milestones, timeline and notes about important business parameters, detailed explanation which business problem will be solved and how
  3. Phase 2 - Feature engineering  (up to 2 weeks)
    1. Why: after a business domain workshop, everything should be clear and we can extract important features which will be used in machine learning algorithms
    2. How: Advisory (Data Scientists)
    3. Input: notes about important business parameters, tech feasibility, other tech requirements mapping
    4. Output: well defined set of features to use by machine learning algorithm
  4. Phase 3 - Demo / prototyping (about a month)
    1. Why: ​create machine learning algorithms based on the input data set (e.g. historical data) in order to verify the idea and to test it before the actual implementation.
    2. How: Implementation (Team Approach: Data Scientists, Data Engineer)
    3. Input: Defined set of features
    4. Output: a clickable web demo that demonstrates the main idea as well as machine learning algorithms’ outputs/results
  5. Phase 4 - MVP (few months, depending on the scope)
    1. Why: ​After the algorithms is verified and the output results are satisfying, we can proceed with a MVP (minimum viable product) implementation. Team approach which is blend of data engineers, devops and data scientists.
    2. How: Implementation (Team Approach: Data Scientists, Data Engineer, Data Ops)
    3. Input: verified demo with selected algorithms
    4. Output: a functional MVP (first product implementation)
  6. Phase 5 - development of other features and improvement of algorithm (ongoing)
    1. Why: Every algorithm can be better, and every application can have more features.
    2. How: Maintenance (Team Approach: Data Scientists, Data Engineer, Data Ops)