What is the role of a chief data scientist?

In this article we are going to discuss about the roles and responsibility of a chief data scientist. The data scientists are essentially highly experienced programmer, big data engineer, UI developer, and Mathematician and data scientists having years of experience in core software development.

What is the role of a chief data scientist?

Data Science: What is the role of a chief data scientist?

The Chief Data Scientist or CDS for short is a job tile in IT field whose responsibility is to design, develop, deliver and maintain large scale machine learning solutions involving huge data management. The roles and responsibility of Chief data scientists varies according to the organization needs but the goal of delivering large scale machine learning solution remains the same. Depending on the organization it may or may involve actual coding, testing and delivery. If it's small organization then chief data scientist design, code, test and deliver the solution along with the help of small team. If the organization is big and there big team in that case chief data scientist mainly involve in designing of the system and the managerial role.

In this article we are going to explore the core responsibility of a chief data scientist. Even if you are working in big organization and not involved in actual hands-on, you must have prior hands-on in all the technologies used in data science. Because chief data scientist is big role in industry and there is huge investment in any data science project; hence you will be responsible for all the outcome of the project. You have to manage the team and work in such a manner that the product is delivered by team at its maximum capacity.

The chief data scientist is the final decision maker on the algorithms and methodology to use for the delivery of a project. You should have solid understanding of all the algorithms, methodology, data ingestion, data storing, data processing logic and visualization technologies used in data science projects. Having prior experience in Big Data and programming technologies are must to become highly productive chief data scientist.

The chief data scientist role involves understanding the client business requirement, convert into a mathematical problem, solve the mathematical problem with programming/machine learning and finally present the result in dashboard format to client. So, this role requires interaction with client and stakeholders. The software project management and team management skills are a must for this role.

You must have experience in designing; coding, testing and production deliver of various machine learning models. These skills are required because if model is not performing as expected then you can work with your team to tune the model to get best results.

In a small organization you have to hands-on while in big organization such as global banks role changes to 95% admin-related work. In big organization you have to lead a team of reporters, forecasters, data scientists and data modelers. In big organization team management, project management and client handling skills is required along with data science skills. Here you may have to work with team or yourself do the hands-on to create POC for the client fast.

In bigger organizations roles includes the administrative tasks such as insuring data integrity and compliance. You should make sure that the data science processes followed for the projects are robust and really useful for the business. You will be handling a big team and reporting to higher management. Here you will be working as VP of engineering team or we can say VP of data science.

Job description for chief data scientists

Usually companies are looking for highly talented and experienced data scientists having 17+ years of experience in IT field with 5 or more years of experience in data science. For this role you must have experience in applied AI in ML, deep Learning, ANN, CNN platform  and large scale AI system integration. Prior MNC or experience in research organization is a must for this position.

Job responsibility of chief data scientists

  • The chief data scientist must lead and mentor a team of data scientists and data engineers.
     
  •  Chief data scientist must drive machine learning / deep learning initiatives of the company in all areas.
     
  • You should be able to to design and deploy Machine Learning algorithms for consumer and commercial products.
     
  • This role requires one to collaborate with data and subject matter experts to seek, understand, validate, interpret, and correctly use new data elements.
     
  • You will be collaborating with engineering teams to develop prototypes and software products as per client specification. Machine learning and deep learning solution must be developed to solve the actual business problem.
     
  • Key tasks of project and delivery man agent like manage stake holders expectations, and working with business users to gather requirements, resolve business rule requirements, design ML solution, create POC for the proposed solution and perform joint conceptual data model reviews.
     
  • Define the Enterprise Data Strategy which involves the interaction with multiple systems, disparate sources, databases, global teams, and varying data needs based on volume, variety and velocity.
     
  • The enterprise strategy should cover Data Management and Architecture, Enterprise Information Management (EIM), Master Data Management (MDM), Meta Data Management, Quality and Data Governance strategies, methodologies, guidelines and standards.
     
  • Design and develop the enterprise-level conceptual, logical, and physical data models.
     
  • The chief data architect role requires both hands on technical expertise as well as strategic problem solving skills to achieve the goal of an enterprise
     
  • The chief data scientist should be able to define overall, strategic and tactical, Big Data roadmap for design, development and implementation of the enterprise data warehouse (EDW) and its associated data stores so that it can be used for ML/DL activities.
     
  • Should be well versed with the range of existing Big data technologies and data modelling techniques. One should be able to design complete data model, ingestion pipeline, data pre-processing pipeline, data cleansing strategy and finally the system for large scale data analysis.
     
  •  The project and team management skills are also very import for chief data scientist role. One should be able to manage team of lead architects, data modelers, or data scientists, and supporting multiple projects.
     
  • The chief data scientist should be able to work various project development methodology including agile or waterfall techniques.
     
  • The candidate should be able to build new data sets, enhance existing data sets and design data structures when required. Prior experience in managing the distribution, replication and archiving of data throughout the enterprise is a must to have skill.

Required Skills for chief data scientists

Following skills are required for this job role:

  • Prior experience in statistical, mathematical, predictive modeling to build models to solve the business problem.
     
  • Experience with the reporting tools/software packages to communicate their findings visually to stakeholders.
     
  • Good experience and theoretical knowledge of Artificial Neural Network ( ANN ), AI Chatbot, CNN and NLP. Experience in different programming language can be an added advantage for chief data scientist.
     
  • Chief data scientists must have experience in Natural Language Processing, Machine learning, Deep learning, Artificial Intelligence, Conceptual modeling, Statistical analysis, Predictive modeling, Hypothesis testing.
     
  • Should be able to apply hardcore in ANN /Deep Learning /Machine Learning/NLP to solve the specific complex business problem.
     
  • Experience on AWS Data lake (Amazon Elastic Compute Cloud (EC2), Amazon Data Pipeline, S3, DynamoDB NoSQL, Relational Database Service (RDS), Elastic Map Reduce (EMR) and Amazon Redshift, Kinesis, Amazon Machine Learning, AWS Lambda, and the Relational Database Service (RDS)) is also required for chief data scientist post.
     
  • Understanding the concepts like MPP databases, noSQL (e.g., MongoDB) storage, Graph databases, Data Warehouse design, BI reporting and Dashboard development is also required.
     

Required educational qualification for chief data scientists

  • B.Tech / M.Tech / M.Tech ( 5 Yrs Integrated ) in CSE / Maths & Computing in Data Science / Machine Learning from top IIT and OR with Ph.D. ( AI Machine Learning, Deep Learning Data Science) from Top-rated Tech University. But this is not limited a simple science graduate possessing all these skills and experience can also get this job in multi-nationals.
     
  • MS or PhD in Computer Science, Electrical Engineering, Statistics, or equivalent fields.
     
  • Experience in all the machine learning, big data and programming technologies.
     
  • Strong English verbal and written communication.

Required programming skills

  • Candidate should be well versed with design patterns and software engineering principles.
     
  • Responsible for development of applications using artificial intelligence/machine learning technology and application analysis. Candidate should be able to understand latest industrial and academic developments in AI/ML.
     
  • Must have experience in Bigdata Mobility, cloud and hands on experience in Hadoop/ Hortonworks preferred but open to Cloudera/ MapR or even Apache Hadoop/ PIG/ HIVE, Mapreduce/ Flume/ Kafka/ Sparks etc.
     
  • Should have prior experience in design competitive AI/ML services and creating prototypes for demonstration.
     

Machine learning skills

One should have experience of many project using various machine learning, deep learning and artificial intelligence technologies.

  • Well versed with all the machine Learning, Statistics, Regression and all programming languages like Python, Scala, Spark, TensorFlow, R, Matlab and Java.
     
  • Candidate should have hands-on experience in data science and machine learning technique like- linear regression, logistic regression, random forest, support vector machines, ANOVA/ANCOVA, optimization techniques, time series modelling, segmentation, decision tree, clustering, recommendation engines and forecasting.
     
  • Candidates must have solid understanding of Artificial Intelligence technologies including knowledge representation & reasoning (KRR), natural language processing (NLP), speech recognition, unsupervised machine learning, and/or reinforcement learning.
     
  • Experience in statistical modeling/data mining algorithms such as:
    o Multivariate Regression, Logistic Regression, clustering algorithms, Support Vector Machines, Decision Trees etc
    o Machine learning, or graph mining.
    o DOE, Forecasting, Segmentation, Uncertainty Analysis etc.
    o Data Mining i.e. Text Mining, Classification Methods SVM, NN, etc
    o Vector Space model for Unstructured Text
    o Sentiment Analysis, Association Mining, Semantic Analysis
     
  • The required key skill sets includes Text Mining R Machine Learning Statistical Modeling Logistic Regression Data Mining Python Analytics Data Visualization Segmentation Data science. 

In this article we have discussed the roles and responsibility of chief data scientist. You can learn many technologies on our website. Here are the links: