Warehousing and Analytics

Overview

The Warehouse and Analytics Service Offering provides agencies with professional development services targeted at enhancing the value of data. Included in the service offering below are:

  • Extract, transform and load (ETL) design and development
  • Warehouse design and development
  • Data analytics tool sets
  • Report development services
  • Data quality services
  • Consultation services around hosting, deployment, data governance and data strategy

All development and consultation services are time and materials rate based that are dependent on the type of resources required for both project and maintenance services. Platforms and infrastructure are rate-based services. See the on the bill section of this service offering for examples.

IT Projects often involve more work from business processes than IT processes, and these parts of the project are often much more intensive and difficult to manage. Significant investiture in time from both OIT and partner agencies is necessary to ensure that the warehouse is providing the right data in the right structure. This cannot be done in a vacuum by either party. Before embarking on the implementation of a solution, it is important for IT and agency partners to work in consultation to identify the appropriate solution. There are several reasons why a data warehouse may be appropriate, including:

  • Report performance
  • Reducing load on operational databases
  • Increased opacity to combining data sources
  • Opportunity to improve data quality without impacting operational databases

However, there are circumstances where a data warehouse may not be the most appropriate solution, even when the need to integrate different data sets exists:

  • Emphasis on ad hoc reporting/data discovery
  • Data mining/analytics
  • Inability to identify appropriate measures

Definitions

Product Owner The Agency Partner representative with the authority to determine the business objectives of an application and the priority of the product features that are developed.
Product Manager The OIT representative who works with the Agency Partner whose responsibilities include that the business objectives of an application are met.
Data Custodian The agency designee that determines what is going into the Data Warehouse and the access rules of that data.
Data Warehouse A large store of data accumulated from a wide range of sources within a company and used to guide management decisions typically structured for rapid reporting.
Data Lake A large store of raw, unstructured data for analytics and various other purposes, without requiring predefined schemas.
Analytics Toolsets Software and programs used by data analysts to collect, store, analyze, and interpret data to derive meaningful insights and make informed decisions.
ETL Extract, Transform and Load – the process of transporting and transforming data from a source system to the target data warehouse or data lake
Data Quality The degree to which data is accurate, complete, consistent, reliable, and relevant for its intended use, ensuring it meets the requirements and expectations of its user
Data Governance A collection of policies, processes, roles, and metrics that ensure the effective and efficient use of information, ensuring data is secure, private, accurate, and usable throughout its lifecycle
Least Privilege An information security concept which maintains that a user or entity should only have access to the specific data, resources and applications needed to complete a required task.

SLA

The most common point of failure in a warehouse is the extract, transform, and load process. ETL issues are covered by the standard production published service level agreement can be viewed at the following link: Standard_SLA CTS Production Services.

Because the needs of warehousing and the accompanying analytics platform are quite varied, each constituent piece of the solution is managed by its own SLA. For instance, if a warehouse is created in an Oracle environment and Tableau is used for the analytics service, the SLA for Oracle databases and Tableau are relevant. However, if SQL Server and Power BI are utilized, their SLAs are relevant instead.

Customer Expectations

A shared understanding of roles is required for any successful project. The table below identifies the responsibilities of participants in a typical EDS project. While each item is distinct from another, they are each dependent on each other, and all require collaboration from each side of the house, IT and the requesting agency. The interplay of these functions illustrates the need for a close working relationship between IT and agencies throughout the entire process.

Services Requested What Enterprise Data Services Provides What other Application Teams Provide What Agency Provides
Extract from Source System to Target Data Warehouse / Data Lake
  • Development of data integrations for provided source data
  • Data quality checks as required.
  • Staging data in warehouse
  • Transformation based on rules
  • Monitoring of ETL performance, success, and failure
  • ETL issue resolution and coordination with Source App team
  • Documentation of ETL processes
  • Documentation of source to target data map
  • Source code management
  • Follow Change Management Policy and Application Deployment Certification Policy
  • Extraction of source data either through a service or consistent file layout on a regular schedule.
  • Trouble shooting of extract file or services when failed.
  • Documentation of extract processes
  • Subject matter expertise of the data contained in the source system.
  • Source code management of source system programming to present or extract source data.
  • Deployment certification of changes to extraction or presentation of source systems

Requirements including:

  • Identification of data required.
  • Frequency and schedule of extracts
  • Quality required.
  • Any transformation rules.
  • Perform user acceptance testing.
  • Acceptable Use Information & Rules
  • Data Sharing Agreements
  • Correct data in the system of record and not the warehouse.
Warehouse / Data Lake design
  • Design and data model of the warehouse / data lake based on best practices for data warehouses. This may be a simple data lake for basic data reporting or a complex Kimball model for more advanced warehouse analytics.
  • Apply security rules to the data (least privilege access).
  • Maintenance on the warehouse
  • Documentation on the warehouse
  • Source code management
  • Change management & Deployment certification
N/A
  • Requirements based on the types of questions they will ask of the data. What facts and dimensions are required of the data model.
  • Define security rules.
  • Perform user acceptance testing
Data Analysis Tools
(Tableau, PowerBI, or Oracle Analytics)
  • Provision a data analytics front end.
  • Development of physical, logical and presentation models within the analysis tool.
  • Apply security rules to presentation of the data
  • Provide access to training sites and tools.
  • Maintenance on the analytics tool
  • Change Management & Deployment certification.
  • Documentation
N/A
  • Training individual Agency staff on the use of the analysis tool
  • Development of reports, dashboards, and visualizations
  • Articulation of requirements including Role Based Access Controls that describe what data needs to be presented to whom.
  • Perform user acceptance testing
Reporting
  • Develop reports based on requirements.
  • Develop a data model that can produce repeatable results based on the fact and dimensions developed from requirements.
  • Follow normal deployment certification practices.
  • Documentation of requested work in a work ticketing system.
  • Source code management.
  • Provision read only access to the warehouse or data lake
N/A
  • Develop read only queries for one-time ad hoc reporting.
  • Perform user acceptance testing Define meaningful KPIs including business rules supporting their development (see design)
  • Develop reports using analysis tools and narrative that explains data to annual report readers

Data Quality

Outlier Detection

  • Data profiling and data transformation / correction based on established rules.
  • Report on outliers using requirements determined by the agency.
  • Provide results back for remediation
  • Develop large scale data correction programming for correcting source system data based on requirements.

Follow normal deployment certification practices.

  • Correct data in source system using normal interfaces.
  • Document statistical methods in requirements language
  • Document outliers and provide analysis of their relevance
Data Analysis
  • Analysis of data for the sole purpose of staging and modeling it into an appropriate for use data model.
N/A
  • Review Data
  • Spot Trends
  • Develop analysis that explains data in a meaningful way to data consumers
Data Governance
  • Follow the guidelines of the Chief Data Officer
N/A
  • Follow the guidelines of the Chief Data Officer
  • Establish and document data sharing and access rules
Hosting platforms & Analytic Tools
  • Provide consultation services on hosting and analytic tools available from MaineIT
N/A
  • Consult with MaineIT on available tools.
Testing, Deployment & Change Management

Follow Change Management Policy and
Application Deployment Certification Policy
Run the following tests as deemed appropriate by the application director and the CIO.

  • Accessibility Test
  • Data Conversion and Migration Test
  • Interfaces Test
  • Security Test
  • Performance Test
  • Restoration Test
  • Regression Test
  • End-to End Test

Follow Change Management Policy and
Application Deployment Certification Policy

For any changes to source systems as deemed appropriate by the application director and the CIO.

Follow Change Management Policy and
Application Deployment Certification Policy

  • Perform User Acceptance Testing prior to deployment.
Being a member of an
Agency’s data Analytics Team

Provide coordination of development activities including:

  • Backlog grooming
  • Budget development
  • Supervision of technical staffIssue resolution / escalation
  • Technology Roadmap communication
  • Technology planning
N/A
  • Prioritize agency needs for technology services.
  • Articulate and advocate for appropriate budget for technology needs.
  • Articulate future technology for planning purposes
Project
  • Assist in the development of a business case with the PMO.
  • Engage with the PMO in identifying potential projects.
  • Provide & Manage resources to projects to conduct previously outlined services
Participation as subject matter experts of the source data
  • Engage with the PMO in the execution of projects.
  • Supply subject matter expertise.
  • Supply user acceptance testing

To get help, or to order this service

For a new warehouse project contact OITEnterpriseDataServices@maine.gov to get started.

A ticket in the Enterprise Ticketing System is required for all non-production work requests. All production work requests require an authorized RFC. If applicable, a billing code is required for those items that are not part of the base published rate.

If the published Service Level Agreement is not met, issues can be escalated to the next priority level by contacting any of the following individuals:

  • The Enterprise Data Services Director, EDS lead of your project, or your friendly Account Managers.

Priority Levels for Monitoring

  • The standard production published service level agreement can be viewed at the following link: Standard SLA Maine IT Production Services.
  • Standard business hour coverage is 7:00AM – 5:00PM Monday through Friday, excluding holidays, please contact MaineIT Operations.
  • If service is required for non-production systems outside of the standard business hours, prior arrangements will be required with the director of this service area and associated fees will apply

On the Bill

For staff time the service category use is either Personnel Services or Personnel Services – Non State Resource. Infrastructure could be SQL Database Services, Oracle Database Services, Storage or another service such as Tableau.