Data Analytics Service

Data Analytics Services provides Buyers the capability to analyse large volume of business data, transform the data into intelligence and insight, and deliver this intelligence and insight to their business processes and business users. Such a comprehensive Data Analytics Module allows the Buyer to customize notification of certain indicators that the Buyer is interested in to trigger activities/ actions.

The Buyer is seeking the capability to analyse large volume   of business data, transform the data into intelligence and insight, and deliver this intelligence and insight to the Buyers business processes and business users. Such a comprehensive  Data Analytics Module (hereinafter referred to as DAM) should allow the Buyer to customize notification of certain indicators that the Buyer is interested in to trigger activities/actions.
The buyers also expect the Service Providers to extract the data from all the local as well as geographically distributed data sources where it may be in different formats  transform it and prepare it for analytics, reporting, data mining and visualization. It should have provision for ad-hoc queries, self service analytics including slicing, dicing and drill down capabilities on different dimensions. The Service Providers are expected to build analytics framework composed of libraries, utilities, tools, distributed file systems for storing big data sets,  splitting them on commodity machines configured in cluster for parallel processing, high availability and reliability.

The analytics framework should allow the Buyers to create run time environment in both open source (R, Python, etc.)  and proprietary technologies for reporting, real time stream analysis, machine learning, image processing, web crawling, text processing, etc.  The analytics framework can be deployed in a traditional on-premise datacenter as well as in the cloud for ad-hoc queries pertaining to the module for quick access to real time information and allow users to put in parameter to view the data from different perspectives.

This system must  help to increase efficiency,  productivity and efficacy by providing  intelligent insights into the data to different types of stakeholders Data Analytics Services (DAS)

The Service Provider needs to provide a Big Data Analytics platform, exposing Business Intelligence and analytics as a service which delivers following functionalities to the buyers:
Performing Search
Preparation of various Dashboard analytics to identify different type of hotspots
Provision to enrich the central Data repository and expose the data securely to various stakeholders.
Generation of various types of statistical reports.
Broad Scope of Work

Activities

Deliverables

Central Big-Data Repository 1. Building the required clusters with the security components.
2. Ensuring the external connectivity and test data ingestion into the system
3. Ingestion of Data sets if required
4. Incremental data ingestion automation from the instances to the Cluster
5. Entity recognition of the data being ingested from the content
Repository Security 1. Authenticate and secure clusters
2. End to End Log Analysis of the access logs of the applications
3. User Management restricting the external user registrations
Self-Serviceable BI 1. Self-serviceable BI solution for the power users to create new reports, dashboards and to expose the dashboard.
2. User Authentication
3. Exporting dashboard as PDF
4. Drill down dashboard
5. Search empowering analytics
6. Geo spatial analysis
7. Export reports
8. Interoperable dashboard analysis
Implementation 1. Data ingestion
2. Incremental data automation on Time Stamp or Auto Increment fields
3. Data standardization
4. Master data resolution
5. Cleaning of data for regeneration of missing or wrong values.
6. Record flagging
7. Geo location integration if required
8. Acts and section cleansing and standardization
9. Hive reports designing
10.API exposing to other stakeholders facilitating the Interoperability

Detailed Activities

Activities Details
Data ingestion scope 1. Providing a facility to push the messages and listening the events facilitating a real-time data analytics solution.
2. Providing the flexibility of consuming the data through API
Data Cleansing and Standardization 1. Data ingested into the system should be de-normalized by resolving the entities by a Big-data Solution
2. Data shall be cleansed followed by the de-duplication
3. Missing fields/patterns inside the data shall be re-generated/ re-calculated followed by the identification of bad records
4. All intermediate data shall be archived and a separate archival repository shall be maintained
5. Notifications/alerts shall be sent with all rules of data quality issues viz. bad records
6. Geo-Location Integration7. Transliteration API Integration. API should be provided by the department.
Data warehouse 1. A complete data warehouse shall be maintained for the ingested and cleansed data
2. The data warehouse shall be defined with the optimal file-formats and shall be followed by indexing and dynamic partitioning
3. Set of reports shall be derived from the warehouse
4. Making the reports accessible to the authorized users on demand
5. Data high availability and regular data backup maintenance shall be maintained.
High performance data discovery on large data-sets, Designing data matching algorithms across various pillars 1. Designing of the search over the complete data providing a provision to expose the datasets through a full-text search API for other stakeholders.
1. Identifying the searchable fields
2. Identifying the aggregate fields
3. Designing the solution for the data search and dynamic aggregations
4. Designing the search for various miss-spelled records
5. Designing of the Views
6. Integrating the application and the solution
7. Highlighting the searched Results matching with the search     parameters
8. Sorting the search results by the relevance and other attributes, as decided by the department
9. Fine tuning of the results
10.Testing the application
11.Deploying the solution
2. Search shall be dynamically populating the filters facilitating the end user to choose the potential fields
Data Analysis –  Visualization & Reporting 1. Analysis on the datasets
2. Identifying the possible reports
3. Designing the visualizations over the required
4. Designing the interactive dashboard
5. Authenticating the dashboard
6. Solution should be a Self-Serviceable BI with role-based authentication
7. Designed solution shall be able to schedule the reports and send alerts/notifications
8. Provided solution should have the drilldown capability to the record level during the analysis
9. Exporting of Dashboard as PDF
10.Search empowering analytics
11.Geospatial Analytics
12.Time-Series Analytics
13. Interoperable Dashboards for deep insights