CPA Business and Data Analytics pdf notes and Revision

KASNEB CPA BUSINESS AND DATA ANALYTICS NOTES

KASNEB CPA ADVANCED LEVEL

PRACTICAL PAPER NO. CA 35P:

BUSINESS AND DATA ANALYTICS (COMPUTER-BASED EXAMINATION).

NOTIONAL HOURS: 240
Recommended tool: Excel, R UNIT DESCRIPTION

This course is aimed at enabling the candidate to use information technology to support decision making through business analytics. The candidate is expected to demonstrate digital competency in preparation and analysis of financial statements, forecasting and related areas in data analytics.
PREREQUISITE
To attempt this paper, a candidate shall be required to have passed all other examination papers within the CPA qualification.
Candidates will be required to have core knowledge of quantitative techniques, financial accounting and reporting and financial management. Candidates are also expected to have knowledge in their specialisation areas of management accounting, audit, tax and public financial management.
The paper will be attempted over three hours in a controlled, computerized environment (examination centres with computer laboratories).
1.0 LEARNING OUTCOMES
A candidate who passes this paper should be able to:

Discuss fundamental aspects of big data and data analytics from the CRISP (cross- industry standard process for data mining) framework, data visualisation and emerging issues.
Apply data analytics in preparation of financial statements, financial statements analysis and forecasting, carrying out sensitivity/scenario analysis and presenting financial data and metrics using dashboards.
Apply data analytics in financial management principles that include time value of money analysis, evaluate capital projects, carry out sensitivity/scenario analysis and present information using dash boards.
Apply data analytics in management accounting to estimate product costs, breakeven analysis, budget preparation, sensitivity/scenario analysis and flexible budgets.
Apply data analytics in auditing techniques including key financial trends, fraud detection, tests of control, model reviews and validation issues.
Apply data analytics in estimating tax payable and in public sector financial management.

CONTENT

1.0 Introduction to Excel

– Utilising the keyboard shortcuts in Excel
– Conducting data analysis using data tables, pivot tables and other common functions
– Using advanced formulas and functions to enhance the functionality of financial models

2.0 Introduction to data analytics

2.1 The CRISP (cross-industry standard process for data mining) framework for data analytics

Data concepts – conceptual, logical, physical data models
Stages in data lifecycle: identifying data sources, modeling data requirements, obtaining data, recording data, using data for making business decision, removing data

2.2 Big data and data analytics

Definition of big data
The 5Vs of big data
Types of data analytics: descriptive analytics, prescriptive analytics and predictive analytics

2.3 Tools for data analytics

Data cleaning tools (Alteryx, SSIS, Datastage, others)
Data Management (Storage/DBA): SQL, Oracle, Cloud Computing (AWS,AZURE), others
Reporting/Visualization : Excel, PowerBI, Tableau, Microstrategy, others

2.4 Data visualization in Excel

Definition of data visualization
Benefits of data visualization
Types of visualization; comparison, composition and relationships
Qualities of good data visualization

3.0 Core application of data analytics

3.1 Financial accounting and reporting

Prepare financial statements; statement of profit or loss, statement of financial position and statement of cash flow for companies and groups
Analyse financial statements using ratios, common size statements, trend and cross-sectional analysis, graphs and charts
Prepare forecast financial statements under specified assumptions
Carry out sensitivity analysis and scenario analysis on the forecast financial statements
Data visualization and dash boards for reporting

3.2 Financial Management

Time value of money analysis for different types of cash flows
Loan amortization schedules
Project evaluation techniques using net present value – (NPV), internal rate of return (IRR)
Carry out sensitivity analysis and scenario analysis in project evaluation
Data visualisation and dashboards

4.0 Application of data analytics in specialised areas

4.1 Management accounting

Estimate cost of products (goods and services) using high-low and regression analysis method
Estimate price, revenue and profit margins
Carry out break-even analysis
Budget preparation and analysis (including variances)
Carry out sensitivity analysis and scenario analysis and prepare flexible budgets

4.2 Auditing

Analysis of trends in key financial statements components
Carry out 3-way order matching
Fraud detection
Test controls (specifically segregation of duties) identifying combinations of users involved in processing transactions
Carry out audit sampling from large data set
Model review and validation issues

4.3 Taxation and public financial management

Compute tax payable for individuals and companies
Prepare wear and tear deduction schedules
Analyse public sector financial statements using analytical tools
Budget preparation and analysis (including variances)
Analysis of both public debt and revenue in both county and national government
Data visualisation and reporting in the public sector

5.0 Emerging issues in data analytics

Skepticism and challenges in data analytics
Ethical issues in data analytics
Data Security / Data Protection
Performance (Limitations within analytic tools)

1.0 INTRODUCTION TO EXCEL Utilising the keyboard shortcuts in Excel

Keyboard shortcuts in Excel can save you a lot of time and help you work more efficiently. Here are some of the most commonly used Excel keyboard shortcuts:

Selecting Cells:

To select the entire column: Press Ctrl +Space
To select the entire row: Press Shift + Space
To select the entire worksheet: Press Ctrl + A
To select multiple non-adjacent cells: Press Ctrl + click on the cells

Copying and Pasting:

To copy a cell or a range of cells: Press Ctrl + C
To paste copied cell(s): Press Ctrl + V
To paste copied cell(s) with formatting: Press Ctrl + Alt + V

Undo and Redo:

To undo the last action: Press Ctrl + Z
To redo the last undone action: Press Ctrl + Y

Inserting and Deleting:

To insert a new row or column: Press Ctrl + Shift ++
To delete a row or column: Press Ctrl + –
To delete the contents of a cell: Press Delete

Formatting:

To apply bold formatting to selected text: Press Ctrl + B
To apply italic formatting to selected text: Press Ctrl + I
To apply underline formatting to selected text: Press Ctrl + U
To display the Format Cells dialog box: Press Ctrl + 1

Navigation:

To move to the next worksheet: Press Ctrl +Page Down
To move to the previous worksheet: Press Ctrl+ Page Up
To move to the first cell of the worksheet: Press Ctrl +
To move to the last cell of the worksheet: Press Ctrl +

These are just a few of the many keyboard shortcuts available in Excel. By using these shortcuts, you can work more efficiently and save time.

Conducting data analysis using data tables, pivot tables and other common functions

Excel is a powerful tool for data analysis, and there are several features that can help you to analyse your data more efficiently. Here are some commonly used features for data analysis:

1. Data Tables: Data tables are a useful tool for performing what-if analysis. You can use data tables to calculate multiple results based on different input values. To create a data table, go to the “Data” tab and select “What-If Analysis” and then “Data Table”. You can then enter the input values and the formula to calculate the result.
2. Pivot Tables: Pivot tables are a powerful tool for summarizing and analysing large data sets. To create a pivot table, select the data you want to analyse and go to the “Insert” tab and select “PivotTable”. You can then drag and drop fields into the “Rows”, “Columns”, and “Values” areas to create a summary of your data.
3. Common Functions: There are several common functions that can be used to analyse data in Excel. Some of the most commonly used functions include:
• SUM: Calculates the sum of a range of cells.
• AVERAGE: Calculates the average of a range of cells.
• COUNT: Counts the number of cells in a range that contain numbers.
• MAX: Finds the maximum value in a range of cells.
• MIN: Finds the minimum value in a range of cells.
• IF: Tests a condition and returns one value if the condition is true and another value if the condition is false.
4. Charts: Charts can be a useful tool for visualizing your data. Excel offers several different types of charts, including column charts, line charts, pie charts, and more. To create a chart, select the data you want to visualize and go to the “Insert” tab and select the type of chart you want to create.

By using these features, you can analyse your data more efficiently and gain valuable insights into your business or organization.

Using advanced formulas and functions to enhance the functionality of financial models

Excel is a powerful tool for financial modelling, and using advanced formulas and functions can enhance the functionality of financial models. Here are some ways to do so:

SUMIFS, AVERAGEIFS, and COUNTIFS: These functions allow you to sum, average, or count data based on multiple criteria. For example, you can use SUMIFS to sum all the sales for a particular product in a specific region.
INDEX and MATCH: INDEX returns the value of a cell in a specified range based on its row and column numbers, while MATCH returns the position of a value in a specified range. Together, they allow you to look up a value in a table based on certain criteria.
VLOOKUP and HLOOKUP: These functions allow you to look up a value in a table based on a matching value in the first column or row, respectively. VLOOKUP is commonly used to retrieve data from a master data set into another data set.
IF, AND, OR: These functions allow you to create conditional formulas. For example, you can use IF to assign a value to a cell based on whether a certain condition is true or false.
PMT: This function calculates the payment for a loan based on the interest rate, loan amount, and number of payments.
NPV and IRR: These functions are used to calculate the net present value and internal rate of return of a series of cash flows. These are commonly used in financial analysis to evaluate the profitability of an investment.
CHOOSE: This function returns a value from a list of values based on a specified index number. It can be used to create dynamic drop-down lists that change based on user input.

By utilizing these advanced formulas and functions, financial models can become more accurate and efficient, allowing for better analysis and decision-making.

INTRODUCTION TO DATA ANALYTICS

THE CRISP FRAMEWORK FOR DATA MINING
The CRISP (Cross-Industry Standard Process for Data Mining) framework is a widely used and accepted standard for data analytics and data mining. It provides a structured approach for carrying out a data analytics project from start to finish, helping to ensure that the project is completed on time, within budget, and with the desired results. The framework consists of six phases:

Business understanding: In this phase, the objectives of the data analytics project are defined and the business problem that needs to be solved is identified. The requirements of the project are also determined, including the resources needed, the scope of the project, and the expected outcomes.
Data understanding: This phase involves collecting and exploring the data that will be used in the analysis. This includes data collection, data cleaning, and data integration. The goal of this phase is to ensure that the data is clean, complete, and consistent.
Data preparation: In this phase, the data is prepared for analysis. This includes selecting the relevant variables, transforming the data into a suitable format for analysis, and selecting the appropriate statistical methods and algorithms.
Modelling: This phase involves developing and validating predictive models using the data prepared in the previous phase. The goal of this phase is to create a model that accurately predicts the outcome of interest.
Evaluation: In this phase, the performance of the model is evaluated using various metrics and techniques. This includes testing the model on a validation dataset, assessing the model’s accuracy and reliability, and determining whether the model meets the business requirements.
Deployment: In the final phase, the model is deployed and integrated into the business process. This includes implementing the model in a production environment, monitoring its performance, and providing ongoing support and maintenance.

The CRISP framework provides a clear and structured approach to data analytics projects, ensuring that all aspects of the project are considered and that the results are actionable and relevant to the business.

Data concepts – conceptual, logical, physical data models
Conceptual, logical, and physical data models are different levels of abstraction used in data modelling to represent data in a structured and organized way.

A. Conceptual Data Model: It is the highest level of abstraction in data modelling, and it represents the concepts and relationships between different entities in a business environment. It focuses on the business requirements and defines the scope of the system. It is independent of any technology or physical implementation and provides a clear understanding of the business concepts.
B. Logical Data Model: The logical data model is the next level of abstraction, and it represents the business requirements in terms of entities, attributes, relationships, and constraints. It describes the data at a level that is independent of any specific database management system or data storage technology. It defines the structure of the data and provides a roadmap for how the data should be organized, managed, and accessed.
C. Physical Data Model: The physical data model is the lowest level of abstraction and describes how the logical data model will be implemented in a specific database management system. It defines the database schema, including tables, columns, data types, indexes, and relationships, that will be used to store and manage the data. It is specific to a particular technology and takes into account the physical characteristics of the system, such as storage capacity, performance, and security.

In summary, the conceptual data model provides a high-level view of the business requirements, the logical data model provides a detailed view of the data structure, and the physical data model provides the details of how the data is physically stored and accessed in a specific system.

DATA LIFECYCLE
The data lifecycle consists of several stages that data goes through from its initial creation to its eventual retirement or deletion. The stages of the data lifecycle are as follows:

Identifying data sources: This involves identifying the sources of data, such as databases, files, sensors, and other sources, that are relevant to the organization’s needs.
Modelling data requirements: In this stage, the organization defines its data requirements, including the types of data needed, the format of the data, and the relationships between different data elements.
Obtaining data: This stage involves acquiring the data from the identified sources, which may involve collecting new data or obtaining existing data from external sources.
Recording data: Once the data has been obtained, it needs to be recorded and stored in a way that is secure, accurate, and easily accessible. This stage involves organizing the data and entering it into the appropriate systems or databases.
Using data for making business decisions: This stage involves analysing the data to gain insights that can be used to make informed business decisions. This may involve using data visualization tools, statistical analysis, or machine learning algorithms.
Removing data: As data becomes outdated, irrelevant, or no longer needed, it may need to be removed from the system. This stage involves retiring or deleting the data in a way that complies with data privacy regulations and organizational policies

In summary, the data lifecycle involves identifying data sources, modelling data requirements, obtaining data, recording data, using data for making business decisions, and removing data when it is no longer needed. By following a structured approach to managing data, organizations can ensure that they have access to the information they need to make informed decisions and comply with regulations.

Big Data and Data Analytics
Definition of Big Data
Big data refers to extremely large and complex data sets that are beyond the capabilities of traditional data processing tools to capture, store, manage, and analyse. The data sets can come from a variety of sources, such as social media, sensors, mobile devices, and transactions.

The 5Vs of big data are:

Volume: The sheer size of big data sets, which can range from terabytes to petabytes and beyond.
Velocity: The speed at which data is generated, processed, and analysed. Big data often requires real-time or near real-time processing.
Variety: The different types of data, such as structured, unstructured, and semi- structured data. Big data can include text, audio, video, images, and other data types.
Veracity: The quality and accuracy of the data, including the potential for errors, inconsistencies, and bias. Big data may require extensive data cleaning and validation.
Value: The insights, knowledge, and business value that can be extracted from big data. Big data analytics can help organizations make informed decisions, identify trends, and predict outcomes.

Types of Data Analytics
There are three main types of data analytics:

Descriptive analytics: Descriptive analytics involves analysing past data to gain insights and understand what has happened. It is used to summarize historical data and identify patterns, trends, and correlations. Descriptive analytics includes techniques such as data visualization, data mining, and reporting.
Predictive analytics: Predictive analytics involves using statistical models and machine learning algorithms to analyse historical data and make predictions about future events or trends. It is used to identify patterns and relationships in data that can be used to make predictions. Predictive analytics includes techniques such as regression analysis, time series analysis, and predictive modelling.
Prescriptive analytics: Prescriptive analytics involves using data and models to determine the best course of action or decision. It is used to optimize decision-making providing recommendations or options for action. Prescriptive analytics includes techniques such as simulation, optimization, and decision trees.

Each type of analytics has its own strengths and limitations, and organizations may use a combination of all three to gain a comprehensive understanding of their data and make informed decisions.

Tools for data analytics

Data Cleaning
Data cleaning, also known as data cleansing, is the process of identifying and correcting or removing errors, inconsistencies, and inaccuracies in data to ensure that it is accurate, complete, and reliable for analysis. It is a critical step in the data preparation process that helps improve the quality of data and reduce the risk of making incorrect decisions based on flawed data.

The data cleaning process typically involves the following steps:

Data profiling: This step involves analysing the data to identify patterns, trends, and potential errors. Data profiling helps understand the data and the potential issues that need to be addressed.
Data validation: This step involves verifying the accuracy and completeness of the data. It includes checking for missing data, duplicate data, and invalid data values.
Data standardization: This step involves converting data into a standardized format to ensure consistency and comparability. It includes standardizing date formats, numerical values, and categorical data.
Data enrichment: This step involves adding additional data to improve the quality and accuracy of the existing data. It includes merging data from different sources and enriching the data with additional information.
Data transformation: This step involves transforming the data to a format that is suitable for analysis. It includes converting data types, aggregating data, and calculating new variables.

Overall, data cleaning is a critical step in the data preparation process that ensures the accuracy and reliability of data for analysis. It requires careful attention to detail and the use of appropriate tools and techniques to ensure that the data is of high quality.

Data Cleaning tools
There are several data cleaning tools available in the market, including:

Alteryx: Alteryx is a data preparation and analytics platform that allows users to automate data cleaning processes. It includes a range of tools for data profiling, data quality, data blending, and data transformation.
SQL Server Integration Services (SSIS): SSIS is a data integration and ETL (Extract, Transform, Load) tool provided Microsoft. It includes a range of data cleaning and transformation features, including data profiling, data quality, data scrubbing, and data enrichment.
DataStage: DataStage is a data integration and ETL tool provided IBM. It includes a range of data cleaning and transformation features, including data profiling, data quality, data scrubbing, and data enrichment.
Open Refine: OpenRefine is a free, open-source tool for cleaning and transforming data. It includes features for data cleansing, data normalization, and data transformation.
Trifacta: Trifacta is a cloud-based data preparation and analytics platform that includes a range of tools for data cleaning, transformation, and enrichment. It includes features for data profiling, data quality, data scrubbing, and data enrichment.

These are just a few examples of data cleaning tools available in the market. Each tool has its own set of features and capabilities, and organizations should evaluate their specific needs before selecting a tool.

Data Management

Data management refers to the process of organizing, storing, protecting, and maintaining data throughout its lifecycle. It involves the implementation of policies, procedures, and technologies to ensure that data is accurate, complete, and accessible for authorized users when and where it is needed.

The key components of data management include:

Data governance: Data governance involves the development of policies and procedures for data management. It includes defining data ownership, data quality standards, and data security measures.
Data architecture: Data architecture involves the design and implementation of a framework for organizing and storing data. It includes the selection of data storage technologies, data modelling, and data integration.
Data quality management: Data quality management involves the implementation of processes and technologies to ensure that data is accurate, complete, and consistent. It includes data profiling, data cleansing, and data validation.
Data security: Data security involves the implementation of measures to protect data from unauthorized access, modification, or destruction. It includes access control, encryption, and disaster recovery.
E. Data privacy: Data privacy involves the protection of personal and sensitive data from unauthorized access or disclosure. It includes compliance with data protection laws and regulations.

Overall, effective data management is critical for ensuring the accuracy, integrity, and security of data. It helps organizations make informed decisions, improve operational efficiency, and meet regulatory requirements.

Data Management Tools

Data Management involves storing, protecting and maintaining data throughout its lifecycle. Here are some of the popular storage and DBA (Database Administration) technologies:

SQL: SQL (Structured Query Language) is a standard programming language for managing relational databases. It is used to create, modify, and query databases. Popular SQL-based database management systems include MySQL, PostgreSQL, and Microsoft SQL Server.
Oracle: Oracle is a database management system that uses the SQL language for managing relational databases. It includes features for data security, data replication, and data backup and recovery.
Cloud computing: Cloud computing provides on-demand access to a shared pool of computing resources over the internet. Popular cloud computing platforms for data storage and management include Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform. These platforms offer a range of database services, such as Amazon RDS, Azure SQL Database, and Google Cloud SQL.
MongoDB: MongoDB is a document-oriented NoSQL database that stores data in JSON- like documents. It is designed for handling unstructured and semi-structured data, making it ideal for use cases such as content management, social networking, and IoT.
Cassandra: Cassandra is a distributed NoSQL database that is designed for handling large volumes of data with high availability and performance. It is commonly used for use cases such as real-time analytics, IoT, and social media.

These are just a few examples of storage and DBA technologies available in the market. Each technology has its own strengths and limitations, and organizations should evaluate their specific needs before selecting a technology.

Reporting/Visualization: Excel, PowerBI, Tableau, Microstrategy, others
Reporting and visualization tools are crucial for presenting data in a way that is easy to understand and allows stakeholders to make informed decisions. Here are some popular reporting and visualization tools:

Excel: Excel is a widely used spreadsheet software that allows users to create charts, graphs, and pivot tables to analyse and visualize data. It is a powerful tool for data analysis and can be used to create basic reports and visualizations.
Power BI: Power BI is a cloud-based business analytics service Microsoft that provides interactive visualizations and business intelligence capabilities. It can connect to multiple data sources and can be integrated with other Microsoft tools such as Excel and SharePoint.
Tableau: Tableau is a powerful data visualization tool that allows users to create interactive dashboards and reports. It has a user-friendly interface that enables users to drag and drop data and create visualizations without the need for coding.
MicroStrategy: MicroStrategy is a comprehensive business intelligence platform that provides advanced analytics, data discovery, and mobile reporting capabilities. It allows users to create custom reports and visualizations using data from multiple sources.
QlikView/Qlik Sense: QlikView and Qlik Sense are data visualization and discovery tools that allow users to create interactive dashboards and reports. They have a patented in- memory data indexing technology that enables users to explore data without the need for predefined queries.

These are just a few examples of popular reporting and visualization tools available in the market. Each tool has its own strengths and limitations, and organizations should evaluate their specific needs before selecting a tool.

2.4 Data visualization in Excel
Definition of data visualization:
Data visualization is the process of presenting complex data sets in a visual and easily understandable format. It involves using various graphical representations such as charts, graphs, maps, and other visual aids to highlight patterns, trends, and relationships within the data.

Benefits of data visualization:
Data visualization has several benefits, including:

Enhanced understanding: Visual representation of data makes it easier for individuals to comprehend and interpret complex data sets.
Better decision-making: Data visualization enables individuals to identify patterns and trends quickly, making it easier to make informed decisions.
Improved communication: Data visualization provides a common language for communicating data insights to others, making it easier to share findings and collaborate with team members.
Increased engagement: Data visualization enhances engagement making data more interactive, visually appealing, and easy to understand.

Types of visualization; comparison, composition, and relationships:
There are three main types of data visualization:

A. Comparison: Comparison charts are used to compare values across different categories, such as bar charts, column charts, and pie charts.
B. Composition: Composition charts show how different parts make up the whole, such as a stacked bar chart or a pie chart.
C. Relationships: Relationship charts highlight the relationship between different variables, such as scatter plots, bubble charts, and heat maps.

Qualities of good data visualization:
Effective data visualization should have the following qualities:

Clarity: The visualization should be easy to understand and convey the intended message clearly.
Accuracy: The visualization should accurately represent the data and avoid any misleading information.
Simplicity: The visualization should be simple and concise, without any unnecessary elements.
Relevance: The visualization should be relevant to the audience and their needs.
Interactivity: Interactive elements, such as hover-over text or zooming, can enhance engagement and understanding.
Aesthetics: The visualization should be visually appealing and use appropriate colour schemes, font sizes, and styles.

Emerging issues in data analytics

Scepticism and challenges in data analytics
Scepticism and challenges in data analytics are common due to the following reasons:

Data quality issues: The quality of data used for analytics can be a challenge, as it may be incomplete, inaccurate, or biased, leading to incorrect conclusions and decisions.
Data privacy concerns: The collection and use of personal data for analytics can raise privacy concerns, and it is essential to ensure that data is collected and analysed ethically and securely.
Data complexity: Big data is complex, with large volumes of data from various sources, which can be difficult to manage and analyse.
Statistical inference: The results obtained from data analysis are based on statistical inference, which can be misleading if the assumptions and models used are incorrect or misinterpreted.
Human bias: Human bias can influence the interpretation and analysis of data, leading to incorrect conclusions.
Interpretation and communication: The interpretation and communication of data analysis results can be a challenge, as it requires clear and concise communication to ensure that the results are correctly understood the audience.
Rapid changes in technology: Rapid changes in technology can make it challenging to keep up with the latest analytical tools and techniques.

To address these challenges, it is important to practice scepticism when analysing data, questioning assumptions, and being mindful of potential biases. Additionally, it is crucial to have a clear understanding of the data, use appropriate statistical models and methods, and ensure that data privacy and security are prioritized. Effective communication of results and ongoing training and education are also critical to ensuring the proper use and interpretation of data analysis.

Ethical issues in data analytics
Ethical issues in data analytics are becoming increasingly important as organizations gather and analyse large amounts of data. The following are some of the key ethical issues in data analytics:

Privacy: The collection and use of personal data for analytics can raise privacy concerns, particularly when individuals are not aware of how their data is being used or when their data is used without their consent.
Bias: Data analytics can introduce bias when data is collected and analysed in a way that does not represent the full population or when algorithms are used to make decisions based on biased data.
Discrimination: Data analytics can be used to discriminate against individuals or groups based on factors such as race, gender, or age.
Transparency: The use of data analytics can make it difficult to understand how decisions are being made, particularly when algorithms are used to make decisions without human intervention.
Responsibility: The use of data analytics raises questions about who is responsible for decisions made based on the analysis and what accountability mechanisms are in place to ensure that decisions are fair and just.
Informed Consent: The use of personal data in data analytics raises concerns about informed consent, particularly when individuals are not aware of how their data is being used or when their data is used without their consent.

To address these ethical issues, it is important for organizations to establish ethical guidelines for the collection, analysis, and use of data, to ensure that data is collected and analysed ethically and transparently, and to implement mechanisms for accountability and responsibility. Additionally, it is important to prioritize the privacy of personal data, to ensure that decisions are fair and just, and to obtain informed consent from individuals before collecting and using their data.

Data Security / Data Protection
Data security, also known as data protection, refers to the measures and practices put in place to protect data from unauthorized access, use, disclosure, modification, or destruction. Data security is essential to maintaining the confidentiality, integrity, and availability of data, and to prevent data breaches and cyber-attacks.

Here are some of the key aspects of data security:

Access control: Access control is the process of controlling who has access to data, ensuring that only authorized individuals can access and modify sensitive data.
Encryption: Encryption involves the use of mathematical algorithms to convert data into a format that is unreadable without a decryption key, providing an additional layer of protection against unauthorized access.
Backup and recovery: Regular backups of data are essential to ensure that data can be recovered in case of a data loss event such as a cyber-attack, natural disaster, or hardware failure.
Monitoring and auditing: Monitoring and auditing involve the use of software tools and processes to track and log access to data, identify suspicious activity, and prevent data breaches.
Employee training: Employee training is essential to ensure that all individuals who have access to sensitive data are aware of data security policies and procedures, and can take appropriate measures to protect data.
Compliance with regulations: Data security regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA) require organizations to implement specific data security measures to protect sensitive data.

Overall, data security is a critical aspect of data management, and organizations must implement appropriate measures to ensure that data is protected from unauthorized access and use, and to prevent data breaches and cyber-attacks.

Performance (Limitations within analytic tools)
Analytic tools are software applications used to analyse data and gain insights into business operations, customer behaviour, and market trends. While analytic tools can be powerful and provide valuable insights, they also have some limitations that can impact performance. Here are some of the limitations within analytic tools:

Scalability: Some analytic tools may not be scalable enough to handle large volumes of data, which can lead to slower performance or system crashes.
Integration: Analytic tools may not be able to integrate seamlessly with other systems, which can make it challenging to combine data from different sources.
Data Quality: The accuracy and completeness of data can impact the performance of analytic tools. Poor data quality can lead to inaccurate results, which can impact decision-making.
Data Formats: Analytic tools may not be compatible with all data formats, making it difficult to process and analyse certain types of data.
Complexity: Some analytic tools may be complex to use, requiring extensive training or specialized knowledge to effectively use them.
Cost: Analytic tools can be expensive to purchase and maintain, particularly for small or medium-sized businesses.

To address these limitations, organizations can consider:

Investing in scalable analytic tools that can handle large volumes of data and integrate with other systems.
Ensuring data quality implementing data quality checks and data governance practices.
Using a data warehouse or data lake to store and manage data in a standardized format.
Training employees on how to use analytic tools effectively and efficiently.
Researching and comparing different analytic tools to find one that fits the organization’s needs and budget.

By understanding the limitations of analytic tools and implementing best practices, organizations can maximize the performance and value of these tools in their data analytics efforts.